y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#jd-commerce News & Analysis

1 article tagged with #jd-commerce. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AIBullisharXiv – CS AI · 6h ago7/10
🧠

AGPO: Asymmetric Group Policy Optimization for Verifiable Reasoning and Search Ads Relevance at JD

Researchers introduce Asymmetric Group Policy Optimization (AGPO), a reinforcement learning method that improves LLM reasoning by preventing capability collapse while focusing on rare correct solutions. The technique demonstrates state-of-the-art performance on mathematical benchmarks and has been deployed in JD's search ads relevance system, showing practical industrial applications.