y0news
AnalyticsDigestsSourcesRSSAICrypto
#evolutionary-algorithm1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท Feb 276/107
๐Ÿง 

Duel-Evolve: Reward-Free Test-Time Scaling via LLM Self-Preferences

Researchers introduce Duel-Evolve, a new optimization algorithm that improves LLM performance at test time without requiring external rewards or labels. The method uses self-generated pairwise comparisons and achieved 20 percentage points higher accuracy on MathBench and 12 percentage points improvement on LiveCodeBench.