y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#bradley-terry-loss News & Analysis

1 article tagged with #bradley-terry-loss. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AIBullisharXiv – CS AI · 7h ago7/10
🧠

When Distance Distracts: Representation Distance Bias in BT-Loss for Reward Models

Researchers identify a critical bias in Bradley-Terry loss, the standard objective for training reward models in LLM alignment, where gradient magnitudes are distorted by representation distance rather than prediction error. They propose NormBT, a lightweight normalization scheme that refocuses learning on actual ranking mistakes, demonstrating 5%+ improvements on fine-grained reasoning benchmarks.