y0news
← Feed
←Back to feed
🧠 AIβšͺ NeutralImportance 7/10

What Scales in Cross-Entropy Scaling Law?

arXiv – CS AI|Junxi Yan, Zixi Wei, Qingyao Ai, Yiqun Liu, Jingtao Zhan||3 views
πŸ€–AI Summary

Researchers discovered that the traditional cross-entropy scaling law for large language models breaks down at very large scales because only one component (error-entropy) actually follows power-law scaling, while other components remain constant. This finding explains why model performance improvements become less predictable as models grow larger and establishes a new error-entropy scaling law for better understanding LLM development.

Key Takeaways
  • β†’Cross-entropy scaling law fails at very large model scales, causing unpredictable performance improvements.
  • β†’Cross-entropy can be decomposed into three components: Error-Entropy, Self-Alignment, and Confidence.
  • β†’Only error-entropy follows robust power-law scaling while other components remain largely invariant.
  • β†’Error-entropy dominates in small models but diminishes proportionally as models grow larger.
  • β†’The new error-entropy scaling law provides more accurate predictions for large language model behavior.
Read Original β†’via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains β€” you keep full control of your keys.
Connect Wallet to AI β†’How it works
Related Articles