#training-objectives News & Analysis

5 articles tagged with #training-objectives. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

5 articles

AINeutralarXiv – CS AI · Jun 117/10

🧠

From Architecture to Output: Structural Origins of Hallucination in Large Language Models and the Amplifying Role of Data

Researchers identify three core architectural mechanisms in large language models that systematically produce hallucinations: self-attention's statistical confusion of entities, maximum likelihood training that rewards plausible-sounding falsehoods, and autoregressive decoding that cascades errors forward. Dataset quality issues amplify rather than originate these failures, suggesting that fixing hallucinations requires architectural redesign, not just better training data.

AIBullisharXiv – CS AI · Jun 107/10

🧠

When Distance Distracts: Representation Distance Bias in BT-Loss for Reward Models

Researchers identify a critical bias in Bradley-Terry loss, the standard objective for training reward models in LLM alignment, where gradient magnitudes are distorted by representation distance rather than prediction error. They propose NormBT, a lightweight normalization scheme that refocuses learning on actual ranking mistakes, demonstrating 5%+ improvements on fine-grained reasoning benchmarks.

AINeutralarXiv – CS AI · Jun 27/10

🧠

Global Geometry Is Not Enough for Vision Representations

Researchers demonstrate that global embedding geometry—the standard metric for evaluating vision model representations—fails to predict compositional binding capabilities. Functional sensitivity measured through input-output Jacobians proves far more reliable, revealing that current training objectives optimize embedding geometry while leaving the local input-output mapping unconstrained, suggesting representation learning requires a more nuanced evaluation framework.

AIBullisharXiv – CS AI · Apr 157/10

🧠

How Transformers Learn to Plan via Multi-Token Prediction

Researchers demonstrate that multi-token prediction (MTP) outperforms standard next-token prediction (NTP) for training language models on reasoning tasks like planning and pathfinding. Through theoretical analysis of simplified Transformers, they reveal that MTP enables a reverse reasoning process where models first identify end states then reconstruct paths backward, suggesting MTP induces more interpretable and robust reasoning circuits.

AINeutralarXiv – CS AI · Apr 146/10

🧠

LLMs Should Incorporate Explicit Mechanisms for Human Empathy

Researchers argue that Large Language Models lack explicit empathy mechanisms, systematically failing to preserve human perspectives, affect, and context despite strong benchmark performance. The paper identifies four recurring empathic failures—sentiment attenuation, granularity mismatch, conflict avoidance, and linguistic distancing—and proposes empathy-aware objectives as essential components of LLM development.