#synthetic-data-bias News & Analysis

2 articles tagged with #synthetic-data-bias. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles

AIBearisharXiv – CS AI · May 17/10

🧠

LLM Biases

Researchers identify four systematic bias channels in transformer-based AI recommenders: positional bias favoring recent events, popularity amplification creating echo chambers, latent driver bias from unobserved user motivations, and synthetic data bias from retraining on AI-generated logs. These mechanism-level risks can distort user exposure and choice at scale, potentially reducing reliability despite strong offline performance metrics.

AIBearisharXiv – CS AI · Jun 106/10

🧠

RealMath-Eval: Why SOTA Judges Struggle with Real Human Reasoning

Researchers introduce RealMath-Eval, a benchmark revealing that state-of-the-art LLM judges fail to accurately evaluate authentic student mathematical reasoning, performing significantly worse on real exam responses (MSE ~2.96) than on synthetic LLM-generated solutions (MSE ~1.17). The study identifies an "Evaluation Gap" stemming from human errors occupying a more diverse semantic space than the predictable patterns found in synthetic errors.