AIBearisharXiv โ CS AI ยท 6h ago1
๐ง
Knowledge without Wisdom: Measuring Misalignment between LLMs and Intended Impact
Research reveals that leading foundation models (LLMs) perform poorly on real-world educational tasks despite excelling on AI benchmarks. The study found that 50% of misalignment errors are shared across models due to common pretraining approaches, with model ensembles actually worsening performance on learning outcomes.