AIBearisharXiv โ CS AI ยท Mar 36/106
๐ง
Knowledge without Wisdom: Measuring Misalignment between LLMs and Intended Impact
Research reveals that leading foundation models (LLMs) perform poorly on real-world educational tasks despite excelling on AI benchmarks. The study found that 50% of misalignment errors are shared across models due to common pretraining approaches, with model ensembles actually worsening performance on learning outcomes.