AINeutralApple Machine Learning ยท 1d ago6/10
๐ง
Cram Less to Fit More: Training Data Pruning Improves Memorization of Facts
Researchers present a data pruning technique that improves how large language models memorize factual knowledge by optimizing training data distribution. The work, grounded in information-theoretic analysis, addresses the gap between theoretical model capacity and actual factual accuracy, offering practical methods to reduce hallucinations in knowledge-intensive tasks.