AIBearisharXiv โ CS AI ยท 7h ago6/10
๐ง
Junk DNA Hypothesis: Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs
Researchers challenge the conventional wisdom that large language models contain significant redundant parameters, demonstrating that small-magnitude weights encode crucial knowledge for difficult downstream tasks. The study reveals that pruning these weights causes irreversible performance degradation that cannot be recovered through continued training, with effects monotonically correlated to task difficulty.