AINeutralarXiv – CS AI · 5h ago6/10
🧠
The Fine-Tuning Trap: Evaluating Negative Transfer and the Role of PEFT in Sub-1B Mathematical Reasoning
Researchers benchmarked five sub-1B language models and discovered that Full Fine-Tuning actively degrades performance on models under 300M parameters, causing accuracy to drop below zero-shot baselines. Parameter-Efficient Fine-Tuning (PEFT) methods like LoRA and DoRA prove necessary for stability, with task-specific strengths that outperform full fine-tuning and sometimes even match in-context learning on the smallest architectures.