AINeutralarXiv – CS AI · 7h ago6/10
🧠
Soft-Prompt Tuning for Fair and Efficient LLM Benchmark Evaluation
Researchers propose soft-prompt tuning, a parameter-efficient method that adapts large language models to benchmark formatting requirements by optimizing only 0.0006% of model parameters. This technique reveals that benchmark scores often underestimate base model knowledge due to formatting constraints, enabling fairer evaluation across different model architectures and pre-training approaches.
🏢 Meta