🧠 AI🟢 BullishImportance 7/10

Memorize Theorems, Not Instances: Probing SFT Generalization through Mathematical Reasoning

arXiv – CS AI|Ruiying Peng, Mengyu Yang, Jing Lei, Xiaohui Li, Xueyu Wu, Xinlei Chen|May 12, 2026 at 04:00 AM

🤖AI Summary

Researchers propose Theorem-SFT, a novel supervised fine-tuning approach that teaches language models to apply mathematical rules explicitly rather than memorize surface-level correlations between problems and solutions. The method demonstrates significant performance improvements across benchmarks while revealing that feed-forward layers, not memorization itself, are the primary locus of reasoning capability.

Analysis

This research addresses a fundamental challenge in large language model training: the brittleness of models fine-tuned on task-specific data. While conventional wisdom attributes generalization failures to memorization, this work reframes the problem as memorizing the wrong targets. Vanilla SFT encourages models to exploit spurious correlations in problem-solution pairs, making them vulnerable to minor input variations—a critical limitation for mathematical and scientific reasoning tasks.

Theorem-SFT redirects the learning objective toward explicit rule application, fundamentally altering what models encode during training. By teaching theorem invocation rather than answer patterns, the approach achieves substantial gains: 8.8% improvement on MATH benchmarks and 20.27% on GeoQA without domain-specific retraining. This cross-domain effectiveness suggests the method captures something more generalizable than surface-level pattern matching.

The architectural findings are particularly significant for the AI community. Fine-tuning only MLP (feed-forward) layers achieves comparable performance to full-network training, suggesting that reasoning rules concentrate in these components while attention mechanisms handle other aspects. This mechanistic insight could streamline future training pipelines and reduce computational costs.

For developers building reasoning-dependent systems, this research implies that training methodology matters as much as data quality. The approach could enhance reliability of AI systems in mathematics, geometry, and logical reasoning—domains where generalization across problem variants is essential. The findings also inform broader discussions about scaling laws and model interpretability, suggesting that how we supervise learning determines not just performance but the robustness of learned capabilities.

Key Takeaways

→Theorem-SFT teaches models explicit rule application rather than surface patterns, yielding 8.8-20.27% performance gains across benchmarks.
→Generalization failures stem from memorizing spurious correlations, not from memorization as a mechanism.
→Feed-forward MLP layers are the primary locus of reasoning rules, with fine-tuning MLPs alone matching full-network performance.
→The method generalizes across model families and modalities without domain-specific retraining.
→Training methodology directly impacts reasoning robustness and generalization to input variations.

#supervised-fine-tuning #mathematical-reasoning #llm-generalization #theorem-sft #model-interpretability #reasoning-benchmarks #mlp-layers #ai-training

Read Original →via arXiv – CS AI

Act on this with AI

Stay ahead of the market.

Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.

Connect Wallet to AI →How it works

AI5d ago

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

AI6d ago

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

AI6d ago

Memorize Theorems, Not Instances: Probing SFT Generalization through Mathematical Reasoning

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

S&P 500 and NASDAQ hit record highs as AI chip stocks surge