AINeutralarXiv – CS AI · 9h ago6/10
🧠
CoT-Space: A Theoretical Framework for Internal Slow-Thinking via Reinforcement Learning
Researchers introduce CoT-Space, a theoretical framework that explains how Large Language Models improve reasoning through multi-step Chain-of-Thought processes via reinforcement learning. The framework models reasoning as an optimization problem in continuous semantic space, demonstrating that optimal reasoning length emerges naturally from the underfitting-overfitting trade-off, providing a principled foundation for understanding test-time scaling in modern LLMs.