Beyond Semantics: The Unreasonable Effectiveness of Reasonless Intermediate Tokens
A new arXiv study challenges the assumption that Chain of Thought reasoning traces in large language models reflect genuine internal reasoning processes. Researchers found that models trained on corrupted, semantically meaningless intermediate steps perform comparably to those trained on correct reasoning traces, suggesting that intermediate tokens function more as statistical patterns than transparent reasoning proxies.