y0news
#trajectory-control1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท 6h ago1
๐Ÿง 

Learning Structured Reasoning via Tractable Trajectory Control

Researchers propose Ctrl-R, a new framework that improves large language models' reasoning abilities by systematically discovering and reinforcing diverse reasoning patterns through structured trajectory control. The method enables better exploration of complex reasoning behaviors and shows consistent improvements across mathematical reasoning tasks in both language and vision-language models.