AIBullisharXiv – CS AI · 6h ago7/10
🧠
ReFlect: An Effective Harness System for Complex Long-Horizon LLM Reasoning
ReFlect introduces a training-free harness system that wraps around LLMs to detect and recover from reasoning failures in complex, multi-step tasks. Testing across six models shows significant improvements in task success rates, with gains inversely correlated to baseline performance, though the approach reveals limitations in how smaller models handle structured reasoning.
🧠 GPT-4🧠 Claude🧠 Sonnet