Hindsight Credit Assignment for Long-Horizon LLM Agents
Researchers introduced HCAPO, a new framework that uses hindsight credit assignment to improve Large Language Model agents' performance in long-horizon tasks. The system leverages LLMs as post-hoc critics to refine decision-making, achieving 7.7% and 13.8% improvements over existing methods on WebShop and ALFWorld benchmarks respectively.