#complex-reasoning News & Analysis

2 articles tagged with #complex-reasoning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles

AIBullisharXiv – CS AI · May 127/10

🧠

EvoMAS: Learning Execution-Time Workflows for Multi-Agent Systems

Researchers introduce EvoMAS, a framework that dynamically constructs multi-agent workflows during task execution rather than using static, pre-optimized designs. The system uses a Planner-Evaluator-Updater pipeline to assess task state and adapts agent coordination across execution stages, demonstrating superior performance on complex reasoning tasks compared to existing approaches.

AINeutralarXiv – CS AI · Apr 147/10

🧠

The Amazing Agent Race: Strong Tool Users, Weak Navigators

Researchers introduce The Amazing Agent Race (AAR), a new benchmark revealing that LLM agents excel at tool-use but struggle with navigation tasks. Testing three agent frameworks on 1,400 complex, graph-structured puzzles shows the best achieve only 37.2% accuracy, with navigation errors (27-52% of failures) far outweighing tool-use failures (below 17%), exposing a critical blind spot in existing linear benchmarks.

🧠 Claude