#agent-engineering News & Analysis

3 articles tagged with #agent-engineering. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

3 articles

AINeutralarXiv – CS AI · Jun 237/10

🧠

From Question Answering to Task Completion: A Survey on Agent System and Harness Design

A comprehensive survey examines LLM-based agent systems through a model-harness lens, arguing that agent performance depends on the interaction between foundation models, execution infrastructure, and task structure rather than model capabilities alone. The research identifies six core runtime responsibilities and maps how different harness configurations affect long-horizon task completion, efficiency, and reliability.

AINeutralarXiv – CS AI · Jun 26/10

🧠

SkillAdaptor: Self-Adapting Skills for LLM Agents from Trajectories

SkillAdaptor introduces a training-free framework for refining external skills used by LLM agents, using step-level failure attribution instead of trajectory-level feedback. The method demonstrates consistent improvements across three evaluation benchmarks (WebShop, PinchBench, Claw-Eval) with gains up to 1.8 points, offering more stable and auditable skill maintenance for autonomous agent systems.

🧠 GPT-5

AINeutralarXiv – CS AI · Apr 146/10

🧠

Agent^2 RL-Bench: Can LLM Agents Engineer Agentic RL Post-Training?

Researchers introduce Agent^2 RL-Bench, a benchmark testing whether LLM agents can autonomously design and execute reinforcement learning pipelines to improve foundation models. Testing across multiple agent systems reveals significant performance variation, with online RL succeeding primarily on ALFWorld while supervised learning pipelines dominate under fixed computational budgets.