#agent-research News & Analysis

3 articles tagged with #agent-research. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

3 articles

AIBullisharXiv – CS AI · May 287/10

🧠

MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research

MobileGym is a new browser-based simulation platform designed to accelerate mobile GUI agent research by enabling verifiable outcomes and scalable parallel training. The platform supports 416 parameterized tasks across 28 apps and demonstrates strong sim-to-real transfer, with a trained model retaining 95.1% of simulation gains on real devices.

AIBullisharXiv – CS AI · Jun 46/10

🧠

Online Skill Learning for Web Agents via State-Grounded Dynamic Retrieval

Researchers introduce State-Grounded Dynamic Retrieval (SGDR), a new method enabling language agents to dynamically reuse learned skills during web automation tasks. By matching skills to both task goals and current webpage states rather than fixed skill sets, SGDR achieves 10.6% relative performance gains over existing approaches on complex multi-step web tasks.

🧠 GPT-4

AINeutralarXiv – CS AI · May 126/10

🧠

OPT-BENCH: Evaluating the Iterative Self-Optimization of LLM Agents in Large-Scale Search Spaces

Researchers introduce OPT-BENCH, a benchmark evaluating whether large language models can self-improve through iterative feedback in complex problem spaces. Testing 19 LLMs across machine learning and NP-hard problems reveals that while stronger models adapt better, even the most advanced systems remain constrained by their base capabilities and fall short of human expert performance.