y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#agent-research News & Analysis

1 article tagged with #agent-research. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AINeutralarXiv – CS AI · 10h ago6/10
🧠

OPT-BENCH: Evaluating the Iterative Self-Optimization of LLM Agents in Large-Scale Search Spaces

Researchers introduce OPT-BENCH, a benchmark evaluating whether large language models can self-improve through iterative feedback in complex problem spaces. Testing 19 LLMs across machine learning and NP-hard problems reveals that while stronger models adapt better, even the most advanced systems remain constrained by their base capabilities and fall short of human expert performance.