y0news
AnalyticsDigestsSourcesRSSAICrypto
#benchmark-improvement2 articles
2 articles
AIBullisharXiv โ€“ CS AI ยท Feb 277/105
๐Ÿง 

Towards Autonomous Memory Agents

Researchers introduce U-Mem, an autonomous memory agent system that actively acquires and validates knowledge for large language models. The system uses cost-aware knowledge extraction and semantic Thompson sampling to improve performance, showing significant gains on benchmarks like HotpotQA and AIME25.

AIBullisharXiv โ€“ CS AI ยท Feb 277/108
๐Ÿง 

AgentDropoutV2: Optimizing Information Flow in Multi-Agent Systems via Test-Time Rectify-or-Reject Pruning

Researchers propose AgentDropoutV2, a test-time framework that optimizes multi-agent systems by dynamically correcting or removing erroneous outputs without requiring retraining. The system acts as an active firewall with retrieval-augmented rectification, achieving 6.3 percentage point accuracy gains on math benchmarks while preventing error propagation between AI agents.