#arc-agi News & Analysis

6 articles tagged with #arc-agi. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

6 articles

AIBullisharXiv – CS AI · May 127/10

🧠

Workspace Optimization: How to Train Your Agent

Researchers propose workspace optimization, a novel training approach for AI agents that evolves external structured environments rather than model weights. The DreamTeam multi-agent system demonstrates this concept on ARC-AGI-3 benchmarks, achieving 38.4% accuracy—a 2.4-point improvement over previous state-of-the-art while reducing computational actions by 31%.

AIBearishDecrypt · Mar 267/10

🧠

Is AGI Here? Not Even Close, New AI Benchmark Suggests

A new AI benchmark called ARC-AGI-3 was released the same week Jensen Huang claimed AGI was achieved, showing dramatically poor performance from leading AI models. While humans scored 100% on the benchmark, advanced models like Gemini and GPT scored less than 0.4%, suggesting artificial general intelligence remains far from reality.

🧠 GPT-5🧠 Gemini

AIBullisharXiv – CS AI · Mar 177/10

🧠

Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGI

Researchers introduced SOAR, a self-improving language model system that combines evolutionary search with hindsight learning for program synthesis tasks. The method achieved 52% success rate on the challenging ARC-AGI benchmark by iteratively improving through search and refinement cycles.

AINeutralarXiv – CS AI · May 76/10

🧠

Executable World Models for ARC-AGI-3 in the Era of Coding Agents

Researchers demonstrate a coding-agent system for ARC-AGI-3 that uses executable Python world models to solve abstract reasoning challenges without game-specific code. The agent achieved full solutions on 7 of 25 public games, establishing a generalizable baseline approach that relies on model verification and simplicity-driven refactoring rather than hand-coded logic.

AIBullisharXiv – CS AI · Mar 166/10

🧠

Tiny Recursive Reasoning with Mamba-2 Attention Hybrid

Researchers developed a hybrid model combining Mamba-2 state space operators with Transformer blocks for recursive reasoning, achieving a 2% improvement in pass@2 performance on ARC-AGI-1 tasks with only 6.83M parameters. The study demonstrates that Mamba-2 operators can preserve reasoning capabilities while improving solution candidate coverage in tiny neural networks.

AIBullisharXiv – CS AI · Mar 36/103

🧠

Symbol-Equivariant Recurrent Reasoning Models

Researchers introduced Symbol-Equivariant Recurrent Reasoning Models (SE-RRMs), a new neural network architecture that solves reasoning problems like Sudoku and ARC-AGI more efficiently than existing models. SE-RRMs achieve competitive performance with only 2 million parameters and can generalize across different puzzle sizes without requiring extensive data augmentation.