Researchers apply psychometric analysis to large language model benchmarks, discovering that AI's general intelligence factor (G-factor) peaked around 2023-2024 before fragmenting as models specialized in reasoning tasks. The finding suggests AI development is shifting from unified capability improvement toward specialized tool-using systems, challenging assumptions about monolithic AGI progress.
This research reframes how we measure artificial general intelligence by borrowing Spearman's g-factor framework from psychology, which measures correlated abilities rather than isolated skills. The study analyzed 39 AI models across 14 benchmarks from 2019-2025, revealing a strong positive manifold where nearly all benchmark correlations remained positive—initially suggesting unified capability growth. However, the narrative inverts dramatically in 2024: the principal component explaining variance collapsed from 92% to 64% as reasoning-specialized models emerged, indicating the intelligence manifold is fragmenting into specialized systems rather than scaling uniformly. This matters because it challenges the prevailing assumption that AGI approaches represent a single ascending trajectory. Instead, models increasingly outsource complex reasoning to external tools, creating what researchers term an 'AI-hedgehog to AI-fox' transition—from unified general intelligence to distributed specialized intelligences. The architectural complexity increases paradoxically as performance specializes, inverting classical scientific ideals of parsimonious explanation. For the AI industry, this suggests AGI benchmarking frameworks may be fundamentally misaligned with how systems actually develop. The shift toward tool-use outsourcing indicates future progress may not manifest as improved monolithic model performance but through ecosystem integration. Investors and developers should recognize that capability measurements based on traditional benchmark batteries may increasingly misrepresent true system competence. The fragmenting G-factor signals a structural transition in how AI capability will be assessed and deployed in production environments.
- →AI models exhibit a unified positive manifold through 2023-2024, but specialization fragmenting this correlation as reasoning tools become externalized
- →Principal component analysis shows variance explained by general intelligence dropped from 92% to 64% with arrival of reasoning-specialized models in 2024
- →The shift represents movement from monolithic general intelligence to distributed specialized systems, inverting classical parsimony in model design
- →Current AGI benchmarking frameworks may be misaligned with how systems actually develop and deploy in practice
- →Increased architectural complexity accompanies capability specialization, creating what researchers call an 'AI-hedgehog to AI-fox' evolution