y0news
← Feed
Back to feed
🧠 AI NeutralImportance 6/10

The Variance Brain Foundation Models Forgot: Third-Order Statistics Predict Cognition Where Billion-Parameter Models Fail

arXiv – CS AI|Giovanni Marraffini, Gabriel Mahuas, Trinidad Borrell, Victoria Shevchenko, Demian Wassermann|
🤖AI Summary

Researchers demonstrate that brain foundation models (BFMs)—billion-parameter Transformers trained on fMRI data—paradoxically predict cognitive performance worse than simple linear regression on functional connectivity matrices. The study identifies a variance allocation problem where BFM pretraining captures dominant fMRI variance but destroys higher-order statistical structures (third-order co-skewness) that actually predict cognition, solved through a lightweight linear pipeline requiring no pretraining.

Analysis

This research exposes a fundamental limitation in scaling brain foundation models: bigger doesn't mean better when the pretraining objective misaligns with downstream tasks. The researchers tested three state-of-the-art BFMs across multiple cognitive prediction benchmarks and consistently found that a simple functional connectivity matrix outperformed models with up to 650 million parameters. The performance gap widened with model scale, suggesting that additional parameters actively harm generalization rather than improve it.

The variance allocation problem identifies the root cause: BFM pretraining optimizes for reconstructing dominant variance patterns in fMRI signals, which are largely noise for cognitive prediction purposes. Meanwhile, third-order statistics (co-skewness tensors) that encode genuine cognitive information get destroyed during reconstruction. This mirrors broader challenges in self-supervised learning where pretraining objectives diverge from downstream task requirements.

The practical implications are significant for neuroscience and AI development. The proposed solution—projecting fMRI signals into a co-skewness-preserving subspace before computing functional connectivity—achieves state-of-the-art results without any pretraining, GPU requirements, or complex architecture. This validates that architectural scale and computational resources aren't bottlenecks; instead, objective function design fundamentally shapes what models learn.

The finding that finetuning BrainLM with a co-skewness-targeted loss recovers raw functional connectivity performance demonstrates the pretraining objective, not model capacity, represents the bottleneck. This suggests future brain foundation models require rethinking loss functions to preserve task-relevant statistical structures rather than pursuing ever-larger parameter counts.

Key Takeaways
  • Billion-parameter brain foundation models underperform 80K-parameter linear regression on cognitive prediction tasks due to misaligned pretraining objectives
  • BFM pretraining preserves second-order covariance but destroys third-order co-skewness tensors that encode genuine cognitive information
  • A lightweight linear pipeline preserving co-skewness outperforms all tested foundation models without pretraining or GPU computation
  • Larger model scale (650M vs 111M parameters) worsens cognitive prediction performance, indicating scaling harms generalization
  • The bottleneck is pretraining objective design, not architecture or model capacity, requiring fundamental rethinking of self-supervised learning for neuroscience
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles