9 articles tagged with #state-space-models. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullisharXiv โ CS AI ยท Mar 56/10
๐ง Researchers introduce STAR, a new autoregressive pretraining method for Vision Mamba that uses separators to quadruple input sequence length while maintaining image dimensions. The STAR-B model achieved 83.5% accuracy on ImageNet-1k, demonstrating improved performance through better utilization of long-range dependencies in computer vision tasks.
AIBullisharXiv โ CS AI ยท Mar 57/10
๐ง Researchers introduce the Probability Navigation Architecture (PNA) framework that trains State Space Models with thermodynamic principles, discovering that SSMs develop 'architectural proprioception' - the ability to predict when to stop computation based on internal state entropy. This breakthrough shows SSMs can achieve computational self-awareness while Transformers cannot, with significant implications for efficient AI inference systems.
AINeutralarXiv โ CS AI ยท Mar 47/103
๐ง Research compares Transformers, State Space Models (SSMs), and hybrid architectures for in-context retrieval tasks, finding hybrid models excel at information-dense retrieval while Transformers remain superior for position-based tasks. SSM-based models develop unique locality-aware embeddings that create interpretable positional structures, explaining their specific strengths and limitations.
AIBullisharXiv โ CS AI ยท Feb 277/106
๐ง Researchers propose Decision MetaMamba (DMM), a new AI model architecture that improves offline reinforcement learning by addressing information loss issues in Mamba-based models. The solution uses a dense layer-based sequence mixer and modified positional structure to achieve state-of-the-art performance with fewer parameters.
AIBullishSynced Review ยท May 287/104
๐ง Adobe Research has developed a breakthrough approach to video generation that solves long-term memory challenges by combining State-Space Models (SSMs) with dense local attention mechanisms. The researchers used advanced training strategies including diffusion forcing and frame local attention to achieve coherent long-range video generation.
AIBullisharXiv โ CS AI ยท Mar 166/10
๐ง Researchers developed a hybrid model combining Mamba-2 state space operators with Transformer blocks for recursive reasoning, achieving a 2% improvement in pass@2 performance on ARC-AGI-1 tasks with only 6.83M parameters. The study demonstrates that Mamba-2 operators can preserve reasoning capabilities while improving solution candidate coverage in tiny neural networks.
AIBullisharXiv โ CS AI ยท Mar 55/10
๐ง Researchers have developed HealthMamba, a new AI framework that uses spatiotemporal modeling and uncertainty quantification to predict healthcare facility visits more accurately. The system achieved 6% better prediction accuracy and 3.5% improvement in uncertainty quantification compared to existing methods when tested on real-world datasets from four US states.
AIBullisharXiv โ CS AI ยท Mar 36/108
๐ง Researchers introduce Mamba-CAD, a state space model using Mamba architecture for generating complex 3D CAD models from parametric sequences. The model addresses limitations in handling longer, fine-grained industrial CAD sequences through an encoder-decoder framework paired with GANs, trained on a new dataset of 77,078 CAD models.
AINeutralarXiv โ CS AI ยท Mar 26/1015
๐ง Researchers conducted an in-depth analysis of in-context learning capabilities across different AI architectures including transformers, state-space models, and hybrid systems. The study reveals that while these models perform similarly on tasks, their internal mechanisms differ significantly, with function vectors playing key roles in self-attention and Mamba layers.