#mamba-architecture News & Analysis

10 articles tagged with #mamba-architecture. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

10 articles

AIBullisharXiv – CS AI · Jun 27/10

🧠

Zamba2-VL Technical Report

Zyphra released Zamba2-VL, a suite of vision-language models combining Mamba2 state-space layers with transformer blocks, achieving competitive performance with leading VLMs while delivering 10x faster time-to-first-token speeds. The three released models (1.2B, 2.7B, 7B parameters) represent a significant efficiency breakthrough for edge and on-device deployment.

🏢 Hugging Face

AIBullisharXiv – CS AI · May 287/10

🧠

CaMBRAIN: Real-time, Continuous EEG Inference with Causal State Space Models

Researchers introduce CaMBRAIN, a causal state space model based on Mamba architecture that enables real-time, continuous EEG signal processing with linear-time complexity. The model achieves state-of-the-art results across multiple datasets while processing signals >10x faster than existing attention-based methods, overcoming critical limitations in handling variable-length brain activity recordings.

AIBullisharXiv – CS AI · Feb 277/106

🧠

ViT-Linearizer: Distilling Quadratic Knowledge into Linear-Time Vision Models

Researchers developed ViT-Linearizer, a distillation framework that transfers Vision Transformer knowledge into linear-time models, addressing quadratic complexity issues for high-resolution inputs. The method achieves 84.3% ImageNet accuracy while providing significant speedups, bridging the gap between efficient RNN-based architectures and transformer performance.

AINeutralarXiv – CS AI · Jun 236/10

🧠

An approach with Visual and Tabular Mamba to multimodal medical data using Mixed Fusion

Researchers propose a Mamba-based architecture for multimodal medical data fusion that combines visual and tabular processing to improve cancer classification interpretability. Testing on skin and oral cancer datasets shows competitive performance with enhanced explainability through SHAP analysis, positioning state space models as viable alternatives to Transformers in medical AI applications.

AIBullisharXiv – CS AI · Jun 196/10

🧠

Spatial-Aware Reduction Framework: Towards Efficient and Faithful Visual State Space Models

Researchers introduce STORM, a spatial-aware token reduction framework that addresses performance collapse in visual state space models like Mamba when applying token reduction techniques. By maintaining structural integrity and two-dimensional grid topology during compression, STORM achieves significant accuracy recovery, particularly on VMamba with up to 63.3% improvement while operating as a training-free plug-and-play module.

AINeutralarXiv – CS AI · Jun 46/10

🧠

Coarse-to-fine Hierarchical Architecture with Sequential Mamba for Brain Reconstruction

Researchers introduce CHASMBrain, a hierarchical neural architecture using Mamba models to predict brain activity from images by mimicking the visual cortex's functional organization. The model achieves state-of-the-art performance on brain imaging datasets and reveals that different neural pathways specialize in processing semantic versus spatial information, advancing understanding of how artificial and biological vision systems align.

AINeutralarXiv – CS AI · May 126/10

🧠

Prediction Bottlenecks Don't Discover Causal Structure (But Here's What They Actually Do)

Researchers rigorously tested claims that Mamba state-space models can discover causal structure through prediction-only training, finding the method underperforms classical approaches like PCMCI and Granger causality. The apparent success in earlier experiments was largely attributable to sample-size confounds and non-standard intervention semantics rather than genuine architectural advantages.

AIBullisharXiv – CS AI · Mar 27/1016

🧠

DiffuMamba: High-Throughput Diffusion LMs with Mamba Backbone

Researchers introduce DiffuMamba, a new diffusion language model using Mamba backbone architecture that achieves up to 8.2x higher inference throughput than Transformer-based models while maintaining comparable performance. The model demonstrates linear scaling with sequence length and represents a significant advancement in efficient AI text generation systems.

AINeutralarXiv – CS AI · Mar 34/106

🧠

MixerCSeg: An Efficient Mixer Architecture for Crack Segmentation via Decoupled Mamba Attention

Researchers have developed MixerCSeg, a new AI architecture for crack segmentation that combines CNN, Transformer, and Mamba-based approaches to achieve state-of-the-art performance with high efficiency. The model uses only 2.05 GFLOPs and 2.54M parameters while outperforming existing methods on crack detection benchmarks.

AIBullisharXiv – CS AI · Mar 24/105

🧠

R2GenCSR: Mining Contextual and Residual Information for LLMs-based Radiology Report Generation

Researchers have developed R2GenCSR, a new AI framework for generating radiology reports that uses Mamba architecture instead of Transformers to reduce computational complexity while maintaining performance. The system leverages context retrieval and large language models to produce high-quality medical reports from X-ray images.