y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#mamba-architecture News & Analysis

4 articles tagged with #mamba-architecture. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles
AIBullisharXiv โ€“ CS AI ยท Feb 277/106
๐Ÿง 

ViT-Linearizer: Distilling Quadratic Knowledge into Linear-Time Vision Models

Researchers developed ViT-Linearizer, a distillation framework that transfers Vision Transformer knowledge into linear-time models, addressing quadratic complexity issues for high-resolution inputs. The method achieves 84.3% ImageNet accuracy while providing significant speedups, bridging the gap between efficient RNN-based architectures and transformer performance.

AIBullisharXiv โ€“ CS AI ยท Mar 27/1016
๐Ÿง 

DiffuMamba: High-Throughput Diffusion LMs with Mamba Backbone

Researchers introduce DiffuMamba, a new diffusion language model using Mamba backbone architecture that achieves up to 8.2x higher inference throughput than Transformer-based models while maintaining comparable performance. The model demonstrates linear scaling with sequence length and represents a significant advancement in efficient AI text generation systems.

AINeutralarXiv โ€“ CS AI ยท Mar 34/106
๐Ÿง 

MixerCSeg: An Efficient Mixer Architecture for Crack Segmentation via Decoupled Mamba Attention

Researchers have developed MixerCSeg, a new AI architecture for crack segmentation that combines CNN, Transformer, and Mamba-based approaches to achieve state-of-the-art performance with high efficiency. The model uses only 2.05 GFLOPs and 2.54M parameters while outperforming existing methods on crack detection benchmarks.

AIBullisharXiv โ€“ CS AI ยท Mar 24/105
๐Ÿง 

R2GenCSR: Mining Contextual and Residual Information for LLMs-based Radiology Report Generation

Researchers have developed R2GenCSR, a new AI framework for generating radiology reports that uses Mamba architecture instead of Transformers to reduce computational complexity while maintaining performance. The system leverages context retrieval and large language models to produce high-quality medical reports from X-ray images.