y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#linear-attention News & Analysis

2 articles tagged with #linear-attention. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles
AIBullisharXiv โ€“ CS AI ยท Mar 37/102
๐Ÿง 

MiniCPM-SALA: Hybridizing Sparse and Linear Attention for Efficient Long-Context Modeling

MiniCPM-SALA introduces a 9B-parameter hybrid language model architecture that combines sparse and linear attention mechanisms to handle ultra-long contexts up to 1M tokens. The model achieves 3.5x faster inference than full-attention models while reducing training costs by 75% through a continual training framework that transforms existing Transformer models.

AINeutralarXiv โ€“ CS AI ยท Mar 27/1017
๐Ÿง 

Test-Time Training with KV Binding Is Secretly Linear Attention

Researchers reveal that Test-Time Training (TTT) with KV binding, previously understood as online meta-learning for memorization, can actually be reformulated as a learned linear attention operator. This new perspective explains previously puzzling behaviors and enables architectural simplifications and efficiency improvements.