y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#int8 News & Analysis

2 articles tagged with #int8. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles
AIBullisharXiv โ€“ CS AI ยท Mar 37/103
๐Ÿง 

SageBwd: A Trainable Low-bit Attention

Researchers have developed SageBwd, a trainable INT8 attention mechanism that can match full-precision attention performance during pre-training while quantizing six of seven attention matrix multiplications. The study identifies key factors for stable training including QK-norm requirements and the impact of tokens per step on quantization errors.

AIBullisharXiv โ€“ CS AI ยท Apr 76/10
๐Ÿง 

MUXQ: Mixed-to-Uniform Precision MatriX Quantization via Low-Rank Outlier Decomposition

Researchers propose MUXQ, a new quantization technique for large language models that addresses activation outliers through low-rank decomposition. The method enables efficient INT8 quantization while maintaining accuracy close to FP16, making it suitable for edge device deployment with NPU-based hardware.

๐Ÿข Perplexity