y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#neural-network-compression News & Analysis

3 articles tagged with #neural-network-compression. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

3 articles
AIBullisharXiv – CS AI · 4d ago7/10
🧠

InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization

Researchers introduce InfoQuant, a training-free method that optimizes activation distributions for low-bit quantization in large language models by using Peak Suppression Orthogonal Transformation. The technique achieves 97% accuracy preservation under W4A4KV4 quantization and reduces performance degradation by 42% compared to previous methods, advancing efficient LLM deployment.

AIBullisharXiv – CS AI · Apr 156/10
🧠

CLASP: Class-Adaptive Layer Fusion and Dual-Stage Pruning for Multimodal Large Language Models

Researchers introduce CLASP, a token reduction framework that optimizes Multimodal Large Language Models by intelligently pruning visual tokens through class-adaptive layer fusion and dual-stage pruning. The approach addresses computational inefficiency in MLLMs while maintaining performance across diverse benchmarks and architectures.