10 articles tagged with #pruning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AINeutralarXiv โ CS AI ยท Mar 277/10
๐ง Researchers conducted the first systematic study of how weight pruning affects language model representations using Sparse Autoencoders across multiple models and pruning methods. The study reveals that rare features survive pruning better than common ones, suggesting pruning acts as implicit feature selection that preserves specialized capabilities while removing generic features.
๐ง Llama
AINeutralarXiv โ CS AI ยท Mar 47/103
๐ง Research reveals an exponential gap between structured and unstructured neural network pruning methods. While unstructured weight pruning can approximate target functions with O(d log(1/ฮต)) neurons, structured neuron pruning requires ฮฉ(d/ฮต) neurons, demonstrating fundamental limitations of structured approaches.
AIBullisharXiv โ CS AI ยท Mar 37/105
๐ง Researchers developed HierarchicalPrune, a compression framework that reduces large-scale text-to-image diffusion models' memory footprint by 77.5-80.4% and latency by 27.9-38.0% while maintaining image quality. The technique enables billion-parameter AI models to run efficiently on resource-constrained devices through hierarchical pruning and knowledge distillation.
AIBullisharXiv โ CS AI ยท Feb 277/107
๐ง Researchers introduce GUIPruner, a training-free framework that addresses efficiency bottlenecks in high-resolution GUI agents by eliminating spatiotemporal redundancy. The system achieves 3.4x reduction in computational operations and 3.3x speedup while maintaining 94% of original performance, enabling real-time navigation with minimal resource consumption.
AIBullisharXiv โ CS AI ยท Feb 277/105
๐ง Tencent Hunyuan team introduces AngelSlim, a comprehensive toolkit for large model compression featuring quantization, speculative decoding, and pruning techniques. The toolkit includes the first industrially viable 2-bit large model (HY-1.8B-int2) and achieves 1.8x to 2.0x throughput gains while maintaining output quality.
AIBullisharXiv โ CS AI ยท Mar 176/10
๐ง Researchers introduce GPrune-LLM, a new structured pruning framework that improves compression of large language models by addressing calibration bias and cross-task generalization issues. The method partitions neurons into behavior-consistent modules and uses adaptive metrics based on distribution sensitivity, showing consistent improvements in post-compression performance.
AIBullisharXiv โ CS AI ยท Mar 176/10
๐ง Researchers developed SimCert, a probabilistic certification framework that verifies behavioral similarity between compressed neural networks and their original versions. The framework addresses critical safety challenges in deploying compressed DNNs on resource-constrained systems by providing quantitative safety guarantees with adjustable confidence levels.
AIBullisharXiv โ CS AI ยท Mar 96/10
๐ง Researchers introduce HiPP-Prune, a new framework for efficiently compressing vision-language models while maintaining performance and reducing hallucinations. The hierarchical approach uses preference-based pruning that considers multiple objectives including task utility, visual grounding, and compression efficiency.
AIBullisharXiv โ CS AI ยท Mar 36/107
๐ง Researchers developed a new mathematical framework called Curvature-Weighted Capacity Allocation that optimizes large language model performance by identifying which layers contribute most to loss reduction. The method uses the Minimum Description Length principle to make principled decisions about layer pruning and capacity allocation under hardware constraints.
$NEAR
AIBullisharXiv โ CS AI ยท Mar 26/1013
๐ง Researchers have developed a new method to extract interpretable causal mechanisms from neural networks using structured pruning as a search technique. The approach reframes network pruning as finding approximate causal abstractions, yielding closed-form criteria for simplifying networks while maintaining their causal structure under interventions.