y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#pruning News & Analysis

10 articles tagged with #pruning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

10 articles
AINeutralarXiv โ€“ CS AI ยท Mar 277/10
๐Ÿง 

How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models

Researchers conducted the first systematic study of how weight pruning affects language model representations using Sparse Autoencoders across multiple models and pruning methods. The study reveals that rare features survive pruning better than common ones, suggesting pruning acts as implicit feature selection that preserves specialized capabilities while removing generic features.

๐Ÿง  Llama
AINeutralarXiv โ€“ CS AI ยท Mar 47/103
๐Ÿง 

Structured vs. Unstructured Pruning: An Exponential Gap

Research reveals an exponential gap between structured and unstructured neural network pruning methods. While unstructured weight pruning can approximate target functions with O(d log(1/ฮต)) neurons, structured neuron pruning requires ฮฉ(d/ฮต) neurons, demonstrating fundamental limitations of structured approaches.

AIBullisharXiv โ€“ CS AI ยท Mar 37/105
๐Ÿง 

HierarchicalPrune: Position-Aware Compression for Large-Scale Diffusion Models

Researchers developed HierarchicalPrune, a compression framework that reduces large-scale text-to-image diffusion models' memory footprint by 77.5-80.4% and latency by 27.9-38.0% while maintaining image quality. The technique enables billion-parameter AI models to run efficiently on resource-constrained devices through hierarchical pruning and knowledge distillation.

AIBullisharXiv โ€“ CS AI ยท Feb 277/107
๐Ÿง 

Spatio-Temporal Token Pruning for Efficient High-Resolution GUI Agents

Researchers introduce GUIPruner, a training-free framework that addresses efficiency bottlenecks in high-resolution GUI agents by eliminating spatiotemporal redundancy. The system achieves 3.4x reduction in computational operations and 3.3x speedup while maintaining 94% of original performance, enabling real-time navigation with minimal resource consumption.

AIBullisharXiv โ€“ CS AI ยท Feb 277/105
๐Ÿง 

AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression

Tencent Hunyuan team introduces AngelSlim, a comprehensive toolkit for large model compression featuring quantization, speculative decoding, and pruning techniques. The toolkit includes the first industrially viable 2-bit large model (HY-1.8B-int2) and achieves 1.8x to 2.0x throughput gains while maintaining output quality.

AIBullisharXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

GPrune-LLM: Generalization-Aware Structured Pruning for Large Language Models

Researchers introduce GPrune-LLM, a new structured pruning framework that improves compression of large language models by addressing calibration bias and cross-task generalization issues. The method partitions neurons into behavior-consistent modules and uses adaptive metrics based on distribution sensitivity, showing consistent improvements in post-compression performance.

AIBullisharXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

SimCert: Probabilistic Certification for Behavioral Similarity in Deep Neural Network Compression

Researchers developed SimCert, a probabilistic certification framework that verifies behavioral similarity between compressed neural networks and their original versions. The framework addresses critical safety challenges in deploying compressed DNNs on resource-constrained systems by providing quantitative safety guarantees with adjustable confidence levels.

AIBullisharXiv โ€“ CS AI ยท Mar 96/10
๐Ÿง 

HiPP-Prune: Hierarchical Preference-Conditioned Structured Pruning for Vision-Language Models

Researchers introduce HiPP-Prune, a new framework for efficiently compressing vision-language models while maintaining performance and reducing hallucinations. The hierarchical approach uses preference-based pruning that considers multiple objectives including task utility, visual grounding, and compression efficiency.

AIBullisharXiv โ€“ CS AI ยท Mar 36/107
๐Ÿง 

Curvature-Weighted Capacity Allocation: A Minimum Description Length Framework for Layer-Adaptive Large Language Model Optimization

Researchers developed a new mathematical framework called Curvature-Weighted Capacity Allocation that optimizes large language model performance by identifying which layers contribute most to loss reduction. The method uses the Minimum Description Length principle to make principled decisions about layer pruning and capacity allocation under hardware constraints.

$NEAR
AIBullisharXiv โ€“ CS AI ยท Mar 26/1013
๐Ÿง 

Efficient Discovery of Approximate Causal Abstractions via Neural Mechanism Sparsification

Researchers have developed a new method to extract interpretable causal mechanisms from neural networks using structured pruning as a search technique. The approach reframes network pruning as finding approximate causal abstractions, yielding closed-form criteria for simplifying networks while maintaining their causal structure under interventions.