y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#model-pruning News & Analysis

10 articles tagged with #model-pruning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

10 articles
AIBearisharXiv – CS AI · 2d ago7/10
🧠

Finding DoRI: Discovery of Retained Images in Diffusion Models

Researchers challenge the assumption that memorization in text-to-image diffusion models can be localized to specific weights, demonstrating that pruning efforts can be bypassed through minor text embedding perturbations. The study reveals memorization is distributed throughout embedding space, suggesting current mitigation strategies are fundamentally fragile and requiring new approaches to protect training data privacy.

AIBullisharXiv – CS AI · May 127/10
🧠

A Game Theoretic Free Energy Analysis of Higher Order Synergy in Attention Heads of Large Language Models

Researchers apply game-theoretic free energy principles to analyze attention head interactions in large language models, discovering that heads exhibit higher-order redundancy. Their framework enables principled pruning of low-contribution heads, achieving 18% FLOP reduction and 22% throughput improvement in GPT2 with minimal performance degradation.

🏢 Perplexity🧠 Llama
AIBullisharXiv – CS AI · May 127/10
🧠

Hierarchical Attention-based Graph Neural Network with Relevance-driven Pruning

Researchers introduce HA-HeteroGNN, a Graph Neural Network framework that improves both interpretability and efficiency through hierarchical attention mechanisms and relevance-driven pruning. The approach achieves a 27% reduction in graph edges while improving classification accuracy by up to 2.46%, alongside 43.9% training time reductions.

AINeutralarXiv – CS AI · Mar 37/104
🧠

When Reasoning Meets Compression: Understanding the Effects of LLMs Compression on Large Reasoning Models

Researchers analyzed compression effects on large reasoning models (LRMs) through quantization, distillation, and pruning methods. They found that dynamically quantized 2.51-bit models maintain near-original performance, while identifying critical weight components and showing that protecting just 2% of excessively compressed weights can improve accuracy by 6.57%.

AIBullisharXiv – CS AI · Mar 37/104
🧠

HEAPr: Hessian-based Efficient Atomic Expert Pruning in Output Space

Researchers introduce HEAPr, a novel pruning algorithm for Mixture-of-Experts (MoE) language models that decomposes experts into atomic components for more precise pruning. The method achieves nearly lossless compression at 20-25% pruning ratios while reducing computational costs by approximately 20%.

AINeutralarXiv – CS AI · 3d ago6/10
🧠

Resource-Constrained Affect Modelling via Variance Regularisation Pruning

Researchers introduce Variance-Regularised Pruning (VR), a neural network pruning technique that reduces model size while maintaining robust performance across diverse users. The method balances computational efficiency with cross-participant stability in affective computing systems, achieving 80% sparsity without sacrificing reliability on the AGAIN emotion recognition dataset.

AIBullisharXiv – CS AI · 3d ago6/10
🧠

Extracting Small Translation Specialists from LLMs by Aggressively Pruning Experts

Researchers present a method for aggressively pruning expert modules from mixture-of-experts large language models to create specialized translation systems. The approach removes up to 90% of experts with minimal performance degradation, demonstrating that translation tasks require only a fraction of a full LLM's parameters, enabling substantial model compression.

AIBullisharXiv – CS AI · Apr 76/10
🧠

REAM: Merging Improves Pruning of Experts in LLMs

Researchers propose REAM (Router-weighted Expert Activation Merging), a new method for compressing large language models that groups and merges expert weights instead of pruning them. The technique preserves model performance better than existing pruning methods while reducing memory requirements for deployment.

AIBullisharXiv – CS AI · Mar 26/1015
🧠

FineScope : SAE-guided Data Selection Enables Domain Specific LLM Pruning and Finetuning

Researchers introduce FineScope, a framework that uses Sparse Autoencoder (SAE) techniques to create smaller, domain-specific language models from larger pretrained LLMs through structured pruning and self-data distillation. The method achieves competitive performance while significantly reducing computational requirements compared to training from scratch.

AINeutralarXiv – CS AI · Mar 34/107
🧠

CA-AFP: Cluster-Aware Adaptive Federated Pruning

Researchers propose CA-AFP, a new federated learning framework that combines client clustering with adaptive model pruning to address both statistical and system heterogeneity challenges. The approach achieves better accuracy and fairness while reducing communication costs compared to existing methods, as demonstrated on human activity recognition benchmarks.