Weight Pruning Amplifies Bias: A Multi-Method Study of Compressed LLMs for Edge AI
A comprehensive empirical study reveals that weight pruning—a technique for compressing large language models for edge devices—paradoxically amplifies bias while preserving performance metrics. The research shows activation-aware pruning methods maintain perplexity but increase stereotype reliance by up to 84%, suggesting current evaluation methods fail to detect fairness degradation in compressed models.