AIBullisharXiv โ CS AI ยท 8h ago6/10
๐ง
GPrune-LLM: Generalization-Aware Structured Pruning for Large Language Models
Researchers introduce GPrune-LLM, a new structured pruning framework that improves compression of large language models by addressing calibration bias and cross-task generalization issues. The method partitions neurons into behavior-consistent modules and uses adaptive metrics based on distribution sensitivity, showing consistent improvements in post-compression performance.