AIBearisharXiv β CS AI Β· 10h ago7/10
π§
On the Limits of Layer Pruning for Generative Reasoning in Large Language Models
Research demonstrates that layer pruningβa compression technique for large language modelsβeffectively reduces model size while maintaining classification performance, but critically fails to preserve generative reasoning capabilities like arithmetic and code generation. Even with extensive post-training on 400B tokens, models cannot recover lost reasoning abilities, revealing fundamental limitations in current compression approaches.