AIBullisharXiv โ CS AI ยท 5d ago7/105
๐ง
HierarchicalPrune: Position-Aware Compression for Large-Scale Diffusion Models
Researchers developed HierarchicalPrune, a compression framework that reduces large-scale text-to-image diffusion models' memory footprint by 77.5-80.4% and latency by 27.9-38.0% while maintaining image quality. The technique enables billion-parameter AI models to run efficiently on resource-constrained devices through hierarchical pruning and knowledge distillation.