AIBullisharXiv – CS AI · 8h ago7/10
🧠
Less is More: Lightweight Prompt Compression for Question Answering Applications on Edge Devices
Researchers introduce CORE, a lightweight prompt compression method that optimizes large language models for edge devices without requiring auxiliary smaller models. The approach achieves 30% accuracy improvements while reducing memory usage by 50% and cutting energy consumption by 95% on smartphones compared to existing methods.
🏢 Nvidia