AIBullisharXiv โ CS AI ยท 4h ago5
๐ง
Task-Centric Acceleration of Small-Language Models
Researchers propose TASC (Task-Adaptive Sequence Compression), a framework for accelerating small language models through two methods: TASC-ft for fine-tuning with expanded vocabularies and TASC-spec for training-free speculative decoding. The methods demonstrate improved inference efficiency while maintaining task performance across low output-variability generation tasks.