y0news
#small-language-models2 articles
2 articles
AIBullisharXiv โ€“ CS AI ยท 4h ago5
๐Ÿง 

Task-Centric Acceleration of Small-Language Models

Researchers propose TASC (Task-Adaptive Sequence Compression), a framework for accelerating small language models through two methods: TASC-ft for fine-tuning with expanded vocabularies and TASC-spec for training-free speculative decoding. The methods demonstrate improved inference efficiency while maintaining task performance across low output-variability generation tasks.

AINeutralarXiv โ€“ CS AI ยท 4h ago4
๐Ÿง 

RooflineBench: A Benchmarking Framework for On-Device LLMs via Roofline Analysis

Researchers introduce RooflineBench, a framework for measuring performance capabilities of Small Language Models on edge devices using operational intensity analysis. The study reveals that sequence length significantly impacts performance, model depth causes efficiency regression, and structural improvements like Multi-head Latent Attention can unlock better hardware utilization.