y0news
#hardware-optimization1 article
1 articles
AINeutralarXiv โ€“ CS AI ยท 4h ago4
๐Ÿง 

RooflineBench: A Benchmarking Framework for On-Device LLMs via Roofline Analysis

Researchers introduce RooflineBench, a framework for measuring performance capabilities of Small Language Models on edge devices using operational intensity analysis. The study reveals that sequence length significantly impacts performance, model depth causes efficiency regression, and structural improvements like Multi-head Latent Attention can unlock better hardware utilization.