AINeutralarXiv โ CS AI ยท 4h ago4
๐ง
RooflineBench: A Benchmarking Framework for On-Device LLMs via Roofline Analysis
Researchers introduce RooflineBench, a framework for measuring performance capabilities of Small Language Models on edge devices using operational intensity analysis. The study reveals that sequence length significantly impacts performance, model depth causes efficiency regression, and structural improvements like Multi-head Latent Attention can unlock better hardware utilization.