AINeutralarXiv – CS AI · 15h ago6/10
🧠
Tracing Computation Density in LLMs
Researchers introduce the s-Trace method to analyze how transformer-based LLMs utilize their computational capacity, revealing that model computation organizes into two distinct phases: a sparse early-layer core providing rough predictions, refined through denser later-layer computations. The findings suggest LLMs operate with modular efficiency rather than fully exploiting their parameter capacity across all inputs.