AIBullisharXiv โ CS AI ยท 10h ago7/10
๐ง
Neurons Speak in Ranges: Breaking Free from Discrete Neuronal Attribution
Researchers introduce NeuronLens, a framework that interprets neural networks by analyzing activation ranges rather than individual neurons, addressing the widespread polysemanticity problem in large language models. The range-based approach enables more precise concept manipulation while minimizing unintended degradation to model performance.