AINeutralarXiv – CS AI · 7h ago6/10
🧠
Neural FOXP2 -- Language Specific Neuron Steering for Targeted Language Improvement in LLMs
Researchers introduce Neural FOXP2, a technique that identifies and steers language-specific neurons in large language models to shift their default behavior from English to other languages like Hindi or Spanish. The method uses sparse autoencoders and spectral analysis to isolate a compact set of control circuits governing language preference, enabling safer, more targeted manipulation of multilingual model behavior.