y0news
← Feed
Back to feed
🧠 AI NeutralImportance 6/10

Extracting and Steering Emotion Representations in Small Language Models: A Methodological Comparison

arXiv – CS AI|Jihoon Jeong|
🤖AI Summary

Researchers conducted the first comprehensive analysis of emotion representations in small language models (100M-10B parameters), finding that these models do possess internal emotion vectors similar to larger frontier models. The study evaluated 9 models across 5 architectural families and discovered that emotion representations localize at middle transformer layers, with generation-based extraction methods proving superior to comprehension-based approaches.

Key Takeaways
  • Small language models (100M-10B parameters) possess internal emotion representations similar to larger frontier models.
  • Generation-based extraction methods produce statistically superior emotion separation compared to comprehension-based methods.
  • Emotion representations consistently localize at middle transformer layers (~50% depth) across different architectures and scales.
  • Steering experiments revealed three behavioral regimes: surgical transformation, repetitive collapse, and explosive text degradation.
  • Cross-lingual emotion entanglement was discovered in Qwen models, raising safety concerns for multilingual AI deployment.
Mentioned in AI
Companies
Perplexity
Models
LlamaMeta
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles