y0news
← Feed
←Back to feed
🧠 AIβšͺ NeutralImportance 6/10

Extracting and Steering Emotion Representations in Small Language Models: A Methodological Comparison

arXiv – CS AI|Jihoon Jeong|
πŸ€–AI Summary

Researchers conducted the first comprehensive analysis of emotion representations in small language models (100M-10B parameters), finding that these models do possess internal emotion vectors similar to larger frontier models. The study evaluated 9 models across 5 architectural families and discovered that emotion representations localize at middle transformer layers, with generation-based extraction methods proving superior to comprehension-based approaches.

Key Takeaways
  • β†’Small language models (100M-10B parameters) possess internal emotion representations similar to larger frontier models.
  • β†’Generation-based extraction methods produce statistically superior emotion separation compared to comprehension-based methods.
  • β†’Emotion representations consistently localize at middle transformer layers (~50% depth) across different architectures and scales.
  • β†’Steering experiments revealed three behavioral regimes: surgical transformation, repetitive collapse, and explosive text degradation.
  • β†’Cross-lingual emotion entanglement was discovered in Qwen models, raising safety concerns for multilingual AI deployment.
Mentioned in AI
Companies
Perplexity→
Models
LlamaMeta
Read Original β†’via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains β€” you keep full control of your keys.
Connect Wallet to AI β†’How it works
Related Articles