AINeutralarXiv – CS AI · 7h ago6/10
🧠
RoVE: Rotary Value Embeddings Attention for Relative Position-dependent Value Pathways
Researchers introduce RoVE (Rotary Value Embeddings), a parameter-free modification to Rotary Position Embeddings (RoPE) that makes value tokens position-sensitive in attention mechanisms. Testing on GPT-2 models demonstrates consistent improvements in few-shot learning, out-of-distribution performance, and long-context retrieval tasks.
🏢 Perplexity