AIBullisharXiv – CS AI · Mar 177/10
🧠
Directional Routing in Transformers
Researchers introduce directional routing, a lightweight mechanism for transformer models that adds only 3.9% parameter cost but significantly improves performance. The technique gives attention heads learned suppression directions controlled by a shared router, reducing perplexity by 31-56% and becoming the dominant computational pathway in the model.
🏢 Perplexity