AIBullisharXiv โ CS AI ยท 9h ago7/10
๐ง
Directional Routing in Transformers
Researchers introduce directional routing, a lightweight mechanism for transformer models that adds only 3.9% parameter cost but significantly improves performance. The technique gives attention heads learned suppression directions controlled by a shared router, reducing perplexity by 31-56% and becoming the dominant computational pathway in the model.
๐ข Perplexity