AINeutralarXiv – CS AI · 9h ago6/10
🧠
MesaNet: Sequence Modeling by Locally Optimal Test-Time Training
Researchers introduce MesaNet, an improved recurrent neural network architecture that optimizes sequence modeling through test-time training, achieving better language modeling performance than previous RNNs while requiring additional inference-time compute. The work advances the trend toward linearized transformers that maintain constant memory costs during inference, positioning computational efficiency against performance gains.
🏢 Perplexity