🧠 AI🟢 BullishImportance 6/10

Separators in Enhancing Autoregressive Pretraining for Vision Mamba

arXiv – CS AI|Hanpeng Liu, Zidan Wang, Shuoxi Zhang, Kaiyuan Gao, Kun He|March 5, 2026 at 05:00 AM

🤖AI Summary

Researchers introduce STAR, a new autoregressive pretraining method for Vision Mamba that uses separators to quadruple input sequence length while maintaining image dimensions. The STAR-B model achieved 83.5% accuracy on ImageNet-1k, demonstrating improved performance through better utilization of long-range dependencies in computer vision tasks.

Key Takeaways

→Vision Mamba's causal mechanism makes it well-suited for autoregressive pretraining but current methods are limited to short sequences.
→STAR introduces identical separators before each image to demarcate different images and extend sequence length by 4x.
→The method preserves original dataset image dimensions while significantly increasing input sequence capacity.
→STAR-B achieved 83.5% accuracy on ImageNet-1k, showing competitive performance in Vision Mamba models.
→The approach demonstrates potential for enhancing vision model performance through improved long-range dependency modeling.