AIBullisharXiv โ CS AI ยท 6h ago10
๐ง
Reason to Contrast: A Cascaded Multimodal Retrieval Framework
Researchers introduce TTE-v2, a new multimodal retrieval framework that achieves state-of-the-art performance by incorporating reasoning steps during retrieval and reranking. The approach demonstrates that scaling based on reasoning tokens rather than model size can significantly improve performance, with TTE-v2-7B reaching 75.7% accuracy on MMEB-V2 benchmark.