AIBullisharXiv – CS AI · 3h ago6/10
🧠
VCap: Hypergeometric Rewards for Weak-to-Strong Visual Captioning
Researchers introduce VCap, a reinforcement learning reward mechanism that improves visual captioning in multimodal AI models by grounding caption verification in actual visual signals. An 8B parameter model trained with VCap outperforms larger open and closed-source competitors on image and video captioning benchmarks, demonstrating that smarter reward design can enable weak-to-strong generalization in AI training.