AIBullisharXiv โ CS AI ยท 5d ago7/103
๐ง
VITA: Zero-Shot Value Functions via Test-Time Adaptation of Vision-Language Models
Researchers introduce VITA, a zero-shot value function learning method that enhances Vision-Language Models through test-time adaptation for robotic manipulation tasks. The system updates parameters sequentially over trajectories to improve temporal reasoning and generalizes across diverse environments, outperforming existing autoregressive VLM methods.