🧠 AI🟢 BullishImportance 7/10

VLA-Pro: Cross-Task Procedural Memory Transfer for Vision-Language-Action Models

arXiv – CS AI|Shengyu Si, Yuanzhuo Lu, Ruimeng Yang, Ziyi Ye, Zuxuan Wu, Yu-Gang Jiang|May 29, 2026 at 04:00 AM

🤖AI Summary

Researchers introduce VLA-Pro, a framework that enhances vision-language-action models for robotics by storing and retrieving task-specific procedural memories during inference. The approach achieves dramatic performance gains—up to 207% improvement in simulation and raising real-world success rates from 5.8% to 65%—demonstrating significant progress in cross-task generalization for robotic manipulation.

Analysis

VLA-Pro addresses a critical limitation in current vision-language-action models: their inability to effectively generalize to novel tasks by leveraging experience across different objects, scenes, and action patterns. The framework operates as a modular plug-and-play system that stores task-specific LoRA adapters as procedural memories during training, then dynamically retrieves and fuses these memories during inference based on multi-modal context. This architecture preserves the modularity and stability of execution while enabling sophisticated knowledge transfer.

The advancement builds on growing recognition that language models and multimodal systems benefit from structured memory mechanisms. Rather than relying solely on end-to-end training, VLA-Pro demonstrates that explicit procedural memory storage mirrors how humans transfer skills across related tasks. The testing across RoboTwin, RLBench, and real-world environments suggests the approach generalizes across different robotic platforms and task complexities.

For the robotics and AI industry, these results signal meaningful progress toward practical general-purpose manipulation systems. The 207% relative improvement in simulation and the tenfold increase in real-world success rates (5.8% to 65%) represent substantial performance gains that could accelerate deployment of robotic systems in manufacturing, logistics, and service sectors. The modularity of the approach appeals to developers building on existing VLA backbones without requiring architectural changes.

Looking ahead, research will likely focus on scaling procedural memory systems to larger task distributions, optimizing memory retrieval efficiency for real-time robotic control, and exploring how procedural memory transfers across entirely different domains beyond manipulation.

Key Takeaways

→VLA-Pro stores task-specific LoRA adapters as procedural memories to enable cross-task generalization in robotic manipulation
→Real-world manipulation success rate increased from 5.8% to 65%, demonstrating practical viability for deployed systems
→Framework achieves up to 207% relative improvement in simulation environments across multiple robotic platforms
→Modular plug-and-play design allows integration with existing VLA backbones without architectural modifications
→Dynamic memory fusion mechanism enables robots to transfer manipulation experience to novel tasks while maintaining execution stability

#robotics #vision-language-models #procedural-memory #manipulation #generalization #lora-adapters #ai-research #cross-task-transfer

Read Original →via arXiv – CS AI

Act on this with AI

Stay ahead of the market.

Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.

Connect Wallet to AI →How it works

AIMay 6

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

AIMay 6

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

AIMay 6

VLA-Pro: Cross-Task Procedural Memory Transfer for Vision-Language-Action Models

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

S&P 500 and NASDAQ hit record highs as AI chip stocks surge