🧠 AI🟢 BullishImportance 7/10

VASO: Formally Verifiable Self-Evolving Skills for Physical AI Agents

arXiv – CS AI|Yunhao Yang, Neel P. Bhatt, Kevin Wang, Samuel Tetteh, Zhangyang Wang, Ufuk Topcu|June 5, 2026 at 04:00 AM

🤖AI Summary

Researchers introduce VASO, a framework that combines formal verification with self-evolving language model skills for robot control, achieving 97.2% specification compliance on physical tasks. The approach bridges formal methods and foundation models by using counterexamples from model checking as optimization feedback for skill contracts rather than modifying underlying model weights.

Analysis

VASO addresses a critical gap in autonomous robotics: while large language models have reduced the engineering burden of creating robot skills, they have introduced new trust problems. Traditional skill refinement relies on execution traces, unit tests, or reward signals—all of which provide limited evidence of correctness. A robot skill that works in sampled conditions may still fail under untested scenarios, posing safety risks in physical deployments.

The framework achieves this by decoupling verification from execution. Each skill exists as a semantic contract with two interfaces: a formal one that translates robot states into logical propositions suitable for model checking, and a planner-facing one that generates executable behavior. When a model checker identifies violations of temporal safety specifications, VASO converts the counterexample trace into a textual update signal. This gradient optimization refines the skill contract while keeping foundation model weights frozen—a computationally efficient approach that avoids expensive fine-tuning.

The practical implications are substantial. Reaching 97.2% specification compliance with fewer than 100 optimization samples suggests VASO scales better than execution-feedback or prompt-optimization baselines. For robotics deployment, this means engineers gain formal guarantees about temporal properties (e.g., "the robot must avoid obstacles before executing a turn") rather than statistical assurances based on test runs.

Looking forward, this work establishes a template for integrating formal methods into AI agent development. As robotics systems become more complex and mission-critical, the demand for formal verification will grow. The key question is whether this verification-guided approach generalizes beyond navigation and control tasks to more complex reasoning domains.

Key Takeaways

→VASO uses formal verification counterexamples as optimization feedback to evolve LLM-generated robot skill contracts
→The framework achieves 97.2% formal-specification compliance using fewer than 100 samples, outperforming execution and fine-tuning baselines
→Skills are represented as semantic contracts with formal and planner-facing interfaces, enabling both verification and execution
→Model checking filters logically inconsistent contracts before physical deployment, reducing safety risks
→Foundation model weights remain frozen during optimization, making the approach computationally efficient compared to fine-tuning