AURA: Action-Gated Memory for Robot Policies at Constant VRAM
Researchers introduce AURA-Mem, a memory management system for robot policies that maintains constant memory footprint (4,224 bytes) regardless of episode length by using a learned gate to write only when observations would change actions. The approach reduces memory writes by 5-9x compared to KV-cache methods while matching performance on robotic tasks, addressing the bandwidth constraints of edge hardware used in embodied AI systems.
AURA-Mem represents a practical engineering solution to a fundamental constraint in deploying large language models for robotics. While datacenter inference systems can distribute attention caches across many simultaneous requests, robots operate in extended single episodes on resource-constrained hardware where memory writes become the computational bottleneck rather than compute itself. This distinction drives the core innovation: a learned gate that predicts action-relevance of observations rather than storing everything indiscriminately.
The technical contribution lies in training this gate directly against closed-loop action-error signals, departing from reconstruction-based memory approaches that optimize for observation fidelity rather than task performance. In controlled benchmarks, AURA-Mem achieved 5-6x fewer memory writes than constant-memory baselines and 9x fewer than periodic writing schedules, demonstrating that the action-surprise signal provides genuine information beyond random or fixed schedules.
Field validation on OpenVLA-OFT 7B with LIBERO-Long tasks showed the method maintains performance (0.233 success rate) matching ungated policies while reducing writes sevenfold. The constant memory requirement enables deployment on edge devices with limited VRAM and flash write endurance, critical constraints for long-horizon robotics applications in manufacturing, logistics, and autonomous systems.
Looking forward, this work establishes memory gating as a viable paradigm for embodied AI inference. The demonstrated 7x write reduction directly translates to extended operational lifespans on flash storage and reduced power consumption in bandwidth-limited environments. Future developments may explore multi-modal gating signals, adaptive threshold learning, and integration with hierarchical reasoning systems.
- βAURA-Mem maintains constant 4,224-byte memory regardless of episode length versus KV-cache growing to 6MB+ at 100,000 steps
- βAction-gated writing reduces memory writes 5-9x by learning when observations would change robot actions
- βMatched performance on LIBERO-Long tasks (0.233 success) while using 7x fewer memory writes than baseline KV-cache approaches
- βDirectly addresses bandwidth constraints of edge hardware where memory writes, not compute, become the performance bottleneck
- βLearned gate trained on closed-loop action-error signals outperforms random and periodic write schedules by isolating action-surprise information