PACT: Self-Evolving Physical Safety Alignment for Diffusion Policies in Embodied Manipulation
Researchers introduce PACT, a post-training framework that enhances diffusion policies for robotic manipulation by ensuring physical safety constraints without sacrificing task performance. The method reduces safety violations by 31% while improving task success by 30.7% across simulated and real-world benchmarks.
PACT addresses a critical bottleneck in deploying diffusion models for robotics: the tension between maintaining safety constraints and preserving model expressivity. Traditional approaches either enforce safety during training, which limits policy flexibility, or apply external guardrails at deployment, which reduces scalability. This new framework operates post-training, meaning it can refine already-learned policies without retraining from scratch.
The technical innovation centers on distilling constraint gradients into diffusion models using reverse-KL divergence with timestep-level supervision. Critically, PACT incorporates a curriculum that gradually tightens safety constraints while providing theoretical guarantees on bounded policy shift and monotonic improvement. This prevents catastrophic forgetting—where safety improvements degrade task performance—a common problem in constraint-based policy refinement.
For the robotics and embodied AI industry, this work has substantial implications. Safety remains a primary barrier to autonomous system deployment in real-world environments. By achieving simultaneous safety and performance gains on both simulation and physical robots, PACT demonstrates practical feasibility rather than theoretical promise. The framework's data-agnostic design—requiring no demonstration data or task rewards—increases its applicability across diverse robotic platforms and tasks.
The 31% reduction in safety violations paired with 30.7% task improvement suggests PACT genuinely mitigates the safety-performance trade-off rather than simply shifting it. Future research will likely explore extending this approach to multi-agent scenarios and more complex constraint hierarchies, potentially accelerating autonomous system adoption in manufacturing, healthcare, and other safety-critical domains.
- →PACT reduces safety violations by 31% while improving task success by 30.7% on robotic manipulation benchmarks.
- →The framework operates post-training on pretrained diffusion policies without requiring demonstration data or task rewards.
- →A progressive curriculum tightens constraints while maintaining theoretical bounds on policy shift and monotonic improvement.
- →PACT addresses the critical safety-performance trade-off that currently limits real-world deployment of learned policies.
- →The method demonstrates effectiveness on both simulated and physical robot systems, indicating practical applicability.