#flow-policies News & Analysis

2 articles tagged with #flow-policies. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles

AIBullisharXiv – CS AI · May 277/10

🧠

Trust Region Q Adjoint Matching

Researchers introduce Trust Region Q-Adjoint Matching (TRQAM), a reinforcement learning algorithm that stabilizes off-policy fine-tuning of pretrained flow policies by adaptively controlling deviation through trust-region constraints. The method demonstrates significant performance improvements, achieving 68% success rate on offline RL tasks compared to 46% for previous approaches.

AINeutralarXiv – CS AI · Mar 174/10

🧠

Safe Flow Q-Learning: Offline Safe Reinforcement Learning with Reachability-Based Flow Policies

Researchers introduce Safe Flow Q-Learning (SafeFQL), a new offline safe reinforcement learning method that combines Hamilton-Jacobi reachability with flow policies for safety-critical real-time control. The method achieves better safety performance with lower inference latency compared to existing diffusion-based approaches, making it more suitable for real-time deployment.