#continuous-control News & Analysis

11 articles tagged with #continuous-control. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

11 articles

AIBullisharXiv – CS AI · Mar 167/10

🧠

Guided Policy Optimization under Partial Observability

Researchers introduce Guided Policy Optimization (GPO), a new reinforcement learning framework that addresses challenges in partially observable environments by co-training a guider with privileged information and a learner through imitation learning. The method demonstrates theoretical optimality comparable to direct RL and shows strong empirical performance across various tasks including continuous control and memory-based challenges.

AINeutralarXiv – CS AI · May 126/10

🧠

When (and How) to Trust the Expert: Diagnosing Query-Time Expert-Guided Reinforcement Learning

Researchers conduct a comprehensive benchmarking study of expert-guided reinforcement learning methods, revealing three critical failure modes that single-paper evaluations miss. They propose a decision rule based on pre-training observables to guide method selection, introducing EDGE as a new design point that exposes exploitable architectural dimensions.

AINeutralarXiv – CS AI · May 126/10

🧠

Revisiting Mixture Policies in Entropy-Regularized Actor-Critic

Researchers propose a marginalized reparameterization (MRP) estimator to enable practical use of mixture policies in reinforcement learning, addressing a long-standing gap between theoretical potential and practical implementation. By reducing variance compared to likelihood-ratio methods, MRP mixture policies achieve performance parity with standard Gaussian policies while offering greater flexibility in continuous action spaces.

🏢 Google

AINeutralarXiv – CS AI · May 96/10

🧠

Entropy-Regularized Adjoint Matching for Offline RL

Researchers introduce Maximum Entropy Adjoint Matching (ME-AM), a new framework for offline reinforcement learning that combines flow-matching generative policies with entropy regularization to overcome limitations in existing Q-learning approaches. The method addresses popularity bias and support binding issues that prevent agents from discovering high-reward actions in low-density regions, demonstrating competitive performance across continuous control benchmarks.

AINeutralarXiv – CS AI · May 96/10

🧠

Operator-Guided Invariance Learning for Continuous Reinforcement Learning

Researchers propose VPSD-RL, a reinforcement learning framework that discovers value-preserving structures in continuous control tasks using Lie-group operators and diffusion models. The method improves data efficiency and robustness by identifying nonlinear transformations that preserve optimal value functions, addressing brittleness in RL systems under environmental variability.

AIBullisharXiv – CS AI · Mar 166/10

🧠

FastDSAC: Unlocking the Potential of Maximum Entropy RL in High-Dimensional Humanoid Control

Researchers introduce FastDSAC, a new framework that successfully applies Maximum Entropy Reinforcement Learning to high-dimensional humanoid control tasks. The system uses Dimension-wise Entropy Modulation and continuous distributional critics to achieve 180% and 400% performance gains on challenging control tasks compared to deterministic methods.

AIBullisharXiv – CS AI · Mar 126/10

🧠

Adaptive RAN Slicing Control via Reward-Free Self-Finetuning Agents

Researchers propose a novel self-finetuning framework for AI agents that enables continuous learning without handcrafted rewards, demonstrating superior performance in dynamic Radio Access Network slicing tasks. The approach uses bi-perspective reflection to generate autonomous feedback and distill long-term experiences into model parameters, outperforming traditional reinforcement learning methods.

AIBullisharXiv – CS AI · Mar 37/108

🧠

State-Action Inpainting Diffuser for Continuous Control with Delay

Researchers introduce State-Action Inpainting Diffuser (SAID), a new AI framework that addresses signal delay challenges in continuous control and reinforcement learning. SAID combines model-based and model-free approaches using a generative formulation that can be applied to both online and offline RL, demonstrating state-of-the-art performance on delayed control benchmarks.

AINeutralarXiv – CS AI · Mar 36/104

🧠

Distributions as Actions: A Unified Framework for Diverse Action Spaces

Researchers introduce a new reinforcement learning framework called Distributions-as-Actions (DA) that treats parameterized action distributions as actions, making all action spaces continuous regardless of original type. The approach includes a new policy gradient estimator (DA-PG) with lower variance and a practical actor-critic algorithm (DA-AC) that shows competitive performance across discrete, continuous, and hybrid control tasks.

AIBullisharXiv – CS AI · Mar 26/1014

🧠

Actor-Critic for Continuous Action Chunks: A Reinforcement Learning Framework for Long-Horizon Robotic Manipulation with Sparse Reward

Researchers introduced AC3 (Actor-Critic for Continuous Chunks), a new reinforcement learning framework that addresses challenges in long-horizon robotic manipulation tasks with sparse rewards. The system uses continuous action chunks with stabilization mechanisms and achieved superior performance on 25 benchmark tasks using minimal demonstrations.

AINeutralarXiv – CS AI · Apr 145/10

🧠

Enhanced-FQL($\lambda$), an Efficient and Interpretable RL with novel Fuzzy Eligibility Traces and Segmented Experience Replay

Researchers propose Enhanced-FQL(λ), a fuzzy reinforcement learning framework that combines fuzzified eligibility traces and segmented experience replay to improve interpretability and efficiency in continuous control tasks. The method demonstrates competitive performance with neural network approaches while maintaining computational simplicity through interpretable fuzzy rule bases rather than complex black-box architectures.

$FET