#deep-reinforcement-learning News & Analysis

34 articles tagged with #deep-reinforcement-learning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

34 articles

AIBullisharXiv – CS AI · Jun 237/10

🧠

An LLM-Explainable DRL Framework for Passenger-Directed Autonomous Driving

Researchers developed a framework combining deep reinforcement learning (DRL) with large language models (LLMs) to make autonomous vehicles safer and more trustworthy by explaining driving decisions to passengers. The system was trained to handle three driving modes—fast, comfort, and stop—while generating safety-focused explanations for its actions, demonstrating effective mode switching and rule compliance.

AINeutralarXiv – CS AI · Jun 237/10

🧠

A Differentiable Atari VCS:A Complex, Fully Known Ground Truth for Explainable AI

Researchers have created fully differentiable emulators of the Atari 2600 computer system in Julia and JAX, solving a fundamental problem in explainable AI by providing a complex system with complete ground truth. The emulators are bit-for-bit identical to the original hardware while remaining mathematically differentiable, enabling gradient-based analysis to understand how AI systems make decisions.

AIBullisharXiv – CS AI · Jun 97/10

🧠

Addressing Market Regime Changes and Heavy-Tailed Returns in Portfolio Optimization via Bayesian VAR and Elliptical Black-Litterman

Researchers introduce BAVAR-BLED, a novel deep reinforcement learning algorithm that addresses critical limitations in portfolio optimization by incorporating fat-tailed return distributions and market regime awareness. The method combines Bayesian Vector Autoregression, Black-Litterman modeling with elliptical distributions, and transformer networks to achieve superior risk-adjusted returns compared to existing approaches.

AIBullisharXiv – CS AI · Jun 87/10

🧠

Robust Driving Control for Autonomous Vehicles: An Intelligent General-sum Constrained Adversarial Reinforcement Learning Approach

Researchers introduce IGCARL, a novel deep reinforcement learning framework that trains autonomous driving agents against sophisticated, multi-step adversarial attacks rather than simple myopic threats. The approach improves robustness by 27.9% over existing methods, addressing critical safety vulnerabilities that could impact real-world autonomous vehicle deployment.

AINeutralarXiv – CS AI · May 97/10

🧠

BehaviorGuard: Online Backdoor Defense for Deep Reinforcement Learning

Researchers propose BehaviorGuard, an online defense framework against backdoor attacks in deep reinforcement learning that detects malicious behavior by analyzing action distribution shifts rather than relying on reward anomalies or model fine-tuning. The approach works in both single and multi-agent DRL environments and demonstrates superior efficacy and efficiency compared to existing defense methods.

AIBullisharXiv – CS AI · Mar 47/102

🧠

Learning Memory-Enhanced Improvement Heuristics for Flexible Job Shop Scheduling

Researchers propose MIStar, a memory-enhanced improvement search framework using heterogeneous graph neural networks for flexible job-shop scheduling problems in smart manufacturing. The approach significantly outperforms traditional heuristics and state-of-the-art deep reinforcement learning methods in optimizing production schedules.

$NEAR

AIBullisharXiv – CS AI · Jun 236/10

🧠

Platooning Connected, Autonomous, and Human-Driven Vehicles: A Deep Reinforcement Learning-based Approach

Researchers propose a hybrid vehicle platooning system using deep reinforcement learning that allows non-connected vehicles to safely join autonomous platoons while managing traffic flow stability. The approach addresses real-world mixed traffic conditions by dynamically controlling platoon structures to suppress disturbance propagation, reduce fuel consumption, and improve safety—demonstrating significant improvements in balancing traffic capacity with stability.

AIBullisharXiv – CS AI · Jun 196/10

🧠

Oranits: Mission Assignment and Task Offloading in Open RAN-based ITS using Metaheuristic and Deep Reinforcement Learning

Researchers introduce Oranits, a system for optimizing mission assignment and task offloading in Open RAN-based autonomous vehicle networks using metaheuristic algorithms and deep reinforcement learning. The proposed MA-DDQN framework achieves 11% improvement in mission completions and 12.5% improvement in overall benefit compared to baseline methods, advancing edge computing efficiency in intelligent transportation systems.

AIBullisharXiv – CS AI · Jun 106/10

🧠

Robust Deep Reinforcement Learning Through Adversarial Attacks and Training : A Survey

A comprehensive survey examines adversarial attacks and training methodologies for improving Deep Reinforcement Learning robustness. The research addresses DRL's vulnerability to environmental perturbations and condition variations, proposing adversarial training as a key mechanism to enhance agent reliability in real-world deployments.

AINeutralarXiv – CS AI · Jun 96/10

🧠

Adversarial Instance Generation and Robust Training for Neural Combinatorial Optimization with Multiple Objectives

Researchers propose a framework for improving the robustness of deep reinforcement learning solvers for multi-objective combinatorial optimization problems by generating adversarial instances that expose weaknesses and training defenses using hardness-aware preference selection. The method demonstrates significant improvements in solver generalizability across traveling salesman, vehicle routing, and knapsack problems.

AINeutralarXiv – CS AI · Jun 46/10

🧠

Learning Empirically Admissible Neural Heuristics for Combinatorial Search

Researchers introduce a framework for training neural networks to solve combinatorial puzzles optimally by enforcing admissibility constraints—ensuring heuristics never overestimate remaining costs. The method combines an underestimating Bellman operator with asymmetric loss functions and post-hoc calibration, achieving significant reductions in search node expansions while maintaining solution optimality.

AINeutralarXiv – CS AI · Jun 25/10

🧠

Deft Scheduling of Dynamic Cloud Workflows with Varying Deadlines via Mixture-of-Experts

Researchers introduce DEFT, a new deep reinforcement learning architecture using a mixture-of-experts approach to optimize cloud workflow scheduling with varying deadline constraints. The system uses a graph-adaptive gating mechanism to route scheduling decisions through specialized experts, demonstrating improved performance in reducing execution costs and deadline violations compared to existing DRL baselines.

AINeutralarXiv – CS AI · Jun 26/10

🧠

DRL-Based Pose Control for Double-Ackermann Robots Under Actuation Uncertainties

Researchers extended the ManeuverNet deep reinforcement learning framework to achieve full pose control for double-Ackermann mobile robots while addressing the sim-to-real gap caused by actuation uncertainties. By incorporating Gazebo simulation dynamics into PyBullet training through multi-environment DRL, the team achieved 92% success rates in simulation and 69% under strict conditions, with successful real-world deployment without additional tuning.

AINeutralarXiv – CS AI · Jun 26/10

🧠

MViewRouter: Internalizing Geometric Equivariance via Multi-view Alternating Attention for Combinatorial Routing

Researchers propose MViewRouter, a deep reinforcement learning framework that solves combinatorial routing problems like TSP and CVRP by embedding geometric symmetries directly into the model architecture rather than relying on data augmentation. The approach uses multi-view alternating attention and collective policy gradient aggregation to achieve more consistent decision-making and improved generalization across problem variants.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Digital Twin-Assisted Adaptive Multi-Agent DRL for Intelligent Spectrum and Resource Management in Open-RAN UAV-Enabled 6G Networks

Researchers propose a digital twin-assisted deep reinforcement learning framework for optimizing spectrum and resource allocation in 6G networks powered by UAVs. The hybrid approach combines particle swarm optimization for UAV trajectory planning with multi-agent DRL for dynamic spectrum-power management, demonstrating improvements in spectral efficiency and energy utilization in simulated environments.

AINeutralarXiv – CS AI · May 286/10

🧠

PIRS: Physics-Informed Reward Shaping for SAC-Based Building Energy Management

Researchers introduce PIRS (Physics-Informed Reward Shaping), a method that improves deep reinforcement learning controllers for building energy management by replacing ad-hoc comfort metrics with ISO 7730 Predicted Mean Vote (PMV) standards. Tested on CityLearn v2.1.2, PIRS demonstrates competitive performance against manual baselines while substantially outperforming non-physics-grounded approaches in load ramping and peak demand metrics.

AINeutralarXiv – CS AI · May 286/10

🧠

Global Policy-Space Response Oracles for Two-Player Zero-Sum Games

Researchers introduce Global PSRO, an improved algorithm for computing Nash equilibria in two-player zero-sum games by using Population Exploitability metrics to guide strategy expansion more efficiently than existing methods. The approach reduces computational requirements while achieving better approximations of equilibrium solutions, advancing game-theoretic AI applications.

AIBullisharXiv – CS AI · May 286/10

🧠

Modeling Vehicle-Type-Specific Pedestrian Crash Avoidance Behavior in Safety-Critical Interactions Using Smooth-Mamba Deep Reinforcement Learning

Researchers developed SMamba-DDPG, a deep reinforcement learning framework that models how pedestrians behave differently when interacting with autonomous vehicles versus human-driven vehicles. The study found that pedestrians react faster to AVs and adopt lower crossing speeds, with AV interactions showing lower conflict rates than HDV scenarios.

AINeutralarXiv – CS AI · May 286/10

🧠

Visualizing Latent Phase Structures in Locomotion Policies: A Multi-Environment Study with Temporal Feature Extension

Researchers propose a novel framework for visualizing latent motion phase structures in deep reinforcement learning locomotion policies by extending clustering features beyond state observations to include actions and next states. The method successfully identifies clearer phase transition patterns across three MuJoCo environments, advancing interpretability of neural network-based control policies.

AINeutralarXiv – CS AI · May 276/10

🧠

When Does Deep RL Beat Calibrated Baselines? A Benchmark Study on Adaptive Resource Control

A comprehensive benchmark study reveals that properly calibrated rule-based autoscalers outperform six mainstream deep reinforcement learning algorithms on cost in adaptive resource control tasks. The research challenges assumptions about DRL superiority, identifying baseline calibration and reward engineering as greater bottlenecks than algorithm selection.

AINeutralarXiv – CS AI · May 276/10

🧠

Towards Generalization-Oriented Models for Vehicle Routing Problems with Mixture-of-Experts

Researchers propose R2E-IG, a deep reinforcement learning model using mixture-of-experts architecture to improve vehicle routing problem solutions across different data distributions. The approach combines residual-refined expert modules with instance-level gating and dynamic weight adaptation training, achieving competitive performance on both standard and out-of-distribution test cases.

AINeutralarXiv – CS AI · May 276/10

🧠

Intelligent Offloading in Vehicular Edge Computing: A Comprehensive Review of Deep Reinforcement Learning Approaches and Architectures

This academic survey examines deep reinforcement learning (DRL) approaches for optimizing computational offloading in vehicular edge computing systems. The research classifies existing DRL strategies across learning paradigms, system architectures, and optimization objectives while identifying challenges in scalability and coordination for next-generation intelligent transportation systems.

AINeutralarXiv – CS AI · May 276/10

🧠

An End-to-End Learning Approach for Solving Capacitated Location-Routing Problems

Researchers propose DRLHQ, a deep reinforcement learning approach with heterogeneous query attention mechanisms to solve capacitated location-routing problems (CLRPs) and their open variants. This marks the first end-to-end learning framework for CLRPs, demonstrating superior performance over traditional and DRL-based baselines on benchmark datasets.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Distilling Deep Reinforcement Learning into Interpretable Fuzzy Rules: An Explainable AI Framework

Researchers developed a Hierarchical Takagi-Sugeno-Kang Fuzzy Classifier System that converts opaque deep reinforcement learning agents into human-readable IF-THEN rules, achieving 81.48% fidelity in tests. The framework addresses the critical explainability problem in AI systems used for safety-critical applications by providing interpretable rules that humans can verify and understand.

AIBullisharXiv – CS AI · Mar 45/102

🧠

Enhancing User Throughput in Multi-panel mmWave Radio Access Networks for Beam-based MU-MIMO Using a DRL Method

Researchers developed a deep reinforcement learning approach to optimize beam management in millimeter-wave radio access networks, achieving up to 16% throughput improvements and 3-7x latency reduction. The method uses adaptive beam selection based on real-time observations to enhance multi-user MIMO performance in practical network setups.

Page 1 of 2Next →