🧠 AI⚪ NeutralImportance 4/10

Adaptive Correlation-Weighted Intrinsic Rewards for Reinforcement Learning

arXiv – CS AI|Viet Bac Nguyen, Phuong Thai Nguyen|March 2, 2026 at 05:00 AM|6 views

🤖AI Summary

Researchers propose ACWI, a new reinforcement learning framework that dynamically balances intrinsic and extrinsic rewards through adaptive scaling coefficients. The system uses a lightweight Beta Network to optimize exploration in sparse reward environments, demonstrating improved sample efficiency and stability in MiniGrid experiments.

Key Takeaways

→ACWI introduces adaptive intrinsic reward scaling that learns state-dependent coefficients online rather than using fixed manual tuning.
→The framework employs a Beta Network with encoder-based architecture to predict optimal intrinsic reward weights from agent states.
→A correlation-based objective aligns weighted intrinsic rewards with discounted future extrinsic returns for better exploration.
→Experimental results show consistent improvements in sample efficiency and learning stability with minimal computational overhead.
→The approach addresses key limitations of conventional reinforcement learning methods in sparse reward environments.