y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#reward-learning News & Analysis

3 articles tagged with #reward-learning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

3 articles
AIBullisharXiv – CS AI · Apr 147/10
🧠

TimeRewarder: Learning Dense Reward from Passive Videos via Frame-wise Temporal Distance

TimeRewarder is a new machine learning method that learns dense reward signals from passive videos to improve reinforcement learning in robotics. By modeling temporal distances between video frames, the approach achieves 90% success rates on Meta-World tasks using significantly fewer environment interactions than prior methods, while also leveraging human videos for scalable reward learning.

AINeutralarXiv – CS AI · May 116/10
🧠

Active teacher selection for reward learning

Researchers introduce the Hidden Utility Bandit (HUB) framework to address a critical limitation in reward learning systems: their reliance on feedback from a single idealized teacher. The framework models teacher heterogeneity in rationality, expertise, and cost, enabling Active Teacher Selection (ATS) algorithms that strategically choose which teachers to query, demonstrating superior performance in paper recommendation and vaccine testing applications.

AIBullisharXiv – CS AI · Apr 146/10
🧠

Learning Preference-Based Objectives from Clinical Narratives for Sequential Treatment Decision-Making

Researchers propose Clinical Narrative-informed Preference Rewards (CN-PR), a machine learning framework that extracts reward signals from patient discharge summaries to train reinforcement learning models for treatment decisions. The approach achieves strong alignment with clinical outcomes, including improved organ support-free days and faster shock resolution, offering a scalable alternative to traditional reward design in healthcare AI.