y0news
AnalyticsDigestsSourcesRSSAICrypto
#montezuma-revenge2 articles
2 articles
AIBullishOpenAI News · Oct 317/108
🧠

Reinforcement learning with prediction-based rewards

OpenAI researchers have developed Random Network Distillation (RND), a reinforcement learning method that uses prediction-based rewards to encourage AI agents to explore environments through curiosity. This breakthrough represents the first time an AI system has exceeded average human performance on the notoriously difficult Atari game Montezuma's Revenge.

AIBullishOpenAI News · Jul 46/105
🧠

Learning Montezuma’s Revenge from a single demonstration

OpenAI researchers achieved a breakthrough score of 74,500 on Montezuma's Revenge using reinforcement learning from just a single human demonstration. The algorithm trains agents starting from strategically selected states and optimizes using PPO, the same technique behind OpenAI Five.