AIBullishOpenAI News · Oct 317/108
🧠
Reinforcement learning with prediction-based rewards
OpenAI researchers have developed Random Network Distillation (RND), a reinforcement learning method that uses prediction-based rewards to encourage AI agents to explore environments through curiosity. This breakthrough represents the first time an AI system has exceeded average human performance on the notoriously difficult Atari game Montezuma's Revenge.