y0news
AnalyticsDigestsSourcesRSSAICrypto
#demonstration-learning1 article
1 articles
AIBullishOpenAI News · Jul 46/105
🧠

Learning Montezuma’s Revenge from a single demonstration

OpenAI researchers achieved a breakthrough score of 74,500 on Montezuma's Revenge using reinforcement learning from just a single human demonstration. The algorithm trains agents starting from strategically selected states and optimizes using PPO, the same technique behind OpenAI Five.