Reinforcement learning with prediction-based rewards 文章

OpenAI Blog2018-10-31BLOGen

摘要

We’ve developed Random Network Distillation (RND), a prediction-based method for encouraging reinforcement learning agents to explore their environments through curiosity, which for the first time exceeds average human performance on Montezuma’s Revenge.

Reinforcement learning with prediction-based rewards 文章

摘要

相关事件

相关公司

相关人物

相关产品查看全部 (1)

相关技术查看全部 (1)