Variational information maximisation for intrinsically motivated reinforcement learning 论文
2015Neural Information Processing Systems引用 226
Reinforcement Learning in RoboticsAdvanced Bandit Algorithms ResearchStochastic Gradient Optimization Techniques