Near-optimal Regret Bounds for Reinforcement Learning 论文

2010引用 711
Advanced Bandit Algorithms ResearchReinforcement Learning in RoboticsMachine Learning and Algorithms

Near-optimal Regret Bounds for Reinforcement Learning · 作者