UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem 论文
2010Periodica Mathematica Hungarica引用 291
Advanced Bandit Algorithms ResearchReinforcement Learning in RoboticsOptimization and Search Problems
UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem · 相关文章
暂无数据