Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis 论文

2012Lecture notes in computer science引用 364
Advanced Bandit Algorithms ResearchReinforcement Learning in RoboticsSmart Grid Energy Management