MAPLE: Multi-State Aggregated Policy Evaluation for AlphaZero in Imperfect-Information Games 文章

ArXiv CS.AI2026-05-26NEWSen作者: Qian-Rong Li, Hung Guei, I-Chen Wu, Ti-Rong Wu

摘要

arXiv:2605.24139v1 Announce Type: new Abstract: Imperfect-information games (IIGs) are challenging, as players must make decisions without fully observing the true game state. While AlphaZero has achieved remarkable success in perfect-information games, extending it to IIGs remains difficult. Existing search-based approaches, such as Perfect Information Monte Carlo (PIMC), suffer from strategy fusion, while Information Set Monte Carlo Tree Search (IS-MCTS) incurs high computational cost when combined with neural networks. In this paper, we propose Multi-State Aggregated PoLicy Evaluation (MAPLE), a tree search method that aggregates policy and value evaluations from multiple sampled world states within a single search tree, combining the advantages of PIMC and IS-MCTS while maintaining a controllable computational cost. We further incorporate a Siamese-based sampling strategy to select informative world states from the information set.