一种以α为启发的方法来解决搜索问题

论文标题

一种以α为启发的方法来解决搜索问题

An AlphaZero-Inspired Approach to Solving Search Problems

论文作者

Dantsin, Evgeny, Kreinovich, Vladik, Wolpert, Alexander

论文摘要

Alphazero及其扩展Muzero是计算机程序，使用机器学习技术在国际象棋，GO和其他一些游戏的超人级别上玩。他们仅通过从自我播放的强化学习来实现这一水平的比赛，除了游戏规则外，没有任何领域知识。适应alphazero中用于解决搜索问题的方法和技术是一个自然的想法。给定搜索问题，如何为α启发的求解器表示它？这个搜索问题的“解决规则”是什么？我们用简单的解决者和自我还原来描述可能的表示形式，并为满足性问题提供了此类表示的例子。我们还描述了用于搜索问题的蒙特卡洛树搜索版本。

AlphaZero and its extension MuZero are computer programs that use machine-learning techniques to play at a superhuman level in chess, go, and a few other games. They achieved this level of play solely with reinforcement learning from self-play, without any domain knowledge except the game rules. It is a natural idea to adapt the methods and techniques used in AlphaZero for solving search problems such as the Boolean satisfiability problem (in its search version). Given a search problem, how to represent it for an AlphaZero-inspired solver? What are the "rules of solving" for this search problem? We describe possible representations in terms of easy-instance solvers and self-reductions, and we give examples of such representations for the satisfiability problem. We also describe a version of Monte Carlo tree search adapted for search problems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题