在重复的游戏中，上一轮收敛和不变的遗憾，没有非对称信息

论文标题

在重复的游戏中，上一轮收敛和不变的遗憾，没有非对称信息

Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information

论文作者

Dinh, Le Cong, Tran-Thanh, Long, Nguyen, Tri-Dung, Zemkoho, Alain B.

论文摘要

本文考虑了重复的游戏，其中一个玩家比其他玩家拥有更多有关游戏的信息。特别是，我们调查了重复的两人零和游戏游戏，其中只有专栏播放器知道游戏的回报矩阵A。假设在反复玩这款游戏时，该行玩家通过使用无regret算法来最大程度地减少她（伪）的遗憾，在每轮比赛中选择了她的策略。我们为列播放器开发了一种无恒定的regret算法，以展示上一轮收敛到最小值平衡。我们表明，对于ROW播放器的大量流行的No-Regret算法，我们的算法是有效的，包括多重重量更新算法，在线镜像下降方法/关注型调查领导者，线性多重重量更新算法以及优化的多重重量更新。

This paper considers repeated games in which one player has more information about the game than the other players. In particular, we investigate repeated two-player zero-sum games where only the column player knows the payoff matrix A of the game. Suppose that while repeatedly playing this game, the row player chooses her strategy at each round by using a no-regret algorithm to minimize her (pseudo) regret. We develop a no-instant-regret algorithm for the column player to exhibit last round convergence to a minimax equilibrium. We show that our algorithm is efficient against a large set of popular no-regret algorithms of the row player, including the multiplicative weight update algorithm, the online mirror descent method/follow-the-regularized-leader, the linear multiplicative weight update algorithm, and the optimistic multiplicative weight update.

下载PDF全文

下载文献需遵守相关版权规定

论文标题