线性二次零和平均场类型游戏的策略优化

论文标题

线性二次零和平均场类型游戏的策略优化

Policy Optimization for Linear-Quadratic Zero-Sum Mean-Field Type Games

论文作者

Carmona, René, Hamidouche, Kenza, Laurière, Mathieu, Tan, Zongjun

论文摘要

在本文中，在Infinite-Horizon折扣实用程序功能下研究了具有线性动力学和二次实用程序的零和均值型游戏（ZSMFTG）。 ZSMFTG是一类游戏，其中两个决策者的公用事业总计为零，竞争以影响大量代理商。特别是，研究了过渡和效用功能取决于状态，控制器的行动以及国家和行动的均值。分析了该游戏，并得出了NASH平衡策略的明确表达式。此外，针对基于模型和基于样本的框架提出了依赖于策略梯度的两种策略优化方法。在第一种情况下，使用模型精确计算梯度，而在第二种情况下，使用蒙特卡洛模拟估算它们。数值实验表明，当两种算法在不同的情况下使用两个播放器控件的收敛以及实用程序功能。

In this paper, zero-sum mean-field type games (ZSMFTG) with linear dynamics and quadratic utility are studied under infinite-horizon discounted utility function. ZSMFTG are a class of games in which two decision makers whose utilities sum to zero, compete to influence a large population of agents. In particular, the case in which the transition and utility functions depend on the state, the action of the controllers, and the mean of the state and the actions, is investigated. The game is analyzed and explicit expressions for the Nash equilibrium strategies are derived. Moreover, two policy optimization methods that rely on policy gradient are proposed for both model-based and sample-based frameworks. In the first case, the gradients are computed exactly using the model whereas they are estimated using Monte-Carlo simulations in the second case. Numerical experiments show the convergence of the two players' controls as well as the utility function when the two algorithms are used in different scenarios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题