朝向多机构增强学习驱动的非处方市场模拟

论文标题

朝向多机构增强学习驱动的非处方市场模拟

Towards Multi-Agent Reinforcement Learning driven Over-The-Counter Market Simulations

论文作者

Vadori, Nelson, Ardon, Leo, Ganesh, Sumitra, Spooner, Thomas, Amrouni, Selim, Vann, Jared, Xu, Mengda, Zheng, Zeyu, Balch, Tucker, Veloso, Manuela

论文摘要

我们研究流动性提供商与流动性接管者在非处方市场相互作用的游戏之间，典型的例子是外汇。我们展示了奖励功能的参数化家族的合适设计以及共同的策略学习如何构成了解决此问题的有效解决方案。通过相互对抗，我们的深入学习学习者驱动的代理商学习了相对于各种目标的新兴行为，包括利润和损害，最佳执行和市场份额。特别是，我们发现流动性提供者自然会学会平衡对冲和偏斜，在那里偏斜的意思是将他们的买卖不对称地设定和销售作为其库存的函数。我们进一步介绍了一种基于RL的新型校准算法，我们发现该算法在对游戏均衡施加限制方面表现良好。从理论方面来说，我们能够在传递性假设下显示多代理政策梯度算法的融合率，与广义序列潜在游戏密切相关。

We study a game between liquidity provider and liquidity taker agents interacting in an over-the-counter market, for which the typical example is foreign exchange. We show how a suitable design of parameterized families of reward functions coupled with shared policy learning constitutes an efficient solution to this problem. By playing against each other, our deep-reinforcement-learning-driven agents learn emergent behaviors relative to a wide spectrum of objectives encompassing profit-and-loss, optimal execution and market share. In particular, we find that liquidity providers naturally learn to balance hedging and skewing, where skewing refers to setting their buy and sell prices asymmetrically as a function of their inventory. We further introduce a novel RL-based calibration algorithm which we found performed well at imposing constraints on the game equilibrium. On the theoretical side, we are able to show convergence rates for our multi-agent policy gradient algorithm under a transitivity assumption, closely related to generalized ordinal potential games.

下载PDF全文

下载文献需遵守相关版权规定

论文标题