有限摩尼斯平衡，用于神经符号并发随机游戏

论文标题

有限摩尼斯平衡，用于神经符号并发随机游戏

Finite-horizon Equilibria for Neuro-symbolic Concurrent Stochastic Games

论文作者

Yan, Rui, Santos, Gabriel, Duan, Xiaoming, Parker, David, Kwiatkowska, Marta

论文摘要

我们介绍了神经符号并发随机游戏的新技术，这是一种最近提出的建模形式主义，以使用基于神经网络的感知机制和传统符号方法的组合来代表在连续空间环境中运行的一组概率的药物。迄今为止，仅研究了模型的零和变体，当代理具有不同的目标时，这太过限制了。我们为这些模型制定了平衡的概念，并提出算法来综合它们。专注于有限的马环境和（全球）社会福利亚游戏最优性，我们考虑了两种不同的类型：NASH Equilibria和相关的平衡。我们首先表明，基于向后诱导的精确解决方案可能会产生任意不良的平衡。然后，我们提出了一种称为冷冻子游戏改进的近似算法，该算法通过非线性程序的迭代解决方案进行。我们开发了一个原型实施，并证明了我们方法对两个案例研究的好处：自动化的汽车制定系统和飞机避免碰撞系统。

We present novel techniques for neuro-symbolic concurrent stochastic games, a recently proposed modelling formalism to represent a set of probabilistic agents operating in a continuous-space environment using a combination of neural network based perception mechanisms and traditional symbolic methods. To date, only zero-sum variants of the model were studied, which is too restrictive when agents have distinct objectives. We formalise notions of equilibria for these models and present algorithms to synthesise them. Focusing on the finite-horizon setting, and (global) social welfare subgame-perfect optimality, we consider two distinct types: Nash equilibria and correlated equilibria. We first show that an exact solution based on backward induction may yield arbitrarily bad equilibria. We then propose an approximation algorithm called frozen subgame improvement, which proceeds through iterative solution of nonlinear programs. We develop a prototype implementation and demonstrate the benefits of our approach on two case studies: an automated car-parking system and an aircraft collision avoidance system.

下载PDF全文

下载文献需遵守相关版权规定

论文标题