论文标题
可靠地将伴侣的行动重新施加到社会的内在动机
Reliably Re-Acting to Partner's Actions with the Social Intrinsic Motivation of Transfer Empowerment
论文作者
论文摘要
我们考虑用于合作交流和协调任务的多代理强化学习(MARL)。 MARL代理商可能会变得脆弱,因为他们可以过分努力培训伙伴的政策。这种过度拟合可以产生采用政策,这些政策是根据其他代理人将以某种方式行事而不是对其行为做出反应的政策。我们的目标是将学习过程偏向于对其他代理人的行为寻找反应性策略。我们的方法(转移授权)衡量了代理行动之间的潜在影响。来自三个模拟合作情景的结果支持我们的假设,即转移授权可以改善MARL的绩效。我们讨论如何通过确保对伴侣的反应性来指导多代理协调的有用原理。
We consider multi-agent reinforcement learning (MARL) for cooperative communication and coordination tasks. MARL agents can be brittle because they can overfit their training partners' policies. This overfitting can produce agents that adopt policies that act under the expectation that other agents will act in a certain way rather than react to their actions. Our objective is to bias the learning process towards finding reactive strategies towards other agents' behaviors. Our method, transfer empowerment, measures the potential influence between agents' actions. Results from three simulated cooperation scenarios support our hypothesis that transfer empowerment improves MARL performance. We discuss how transfer empowerment could be a useful principle to guide multi-agent coordination by ensuring reactiveness to one's partner.