人类通过政策解剖共享控制

论文标题

人类通过政策解剖共享控制

Human-AI Shared Control via Policy Dissection

论文作者

Li, Quanyi, Peng, Zhenghao, Wu, Haibin, Feng, Lan, Zhou, Bolei

论文摘要

人类共享控制允许人类与AI进行互动和协作，以在复杂的环境中完成控制任务。以前的强化学习（RL）方法试图以目标条件的设计来实现可控制的政策，而付出了重新设计奖励功能和培训范式。受神经科学方法的启发，我们开发了一种简单而有效的基于频率的方法，称为\ textit {策略解剖}，以使学习神经控制器的中间表示与代理行为的运动属性相结合。在不修改神经控制器或检验模型的情况下，提出的方法可以将给定的RL训练的政策转换为人类交互式政策。我们评估了关于自动驾驶和运动的RL任务的建议方法。实验表明，通过政策解剖在驾驶任务中获得的人类共享控制可以大大提高看不见的交通场景的性能和安全性。随着人类的循环，机器人机器人也表现出多才多艺的可控运动技能，即使他们只接受了前进的训练。我们的结果表明，通过解释自主代理人的学会代表来实施人类共享自主权的有希望的方向。演示视频和代码将在https://metadriverse.github.io/policydissect上提供。

Human-AI shared control allows human to interact and collaborate with AI to accomplish control tasks in complex environments. Previous Reinforcement Learning (RL) methods attempt the goal-conditioned design to achieve human-controllable policies at the cost of redesigning the reward function and training paradigm. Inspired by the neuroscience approach to investigate the motor cortex in primates, we develop a simple yet effective frequency-based approach called \textit{Policy Dissection} to align the intermediate representation of the learned neural controller with the kinematic attributes of the agent behavior. Without modifying the neural controller or retraining the model, the proposed approach can convert a given RL-trained policy into a human-interactive policy. We evaluate the proposed approach on the RL tasks of autonomous driving and locomotion. The experiments show that human-AI shared control achieved by Policy Dissection in driving task can substantially improve the performance and safety in unseen traffic scenes. With human in the loop, the locomotion robots also exhibit versatile controllable motion skills even though they are only trained to move forward. Our results suggest the promising direction of implementing human-AI shared autonomy through interpreting the learned representation of the autonomous agents. Demo video and code will be made available at https://metadriverse.github.io/policydissect.

下载PDF全文

下载文献需遵守相关版权规定

论文标题