论文标题
通过强化学习控制来加快机器人假体进行个性化步态援助的阻抗调整
Towards Expedited Impedance Tuning of a Robotic Prosthesis for Personalized Gait Assistance by Reinforcement Learning Control
论文作者
论文摘要
个性化下肢可穿戴机器人等医疗设备具有挑战性。虽然已经以原则性的方式证明了自动化膝盖假体控制参数调整过程的最初可行性,但下一个关键问题是提高调整效率并在临床环境中为人类用户加速效率,同时维持人类安全。因此,我们提出了一个具有约束(PICE)方法的政策迭代,作为在强化学习框架下对问题的创新解决方案。 PICE的核心是使用投影的钟手方程,并限制了在策略评估过程中确保绩效价值的积极半精力的限制。此外,我们开发了在线和离线PICE实现,这些实现为设计人员提供了额外的灵活性,可以完全利用从盘子或非政策的测量数据,以进一步提高PICE调整效率。我们的人类受试者测试表明,PICE提供了有效的策略,并大大减少了调整时间。我们还第一次通过将部署策略应用于不同的任务和用户进行了实验评估并证明了部署政策的鲁棒性。将其汇总在一起,我们的新解决方法解决方法已经有效,因为PICE已经证明了其真正自动化机器人膝盖假体使用者控制参数调整过程的潜力。
Personalizing medical devices such as lower limb wearable robots is challenging. While the initial feasibility of automating the process of knee prosthesis control parameter tuning has been demonstrated in a principled way, the next critical issue is to improve tuning efficiency and speed it up for the human user, in clinic settings, while maintaining human safety. We, therefore, propose a policy iteration with constraint embedded (PICE) method as an innovative solution to the problem under the framework of reinforcement learning. Central to PICE is the use of a projected Bellman equation with a constraint of assuring positive semidefiniteness of performance values during policy evaluation. Additionally, we developed both online and offline PICE implementations that provide additional flexibility for the designer to fully utilize measurement data, either from on-policy or off-policy, to further improve PICE tuning efficiency. Our human subject testing showed that the PICE provided effective policies with significantly reduced tuning time. For the first time, we also experimentally evaluated and demonstrated the robustness of the deployed policies by applying them to different tasks and users. Putting it together, our new way of problem solving has been effective as PICE has demonstrated its potential toward truly automating the process of control parameter tuning for robotic knee prosthesis users.