使用信号时间逻辑从自然语言和演示中进行互动学习

论文标题

使用信号时间逻辑从自然语言和演示中进行互动学习

Interactive Learning from Natural Language and Demonstrations using Signal Temporal Logic

论文作者

Mohammadinejad, Sara, Thomason, Jesse, Deshmukh, Jyotirmoy V.

论文摘要

自然语言是人类将任务传达给机器人的直观方式。尽管自然语言（NL）是模棱两可的，但现实世界的任务及其安全要求需要明确传达。信号时间逻辑（STL）是一种形式的逻辑，可以用作一种多功能，表达和明确的形式语言来描述机器人任务。一方面，使用STL用于机器人域的现有工作通常要求最终用户在STL中表达任务规格，这是非专家用户的挑战。另一方面，从NL转换为STL规范的转换仅限于特定片段。在这项工作中，我们提出了Dialoguestl，这是一种从（通常）模棱两可的NL描述中学习正确和简洁的STL公式的交互方法。我们结合了语义解析，预先训练的基于变压器的语言模型以及在少数用户演示的辅助上的用户澄清，以预测编码NL任务描述的最佳STL公式。将NL映射到STL的一个优点是，在使用增强学习（RL）以识别机器人的控制策略方面已经有很多工作。我们表明，我们可以使用深层学习技术来从学习的STL规范中学习最佳策略。我们证明Dialoguestl具有高效，可扩展性和稳定性，并且在预测正确的STL公式方面具有很高的精度，并与Oracle用户进行了一些演示和一些交互。

Natural language is an intuitive way for humans to communicate tasks to a robot. While natural language (NL) is ambiguous, real world tasks and their safety requirements need to be communicated unambiguously. Signal Temporal Logic (STL) is a formal logic that can serve as a versatile, expressive, and unambiguous formal language to describe robotic tasks. On one hand, existing work in using STL for the robotics domain typically requires end-users to express task specifications in STL, a challenge for non-expert users. On the other, translating from NL to STL specifications is currently restricted to specific fragments. In this work, we propose DIALOGUESTL, an interactive approach for learning correct and concise STL formulas from (often) ambiguous NL descriptions. We use a combination of semantic parsing, pre-trained transformer-based language models, and user-in-the-loop clarifications aided by a small number of user demonstrations to predict the best STL formula to encode NL task descriptions. An advantage of mapping NL to STL is that there has been considerable recent work on the use of reinforcement learning (RL) to identify control policies for robots. We show we can use Deep Q-Learning techniques to learn optimal policies from the learned STL specifications. We demonstrate that DIALOGUESTL is efficient, scalable, and robust, and has high accuracy in predicting the correct STL formula with a few number of demonstrations and a few interactions with an oracle user.

下载PDF全文

下载文献需遵守相关版权规定

论文标题