论文标题
向我们展示方式:学习从演示管理对话框
Show Us the Way: Learning to Manage Dialog from Demonstrations
论文作者
论文摘要
我们提出提交的内容,即第八对话系统技术挑战的端到端多域对话框挑战赛。我们提出的对话系统采用管道架构,具有自然语言理解,对话状态跟踪,对话管理和自然语言的不同组成部分。我们系统的核心是一种强化学习算法,该算法利用演示中的深入Q学习来借助专家示例学习对话策略。我们发现,演示对于训练状态和行动空间都大的准确对话政策至关重要。对对话管理组件的评估表明,我们的方法是有效的 - 击败监督和强化学习基线。
We present our submission to the End-to-End Multi-Domain Dialog Challenge Track of the Eighth Dialog System Technology Challenge. Our proposed dialog system adopts a pipeline architecture, with distinct components for Natural Language Understanding, Dialog State Tracking, Dialog Management and Natural Language Generation. At the core of our system is a reinforcement learning algorithm which uses Deep Q-learning from Demonstrations to learn a dialog policy with the help of expert examples. We find that demonstrations are essential to training an accurate dialog policy where both state and action spaces are large. Evaluation of our Dialog Management component shows that our approach is effective - beating supervised and reinforcement learning baselines.