信息提取和针对现实生活任务的人物对话：Mobilecs数据集的基线研究

论文标题

信息提取和针对现实生活任务的人物对话：Mobilecs数据集的基线研究

Information Extraction and Human-Robot Dialogue towards Real-life Tasks: A Baseline Study with the MobileCS Dataset

论文作者

Liu, Hong, Peng, Hao, Ou, Zhijian, Li, Juanzi, Huang, Yi, Feng, Junlan

论文摘要

最近，已经合并了一类通过“向导”模拟游戏收集的面向任务的对话（TOD）数据集。但是，《巫师》数据实际上是模拟的数据，因此与现实生活中的对话根本不同，后者更加嘈杂和随意。最近，Seretod挑战赛是组织的，并发布了Mobilecs数据集，该数据集由中国移动的真实用户与客户服务人员之间的真实对话记录组成。基于Mobilecs数据集，Seretod挑战具有两个任务，不仅评估了对话系统本身的构建，而且还检查了对话框成绩单中的信息提取，这对于建立TOD的知识库至关重要。本文主要对Mobilecs数据集进行了两项任务的基线研究。我们介绍了如何构建两个基线，遇到的问题以及结果。我们预计基准可以促进令人兴奋的未来研究，以建立用于现实生活任务的人类机器人对话系统。

Recently, there have merged a class of task-oriented dialogue (TOD) datasets collected through Wizard-of-Oz simulated games. However, the Wizard-of-Oz data are in fact simulated data and thus are fundamentally different from real-life conversations, which are more noisy and casual. Recently, the SereTOD challenge is organized and releases the MobileCS dataset, which consists of real-world dialog transcripts between real users and customer-service staffs from China Mobile. Based on the MobileCS dataset, the SereTOD challenge has two tasks, not only evaluating the construction of the dialogue system itself, but also examining information extraction from dialog transcripts, which is crucial for building the knowledge base for TOD. This paper mainly presents a baseline study of the two tasks with the MobileCS dataset. We introduce how the two baselines are constructed, the problems encountered, and the results. We anticipate that the baselines can facilitate exciting future research to build human-robot dialogue systems for real-life tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题