论文标题
共同:用于多转对话推理的数据集
MuTual: A Dataset for Multi-Turn Dialogue Reasoning
论文作者
论文摘要
由于广泛访问的对话数据和深度学习技术的发展,近年来,面向任务的对话系统在近年来取得了巨大的成功。在上下文的情况下,当前的系统能够产生相关且流利的响应,但由于推理能力较弱,有时会造成逻辑错误。为了促进对话推理研究,我们介绍了Mutual,这是一种用于多转对话推理的新型数据集,由基于中国学生英语听力理解考试的8,860个手动注释的对话组成。与以前针对非任务的对话系统的基准测试相比,互惠更具挑战性,因为它需要一个可以解决各种推理问题的模型。经验结果表明,最新方法仅达到71%,远远落后于94%的人类表现,这表明有足够的空间可以提高推理能力。互助可从https://github.com/nealcly/mutual获得。
Non-task oriented dialogue systems have achieved great success in recent years due to largely accessible conversation data and the development of deep learning techniques. Given a context, current systems are able to yield a relevant and fluent response, but sometimes make logical mistakes because of weak reasoning capabilities. To facilitate the conversation reasoning research, we introduce MuTual, a novel dataset for Multi-Turn dialogue Reasoning, consisting of 8,860 manually annotated dialogues based on Chinese student English listening comprehension exams. Compared to previous benchmarks for non-task oriented dialogue systems, MuTual is much more challenging since it requires a model that can handle various reasoning problems. Empirical results show that state-of-the-art methods only reach 71%, which is far behind the human performance of 94%, indicating that there is ample room for improving reasoning ability. MuTual is available at https://github.com/Nealcly/MuTual.