论文标题

KDDRES:餐厅的多层次知识驱动的对话数据集针对定制对话系统

KddRES: A Multi-level Knowledge-driven Dialogue Dataset for Restaurant Towards Customized Dialogue System

论文作者

Wang, Hongru, Li, Min, Zhou, Zimo, Fung, Gabriel Pui Cheong, Wong, Kam-Fai

论文摘要

与具有粗粒信息的CrossWoz(中文)和Multiwoz(英语)数据集相比,没有数据集正确处理细粒度和分层级别的信息。在本文中,我们在香港出版了第一个针对餐厅(KDDRES)的广东话知识驱动的对话数据集,该数据集将多转交谈中的信息与一家特定的餐厅联系起来。我们的语料库包含0.8K的对话,这些对话来自不同地区各种风格的10家餐厅。除此之外,我们设计了细粒的插槽和意图,以更好地捕获语义信息。基准实验和数据统计分析显示了我们数据集的多样性和丰富注释。我们认为,KDDRES的发布可能是当前对话数据集的必要补充,并且对社会的中小型企业(中小企业)更合适,更有价值,例如为每家餐厅建立定制的对话系统。语料库和基准模型公开可用。

Compared with CrossWOZ (Chinese) and MultiWOZ (English) dataset which have coarse-grained information, there is no dataset which handle fine-grained and hierarchical level information properly. In this paper, we publish a first Cantonese knowledge-driven Dialogue Dataset for REStaurant (KddRES) in Hong Kong, which grounds the information in multi-turn conversations to one specific restaurant. Our corpus contains 0.8k conversations which derive from 10 restaurants with various styles in different regions. In addition to that, we designed fine-grained slots and intents to better capture semantic information. The benchmark experiments and data statistic analysis show the diversity and rich annotations of our dataset. We believe the publish of KddRES can be a necessary supplement of current dialogue datasets and more suitable and valuable for small and middle enterprises (SMEs) of society, such as build a customized dialogue system for each restaurant. The corpus and benchmark models are publicly available.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源