Multi2Woz：一个可靠的多语言数据集和针对任务的对话框的对话预处理

论文标题

Multi2Woz：一个可靠的多语言数据集和针对任务的对话框的对话预处理

Multi2WOZ: A Robust Multilingual Dataset and Conversational Pretraining for Task-Oriented Dialog

论文作者

Hung, Chia-Chien, Lauscher, Anne, Vulić, Ivan, Ponzetto, Simone Paolo, Glavaš, Goran

论文摘要

（多域）以任务为导向的对话框（TOD）的研究主要集中在英语上，这主要是由于其他语言中强大的TOD数据集的短缺，从而阻止了此至关重要的NLP应用程序的跨语言转移进行系统的研究。在这项工作中，我们介绍了Multi2Woz，这是一种新的多语言多域TOD数据集，该数据集源自公认的英语数据集Multiwoz，涵盖了四种类型上多样的语言：中文，德语，阿拉伯语和俄语。与并发努力相反，Multi2Woz包含目标语言中的金色标准对话框，这些对话与英语数据集的开发和测试部分直接相当，从而实现了TOD的跨语性转移性能的可靠和比较估计。然后，我们引入了一个新的框架，用于对审慎的语言模型（PRLMS）的多语言对话专业化，该框架旨在促进跨语性转移进行任意下游TOD任务。我们使用专门用于具体目标语言的对话prlms，在两个标准TOD任务上系统地基准了许多零射击和几个射击的跨语性转移方法：对话框状态跟踪和响应检索。我们的实验表明，在大多数设置中，最佳性能都需要（i）目标语言对话专业化和（ii）混凝土TOD任务的几次转换。最重要的是，我们证明我们对目标语言的对话专业化允许对下游TOD任务进行异常的样本效率转移。

Research on (multi-domain) task-oriented dialog (TOD) has predominantly focused on the English language, primarily due to the shortage of robust TOD datasets in other languages, preventing the systematic investigation of cross-lingual transfer for this crucial NLP application area. In this work, we introduce Multi2WOZ, a new multilingual multi-domain TOD dataset, derived from the well-established English dataset MultiWOZ, that spans four typologically diverse languages: Chinese, German, Arabic, and Russian. In contrast to concurrent efforts, Multi2WOZ contains gold-standard dialogs in target languages that are directly comparable with development and test portions of the English dataset, enabling reliable and comparative estimates of cross-lingual transfer performance for TOD. We then introduce a new framework for multilingual conversational specialization of pretrained language models (PrLMs) that aims to facilitate cross-lingual transfer for arbitrary downstream TOD tasks. Using such conversational PrLMs specialized for concrete target languages, we systematically benchmark a number of zero-shot and few-shot cross-lingual transfer approaches on two standard TOD tasks: Dialog State Tracking and Response Retrieval. Our experiments show that, in most setups, the best performance entails the combination of (I) conversational specialization in the target language and (ii) few-shot transfer for the concrete TOD task. Most importantly, we show that our conversational specialization in the target language allows for an exceptionally sample-efficient few-shot transfer for downstream TOD tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题