论文标题
Multi2Woz:一个可靠的多语言数据集和针对任务的对话框的对话预处理
Multi2WOZ: A Robust Multilingual Dataset and Conversational Pretraining for Task-Oriented Dialog
论文作者
论文摘要
(多域)以任务为导向的对话框(TOD)的研究主要集中在英语上,这主要是由于其他语言中强大的TOD数据集的短缺,从而阻止了此至关重要的NLP应用程序的跨语言转移进行系统的研究。在这项工作中,我们介绍了Multi2Woz,这是一种新的多语言多域TOD数据集,该数据集源自公认的英语数据集Multiwoz,涵盖了四种类型上多样的语言:中文,德语,阿拉伯语和俄语。与并发努力相反,Multi2Woz包含目标语言中的金色标准对话框,这些对话与英语数据集的开发和测试部分直接相当,从而实现了TOD的跨语性转移性能的可靠和比较估计。然后,我们引入了一个新的框架,用于对审慎的语言模型(PRLMS)的多语言对话专业化,该框架旨在促进跨语性转移进行任意下游TOD任务。我们使用专门用于具体目标语言的对话prlms,在两个标准TOD任务上系统地基准了许多零射击和几个射击的跨语性转移方法:对话框状态跟踪和响应检索。我们的实验表明,在大多数设置中,最佳性能都需要(i)目标语言对话专业化和(ii)混凝土TOD任务的几次转换。最重要的是,我们证明我们对目标语言的对话专业化允许对下游TOD任务进行异常的样本效率转移。
Research on (multi-domain) task-oriented dialog (TOD) has predominantly focused on the English language, primarily due to the shortage of robust TOD datasets in other languages, preventing the systematic investigation of cross-lingual transfer for this crucial NLP application area. In this work, we introduce Multi2WOZ, a new multilingual multi-domain TOD dataset, derived from the well-established English dataset MultiWOZ, that spans four typologically diverse languages: Chinese, German, Arabic, and Russian. In contrast to concurrent efforts, Multi2WOZ contains gold-standard dialogs in target languages that are directly comparable with development and test portions of the English dataset, enabling reliable and comparative estimates of cross-lingual transfer performance for TOD. We then introduce a new framework for multilingual conversational specialization of pretrained language models (PrLMs) that aims to facilitate cross-lingual transfer for arbitrary downstream TOD tasks. Using such conversational PrLMs specialized for concrete target languages, we systematically benchmark a number of zero-shot and few-shot cross-lingual transfer approaches on two standard TOD tasks: Dialog State Tracking and Response Retrieval. Our experiments show that, in most setups, the best performance entails the combination of (I) conversational specialization in the target language and (ii) few-shot transfer for the concrete TOD task. Most importantly, we show that our conversational specialization in the target language allows for an exceptionally sample-efficient few-shot transfer for downstream TOD tasks.