建模多任务语言的潜在技能

论文标题

建模多任务语言的潜在技能

Modelling Latent Skills for Multitask Language Generation

论文作者

Cao, Kris, Yogatama, Dani

论文摘要

我们提出了多任务有条件语言生成的生成模型。我们的指导性假设是，一组共享的潜在技能是许多不同语言生成任务的基础，并且将这些技能在任务嵌入空间中进行建模可以有助于跨任务的积极转移以及有效地适应新任务。我们将此任务嵌入空间作为潜在变量序列到序列模型中的潜在变量。我们通过策划一系列单语的文本到文本语言生成数据集来评估这一假设 - 涵盖了广泛的任务和域 - 并比较多任务和少数弹药制度中模型的性能。我们表明，我们的潜在任务变量模型在多任务设置中的任务中平均比其他顺序对序列基线的表现优于其他顺序。在看不见的测试数据集（即新任务）上的几个射击学习设置中，我们证明了基于潜在任务空间的推断的模型适应性比基于标准的基于基于标准的微调参数适应性更强，并且在整体绩效方面相当地执行。最后，我们检查了我们的模型学到的潜在任务表示，并表明它们以自然的方式群集任务。

We present a generative model for multitask conditional language generation. Our guiding hypothesis is that a shared set of latent skills underlies many disparate language generation tasks, and that explicitly modelling these skills in a task embedding space can help with both positive transfer across tasks and with efficient adaptation to new tasks. We instantiate this task embedding space as a latent variable in a latent variable sequence-to-sequence model. We evaluate this hypothesis by curating a series of monolingual text-to-text language generation datasets - covering a broad range of tasks and domains - and comparing the performance of models both in the multitask and few-shot regimes. We show that our latent task variable model outperforms other sequence-to-sequence baselines on average across tasks in the multitask setting. In the few-shot learning setting on an unseen test dataset (i.e., a new task), we demonstrate that model adaptation based on inference in the latent task space is more robust than standard fine-tuning based parameter adaptation and performs comparably in terms of overall performance. Finally, we examine the latent task representations learnt by our model and show that they cluster tasks in a natural way.

下载PDF全文

下载文献需遵守相关版权规定

论文标题