通过JSA学习离散潜在变量模型，推进半监督的面向任务的对话系统

论文标题

通过JSA学习离散潜在变量模型，推进半监督的面向任务的对话系统

Advancing Semi-Supervised Task Oriented Dialog Systems by JSA Learning of Discrete Latent Variable Models

论文作者

Cai, Yucheng, Liu, Hong, Ou, Zhijian, Huang, Yi, Feng, Junlan

论文摘要

通过利用未标记的对话框数据来开发半监督的面向任务的对话框（TOD）系统，吸引了越来越多的兴趣。对于对潜在状态TOD模型的半监督学习，经常使用变分学习，但遭受了通过离散潜在变量传播的梯度的烦人的高度变化，并且间接优化了目标对数的偏见。最近，一种称为关节随机近似（JSA）的替代算法已出现，用于学习具有令人印象深刻的性能的离散潜在可变模型。在本文中，我们建议将JSA应用于对潜在状态TOD模型的半监督学习，该模型称为JSA-TOD。据我们所知，JSA-TOD代表了开发基于JSA的半监督学习的第一批工作，用于为TOD系统（例如TOD系统）等长期顺序产生问题的离散潜在可变条件模型学习。广泛的实验表明，JSA-TOD明显优于其变异学习对应物。值得注意的是，使用20％标签的半监督JSA-TOD在Multiwoz2.1上的全面监督基线表现近。

Developing semi-supervised task-oriented dialog (TOD) systems by leveraging unlabeled dialog data has attracted increasing interests. For semi-supervised learning of latent state TOD models, variational learning is often used, but suffers from the annoying high-variance of the gradients propagated through discrete latent variables and the drawback of indirectly optimizing the target log-likelihood. Recently, an alternative algorithm, called joint stochastic approximation (JSA), has emerged for learning discrete latent variable models with impressive performances. In this paper, we propose to apply JSA to semi-supervised learning of the latent state TOD models, which is referred to as JSA-TOD. To our knowledge, JSA-TOD represents the first work in developing JSA based semi-supervised learning of discrete latent variable conditional models for such long sequential generation problems like in TOD systems. Extensive experiments show that JSA-TOD significantly outperforms its variational learning counterpart. Remarkably, semi-supervised JSA-TOD using 20% labels performs close to the full-supervised baseline on MultiWOZ2.1.

下载PDF全文

下载文献需遵守相关版权规定

论文标题