通过混合潜在变量建立多样，相关和连贯的开放域对话

论文标题

通过混合潜在变量建立多样，相关和连贯的开放域对话

Towards Diverse, Relevant and Coherent Open-Domain Dialogue Generation via Hybrid Latent Variables

论文作者

Sun, Bin, Li, Yitong, Mi, Fei, Wang, Weichao, Li, Yiwei, Li, Kan

论文摘要

使用连续或离散的潜在变量的条件变分模型对于开放域对话响应的生成非常有力。但是，先前的作品表明，连续的潜在变量倾向于降低产生的响应的连贯性。在本文中，我们还发现，离散的潜在变量难以捕获更多的表达方式。为了解决这些问题，我们结合了连续和离散的潜在变量的优点，并提出了混合潜在变量（HLV）方法。具体而言，HLV通过离散的潜在变量来限制响应的全局语义，并通过连续的潜在变量丰富响应。因此，我们在保持相关性和连贯性的同时使生成的响应多样化。此外，我们建议有条件的混合变量变压器（CHVT）构建并利用与变压器进行对话生成的HLV。通过细粒度的符号级别的语义信息和加性高斯混合，我们构建了连续变量的分布，促使产生各种表达式。同时，为了保持相关性和连贯性，离散潜在变量通过自行分离训练进行了优化。两个对话生成数据集（DailyDialog和OpenSubtitles）的实验结果表明，CHVT优于传统的基于变压器的变异机制W.R.T.多样性，相关性和连贯指标。此外，我们还证明了将HLV应用于微调两个预训练的对话模型（Plato和Bart-Base）的好处。

Conditional variational models, using either continuous or discrete latent variables, are powerful for open-domain dialogue response generation. However, previous works show that continuous latent variables tend to reduce the coherence of generated responses. In this paper, we also found that discrete latent variables have difficulty capturing more diverse expressions. To tackle these problems, we combine the merits of both continuous and discrete latent variables and propose a Hybrid Latent Variable (HLV) method. Specifically, HLV constrains the global semantics of responses through discrete latent variables and enriches responses with continuous latent variables. Thus, we diversify the generated responses while maintaining relevance and coherence. In addition, we propose Conditional Hybrid Variational Transformer (CHVT) to construct and to utilize HLV with transformers for dialogue generation. Through fine-grained symbolic-level semantic information and additive Gaussian mixing, we construct the distribution of continuous variables, prompting the generation of diverse expressions. Meanwhile, to maintain the relevance and coherence, the discrete latent variable is optimized by self-separation training. Experimental results on two dialogue generation datasets (DailyDialog and Opensubtitles) show that CHVT is superior to traditional transformer-based variational mechanism w.r.t. diversity, relevance and coherence metrics. Moreover, we also demonstrate the benefit of applying HLV to fine-tuning two pre-trained dialogue models (PLATO and BART-base).

下载PDF全文

下载文献需遵守相关版权规定

论文标题