论文标题
从连续的话语中学习对话表示
Learning Dialogue Representations from Consecutive Utterances
论文作者
论文摘要
学习高质量的对话表示对于解决各种面向对话的任务至关重要,尤其是考虑到对话系统通常会遇到数据稀缺。在本文中,我们介绍了对话句子嵌入(DSE),这是一种自我监视的对比学习方法,它学习有效的对话表示,适合各种对话任务。 DSE通过连续进行与对比度学习的积极对对话的连续对话,从对话中学习。尽管它很简单,但与其他对话表示和普遍的句子表示模型相比,DSE的表现能力明显更好。我们评估DSE的五个下游对话任务,这些任务检查了不同语义粒度的对话表示。几次射击和零射击设置的实验表明,DSE的表现优于基准的幅度很大。例如,它在6个数据集中的1-Shot意图分类中,比最强的无监督基线实现了13%的平均绩效提高。我们还提供了有关模型的好处和局限性的分析。
Learning high-quality dialogue representations is essential for solving a variety of dialogue-oriented tasks, especially considering that dialogue systems often suffer from data scarcity. In this paper, we introduce Dialogue Sentence Embedding (DSE), a self-supervised contrastive learning method that learns effective dialogue representations suitable for a wide range of dialogue tasks. DSE learns from dialogues by taking consecutive utterances of the same dialogue as positive pairs for contrastive learning. Despite its simplicity, DSE achieves significantly better representation capability than other dialogue representation and universal sentence representation models. We evaluate DSE on five downstream dialogue tasks that examine dialogue representation at different semantic granularities. Experiments in few-shot and zero-shot settings show that DSE outperforms baselines by a large margin. For example, it achieves 13% average performance improvement over the strongest unsupervised baseline in 1-shot intent classification on 6 datasets. We also provide analyses on the benefits and limitations of our model.