MCP：对个性化聊天机器人的自我监管的预培训，并进行多层对比抽样

论文标题

MCP：对个性化聊天机器人的自我监管的预培训，并进行多层对比抽样

MCP: Self-supervised Pre-training for Personalized Chatbots with Multi-level Contrastive Sampling

论文作者

Huang, Zhaoheng, Dou, Zhicheng, Zhu, Yutao, Ma, Zhengyi

论文摘要

个性化的聊天机器人专注于赋予聊天机器人具有一致的个性，像真实用户一样行事，并进一步充当个人助理。先前的研究已经探索了从用户对话历史记录中生成隐式用户资料，以构建个性化的聊天机器人。但是，这些研究仅利用响应产生损失来训练整个模型，因此容易患有数据稀疏问题。此外，他们过分强调了最终生成的响应质量，同时忽略了用户对话历史记录之间的相关性和融合，从而导致了粗略的数据表示和性能下降。为了解决这些问题，我们提出了一个自我监督的学习框架MCP，以捕获用户对话历史的更好表示，以获取个性化聊天机器人。具体而言，我们应用对比度采样方法来利用用户对话记录中隐藏的监督信号，并生成预训练样本以增强模型。我们根据用户对话历史记录中三种类型的对比对设计了三个预训练任务，即响应对，序列增强对和用户对。我们将语音编码器和历史编码器预先介绍到对比目标，并使用这些预训练的编码器在个性化响应生成时生成用户配置文件。与现有方法相比，两个现实世界数据集的实验结果在我们提出的模型MCP方面有显着改善。

Personalized chatbots focus on endowing the chatbots with a consistent personality to behave like real users and further act as personal assistants. Previous studies have explored generating implicit user profiles from the user's dialogue history for building personalized chatbots. However, these studies only use the response generation loss to train the entire model, thus it is prone to suffer from the problem of data sparsity. Besides, they overemphasize the final generated response's quality while ignoring the correlations and fusions between the user's dialogue history, leading to rough data representations and performance degradation. To tackle these problems, we propose a self-supervised learning framework MCP for capturing better representations from users' dialogue history for personalized chatbots. Specifically, we apply contrastive sampling methods to leverage the supervised signals hidden in user dialog history, and generate the pre-training samples for enhancing the model. We design three pre-training tasks based on three types of contrastive pairs from user dialogue history, namely response pairs, sequence augmentation pairs, and user pairs. We pre-train the utterance encoder and the history encoder towards the contrastive objectives and use these pre-trained encoders for generating user profiles while personalized response generation. Experimental results on two real-world datasets show a significant improvement in our proposed model MCP compared with the existing methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题