迈向双重编码模型的对话响应建议建议

论文标题

迈向双重编码模型的对话响应建议建议

Toward Interpretability of Dual-Encoder Models for Dialogue Response Suggestions

论文作者

Li, Yitong, Li, Dianqi, Prakash, Sushant, Wang, Peng

论文摘要

这项工作显示了如何改进和解释常用的双编码器模型，以在对话中进行响应建议。我们提出了一个细心的双重编码器模型，该模型在两个编码器中提取的单词级特征的顶部包括一个注意机制，一个分别用于上下文，一个用于标签。为了改善双重编码器模型中的可解释性，我们设计了一种新颖的正则化损失，以最大程度地减少不重要的单词和所需标签之间的相互信息，除了原始的注意方法之外，还强调了重要的单词，同时强调了不重要的单词。这不仅可以帮助模型可解释性，还可以进一步提高模型准确性。我们提出了一种使用神经网络来计算相互信息的近似方法。此外，通过在原始单词嵌入和最终编码的上下文特征之间添加残留层，在模型的最终预测中可以保留单词级别的可解释性。我们将提出的模型与两个公共数据集（Persona和Ubuntu）上的对话响应任务进行比较。实验证明了提出的模型的有效性，以更好的回忆@1的准确性和可视化的解释性。

This work shows how to improve and interpret the commonly used dual encoder model for response suggestion in dialogue. We present an attentive dual encoder model that includes an attention mechanism on top of the extracted word-level features from two encoders, one for context and one for label respectively. To improve the interpretability in the dual encoder models, we design a novel regularization loss to minimize the mutual information between unimportant words and desired labels, in addition to the original attention method, so that important words are emphasized while unimportant words are de-emphasized. This can help not only with model interpretability, but can also further improve model accuracy. We propose an approximation method that uses a neural network to calculate the mutual information. Furthermore, by adding a residual layer between raw word embeddings and the final encoded context feature, word-level interpretability is preserved at the final prediction of the model. We compare the proposed model with existing methods for the dialogue response task on two public datasets (Persona and Ubuntu). The experiments demonstrate the effectiveness of the proposed model in terms of better Recall@1 accuracy and visualized interpretability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题