论文标题

关键字键入深度语义匹配

Keyword-Attentive Deep Semantic Matching

论文作者

Miao, Changyu, Cao, Zhen, Tam, Yik-Cheung

论文摘要

深度语义匹配是各种自然语言处理应用程序(例如问答(QA))中的关键组成部分,其中将输入查询与QA语料库中的每个候选问题进行比较。由于查询对中的不同单词令牌,在开放域情景中测量查询问题对之间的相似性可能会具有挑战性。我们提出了一种关键字指导方法,以改善深层语义匹配。我们首先利用大型语料库的域标签来生成域增强关键字字典。在BERT上构建,我们堆叠了一个关键字 - 键入的变压器层,以突出查询问题对中关键字的重要性。在模型培训期间,我们根据输入对之间的关​​键字覆盖范围提出了一种新的负面采样方法。我们使用各种指标评估中国质量保证语料库的方法,包括检索候选者的精确度和语义匹配的准确性。实验表明,我们的方法表现优于现有的强基线。我们的方法是一般的,可以应用于其他文本匹配任务,几乎没有适应。

Deep Semantic Matching is a crucial component in various natural language processing applications such as question and answering (QA), where an input query is compared to each candidate question in a QA corpus in terms of relevance. Measuring similarities between a query-question pair in an open domain scenario can be challenging due to diverse word tokens in the queryquestion pair. We propose a keyword-attentive approach to improve deep semantic matching. We first leverage domain tags from a large corpus to generate a domain-enhanced keyword dictionary. Built upon BERT, we stack a keyword-attentive transformer layer to highlight the importance of keywords in the query-question pair. During model training, we propose a new negative sampling approach based on keyword coverage between the input pair. We evaluate our approach on a Chinese QA corpus using various metrics, including precision of retrieval candidates and accuracy of semantic matching. Experiments show that our approach outperforms existing strong baselines. Our approach is general and can be applied to other text matching tasks with little adaptation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源