使用预训练的语言模型的键形预测

论文标题

使用预训练的语言模型的键形预测

Keyphrase Prediction With Pre-trained Language Model

论文作者

Liu, Rui, Lin, Zheng, Wang, Weiping

论文摘要

最近，由于它们能够产生出现在源文本中的两个当前键形和不匹配任何源文本中的键形的键形的能力，因此生成方法已被广泛用于键形预测。但是，缺乏键形是以当前键形预测的性能为代价生成的，因为以前的工作主要使用依赖复制机制的生成模型，并逐步选择单词。此外，直接提取文本跨度的提取模型更适合预测当前的键形。考虑到提取和生成方法的不同特征，我们建议将键形预测分为两个子任务，即当前的键形提取（PKE）（PKE）和缺乏键形酶产生（AKG），以完全利用其各自的优势。在此基础上，提出了一个联合推理框架，以在两个子任务中充分利用BERT。对于PKE，我们将这项任务作为序列标记问题来解决，以训练的语言模型BERT。对于AKG，我们引入了基于变压器的体系结构，该体系结构完全整合了通过微调Bert从PKE学到的当前键形知识。实验结果表明，我们的方法可以在基准数据集上的两个任务上实现最新的结果。

Recently, generative methods have been widely used in keyphrase prediction, thanks to their capability to produce both present keyphrases that appear in the source text and absent keyphrases that do not match any source text. However, the absent keyphrases are generated at the cost of the performance on present keyphrase prediction, since previous works mainly use generative models that rely on the copying mechanism and select words step by step. Besides, the extractive model that directly extracts a text span is more suitable for predicting the present keyphrase. Considering the different characteristics of extractive and generative methods, we propose to divide the keyphrase prediction into two subtasks, i.e., present keyphrase extraction (PKE) and absent keyphrase generation (AKG), to fully exploit their respective advantages. On this basis, a joint inference framework is proposed to make the most of BERT in two subtasks. For PKE, we tackle this task as a sequence labeling problem with the pre-trained language model BERT. For AKG, we introduce a Transformer-based architecture, which fully integrates the present keyphrase knowledge learned from PKE by the fine-tuned BERT. The experimental results show that our approach can achieve state-of-the-art results on both tasks on benchmark datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题