论文标题
文本需要识别具有经验文本表示的语义特征
Textual Entailment Recognition with Semantic Features from Empirical Text Representation
论文作者
论文摘要
文本需要识别是基本的自然语言理解(NLU)任务之一。在应用任何自然语言处理(NLP)技术以自动识别文本需要之前,了解句子的含义是先决条件。当且仅当假设的真实价值遵循文本时,文本就需要假设。经典方法通常利用从嵌入单词嵌入的每个单词的特征值来表示句子。在本文中,我们提出了一种新的方法来识别文本和假设之间的文本需要关系,从而引入了一个新的语义特征,重点是基于经验阈值的语义文本表示。我们采用基于元素的元素距离向量的特征,可以识别文本 - 假设对之间的语义构成关系。我们对基准构成分类(Sick-RTE)数据集进行了几项实验。我们训练几种应用语义和词汇特征的机器学习(ML)算法,以将文本 - 假设对分类为综合,中性或矛盾。我们的经验句子表示技术丰富了本文和假设的语义信息,发现比经典句子更有效。最后,我们的方法极大地超过了已知的方法,以理解文本构成分类任务的句子的含义。
Textual entailment recognition is one of the basic natural language understanding(NLU) tasks. Understanding the meaning of sentences is a prerequisite before applying any natural language processing(NLP) techniques to automatically recognize the textual entailment. A text entails a hypothesis if and only if the true value of the hypothesis follows the text. Classical approaches generally utilize the feature value of each word from word embedding to represent the sentences. In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis, thereby introducing a new semantic feature focusing on empirical threshold-based semantic text representation. We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair. We carried out several experiments on a benchmark entailment classification(SICK-RTE) dataset. We train several machine learning(ML) algorithms applying both semantic and lexical features to classify the text-hypothesis pair as entailment, neutral, or contradiction. Our empirical sentence representation technique enriches the semantic information of the texts and hypotheses found to be more efficient than the classical ones. In the end, our approach significantly outperforms known methods in understanding the meaning of the sentences for the textual entailment classification task.