Logiformer：用于可解释的逻辑推理的两分支图形变压器网络

论文标题

Logiformer：用于可解释的逻辑推理的两分支图形变压器网络

Logiformer: A Two-Branch Graph Transformer Network for Interpretable Logical Reasoning

论文作者

Xu, Fangzhi, Liu, Jun, Lin, Qika, Pan, Yudai, Zhang, Lingling

论文摘要

机器阅读理解引起了广泛的关注，因为它探讨了模型对文本理解的潜力。为了进一步为机器配备推理能力，提出了逻辑推理的挑战性任务。以前关于逻辑推理的著作提出了一些策略，以从不同方面提取逻辑单位。但是，要模拟逻辑单元之间的长距离依赖性仍然存在挑战。同样，要求发现文本的逻辑结构，并将离散逻辑进一步融合到连续文本嵌入。为了解决上述问题，我们提出了一个端到端的模型徽标，该登录器利用两个分支的图形变压器网络进行文本逻辑推理。首先，我们引入了不同的提取策略，将文本分为两组逻辑单元，并分别构造逻辑图和语法图。逻辑图对逻辑分支的因果关系进行了建模，而语法图捕获了语法分支的共发生关系。其次，为了建模长距离依赖性，每个图的节点序列被馈入完全连接的图形变压器结构。两个相邻的矩阵被视为图形变压器层的注意力偏差，它们将离散的逻辑结构映射到连续文本嵌入空间。第三，在答案预测更新功能之前，介绍了动态的门机制和一个问题意识到的自我发项模块。推理过程通过采用逻辑单元来提供与人类认知一致的逻辑单元。实验结果表明了我们模型的优越性，该模型的表现优于两个逻辑推理基准上的最新单个模型。

Machine reading comprehension has aroused wide concerns, since it explores the potential of model for text understanding. To further equip the machine with the reasoning capability, the challenging task of logical reasoning is proposed. Previous works on logical reasoning have proposed some strategies to extract the logical units from different aspects. However, there still remains a challenge to model the long distance dependency among the logical units. Also, it is demanding to uncover the logical structures of the text and further fuse the discrete logic to the continuous text embedding. To tackle the above issues, we propose an end-to-end model Logiformer which utilizes a two-branch graph transformer network for logical reasoning of text. Firstly, we introduce different extraction strategies to split the text into two sets of logical units, and construct the logical graph and the syntax graph respectively. The logical graph models the causal relations for the logical branch while the syntax graph captures the co-occurrence relations for the syntax branch. Secondly, to model the long distance dependency, the node sequence from each graph is fed into the fully connected graph transformer structures. The two adjacent matrices are viewed as the attention biases for the graph transformer layers, which map the discrete logical structures to the continuous text embedding space. Thirdly, a dynamic gate mechanism and a question-aware self-attention module are introduced before the answer prediction to update the features. The reasoning process provides the interpretability by employing the logical units, which are consistent with human cognition. The experimental results show the superiority of our model, which outperforms the state-of-the-art single model on two logical reasoning benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题