RLET：一种基于强化学习的方法，用于与需要树的可解释的QA

论文标题

RLET：一种基于强化学习的方法，用于与需要树的可解释的QA

RLET: A Reinforcement Learning Based Approach for Explainable QA with Entailment Trees

论文作者

Liu, Tengxiao, Guo, Qipeng, Hu, Xiangkun, Zhang, Yue, Qiu, Xipeng, Zhang, Zheng

论文摘要

解释从问题到答案的推理过程在接近可解释的质量检查方面构成了挑战。最近提出的结构化推理格式，构成树，设法在树结构中提供了带有需要步骤的明确逻辑扣除额。为了产生含义树，先前的单个通过序列到序列模型缺乏可见的内部决策概率，而逐步方法是用提取的单步数据监督的，并且无法整体建模树。在这项工作中，我们提出了Rlet，这是一种基于增强学习的构成树生成框架，该框架是通过整棵树上的累积信号训练的。 rlet迭代地执行单步推理，并通过句子选择和扣除生成模块进行单步推理，从该模块中积累了训练信号，并具有与评估一致的精心设计的对齐奖励函数。据我们所知，我们是第一个将RL引入元素生成任务的人。在IntailmentBank数据集的三个设置上进行实验证明了使用RL框架的强度。

Interpreting the reasoning process from questions to answers poses a challenge in approaching explainable QA. A recently proposed structured reasoning format, entailment tree, manages to offer explicit logical deductions with entailment steps in a tree structure. To generate entailment trees, prior single pass sequence-to-sequence models lack visible internal decision probability, while stepwise approaches are supervised with extracted single step data and cannot model the tree as a whole. In this work, we propose RLET, a Reinforcement Learning based Entailment Tree generation framework, which is trained utilising the cumulative signals across the whole tree. RLET iteratively performs single step reasoning with sentence selection and deduction generation modules, from which the training signal is accumulated across the tree with elaborately designed aligned reward function that is consistent with the evaluation. To the best of our knowledge, we are the first to introduce RL into the entailment tree generation task. Experiments on three settings of the EntailmentBank dataset demonstrate the strength of using RL framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题