探索特定问题的奖励，以产生深入的问题

论文标题

探索特定问题的奖励，以产生深入的问题

Exploring Question-Specific Rewards for Generating Deep Questions

论文作者

Xie, Yuxi, Pan, Liangming, Wang, Dongzhe, Kan, Min-Yen, Feng, Yansong

论文摘要

最近的问题生成（QG）方法通常会利用序列到序列框架（SEQ2SEQ）来优化使用教师强迫的基本真相的样本。但是，这个培训目标与实际的问题质量不一致，这通常由某些全球属性反映，例如文档是否可以回答问题。因此，我们通过加强学习来直接为QG特定目标优化，以提高问题质量。我们设计了三个不同的奖励，以提高生成问题的流利性，相关性和答复性。除了进行彻底的分析外，我们还进行自动评估，以探索每个QG特异性奖励的效果。我们发现，优化特定问题的奖励通常会导致自动评估指标的表现更好。但是，只有与人类判断良好相关的奖励（例如相关性）才能真正提高问题质量。对其他人进行优化，尤其是可回答性，对模型引入了不正确的偏见，导致质量质量差。我们的代码可在https://github.com/yuxixie/rl-for-question-generation上公开获取。

Recent question generation (QG) approaches often utilize the sequence-to-sequence framework (Seq2Seq) to optimize the log-likelihood of ground-truth questions using teacher forcing. However, this training objective is inconsistent with actual question quality, which is often reflected by certain global properties such as whether the question can be answered by the document. As such, we directly optimize for QG-specific objectives via reinforcement learning to improve question quality. We design three different rewards that target to improve the fluency, relevance, and answerability of generated questions. We conduct both automatic and human evaluations in addition to a thorough analysis to explore the effect of each QG-specific reward. We find that optimizing question-specific rewards generally leads to better performance in automatic evaluation metrics. However, only the rewards that correlate well with human judgement (e.g., relevance) lead to real improvement in question quality. Optimizing for the others, especially answerability, introduces incorrect bias to the model, resulting in poor question quality. Our code is publicly available at https://github.com/YuxiXie/RL-for-Question-Generation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题