论文标题
通过信息引导的强化学习策略生成风格的对话策略
Stylistic Dialogue Generation via Information-Guided Reinforcement Learning Strategy
论文作者
论文摘要
风格响应的产生对于建立用于工业用途的引人入胜的对话系统至关重要。尽管它吸引了很多研究兴趣,但现有方法通常以内容质量(相关性和流利性)为代价产生风格响应。为了在内容质量和样式之间更好地平衡,我们引入了一种新的培训策略,即信息引导的强化学习(IG-RL)。在IG-RL中,鼓励培训模型探索风格表达式,同时被限制以保持其内容质量。这是通过采用强化学习策略的统计风格信息指南来实现的,以进行质量保护探索。在两个数据集上的实验表明,就整体响应性能而言,所提出的方法的表现优于几个强大的基线。
Stylistic response generation is crucial for building an engaging dialogue system for industrial use. While it has attracted much research interest, existing methods often generate stylistic responses at the cost of the content quality (relevance and fluency). To enable better balance between the content quality and the style, we introduce a new training strategy, know as Information-Guided Reinforcement Learning (IG-RL). In IG-RL, a training model is encouraged to explore stylistic expressions while being constrained to maintain its content quality. This is achieved by adopting reinforcement learning strategy with statistical style information guidance for quality-preserving explorations. Experiments on two datasets show that the proposed approach outperforms several strong baselines in terms of the overall response performance.