论文标题
评论家指导的解码用于受控文本生成
Critic-Guided Decoding for Controlled Text Generation
论文作者
论文摘要
转向语言生成目标或远离不希望的内容一直是使用语言模型(LM)的长期目标。最近的工作表明,加强学习和加权解码是有效的方法,可以通过利弊获得更高水平的语言控制和质量。在这项工作中,我们提出了一种新颖的评论家解码方法,用于控制语言产生(Critic Control),该方法结合了增强学习和加权解码的优势。具体来说,我们采用参与者批评的框架来培训来自非差异奖励模型的LM Steering评论家。与加权解码类似,我们的方法冻结了语言模型,并使用称为评论家来操纵输出令牌分布,从而提高训练效率和稳定性。评估我们对三个受控生成任务的方法,即主题控制,情感控制和排毒,表明我们的方法比以前的方法生成更连贯且控制良好的文本。此外,评论家表现出在零拍设置中的出色概括能力。人类评估研究也证实了我们的发现。
Steering language generation towards objectives or away from undesired content has been a long-standing goal in utilizing language models (LM). Recent work has demonstrated reinforcement learning and weighted decoding as effective approaches to achieve a higher level of language control and quality with pros and cons. In this work, we propose a novel critic decoding method for controlled language generation (CriticControl) that combines the strengths of reinforcement learning and weighted decoding. Specifically, we adopt the actor-critic framework to train an LM-steering critic from non-differentiable reward models. And similar to weighted decoding, our method freezes the language model and manipulates the output token distribution using called critic, improving training efficiency and stability. Evaluation of our method on three controlled generation tasks, namely topic control, sentiment control, and detoxification, shows that our approach generates more coherent and well-controlled texts than previous methods. In addition, CriticControl demonstrates superior generalization ability in zero-shot settings. Human evaluation studies also corroborate our findings.