最好的-K $搜索算法神经文本生成

论文标题

最好的-K $搜索算法神经文本生成

Best-$k$ Search Algorithm for Neural Text Generation

论文作者

Xu, Jiacheng, Xiong, Caiming, Savarese, Silvio, Zhou, Yingbo

论文摘要

现代的自然语言产生范式需要一个良好的解码策略，以从模型中获得质量序列。梁搜索产生高质量但多样性输出低；随机方法的差异很大，有时质量低，但是输出往往更自然和创造力。在这项工作中，我们提出了一种确定性的搜索算法，平衡质量和多样性。我们首先研究了香草最佳搜索（BFS）算法，然后提出最佳$ K $搜索算法。受BFS的启发，我们贪婪地扩展了顶级$ K $节点，而不仅仅是第一个节点，以提高效率和多样性。最近发现伴随着堆修剪的节点可确保搜索程序的完整性。对四个NLG任务进行的实验，包括问题产生，常识性生成，文本摘要和翻译，表明最佳$ K $搜索与强大的基线相比会产生更多多样化和自然的输出，而我们的方法则保持高文本质量。所提出的算法是无参数，轻巧，高效且易于使用的。

Modern natural language generation paradigms require a good decoding strategy to obtain quality sequences out of the model. Beam search yields high-quality but low diversity outputs; stochastic approaches suffer from high variance and sometimes low quality, but the outputs tend to be more natural and creative. In this work, we propose a deterministic search algorithm balancing both quality and diversity. We first investigate the vanilla best-first search (BFS) algorithm and then propose the Best-$k$ Search algorithm. Inspired by BFS, we greedily expand the top $k$ nodes, instead of only the first node, to boost efficiency and diversity. Upweighting recently discovered nodes accompanied by heap pruning ensures the completeness of the search procedure. Experiments on four NLG tasks, including question generation, commonsense generation, text summarization, and translation, show that best-$k$ search yields more diverse and natural outputs compared to strong baselines, while our approach maintains high text quality. The proposed algorithm is parameter-free, lightweight, efficient, and easy to use.

下载PDF全文

下载文献需遵守相关版权规定

论文标题