积极的示例选择中下文学习

论文标题

积极的示例选择中下文学习

Active Example Selection for In-Context Learning

论文作者

Zhang, Yiming, Feng, Shi, Tan, Chenhao

论文摘要

通过少数演示示例，大规模的语言模型表现出强大的能力，可以通过从这些示例中学习来执行各种任务，而无需进行任何微调。我们证明，在示例样本中，文章学习绩效可能是高度不稳定的，这表明语言模型如何获取信息的特质。我们为秘密学习作为顺序决策问题制定示例选择，并提出了一种强化学习算法，以识别可概括示例的可推广策略。对于GPT-2，我们的学识渊博的政策表明，在培训中没有看到的任务，平均提高了5.8美元的培训能力。从我们学到的政策中选择的示例甚至可以对GPT-3 ADA实现很小的改进。但是，改进会减少较大的GPT-3模型，这表明大语模型的新兴功能。

With a handful of demonstration examples, large-scale language models show strong capability to perform various tasks by in-context learning from these examples, without any fine-tuning. We demonstrate that in-context learning performance can be highly unstable across samples of examples, indicating the idiosyncrasies of how language models acquire information. We formulate example selection for in-context learning as a sequential decision problem, and propose a reinforcement learning algorithm for identifying generalizable policies to select demonstration examples. For GPT-2, our learned policies demonstrate strong abilities of generalizing to unseen tasks in training, with a $5.8\%$ improvement on average. Examples selected from our learned policies can even achieve a small improvement on GPT-3 Ada. However, the improvement diminishes on larger GPT-3 models, suggesting emerging capabilities of large language models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题