论文标题

KNN-PROMPT:最近的邻居零射推理

kNN-Prompt: Nearest Neighbor Zero-Shot Inference

论文作者

Shi, Weijia, Michael, Julian, Gururangan, Suchin, Zettlemoyer, Luke

论文摘要

检索声明的语言模型(LMS)使用非参数记忆在基于困惑的评估方面大大优于其非回归对应物,但这是一个悬而未决的问题,它们是否在少数和零击的终端任务准确度中获得类似的收益。我们广泛研究了一个这样的模型,即K-Near-最初的LM(KNN-LM),表明该模型的转移略有转移。主要的挑战是实现定义不同端任务类标签的Verbalizer令牌的覆盖范围。为了应对这一挑战,我们还引入了KNN-Prompt,这是一种简单有效的KNN-LM,具有自动扩展模糊的言语(例如,扩展可怕的可怕,还包括愚蠢和其他特定于任务的情感分类同义词)。在九个不同的终端任务中,使用具有GPT-2的KNN-PROMPT较大的产量可提高强大的零射基线基线(平均比基本LM的绝对提高13.4%)。我们还表明,最终任务的非参数增强功能的其他优势; KNN-PROMPT可有效,无需进一步的训练,随着检索模型的规模而增加。

Retrieval-augmented language models (LMs) use non-parametric memory to substantially outperform their non-retrieval counterparts on perplexity-based evaluations, but it is an open question whether they achieve similar gains in few- and zero-shot end-task accuracy. We extensively study one such model, the k-nearest neighbor LM (kNN-LM), showing that the gains marginally transfer. The main challenge is to achieve coverage of the verbalizer tokens that define the different end-task class labels. To address this challenge, we also introduce kNN-Prompt, a simple and effective kNN-LM with automatically expanded fuzzy verbalizers (e.g. to expand terrible to also include silly and other task-specific synonyms for sentiment classification). Across nine diverse end-tasks, using kNN-Prompt with GPT-2 large yields significant performance boosts over strong zero-shot baselines (13.4% absolute improvement over the base LM on average). We also show that other advantages of non-parametric augmentation hold for end tasks; kNN-Prompt is effective for domain adaptation with no further training, and gains increase with the size of the retrieval model.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源