论文标题

在文化中,学习者可以从演示中学习推理概念吗?

Can In-context Learners Learn a Reasoning Concept from Demonstrations?

论文作者

Štefánik, Michal, Kadlčík, Marek

论文摘要

语言模型具有从少数投入输出演示中学习新任务的紧急能力。但是,最近的工作表明,在很大程度上依赖他们的预训练的知识,例如标签的情感,而不是从输入中学习新关联。我们认为,使用随机选择的文本演示的常用几次评估不能消除模型对此类偏差的依赖,因为大多数随机选择的演示均未向预测提供信息,而不是公开任务输入输出输入分布。 因此,为了评估模型独立于模型记忆的模型学习能力,我们介绍了一种概念共享的几声学习方法,选择了与预测样本共享基本概念的演示。我们从可用的人类解释中提取了一组此类概念,并衡量在几次演示中展示这些概念中可以受益多少。 我们发现,无论模型大小如何,大多数最近的文化学习者都无法从所展示的概念中始终如一地受益。但是,我们注意到,T0模型对展示概念更为敏感,这受益于8种评估场景中7个概念共享的演示。

Language models exhibit an emergent ability to learn a new task from a small number of input-output demonstrations. However, recent work shows that in-context learners largely rely on their pre-trained knowledge, such as the sentiment of the labels, instead of learning new associations from the input. We argue that the commonly-used few-shot evaluation using a random selection of in-context demonstrations can not disentangle models' reliance on such biases, as most of the randomly-selected demonstrations do not present relations informative for prediction beyond exposing the task's input-output distribution. Therefore, to evaluate models' in-context learning ability independent of models' memory, we introduce a Concept-sharing few-shot learning method choosing the demonstrations that share an underlying concept with the predicted sample. We extract a set of such concepts from available human explanations and measure how much models can benefit from presenting these concepts in few-shot demonstrations. We find that most of the recent in-context learners can not consistently benefit from the demonstrated concepts, irrespective of the model size. However, we note that T0 models are more sensitive to exhibited concepts, benefiting from concept-sharing demonstrations in 7 out of 8 evaluation scenarios.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源