论文标题
通过人工和人工神经网络中的抽象表示的组成概括
Compositional generalization through abstract representations in human and artificial neural networks
论文作者
论文摘要
人类具有在人工学习系统中难以繁殖的新任务快速推广到新任务的非凡能力。已经提出了组成性作为支持人类概括的关键机制,但其神经实施和对行为影响的证据仍然很少。在这里,我们研究了高度组成任务的人类和人工神经网络(ANN)中与组成概括相关的计算特性。首先,我们确定了人类组成概括的行为特征,并使用全皮层功能磁共振成像(fMRI)数据进行了神经相关性。接下来,我们设计了一个在过程中的预处理范式,我们将{\ em diminives}术语授予构图任务元素赋予ANN。我们发现,具有此先验知识的ANN与人类行为和神经组成签名具有更大的对应关系。重要的是,预训练引起的抽象内部表示,出色的零光概括和样品有效学习。此外,它产生了与人类fMRI数据相匹配的抽象表示的层次结构,其中感官规则抽象在早期感觉区域中出现,运动规则抽象在后来的运动区域中出现。我们的发现为组成概括在人类行为中的作用的作用提供了经验支持,将抽象的表述视为其神经实施,并说明这些表示可以通过设计简单有效的预审进程序将其嵌入ANN中。
Humans have a remarkable ability to rapidly generalize to new tasks that is difficult to reproduce in artificial learning systems. Compositionality has been proposed as a key mechanism supporting generalization in humans, but evidence of its neural implementation and impact on behavior is still scarce. Here we study the computational properties associated with compositional generalization in both humans and artificial neural networks (ANNs) on a highly compositional task. First, we identified behavioral signatures of compositional generalization in humans, along with their neural correlates using whole-cortex functional magnetic resonance imaging (fMRI) data. Next, we designed pretraining paradigms aided by a procedure we term {\em primitives pretraining} to endow compositional task elements into ANNs. We found that ANNs with this prior knowledge had greater correspondence with human behavior and neural compositional signatures. Importantly, primitives pretraining induced abstract internal representations, excellent zero-shot generalization, and sample-efficient learning. Moreover, it gave rise to a hierarchy of abstract representations that matched human fMRI data, where sensory rule abstractions emerged in early sensory areas, and motor rule abstractions emerged in later motor areas. Our findings give empirical support to the role of compositional generalization in human behavior, implicate abstract representations as its neural implementation, and illustrate that these representations can be embedded into ANNs by designing simple and efficient pretraining procedures.