无监督的构图代表学习中的组成概括：关于解开和新兴语言的研究

论文标题

无监督的构图代表学习中的组成概括：关于解开和新兴语言的研究

Compositional Generalization in Unsupervised Compositional Representation Learning: A Study on Disentanglement and Emergent Language

论文作者

Xu, Zhenlin, Niethammer, Marc, Raffel, Colin

论文摘要

深度学习模型与组成概括的斗争，即能够识别或生成观察到的基本概念的新型组合的能力。为了实现构图概括，已经提出了各种无监督的学习算法，旨在在学习的表示中诱导构图结构（例如，分散的表示形式和新兴的语言学习）。在这项工作中，我们从它们实现构图概括方面评估了这些无监督的学习算法。具体而言，我们的评估协议着重于在概括为组成因子的新组合的基础上训练简单模型是否容易训练。我们系统地研究了三种无监督的表示算法 - $β$ -VAE，$β$ -TCVAE和新兴语言（EL）自动编码器 - 在两个数据集上，允许直接测试组成概括。我们发现，与使用学习表示表示本身之前或之后使用的瓶颈表示，直接使用简单模型的瓶颈表示可能会导致概括。此外，我们发现先前提出的用于评估组成性水平的指标与我们的框架中的实际组成概括无关。令人惊讶的是，我们发现产生分离的表示的压力增加会产生概括的表示，而EL模型的表示形式显示出强大的组成概括。综上所述，我们的结果为不同的无监督学习算法的组成概括行为提供了新的启示，并具有严格测试这种行为的新设置，并提出了对更可推广的表示的脱离EL学习算法的潜在益处。

Deep learning models struggle with compositional generalization, i.e. the ability to recognize or generate novel combinations of observed elementary concepts. In hopes of enabling compositional generalization, various unsupervised learning algorithms have been proposed with inductive biases that aim to induce compositional structure in learned representations (e.g. disentangled representation and emergent language learning). In this work, we evaluate these unsupervised learning algorithms in terms of how well they enable compositional generalization. Specifically, our evaluation protocol focuses on whether or not it is easy to train a simple model on top of the learned representation that generalizes to new combinations of compositional factors. We systematically study three unsupervised representation learning algorithms - $β$-VAE, $β$-TCVAE, and emergent language (EL) autoencoders - on two datasets that allow directly testing compositional generalization. We find that directly using the bottleneck representation with simple models and few labels may lead to worse generalization than using representations from layers before or after the learned representation itself. In addition, we find that the previously proposed metrics for evaluating the levels of compositionality are not correlated with actual compositional generalization in our framework. Surprisingly, we find that increasing pressure to produce a disentangled representation produces representations with worse generalization, while representations from EL models show strong compositional generalization. Taken together, our results shed new light on the compositional generalization behavior of different unsupervised learning algorithms with a new setting to rigorously test this behavior, and suggest the potential benefits of delevoping EL learning algorithms for more generalizable representations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题