论文标题

超级侵袭性:摊销不变性学习

HyperInvariances: Amortizing Invariance Learning

论文作者

Chavhan, Ruchika, Gouk, Henry, Stühmer, Jan, Hospedales, Timothy

论文摘要

在给定的学习任务中提供不向导会传达出一个关键的归纳偏见,如果正确指定,可以导致样本效率学习和良好的概括。但是,对于许多感兴趣的问题来说,理想的不变性通常是未知的,这既导致了工程知识,又试图为不变性学习提供框架。但是,不变性学习昂贵,并且对流行的神经体系结构进行了密集的数据。我们介绍了摊销不变性学习的概念。在前期学习阶段,我们学习了跨越不变性的特征提取器的低维流形,使用超网络进行不同的转换。然后,对于任何感兴趣的问题,模型和不变性学习都可以通过拟合低维不变性描述符和输出头的速度快速有效。从经验上讲,该框架可以在不同的下游任务中识别适当的不向导,并与传统方法相比,导致可比或更好的测试性能。我们的Hypervariance框架在理论上也很有吸引力,因为它可以实现概括性结合,从而在模型拟合和复杂性之间的权衡中提供了一个有趣的新工作点。

Providing invariances in a given learning task conveys a key inductive bias that can lead to sample-efficient learning and good generalisation, if correctly specified. However, the ideal invariances for many problems of interest are often not known, which has led both to a body of engineering lore as well as attempts to provide frameworks for invariance learning. However, invariance learning is expensive and data intensive for popular neural architectures. We introduce the notion of amortizing invariance learning. In an up-front learning phase, we learn a low-dimensional manifold of feature extractors spanning invariance to different transformations using a hyper-network. Then, for any problem of interest, both model and invariance learning are rapid and efficient by fitting a low-dimensional invariance descriptor an output head. Empirically, this framework can identify appropriate invariances in different downstream tasks and lead to comparable or better test performance than conventional approaches. Our HyperInvariance framework is also theoretically appealing as it enables generalisation-bounds that provide an interesting new operating point in the trade-off between model fit and complexity.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源