论文标题
有条件的基于信息的概括用于元学习
Conditional Mutual Information-Based Generalization Bound for Meta Learning
论文作者
论文摘要
Meta-Learning通过观察来自有限数量的相关任务的数据来优化基础学习算法的超参数的形式。本文介绍了任何给定的元学习者的概括性能的信息理论,该元素是建立在Steinke和Zakynthinou(2020)的条件相互信息(CMI)框架上的。在提议的元学习扩展中,CMI界限涉及通过首先从任务环境中对$ 2N $独立任务进行$ 2N $独立任务获得的培训\ textit {meta-supersample},然后为每个采样任务绘制200万美元的独立培训样本。馈送给元学习者的元训练数据被建模为通过从可用的$ 2N $任务中随机选择$ n $任务,而每个任务中可用的$ 200万美元培训样本中的$ N $任务。最终的结合用两个CMI项显式,这些术语衡量了鉴于整个Meta-Supersample,衡量元学习者输出和基本学习者输出提供了选择哪些训练数据的信息。最后,我们提出了一个数字示例,该示例说明了与先前的信息理论界限相比,该示例的优点。
Meta-learning optimizes an inductive bias---typically in the form of the hyperparameters of a base-learning algorithm---by observing data from a finite number of related tasks. This paper presents an information-theoretic bound on the generalization performance of any given meta-learner, which builds on the conditional mutual information (CMI) framework of Steinke and Zakynthinou (2020). In the proposed extension to meta-learning, the CMI bound involves a training \textit{meta-supersample} obtained by first sampling $2N$ independent tasks from the task environment, and then drawing $2M$ independent training samples for each sampled task. The meta-training data fed to the meta-learner is modelled as being obtained by randomly selecting $N$ tasks from the available $2N$ tasks and $M$ training samples per task from the available $2M$ training samples per task. The resulting bound is explicit in two CMI terms, which measure the information that the meta-learner output and the base-learner output provide about which training data are selected, given the entire meta-supersample. Finally, we present a numerical example that illustrates the merits of the proposed bound in comparison to prior information-theoretic bounds for meta-learning.