论文标题
对有限数据的深层生成建模,并通过不可转让的预训练模型正规化
Deep Generative Modeling on Limited Data with Regularization by Nontransferable Pre-trained Models
论文作者
论文摘要
深层生成模型(DGM)是数据吸引的,因为在有限数据上学习一个复杂的模型,遇到了较大的差异和易于拟合。受到偏见变化权衡的经典观点的启发,我们提出了正则化的深层生成模型(REG-DGM),该模型利用不可转移的预培训模型来减少具有有限数据的生成模型的差异。正式地,Reg-DGM优化了一定差异的加权总和和能量函数的期望,其中数据和模型分布之间的差异是在其中的差异,并且能量函数由预先训练的模型W.R.T.定义。模型分布。我们分析了一个简单而代表性的高斯拟合案例,以证明加权超参数如何摆脱偏见和差异。从理论上讲,我们表征了非参数环境中REG-DGM全局最小值的存在和独特性,并证明了其与基于梯度方法培训的神经网络的收敛性。从经验上讲,使用各种预训练的特征提取器和数据依赖性能量函数,Reg-DGM始终以有限的数据来提高强DGM的生成性能,并为最先进的方法实现竞争成果。我们的实施可在https://github.com/ml-gsai/reg-ada-apa上获得。
Deep generative models (DGMs) are data-eager because learning a complex model on limited data suffers from a large variance and easily overfits. Inspired by the classical perspective of the bias-variance tradeoff, we propose regularized deep generative model (Reg-DGM), which leverages a nontransferable pre-trained model to reduce the variance of generative modeling with limited data. Formally, Reg-DGM optimizes a weighted sum of a certain divergence and the expectation of an energy function, where the divergence is between the data and the model distributions, and the energy function is defined by the pre-trained model w.r.t. the model distribution. We analyze a simple yet representative Gaussian-fitting case to demonstrate how the weighting hyperparameter trades off the bias and the variance. Theoretically, we characterize the existence and the uniqueness of the global minimum of Reg-DGM in a non-parametric setting and prove its convergence with neural networks trained by gradient-based methods. Empirically, with various pre-trained feature extractors and a data-dependent energy function, Reg-DGM consistently improves the generation performance of strong DGMs with limited data and achieves competitive results to the state-of-the-art methods. Our implementation is available at https://github.com/ML-GSAI/Reg-ADA-APA.