论文标题
一个两个流的故事:兰格文流的合作学习,并将流向基于能量的模型的流动归一化
A Tale of Two Flows: Cooperative Learning of Langevin Flow and Normalizing Flow Toward Energy-Based Model
论文作者
论文摘要
本文研究了两个生成流模型的合作学习,其中两个模型基于共同合成的示例进行迭代更新。第一个流量模型是一种归一化流,该流量通过应用一系列可逆变换来将初始简单密度转换为目标密度。第二流模型是Langevin流,该流程运行基于梯度的MCMC的有限步骤朝基于能量的模型。我们从提出一个生成框架开始,该框架训练具有标准化流程作为摊销采样器的基于能量的模型,以初始化基于能量的模型的MCMC链。在每次学习迭代中,我们通过使用归一化流量初始化,然后对当前基于能量的模型进行短期Langevin流动修订,从而生成综合示例。然后,我们将综合示例视为来自基于能量的模型的公平样本,并以最大似然学习梯度更新模型参数,而归一化流则通过最大化可拖动的可能性直接从综合示例中学习。在短期非混合MCMC方案下,基于能量的模型的估计被证明是遵循最大似然的扰动,而短期的langevin流量和归一化流量形成了我们称为Coopflow的两流量生成器。我们通过信息几何形状提供了对曲线算法的低估,并表明它是一个有效的生成器,因为它会收敛到矩匹配估计器。我们证明,受过训练的羊松流能够合成逼真的图像,重建图像和图像之间插值。
This paper studies the cooperative learning of two generative flow models, in which the two models are iteratively updated based on the jointly synthesized examples. The first flow model is a normalizing flow that transforms an initial simple density to a target density by applying a sequence of invertible transformations. The second flow model is a Langevin flow that runs finite steps of gradient-based MCMC toward an energy-based model. We start from proposing a generative framework that trains an energy-based model with a normalizing flow as an amortized sampler to initialize the MCMC chains of the energy-based model. In each learning iteration, we generate synthesized examples by using a normalizing flow initialization followed by a short-run Langevin flow revision toward the current energy-based model. Then we treat the synthesized examples as fair samples from the energy-based model and update the model parameters with the maximum likelihood learning gradient, while the normalizing flow directly learns from the synthesized examples by maximizing the tractable likelihood. Under the short-run non-mixing MCMC scenario, the estimation of the energy-based model is shown to follow the perturbation of maximum likelihood, and the short-run Langevin flow and the normalizing flow form a two-flow generator that we call CoopFlow. We provide an understating of the CoopFlow algorithm by information geometry and show that it is a valid generator as it converges to a moment matching estimator. We demonstrate that the trained CoopFlow is capable of synthesizing realistic images, reconstructing images, and interpolating between images.