推动的生成模型可以符合多模式分布吗？

论文标题

推动的生成模型可以符合多模式分布吗？

Can Push-forward Generative Models Fit Multimodal Distributions?

论文作者

Salmona, Antoine, de Bortoli, Valentin, Delon, Julie, Desolneux, Agnès

论文摘要

许多生成模型通过使用确定性神经网络转换标准的高斯随机变量来综合数据。这些模型包括变异自动编码器和生成对抗网络。在这项工作中，我们称它们为“推动”模型并研究其表现力。我们表明，这些生成网络的Lipschitz常数必须很大才能拟合多模式分布。更确切地说，我们表明，生成的和数据分布之间的总变化距离和kullback-leibler差异是根据模式分离和Lipschitz常数从下面界定的。由于限制神经网络的Lipschitz常数是稳定生成模型的一种常见方法，因此推动前向模型近似多模式分布的能力与训练的稳定性之间存在可证明的权衡。我们在一维和图像数据集上验证了我们的发现，并从经验上表明，在每个步骤中具有随机输入的堆叠网络组成的生成模型，例如扩散模型不会受到此类限制。

Many generative models synthesize data by transforming a standard Gaussian random variable using a deterministic neural network. Among these models are the Variational Autoencoders and the Generative Adversarial Networks. In this work, we call them "push-forward" models and study their expressivity. We show that the Lipschitz constant of these generative networks has to be large in order to fit multimodal distributions. More precisely, we show that the total variation distance and the Kullback-Leibler divergence between the generated and the data distribution are bounded from below by a constant depending on the mode separation and the Lipschitz constant. Since constraining the Lipschitz constants of neural networks is a common way to stabilize generative models, there is a provable trade-off between the ability of push-forward models to approximate multimodal distributions and the stability of their training. We validate our findings on one-dimensional and image datasets and empirically show that generative models consisting of stacked networks with stochastic input at each step, such as diffusion models do not suffer of such limitations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题