论文标题
传递信息理论及其在深层生成模型中的应用
The Transitive Information Theory and its Application to Deep Generative Models
论文作者
论文摘要
矛盾的是,可以将变量自动编码器(VAE)朝两个相反的方向推动,利用强大的解码器模型生成逼真的图像,但崩溃了,或者增加了用于解散表示的正则化系数,但最终会产生混乱的示例。现有方法将问题范围缩小到压缩和重建之间的利率差异。我们认为,一个良好的重建模型确实学习了编码更多细节的高容量潜在潜在的,但是,它的使用受到了两个主要问题的阻碍:先验是随机噪声,它完全脱离了后部,并且不允许世代的可控性;平均场变异推理不会强制执行层次结构,这使得将这些单元重组为合理的新型输出不可行。结果,我们开发了一个系统,该系统可以学习分离的表示形式的层次结构,以及重新组合学习概括的表示的机制。这是通过引入最少的感应偏见来实现的,以便在VAE之前学习可控制的偏见。这个想法得到了这里开发的传递信息理论的支持,也就是说,两个目标变量之间的相互信息可以通过相互信息最大化到第三个变量,从而绕过了VAE设计中的速率 - 延伸瓶颈。特别是,我们表明我们的模型(受到Semafovae的启发)(受计算机科学中的类似概念的启发)可以以可控的方式生成高质量的示例,在不同级别的表示层次结构上进行分散因素和干预的平滑遍历。
Paradoxically, a Variational Autoencoder (VAE) could be pushed in two opposite directions, utilizing powerful decoder model for generating realistic images but collapsing the learned representation, or increasing regularization coefficient for disentangling representation but ultimately generating blurry examples. Existing methods narrow the issues to the rate-distortion trade-off between compression and reconstruction. We argue that a good reconstruction model does learn high capacity latents that encode more details, however, its use is hindered by two major issues: the prior is random noise which is completely detached from the posterior and allow no controllability in the generation; mean-field variational inference doesn't enforce hierarchy structure which makes the task of recombining those units into plausible novel output infeasible. As a result, we develop a system that learns a hierarchy of disentangled representation together with a mechanism for recombining the learned representation for generalization. This is achieved by introducing a minimal amount of inductive bias to learn controllable prior for the VAE. The idea is supported by here developed transitive information theory, that is, the mutual information between two target variables could alternately be maximized through the mutual information to the third variable, thus bypassing the rate-distortion bottleneck in VAE design. In particular, we show that our model, named SemafoVAE (inspired by the similar concept in computer science), could generate high-quality examples in a controllable manner, perform smooth traversals of the disentangled factors and intervention at a different level of representation hierarchy.