动态pix2pix：用于建模输入和目标域关节分布的噪声注射的CGAN具有有限的训练数据

论文标题

动态pix2pix：用于建模输入和目标域关节分布的噪声注射的CGAN具有有限的训练数据

Dynamic-Pix2Pix: Noise Injected cGAN for Modeling Input and Target Domain Joint Distributions with Limited Training Data

论文作者

Naderi, Mohammadreza, Karimi, Nader, Emami, Ali, Shirani, Shahram, Samavi, Shadrokh

论文摘要

通过将图像从源转化为目标域的应用程序，诸如将简单的线条绘画转换为油画等应用程序引起了极大的关注。翻译图像的质量与两个关键问题直接相关。首先，输出分布与目标的一致性至关重要。其次，生成的输出应与输入具有很高的相关性。有条件的生成对抗网络CGAN是翻译图像的最常见模型。当我们使用有限的培训数据集时，CGAN的性能会下降。在这项工作中，我们在动态神经网络理论的帮助下增加了PIX2PIX（CGAN）目标分布建模能力。我们的模型有两个学习周期。该模型在第一个周期中了解了输入与地面真相之间的相关性。然后，模型的体系结构在第二个周期中进行了完善，以从噪声输入中学习目标分布。这些过程在培训过程的每次迭代中都执行。从噪声输入中帮助CGAN学习目标分布会导致测试时间更好的模型概括，并允许模型几乎完全适合目标域分布。结果，我们的模型超过了分割HC18和Montgomery的胸部X射线图像的PIX2PIX模型。定性和骰子分数都表明我们模型的优越性。尽管我们提出的方法不使用数千个其他数据进行预处理，但与最先进的方法相比，它为IN和外域的概括产生了可比的结果。

Learning to translate images from a source to a target domain with applications such as converting simple line drawing to oil painting has attracted significant attention. The quality of translated images is directly related to two crucial issues. First, the consistency of the output distribution with that of the target is essential. Second, the generated output should have a high correlation with the input. Conditional Generative Adversarial Networks, cGANs, are the most common models for translating images. The performance of a cGAN drops when we use a limited training dataset. In this work, we increase the Pix2Pix (a form of cGAN) target distribution modeling ability with the help of dynamic neural network theory. Our model has two learning cycles. The model learns the correlation between input and ground truth in the first cycle. Then, the model's architecture is refined in the second cycle to learn the target distribution from noise input. These processes are executed in each iteration of the training procedure. Helping the cGAN learn the target distribution from noise input results in a better model generalization during the test time and allows the model to fit almost perfectly to the target domain distribution. As a result, our model surpasses the Pix2Pix model in segmenting HC18 and Montgomery's chest x-ray images. Both qualitative and Dice scores show the superiority of our model. Although our proposed method does not use thousand of additional data for pretraining, it produces comparable results for the in and out-domain generalization compared to the state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题