Laplacian denoising自动编码器

论文标题

Laplacian denoising自动编码器

Laplacian Denoising Autoencoder

论文作者

Jiao, Jianbo, Bao, Linchao, Wei, Yunchao, He, Shengfeng, Shi, Honghui, Lau, Rynson, Huang, Thomas S.

论文摘要

尽管已经证明深度神经网络在许多机器学习任务中表现出色，但标记大量地面真相数据以进行监督培训通常是非常昂贵的。因此，使用未标记数据的学习强大表示对于缓解人类的努力至关重要，对于许多下游任务至关重要。视觉数据的无监督和自我监督学习方法的最新进展已从领域知识中受益匪浅。在这里，我们对一个更通用的无监督学习框架感兴趣，该框架可以很容易地推广到其他领域。在本文中，我们建议通过一种新型的Denoising AutoCoder学习数据表示，在该数据表示中，嘈杂的输入数据是通过损坏梯度域中的潜在清洁数据而生成的。这可以自然概括以用输入数据的拉普拉斯金字塔表示跨越多个尺度。通过这种方式，代理商会学习更强大的表示，从而利用跨多个量表的基础数据结构。与单尺度腐败和其他方法相比，几个视觉基准的实验表明，可以通过提出的方法来学习更好的表示。此外，我们还证明了在转移到其他下游视觉任务时，学到的表示形式表现良好。

While deep neural networks have been shown to perform remarkably well in many machine learning tasks, labeling a large amount of ground truth data for supervised training is usually very costly to scale. Therefore, learning robust representations with unlabeled data is critical in relieving human effort and vital for many downstream tasks. Recent advances in unsupervised and self-supervised learning approaches for visual data have benefited greatly from domain knowledge. Here we are interested in a more generic unsupervised learning framework that can be easily generalized to other domains. In this paper, we propose to learn data representations with a novel type of denoising autoencoder, where the noisy input data is generated by corrupting latent clean data in the gradient domain. This can be naturally generalized to span multiple scales with a Laplacian pyramid representation of the input data. In this way, the agent learns more robust representations that exploit the underlying data structures across multiple scales. Experiments on several visual benchmarks demonstrate that better representations can be learned with the proposed approach, compared to its counterpart with single-scale corruption and other approaches. Furthermore, we also demonstrate that the learned representations perform well when transferring to other downstream vision tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题