远程学习者：在模型培训之前合并多种多样

论文标题

远程学习者：在模型培训之前合并多种多样

Distance Learner: Incorporating Manifold Prior to Model Training

论文作者

Chetan, Aditya, Kwatra, Nipun

论文摘要

歧管假设（现实世界数据集中在低维流形附近）被认为是在非常高的维度问题中的机器学习算法有效性的原理，这些问题在视力和言语等领域很常见。已经提出了多种方法将歧管假设明确地纳入现代深度神经网络（DNNS）的先验，并取得了不同的成功。在本文中，我们提出了一种新方法，即远程学习者，以将基于DNN的分类器提前整合。对距离学习者进行了训练，以预测一个点与每个类别的基础歧管的距离，而不是类标签。对于分类，远程学习者然后选择与最接近预测的类歧管相对应的类。距离学习者还可以将点识别为超出分布（属于两类），如果到最接近的歧管高于阈值的距离。我们在多个合成数据集上评估了我们的方法，并表明距离学习者与标准分类器相比学习了更有意义的分类边界。我们还评估了我们的方法对对抗性鲁棒性的任务，并发现它不仅要优于标准分类器，而且还以大幅度的优势，而且还可以与通过最先进的对抗训练进行培训的分类器。

The manifold hypothesis (real world data concentrates near low-dimensional manifolds) is suggested as the principle behind the effectiveness of machine learning algorithms in very high dimensional problems that are common in domains such as vision and speech. Multiple methods have been proposed to explicitly incorporate the manifold hypothesis as a prior in modern Deep Neural Networks (DNNs), with varying success. In this paper, we propose a new method, Distance Learner, to incorporate this prior for DNN-based classifiers. Distance Learner is trained to predict the distance of a point from the underlying manifold of each class, rather than the class label. For classification, Distance Learner then chooses the class corresponding to the closest predicted class manifold. Distance Learner can also identify points as being out of distribution (belonging to neither class), if the distance to the closest manifold is higher than a threshold. We evaluate our method on multiple synthetic datasets and show that Distance Learner learns much more meaningful classification boundaries compared to a standard classifier. We also evaluate our method on the task of adversarial robustness, and find that it not only outperforms standard classifier by a large margin, but also performs at par with classifiers trained via state-of-the-art adversarial training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题