重新思考在细粒度嘈杂的面孔下的强大表示学习

论文标题

重新思考在细粒度嘈杂的面孔下的强大表示学习

Rethinking Robust Representation Learning Under Fine-grained Noisy Faces

论文作者

Ma, Bingqi, Song, Guanglu, Liu, Boxiao, Liu, Yu

论文摘要

从大规模嘈杂的面孔中学习强大的功能表示，这是高性能面部识别的关键挑战之一。最近通过减轻阶层内冲突和阶层间冲突来应对这一挑战。但是，每种冲突中无约束的噪声类型仍然使这些算法难以表现良好。为了更好地理解这一点，我们将每个类别的噪声类型以更细粒度的方式重新制定为n-身份| k^c-clusters。可以通过调整\ nkc的值来生成不同类型的嘈杂面。基于这种统一的配方，我们发现噪声射击表示学习背后的主要障碍是在不同的N，K和C下算法的灵活性。对于这个潜在的问题，我们提出了一种新方法，称为Evolling子中心学习〜（ESL），以找到最佳的超级平面，以准确地描述质量噪音的潜在面孔的潜在空间。更具体地说，我们将每个类的M子中心初始化，ESL鼓励它通过生产，合并和丢弃操作自动与N-身份| K^c-Clusters面对面。嘈杂面上属于相同身份的图像可以有效地收敛到同一子中心，并且具有不同身份的样本将被推开。我们通过对具有不同n，k和C的合成噪声数据集进行了精心的消融研究来检查其有效性

Learning robust feature representation from large-scale noisy faces stands out as one of the key challenges in high-performance face recognition. Recent attempts have been made to cope with this challenge by alleviating the intra-class conflict and inter-class conflict. However, the unconstrained noise type in each conflict still makes it difficult for these algorithms to perform well. To better understand this, we reformulate the noise type of each class in a more fine-grained manner as N-identities|K^C-clusters. Different types of noisy faces can be generated by adjusting the values of \nkc. Based on this unified formulation, we found that the main barrier behind the noise-robust representation learning is the flexibility of the algorithm under different N, K, and C. For this potential problem, we propose a new method, named Evolving Sub-centers Learning~(ESL), to find optimal hyperplanes to accurately describe the latent space of massive noisy faces. More specifically, we initialize M sub-centers for each class and ESL encourages it to be automatically aligned to N-identities|K^C-clusters faces via producing, merging, and dropping operations. Images belonging to the same identity in noisy faces can effectively converge to the same sub-center and samples with different identities will be pushed away. We inspect its effectiveness with an elaborate ablation study on the synthetic noisy dataset with different N, K, and C. Without any bells and whistles, ESL can achieve significant performance gains over state-of-the-art methods on large-scale noisy faces

下载PDF全文

下载文献需遵守相关版权规定

论文标题