论文标题
新颖性的程度如何影响新颖阶级检索的半监督表示学习?
How does the degree of novelty impacts semi-supervised representation learning for novel class retrieval?
论文作者
论文摘要
深层网络的监督表示学习倾向于过度培养培训课程,而对新课程的概括是一个具有挑战性的问题。评估同一培训课程的固定图像的学习嵌入是很常见的。但是,在实际应用中,数据来自新来源,新颖的课程可能会出现。我们假设将新颖类的未标记图像以半监督的方式纳入训练,这将是有益于有效检索新颖级别的图像,而这与香草监督的表示相比。为了以综合的方式验证这一假设,我们提出了一种原始的评估方法,该方法通过随机或语义上的数据集类别进行分区,即通过对基础和新颖类之间的共享语义进行最小化,从而改变了新类别的新颖性程度。该评估程序允许盲目训练一台新型级标签,并评估基础或新阶级检索的冷冻表示。我们发现,香草监督的表现不足以在新颖阶级的检索中缺乏,因此当语义差距更高时。半监督算法可以部分弥合这一性能差距,但仍然有很大的改进空间。
Supervised representation learning with deep networks tends to overfit the training classes and the generalization to novel classes is a challenging question. It is common to evaluate a learned embedding on held-out images of the same training classes. In real applications however, data comes from new sources and novel classes are likely to arise. We hypothesize that incorporating unlabelled images of novel classes in the training set in a semi-supervised fashion would be beneficial for the efficient retrieval of novel-class images compared to a vanilla supervised representation. To verify this hypothesis in a comprehensive way, we propose an original evaluation methodology that varies the degree of novelty of novel classes by partitioning the dataset category-wise either randomly, or semantically, i.e. by minimizing the shared semantics between base and novel classes. This evaluation procedure allows to train a representation blindly to any novel-class labels and evaluate the frozen representation on the retrieval of base or novel classes. We find that a vanilla supervised representation falls short on the retrieval of novel classes even more so when the semantics gap is higher. Semi-supervised algorithms allow to partially bridge this performance gap but there is still much room for improvement.