论文标题
不要等待,只是体重:通过学习目标驱动实例权重改善无监督的表示
Don't Wait, Just Weight: Improving Unsupervised Representations by Learning Goal-Driven Instance Weights
论文作者
论文摘要
在没有大型标签数据集的情况下,自我监督的学习技术可以通过从未标记的数据中学习有用的表示形式来提高性能,这通常更容易获得。但是,在未标记的集合和下游目标问题数据之间通常会有域的变化。我们表明,通过学习未标记数据的贝叶斯实例权重,我们可以通过优先考虑最有用的实例来提高下游分类精度。此外,我们表明可以通过丢弃不必要的数据点来减少培训时间。我们的方法是使用STL-10和Visual Decathlon上流行的自我监督旋转预测任务评估Betadataweighter。我们将相关的实例加权方案(包括手工设计的启发式方法和元学习)以及常规的自我监督学习进行比较。 Betadataweighter既达到了数据集的最高平均准确性和排名,又在STL-10上,它可以修剪出高达78%的未标记图像,而准确性却没有明显损失,相当于训练时间降低了50%以上。
In the absence of large labelled datasets, self-supervised learning techniques can boost performance by learning useful representations from unlabelled data, which is often more readily available. However, there is often a domain shift between the unlabelled collection and the downstream target problem data. We show that by learning Bayesian instance weights for the unlabelled data, we can improve the downstream classification accuracy by prioritising the most useful instances. Additionally, we show that the training time can be reduced by discarding unnecessary datapoints. Our method, BetaDataWeighter is evaluated using the popular self-supervised rotation prediction task on STL-10 and Visual Decathlon. We compare to related instance weighting schemes, both hand-designed heuristics and meta-learning, as well as conventional self-supervised learning. BetaDataWeighter achieves both the highest average accuracy and rank across datasets, and on STL-10 it prunes up to 78% of unlabelled images without significant loss in accuracy, corresponding to over 50% reduction in training time.