论文标题
带有嘈杂标签的长尾实例分割的基准
A Benchmark of Long-tailed Instance Segmentation with Noisy Labels
论文作者
论文摘要
在本文中,我们考虑了一个长尾数据集中的实例分割任务,其中包含标签噪声,即某些注释不正确。使此案现实的主要原因有两个主要原因。首先,从现实世界中收集的数据集通常会遵守长尾巴的分布。其次,例如,分段数据集,因为一个图像中有很多实例,其中有些是很小的,因此将噪声引入注释会更容易。具体来说,我们提出了一个新的数据集,该数据集是一个大型词汇长尾数据集,其中包含标签噪声,例如分割。此外,我们评估了此数据集上先前提出的实例分割算法。结果表明,训练数据集中的噪声将阻碍模型学习稀有类别并降低整体性能,并激发我们探索更有效的方法以应对这一实用挑战。该代码和数据集可在https://github.com/guanlinlee/noisy-lvis中找到。
In this paper, we consider the instance segmentation task on a long-tailed dataset, which contains label noise, i.e., some of the annotations are incorrect. There are two main reasons making this case realistic. First, datasets collected from real world usually obey a long-tailed distribution. Second, for instance segmentation datasets, as there are many instances in one image and some of them are tiny, it is easier to introduce noise into the annotations. Specifically, we propose a new dataset, which is a large vocabulary long-tailed dataset containing label noise for instance segmentation. Furthermore, we evaluate previous proposed instance segmentation algorithms on this dataset. The results indicate that the noise in the training dataset will hamper the model in learning rare categories and decrease the overall performance, and inspire us to explore more effective approaches to address this practical challenge. The code and dataset are available in https://github.com/GuanlinLee/Noisy-LVIS.