论文标题

Turath-150k:阿拉伯遗产的图像数据库

Turath-150K: Image Database of Arab Heritage

论文作者

Kiyasseh, Dani, El-Bouri, Rasheed

论文摘要

大规模的图像数据库在很大程度上仍然偏向于少数文化中遇到的物体和活动。我们称之为隐藏的尾巴的文化多样性图像限制了预训练的神经网络的适用性,并无意中将研究人员排除在代表性不足的地区。为了开始解决这个问题,我们策划了Turath-150k,这是一个反映在那里常见的对象,活动和场景的阿拉伯世界图像的数据库。在此过程中,我们介绍了三个基准数据库,即Turath Standard,ART和UNESCO,即Turath数据集的专门子集。在展示了在此类基准上部署在Imagenet上预先训练的现有网络的局限性之后,我们在图像分类的任务上训练并评估了几个网络。由于Turath的结果,我们希望与代表性不足的地区的机器学习研究人员参与,并激发释放其他以文化为中心的数据库。可以在此处访问数据库:danikiyasseh.github.io/turath。

Large-scale image databases remain largely biased towards objects and activities encountered in a select few cultures. This absence of culturally-diverse images, which we refer to as the hidden tail, limits the applicability of pre-trained neural networks and inadvertently excludes researchers from under-represented regions. To begin remedying this issue, we curate Turath-150K, a database of images of the Arab world that reflect objects, activities, and scenarios commonly found there. In the process, we introduce three benchmark databases, Turath Standard, Art, and UNESCO, specialised subsets of the Turath dataset. After demonstrating the limitations of existing networks pre-trained on ImageNet when deployed on such benchmarks, we train and evaluate several networks on the task of image classification. As a consequence of Turath, we hope to engage machine learning researchers in under-represented regions, and to inspire the release of additional culture-focused databases. The database can be accessed here: danikiyasseh.github.io/Turath.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源