论文标题
野生多媒体中的家庭:一个多模式数据库,用于识别亲属关系
Families In Wild Multimedia: A Multimodal Database for Recognizing Kinship
论文作者
论文摘要
亲属关系是一种在媒体中可检测到的软生物特征,对于无数用例而言至关重要。尽管难以检测亲属关系,但使用静止图像的年度数据挑战始终提高了表现并吸引了新的研究人员。现在,系统达到了十年前无法预料的性能水平,即将在实践中部署可接受的表现。像其他生物识别任务一样,我们希望系统可以从其他方式中获得帮助。我们假设将仅具有静止图像的FIW添加方式将提高性能。因此,为了缩小研究与现实之间的差距并增强亲属识别系统的力量,我们使用多媒体(MM)数据(即视频,音频和文本字幕)扩展了FIW。具体来说,我们介绍了第一个公开可用的多任务MM亲属数据集。为了构建FIW MM,我们开发了机械来自动收集,注释和准备数据,需要最少的人类投入,而无需经济成本。提出的MM语料库允许问题语句是基于模板的更现实的协议。我们在所有基准测试基准中都具有附加方式的显着改善。结果突出了边缘案例,以激发未来的研究,并具有不同的改进领域。 FIW MM提供了增加自动化系统检测MM亲属的潜力所需的数据。它还允许来自不同领域的专家以新颖的方式进行合作。
Kinship, a soft biometric detectable in media, is fundamental for a myriad of use-cases. Despite the difficulty of detecting kinship, annual data challenges using still-images have consistently improved performances and attracted new researchers. Now, systems reach performance levels unforeseeable a decade ago, closing in on performances acceptable to deploy in practice. Like other biometric tasks, we expect systems can receive help from other modalities. We hypothesize that adding modalities to FIW, which has only still-images, will improve performance. Thus, to narrow the gap between research and reality and enhance the power of kinship recognition systems, we extend FIW with multimedia (MM) data (i.e., video, audio, and text captions). Specifically, we introduce the first publicly available multi-task MM kinship dataset. To build FIW MM, we developed machinery to automatically collect, annotate, and prepare the data, requiring minimal human input and no financial cost. The proposed MM corpus allows the problem statements to be more realistic template-based protocols. We show significant improvements in all benchmarks with the added modalities. The results highlight edge cases to inspire future research with different areas of improvement. FIW MM supplies the data needed to increase the potential of automated systems to detect kinship in MM. It also allows experts from diverse fields to collaborate in novel ways.