位置敏感图像检索和标记

论文标题

位置敏感图像检索和标记

Location Sensitive Image Retrieval and Tagging

论文作者

Gomez, Raul, Gibert, Jaume, Gomez, Lluis, Karatzas, Dimosthenis

论文摘要

来自全球不同地区的人们以不同的方式描述了对象和概念。因此，视觉外观在不同的地理位置之间可能会有所不同，这使位置在分析视觉数据时成为相关的上下文信息。在这项工作中，我们解决了图像检索的任务与在地球上某个位置的给定标签相关。我们提出了Locsens，该模型可以通过合理的方式来对图像，标签和坐标的三重态进行排名，以及两种培训策略，以平衡最终排名中的位置影响。 Locsens学会融合多模式查询的文本和位置信息，以检索不同位置粒度级别的相关图像，并成功利用位置信息来改善图像标记。

People from different parts of the globe describe objects and concepts in distinct manners. Visual appearance can thus vary across different geographic locations, which makes location a relevant contextual information when analysing visual data. In this work, we address the task of image retrieval related to a given tag conditioned on a certain location on Earth. We present LocSens, a model that learns to rank triplets of images, tags and coordinates by plausibility, and two training strategies to balance the location influence in the final ranking. LocSens learns to fuse textual and location information of multimodal queries to retrieve related images at different levels of location granularity, and successfully utilizes location information to improve image tagging.

下载PDF全文

下载文献需遵守相关版权规定

论文标题