论文标题
协作图像理解
Collaborative Image Understanding
论文作者
论文摘要
在实践中,自动理解图像的内容是一个高度相关的问题。例如,在电子商务和社交媒体设置中,一个常见的问题是自动对用户提供的图片进行分类。如今,一种标准方法是用特定于应用程序的数据微调预训练的图像模型。除了图像外,组织通常还经常在其应用程序的上下文中收集协作信号,特别是用户如何与提供的在线内容进行交互,例如观看,评级或标记的形式。此类信号通常用于项目推荐,通常是通过从数据中得出潜在用户和项目表示形式。在这项工作中,我们表明可以利用此类协作信息来改善新图像的分类过程。具体来说,我们建议一个多任务学习框架,辅助任务是重建协作潜在项目表示。电子商务和社交媒体的数据集上的一系列实验表明,考虑协作信号有助于将图像分类主要任务的主要任务显着提高9.1%。
Automatically understanding the contents of an image is a highly relevant problem in practice. In e-commerce and social media settings, for example, a common problem is to automatically categorize user-provided pictures. Nowadays, a standard approach is to fine-tune pre-trained image models with application-specific data. Besides images, organizations however often also collect collaborative signals in the context of their application, in particular how users interacted with the provided online content, e.g., in forms of viewing, rating, or tagging. Such signals are commonly used for item recommendation, typically by deriving latent user and item representations from the data. In this work, we show that such collaborative information can be leveraged to improve the classification process of new images. Specifically, we propose a multitask learning framework, where the auxiliary task is to reconstruct collaborative latent item representations. A series of experiments on datasets from e-commerce and social media demonstrates that considering collaborative signals helps to significantly improve the performance of the main task of image classification by up to 9.1%.