使用深度学习认识艺术史上的角色

论文标题

使用深度学习认识艺术史上的角色

Recognizing Characters in Art History Using Deep Learning

论文作者

Madhu, Prathmesh, Kosti, Ronak, Mührenberg, Lara, Bell, Peter, Maier, Andreas, Christlein, Vincent

论文摘要

在艺术史领域，艺术品及其上下文是理解潜在语义信息的核心。但是，这些艺术品的高度复杂而复杂的表示使得很难分析现场。从计算机视觉的角度来看，分析此类艺术品的任务可以通过采用自下而上的方法将其分为子问题。在本文中，我们专注于认识艺术史上角色的问题。从$ $ $ $ $ $ $ $ lord $（图1）的肖像学中，我们考虑了主要主角的代表，$ MARY $和$ GABRIEL $，在不同的艺术品和样式上。我们调查并介绍了培训角色分类器的发现，以从他们的面部图像中提取的特征。该方法的局限性以及$ GABRIEL $表示的固有歧义，促使我们考虑他们的身体（更大的上下文）进行分析以识别角色。在$ MARY $ $ $和$ GABRIEL $的身体上培训的卷积神经网络（CNN）能够学习相关的功能，并最终提高角色识别的性能。我们引入了一种新技术，该技术以类似样式生成更多数据，从而有效地在相似的域中创建数据。我们对三种不同模型进行了实验和分析，并表明对域相关数据训练的模型为识别角色提供了最佳性能。此外，我们分析了网络预测的局部图像区域。代码是开源的，可在https://github.com/prathmeshrmadhu/recognize_characters_art_history上找到，并链接到已发布的Peer-Reviewed文章是https://dl.acm.org/citation.cfm?id=3357242。

In the field of Art History, images of artworks and their contexts are core to understanding the underlying semantic information. However, the highly complex and sophisticated representation of these artworks makes it difficult, even for the experts, to analyze the scene. From the computer vision perspective, the task of analyzing such artworks can be divided into sub-problems by taking a bottom-up approach. In this paper, we focus on the problem of recognizing the characters in Art History. From the iconography of $Annunciation$ $of$ $the$ $Lord$ (Figure 1), we consider the representation of the main protagonists, $Mary$ and $Gabriel$, across different artworks and styles. We investigate and present the findings of training a character classifier on features extracted from their face images. The limitations of this method, and the inherent ambiguity in the representation of $Gabriel$, motivated us to consider their bodies (a bigger context) to analyze in order to recognize the characters. Convolutional Neural Networks (CNN) trained on the bodies of $Mary$ and $Gabriel$ are able to learn person related features and ultimately improve the performance of character recognition. We introduce a new technique that generates more data with similar styles, effectively creating data in the similar domain. We present experiments and analysis on three different models and show that the model trained on domain related data gives the best performance for recognizing character. Additionally, we analyze the localized image regions for the network predictions. Code is open-sourced and available at https://github.com/prathmeshrmadhu/recognize_characters_art_history and the link to the published peer-reviewed article is https://dl.acm.org/citation.cfm?id=3357242.

下载PDF全文

下载文献需遵守相关版权规定

论文标题