Mirrornet：一种深贝叶斯的方法，用于反射2D姿势估算人类图像

论文标题

Mirrornet：一种深贝叶斯的方法，用于反射2D姿势估算人类图像

MirrorNet: A Deep Bayesian Approach to Reflective 2D Pose Estimation from Human Images

论文作者

Nakatsuka, Takayuki, Yoshii, Kazuyoshi, Koyama, Yuki, Fukayama, Satoru, Goto, Masataka, Morishima, Shigeo

论文摘要

本文提出了一种从人类图像中估算2D姿势的统计方法。标准监督方法的主要问题基于深层识别（图像置置）模型，是它通常会产生解剖学上令人难以置信的姿势，并且其性能受到配对数据量的限制。为了解决这些问题，我们提出了一种半监督方法，可以有效利用带有和没有姿势注释的图像。具体而言，我们通过将姿势特征的深层生成模型与来自姿势和图像特征的图像的姿势进行深层生成模型来制定姿势和图像的分层生成模型。然后，我们引入了一个深层识别模型，该模型Infers从图像中提出。给定图像作为观察到的数据，这些模型可以以层次的变分自动编码（图像到置换到置换图像）的方式共同训练。实验的结果表明，所提出的反射架构使估计的姿势在解剖学上是合理的，并且通过整合识别和生成模型以及通过喂食未经通知的图像来提高姿势估计的性能。

This paper proposes a statistical approach to 2D pose estimation from human images. The main problems with the standard supervised approach, which is based on a deep recognition (image-to-pose) model, are that it often yields anatomically implausible poses, and its performance is limited by the amount of paired data. To solve these problems, we propose a semi-supervised method that can make effective use of images with and without pose annotations. Specifically, we formulate a hierarchical generative model of poses and images by integrating a deep generative model of poses from pose features with that of images from poses and image features. We then introduce a deep recognition model that infers poses from images. Given images as observed data, these models can be trained jointly in a hierarchical variational autoencoding (image-to-pose-to-feature-to-pose-to-image) manner. The results of experiments show that the proposed reflective architecture makes estimated poses anatomically plausible, and the performance of pose estimation improved by integrating the recognition and generative models and also by feeding non-annotated images.

下载PDF全文

下载文献需遵守相关版权规定

论文标题