使用可学习的体积聚合的多视图人类姿势和形状估计

论文标题

使用可学习的体积聚合的多视图人类姿势和形状估计

Multi-view Human Pose and Shape Estimation Using Learnable Volumetric Aggregation

论文作者

Shin, Soyong, Halilaj, Eni

论文摘要

RGB图像中的人姿势和形状估计是一种备受追捧的基于标记的运动捕获的替代方案，这很费力，需要昂贵的设备，并将捕获限制在实验室环境中。然而，基于单眼视力的算法仍然患有旋转歧义，并且还没有准备好在高精度至关重要的医疗保健应用中翻译。虽然从多个观点融合数据可以克服这些挑战，但当前的算法需要进一步改进才能获得临床上可接受的精确度。在本文中，我们提出了一种可学习的体积聚合方法，以从校准的多视图图像中重建3D人体姿势和形状。我们使用人体的参数表示，这使我们的方法直接适用于医疗应用。与以前的方法相比，鉴于其成本效率，我们的框架显示出更高的准确性和更大的实时预测前景。

Human pose and shape estimation from RGB images is a highly sought after alternative to marker-based motion capture, which is laborious, requires expensive equipment, and constrains capture to laboratory environments. Monocular vision-based algorithms, however, still suffer from rotational ambiguities and are not ready for translation in healthcare applications, where high accuracy is paramount. While fusion of data from multiple viewpoints could overcome these challenges, current algorithms require further improvement to obtain clinically acceptable accuracies. In this paper, we propose a learnable volumetric aggregation approach to reconstruct 3D human body pose and shape from calibrated multi-view images. We use a parametric representation of the human body, which makes our approach directly applicable to medical applications. Compared to previous approaches, our framework shows higher accuracy and greater promise for real-time prediction, given its cost efficiency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题