Trihorn-net：一种基于深度的3D手姿势估计的模型

论文标题

Trihorn-net：一种基于深度的3D手姿势估计的模型

TriHorn-Net: A Model for Accurate Depth-Based 3D Hand Pose Estimation

论文作者

Rezaei, Mohammad, Rastgoo, Razieh, Athitsos, Vassilis

论文摘要

3D手姿势估计方法最近取得了重大进展。但是，对于特定的现实世界应用，估计精度通常远远不足，因此还有很大的改进空间。本文提出了Trihorn-Net，这是一种新型模型，该模型使用特定的创新来提高深度图像的手姿势估计精度。第一个创新是将3D手姿势估计分解为深度图像空间（UV）中2D关节位置的估计，以及其相应深度的估计得到了两个互补的注意图。这种分解阻止了深度估计，这是一项更加困难的任务，无法在预测水平和特征级别上干扰紫外线估计。第二个创新是PixDropout，据我们所知，这是用于手动深度图像的首个基于外观的数据增强方法。实验结果表明，所提出的模型优于三个公共基准数据集上的最先进方法。我们的实施可从https://github.com/mrezaei92/trihorn-net获得。

3D hand pose estimation methods have made significant progress recently. However, the estimation accuracy is often far from sufficient for specific real-world applications, and thus there is significant room for improvement. This paper proposes TriHorn-Net, a novel model that uses specific innovations to improve hand pose estimation accuracy on depth images. The first innovation is the decomposition of the 3D hand pose estimation into the estimation of 2D joint locations in the depth image space (UV), and the estimation of their corresponding depths aided by two complementary attention maps. This decomposition prevents depth estimation, which is a more difficult task, from interfering with the UV estimations at both the prediction and feature levels. The second innovation is PixDropout, which is, to the best of our knowledge, the first appearance-based data augmentation method for hand depth images. Experimental results demonstrate that the proposed model outperforms the state-of-the-art methods on three public benchmark datasets. Our implementation is available at https://github.com/mrezaei92/TriHorn-Net.

下载PDF全文

下载文献需遵守相关版权规定

论文标题