论文标题
通过合成训练数据对2D工具地标的实时检测
Real-time Detection of 2D Tool Landmarks with Synthetic Training Data
论文作者
论文摘要
在本文中,提出了深度学习的体系结构,可以实时检测某些地标物理工具(例如锤子或螺丝刀)的2D位置。为了避免手动标记的劳动,对网络进行了综合生成的数据培训。由于域的差异,在计算机生成的图像上训练计算机视觉模型,同时仍然在真实图像上达到良好的精度是一个挑战。所提出的方法将高级渲染方法与转移学习和中间监督体系结构结合使用来解决此问题。结果表明,本文中介绍的模型(称为中间热图模型(IHM))在接受合成数据训练时将其概括为真实图像。为了避免对所讨论工具的精确纹理3D模型的需求,这表明该模型将在一组相同类型的工具的不同3D模型上进行训练时将其推广到看不见的工具。将IHM与两种现有的关键点检测方法进行了比较,并且表明它的表现优于检测工具地标的那些,经过合成数据训练。
In this paper a deep learning architecture is presented that can, in real time, detect the 2D locations of certain landmarks of physical tools, such as a hammer or screwdriver. To avoid the labor of manual labeling, the network is trained on synthetically generated data. Training computer vision models on computer generated images, while still achieving good accuracy on real images, is a challenge due to the difference in domain. The proposed method uses an advanced rendering method in combination with transfer learning and an intermediate supervision architecture to address this problem. It is shown that the model presented in this paper, named Intermediate Heatmap Model (IHM), generalizes to real images when trained on synthetic data. To avoid the need for an exact textured 3D model of the tool in question, it is shown that the model will generalize to an unseen tool when trained on a set of different 3D models of the same type of tool. IHM is compared to two existing approaches to keypoint detection and it is shown that it outperforms those at detecting tool landmarks, trained on synthetic data.