通过任意分辨率视频学习运动稳定远程光绘画学

论文标题

通过任意分辨率视频学习运动稳定远程光绘画学

Learning Motion-Robust Remote Photoplethysmography through Arbitrary Resolution Videos

论文作者

Li, Jianwei, Yu, Zitong, Shi, Jingang

论文摘要

远程照相学（RPPG）可以从面部视频中进行非接触性心率（HR）估计，与传统的基于接触的测量相比，这给出了很大的便利性。在现实世界的长期健康监测方案中，参与者及其头部运动的距离通常会因时间而变化，从而导致RPPG测量由于面部分辨率变化和复杂的运动伪像而导致RPPG测量。与以前的RPPG模型不同，为相机和参与者之间的恒定距离而设计，在本文中，我们提出了两个插头播放块（即生理信号特征提取块（PFE）和暂时的面部对齐块（TFA）），以减轻距离变化和头部运动的降解。在一侧，PFE在带有代表性区域的信息的指导下，将任意分辨率的面部框架自适应地编码为固定分辨率的面部结构特征。另一方面，利用估计的光流，TFA能够抵消由头部运动引起的RPPG信号混乱，从而使运动瞬间RPPG信号恢复有益。此外，我们还使用两流双分辨率框架对模型进行跨分辨率约束训练，这进一步帮助PFE学习了分辨率射击的面部RPPG功能。在三个基准数据集（UBFC-RPPG，Cohface和Pure）上进行了广泛的实验，证明了该方法的出色性能。一个亮点是，使用PFE和TFA，现成的时空RPPG模型可以预测在不同的面部分辨率和严重的头部运动方案下更强大的RPPG信号。这些代码可在https://github.com/ljw-git/arbitrary_resolution_rppg上找到。

Remote photoplethysmography (rPPG) enables non-contact heart rate (HR) estimation from facial videos which gives significant convenience compared with traditional contact-based measurements. In the real-world long-term health monitoring scenario, the distance of the participants and their head movements usually vary by time, resulting in the inaccurate rPPG measurement due to the varying face resolution and complex motion artifacts. Different from the previous rPPG models designed for a constant distance between camera and participants, in this paper, we propose two plug-and-play blocks (i.e., physiological signal feature extraction block (PFE) and temporal face alignment block (TFA)) to alleviate the degradation of changing distance and head motion. On one side, guided with representative-area information, PFE adaptively encodes the arbitrary resolution facial frames to the fixed-resolution facial structure features. On the other side, leveraging the estimated optical flow, TFA is able to counteract the rPPG signal confusion caused by the head movement thus benefit the motion-robust rPPG signal recovery. Besides, we also train the model with a cross-resolution constraint using a two-stream dual-resolution framework, which further helps PFE learn resolution-robust facial rPPG features. Extensive experiments on three benchmark datasets (UBFC-rPPG, COHFACE and PURE) demonstrate the superior performance of the proposed method. One highlight is that with PFE and TFA, the off-the-shelf spatio-temporal rPPG models can predict more robust rPPG signals under both varying face resolution and severe head movement scenarios. The codes are available at https://github.com/LJW-GIT/Arbitrary_Resolution_rPPG.

下载PDF全文

下载文献需遵守相关版权规定

论文标题