论文标题
关于神经对象检测网络的视频编码和环内过滤
On Intra Video Coding and In-loop Filtering for Neural Object Detection Networks
论文作者
论文摘要
满足人类作为最终用户的经典视频编码是广泛研究的视觉内容研究领域,并且对人类视觉系统(HVS)进行了优化的常见视频编解码器。但是,当机器分析压缩视频流时,假设和优化也有效吗?为了回答这个问题,我们比较了使用Intra Intra Intra编码在自主驾驶的情况下用HEVC和VVC编码的两个最先进的神经检测网络的性能。此外,检查了神经网络编码图像时,三个VVC内过滤器的影响。使用平均平均精度度量比较结果,以评估压缩输入的对象检测性能。在这些测试中,我们发现,当编码最佳情况下,使用VVC而不是HEVC,使用VVC而不是HEVC的PSNR节省了BjøntegaardDelta速率22.2%,而不是达到HEVC。此外,显示出,与标准VTM相比,以相同的平均平均精度,禁用VVC内部过滤器SAO和ALF可节省6.4%。
Classical video coding for satisfying humans as the final user is a widely investigated field of studies for visual content, and common video codecs are all optimized for the human visual system (HVS). But are the assumptions and optimizations also valid when the compressed video stream is analyzed by a machine? To answer this question, we compared the performance of two state-of-the-art neural detection networks when being fed with deteriorated input images coded with HEVC and VVC in an autonomous driving scenario using intra coding. Additionally, the impact of the three VVC in-loop filters when coding images for a neural network is examined. The results are compared using the mean average precision metric to evaluate the object detection performance for the compressed inputs. Throughout these tests, we found that the Bjøntegaard Delta Rate savings with respect to PSNR of 22.2 % using VVC instead of HEVC cannot be reached when coding for object detection networks with only 13.6% in the best case. Besides, it is shown that disabling the VVC in-loop filters SAO and ALF results in bitrate savings of 6.4 % compared to the standard VTM at the same mean average precision.