移动设备上的实时神经光场

论文标题

移动设备上的实时神经光场

Real-Time Neural Light Field on Mobile Devices

论文作者

Cao, Junli, Wang, Huan, Chemerys, Pavlo, Shakhrai, Vladislav, Hu, Ju, Fu, Yun, Makoviichuk, Denys, Tulyakov, Sergey, Ren, Jian

论文摘要

神经渲染场（NERF）的最新努力通过利用隐式神经表示代表3D场景，在新观点综合方面表现出了令人印象深刻的结果。由于体积渲染的过程，NERF的推理速度非常慢，这限制了将NERF在资源受限的硬件（例如移动设备）上使用的应用程序方案。已经进行了许多工作，以减少运行NERF模型的延迟。但是，其中大多数仍然需要高端GPU来加速或额外的存储内存，这在移动设备上都无法使用。另一个出现的方向利用神经光场（NELF）进行加速，因为在射线上仅执行一个正向通行以预测像素颜色。然而，为了达到与NERF相似的渲染质量，NELF的网络是通过密集的计算设计的，该计算并不易于移动。在这项工作中，我们提出了一个有效的网络，该网络在移动设备上实时运行以进行神经渲染。我们遵循NELF的设置来培训我们的网络。与现有作品不同，我们引入了一种新颖的网络体系结构，该架构在低潜伏期且尺寸较小的移动设备上有效运行，即与Mobilenerf相比，节省了$ 15 \ times \ sim 24 \ times $存储。我们的模型可实现高分辨率的生成，同时保持移动设备上合成和实际场景的实时推断，例如$ 18.04 $ MS（iPhone 13），以渲染一个$ 1008 \ times756 $的真实3D场景的图像。此外，我们的图像质量与NERF相似，并且质量比Mobilenerf更好（PSNR $ 26.15 $ vs. $ 25.91 $在现实世界向前的数据集中）。

Recent efforts in Neural Rendering Fields (NeRF) have shown impressive results on novel view synthesis by utilizing implicit neural representation to represent 3D scenes. Due to the process of volumetric rendering, the inference speed for NeRF is extremely slow, limiting the application scenarios of utilizing NeRF on resource-constrained hardware, such as mobile devices. Many works have been conducted to reduce the latency of running NeRF models. However, most of them still require high-end GPU for acceleration or extra storage memory, which is all unavailable on mobile devices. Another emerging direction utilizes the neural light field (NeLF) for speedup, as only one forward pass is performed on a ray to predict the pixel color. Nevertheless, to reach a similar rendering quality as NeRF, the network in NeLF is designed with intensive computation, which is not mobile-friendly. In this work, we propose an efficient network that runs in real-time on mobile devices for neural rendering. We follow the setting of NeLF to train our network. Unlike existing works, we introduce a novel network architecture that runs efficiently on mobile devices with low latency and small size, i.e., saving $15\times \sim 24\times$ storage compared with MobileNeRF. Our model achieves high-resolution generation while maintaining real-time inference for both synthetic and real-world scenes on mobile devices, e.g., $18.04$ms (iPhone 13) for rendering one $1008\times756$ image of real 3D scenes. Additionally, we achieve similar image quality as NeRF and better quality than MobileNeRF (PSNR $26.15$ vs. $25.91$ on the real-world forward-facing dataset).

下载PDF全文

下载文献需遵守相关版权规定

论文标题