V4D：4D新颖视图合成的体素

论文标题

V4D：4D新颖视图合成的体素

V4d: voxel for 4d novel view synthesis

论文作者

Gan, Wanshui, Xu, Hongbin, Huang, Yi, Chen, Shifeng, Yokoya, Naoto

论文摘要

神经辐射场在3D静态场景的新型视图综合任务中取得了显着突破。但是，对于4D情况（例如，动态场景），现有方法的性能仍然受神经网络的能力限制，通常在多层求助者网络（MLP）中。在本文中，我们利用3D体素对4D神经辐射场进行建模，为V4D，其中3D Voxel具有两种格式。第一个是定期对3D空间进行建模，然后使用带有时间索引采样的本地3D特征，以微小的MLP对密度场和纹理场进行建模。第二个是用于像素级改进的查找表（LUTS）格式，其中使用音量渲染产生的伪表面用作指导信息来学习2D像素级的细化映射。拟议的基于LUTS的改进模块几乎没有计算成本来实现性能增长，并且可以作为新型视图综合任务中的插件模块。此外，我们提出了一个更有效的条件位置编码，以实现可忽略不计的计算负担来实现性能增长的4D数据。广泛的实验表明，所提出的方法以低计算成本实现最先进的性能。

Neural radiance fields have made a remarkable breakthrough in the novel view synthesis task at the 3D static scene. However, for the 4D circumstance (e.g., dynamic scene), the performance of the existing method is still limited by the capacity of the neural network, typically in a multilayer perceptron network (MLP). In this paper, we utilize 3D Voxel to model the 4D neural radiance field, short as V4D, where the 3D voxel has two formats. The first one is to regularly model the 3D space and then use the sampled local 3D feature with the time index to model the density field and the texture field by a tiny MLP. The second one is in look-up tables (LUTs) format that is for the pixel-level refinement, where the pseudo-surface produced by the volume rendering is utilized as the guidance information to learn a 2D pixel-level refinement mapping. The proposed LUTs-based refinement module achieves the performance gain with little computational cost and could serve as the plug-and-play module in the novel view synthesis task. Moreover, we propose a more effective conditional positional encoding toward the 4D data that achieves performance gain with negligible computational burdens. Extensive experiments demonstrate that the proposed method achieves state-of-the-art performance at a low computational cost.

下载PDF全文

下载文献需遵守相关版权规定

论文标题