论文标题

mesh2ir:复杂3D场景的神经声脉冲响应发生器

MESH2IR: Neural Acoustic Impulse Response Generator for Complex 3D Scenes

论文作者

Ratnarajah, Anton, Tang, Zhenyu, Aralikatti, Rohith Chandrashekar, Manocha, Dinesh

论文摘要

我们提出了一个基于网格的神经网络(MESH2IR),以生成使用网格代表的室内3D场景的声学脉冲响应(IRS)。美国国税局用于在交互式应用程序和音频处理中创建高质量的声音体验。我们的方法可以处理具有任意拓扑结构(2K -3M三角形)的输入三角网格。我们提出了一种新颖的培训技术,可利用能量衰减的缓解训练网格2IR并突出其好处。我们还表明,使用我们提出的技术对IRS进行预处理的培训Mesh2IR可显着提高IR发电的准确性。我们通过使用图形卷积网络将3D场景网格转换为潜在空间,从而降低了网格空间中的非线性性。我们的网格2IR比CPU上的几何声学算法快200倍以上,并且在给定的室内3D场景中,在NVIDIA GEFORCE RTX 2080 TI GPU上可以在NVIDIA GEFORCE RTX 2080 TI GPU上产生超过10,000个IRS。声学指标用于表征声学环境。我们表明,IRS的声学指标从我们的网格2IR中预测,误差少于10%。我们还强调了Mesh2ir对音频和语音处理应用的好处,例如语音消失和语音分离。据我们所知,我们的是第一种基于神经网络的基于神经网络的方法,可以实时预测给定的3D场景网格。

We propose a mesh-based neural network (MESH2IR) to generate acoustic impulse responses (IRs) for indoor 3D scenes represented using a mesh. The IRs are used to create a high-quality sound experience in interactive applications and audio processing. Our method can handle input triangular meshes with arbitrary topologies (2K - 3M triangles). We present a novel training technique to train MESH2IR using energy decay relief and highlight its benefits. We also show that training MESH2IR on IRs preprocessed using our proposed technique significantly improves the accuracy of IR generation. We reduce the non-linearity in the mesh space by transforming 3D scene meshes to latent space using a graph convolution network. Our MESH2IR is more than 200 times faster than a geometric acoustic algorithm on a CPU and can generate more than 10,000 IRs per second on an NVIDIA GeForce RTX 2080 Ti GPU for a given furnished indoor 3D scene. The acoustic metrics are used to characterize the acoustic environment. We show that the acoustic metrics of the IRs predicted from our MESH2IR match the ground truth with less than 10% error. We also highlight the benefits of MESH2IR on audio and speech processing applications such as speech dereverberation and speech separation. To the best of our knowledge, ours is the first neural-network-based approach to predict IRs from a given 3D scene mesh in real-time.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源