论文标题

通过混合ASR瓶颈功能用于语音转换的杂交内容和细粒度的韵律信息

Disentangleing Content and Fine-grained Prosody Information via Hybrid ASR Bottleneck Features for Voice Conversion

论文作者

Zhao, Xintao, Liu, Feng, Song, Changhe, Wu, Zhiyong, Kang, Shiyin, Tuo, Deyi, Meng, Helen

论文摘要

非并行数据语音转换(VC)最近通过引入自动语音识别(ASR)模型提取的瓶颈功能(BNF)实现了相当大的突破。但是,选择BNF对VC结果有重大影响。例如,当从经过跨熵损失(CE-BNF)训练的ASR中提取BNF并进食神经网络以训练VC系统时,转换后语音的音色相似性将显着降低。如果使用连接派时间分类损失(CTC-BNF)从ASR中提取BNF,则转换后语音的自然性可能会减少。这种现象是由BNF中包含的信息差异引起的。在本文中,我们提出了一种使用CTC-BNF和CE-BNF的混合瓶颈功能的任何一对一的VC方法,以相互补充优势。梯度反转层和实例归一化用于从CE-BNF中提取韵律信息,以及从CTC-BNF中提取韵律信息。自动回归解码器和HIFI-GAN VOCODER用于生成高质量的波形。实验结果表明,我们提出的方法比基线方法获得了更高的相似性,自然性,质量,并揭示了CE-BNF和CTC-BNF中包含的信息之间的差异以及它们对转换语音的影响。

Non-parallel data voice conversion (VC) have achieved considerable breakthroughs recently through introducing bottleneck features (BNFs) extracted by the automatic speech recognition(ASR) model. However, selection of BNFs have a significant impact on VC result. For example, when extracting BNFs from ASR trained with Cross Entropy loss (CE-BNFs) and feeding into neural network to train a VC system, the timbre similarity of converted speech is significantly degraded. If BNFs are extracted from ASR trained using Connectionist Temporal Classification loss (CTC-BNFs), the naturalness of the converted speech may decrease. This phenomenon is caused by the difference of information contained in BNFs. In this paper, we proposed an any-to-one VC method using hybrid bottleneck features extracted from CTC-BNFs and CE-BNFs to complement each other advantages. Gradient reversal layer and instance normalization were used to extract prosody information from CE-BNFs and content information from CTC-BNFs. Auto-regressive decoder and Hifi-GAN vocoder were used to generate high-quality waveform. Experimental results show that our proposed method achieves higher similarity, naturalness, quality than baseline method and reveals the differences between the information contained in CE-BNFs and CTC-BNFs as well as the influence they have on the converted speech.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源