论文标题

自动歌词复合音乐的体裁条件模型

Genre-conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music

论文作者

Gao, Xiaoxue, Gupta, Chitralekha, Li, Haizhou

论文摘要

多音音乐的歌词转录不仅是因为歌唱人声被背景音乐所破坏,而且还因为背景音乐和歌唱风格在音乐流派中各不相同,例如流行,金属和嘻哈,这会以不同的方式影响歌曲的歌词清晰度。在这项工作中,我们建议使用新型流派条件网络抄录多形音乐的歌词。拟议的网络采用了预训练的模型参数,并结合了层之间的流派适配器,以捕获歌词 - 流派对的不同类型的特征,从而仅需要轻巧的体裁特异性参数来训练。我们的实验表明,所提出的类型条件网络的表现优于现有的歌词转录系统。

Lyrics transcription of polyphonic music is challenging not only because the singing vocals are corrupted by the background music, but also because the background music and the singing style vary across music genres, such as pop, metal, and hip hop, which affects lyrics intelligibility of the song in different ways. In this work, we propose to transcribe the lyrics of polyphonic music using a novel genre-conditioned network. The proposed network adopts pre-trained model parameters, and incorporates the genre adapters between layers to capture different genre peculiarities for lyrics-genre pairs, thereby only requiring lightweight genre-specific parameters for training. Our experiments show that the proposed genre-conditioned network outperforms the existing lyrics transcription systems.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源