论文标题

G2PW:一种有条件加权的软电话,用于多人普通话歧义

g2pW: A Conditional Weighted Softmax BERT for Polyphone Disambiguation in Mandarin

论文作者

Chen, Yi-Chang, Chang, Yu-Chuan, Chang, Yen-Cheng, Yeh, Yi-Ren

论文摘要

多人歧义是普通话素至phoneme(G2P)转换中最关键的任务。先前的研究已使用预先训练的语言模型,限制输出以及言论部分(POS)标记的额外信息来解决此问题。受这些策略的启发,我们提出了一种称为G2PW的新颖方法,该方法适应了可学习的软磁体重,以使用感兴趣的多音特征及其POS标记来调节BERT的输出。我们的实验并没有像以前的作品那样使用硬面膜,而是表明,学习候选音素的软加权功能会使性能受益。此外,我们提出的G2PW不需要额外的预训练POS标签模型,而将POS标签用作辅助功能,因为我们与统一的编码者同时训练POS标记模型。实验结果表明,我们的G2PW优于公共CPP数据集上的现有方法。所有代码,模型权重和用户友好的软件包均可公开使用。

Polyphone disambiguation is the most crucial task in Mandarin grapheme-to-phoneme (g2p) conversion. Previous studies have approached this problem using pre-trained language models, restricted output, and extra information from Part-Of-Speech (POS) tagging. Inspired by these strategies, we propose a novel approach, called g2pW, which adapts learnable softmax-weights to condition the outputs of BERT with the polyphonic character of interest and its POS tagging. Rather than using the hard mask as in previous works, our experiments show that learning a soft-weighting function for the candidate phonemes benefits performance. In addition, our proposed g2pW does not require extra pre-trained POS tagging models while using POS tags as auxiliary features since we train the POS tagging model simultaneously with the unified encoder. Experimental results show that our g2pW outperforms existing methods on the public CPP dataset. All codes, model weights, and a user-friendly package are publicly available.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源