论文标题

分解的时间动态CNN:使用扬声器激活图解释的文本独立扬声器验证的有效时间自适应网络

Decomposed Temporal Dynamic CNN: Efficient Time-Adaptive Network for Text-Independent Speaker Verification Explained with Speaker Activation Map

论文作者

Kim, Seong-Hu, Nam, Hyeonuk, Park, Yong-Hwa

论文摘要

为了提取与文本无关的扬声器验证的准确扬声器信息,时间动态CNN(TDY-CNN)调整内核每次提出bin时。但是,TDY-CNN的模型大小太大,自适应内核的自由度受到限制。为了解决这些局限性,我们提出了分解的时间动态CNN(DTDY-CNN),该动态CNNS通过将静态内核与基于矩阵分解的动态残留物相结合,从而形成时间自适应内核。提出的使用细心的统计池而无需数据增强的DTDY-RESNET-34(X0.50)显示EER为0.96%,这比其他最先进的方法要好。 DTDY-CNN成功地升级了TDY-CNN,将模型尺寸降低了64%,并提高了性能。我们表明,与TDY-CNN相比,DTDY-CNN也提取更准确的框架级别扬声器嵌入。 DTDY-RESNET-34(X0.50)对提取说话者信息提取的详细行为,使用通过改良的梯度加权类激活映射(Grad-CAM)生成的说话者激活图(SAM)进行扬声器验证。 DTDY-RESNET-34(X0.50)有效地从强义频率和未声音音素的高频信息中提取了扬声器信息,从而解释了其在独立于文本独立的扬声器验证上的出色性能。

To extract accurate speaker information for text-independent speaker verification, temporal dynamic CNNs (TDY-CNNs) adapting kernels to each time bin was proposed. However, model size of TDY-CNN is too large and the adaptive kernel's degree of freedom is limited. To address these limitations, we propose decomposed temporal dynamic CNNs (DTDY-CNNs) which forms time-adaptive kernel by combining static kernel with dynamic residual based on matrix decomposition. Proposed DTDY-ResNet-34(x0.50) using attentive statistical pooling without data augmentation shows EER of 0.96%, which is better than other state-of-the-art methods. DTDY-CNNs are successful upgrade of TDY-CNNs, reducing the model size by 64% and enhancing the performance. We showed that DTDY-CNNs extract more accurate frame-level speaker embeddings as well compared to TDY-CNNs. Detailed behaviors of DTDY-ResNet-34(x0.50) on extraction of speaker information were analyzed using speaker activation map (SAM) produced by modified gradient-weighted class activation mapping (Grad-CAM) for speaker verification. DTDY-ResNet-34(x0.50) effectively extracts speaker information from not only formant frequencies but also high frequency information of unvoiced phonemes, thus explaining its outstanding performance on text-independent speaker verification.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源