论文标题

基于图形网络基于多演讲者会议数据的半监督学习

Graph Convolutional Network Based Semi-Supervised Learning on Multi-Speaker Meeting Data

论文作者

Tong, Fuchuan, Zheng, Siqi, Zhang, Min, Chen, Yafeng, Suo, Hongbin, Hong, Qingyang, Li, Lin

论文摘要

对扬声器的无监督聚类对于其在半监督学习中的潜在用途变得越来越重要。实际上,经常向我们提供大量来自多方会议和讨论的未标记数据。一种有效的无监督聚类方法将使我们能够显着增加培训数据的量,而无需额外的注释。最近,基于图形卷积网络(GCN)的方法因无监督聚类而受到越来越多的关注,因为这些方法利用节点之间的连接模式来提高学习绩效。在这项工作中,我们提出了一种基于GCN的半监督学习方法。给定预先训练的嵌入式提取器,对标记的数据和簇未标记的数据进行了图形卷积网络的训练。我们提出了一种自我校正的训练机制,该机制在伪标签上迭代地运行群集训练 - 校正过程。我们表明,这种提出的方​​法有效地使用了未标记的数据,并提高了扬声器识别精度。

Unsupervised clustering on speakers is becoming increasingly important for its potential uses in semi-supervised learning. In reality, we are often presented with enormous amounts of unlabeled data from multi-party meetings and discussions. An effective unsupervised clustering approach would allow us to significantly increase the amount of training data without additional costs for annotations. Recently, methods based on graph convolutional networks (GCN) have received growing attention for unsupervised clustering, as these methods exploit the connectivity patterns between nodes to improve learning performance. In this work, we present a GCN-based approach for semi-supervised learning. Given a pre-trained embedding extractor, a graph convolutional network is trained on the labeled data and clusters unlabeled data with "pseudo-labels". We present a self-correcting training mechanism that iteratively runs the cluster-train-correct process on pseudo-labels. We show that this proposed approach effectively uses unlabeled data and improves speaker recognition accuracy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源