LSCALE：基于潜在空间聚类的节点分类的主动学习

论文标题

LSCALE：基于潜在空间聚类的节点分类的主动学习

LSCALE: Latent Space Clustering-Based Active Learning for Node Classification

论文作者

Liu, Juncheng, Wang, Yiwei, Hooi, Bryan, Yang, Renchi, Xiao, Xiaokui

论文摘要

图表上的节点分类是许多实际域中的重要任务。它通常需要培训标签，在实践中获得很难或昂贵。鉴于标签的预算，主动学习旨在通过仔细选择要标记的节点来提高性能。先前的图形主动学习方法使用标记的节点学习表示表示，并选择一些未标记的节点进行标签采集。但是，它们并未完全利用未标记节点中存在的表示能力。我们认为，未标记节点中的表示能力对于积极学习和进一步提高主动学习的性能以进行节点分类很有用。在本文中，我们提出了一个基于潜在空间聚类的活性学习框架（LSCALE），在该框架中，我们在标记和未标记的节点中充分利用了表示功能。具体来说，为了选择用于标签的节点，我们的框架使用了基于无人监督功能和监督功能的动态组合，在潜在空间上使用K-Medoids聚类算法。此外，我们设计了一个增量聚类模块，以避免在不同步骤中选择的节点之间的冗余。五个数据集的广泛实验表明，我们提出的框架LSCALE始终如一，并显着超过了较大的余量。

Node classification on graphs is an important task in many practical domains. It usually requires labels for training, which can be difficult or expensive to obtain in practice. Given a budget for labelling, active learning aims to improve performance by carefully choosing which nodes to label. Previous graph active learning methods learn representations using labelled nodes and select some unlabelled nodes for label acquisition. However, they do not fully utilize the representation power present in unlabelled nodes. We argue that the representation power in unlabelled nodes can be useful for active learning and for further improving performance of active learning for node classification. In this paper, we propose a latent space clustering-based active learning framework for node classification (LSCALE), where we fully utilize the representation power in both labelled and unlabelled nodes. Specifically, to select nodes for labelling, our framework uses the K-Medoids clustering algorithm on a latent space based on a dynamic combination of both unsupervised features and supervised features. In addition, we design an incremental clustering module to avoid redundancy between nodes selected at different steps. Extensive experiments on five datasets show that our proposed framework LSCALE consistently and significantly outperforms the stateof-the-art approaches by a large margin.

下载PDF全文

下载文献需遵守相关版权规定

论文标题