论文标题
NCAGC:归因图集群的邻域对比框架
NCAGC: A Neighborhood Contrast Framework for Attributed Graph Clustering
论文作者
论文摘要
归因图聚类是图形学习字段中最基本的任务之一,其目的是将具有相似表示的节点分组到没有人类注释的情况下。当利用图形结构数据时,基于图形对比度学习方法的最新研究已取得了显着的结果。但是,大多数现有方法1)不能直接解决聚类任务,因为表示和聚类过程是分开的; 2)太多取决于数据的增强,这极大地限制了对比度学习的能力; 3)忽略聚类任务的对比消息,这会不利地退化聚类结果。在本文中,我们提出了一个归因图聚类的邻域对比框架,即NCAGC,以寻求征服上述局限性。具体而言,通过利用邻里对比模块,邻居节点的表示将“靠近”,并以群体的对比损失面向聚类。此外,通过在自我表达层之前和之后最小化节点表示来构建对比度表达模块,以限制自我表达矩阵的学习。 NCAGC的所有模块均在统一的框架中进行了优化,因此学习的节点表示包含面向聚类的消息。与16种最先进的聚类方法相比,四个归因图数据集的大量实验结果证明了NCAGC的有希望的性能。该代码可从https://github.com/wangtong627/ncagc获得。
Attributed graph clustering is one of the most fundamental tasks among graph learning field, the goal of which is to group nodes with similar representations into the same cluster without human annotations. Recent studies based on graph contrastive learning method have achieved remarkable results when exploit graph-structured data. However, most existing methods 1) do not directly address the clustering task, since the representation learning and clustering process are separated; 2) depend too much on data augmentation, which greatly limits the capability of contrastive learning; 3) ignore the contrastive message for clustering tasks, which adversely degenerate the clustering results. In this paper, we propose a Neighborhood Contrast Framework for Attributed Graph Clustering, namely NCAGC, seeking for conquering the aforementioned limitations. Specifically, by leveraging the Neighborhood Contrast Module, the representation of neighbor nodes will be 'push closer' and become clustering-oriented with the neighborhood contrast loss. Moreover, a Contrastive Self-Expression Module is built by minimizing the node representation before and after the self-expression layer to constraint the learning of self-expression matrix. All the modules of NCAGC are optimized in a unified framework, so the learned node representation contains clustering-oriented messages. Extensive experimental results on four attributed graph datasets demonstrate the promising performance of NCAGC compared with 16 state-of-the-art clustering methods. The code is available at https://github.com/wangtong627/NCAGC.