论文标题
在表示形式感知的随机块模型下,光谱聚类的一致性
On consistency of constrained spectral clustering under representation-aware stochastic block model
论文作者
论文摘要
由于其灵活性,计算效率和理解的理论性能保证,光谱群集在实践中被广泛使用。最近,已经研究了光谱聚类,以在人口级别的约束下找到平衡的簇。这些约束是由以辅助分类节点属性形式可用的其他信息指定的。在本文中,我们考虑了一个场景,其中这些属性可能无法观察到,而是表现为辅助图的潜在特征。在此激励的情况下,我们研究了受约束的光谱聚类,目的是在给定的\ textIt {相似性图} $ \ Mathcal {g} $中找到平衡的簇,以便每个人都相对于辅助图$ \ Mathcal {r}(r} $(我们将其称为表示图形)。我们提出了一个个人级别的平衡约束,以形式化这一想法。我们的工作导致了一个有趣的随机块模型,该模型不仅以$ \ MATHCAL {G} $种植给定的分区,而且还种植了表示图中编码的辅助信息$ \ Mathcal {r} $。在这种情况下,我们开发了光谱聚类的非均衡和归一化变体。这些算法使用$ \ Mathcal {r} $在$ \ Mathcal {g} $中查找群集,以大约满足所提出的约束。我们还为从随机块模型的上述变体中采样的图表下,在单个级别约束下建立了第一个统计一致性结果。我们的实验结果证实了我们的理论发现。
Spectral clustering is widely used in practice due to its flexibility, computational efficiency, and well-understood theoretical performance guarantees. Recently, spectral clustering has been studied to find balanced clusters under population-level constraints. These constraints are specified by additional information available in the form of auxiliary categorical node attributes. In this paper, we consider a scenario where these attributes may not be observable, but manifest as latent features of an auxiliary graph. Motivated by this, we study constrained spectral clustering with the aim of finding balanced clusters in a given \textit{similarity graph} $\mathcal{G}$, such that each individual is adequately represented with respect to an auxiliary graph $\mathcal{R}$ (we refer to this as representation graph). We propose an individual-level balancing constraint that formalizes this idea. Our work leads to an interesting stochastic block model that not only plants the given partitions in $\mathcal{G}$ but also plants the auxiliary information encoded in the representation graph $\mathcal{R}$. We develop unnormalized and normalized variants of spectral clustering in this setting. These algorithms use $\mathcal{R}$ to find clusters in $\mathcal{G}$ that approximately satisfy the proposed constraint. We also establish the first statistical consistency result for constrained spectral clustering under individual-level constraints for graphs sampled from the above-mentioned variant of the stochastic block model. Our experimental results corroborate our theoretical findings.