论文标题

多层程度校正随机块模型中的关节光谱聚类

Joint Spectral Clustering in Multilayer Degree-Corrected Stochastic Blockmodels

论文作者

Agterberg, Joshua, Lubberts, Zachary, Arroyo, Jesús

论文摘要

现代网络数据集通常由多个图层组成,要么是不同的视图,时变的观测值或独立的样品单元,因此在同一顶点集上收集了网络的收集,但在每个网络上具有潜在的连接模式。这些数据需要足够灵活的模型和方法,可以捕获整个网络之间的局部和全局差异,同时又是简约且可进行的,以产生能够在网络上汇总信息的计算有效且理论上声音的解决方案。本文考虑了多层学位校正的随机块模型,其中集合网络共享相同的社区结构,但是允许允许学位校正和块连接概率矩阵不同。我们建立了该模型的可识别性,并提出了一种在这种情况下进行社区检测的光谱聚类算法。我们的理论结果表明,算法的错误分类错误率在多个网络实现的情况下呈指数提高,即使在存在明显的层异质性方面,相对于程度校正,信号强度和块连接概率矩阵的光谱特性。仿真研究表明,这种方法在这种挑战性的制度中改善了现有的多层社区检测方法。此外,在2016年1月至2021年9月对美国机场数据的案例研究中,我们发现这种方法可以确定有意义的社区结构和机场受欢迎程度受到大流行对旅行影响的影响。

Modern network datasets are often composed of multiple layers, either as different views, time-varying observations, or independent sample units, resulting in collections of networks over the same set of vertices but with potentially different connectivity patterns on each network. These data require models and methods that are flexible enough to capture local and global differences across the networks, while at the same time being parsimonious and tractable to yield computationally efficient and theoretically sound solutions that are capable of aggregating information across the networks. This paper considers the multilayer degree-corrected stochastic blockmodel, where a collection of networks share the same community structure, but degree-corrections and block connection probability matrices are permitted to be different. We establish the identifiability of this model and propose a spectral clustering algorithm for community detection in this setting. Our theoretical results demonstrate that the misclustering error rate of the algorithm improves exponentially with multiple network realizations, even in the presence of significant layer heterogeneity with respect to degree corrections, signal strength, and spectral properties of the block connection probability matrices. Simulation studies show that this approach improves on existing multilayer community detection methods in this challenging regime. Furthermore, in a case study of US airport data through January 2016 -- September 2021, we find that this methodology identifies meaningful community structure and trends in airport popularity influenced by pandemic impacts on travel.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源