论文标题
Ecohen:一个假设测试框架,用于从异质网络中提取社区
ECoHeN: A Hypothesis Testing Framework for Extracting Communities from Heterogeneous Networks
论文作者
论文摘要
社区发现是从网络中获得分类社区的一般过程:在与网络其余部分之间稀疏连接内密度连接的节点的集合。尽管对社区发现进行了充分的研究,但对于异质网络而言,几乎没有这种技术,这些技术包含不同类型的节点以及节点类型之间可能不同的连接模式。在本文中,我们介绍了一个名为ecohen的框架,该框架\ textbf {e} xtracks \ textbf {co} mmunities从\ textbf {he} terogeneous \ textbf {n} etwork以统计意义的方式中。 Ecohen使用异质配置模型作为参考分布,确定了鉴于其构件的节点类型和连接性的密集连接的社区。具体而言,Ecohen算法一次通过一组动态的迭代更新规则集提取一个社区,可以保证会收敛,并且对提取的社区的类型组成没有任何限制。据我们所知,这是第一个区分和识别网络中社区结构的均匀和异质,可能是重叠的社区结构的第一种发现方法。我们通过模拟和应用于政治博客网络的应用来证明Ecohen的表现,以识别博客的集合,这些博客的汇集超出了其成员意识形态的预期。
Community discovery is the general process of attaining assortative communities from a network: collections of nodes that are densely connected within yet sparsely connected to the rest of the network. While community discovery has been well studied, few such techniques exist for heterogeneous networks, which contain different types of nodes and possibly different connectivity patterns between the node types. In this paper, we introduce a framework called ECoHeN, which \textbf{e}xtracts \textbf{co}mmunities from a \textbf{he}terogeneous \textbf{n}etwork in a statistically meaningful way. Using a heterogeneous configuration model as a reference distribution, ECoHeN identifies communities that are significantly more densely connected than expected given the node types and connectivity of its membership. Specifically, the ECoHeN algorithm extracts communities one at a time through a dynamic set of iterative updating rules, is guaranteed to converge, and imposes no constraints on the type composition of extracted communities. To our knowledge this is the first discovery method that distinguishes and identifies both homogeneous and heterogeneous, possibly overlapping, community structure in a network. We demonstrate the performance of ECoHeN through simulation and in application to a political blogs network to identify collections of blogs which reference one another more than expected considering the ideology of its' members.