论文标题

抽样在线社交网络:大都会黑斯廷斯随机步行和随机步行

Sampling Online Social Networks: Metropolis Hastings Random Walk and Random Walk

论文作者

Qi, Xiao

论文摘要

近年来社交网络分析(SNA)引起了很多关注,SNA的一个瓶颈是这些网络数据太大而无法处理。此外,由于隐私问题,某些网络数据无法访问。因此,我们必须开发采样方法来从人群图中绘制代表性的样本图。在本文中,引入了大都市束缚随机步行(MHRW)和随机步行(RWWJ)采样策略,包括收集节点的程序,基本的数学理论和相应的估计器。我们比较了我们的方法和现有的研究结果,发现MHRW在估计学位分布(比RWWJ少61%)和图表订单(比RWWJ少0.69%)时的性能更好,而RWWJ估计追随者,并且在邻近的关系中,比率和互惠关系的比例和互惠关系比较低的错误和较低的错误和6%的错误和6%的错误和6%的错误和6%的MHRW。我们分析结果的原因,并提供可能的未来工作指导。

As social network analysis (SNA) has drawn much attention in recent years, one bottleneck of SNA is these network data are too massive to handle. Furthermore, some network data are not accessible due to privacy problems. Therefore, we have to develop sampling methods to draw representative sample graphs from the population graph. In this paper, Metropolis-Hastings Random Walk (MHRW) and Random Walk with Jumps (RWwJ) sampling strategies are introduced, including the procedure of collecting nodes, the underlying mathematical theory, and corresponding estimators. We compared our methods and existing research outcomes and found that MHRW performs better when estimating degree distribution (61% less error than RWwJ) and graph order (0.69% less error than RWwJ), while RWwJ estimates follower and following ratio average and mutual relationship proportion in adjacent relationship with better results, with 13% less error and 6% less error than MHRW. We analyze the reasons for the outcomes and give possible future work directions.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源