论文标题

个性化图形摘要:配方,可扩展算法和应用程序

Personalized Graph Summarization: Formulation, Scalable Algorithms, and Applications

论文作者

Kang, Shinhwan, Lee, Kyuhan, Shin, Kijung

论文摘要

在线社交网络的用户是否对网络中的所有连接感兴趣?如果没有,我们如何获得个性化特定用户的网络摘要?我们可以将摘要用于近似查询回答吗? 随着大量图(例如在线社交网络,超链接网络和道路网络)的普遍性,图表压缩对于具有有限资源的有效处理的有效处理变得非常重要。图摘要是一种广泛研究的有损压缩方法。它提供了一个摘要图,其中将具有相似连接性的节点合并为超节点,并且可以从摘要图中回答各种图形查询。 在这项工作中,我们引入了一个新问题,即个性化的图形摘要,其目的是获得摘要图,其中更多重点放在接近给定的目标节点集的连接上。然后,我们提出了Pegasus,这是该问题的线性时间算法。 Through experiments on six real-world graphs, we demonstrate that PeGaSus is (a) Effective: node-similarity queries for target nodes can be answered significantly more accurately from personalized summary graphs than from non-personalized ones of similar size, (b) Scalable: it summarizes graphs with up to one billion edges, and (c) Applicable to distributed multi-query answering: it successfully replaces graph partitioning for无通信的多质量处理。

Are users of an online social network interested equally in all connections in the network? If not, how can we obtain a summary of the network personalized to specific users? Can we use the summary for approximate query answering? As massive graphs (e.g., online social networks, hyperlink networks, and road networks) have become pervasive, graph compression has gained importance for the efficient processing of such graphs with limited resources. Graph summarization is an extensively-studied lossy compression method. It provides a summary graph where nodes with similar connectivity are merged into supernodes, and a variety of graph queries can be answered approximately from the summary graph. In this work, we introduce a new problem, namely personalized graph summarization, where the objective is to obtain a summary graph where more emphasis is put on connections closer to a given set of target nodes. Then, we propose PeGaSus, a linear-time algorithm for the problem. Through experiments on six real-world graphs, we demonstrate that PeGaSus is (a) Effective: node-similarity queries for target nodes can be answered significantly more accurately from personalized summary graphs than from non-personalized ones of similar size, (b) Scalable: it summarizes graphs with up to one billion edges, and (c) Applicable to distributed multi-query answering: it successfully replaces graph partitioning for communication-free multi-query processing.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源