论文标题
较少但更好:与异常值进行社区搜索
Better Fewer but Better: Community Search with Outliers
论文作者
论文摘要
鉴于网络中的一组顶点,我们认为对正在分析的应用程序感兴趣,社区搜索是产生一个子图的问题,该子图可能解释了感兴趣的顶点之间存在的关系。在实践中,这意味着解决方案应该在查询的顶点中添加一些顶点,因此要创建一个具有“凝聚力”属性的连接子图。近年来,这个问题受到了越来越多的关注:尽管已经研究了几种凝聚力功能,但大部分文献都寻找包含所有查询顶点的解决方案子图。但是,在许多探索性分析中,我们可能只对感兴趣的顶点有一个合理的信念:如果只有一个人与其他人并不真正相关,迫使解决方案包括所有这些可能会隐藏存在更具凝聚力和有意义的子用品的存在,那么我们可以通过允许解决方案来检测和丢弃外距离偏见。在本文中,我们研究了与离群值的社区搜索问题,在该问题中,我们可以降低$ k $ Query的顶点,而$ k $是输入参数。我们考虑了三种最常用的凝聚力度量:最小程度,子图的直径和带有查询顶点的最大距离。通过优化一个并将其他一个作为约束,我们获得了三个优化问题:我们研究它们的硬度,并提出了不同的精确和近似算法。
Given a set of vertices in a network, that we believe are of interest for the application under analysis, community search is the problem of producing a subgraph potentially explaining the relationships existing among the vertices of interest. In practice this means that the solution should add some vertices to the query ones, so to create a connected subgraph that exhibits some "cohesiveness" property. This problem has received increasing attention in recent years: while several cohesiveness functions have been studied, the bulk of the literature looks for a solution subgraphs containing all the query vertices. However, in many exploratory analyses we might only have a reasonable belief about the vertices of interest: if only one of them is not really related to the others, forcing the solution to include all of them might hide the existence of much more cohesive and meaningful subgraphs, that we could have found by allowing the solution to detect and drop the outlier vertex. In this paper we study the problem of community search with outliers, where we are allowed to drop up to $k$ query vertices, with $k$ being an input parameter. We consider three of the most used measures of cohesiveness: the minimum degree, the diameter of the subgraph and the maximum distance with a query vertex. By optimizing one and using one of the others as a constraint we obtain three optimization problems: we study their hardness and we propose different exact and approximation algorithms.