论文标题
基于图的近似NN搜索:重新访问
Graph-based Approximate NN Search: A Revisit
论文作者
论文摘要
最近的邻居搜索在许多学科中起着基本作用,例如多媒体信息检索,数据挖掘和机器学习。在最近的研究中,基于图的搜索方法比其他类型的方法表现出优越的性能。在本文中,重新访问了基于图的NN搜索。我们优化了该方法中的两个关键组件,即搜索过程和支持搜索的图。对于图形构建,提出了两阶段的分散方案,这可以在基于其建立的搜索过程的效率和可达到性之间进行良好的权衡。此外,拟议的多元化方案允许搜索过程动态决定在一个节点的社区中应访问多少个节点。通过这种方式,当搜索在不同情况下进行搜索时,设备的计算能力将被充分利用。此外,针对GPU上的大小批次查询分别设计了两个NN搜索程序。优化的NN搜索在两阶段多元化图的支持下,在所有考虑的大规模数据集中都优于CPU和GPU上的所有最新方法。
Nearest neighbor search plays a fundamental role in many disciplines such as multimedia information retrieval, data-mining, and machine learning. The graph-based search approaches show superior performance over other types of approaches in recent studies. In this paper, the graph-based NN search is revisited. We optimize two key components in the approach, namely the search procedure and the graph that supports the search. For the graph construction, a two-stage graph diversification scheme is proposed, which makes a good trade-off between the efficiency and reachability for the search procedure that builds upon it. Moreover, the proposed diversification scheme allows the search procedure to decide dynamically how many nodes should be visited in one node's neighborhood. By this way, the computing power of the devices is fully utilized when the search is carried out under different circumstances. Furthermore, two NN search procedures are designed respectively for small and large batch queries on the GPU. The optimized NN search, when being supported by the two-stage diversified graph, outperforms all the state-of-the-art approaches on both the CPU and the GPU across all the considered large-scale datasets.