论文标题
HQANN:具有结构化和非结构化约束的混合查询有效且稳健的相似性搜索
HQANN: Efficient and Robust Similarity Search for Hybrid Queries with Structured and Unstructured Constraints
论文作者
论文摘要
内存中的近似邻居搜索(ANNS)算法在快速高回复查询处理方面取得了巨大的成功,但是在处理使用非结构化的混合查询(即特征向量)和结构化的(即相关属性)约束时,效率极高。在本文中,我们提出了HQANN,这是一种简单而高效的混合查询处理框架,可以很容易地嵌入现有的基于图形的ANN算法中。我们保证通过利用属性之间的导航感以及将向量相似性搜索与属性滤波进行融合。公共和内部数据集的实验结果表明,HQANN比最先进的混合ANNS解决方案快10倍,以达到相同的召回质量,并且其性能几乎不受属性的复杂性影响。它可以在手套1.2m的大约50微秒中达到99 \%召回@10,并具有数千个属性约束。
The in-memory approximate nearest neighbor search (ANNS) algorithms have achieved great success for fast high-recall query processing, but are extremely inefficient when handling hybrid queries with unstructured (i.e., feature vectors) and structured (i.e., related attributes) constraints. In this paper, we present HQANN, a simple yet highly efficient hybrid query processing framework which can be easily embedded into existing proximity graph-based ANNS algorithms. We guarantee both low latency and high recall by leveraging navigation sense among attributes and fusing vector similarity search with attribute filtering. Experimental results on both public and in-house datasets demonstrate that HQANN is 10x faster than the state-of-the-art hybrid ANNS solutions to reach the same recall quality and its performance is hardly affected by the complexity of attributes. It can reach 99\% recall@10 in just around 50 microseconds On GLOVE-1.2M with thousands of attribute constraints.