论文标题

“ AI+R” -tree:实例优化的R-Tree

The "AI+R"-tree: An Instance-optimized R-tree

论文作者

Abdullah-Al-Mamun, Haider, Ch. Md. Rakin, Wang, Jianguo, Aref, Walid G.

论文摘要

实例优化的系统的新兴类别通过专门研究特定的数据和查询工作负载,显示出了实现高性能的潜力。特别是,已经成功地应用了机器学习(ML)技术来构建各种实例优化的组件(例如,学习的索引)。本文研究以利用ML技术来增强给定数据和查询工作负载的空间索引的性能,尤其是R-Tree。当R-Tree索引节点覆盖的区域在空间中重叠,在搜索空间中的特定点时,可能会探索从根到叶子的多个路径。在最坏的情况下,可以搜索整个R-Tree。在本文中,我们定义并使用重叠比来量化范围查询所需的外叶节点访问的程度。目的是提高传统的R-Tree对高拼图范围查询的查询性能,因为它们倾向于跑步时间很长。我们引入了一个新的AI-Tree,将R-Tree的搜索操作转换为多标签分类任务,以排除外部叶子节点访问。然后,我们将传统的R-Tree扩大到Ai-Tree,形成混合“ AI+R” -tree。 “ AI+R” -Tree可以使用学习的模型自动区分高和低重叠的查询。因此,“ AI+R” -tree使用AI-Tree处理高重叠的查询,并使用R-Tree处理低重叠的查询。实际数据集上的实验表明,“ AI+R” -Tree可以在传统的R-Tree上提高查询性能高达500%。

The emerging class of instance-optimized systems has shown potential to achieve high performance by specializing to a specific data and query workloads. Particularly, Machine Learning (ML) techniques have been applied successfully to build various instance-optimized components (e.g., learned indexes). This paper investigates to leverage ML techniques to enhance the performance of spatial indexes, particularly the R-tree, for a given data and query workloads. As the areas covered by the R-tree index nodes overlap in space, upon searching for a specific point in space, multiple paths from root to leaf may potentially be explored. In the worst case, the entire R-tree could be searched. In this paper, we define and use the overlap ratio to quantify the degree of extraneous leaf node accesses required by a range query. The goal is to enhance the query performance of a traditional R-tree for high-overlap range queries as they tend to incur long running-times. We introduce a new AI-tree that transforms the search operation of an R-tree into a multi-label classification task to exclude the extraneous leaf node accesses. Then, we augment a traditional R-tree to the AI-tree to form a hybrid "AI+R"-tree. The "AI+R"-tree can automatically differentiate between the high- and low-overlap queries using a learned model. Thus, the "AI+R"-tree processes high-overlap queries using the AI-tree, and the low-overlap queries using the R-tree. Experiments on real datasets demonstrate that the "AI+R"-tree can enhance the query performance over a traditional R-tree by up to 500%.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源