论文标题
GLIN:A(G)Eneric(L)(在)复杂几何形状的(IN)脱氧机制
GLIN: A (G)eneric (L)earned (In)dexing Mechanism for Complex Geometries
论文作者
论文摘要
尽管空间索引缩短了查询响应时间,但它们依靠复杂的树结构来缩小搜索空间。此类结构又产生了额外的存储开销,并损失了索引维护。最近,试图利用机器学习(ML)模型来简化索引结构的努力。但是,现有的地理空间索引只能索引点数据,而不是复杂的几何形状,例如多边形和地理空间数据中广泛使用的轨迹。结果,它们无法有效,正确地回答几何关系查询。本文引入了GLIN,这是一种用于复杂几何形状的空间关系查询的索引机制。为了实现这一目标,GLIN将几何形状转换为Z-Address间隔,然后利用现有的订单保留索引,以对这些间隔和记录位置之间的累积分布函数进行建模。轻量级学习的索引大大减少了索引开销,并提供更快或可比较的查询延迟。最重要的是,GLIN增强了空间查询窗口,以完全支持查询,以完全为常见的空间关系。我们对现实世界和合成数据集的实验表明,GLIN的储存开销比Quad-Tree低80 \%-90 \%的存储开销,而60%-80%的存储空间比R-Tree和中等选择性的30%-70% - 70%。此外,GLIN的维护吞吐量是插入时的1.5倍,删除量高3-5倍。
Although spatial indexes shorten the query response time, they rely on complex tree structures to narrow down the search space. Such structures in turn yield additional storage overhead and take a toll on index maintenance. Recently, there have been a flurry of efforts attempting to leverage Machine-Learning (ML) models to simplify the index structures. However, existing geospatial indexes can only index point data rather than complex geometries such as polygons and trajectories that are widely available in geospatial data. As a result, they cannot efficiently and correctly answer geometry relationship queries. This paper introduces GLIN, an indexing mechanism for spatial relationship queries on complex geometries. To achieve that, GLIN transforms geometries to Z-address intervals, and then harnesses an existing order-preserving learned index to model the cumulative distribution function between these intervals and the record positions. The lightweight learned index greatly reduces indexing overhead and provides faster or comparable query latency. Most importantly, GLIN augments spatial query windows to support queries exactly for common spatial relationships. Our experiments on real-world and synthetic datasets show that GLIN has 80\%-90\% lower storage overhead than Quad-Tree and 60% - 80% than R-tree and 30% - 70% faster query on medium selectivity. Moreover, GLIN's maintenance throughput is 1.5 times higher on insertion and 3 - 5 times higher on deletion.