论文标题

在插入密集型数据库上弥合理论与实践之间的差距

Bridging the Gap Between Theory and Practice on Insertion-Intensive Database

论文作者

Zeighami, Sepanta, Wong, Raymond Chi-Wing

论文摘要

随着在线平台的普遍性,如今,用户正在以非常高的速度生成和访问数据。此外,诸如股票交易或高频交易之类的应用程序需要保证在数据库上执行操作的较低延迟。对于设计数据库而言,这是可观的,这些数据库可以保证数据插入和查询的速度始终高,而无需在插入过程中引入任何长时间的延迟。在本文中,我们提出了嵌套的B树(NB-Trees),该指数可以在大量数据上达到一致的高插入率,同时提供渐近最佳的最佳查询性能,在实践中非常有效。嵌套的B树以高于LSM-Trees的速度支持插入,这是插入密集型工作负载的最新指数,同时避免了他们的长时间插入延迟并改善其查询性能。当与Bloom过滤器相辅相成时,他们接近B-Trees的查询性能。在我们的实验中,NB-Trees的延迟最大的延迟比LevelDB,RockSDB和BLSM高1000,通常使用的LSM-TREE数据存储可能比LevelDB快4倍,并且比BLSM和RockSDB快1.5倍以上,而在平均插入率上也超过了它们。

With the prevalence of online platforms, today, data is being generated and accessed by users at a very high rate. Besides, applications such as stock trading or high frequency trading require guaranteed low delays for performing an operation on a database. It is consequential to design databases that guarantee data insertion and query at a consistently high rate without introducing any long delay during insertion. In this paper, we propose Nested B-trees (NB-trees), an index that can achieve a consistently high insertion rate on large volumes of data, while providing asymptotically optimal query performance that is very efficient in practice. Nested B-trees support insertions at rates higher than LSM-trees, the state-of-the-art index for insertion-intensive workloads, while avoiding their long insertion delays and improving on their query performance. They approach the query performance of B-trees when complemented with Bloom filters. In our experiments, NB-trees had worst-case delays up to 1000 smaller than LevelDB, RocksDB and bLSM, commonly used LSM-tree data-stores, could perform queries more than 4 times faster than LevelDB and 1.5 times faster than bLSM and RocksDB, while also outperforming them in terms of average insertion rate.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源