论文标题
proteus:自设计范围过滤器
Proteus: A Self-Designing Range Filter
论文作者
论文摘要
我们介绍了一种新颖的自设计近似范围滤波器Proteus,该范围是根据采样数据进行配置的,以便针对给定的空间要求优化其误报率(FPR)。 Proteus统一了最先进的范围过滤器的概率和确定性设计空间,以在更大的用例中实现稳健的性能。 Proteus的核心是我们的上下文前缀FPR(CPFPR)模型 - 在其设计空间中基于前缀的过滤器的FPR的正式框架。我们从经验上证明了我们的模型和Proteus在合成工作负载和实际数据集上优化的能力的准确性。我们进一步评估了RockSDB中的Proteus,并表明它能够将端到端的性能提高多达5.3倍,而不是更脆的先进方法,例如Surf和Rosetta。我们的实验还表明,与端到端的性能增长相比,建模的成本并不显着,并且Proteus对工作负载转移具有鲁棒性。
We introduce Proteus, a novel self-designing approximate range filter, which configures itself based on sampled data in order to optimize its false positive rate (FPR) for a given space requirement. Proteus unifies the probabilistic and deterministic design spaces of state-of-the-art range filters to achieve robust performance across a larger variety of use cases. At the core of Proteus lies our Contextual Prefix FPR (CPFPR) model - a formal framework for the FPR of prefix-based filters across their design spaces. We empirically demonstrate the accuracy of our model and Proteus' ability to optimize over both synthetic workloads and real-world datasets. We further evaluate Proteus in RocksDB and show that it is able to improve end-to-end performance by as much as 5.3x over more brittle state-of-the-art methods such as SuRF and Rosetta. Our experiments also indicate that the cost of modeling is not significant compared to the end-to-end performance gains and that Proteus is robust to workload shifts.