论文标题
不平等的选择性估计加入数据库
Selectivity Estimation of Inequality Joins In Databases
论文作者
论文摘要
选择性估计是指SQL查询优化器估计查询中谓词结果的大小的能力。它是主要计算,基于优化器可以选择执行最便宜的计划。虽然该问题自70年代中期以来就已经知道,但我们惊讶的是,文献中没有解决不平等的选择性估计的解决方案。通过测试四个通用数据库系统:Oracle,SQL-Server,PostgreSQL和MySQL,我们发现开源系统PostgreSQL和MySQL缺乏此估计。 Oracle和SQL-Server进行了相当准确的估计,但它们的算法是秘密的。因此,本文提出了一种不平等算法加入选择性估计。所提出的算法已在PostgreSQL中实现,并作为补丁发送,将包含在下一个版本中。
Selectivity estimation refers to the ability of the SQL query optimizer to estimate the size of the results of a predicate in the query. It is the main calculation, based on which the optimizer can select the cheapest plan to execute. While the problem is known since the mid 70s, we were surprised that there are no solutions in the literature for the selectivity estimation of inequality joins. By testing four common database systems: Oracle, SQL-Server, PostgreSQL, and MySQL, we found that the open-source systems PostgreSQL and MySQL lack this estimation. Oracle and SQL-Server make fairly accurate estimations, yet their algorithms are secret. This paper thus proposes an algorithm for inequality join selectivity estimation. The proposed algorithm has been implemented in PostgreSQL and sent as a patch to be included in the next releases.