论文标题

连接二进制关系的近乎最佳的平行算法

A Near-Optimal Parallel Algorithm for Joining Binary Relations

论文作者

Ketsman, Bas, Suciu, Dan, Tao, Yufei

论文摘要

我们在大规模平行计算(MPC)模型中提出了一种恒定算法,用于评估每个输入关系具有两个属性的自然连接。 Our algorithm achieves a load of $\tilde{O}(m/p^{1/ρ})$ where $m$ is the total size of the input relations, $p$ is the number of machines, $ρ$ is the join's fractional edge covering number, and $\tilde{O}(.)$ hides a polylogarithmic factor.该载荷与已知的下限匹配到聚类因子。所提出的算法的核心是一种新定理(我们将其命名为“孤立的笛卡尔产物定理”),它为问题的数学结构提供了新的见解。我们的结果意味着,目标是在MPC模型中最佳地解决恒定尺寸子图模式的所有发生的子图枚举问题。

We present a constant-round algorithm in the massively parallel computation (MPC) model for evaluating a natural join where every input relation has two attributes. Our algorithm achieves a load of $\tilde{O}(m/p^{1/ρ})$ where $m$ is the total size of the input relations, $p$ is the number of machines, $ρ$ is the join's fractional edge covering number, and $\tilde{O}(.)$ hides a polylogarithmic factor. The load matches a known lower bound up to a polylogarithmic factor. At the core of the proposed algorithm is a new theorem (which we name the "isolated cartesian product theorem") that provides fresh insight into the problem's mathematical structure. Our result implies that the subgraph enumeration problem, where the goal is to report all the occurrences of a constant-sized subgraph pattern, can be settled optimally (up to a polylogarithmic factor) in the MPC model.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源