论文标题
将MAPREDUCE计算连接到现实的机器模型
Connecting MapReduce Computations to Realistic Machine Models
论文作者
论文摘要
我们解释了如何通过解释如何在现实的分布式内存并行机器(如BSP)(例如BSP)上解释它的模拟来植根于现实的流行,高度抽象的MAPREDUCE模型(MRC)。我们首先完善模型(MRC $^+$),以包括总工作$ W $的参数,瓶颈工作$ \ hat {w} $,数据卷$ m $和最大对象大小$ \ hat $ \ hat {m} $。然后,我们显示在分布式内存机器上执行MAPREDUCE计算的上限和下限 - $θ($θ(w/p+\ hat {w}+\ log p)$ work和$θ(m/p+\ hat {m}+\ log p)$ bottleneck $ bottLeNeck $ bottLeNeck $ bottLeNeck $ bettleneck $ bettleneck $ bottLeNeck使用$ p $处理器。
We explain how the popular, highly abstract MapReduce model of parallel computation (MRC) can be rooted in reality by explaining how it can be simulated on realistic distributed-memory parallel machine models like BSP. We first refine the model (MRC$^+$) to include parameters for total work $w$, bottleneck work $\hat{w}$, data volume $m$, and maximum object sizes $\hat{m}$. We then show matching upper and lower bounds for executing a MapReduce calculation on the distributed-memory machine -- $Θ(w/p+\hat{w}+\log p)$ work and $Θ(m/p+\hat{m}+\log p)$ bottleneck communication volume using $p$ processors.