论文标题
专门为GPU图分析的连贯性,一致性和推拉/拉力
Specializing Coherence, Consistency, and Push/Pull for GPU Graph Analytics
论文作者
论文摘要
这项工作提供了首次研究,以探索有或没有细粒度同步的更新传播的相互作用(推动与拉力),新兴相干协议(GPU与Denovo相干性)以及以软件为中心的一致性模型(DRF0,DRF1,DRF1和DRFRLX),可在与本机组成的新型GPU-CPU系统上进行图形工作。我们研究了6个图形应用程序,其中6个图表输入总共在12个系统(硬件+软件)配置上运行36个工作负载,这反映了上述更新传播,相干性和内存一致性的设计空间。我们做出三个关键贡献。首先,我们表明,对于所有工作负载,没有一个最佳的系统配置,具有灵活的连贯性和一致性支持的激励系统。其次,我们开发了一个模型来准确预测最佳的系统配置 - 软件设计人员可以使用此模型来决定推动与拉力和一致性模型,并通过灵活的硬件来调用给定工作负载的适当连贯性和一致性配置。第三,我们表明此处探索的设计维度是相互依赖的,从而强大了上述设计维度中对软件硬件共同设计的需求。例如,决定推送与拉力的软件设计人员必须考虑硬件支持的一致性模型 - 在某些情况下,如果硬件支持DRFRLX,则推动可能会更好,而如果硬件不支持DRFRLX,则拉力可能会更好。
This work provides the first study to explore the interaction of update propagation with and without fine-grained synchronization (push vs. pull), emerging coherence protocols (GPU vs. DeNovo coherence), and software-centric consistency models (DRF0, DRF1, and DRFrlx) for graph workloads on emerging integrated GPU-CPU systems with native unified shared memory. We study 6 graph applications with 6 graph inputs for a total of 36 workloads running on 12 system (hardware+software) configurations reflecting the above design space of update propagation, coherence, and memory consistency. We make three key contributions. First, we show that there is no single best system configuration for all workloads, motivating systems with flexible coherence and consistency support. Second, we develop a model to accurately predict the best system configuration -- this model can be used by software designers to decide on push vs. pull and the consistency model and by flexible hardware to invoke the appropriate coherence and consistency configuration for the given workload. Third, we show that the design dimensions explored here are inter-dependent, reinforcing the need for software-hardware co-design in the above design dimensions. For example, software designers deciding on push vs. pull must consider the consistency model supported by hardware -- in some cases, push maybe better if hardware supports DRFrlx while pull may be better if hardware does not support DRFrlx.