论文标题

Figaro:通过细粒度内的DRAM数据重新定位和缓存来改善系统性能

FIGARO: Improving System Performance via Fine-Grained In-DRAM Data Relocation and Caching

论文作者

Wang, Yaohua, Orosa, Lois, Peng, Xiangjun, Guo, Yang, Ghose, Saugata, Patel, Minesh, Kim, Jeremie S., Luna, Juan Gómez, Sadrosadati, Mohammad, Ghiasi, Nika Mansouri, Mutlu, Onur

论文摘要

DRAM主内存是由于高访问延迟而用于许多应用程序的性能瓶颈。 In-Dram缓存通过用小但狂热的DRAM区域来增加常规延迟DRAM来减轻潜伏期,这些区域是DRAM常规延迟区域中持有的数据的缓存。虽然有效的DRAM缓存可以从快速的DRAM区域提供大量的内存请求,但延迟节省通常会受到低效的机制来阻碍,用于将数据副本重新定位到快速区域。现有的DRAM内存具有两个效率低下的来源:(1)数据搬迁粒度是整个多kilobyte dram; (2)由于迁移潜伏期随着慢速区域和快速区域之间的物理距离增加,因此在慢速区域之间进行了多个快速区域的物理交织,以减少恢复潜伏期,从而增加硬件区域和制造复杂性。我们提出了一个新的基板Figaro,该基材使用DRAM银行内部的子阵列中现有的共享全局缓冲区,以支持单个缓存块粒度的跨子阵列的DRAM In-DRAM数据重置。 Figaro在DRAM库中具有无距离的延迟,并避免了对DRAM的复杂修改。使用Figaro,我们设计了一个名为Figcache的细粒度内gram缓存。 Figcache的关键思想是在指定的DRAM区域中仅缓存不同DRAM行的小,经常接种的部分。通过仅缓存在不久的将来预计将访问的每一行的部分,我们可以将更多经常接收的数据包装到Figcache中,并且可以从DRAM中受益。我们的评估表明,在没有DRAM内部的系统的情况下,使用DDR4 DRAM使用DDR4 DRAM提高了系统的平均性能,并使8核工作负载的平均DRAM能耗提高了7.8%。

DRAM Main memory is a performance bottleneck for many applications due to the high access latency. In-DRAM caches work to mitigate this latency by augmenting regular-latency DRAM with small-but-fast regions of DRAM that serve as a cache for the data held in the regular-latency region of DRAM. While an effective in-DRAM cache can allow a large fraction of memory requests to be served from a fast DRAM region, the latency savings are often hindered by inefficient mechanisms for relocating copies of data into and out of the fast regions. Existing in-DRAM caches have two sources of inefficiency: (1) the data relocation granularity is an entire multi-kilobyte row of DRAM; and (2) because the relocation latency increases with the physical distance between the slow and fast regions, multiple fast regions are physically interleaved among slow regions to reduce the relocation latency, resulting in increased hardware area and manufacturing complexity. We propose a new substrate, FIGARO, that uses existing shared global buffers among subarrays within a DRAM bank to provide support for in-DRAM data relocation across subarrays at the granularity of a single cache block. FIGARO has a distance-independent latency within a DRAM bank, and avoids complex modifications to DRAM. Using FIGARO, we design a fine-grained in-DRAM cache called FIGCache. The key idea of FIGCache is to cache only small, frequently-accessed portions of different DRAM rows in a designated region of DRAM. By caching only the parts of each row that are expected to be accessed in the near future, we can pack more of the frequently-accessed data into FIGCache, and can benefit from additional row hits in DRAM. Our evaluations show that FIGCache improves the average performance of a system using DDR4 DRAM by 16.3% and reduces average DRAM energy consumption by 7.8% for 8-core workloads, over a conventional system without in-DRAM caching.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源