Seculator：快速安全的神经处理单元

论文标题

Seculator：快速安全的神经处理单元

Seculator: A Fast and Secure Neural Processing Unit

论文作者

Shrivastava, Nivedita, Sarangi, Smruti R.

论文摘要

确保深度神经网络（DNN）是一个引起重大兴趣的问题，因为ML模型结合了高质量的知识产权，机械土耳其人精心整理的数据集的特征以及大型集群计算机上的新型培训方法。可悲的是，提取模型参数的攻击正在上升，因此设计师被迫创建用于保护此类模型的体系结构。该领域的最新建议将此类网络的确定性内存访问模式带入认知（尽管部分），但将一组内存块组为瓷砖，并保持状态在瓷砖级别（以减少存储空间）。为了提供完整性保证（避免篡改），他们没有提出任何明显的优化，并且仍然保持块级别。我们观察到，可以进一步利用DNN的确定性内存访问模式，并仅为当前瓷砖和当前层维护状态信息，这可能包括大量图块。这样可以减少存储空间，减少内存访问的数量，提高性能，并简化设计，而无需牺牲任何安全保证。我们提出的加速器体系结构Seculator中的关键技术是编码内存访问模式，以创建一个基于HW的小型图块版本编号生成器，并存储层级的MAC。我们完全消除了拥有MAC缓存和图块版本编号存储的需求（相关工作中使用）。我们表明，使用智能设计的数学操作，不需要这些结构。通过减少此类间接费用，我们在最接近的竞争工作中显示出16％的速度。

Securing deep neural networks (DNNs) is a problem of significant interest since an ML model incorporates high-quality intellectual property, features of data sets painstakingly collated by mechanical turks, and novel methods of training on large cluster computers. Sadly, attacks to extract model parameters are on the rise, and thus designers are being forced to create architectures for securing such models. State-of-the-art proposals in this field take the deterministic memory access patterns of such networks into cognizance (albeit partially), group a set of memory blocks into a tile, and maintain state at the level of tiles (to reduce storage space). For providing integrity guarantees (tamper avoidance), they don't propose any significant optimizations, and still maintain block-level state. We observe that it is possible to exploit the deterministic memory access patterns of DNNs even further, and maintain state information for only the current tile and current layer, which may comprise a large number of tiles. This reduces the storage space, reduces the number of memory accesses, increases performance, and simplifies the design without sacrificing any security guarantees. The key techniques in our proposed accelerator architecture, Seculator, are to encode memory access patterns to create a small HW-based tile version number generator for a given layer, and to store layer-level MACs. We completely eliminate the need for having a MAC cache and a tile version number store (as used in related work). We show that using intelligently-designed mathematical operations, these structures are not required. By reducing such overheads, we show a speedup of 16% over the closest competing work.

下载PDF全文

下载文献需遵守相关版权规定

论文标题