论文标题
HMT:以高性能身份验证
HMT: A Hardware-Centric Hybrid Bonsai Merkle Tree Algorithm for High-Performance Authentication
论文作者
论文摘要
默克尔树是一种广泛使用的树结构,用于在安全计算系统中对数据/元数据进行身份验证。最近最先进的安全系统使用较小尺寸的MT,即盆景Merkle树(BMT)来保护元数据,例如加密计数器。常见的BMT算法是针对传统的冯·诺伊曼(Von Neumann)体系结构设计的,考虑了以软件为中心的实现,因此它们使用了大量递归,并且通常是顺序的。但是,采用现场可编程门阵列(FPGA)设备的现代异质计算平台需要以并发为中心的算法,以充分利用此类系统的多功能性和并行性质。我们的这项工作目标是引入HMT,这是一种适合硬件友好的BMT算法,可实现验证和更新过程以独立运行,并提供放松更新的好处,同时在更新复杂性方面与急切的更新相当。 HMT的方法论既有新的算法修订版,又为实施BMT提供了创新的硬件技术。我们介绍了一种混合BMT算法,该算法是针对硬件的,并行并放松更新的,具体取决于BMT缓存命中率,但与Lazy更新相比,更新条件更加灵活,以节省其他写入。部署这种新算法,我们设计了一个新的BMT控制器,它具有数据流架构,投机性缓冲区和并行写下引擎,可提供多个并发放松身份验证。我们的经验绩效测量表明,在子系统级别测试中,HMT可以提高带宽高达7倍的带宽和4.5倍的延迟。在Xilinx U200加速器FPGA上的真实安全内存系统中,与FPGA上最先进的BMT解决方案相比,HMT在标准基准测试中显示了高达14 \%的执行。
Merkle tree is a widely used tree structure for authentication of data/metadata in a secure computing system. Recent state-of-the art secure systems use a smaller-sized MT, namely Bonsai Merkle Tree (BMT) to protect the metadata such as encryption counters. Common BMT algorithms were designed for traditional Von Neumann architectures with a software-centric implementation in mind, hence they use a lot of recursions and are often sequential in nature. However, the modern heterogeneous computing platforms employing Field-Programmable Gate Array (FPGA) devices require concurrency-focused algorithms to fully utilize the versatility and parallel nature of such systems. Our goal for this work is to introduce HMT, a hardware-friendly BMT algorithm that enables the verification and update processes to function independently and provides the benefits of relaxed update while being comparable to eager update in terms of update complexity. The methodology of HMT contributes both novel algorithm revisions and innovative hardware techniques to implementing BMT. We introduce a hybrid BMT algorithm that is hardware-targeted, parallel and relaxes the update depending on BMT cache hit but makes the update conditions more flexible compared to lazy update to save additional write-backs. Deploying this new algorithm, we have designed a new BMT controller with a dataflow architecture, speculative buffers and parallel write-back engines that allows for multiple concurrent relaxed authentication. Our empirical performance measurements have demonstrated that HMT can achieve up to 7x improvement in bandwidth and 4.5x reduction in latency over baseline in subsystem level tests. In a real secure-memory system on a Xilinx U200 accelerator FPGA, HMT exhibits up to 14\% faster execution in standard benchmarks compared to state-of-the art BMT solution on FPGA.