论文标题

配置文件引导,多元二进制重写

Profile-Guided, Multi-Version Binary Rewriting

论文作者

Meng, Xiaozhu, Chamith, Buddhika, Newton, Ryan

论文摘要

机器代码的静态仪器(也称为二进制重写)是一种电源技术,但与编译器级仪器相比,运行时间高。最近的研究表明,在重写二进制文件时,工具可以实现接近零的开销(不包括特定于应用程序仪器的开销)。但是,二进制重写工具的用户通常很难理解其仪器为什么缓慢以及如何优化其仪器。 我们的灵感来自传统的程序优化工作流程,在该工作流程中可以介绍程序执行以识别性能热点,修改源代码或应用合适的编译器优化,甚至应用配置文件引导的优化。我们提出了个人资料引导的多次二进制重写,以启用此优化工作流程用于静态二进制仪器。我们的新技术包括三个组件。首先,我们增加现有的二进制重写以支持呼叫路径分析;一个人可以交互观看仪器成本,并了解成本所产生的呼叫环境。其次,我们提出版本的结构二进制编辑,这是一种通用的二进制转换技术。第三,我们使用呼叫路径配置文件来指导二进制转换的应用。 我们将新技术应用于阴影堆栈和基本块代码覆盖范围。我们的仪器优化工作流程有助于我们确定有关代码转换和仪器数据布局的几个机会。我们对Spec CPU 2017的评估表明,影子堆栈和块覆盖范围的几何开销分别从7.6%和161.3%降至1.4%和4.0%。我们还可以在Apache HTTP服务器上获得有希望的结果,在Apache HTTP服务器中,Shadow堆栈开销从约20%降低到3.5%。

The static instrumentation of machine code, also known as binary rewriting, is a power technique, but suffers from high runtime overhead compared to compiler-level instrumentation. Recent research has shown that tools can achieve near-to-zero overhead when rewriting binaries (excluding the overhead from the application specific instrumentation). However, the users of binary rewriting tools often have difficulties in understanding why their instrumentation is slow and how to optimize their instrumentation. We are inspired by a traditional program optimization workflow, where one can profile the program execution to identify performance hot spots, modify the source code or apply suitable compiler optimizations, and even apply profile-guided optimization. We present profile-guided, Multi-Version Binary Rewriting to enable this optimization workflow for static binary instrumentation. Our new techniques include three components. First, we augment existing binary rewriting to support call path profiling; one can interactively view instrumentation costs and understand the calling contexts where the costs incur. Second, we present Versioned Structure Binary Editing, which is a general binary transformation technique. Third, we use call path profiles to guide the application of binary transformation. We apply our new techniques to shadow stack and basic block code coverage. Our instrumentation optimization workflow helps us identify several opportunities with regard to code transformation and instrumentation data layout. Our evaluation on SPEC CPU 2017 shows that the geometric overhead of shadow stack and block coverage is reduced from 7.6% and 161.3% to 1.4% and 4.0%, respectively. We also achieve promising results on Apache HTTP Server, where the shadow stack overhead is reduced from about 20% to 3.5%.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源