饮食代码健康：简化预培训的代码模型的计划

论文标题

饮食代码健康：简化预培训的代码模型的计划

Diet Code Is Healthy: Simplifying Programs for Pre-trained Models of Code

论文作者

Zhang, Zhaowei, Zhang, Hongyu, Shen, Beijun, Gu, Xiaodong

论文摘要

预先训练的代码表示模型（例如Codebert）在各种软件工程任务中都表现出了卓越的性能，但是它们的复杂性通常很重，并且在输入序列的长度上四倍。我们对Codebert的注意力的经验分析表明，Codebert对某些类型的令牌和陈述（例如关键字和数据相关的语句）更加关注。根据这些发现，我们提出了Dietcode，该饮食代码旨在针对源代码的大型预训练模型的轻巧杠杆作用。 DietCode通过三种策略简化了Codebert的输入程序，即单词辍学，频率过滤和基于注意力的策略，该策略选择了在预训练期间接受最高注意力的语句和令牌。因此，它在不妨碍模型性能的情况下大大降低了计算成本。两个下游任务的实验结果表明，DietCodebert为Codebert提供了可比的结果，而微调和测试的计算成本却降低了40％。

Pre-trained code representation models such as CodeBERT have demonstrated superior performance in a variety of software engineering tasks, yet they are often heavy in complexity, quadratically with the length of the input sequence. Our empirical analysis of CodeBERT's attention reveals that CodeBERT pays more attention to certain types of tokens and statements such as keywords and data-relevant statements. Based on these findings, we propose DietCode, which aims at lightweight leverage of large pre-trained models for source code. DietCode simplifies the input program of CodeBERT with three strategies, namely, word dropout, frequency filtering, and an attention-based strategy which selects statements and tokens that receive the most attention weights during pre-training. Hence, it gives a substantial reduction in the computational cost without hampering the model performance. Experimental results on two downstream tasks show that DietCodeBERT provides comparable results to CodeBERT with 40% less computational cost in fine-tuning and testing.

下载PDF全文

下载文献需遵守相关版权规定

论文标题