论文标题

metatptrans:一种用于多语言代码表示学习的元学习方法

MetaTPTrans: A Meta Learning Approach for Multilingual Code Representation Learning

论文作者

Pian, Weiguo, Peng, Hanyu, Tang, Xunzhu, Sun, Tiezhu, Tian, Haoye, Habib, Andrew, Klein, Jacques, Bissyandé, Tegawendé F.

论文摘要

源代码的表示学习对于将机器学习应用于软件工程任务至关重要。从多语言源代码数据集中进行的学习代码表示比单语言数据集中的学习更有效,因为来自多语言数据集的更多培训数据可提高该模型从源代码中提取语言 - 敏捷信息的能力。但是,现有的多语言培训忽略了特定于语言的信息,这对于对不同编程语言进行建模源代码至关重要,同时只专注于学习具有不同语言之间共享参数的统一模型,以进行语言 - 语言信息信息建模。为了解决这个问题,我们提出了MetatPtrans,这是一种用于多语言代码表示学习的元学习方法。 MetAtPtrans根据输入代码段的特定编程语言类型为特征提取器生成不同的参数,从而使模型能够使用功能提取器中的动态参数来学习语言 - 不可能的信息和特定于语言的信息。我们对代码摘要和代码完成任务进行实验,以验证我们方法的有效性。结果证明了我们方法的优势,并在最先进的基线上有了重大改进。

Representation learning of source code is essential for applying machine learning to software engineering tasks. Learning code representation from a multilingual source code dataset has been shown to be more effective than learning from single-language datasets separately, since more training data from multilingual dataset improves the model's ability to extract language-agnostic information from source code. However, existing multilingual training overlooks the language-specific information which is crucial for modeling source code across different programming languages, while only focusing on learning a unified model with shared parameters among different languages for language-agnostic information modeling. To address this problem, we propose MetaTPTrans, a meta learning approach for multilingual code representation learning. MetaTPTrans generates different parameters for the feature extractor according to the specific programming language type of the input code snippet, enabling the model to learn both language-agnostic and language-specific information with dynamic parameters in the feature extractor. We conduct experiments on the code summarization and code completion tasks to verify the effectiveness of our approach. The results demonstrate the superiority of our approach with significant improvements on state-of-the-art baselines.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源