论文标题
开源软件开发人员的代码建议
Code Recommendation for Open Source Software Developers
论文作者
论文摘要
开源软件(OSS)正在形成技术基础设施的刺,吸引了数百万人才来贡献。值得注意的是,考虑开发人员的利益和项目代码的语义特征,向OSS开发人员推荐适当的开发任务是具有挑战性和至关重要的。在本文中,我们制定了代码建议的新问题,其目的是预测开发人员的交互历史,源代码的语义特征以及项目的层次文件结构,其目的是预测开发人员的未来贡献行为。考虑到系统中多个方之间的复杂交互,我们提出了编码器,这是一个基于图形的新型代码建议框架,用于开源软件开发人员。编码器通过异质图共同对微观用户代码交互和宏观用户项目交互进行建模,并通过在反映项目层次结构的文件结构图上通过汇总来进一步桥接两个信息级别。此外,由于缺乏可靠的基准,我们构建了三个大型数据集,以促进未来的研究。广泛的实验表明,我们的编码器框架在各种实验设置(包括项目内,跨项目和冷启动建议)下实现了卓越的性能。在接受此工作后,我们将发布所有数据集,代码和实用程序,以进行数据检索。
Open Source Software (OSS) is forming the spines of technology infrastructures, attracting millions of talents to contribute. Notably, it is challenging and critical to consider both the developers' interests and the semantic features of the project code to recommend appropriate development tasks to OSS developers. In this paper, we formulate the novel problem of code recommendation, whose purpose is to predict the future contribution behaviors of developers given their interaction history, the semantic features of source code, and the hierarchical file structures of projects. Considering the complex interactions among multiple parties within the system, we propose CODER, a novel graph-based code recommendation framework for open source software developers. CODER jointly models microscopic user-code interactions and macroscopic user-project interactions via a heterogeneous graph and further bridges the two levels of information through aggregation on file-structure graphs that reflect the project hierarchy. Moreover, due to the lack of reliable benchmarks, we construct three large-scale datasets to facilitate future research in this direction. Extensive experiments show that our CODER framework achieves superior performance under various experimental settings, including intra-project, cross-project, and cold-start recommendation. We will release all the datasets, code, and utilities for data retrieval upon the acceptance of this work.