代码审查措施是否解释了释放后缺陷的发生率？

论文标题

代码审查措施是否解释了释放后缺陷的发生率？

Do Code Review Measures Explain the Incidence of Post-Release Defects?

论文作者

Krutauz, Andrey, Dey, Tapajit, Rigby, Peter C., Mockus, Audris

论文摘要

目的：与代码审查期间发现的缺陷的研究相反，我们旨在澄清代码审查措施是否可以解释释放后缺陷的流行。方法：我们复制了McIntoshet的研究。使用加法回归来对缺陷和代码审查之间的关系建模。为了提高外部有效性，我们将相同的方法应用于新的软件项目。我们与原始研究的第一作者McIntosh讨论了我们的发现。然后，我们研究了如何通过采用贝叶斯网络（BN）模型来减少可变选择过程中相关预测因子在变量选择过程中的影响。上下文：与原始研究一样，我们在原始研究中使用了针对QT项目获得的相同措施。我们从Google Chrome的版本控制和问题跟踪器中挖掘数据，并操作与大量代码，过程和代码审查措施相似的措施，这些措施是在复制的研究中使用的。结果：来自原始研究的数据和铬数据的数据都表明，代码审查措施对缺陷的影响很高，结果对可变选择程序高度敏感。没有代码审核预测变量的模型比具有审核预测变量的模型具有良好或更好的合适性。但是，复制证实了大部分先前的工作，表明先前的缺陷，模块大小和作者身份与后释放缺陷具有最强的关系。 BN模型的应用通过证明与审查相关的预测因子不会直接影响后释放后缺陷并显示间接影响来解释观察到的不稳定。例如，没有评论讨论的更改往往与具有许多先前缺陷的文件相关联，而这些缺陷又增加了释放后缺陷的数量。

Aim: In contrast to studies of defects found during code review, we aim to clarify whether code reviews measures can explain the prevalence of post-release defects. Method: We replicate a study by McIntoshet. al that uses additive regression to model the relationship between defects and code reviews. To increase external validity, we apply the same methodology on a new software project. We discuss our findings with the first author of the original study, McIntosh. We then investigate how to reduce the impact of correlated predictors in the variable selection process and how to increase understanding of the inter-relationships among the predictors by employing Bayesian Network (BN) models. Context: As in the original study, we use the same measures authors obtained for Qt project in the original study. We mine data from version control and issue tracker of Google Chrome and operationalize measures that are close analogs to the large collection of code, process, and code review measures used in the replicated the study. Results: Both the data from the original study and the Chrome data showed high instability of the influence of code review measures on defects with the results being highly sensitive to variable selection procedure. Models without code review predictors had as good or better fit than those with review predictors. Replication, however, confirms with the bulk of prior work showing that prior defects, module size, and authorship have the strongest relationship to post-release defects. The application of BN models helped explain the observed instability by demonstrating that the review-related predictors do not affect post-release defects directly and showed indirect effects. For example, changes that have no review discussion tend to be associated with files that have had many prior defects which in turn increase the number of post-release defects.

下载PDF全文

下载文献需遵守相关版权规定

论文标题