论文标题
重复的错误报告检测:我们有多远?
Duplicate Bug Report Detection: How Far Are We?
论文作者
论文摘要
研究文献中提出了许多重复的错误报告检测(DBRD)技术。该行业使用其他一些技术。不幸的是,它们之间没有足够的比较,目前尚不清楚我们走了多远。这项工作通过比较上述技术来填补这一空白。为了比较它们,我们首先需要一个基准,该基准可以估算如果在当今现实的环境中应用工具的性能。因此,我们首先研究了影响DBRD技术准确性公平比较的潜在偏见。我们的实验表明,数据年龄和问题跟踪系统选择会导致显着差异。根据这些发现,我们准备了一个新的基准。然后,我们用它来评估DBRD技术,以更好地估计我们已经走了多远。令人惊讶的是,一种更简单的技术在我们的基准测试中的大多数项目上都优于最近提出的复杂技术。此外,我们将研究中提出的DBRD技术与Mozilla和Vscode中使用的DBRD技术进行了比较。令人惊讶的是,我们观察到,实践中已经采用的简单技术可以作为最近提出的研究工具获得可比的结果。我们的研究对DBRD的当前状态进行了思考,我们分享了有益于未来DBRD研究的见解。
Many Duplicate Bug Report Detection (DBRD) techniques have been proposed in the research literature. The industry uses some other techniques. Unfortunately, there is insufficient comparison among them, and it is unclear how far we have been. This work fills this gap by comparing the aforementioned techniques. To compare them, we first need a benchmark that can estimate how a tool would perform if applied in a realistic setting today. Thus, we first investigated potential biases that affect the fair comparison of the accuracy of DBRD techniques. Our experiments suggest that data age and issue tracking system choice cause a significant difference. Based on these findings, we prepared a new benchmark. We then used it to evaluate DBRD techniques to estimate better how far we have been. Surprisingly, a simpler technique outperforms recently proposed sophisticated techniques on most projects in our benchmark. In addition, we compared the DBRD techniques proposed in research with those used in Mozilla and VSCode. Surprisingly, we observe that a simple technique already adopted in practice can achieve comparable results as a recently proposed research tool. Our study gives reflections on the current state of DBRD, and we share our insights to benefit future DBRD research.