论文标题
数据密集型系统中的数据访问性能反图案
Data-access performance anti-patterns in data-intensive systems
论文作者
论文摘要
数据密集型系统处理人类和数字设备生成的可变,大量和高速数据。像传统软件一样,数据密集型系统容易出现技术债务,以应对开发人员的时间和资源限制的压力。数据访问是数据密集型系统的关键组成部分,因为它决定了此类系统的整体性能和功能。尽管数据访问技术债务正在从研究界引起人们的关注,但影响绩效的技术债务并未得到很好的研究。目的:使用定性研究,在基于NOSQL的数据密集型系统的背景下确定,分类和验证数据访问性能问题。方法:我们从基于NOSQL的和多面有持久性的开源数据密集型系统中收集问题,并使用归纳编码确定数据访问性能问题,并构建根本原因的分类法。然后,我们使用开发人员调查验证了新确定的绩效问题的感知相关性。
Data-intensive systems handle variable, high volume, and high-velocity data generated by human and digital devices. Like traditional software, data-intensive systems are prone to technical debts introduced to cope-up with the pressure of time and resource constraints on developers. Data-access is a critical component of data-intensive systems as it determines the overall performance and functionality of such systems. While data access technical debts are getting attention from the research community, technical debts affecting the performance, are not well investigated. Objective: Identify, categorize, and validate data access performance issues in the context of NoSQL-based and polyglot persistence data-intensive systems using qualitative study. Method: We collect issues from NoSQL-based and polyglot persistence open-source data-intensive systems and identify data access performance issues using inductive coding and build a taxonomy of the root causes. Then, we validate the perceived relevance of the newly identified performance issues using a developer survey.