论文标题

数据科学项目成功因素的调查研究

A survey study of success factors in data science projects

论文作者

Martinez, Iñigo, Viles, Elisabeth, Olaizola, Igor G.

论文摘要

近年来,数据科学界一直追求卓越,并做出了重大的研究工作来开发先进的分析,重点是解决技术问题,但以组织和社会技术挑战为代价。根据有关数据科学项目管理状况的先前调查,技术和组织流程之间存在很大的差距。在本文中,我们介绍了从调查到237个数据科学专业人员的新经验数据,介绍了用于数据科学项目管理方法的使用。在执行数据科学项目时,我们提供了调查受访者角色及其优先级的其他分析。基于这项调查研究,主要发现是:(1)敏捷数据科学生命周期是最广泛使用的框架,但只有25%的调查参与者声明遵循数据科学项目方法论。 (2)最重要的成功因素是精确地描述了利益相关者的需求,将结果传达给最终用户,以及团队的协作和协调。 (3)遵守项目方法论的专业人员更加重视该项目的潜在风险和陷阱,版本控制,生产部署管道以及数据安全和隐私。

In recent years, the data science community has pursued excellence and made significant research efforts to develop advanced analytics, focusing on solving technical problems at the expense of organizational and socio-technical challenges. According to previous surveys on the state of data science project management, there is a significant gap between technical and organizational processes. In this article we present new empirical data from a survey to 237 data science professionals on the use of project management methodologies for data science. We provide additional profiling of the survey respondents' roles and their priorities when executing data science projects. Based on this survey study, the main findings are: (1) Agile data science lifecycle is the most widely used framework, but only 25% of the survey participants state to follow a data science project methodology. (2) The most important success factors are precisely describing stakeholders' needs, communicating the results to end-users, and team collaboration and coordination. (3) Professionals who adhere to a project methodology place greater emphasis on the project's potential risks and pitfalls, version control, the deployment pipeline to production, and data security and privacy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源