论文标题
通过NLP转移学习模型从用户故事中检测隐私要求
Detecting Privacy Requirements from User Stories with NLP Transfer Learning Models
论文作者
论文摘要
为了提供隐私感知的软件系统,从开发开始时考虑隐私至关重要。但是,开发人员没有将数据保护的法律和社会要求嵌入到软件系统中所需的专业知识和知识。目的:我们提出了一种方法,可以通过在用户故事需求的背景下自动检测与隐私相关的信息来降低隐私风险,这是敏捷需求工程(RE)的突出符号。方法:所提出的方法将自然语言处理(NLP)和语言资源与深度学习算法结合在一起,以将隐私方面识别为用户故事。 NLP技术用于提取有关文本的语义和句法结构的信息。然后,该信息由预先训练的卷积神经网络处理,该网络为实施转移学习技术铺平了道路。我们通过使用1680个用户故事的数据集进行实证研究来评估所提出的方法。结果:实验结果表明,与传统(浅)机器学习方法相比,深度学习算法允许获得更好的预测。此外,转移学习的应用可以大大提高预测的准确性。 10%。结论:我们的研究有助于鼓励软件工程研究人员考虑通过利用转移学习模型在设计早期自动化隐私检测的机会。
To provide privacy-aware software systems, it is crucial to consider privacy from the very beginning of the development. However, developers do not have the expertise and the knowledge required to embed the legal and social requirements for data protection into software systems. Objective: We present an approach to decrease privacy risks during agile software development by automatically detecting privacy-related information in the context of user story requirements, a prominent notation in agile Requirement Engineering (RE). Methods: The proposed approach combines Natural Language Processing (NLP) and linguistic resources with deep learning algorithms to identify privacy aspects into User Stories. NLP technologies are used to extract information regarding the semantic and syntactic structure of the text. This information is then processed by a pre-trained convolutional neural network, which paved the way for the implementation of a Transfer Learning technique. We evaluate the proposed approach by performing an empirical study with a dataset of 1680 user stories. Results: The experimental results show that deep learning algorithms allow to obtain better predictions than those achieved with conventional (shallow) machine learning methods. Moreover, the application of Transfer Learning allows to considerably improve the accuracy of the predictions, ca. 10%. Conclusions: Our study contributes to encourage software engineering researchers in considering the opportunities to automate privacy detection in the early phase of design, by also exploiting transfer learning models.