论文标题
数据驱动网络入侵检测:挑战和方法的分类法
Data-Driven Network Intrusion Detection: A Taxonomy of Challenges and Methods
论文作者
论文摘要
数据驱动的方法已被广泛用于网络入侵检测(NID)系统。但是,目前从数据集的收集方式中遇到了许多挑战。与普通流量相比,网络入侵数据集中的大多数攻击类都是少数群体,并且许多数据集都是通过虚拟机或其他模拟环境而不是现实世界网络收集的。这些挑战通过拟合诸如随机森林或支持矢量机器等非代表性的“沙盒”数据集的模型来破坏入侵检测机器学习模型的性能。 This survey presents a carefully designed taxonomy highlighting eight main challenges and solutions and explores common datasets from 1999 to 2020. Trends are analyzed on the distribution of challenges addressed for the past decade and future directions are proposed on expanding NID into cloud-based environments, devising scalable models for larger amount of network intrusion data, and creating labeled datasets collected in real-world networks.
Data-driven methods have been widely used in network intrusion detection (NID) systems. However, there are currently a number of challenges derived from how the datasets are being collected. Most attack classes in network intrusion datasets are considered the minority compared to normal traffic and many datasets are collected through virtual machines or other simulated environments rather than real-world networks. These challenges undermine the performance of intrusion detection machine learning models by fitting models such as random forests or support vector machines to unrepresentative "sandbox" datasets. This survey presents a carefully designed taxonomy highlighting eight main challenges and solutions and explores common datasets from 1999 to 2020. Trends are analyzed on the distribution of challenges addressed for the past decade and future directions are proposed on expanding NID into cloud-based environments, devising scalable models for larger amount of network intrusion data, and creating labeled datasets collected in real-world networks.