论文标题
稀疏:通过增强学习学习数据包采样
SparseIDS: Learning Packet Sampling with Reinforcement Learning
论文作者
论文摘要
复发性神经网络(RNN)已被证明对于为网络数据构建入侵检测系统(IDS)很有价值。他们允许确定流量是否在结束之前是恶意的,从而可以立即采取行动。但是,考虑到必须检查的大量数据包,例如在云/雾和边缘计算中,会出现计算效率的问题。我们表明,通过使用新颖的增强学习(RL)的方法称为稀疏的方法,我们可以将消费数据包数量减少三分之二以上,同时保持分类准确性很高。为了最大程度地减少基于RL的采样的计算费用,我们表明可以将共享神经网络用于分类器和RL逻辑。因此,部署中的采样不会消耗其他资源。与其他各种抽样技术相比,稀疏技术通过学习仅采样相关数据包来始终达到更高的分类精度。我们基于RL的方法的主要新颖性是,它不仅可以像自然语言处理领域中提出的其他方法一样跳过预定义的最大样本数量,而且甚至可以一步一步跳过许多数据包。这使得为长序列节省了更多的计算资源。检查稀疏选择数据包的行为表明,它对不同的攻击类型和网络流采用了不同的采样策略。最后,我们建立了一种自动转向机制,可以指导部署中的稀疏以达到所需的稀疏度。
Recurrent Neural Networks (RNNs) have been shown to be valuable for constructing Intrusion Detection Systems (IDSs) for network data. They allow determining if a flow is malicious or not already before it is over, making it possible to take action immediately. However, considering the large number of packets that has to be inspected, for example in cloud/fog and edge computing, the question of computational efficiency arises. We show that by using a novel Reinforcement Learning (RL)-based approach called SparseIDS, we can reduce the number of consumed packets by more than three fourths while keeping classification accuracy high. To minimize the computational expenses of the RL-based sampling we show that a shared neural network can be used for both the classifier and the RL logic. Thus, no additional resources are consumed by the sampling in deployment. Comparing to various other sampling techniques, SparseIDS consistently achieves higher classification accuracy by learning to sample only relevant packets. A major novelty of our RL-based approach is that it can not only skip up to a predefined maximum number of samples like other approaches proposed in the domain of Natural Language Processing but can even skip arbitrarily many packets in one step. This enables saving even more computational resources for long sequences. Inspecting SparseIDS's behavior of choosing packets shows that it adopts different sampling strategies for different attack types and network flows. Finally we build an automatic steering mechanism that can guide SparseIDS in deployment to achieve a desired level of sparsity.