论文标题

自适应深森林,用于在线学习数据流的在线学习

Adaptive Deep Forest for Online Learning from Drifting Data Streams

论文作者

Korycki, Łukasz, Krawczyk, Bartosz

论文摘要

从数据流中学习是当代数据挖掘最重要的领域之一。来自那些潜在无界数据源的信息的在线分析允许设计能够调整自己的反应性最新模型,以调整自己的连续数据流。尽管已经提出了多种浅层方法,以简单地用于简单的低维流问题,但几乎没有一个从复杂的上下文数据(例如图像或文本)中学习的问题。前者主要是由自适应决策树代表,这些决策树已被证明在流媒体场景中非常有效。后者主要通过离线深度学习来解决。在这项工作中,我们试图弥合这两个世界之间的鸿沟,并提出自适应深森林(ADF) - 成功的基于树的流分类器与深森林的自然组合,这是从上下文数据中学习的有趣替代思想。进行的实验表明,深层森林方法可以有效地转化为在线算法,形成了一个模型,该模型的表现优于所有最新的浅自适应分类器,尤其是对于高维复杂流。

Learning from data streams is among the most vital fields of contemporary data mining. The online analysis of information coming from those potentially unbounded data sources allows for designing reactive up-to-date models capable of adjusting themselves to continuous flows of data. While a plethora of shallow methods have been proposed for simpler low-dimensional streaming problems, almost none of them addressed the issue of learning from complex contextual data, such as images or texts. The former is represented mainly by adaptive decision trees that have been proven to be very efficient in streaming scenarios. The latter has been predominantly addressed by offline deep learning. In this work, we attempt to bridge the gap between these two worlds and propose Adaptive Deep Forest (ADF) - a natural combination of the successful tree-based streaming classifiers with deep forest, which represents an interesting alternative idea for learning from contextual data. The conducted experiments show that the deep forest approach can be effectively transformed into an online algorithm, forming a model that outperforms all state-of-the-art shallow adaptive classifiers, especially for high-dimensional complex streams.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源