渴望在线决策树的策略

论文标题

渴望在线决策树的策略

An Eager Splitting Strategy for Online Decision Trees

论文作者

Manapragada, Chaitanya, Gomes, Heitor M, Salehi, Mahsa, Bifet, Albert, Webb, Geoffrey I

论文摘要

决策树的合奏在实践中被广泛使用。在这项工作中，我们在合奏设置中研究了最先进的在线树木学习者Hoeffding Tree的拆分策略的有效性，并具有严格但更渴望的分裂策略，我们以前在任何时候都以Hoeffing为生。 Hoeffing Anytime Tree（Hatt）使用Hoeffding测试来确定当前最佳候选人分裂是否优于当前的拆分，并且可能会进行修订，而Hoeffding Tree则旨在确定顶级候选人是否比第二好的候选者更好，并且是否选择了测试，并将其用于所有后代。 Hatt收敛到理想的批处理树，而Hoeffding树则没有收敛。我们发现Hatt是一个有效的基础学习者，用于在线行李和在线增强合奏。在UCI和合成流方面，Hatt作为基础学习者在大多数经过测试的合奏中优于0.05的显着性水平，我们认为我们认为是在线学习文献中最大，最全面的测试台。我们的结果表明，在许多合奏设置中，Hatt是Hoffding树的优越替代品。

Decision tree ensembles are widely used in practice. In this work, we study in ensemble settings the effectiveness of replacing the split strategy for the state-of-the-art online tree learner, Hoeffding Tree, with a rigorous but more eager splitting strategy that we had previously published as Hoeffding AnyTime Tree. Hoeffding AnyTime Tree (HATT), uses the Hoeffding Test to determine whether the current best candidate split is superior to the current split, with the possibility of revision, while Hoeffding Tree aims to determine whether the top candidate is better than the second best and if a test is selected, fixes it for all posterity. HATT converges to the ideal batch tree while Hoeffding Tree does not. We find that HATT is an efficacious base learner for online bagging and online boosting ensembles. On UCI and synthetic streams, HATT as a base learner outperforms HT within a 0.05 significance level for the majority of tested ensembles on what we believe is the largest and most comprehensive set of testbenches in the online learning literature. Our results indicate that HATT is a superior alternative to Hoeffding Tree in a large number of ensemble settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题