论文标题
结合不确定性下的预测:随机决策树的情况
Combining Predictions under Uncertainty: The Case of Random Decision Trees
论文作者
论文摘要
决策树集合中汇总分类估计的一种常见方法是使用投票或平均每个类别的概率。后者考虑了不确定性估计值的可靠性(可以说,“不确定性的不确定性”)。更笼统地,关于如何最好地结合来自多个来源的概率估计值的尚不清楚。在本文中,我们研究了许多替代预测方法。我们的方法的灵感来自概率,信念功能和可靠分类的理论,以及我们称证据积累的原则。我们对各种数据集的实验是基于随机决策树,该决策树保证了要组合的预测的高度多样性。出乎意料的是,我们发现将平均值超过概率实际上很难击败。但是,证据积累显示出除了很小的叶子以外的所有叶子的始终更好的结果。
A common approach to aggregate classification estimates in an ensemble of decision trees is to either use voting or to average the probabilities for each class. The latter takes uncertainty into account, but not the reliability of the uncertainty estimates (so to say, the "uncertainty about the uncertainty"). More generally, much remains unknown about how to best combine probabilistic estimates from multiple sources. In this paper, we investigate a number of alternative prediction methods. Our methods are inspired by the theories of probability, belief functions and reliable classification, as well as a principle that we call evidence accumulation. Our experiments on a variety of data sets are based on random decision trees which guarantees a high diversity in the predictions to be combined. Somewhat unexpectedly, we found that taking the average over the probabilities is actually hard to beat. However, evidence accumulation showed consistently better results on all but very small leafs.