论文标题
快速优化加权稀疏决策树,用于最佳治疗方案和最佳政策设计
Fast Optimization of Weighted Sparse Decision Trees for use in Optimal Treatment Regimes and Optimal Policy Design
论文作者
论文摘要
稀疏决策树是最常见的可解释模型形式之一。尽管最近的进步产生了完全优化稀疏决策树以预测的算法,但该工作并不能解决策略设计,因为算法无法处理加权数据样本。具体来说,它们依赖于损失函数的离散性,这意味着无法直接使用实用值的权重。例如,现有技术均未产生策略,这些策略纳入了各个数据点上的反向倾向权重。我们提出了三种算法,用于有效的稀疏加权决策树优化。第一种方法可以直接优化加权损失函数;但是,对于大型数据集,它往往在计算上效率低下。我们的第二种方法可以更有效地缩放,将权重转换为整数值,并使用数据重复将加权决策树优化问题转化为未加权(但更大)的对应物。我们的第三个算法比例扩展到更大的数据集,它使用了一个随机过程,该过程将每个数据点采样,其概率与其重量成正比。我们介绍了两种快速方法的误差的理论界限,并在实验上表明,这些方法可以比加权损失的直接优化的速度快两个数量级,而不会失去明显的准确性。
Sparse decision trees are one of the most common forms of interpretable models. While recent advances have produced algorithms that fully optimize sparse decision trees for prediction, that work does not address policy design, because the algorithms cannot handle weighted data samples. Specifically, they rely on the discreteness of the loss function, which means that real-valued weights cannot be directly used. For example, none of the existing techniques produce policies that incorporate inverse propensity weighting on individual data points. We present three algorithms for efficient sparse weighted decision tree optimization. The first approach directly optimizes the weighted loss function; however, it tends to be computationally inefficient for large datasets. Our second approach, which scales more efficiently, transforms weights to integer values and uses data duplication to transform the weighted decision tree optimization problem into an unweighted (but larger) counterpart. Our third algorithm, which scales to much larger datasets, uses a randomized procedure that samples each data point with a probability proportional to its weight. We present theoretical bounds on the error of the two fast methods and show experimentally that these methods can be two orders of magnitude faster than the direct optimization of the weighted loss, without losing significant accuracy.