论文标题
决策树以预测到优化框架下的决策树
Decision Trees for Decision-Making under the Predict-then-Optimize Framework
论文作者
论文摘要
我们考虑使用决策树在预测的框架下进行决策问题。也就是说,我们想首先使用决策树来预测优化问题的未知输入参数,然后通过使用预测参数解决优化问题来做出决策。该框架中的自然损失函数是测量预测输入参数引起的决策的次级临时性,而不是使用输入参数预测误差来测量损失。在文献中,这种自然损失函数被称为智能预测 - 优化(SPO)损失,我们提出了一种可拖延的方法,称为SPO树(Spots),用于在此损失下训练决策树。斑点受益于决策树的可解释性,将上下文特征分割为具有独特最佳解决方案的群体,以解决感兴趣的优化问题。我们对合成和真实数据进行了几项数值实验,包括预测最短路径问题的旅行时间以及预测新闻文章建议的点击概率。我们在这些数据集上证明,与其他机器学习方法(例如,购物车)相比,该数据集同时提供了更高质量的决策,并且模型的复杂性明显降低,以最大程度地减少预测错误。
We consider the use of decision trees for decision-making problems under the predict-then-optimize framework. That is, we would like to first use a decision tree to predict unknown input parameters of an optimization problem, and then make decisions by solving the optimization problem using the predicted parameters. A natural loss function in this framework is to measure the suboptimality of the decisions induced by the predicted input parameters, as opposed to measuring loss using input parameter prediction error. This natural loss function is known in the literature as the Smart Predict-then-Optimize (SPO) loss, and we propose a tractable methodology called SPO Trees (SPOTs) for training decision trees under this loss. SPOTs benefit from the interpretability of decision trees, providing an interpretable segmentation of contextual features into groups with distinct optimal solutions to the optimization problem of interest. We conduct several numerical experiments on synthetic and real data including the prediction of travel times for shortest path problems and predicting click probabilities for news article recommendation. We demonstrate on these datasets that SPOTs simultaneously provide higher quality decisions and significantly lower model complexity than other machine learning approaches (e.g., CART) trained to minimize prediction error.