论文标题
CD-SPLIT和HPD-STLIT:高维度的有效保形区域
CD-split and HPD-split: efficient conformal regions in high dimensions
论文作者
论文摘要
共形方法创建了预测频段,以控制平均覆盖范围仅假设仅I.I.D.数据。尽管文献主要集中在预测间隔上,但更一般的区域通常可以更好地代表不确定性。例如,双峰目标通过两个间隔的结合更好地表示。此类预测区域是通过CD-Split获得的,CD-Split结合了拆分方法和特征空间的数据驱动分区,该分区缩放到高维度。但是,CD-Split包含许多调谐参数,它们的作用尚不清楚。在本文中,我们通过探索其理论属性来提供有关CD分类的新见解。特别是,我们表明CD-Split渐近地收敛到Oracle最高预测密度集,并满足局部和渐近条件有效性。我们还提出了显示如何调整CD-Split的模拟。最后,我们介绍了HPD-Split,这是CD-Split的变体,需要更少的调整,并表明它具有与CD-Split相同的理论保证。在我们的各种模拟中,CD-Split和HPD-Split具有更好的条件覆盖范围,并且比其他方法产生的预测区域较小。
Conformal methods create prediction bands that control average coverage assuming solely i.i.d. data. Although the literature has mostly focused on prediction intervals, more general regions can often better represent uncertainty. For instance, a bimodal target is better represented by the union of two intervals. Such prediction regions are obtained by CD-split , which combines the split method and a data-driven partition of the feature space which scales to high dimensions. CD-split however contains many tuning parameters, and their role is not clear. In this paper, we provide new insights on CD-split by exploring its theoretical properties. In particular, we show that CD-split converges asymptotically to the oracle highest predictive density set and satisfies local and asymptotic conditional validity. We also present simulations that show how to tune CD-split. Finally, we introduce HPD-split, a variation of CD-split that requires less tuning, and show that it shares the same theoretical guarantees as CD-split. In a wide variety of our simulations, CD-split and HPD-split have better conditional coverage and yield smaller prediction regions than other methods.