通过机器学习对预测不确定性估计的综述

论文标题

通过机器学习对预测不确定性估计的综述

A review of predictive uncertainty estimation with machine learning

论文作者

Tyralis, Hristos, Papacharalampous, Georgia

论文摘要

机器学习模型的预测和预测应采用概率分布的形式，旨在增加传达给最终用户的信息的数量。尽管学术界和行业中机器学习模型的概率预测和预测的应用变得越来越频繁，但在整个领域的整体视野下，相关的概念和方法尚未正式化和结构。在这里，我们通过机器学习算法回顾了预测不确定性估计的主题，以及用于评估概率预测的相关指标（一致的评分功能和适当的评分规则）。该评论涵盖了从引入早期统计（基于贝叶斯统计或分数回归）的早期统计（线性回归和时间序列模型）到最近的机器学习算法（包括位置，尺度和形状的通用添加剂模型，随机森林，增强和深度学习算法）的时间段。对该领域的进度的审查，加快了我们对如何开发针对用户需求量身定制的新算法的理解，因为最新进步是基于应用于更复杂算法的一些基本概念。我们通过对材料进行分类并讨论正在成为研究热门话题的挑战来结束。

Predictions and forecasts of machine learning models should take the form of probability distributions, aiming to increase the quantity of information communicated to end users. Although applications of probabilistic prediction and forecasting with machine learning models in academia and industry are becoming more frequent, related concepts and methods have not been formalized and structured under a holistic view of the entire field. Here, we review the topic of predictive uncertainty estimation with machine learning algorithms, as well as the related metrics (consistent scoring functions and proper scoring rules) for assessing probabilistic predictions. The review covers a time period spanning from the introduction of early statistical (linear regression and time series models, based on Bayesian statistics or quantile regression) to recent machine learning algorithms (including generalized additive models for location, scale and shape, random forests, boosting and deep learning algorithms) that are more flexible by nature. The review of the progress in the field, expedites our understanding on how to develop new algorithms tailored to users' needs, since the latest advancements are based on some fundamental concepts applied to more complex algorithms. We conclude by classifying the material and discussing challenges that are becoming a hot topic of research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题