通过最佳转换来预测具有不完美数据的回归概率分布

论文标题

通过最佳转换来预测具有不完美数据的回归概率分布

Predicting Regression Probability Distributions with Imperfect Data Through Optimal Transformations

论文作者

Friedman, Jerome H.

论文摘要

回归分析的目的是在给定其他（预测指标）变量的关节值x的矢量x的矢量x上预测数字结果变量的值。通常，特定的x-vector不会指定y的可重复值，而是可能的y-值概率分布，p（y | x）。该分布具有一个位置，比例和形状，所有这些分布都可以取决于x，并且需要推断给定x的可能值。回归方法通常假定训练数据y值是一些良好指示的P（Y | X）的完美数字实现。通常，实际的培训数据y值是离散，截断和/或任意审查的。在可能存在这种不完美的训练数据的情况下，提出了基于最佳转换策略的回归程序，以将P（Y | X）作为X的一般函数估算为X的一般函数。此外，提出了验证诊断以确定解决方案的质量。

The goal of regression analysis is to predict the value of a numeric outcome variable y given a vector of joint values of other (predictor) variables x. Usually a particular x-vector does not specify a repeatable value for y, but rather a probability distribution of possible y--values, p(y|x). This distribution has a location, scale and shape, all of which can depend on x, and are needed to infer likely values for y given x. Regression methods usually assume that training data y-values are perfect numeric realizations from some well behaived p(y|x). Often actual training data y-values are discrete, truncated and/or arbitrary censored. Regression procedures based on an optimal transformation strategy are presented for estimating location, scale and shape of p(y|x) as general functions of x, in the possible presence of such imperfect training data. In addition, validation diagnostics are presented to ascertain the quality of the solutions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题