论文标题
通过最佳转换来预测具有不完美数据的回归概率分布
Predicting Regression Probability Distributions with Imperfect Data Through Optimal Transformations
论文作者
论文摘要
回归分析的目的是在给定其他(预测指标)变量的关节值x的矢量x的矢量x上预测数字结果变量的值。通常,特定的x-vector不会指定y的可重复值,而是可能的y-值概率分布,p(y | x)。该分布具有一个位置,比例和形状,所有这些分布都可以取决于x,并且需要推断给定x的可能值。回归方法通常假定训练数据y值是一些良好指示的P(Y | X)的完美数字实现。通常,实际的培训数据y值是离散,截断和/或任意审查的。在可能存在这种不完美的训练数据的情况下,提出了基于最佳转换策略的回归程序,以将P(Y | X)作为X的一般函数估算为X的一般函数。此外,提出了验证诊断以确定解决方案的质量。
The goal of regression analysis is to predict the value of a numeric outcome variable y given a vector of joint values of other (predictor) variables x. Usually a particular x-vector does not specify a repeatable value for y, but rather a probability distribution of possible y--values, p(y|x). This distribution has a location, scale and shape, all of which can depend on x, and are needed to infer likely values for y given x. Regression methods usually assume that training data y-values are perfect numeric realizations from some well behaived p(y|x). Often actual training data y-values are discrete, truncated and/or arbitrary censored. Regression procedures based on an optimal transformation strategy are presented for estimating location, scale and shape of p(y|x) as general functions of x, in the possible presence of such imperfect training data. In addition, validation diagnostics are presented to ascertain the quality of the solutions.