论文标题
非添加指标的时间序列:通过线性分解来识别和解释差异因素
Time Series of Non-Additive Metrics: Identification and Interpretation of Contributing Factors of Variance by Linear Decomposition
论文作者
论文摘要
该研究论文介绍了非添加指标的时间序列的线性分解,该指标允许识别和解释差异方差的因素(输入特征)。非添加指标(例如比率)被广泛用于多种域。它通常需要用于计算关注度量指标的基本变量之前的聚合。当输入功能和基础变量沿元素(例如帐户或客户标识和时间点)形成为二维数组时,最新的构成了维度挑战。它排除了非添加度度量的直接建模作为输入功能的函数。 The article discusses a five-step approach: (1) segmentations of input features and the underlying variables of the metric that are supported by unsupervised autoencoders, (2) univariate or joint fittings of the metric by the aggregated input features on the segmented domains, (3) transformations of pre-screened input features according to the fitted models, (4) aggregation of the transformed features as time series, and (5)公制时间序列的建模作为聚合特征的约束线性效应的总和。另外,已经认为通过数值分化的近似值将度量线性化。它允许步骤(2)的元素级别或关节建模。这些分析步骤的过程允许将度量的向后看解释性分解,作为幸存的输入特征的时间序列。该论文包括一个合成示例,该例子研究假设零售信贷组合的每月失去平衡率。为了验证除了幸存的输入特征以外,没有其他潜在因素对度量标准产生了重大影响,为残留时间序列引入了统计过程控制。
The research paper addresses linear decomposition of time series of non-additive metrics that allows for the identification and interpretation of contributing factors (input features) of variance. Non-additive metrics, such as ratios, are widely used in a variety of domains. It commonly requires preceding aggregations of underlying variables that are used to calculate the metric of interest. The latest poses a dimensionality challenge when the input features and underlying variables are formed as two-dimensional arrays along elements, such as account or customer identifications, and time points. It rules out direct modeling of the time series of a non-additive metric as a function of input features. The article discusses a five-step approach: (1) segmentations of input features and the underlying variables of the metric that are supported by unsupervised autoencoders, (2) univariate or joint fittings of the metric by the aggregated input features on the segmented domains, (3) transformations of pre-screened input features according to the fitted models, (4) aggregation of the transformed features as time series, and (5) modelling of the metric time series as a sum of constrained linear effects of the aggregated features. Alternatively, approximation by numerical differentiation has been considered to linearize the metric. It allows for element level univariate or joint modeling of step (2). The process of these analytical steps allows for a backward-looking explanatory decomposition of the metric as a sum of time series of the survived input features. The paper includes a synthetic example that studies loss-to-balance monthly rates of a hypothetical retail credit portfolio. To validate that no latent factors other than the survived input features have significant impacts on the metric, Statistical Process Control has been introduced for the residual time series.