论文标题
通过估计小数据的累积分布函数分类
Classification by estimating the cumulative distribution function for small data
论文作者
论文摘要
在本文中,我们通过估计给定数据的条件概率函数来研究分类问题。与传统的经验数据预期风险估计理论不同,我们通过弗雷霍尔姆方程计算概率,这导致估计数据的分布。基于弗雷德尔姆方程,提出了一种新的预期风险估计理论,通过估计累积分布函数。新的预期风险估计的主要特征是衡量输入空间分布的风险。还提出了相应的经验风险估计,并提出了$ \ varepsilon $不敏感的$ l_ {1} $累积支持向量机($ \ varepsilon $ - $ l_ {1} vsvm $)是通过引入不敏感的损失而提出的。值得一提的是,基于新机制的分类模型和分类评估指标与传统机制不同。实验结果表明,所提出的$ \ varepsilon $ - $ l_ {1} vsvm $的有效性以及相应的累积分布函数指标指标有关小数据分类的有效性和解释性。
In this paper, we study the classification problem by estimating the conditional probability function of the given data. Different from the traditional expected risk estimation theory on empirical data, we calculate the probability via Fredholm equation, this leads to estimate the distribution of the data. Based on the Fredholm equation, a new expected risk estimation theory by estimating the cumulative distribution function is presented. The main characteristics of the new expected risk estimation is to measure the risk on the distribution of the input space. The corresponding empirical risk estimation is also presented, and an $\varepsilon$-insensitive $L_{1}$ cumulative support vector machines ($\varepsilon$-$L_{1}VSVM$) is proposed by introducing an insensitive loss. It is worth mentioning that the classification models and the classification evaluation indicators based on the new mechanism are different from the traditional one. Experimental results show the effectiveness of the proposed $\varepsilon$-$L_{1}VSVM$ and the corresponding cumulative distribution function indicator on validity and interpretability of small data classification.