论文标题
一般的随机分离定理具有最佳边界
General stochastic separation theorems with optimal bounds
论文作者
论文摘要
随机可分离性的现象被揭示并用于机器学习中,以纠正人工智能(AI)系统的错误并分析AI不稳定性。在高维数据集中,在广泛的假设下,每个点可以通过简单而强大的Fisher的判别物(是Fisher可分开)将每个点与集合的其余部分分开。错误或错误群可以与其余数据分开。纠正AI系统的能力也打开了对其攻击的可能性,并且高维度引起的脆弱性是由相同的随机分离性引起的,该脆弱性具有相同的随机分离性,该脆弱性具有了解高维数据驱动的AI中稳健性和适应性的基本原理。为了管理错误和分析漏洞,随机分离定理应评估数据集在给定维度和给定的分布类别中可分开的渔民的概率。需要对这些分离概率的明确和最佳估计,并且在当前工作中解决了此问题。对于重要的分布类别,获得了具有最佳概率估计的一般随机分离定理:对数 - 循环分布,其凸组合和产品分布。标准I.I.D.假设显着放松。这些定理和估计值既可用于校正高维数据驱动的AI系统,又用于分析其脆弱性。应用的第三个领域是神经元合奏,祖母细胞现象和大脑中稀疏编码的记忆的出现,并解释了高维大脑中小神经合奏意外有效性的解释。
Phenomenon of stochastic separability was revealed and used in machine learning to correct errors of Artificial Intelligence (AI) systems and analyze AI instabilities. In high-dimensional datasets under broad assumptions each point can be separated from the rest of the set by simple and robust Fisher's discriminant (is Fisher separable). Errors or clusters of errors can be separated from the rest of the data. The ability to correct an AI system also opens up the possibility of an attack on it, and the high dimensionality induces vulnerabilities caused by the same stochastic separability that holds the keys to understanding the fundamentals of robustness and adaptivity in high-dimensional data-driven AI. To manage errors and analyze vulnerabilities, the stochastic separation theorems should evaluate the probability that the dataset will be Fisher separable in given dimensionality and for a given class of distributions. Explicit and optimal estimates of these separation probabilities are required, and this problem is solved in present work. The general stochastic separation theorems with optimal probability estimates are obtained for important classes of distributions: log-concave distribution, their convex combinations and product distributions. The standard i.i.d. assumption was significantly relaxed. These theorems and estimates can be used both for correction of high-dimensional data driven AI systems and for analysis of their vulnerabilities. The third area of application is the emergence of memories in ensembles of neurons, the phenomena of grandmother's cells and sparse coding in the brain, and explanation of unexpected effectiveness of small neural ensembles in high-dimensional brain.