论文标题

自适应 - 对称性破裂的复杂性:深神经网络的统计力学中的全局最小值

Complexity from Adaptive-Symmetries Breaking: Global Minima in the Statistical Mechanics of Deep Neural Networks

论文作者

Li, Shawn W. M.

论文摘要

提出了对物理学中保守对称性的自适应对称性的对立概念,以了解深度神经网络(DNNS)。它表征了方差的不变性,其中生物系统在没有反馈信号的情况下探索了不同的进化途径,而复杂的功能结构在反馈信号响应于自适应 - 对称性破裂的定量积累中出现。从理论和实验上讲,我们将DNN系统的优化过程表征为扩展的自适应对称对称过程。一个特殊的发现是,层次较大的DNN将具有大量的适应性对称性库,当储层的信息能力超过数据集的复杂性时,该系统可以吸收示例的所有扰动并自我组织成零训练错误的功能结构,以某种促销风险测量零训练错误。更具体地说,此过程的特征是统计机械模型,可以将其视为将统计物理学对DNN有组织的复杂系统的概括,并以较高维度的规律性为特征。该模型由三个构成的模型组成,可以分别为玻尔兹曼分布,伊辛模型和保守对称性的对应物,分别是:(1)DNN的随机定义/解释,该定义/解释是多层概率图形模型,(2)在生物学计算中的形式主义(2)(3)(3)(3)(3)(3)(3))(3)(3)(3)(3)适应性表现出来。用一种称为统计组装方法的方法分析该模型,该方法分析了DNN中异质分层多体相互作用的粗粒度行为(对称组)。

An antithetical concept, adaptive symmetry, to conservative symmetry in physics is proposed to understand the deep neural networks (DNNs). It characterizes the invariance of variance, where a biotic system explores different pathways of evolution with equal probability in absence of feedback signals, and complex functional structure emerges from quantitative accumulation of adaptive-symmetries breaking in response to feedback signals. Theoretically and experimentally, we characterize the optimization process of a DNN system as an extended adaptive-symmetry-breaking process. One particular finding is that a hierarchically large DNN would have a large reservoir of adaptive symmetries, and when the information capacity of the reservoir exceeds the complexity of the dataset, the system could absorb all perturbations of the examples and self-organize into a functional structure of zero training errors measured by a certain surrogate risk. More specifically, this process is characterized by a statistical-mechanical model that could be appreciated as a generalization of statistics physics to the DNN organized complex system, and characterizes regularities in higher dimensionality. The model consists of three constitutes that could be appreciated as the counterparts of Boltzmann distribution, Ising model, and conservative symmetry, respectively: (1) a stochastic definition/interpretation of DNNs that is a multilayer probabilistic graphical model, (2) a formalism of circuits that perform biological computation, (3) a circuit symmetry from which self-similarity between the microscopic and the macroscopic adaptability manifests. The model is analyzed with a method referred as the statistical assembly method that analyzes the coarse-grained behaviors (over a symmetry group) of the heterogeneous hierarchical many-body interaction in DNNs.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源