论文标题
简短的深度学习
Short sighted deep learning
论文作者
论文摘要
一种解释深度学习工作的理论尚未开发。先前的工作表明,深度学习的精神与重新归一化组(RG)相似。在局部(最近的邻居相互作用)Ising旋转晶格的环境中探索了这个想法。我们将讨论扩展到远程旋转晶格的设置。马尔可夫链蒙特卡洛(MCMC)模拟决定了系统的临界温度和缩放尺寸。该模型用于训练单个RBM(受限的玻尔兹曼机器)网络以及堆叠的RBM网络。遵循较早的Ising模型研究,单层RBM网络的训练重量定义了晶格模型的流动。与最近的邻居Ising的结果相反,长范围模型的RBM流量不会收敛到旋转和能量缩放维度的正确值。此外,可见节点和隐藏节点之间的相关函数在堆叠的RBM和RG流之间表现出关键差异。堆叠的RBM流动似乎向低温移动,而RG流动向高温移动。这再次与最近的邻居Ising获得的结果不同。
A theory explaining how deep learning works is yet to be developed. Previous work suggests that deep learning performs a coarse graining, similar in spirit to the renormalization group (RG). This idea has been explored in the setting of a local (nearest neighbor interactions) Ising spin lattice. We extend the discussion to the setting of a long range spin lattice. Markov Chain Monte Carlo (MCMC) simulations determine both the critical temperature and scaling dimensions of the system. The model is used to train both a single RBM (restricted Boltzmann machine) network, as well as a stacked RBM network. Following earlier Ising model studies, the trained weights of a single layer RBM network define a flow of lattice models. In contrast to results for nearest neighbor Ising, the RBM flow for the long ranged model does not converge to the correct values for the spin and energy scaling dimension. Further, correlation functions between visible and hidden nodes exhibit key differences between the stacked RBM and RG flows. The stacked RBM flow appears to move towards low temperatures whereas the RG flow moves towards high temperature. This again differs from results obtained for nearest neighbor Ising.