论文标题
深度神经网络的模型
Ising models of deep neural networks
论文作者
论文摘要
这项工作将深层神经网络映射到经典的旋转模型,从而可以使用统计热力学来描述它们。状态的密度表明,与训练有素的网络相比,训练有素的网络在训练之后的重量范围更广泛,训练有素的网络范围更广泛。这些结构在整个网络中传播,并且在各个层中没有观察到。能量值与任务的性能相关,从而使基于质量的网络在不访问数据的情况下可以区分网络。还研究了热力学特性,例如特定热量,揭示了训练有素的网络中较高的临界温度。
This work maps deep neural networks to classical Ising spin models, allowing them to be described using statistical thermodynamics. The density of states shows that structures emerge in the weights after they have been trained -- well-trained networks span a much wider range of realizable energies compared to poorly trained ones. These structures propagate throughout the entire network and are not observed in individual layers. The energy values correlate to performance on tasks, making it possible to distinguish networks based on quality without access to data. Thermodynamic properties such as specific heat are also studied, revealing a higher critical temperature in trained networks.