论文标题
如何训练不稳定的循环张量网络
How to Train Unstable Looped Tensor Network
论文作者
论文摘要
深度神经网络压缩的一个上升问题是如何减少卷积内核中参数的数量以及通过低级张量近似值来减少这些层的复杂性。规范的多胶张量分解(CPD)和Tucker Tensor分解(TKD)是解决此问题的两种解决方案,并提供了令人鼓舞的结果。但是,CPD通常由于退化而失败,使网络不稳定且难以微调。如果核心张量很大,则TKD不会提供太多的压缩。这激发了使用CPD和TKD的混合模型,这是一种具有多个带有小核张量的Tucker模型的分解,称为块项分解(BTD)。本文提出了一个更紧凑的模型,该模型通过在BTD相同的BTD中强制执行核心张量进一步压缩BTD。我们在具有共享参数的BTD与环路链张量网络(TC)之间建立了一个链接。不幸的是,如此严格的张力网络(带环)遇到了严重的数值不稳定,如Y(Landsberg,2012年)和(Handschuh,2015a)所证明。我们研究链张量网络的扰动,提供了TC中不稳定性的解释,这证明了问题。我们提出了新的方法来获得分解结果的稳定性,保持网络稳定并获得更好的近似。实验结果将确认所提出的方法在压缩众所周知的CNN和TC分解方面的优越性
A rising problem in the compression of Deep Neural Networks is how to reduce the number of parameters in convolutional kernels and the complexity of these layers by low-rank tensor approximation. Canonical polyadic tensor decomposition (CPD) and Tucker tensor decomposition (TKD) are two solutions to this problem and provide promising results. However, CPD often fails due to degeneracy, making the networks unstable and hard to fine-tune. TKD does not provide much compression if the core tensor is big. This motivates using a hybrid model of CPD and TKD, a decomposition with multiple Tucker models with small core tensor, known as block term decomposition (BTD). This paper proposes a more compact model that further compresses the BTD by enforcing core tensors in BTD identical. We establish a link between the BTD with shared parameters and a looped chain tensor network (TC). Unfortunately, such strongly constrained tensor networks (with loop) encounter severe numerical instability, as proved by y (Landsberg, 2012) and (Handschuh, 2015a). We study perturbation of chain tensor networks, provide interpretation of instability in TC, demonstrate the problem. We propose novel methods to gain the stability of the decomposition results, keep the network robust and attain better approximation. Experimental results will confirm the superiority of the proposed methods in compression of well-known CNNs, and TC decomposition under challenging scenarios