论文标题
通过编码的reSnext解开信息路径
Towards Disentangling Information Paths with Coded ResNeXt
论文作者
论文摘要
传统的,广泛使用的深度学习模型作为黑匣子提供了有限或对指导神经网络决策的机制有限的见解。大量的研究工作已致力于建立可解释的模型来解决这个问题。大多数努力要么关注与最后一层相关的高级特征,要么尝试解释单层的输出。在本文中,我们采用了一种新颖的方法来增强整个网络功能的透明度。我们提出了一个用于分类的神经网络体系结构,其中与每个类相关的信息通过特定路径流动。这些路径是在训练利用编码理论之前先提前设计的,而无需依赖类之间的语义相似性。一个关键属性是,每条路径都可以用作自主单用途模型。这使我们能够获得任何额外的培训,并且对于任何类别的轻量级二进制分类器,其参数至少比原始网络少$ 60 \%。此外,我们基于编码理论的方法使神经网络在推理过程中可以在中间层进行早期预测,而无需进行全面评估。值得注意的是,所提出的架构提供了上述所有属性,同时提高了整体准确性。我们在CIFAR-10/100和Imagenet-1K上测试的略微修改的RESNEXT模型上演示了这些属性。
The conventional, widely used treatment of deep learning models as black boxes provides limited or no insights into the mechanisms that guide neural network decisions. Significant research effort has been dedicated to building interpretable models to address this issue. Most efforts either focus on the high-level features associated with the last layers, or attempt to interpret the output of a single layer. In this paper, we take a novel approach to enhance the transparency of the function of the whole network. We propose a neural network architecture for classification, in which the information that is relevant to each class flows through specific paths. These paths are designed in advance before training leveraging coding theory and without depending on the semantic similarities between classes. A key property is that each path can be used as an autonomous single-purpose model. This enables us to obtain, without any additional training and for any class, a lightweight binary classifier that has at least $60\%$ fewer parameters than the original network. Furthermore, our coding theory based approach allows the neural network to make early predictions at intermediate layers during inference, without requiring its full evaluation. Remarkably, the proposed architecture provides all the aforementioned properties while improving the overall accuracy. We demonstrate these properties on a slightly modified ResNeXt model tested on CIFAR-10/100 and ImageNet-1k.