论文标题
基于时空概念的3D Convnets的解释
Spatial-temporal Concept based Explanation of 3D ConvNets
论文作者
论文摘要
最近的研究在解释2D图像识别卷轴方面取得了杰出的成功。另一方面,由于视频数据的计算成本和复杂性,对3D视频识别转弯的说明相对较少。在本文中,我们提出了一个3D ACE(基于自动概念的解释)框架,用于解释3D Convnets。在我们的方法中:(1)使用高级Subervoxel代表视频,这对于人类来说是简单的理解; (2)解释框架估计每个体素的分数,这反映了其在决策过程中的重要性。实验表明,我们的方法可以发现不同重要性级别的时空概念,因此可以探索概念对目标任务的影响,例如动作分类,深度。这些代码可公开可用。
Recent studies have achieved outstanding success in explaining 2D image recognition ConvNets. On the other hand, due to the computation cost and complexity of video data, the explanation of 3D video recognition ConvNets is relatively less studied. In this paper, we present a 3D ACE (Automatic Concept-based Explanation) framework for interpreting 3D ConvNets. In our approach: (1) videos are represented using high-level supervoxels, which is straightforward for human to understand; and (2) the interpreting framework estimates a score for each voxel, which reflects its importance in the decision procedure. Experiments show that our method can discover spatial-temporal concepts of different importance-levels, and thus can explore the influence of the concepts on a target task, such as action classification, in-depth. The codes are publicly available.