迈向$ n $ d的长距离依赖关系的通用CNN

论文标题

迈向$ n $ d的长距离依赖关系的通用CNN

Towards a General Purpose CNN for Long Range Dependencies in $N$D

论文作者

Romero, David W., Knigge, David M., Gu, Albert, Bekkers, Erik J., Gavves, Efstratios, Tomczak, Jakub M., Hoogendoorn, Mark

论文摘要

由于一系列理想的模型属性，卷积神经网络（CNN）的使用在深度学习中被广泛扩展，这些属性导致了有效有效的机器学习框架。但是，必须将CNN体系结构量身定制为特定任务，以结合诸如输入长度，分辨率和尺寸的考虑因素。在这项工作中，我们通过连续的卷积神经网络（CCNN）克服了对特定问题的CNN体系结构的需求：配备了连续卷积内核的单个CNN体系结构，可用于根据任意分辨率，维度，长度和长度而没有结构变化的任务。连续的卷积内核在每一层的远距离依赖性模型，并消除对当前CNN体系结构中所需的降采样层和任务依赖性深度的需求。我们通过将相同的CCNN应用于顺序（1 $ \ mathrm {d} $）和视觉数据（2 $ \ mathrm {d} $）上的一系列任务来显示我们方法的普遍性。我们的CCNN竞争性能，并且在所有考虑的所有任务中通常都要优于当前最新的。

The use of Convolutional Neural Networks (CNNs) is widespread in Deep Learning due to a range of desirable model properties which result in an efficient and effective machine learning framework. However, performant CNN architectures must be tailored to specific tasks in order to incorporate considerations such as the input length, resolution, and dimentionality. In this work, we overcome the need for problem-specific CNN architectures with our Continuous Convolutional Neural Network (CCNN): a single CNN architecture equipped with continuous convolutional kernels that can be used for tasks on data of arbitrary resolution, dimensionality and length without structural changes. Continuous convolutional kernels model long range dependencies at every layer, and remove the need for downsampling layers and task-dependent depths needed in current CNN architectures. We show the generality of our approach by applying the same CCNN to a wide set of tasks on sequential (1$\mathrm{D}$) and visual data (2$\mathrm{D}$). Our CCNN performs competitively and often outperforms the current state-of-the-art across all tasks considered.

下载PDF全文

下载文献需遵守相关版权规定

论文标题