论文标题

用推理层了解深层体系结构

Understanding Deep Architectures with Reasoning Layer

论文作者

Chen, Xinshi, Zhang, Yufei, Reisinger, Christoph, Song, Le

论文摘要

最近,为了处理更复杂的学习任务,将深度学习模型与推理相结合,引起了人们的兴趣。在许多情况下,可以通过迭代算法来解决推理任务。该算法通常是展开的,并用作深度建筑中的专业层,可以通过其他神经组件端到端训练。尽管这种混合深层建筑已取得了许多经验成功,但这种架构的理论基础,尤其是算法层与其他神经层之间的相互作用,在很大程度上没有得到探索。在本文中,我们通过证明算法层的特性(例如收敛,稳定性和灵敏度)迈出了对这种混合深度体系结构的第一步,这与端到端模型的近似和概括能力密切相关。此外,我们的分析在各种条件下与我们的实验观测密切相匹配,这表明我们的理论可以为设计具有推理层的深层体系结构提供有用的指南。

Recently, there has been a surge of interest in combining deep learning models with reasoning in order to handle more sophisticated learning tasks. In many cases, a reasoning task can be solved by an iterative algorithm. This algorithm is often unrolled, and used as a specialized layer in the deep architecture, which can be trained end-to-end with other neural components. Although such hybrid deep architectures have led to many empirical successes, the theoretical foundation of such architectures, especially the interplay between algorithm layers and other neural layers, remains largely unexplored. In this paper, we take an initial step towards an understanding of such hybrid deep architectures by showing that properties of the algorithm layers, such as convergence, stability, and sensitivity, are intimately related to the approximation and generalization abilities of the end-to-end model. Furthermore, our analysis matches closely our experimental observations under various conditions, suggesting that our theory can provide useful guidelines for designing deep architectures with reasoning layers.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源