通过搜索通道维度和映射预训练的参数，用于对象检测的神经体系结构适应

论文标题

通过搜索通道维度和映射预训练的参数，用于对象检测的神经体系结构适应

Neural Architecture Adaptation for Object Detection by Searching Channel Dimensions and Mapping Pre-trained Parameters

论文作者

Jung, Harim, Oh, Myeong-Seok, Yang, Cheoljong, Lee, Seong-Whan

论文摘要

大多数对象检测框架都使用最初设计用于图像分类的骨干体系结构，从传统上讲，ImageNet上的预训练参数。但是，图像分类和对象检测本质上是不同的任务，无法保证分类的最佳主链也适用于对象检测。最近的神经体系结构搜索（NAS）研究表明，自动设计专门用于对象检测的骨干有助于提高整体准确性。在本文中，我们引入了一种神经体系结构适应方法，该方法可以优化给定的主链以进行检测目的，同时仍允许使用预训练的参数。我们建议除了每个块的输出通道尺寸外，还通过搜索特定操作和层数来调整微体系结构。找到最佳通道深度很重要，因为它极大地影响了特征表示功能和计算成本。我们使用搜索的主链进行对象检测进行实验，并证明我们的主干在可可数据集中手动设计和搜索的最新骨干均优于手动设计和搜索的骨干。

Most object detection frameworks use backbone architectures originally designed for image classification, conventionally with pre-trained parameters on ImageNet. However, image classification and object detection are essentially different tasks and there is no guarantee that the optimal backbone for classification is also optimal for object detection. Recent neural architecture search (NAS) research has demonstrated that automatically designing a backbone specifically for object detection helps improve the overall accuracy. In this paper, we introduce a neural architecture adaptation method that can optimize the given backbone for detection purposes, while still allowing the use of pre-trained parameters. We propose to adapt both the micro- and macro-architecture by searching for specific operations and the number of layers, in addition to the output channel dimensions of each block. It is important to find the optimal channel depth, as it greatly affects the feature representation capability and computation cost. We conduct experiments with our searched backbone for object detection and demonstrate that our backbone outperforms both manually designed and searched state-of-the-art backbones on the COCO dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题