2D对象检测和视频事件识别中应用的最新趋势

论文标题

2D对象检测和视频事件识别中应用的最新趋势

Recent Trends in 2D Object Detection and Applications in Video Event Recognition

论文作者

Jana, Prithwish, Mohanta, Partha Pratim

论文摘要

对象检测是改善复杂下游计算机视觉任务的性能的重要一步。它已经进行了多年的广泛研究，即使在复杂的图像中，当前最新的2D对象检测技术也可以提供最高级结果。在本章中，我们讨论了基于几何的开拓性在物体检测中的作品，然后是最近采用深度学习的突破。其中一些使用单层架构，该体系结构将RGB图像作为输入，并将其传递到前馈交流或视觉变压器中。这些方法在单个统一管道中都可以预测类概率和边界盒坐标。另一方面，两阶段体系结构首先生成区域建议，然后将其馈送到CNN以提取功能并预测对象类别和边界框。我们还详细阐述了对象检测在视频事件识别中的应用，以实现更好的细粒度视频分类性能。此外，我们重点介绍了图像和视频中2D对象检测的最新数据集，并提供了各种最新对象检测技术的比较性能摘要。

Object detection serves as a significant step in improving performance of complex downstream computer vision tasks. It has been extensively studied for many years now and current state-of-the-art 2D object detection techniques proffer superlative results even in complex images. In this chapter, we discuss the geometry-based pioneering works in object detection, followed by the recent breakthroughs that employ deep learning. Some of these use a monolithic architecture that takes a RGB image as input and passes it to a feed-forward ConvNet or vision Transformer. These methods, thereby predict class-probability and bounding-box coordinates, all in a single unified pipeline. Two-stage architectures on the other hand, first generate region proposals and then feed it to a CNN to extract features and predict object category and bounding-box. We also elaborate upon the applications of object detection in video event recognition, to achieve better fine-grained video classification performance. Further, we highlight recent datasets for 2D object detection both in images and videos, and present a comparative performance summary of various state-of-the-art object detection techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题