论文标题
基于无透镜成像系统的文本检测和识别
Text detection and recognition based on a lensless imaging system
论文作者
论文摘要
与常规摄像机相比,无透镜相机的特征是多种优势(例如,微型化,易于制造和低成本)。但是,由于图像清晰度差和图像分辨率低,尤其是对于对图像质量和细节较高要求的任务,例如文本检测和文本识别的任务,它们尚未广泛使用。为了解决这个问题,建立了一个基于深度学习的管道结构的框架,以识别通过使用无镜头摄像机捕获的原始数据的三个步骤的文本。该管道结构由无镜头成像模型U-NET,文本检测模型Connectionist文本建议网络(CTPN)和文本识别模型卷积复发性神经网络(CRNN)组成。与仅关注图像重建的方法相比,管道中的UNET能够通过增强重建过程中与角色类别相关的因素来补充成像细节,因此,CTPN和CRNN可以更有效地检测和识别文本信息,并以较少的工件和较少的工件和高构建的无镜头无镜头图像来检测和识别。通过在不同复杂性的数据集上执行实验,验证了无透镜相机上文本检测和识别的适用性。这项研究合理地证明了无镜头相机系统中的文本检测和识别任务,并为新型应用开发了一种基本方法。
Lensless cameras are characterized by several advantages (e.g., miniaturization, ease of manufacture, and low cost) as compared with conventional cameras. However, they have not been extensively employed due to their poor image clarity and low image resolution, especially for tasks that have high requirements on image quality and details such as text detection and text recognition. To address the problem, a framework of deep-learning-based pipeline structure was built to recognize text with three steps from raw data captured by employing lensless cameras. This pipeline structure consisted of the lensless imaging model U-Net, the text detection model connectionist text proposal network (CTPN), and the text recognition model convolutional recurrent neural network (CRNN). Compared with the method focusing only on image reconstruction, UNet in the pipeline was able to supplement the imaging details by enhancing factors related to character categories in the reconstruction process, so the textual information can be more effectively detected and recognized by CTPN and CRNN with fewer artifacts and high-clarity reconstructed lensless images. By performing experiments on datasets of different complexities, the applicability to text detection and recognition on lensless cameras was verified. This study reasonably demonstrates text detection and recognition tasks in the lensless camera system,and develops a basic method for novel applications.