论文标题

场景文本图像通过内容感知损失和纵横交错变压器块超级分辨率

Scene Text Image Super-Resolution via Content Perceptual Loss and Criss-Cross Transformer Blocks

论文作者

Qin, Rui, Wang, Bin, Tai, Yu-Wing

论文摘要

文本图像超分辨率是一项独特而重要的任务,可以增强对人类文本图像的可读性。它被广泛用作场景文本识别中的预处理。但是,由于自然场景中的复杂降解,从低分辨率输入中恢复了高分辨率文本是模棱两可和具有挑战性的。现有的方法主要利用了针对自然图像重建设计的以像素为单位的损失训练的深神经网络,这些损失忽略了文本的独特特征。一些作品提出了基于内容的损失。但是,他们只专注于文本识别者的准确性,而重建的图像可能仍然对人类模棱两可。此外,它们通常具有处理跨语言的可推广性。为此,我们提出了TATSR,这是一个文本感知的文本超分辨率框架,该框架可以有效地学习使用Criss-Cross Transformer块(CCTB)和新颖的内容感知(CP)损失的独特文本特征。 CCTB分别通过两个正交变压器从文本图像中提取垂直和水平内容信息。 CP损失通过多尺度文本识别功能来监督文本重建,从而有效地将内容意识纳入框架中。在各种语言数据集上进行的广泛实验表明,就识别精度和人类的感知而言,TATSR优于最先进的方法。

Text image super-resolution is a unique and important task to enhance readability of text images to humans. It is widely used as pre-processing in scene text recognition. However, due to the complex degradation in natural scenes, recovering high-resolution texts from the low-resolution inputs is ambiguous and challenging. Existing methods mainly leverage deep neural networks trained with pixel-wise losses designed for natural image reconstruction, which ignore the unique character characteristics of texts. A few works proposed content-based losses. However, they only focus on text recognizers' accuracy, while the reconstructed images may still be ambiguous to humans. Further, they often have weak generalizability to handle cross languages. To this end, we present TATSR, a Text-Aware Text Super-Resolution framework, which effectively learns the unique text characteristics using Criss-Cross Transformer Blocks (CCTBs) and a novel Content Perceptual (CP) Loss. The CCTB extracts vertical and horizontal content information from text images by two orthogonal transformers, respectively. The CP Loss supervises the text reconstruction with content semantics by multi-scale text recognition features, which effectively incorporates content awareness into the framework. Extensive experiments on various language datasets demonstrate that TATSR outperforms state-of-the-art methods in terms of both recognition accuracy and human perception.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源