论文标题
基于等距映射的折痕文档图像的几何校准
Geometric Rectification of Creased Document Images based on Isometric Mapping
论文作者
论文摘要
扭曲文档图像的几何整流发现在文档数字化和光学特征识别(OCR)中发现了广泛的应用。尽管许多作品已经广泛研究了平滑弯曲的变形,但最具挑战性的扭曲,例如尚未特别研究复杂的折痕和大折叠。现有方法的性能在很大程度上被应用于折痕或折叠的文档时,远非令人满意,留下了很大的改进空间。为了解决这项任务,有关文档纠正的知识应纳入计算中,其中3D文档模型和图像中特定的纹理特征(例如直线)的开发性是最重要的。为此,我们提出了一个文档图像整流的一般框架,其中使用计算等距映射模型来表达3D文档模型及其在平面中的变平。基于此框架,计算中都考虑了模型的开发性和纹理特征。与最新方法的实验和比较证明了该方法的有效性和出色性能。我们的方法也很灵活,因为可以通过提取图像中提取高质量特征线的任何其他方法来增强整流结果。
Geometric rectification of images of distorted documents finds wide applications in document digitization and Optical Character Recognition (OCR). Although smoothly curved deformations have been widely investigated by many works, the most challenging distortions, e.g. complex creases and large foldings, have not been studied in particular. The performance of existing approaches, when applied to largely creased or folded documents, is far from satisfying, leaving substantial room for improvement. To tackle this task, knowledge about document rectification should be incorporated into the computation, among which the developability of 3D document models and particular textural features in the images, such as straight lines, are the most essential ones. For this purpose, we propose a general framework of document image rectification in which a computational isometric mapping model is utilized for expressing a 3D document model and its flattening in the plane. Based on this framework, both model developability and textural features are considered in the computation. The experiments and comparisons to the state-of-the-art approaches demonstrated the effectiveness and outstanding performance of the proposed method. Our method is also flexible in that the rectification results can be enhanced by any other methods that extract high-quality feature lines in the images.