论文标题
SOFTCTC-使用软伪标签的半监督学习文本识别
SoftCTC -- Semi-Supervised Learning for Text Recognition using Soft Pseudo-Labels
论文作者
论文摘要
本文探讨了针对序列任务的半监督培训,例如光学特征识别或自动语音识别。我们提出了一个新颖的损失函数$ \ unicode {x2013} $ softctc $ \ unicode {x2013} $,这是CTC的扩展,允许同时考虑多个转录变体。这允许省略基于置信的过滤步骤,否则,这是半监督学习的伪标记方法的关键组成部分。我们证明了我们的方法在具有挑战性的手写识别任务上的有效性,并得出结论,SoftCTC与基于精细的过滤管道的性能相匹配。我们还根据计算效率评估了SOFTCTC,得出的结论是,它比基于幼稚的CTC进行多种转录变体培训的方法要高得多,并且我们使我们的GPU实施公开。
This paper explores semi-supervised training for sequence tasks, such as Optical Character Recognition or Automatic Speech Recognition. We propose a novel loss function $\unicode{x2013}$ SoftCTC $\unicode{x2013}$ which is an extension of CTC allowing to consider multiple transcription variants at the same time. This allows to omit the confidence based filtering step which is otherwise a crucial component of pseudo-labeling approaches to semi-supervised learning. We demonstrate the effectiveness of our method on a challenging handwriting recognition task and conclude that SoftCTC matches the performance of a finely-tuned filtering based pipeline. We also evaluated SoftCTC in terms of computational efficiency, concluding that it is significantly more efficient than a naïve CTC-based approach for training on multiple transcription variants, and we make our GPU implementation public.