利用未标记的数据用于抽象性语音汇总，并通过自我监督学习和背部夏季化

论文标题

利用未标记的数据用于抽象性语音汇总，并通过自我监督学习和背部夏季化

Leverage Unlabeled Data for Abstractive Speech Summarization with Self-Supervised Learning and Back-Summarization

论文作者

Tardy, Paul, de Seynes, Louis, Hernandez, François, Nguyen, Vincent, Janiszek, David, Estève, Yannick

论文摘要

神经抽象摘要的监督方法需要大量的带注释的语料库，这是构建昂贵的。我们提出了一项法国会议摘要任务，根据会议录音的自动转录，预测报告。为了为此任务构建语料库，有必要获取每次会议的（自动或手动）转录，然后将其分割并与相应的手动报告对齐，以产生适合培训的培训示例。另一方面，我们可以访问大量未对准的数据，特别是没有相应转录的报告。报告是专业编写的，格式良好，使预处理直接处理。在这种情况下，我们使用两种方法（i）使用目标端denoising编码器模型来研究如何利用大量未对齐的数据来利用大量的未对准数据；（ii）反及召唤，即通过学习预测给定报告的转录，以使单个报告与生成的转录相一致，并使用此合成数据集将其对准摘要过程。我们报告了两种评估集的两种方法的基线（仅在对齐数据上培训）相比，我们报告了很大的改进。此外，将两者的结合提供更好的结果，在两个评估集上超过了基线+6 rouge-1和rouge-l和+5 rouge-2的大幅度优于基线

Supervised approaches for Neural Abstractive Summarization require large annotated corpora that are costly to build. We present a French meeting summarization task where reports are predicted based on the automatic transcription of the meeting audio recordings. In order to build a corpus for this task, it is necessary to obtain the (automatic or manual) transcription of each meeting, and then to segment and align it with the corresponding manual report to produce training examples suitable for training. On the other hand, we have access to a very large amount of unaligned data, in particular reports without corresponding transcription. Reports are professionally written and well formatted making pre-processing straightforward. In this context, we study how to take advantage of this massive amount of unaligned data using two approaches (i) self-supervised pre-training using a target-side denoising encoder-decoder model; (ii) back-summarization i.e. reversing the summarization process by learning to predict the transcription given the report, in order to align single reports with generated transcription, and use this synthetic dataset for further training. We report large improvements compared to the previous baseline (trained on aligned data only) for both approaches on two evaluation sets. Moreover, combining the two gives even better results, outperforming the baseline by a large margin of +6 ROUGE-1 and ROUGE-L and +5 ROUGE-2 on two evaluation sets

下载PDF全文

下载文献需遵守相关版权规定

论文标题