Barthez：熟练的法国序列到序列模型

论文标题

Barthez：熟练的法国序列到序列模型

BARThez: a Skilled Pretrained French Sequence-to-Sequence Model

论文作者

Eddine, Moussa Kamal, Tixier, Antoine J. -P., Vazirgiannis, Michalis

论文摘要

归纳转移学习使整个NLP领域都席卷了，伯特（Bert）和巴特（Bart）等模型将新的最新状态设置为无数NLU任务。但是，大多数可用的模型和研究都是针对英语进行的。在这项工作中，我们介绍了Barthez，这是French的首个大规模预处理的SEQ2SEQ模型。 Barthez以Bart为基础，特别适合生成任务。我们从烟道基准中评估了Barthez的五项歧视性任务，以及我们为这项研究创建的新型摘要数据集（Orangesum）的两个生成任务。我们向Barthez表明，与Camembert和Flaubert等最先进的Bert法语模型具有非常有竞争力的竞争。我们还继续在Barthez的语料库上预处理多语言Bart，并展示了我们由此产生的模型Mbarthez，以显着提高Barthez的生成性能。代码，数据和模型公开可用。

Inductive transfer learning has taken the entire NLP field by storm, with models such as BERT and BART setting new state of the art on countless NLU tasks. However, most of the available models and research have been conducted for English. In this work, we introduce BARThez, the first large-scale pretrained seq2seq model for French. Being based on BART, BARThez is particularly well-suited for generative tasks. We evaluate BARThez on five discriminative tasks from the FLUE benchmark and two generative tasks from a novel summarization dataset, OrangeSum, that we created for this research. We show BARThez to be very competitive with state-of-the-art BERT-based French language models such as CamemBERT and FlauBERT. We also continue the pretraining of a multilingual BART on BARThez' corpus, and show our resulting model, mBARThez, to significantly boost BARThez' generative performance. Code, data and models are publicly available.

下载PDF全文

下载文献需遵守相关版权规定

论文标题