论文标题

基于法语口语理解任务的基于变形金刚的基准模型

Benchmarking Transformers-based models on French Spoken Language Understanding tasks

论文作者

Cattan, Oralie, Ghannay, Sahar, Servan, Christophe, Rosset, Sophie

论文摘要

在过去的五年中,基于自动变压器的体系结构的兴起导致了许多自然语言任务的最新表现。尽管这些方法越来越受欢迎,但它们需要大量的数据和计算资源。在数据制度应用程序条件下,在资源不足的语言上,基准测试方法仍然非常需要对方法进行基准测试。大多数预训练的语言模型都使用英语进行了大规模研究,其中只有少数人在法语上进行了评估。在本文中,我们提出了一个统一的基准测试,重点是评估模型质量及其对两个著名法语口语理解任务的生态影响。尤其是我们基于13个完善的基于变压器的模型基于两种可用的口语理解法语:媒体和ATIS-FR的任务。在此框架内,我们表明,紧凑的模型可以与较大的模型达到可比的结果,而生态影响却大大降低。但是,此假设是细微的,取决于考虑的压缩方法。

In the last five years, the rise of the self-attentional Transformer-based architectures led to state-of-the-art performances over many natural language tasks. Although these approaches are increasingly popular, they require large amounts of data and computational resources. There is still a substantial need for benchmarking methodologies ever upwards on under-resourced languages in data-scarce application conditions. Most pre-trained language models were massively studied using the English language and only a few of them were evaluated on French. In this paper, we propose a unified benchmark, focused on evaluating models quality and their ecological impact on two well-known French spoken language understanding tasks. Especially we benchmark thirteen well-established Transformer-based models on the two available spoken language understanding tasks for French: MEDIA and ATIS-FR. Within this framework, we show that compact models can reach comparable results to bigger ones while their ecological impact is considerably lower. However, this assumption is nuanced and depends on the considered compression method.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源