NADI 2020：第一个细微的阿拉伯方言标识共享任务

论文标题

NADI 2020：第一个细微的阿拉伯方言标识共享任务

NADI 2020: The First Nuanced Arabic Dialect Identification Shared Task

论文作者

Abdul-Mageed, Muhammad, Zhang, Chiyu, Bouamor, Houda, Habash, Nizar

论文摘要

我们介绍了第一个细微的阿拉伯方言标识共享任务（NADI）的结果和发现。此共享任务包括两个子任务：国家级方言识别（子任务1）和省级级子划分标识（子任务2）。共享任务的数据涵盖了来自21个阿拉伯国家的100个省，并从Twitter领域收集。因此，NADI是第一个在子国级别定位自然出现的精细方言文本的共享任务。来自25个国家的61个团队注册参加了任务，从而反映了社区在该领域的利益。我们从18个团队中收到了47个子任务1的意见书，并从9个团队中获得了9个子任务2的提交。

We present the results and findings of the First Nuanced Arabic Dialect Identification Shared Task (NADI). This Shared Task includes two subtasks: country-level dialect identification (Subtask 1) and province-level sub-dialect identification (Subtask 2). The data for the shared task covers a total of 100 provinces from 21 Arab countries and are collected from the Twitter domain. As such, NADI is the first shared task to target naturally-occurring fine-grained dialectal text at the sub-country level. A total of 61 teams from 25 countries registered to participate in the tasks, thus reflecting the interest of the community in this area. We received 47 submissions for Subtask 1 from 18 teams and 9 submissions for Subtask 2 from 9 teams.

下载PDF全文

下载文献需遵守相关版权规定

论文标题