论文标题
数据集的MT适应数据表:模板和存储库
MT-Adapted Datasheets for Datasets: Template and Repository
论文作者
论文摘要
在本报告中,我们采用了Gebru等人提出的标准化模型。 (2018年)记录了Europarl的流行机器翻译数据集(Koehn,2005年)和新闻量表(Barrault等,2019)。在此文档过程中,我们将原始数据表调整为机器翻译区域内的数据消费者的特定情况。我们还提出了一个存储库,用于在此研究领域收集改编的数据表
In this report we are taking the standardized model proposed by Gebru et al. (2018) for documenting the popular machine translation datasets of the EuroParl (Koehn, 2005) and News-Commentary (Barrault et al., 2019). Within this documentation process, we have adapted the original datasheet to the particular case of data consumers within the Machine Translation area. We are also proposing a repository for collecting the adapted datasheets in this research area