构建用于自动评估日本语法错误校正的质量估计数据集

论文标题

构建用于自动评估日本语法错误校正的质量估计数据集

Construction of a Quality Estimation Dataset for Automatic Evaluation of Japanese Grammatical Error Correction

论文作者

Suzuki, Daisuke, Takahashi, Yujin, Yamashita, Ikumi, Aida, Taichi, Hirasawa, Tosho, Nakatsuji, Michitaka, Mita, Masato, Komachi, Mamoru

论文摘要

在语法误差校正（GEC）中，自动评估是GEC系统研究和开发的重要因素。先前关于自动评估的研究表明，通过手动评估的数据集构建的质量估计模型可以在不使用参考句子的情况下自动评估英语GEC的高性能。但是，由于没有用于构建质量估计模型的数据集，因此尚未在日语中研究质量估计模型。因此，在这项研究中，我们创建了一个质量估计数据集，并通过手动评估为日本GEC建立自动评估模型。此外，我们进行了元评估，以验证数据集在构建日本质量估计模型方面的有用性。

In grammatical error correction (GEC), automatic evaluation is an important factor for research and development of GEC systems. Previous studies on automatic evaluation have demonstrated that quality estimation models built from datasets with manual evaluation can achieve high performance in automatic evaluation of English GEC without using reference sentences.. However, quality estimation models have not yet been studied in Japanese, because there are no datasets for constructing quality estimation models. Therefore, in this study, we created a quality estimation dataset with manual evaluation to build an automatic evaluation model for Japanese GEC. Moreover, we conducted a meta-evaluation to verify the dataset's usefulness in building the Japanese quality estimation model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题