论文标题
实验室与人群:对神经对话模型的数据质量的调查
The Lab vs The Crowd: An Investigation into Data Quality for Neural Dialogue Models
论文作者
论文摘要
围绕收集和处理质量数据的挑战阻碍了数据驱动的对话模型的进展。先前的方法正在远离昂贵的资源密集型实验室设置,那里的收集很慢,但是数据被认为是高质量的。众包平台的出现,例如亚马逊机械土耳其人,为研究人员提供了一种替代性的成本效益和快速收集数据的方式。但是,流体,自然口语或文本互动的收集可能具有挑战性,尤其是在两个众群体工人之间。在这项研究中,我们比较了同一交互任务的对话模型的性能,但在两个不同的设置中收集:在实验室与众群体中。我们发现,达到类似的准确性所需的实验室对话更少,少于实验室数据量的一半与人群数据。我们讨论了每种数据收集方法的优点和缺点。
Challenges around collecting and processing quality data have hampered progress in data-driven dialogue models. Previous approaches are moving away from costly, resource-intensive lab settings, where collection is slow but where the data is deemed of high quality. The advent of crowd-sourcing platforms, such as Amazon Mechanical Turk, has provided researchers with an alternative cost-effective and rapid way to collect data. However, the collection of fluid, natural spoken or textual interaction can be challenging, particularly between two crowd-sourced workers. In this study, we compare the performance of dialogue models for the same interaction task but collected in two different settings: in the lab vs. crowd-sourced. We find that fewer lab dialogues are needed to reach similar accuracy, less than half the amount of lab data as crowd-sourced data. We discuss the advantages and disadvantages of each data collection method.