HarpervalleyBank：特定领域的口语对话框语料库

论文标题

HarpervalleyBank：特定领域的口语对话框语料库

HarperValleyBank: A Domain-Specific Spoken Dialog Corpus

论文作者

Wu, Mike, Nafziger, Jonathan, Scodary, Anthony, Maas, Andrew

论文摘要

我们介绍了Harpervalleybank，这是一个免费的公共领域口语对话语料库。数据模拟了简单的消费者银行交互，其中包含来自59位独特扬声器之间1,446次人类对话的大约23个小时的音频。我们选择了意图和话语模板，以允许逼真的变化，同时控制整个任务复杂性，并将词汇大小限制在约700个独特的单词。我们提供音频数据以及成绩单和说话者身份，呼叫者意图，对话动作和情感价的注释。数据大小和域的特异性可以通过现代端到端神经方法进行快速转录实验。此外，我们为表示学习提供了基准，将最近的工作调整为嵌入波形以进行下游预测任务。我们的实验表明，使用注释的任务对模型选择和语料库大小都敏感。

We introduce HarperValleyBank, a free, public domain spoken dialog corpus. The data simulate simple consumer banking interactions, containing about 23 hours of audio from 1,446 human-human conversations between 59 unique speakers. We selected intents and utterance templates to allow realistic variation while controlling overall task complexity and limiting vocabulary size to about 700 unique words. We provide audio data along with transcripts and annotations for speaker identity, caller intent, dialog actions, and emotional valence. The data size and domain specificity makes for quick transcription experiments with modern end-to-end neural approaches. Further, we provide baselines for representation learning, adapting recent work to embed waveforms for downstream prediction tasks. Our experiments show that tasks using our annotations are sensitive to both the model choice and corpus size.

下载PDF全文

下载文献需遵守相关版权规定

论文标题