非灌溉编码时代的增量处理：对增量NLU的双向模型的经验评估

论文标题

非灌溉编码时代的增量处理：对增量NLU的双向模型的经验评估

Incremental Processing in the Age of Non-Incremental Encoders: An Empirical Assessment of Bidirectional Models for Incremental NLU

论文作者

Madureira, Brielen, Schlangen, David

论文摘要

尽管人类逐步处理语言，但NLP当前使用的最佳语言编码器却没有。双向LSTM和变形金刚都假定要编码的序列可以完全使用，可以向前和向后（Bilstms）或整体处理（变形金刚）。我们研究了它们在增量界面下的行为，当必须基于所见的部分输入提供一定时间步骤时，这可能会在交互式系统中发生。我们在各种NLU数据集上测试了五个模型，并使用三个增量评估指标比较其性能。结果支持以增量模式使用双向编码的可能性，同时保留其大部分非额外质量。实现更好的非额外性能的“ Omni方向” BERT模型受到增量访问的影响。通过适应培训制度（截断培训）或测试程序，可以通过延迟输出直至可用或通过合并由GPT-2这样的语言模型产生的假设权利上下文来缓解这一点。

While humans process language incrementally, the best language encoders currently used in NLP do not. Both bidirectional LSTMs and Transformers assume that the sequence that is to be encoded is available in full, to be processed either forwards and backwards (BiLSTMs) or as a whole (Transformers). We investigate how they behave under incremental interfaces, when partial output must be provided based on partial input seen up to a certain time step, which may happen in interactive systems. We test five models on various NLU datasets and compare their performance using three incremental evaluation metrics. The results support the possibility of using bidirectional encoders in incremental mode while retaining most of their non-incremental quality. The "omni-directional" BERT model, which achieves better non-incremental performance, is impacted more by the incremental access. This can be alleviated by adapting the training regime (truncated training), or the testing procedure, by delaying the output until some right context is available or by incorporating hypothetical right contexts generated by a language model like GPT-2.

下载PDF全文

下载文献需遵守相关版权规定

论文标题