经常更新，快速更新：重新训练语义解析系统在一小部分时间内

论文标题

经常更新，快速更新：重新训练语义解析系统在一小部分时间内

Update Frequently, Update Fast: Retraining Semantic Parsing Systems in a Fraction of Time

论文作者

Lialin, Vladislav, Goel, Rahul, Simanovsky, Andrey, Rumshisky, Anna, Shah, Rushin

论文摘要

当前使用的语义解析系统部署在语音助手中可能需要数周的训练。这些模型的数据集通常会收到小型和频繁的更新，数据补丁。每个补丁都需要培训一个新型号。为了减少训练时间，可以在每个贴片上微调先前训练的模型，但是天真的微调表现出灾难性的遗忘 - 模型性能在数据补丁中未表示的数据上降低。在这项工作中，我们提出了一种简单的方法，可以减轻灾难性的遗忘，并表明可以通过微调在不到10％的时间内从头开始训练的模型的性能。实现这一目标的关键是超采样和EWC正则化。我们证明了我们的方法对Facebook Top和SNIPS数据集的多个拆分的有效性。

Currently used semantic parsing systems deployed in voice assistants can require weeks to train. Datasets for these models often receive small and frequent updates, data patches. Each patch requires training a new model. To reduce training time, one can fine-tune the previously trained model on each patch, but naive fine-tuning exhibits catastrophic forgetting - degradation of the model performance on the data not represented in the data patch. In this work, we propose a simple method that alleviates catastrophic forgetting and show that it is possible to match the performance of a model trained from scratch in less than 10% of a time via fine-tuning. The key to achieving this is supersampling and EWC regularization. We demonstrate the effectiveness of our method on multiple splits of the Facebook TOP and SNIPS datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题