资源约束的对话策略通过可区分的归纳逻辑编程学习

论文标题

资源约束的对话策略通过可区分的归纳逻辑编程学习

Resource Constrained Dialog Policy Learning via Differentiable Inductive Logic Programming

论文作者

Zhou, Zhenpeng, Beirami, Ahmad, Crook, Paul, Shah, Pararth, Subba, Rajen, Geramifard, Alborz

论文摘要

在资源约束对话策略学习的需求中，我们通过可区分的归纳逻辑（DILOG）介绍了对话策略。我们在Simdial和Multiwoz上使用DILOG探索一次性学习的任务和零射击域的转移。使用餐厅域中的单个代表性对话框，我们在Simdial数据集上训练DILOG，并获得99+％的内域测试准确性。我们还表明，受过训练的diLog零射击转移到所有其他域中的精度为99+％，证明了Dilog对插槽填充对话的适用性。我们进一步将研究扩展到达到90％以上信息和成功指标的多沃兹数据集。我们还观察到，这些指标并没有从假阳性方面捕获稀疏的某些缺点，从而促使我们测量辅助行动F1分数。我们表明，DILOG的数据效率比Multiwoz上的最新神经方法高100倍，同时达到了相似的性能指标。我们在讨论了DILOG的优势和劣势的讨论中结束。

Motivated by the needs of resource constrained dialog policy learning, we introduce dialog policy via differentiable inductive logic (DILOG). We explore the tasks of one-shot learning and zero-shot domain transfer with DILOG on SimDial and MultiWoZ. Using a single representative dialog from the restaurant domain, we train DILOG on the SimDial dataset and obtain 99+% in-domain test accuracy. We also show that the trained DILOG zero-shot transfers to all other domains with 99+% accuracy, proving the suitability of DILOG to slot-filling dialogs. We further extend our study to the MultiWoZ dataset achieving 90+% inform and success metrics. We also observe that these metrics are not capturing some of the shortcomings of DILOG in terms of false positives, prompting us to measure an auxiliary Action F1 score. We show that DILOG is 100x more data efficient than state-of-the-art neural approaches on MultiWoZ while achieving similar performance metrics. We conclude with a discussion on the strengths and weaknesses of DILOG.

下载PDF全文

下载文献需遵守相关版权规定

论文标题