在Qualtrics上压缩跨语言多任务模型

论文标题

在Qualtrics上压缩跨语言多任务模型

Compressing Cross-Lingual Multi-Task Models at Qualtrics

论文作者

Campos, Daniel, Perry, Daniel, Joshi, Samir, Gambhir, Yashmeet, Du, Wei, Xing, Zhengzheng, Colak, Aaron

论文摘要

经验管理是一个新兴的业务领域，组织专注于了解客户和员工的反馈，以改善其端到端的经验。这导致了一系列独特的机器学习问题，以帮助了解人们的感受，发现他们关心的问题，并发现在与传统NLP域不同的数据和分布不同的数据上需要采取哪些操作。在本文中，我们介绍了一个案例研究，该案例研究在构建文本分析应用程序中，在经验管理的新生业务领域有效地执行多种分类任务。为了扩展经验数据的现代ML方法，我们利用交叉舌式和多任务建模技术将模型整合到单个部署中，以避免开销。我们还利用模型压缩和模型蒸馏，以将整体推理潜伏期和硬件成本降低到可用于业务需求的水平，同时保持模型预测质量。我们的发现表明，多任务建模改善了XLM-R和Mbert Architectures中经验管理任务的一部分任务绩效。在我们探索的压缩体系结构中，我们发现Minilm取得了最佳的压缩/性能权衡。我们的案例研究表明，使用原始的全尺寸型号，速度高达15.61倍，平均任务降解为2.60％（或降解为1.71％的3.29倍加速度（或3.29倍），估计节省了44％。这些结果表明，成功扩展了文本分类，以挑战ML的新领域，以实现经验管理。

Experience management is an emerging business area where organizations focus on understanding the feedback of customers and employees in order to improve their end-to-end experiences. This results in a unique set of machine learning problems to help understand how people feel, discover issues they care about, and find which actions need to be taken on data that are different in content and distribution from traditional NLP domains. In this paper, we present a case study of building text analysis applications that perform multiple classification tasks efficiently in 12 languages in the nascent business area of experience management. In order to scale up modern ML methods on experience data, we leverage cross lingual and multi-task modeling techniques to consolidate our models into a single deployment to avoid overhead. We also make use of model compression and model distillation to reduce overall inference latency and hardware cost to the level acceptable for business needs while maintaining model prediction quality. Our findings show that multi-task modeling improves task performance for a subset of experience management tasks in both XLM-R and mBert architectures. Among the compressed architectures we explored, we found that MiniLM achieved the best compression/performance tradeoff. Our case study demonstrates a speedup of up to 15.61x with 2.60% average task degradation (or 3.29x speedup with 1.71% degradation) and estimated savings of 44% over using the original full-size model. These results demonstrate a successful scaling up of text classification for the challenging new area of ML for experience management.

下载PDF全文

下载文献需遵守相关版权规定

论文标题