论文标题
在调整语言模型中的绩效效率差异到文本分类任务
Performance-Efficiency Trade-Offs in Adapting Language Models to Text Classification Tasks
论文作者
论文摘要
预先训练的语言模型(LMS)在适应文本分类任务时会获得最先进的性能。但是,当在现实世界应用中使用此类模型时,效率考虑至关重要。在本文中,我们研究了如何使LMS适应文本分类的不同培训程序,因为我们改变了模型和火车设定的大小。更具体地说,当老师接受微调或提示培训时,我们比较标准的微调,提示和知识蒸馏(KD)。我们的发现表明,即使进行微调和促使在大型火车套装上训练大型LMS,还有更有效的替代方案可以降低计算或数据成本。有趣的是,我们发现提示与KD结合可以同时降低计算和数据成本。
Pre-trained language models (LMs) obtain state-of-the-art performance when adapted to text classification tasks. However, when using such models in real-world applications, efficiency considerations are paramount. In this paper, we study how different training procedures that adapt LMs to text classification perform, as we vary model and train set size. More specifically, we compare standard fine-tuning, prompting, and knowledge distillation (KD) when the teacher was trained with either fine-tuning or prompting. Our findings suggest that even though fine-tuning and prompting work well to train large LMs on large train sets, there are more efficient alternatives that can reduce compute or data cost. Interestingly, we find that prompting combined with KD can reduce compute and data cost at the same time.