论文标题
你有正确的剪刀吗?通过蒙特卡洛方法调整预训练的语言模型
Do You Have the Right Scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods
论文作者
论文摘要
这是一种在大型语料库上预先培训语言模型并在特定于任务数据上进行调整的常见方法。在实践中,我们观察到,在小数据集上对预训练的模型进行微调可能导致过度估计问题。在本文中,我们提出了MC-Tailor,这是一种新颖的方法,可以通过将概率质量从过高的估计区域转移到低估的区域来减轻文本生成任务中的上述问题。在各种文本生成数据集上进行的实验表明,MC-Tailor始终如一,并且显着优于微调方法。我们的代码可在此URL上找到。
It has been a common approach to pre-train a language model on a large corpus and fine-tune it on task-specific data. In practice, we observe that fine-tuning a pre-trained model on a small dataset may lead to over- and/or under-estimation problem. In this paper, we propose MC-Tailor, a novel method to alleviate the above issue in text generation tasks by truncating and transferring the probability mass from over-estimated regions to under-estimated ones. Experiments on a variety of text generation datasets show that MC-Tailor consistently and significantly outperforms the fine-tuning approach. Our code is available at this url.