论文标题

几次微调的预训练语言模型的病理

Pathologies of Pre-trained Language Models in Few-shot Fine-tuning

论文作者

Chen, Hanjie, Zheng, Guoqing, Awadallah, Ahmed Hassan, Ji, Yangfeng

论文摘要

尽管将预训练的语言模型以很少的示例表现出了有希望的文本分类表现,但对性能增益的来源缺乏理解。在这项工作中,我们建议通过使用模型预测的事后解释来解释适应行为来回答这个问题。通过对解释的特征统计进行建模,我们发现(1)没有微调,预训练的模型(例如Bert和Roberta)在标签上显示出强烈的预测偏见; (2)尽管很少的微调可以减轻预测偏差并证明了有希望的预测性能,但我们的分析表明,通过捕获与非任务相关的特征(例如停止单词)或浅数据模式(例如,词法重叠),模型提高了性能的提高。这些观察结果警告说,以更少的例子追求模型性能可能会产生病理预测行为,这需要对模型预测进行进一步的理智检查,并在模型评估中进行了几次微调。

Although adapting pre-trained language models with few examples has shown promising performance on text classification, there is a lack of understanding of where the performance gain comes from. In this work, we propose to answer this question by interpreting the adaptation behavior using post-hoc explanations from model predictions. By modeling feature statistics of explanations, we discover that (1) without fine-tuning, pre-trained models (e.g. BERT and RoBERTa) show strong prediction bias across labels; (2) although few-shot fine-tuning can mitigate the prediction bias and demonstrate promising prediction performance, our analysis shows models gain performance improvement by capturing non-task-related features (e.g. stop words) or shallow data patterns (e.g. lexical overlaps). These observations alert that pursuing model performance with fewer examples may incur pathological prediction behavior, which requires further sanity check on model predictions and careful design in model evaluations in few-shot fine-tuning.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源