多任训练模块化提示的多任务预训练中文几次学习

论文标题

多任训练模块化提示的多任务预训练中文几次学习

Multitask Pre-training of Modular Prompt for Chinese Few-Shot Learning

论文作者

Sun, Tianxiang, He, Zhengfu, Zhu, Qin, Qiu, Xipeng, Huang, Xuanjing

论文摘要

及时调整是一种参数效率的方法，用于将预训练的语言模型调整为下游任务。尽管已证明迅速调整可以与训练数据足够的完整模型调整的性能相匹配，但它往往会在几次学习设置中挣扎。在本文中，我们提出了多任务预训练的模块化提示（MP2），以提高提示调整以进行几次学习。 MP2是一套可在38个中国任务中预先训练的可组合提示。在下游任务上，预训练的提示被选择性地激活和组合，从而导致强大的组成概括从看不见的任务。为了弥合预训练和微调之间的差距，我们将上游和下游任务提出为统一的机器阅读理解任务。在两个学习范式（即梯度下降和黑盒调整）下进行的广泛实验表明，MP2在几个射击设置中的表现明显胜过促使调整，完整的模型调整和事先提示的预训练方法。此外，我们证明MP2可以通过仅学习8个参数来结合预训练的模块化提示来实现出奇的快速和强大适应下游任务。

Prompt tuning is a parameter-efficient approach to adapting pre-trained language models to downstream tasks. Although prompt tuning has been shown to match the performance of full model tuning when training data is sufficient, it tends to struggle in few-shot learning settings. In this paper, we present Multi-task Pre-trained Modular Prompt (MP2) to boost prompt tuning for few-shot learning. MP2 is a set of combinable prompts pre-trained on 38 Chinese tasks. On downstream tasks, the pre-trained prompts are selectively activated and combined, leading to strong compositional generalization to unseen tasks. To bridge the gap between pre-training and fine-tuning, we formulate upstream and downstream tasks into a unified machine reading comprehension task. Extensive experiments under two learning paradigms, i.e., gradient descent and black-box tuning, show that MP2 significantly outperforms prompt tuning, full model tuning, and prior prompt pre-training methods in few-shot settings. In addition, we demonstrate that MP2 can achieve surprisingly fast and strong adaptation to downstream tasks by merely learning 8 parameters to combine the pre-trained modular prompts.

下载PDF全文

下载文献需遵守相关版权规定

论文标题