论文标题
学习功能以研究多任务学习的好处
Learning Functions to Study the Benefit of Multitask Learning
论文作者
论文摘要
我们研究并量化了序列标记任务的多任务学习(MTL)模型的概括模式。 MTL模型经过培训以优化一组相关任务。尽管多任务学习在某些问题中的表现提高了,但在一起训练时,也有一些任务失去表现。这些混合的结果激发了我们研究影响MTL模型性能的因素。我们注意到,MTL模型的理论界限和收敛率存在,但它们依赖于诸如任务相关性和平衡数据集的强大假设。为了纠正这些局限性,我们建议创建任务模拟器,并使用符号回归来学习将模型绩效与可能的影响因素相关的表达式。对于MTL,我们根据任务数量(t)的数量,每个任务(N)的样本数量以及由调整后的共同信息(AMI)衡量的任务相关性研究模型性能。在我们的实验中,我们可以从经验上找到将模型性能与SQRT(N),SQRT(T)因子相关的公式,它们等同于Maurer [2016]中的合理数学证明,我们通过发现与SQRT(AMI)的性能有关。
We study and quantify the generalization patterns of multitask learning (MTL) models for sequence labeling tasks. MTL models are trained to optimize a set of related tasks jointly. Although multitask learning has achieved improved performance in some problems, there are also tasks that lose performance when trained together. These mixed results motivate us to study the factors that impact the performance of MTL models. We note that theoretical bounds and convergence rates for MTL models exist, but they rely on strong assumptions such as task relatedness and the use of balanced datasets. To remedy these limitations, we propose the creation of a task simulator and the use of Symbolic Regression to learn expressions relating model performance to possible factors of influence. For MTL, we study the model performance against the number of tasks (T), the number of samples per task (n) and the task relatedness measured by the adjusted mutual information (AMI). In our experiments, we could empirically find formulas relating model performance with factors of sqrt(n), sqrt(T), which are equivalent to sound mathematical proofs in Maurer[2016], and we went beyond by discovering that performance relates to a factor of sqrt(AMI).