论文标题
时间序列数据的内域和跨域转移学习 - 功能如何转移?
Intra-domain and cross-domain transfer learning for time series data -- How transferable are the features?
论文作者
论文摘要
实际上,这是非常要求的,有时不可能收集足够大的标记数据数据集以成功训练机器学习模型,而解决此问题的一种可能解决方案是转移学习。这项研究旨在评估时间序列数据不同域之间的特征以及在哪些条件下的特征。转移学习的影响是根据模型的预测性能及其在训练过程中的收敛速度观察到的。在我们的实验中,我们使用减少的1,500和9,000个数据实例的数据集来模仿现实世界的条件。使用相同的缩小数据集,我们培训了两组机器学习模型:经过转移学习训练的机器学习模型以及经过从头开始训练的机器学习模型。实验使用了四个机器学习模型。知识的转移是在相同的应用领域(地震学)以及相互不同的应用领域(地震学,言语,医学,金融)之间进行的。我们观察到模型的预测性能以及训练期间的收敛速度。为了确认所获得的结果的有效性,我们重复了实验七次,并应用了统计测试以确认结果的重要性。我们研究的总体结论是,转移学习很可能会增加或不对模型的预测性能或其收敛速率产生负面影响。对收集的数据进行了更多详细信息分析,以确定哪些源和目标域兼容知识传输。我们还分析了目标数据集大小的影响以及模型及其超参数对转移学习影响的影响。
In practice, it is very demanding and sometimes impossible to collect datasets of tagged data large enough to successfully train a machine learning model, and one possible solution to this problem is transfer learning. This study aims to assess how transferable are the features between different domains of time series data and under which conditions. The effects of transfer learning are observed in terms of predictive performance of the models and their convergence rate during training. In our experiment, we use reduced data sets of 1,500 and 9,000 data instances to mimic real world conditions. Using the same scaled-down datasets, we trained two sets of machine learning models: those that were trained with transfer learning and those that were trained from scratch. Four machine learning models were used for the experiment. Transfer of knowledge was performed within the same domain of application (seismology), as well as between mutually different domains of application (seismology, speech, medicine, finance). We observe the predictive performance of the models and the convergence rate during the training. In order to confirm the validity of the obtained results, we repeated the experiments seven times and applied statistical tests to confirm the significance of the results. The general conclusion of our study is that transfer learning is very likely to either increase or not negatively affect the predictive performance of the model or its convergence rate. The collected data is analysed in more details to determine which source and target domains are compatible for transfer of knowledge. We also analyse the effect of target dataset size and the selection of model and its hyperparameters on the effects of transfer learning.