论文标题
部分可观测时空混沌系统的无模型预测
ImitAL: Learned Active Learning Strategy on Synthetic Data
论文作者
论文摘要
主动学习(AL)是一种众所周知的标准方法,用于通过首先根据查询策略标记包含最多信息的样本来有效地获得带注释的数据。过去,已经提出了各种各样的查询策略,每一代新策略都会增加运行时并增加了更复杂的功能。但是,据我们所知,这些策略在来自不同应用程序域的大量数据集上都没有始终如一。基本上,大多数现有的AL策略是两个简单的启发式信息信息和代表性的结合,而巨大的差异在于通常相互矛盾的启发式方法的结合。在本文中,我们提出了Imital,这是一种独立于领域的新型查询策略,该策略将Al编码为学习级别的问题,并学习两种启发式方法之间的最佳组合。我们在纯粹的合成数据集上进行大规模模拟的AL运行训练Imital。为了证明Imital经过了成功培训,我们进行了广泛的评估,将来自各个领域的13个不同数据集的策略与其他7种查询策略进行了比较。
Active Learning (AL) is a well-known standard method for efficiently obtaining annotated data by first labeling the samples that contain the most information based on a query strategy. In the past, a large variety of such query strategies has been proposed, with each generation of new strategies increasing the runtime and adding more complexity. However, to the best of our our knowledge, none of these strategies excels consistently over a large number of datasets from different application domains. Basically, most of the the existing AL strategies are a combination of the two simple heuristics informativeness and representativeness, and the big differences lie in the combination of the often conflicting heuristics. Within this paper, we propose ImitAL, a domain-independent novel query strategy, which encodes AL as a learning-to-rank problem and learns an optimal combination between both heuristics. We train ImitAL on large-scale simulated AL runs on purely synthetic datasets. To show that ImitAL was successfully trained, we perform an extensive evaluation comparing our strategy on 13 different datasets, from a wide range of domains, with 7 other query strategies.