基于图的强化学习实时积极学习：建模河网络的应用

论文标题

基于图的强化学习实时积极学习：建模河网络的应用

Graph-based Reinforcement Learning for Active Learning in Real Time: An Application in Modeling River Networks

论文作者

Jia, Xiaowei, Lin, Beiyu, Zwart, Jacob, Sadler, Jeffrey, Appling, Alison, Oliver, Samantha, Read, Jordan

论文摘要

对高级ML模型的有效培训需要大量标记的数据，鉴于人类的劳动力和材料成本可供收集标记的数据，这通常在科学问题中很少。这在确定何时何地应部署测量仪器（例如，原位传感器）以有效地收集标记的数据时提出了一个挑战。这个问题与传统的基于池的主动学习设置有所不同，因为在我们观察时间序列中的输入数据之后，必须立即做出标签决策。在本文中，我们开发了一种实时的主动学习方法，该方法使用空间和时间上下文信息在强化学习框架中选择代表性查询样本。为了减少对大型培训数据的需求，我们进一步提议转移从现有基于物理模型生成的仿真数据中学习的策略。我们通过预测特拉华河流域的水流和水温来证明了该方法的有效性，因为预算收集标记的数据的预算有限。我们进一步研究了选定样品的空间和时间分布，以验证该方法在空间和时间上选择信息性样本的能力。

Effective training of advanced ML models requires large amounts of labeled data, which is often scarce in scientific problems given the substantial human labor and material cost to collect labeled data. This poses a challenge on determining when and where we should deploy measuring instruments (e.g., in-situ sensors) to collect labeled data efficiently. This problem differs from traditional pool-based active learning settings in that the labeling decisions have to be made immediately after we observe the input data that come in a time series. In this paper, we develop a real-time active learning method that uses the spatial and temporal contextual information to select representative query samples in a reinforcement learning framework. To reduce the need for large training data, we further propose to transfer the policy learned from simulation data which is generated by existing physics-based models. We demonstrate the effectiveness of the proposed method by predicting streamflow and water temperature in the Delaware River Basin given a limited budget for collecting labeled data. We further study the spatial and temporal distribution of selected samples to verify the ability of this method in selecting informative samples over space and time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题