论文标题
Delta冶炼保护的批处理序列设计和异质替代建模
Batch-sequential design and heteroskedastic surrogate modeling for delta smelt conservation
论文作者
论文摘要
Delta Glelt是旧金山河口的一种濒临灭绝的鱼类,在过去30年中总体人口下降。研究人员已经开发了一种基于代理的模拟器来虚拟化系统,目的是了解自然和人为因素的相对贡献,建议在其下降中发挥作用。但是,输入配置空间是高维空间,运行模拟器是耗时的,并且其嘈杂的输出在平均值和方差上都非线性地变化。获得足够的运行以有效地学习输入 - 输出动态需要敏捷的建模策略和并行的超级计算机评估。 Heteroskedastic高斯工艺(HETGP)替代建模的最新进展有所帮助,但是对于如何适当计划实验以进行高度分布的模拟器评估,知之甚少。我们提出了一个批处理顺序设计方案,对HETGP替代物进行了基于一次差异的主动学习,作为使多核群集节点保持完全昂贵运行的一种手段。我们的获取策略经过精心设计,以选择重复选择,从而在训练替代物以隔离高噪声区域的信号时提高统计和计算效率。在进行大规模的冶炼模拟运动和下游高保真输入灵敏度分析之前,在一系列玩具示例上进行了设计和建模性能。
Delta smelt is an endangered fish species in the San Francisco estuary that have shown an overall population decline over the past 30 years. Researchers have developed a stochastic, agent-based simulator to virtualize the system, with the goal of understanding the relative contribution of natural and anthropogenic factors suggested as playing a role in their decline. However, the input configuration space is high-dimensional, running the simulator is time-consuming, and its noisy outputs change nonlinearly in both mean and variance. Getting enough runs to effectively learn input--output dynamics requires both a nimble modeling strategy and parallel supercomputer evaluation. Recent advances in heteroskedastic Gaussian process (HetGP) surrogate modeling helps, but little is known about how to appropriately plan experiments for highly distributed simulator evaluation. We propose a batch sequential design scheme, generalizing one-at-a-time variance-based active learning for HetGP surrogates, as a means of keeping multi-core cluster nodes fully engaged with expensive runs. Our acquisition strategy is carefully engineered to favor selection of replicates which boost statistical and computational efficiencies when training surrogates to isolate signal in high noise regions. Design and modeling performance is illustrated on a range of toy examples before embarking on a large-scale smelt simulation campaign and downstream high-fidelity input sensitivity analysis.