大于其各个部分的总和：计算灵活的贝叶斯分层建模

论文标题

大于其各个部分的总和：计算灵活的贝叶斯分层建模

Greater Than the Sum of its Parts: Computationally Flexible Bayesian Hierarchical Modeling

论文作者

Johnson, Devin S., Brost, Brian M., Hooten, Mevin B.

论文摘要

我们提出了一种多阶段方法，用于使用自然数据分区在贝叶斯分层模型（BHM）的所有级别进行推断，以通过允许使用最适合每个数据分区的软件以并行形式进行计算来提高效率。然后，由模型的数据组件的独立正常分布的乘积近似完整的分层模型。在第二阶段，通过最大化近似的后验密度相对于参数，可以找到贝叶斯最大值{\ it postteriori}（map）估计器。如果模型的参数可以表示为正态分布的随机效应，则第二阶段优化等效于拟合多元正常线性混合模型。可以扩展此方法以说明数据分区之间共享的常见固定参数以及分区之间不同的参数。在不同的参数估计的情况下，我们考虑了第三阶段，该阶段根据第二阶段的结果重新估算每个数据分区的不同参数。这允许从整个数据集中进行更多信息，以正确地告知不同参数的后验分布。该方法通过两个生态数据集和模型，一个随机效应GLM和一个综合种群模型（IPM）证明。将多阶段结果与从单个阶段拟合到整个数据集的模型的估计值进行了比较。这两个示例都表明，多阶点和后标准偏差估计值估计通过同时将模型与所有数据拟合而获得的估计值近似于拟合所有数据，因此可以考虑在计算上拟合分层贝叶斯模型的拟合层次，以便在一步之内进行操作。

We propose a multistage method for making inference at all levels of a Bayesian hierarchical model (BHM) using natural data partitions to increase efficiency by allowing computations to take place in parallel form using software that is most appropriate for each data partition. The full hierarchical model is then approximated by the product of independent normal distributions for the data component of the model. In the second stage, the Bayesian maximum {\it a posteriori} (MAP) estimator is found by maximizing the approximated posterior density with respect to the parameters. If the parameters of the model can be represented as normally distributed random effects then the second stage optimization is equivalent to fitting a multivariate normal linear mixed model. This method can be extended to account for common fixed parameters shared between data partitions, as well as parameters that are distinct between partitions. In the case of distinct parameter estimation, we consider a third stage that re-estimates the distinct parameters for each data partition based on the results of the second stage. This allows more information from the entire data set to properly inform the posterior distributions of the distinct parameters. The method is demonstrated with two ecological data sets and models, a random effects GLM and an Integrated Population Model (IPM). The multistage results were compared to estimates from models fit in single stages to the entire data set. Both examples demonstrate that multistage point and posterior standard deviation estimates closely approximate those obtained from fitting the models with all data simultaneously and can therefore be considered for fitting hierarchical Bayesian models when it is computationally prohibitive to do so in one step.

下载PDF全文

下载文献需遵守相关版权规定

论文标题