论文标题
没有免费的午餐供大约MCMC
No Free Lunch for Approximate MCMC
论文作者
论文摘要
众所周知,马尔可夫链蒙特卡洛(MCMC)的性能在瞄准计算昂贵的后验分布时(例如样本量较大时)会迅速降解。这激发了人们对MCMC变体的搜索,这些变体可以很好地扩展到大型数据集。一种流行的一般方法是在每个步骤中仅查看数据的子样本。在本说明中,我们指出,众所周知的MCMC收敛结果通常暗示这些``subsmpling''MCMC算法不能大大提高性能。我们将这些抽象结果应用于现实的统计问题和提出的算法,还讨论了结果建议的一些设计原理。最后,我们开发了可能具有独立关注的随机矩阵界限的奇异值的估计。
It is widely known that the performance of Markov chain Monte Carlo (MCMC) can degrade quickly when targeting computationally expensive posterior distributions, such as when the sample size is large. This has motivated the search for MCMC variants that scale well to large datasets. One popular general approach has been to look at only a subsample of the data at every step. In this note, we point out that well-known MCMC convergence results often imply that these ``subsampling'' MCMC algorithms cannot greatly improve performance. We apply these abstract results to realistic statistical problems and proposed algorithms, and also discuss some design principles suggested by the results. Finally, we develop estimates for the singular values of random matrices bounds that may be of independent interest.