论文标题

贝叶斯双曲线多维缩放

Bayesian Hyperbolic Multidimensional Scaling

论文作者

Liu, Bolun, Lubold, Shane, Raftery, Adrian E., McCormick, Tyler H.

论文摘要

多维缩放(MDS)是代表高维,依赖数据的广泛使用方法。 MDS通过在低维几何歧管上分配每个观测值的位置来起作用,而在歧管上距离表示相似。当低维歧管是双曲线时,我们提出了一种贝叶斯方法来进行多维缩放。使用双曲空间促进代表许多环境中常见的树状结构(例如具有分层结构的文本或遗传数据)。贝叶斯方法提供了正则化,以最大程度地减少观察到的数据中测量误差的影响并评估不确定性。我们还提出了一个病例对照的可能性近似,该近似可从较大数据设置中的后验分布进行有效采样,从而将计算复杂性从大约$ o(n^2)$降低到$ o(n)$。我们使用模拟,规范参考数据集,印度村网络数据和人类基因表达数据来评估针对最新替代方案的提议方法。

Multidimensional scaling (MDS) is a widely used approach to representing high-dimensional, dependent data. MDS works by assigning each observation a location on a low-dimensional geometric manifold, with distance on the manifold representing similarity. We propose a Bayesian approach to multidimensional scaling when the low-dimensional manifold is hyperbolic. Using hyperbolic space facilitates representing tree-like structures common in many settings (e.g. text or genetic data with hierarchical structure). A Bayesian approach provides regularization that minimizes the impact of measurement error in the observed data and assesses uncertainty. We also propose a case-control likelihood approximation that allows for efficient sampling from the posterior distribution in larger data settings, reducing computational complexity from approximately $O(n^2)$ to $O(n)$. We evaluate the proposed method against state-of-the-art alternatives using simulations, canonical reference datasets, Indian village network data, and human gene expression data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源