论文标题

对未对齐数据集之间固有距离的日志 - 欧几里得签名

Log-Euclidean Signatures for Intrinsic Distances Between Unaligned Datasets

论文作者

Shnitzer, Tal, Yurochkin, Mikhail, Greenewald, Kristjan, Solomon, Justin

论文摘要

从模型分析和机器学习中的比较到医疗数据集集合中的趋势发现,需要有效地比较和表示具有未知字段的数据集跨越各个字段。我们使用歧管学习来比较不同数据集的固有几何结构,通过比较其扩散操作员,对称阳性定义(SPD)矩阵,这些矩阵与连续的laplace-beltrami操作员与离散样品的近似相关。现有方法通常假设已知的数据对齐,并以点数的方式比较此类运算符。取而代之的是,我们利用SPD矩阵的Riemannian几何形状比较这些操作员并根据log-Euclidean Metric的下限定义了新的理论动机距离。我们的框架有助于比较具有不同大小,功能数量和测量方式的数据集中表达的数据歧管。我们的日志欧国签名(LES)距离恢复了有意义的结构差异,在各种应用领域的表现都优于竞争方法。

The need for efficiently comparing and representing datasets with unknown alignment spans various fields, from model analysis and comparison in machine learning to trend discovery in collections of medical datasets. We use manifold learning to compare the intrinsic geometric structures of different datasets by comparing their diffusion operators, symmetric positive-definite (SPD) matrices that relate to approximations of the continuous Laplace-Beltrami operator from discrete samples. Existing methods typically assume known data alignment and compare such operators in a pointwise manner. Instead, we exploit the Riemannian geometry of SPD matrices to compare these operators and define a new theoretically-motivated distance based on a lower bound of the log-Euclidean metric. Our framework facilitates comparison of data manifolds expressed in datasets with different sizes, numbers of features, and measurement modalities. Our log-Euclidean signature (LES) distance recovers meaningful structural differences, outperforming competing methods in various application domains.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源