论文标题

较低维零密度区域的渐近学

Asymptotics of Lower Dimensional Zero-Density Regions

论文作者

Luo, Hengrui, MacEachern, Steve N., Peruggia, Mario

论文摘要

拓扑数据分析(TDA)使我们能够探索数据集的拓扑特征。在拓扑特征中,较低的维度最近引起了从业者在数学和统计数据方面的注意,因为它们有助于在数据集中发现低维结构的潜力。但是,较低的尺寸特征通常具有挑战性地基于有限样品检测,并使用忽略生成数据的概率机制的TDA方法。在本文中,由于引入并彻底研究了密度函数的零密度区域,因此发生的较低维拓扑特征。具体而言,我们考虑了覆盖序列的序列,以支持密度函数,其中覆盖物由半径收缩的球组成。我们表明,当这些覆盖物满足某些足够的条件,因为样本量进入无穷大时,我们可以检测到较低的零密度区域,而较高的概率越来越高,同时防止虚假检测。我们通过对模拟实验的讨论来补充理论发展,这些实验阐明了该方法的行为,以控制覆盖序列构建并表征渐近结果的调整参数的不同选择。

Topological data analysis (TDA) allows us to explore the topological features of a dataset. Among topological features, lower dimensional ones have recently drawn the attention of practitioners in mathematics and statistics due to their potential to aid the discovery of low dimensional structure in a data set. However, lower dimensional features are usually challenging to detect based on finite samples and using TDA methods that ignore the probabilistic mechanism that generates the data. In this paper, lower dimensional topological features occurring as zero-density regions of density functions are introduced and thoroughly investigated. Specifically, we consider sequences of coverings for the support of a density function in which the coverings are comprised of balls with shrinking radii. We show that, when these coverings satisfy certain sufficient conditions as the sample size goes to infinity, we can detect lower dimensional, zero-density regions with increasingly higher probability while guarding against false detection. We supplement the theoretical developments with the discussion of simulated experiments that elucidate the behavior of the methodology for different choices of the tuning parameters that govern the construction of the covering sequences and characterize the asymptotic results.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源