论文标题
关于高斯数据库的相关检测和对齐恢复
On Correlation Detection and Alignment Recovery of Gaussian Databases
论文作者
论文摘要
在这项工作中,我们提出了一种有效的两阶段算法,以解决两个高斯数据库之间相关检测和部分比对恢复的联合问题。相关检测是一个假设测试问题。在零假设下,数据库是独立的,在替代假设下,它们在未知的行排列下是相关的。我们在类型I和II类型误差概率上开发界限,并表明分析的检测器的性能比最近提出的检测器更好,至少对于某些特定的参数选择。由于所提出的检测器依赖于统计量,该统计量是依赖指标随机变量的总和,因此为了绑定I型错误概率,我们开发了一种新颖的图理论技术来界定此类统计的$ K $ ther阶矩。当数据库被接受为相关时,该算法还恢复了给定数据库之间的某些部分比对。我们还提出了另外两种算法:(i)另一种用于部分比对恢复的算法,其可靠性和计算复杂性都高于第一个提出的算法的算法。 (ii)与最佳恢复程序相比,用于完全对齐恢复的算法,其计算量减少,误差概率降低。
In this work, we propose an efficient two-stage algorithm solving a joint problem of correlation detection and partial alignment recovery between two Gaussian databases. Correlation detection is a hypothesis testing problem; under the null hypothesis, the databases are independent, and under the alternate hypothesis, they are correlated, under an unknown row permutation. We develop bounds on the type-I and type-II error probabilities, and show that the analyzed detector performs better than a recently proposed detector, at least for some specific parameter choices. Since the proposed detector relies on a statistic, which is a sum of dependent indicator random variables, then in order to bound the type-I probability of error, we develop a novel graph-theoretic technique for bounding the $k$-th order moments of such statistics. When the databases are accepted as correlated, the algorithm also recovers some partial alignment between the given databases. We also propose two more algorithms: (i) One more algorithm for partial alignment recovery, whose reliability and computational complexity are both higher than those of the first proposed algorithm. (ii) An algorithm for full alignment recovery, which has a reduced amount of calculations and a not much lower error probability, when compared to the optimal recovery procedure.