论文标题
在动态时间扭曲下更快的二进制平均计算
Faster Binary Mean Computation Under Dynamic Time Warping
论文作者
论文摘要
许多共识的字符串问题基于锤距。我们用更灵活的(例如,用不同的输入字符串长度)替换了锤距离距离,这是动态时间扭曲距离,这是从时间序列挖掘中的应用中最著名的。这样做,我们研究了找到一个平均字符串的问题,该问题将(平方)动态时间翘曲距离的总和最小化,以达到给定的一组输入字符串。虽然已知此问题是NP-固定的(即使在三元素字母上的字符串),但我们解决了二进制字母案例,该二进制字母案例可溶解是多项式时间。从最差的运行时间角度来看,我们在先前已知的算法上有了显着改进。此外,我们还显示了我们一种算法在现实世界和合成数据实验中的实际实用性。最后,我们确定可在线性时间解决的特殊情况(例如,找到两个二进制输入字符串的平均值),并报告了有关最佳均值组合特性的一些经验发现。
Many consensus string problems are based on Hamming distance. We replace Hamming distance by the more flexible (e.g., easily coping with different input string lengths) dynamic time warping distance, best known from applications in time series mining. Doing so, we study the problem of finding a mean string that minimizes the sum of (squared) dynamic time warping distances to a given set of input strings. While this problem is known to be NP-hard (even for strings over a three-element alphabet), we address the binary alphabet case which is known to be polynomial-time solvable. We significantly improve on a previously known algorithm in terms of worst-case running time. Moreover, we also show the practical usefulness of one of our algorithms in experiments with real-world and synthetic data. Finally, we identify special cases solvable in linear time (e.g., finding a mean of only two binary input strings) and report some empirical findings concerning combinatorial properties of optimal means.