论文标题
通过Max-Sat计算NP硬重复措施
Computing NP-hard Repetitiveness Measures via MAX-SAT
论文作者
论文摘要
重复措施揭示了数据集的深刻特征,并引起了在压缩空间中工作的压缩数据结构和算法。 las,其中某些度量的计算是NP固定的,而直截了当的计算对于甚至小尺寸的数据集来说都是不可行的。三种这样的措施是弦吸引子的最小尺寸,双向宏观方案的最小尺寸,也是直线程序的最小尺寸。尽管存在多种启发式计算近似的实现,但这些措施的精确计算几乎没有引起关注。在本文中,我们提出了最高-SAT公式,该公式提供了第一个非平凡的实现,以精确计算最小的弦吸引子,最小的双向宏观方案和最小的直线程序。计算实验表明,我们的实现适用于直线程序和双向宏观方案的长度最高长度的文本,甚至对于弦乐吸引子的文本甚至超过一百万。
Repetitiveness measures reveal profound characteristics of datasets, and give rise to compressed data structures and algorithms working in compressed space. Alas, the computation of some of these measures is NP-hard, and straight-forward computation is infeasible for datasets of even small sizes. Three such measures are the smallest size of a string attractor, the smallest size of a bidirectional macro scheme, and the smallest size of a straight-line program. While a vast variety of implementations for heuristically computing approximations exist, exact computation of these measures has received little to no attention. In this paper, we present MAX-SAT formulations that provide the first non-trivial implementations for exact computation of smallest string attractors, smallest bidirectional macro schemes, and smallest straight-line programs. Computational experiments show that our implementations work for texts of length up to a few hundred for straight-line programs and bidirectional macro schemes, and texts even over a million for string attractors.