论文标题

通过Max-Sat计算NP硬重复措施

Computing NP-hard Repetitiveness Measures via MAX-SAT

论文作者

Bannai, Hideo, Goto, Keisuke, Ishihata, Masakazu, Kanda, Shunsuke, Köppl, Dominik, Nishimoto, Takaaki

论文摘要

重复措施揭示了数据集的深刻特征,并引起了在压缩空间中工作的压缩数据结构和算法。 las,其中某些度量的计算是NP固定的,而直截了当的计算对于甚至小尺寸的数据集来说都是不可行的。三种这样的措施是弦吸引子的最小尺寸,双向宏观方案的最小尺寸,也是直线程序的最小尺寸。尽管存在多种启发式计算近似的实现,但这些措施的精确计算几乎没有引起关注。在本文中,我们提出了最高-SAT公式,该公式提供了第一个非平凡的实现,以精确计算最小的弦吸引子,最小的双向宏观方案和最小的直线程序。计算实验表明,我们的实现适用于直线程序和双向宏观方案的长度最高长度的文本,甚至对于弦乐吸引子的文本甚至超过一百万。

Repetitiveness measures reveal profound characteristics of datasets, and give rise to compressed data structures and algorithms working in compressed space. Alas, the computation of some of these measures is NP-hard, and straight-forward computation is infeasible for datasets of even small sizes. Three such measures are the smallest size of a string attractor, the smallest size of a bidirectional macro scheme, and the smallest size of a straight-line program. While a vast variety of implementations for heuristically computing approximations exist, exact computation of these measures has received little to no attention. In this paper, we present MAX-SAT formulations that provide the first non-trivial implementations for exact computation of smallest string attractors, smallest bidirectional macro schemes, and smallest straight-line programs. Computational experiments show that our implementations work for texts of length up to a few hundred for straight-line programs and bidirectional macro schemes, and texts even over a million for string attractors.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源