从前缀和后缀组成的编码字符串重建混合物

论文标题

从前缀和后缀组成的编码字符串重建混合物

Reconstructing Mixtures of Coded Strings from Prefix and Suffix Compositions

论文作者

Gabrys, Ryan, Pattabiraman, Srilakshmi, Milenkovic, Olgica

论文摘要

从子弦信息中的字符串重建问题发现了许多应用程序，因为它与基于DNA和聚合物的数据存储相关。实际上重要且具有挑战性的范式需要根据质谱读数产生的前缀和后缀的组成结合来重建字符串混合物。我们描述了新的编码方法，这些方法允许从代码中选择的字符串子集进行独特的联合重建，并在基础代码簿的渐近率上提供匹配的上和下限。在对问题参数的某些温和约束下，可以表明，代码簿的最大可能速率允许所有$ \ leq h $ codestrings的子汇编从前缀suffix信息从$ 1/h $等于$ 1/h $。

The problem of string reconstruction from substring information has found many applications due to its relevance in DNA- and polymer-based data storage. One practically important and challenging paradigm requires reconstructing mixtures of strings based on the union of compositions of their prefixes and suffixes, generated by mass spectrometry readouts. We describe new coding methods that allow for unique joint reconstruction of subsets of strings selected from a code and provide matching upper and lower bounds on the asymptotic rate of the underlying codebooks. Under certain mild constraints on the problem parameters, one can show that the largest possible rate of a codebook that allows for all subcollections of $\leq h$ codestrings to be uniquely reconstructable from the prefix-suffix information equals $1/h$.

下载PDF全文

下载文献需遵守相关版权规定

论文标题