论文标题

在资源约束设备上进行有效的实时选择性基因组测序

Efficient Real-Time Selective Genome Sequencing on Resource-Constrained Devices

论文作者

Shih, Po Jui, Saadat, Hassaan, Parameswaran, Sri, Gamaarachchi, Hasindu

论文摘要

第三代纳米孔序列提供了一个称为选择性测序的功能,或“读取直到”,允许实时分析基因组读取,并在中途进行放弃,即使不是属于“兴趣”的基因组区域。这种选择性的测序为重要应用打开了大门,例如快速和低成本的基因检测。分析的延迟应尽可能较低,以使选择性测序有效,以便可以尽早拒绝不必要的读取。但是,对于此问题采用子序列动态时间扭曲(SDTW)算法的现有方法在计算上太密集了,以至于数十个CPU核心的大规模工作站仍在努力跟上移动电话量的小额兵运程序的数据速率。在本文中,我们提出硬件加速读取,直到(Haru)是一种资源有效的硬件软件共同设计的方法,该方法利用芯片FPGA的低成本和便携式异质MPSOC平台来加速基于SDTW的读取直到算法。实验结果表明,在带有4核ARM处理器的Xilinx FPGA上的Haru比在具有36核Intel Xeon Processors在36核Intel Xeon Processors上运行的现有未取代的多线程软件快的高度优化的多线程软件版本(约85倍)快2.5倍,该速度快2.5倍。 Haru的能源消耗是两个比在36核服务器上执行的同一应用程序的幅度的两个级数。 Haru SDTW模块的源代码可在https://github.com/beebdev/haru上以开放源为单位,并且可以在https://github.com/beebdev/sigfish-haru上使用Haru的示例应用程序。

Third-generation nanopore sequencers offer a feature called selective sequencing or 'Read Until' that allows genomic reads to be analyzed in real-time and abandoned halfway, if not belonging to a genomic region of 'interest'. This selective sequencing opens the door to important applications such as rapid and low-cost genetic tests. The latency in analyzing should be as low as possible for selective sequencing to be effective so that unnecessary reads can be rejected as early as possible. However, existing methods that employ subsequence Dynamic Time Warping (sDTW) algorithm for this problem are too computationally intensive that a massive workstation with dozens of CPU cores still struggles to keep up with the data rate of a mobile phone-sized MinION sequencer. In this paper, we present Hardware Accelerated Read Until (HARU), a resource-efficient hardware-software co-design-based method that exploits a low-cost and portable heterogeneous MPSoC platform with on-chip FPGA to accelerate the sDTW-based Read Until algorithm. Experimental results show that HARU on a Xilinx FPGA embedded with a 4-core ARM processor is around 2.5X faster than a highly optimized multi-threaded software version (around 85X faster than the existing unoptimized multi-threaded software) running on a sophisticated server with 36-core Intel Xeon processor for a SARS-CoV-2 dataset. The energy consumption of HARU is two orders of magnitudes lower than the same application executing on the 36-core server. Source code for HARU sDTW module is available as open-source at https://github.com/beebdev/HARU and an example application that utilises HARU is at https://github.com/beebdev/sigfish-haru.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源