RUBICON：设计有效的基于深度学习的基因组基底机的框架

论文标题

RUBICON：设计有效的基于深度学习的基因组基底机的框架

RUBICON: A Framework for Designing Efficient Deep Learning-Based Genomic Basecallers

论文作者

Singh, Gagandeep, Alser, Mohammed, Denolf, Kristof, Firtina, Can, Khodamoradi, Alireza, Cavlak, Meryem Banu, Corporaal, Henk, Mutlu, Onur

论文摘要

纳米孔测序会生成嘈杂的电信号，需要使用称为基本的计算步骤将其转换为DNA核苷酸碱基的标准字符串。基本化的准确性和速度对基因组分析的所有以后步骤都具有关键意义。许多研究人员采用复杂的基于深度学习的模型来执行基本模型，而无需考虑此类模型的计算需求，这会导致缓慢，效率低下和渴望记忆力的基本者。因此，有必要在保持准确性的同时降低基本收集的计算和记忆成本。我们的目标是开发一个综合框架，以创建可提供高效率和性能的深度基础基础。我们介绍了Rubicon，这是一个开发硬件优化的基本框架的框架。 Rubicon由两种新型的机器学习技术组成，这些技术是专门为基本设计设计的。首先，我们介绍了第一个量化的基本化神经体系结构搜索（QABAS）框架，以专门为给定的硬件加速平台的基本神经网络体系结构，同时共同探索并找到每个神经网络层的最佳位宽度精度。其次，我们开发了Skipclip，这是第一种删除现代基本销售商中存在的跳过连接的技术，以大大降低资源和存储要求而不会损失基本准确性。我们通过开发Rubicall来证明Rubicon的好处，Rubicall是第一个易于执行的硬件优化的BaseCaller，其性能快速准确。与最先进的基本赛车相比，Rubicall提供了3.96倍的速度，精度提高了2.97％。我们表明Rubicon可以帮助研究人员开发优于专家设计的模型的硬件优化的基本销售商。

Nanopore sequencing generates noisy electrical signals that need to be converted into a standard string of DNA nucleotide bases using a computational step called basecalling. The accuracy and speed of basecalling have critical implications for all later steps in genome analysis. Many researchers adopt complex deep learning-based models to perform basecalling without considering the compute demands of such models, which leads to slow, inefficient, and memory-hungry basecallers. Therefore, there is a need to reduce the computation and memory cost of basecalling while maintaining accuracy. Our goal is to develop a comprehensive framework for creating deep learning-based basecallers that provide high efficiency and performance. We introduce RUBICON, a framework to develop hardware-optimized basecallers. RUBICON consists of two novel machine-learning techniques that are specifically designed for basecalling. First, we introduce the first quantization-aware basecalling neural architecture search (QABAS) framework to specialize the basecalling neural network architecture for a given hardware acceleration platform while jointly exploring and finding the best bit-width precision for each neural network layer. Second, we develop SkipClip, the first technique to remove the skip connections present in modern basecallers to greatly reduce resource and storage requirements without any loss in basecalling accuracy. We demonstrate the benefits of RUBICON by developing RUBICALL, the first hardware-optimized basecaller that performs fast and accurate basecalling. Compared to the fastest state-of-the-art basecaller, RUBICALL provides a 3.96x speedup with 2.97% higher accuracy. We show that RUBICON helps researchers develop hardware-optimized basecallers that are superior to expert-designed models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题