论文标题
使用共表达网络中的重叠群落来识别压力反应性基因
Identifying Stress Responsive Genes using Overlapping Communities in Co-expression Networks
论文作者
论文摘要
本文提出了一个工作流程,以识别对植物中特定治疗的响应的基因。工作流程作为输入,在对照和治疗条件下测量的不同基因型的RNA测序读数和表型数据。它输出了一组与治疗反应相关的基因组。从技术上讲,提出的方法既是WGCNA的概括,也是扩展的。它旨在确定基因共表达网络基础的重叠社区的特定模块。模块检测是通过使用分层链接聚类来实现的。这些模块可以识别系统的调节域的重叠性质。拉索回归用于分析模块对治疗的表型反应。 结果。工作流程应用于大米(Oryza sativa),这是一种已知对盐胁迫高度敏感的主要食物。工作流程确定了19种与盐胁迫反应相关的水稻基因。它们分布在6个模块中:3个模块,每个模块将3个基因分组在一起,与射击K含量相关。 3个基因的2个模块与射击生物量有关。 1个基因的1个模块与根生物量有关。这些基因代表了靶基因改善水稻中盐度的耐受性。 结论。引入了一个更有效的框架,以减少对特定治疗响应的目标基因的搜索空间。它通过将努力限制为具有高潜在相关性的较小基因的较小基因来促进实验验证。
This paper proposes a workflow to identify genes that respond to specific treatments in plants. The workflow takes as input the RNA sequencing read counts and phenotypical data of different genotypes, measured under control and treatment conditions. It outputs a reduced group of genes marked as relevant for treatment response. Technically, the proposed approach is both a generalization and an extension of WGCNA. It aims to identify specific modules of overlapping communities underlying the co-expression network of genes. Module detection is achieved by using Hierarchical Link Clustering. The overlapping nature of the systems' regulatory domains that generate co-expression can be identified by such modules. LASSO regression is employed to analyze phenotypic responses of modules to treatment. Results. The workflow is applied to rice (Oryza sativa), a major food source known to be highly sensitive to salt stress. The workflow identifies 19 rice genes that seem relevant in the response to salt stress. They are distributed across 6 modules: 3 modules, each grouping together 3 genes, are associated to shoot K content; 2 modules of 3 genes are associated to shoot biomass; and 1 module of 4 genes is associated to root biomass. These genes represent target genes for the improvement of salinity tolerance in rice. Conclusion. A more effective framework to reduce the search-space for target genes that respond to a specific treatment is introduced. It facilitates experimental validation by restraining efforts to a smaller subset of genes of high potential relevance.