注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

wangyufeng的博客

祝愿BB 健康开心快乐每一天

 
 
 

日志

 
 

Exon-Based Strategy to Identify Differentially Expressed Genes in RNA-Seq Experiments  

2015-01-08 19:55:09|  分类: 生物信息分析 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

Abstract

RNA-sequencing (RNA-seq) has rapidly become the method of choice in many genome-wide transcriptomic studies. To meet the high expectations posed by this technology, powerful computational techniques are needed to translate the measurements into biological and biomedical understanding. A number of statistical procedures have already been developed to identify differentially expressed genes between distinct sample groups. With these methods statistical testing is typically performed after the data has been summarized at the gene level. As an alternative strategy, developed with the aim to improve the results, we demonstrate a method in which statistical testing at the exon level is performed prior to the summary of the results at the gene level. Using publicly available RNA-seq datasets as case studies, we illustrate how this exon-based strategy can improve the performance of the widely used differential expression software packages as compared to the conventional gene-based strategy. In particular, we show how it enables robust detection of moderate but systematic changes that are missed when relying on single gene-level summary counts only.

Exon-Based Strategy to Identify Differentially Expressed Genes in RNA-Seq Experiments - 喜欢吃桃子 - wangyufeng的博客

Schematic illustration of two alternative strategies (gene-based and exon-based) for detecting differential expression between two sample groups.

The RNA-seq data are from the MAQC dataset, containing two types of biological samples: human brain reference (brain) and human universal reference RNA (uhr). (A) Exon structure of the gene DCUN1D5 (B) Separate read counts for the eight exons of the gene. (C) Normalized total read counts across all the exons for the gene. (D) Logarithmic (base 2) fold change between the sample groups separately for each exon. The number of stars above a bar indicates whether one or both of the two software packages (limma, edgeR) identify the particular exon as significant at p<0.05. (E) Gene-level log fold change between the sample groups obtained using directly the gene-level read counts (gene-based strategy; left bar) or by taking the median over the exon-level changes (exon-based strategy; right bar). The exon-based strategy supports differential expression (median p = 3.69e–06 and 1.64e–09 with limma and edgeR, respectively), whereas the conventional gene-based strategy suggests that the gene is equally expressed in both groups (p = 0.91 with both limma and edgeR). The fold changes were determined here using the limma software package.

doi:10.1371/journal.pone.0115964.g001

Citation: Laiho A, Elo LL (2014) A Note on an Exon-Based Strategy to Identify Differentially Expressed Genes in RNA-Seq Experiments. PLoS ONE 9(12): e115964. doi:10.1371/journal.pone.0115964

 


  评论这张
 
阅读(430)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017