注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

wangyufeng的博客

祝愿BB 健康开心快乐每一天

 
 
 

日志

 
 

ANNOVAR: Functional annotation of genetic variants from high-throughput sequencing data  

2011-10-27 09:18:55|  分类: 生物信息分析 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

       ANNOVAR is an efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes (including human genome hg18, hg19, as well as mouse, worm, fly, yeast and many others). Given a list of variants with chromosome, start position, end position, reference nucleotide and observed nucleotides, ANNOVAR can perform:

  1. Gene-based annotation: identify whether SNPs or CNVs cause protein coding changes and the amino acids that are affected. Users can flexibly use RefSeq genes, UCSC genes, ENSEMBL genes, GENCODE genes, or many other gene definition systems.
  2. Region-based annotations: identify variants in specific genomic regions, for example, conserved regions among 44 species, predicted transcription factor binding sites, segmental duplication regions, GWAS hits, database of genomic variants, DNAse I hypersensitivity sites, ENCODE H3K4Me1/H3K4Me3/H3K27Ac/CTCF sites, ChIP-Seq peaks, RNA-Seq peaks, or many other annotations on genomic intervals.
  3. Filter-based annotation: identify variants that are reported in dbSNP, or identify the subset of common SNPs (MAF>1%) in the 1000 Genome Project, or identify subset of non-synonymous SNPs with SIFT score>0.05, or many other annotations on specific mutations.
  4. Other functionalities: Retrieve the nucleotide sequence in any user-specific genomic positions in batch, identify a candidate gene list for Mendelian diseases from exome data, identify a list of SNPs from 1000 Genomes that are in strong LD with a GWAS hit, and many other creative utilities.

In a modern desktop computer (3GHz Intel Xeon CPU, 8Gb memory), for 4.7 million variants, ANNOVAR requires ~4 minutes to perform gene-based functional annotation, or ~15 minutes to perform stepwise "variants reduction" procedure, making it practical to handle hundreds of human genomes in a day.


What’s new:

as: 2011Oct02: The last Version of ANNOVAR has introduced some bugs related to ncRNA annotation, which subsequently affects exonic/splicing annotation. An updated version is released. Please report bugs to me if you still see problems.

as: 2011Sep11: New Version of ANNOVAR is released with significant speedup of filter operation for certain databases (dbSNP, SIFT, PolyPhen, 1000G, etc), thanks to Ion Flux for the speed improvements. In previous version of ANNOVAR, filter-based annotation for ex1.human (12 variants) requires ~10 minutes for snp132, sift or polyphen. In the new version, it takes 1 second only! Performance improvements for larger query file will be less apparent. To use the new version, it is necessary to re-download the databases by -downdb. See details here. (Updated 2011Sep14: User reports that the previously uploaded program cannot download index file correctly and was fixed. Please download annovar program again).

as: 2011Jun18: New Version of ANNOVAR is released with some function enhancements. New mRNA FASTA files were uploaded for hg18 and hg19 (refseq, knowngene, ensgene), given recent update in gene annotations.

as: 2011Jun18: The 1000g2010nov file was updated to include indel calls. Now it has 26.1 million SNPs (released by 1000G in Nov 2011 based on Aug 2011 alignments) and 3.7 million indels (released by 1000G in Feb 2011 based on Aug 2010 alignments). A new 1000g2011may file was provided with 39 million SNPs. Read details here.

as: 2011May06: New version of ANNOVAR is released with minor bug fixes and feature enhancements. Whole-exome pre-computed PolyPhen v2, MutationTaster, LRT, PhyloP scores are available as ANNOVAR annotation database to give more detailed annotation of non-synonymous mutations in humans, in addition to SIFT. Use "-downdb ljb_pp2 -webfrom annovar", "-downdb ljb_lrt -webfrom annovar", "-downdb ljb_mt -webfrom annovar", "-downdb ljb_phylop -webfrom annovar" to download them. Add "-buildver hg19" to download them in hg19 coordinate. The annotation database ljb refers to Liu, Jian, Boerwinkle paper in Human Mutation with pubmed ID 21520341. Cite this paper if you use the scores; higher scores (0-1) represent functionally more deleterious predictions. (2011May11: There is a bug in the hg18_lrt_pp2 file which has been fixed now; if you download before this date, please download file again. Please report other bugs).

as: 2011May03: Fourty six whole-genome (variant calls and allele frequency information) from Complete Genomics are now available as a ANNOVAR annotation database. Users need to use "-downdb cg46 -webfrom annovar" (with either '-buildver hg18' or '-buildver hg19') to download the file. For filter-based annotation, use "-dbtype generic -genericdbfile hg18_cg46.txt" for annotation. The -score_threshold argument can be used to apply a MAF threshold.

as: 2011Apr18: New mRNA FASTA files were uploaded for hg18 and hg19 (refseq, knowngene, ensgene), given recent update in gene annotations. Users can always generate the latest files using retrieve_seq_from_fasta.pl by yourself.

as: 2011Mar25: dbSNP version 132 in hg19 coordinate with >30 million SNPs (more than double of dbSNP131). Download the files from the download page, or use "-downdb -webfrom annovar" in ANNOVAR to download directly (as the file is from ANNOVAR not UCSC).

as: 2011Mar18: dbSNP version 131 and 132 in hg18 coordinate! There is a huge community demand to have latest dbSNP in hg18 (NCBI 36), but unfortunately dbSNP elected to work on hg19 only. Dr. Leparc lifted over the latest dbSNP files and provided the dbSNP131 and dbSNP132 file in hg18 coordinate for use in ANNOVAR. Download the files from the download page, or use "-downdb -webfrom annovar" in ANNOVAR to download directly (-webfrom is required as the file is from ANNOVAR website).

as: 2011Mar01: Small update to AVSIFT database based on updated annotations at http://sift-dna.org/.


Reference:

If you have questions, comments or concerns, contact as

  评论这张
 
阅读(3034)| 评论(6)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017