注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

wangyufeng的博客

祝愿BB 健康开心快乐每一天

 
 
 

日志

 
 

SNVer:calling common and rare variants from NGS data  

2013-08-07 17:39:42|  分类: 生物信息分析 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |
SNVer is a statistical tool for calling common and rare variants in analysis of pool or individual next-generation sequencing data. It reports one single overall p-value for evaluating the significance of a candidate locus being a variant, based on which multiplicity control can be obtained. Loci with any (low) coverage can be tested and depth of coverage will be quantitatively factored into final significance calculation. SNVer runs very fast, making it feasible for analysis of whole-exome sequencing data, or even whole-genome sequencing data.
http://snver.sourceforge.net/

SNVer for Pooled Sequencing

Usage: java -jar SNVerPool.jar -i input_directory -r reference_file 
	[-c pool_info_file | -n number_of_haploids] [options]

Required:

 -i	Input directory must be a directory contains a batch of 
	standard SAM/BAM files.
	
 -r	Reference file must be the same as what are used in the input 
	bam files. Inconsistent reference file, such as different chromosome 
	names and different chromosome length, will prevent SNVer from 
	running. See how to prepare a reference in Getting Started. 
	
 -c	Pool info file, a tab-delim file containing five columns, the 
	first column is for file names and the second column is for the 
	number of haploids in the pool and the third column is the number
	of samples. For example, if assuming diploid individual, the number
	should be 2 * no. of individuals in a pool. The file is preferred 
	when dealing with pools with different pool size and the output vcf 
	file is also based on the order of file names listed. The following 
	is an example of the pool info file, the line starting with # will 
	be omitted automatically. In the new SNVer, user is able to set 
	different base quality and mapping quality threshold among different 
	pools, which can be configured in the fouth and fifth columns, if 
	missing, -bq and -mq will be used for the analysis.
	
	#names	no.haploids	no.samples	mq	bq
	test1.bam	2	1		17	30
	test2.bam	2	1		20	20
	
 -n	The number of haploids, similar with the second column in pool 
	info file, which is required when option -c is not given. If a number 
	is give here, SNVer will assign each pool with the same number of 
	haploids for calculation. For example, if each pool has 5 samples, 
	we need to input 10 haploids here.

SNVer for Individual Sequencing


Usage: java -jar SNVerIndividual.jar -i input_file -r reference_file [options]

Required:

 -i	Input file must be a standard SAM/BAM file.
 
 -r	Reference file must be the same as what is used in the input bam 
	file. Inconsistent reference file, such as different chromosome names 
	and different chromosome length, will prevent SNVer from running.
	See how to prepare a reference in Getting Started.  
   

More Options:


 -l	Target regions in bed file format, if absent, SNVer will pileup
	for the entire reference. The default is null.
    
 -o	The prefix of output file, SNVer output results will generate 
	two vcf files, one is called prefix.raw.vcf and the other is called 
	prefix.filter.vcf according to the p-value cutoff uses. The default 
	is input_file.filter.vcf for SNVerIndividual and input_directory.
	filter.vcf for SNVerPool.
	
 -n	The number of haploids. For SNVerIndividual, the default is 2, 
	assuming diploid individual. For SNVerPool, which is mentioned in the
	requirement of SNVerPool.
	
 -mq	The mapping quality threshold, meaning that only consider reads with 
	mapping quality above the cutoff. The default is 20.
	 
 -bq	The base quality threshold, meaning that only consider bases with 
	base quality above the cutoff. The default is 17.
	
 -s	The strand bias threshold, aiming to remove potential false postives 
	due to strand bias issue. SNVer uses a one-sided binomial test for 
	alternative forward count, and alternative reverse count. The default 
	p-value cutoff is 0.0001.
	
 -f	The fisher exact threshold, aim to remove potential false postives
	due to allele imbalance issue. SNVer uses a one-sided fisher's exact 
	test for contingency table of alternative forward count, alternative 
	reverse count, reference forward count, reference reverse count. 
	The default p-value cutoff is 0.0001.
	
 -p	The SNVer p-value threshold for testing significant variants. The 
	default p-value cutoff is based on Bonferroni correction, the definition
	is 0.05/the number of tests. If specify a p-value cutoff, say 0.5, 
	the loci with p-value greater than the cutoff would be filtered out.
	P-value range is [0-1].
	
 -a	Require at least this number of reads supporting each strand for 
	alternative allele. For example, if alternative forward count is 0 and 
	alternative reverse count is 10. The loci would be discarded. The 
	default is at least 1 supported read.
	
 -b	Require the ratio of alt/ref above the threshold, aiming to filter out 
	loci with reference bias problem. The default ratio is 0.25. This is only
	for individual data.
	
 -t	Allele frequency threshold, which can be only used in analysis of pool
	data. For example, if you want to test all variants, this should be 0.
	The default is 0.01. Allele frequency range is [0-1].
	
 -het	The heterozygosity, the prior for computing posterior probability of 
	genotypes. This is only for computing the genotype for individual data. 
	The default is 0.001.
	
 -u	Inactivate -s and -f above this threshold. The default is 30, which means 
	if observing 30 or more alterative count, SNVer will not conduct such tests. 
	Decreasing -u will increase the sensitivity but lower the specificity. 

 -db	Support query for snp_id, if users provide a dbSNP file and the column 
	number of chr, pos, snp_id. The format is [path for dbSNP, column number of 
	chromosome, position and snp_id]. The default is null, meaning that no such 
	query needed.
download:http://snver.sourceforge.net/manual.html
 
  评论这张
 
阅读(1154)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017