注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

wangyufeng的博客

祝愿BB 健康开心快乐每一天

 
 
 

日志

 
 

CNV-seq:a method for detecting DNA copy number variation (CNV) using highthroughput sequencing.  

2010-11-14 18:34:20|  分类: 生物信息分析 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

CNV-seq method has been tested  in the following configurations:

OS: Mac OS X leopard, Ubuntu Hardy, CentOS 4.3, Debian 5

Perl: 5.8.8, 5.10.0

R: 2.10.1

ggplot2: 0.8.5

This package contains two Perl scripts and one R package. To use the CNV-seq package,

you need to install Perl and R first. Then you can download the CNV-seq package

from http://tiger.dbs.nus.edu.sg/CNV-seq. After downloading the package, open

a command line console and run:

$ tar xzf cnv-seq.tar.gz

$ cd cnv-seq

$ ls

 There are several Perl scripts (best-hit.*.pl and cnv-seq.pl) and one R package cnv in this directory. To install the two Perl scripts, simply move or copy the them to your desired location.

 To install the R package run:

$ R CMD INSTALL cnv/

 Please note that the R package cnv requires package ggplot2 for plotting CNV graphs, you may want to install ggplot2 if you need the plotting function in the cnv package.

 CNV-seq usage

 best-hit.*.pl

The only requirement for CNV-seq is best-hit location files for each mapped sequence

read, in the following format:

1 1234

1 23456

chr2 999

chr2 999

X 1234

 Authors provide best-hit.*.pl for obtaining the best mapping locations for several alignment

tools.

 cnv-seq.pl

cnv-seq.pl is used to calculate sliding window size, to count number of mapped hits in each window, and to call cnv R package to calculate log2 ratios and annotate CNV. You can get the usage of cnv-seq.pl by run the script without any arguments:

 $./cnv-seq.pl

usage: cnv-seq.pl [options]

--test = test.hits.file

--ref = ref.hits.file

--log2-threshold = number

(default=0.6)

--p-value = number

(default=0.001)

--bigger-window = number

(default=2, in order to use larger window than minimum)

--genome =(human, chicken, chrom1, autosome, sex, chromX, chromY)

(higher priority than --genome-size)

--genome-size = number

(in bases, overwiten by --genome option)

--global-normalization

(if used, normalization on whole genome,

instead of on each chromosome)

--annotate

--no-annotate

(default do annotation)

--minimum-windows-required = number

(default=4; only for annotation of CNV)

--Rexe = path to your R program

(default=R)

--help

 The option

 --test and

--ref are required.

Either --genome or --genome-size option must be specified too. All other options have default values as shown above. The “test” and “ref” options accept the best hit files outputed from best-hit.*.pl for the testand reference individuals.

 R package cnv

The cnv-seq.pl will call the R package cnv by default, and a tab-delimited file containing the log2 ratios and (optionally) CNV annotation will be ouputed. However, in order to achieve the full power of the cnv package, you are strongly recommended to run the cnv package from R by yourself.

 The cnv package contains several functions:

 cnv.cal <- function (file, log2.threshold = NA,

 chromosomal.normalization = TRUE,

 annotate = FALSE, minimum.window = 4)

 cnv.print <- function (cnv, file ="")

 cnv.summary <- function (cnv)

 plot.cnv.all <- function (data, chrom.gap = 2e+07,

 colour = 5, title = NA, ylim = c(-2,2),

 xlabel ="Chromosome")

 plot.cnv.chr <- function (data, chromosome = NA,

 from = NA, to = NA, title = NA,

 ylim = c(-4, 4), glim = c(NA, NA),

 xlabel ="Position (bp)")

 plot.cnv.cnv <- function (data, CNV, upstream = NA,

 downstream = NA,...)

 Demonstration

Authors [39] provide some sample data for demonstration.

You can download Sample1.tar.gz from their website. Sample1.tar.gz contains BLAT output for simulated Solexa reads with 1X coverage on human chromosome 1.

After download the file, open command line console, and run:

 $ cd DOWNLOAD_DIR

$ tar xzf Sample1.tar.gz

$ PATH-TO/best-hit.BLAT.pl ref.psl > ref.hits

$ PATH-TO/best-hit.BLAT.pl test.psl > test.hits

$### NOTE: there are several other versions of

$### best-hit.*.pl for different input formats

 The above lines will generate two output files: test.hits and ref.hits, which are the genomic locations of the best BLAT hits. We also provided the two files (test.hits and ref.hits) on previously mentioned website as Sample2.tar.gz, which is much smaller than Sample1.tar.gz. After obtaining the two hits files, you can run cnv-seq.pl:

 $ cnv-seq.pl --test test.hits --ref ref.hits --genome chrom1

--log2 0.6 --p 0.001 --bigger-window 1.5 #default

--annotate --minimum-windows 4 #default

 This will give you output like this:

 genome size used for calculation is 247249719

test.hits: 1874797 reads

ref.hits: 1878852 reads

 The minimum window size for detecting log2>= 0.6 should be 17676.9728733869

The minimum window size for detecting log2<=-0.6 should be 17692.0064154661

window size to use is 17692.0064154661 x 1.5 = 26538

window size to be used: 26538

read 1874797 test reads, out of 1874797 lines

read 1878852 ref reads, out of 1878852 lines

write read-counts into file: test.hits-vs-ref.hits.log2-0.6.pvalue-0.001.count

R package cnv output: test.hits-vs-ref.hits.log2-0.6.pvalue-0.001.minw-4.cnv

...

[1]"chromosome: 1"

[1]"cnv_id: 1 of 50"

[1]"cnv_id: 2 of 50"

[1]"cnv_id: 3 of 50"

[1]"cnv_id: 4 of 50"

...

The sliding window size used is 26.5Kb.

 This will give a tab delimited file test.hits-vs-ref.hits.log2-0.6.pvalue-0.001.minw-4.cnv. This file contains all information about CNV prediction from the analysis. In order to plot the log2 CNV graph:

 $ R

# in R command prompt

> library(cnv)

> data <- read.delim("test.hits-vs-ref.hits.log2-0.6.pvalue-0.001.cnv")

> cnv.print(data)

# output ...

> cnv.summary(data)

# output ...

> plot.cnv(data, CNV=4, upstream=4e+6, downstream=4e+6)

 >ggsave("sam


more information pls link to:http://lib.bioinfo.pl/courses/view/563

  评论这张
 
阅读(1787)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017