-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
Hi,
I'm trying for the first time to use the package. I want to analysis H3k27me3 CHIP-Seq data, and therefore I use the same parameters shown in the tutorial for this histone modification.
When I run the function filterWindowsGlobal for filtering of low-abundance windows, it takes more than 12 hours. Any idea way? my bam file contain ~5 million reads each.
My code:
library(BiocFileCache)
library(csaw)
## read the metadata file
h3k27me3data <-
read.csv("../CSAW_df.csv")
h3k27me3data <- data.frame(apply(h3k27me3data, 2, as.character), stringsAsFactors = F)
## define the blacklist using the predicted repeats from the RepeatMasker software
bfc <- BiocFileCache("local", ask=FALSE)
black.path <- bfcrpath(bfc, file.path("http://hgdownload.cse.ucsc.edu",
"goldenPath/mm10/bigZips/chromOut.tar.gz"))
tmpdir <- tempfile()
dir.create(tmpdir)
untar(black.path, exdir=tmpdir)
## Iterate through all chromosomes.
collected <- list()
for (x in list.files(tmpdir, full=TRUE)) {
f <- list.files(x, full=TRUE, pattern=".fa.out")
to.get <- vector("list", 15)
to.get[[5]] <- "character"
to.get[6:7] <- "integer"
collected[[length(collected)+1]] <- read.table(f, skip=3,
stringsAsFactors=FALSE, colClasses=to.get)
}
collected <- do.call(rbind, collected)
blacklist <- GRanges(collected[,1], IRanges(collected[,2], collected[,3]))
blacklist
## minimum mapping quality score to 10.
## We also restrict ourselves to the standard chromosomes
param <- readParam(minq=10, discard=blacklist,
restrict=paste0("chr", c(1:19, "X", "Y")))
## Counting reads into windows
win.data <- windowCounts(h3k27me3data$Path, param=param, width=2000,
spacing=500, ext=200)
bins <- windowCounts(h3k27me3data$Path, bin=TRUE, width=10000, param=param)
filter.stat <- filterWindowsGlobal(win.data, bins)
Thank you!!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels