Skip to content

filterWindowsGlobal() function running is extremely long #16

@RacheliRap

Description

@RacheliRap

Hi,
I'm trying for the first time to use the package. I want to analysis H3k27me3 CHIP-Seq data, and therefore I use the same parameters shown in the tutorial for this histone modification.
When I run the function filterWindowsGlobal for filtering of low-abundance windows, it takes more than 12 hours. Any idea way? my bam file contain ~5 million reads each.
My code:

library(BiocFileCache)
library(csaw)


## read the metadata file
h3k27me3data <- 
  read.csv("../CSAW_df.csv")
h3k27me3data <- data.frame(apply(h3k27me3data, 2, as.character), stringsAsFactors = F)

## define the blacklist using the predicted repeats from the RepeatMasker software
bfc <- BiocFileCache("local", ask=FALSE)
black.path <- bfcrpath(bfc, file.path("http://hgdownload.cse.ucsc.edu",
                                      "goldenPath/mm10/bigZips/chromOut.tar.gz"))
tmpdir <- tempfile()
dir.create(tmpdir)
untar(black.path, exdir=tmpdir)

## Iterate through all chromosomes.
collected <- list()
for (x in list.files(tmpdir, full=TRUE)) {
  f <- list.files(x, full=TRUE, pattern=".fa.out")
  to.get <- vector("list", 15)
  to.get[[5]] <- "character"
  to.get[6:7] <- "integer"
  collected[[length(collected)+1]] <- read.table(f, skip=3, 
                                                 stringsAsFactors=FALSE, colClasses=to.get)
}

collected <- do.call(rbind, collected)
blacklist <- GRanges(collected[,1], IRanges(collected[,2], collected[,3]))
blacklist

## minimum mapping quality score to 10. 
## We also restrict ourselves to the standard chromosomes
param <- readParam(minq=10, discard=blacklist,
                   restrict=paste0("chr", c(1:19, "X", "Y")))

## Counting reads into windows
win.data <- windowCounts(h3k27me3data$Path, param=param, width=2000,
                         spacing=500, ext=200)

bins <- windowCounts(h3k27me3data$Path, bin=TRUE, width=10000, param=param)
filter.stat <- filterWindowsGlobal(win.data, bins)

Thank you!!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions