-
Notifications
You must be signed in to change notification settings - Fork 101
questions about speed and ONT reads #704
Description
Hi, here are my questions on running FALCON. I'm wondering if anyone has tested on these.
-
I’m assembling ONT reads using FALCON and FALCON unzip. Should I correct the raw ONT reads with CANU or some sort of correction program?
-
Do you have some suggestions for speeding up FALCON? We have three plant genomes, 350MB, 3GB, and 17GB. So far for 118x ONT reads of a 350 Mb genome, it took me two weeks to finish the 0-rawreads/las-merge-runs stage, which is way too slow.
-
What's an acceptable low coverage for diploids to adequately assemble primary contigs and haplotigs? I wonder if 50x coverage would just break the assembly down to small contigs or maybe lose some haplotigs while maintaining the assembled N50. Anyone has experience on this matter with lower coverage ONT reads using FALCON?
Here is what I’m planning to speed up the assembly:
a. Increase DBsplit_option -s from 100 to 200 to reduce the number of my tan-run jobs, 5461 jobs with -s 100 currently.
b. I want to play with njob and NPROC options. But I’m a little unsure about how they play out together. My local server has 48 cpus and 560 GB memory.
Thanks you in advance for any suggestion.
Here is my current run_falcon.cfg file for the 118x corrected-ONT reads for 350 Mb genome:
[General]
input_fofn = input_run1.fofn
input_type = raw
pa_DBsplit_option = -a -x500 -s200
ovlp_DBsplit_option = -a -x500 -s200
ovlp_HPCTANmask_option =
pa_REPmask_code = 0,300;0,300;0,300
genome_size = 350000000
seed_coverage = 80
length_cutoff = -1
length_cutoff_pr = 1500
pa_HPCdaligner_option = -v -B4 -M16
pa_daligner_option = -e.70 -l1000 -s100
falcon_sense_option = --output_multi --min_idt 0.70 --min_cov 4 --max_n_read 200 --n_core 12
ovlp_HPCdaligner_option = -v -B4 -M32
ovlp_daligner_option = -h60 -e.96 -l500 -s1000
overlap_filtering_setting = --max_diff 100 --max_cov 100 --min_cov 20 --bestn 10
[job.defaults]
use_tmpdir = ./tmp
stop_all_jobs_on_failure = true
pwatcher_type = blocking
job_type = local
JOB_QUEUE=default
submit = /bin/bash -c "${JOB_SCRIPT}" > "${JOB_STDOUT}" 2> "${JOB_STDERR}"
[job.step.da]
NPROC=8
[job.step.la]
NPROC=8
[job.step.cns]
NPROC=12
[job.step.pda]
NPROC=8
[job.step.pla]
NPROC=8
[job.step.asm]
NPROC=24