Using Canu v1.6 this stage has taken 2 months so far with 351 jobs. There are 28 jobs still to run on the University of Sydney HPC, Artemis. Each job requires around 10GB memory, 4 cpus, and a progressive amount of walltime. Starting from only about 5 hours to about 150 hours of walltime for the last 100 jobs.
I have detailed this previously but the genome is predicted to be 1.3 or 1.4 Gbp and we started with 50-60X coverage of PacBio Sequel and RSII data.
The trimmedReads.fasta.gz output was 12G
canu script (NB the genome size was set to 1G for the script):
canu -p MR_1805a \
-d /scratch/directory/run-MR_1805a \
batOptions=”-dg 3 -db 3 -dr 1 -ca 500 -cp 50″ \
A previous assembly was begun 6 months ago using the same raw data and is still incomplete. The same script was run except for an additional parameter:
The trimmedReads.fasta.gz output from that assembly run was 17G