Category Archives: SMRT analysis tools

Polishing Canu genome

Having assembled the genome using Canu it is time to polish. This can be done with Pilon, using Illumina reads aligned to the genome. This can also be a time and compute heavy job (days). For example:

PBS -l select=1:ncpus=8:mem=160gb
PBS -l walltime=64:00:00

Now doing the polishing again. This time aligned the raw subread.bam files with the genome using pbalign from SMRT analysis tools. A good explanation was found here after much confusion.

https://www.biostars.org/p/273447/

The command to make an xml dataset (of all the subread.bam files for pbalign is this below. Note the subreads.bam files are from the sequel data (one folder) and RSII data (another folder):

~/bin/smrtlink/smrtcmds/bin/dataset create –type SubreadSet –name fungiĀ  fungi_set.xml ../RSII/*.subreads.bam ../sequel/*.subreads.bam

The main confusing bit is the name ‘bam’. Here we have pre-aligned raw bams (subread.bams). Then, after aligning using pbalign, we have aligned bams. These aligned reads are the ones to use in the polishing with arrow from SMRT-tools.

Software downloads