Polishing Canu genome

Having assembled the genome using Canu it is time to polish. This can be done with Pilon, using Illumina reads aligned to the genome. This can also be a time and compute heavy job (days). For example:

PBS -l select=1:ncpus=8:mem=160gb
PBS -l walltime=64:00:00

Now doing the polishing again. This time aligned the raw subread.bam files with the genome using pbalign from SMRT analysis tools. A good explanation was found here after much confusion.


The command to make an xml dataset (of all the subread.bam files for pbalign is this below. Note the subreads.bam files are from the sequel data (one folder) and RSII data (another folder):

~/bin/smrtlink/smrtcmds/bin/dataset create –type SubreadSet –name fungi  fungi_set.xml ../RSII/*.subreads.bam ../sequel/*.subreads.bam

The main confusing bit is the name ‘bam’. Here we have pre-aligned raw bams (subread.bams). Then, after aligning using pbalign, we have aligned bams. These aligned reads are the ones to use in the polishing with arrow from SMRT-tools.




Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s