Tag Archives: Transcriptomes

RNA and denovo transcriptomes

About me

We live in incredible times – where the genes expressed at a single moment can be captured, read and studied. I am a PhD student at the Faculty of Agriculture and Environment, University of Sydney. I am interested in the molecular responses of plants to pathogens.

I have had no prior experience running software in a Unix environment but decided that the tools are all out there and perhaps I could have a go. The last few months I have been working through an RNAseq pipeline and thought my failures and successes could help someone else who is starting out. I will blog each step and provide scripts that I used to run the software.

There are many different ways to go about analysing RNA data. I present one approach but welcome friendly criticism, constructive advice or comments. I am really learning on the run and there is a lot of free software available for different tasks – thanks to the many software developers.

The experiment

I am working with Australian Myrtaceae plants and investigating host responses to a newly arrived fungal pathogen, myrtle rust. Below is an electron scanning microscope image of a myrtle rust spore germinating on a eucalypt leaf made with the help of the Australian Microscopy & Microanalysis Research Facility (http://sydney.edu.au/acmm).

Sample4_002 - Copy

Just by way of some background, the pathogen has a very extensive Myrtaceae host-range and most Australian natural vegetation communities are composed of Myrtaceae plants. These include eucalypts, paperbark, bottle-brush, tea-tree, lemon-scented myrtle, waxflower and more…. The potential for devastating outbreaks under the right environmental conditions is therefore still a looming threat.

For my experiment I took leaf samples from 4 resistant  and 4 susceptible plants at three time points – pre-inoculation, 24 hours post inoculation and 48 hours post inoculation. I extracted and cleaned the RNA using commercial kits. I then made libraries of around 300 base pairs for paired-end Illumina sequencing on Hiseq2500. Two lanes for each sample were run. The raw data from these runs is where I will begin the bioinformatic explanation in my next post.