I have been going back into my transcriptomes for each plant and pulling out ‘genes’ that fit with Hidden Markov Models I made using HMMER (http://hmmer.org/). I used the specific domains from resistance gene models, previously identified in Eucalyptus grandis (http://journal.frontiersin.org/article/10.3389/fpls.2015.01238/full) and chitinases (https://academic.oup.com/treephys/article-abstract/37/5/565/3067625/Identification-of-the-Eucalyptus-grandis-chitinase) to initially find putative genes in my clustered Syzygium. Then I used the aligned genes to build species specific nucleotide HMM for several defence-related genes.
Armed with lists of potential genes in each transcriptome I now want to find out if there are actually transcript variations between my resistant and susceptible plants.
This has meant aligning my raw reads for each plant/at each time against its own transcriptome. A big job again – hope it is worth it.
Another thing that has been useful recently is the regular expressions for sorting etc in Notepad +++ (http://www.rexegg.com/regex-quickstart.html). With large datasets it seems there is always something needing extracting or amending.
Have a bunch of primers ready as well to check against all samples. Back to the lab soon for much qRTPCR.