Genomics for differentiation of Venturia sp. Dan Jones PhD Student: Cooperative Research Centre for National Plant Biosecurity/ La Trobe University biosecurity built on science Cooperative Research Centre for National Plant Biosecurity
What's this talk about? The Biosecurity context
biosecurity built on science
What's this talk about? The Biosecurity context The problem organisms: Apple, Pear and Nashi scab
biosecurity built on science
What's this talk about? The Biosecurity context The problem organisms: Apple, Pear and Nashi scab The Venturia spp. genome project, and how it relates to the biosecurity context
biosecurity built on science
What's this talk about? The Biosecurity context The problem organisms: Apple, Pear and Nashi scab The Venturia spp. genome project, and how it relates to the biosecurity context Status of the project
biosecurity built on science
What's this talk about? The Biosecurity context The problem organisms: Apple, Pear and Nashi scab The Venturia spp. genome project, and how it relates to the biosecurity context Status of the project Outcomes of the project
biosecurity built on science
What's this talk about? The Biosecurity context The problem organisms: Apple, Pear and Nashi scab The Venturia spp. genome project, and how it relates to the biosecurity context Status of the project Outcomes of the project Future work biosecurity built on science
Biosecurity context or Complications that happened during my Ph.D.
• WA declared free of Apple scab (Venturia inaequalis), but lost it in 2010 • Partly due to the cost of maintaining area freedom
• Nashi pear is being imported from Asia • Susceptible to V. nashicola, which is absent from Australia
• Hybrids exist between European pear and Nashi pear • Host no longer useful for species ID biosecurity built on science
Biosecurity context – what is the common problem? • Identifying Apple, Pear and Nashi pear pathogens • Is it Venturia sp? • Distinguising them from each other • Which one? • Which strain?
biosecurity built on science
The problem organism(s): Venturia sp.
biosecurity built on science
biosecurity built on science
The Venturia sp. collaborative genome projects • Set up to answer scientific questions • What infection-associated genes are unique to Venturia pirina? Or V. inaequalis? – Host-specificity genes? • What infection-associated genes are present across the Venturia sp? – Important for infection? • ..but also to answer biosecurity questions • Some existing at the start of the project • Some which arrived during the project biosecurity built on science
Comparative genomics of V. inaequalis and V. pirina
Complete genome sequence of V. inaequalis
Identify infection-associated genes
Complete genome sequence of V. pirina
Identify infection-associated genes
A fruitful collaboration that identifies -host-specificity genes - genes important for infection biosecurity built on science
The V. pirina genome: the bioinformatics process
Raw data (e.g. NGS data)
Useful outputs (e.g. an assembled Genome, gene identifications
biosecurity built on science
The V. pirina genome: the bioinformatics process
Raw data (e.g. NGS data)
Useful outputs (e.g. an assembled Genome, gene identifications
biosecurity built on science
The V. pirina genome: the bioinformatics process
Raw data (e.g. NGS data)
Useful outputs (e.g. an assembled Genome, gene identifications
biosecurity built on science
The V. pirina genome • Important to set up a bioinformatics pipeline • To extract useful information from NGS data, automatically and in a consistent and documented way • To enable us to prove that we have followed bestpractice when producing the genome and identifying genes • To allow future revisions of the genome and gene identification to be re-analysed in a consistent manner biosecurity built on science
The Galaxy workflow system..
The Galaxy workflow system..
Not a flowchart of bioinformatics programs: these are the programs
The Galaxy workflow system..
All processes are recorded, down to the lines of code used
The Galaxy workflow system..
Galaxy workflows can be shared online, and referred to in publications
The V. pirina genome • End products: • A genome, annotated with important genome features, in a browser that's accessible to end users • A bioinformatics pipeline (GALAXY) – Pipelines for genome assembly – Pipelines for genome annotation – Pipelines for gene mining • Wiki-style documentation of all bioinformatics work
biosecurity built on science
Comparison of genomes
A measure of assembly quality: Higher = better 50 % of the genome is on this many scaffolds
Venturia inaequalis
Venturia pirina
Number of scaffolds
364
467
Total size
48 Mb
37 Mb
Largest scaffold
1.1 Mb
1.5 Mb
N50
306 Kb
215 Kb
L50
62
52
The V. pirina genome: Searching for infection-associated genes (Effectors) • Effectors: pathogen genes involved in infection • By blocking the plant immune system – e.g. Chitin scavengers • Some function important to infection – e.g appressoria formation Case in point: the recent Pseudomonas syringae pv. actinidae kiwifruit infections in NZ were caused by a new variant, which was found (by whole genome sequencing) to differ in only a few effector genes and in one toxin pathway. (Matt Templeton, Plant and Food Research, New Zealand) biosecurity built on science
The V. pirina genome: Searching for infection-associated genes (Effectors) • Share characteristics that can be identified in a group of gene predictions • Small • Secreted (contain signal peptide) • Contain disulphide bonds • Contain known effector motifs • Homology to known effectors
biosecurity built on science
The Galaxy workflow system.. it's easy to...
Predict genes and protein sequences, sort by size, sort by cysteine content, and pass to SignalP signal peptide prediction
The Galaxy workflow system..gene predictions
Take a SignalP prediction on a group of proteins, and sort the proteins into: 1) Proteins with SignalP 2) Proteins with TM domain 3) Proteins with both SignalP and TM domain
Comparison of genomes
Genes
Venturia inaequalis
Venturia pirina
15,497
14,119
Genes >50 bp, <1500 bp 7,928
7,813
Secreted
~300
~300
Case study 1: using the genome to answer biosecurity questions Nashi pears imported from Asia. Nashi can carry Venturia nashicola: a potential biosecurity threat. â&#x20AC;˘Using the V. pirina genome, we designed PCR primers to a number of effector genes, to examine whether effectors vary between the Venturia sp. â&#x20AC;&#x201C; good targets for species-specific PCR
Case study 1: using the genome to answer biosecurity questions 10/10 primer sets amplified a band of the correct size and sequence in V. pirina Only 2/10 primer sets amplified anything in V. nashicola IV
IV Vp
Vn
Vp
Vn
Vp
Vn
Vp
Vn
Vp
Vn
IV
IV Vp
Vn
Vp
Vn
Vp
Vn
Alternating lanes: Venturia pirina and Venturia nashicola genomic DNA
Vp
Vn
Vp
Vn
Case study 2: using the genome to ask biological questions Some recent publications raise the possibility that there has been an expansion in effector gene families in oomycetes.. (Raffaelle & Kamoun 2012 Nature Reviews Microbiology)
Case study 2: using the genome to ask biological questions Some recent publications raise the possibility that there has been an expansion in effector gene families in some fungi, but not others (Raffaelle & Kamoun 2012 Nature Reviews Microbiology)
â&#x20AC;Ś.are effectors in V. pirina / V. inaequalis members of expanded gene families?
Case study 2: using the genome to ask biological questions Short answer: Yes, but not in all cases.
ECP6: Single copy in both Venturia sp
AvrLM6 family: expanded in both Venturia sp
Ongoing work.. infection model system for proteomics and transcriptomics We have been growing and harvesting V. inaequalis and V. pirina grown on cellophane: An infection model system. We will be: •
•
Extracting the secreted proteome • LC-MS to obtain mass spectra of individual protein • Identification of proteins by reference to the respective genomes (Ira Cooke, Gert Talbo, Suresh Mathivanan) Extracting the transcriptome • Jason Shiller
In each case, we will be comparing cultures grown with and without cellophane to identify genes / proteins present during stroma formation
Thanks for your attention! We would like to acknowledge... CRC for National Plant Biosecurity Kim Plummer (Ph.D. supervisor) Jason Shiller ..and the Plummer Lab Plant and Food Research New Zealand: Matt Templeton, Jo Bowen, Vincent Bus, Carl Mesarich, Ross Crowhurst, Cecelia Dung DPI-Vic: Oscar Villalta Kyeongho Won and KangHee Cho biosecurity built on science
biosecurity built on science
biosecurity built on science
biosecurity built on science
biosecurity built on science
biosecurity built on science
Ongoing work.. bioinformatics Refine gene predictions using different programs: GeneMark, Augustus Fix â&#x20AC;&#x153;roadblocksâ&#x20AC;? in Galaxy pipeline: Programs which currently are run outside Galaxy (Solution: write wrappers for Galaxy to integrate new tools into the system)
Programs too memory intensive for Galaxy (Solution: put Galaxy on a bigger computer!)
Automated ID of effector motifs and further non-gene annotation of the genomes
The Galaxy workflow system.. it's easy to...
Check the quality of a sequencing run (Total sequencing, Quality score boxplots, GC content and summary statistics
The Galaxy workflow system.. it's easy to...
Prepare three different sequencing libraries (one paired-end, two mate-pair) for assembly by: 1) Interleaving paired-end reads 2) Reverse-complement mate-pairs 3) Individually trim reads by quality using a sliding window
The Galaxy workflow system.. it's easy to...
Predict genes and protein sequences, sort by size, sort by cysteine content, and pass to SignalP signal peptide prediction
Ongoing work.. infection model system For proteomics and transcriptomics
V. pirina and inaequalis spores form stroma after penetrating inside a leaf
Ongoing work.. infection model system for proteomics and transcriptomics
Eric Kemen University of Konstanz
But they will also form pseudostroma on cellophane culture plates!