Current methods in retrotransposon based marker production and utilization Doc. Ruslan Kalendar
Visit Program to Egypt: FCRI-Institute of Field Crops Research, ARC, Egypt; 30.03 - 04.04.2014
RAPD (Random Amplified Polymorphic DNA)
Theoretically any primers: from 10 to 18 bases, 40-80% GC single or several primer(s) per reaction Simple reaction set-up
Comfortable for study many species But: The reaction is sensitive to reaction conditions
ISSR (Inter-Simple Sequence Repeat)
Specialized RAPD primers: simple repeat with selective base: 5-САСАСАСАСАСАСАG Simple reaction set-up
Comfortable for study many species
Marker methods SNPs
SSRs
RAPDs
Association genetics
TKW
Plant species display remarkable variations in genome size
Fritillaria davisii 12.9 x 1010 bp largest diploid
Pinus edulis, 3 x 1010 bp
http://www.hindawi.com/journals/jb/2010/527357.html
Zonneveld BJM 2010. J. Bot.
Plant species display remarkable variations in genome size
Arabidopsis thaliana, 170 Mbp
Brachypodium distachyon 272 Mbp
Genome size varies even within plant genera
Retrotransposons form the greatest proportion of the genome
Arabidopsis 14%
Drosophila 15%
Human 45%
Maize 50-60%
Lilium 99%
The C-value paradox
How much DNA do all the genes occupy? 30,000 genes x 4800 bp = 1.44 x 108 bp for genic portion of “average” genome • 90 % of the Arabidopsis genome • 1.3 x 108 bp, 25498 genes • 51 % of the rice genome • 42 653 genes • 4.8 % of the barley genome •50 000 genes?
• 0.27 % of the Fritillaria genome !
What is the rest of the genome if not genes? •Microsatellites and other simple repeats •DNA transposons (Class II transposable elements) •Retrotransposons (Class I transposable elements) •Unidentified sequence
Retrotransposons form the greatest proportion of the genome by number and by bulk!
Two groups of RNA derived retrotransposons
Retrotransposition is inherently replicative DNA transposons (Class II Elements)
DNA
Retrotransposons (Class I Elements) rev. transcription AAAA
n
RNA DNA
Retrotransposons often form nested insertions in large cereal genomes
Sukkula
Sabrina
LINE
BAGY-2
Nikita
MITE
BARE-1
En/Spm
gene
Retrotransposon-based molecular marker methods Multiplex products of various lengths from different loci are indicated by the bars beneath the diagrams of each reaction. (a) The SSAP method. The geometry of amplification from a DNA fragment cut with two restriction enzymes (El, E2), containing a retrotransposon LTR (shaded, labeled) and ligated to an adapter (stippled) is shown. Primers used for amplification match the adapter (PA) and retrotransposon (PT). (b) The IRAP method. Amplification takes place between retrotransposons (internal regions hatched) near each other in the genome (open bar), using retrotransposon primers (PT). The elements are shown oriented head-to-head, using a single primer. (c) The REMAP method. Amplification takes place between a microsatellite domain (vertical bars) and a retrotransposon, using a primer anchored to the proximal side of the microsatellite (PM) and a retrotransposon primer (PT). (d) RBIP. Full sites, depicted left, are scored by amplification between a primer in the flanking genomic DNA (here, primer PFL matches the left flank) and a retrotransposon primer (PT). The single product is shown as one bar beneath the diagram. The alternative reaction between the primers for the left and right flanks, PFL, and PFR, respectively, is inhibited in the full site by the length of the retrotransposon. The product that is not amplified is indicated by the grey bar beneath the diagram. The flanking primers are able to amplify the empty site, right, depicted as a bar beneath the diagram.
Kalendar R, Flavell A, Ellis THN, Sjakste T, Moisy C, Schulman AH 2011. Analysis of plant diversity with retrotransposon-based molecular markers. Heredity, 106: 520-530. Kalendar R 2011. The use of retrotransposon-based molecular markers to analyze genetic diversity. Field and Vegetable Crops Research, 48(2): 261-274.
Retrotransposon-Based Marker Assays IRAP (Inter-Retrotransposon Amplified Polymorphism)
Retrotransposon-Based Marker Assays REMAP (REtrotransposon-Microsatellite Amplified Polymorphism)
IRAP - REMAP Specialized primers from conserved regions in LTR Single or combination of primers per reaction Simple reaction set-up Comfortable for study only closely related species Highest polymorphism and reliable amplification
But : Cloned sequences necessary for new retrotransposons for new species Kalendar R, Schulman AH 2006. IRAP and REMAP for retrotransposon-based genotyping and fingerprinting. Nature Protocols, 1(5): 2478 - 2484. Kalendar R, Grob T, Regina MT, Suoniemi A, Schulman AH 1999. IRAP and REMAP: Two new retrotransposon-based DNA fingerprinting techniques. Theoretical and Applied Genetics, 98: 704-711.
Retrotransposon-Based Marker Assays IRAP (Inter-Retrotransposon Amplified Polymorphism)
H.spontaneum linesďƒ
IRAP with Sukkula LTR primer (9900) with species of tribe Triticeae. 1 – H.perica, 2 – T.aestivum; 3 - T.durum, 4 - Ag.taushii, 5 - T.dicoccoides, 6 - S.cereale, 7 - S.strictum, 8 - H. erectifolium, 9 - H.pussilum, 10 – H.marinum, 11 – H.murinum, 12 - H.spontaneum, 13 - H.patagonicum, 14 H.muticum, 15 - H.roshevitzii, 16 - H. euclaston, 17 - H.brachyantherum, 18 – El.repens, 19 - Er.distans, 20 Er.triticeum, 21 - L.elongatum, 22 - T.caput-medusa, 23 - Ps.spicata, 24 - H.piliferum, 25 - Am.muticum, 26 C.comosum, 27 - S.speltoides, 28 - D.villosum, 29 – Cr.delileana, 30 – A. retrofractum, 31 – Th.bessarabicum
IRAP with single primer M
1
2
3
4
5
6
7
8
9
10
11
12 13 14 15 16
M
1
2
3
4
5
6
7
8
9
10
11 12 13 14
15 16
1. Sabrina LTR (489) 2. Wham LTR (515)
1-5: Brachypodium distachyon lines; 6: Triticum eastivum (ABD); 7. T.durum (AB); 8. Aegilops tauschii (1704) (D); 9. Aegilops tauschii (1691) (D); 10. Triticum dicoccoides (AB); 11. T.dicoccoides (138); 12. T.dicoccoides (156); 13. Aegilops peregrina (S); 14. Phleum pratense; 15. Avena sativa; 16. Secale strictum (H4342).
Single IRAP and two primer mixing Nikita
Nikita+Sukkula
Sukkula
Retrotransposon-Based Marker Assays
18 hours 1.7% agarose gel Electrophoresis 70V
IRAP (original)
+TaqI
+TaiI
IRAP- PCR bands: restriction digestion
Implemented IRAP & REMAP marker systems Poaceae: Barley, Wheat, Rye, Maize, Oat, Spartina, Phleum, Brachypodium, etc. Arecaceae: Oil Palm (Elaeis) Musaceae: Banana Brassicaceae: Brassica spp. Rosaceae: Apple, Rose, Plum, Quince, Rubus spp. Asteraceae: Sunflower Fabaceae: Pisum sativum L., Lotus corniculatus, Medicago truncatula, etc. Solanaceae: Tobacco, Tomato, Potato Linaceae: Flax Vaccinium: Blueberry Vitaceae: Grape Cucurbitaceae: Melon Clusiaceae: Mangosteen (Garcinia mangostana) Lithocarpus: Oak Ranunculaceae: Adonis sp. Campanulaceae: Adenophora liliifolia Plantaginaceae: Digitalis grandiflora Paeoniaceae: Paeonia anomala Pinaceae: Pine, Spruce Ferns: Nephrolepis exaltata, Sphaeropteris cooperi, Didymochlaena truncatula
Retrotransposon-based insertional polymorphism (RBIP)
Using three primers, RBIP yields co-dominant marker scores, which are particularly useful for phylogenetic studies because retrotransposon insertions are irreversible. In the case of a retrotransposon, a primer designed in the LTR is used together with a primer designed in the flanking region to allow the amplification of an insertion site, whereas primers specific to both the 5′ and 3′ flanking regions are used to score the corresponding empty site. TE insertions are usually more than thousands of bases long, and hence the flanking primers do not generate an amplicon from the occupied site.
Tagged microarray marker (TAM) a
TAM fingerprinting of two RBIP markers The basic RBIP method has been developed for high-throughput applications by replacing gel electrophoresis with array hybridization to a filter (Flavell et al., 1998; Jing et al., 2007).
b
Initially, PCR reactions detecting the occupied sites and unoccupied sites carried out together were spotted onto membranes, and probed with a locus-specific probe. TAM is an extension of this to a microarray format (Jing et al., 2007). TAM based on the PDR1, Cyclops and Tpv LTR retrotransposons of pea has been developed for scoring thousands of DNAs for a co-dominant molecular marker on a glass microarray slide.
A total of 3263 Pisum lines were scored for the RBIP markers (a) Birte-B1 (b) 1794-2 by the TAM approach (Flavell et al., 2003; Jing et al., 2007, 2010). Each spot represents a single sample (sample locations in the array are conserved between slides) and in these two cases a red spot indicates an occupied (retrotransposon insertion present) locus and the green spot an unoccupied locus. Yellow spots indicate an individual heterozygous for the retrotransposon insertion.
What is next for the developing molecular markers based on repeated DNA The genome DNA contain huge amount of short and conservative for all eukaryotes sequences:
From ancient RNA world – Pol III promoters: 5S rRNA, tRNA and other small RNA; In silico search short and conservative sequences;
The inter PBS amplification (iPBS) scheme
LTR retrotransposon structure: LTR and PBS sequence Two nested LTR retrotransposons in inverted orientations amplified from single primer or two different primers from primer binding sites. PCR product contains both LTRs and PBS sequences as PCR primers in the termini. In figure schematically showing general structure for PBS and LTR sequences, between 5’LTR(5’-..CA) and PBS (5’-TGG..3’) is spacer with several nucleotides (0-5 bases). Kalendar R, Antonius K, Smykal P, Schulman AH 2010. iPBS: A universal method for DNA fingerprinting and retrotransposon isolation. Theoretical and Applied Genetics, 121(8):1419-1430.
The inter PBS amplification Ginkgo biloba
Aegilops speltoides
Equisetum arvense
Detection polymorphism maize callus lines
Detection polymorphism with animals Cow: P14 P70 P69 P34 P88 P167 P161 P212 P213 P134 P137 P16 Yak Lama Chicken
Polymorphism between people
PCR with universal primers and detection transcription polymorphism in barley cDNA (stress and tissues) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1. CI9819, leaves 2 days old seedlings 2. CI9819, roots 2 days old seedlings 3. Adele, leaves 2 days old seedlings 4. Adele, roots 2 days old seedlings 5. Kristap, leaves 2 days old seedlings 6. Kristap, roots 2 days old seedlings 7. Rolfi, leaves 2 days old seedlings 8. Rolfi, roots 2 days old seedlings 9. Rolfi, spike 30 days, greenhouse 10. Rolfi, soil, greenhouse, leaves 11. Rolfi, cold(7 C), soil, leaves 12. Rolfi, cold(7 C), soil, leaves 13. Rolfi, cold(7 C), vermiculate, leaves 14. Rolfi, cold(7 C), vermiculate, leaves 15. Rolfi, sand, dark, 1 week old plant 16. Callus, K19 (barley Kymppi), old12 A1
Genomics toolbox for connecting traits to genes crop improvement
crop traits
QTL mapping
fine mapping
saturation mapping
BAC contigs
B I O I N F O R M AT I C S MATR I X
phenotype
GMOs
sequence genotype
TILLING Metabalomics
Proteomics
microarrays
Implementation of the genomics toolbox crop improvement
crop traits
QTL mapping
fine mapping
saturation mapping
BAC contigs
B I O I N F O R M AT I C S MATR I X
phenotype
GMOs
sequence genotype
TILLING Metabalomics
Proteomics
microarrays
Genomics decodes quantitative trait variation
SNPs for barley on Illumina
9000 High throughput Illumina SNP markers with known map locations
Hybridized Agilent chip: 44 000 barley genes
Landscape of the barley gene space Track a gives the seven barley chromosomes. Green/grey colour depicts the agreement of anchored fingerprint (FPC) contigs with their chromosome arm assignment based on chromosome-arm-specific shotgun sequence reads. For 1H only whole-chromosome sequence assignment was available. Track b, distribution of high-confidence genes along the genetic map; track c, connectors relate gene positions between genetic and the integrated physical map given in track d. Position and distribution of track e class I LTRretroelements and track f class II DNA transposons are given. Track g, distribution and positioning of sequenced BACs. “A physical, genetic and functional sequence assembly of the barley genome�. The International Barley Genome Sequencing Consortium. Nature (2012) doi:10.1038/nature11543
Cereal genome synteny
15 X change, 4.9 x 109 bp (140 bp/ year)
Single nucleotide variation (SNV) frequency in barley Barley chromosomes indicated as inner circle of grey bars. Connector lines give the genetic/physical relationship in the barley genome. SNV frequency distribution displayed as five coloured circular histograms (scale, relative abundance of SNVs within accession; abundance, total number of SNVs in non-overlapping 50-kb intervals of concatenated ‘Morex’ genomic scaffold; range, zero to maximum number of SNVs per 50-kb interval). Selected patterns of SNV frequency indicated by coloured arrowheads. Colouring of arrowheads refers to cultivar with deviating SNV frequency for the respective region.
“A physical, genetic and functional sequence assembly of the barley genome”. The International Barley Genome Sequencing Consortium. Nature (2012) doi:10.1038/nature11543