Genomic advancements for canola
Elodie Gazave1, Erica E. Tassone2, Megan Wingerson3, James B. Davis3, John M. Dyer2, Matthew A. Jenks5, Jack Brown3 & Michael A. Gore1 Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY, USA 2 Plant Physiology and Genetics Research Unit, U.S. Arid Land Agricultural Research Center, USDAARS, Maricopa, AZ, USA 3 Department of Plant, Soil and Entomological Sciences, University of Idaho, Moscow, ID, USA 4 Keygene N.V., Wageningen, Netherlands 5 Division of Plant and Soil Sciences, West Virginia University, Morgantown, WV, USA 1
Genetically improve Brassica napus feedstocks to enhance: ďƒź Oil yield and quality stability in western U.S. production conditions and ďƒź Compatibility with HRJ fuel conversion processes (reduce oil processing costs and optimize conversion efficiency.)
Sequenced 882 Brassica napus accessions Identify genetic variants associated with our traits of interest (e.g. yield, oil composition, etc.)
Outline Building the puzzle. Develop a powerful data set. Assess phenotypic variation. Complete genotypic association studies.
Germplasm Base/Training Populations Source: USDA/Ames; Dutch Collection; UI Collection Spring V Winter
Winter Types 652 Lines G’house Seed & DNA
Spring Types 230 Lines G’house Seed & DNA
882 accessions sent to genotyping by sequencing (GBS): Generated 5 Terabytes of sequencing data
Our goal: Build 882 puzzles & find tiny variations that differentiate each puzzle (genetic variants, single nucleotide polymorphisms [SNPs])
Building the puzzle Brassica oleracea
Whole genome triplication
4. 6 YA M 20 M YA
Whole genome duplication
Arabidopsis thaliana
Brassica rapa
Building the puzzle Brassica oleracea
Whole genome triplication
Brassica napus 4. 6 YA M 20 M YA
Whole genome duplication
Polyploidy 7,500 YA
Arabidopsis thaliana
Brassica rapa
Building the puzzle Challenge 1:
• Ancestral whole genome duplication & triplication. • Each individual puzzle contain 6 partially similar copies of the same piece
5 genes, with 6 copies resembling each other
Building the puzzle Challenge 2:
As a result of polyploidization, each individual puzzle is made of 2 almost identical copies of the same image.
Goal: Build 882 such puzzles and
find differences among them (SNPs)
Accession 1
Accession 2
etc … Accession 3
Population Structure: Principal coordinate (PCO) analysis “Asian” winter cluster (Japan, China, S. Korea, etc.)
Spring “Western Europe” winter cluster (Poland, Germany, Sweden, Latvia, UK, Netherlands, France, etc. )
PCOs based on 260 SNPs from SNP array data set
Genetic Diversity C- subgenome (inherited from B. oleracea)
A- subgenome (inherited from B. rapa)
e op ur
e op ur
Spring
i W
er t n
ia s A
rE te in W
rE te in W
i W
er t n
ia s A
Spring
Germplasm Base/Training Populations Source: USDA/Ames; Dutch Collection; UI Collection Spring V Winter
Winter Types 652 Lines 2012-3 Field Data
652 Lines G’house Seed & DNA
652 Lines 2013-4 Field (ID) Data & Seed
652 Lines 2014-5 Field Data Iowa
Spring Types
652 Lines Field Data Minissota
652 Lines G’house Seed Mx 652 Lines Field (x2) Data Idaho
230 Lines 2012 Field Data & Seed
230 Lines G’house Seed & DNA
230 Lines 2013 Field Data
232 Lines Field Data
230 Lines Field (x2) Data
230 Lines Iowa 2014 Field Data
232 Lines Field Data
230 Lines Field (x2) Data
230 Lines Iowa 2015 Field Data Ames
230 Lines Field Data Akron
230 Lines Field (x2) Data Idaho
Seed Yield – Spring Germplasm Average Year-Site Moscow 2012 980 Genesee 2012 1,172 Moscow 2013 961 Genesee 2013 861 Moscow 2014 916 Genesee 2014 1,209 Moscow 2015 1,812 Genesee 2015 1,630 Average
Maximum Minimum
2,507 2,935 1,735 2,512 1,839 2,213 2,639 2,786
0 0 0 0 91 400 839 561
Heritabilities Gene 2012
0.26
Mosc 2013
0.28
0.29
Gene 2013
0.21
0.21
0.11
Mosc 2014
0.11
0.16
0.03
0.11
Gene 2014
0.09
0.14
0.03
0.18
0.35
Mosc 2015
0.32
0.23
0.24
0.58
0.27
0.14
Gene 2015
0.13
0.25
0.23
0.43
0.22
0.36
0.36
Mosc 2012
Gene 2013
Mosc 2013
Gene 2014
Mosc 2014
Gene 2014
Mosc 2014
Average 2 h = 0.226
Heritabilities Gene 2012
0.26
Mosc 2013
0.28
0.29
Gene 2013
0.21
0.21
0.11
Mosc 2014
0.11
0.16
0.03
0.11
Gene 2014
0.09
0.14
0.03
0.18
0.35
Mosc 2015
0.32
0.23
0.24
0.18
0.27
0.14
Gene 2015
0.13
0.25
0.23
0.43
0.22
0.36
0.36
Mosc 2012
Gene 2013
Mosc 2013
Gene 2014
Mosc 2014
Gene 2014
Mosc 2014
Sites within years 2 h =0.236
Heritabilities Gene 2012
0.26
Mosc 2013
0.28
0.29
Gene 2013
0.21
0.21
0.11
Mosc 2014
0.11
0.16
0.03
0.11
Gene 2014
0.09
0.14
0.03
0.18
0.35
Mosc 2015
0.32
0.23
0.24
0.18
0.27
0.14
Gene 2015
0.13
0.25
0.23
0.43
0.22
0.36
0.36
Mosc 2012
Gene 2013
Mosc 2013
Gene 2014
Mosc 2014
Gene 2014
Mosc 2014
Within sites 2 h = 0.235
Heritabilities Gene 2012
0.26
Mosc 2013
0.28
0.29
Gene 2013
0.21
0.21
0.11
Mosc 2014
0.11
0.16
0.03
0.11
Gene 2014
0.09
0.14
0.03
0.18
0.35
Mosc 2015
0.32
0.23
0.24
0.18
0.27
0.14
Gene 2015
0.13
0.25
0.23
0.43
0.22
0.36
0.36
Mosc 2012
Gene 2013
Mosc 2013
Gene 2014
Mosc 2014
Gene 2014
Mosc 2014
Between sites 2 h = 0.123
Seed Yield – Spring Germplasm
Genotype 03.IL.5.6.1 04.SC.28.4.3 3789.RR Python.CL 05SC11A1.35.2 05SI13A5JB.8.16 05SC1A4.10.1 03.IH.4.12.2 07.SC.38.16 05SC11A1.2.6
Mosc Gene Gene Mosc Avera Ra Mosc Genes Mosc Genes ow- seesee- owge nk ow-13 ee-13 ow-14 ee-15 12 12 14 15 1,949 1,898 1,881 1,871 1,866 1,832 1,761 1,751 1,733 1,733
1 2 3 4 5 6 7 8 9 10
20 49 21 11 22 5 16 18 1 9
6 2 4 3 9 14 40 60 8 10
37 102 33 98 76 13 3 97 47 54
15 45 . 17 10 9 12 . 103 38
7 2 22 24 26 44 82 3 23 90
6 1 21 53 19 163 20 40 68 83
6 42 36 5 1 12 63 32 29 8
6 3 14 13 22 16 11 17 72 27
Seed Yield – Spring Germplasm
Genotype 03.IL.5.6.1 04.SC.28.4.3 3789.RR Python.CL 05SC11A1.35.2 05SI13A5JB.8.16 05SC1A4.10.1 03.IH.4.12.2 07.SC.38.16 05SC11A1.2.6
Mosc Gene Gene Mosc Avera Ra Mosc Genes Mosc Genes ow- seesee- owge nk ow-13 ee-13 ow-14 ee-15 12 12 14 15 1,949 1,898 1,881 1,871 1,866 1,832 1,761 1,751 1,733 1,733
1 2 3 4 5 6 7 8 9 10
20 49 21 11 22 5 16 18 1 9
6 2 4 3 9 14 40 60 8 10
37 102 33 98 76 13 3 97 47 54
15 45 . 17 10 9 12 . 103 38
7 2 22 24 26 44 82 3 23 90
6 1 21 53 19 163 20 40 68 83
6 42 36 5 1 12 63 32 29 8
6 3 14 13 22 16 11 17 72 27
Seed Yield – Spring Germplasm
Genotype Cara Empire 3789.RR Python.CL 05SC11A1.35.2 05SI13A5JB.8.16 05SC1A4.10.1 03.IH.4.12.2 07.SC.38.16 05SC11A1.2.6
Mosc Gene Gene Mosc Avera Ra Mosc Genes Mosc Genes ow- seesee- owge nk ow-13 ee-13 ow-14 ee-15 12 12 14 15 1,949 1,898 1,881 1,871 1,866 1,832 1,761 1,751 1,733 1,733
1 2 3 4 5 6 7 8 9 10
20 49 21 11 22 5 16 18 1 9
6 2 4 3 9 14 40 60 8 10
37 102 33 98 76 13 3 97 47 54
15 45 . 17 10 9 12 . 103 38
7 2 22 24 26 44 82 3 23 90
6 1 21 53 19 163 20 40 68 83
6 42 36 5 1 12 63 32 29 8
6 3 14 13 22 16 11 17 72 27
WINTER GERMPLASM Yield Identifier
2012-13
2014-15
Mean Yield Yield ------------------ kg ha-1 ---------------------
Mean
1,476
1,701
1,251
Maximum
7,065
8,974
6,999
Minimum
0
0
0
WINTER GERMPLASM Yield 2012-13 2014-15 Identifier Mean Rank Yield Rank Yield Rank -------------------- kg ha-1 --------------------PI-384.536 7,065 1 9,532 1 4,598 27 PI-612.846 6,539 2 8,047 4 5,031 15 Visby 6,448 3 8,974 2 3,923 61 PI-458.970 6,007 4 8,279 3 3,735 66 PI-535.862 5,595 5 7,718 5 3,472 75 Ames-061.00 5,545 6 4,305 65 6,784 2 PI-535.852 5,440 7 6,503 9 4,376 36 Ames-156.52 5,437 8 6,811 7 4,063 50 PI-537.302 5,218 9 4,754 47 5,683 7 PI-531.276 5,211 10 6,410 11 4,011 55
Genotype v Phenotype
Coincidence ≠Causation
Observed – Log 10 (p)
o Best Linear Unbiased Estimates (BLUE’s)→ observed yield values corrected for environment. o Incorporated within-spring population structure and kindship as covariate in the GWA’s. o Top 20 low and high yielding accessions from 2014. stronger year and site effect. P-value = 0.07
Expected – Log10 (p)
–Log10 (p)
Manhattan plot
These 2 SNP’s are located in a Brassica napus gene annotation: GSBRNA2G00091535001. Blasting the B. napus peptid against Arabidopsis thaliana:
Top 2 hits: CONTAINS InterPro
DOMAIN/s: EF-Hand 1, calcium-binding site (InterPro:IPR018247), EF-HAND 2 (InterPro:IPR018249), EF-hand-like domain (InterPro:IPR011992), Calcium-binding EFhand (InterPro:IPR002048)
Proteins with Calcium-binding EFhand structures have been shown to be key participant in oil bodies formation and stability in many plants.
Genetic & Phenotypic Combined Optimum oil profile for biofuel production: ďƒź High saturated & monounsaturated fatty acids (erucic acid, C22:1, oleic acid, C18:1). ďƒź Low polyunsaturated fatty acids (18:2 & 18:3) & low glucosinolates.
Measured 12 VLCFA compounds: Looked for SNPs associated with these traits (GWAS)
Retrieve known hits Chromosome C03 C22:1
QTL interval for Erucic acid FAE1 gene
… and discover new ones? Chromosome C02 C18:0
plastid C2:0-ACP C4:0-ACP
Kas1 C16: 0-ACP
K as2 C18:
3-ketoacyl-acyl carrier protein synthase I (KAS I)
Questions
USDA-NIFA/DOE, Title 9008 Biomass Research and Development Initiative (BRDI) Grant