The effects of training population design on genomic prediction accuracy in wheat

Stefan Edwards, Jaap Buntjer, Robert Jackson, Alison Bentley, Jacob Lage, Ed Byrne, Chris Burt, Peter Jack, Simon Berry, Edward Flatman, Bruno Poupard, Stephen Smith, Charlotte Hayes, Chris Gaynor, Gregor Gorjanc, Phil Howell, Eric Ober, Ian Mackay, John Hickey

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Genomic selection offers several routes for increasing the genetic gain or efficiency of plant breeding programmes. In various species of livestock, there is empirical evidence of increased rates of genetic gain from the use of genomic selection to target different aspects of the breeder’s equation. Accurate predictions of genomic breeding value are central to this, and the design of training sets is in turn central to achieving sufficient levels of accuracy. In summary, small numbers of close relatives and very large numbers of distant relatives are expected to enable predictions with higher accuracy. To quantify the effect of some of the properties of training sets on the accuracy of genomic selection in crops, we performed an extensive field-based winter wheat trial. In summary, this trial involved the construction of 44 F2:4 bi- and tri-parental populations, from which 2992 lines were grown on four field locations and yield was measured. For each line, genotype data were generated for 25 K segregating SNP markers. The overall heritability of yield was estimated to 0.65, and estimates within individual families ranged between 0.10 and 0.85. Genomic prediction accuracies of yield BLUEs were 0.125–0.127 using two different cross-validation approaches and generally increased with training set size. Using related crosses in training and validation sets generally resulted in higher prediction accuracies than using unrelated crosses. The results of this study emphasise the importance of the training panel design in relation to the genetic material to which the resulting prediction model is to be applied.
Original languageEnglish
Pages (from-to)1943–1952
Number of pages10
JournalTheoretical and Applied Genetics
Volume132
Issue number7
Early online date19 Mar 2019
DOIs
Publication statusPrint publication - Jul 2019
Externally publishedYes

Fingerprint

Livestock
Triticum
Breeding
Single Nucleotide Polymorphism
Genotype
genomics
wheat
prediction
marker-assisted selection
Population
Genes
genetic improvement
breeding value
plant breeding
winter wheat
heritability
livestock
Plant Breeding
genotype
crops

Keywords

  • Genomic selection
  • plant breeding

Cite this

Edwards, S., Buntjer, J., Jackson, R., Bentley, A., Lage, J., Byrne, E., ... Hickey, J. (2019). The effects of training population design on genomic prediction accuracy in wheat. Theoretical and Applied Genetics, 132(7), 1943–1952. https://doi.org/10.1007/s00122-019-03327-y
Edwards, Stefan ; Buntjer, Jaap ; Jackson, Robert ; Bentley, Alison ; Lage, Jacob ; Byrne, Ed ; Burt, Chris ; Jack, Peter ; Berry, Simon ; Flatman, Edward ; Poupard, Bruno ; Smith, Stephen ; Hayes, Charlotte ; Gaynor, Chris ; Gorjanc, Gregor ; Howell, Phil ; Ober, Eric ; Mackay, Ian ; Hickey, John. / The effects of training population design on genomic prediction accuracy in wheat. In: Theoretical and Applied Genetics. 2019 ; Vol. 132, No. 7. pp. 1943–1952.
@article{e40128b5fda94a10a32c0320b45b78ea,
title = "The effects of training population design on genomic prediction accuracy in wheat",
abstract = "Genomic selection offers several routes for increasing the genetic gain or efficiency of plant breeding programmes. In various species of livestock, there is empirical evidence of increased rates of genetic gain from the use of genomic selection to target different aspects of the breeder’s equation. Accurate predictions of genomic breeding value are central to this, and the design of training sets is in turn central to achieving sufficient levels of accuracy. In summary, small numbers of close relatives and very large numbers of distant relatives are expected to enable predictions with higher accuracy. To quantify the effect of some of the properties of training sets on the accuracy of genomic selection in crops, we performed an extensive field-based winter wheat trial. In summary, this trial involved the construction of 44 F2:4 bi- and tri-parental populations, from which 2992 lines were grown on four field locations and yield was measured. For each line, genotype data were generated for 25 K segregating SNP markers. The overall heritability of yield was estimated to 0.65, and estimates within individual families ranged between 0.10 and 0.85. Genomic prediction accuracies of yield BLUEs were 0.125–0.127 using two different cross-validation approaches and generally increased with training set size. Using related crosses in training and validation sets generally resulted in higher prediction accuracies than using unrelated crosses. The results of this study emphasise the importance of the training panel design in relation to the genetic material to which the resulting prediction model is to be applied.",
keywords = "Genomic selection, plant breeding",
author = "Stefan Edwards and Jaap Buntjer and Robert Jackson and Alison Bentley and Jacob Lage and Ed Byrne and Chris Burt and Peter Jack and Simon Berry and Edward Flatman and Bruno Poupard and Stephen Smith and Charlotte Hayes and Chris Gaynor and Gregor Gorjanc and Phil Howell and Eric Ober and Ian Mackay and John Hickey",
year = "2019",
month = "7",
doi = "10.1007/s00122-019-03327-y",
language = "English",
volume = "132",
pages = "1943–1952",
journal = "Theoretical and Applied Genetics",
number = "7",

}

Edwards, S, Buntjer, J, Jackson, R, Bentley, A, Lage, J, Byrne, E, Burt, C, Jack, P, Berry, S, Flatman, E, Poupard, B, Smith, S, Hayes, C, Gaynor, C, Gorjanc, G, Howell, P, Ober, E, Mackay, I & Hickey, J 2019, 'The effects of training population design on genomic prediction accuracy in wheat', Theoretical and Applied Genetics, vol. 132, no. 7, pp. 1943–1952. https://doi.org/10.1007/s00122-019-03327-y

The effects of training population design on genomic prediction accuracy in wheat. / Edwards, Stefan; Buntjer, Jaap; Jackson, Robert; Bentley, Alison; Lage, Jacob; Byrne, Ed; Burt, Chris; Jack, Peter; Berry, Simon; Flatman, Edward; Poupard, Bruno; Smith, Stephen; Hayes, Charlotte; Gaynor, Chris; Gorjanc, Gregor; Howell, Phil; Ober, Eric; Mackay, Ian; Hickey, John.

In: Theoretical and Applied Genetics, Vol. 132, No. 7, 07.2019, p. 1943–1952.

Research output: Contribution to journalArticle

TY - JOUR

T1 - The effects of training population design on genomic prediction accuracy in wheat

AU - Edwards, Stefan

AU - Buntjer, Jaap

AU - Jackson, Robert

AU - Bentley, Alison

AU - Lage, Jacob

AU - Byrne, Ed

AU - Burt, Chris

AU - Jack, Peter

AU - Berry, Simon

AU - Flatman, Edward

AU - Poupard, Bruno

AU - Smith, Stephen

AU - Hayes, Charlotte

AU - Gaynor, Chris

AU - Gorjanc, Gregor

AU - Howell, Phil

AU - Ober, Eric

AU - Mackay, Ian

AU - Hickey, John

PY - 2019/7

Y1 - 2019/7

N2 - Genomic selection offers several routes for increasing the genetic gain or efficiency of plant breeding programmes. In various species of livestock, there is empirical evidence of increased rates of genetic gain from the use of genomic selection to target different aspects of the breeder’s equation. Accurate predictions of genomic breeding value are central to this, and the design of training sets is in turn central to achieving sufficient levels of accuracy. In summary, small numbers of close relatives and very large numbers of distant relatives are expected to enable predictions with higher accuracy. To quantify the effect of some of the properties of training sets on the accuracy of genomic selection in crops, we performed an extensive field-based winter wheat trial. In summary, this trial involved the construction of 44 F2:4 bi- and tri-parental populations, from which 2992 lines were grown on four field locations and yield was measured. For each line, genotype data were generated for 25 K segregating SNP markers. The overall heritability of yield was estimated to 0.65, and estimates within individual families ranged between 0.10 and 0.85. Genomic prediction accuracies of yield BLUEs were 0.125–0.127 using two different cross-validation approaches and generally increased with training set size. Using related crosses in training and validation sets generally resulted in higher prediction accuracies than using unrelated crosses. The results of this study emphasise the importance of the training panel design in relation to the genetic material to which the resulting prediction model is to be applied.

AB - Genomic selection offers several routes for increasing the genetic gain or efficiency of plant breeding programmes. In various species of livestock, there is empirical evidence of increased rates of genetic gain from the use of genomic selection to target different aspects of the breeder’s equation. Accurate predictions of genomic breeding value are central to this, and the design of training sets is in turn central to achieving sufficient levels of accuracy. In summary, small numbers of close relatives and very large numbers of distant relatives are expected to enable predictions with higher accuracy. To quantify the effect of some of the properties of training sets on the accuracy of genomic selection in crops, we performed an extensive field-based winter wheat trial. In summary, this trial involved the construction of 44 F2:4 bi- and tri-parental populations, from which 2992 lines were grown on four field locations and yield was measured. For each line, genotype data were generated for 25 K segregating SNP markers. The overall heritability of yield was estimated to 0.65, and estimates within individual families ranged between 0.10 and 0.85. Genomic prediction accuracies of yield BLUEs were 0.125–0.127 using two different cross-validation approaches and generally increased with training set size. Using related crosses in training and validation sets generally resulted in higher prediction accuracies than using unrelated crosses. The results of this study emphasise the importance of the training panel design in relation to the genetic material to which the resulting prediction model is to be applied.

KW - Genomic selection

KW - plant breeding

U2 - 10.1007/s00122-019-03327-y

DO - 10.1007/s00122-019-03327-y

M3 - Article

C2 - 30888431

VL - 132

SP - 1943

EP - 1952

JO - Theoretical and Applied Genetics

JF - Theoretical and Applied Genetics

IS - 7

ER -