The feasibility of using low density marker panels for genotype imputation and genomic prediction of crossbred dairy cattle of East Africa

H Aliloo, R Mrode, AM Okeyo, G Ni, ME Goddard, JP Gibson

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Cost-effective high density (HD) genotypes of livestock species can be obtained by genotyping a proportion of the population using a HD panel and the remainder using a cheaper low density panel, and then imputing the missing genotypes that are not directly assayed in the low density panel. The efficacy of genotype imputation can largely affected by the structure and history of the specific target population and it should be checked before incorporating imputation in routine genotyping practices. Here, we investigated the efficacy of imputation in crossbred dairy cattle populations of East Africa using 4 different commercial single nucleotide polymorphisms (SNP) panels, 3 reference populations and 3 imputation algorithms. We found that Minimac and a reference population which included a mixture of crossbred and ancestral purebred animals provided the highest imputation accuracy compared to other scenarios of imputation. The accuracies of imputation, measured as the correlation between real and imputed genotypes averaged across SNPs, were around 0.76 and 0.94 for 7K and 40K SNPs, respectively, when imputed up to a 770K panel. We also presented a method to maximize the imputation accuracy of low density panels which relies on the pairwise (co)variances between SNPs and the minor allele frequency of SNPs. The performance of the developed method was tested in a 5-fold cross validation process where various densities of SNPs were selected using the (co)variance method and also by alternative SNP selection methods and then imputed up to the HD panel. The (co)variance method provided the highest imputation accuracies at almost all marker densities, with accuracies being up to 0.19 higher than the random selection of SNPs. The accuracies of imputation from 7K and 40K panels selected using the (co)variance method were around 0.80 and 0.94, respectively. The presented method also achieved higher accuracy of genomic prediction at lower densities of selected SNPs. The squared correlation between genomic breeding values estimated using imputed genotypes and those from the real 770K HD panel was 0.95 when the accuracy of imputation was 0.64. The presented method for SNP selection is straightforward in its application and can ensure high accuracies in genotype imputation of crossbred dairy populations in East Africa.
Original languageEnglish
Pages (from-to)9108 - 9127
Number of pages20
JournalJournal of Dairy Science
Volume101
Issue number10
Early online date1 Aug 2018
DOIs
Publication statusFirst published - 1 Aug 2018

Fingerprint

Eastern Africa
dairy cattle
crossbreds
genomics
prediction
genotype
single nucleotide polymorphism
methodology
purebreds
selection methods
breeding value
genotyping
dairies
livestock

Bibliographical note

20101043

Keywords

  • East African crossbred dairy cattle
  • Genomic selection
  • Genotype imputation
  • Low density marker panel design

Cite this

@article{a714f4b34a6a41fbb181587e4f649ad2,
title = "The feasibility of using low density marker panels for genotype imputation and genomic prediction of crossbred dairy cattle of East Africa",
abstract = "Cost-effective high density (HD) genotypes of livestock species can be obtained by genotyping a proportion of the population using a HD panel and the remainder using a cheaper low density panel, and then imputing the missing genotypes that are not directly assayed in the low density panel. The efficacy of genotype imputation can largely affected by the structure and history of the specific target population and it should be checked before incorporating imputation in routine genotyping practices. Here, we investigated the efficacy of imputation in crossbred dairy cattle populations of East Africa using 4 different commercial single nucleotide polymorphisms (SNP) panels, 3 reference populations and 3 imputation algorithms. We found that Minimac and a reference population which included a mixture of crossbred and ancestral purebred animals provided the highest imputation accuracy compared to other scenarios of imputation. The accuracies of imputation, measured as the correlation between real and imputed genotypes averaged across SNPs, were around 0.76 and 0.94 for 7K and 40K SNPs, respectively, when imputed up to a 770K panel. We also presented a method to maximize the imputation accuracy of low density panels which relies on the pairwise (co)variances between SNPs and the minor allele frequency of SNPs. The performance of the developed method was tested in a 5-fold cross validation process where various densities of SNPs were selected using the (co)variance method and also by alternative SNP selection methods and then imputed up to the HD panel. The (co)variance method provided the highest imputation accuracies at almost all marker densities, with accuracies being up to 0.19 higher than the random selection of SNPs. The accuracies of imputation from 7K and 40K panels selected using the (co)variance method were around 0.80 and 0.94, respectively. The presented method also achieved higher accuracy of genomic prediction at lower densities of selected SNPs. The squared correlation between genomic breeding values estimated using imputed genotypes and those from the real 770K HD panel was 0.95 when the accuracy of imputation was 0.64. The presented method for SNP selection is straightforward in its application and can ensure high accuracies in genotype imputation of crossbred dairy populations in East Africa.",
keywords = "East African crossbred dairy cattle, Genomic selection, Genotype imputation, Low density marker panel design",
author = "H Aliloo and R Mrode and AM Okeyo and G Ni and ME Goddard and JP Gibson",
note = "20101043",
year = "2018",
month = "8",
day = "1",
doi = "10.3168/jds.2018-14621",
language = "English",
volume = "101",
pages = "9108 -- 9127",
journal = "Journal of Dairy Science",
issn = "0022-0302",
publisher = "American Dairy Science Association",
number = "10",

}

The feasibility of using low density marker panels for genotype imputation and genomic prediction of crossbred dairy cattle of East Africa. / Aliloo, H; Mrode, R; Okeyo, AM; Ni, G; Goddard, ME; Gibson, JP.

In: Journal of Dairy Science, Vol. 101, No. 10, 01.08.2018, p. 9108 - 9127.

Research output: Contribution to journalArticle

TY - JOUR

T1 - The feasibility of using low density marker panels for genotype imputation and genomic prediction of crossbred dairy cattle of East Africa

AU - Aliloo, H

AU - Mrode, R

AU - Okeyo, AM

AU - Ni, G

AU - Goddard, ME

AU - Gibson, JP

N1 - 20101043

PY - 2018/8/1

Y1 - 2018/8/1

N2 - Cost-effective high density (HD) genotypes of livestock species can be obtained by genotyping a proportion of the population using a HD panel and the remainder using a cheaper low density panel, and then imputing the missing genotypes that are not directly assayed in the low density panel. The efficacy of genotype imputation can largely affected by the structure and history of the specific target population and it should be checked before incorporating imputation in routine genotyping practices. Here, we investigated the efficacy of imputation in crossbred dairy cattle populations of East Africa using 4 different commercial single nucleotide polymorphisms (SNP) panels, 3 reference populations and 3 imputation algorithms. We found that Minimac and a reference population which included a mixture of crossbred and ancestral purebred animals provided the highest imputation accuracy compared to other scenarios of imputation. The accuracies of imputation, measured as the correlation between real and imputed genotypes averaged across SNPs, were around 0.76 and 0.94 for 7K and 40K SNPs, respectively, when imputed up to a 770K panel. We also presented a method to maximize the imputation accuracy of low density panels which relies on the pairwise (co)variances between SNPs and the minor allele frequency of SNPs. The performance of the developed method was tested in a 5-fold cross validation process where various densities of SNPs were selected using the (co)variance method and also by alternative SNP selection methods and then imputed up to the HD panel. The (co)variance method provided the highest imputation accuracies at almost all marker densities, with accuracies being up to 0.19 higher than the random selection of SNPs. The accuracies of imputation from 7K and 40K panels selected using the (co)variance method were around 0.80 and 0.94, respectively. The presented method also achieved higher accuracy of genomic prediction at lower densities of selected SNPs. The squared correlation between genomic breeding values estimated using imputed genotypes and those from the real 770K HD panel was 0.95 when the accuracy of imputation was 0.64. The presented method for SNP selection is straightforward in its application and can ensure high accuracies in genotype imputation of crossbred dairy populations in East Africa.

AB - Cost-effective high density (HD) genotypes of livestock species can be obtained by genotyping a proportion of the population using a HD panel and the remainder using a cheaper low density panel, and then imputing the missing genotypes that are not directly assayed in the low density panel. The efficacy of genotype imputation can largely affected by the structure and history of the specific target population and it should be checked before incorporating imputation in routine genotyping practices. Here, we investigated the efficacy of imputation in crossbred dairy cattle populations of East Africa using 4 different commercial single nucleotide polymorphisms (SNP) panels, 3 reference populations and 3 imputation algorithms. We found that Minimac and a reference population which included a mixture of crossbred and ancestral purebred animals provided the highest imputation accuracy compared to other scenarios of imputation. The accuracies of imputation, measured as the correlation between real and imputed genotypes averaged across SNPs, were around 0.76 and 0.94 for 7K and 40K SNPs, respectively, when imputed up to a 770K panel. We also presented a method to maximize the imputation accuracy of low density panels which relies on the pairwise (co)variances between SNPs and the minor allele frequency of SNPs. The performance of the developed method was tested in a 5-fold cross validation process where various densities of SNPs were selected using the (co)variance method and also by alternative SNP selection methods and then imputed up to the HD panel. The (co)variance method provided the highest imputation accuracies at almost all marker densities, with accuracies being up to 0.19 higher than the random selection of SNPs. The accuracies of imputation from 7K and 40K panels selected using the (co)variance method were around 0.80 and 0.94, respectively. The presented method also achieved higher accuracy of genomic prediction at lower densities of selected SNPs. The squared correlation between genomic breeding values estimated using imputed genotypes and those from the real 770K HD panel was 0.95 when the accuracy of imputation was 0.64. The presented method for SNP selection is straightforward in its application and can ensure high accuracies in genotype imputation of crossbred dairy populations in East Africa.

KW - East African crossbred dairy cattle

KW - Genomic selection

KW - Genotype imputation

KW - Low density marker panel design

U2 - 10.3168/jds.2018-14621

DO - 10.3168/jds.2018-14621

M3 - Article

C2 - 30077450

VL - 101

SP - 9108

EP - 9127

JO - Journal of Dairy Science

JF - Journal of Dairy Science

SN - 0022-0302

IS - 10

ER -