Cost-effective high density (HD) genotypes of livestock species can be obtained by genotyping a proportion of the population using a HD panel and the remainder using a cheaper low density panel, and then imputing the missing genotypes that are not directly assayed in the low density panel. The efficacy of genotype imputation can largely affected by the structure and history of the specific target population and it should be checked before incorporating imputation in routine genotyping practices. Here, we investigated the efficacy of imputation in crossbred dairy cattle populations of East Africa using 4 different commercial single nucleotide polymorphisms (SNP) panels, 3 reference populations and 3 imputation algorithms. We found that Minimac and a reference population which included a mixture of crossbred and ancestral purebred animals provided the highest imputation accuracy compared to other scenarios of imputation. The accuracies of imputation, measured as the correlation between real and imputed genotypes averaged across SNPs, were around 0.76 and 0.94 for 7K and 40K SNPs, respectively, when imputed up to a 770K panel. We also presented a method to maximize the imputation accuracy of low density panels which relies on the pairwise (co)variances between SNPs and the minor allele frequency of SNPs. The performance of the developed method was tested in a 5-fold cross validation process where various densities of SNPs were selected using the (co)variance method and also by alternative SNP selection methods and then imputed up to the HD panel. The (co)variance method provided the highest imputation accuracies at almost all marker densities, with accuracies being up to 0.19 higher than the random selection of SNPs. The accuracies of imputation from 7K and 40K panels selected using the (co)variance method were around 0.80 and 0.94, respectively. The presented method also achieved higher accuracy of genomic prediction at lower densities of selected SNPs. The squared correlation between genomic breeding values estimated using imputed genotypes and those from the real 770K HD panel was 0.95 when the accuracy of imputation was 0.64. The presented method for SNP selection is straightforward in its application and can ensure high accuracies in genotype imputation of crossbred dairy populations in East Africa.
- East African crossbred dairy cattle
- Genomic selection
- Genotype imputation
- Low density marker panel design