Major errors in data and their effect on response to selection

I. J. Mackay, P. D.S. Caligari

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Outliers in data can usually be detected by data validation routines, but some major errors escape detection because they fall within an acceptable range of values. In a plant breeding program, although these errors may be rare, they could reduce response to selection by an amount disproportionate to their frequency. We used stochastic computer simulations to assess the effect of such errors on response to selection. Combinations of high (1%) and low (0.1%) error rates were simulated, with between 1 and 10 individuals selected from populations of size 100 or 1000. Four different error types were simulated by adjusting the means and variances of the simulated major errors. Major errors caused large reductions in response to selection, especially when present at an error rate of 1% with a population of size 1000. Under such circumstances response to selection may actually increase if selection intensity is reduced. At the 0.1% error rate, and in populations of size 100, the reduction in response to selection was less marked. Data validation methods, in which the most extreme observations were rejected prior to selection, usually reduced response to selection and therefore should not be used routinely. In addition to their effect on selection programs, major errors will also reduce the efficiency of bulked segregant analysis. These results confirm that vigilance and careful experimental technique repay their time and effort. Data on the frequency and distribution of major errors are required to achieve a better understanding of their effect and define the best procedure to handle their presence.
Original languageEnglish
Pages (from-to)697-702
Number of pages6
JournalCrop Science
Volume39
Issue number3
DOIs
Publication statusPrint publication - 1999
Externally publishedYes

Fingerprint

population size
selection intensity
plant breeding
computer simulation
methodology

Cite this

Mackay, I. J. ; Caligari, P. D.S. / Major errors in data and their effect on response to selection. In: Crop Science. 1999 ; Vol. 39, No. 3. pp. 697-702.
@article{dd0ef3fc8e1946d49e4ac95906e85fe2,
title = "Major errors in data and their effect on response to selection",
abstract = "Outliers in data can usually be detected by data validation routines, but some major errors escape detection because they fall within an acceptable range of values. In a plant breeding program, although these errors may be rare, they could reduce response to selection by an amount disproportionate to their frequency. We used stochastic computer simulations to assess the effect of such errors on response to selection. Combinations of high (1{\%}) and low (0.1{\%}) error rates were simulated, with between 1 and 10 individuals selected from populations of size 100 or 1000. Four different error types were simulated by adjusting the means and variances of the simulated major errors. Major errors caused large reductions in response to selection, especially when present at an error rate of 1{\%} with a population of size 1000. Under such circumstances response to selection may actually increase if selection intensity is reduced. At the 0.1{\%} error rate, and in populations of size 100, the reduction in response to selection was less marked. Data validation methods, in which the most extreme observations were rejected prior to selection, usually reduced response to selection and therefore should not be used routinely. In addition to their effect on selection programs, major errors will also reduce the efficiency of bulked segregant analysis. These results confirm that vigilance and careful experimental technique repay their time and effort. Data on the frequency and distribution of major errors are required to achieve a better understanding of their effect and define the best procedure to handle their presence.",
author = "Mackay, {I. J.} and Caligari, {P. D.S.}",
year = "1999",
doi = "10.2135/cropsci1999.0011183X003900020016x",
language = "English",
volume = "39",
pages = "697--702",
journal = "Crop Science",
issn = "0011-183X",
publisher = "Crop Science Society of America",
number = "3",

}

Major errors in data and their effect on response to selection. / Mackay, I. J.; Caligari, P. D.S.

In: Crop Science, Vol. 39, No. 3, 1999, p. 697-702.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Major errors in data and their effect on response to selection

AU - Mackay, I. J.

AU - Caligari, P. D.S.

PY - 1999

Y1 - 1999

N2 - Outliers in data can usually be detected by data validation routines, but some major errors escape detection because they fall within an acceptable range of values. In a plant breeding program, although these errors may be rare, they could reduce response to selection by an amount disproportionate to their frequency. We used stochastic computer simulations to assess the effect of such errors on response to selection. Combinations of high (1%) and low (0.1%) error rates were simulated, with between 1 and 10 individuals selected from populations of size 100 or 1000. Four different error types were simulated by adjusting the means and variances of the simulated major errors. Major errors caused large reductions in response to selection, especially when present at an error rate of 1% with a population of size 1000. Under such circumstances response to selection may actually increase if selection intensity is reduced. At the 0.1% error rate, and in populations of size 100, the reduction in response to selection was less marked. Data validation methods, in which the most extreme observations were rejected prior to selection, usually reduced response to selection and therefore should not be used routinely. In addition to their effect on selection programs, major errors will also reduce the efficiency of bulked segregant analysis. These results confirm that vigilance and careful experimental technique repay their time and effort. Data on the frequency and distribution of major errors are required to achieve a better understanding of their effect and define the best procedure to handle their presence.

AB - Outliers in data can usually be detected by data validation routines, but some major errors escape detection because they fall within an acceptable range of values. In a plant breeding program, although these errors may be rare, they could reduce response to selection by an amount disproportionate to their frequency. We used stochastic computer simulations to assess the effect of such errors on response to selection. Combinations of high (1%) and low (0.1%) error rates were simulated, with between 1 and 10 individuals selected from populations of size 100 or 1000. Four different error types were simulated by adjusting the means and variances of the simulated major errors. Major errors caused large reductions in response to selection, especially when present at an error rate of 1% with a population of size 1000. Under such circumstances response to selection may actually increase if selection intensity is reduced. At the 0.1% error rate, and in populations of size 100, the reduction in response to selection was less marked. Data validation methods, in which the most extreme observations were rejected prior to selection, usually reduced response to selection and therefore should not be used routinely. In addition to their effect on selection programs, major errors will also reduce the efficiency of bulked segregant analysis. These results confirm that vigilance and careful experimental technique repay their time and effort. Data on the frequency and distribution of major errors are required to achieve a better understanding of their effect and define the best procedure to handle their presence.

UR - http://www.mendeley.com/research/major-errors-data-effect-response-selection

U2 - 10.2135/cropsci1999.0011183X003900020016x

DO - 10.2135/cropsci1999.0011183X003900020016x

M3 - Article

VL - 39

SP - 697

EP - 702

JO - Crop Science

JF - Crop Science

SN - 0011-183X

IS - 3

ER -