Comparing genotyping algorithms for Illumina's Infinium whole-genome SNP BeadChips

Matthew E. Ritchie, Ruijie Liu, Benilton S. Carvalho, Rafael A. Irizarry, Melanie Bahlo, David R. Booth, Simon A. Broadley, Matthew A. Brown, Simon J. Foote, Lyn R Griffiths, Trevor J. Kilpatrick, Jeanette Lechner-Scott, Pablo Moscato, Victoria M. Perreau, Justin P. Rubio, Rodney J. Scott, Jim Stankovich, Graeme J. Stewart, Bruce V. Taylor, James Wiley & 50 others Matthew A. Brown, David R. Booth, Glynnis Clarke, Mathew B. Cox, Peter A Csurhes, Patrick Danoy, Joanne L. Dickinson, Karen Drysdale, Judith Field, Simon J. Foote, Judith M Greer, Preethi Guru, Johanna Hadler, Ella Hoban, Brendan J. McMorran, Cathy J. Jensen, Laura J. Johnson, Ruth McCallum, Marilyn Merriman, Tony Merriman, Andrea Polanowski, Karena Pryce, Rodney J. Scott, Graeme J. Stewart, Lotti Tajouri, Lucy Whittock, Ella J. Wilkins, Justin P. Rubio, Melanie Bahlo, Matthew A. Brown, Brian L. Browning, Sharon R. Browning, Devindri Perera, Justin P. Rubio, Simon Broadley, Simon Broadley, Helmut Butzkueven, William M. Carroll, Allan G. Kermode, Mark Marriott, Deborah Mason, Robert N. Heard, Michael P. Pender, Michael P. Pender, Niall Tubridy, Jeanette Lechner-Scott, Bruce V. Taylor, Ernest Willoughby, Trevor J. Kilpatrick, Trevor J. Kilpatrick

Research output: Contribution to journalArticleResearchpeer-review

30 Citations (Scopus)

Abstract

Background: Illumina's Infinium SNP BeadChips are extensively used in both small and large-scale genetic studies. A fundamental step in any analysis is the processing of raw allele A and allele B intensities from each SNP into genotype calls (AA, AB, BB). Various algorithms which make use of different statistical models are available for this task. We compare four methods (GenCall, Illuminus, GenoSNP and CRLMM) on data where the true genotypes are known in advance and data from a recently published genome-wide association study.Results: In general, differences in accuracy are relatively small between the methods evaluated, although CRLMM and GenoSNP were found to consistently outperform GenCall. The performance of Illuminus is heavily dependent on sample size, with lower no call rates and improved accuracy as the number of samples available increases. For X chromosome SNPs, methods with sex-dependent models (Illuminus, CRLMM) perform better than methods which ignore gender information (GenCall, GenoSNP). We observe that CRLMM and GenoSNP are more accurate at calling SNPs with low minor allele frequency than GenCall or Illuminus. The sample quality metrics from each of the four methods were found to have a high level of agreement at flagging samples with unusual signal characteristics.Conclusions: CRLMM, GenoSNP and GenCall can be applied with confidence in studies of any size, as their performance was shown to be invariant to the number of samples available. Illuminus on the other hand requires a larger number of samples to achieve comparable levels of accuracy and its use in smaller studies (50 or fewer individuals) is not recommended.

Original languageEnglish
Article number68
JournalBMC Bioinformatics
Volume12
DOIs
Publication statusPublished - 8 Mar 2011
Externally publishedYes

Fingerprint

Chromosomes
Single Nucleotide Polymorphism
Genome
Genes
Processing
Genotype
Alleles
Dependent
Genome-Wide Association Study
X Chromosome
Statistical Models
Gene Frequency
Sample Size
Statistical Model
Chromosome
Confidence
Minor
Metric
Invariant

Cite this

Ritchie, M. E., Liu, R., Carvalho, B. S., Irizarry, R. A., Bahlo, M., Booth, D. R., ... Kilpatrick, T. J. (2011). Comparing genotyping algorithms for Illumina's Infinium whole-genome SNP BeadChips. BMC Bioinformatics, 12, [68]. https://doi.org/10.1186/1471-2105-12-68
Ritchie, Matthew E. ; Liu, Ruijie ; Carvalho, Benilton S. ; Irizarry, Rafael A. ; Bahlo, Melanie ; Booth, David R. ; Broadley, Simon A. ; Brown, Matthew A. ; Foote, Simon J. ; Griffiths, Lyn R ; Kilpatrick, Trevor J. ; Lechner-Scott, Jeanette ; Moscato, Pablo ; Perreau, Victoria M. ; Rubio, Justin P. ; Scott, Rodney J. ; Stankovich, Jim ; Stewart, Graeme J. ; Taylor, Bruce V. ; Wiley, James ; Brown, Matthew A. ; Booth, David R. ; Clarke, Glynnis ; Cox, Mathew B. ; Csurhes, Peter A ; Danoy, Patrick ; Dickinson, Joanne L. ; Drysdale, Karen ; Field, Judith ; Foote, Simon J. ; Greer, Judith M ; Guru, Preethi ; Hadler, Johanna ; Hoban, Ella ; McMorran, Brendan J. ; Jensen, Cathy J. ; Johnson, Laura J. ; McCallum, Ruth ; Merriman, Marilyn ; Merriman, Tony ; Polanowski, Andrea ; Pryce, Karena ; Scott, Rodney J. ; Stewart, Graeme J. ; Tajouri, Lotti ; Whittock, Lucy ; Wilkins, Ella J. ; Rubio, Justin P. ; Bahlo, Melanie ; Brown, Matthew A. ; Browning, Brian L. ; Browning, Sharon R. ; Perera, Devindri ; Rubio, Justin P. ; Broadley, Simon ; Broadley, Simon ; Butzkueven, Helmut ; Carroll, William M. ; Kermode, Allan G. ; Marriott, Mark ; Mason, Deborah ; Heard, Robert N. ; Pender, Michael P. ; Pender, Michael P. ; Tubridy, Niall ; Lechner-Scott, Jeanette ; Taylor, Bruce V. ; Willoughby, Ernest ; Kilpatrick, Trevor J. ; Kilpatrick, Trevor J. / Comparing genotyping algorithms for Illumina's Infinium whole-genome SNP BeadChips. In: BMC Bioinformatics. 2011 ; Vol. 12.
@article{67060f0e4eb84a429f5ecc2dc706cdc2,
title = "Comparing genotyping algorithms for Illumina's Infinium whole-genome SNP BeadChips",
abstract = "Background: Illumina's Infinium SNP BeadChips are extensively used in both small and large-scale genetic studies. A fundamental step in any analysis is the processing of raw allele A and allele B intensities from each SNP into genotype calls (AA, AB, BB). Various algorithms which make use of different statistical models are available for this task. We compare four methods (GenCall, Illuminus, GenoSNP and CRLMM) on data where the true genotypes are known in advance and data from a recently published genome-wide association study.Results: In general, differences in accuracy are relatively small between the methods evaluated, although CRLMM and GenoSNP were found to consistently outperform GenCall. The performance of Illuminus is heavily dependent on sample size, with lower no call rates and improved accuracy as the number of samples available increases. For X chromosome SNPs, methods with sex-dependent models (Illuminus, CRLMM) perform better than methods which ignore gender information (GenCall, GenoSNP). We observe that CRLMM and GenoSNP are more accurate at calling SNPs with low minor allele frequency than GenCall or Illuminus. The sample quality metrics from each of the four methods were found to have a high level of agreement at flagging samples with unusual signal characteristics.Conclusions: CRLMM, GenoSNP and GenCall can be applied with confidence in studies of any size, as their performance was shown to be invariant to the number of samples available. Illuminus on the other hand requires a larger number of samples to achieve comparable levels of accuracy and its use in smaller studies (50 or fewer individuals) is not recommended.",
author = "Ritchie, {Matthew E.} and Ruijie Liu and Carvalho, {Benilton S.} and Irizarry, {Rafael A.} and Melanie Bahlo and Booth, {David R.} and Broadley, {Simon A.} and Brown, {Matthew A.} and Foote, {Simon J.} and Griffiths, {Lyn R} and Kilpatrick, {Trevor J.} and Jeanette Lechner-Scott and Pablo Moscato and Perreau, {Victoria M.} and Rubio, {Justin P.} and Scott, {Rodney J.} and Jim Stankovich and Stewart, {Graeme J.} and Taylor, {Bruce V.} and James Wiley and Brown, {Matthew A.} and Booth, {David R.} and Glynnis Clarke and Cox, {Mathew B.} and Csurhes, {Peter A} and Patrick Danoy and Dickinson, {Joanne L.} and Karen Drysdale and Judith Field and Foote, {Simon J.} and Greer, {Judith M} and Preethi Guru and Johanna Hadler and Ella Hoban and McMorran, {Brendan J.} and Jensen, {Cathy J.} and Johnson, {Laura J.} and Ruth McCallum and Marilyn Merriman and Tony Merriman and Andrea Polanowski and Karena Pryce and Scott, {Rodney J.} and Stewart, {Graeme J.} and Lotti Tajouri and Lucy Whittock and Wilkins, {Ella J.} and Rubio, {Justin P.} and Melanie Bahlo and Brown, {Matthew A.} and Browning, {Brian L.} and Browning, {Sharon R.} and Devindri Perera and Rubio, {Justin P.} and Simon Broadley and Simon Broadley and Helmut Butzkueven and Carroll, {William M.} and Kermode, {Allan G.} and Mark Marriott and Deborah Mason and Heard, {Robert N.} and Pender, {Michael P.} and Pender, {Michael P.} and Niall Tubridy and Jeanette Lechner-Scott and Taylor, {Bruce V.} and Ernest Willoughby and Kilpatrick, {Trevor J.} and Kilpatrick, {Trevor J.}",
year = "2011",
month = "3",
day = "8",
doi = "10.1186/1471-2105-12-68",
language = "English",
volume = "12",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

Ritchie, ME, Liu, R, Carvalho, BS, Irizarry, RA, Bahlo, M, Booth, DR, Broadley, SA, Brown, MA, Foote, SJ, Griffiths, LR, Kilpatrick, TJ, Lechner-Scott, J, Moscato, P, Perreau, VM, Rubio, JP, Scott, RJ, Stankovich, J, Stewart, GJ, Taylor, BV, Wiley, J, Brown, MA, Booth, DR, Clarke, G, Cox, MB, Csurhes, PA, Danoy, P, Dickinson, JL, Drysdale, K, Field, J, Foote, SJ, Greer, JM, Guru, P, Hadler, J, Hoban, E, McMorran, BJ, Jensen, CJ, Johnson, LJ, McCallum, R, Merriman, M, Merriman, T, Polanowski, A, Pryce, K, Scott, RJ, Stewart, GJ, Tajouri, L, Whittock, L, Wilkins, EJ, Rubio, JP, Bahlo, M, Brown, MA, Browning, BL, Browning, SR, Perera, D, Rubio, JP, Broadley, S, Broadley, S, Butzkueven, H, Carroll, WM, Kermode, AG, Marriott, M, Mason, D, Heard, RN, Pender, MP, Pender, MP, Tubridy, N, Lechner-Scott, J, Taylor, BV, Willoughby, E, Kilpatrick, TJ & Kilpatrick, TJ 2011, 'Comparing genotyping algorithms for Illumina's Infinium whole-genome SNP BeadChips' BMC Bioinformatics, vol. 12, 68. https://doi.org/10.1186/1471-2105-12-68

Comparing genotyping algorithms for Illumina's Infinium whole-genome SNP BeadChips. / Ritchie, Matthew E.; Liu, Ruijie; Carvalho, Benilton S.; Irizarry, Rafael A.; Bahlo, Melanie; Booth, David R.; Broadley, Simon A.; Brown, Matthew A.; Foote, Simon J.; Griffiths, Lyn R; Kilpatrick, Trevor J.; Lechner-Scott, Jeanette; Moscato, Pablo; Perreau, Victoria M.; Rubio, Justin P.; Scott, Rodney J.; Stankovich, Jim; Stewart, Graeme J.; Taylor, Bruce V.; Wiley, James; Brown, Matthew A.; Booth, David R.; Clarke, Glynnis; Cox, Mathew B.; Csurhes, Peter A; Danoy, Patrick; Dickinson, Joanne L.; Drysdale, Karen; Field, Judith; Foote, Simon J.; Greer, Judith M; Guru, Preethi; Hadler, Johanna; Hoban, Ella; McMorran, Brendan J.; Jensen, Cathy J.; Johnson, Laura J.; McCallum, Ruth; Merriman, Marilyn; Merriman, Tony; Polanowski, Andrea; Pryce, Karena; Scott, Rodney J.; Stewart, Graeme J.; Tajouri, Lotti; Whittock, Lucy; Wilkins, Ella J.; Rubio, Justin P.; Bahlo, Melanie; Brown, Matthew A.; Browning, Brian L.; Browning, Sharon R.; Perera, Devindri; Rubio, Justin P.; Broadley, Simon; Broadley, Simon; Butzkueven, Helmut; Carroll, William M.; Kermode, Allan G.; Marriott, Mark; Mason, Deborah; Heard, Robert N.; Pender, Michael P.; Pender, Michael P.; Tubridy, Niall; Lechner-Scott, Jeanette; Taylor, Bruce V.; Willoughby, Ernest; Kilpatrick, Trevor J.; Kilpatrick, Trevor J.

In: BMC Bioinformatics, Vol. 12, 68, 08.03.2011.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Comparing genotyping algorithms for Illumina's Infinium whole-genome SNP BeadChips

AU - Ritchie, Matthew E.

AU - Liu, Ruijie

AU - Carvalho, Benilton S.

AU - Irizarry, Rafael A.

AU - Bahlo, Melanie

AU - Booth, David R.

AU - Broadley, Simon A.

AU - Brown, Matthew A.

AU - Foote, Simon J.

AU - Griffiths, Lyn R

AU - Kilpatrick, Trevor J.

AU - Lechner-Scott, Jeanette

AU - Moscato, Pablo

AU - Perreau, Victoria M.

AU - Rubio, Justin P.

AU - Scott, Rodney J.

AU - Stankovich, Jim

AU - Stewart, Graeme J.

AU - Taylor, Bruce V.

AU - Wiley, James

AU - Brown, Matthew A.

AU - Booth, David R.

AU - Clarke, Glynnis

AU - Cox, Mathew B.

AU - Csurhes, Peter A

AU - Danoy, Patrick

AU - Dickinson, Joanne L.

AU - Drysdale, Karen

AU - Field, Judith

AU - Foote, Simon J.

AU - Greer, Judith M

AU - Guru, Preethi

AU - Hadler, Johanna

AU - Hoban, Ella

AU - McMorran, Brendan J.

AU - Jensen, Cathy J.

AU - Johnson, Laura J.

AU - McCallum, Ruth

AU - Merriman, Marilyn

AU - Merriman, Tony

AU - Polanowski, Andrea

AU - Pryce, Karena

AU - Scott, Rodney J.

AU - Stewart, Graeme J.

AU - Tajouri, Lotti

AU - Whittock, Lucy

AU - Wilkins, Ella J.

AU - Rubio, Justin P.

AU - Bahlo, Melanie

AU - Brown, Matthew A.

AU - Browning, Brian L.

AU - Browning, Sharon R.

AU - Perera, Devindri

AU - Rubio, Justin P.

AU - Broadley, Simon

AU - Broadley, Simon

AU - Butzkueven, Helmut

AU - Carroll, William M.

AU - Kermode, Allan G.

AU - Marriott, Mark

AU - Mason, Deborah

AU - Heard, Robert N.

AU - Pender, Michael P.

AU - Pender, Michael P.

AU - Tubridy, Niall

AU - Lechner-Scott, Jeanette

AU - Taylor, Bruce V.

AU - Willoughby, Ernest

AU - Kilpatrick, Trevor J.

AU - Kilpatrick, Trevor J.

PY - 2011/3/8

Y1 - 2011/3/8

N2 - Background: Illumina's Infinium SNP BeadChips are extensively used in both small and large-scale genetic studies. A fundamental step in any analysis is the processing of raw allele A and allele B intensities from each SNP into genotype calls (AA, AB, BB). Various algorithms which make use of different statistical models are available for this task. We compare four methods (GenCall, Illuminus, GenoSNP and CRLMM) on data where the true genotypes are known in advance and data from a recently published genome-wide association study.Results: In general, differences in accuracy are relatively small between the methods evaluated, although CRLMM and GenoSNP were found to consistently outperform GenCall. The performance of Illuminus is heavily dependent on sample size, with lower no call rates and improved accuracy as the number of samples available increases. For X chromosome SNPs, methods with sex-dependent models (Illuminus, CRLMM) perform better than methods which ignore gender information (GenCall, GenoSNP). We observe that CRLMM and GenoSNP are more accurate at calling SNPs with low minor allele frequency than GenCall or Illuminus. The sample quality metrics from each of the four methods were found to have a high level of agreement at flagging samples with unusual signal characteristics.Conclusions: CRLMM, GenoSNP and GenCall can be applied with confidence in studies of any size, as their performance was shown to be invariant to the number of samples available. Illuminus on the other hand requires a larger number of samples to achieve comparable levels of accuracy and its use in smaller studies (50 or fewer individuals) is not recommended.

AB - Background: Illumina's Infinium SNP BeadChips are extensively used in both small and large-scale genetic studies. A fundamental step in any analysis is the processing of raw allele A and allele B intensities from each SNP into genotype calls (AA, AB, BB). Various algorithms which make use of different statistical models are available for this task. We compare four methods (GenCall, Illuminus, GenoSNP and CRLMM) on data where the true genotypes are known in advance and data from a recently published genome-wide association study.Results: In general, differences in accuracy are relatively small between the methods evaluated, although CRLMM and GenoSNP were found to consistently outperform GenCall. The performance of Illuminus is heavily dependent on sample size, with lower no call rates and improved accuracy as the number of samples available increases. For X chromosome SNPs, methods with sex-dependent models (Illuminus, CRLMM) perform better than methods which ignore gender information (GenCall, GenoSNP). We observe that CRLMM and GenoSNP are more accurate at calling SNPs with low minor allele frequency than GenCall or Illuminus. The sample quality metrics from each of the four methods were found to have a high level of agreement at flagging samples with unusual signal characteristics.Conclusions: CRLMM, GenoSNP and GenCall can be applied with confidence in studies of any size, as their performance was shown to be invariant to the number of samples available. Illuminus on the other hand requires a larger number of samples to achieve comparable levels of accuracy and its use in smaller studies (50 or fewer individuals) is not recommended.

UR - http://www.scopus.com/inward/record.url?scp=79952330384&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-12-68

DO - 10.1186/1471-2105-12-68

M3 - Article

VL - 12

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 68

ER -

Ritchie ME, Liu R, Carvalho BS, Irizarry RA, Bahlo M, Booth DR et al. Comparing genotyping algorithms for Illumina's Infinium whole-genome SNP BeadChips. BMC Bioinformatics. 2011 Mar 8;12. 68. https://doi.org/10.1186/1471-2105-12-68