Maximum Likelihood Pedigree Reconstruction using Integer Linear Programming

James Cussens, Mark Bartlett, Elinor Jones, Nuala Sheehan*

*Corresponding author for this work

Research output: Contribution to journalArticle (Academic Journal)peer-review

34 Citations (Scopus)

Abstract

Large population biobanks of unrelated individuals have been highly successful in detecting common genetic variants affecting diseases of public health concern. However, they lack the statistical power to detect more modest gene-gene and gene-environment interaction effects or the effects of rare variants for which related individuals are ideally required. In reality, most large population studies will undoubtedly contain sets of undeclared relatives, or pedigrees. Although a crude measure of relatedness might sometimes suffice, having a good estimate of the true pedigree would be much more informative if this could be obtained efficiently. Relatives are more likely to share longer haplotypes around disease susceptibility loci and are hence biologically more informative for rare variants than unrelated cases and controls. Distant relatives are arguably more useful for detecting variants with small effects because they are less likely to share masking environmental effects. Moreover, the identification of relatives enables appropriate adjustments of statistical analyses that typically assume unrelatedness. We propose to exploit an integer linear programming optimisation approach to pedigree learning, which is adapted to find valid pedigrees by imposing appropriate constraints. Our method is not restricted to small pedigrees and is guaranteed to return a maximum likelihood pedigree. With additional constraints, we can also search for multiple high-probability pedigrees and thus account for the inherent uncertainty in any particular pedigree reconstruction. The true pedigree is found very quickly by comparison with other methods when all individuals are observed. Extensions to more complex problems seem feasible.
Original languageEnglish
Pages (from-to)69-83
Number of pages15
JournalGenetic Epidemiology
Volume37
Issue number1
Early online date3 Oct 2012
DOIs
Publication statusPublished - 1 Jan 2013

Bibliographical note

Funding Information:
The authors acknowledge support from the Medical Research Council (Project Grant G1002312), the Leverhulme Trust (Research Fellowship RF/9/RFG/2009/0062), and the BioSHaRE-EU project (HEALTH-F4-2010-261433) funded by the European Commission under the Seventh Framework Program (FP7/2007-2013).

Publisher Copyright:
© 2012 Wiley Periodicals, Inc.

Keywords

  • constraint-based optimisation; genetic marker data; Bayesian networks; model uncertainty

Fingerprint

Dive into the research topics of 'Maximum Likelihood Pedigree Reconstruction using Integer Linear Programming'. Together they form a unique fingerprint.

Cite this