The analysis of record-linked data using multiple imputation with data value priors

Harvey Goldstein, Katie Harron, Angie Wade

Research output: Contribution to journalArticle (Academic Journal)peer-review

46 Citations (Scopus)


Probabilistic record linkage techniques assign match weights to one or more potential matches for those individual records that cannot be assigned ‘unequivocal matches’ across data files. Existing methods select the single
record having themaximum weight provided that this weight is higher than an assigned threshold.We argue that this procedure, which ignores all information from matches with lower weights and for some individuals assigns no match, is inefficient and may also lead to biases in subsequent analysis of the linked data. We propose that a multiple imputation framework be utilised for data that belong to records that cannot be matched unequivocally. In this way, the information from all potential matches is transferred through to the analysis stage. This
procedure allows for the propagation of matching uncertainty through a full modelling process that preserves the data structure. For purposes of statistical modelling, results from a simulation example suggest that a full probabilistic record linkage is unnecessary and that standard multiple imputation will provide unbiased and efficient parameter estimates.
Original languageEnglish
JournalStatistics in Medicine
Publication statusPublished - Jul 2012


  • linking errors; missing data; multiple imputation; prior informed imputation; record linkage


Dive into the research topics of 'The analysis of record-linked data using multiple imputation with data value priors'. Together they form a unique fingerprint.
  • Lemma 3

    Rintoul, D. A.


    Project: Research

Cite this