Abstract
With increasing availability of large data sets derived from administrative and other sources, there is an increasing demand for the successful linking of these to provide rich sources of data for further analysis. Variation in the quality of identifiers used to carry out linkage means that existing approaches are often based upon ‘probabilistic’ models, which are based on a number of assumptions, and can make heavy computational demands. In this paper we suggest a new approach to classifying record pairs in linkage, based upon weights (scores) derived using a scaling algorithm. The proposed method does not rely on training data, is computationally fast, requires only moderate amounts of storage and has intuitive appeal.
Original language | English |
---|---|
Pages (from-to) | 2514-2521 |
Number of pages | 6 |
Journal | Statistics in Medicine |
Volume | 36 |
Issue number | 16 |
Early online date | 16 Mar 2017 |
DOIs | |
Publication status | Published - 20 Jul 2017 |
Keywords
- scaling
- record linkage
- correspondence analysis
- data linkage