OBJECTIVES: Data linkage combines information from several clinical data sets. The authors examined whether coding inconsistencies for cardiovascular disease between components of linked data sets result in differences in apparent population characteristics.
DESIGN: Retrospective cohort study.
SETTING: Routine primary care data from 40 Scottish general practitioner (GP) surgeries linked to national hospital records.
PARTICIPANTS: 240 846 patients, aged 20 years or older, registered at a GP surgery.
OUTCOMES: Cases of myocardial infarction, ischaemic heart disease and stroke (cerebrovascular disease) were identified from GP and hospital records. Patient characteristics and incidence rates were assessed for all three clinical outcomes, based on GP, hospital, paired GP/hospital (similar diagnoses recorded simultaneously in both data sets) or pooled GP/hospital records (diagnosis recorded in either or both data sets).
RESULTS: For all three outcomes, the authors found evidence (p<0.05) of different characteristics when using different methods of case identification. Prescribing of cardiovascular medicines for ischaemic heart disease was greatest for cases identified using paired records (p≤0.013). For all conditions, 30-day case fatality rates were higher for cases identified using hospital compared with GP or paired data, most noticeably for myocardial infarction (hospital 20%, GP 4%, p=0.001). Incidence rates were highest using pooled GP/hospital data and lowest using paired data.
CONCLUSIONS: Differences exist in patient characteristics and disease incidence for cardiovascular conditions, depending on the data source. This has implications for studies using routine clinical data.