Assessing and predicting adolescent and early adulthood common mental disorders using electronic primary care data: analysis of a prospective cohort study (ALSPAC) in Southwest England

Daniel Smith*, Kathryn Willan, Stephanie Prady, Josie Dickerson, Gillian Santorelli, Kate M Tilling, Rosie P Cornish

*Corresponding author for this work

Research output: Contribution to journalArticle (Academic Journal)peer-review

2 Citations (Scopus)
37 Downloads (Pure)


Objectives: We aimed to examine agreement between common mental disorders (CMDs) from primary care records and repeated CMD questionnaire data from ALSPAC (the Avon Longitudinal Study of Parents and Children) over adolescence and young adulthood, explore factors affecting CMD identification in primary care records, and construct models predicting ALSPAC-derived CMDs using only primary care data.
Design and setting: Prospective cohort study (ALSPAC) in Southwest England with linkage to electronic primary care records.
Participants: Primary care records were extracted for 11 807 participants (80% of 14 731 eligible). Between 31% (3633; age 15/16) and 11% (1298; age 21/22) of participants had both primary care and ALSPAC CMD data.
Outcome measures: ALSPAC outcome measures were diagnoses of suspected depression and/or CMDs. Primary care outcome measure were Read codes for diagnosis, symptoms and treatment of depression/CMDs. For each time point, sensitivities and specificities for primary care CMD diagnoses were calculated for predicting ALSPAC-derived measures of CMDs, and the factors associated with identification of primary care-based CMDs in those with suspected ALSPAC-derived CMDs explored. Lasso (least absolute selection and shrinkage operator) models were used at each time point to predict ALSPAC-derived CMDs using only primary care data, with internal validation by randomly splitting data into 60% training and 40% validation samples.
Results: Sensitivities for primary care diagnoses were low for CMDs (range: 3.5%–19.1%) and depression (range: 1.6%–34.0%), while specificities were high (nearly all >95%). The strongest predictors of identification in the primary care data for those with ALSPAC-derived CMDs were symptom severity indices. The lasso models had relatively low prediction rates, especially in the validation sample (deviance ratio range: −1.3 to 12.6%), but improved with age.
Conclusions: Primary care data underestimate CMDs compared to population-based studies. Improving general practitioner identification, and using free-text or secondary care data, is needed to improve the accuracy of models using clinical data.
Original languageEnglish
Article numbere053624
Number of pages13
JournalBMJ Open
Issue number10
Early online date18 Oct 2021
Publication statusPublished - 18 Oct 2021

Bibliographical note

Funding Information:
Funding This work was funded by the Medical Research Council (MRC grant number: MC_PC_17210). The UK Medical Research Council and Wellcome (Grant ref: 217065/Z/19/Z) and the University of Bristol provide core support for ALSPAC. A comprehensive list of grants funding is available on the ALSPAC website (http:// pdf); collection of ALSPAC CMD data was funded by the NIH (grant references 5R01MH073842-04 and PD301198-SC101645; for the TF3 DAWBA and YPB MFQ data), Wellcome Trust (grant reference 08426812/Z/07/Z; for the TF4 CIS-R data), and a joint Wellcome Trust and MRC grant (grant reference 092731; for the CCS, CCT and YPA SMFQ data). This publication is the work of the authors and Daniel Smith and Rosie Cornish will serve as guarantors for the contents of this paper.

Publisher Copyright:
© 2021 Author(s). Published by BMJ.

Structured keywords



Dive into the research topics of 'Assessing and predicting adolescent and early adulthood common mental disorders using electronic primary care data: analysis of a prospective cohort study (ALSPAC) in Southwest England'. Together they form a unique fingerprint.

Cite this