Abstract
Background
Rapid population-level identification of language disorders could help provide care to young children to improve their outcomes. Two previous studies identified and replicated up to six parent-reported items that predicted 11-year language outcome with ≥71 % sensitivity and specificity. Here, we assess whether including genetic propensity for toddlerhood vocabulary improves predictive accuracy.
Method
The Early Language in Victoria Study (ELVS) recruited 1910 8-month-olds in Melbourne in 2003–2004. The Longitudinal Study of Australian Children (LSAC) recruited 5107 0–1-year-olds across Australia in 2004. Both collected parent-reported items at 2–3 years, a comparable 11-year language outcome: the Clinical Evaluation of Language Fundamentals (CELF-4) Core Language score or Recalling Sentences subtest, and biospecimens for genotyping. We derived polygenic scores capturing participants’ genetic propensity for parent-reported 24–38-month vocabulary. We calculated univariate associations with continuous language outcomes. We used ensemble method SuperLearner to estimate how accurately the parent-reported predictors and polygenic scores predict low 11-year language outcome (>1.5 standard deviations below the mean) in each cohort.
Results
Language outcome was available for 839 ELVS and 1441 LSAC participants. Polygenic scores accounted for little variance in continuous language outcomes (R2 < 1.3 %). Adding polygenic scores to the predictor sets increased accuracy of predicting language outcome by up to 7 %, but inconsistently between analyses.
Conclusions
Polygenic scores derived for toddlerhood vocabulary did not meaningfully improve predictive accuracy of individuals’ language outcome when added to the phenotypic predictor set. Presently, parent-reported measures or clinician observation appear best for predicting language outcome at this age.
Rapid population-level identification of language disorders could help provide care to young children to improve their outcomes. Two previous studies identified and replicated up to six parent-reported items that predicted 11-year language outcome with ≥71 % sensitivity and specificity. Here, we assess whether including genetic propensity for toddlerhood vocabulary improves predictive accuracy.
Method
The Early Language in Victoria Study (ELVS) recruited 1910 8-month-olds in Melbourne in 2003–2004. The Longitudinal Study of Australian Children (LSAC) recruited 5107 0–1-year-olds across Australia in 2004. Both collected parent-reported items at 2–3 years, a comparable 11-year language outcome: the Clinical Evaluation of Language Fundamentals (CELF-4) Core Language score or Recalling Sentences subtest, and biospecimens for genotyping. We derived polygenic scores capturing participants’ genetic propensity for parent-reported 24–38-month vocabulary. We calculated univariate associations with continuous language outcomes. We used ensemble method SuperLearner to estimate how accurately the parent-reported predictors and polygenic scores predict low 11-year language outcome (>1.5 standard deviations below the mean) in each cohort.
Results
Language outcome was available for 839 ELVS and 1441 LSAC participants. Polygenic scores accounted for little variance in continuous language outcomes (R2 < 1.3 %). Adding polygenic scores to the predictor sets increased accuracy of predicting language outcome by up to 7 %, but inconsistently between analyses.
Conclusions
Polygenic scores derived for toddlerhood vocabulary did not meaningfully improve predictive accuracy of individuals’ language outcome when added to the phenotypic predictor set. Presently, parent-reported measures or clinician observation appear best for predicting language outcome at this age.
| Original language | English |
|---|---|
| Article number | 116826 |
| Number of pages | 9 |
| Journal | Psychiatry Research |
| Volume | 354 |
| Early online date | 13 Nov 2025 |
| DOIs | |
| Publication status | Published - 1 Dec 2025 |
Bibliographical note
Publisher Copyright:© 2025 The Author(s)