Decline in attention-deficit hyperactivity disorder traits over the life course in the general population: trajectories across five population birth cohorts spanning ages 3 to 45 years

Abstract Background Trajectories of attention-deficit hyperactivity disorder (ADHD) traits spanning early childhood to mid-life have not been described in general populations across different geographical contexts. Population trajectories are crucial to better understanding typical developmental patterns. Methods We combined repeated assessments of ADHD traits from five population-based cohorts, spanning ages 3 to 45 years. We used two measures: (i) the Strengths and Difficulties Questionnaire (SDQ) hyperactive-inattentive subscale (175 831 observations, 29 519 individuals); and (ii) scores from DSM-referenced scales (118 144 observations, 28 685 individuals). Multilevel linear spline models allowed for non-linear change over time and differences between cohorts and raters (parent/teacher/self). Results Patterns of age-related change differed by measure, cohort and country: overall, SDQ scores decreased with age, most rapidly declining before age 8 years (-0.157, 95% CI: -0.170, -0.144 per year). The pattern was generally consistent using DSM scores, although with greater between-cohort variation. DSM scores decreased most rapidly between ages 14 and 17 years (-1.32%, 95% CI: -1.471, -1.170 per year). Average scores were consistently lower for females than males (SDQ: -0.818, 95% CI: -0.856, -0.780; DSM: -4.934%, 95% CI: -5.378, -4.489). This sex difference decreased over age for both measures, due to an overall steeper decrease for males. Conclusions ADHD trait scores declined from childhood to mid-life, with marked variation between cohorts. Our results highlight the importance of taking a developmental perspective when considering typical population traits. When interpreting changes in clinical cohorts, it is important to consider the pattern of expected change within the general population, which is influenced by cultural context and measurement.


Introduction
Attention-deficit hyperactivity disorder (ADHD) is a neurodevelopmental condition-defined by a persistent and impairing pattern of inattentive, hyperactive and impulsive behaviours that typically starts in childhood. 1 Its estimated prevalence worldwide is 3.4% (95% CI: 2.6, 4.5) in children and adolescents 2 and 2.6% in adults (95% CI: 2.1, 3.1). 3 The developmental course of ADHD within clinical samples is well documented: traits typically decline with age but not for all. 1,4 Meta-analyses suggest over 80% of those with a childhood ADHD diagnosis do not meet full diagnostic criteria in adulthood, although around 65% do experience residual traits and impairment. 4 Based upon the meta-analysed rate of decline, for an individual with ADHD there is an 83% chance of meeting full ADHD criteria 1 year later and a 96% chance of meeting residual criteria. However, trait trajectories among adults with ADHD are also highly heterogeneous. 5 Categorical ADHD diagnosis represents one extreme of an underlying continuous distribution of ADHD traits within the general population. 6,7 Compared with clinically ascertained samples, less is known about the developmental pattern of ADHD traits in the general population, especially into adult life. Previous cohort studies of ADHD trait trajectories suggest that for most individuals, traits are consistently low or decline across childhood/ adolescence. 8,9 However, modelling trajectories are often disrupted by the transition to adulthood, because measures and raters typically change (e.g. from parent-to self-ratings). Trajectory modelling across the life course is needed to understand the developmental course of ADHD traits in the general population: this is an important first step towards delineating what is developmentally (in)appropriate at different ages.
Even less is known about how the developmental patterns differ across different countries and cultural contexts. In this study, we use repeated measures from five population cohorts in the UK, New Zealand and Brazil, to better understand the natural history of ADHD traits in the general population. We set out to describe typical trajectories from childhood (earliest age 3 years) into mid-life (latest age 45 years) and to examine how these vary by cohort, rater, sex and common risk factors. We included repeated measures across multiple cohorts and raters through multilevel modelling, to maximize the generalizability of results: an approach previously applied to height and weight, 10 blood pressure 11 and alcohol consumption. 12

Sample
We used data from five population-based birth cohorts: the Avon Longitudinal Study of Parents and Children (ALSPAC), 13

Measures of ADHD traits
Seven different measures of ADHD traits were available across cohorts and harmonized into two groups: (i) the hyperactive-inattentive subscale of the Strengths and Difficulties Questionnaire (SDQ); and (ii) ADHD scores based on the 18 ADHD diagnostic criteria in the Diagnostic and Statistical Manual (DSM percentage scores, see below). Parent-, teacher-and self-ratings of these measures were collected. For a detailed overview of the measures, see Supplementary Notes 2 and 3 (available as Supplementary data at IJE online).
Strengths and Difficulties Questionnaire (SDQ) measures were collected in three cohorts: ALSPAC (4-25 years), TEDS (3-21 years) and Pelotas (11 and 15 years). The hyperactive-inattentive subscale of the SDQ consists of five items capturing inattentive, hyperactive and impulsive traits. Possible scores range from 0 to 10, where higher scores represent higher ADHD trait levels. Further details of items and validations can be found in Supplementary Note 4 (available as Supplementary data at IJE online).
DSM percentage scores were measured in all five cohorts: ALSPAC (8-25 years), TEDS (8-21 years), E-Risk (5-18 years), Dunedin (9-45 years), Pelotas (21-22 years). DSM assessments ranged from 11 to 27 items and response categories from 0-1 to 0-3, resulting in considerable variation in possible scores across cohorts and across occasions (see Supplementary Notes 2, 3 and 6, available as Supplementary data at IJE online). To enable comparison despite this variation, scores were converted to a percentage of the total possible score for each cohort at each time point (additional information in Supplementary Note 4).

Measures of covariates
The association between ADHD and related risk factors differs across cohorts, 21 so we examined cohort differences by including common risk factors for ADHD as covariates in our models and allowed for interactions between covariates and cohorts. The five covariates included were: sex, 22 birthweight (kg), 23 gestational age (weeks), 24 maternal age at delivery (years) 25 and standardized parental socioeconomic position (SEP). 26 Covariates chosen are common risk factors for ADHD and were measured across all cohorts (see additional details of covariate measures in Supplementary Note 5, available as Supplementary data at IJE online). We also modelled sex-stratified and SEP-stratified trajectories.

Statistical analysis
We used multilevel modelling (MLM) to estimate individual-specific and average trajectories of ADHD traits. We constructed separate trajectories for SDQ (3 to 25 years) and DSM (5 to 45 years) scores. We used cubic splines (smooth curves, joined at knot points, where model slope is allowed to change) and linear splines (linear periods of change, joined at knot points) to allow non-linear change over time. 10 Model complexity was built up incrementally, beginning with parent-rated single cohorts and gradually adding additional cohorts, raters, covariates and finally related individuals. Model fit was assessed using the Akaike information criterion (AIC) and by comparing observed and predicted values for 2-year groups of age across the trajectory (see Supplementary Note 11, available as Supplementary data at IJE online). The final MLM is presented in Figure 1.

DSM benchmark model
Due to inconsistency in the assessment of DSM items (see supplementary Notes 2 and 3), we first constructed a benchmark model, removing as much variation as possible. We used only five parent-rated items that were consistent across cohorts. Differences between cohorts observed in this benchmark model can be used to help interpret the overall DSM percentage score model (see additional details in Supplementary Note 6).

Sensitivity analyses
We conducted MLM sensitivity analyses (i) allowing for autocorrelation, (ii) examining attrition (Supplementary Note 7, available as Supplementary data at IJE online), (iii) comparing centring by overall covariate mean to centring by the mean for each cohort separately and (iv) assessing the impact of zero-inflated distribution using generalized estimating equations (GEE). Even though MLM fixed effects are robust to non-normal distributions, 27 we examined the sensitivity of our conclusions using GEE which does not rely on normality for confidence interval estimation.

Model fitting
The best fitting model in the test cohort (ALSPAC) had linear splines with knot points at ages 8 and 16 years, where the rate of decrease changed at each knot point: the rate of decrease was shallower following the age 8 knot point and steeper again following the age 16 knot point (cubic splines, Supplementary Figure S1; fit comparisons, Supplementary Tables S3-S4, available as Supplementary data at IJE online). The model fit remained good after adding in additional raters and cohorts (Supplementary Figure  S2 and Tables S5-S7, available as Supplementary data at IJE online). Mean and standard deviation for hyperactiveinattentive SDQ scores with age across cohorts were Figure 1 The final hierarchical multilevel model, with repeated measures of attention-deficit hyperactivity disorder traits nested within individuals who are nested within families similar for ALSPAC and TEDS and slightly higher for Pelotas (Supplementary Table S8, available as Supplementary data at IJE online). The best fitting model was adjusted for rater, cohort, sex, birthweight, maternal age at delivery, and SEP (Supplementary Table S9, available as Supplementary data at IJE online). Additionally, fit was improved by interacting cohort with rater, age at delivery, SEP and slope (final model fit, Supplementary Table S10, available as Supplementary data at IJE online), suggesting that these can partially account for the observed differences in scores between cohorts.
The final model of hyperactive-inattentive SDQ scores comprised 175 831 observations from 29 519 individuals ( Figure 2; Supplementary Table S11, available as Supplementary data at IJE online). This model estimated that a male, aged 3 years, from the ALSPAC cohort (with mean covariate values), would have an average parent-rated hyperactive-inattentive SDQ score of 4.46 (95% CI: 4.40, 4.53). Average hyperactive-inattentive SDQ score decreased by 0.16 (95% CI: -0.17, -0.14) per year between ages 3 and 8 years; by 0.07 (95% CI: -0.08, -0.06) per year between ages 8 and 16 years; and by 0.11 (95% CI: -0.12, -0.10) per year after age 16. Average hyperactive-inattentive SDQ scores were 0.82 (95% CI: -0.86, -0.78) lower for females than males (sex-stratified results, Supplementary Note 8, available as Supplementary data at IJE online). Per SD increase in SEP, average hyperactive-inattentive SDQ scores were reduced by -0.15 (95% CI: -0.18, -0.11) (SEP-stratified results, Supplementary Note 9, available as Supplementary data at IJE online). Results were similar to those where covariates were centred at the mean for each cohort (Supplementary  Table S11) and fixed effects were consistent with a model allowing for autocorrelation. Figure 3 compares the trajectories across cohorts. Average scores for ALSPAC and TEDS were similar, with higher average scores for Pelotas. Average self-ratings were higher than parent-ratings (1.71, 95% CI: 1.62, 1.81) and teacher-ratings were lower than parent-ratings (-0.80, Figure 2 The best-fitting model of Strengths and Difficulties Questionnaire hyperactive-inattentive subscale scores with knot points at 8 and 16 years, using data from three cohorts combined (the Avon Longitudinal Study of Parents and Children: ALSPAC; the Twins Early Development Study; the Pelotas 1993 birth cohort). Plotted average scores are parent-rated for a male from the ALSPAC cohort, with mean covariate values 95% CI: -0.84, -0.76). There was an interaction between cohort and rater, such that self-ratings were higher than parent-ratings in TEDS and ALSPAC (intercept difference ¼ 1.02 and 1.71 hyperactive-inattentive SDQ points, respectively), but self-rated ADHD traits were lower than parent-rated in Pelotas (intercept difference ¼ -0.98 and hyperactive-inattentive SDQ points, respectively). For full model coefficients, see Supplementary Table S11; for  Between-and within-person variability Of the total variation in hyperactive-inattentive SDQ scores at baseline, 53% was explained by level 1 (withinparticipant variation), 33% was explained by level 2 (between-participant) variation and 14% was explained by level 3 (between-family) variation. In other words, most of the variability between scores is explained by the reliability of two repeated scores within the same person; approximately a quarter is explained by differences between people; and there is very little similarity between families.

ADHD trait trajectories using DSM percentage scores
Benchmark model Average scores were lowest for ALSPAC, then TEDS, Dunedin, E-Risk and finally Pelotas which had the highest average DSM percentage scores (Supplementary Figure S4, available as Supplementary data at IJE online). All cohorts had similar trajectories despite different average scores, with the exception of Pelotas, which showed a steeper trajectory (additional details in Supplementary Note 6).

Model fitting
The best fitting model in the test cohort (TEDS) had knot points at ages 14, 17 and 21 years (cubic spline, Supplementary Figure S5; fit comparison, Supplementary Tables S16 and S17, available as Supplementary data at IJE online). The model fit remained good after adding in additional raters and cohorts (Supplementary Tables S18-S22, available as Supplementary data at IJE online). This model was adjusted for rater, cohort, sex, birthweight, gestational age, maternal age at delivery and SEP. Additionally, fit was improved by interacting cohort with

Cohort comparisons
A comparison of the trajectories across cohorts is shown in Figure 5. On average, self-ratings were higher than parent-ratings (10.79%; 95% CI: 10.51, 11.07) and teacher-ratings were lower than parent-ratings (-2.10%; 95% CI: -2.81, -1.39). Scores were the highest for E-Risk and the lowest for ALSPAC, especially for the first spline Similar to the hyperactive-inattentive SDQ model, TEDS and ALSPAC had a similar slope, but average scores were higher for the TEDS cohort (intercept difference ¼ 8.22%). Trajectories differed more across cohorts compared with the benchmark model (Supplementary Figure S4). This could suggest that some of the differences between cohorts are due to differences in DSM measurement rather than true differences in slope. For extrapolated DSM percentage scores, see Supplementary Figure S6 (available as Supplementary data at IJE online).

Between-and within-person variability
Of the total variation in DSM percentage scores at baseline, 37% was explained by level 1 (within-participant variation), 46% was explained by level 2 (between-participant) variation and 16% was explained by level 3 (between-family) variation. As with hyperactive-inattentive SDQ, similarity within families was low. However, here the variation between individuals was high and within individuals was lower than for the hyperactive-inattentive SDQ.

Discussion
This is the most comprehensive investigation to date of the developmental course of ADHD traits from childhood to adulthood in the general population. There was an overall pattern of decreasing traits across development, which is consistent with findings from both single cohort studies spanning childhood/adolescence and clinical samples across the lifespan. 4,28,29 Average ADHD traits differed according to sex, rater, cohort, socioeconomic position, birthweight, maternal age at delivery and gestational age. Overall, males had higher average scores than females, consistent with the well-documented preponderance for ADHD traits in boys in clinical 22 and population samples. 30 For both measures, this sex difference decreased over age, showing overall steeper decrease for males. Consequently by approximately age 25 years, average scores were similar for males and females. There is mixed evidence for a reduction in sex differences by adult life, [31][32][33] which could reflect true differences in persistence or possibly that ADHD measures are better suited to detecting childhood traits in males than females. 34 Average ADHD traits were highest for self-ratings and lowest for teacher-ratings, for both hyperactive-inattentive SDQ and DSM. Higher self-compared with parent-ratings have been reported in young adults without ADHD. 35 However, this differs compared with clinical samples for which children with ADHD diagnosis tend to under-report traits compared with parents. [35][36][37] Non-corroboration between raters has implications for longitudinal measurement, because respondents typically change from parent to self during adolescence. Our multilevel modelling approach accounted for this by including a fixed effect for rater and by allowing interactions between rater and cohort.
Across cohorts there were differences in average ADHD trait scores, even after controlling for rater and other covariates. For both hyperactive-inattentive SDQ and DSM models, the 1993 Pelotas cohort had higher average scores compared with ALSPAC and TEDS. This is consistent with a previous cross-cohort comparison which found higher SDQ and DAWBA scores in the 2004 Pelotas cohort compared with the ALSPAC cohort. 21 Previous estimates of adolescent ADHD population prevalence from across Brazil were within the expected range, 38 but estimates for ADHD in adulthood 39 were higher than those found in the UK and New Zealand. [40][41][42] It is important to note that items were translated into the Portuguese for the 1993 Pelotas cohort, which could have possibly influenced interpretation. The only cohort with higher average ADHD scores than Pelotas was E-Risk. This is likely due to the E-Risk sampling approach (a subsample of TEDS), where more young mothers were contacted and attrition was low, achieving a sample more representative of the UK population. 18 To avoid overlap in the current study, we removed all E-Risk participants from the TEDS cohort, making it likely less representative of the general population in the UK. 17 The higher ADHD scores observed in E-Risk likely better reflect the UK population. Furthermore, the majority of participants in the final model were from UK-based cohorts, with smaller numbers of participants from New Zealand and Brazil. Consequently, model inferences are likely to be most applicable to the UK population. Future investigations should incorporate data from additional countries to make wider generalizations.
Despite an overall decreasing trend, the average DSM percentage scores increased slightly between the ages of 17 and 21 years. This period of transition to adulthood is a particularly challenging time and a peak age for depression onset, 43 both of which could exacerbate, and affect the measurement of, ADHD traits. 36 However, only two of the contributing cohorts had observations of DSM-related items between these ages (TEDS and E-Risk), and both included additional self-reported items at these time points to capture age-related change in ADHD trait presentation (see Supplementary Note 10, available as Supplementary data at IJE online). These ADHD items were more frequently endorsed than the original items, which could suggest they are indeed more relevant to this developmental period or that they capture behaviours less specific to ADHD. 35 We did not see an increase at this age in either our benchmark DSM model, or our model of hyperactiveinattentive SDQ, suggesting it is most likely due to different measurement rather than a true increase in average scores. This highlights the complexity of longitudinal work spanning different developmental periods: consistent measures are needed for the robust investigation of ADHD traits across age, 44 but different measures are often needed to assess the same underlying construct in a developmentally appropriate manner. When items are adapted to be more developmentally appropriate, we recommend that the original measure is also included to allow for direct comparison.
Furthermore, caution is needed in interpreting average DSM percentage scores, given measurement variability both within and between cohorts (e.g. number of items, scoring of items, phrasing of items). These measurement differences meant that we were not able to use item counts, which would have enabled comparison between our general population trajectories and diagnostic thresholds. We converted DSM scores to the percentage of total possible scores to enable harmonization across cohorts and still observed greater cross-cohort variation for DSM percentage scores compared with hyperactive-inattentive SDQ scores. Triangulation of results with a benchmark model and the hyperactive-inattentive SDQ model enabled us to infer which changes might be due to measurement differences rather than true score change over time. Our findings highlight the importance of collecting consistent repeated measures in longitudinal cohorts to explore age-related change. This improves confidence in inferences from trajectory modelling and facilitates more effective meta-analyses across cohorts. 2 Finally, it is important to note that the contributing cohorts suffer from non-random attrition to varying degrees, 17,20,45 with those at the highest risk of psychopathology most likely to drop out. 45 MLMs are robust to bias from attrition that is missing at random (i.e. observed variables predict dropout). Cohorts that have only collected measures later are therefore more likely to show bias because they do not have earlier observed scores. This could in part explain higher scores in the Pelotas cohort, as well as different sample compositions. Our inclusion of individuals with single observations will have reduced bias from attrition. Furthermore, results for individuals who had responded in early, middle and late age were very consistent with the main model, and results for E-Risk and Dunedin (where attrition was much lower) showed similar findings. However, we cannot rule out the possibility that the average reduction in ADHD traits over time could be in part due to non-random attrition.

Conclusions
There was an overall pattern of decreasing ADHD traits across childhood through to adulthood in the general population in three different countries (UK, Brazil, New Zealand). This is the most comprehensive investigation to date of the developmental course of ADHD traits in the general population. The pattern of non-linear change was influenced by several factors including rater, sex and cohort. Our trajectories, which span childhood to mid-life in the general population, are a valuable step towards determining what is developmentally typical. We also emphasize the need for greater consistency in measurement of ADHD traits both between and within cohorts, which will improve the interpretation of future longitudinal models that aim to combine data across cohorts.

Ethics approval
Ethical approval for the study was obtained from each cohort individually. For ALSPAC, the study was approved by the ALSPAC Ethics and Law Committee and the local research ethics committees. Informed consent for the use of data collected via questionnaires and clinics was obtained from participants following the recommendations of the ALSPAC Ethics and Law Committee at the time. TEDS and their consent procedures were approved by the King's College London Research Ethics Committee (ref: PNM/09/10-104). For E-Risk, the Joint South London and Maudsley and the Institute of Psychiatry Research Ethics Committee approved each phase of the study. Parents gave informed consent and twins gave assent between 5 and 12 years and then informed consent at age 18. For Pelotas, ethical approval for the study was obtained from the Ethics and Research Committee of the Faculty of Medicine of the Federal University of Pelotas. Informed consent was obtained from parents, and also cohort participants gave their consent when applicable. For Dunedin, the NZ-HDEC (Health and Disability Ethics Committee) approved the study and all study members provided written informed consent.

Data availability
The data underlying this article cannot be shared publicly. Researchers can apply for access to each of the cohorts.

Supplementary data
Supplementary data are available at IJE online.

Author contributions
A.T., K.T., E.S. and G.D.S. designed the study and obtained funding for the work. J.A.B., A.C., K.R., T.C.E., L.A.R., L.A., F.C.W., H.G., A.M.B.M. and T.E.M. provided datasets, cleaned variables and provided sample expertise. R.E.W. conducted the analysis and led the manuscript writing. L.R. and R.B. helped with analysis and script checking. T.C. helped with analysis and presenting results. K.T. planned the methodological approach and oversaw all analyses. All co-authors helped in interpreting the results and revising the manuscript. This publication is the work of the authors, but R.E.W. and K.T. will serve as guarantors for the contents of this paper.