Detection and attention for auditory, visual, and audiovisual speech in children with hearing loss

Susan Jerger, Markus Damian, Cassandra Karl, Hervé Abdi

Research output: Contribution to journalArticle (Academic Journal)peer-review

1 Citation (Scopus)
24 Downloads (Pure)


Efficient multisensory speech detection is critical for children who must quickly detect/encode a rapid stream of speech to participate in conversations and to have access to the audiovisual cues that underpin speech and language development, yet multisensory speech detection remains understudied in children with hearing loss (CHL). This research assessed detection, along with vigilant/goal-directed attention, for multisensory vs. uni-sensory speech in CHL vs. children with normal hearing (CNH).

Participants were 60 CHL who used hearing aids and communicated successfully aurally/orally and 60 age-matched CNH. Simple response times determined how quickly children could detect a pre-identified easy-to-hear stimulus (70 dB SPL, utterance “buh” presented in Auditory only (A), Visual only (V), or Audiovisual (AV) modes). The V mode formed two facial conditions: static vs. dynamic face. Faster detection for multisensory (AV) than uni-sensory (A or V) input indicates multisensory facilitation. We assessed mean responses and faster vs. slower responses (defined by 1st vs. 3rd quartiles of response-time distributions), which were respectively conceptualized as: faster responses (1st quartile) reflect efficient detection with efficient vigilant/goal-directed attention and slower responses (3rd quartile) reflect less efficient detection associated with attentional lapses. Lastly, we studied associations between these results and personal characteristics of CHL.

Uni-sensory A vs. V Modes: Both Groups showed better detection and attention for A than V input. The A input more readily captured children's attention and minimized attentional lapses, which supports A-bound processing even by CHL who were processing low fidelity A input. CNH and CHL did not differ in ability to detect A input at conversational speech level. Multisensory AV vs. A Modes. Both Groups showed better detection and attention for AV than A input. The advantage for AV input was facial effect (both static and dynamic faces), a pattern suggesting that communication is a social interaction that is more than just words. Attention did not differ between Groups; detection was faster in CHL than CNH for AV input, but not for A input. Associations Between Personal Characteristics/Degree of Hearing Loss of CHL and Results. CHL with greatest deficits in detection of V input had poorest word recognition skills and CHL with greatest reduction of attentional lapses from AV input had poorest vocabulary skills. Both outcomes are consistent with the idea that CHL who are processing low fidelity A input depend disproportionately on V and AV input to learn to identify words and associated them with concepts. As CHL aged, attention to V input improved. Degree of HL did not influence results.

Understanding speech—a daily challenge for CHL—is a complex task that demands efficient detection of and attention to AV speech cues. Our results support the clinical importance of multisensory assessments in order to understand and advance spoken communication by CHL.
Original languageEnglish
Number of pages13
JournalEar and Hearing
Early online date7 Oct 2019
Publication statusE-pub ahead of print - 7 Oct 2019

Structured keywords

  • Language
  • Developmental
  • Cognitive Science


Dive into the research topics of 'Detection and attention for auditory, visual, and audiovisual speech in children with hearing loss'. Together they form a unique fingerprint.

Cite this