When listening to two competing speakers, normal-hearing (NH) listeners can take advantage of voice differences between the speakers. Users of cochlear implants (CIs) have difficulty in perceiving ...speech on speech. Previous literature has indicated sensitivity to voice pitch (related to fundamental frequency, F0) to be poor among implant users, while sensitivity to vocal-tract length (VTL; related to the height of the speaker and formant frequencies), the other principal voice characteristic, has not been directly investigated in CIs. A few recent studies evaluated F0 and VTL perception indirectly, through voice gender categorization, which relies on perception of both voice cues. These studies revealed that, contrary to prior literature, CI users seem to rely exclusively on F0 while not utilizing VTL to perform this task. The objective of the present study was to directly and systematically assess raw sensitivity to F0 and VTL differences in CI users to define the extent of the deficit in voice perception.
The just-noticeable differences (JNDs) for F0 and VTL were measured in 11 CI listeners using triplets of consonant-vowel syllables in an adaptive three-alternative forced choice method.
The results showed that while NH listeners had average JNDs of 1.95 and 1.73 semitones (st) for F0 and VTL, respectively, CI listeners showed JNDs of 9.19 and 7.19 st. These JNDs correspond to differences of 70% in F0 and 52% in VTL. For comparison to the natural range of voices in the population, the F0 JND in CIs remains smaller than the typical male-female F0 difference. However, the average VTL JND in CIs is about twice as large as the typical male-female VTL difference.
These findings, thus, directly confirm that CI listeners do not seem to have sufficient access to VTL cues, likely as a result of limited spectral resolution, and, hence, that CI listeners' voice perception deficit goes beyond poor perception of F0. These results provide a potential common explanation not only for a number of deficits observed in CI listeners, such as voice identification and gender categorization, but also for competing speech perception.
Evidence for transfer of musical training to better perception of speech in noise has been mixed. Unlike speech-in-noise, speech-on-speech perception utilizes many of the skills that musical training ...improves, such as better pitch perception and stream segregation, as well as use of higher-level auditory cognitive functions, such as attention. Indeed, despite the few non-musicians who performed as well as musicians, on a group level, there was a strong musician benefit for speech perception in a speech masker. This benefit does not seem to result from better voice processing and could instead be related to better stream segregation or enhanced cognitive functions.
Speech is crucial for communication in everyday life. Speech-brain entrainment, the alignment of neural activity to the slow temporal fluctuations (envelope) of acoustic speech input, is a ubiquitous ...element of current theories of speech processing. Associations between speech-brain entrainment and acoustic speech signal, listening task, and speech intelligibility have been observed repeatedly. However, a methodological bottleneck has prevented so far clarifying whether speech-brain entrainment contributes functionally to (i.e., causes) speech intelligibility or is merely an epiphenomenon of it. To address this long-standing issue, we experimentally manipulated speech-brain entrainment without concomitant acoustic and task-related variations, using a brain stimulation approach that enables modulating listeners’ neural activity with transcranial currents carrying speech-envelope information. Results from two experiments involving a cocktail-party-like scenario and a listening situation devoid of aural speech-amplitude envelope input reveal consistent effects on listeners’ speech-recognition performance, demonstrating a causal role of speech-brain entrainment in speech intelligibility. Our findings imply that speech-brain entrainment is critical for auditory speech comprehension and suggest that transcranial stimulation with speech-envelope-shaped currents can be utilized to modulate speech comprehension in impaired listening conditions.
•Transcranial stimulation with speech-shaped currents influences auditory processing•Speech-brain entrainment modulates speech intelligibility•Speech-brain entrainment and speech intelligibility interact reciprocally
Riecke et al. study how humans can recognize speech. Using electric brain stimulation, they find that synchronization of ongoing brain activity with the rhythm of auditory speech modulates the intelligibility of this speech. This implies that the brain can employ its ongoing temporal activity as a critical instrument in speech recognition.
The brain, using expectations, linguistic knowledge, and context, can perceptually restore inaudible portions of speech. Such top-down repair is thought to enhance speech intelligibility in noisy ...environments. Hearing-impaired listeners with cochlear implants commonly complain about not understanding speech in noise. We hypothesized that the degradations in the bottom-up speech signals due to the implant signal processing may have a negative effect on the top-down repair mechanisms, which could partially be responsible for this complaint. To test the hypothesis, phonemic restoration of interrupted sentences was measured with young normal-hearing listeners using a noise-band vocoder simulation of implant processing. Decreasing the spectral resolution (by reducing the number of vocoder processing channels from 32 to 4) systematically degraded the speech stimuli. Supporting the hypothesis, the size of the restoration benefit varied as a function of spectral resolution. A significant benefit was observed only at the highest spectral resolution of 32 channels. With eight channels, which resembles the resolution available to most implant users, there was no significant restoration effect. Combined electric–acoustic hearing has been previously shown to provide better intelligibility of speech in adverse listening environments. In a second configuration, combined electric–acoustic hearing was simulated by adding low-pass-filtered acoustic speech to the vocoder processing. There was a slight improvement in phonemic restoration compared to the first configuration; the restoration benefit was observed at spectral resolutions of both 16 and 32 channels. However, the restoration was not observed at lower spectral resolutions (four or eight channels). Overall, the findings imply that the degradations in the bottom-up signals alone (such as occurs in cochlear implants) may reduce the top-down restoration of speech.
Perception of voice characteristics allows normal hearing listeners to identify the gender of a speaker, and to better segregate speakers from each other in cocktail party situations. This benefit is ...largely driven by the perception of two vocal characteristics of the speaker: The fundamental frequency (F0) and the vocal-tract length (VTL). Previous studies have suggested that cochlear implant (CI) users have difficulties in perceiving these cues. The aim of the present study was to investigate possible causes for limited sensitivity to VTL differences in CI users. Different acoustic simulations of CI stimulation were implemented to characterize the role of spectral resolution on VTL, both in terms of number of channels and amount of channel interaction. The results indicate that with 12 channels, channel interaction caused by current spread is likely to prevent CI users from perceiving VTL differences typically found between male and female speakers.
Children's ability to distinguish speakers' voices continues to develop throughout childhood, yet it remains unclear how children's sensitivity to voice cues, such as differences in speakers' gender, ...develops over time. This so-called voice gender is primarily characterized by speakers' mean fundamental frequency (F0), related to glottal pulse rate, and vocal-tract length (VTL), related to speakers' size. Here we show that children's acquisition of adult-like performance for discrimination, a lower-order perceptual task, and categorization, a higher-order cognitive task, differs across voice gender cues. Children's discrimination was adult-like around the age of 8 for VTL but still differed from adults at the age of 12 for F0. Children's perceptual weight attributed to F0 for gender categorization was adult-like around the age of 6 but around the age of 10 for VTL. Children's discrimination and weighting of F0 and VTL were only correlated for 4- to 6-year-olds. Hence, children's development of discrimination and weighting of voice gender cues are dissociated, i.e., adult-like performance for F0 and VTL is acquired at different rates and does not seem to be closely related. The different developmental patterns for auditory discrimination and categorization highlight the complexity of the relationship between perceptual and cognitive mechanisms of voice perception.
This study compares two response-time measures of listening effort that can be combined with a clinical speech test for a more comprehensive evaluation of total listening experience; verbal response ...times to auditory stimuli (RT(aud)) and response times to a visual task (RTs(vis)) in a dual-task paradigm. The listening task was presented in five masker conditions; no noise, and two types of noise at two fixed intelligibility levels. Both the RTs(aud) and RTs(vis) showed effects of noise. However, only RTs(aud) showed an effect of intelligibility. Because of its simplicity in implementation, RTs(aud) may be a useful effort measure for clinical applications.
Purpose: The current study investigates how individual differences in cochlear implant (CI) users' sensitivity to word-nonword differences, reflecting lexical uncertainty, relate to their reliance on ...sentential context for lexical access in processing continuous speech. Method: Fifteen CI users and 14 normal-hearing (NH) controls participated in an auditory lexical decision task (Experiment 1) and a visual-world paradigm task (Experiment 2). Experiment 1 tested participants' reliance on lexical statistics, and Experiment 2 studied how sentential context affects the time course and patterns of lexical competition leading to lexical access. Results: In Experiment 1, CI users had lower accuracy scores and longer reaction times than NH listeners, particularly for nonwords. In Experiment 2, CI users' lexical competition patterns were, on average, similar to those of NH listeners, but the patterns of individual CI users varied greatly. Individual CI users' word-nonword sensitivity (Experiment 1) explained differences in the reliance on sentential context to resolve lexical competition, whereas clinical speech perception scores explained competition with phonologically related words. Conclusions: The general analysis of CI users' lexical competition patterns showed merely quantitative differences with NH listeners in the time course of lexical competition, but our additional analysis revealed more qualitative differences in CI users' strategies to process speech. Individuals' word-nonword sensitivity explained different parts of individual variability than clinical speech perception scores. These results stress, particularly for heterogeneous clinical populations such as CI users, the importance of investigating individual differences in addition to group averages, as they can be informative for clinical rehabilitation.
Assessing effort in speech comprehension for hearing-impaired (HI) listeners is important, as effortful processing of speech can limit their hearing rehabilitation. We examined the measure of pupil ...dilation in its capacity to accommodate the heterogeneity that is present within clinical populations by studying lexical access in users with sensorineural hearing loss, who perceive speech via cochlear implants (CIs). We compared the pupillary responses of 15 experienced CI users and 14 age-matched normal-hearing (NH) controls during auditory lexical decision. A growth curve analysis was applied to compare the responses between the groups. NH listeners showed a coherent pattern of pupil dilation that reflects the task demands of the experimental manipulation and a homogenous time course of dilation. CI listeners showed more variability in the morphology of pupil dilation curves, potentially reflecting variable sources of effort across individuals. In follow-up analyses, we examined how speech perception, a task that relies on multiple stages of perceptual analyses, poses multiple sources of increased effort for HI listeners, wherefore we might not be measuring the same source of effort for HI as for NH listeners. We argue that interindividual variability among HI listeners can be clinically meaningful in attesting not only the magnitude but also the locus of increased effort. The understanding of individual variations in effort requires experimental paradigms that (a) differentiate the task demands during speech comprehension, (b) capture pupil dilation in its time course per individual listeners, and (c) investigate the range of individual variability present within clinical and NH populations.