Purpose: Recent global, regional and country-level prevalence estimates for blindness and vision impairment will be important when designing future public health policies. The aim of this paper is to ...contribute to this discussion by estimating the productivity impact of known effective interventions to treat all preventable cases of vision impairment at the global, regional and country-level up to 2050. We also provide estimates of potential reduction in the number of people with vision impairment, as well as averted vision-impaired years up to 2050.
Methods: We combined recent estimates of the prevalence of blindness, distance and near vision impairment with the World Bank's World Development Indicators (WDI) and estimated the global, regional and country-level productivity gains up to 2030, 2040 and 2050 from known effective interventions, primarily cataract surgery and treated uncorrected refractive errors. The magnitude of productivity gains relative to baseline depended on population size, estimated current and future prevalence of vision impairment, level of economic development, long-term wage growth, and long-term real interest rates.
Results: Globally, we estimate that the number of people affected by blindness could be reduced from the estimated 114.6 million by 2050 to 58.3 million. This would be associated with over one billion blind life-years averted and US$ 984 billion in global productivity gains. These numbers are dwarfed by the impact of interventions to reduce the prevalence of Moderate and Severe Vision Impairment (MSVI) Presenting Acuity <20/60 to 20/400 in the better-seeing eye. We estimate that the number of people affected by MSVI could be reduced by 435.8 million people to 147.9 million by 2050. This reduction would translate to over 9 billion MSVI -life-years avoided and US$ 17 trillion in productivity gains by 2050. While other causes of VI would not be possible to eliminate completely based on current known effective treatments, low-cost interventions to eliminate VI from uncorrected presbyopia would avert 1.2 billion presbyopia life-years and achieve US$ 1.05 trillion in productivity gains by 2050. In total, the global productivity gains for all three categories are estimated to be US$ 19 trillion by 2050. East Asia makes up the greatest share of productivity gains due to the high number of people affected by VI and the region's continuing economic growth.
Conclusion: Implementation of currently known and effective treatments of avoidable blindness, MSVI and presbyopia would be expected to contribute significant productivity gains to the global economy at a fraction of the estimated costs to deliver them.
Celotno besedilo
Dostopno za:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
Prosodic features are important in achieving intelligibility, comprehensibility, and fluency in a second or foreign language (L2). However, research on the assessment of prosody as part of oral ...proficiency remains scarce. Moreover, the acoustic analysis of L2 prosody has often focused on fluency-related temporal measures, neglecting language-dependent stress features that can be quantified in terms of syllable prominence. Introducing the evaluation of prominence-related measures can be of use in developing both teaching and assessment of L2 speaking skills. In this study we compare temporal measures and syllable prominence estimates as predictors of prosodic proficiency in non-native speakers of English with respect to the speaker’s native language (L1).
The predictive power of temporal and prominence measures was evaluated for utterance-sized samples produced by language learners from four different L1 backgrounds: Czech, Slovak, Polish, and Hungarian. Firstly, the speech samples were assessed using the revised Common European Framework of Reference scale for prosodic features. The assessed speech samples were then analyzed to derive articulation rate and three fluency measures. Syllable-level prominence was estimated by a continuous wavelet transform analysis using combinations of F0, energy, and syllable duration.
The results show that the temporal measures serve as reliable predictors of prosodic proficiency in the L2, with prominence measures providing a small but significant improvement to prosodic proficiency predictions. The predictive power of the individual measures varies both quantitatively and qualitatively depending on the L1 of the speaker. We conclude that the possible effects of the speaker’s L1 on the production of L2 prosody in terms of temporal features as well as syllable prominence deserve more attention in applied research and developing teaching and assessment methods for spoken L2.
Chunking language has been proposed to be vital for comprehension enabling the extraction of meaning from a continuous stream of speech. However, neurocognitive mechanisms of chunking are poorly ...understood. The present study investigated neural correlates of chunk boundaries intuitively identified by listeners in natural speech drawn from linguistic corpora using magneto- and electroencephalography (MEEG). In a behavioral experiment, subjects marked chunk boundaries in the excerpts intuitively, which revealed highly consistent chunk boundary markings across the subjects. We next recorded brain activity to investigate whether chunk boundaries with high and medium agreement rates elicit distinct evoked responses compared to non-boundaries. Pauses placed at chunk boundaries elicited a closure positive shift with the sources over bilateral auditory cortices. In contrast, pauses placed within a chunk were perceived as interruptions and elicited a biphasic emitted potential with sources located in the bilateral primary and non-primary auditory areas with right-hemispheric dominance, and in the right inferior frontal cortex. Furthermore, pauses placed at stronger boundaries elicited earlier and more prominent activation over the left hemisphere suggesting that brain responses to chunk boundaries of natural speech can be modulated by the relative strength of different linguistic cues, such as syntactic structure and prosody.
This paper describes an hidden Markov model (HMM)-based speech synthesizer that utilizes glottal inverse filtering for generating natural sounding synthetic speech. In the proposed method, speech is ...first decomposed into the glottal source signal and the model of the vocal tract filter through glottal inverse filtering, and thus parametrized into excitation and spectral features. The source and filter features are modeled individually in the framework of HMM and generated in the synthesis stage according to the text input. The glottal excitation is synthesized through interpolating and concatenating natural glottal flow pulses, and the excitation signal is further modified according to the spectrum of the desired voice source characteristics. Speech is synthesized by filtering the reconstructed source signal with the vocal tract filter. Experiments show that the proposed system is capable of generating natural sounding speech, and the quality is clearly better compared to two HMM-based speech synthesis systems based on widely used vocoder techniques.
The Finnmark North Sámi is a variety of North Sámi language, an indigenous, endangered minority language spoken in the northernmost parts of Norway and Finland. The speakers of this language are ...bilingual, and regularly speak the majority language (Finnish or Norwegian) as well as their own North Sámi variety. In this paper we investigate possible influences of these majority languages on prosodic characteristics of Finnmark North Sámi, and associate them with prosodic patterns prevalent in the majority languages. We present a novel methodology that: (a) automatically finds the portions of speech (words) where the prosodic differences based on majority languages are most robustly manifested; and (b) analyzes the nature of these differences in terms of intonational patterns. For the first step, we trained convolutional WaveNet speech synthesis models on North Sámi speech material, modified to contain purely prosodic information, and used conditioning embeddings to find words with the greatest differences between the varieties. The subsequent exploratory analysis suggests that the differences in intonational patterns between the two Finnmark North Sámi varieties are not manifested uniformly across word types (based on part-of-speech category). Instead, we argue that the differences reflect phrase-level prosodic characteristics of the majority languages.
•Synthesis of speech on a wide vocal effort continuum and its perception is studied.•Breathy, normal, and Lombard speech are recorded, analyzed, and synthesized.•Speech is evaluated in silence, and ...in moderate and extreme street noise.•Intelligibility, quality, and suitability of speech was evaluated.•Results show that intelligibility and suitability are comparable to natural speech.
This papers studies the synthesis of speech over a wide vocal effort continuum and its perception in the presence of noise. Three types of speech are recorded and studied along the continuum: breathy, normal, and Lombard speech. Corresponding synthetic voices are created by training and adapting the statistical parametric speech synthesis system GlottHMM. Natural and synthetic speech along the continuum is assessed in listening tests that evaluate the intelligibility, quality, and suitability of speech in three different realistic multichannel noise conditions: silence, moderate street noise, and extreme street noise. The evaluation results show that the synthesized voices with varying vocal effort are rated similarly to their natural counterparts both in terms of intelligibility and suitability.
This paper introduces a cost-benefit analysis for future nuclear weapon possession using natural numbers in a simple discrete time model. In essence, I focus on the expected values (probability ...multiplied by magnitude of detonations) of deliberate and accidental nuclear wars among unitary states. I take the United Kingdom's current Trident renewal program as my case study. I seek to establish the expected value of a nuclear attack on the UK in the absence of nuclear weapons necessary to make the possession of nuclear weapons worthwhile. I find the net-value of nuclear weapons to be negative even under generous parametric values in their favor. I also discuss how our cognitive biases may affect the interpretation of the results. The analysis and discussion are limited to the UK, but the implications are likely to apply to other small nuclear weapon states, as well.
•Phase perception of the glottal excitation is studied.•Source-filter vocoder is used to modify pitch-synchronous excitation phase pattern.•Natural-phase, zero-phase, and random-phase excitations are ...compared.•Various speakers and speaking styles are utilized in subjective listening tests.•Results show that using natural phase information results in improved speech quality.
While the characteristics of the amplitude spectrum of the voiced excitation have been studied widely both in natural and synthetic speech, the role of the excitation phase has remained less explored. This contradicts findings observed in sound perception studies indicating that humans are not phase deaf. Especially in speech synthesis, phase information is often omitted for simplicity. This study investigates the impact of phase information of the excitation signal of voiced speech and its relevance in statistical parametric speech synthesis. The experiments in the study involve, firstly, converting the pitch-synchronously computed original phase spectra of the excitation waveforms (either glottal flow waveforms or residuals) to either zero phase, cyclostationary random phase, or random phase. Secondly, the quality of synthetic speech in each case is compared in subjective listening tests to the corresponding signal excited with the original, natural phase. Experiments are conducted with natural, vocoded, and synthetic speech using voice material from various speakers with varying speaking styles, such as breathy, normal, and Lombard speech. The results indicate that the phase spectrum of the voiced excitation has a perceptually relevant effect in natural, vocoded, and synthetic speech, and utilizing the phase information in speech synthesis leads to improved speech quality.
•We introduce a wavelet based representation system for speech prosody.•Emergent hierarchy from f0, intensity and duration.•Prominences and boundaries are represented in one framework.•System allows ...for efficient analysis and annotation of prosodic events.•The unsupervised prosodic labelling scheme is comparable with supervised methods.
Prominences and boundaries are the essential constituents of prosodic structure in speech. They provide for means to chunk the speech stream into linguistically relevant units by providing them with relative saliences and demarcating them within utterance structures. Prominences and boundaries have both been widely used in both basic research on prosody as well as in text-to-speech synthesis. However, there are no representation schemes that would provide for both estimating and modelling them in a unified fashion. Here we present an unsupervised unified account for estimating and representing prosodic prominences and boundaries using a scale-space analysis based on continuous wavelet transform. The methods are evaluated and compared to earlier work using the Boston University Radio News corpus. The results show that the proposed method is comparable with the best published supervised annotation methods.
•A new wavelet-based method for assessing prosodic proficiency in L2 is proposed.•Signal-based syllable prominence is a reliable predictor of prosodic proficiency.•Wavelet-based analysis of ...prominence correlates significantly with expert ratings.•The method has potential for fully automatic L2 proficiency assessment.
Prosodic characteristics, such as lexical and phrasal stress, are one of the most challenging features for second language (L2) speakers to learn. The ability to quantify language learners’ proficiency in terms of prosody can be of use to language teachers and improve the assessment of L2 speaking skills. Automatic assessment, however, requires reliable automatic analyses of prosodic features that allow for the comparison between the productions of L2 speech and reference samples. In this paper we investigate whether signal-based syllable prominence can be used to predict the prosodic competence of Finnish learners of Swedish. Syllable-level prominence was estimated for 180 L2 and 45 native (L1) utterances by a continuous wavelet transform analysis using combinations of f0, energy, and duration. The L2 utterances were graded by four expert assessors using the revised CEFR scale for prosodic features. Correlations of prominence estimates for L2 utterances with estimates for L1 utterances and linguistic stress patterns were used as a measure of prosodic proficiency of the L2 speakers. The results show that the level of agreement conceptualized in this way correlates significantly with the assessments of expert raters, providing strong support for the use of the wavelet-based prominence estimation techniques in computer-assisted assessment of L2 speaking skills.