E-resources
Peer reviewed
Open access
-
Ellinas, Nikolaos; Christidou, Myrsini; Vioni, Alexandra; Sung, June Sig; Chalamandaris, Aimilios; Tsiakoulis, Pirros; Mastorocostas, Paris
Speech communication, January 2023, 2023-01-00, Volume: 146Journal Article
In this paper, we present a novel method for phoneme-level prosody control of F0 and duration using intuitive discrete labels. We propose an unsupervised prosodic clustering process which is used to discretize phoneme-level F0 and duration features from a multispeaker speech dataset. These features are fed as an input sequence of prosodic labels to a prosody encoder module which augments an autoregressive attention-based text-to-speech model. We utilize various methods in order to improve prosodic control range and coverage, such as augmentation, F0 normalization, balanced clustering for duration and speaker-independent clustering. The final model enables fine-grained phoneme-level prosody control for all speakers contained in the training set, while maintaining the speaker identity. Instead of relying on reference utterances for inference, we introduce a prior prosody encoder which learns the style of each speaker and enables speech synthesis without the requirement of reference audio. We also fine-tune the multispeaker model to unseen speakers with limited amounts of data, as a realistic application scenario and show that the prosody control capabilities are maintained, verifying that the speaker-independent prosodic clustering is effective. Experimental results show that the model has high output speech quality and that the proposed method allows efficient prosody control within each speaker’s range despite the variability that a multispeaker setting introduces. •Prosodic clustering for fine-grained phoneme-level prosody control.•Controllable end-to-end text-to-speech synthesis using intuitive discrete labels.•Multispeaker prosody control, with application to unseen speaker adaptation.
![loading ... loading ...](themes/default/img/ajax-loading.gif)
Shelf entry
Permalink
- URL:
Impact factor
Access to the JCR database is permitted only to users from Slovenia. Your current IP address is not on the list of IP addresses with access permission, and authentication with the relevant AAI accout is required.
Year | Impact factor | Edition | Category | Classification | ||||
---|---|---|---|---|---|---|---|---|
JCR | SNIP | JCR | SNIP | JCR | SNIP | JCR | SNIP |
Select the library membership card:
If the library membership card is not in the list,
add a new one.
DRS, in which the journal is indexed
Database name | Field | Year |
---|
Links to authors' personal bibliographies | Links to information on researchers in the SICRIS system |
---|
Source: Personal bibliographies
and: SICRIS
The material is available in full text. If you wish to order the material anyway, click the Continue button.