E-resources
-
Saito, Yuki; Takamichi, Shinnosuke; Saruwatari, Hiroshi
IEEE/ACM transactions on audio, speech, and language processing, 2018-Jan., 2018-1-00, Volume: 26, Issue: 1Journal Article
A method for statistical parametric speech synthesis incorporating generative adversarial networks (GANs) is proposed. Although powerful deep neural networks techniques can be applied to artificially synthesize speech waveform, the synthetic speech quality is low compared with that of natural speech. One of the issues causing the quality degradation is an oversmoothing effect often observed in the generated speech parameters. A GAN introduced in this paper consists of two neural networks: a discriminator to distinguish natural and generated samples, and a generator to deceive the discriminator. In the proposed framework incorporating the GANs, the discriminator is trained to distinguish natural and generated speech parameters, while the acoustic models are trained to minimize the weighted sum of the conventional minimum generation loss and an adversarial loss for deceiving the discriminator. Since the objective of the GANs is to minimize the divergence (i.e., distribution difference) between the natural and generated speech parameters, the proposed method effectively alleviates the oversmoothing effect on the generated speech parameters. We evaluated the effectiveness for text-to-speech and voice conversion, and found that the proposed method can generate more natural spectral parameters and F 0 than conventional minimum generation error training algorithm regardless of its hyperparameter settings. Furthermore, we investigated the effect of the divergence of various GANs, and found that a Wasserstein GAN minimizing the Earth-Mover's distance works the best in terms of improving the synthetic speech quality.
![loading ... loading ...](themes/default/img/ajax-loading.gif)
Shelf entry
Permalink
- URL:
Impact factor
Access to the JCR database is permitted only to users from Slovenia. Your current IP address is not on the list of IP addresses with access permission, and authentication with the relevant AAI accout is required.
Year | Impact factor | Edition | Category | Classification | ||||
---|---|---|---|---|---|---|---|---|
JCR | SNIP | JCR | SNIP | JCR | SNIP | JCR | SNIP |
Select the library membership card:
If the library membership card is not in the list,
add a new one.
DRS, in which the journal is indexed
Database name | Field | Year |
---|
Links to authors' personal bibliographies | Links to information on researchers in the SICRIS system |
---|
Source: Personal bibliographies
and: SICRIS
The material is available in full text. If you wish to order the material anyway, click the Continue button.