Non-Parallel Whisper-to-Normal Speaking Style Conversion Using Auxiliary Classifier Variational Autoencoder

E-viri

Recenzirano Odprti dostop

Non-Parallel Whisper-to-Normal Speaking Style Conversion Using Auxiliary Classifier Variational Autoencoder

Seki, Shogo; Kameoka, Hirokazu; Kaneko, Takuhiro; Tanaka, Kou

IEEE access, 2023, Letnik: 11

Journal Article

This paper is concerned with non-parallel whisper-to-normal speaking-style conversion (W2N-SC), which converts whispered speech into normal speech without using parallel training data. Most relevant to this task is voice conversion (VC), which converts one speaker's voice to another. However, the W2N-SC task differs from the regular VC task in three main respects. First, unlike normal speech, whispered speech contains little or no pitch information. Second, whispered speech usually has significantly less energy than normal speech and is therefore more susceptible to external noise. Third, in the actual usage scenario of W2N-SC, users may suddenly switch voice modes from whispered to normal speech, or vice versa, meaning that the speaking-style of input speech cannot be assumed in advance. To clarify whether existing VC techniques can successfully handle these task-specific concerns and how they should be modified to better address them, we consider a variational autoencoder (VAE)-based VC method as a baseline and examine what modifications to this method would be effective for the current task. Specifically, we study the effects of 1) a self-supervised training scheme called filling-in-frames (FIF); 2) data augmentation (DA) using noisy speech samples; and 3) an architecture that allows for any-to-many conversions. Through experimental evaluation of the W2N-SC and speaker conversion tasks, we confirmed that, especially in the W2N-SC task, the version incorporating the above modifications works better than the baseline VC model applied as is.

Išči dalje

Avtor

Seki, Shogo | Kameoka, Hirokazu | Kaneko, Takuhiro | Tanaka, Kou

Dostop do baze podatkov JCR je dovoljen samo uporabnikom iz Slovenije. Vaš trenutni IP-naslov ni na seznamu dovoljenih za dostop, zato je potrebna avtentikacija z ustreznim računom AAI.

Leto	Faktor vpliva		Izdaja		Kategorija		Razvrstitev
Leto	JCR	SNIP	JCR	SNIP	JCR	SNIP	JCR	SNIP

Povezave do osebnih bibliografij avtorjev	Povezave do podatkov o raziskovalcih v sistemu SICRIS

Vir: Osebne bibliografije in: SICRIS

Naloži sliko

Vnos na polico

Dodajanje gradiva na polico je uspelo.

Dodajanje gradiva na polico je spodletelo.

Dodajanje gradiva na polico ni bilo potrebno.

Trajna povezava

E-pošta

Faktor vpliva

Izberite knjižnično izkaznico:

Baze podatkov, v katerih je revija indeksirana

Citiranje

Tema