Adapting multilingual speech representation model for a new, underresourced language through multilingual fine-tuning and continued pretraining

E-viri

Recenzirano Odprti dostop

Adapting multilingual speech representation model for a new, underresourced language through multilingual fine-tuning and continued pretraining

Nowakowski, Karol; Ptaszynski, Michal; Murasaki, Kyoko; Nieuważny, Jagna

Information processing & management, March 2023, 2023-03-00, Letnik: 60, Številka: 2

Journal Article

In recent years, neural models learned through self-supervised pretraining on large scale multilingual text or speech data have exhibited promising results for underresourced languages, especially when a relatively large amount of data from related language(s) is available. While the technology has a potential for facilitating tasks carried out in language documentation projects, such as speech transcription, pretraining a multilingual model from scratch for every new language would be highly impractical. We investigate the possibility for adapting an existing multilingual wav2vec 2.0 model for a new language, focusing on actual fieldwork data from a critically endangered tongue: Ainu. Specifically, we (i) examine the feasibility of leveraging data from similar languages also in fine-tuning; (ii) verify whether the model’s performance can be improved by further pretraining on target language data. Our results show that continued pretraining is the most effective method to adapt a wav2vec 2.0 model for a new language and leads to considerable reduction in error rates. Furthermore, we find that if a model pretrained on a related speech variety or an unrelated language with similar phonological characteristics is available, multilingual fine-tuning using additional data from that language can have positive impact on speech recognition performance when there is very little labeled data in the target language. •Downstream performance of a multilingual speech representation model on a new, underresourced language can be improved through multilingual fine-tuning and additional pretraining.•Continued pretraining on target language data leads to substantially lower error rates in automatic speech transcription.•Multilingual fine-tuning with additional data from a related or similar language helps when labeled target language data is scarce.

Išči dalje

Avtor

Nowakowski, Karol | Ptaszynski, Michal | Murasaki, Kyoko | Nieuważny, Jagna

Dostop do baze podatkov JCR je dovoljen samo uporabnikom iz Slovenije. Vaš trenutni IP-naslov ni na seznamu dovoljenih za dostop, zato je potrebna avtentikacija z ustreznim računom AAI.

Leto	Faktor vpliva		Izdaja		Kategorija		Razvrstitev
Leto	JCR	SNIP	JCR	SNIP	JCR	SNIP	JCR	SNIP

Povezave do osebnih bibliografij avtorjev	Povezave do podatkov o raziskovalcih v sistemu SICRIS

Vir: Osebne bibliografije in: SICRIS

Naloži sliko

Vnos na polico

Dodajanje gradiva na polico je uspelo.

Dodajanje gradiva na polico je spodletelo.

Dodajanje gradiva na polico ni bilo potrebno.

Trajna povezava

E-pošta

Faktor vpliva

Izberite knjižnično izkaznico:

Baze podatkov, v katerih je revija indeksirana

Citiranje

Tema