Crossing language identification: Multilingual ASR framework based on semantic dataset creation & Wav2Vec 2.0

E-resources

Peer reviewed Open access

Crossing language identification: Multilingual ASR framework based on semantic dataset creation & Wav2Vec 2.0

Anidjar, Or Haim; Yozevitch, Roi; Bigon, Nerya; Abdalla, Najeeb; Myara, Benjamin; Marbel, Revital

Machine learning with applications, 09/2023, Volume: 13

Journal Article

This study proposes an innovative methodology to enhance the performance of multilingual Automatic Speech Recognition (ASR) systems by capitalizing on the high semantic similarity between sentences across different languages and eliminating the requirement for Language Identification (LID). To achieve this, special bilingual datasets were created from the Mozzila Common Voices datasets in Spanish, Russian, and Portuguese. The process involves computing sentence embeddings using Language-agnostic BERT and selecting sentence pairs based on high and low cosine similarity. Subsequently, we train the Wav2vec 2.0 XLSR53 model on these datasets and assess its performance utilizing Character Error Rate (CER) and Word Error Rate (WER) metrics. The experimental results indicate that models trained on high-similarity samples consistently surpass their low-similarity counterparts, emphasizing the significance of high semantic similarity data selection for precise and dependable ASR performance. Furthermore, the elimination of LID contributes to a simplified system with reduced computational costs and the capacity for real-time text output. The findings of this research offer valuable insights for the development of more efficient and accurate multilingual ASR systems, particularly in real-time and on-device applications. •MLASR data creation without LID to overcome accuracy loss & performance degradation.•Handle various grammar rules & syntax in different languages for cross-lingual ASR.•Semantic dataset creation incorporated into Wav2Vec prevents language output errors.•Cope with real-life bilingual datasets for low-data languages using data augmentation.•Improve CER for languages with limited datasets and similar alphabetic characters.

Keep searching

Author

Access to the JCR database is permitted only to users from Slovenia. Your current IP address is not on the list of IP addresses with access permission, and authentication with the relevant AAI accout is required.

Year	Impact factor		Edition		Category		Classification
Year	JCR	SNIP	JCR	SNIP	JCR	SNIP	JCR	SNIP

Links to authors' personal bibliographies	Links to information on researchers in the SICRIS system

Source: Personal bibliographies and: SICRIS

Upload image

Shelf entry

Adding material to shelf was successful.

Adding material to shelf failed.

It was not necessary to add the material to the shelf.

Permalink

E-mail

Impact factor

Select the library membership card:

DRS, in which the journal is indexed

Citations

Theme