UNI-MB - logo
UMNIK - logo
 

Search results

Basic search    Advanced search   
Search
request
Library

Currently you are NOT authorised to access e-resources UM. For full access, REGISTER.

1 2 3 4 5
hits: 206
11.
  • Fine-Grained Style Control In Transformer-Based Text-To-Speech Synthesis
    Chen, Li-Wei; Rudnicky, Alexander ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022-May-23
    Conference Proceeding
    Open access

    In this paper, we present a novel architecture to realize fine-grained style control on the transformer-based text-to-speech synthesis (TransformerTTS). Specifically, we model the speaking style by ...
Full text
12.
  • An LSTM-based model for the... An LSTM-based model for the compression of acoustic inventories for corpus-based text-to-speech synthesis systems
    Rojc, Matej; Mlakar, Izidor Computers & electrical engineering, 20/May , Volume: 100
    Journal Article
    Peer reviewed
    Open access

    •Efficient compression of huge corpus-based TTS unit selection acoustic space.•A novel look-up of acoustic units' concatenation costs as a seq-2-seq problem.•Efficient compression of concatenation ...
Full text
13.
  • Deep Gaussian process based... Deep Gaussian process based multi-speaker speech synthesis with latent speaker representation
    Mitsui, Kentaro; Koriyama, Tomoki; Saruwatari, Hiroshi Speech communication, September 2021, 2021-09-00, 20210901, Volume: 132
    Journal Article
    Peer reviewed
    Open access

    This paper proposes deep Gaussian process (DGP)-based frameworks for multi-speaker speech synthesis and speaker representation learning. A DGP has a deep architecture of Bayesian kernel regression, ...
Full text

PDF
14.
  • Improving Speech Prosody of Audiobook Text-To-Speech Synthesis with Acoustic and Textual Contexts
    Xin, Detai; Adavanne, Sharath; Ang, Federico ... ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023-June-4
    Conference Proceeding
    Open access

    We present a multi-speaker Japanese audiobook text-to-speech (TTS) system that leverages multimodal context information of preceding acoustic context and bilateral textual context to improve the ...
Full text
15.
  • DNN-Based Full-Band Speech ... DNN-Based Full-Band Speech Synthesis Using GMM Approximation of Spectral Envelope
    KOGUCHI, Junya; TAKAMICHI, Shinnosuke; MORISE, Masanori ... IEICE Transactions on Information and Systems, 12/2020, Volume: E103.D, Issue: 12
    Journal Article
    Peer reviewed
    Open access

    We propose a speech analysis-synthesis and deep neural network (DNN)-based text-to-speech (TTS) synthesis framework using Gaussian mixture model (GMM)-based approximation of full-band spectral ...
Full text

PDF
16.
  • Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-to-Speech
    Saeki, Takaaki; Zen, Heiga; Chen, Zhehuai ... ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023-June-4
    Conference Proceeding

    This paper proposes Virtuoso, a massively multilingual speech-text joint semi-supervised learning framework for text-to-speech synthesis (TTS) models. Existing multilingual TTS typically supports ...
Full text
17.
  • Can Knowledge of End-to-End Text-to-Speech Models Improve Neural Midi-to-Audio Synthesis Systems?
    Shi, Xuan; Cooper, Erica; Wang, Xin ... ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023-June-4
    Conference Proceeding

    With the similarity between music and speech synthesis from symbolic input and the rapid development of text-to-speech (TTS) techniques, it is worthwhile to explore ways to improve the MIDI-to-audio ...
Full text
18.
  • Acoustic model-based subwor... Acoustic model-based subword tokenization and prosodic-context extraction without language knowledge for text-to-speech synthesis
    Aso, Masashi; Takamichi, Shinnosuke; Takamune, Norihiro ... Speech communication, December 2020, 2020-12-00, 20201201, Volume: 125
    Journal Article
    Peer reviewed
    Open access

    •We propose unsupervised text-to-speech synthesis using subword tokenization and prosodic-context extraction.•The subword tokenization can determine language units suitable for prosody ...
Full text

PDF
19.
  • Joint Multiscale Cross-Ling... Joint Multiscale Cross-Lingual Speaking Style Transfer With Bidirectional Attention Mechanism for Automatic Dubbing
    Li, Jingbei; Li, Sipan; Chen, Ping ... IEEE/ACM transactions on audio, speech, and language processing, 2024, Volume: 32
    Journal Article
    Peer reviewed
    Open access

    Automatic dubbing, which generates a corresponding version of the input speech in another language, can be widely utilized in many real-world scenarios, such as video and game localization. In ...
Full text
20.
  • Text-To-Speech Synthesis Based on Latent Variable Conversion Using Diffusion Probabilistic Model and Variational Autoencoder
    Yasuda, Yusuke; Toda, Tomoki ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023-June-4
    Conference Proceeding

    Text-to-speech synthesis (TTS) is a task to convert texts into speech. Two of the factors that have been driving TTS are the advancements of probabilistic models and latent representation learning. ...
Full text
1 2 3 4 5
hits: 206

Load filters