UNI-MB - logo
UMNIK - logo
 

Search results

Basic search    Expert search   

Currently you are NOT authorised to access e-resources UM. For full access, REGISTER.

1 2 3 4 5
hits: 126
1.
  • Speechlmscore: Evaluating Speech Generation Using Speech Language Model
    Maiti, Soumi; Peng, Yifan; Saeki, Takaaki ... ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 06/2023
    Conference Proceeding
    Open access

    While human evaluation is the most reliable metric for evaluating speech generation systems, it is generally costly and time-consuming. Previous studies on automatic speech quality assessment address ...
Full text
2.
  • Real-Time Full-Band Voice C... Real-Time Full-Band Voice Conversion with Sub-Band Modeling and Data-Driven Phase Estimation of Spectral Differentials
    SAEKI, Takaaki; SAITO, Yuki; TAKAMICHI, Shinnosuke ... IEICE Transactions on Information and Systems, 07/2021, Volume: E104.D, Issue: 7
    Journal Article
    Peer reviewed
    Open access

    This paper proposes two high-fidelity and computationally efficient neural voice conversion (VC) methods based on a direct waveform modification using spectral differentials. The conventional ...
Full text

PDF
3.
  • SelfRemaster: Self-Supervis... SelfRemaster: Self-Supervised Speech Restoration for Historical Audio Resources
    Saeki, Takaaki; Takamichi, Shinnosuke; Nakamura, Tomohiko ... IEEE access, 01/2023, Volume: 11
    Journal Article
    Peer reviewed
    Open access

    Restoring high-quality speech from degraded historical recordings is crucial for the preservation of cultural and endangered linguistic resources. A key challenge in this task is the scarcity of ...
Full text
4.
  • Lifter Training and Sub-Band Modeling for Computationally Efficient and High-Quality Voice Conversion Using Spectral Differentials
    Saeki, Takaaki; Saito, Yuki; Takamichi, Shinnosuke ... ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 05/2020
    Conference Proceeding
    Open access

    In this paper, we propose computationally efficient and high-quality methods for statistical voice conversion (VC) with direct waveform modification based on spectral differentials. The conventional ...
Full text

PDF
5.
  • Incremental Text-to-Speech ... Incremental Text-to-Speech Synthesis Using Pseudo Lookahead With Large Pretrained Language Model
    Saeki, Takaaki; Takamichi, Shinnosuke; Saruwatari, Hiroshi IEEE signal processing letters, 2021, Volume: 28
    Journal Article
    Peer reviewed
    Open access

    This letter presents an incremental text-to-speech (TTS) method that performs synthesis in small linguistic units while maintaining the naturalness of output speech. Incremental TTS is generally ...
Full text

PDF
6.
Full text
7.
Full text

PDF
8.
  • Text-Inductive Graphone-Bas... Text-Inductive Graphone-Based Language Adaptation for Low-Resource Speech Synthesis
    Saeki, Takaaki; Maiti, Soumi; Li, Xinjian ... IEEE/ACM transactions on audio, speech, and language processing, 01/2024, Volume: 32
    Journal Article
    Peer reviewed
    Open access

    Neural text-to-speech (TTS) systems have made significant progress in generating natural synthetic speech. However, neural TTS requires large amounts of paired training data, which limits its ...
Full text
9.
  • Duration-Aware Pause Insertion Using Pre-Trained Language Model for Multi-Speaker Text-To-Speech
    Yang, Dong; Koriyama, Tomoki; Saito, Yuki ... ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023-June-4
    Conference Proceeding
    Open access

    Pause insertion, also known as phrase break prediction and phrasing, is an essential part of TTS systems because proper pauses with natural duration significantly enhance the rhythm and ...
Full text
10.
  • Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-to-Speech
    Saeki, Takaaki; Zen, Heiga; Chen, Zhehuai ... ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023-June-4
    Conference Proceeding

    This paper proposes Virtuoso, a massively multilingual speech-text joint semi-supervised learning framework for text-to-speech synthesis (TTS) models. Existing multilingual TTS typically supports ...
Full text
1 2 3 4 5
hits: 126

Load filters