papers AI Learner
The Github is limit! Click to go to the new site.

A high quality and phonetic balanced speech corpus for Vietnamese

2019-04-11
Pham Ngoc Phuong, Quoc Truong Do, Luong Chi Mai

Abstract

This paper presents a high quality Vietnamese speech corpus that can be used for analyzing Vietnamese speech characteristic as well as building speech synthesis models. The corpus consists of 5400 clean-speech utterances spoken by 12 speakers including 6 males and 6 females. The corpus is designed with phonetic balanced in mind so that it can be used for speech synthesis, especially, speech adaptation approaches. Specifically, all speakers utter a common dataset contains 250 phonetic balanced sentences. To increase the variety of speech context, each speaker also utters another 200 non-shared, phonetic-balanced sentences. The speakers are selected to cover a wide range of age and come from different regions of the North of Vietnam. The audios are recorded in a soundproof studio room, they are sampling at 48 kHz, 16 bits PCM, mono channel.

Abstract (translated by Google)
URL

http://arxiv.org/abs/1904.05569

PDF

http://arxiv.org/pdf/1904.05569


Comments

Content