papers AI Learner
The Github is limit! Click to go to the new site.

Advanced Rich Transcription System for Estonian Speech

2019-01-11
Tanel Alumäe, Ottokar Tilk, Asadullah

Abstract

This paper describes the current TT"U speech transcription system for Estonian speech. The system is designed to handle semi-spontaneous speech, such as broadcast conversations, lecture recordings and interviews recorded in diverse acoustic conditions. The system is based on the Kaldi toolkit. Multi-condition training using background noise profiles extracted automatically from untranscribed data is used to improve the robustness of the system. Out-of-vocabulary words are recovered using a phoneme n-gram based decoding subgraph and a FST-based phoneme-to-grapheme model. The system achieves a word error rate of 8.1% on a test set of broadcast conversations. The system also performs punctuation recovery and speaker identification. Speaker identification models are trained using a recently proposed weakly supervised training method.

Abstract (translated by Google)
URL

http://arxiv.org/abs/1901.03601

PDF

http://arxiv.org/pdf/1901.03601


Comments

Content