Vista Equipo: Avoiding speaker variability in pronunciation verification of children's disordered speech

Avoiding speaker variability in pronunciation verification of children's disordered speech

This paper deals with the problematic of speaker variability in a task of pronunciation verification for the speech therapy of children and young adults in Computer-Aided Pronunciation Training (CAPT) tools. The baseline system evaluates two different score normalization techniques: Traditional Test...

Descripción completa

Detalles Bibliográficos
Autores Principales:	Saz, Oscar, Lleida, Eduardo, Rodríguez-Dueñas, William R.
Formato:	Artículo (Article)
Lenguaje:	Inglés (English)
Publicado:	Association for Computing Machinery 2009
Materias:	Pronunciation evaluation Children speech Speech disorders
Acceso en línea:	https://repository.urosario.edu.co/handle/10336/28304 https://doi.org/10.1145/1640377.1640388

id	ir-10336-28304
recordtype	dspace
spelling	ir-10336-283042021-10-15T11:06:16Z Avoiding speaker variability in pronunciation verification of children's disordered speech Evitar la variabilidad del hablante en la verificación de la pronunciación del habla desordenada de los niños Saz, Oscar Lleida, Eduardo Rodríguez-Dueñas, William R. Pronunciation evaluation Children speech Speech disorders This paper deals with the problematic of speaker variability in a task of pronunciation verification for the speech therapy of children and young adults in Computer-Aided Pronunciation Training (CAPT) tools. The baseline system evaluates two different score normalization techniques: Traditional Test normalization (T-norm), and a novel Nbest based normalization that outperforms the first by normalizing to the log-likelihood score of the first alternative phoneme in an unconstrained N-best list. When performing speaker adaptation, the use of all the adaptation data from the speaker improves the performance measured in Equal Error Rate (EER) of these systems compared to the speaker independent systems; but this can be outperformed by more precise models that only adapt to the correctly pronounced phonetic units as labeled by a set of human experts. The best EER obtained in all experiments is 15.63% when using both elements: Score normalization and speaker adaptation. The possibility of automatizing a more precise adaptation without the human intervention is finally proposed and discussed. 2009-11 2020-08-28T15:47:55Z info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion ISBN: 978-1-60558-690-8 https://repository.urosario.edu.co/handle/10336/28304 https://doi.org/10.1145/1640377.1640388 eng info:eu-repo/semantics/restrictedAccess application/pdf Association for Computing Machinery WOCCI '09: Proceedings of the 2nd Workshop on Child, Computer and Interaction CMI-MLMI '09: International Conference On Multimodal Interfaces/Workshop On Machine Learning For Multimodal Interfaces Cambridge Massachusetts (November, 2009)
institution	EdocUR - Universidad del Rosario
collection	DSpace
language	Inglés (English)
topic	Pronunciation evaluation Children speech Speech disorders
spellingShingle	Pronunciation evaluation Children speech Speech disorders Saz, Oscar Lleida, Eduardo Rodríguez-Dueñas, William R. Avoiding speaker variability in pronunciation verification of children's disordered speech
description	This paper deals with the problematic of speaker variability in a task of pronunciation verification for the speech therapy of children and young adults in Computer-Aided Pronunciation Training (CAPT) tools. The baseline system evaluates two different score normalization techniques: Traditional Test normalization (T-norm), and a novel Nbest based normalization that outperforms the first by normalizing to the log-likelihood score of the first alternative phoneme in an unconstrained N-best list. When performing speaker adaptation, the use of all the adaptation data from the speaker improves the performance measured in Equal Error Rate (EER) of these systems compared to the speaker independent systems; but this can be outperformed by more precise models that only adapt to the correctly pronounced phonetic units as labeled by a set of human experts. The best EER obtained in all experiments is 15.63% when using both elements: Score normalization and speaker adaptation. The possibility of automatizing a more precise adaptation without the human intervention is finally proposed and discussed.
format	Artículo (Article)
author	Saz, Oscar Lleida, Eduardo Rodríguez-Dueñas, William R.
author_facet	Saz, Oscar Lleida, Eduardo Rodríguez-Dueñas, William R.
author_sort	Saz, Oscar
title	Avoiding speaker variability in pronunciation verification of children's disordered speech
title_short	Avoiding speaker variability in pronunciation verification of children's disordered speech
title_full	Avoiding speaker variability in pronunciation verification of children's disordered speech
title_fullStr	Avoiding speaker variability in pronunciation verification of children's disordered speech
title_full_unstemmed	Avoiding speaker variability in pronunciation verification of children's disordered speech
title_sort	avoiding speaker variability in pronunciation verification of children's disordered speech
publisher	Association for Computing Machinery
publishDate	2009
url	https://repository.urosario.edu.co/handle/10336/28304 https://doi.org/10.1145/1640377.1640388
_version_	1723228439804116992
score	12,131701

Avoiding speaker variability in pronunciation verification of children's disordered speech

Ejemplares similares