Automatic Speech Recognition of Quechua Language Using HMM Toolkit

Descripción del Articulo

In this paper, we present the implementation of an Automatic Speech Recognition system (ASR) for southern Quechua language. The software can recognize both continuous speech and isolated words. The ASR was developed using Hidden Markov Model Toolkit (HTK) and the corpus collected by Siminchikkunaray...

Descripción completa

Detalles Bibliográficos
Autores: Zevallos R., Cordova J., Camacho L.
Formato: artículo
Fecha de Publicación:2020
Institución:Consejo Nacional de Ciencia Tecnología e Innovación
Repositorio:CONCYTEC-Institucional
Lenguaje:inglés
OAI Identifier:oai:repositorio.concytec.gob.pe:20.500.12390/2613
Enlace del recurso:https://hdl.handle.net/20.500.12390/2613
https://doi.org/10.1007/978-3-030-46140-9_6
Nivel de acceso:acceso abierto
Materia:Quechua
ASR
Endangered languages
HMM
HTK
http://purl.org/pe-repo/ocde/ford#2.11.02
id CONC_3e5fd9050c956ae3d736671e3a5d0ac5
oai_identifier_str oai:repositorio.concytec.gob.pe:20.500.12390/2613
network_acronym_str CONC
network_name_str CONCYTEC-Institucional
repository_id_str 4689
spelling Publicationrp06457600rp06685600rp01369600Zevallos R.Cordova J.Camacho L.2024-05-30T23:13:38Z2024-05-30T23:13:38Z2020https://hdl.handle.net/20.500.12390/2613https://doi.org/10.1007/978-3-030-46140-9_62-s2.0-85084804619In this paper, we present the implementation of an Automatic Speech Recognition system (ASR) for southern Quechua language. The software can recognize both continuous speech and isolated words. The ASR was developed using Hidden Markov Model Toolkit (HTK) and the corpus collected by Siminchikkunarayku. A dictionary provides the system with a mapping of vocabulary words to sequences of phonemes; the audio files were processed to extract the speech feature vectors (MFCC) and then, the acoustic model was trained using the MFCC files until its convergence. The paper also describes a detailed architecture of an ASR system developed using HTK library modules and tools. The ASR was tested using the audios recorded by volunteers obtaining a 12.70% word error rate. © Springer Nature Switzerland AG 2020.Consejo Nacional de Ciencia, Tecnología e Innovación Tecnológica - ConcytecengSpringerCommunications in Computer and Information Scienceinfo:eu-repo/semantics/openAccessQuechuaASR-1Endangered languages-1HMM-1HTK-1http://purl.org/pe-repo/ocde/ford#2.11.02-1Automatic Speech Recognition of Quechua Language Using HMM Toolkitinfo:eu-repo/semantics/articlereponame:CONCYTEC-Institucionalinstname:Consejo Nacional de Ciencia Tecnología e Innovacióninstacron:CONCYTEC20.500.12390/2613oai:repositorio.concytec.gob.pe:20.500.12390/26132024-05-30 16:09:52.696http://purl.org/coar/access_right/c_14cbinfo:eu-repo/semantics/closedAccessmetadata only accesshttps://repositorio.concytec.gob.peRepositorio Institucional CONCYTECrepositorio@concytec.gob.pe#PLACEHOLDER_PARENT_METADATA_VALUE##PLACEHOLDER_PARENT_METADATA_VALUE##PLACEHOLDER_PARENT_METADATA_VALUE#<Publication xmlns="https://www.openaire.eu/cerif-profile/1.1/" id="bead874f-9538-4b5e-84b0-693e139da693"> <Type xmlns="https://www.openaire.eu/cerif-profile/vocab/COAR_Publication_Types">http://purl.org/coar/resource_type/c_1843</Type> <Language>eng</Language> <Title>Automatic Speech Recognition of Quechua Language Using HMM Toolkit</Title> <PublishedIn> <Publication> <Title>Communications in Computer and Information Science</Title> </Publication> </PublishedIn> <PublicationDate>2020</PublicationDate> <DOI>https://doi.org/10.1007/978-3-030-46140-9_6</DOI> <SCP-Number>2-s2.0-85084804619</SCP-Number> <Authors> <Author> <DisplayName>Zevallos R.</DisplayName> <Person id="rp06457" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> <Author> <DisplayName>Cordova J.</DisplayName> <Person id="rp06685" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> <Author> <DisplayName>Camacho L.</DisplayName> <Person id="rp01369" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> </Authors> <Editors> </Editors> <Publishers> <Publisher> <DisplayName>Springer</DisplayName> <OrgUnit /> </Publisher> </Publishers> <Keyword>Quechua</Keyword> <Keyword>ASR</Keyword> <Keyword>Endangered languages</Keyword> <Keyword>HMM</Keyword> <Keyword>HTK</Keyword> <Abstract>In this paper, we present the implementation of an Automatic Speech Recognition system (ASR) for southern Quechua language. The software can recognize both continuous speech and isolated words. The ASR was developed using Hidden Markov Model Toolkit (HTK) and the corpus collected by Siminchikkunarayku. A dictionary provides the system with a mapping of vocabulary words to sequences of phonemes; the audio files were processed to extract the speech feature vectors (MFCC) and then, the acoustic model was trained using the MFCC files until its convergence. The paper also describes a detailed architecture of an ASR system developed using HTK library modules and tools. The ASR was tested using the audios recorded by volunteers obtaining a 12.70% word error rate. © Springer Nature Switzerland AG 2020.</Abstract> <Access xmlns="http://purl.org/coar/access_right" > </Access> </Publication> -1
dc.title.none.fl_str_mv Automatic Speech Recognition of Quechua Language Using HMM Toolkit
title Automatic Speech Recognition of Quechua Language Using HMM Toolkit
spellingShingle Automatic Speech Recognition of Quechua Language Using HMM Toolkit
Zevallos R.
Quechua
ASR
Endangered languages
HMM
HTK
http://purl.org/pe-repo/ocde/ford#2.11.02
title_short Automatic Speech Recognition of Quechua Language Using HMM Toolkit
title_full Automatic Speech Recognition of Quechua Language Using HMM Toolkit
title_fullStr Automatic Speech Recognition of Quechua Language Using HMM Toolkit
title_full_unstemmed Automatic Speech Recognition of Quechua Language Using HMM Toolkit
title_sort Automatic Speech Recognition of Quechua Language Using HMM Toolkit
author Zevallos R.
author_facet Zevallos R.
Cordova J.
Camacho L.
author_role author
author2 Cordova J.
Camacho L.
author2_role author
author
dc.contributor.author.fl_str_mv Zevallos R.
Cordova J.
Camacho L.
dc.subject.none.fl_str_mv Quechua
topic Quechua
ASR
Endangered languages
HMM
HTK
http://purl.org/pe-repo/ocde/ford#2.11.02
dc.subject.es_PE.fl_str_mv ASR
Endangered languages
HMM
HTK
dc.subject.ocde.none.fl_str_mv http://purl.org/pe-repo/ocde/ford#2.11.02
description In this paper, we present the implementation of an Automatic Speech Recognition system (ASR) for southern Quechua language. The software can recognize both continuous speech and isolated words. The ASR was developed using Hidden Markov Model Toolkit (HTK) and the corpus collected by Siminchikkunarayku. A dictionary provides the system with a mapping of vocabulary words to sequences of phonemes; the audio files were processed to extract the speech feature vectors (MFCC) and then, the acoustic model was trained using the MFCC files until its convergence. The paper also describes a detailed architecture of an ASR system developed using HTK library modules and tools. The ASR was tested using the audios recorded by volunteers obtaining a 12.70% word error rate. © Springer Nature Switzerland AG 2020.
publishDate 2020
dc.date.accessioned.none.fl_str_mv 2024-05-30T23:13:38Z
dc.date.available.none.fl_str_mv 2024-05-30T23:13:38Z
dc.date.issued.fl_str_mv 2020
dc.type.none.fl_str_mv info:eu-repo/semantics/article
format article
dc.identifier.uri.none.fl_str_mv https://hdl.handle.net/20.500.12390/2613
dc.identifier.doi.none.fl_str_mv https://doi.org/10.1007/978-3-030-46140-9_6
dc.identifier.scopus.none.fl_str_mv 2-s2.0-85084804619
url https://hdl.handle.net/20.500.12390/2613
https://doi.org/10.1007/978-3-030-46140-9_6
identifier_str_mv 2-s2.0-85084804619
dc.language.iso.none.fl_str_mv eng
language eng
dc.relation.ispartof.none.fl_str_mv Communications in Computer and Information Science
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Springer
publisher.none.fl_str_mv Springer
dc.source.none.fl_str_mv reponame:CONCYTEC-Institucional
instname:Consejo Nacional de Ciencia Tecnología e Innovación
instacron:CONCYTEC
instname_str Consejo Nacional de Ciencia Tecnología e Innovación
instacron_str CONCYTEC
institution CONCYTEC
reponame_str CONCYTEC-Institucional
collection CONCYTEC-Institucional
repository.name.fl_str_mv Repositorio Institucional CONCYTEC
repository.mail.fl_str_mv repositorio@concytec.gob.pe
_version_ 1844883024732225536
score 13.386405
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).