Neural Network Strategies and Models for Voice Cloning in a Multi-speaker Mode: An Overview

Fura-Mendoza, Marco; Moscol-Albañil, Isabel; Rodriguez, Ciro; Lezama, Pedro; Rodriguez, Diego; Pomachagua, Yuri

Neural Network Strategies and Models for Voice Cloning in a Multi-speaker Mode: An Overview

Descripción del Articulo

The evolution of data science and the constant challenge of carrying out different processes using a few resources with simultaneous personalization has promoted interest in the development of voice cloning. Nowadays, different machine learning techniques are used, given their efficiency in generati...

Descripción completa

Detalles Bibliográficos
Autores:	Fura-Mendoza, Marco, Moscol-Albañil, Isabel, Rodriguez, Ciro, Lezama, Pedro, Rodriguez, Diego, Pomachagua, Yuri
Formato:	artículo
Fecha de Publicación:	2023
Institución:	Universidad Peruana de Ciencias Aplicadas
Repositorio:	UPC-Institucional
Lenguaje:	inglés
OAI Identifier:	oai:repositorioacademico.upc.edu.pe:10757/669499
Enlace del recurso:	https://doi.org/10.1007/978-981-99-1912-3_21 http://hdl.handle.net/10757/669499
Nivel de acceso:	acceso embargado
Materia:	multi-speaker neural networks Voice cloning https://purl.org/pe-repo/ocde/ford#3.00.00

id	UUPC_fb9c05cf37f86e8f09491abe1be9b26b
oai_identifier_str	oai:repositorioacademico.upc.edu.pe:10757/669499
network_acronym_str	UUPC
network_name_str	UPC-Institucional
repository_id_str	2670
spelling	27a4dbd0a508eb478a0c2760c3d763f43006e7e5523ab88f904603e7ee314b055265001481cf04c578bc015c4403f7826ae2f2062a7deb0d67c2c464054d24853cd7215008001a598412e418a24a4afd84d68f628500040bf31ff65f6806d23fcaf71e9ea5d3500Fura-Mendoza, MarcoMoscol-Albañil, IsabelRodriguez, CiroLezama, PedroRodriguez, DiegoPomachagua, Yuri2023-11-28T15:38:34Z2023-11-28T15:38:34Z2023-01-0123673370https://doi.org/10.1007/978-981-99-1912-3_21http://hdl.handle.net/10757/66949923673389Lecture Notes in Networks and Systems2-s2.0-85171139460SCOPUS_ID:851711394600000 0001 2196 144X047xrr705The evolution of data science and the constant challenge of carrying out different processes using a few resources with simultaneous personalization has promoted interest in the development of voice cloning. Nowadays, different machine learning techniques are used, given their efficiency in generating relationships across multiple parameters. In this regard, we evaluated the best-performing models and the different process optimization strategies within this sector, where through neural network models separated modularly by their functionality, it is possible to generate independent processes taking into account the most significant number of linguistic factors in the generation of the voice, thus obtaining significant results of a clear improvement in the whole process of synthesizing the voice of a target speaker.Revisión por paresODS 9: Industria, Innovación e InfraestructuraODS 4: Educación de CalidadODS 10: Reducción de las Desigualdadesapplication/htmlengSpringer Science and Business Media Deutschland GmbHinfo:eu-repo/semantics/embargoedAccessUniversidad Peruana de Ciencias Aplicadas (UPC)Repositorio Academico - UPCLecture Notes in Networks and Systems685 LNNS229237reponame:UPC-Institucionalinstname:Universidad Peruana de Ciencias Aplicadasinstacron:UPCmulti-speakerneural networksVoice cloninghttps://purl.org/pe-repo/ocde/ford#3.00.00Neural Network Strategies and Models for Voice Cloning in a Multi-speaker Mode: An Overviewinfo:eu-repo/semantics/articlehttp://purl.org/coar/version/c_970fb48d4fbd8a1738PublicationLICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://upc.dspace7.openrepository.com/bitstreams/58962190-89ee-5da6-8b27-217481284d32/download8a4605be74aa9ea9d79846c1fba20a33MD5110757/669499oai:upc.dspace7.openrepository.com:10757/6694992026-02-17 17:46:17.067metadata.onlyhttps://upc.dspace7.openrepository.comRepositorio académico upcrepositorioacademico@upc.edu.pe
dc.title.es_PE.fl_str_mv	Neural Network Strategies and Models for Voice Cloning in a Multi-speaker Mode: An Overview
title	Neural Network Strategies and Models for Voice Cloning in a Multi-speaker Mode: An Overview
spellingShingle	Neural Network Strategies and Models for Voice Cloning in a Multi-speaker Mode: An Overview Fura-Mendoza, Marco multi-speaker neural networks Voice cloning https://purl.org/pe-repo/ocde/ford#3.00.00
title_short	Neural Network Strategies and Models for Voice Cloning in a Multi-speaker Mode: An Overview
title_full	Neural Network Strategies and Models for Voice Cloning in a Multi-speaker Mode: An Overview
title_fullStr	Neural Network Strategies and Models for Voice Cloning in a Multi-speaker Mode: An Overview
title_full_unstemmed	Neural Network Strategies and Models for Voice Cloning in a Multi-speaker Mode: An Overview
title_sort	Neural Network Strategies and Models for Voice Cloning in a Multi-speaker Mode: An Overview
author	Fura-Mendoza, Marco
author_facet	Fura-Mendoza, Marco Moscol-Albañil, Isabel Rodriguez, Ciro Lezama, Pedro Rodriguez, Diego Pomachagua, Yuri
author_role	author
author2	Moscol-Albañil, Isabel Rodriguez, Ciro Lezama, Pedro Rodriguez, Diego Pomachagua, Yuri
author2_role	author author author author author
dc.contributor.author.fl_str_mv	Fura-Mendoza, Marco Moscol-Albañil, Isabel Rodriguez, Ciro Lezama, Pedro Rodriguez, Diego Pomachagua, Yuri
dc.subject.es_PE.fl_str_mv	multi-speaker neural networks Voice cloning
topic	multi-speaker neural networks Voice cloning https://purl.org/pe-repo/ocde/ford#3.00.00
dc.subject.ocde.none.fl_str_mv	https://purl.org/pe-repo/ocde/ford#3.00.00
description	The evolution of data science and the constant challenge of carrying out different processes using a few resources with simultaneous personalization has promoted interest in the development of voice cloning. Nowadays, different machine learning techniques are used, given their efficiency in generating relationships across multiple parameters. In this regard, we evaluated the best-performing models and the different process optimization strategies within this sector, where through neural network models separated modularly by their functionality, it is possible to generate independent processes taking into account the most significant number of linguistic factors in the generation of the voice, thus obtaining significant results of a clear improvement in the whole process of synthesizing the voice of a target speaker.
publishDate	2023
dc.date.accessioned.none.fl_str_mv	2023-11-28T15:38:34Z
dc.date.available.none.fl_str_mv	2023-11-28T15:38:34Z
dc.date.issued.fl_str_mv	2023-01-01
dc.type.es_PE.fl_str_mv	info:eu-repo/semantics/article
dc.type.version.none.fl_str_mv	http://purl.org/coar/version/c_970fb48d4fbd8a1738
format	article
dc.identifier.issn.none.fl_str_mv	23673370
dc.identifier.doi.none.fl_str_mv	https://doi.org/10.1007/978-981-99-1912-3_21
dc.identifier.uri.none.fl_str_mv	http://hdl.handle.net/10757/669499
dc.identifier.eissn.none.fl_str_mv	23673389
dc.identifier.journal.es_PE.fl_str_mv	Lecture Notes in Networks and Systems
dc.identifier.eid.none.fl_str_mv	2-s2.0-85171139460
dc.identifier.scopusid.none.fl_str_mv	SCOPUS_ID:85171139460
dc.identifier.isni.none.fl_str_mv	0000 0001 2196 144X
dc.identifier.ror.none.fl_str_mv	047xrr705
identifier_str_mv	23673370 23673389 Lecture Notes in Networks and Systems 2-s2.0-85171139460 SCOPUS_ID:85171139460 0000 0001 2196 144X 047xrr705
url	https://doi.org/10.1007/978-981-99-1912-3_21 http://hdl.handle.net/10757/669499
dc.language.iso.es_PE.fl_str_mv	eng
language	eng
dc.rights.es_PE.fl_str_mv	info:eu-repo/semantics/embargoedAccess
eu_rights_str_mv	embargoedAccess
dc.format.es_PE.fl_str_mv	application/html
dc.publisher.es_PE.fl_str_mv	Springer Science and Business Media Deutschland GmbH
dc.source.es_PE.fl_str_mv	Universidad Peruana de Ciencias Aplicadas (UPC) Repositorio Academico - UPC
dc.source.none.fl_str_mv	reponame:UPC-Institucional instname:Universidad Peruana de Ciencias Aplicadas instacron:UPC
instname_str	Universidad Peruana de Ciencias Aplicadas
instacron_str	UPC
institution	UPC
reponame_str	UPC-Institucional
collection	UPC-Institucional
dc.source.journaltitle.none.fl_str_mv	Lecture Notes in Networks and Systems
dc.source.volume.none.fl_str_mv	685 LNNS
dc.source.beginpage.none.fl_str_mv	229
dc.source.endpage.none.fl_str_mv	237
bitstream.url.fl_str_mv	https://upc.dspace7.openrepository.com/bitstreams/58962190-89ee-5da6-8b27-217481284d32/download
bitstream.checksum.fl_str_mv	8a4605be74aa9ea9d79846c1fba20a33
bitstream.checksumAlgorithm.fl_str_mv	MD5
repository.name.fl_str_mv	Repositorio académico upc
repository.mail.fl_str_mv	repositorioacademico@upc.edu.pe
_version_	1868262554705330176
score	13.077178

Neural Network Strategies and Models for Voice Cloning in a Multi-speaker Mode: An Overview

Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).

Neural Network Strategies and Models for Voice Cloning in a Multi-speaker Mode: An Overview

Descripción del Articulo

Ejemplares Similares