Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]

Romero P.E.; Castillo-Vilcahuaman C.

Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]

Descripción del Articulo

Genetic diversity is an important component of biodiversity, and it is crucial for current efforts to protect and sustainably manage several organisms and habitats. As far as we know, there is only one work describing Peruvian genetic information stored in public databases. We aimed to update this p...

Descripción completa

Detalles Bibliográficos
Autores:	Romero P.E., Castillo-Vilcahuaman C.
Formato:	artículo
Fecha de Publicación:	2021
Institución:	Consejo Nacional de Ciencia Tecnología e Innovación
Repositorio:	CONCYTEC-Institucional
Lenguaje:	inglés
OAI Identifier:	oai:repositorio.concytec.gob.pe:20.500.12390/2399
Enlace del recurso:	https://hdl.handle.net/20.500.12390/2399 https://doi.org/10.15381/RPB.V28I1.17867
Nivel de acceso:	acceso abierto
Materia:	Public databases Biodiversity Data mining Genetic diversity Peru http://purl.org/pe-repo/ocde/ford#3.04.03

id	CONC_c1d851658dd7f51148d77230c937f151
oai_identifier_str	oai:repositorio.concytec.gob.pe:20.500.12390/2399
network_acronym_str	CONC
network_name_str	CONCYTEC-Institucional
repository_id_str	4689
dc.title.none.fl_str_mv	Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]
title	Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]
spellingShingle	Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas] Romero P.E. Public databases Biodiversity Data mining Genetic diversity Peru http://purl.org/pe-repo/ocde/ford#3.04.03
title_short	Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]
title_full	Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]
title_fullStr	Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]
title_full_unstemmed	Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]
title_sort	Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]
author	Romero P.E.
author_facet	Romero P.E. Castillo-Vilcahuaman C.
author_role	author
author2	Castillo-Vilcahuaman C.
author2_role	author
dc.contributor.author.fl_str_mv	Romero P.E. Castillo-Vilcahuaman C.
dc.subject.none.fl_str_mv	Public databases
topic	Public databases Biodiversity Data mining Genetic diversity Peru http://purl.org/pe-repo/ocde/ford#3.04.03
dc.subject.es_PE.fl_str_mv	Biodiversity Data mining Genetic diversity Peru
dc.subject.ocde.none.fl_str_mv	http://purl.org/pe-repo/ocde/ford#3.04.03
description	Genetic diversity is an important component of biodiversity, and it is crucial for current efforts to protect and sustainably manage several organisms and habitats. As far as we know, there is only one work describing Peruvian genetic information stored in public databases. We aimed to update this previous work searching in four public databases that stored digital sequence information: Nucleotide, BioProject, PATRIC, BOLD. With this information, we comment on the contribution of Peruvian institutions during recent years. In Nucleotide, the largest database, Bacteria are the most sequenced organisms by Peruvian institutions (70.60%), pathogenic bacteria such as Pasteurella multocida, Neisseria meningitidis, and Vibrio parahaemolyticus were the most abundant. We found no sequence records from the Archaea domain. In BioProject, the most common sequence belongs to Salmonella enterica subsp. enterica serovar Infantis. In PATRIC, a database of pathogenic agents, Mycobacterium tuberculosis and Yersinia pestis had the highest number of entries. Finally, in BOLD, an exclusively Eukaryotic database, Chordata (Aves and Actinopterygii), Angiospermae, and Arthropoda (Insecta, and Arachnida) were the most frequent records. Our results would indicate research preferences of Peruvian institutions, focusing on infectious diseases and some Eukaryotic phyla. Although there has been a significant increase of DNA information submitted by Peruvian institutions since the last report, the genetic diversity reflected in these databases remains inconsistent with the diversity in the country. More efforts must be made to obtain genetic information from more underestimated taxonomic groups and to promote more genetic research in regional Peruvian institutions. © Los autores.
publishDate	2021
dc.date.accessioned.none.fl_str_mv	2024-05-30T23:13:38Z
dc.date.available.none.fl_str_mv	2024-05-30T23:13:38Z
dc.date.issued.fl_str_mv	2021
dc.type.none.fl_str_mv	info:eu-repo/semantics/article
format	article
dc.identifier.uri.none.fl_str_mv	https://hdl.handle.net/20.500.12390/2399
dc.identifier.doi.none.fl_str_mv	https://doi.org/10.15381/RPB.V28I1.17867
dc.identifier.scopus.none.fl_str_mv	2-s2.0-85103044937
url	https://hdl.handle.net/20.500.12390/2399 https://doi.org/10.15381/RPB.V28I1.17867
identifier_str_mv	2-s2.0-85103044937
dc.language.iso.none.fl_str_mv	eng
language	eng
dc.relation.ispartof.none.fl_str_mv	Revista Peruana de Biologia
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess
dc.rights.uri.none.fl_str_mv	https://creativecommons.org/licenses/by-nc-nd/4.0/
eu_rights_str_mv	openAccess
rights_invalid_str_mv	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.publisher.none.fl_str_mv	Facultad de Ciencias Biologicas, Universidad Nacional Mayor de San Marcos
publisher.none.fl_str_mv	Facultad de Ciencias Biologicas, Universidad Nacional Mayor de San Marcos
dc.source.none.fl_str_mv	reponame:CONCYTEC-Institucional instname:Consejo Nacional de Ciencia Tecnología e Innovación instacron:CONCYTEC
instname_str	Consejo Nacional de Ciencia Tecnología e Innovación
instacron_str	CONCYTEC
institution	CONCYTEC
reponame_str	CONCYTEC-Institucional
collection	CONCYTEC-Institucional
repository.name.fl_str_mv	Repositorio Institucional CONCYTEC
repository.mail.fl_str_mv	repositorio@concytec.gob.pe
_version_	1844883045889343488
spelling	Publicationrp01708600rp05681600Romero P.E.Castillo-Vilcahuaman C.2024-05-30T23:13:38Z2024-05-30T23:13:38Z2021https://hdl.handle.net/20.500.12390/2399https://doi.org/10.15381/RPB.V28I1.178672-s2.0-85103044937Genetic diversity is an important component of biodiversity, and it is crucial for current efforts to protect and sustainably manage several organisms and habitats. As far as we know, there is only one work describing Peruvian genetic information stored in public databases. We aimed to update this previous work searching in four public databases that stored digital sequence information: Nucleotide, BioProject, PATRIC, BOLD. With this information, we comment on the contribution of Peruvian institutions during recent years. In Nucleotide, the largest database, Bacteria are the most sequenced organisms by Peruvian institutions (70.60%), pathogenic bacteria such as Pasteurella multocida, Neisseria meningitidis, and Vibrio parahaemolyticus were the most abundant. We found no sequence records from the Archaea domain. In BioProject, the most common sequence belongs to Salmonella enterica subsp. enterica serovar Infantis. In PATRIC, a database of pathogenic agents, Mycobacterium tuberculosis and Yersinia pestis had the highest number of entries. Finally, in BOLD, an exclusively Eukaryotic database, Chordata (Aves and Actinopterygii), Angiospermae, and Arthropoda (Insecta, and Arachnida) were the most frequent records. Our results would indicate research preferences of Peruvian institutions, focusing on infectious diseases and some Eukaryotic phyla. Although there has been a significant increase of DNA information submitted by Peruvian institutions since the last report, the genetic diversity reflected in these databases remains inconsistent with the diversity in the country. More efforts must be made to obtain genetic information from more underestimated taxonomic groups and to promote more genetic research in regional Peruvian institutions. © Los autores.Consejo Nacional de Ciencia, Tecnología e Innovación Tecnológica - ConcytecengFacultad de Ciencias Biologicas, Universidad Nacional Mayor de San MarcosRevista Peruana de Biologiainfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-nd/4.0/Public databasesBiodiversity-1Data mining-1Genetic diversity-1Peru-1http://purl.org/pe-repo/ocde/ford#3.04.03-1Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]info:eu-repo/semantics/articlereponame:CONCYTEC-Institucionalinstname:Consejo Nacional de Ciencia Tecnología e Innovacióninstacron:CONCYTEC20.500.12390/2399oai:repositorio.concytec.gob.pe:20.500.12390/23992024-05-30 16:07:54.517https://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/openAccesshttp://purl.org/coar/access_right/c_14cbinfo:eu-repo/semantics/closedAccessmetadata only accesshttps://repositorio.concytec.gob.peRepositorio Institucional CONCYTECrepositorio@concytec.gob.pe#PLACEHOLDER_PARENT_METADATA_VALUE##PLACEHOLDER_PARENT_METADATA_VALUE#<Publication xmlns="https://www.openaire.eu/cerif-profile/1.1/" id="ed4e4271-08be-4ae5-ba60-daf951fd4c59"> <Type xmlns="https://www.openaire.eu/cerif-profile/vocab/COAR_Publication_Types">http://purl.org/coar/resource_type/c_1843</Type> <Language>eng</Language> <Title>Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]</Title> <PublishedIn> <Publication> <Title>Revista Peruana de Biologia</Title> </Publication> </PublishedIn> <PublicationDate>2021</PublicationDate> <DOI>https://doi.org/10.15381/RPB.V28I1.17867</DOI> <SCP-Number>2-s2.0-85103044937</SCP-Number> <Authors> <Author> <DisplayName>Romero P.E.</DisplayName> <Person id="rp01708" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> <Author> <DisplayName>Castillo-Vilcahuaman C.</DisplayName> <Person id="rp05681" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> </Authors> <Editors> </Editors> <Publishers> <Publisher> <DisplayName>Facultad de Ciencias Biologicas, Universidad Nacional Mayor de San Marcos</DisplayName> <OrgUnit /> </Publisher> </Publishers> <License>https://creativecommons.org/licenses/by-nc-nd/4.0/</License> <Keyword>Public databases</Keyword> <Keyword>Biodiversity</Keyword> <Keyword>Data mining</Keyword> <Keyword>Genetic diversity</Keyword> <Keyword>Peru</Keyword> <Abstract>Genetic diversity is an important component of biodiversity, and it is crucial for current efforts to protect and sustainably manage several organisms and habitats. As far as we know, there is only one work describing Peruvian genetic information stored in public databases. We aimed to update this previous work searching in four public databases that stored digital sequence information: Nucleotide, BioProject, PATRIC, BOLD. With this information, we comment on the contribution of Peruvian institutions during recent years. In Nucleotide, the largest database, Bacteria are the most sequenced organisms by Peruvian institutions (70.60%), pathogenic bacteria such as Pasteurella multocida, Neisseria meningitidis, and Vibrio parahaemolyticus were the most abundant. We found no sequence records from the Archaea domain. In BioProject, the most common sequence belongs to Salmonella enterica subsp. enterica serovar Infantis. In PATRIC, a database of pathogenic agents, Mycobacterium tuberculosis and Yersinia pestis had the highest number of entries. Finally, in BOLD, an exclusively Eukaryotic database, Chordata (Aves and Actinopterygii), Angiospermae, and Arthropoda (Insecta, and Arachnida) were the most frequent records. Our results would indicate research preferences of Peruvian institutions, focusing on infectious diseases and some Eukaryotic phyla. Although there has been a significant increase of DNA information submitted by Peruvian institutions since the last report, the genetic diversity reflected in these databases remains inconsistent with the diversity in the country. More efforts must be made to obtain genetic information from more underestimated taxonomic groups and to promote more genetic research in regional Peruvian institutions. © Los autores.</Abstract> <Access xmlns="http://purl.org/coar/access_right" > </Access> </Publication> -1
score	13.444865

Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]

Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).

Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]

Descripción del Articulo

Ejemplares Similares