Ship-lemmatagger: Building an nlp toolkit for a peruvian native language

Descripción del Articulo

Natural Language Processing deals with the understanding and generation of texts through computer programs. There are many different functionalities used in this area, but among them there are some functions that are the support of the remaining ones. These methods are related to the core processing...

Descripción completa

Detalles Bibliográficos
Autores: Pereira-Noriega J., Mercado-Gonzales R., Melgar A., Sobrevilla-Cabezudo M., Oncevay-Marcos A.
Formato: objeto de conferencia
Fecha de Publicación:2017
Institución:Consejo Nacional de Ciencia Tecnología e Innovación
Repositorio:CONCYTEC-Institucional
Lenguaje:inglés
OAI Identifier:oai:repositorio.concytec.gob.pe:20.500.12390/773
Enlace del recurso:https://hdl.handle.net/20.500.12390/773
https://doi.org/10.1007/978-3-319-64206-2_53
Nivel de acceso:acceso abierto
Materia:Text processing
Automation
Computational linguistics
Ships
Automatic identification
Core processing
Lemmatization
Low resource languages
Machine translations
Native language
Part of speech tagging
Shipibo-konibo
Natural language processing systems
https://purl.org/pe-repo/ocde/ford#2.00.00
Descripción
Sumario:Natural Language Processing deals with the understanding and generation of texts through computer programs. There are many different functionalities used in this area, but among them there are some functions that are the support of the remaining ones. These methods are related to the core processing of the morphology of the language (such as lemmatization) and automatic identification of the part-of-speech tag. Thereby, this paper describes the implementation of a basic NLP toolkit for a new language, focusing in the features mentioned before, and testing them in an own corpus built for the occasion. The obtained results exceeded the expected results and could be used for more complex tasks such as machine translation.
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).