Ship-lemmatagger: Building an nlp toolkit for a peruvian native language
Descripción del Articulo
Natural Language Processing deals with the understanding and generation of texts through computer programs. There are many different functionalities used in this area, but among them there are some functions that are the support of the remaining ones. These methods are related to the core processing...
Autores: | , , , , |
---|---|
Formato: | objeto de conferencia |
Fecha de Publicación: | 2017 |
Institución: | Consejo Nacional de Ciencia Tecnología e Innovación |
Repositorio: | CONCYTEC-Institucional |
Lenguaje: | inglés |
OAI Identifier: | oai:repositorio.concytec.gob.pe:20.500.12390/773 |
Enlace del recurso: | https://hdl.handle.net/20.500.12390/773 https://doi.org/10.1007/978-3-319-64206-2_53 |
Nivel de acceso: | acceso abierto |
Materia: | Text processing Automation Computational linguistics Ships Automatic identification Core processing Lemmatization Low resource languages Machine translations Native language Part of speech tagging Shipibo-konibo Natural language processing systems https://purl.org/pe-repo/ocde/ford#2.00.00 |
Sumario: | Natural Language Processing deals with the understanding and generation of texts through computer programs. There are many different functionalities used in this area, but among them there are some functions that are the support of the remaining ones. These methods are related to the core processing of the morphology of the language (such as lemmatization) and automatic identification of the part-of-speech tag. Thereby, this paper describes the implementation of a basic NLP toolkit for a new language, focusing in the features mentioned before, and testing them in an own corpus built for the occasion. The obtained results exceeded the expected results and could be used for more complex tasks such as machine translation. |
---|
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).