Chanot: An intelligent annotation tool for indigenous and highly agglutinative languages in Peru

Descripción del Articulo

Linguistic corpus annotation is one of the most important phases for solving Natural Language Processing (NLP) tasks, as these methods are deeply involved with corpus-based techniques. However, meta-data annotation is a highly laborious manual task. A supportive alternative requires the use of compu...

Descripción completa

Detalles Bibliográficos
Autores: Mercado-Gonzales R., Pereira-Noriega J., Sobrevilla M., Oncevay A.
Formato: objeto de conferencia
Fecha de Publicación:2019
Institución:Consejo Nacional de Ciencia Tecnología e Innovación
Repositorio:CONCYTEC-Institucional
Lenguaje:inglés
OAI Identifier:oai:repositorio.concytec.gob.pe:20.500.12390/547
Enlace del recurso:https://hdl.handle.net/20.500.12390/547
Nivel de acceso:acceso abierto
Materia:Ships
Data mining
Learning algorithms
Learning systems
Natural language processing systems
Agglutinative language
Annotation tool
Computational tools
Corpus annotations
Linguistic annotations
https://purl.org/pe-repo/ocde/ford#6.02.06
Descripción
Sumario:Linguistic corpus annotation is one of the most important phases for solving Natural Language Processing (NLP) tasks, as these methods are deeply involved with corpus-based techniques. However, meta-data annotation is a highly laborious manual task. A supportive alternative requires the use of computational tools. They are likely to simplify some of these operations, while can be adjusted appropriately to the needs of particular language features at the same time. Therefore, this paper presents ChAnot, a web-based annotation tool developed for Peruvian indigenous and highly agglutinative languages, where Shipibo-Konibo was the case study. This new tool is able to support a diverse set of linguistic annotation tasks, such as word segmentation, POS-tag markup, among others. Also, it includes a suggestion engine based on historic and machine learning models, and a set of statistics about previous annotations.
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).