WordNet-SHP: Towards the building of a lexical database for a Peruvian minority language
Descripción del Articulo
WordNet-like resources are lexical databases with highly relevance information and data which could be exploited in more complex computational linguistics research and applications. The building process requires manual and automatic tasks, that could be more arduous if the language is a minority one...
| Autores: | , , |
|---|---|
| Formato: | objeto de conferencia |
| Fecha de Publicación: | 2019 |
| Institución: | Consejo Nacional de Ciencia Tecnología e Innovación |
| Repositorio: | CONCYTEC-Institucional |
| Lenguaje: | inglés |
| OAI Identifier: | oai:repositorio.concytec.gob.pe:20.500.12390/819 |
| Enlace del recurso: | https://hdl.handle.net/20.500.12390/819 |
| Nivel de acceso: | acceso abierto |
| Materia: | Wordnet Computational linguistics Database systems Natural language processing systems Ships Bilingual dictionary Digital resources Lexical database Machine translations Minority languages Research and application Word Sense Disambiguation Ontology https://purl.org/pe-repo/ocde/ford#6.02.06 |
| id |
CONC_8e5176666738abfc24e2c0d247c4afce |
|---|---|
| oai_identifier_str |
oai:repositorio.concytec.gob.pe:20.500.12390/819 |
| network_acronym_str |
CONC |
| network_name_str |
CONCYTEC-Institucional |
| repository_id_str |
4689 |
| dc.title.none.fl_str_mv |
WordNet-SHP: Towards the building of a lexical database for a Peruvian minority language |
| title |
WordNet-SHP: Towards the building of a lexical database for a Peruvian minority language |
| spellingShingle |
WordNet-SHP: Towards the building of a lexical database for a Peruvian minority language Maguiño-Valencia D. Wordnet Computational linguistics Database systems Natural language processing systems Ships Bilingual dictionary Digital resources Lexical database Machine translations Minority languages Research and application Word Sense Disambiguation Ontology https://purl.org/pe-repo/ocde/ford#6.02.06 |
| title_short |
WordNet-SHP: Towards the building of a lexical database for a Peruvian minority language |
| title_full |
WordNet-SHP: Towards the building of a lexical database for a Peruvian minority language |
| title_fullStr |
WordNet-SHP: Towards the building of a lexical database for a Peruvian minority language |
| title_full_unstemmed |
WordNet-SHP: Towards the building of a lexical database for a Peruvian minority language |
| title_sort |
WordNet-SHP: Towards the building of a lexical database for a Peruvian minority language |
| author |
Maguiño-Valencia D. |
| author_facet |
Maguiño-Valencia D. Oncevay-Marcos A. Sobrevilla Cabezudo M.A. |
| author_role |
author |
| author2 |
Oncevay-Marcos A. Sobrevilla Cabezudo M.A. |
| author2_role |
author author |
| dc.contributor.author.fl_str_mv |
Maguiño-Valencia D. Oncevay-Marcos A. Sobrevilla Cabezudo M.A. |
| dc.subject.none.fl_str_mv |
Wordnet |
| topic |
Wordnet Computational linguistics Database systems Natural language processing systems Ships Bilingual dictionary Digital resources Lexical database Machine translations Minority languages Research and application Word Sense Disambiguation Ontology https://purl.org/pe-repo/ocde/ford#6.02.06 |
| dc.subject.es_PE.fl_str_mv |
Computational linguistics Database systems Natural language processing systems Ships Bilingual dictionary Digital resources Lexical database Machine translations Minority languages Research and application Word Sense Disambiguation Ontology |
| dc.subject.ocde.none.fl_str_mv |
https://purl.org/pe-repo/ocde/ford#6.02.06 |
| description |
WordNet-like resources are lexical databases with highly relevance information and data which could be exploited in more complex computational linguistics research and applications. The building process requires manual and automatic tasks, that could be more arduous if the language is a minority one with fewer digital resources. This study focuses in the construction of an initial WordNetdatabase for a low-resourced and indigenous language in Peru: Shipibo-Konibo (shp). First, the stages of development from a scarce scenario (a bilingual dictionary shp-es) are described. Then, it is proposed a synset alignment method by comparing the definition glosses in the dictionary (written in Spanish) with the content of a Spanish WordNet. In this sense, word2vec similarity was the chosen metric for the proximity measure. Finally, an evaluation process is performed for the synsets, using a manually annotated Gold Standard inShipibo-Konibo. The obtained results are promising, and this resource is expected to serve well in further applications, such as word sense disambiguation and even machine translation in the shp-es language pair. |
| publishDate |
2019 |
| dc.date.accessioned.none.fl_str_mv |
2024-05-30T23:13:38Z |
| dc.date.available.none.fl_str_mv |
2024-05-30T23:13:38Z |
| dc.date.issued.fl_str_mv |
2019 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject |
| format |
conferenceObject |
| dc.identifier.isbn.none.fl_str_mv |
urn:isbn:9791095546009 |
| dc.identifier.uri.none.fl_str_mv |
https://hdl.handle.net/20.500.12390/819 |
| dc.identifier.scopus.none.fl_str_mv |
2-s2.0-85059915834 |
| identifier_str_mv |
urn:isbn:9791095546009 2-s2.0-85059915834 |
| url |
https://hdl.handle.net/20.500.12390/819 |
| dc.language.iso.none.fl_str_mv |
eng |
| language |
eng |
| dc.relation.ispartof.none.fl_str_mv |
LREC 2018 - 11th International Conference on Language Resources and Evaluation |
| dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.publisher.none.fl_str_mv |
European Language Resources Association (ELRA) |
| publisher.none.fl_str_mv |
European Language Resources Association (ELRA) |
| dc.source.none.fl_str_mv |
reponame:CONCYTEC-Institucional instname:Consejo Nacional de Ciencia Tecnología e Innovación instacron:CONCYTEC |
| instname_str |
Consejo Nacional de Ciencia Tecnología e Innovación |
| instacron_str |
CONCYTEC |
| institution |
CONCYTEC |
| reponame_str |
CONCYTEC-Institucional |
| collection |
CONCYTEC-Institucional |
| repository.name.fl_str_mv |
Repositorio Institucional CONCYTEC |
| repository.mail.fl_str_mv |
repositorio@concytec.gob.pe |
| _version_ |
1844883043128442880 |
| spelling |
Publicationrp02103600rp00570500rp02102600Maguiño-Valencia D.Oncevay-Marcos A.Sobrevilla Cabezudo M.A.2024-05-30T23:13:38Z2024-05-30T23:13:38Z2019urn:isbn:9791095546009https://hdl.handle.net/20.500.12390/8192-s2.0-85059915834WordNet-like resources are lexical databases with highly relevance information and data which could be exploited in more complex computational linguistics research and applications. The building process requires manual and automatic tasks, that could be more arduous if the language is a minority one with fewer digital resources. This study focuses in the construction of an initial WordNetdatabase for a low-resourced and indigenous language in Peru: Shipibo-Konibo (shp). First, the stages of development from a scarce scenario (a bilingual dictionary shp-es) are described. Then, it is proposed a synset alignment method by comparing the definition glosses in the dictionary (written in Spanish) with the content of a Spanish WordNet. In this sense, word2vec similarity was the chosen metric for the proximity measure. Finally, an evaluation process is performed for the synsets, using a manually annotated Gold Standard inShipibo-Konibo. The obtained results are promising, and this resource is expected to serve well in further applications, such as word sense disambiguation and even machine translation in the shp-es language pair.Consejo Nacional de Ciencia, Tecnología e Innovación Tecnológica - ConcytecengEuropean Language Resources Association (ELRA)LREC 2018 - 11th International Conference on Language Resources and Evaluationinfo:eu-repo/semantics/openAccessWordnetComputational linguistics-1Database systems-1Natural language processing systems-1Ships-1Bilingual dictionary-1Digital resources-1Lexical database-1Machine translations-1Minority languages-1Research and application-1Word Sense Disambiguation-1Ontology-1https://purl.org/pe-repo/ocde/ford#6.02.06-1WordNet-SHP: Towards the building of a lexical database for a Peruvian minority languageinfo:eu-repo/semantics/conferenceObjectreponame:CONCYTEC-Institucionalinstname:Consejo Nacional de Ciencia Tecnología e Innovacióninstacron:CONCYTEC20.500.12390/819oai:repositorio.concytec.gob.pe:20.500.12390/8192024-05-30 15:59:10.458http://purl.org/coar/access_right/c_14cbinfo:eu-repo/semantics/closedAccessmetadata only accesshttps://repositorio.concytec.gob.peRepositorio Institucional CONCYTECrepositorio@concytec.gob.pe#PLACEHOLDER_PARENT_METADATA_VALUE##PLACEHOLDER_PARENT_METADATA_VALUE##PLACEHOLDER_PARENT_METADATA_VALUE#<Publication xmlns="https://www.openaire.eu/cerif-profile/1.1/" id="e63d4112-67ab-4bd6-9bf9-7b97967ad53b"> <Type xmlns="https://www.openaire.eu/cerif-profile/vocab/COAR_Publication_Types">http://purl.org/coar/resource_type/c_1843</Type> <Language>eng</Language> <Title>WordNet-SHP: Towards the building of a lexical database for a Peruvian minority language</Title> <PublishedIn> <Publication> <Title>LREC 2018 - 11th International Conference on Language Resources and Evaluation</Title> </Publication> </PublishedIn> <PublicationDate>2019</PublicationDate> <SCP-Number>2-s2.0-85059915834</SCP-Number> <ISBN>urn:isbn:9791095546009</ISBN> <Authors> <Author> <DisplayName>Maguiño-Valencia D.</DisplayName> <Person id="rp02103" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> <Author> <DisplayName>Oncevay-Marcos A.</DisplayName> <Person id="rp00570" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> <Author> <DisplayName>Sobrevilla Cabezudo M.A.</DisplayName> <Person id="rp02102" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> </Authors> <Editors> </Editors> <Publishers> <Publisher> <DisplayName>European Language Resources Association (ELRA)</DisplayName> <OrgUnit /> </Publisher> </Publishers> <Keyword>Wordnet</Keyword> <Keyword>Computational linguistics</Keyword> <Keyword>Database systems</Keyword> <Keyword>Natural language processing systems</Keyword> <Keyword>Ships</Keyword> <Keyword>Bilingual dictionary</Keyword> <Keyword>Digital resources</Keyword> <Keyword>Lexical database</Keyword> <Keyword>Machine translations</Keyword> <Keyword>Minority languages</Keyword> <Keyword>Research and application</Keyword> <Keyword>Word Sense Disambiguation</Keyword> <Keyword>Ontology</Keyword> <Abstract>WordNet-like resources are lexical databases with highly relevance information and data which could be exploited in more complex computational linguistics research and applications. The building process requires manual and automatic tasks, that could be more arduous if the language is a minority one with fewer digital resources. This study focuses in the construction of an initial WordNetdatabase for a low-resourced and indigenous language in Peru: Shipibo-Konibo (shp). First, the stages of development from a scarce scenario (a bilingual dictionary shp-es) are described. Then, it is proposed a synset alignment method by comparing the definition glosses in the dictionary (written in Spanish) with the content of a Spanish WordNet. In this sense, word2vec similarity was the chosen metric for the proximity measure. Finally, an evaluation process is performed for the synsets, using a manually annotated Gold Standard inShipibo-Konibo. The obtained results are promising, and this resource is expected to serve well in further applications, such as word sense disambiguation and even machine translation in the shp-es language pair.</Abstract> <Access xmlns="http://purl.org/coar/access_right" > </Access> </Publication> -1 |
| score |
13.457506 |
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).