On Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasets

Descripción del Articulo

Approximate similarity search algorithms based on hashing were proposed to query high-dimensional datasets due to its fast retrieval speed and low storage cost. Recent studies, promote the use of Convolutional Neural Network (CNN) with hashing techniques to improve the search accuracy. However, ther...

Descripción completa

Detalles Bibliográficos
Autores: Ocsa, Alexander, Huillca, Jose Luis, López Del Alamo, Cristian
Formato: artículo
Fecha de Publicación:2018
Institución:Universidad La Salle
Repositorio:ULASALLE-Institucional
Lenguaje:inglés
OAI Identifier:oai:repositorio.ulasalle.edu.pe:20.500.12953/30
Enlace del recurso:http://repositorio.ulasalle.edu.pe/handle/20.500.12953/30
https://doi.org/10.1007/978-3-319-75193-1
Nivel de acceso:acceso restringido
Materia:Research Subject Categories::TECHNOLOGY
id ULSA_b85651bec2f318c8222bb02d5e69a074
oai_identifier_str oai:repositorio.ulasalle.edu.pe:20.500.12953/30
network_acronym_str ULSA
network_name_str ULASALLE-Institucional
repository_id_str 3920
dc.title.es_ES.fl_str_mv On Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasets
title On Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasets
spellingShingle On Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasets
Ocsa, Alexander
Research Subject Categories::TECHNOLOGY
Research Subject Categories::TECHNOLOGY
title_short On Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasets
title_full On Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasets
title_fullStr On Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasets
title_full_unstemmed On Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasets
title_sort On Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasets
author Ocsa, Alexander
author_facet Ocsa, Alexander
Huillca, Jose Luis
López Del Alamo, Cristian
author_role author
author2 Huillca, Jose Luis
López Del Alamo, Cristian
author2_role author
author
dc.contributor.author.fl_str_mv Ocsa, Alexander
Huillca, Jose Luis
López Del Alamo, Cristian
dc.subject.es_ES.fl_str_mv Research Subject Categories::TECHNOLOGY
topic Research Subject Categories::TECHNOLOGY
Research Subject Categories::TECHNOLOGY
dc.subject.ocde.es_ES.fl_str_mv Research Subject Categories::TECHNOLOGY
description Approximate similarity search algorithms based on hashing were proposed to query high-dimensional datasets due to its fast retrieval speed and low storage cost. Recent studies, promote the use of Convolutional Neural Network (CNN) with hashing techniques to improve the search accuracy. However, there are challenges to solve in order to find a practical and efficient solution to index CNN features, such as the need for heavy training process to achieve accurate query results and the critical dependency on data-parameters. Aiming to overcome these issues, we propose a new method for scalable similarity search, i.e., Deep frActal based Hashing (DAsH), by computing the best data-parameters values for optimal sub-space projection exploring the correlations among CNN features attributes using fractal theory. Moreover, inspired by recent advances in CNNs, we use not only activations of lower layers which are more general-purpose but also previous knowledge of the semantic data on the latest CNN layer to improve the search accuracy. Thus, our method produces a better representation of the data space with a less computational cost for a better accuracy. This significant gain in speed and accuracy allows us to evaluate the framework on a large, realistic, and challenging set of datasets.
publishDate 2018
dc.date.accessioned.none.fl_str_mv 2018-11-21T17:14:44Z
dc.date.available.none.fl_str_mv 2018-11-21T17:14:44Z
dc.date.issued.fl_str_mv 2018-07-04
dc.type.es_ES.fl_str_mv info:eu-repo/semantics/article
format article
dc.identifier.citation.es_ES.fl_str_mv Ocsa A., Huillca J.L., Lopez del Alamo C. (2018) On Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasets. In: Mendoza M., Velastín S. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2017. Lecture Notes in Computer Science, vol 10657. Springer, Cham
dc.identifier.isbn.none.fl_str_mv 978-3-319-75193-1
dc.identifier.uri.none.fl_str_mv http://repositorio.ulasalle.edu.pe/handle/20.500.12953/30
dc.identifier.journal.es_ES.fl_str_mv Iberoamerican Congress on Pattern Recognition
dc.identifier.doi.es_ES.fl_str_mv https://doi.org/10.1007/978-3-319-75193-1
identifier_str_mv Ocsa A., Huillca J.L., Lopez del Alamo C. (2018) On Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasets. In: Mendoza M., Velastín S. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2017. Lecture Notes in Computer Science, vol 10657. Springer, Cham
978-3-319-75193-1
Iberoamerican Congress on Pattern Recognition
url http://repositorio.ulasalle.edu.pe/handle/20.500.12953/30
https://doi.org/10.1007/978-3-319-75193-1
dc.language.iso.eng_US.fl_str_mv eng
language eng
dc.rights.es_ES.fl_str_mv info:eu-repo/semantics/restrictedAccess
eu_rights_str_mv restrictedAccess
dc.publisher.es_ES.fl_str_mv Universidad La Salle
dc.source.es_ES.fl_str_mv Universidad La Salle
Repositorio institucional - ULASALLE
dc.source.none.fl_str_mv reponame:ULASALLE-Institucional
instname:Universidad La Salle
instacron:ULASALLE
instname_str Universidad La Salle
instacron_str ULASALLE
institution ULASALLE
reponame_str ULASALLE-Institucional
collection ULASALLE-Institucional
bitstream.url.fl_str_mv http://repositorio.ulasalle.edu.pe/bitstream/20.500.12953/30/1/link_articulo.txt
http://repositorio.ulasalle.edu.pe/bitstream/20.500.12953/30/2/license.txt
http://repositorio.ulasalle.edu.pe/bitstream/20.500.12953/30/3/link_articulo.txt.txt
bitstream.checksum.fl_str_mv 0db83502828a9ee71f838dabf78ef098
8a4605be74aa9ea9d79846c1fba20a33
b5390a0d10c3af67678d607f261ad5ad
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Repositorio Institucional de la Universidad La Salle
repository.mail.fl_str_mv repositorio@ulasalle.edu.pe
_version_ 1764532734532780032
spelling Ocsa, AlexanderHuillca, Jose LuisLópez Del Alamo, Cristian2018-11-21T17:14:44Z2018-11-21T17:14:44Z2018-07-04Ocsa A., Huillca J.L., Lopez del Alamo C. (2018) On Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasets. In: Mendoza M., Velastín S. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2017. Lecture Notes in Computer Science, vol 10657. Springer, Cham978-3-319-75193-1http://repositorio.ulasalle.edu.pe/handle/20.500.12953/30Iberoamerican Congress on Pattern Recognitionhttps://doi.org/10.1007/978-3-319-75193-1Approximate similarity search algorithms based on hashing were proposed to query high-dimensional datasets due to its fast retrieval speed and low storage cost. Recent studies, promote the use of Convolutional Neural Network (CNN) with hashing techniques to improve the search accuracy. However, there are challenges to solve in order to find a practical and efficient solution to index CNN features, such as the need for heavy training process to achieve accurate query results and the critical dependency on data-parameters. Aiming to overcome these issues, we propose a new method for scalable similarity search, i.e., Deep frActal based Hashing (DAsH), by computing the best data-parameters values for optimal sub-space projection exploring the correlations among CNN features attributes using fractal theory. Moreover, inspired by recent advances in CNNs, we use not only activations of lower layers which are more general-purpose but also previous knowledge of the semantic data on the latest CNN layer to improve the search accuracy. Thus, our method produces a better representation of the data space with a less computational cost for a better accuracy. This significant gain in speed and accuracy allows us to evaluate the framework on a large, realistic, and challenging set of datasets.Trabajo de investigaciónDoble ciegoengUniversidad La Salleinfo:eu-repo/semantics/restrictedAccessUniversidad La SalleRepositorio institucional - ULASALLEreponame:ULASALLE-Institucionalinstname:Universidad La Salleinstacron:ULASALLEResearch Subject Categories::TECHNOLOGYResearch Subject Categories::TECHNOLOGYOn Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasetsinfo:eu-repo/semantics/articleORIGINALlink_articulo.txtlink_articulo.txttext/plain43http://repositorio.ulasalle.edu.pe/bitstream/20.500.12953/30/1/link_articulo.txt0db83502828a9ee71f838dabf78ef098MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://repositorio.ulasalle.edu.pe/bitstream/20.500.12953/30/2/license.txt8a4605be74aa9ea9d79846c1fba20a33MD52TEXTlink_articulo.txt.txtlink_articulo.txt.txtExtracted texttext/plain43http://repositorio.ulasalle.edu.pe/bitstream/20.500.12953/30/3/link_articulo.txt.txtb5390a0d10c3af67678d607f261ad5adMD5320.500.12953/30oai:repositorio.ulasalle.edu.pe:20.500.12953/302021-06-11 14:39:34.116Repositorio Institucional de la Universidad La Sallerepositorio@ulasalle.edu.peTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=
score 13.945474
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).