Object detection in videos using principal component pursuit and convolutional neural networks

Tejada Gamero, Enrique David

Object detection in videos using principal component pursuit and convolutional neural networks

Descripción del Articulo

Object recognition in videos is one of the main challenges in computer vision. Several methods have been proposed to achieve this task, such as background subtraction, temporal differencing, optical flow, particle filtering among others. Since the introduction of Convolutonal Neural Networks (CNN) f...

Descripción completa

Detalles Bibliográficos
Autor:	Tejada Gamero, Enrique David
Formato:	tesis de maestría
Fecha de Publicación:	2018
Institución:	Pontificia Universidad Católica del Perú
Repositorio:	PUCP-Tesis
Lenguaje:	inglés
OAI Identifier:	oai:tesis.pucp.edu.pe:20.500.12404/11982
Enlace del recurso:	http://hdl.handle.net/20.500.12404/11982
Nivel de acceso:	acceso abierto
Materia:	Reconocimiento de imágenes Redes neuronales--Aplicaciones Visión por computadoras https://purl.org/pe-repo/ocde/ford#2.02.05

id	PUCP_0e381216751ceabdd610fdd23470f577
oai_identifier_str	oai:tesis.pucp.edu.pe:20.500.12404/11982
network_acronym_str	PUCP
network_name_str	PUCP-Tesis
repository_id_str	.
dc.title.es_ES.fl_str_mv	Object detection in videos using principal component pursuit and convolutional neural networks
title	Object detection in videos using principal component pursuit and convolutional neural networks
spellingShingle	Object detection in videos using principal component pursuit and convolutional neural networks Tejada Gamero, Enrique David Reconocimiento de imágenes Redes neuronales--Aplicaciones Visión por computadoras https://purl.org/pe-repo/ocde/ford#2.02.05
title_short	Object detection in videos using principal component pursuit and convolutional neural networks
title_full	Object detection in videos using principal component pursuit and convolutional neural networks
title_fullStr	Object detection in videos using principal component pursuit and convolutional neural networks
title_full_unstemmed	Object detection in videos using principal component pursuit and convolutional neural networks
title_sort	Object detection in videos using principal component pursuit and convolutional neural networks
author	Tejada Gamero, Enrique David
author_facet	Tejada Gamero, Enrique David
author_role	author
dc.contributor.advisor.fl_str_mv	Rodríguez Valderrama, Paul Antonio
dc.contributor.author.fl_str_mv	Tejada Gamero, Enrique David
dc.subject.es_ES.fl_str_mv	Reconocimiento de imágenes Redes neuronales--Aplicaciones Visión por computadoras
topic	Reconocimiento de imágenes Redes neuronales--Aplicaciones Visión por computadoras https://purl.org/pe-repo/ocde/ford#2.02.05
dc.subject.ocde.es_ES.fl_str_mv	https://purl.org/pe-repo/ocde/ford#2.02.05
description	Object recognition in videos is one of the main challenges in computer vision. Several methods have been proposed to achieve this task, such as background subtraction, temporal differencing, optical flow, particle filtering among others. Since the introduction of Convolutonal Neural Networks (CNN) for object detection in the Imagenet Large Scale Visual Recognition Competition (ILSVRC), its use for image detection and classification has increased, becoming the state-of-the-art for such task, being Faster R-CNN the preferred model in the latest ILSVRC challenges. Moreover, the Faster R-CNN model, with minimum modifications, has been succesfully used to detect and classify objects (either static or dynamic) in video sequences; in such setup, the frames of the video are input “as is” i.e. without any pre-processing. In this thesis work we propose to use Robust PCA (RPCA, a.k.a. Principal Component Pursuit, PCP), as a video background modeling pre-processing step, before using the Faster R-CNN model, in order to improve the overall performance of detection and classification of, specifically, the moving objects. We hypothesize that such pre-processing step, which segments the moving objects from the background, would reduce the amount of regions to be analyzed in a given frame and thus (i) improve the classification time and (ii) reduce the error in classification for the dynamic objects present in the video. In particular, we use a fully incremental RPCA / PCP algorithm that is suitable for real-time or on-line processing. Furthermore, we present extensive computational results that were carried out in three different platforms: A high-end server with a Tesla K40m GPU, a desktop with a Tesla K10m GPU and the embedded system Jetson TK1. Our classification results attain competitive or superior performance in terms of Fmeasure, achieving an improvement ranging from 3.7% to 97.2%, with a mean improvement of 22% when the sparse image was used to detect and classify the object with the neural network, while at the same time, reducing the classification time in all architectures by a factor raging between 2% and 25%.
publishDate	2018
dc.date.accessioned.es_ES.fl_str_mv	2018-05-03T16:08:58Z
dc.date.available.es_ES.fl_str_mv	2018-05-03T16:08:58Z
dc.date.created.es_ES.fl_str_mv	2018
dc.date.EmbargoEnd.none.fl_str_mv	2019-01-01
dc.date.issued.fl_str_mv	2018-05-03
dc.type.es_ES.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
dc.identifier.uri.none.fl_str_mv	http://hdl.handle.net/20.500.12404/11982
url	http://hdl.handle.net/20.500.12404/11982
dc.language.iso.es_ES.fl_str_mv	eng
language	eng
dc.relation.ispartof.fl_str_mv	SUNEDU
dc.rights.es_ES.fl_str_mv	info:eu-repo/semantics/openAccess
dc.rights.uri.*.fl_str_mv	http://creativecommons.org/licenses/by-nc-nd/2.5/pe/
eu_rights_str_mv	openAccess
rights_invalid_str_mv	http://creativecommons.org/licenses/by-nc-nd/2.5/pe/
dc.publisher.es_ES.fl_str_mv	Pontificia Universidad Católica del Perú
dc.publisher.country.es_ES.fl_str_mv	PE
dc.source.none.fl_str_mv	reponame:PUCP-Tesis instname:Pontificia Universidad Católica del Perú instacron:PUCP
instname_str	Pontificia Universidad Católica del Perú
instacron_str	PUCP
institution	PUCP
reponame_str	PUCP-Tesis
collection	PUCP-Tesis
bitstream.url.fl_str_mv	https://tesis.pucp.edu.pe/bitstreams/83bf5938-136b-4fdf-8f4e-a81a9e93336b/download https://tesis.pucp.edu.pe/bitstreams/c3eaface-986c-4a85-9b3d-2a68dc96111d/download https://tesis.pucp.edu.pe/bitstreams/2270a21a-2a1d-4fb9-b255-d37a3a615ab6/download https://tesis.pucp.edu.pe/bitstreams/9e1ca0b0-5a5a-4e57-810a-3f7e65f8b733/download https://tesis.pucp.edu.pe/bitstreams/6b1d11c0-3f9f-4e10-a80f-2eaa2103a9de/download
bitstream.checksum.fl_str_mv	27aebf468cd68dc7eda8cba25fc5f3a8 17281701433fa9b4afa2d3e6286c67c7 8a4605be74aa9ea9d79846c1fba20a33 bb1e5659bcadb7189bcea207a261dccb 6aef286f0c9002a52edfaa0a5c22a7bb
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Repositorio de Tesis PUCP
repository.mail.fl_str_mv	raul.sifuentes@pucp.pe
_version_	1839176268391645184
spelling	Rodríguez Valderrama, Paul AntonioTejada Gamero, Enrique David2018-05-03T16:08:58Z2018-05-03T16:08:58Z20182018-05-032019-01-01http://hdl.handle.net/20.500.12404/11982Object recognition in videos is one of the main challenges in computer vision. Several methods have been proposed to achieve this task, such as background subtraction, temporal differencing, optical flow, particle filtering among others. Since the introduction of Convolutonal Neural Networks (CNN) for object detection in the Imagenet Large Scale Visual Recognition Competition (ILSVRC), its use for image detection and classification has increased, becoming the state-of-the-art for such task, being Faster R-CNN the preferred model in the latest ILSVRC challenges. Moreover, the Faster R-CNN model, with minimum modifications, has been succesfully used to detect and classify objects (either static or dynamic) in video sequences; in such setup, the frames of the video are input “as is” i.e. without any pre-processing. In this thesis work we propose to use Robust PCA (RPCA, a.k.a. Principal Component Pursuit, PCP), as a video background modeling pre-processing step, before using the Faster R-CNN model, in order to improve the overall performance of detection and classification of, specifically, the moving objects. We hypothesize that such pre-processing step, which segments the moving objects from the background, would reduce the amount of regions to be analyzed in a given frame and thus (i) improve the classification time and (ii) reduce the error in classification for the dynamic objects present in the video. In particular, we use a fully incremental RPCA / PCP algorithm that is suitable for real-time or on-line processing. Furthermore, we present extensive computational results that were carried out in three different platforms: A high-end server with a Tesla K40m GPU, a desktop with a Tesla K10m GPU and the embedded system Jetson TK1. Our classification results attain competitive or superior performance in terms of Fmeasure, achieving an improvement ranging from 3.7% to 97.2%, with a mean improvement of 22% when the sparse image was used to detect and classify the object with the neural network, while at the same time, reducing the classification time in all architectures by a factor raging between 2% and 25%.TesisengPontificia Universidad Católica del PerúPEinfo:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-nd/2.5/pe/Reconocimiento de imágenesRedes neuronales--AplicacionesVisión por computadorashttps://purl.org/pe-repo/ocde/ford#2.02.05Object detection in videos using principal component pursuit and convolutional neural networksinfo:eu-repo/semantics/masterThesisreponame:PUCP-Tesisinstname:Pontificia Universidad Católica del Perúinstacron:PUCPSUNEDUMaestro en Procesamiento de Señales e Imágenes Digitales.MaestríaPontificia Universidad Católica del Perú. Escuela de PosgradoProcesamiento de Señales e Imágenes Digitales07754238613077https://purl.org/pe-repo/renati/level#maestrohttps://purl.org/pe-repo/renati/type#tesisCC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8810https://tesis.pucp.edu.pe/bitstreams/83bf5938-136b-4fdf-8f4e-a81a9e93336b/download27aebf468cd68dc7eda8cba25fc5f3a8MD52falseAnonymousREADORIGINALTEJADA_GAMERO_ENRIQUE_OBJECT_DETECTION_VIDEOS.pdfTEJADA_GAMERO_ENRIQUE_OBJECT_DETECTION_VIDEOS.pdftexto completoapplication/pdf3445409https://tesis.pucp.edu.pe/bitstreams/c3eaface-986c-4a85-9b3d-2a68dc96111d/download17281701433fa9b4afa2d3e6286c67c7MD51trueAnonymousREADLICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://tesis.pucp.edu.pe/bitstreams/2270a21a-2a1d-4fb9-b255-d37a3a615ab6/download8a4605be74aa9ea9d79846c1fba20a33MD53falseAnonymousREADTHUMBNAILTEJADA_GAMERO_ENRIQUE_OBJECT_DETECTION_VIDEOS.pdf.jpgTEJADA_GAMERO_ENRIQUE_OBJECT_DETECTION_VIDEOS.pdf.jpgIM Thumbnailimage/jpeg11858https://tesis.pucp.edu.pe/bitstreams/9e1ca0b0-5a5a-4e57-810a-3f7e65f8b733/downloadbb1e5659bcadb7189bcea207a261dccbMD54falseAnonymousREADTEXTTEJADA_GAMERO_ENRIQUE_OBJECT_DETECTION_VIDEOS.pdf.txtTEJADA_GAMERO_ENRIQUE_OBJECT_DETECTION_VIDEOS.pdf.txtExtracted texttext/plain59915https://tesis.pucp.edu.pe/bitstreams/6b1d11c0-3f9f-4e10-a80f-2eaa2103a9de/download6aef286f0c9002a52edfaa0a5c22a7bbMD55falseAnonymousREAD20.500.12404/11982oai:tesis.pucp.edu.pe:20.500.12404/119822025-07-18 12:56:25.927http://creativecommons.org/licenses/by-nc-nd/2.5/pe/info:eu-repo/semantics/openAccessopen.accesshttps://tesis.pucp.edu.peRepositorio de Tesis PUCPraul.sifuentes@pucp.peTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=
score	13.377223

Object detection in videos using principal component pursuit and convolutional neural networks

Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).

Object detection in videos using principal component pursuit and convolutional neural networks

Descripción del Articulo

Ejemplares Similares