Object detection in videos using principal component pursuit and convolutional neural networks
Descripción del Articulo
Object recognition in videos is one of the main challenges in computer vision. Several methods have been proposed to achieve this task, such as background subtraction, temporal differencing, optical flow, particle filtering among others. Since the introduction of Convolutonal Neural Networks (CNN) f...
| Autor: | |
|---|---|
| Formato: | tesis de maestría |
| Fecha de Publicación: | 2018 |
| Institución: | Pontificia Universidad Católica del Perú |
| Repositorio: | PUCP-Institucional |
| Lenguaje: | inglés |
| OAI Identifier: | oai:repositorio.pucp.edu.pe:20.500.14657/146521 |
| Enlace del recurso: | http://hdl.handle.net/20.500.12404/11982 |
| Nivel de acceso: | acceso abierto |
| Materia: | Reconocimiento de imágenes Redes neuronales--Aplicaciones Visión por computadoras https://purl.org/pe-repo/ocde/ford#2.02.05 |
| id |
RPUC_fa0572be8a98c035fbc7e6d660458bfa |
|---|---|
| oai_identifier_str |
oai:repositorio.pucp.edu.pe:20.500.14657/146521 |
| network_acronym_str |
RPUC |
| network_name_str |
PUCP-Institucional |
| repository_id_str |
2905 |
| spelling |
Rodríguez Valderrama, Paul AntonioTejada Gamero, Enrique David2018-05-03T16:08:58Z2018-05-03T16:08:58Z20182018-05-032019-01-01http://hdl.handle.net/20.500.12404/11982Object recognition in videos is one of the main challenges in computer vision. Several methods have been proposed to achieve this task, such as background subtraction, temporal differencing, optical flow, particle filtering among others. Since the introduction of Convolutonal Neural Networks (CNN) for object detection in the Imagenet Large Scale Visual Recognition Competition (ILSVRC), its use for image detection and classification has increased, becoming the state-of-the-art for such task, being Faster R-CNN the preferred model in the latest ILSVRC challenges. Moreover, the Faster R-CNN model, with minimum modifications, has been succesfully used to detect and classify objects (either static or dynamic) in video sequences; in such setup, the frames of the video are input “as is” i.e. without any pre-processing. In this thesis work we propose to use Robust PCA (RPCA, a.k.a. Principal Component Pursuit, PCP), as a video background modeling pre-processing step, before using the Faster R-CNN model, in order to improve the overall performance of detection and classification of, specifically, the moving objects. We hypothesize that such pre-processing step, which segments the moving objects from the background, would reduce the amount of regions to be analyzed in a given frame and thus (i) improve the classification time and (ii) reduce the error in classification for the dynamic objects present in the video. In particular, we use a fully incremental RPCA / PCP algorithm that is suitable for real-time or on-line processing. Furthermore, we present extensive computational results that were carried out in three different platforms: A high-end server with a Tesla K40m GPU, a desktop with a Tesla K10m GPU and the embedded system Jetson TK1. Our classification results attain competitive or superior performance in terms of Fmeasure, achieving an improvement ranging from 3.7% to 97.2%, with a mean improvement of 22% when the sparse image was used to detect and classify the object with the neural network, while at the same time, reducing the classification time in all architectures by a factor raging between 2% and 25%.TesisengPontificia Universidad Católica del PerúPEinfo:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-nd/2.5/pe/Reconocimiento de imágenesRedes neuronales--AplicacionesVisión por computadorashttps://purl.org/pe-repo/ocde/ford#2.02.05Object detection in videos using principal component pursuit and convolutional neural networksinfo:eu-repo/semantics/masterThesisTesis de maestríareponame:PUCP-Institucionalinstname:Pontificia Universidad Católica del Perúinstacron:PUCPMaestro en Procesamiento de Señales e Imágenes Digitales.MaestríaPontificia Universidad Católica del Perú. Escuela de PosgradoProcesamiento de Señales e Imágenes Digitales07754238613077https://purl.org/pe-repo/renati/level#maestrohttp://purl.org/pe-repo/renati/type#tesis20.500.14657/146521oai:repositorio.pucp.edu.pe:20.500.14657/1465212024-06-10 10:10:36.793http://creativecommons.org/licenses/by-nc-nd/2.5/pe/info:eu-repo/semantics/openAccessmetadata.onlyhttps://repositorio.pucp.edu.peRepositorio Institucional de la PUCPrepositorio@pucp.pe |
| dc.title.es_ES.fl_str_mv |
Object detection in videos using principal component pursuit and convolutional neural networks |
| title |
Object detection in videos using principal component pursuit and convolutional neural networks |
| spellingShingle |
Object detection in videos using principal component pursuit and convolutional neural networks Tejada Gamero, Enrique David Reconocimiento de imágenes Redes neuronales--Aplicaciones Visión por computadoras https://purl.org/pe-repo/ocde/ford#2.02.05 |
| title_short |
Object detection in videos using principal component pursuit and convolutional neural networks |
| title_full |
Object detection in videos using principal component pursuit and convolutional neural networks |
| title_fullStr |
Object detection in videos using principal component pursuit and convolutional neural networks |
| title_full_unstemmed |
Object detection in videos using principal component pursuit and convolutional neural networks |
| title_sort |
Object detection in videos using principal component pursuit and convolutional neural networks |
| author |
Tejada Gamero, Enrique David |
| author_facet |
Tejada Gamero, Enrique David |
| author_role |
author |
| dc.contributor.advisor.fl_str_mv |
Rodríguez Valderrama, Paul Antonio |
| dc.contributor.author.fl_str_mv |
Tejada Gamero, Enrique David |
| dc.subject.es_ES.fl_str_mv |
Reconocimiento de imágenes Redes neuronales--Aplicaciones Visión por computadoras |
| topic |
Reconocimiento de imágenes Redes neuronales--Aplicaciones Visión por computadoras https://purl.org/pe-repo/ocde/ford#2.02.05 |
| dc.subject.ocde.es_ES.fl_str_mv |
https://purl.org/pe-repo/ocde/ford#2.02.05 |
| description |
Object recognition in videos is one of the main challenges in computer vision. Several methods have been proposed to achieve this task, such as background subtraction, temporal differencing, optical flow, particle filtering among others. Since the introduction of Convolutonal Neural Networks (CNN) for object detection in the Imagenet Large Scale Visual Recognition Competition (ILSVRC), its use for image detection and classification has increased, becoming the state-of-the-art for such task, being Faster R-CNN the preferred model in the latest ILSVRC challenges. Moreover, the Faster R-CNN model, with minimum modifications, has been succesfully used to detect and classify objects (either static or dynamic) in video sequences; in such setup, the frames of the video are input “as is” i.e. without any pre-processing. In this thesis work we propose to use Robust PCA (RPCA, a.k.a. Principal Component Pursuit, PCP), as a video background modeling pre-processing step, before using the Faster R-CNN model, in order to improve the overall performance of detection and classification of, specifically, the moving objects. We hypothesize that such pre-processing step, which segments the moving objects from the background, would reduce the amount of regions to be analyzed in a given frame and thus (i) improve the classification time and (ii) reduce the error in classification for the dynamic objects present in the video. In particular, we use a fully incremental RPCA / PCP algorithm that is suitable for real-time or on-line processing. Furthermore, we present extensive computational results that were carried out in three different platforms: A high-end server with a Tesla K40m GPU, a desktop with a Tesla K10m GPU and the embedded system Jetson TK1. Our classification results attain competitive or superior performance in terms of Fmeasure, achieving an improvement ranging from 3.7% to 97.2%, with a mean improvement of 22% when the sparse image was used to detect and classify the object with the neural network, while at the same time, reducing the classification time in all architectures by a factor raging between 2% and 25%. |
| publishDate |
2018 |
| dc.date.accessioned.es_ES.fl_str_mv |
2018-05-03T16:08:58Z |
| dc.date.available.es_ES.fl_str_mv |
2018-05-03T16:08:58Z |
| dc.date.created.es_ES.fl_str_mv |
2018 |
| dc.date.EmbargoEnd.none.fl_str_mv |
2019-01-01 |
| dc.date.issued.fl_str_mv |
2018-05-03 |
| dc.type.es_ES.fl_str_mv |
info:eu-repo/semantics/masterThesis |
| dc.type.other.none.fl_str_mv |
Tesis de maestría |
| format |
masterThesis |
| dc.identifier.uri.none.fl_str_mv |
http://hdl.handle.net/20.500.12404/11982 |
| url |
http://hdl.handle.net/20.500.12404/11982 |
| dc.language.iso.es_ES.fl_str_mv |
eng |
| language |
eng |
| dc.rights.es_ES.fl_str_mv |
info:eu-repo/semantics/openAccess |
| dc.rights.uri.*.fl_str_mv |
http://creativecommons.org/licenses/by-nc-nd/2.5/pe/ |
| eu_rights_str_mv |
openAccess |
| rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-nd/2.5/pe/ |
| dc.publisher.es_ES.fl_str_mv |
Pontificia Universidad Católica del Perú |
| dc.publisher.country.es_ES.fl_str_mv |
PE |
| dc.source.none.fl_str_mv |
reponame:PUCP-Institucional instname:Pontificia Universidad Católica del Perú instacron:PUCP |
| instname_str |
Pontificia Universidad Católica del Perú |
| instacron_str |
PUCP |
| institution |
PUCP |
| reponame_str |
PUCP-Institucional |
| collection |
PUCP-Institucional |
| repository.name.fl_str_mv |
Repositorio Institucional de la PUCP |
| repository.mail.fl_str_mv |
repositorio@pucp.pe |
| _version_ |
1835638829979009024 |
| score |
13.907986 |
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).