Underwater Plastic Waste Detection with YOLO and Vision Transformer Models

Descripción del Articulo

This study addresses the global issue of marine pollution, with a particular focus on plastic bag contamination, by leveraging real-time object detection techniques powered by deep learning algorithms. A detailed comparison was carried out between the YOLOv8, YOLO-NAS, and RT-DETR models to assess t...

Descripción completa

Detalles Bibliográficos
Autores: Cárdenas Rondoño, Jonathan Bruce, Vasquez Espinoza, Ners Armando, Escobedo Cárdenas, Edwin Jonathan
Formato: artículo
Fecha de Publicación:2025
Institución:Universidad de Lima
Repositorio:Revistas - Universidad de Lima
Lenguaje:inglés
OAI Identifier:oai:ojs.pkp.sfu.ca:article/7868
Enlace del recurso:https://revistas.ulima.edu.pe/index.php/Interfases/article/view/7868
Nivel de acceso:acceso abierto
Materia:object detection
deep learning
plastic waste
object detection model
underwater images
detección de objetos
aprendizaje profundo
residuos plásticos
modelos de detección de objetos
imágenes submarinas
Descripción
Sumario:This study addresses the global issue of marine pollution, with a particular focus on plastic bag contamination, by leveraging real-time object detection techniques powered by deep learning algorithms. A detailed comparison was carried out between the YOLOv8, YOLO-NAS, and RT-DETR models to assess their effectiveness in detecting plastic waste in underwater environments. The methodology encompassed several key stages, including data preprocessing, model implementation, and training through transfer learning. Evaluation was conducted using a simulated video environment, followed by an in-depth comparison of the results. Performance assessment was based on critical metrics such as mean average precision (mAP), recall, and inference time. The YOLOv8 model achieved an mAP50 of 0.921 on the validation dataset, along with a recall of 0.829 and an inference time of 14.1 milliseconds. The YOLO-NAS model, by contrast, reached an mAP50 of 0.813, a higher recall of 0.903, and an inference time of 17.8 milliseconds. The RT-DETR model obtained an mAP50 of 0.887, a recall of 0.819, and an inference time of 15.9 milliseconds. Notably, despite not having the highest mAP, the RT-DETR model demonstrated superior detection performance when deployed in real-world underwater conditions, highlighting its robustness and potential for practical environmental monitoring.
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).