Prediction of PM2.5 and PM10 Concentrations Using XGBoost and LightGBM Algorithms: A Case Study in Lima, Peru

Oblitas Mantilla, Johan Andrés; Escobedo Cárdenas, Edwin Jhonatan

Prediction of PM2.5 and PM10 Concentrations Using XGBoost and LightGBM Algorithms: A Case Study in Lima, Peru

Descripción del Articulo

Air pollution is a major problem that affects both human health and the environment, causing millions of premature deaths annually worldwide and severely degrading the state of the planet. Exposure to fine particulate matter, which is highly hazardous, enables these particles to penetrate deeply int...

Descripción completa

Detalles Bibliográficos
Autores:	Oblitas Mantilla, Johan Andrés, Escobedo Cárdenas, Edwin Jhonatan
Formato:	artículo
Fecha de Publicación:	2024
Institución:	Universidad de Lima
Repositorio:	Revistas - Universidad de Lima
Lenguaje:	inglés
OAI Identifier:	oai:revistas.ulima.edu.pe:article/7417
Enlace del recurso:	https://revistas.ulima.edu.pe/index.php/Interfases/article/view/7417
Nivel de acceso:	acceso abierto
Materia:	air pollution air quality meteorological data machine learning XGBoost LightGBM contaminación del aire calidad del aire datos meteorológicos aprendizaje automático

id	REVULIMA_5a808105d4f00e3a35c89d2219ce2472
oai_identifier_str	oai:revistas.ulima.edu.pe:article/7417
network_acronym_str	REVULIMA
network_name_str	Revistas - Universidad de Lima
repository_id_str
dc.title.none.fl_str_mv	Prediction of PM2.5 and PM10 Concentrations Using XGBoost and LightGBM Algorithms: A Case Study in Lima, Peru Predicción de concentraciones de PM2.5 y PM10 utilizando los algoritmos XGboost y LightGBM: un estudio de caso en Lima, Perú
title	Prediction of PM2.5 and PM10 Concentrations Using XGBoost and LightGBM Algorithms: A Case Study in Lima, Peru
spellingShingle	Prediction of PM2.5 and PM10 Concentrations Using XGBoost and LightGBM Algorithms: A Case Study in Lima, Peru Oblitas Mantilla, Johan Andrés air pollution air quality meteorological data machine learning XGBoost LightGBM contaminación del aire calidad del aire datos meteorológicos aprendizaje automático XGBoost LightGBM
title_short	Prediction of PM2.5 and PM10 Concentrations Using XGBoost and LightGBM Algorithms: A Case Study in Lima, Peru
title_full	Prediction of PM2.5 and PM10 Concentrations Using XGBoost and LightGBM Algorithms: A Case Study in Lima, Peru
title_fullStr	Prediction of PM2.5 and PM10 Concentrations Using XGBoost and LightGBM Algorithms: A Case Study in Lima, Peru
title_full_unstemmed	Prediction of PM2.5 and PM10 Concentrations Using XGBoost and LightGBM Algorithms: A Case Study in Lima, Peru
title_sort	Prediction of PM2.5 and PM10 Concentrations Using XGBoost and LightGBM Algorithms: A Case Study in Lima, Peru
dc.creator.none.fl_str_mv	Oblitas Mantilla, Johan Andrés Escobedo Cárdenas, Edwin Jhonatan
author	Oblitas Mantilla, Johan Andrés
author_facet	Oblitas Mantilla, Johan Andrés Escobedo Cárdenas, Edwin Jhonatan
author_role	author
author2	Escobedo Cárdenas, Edwin Jhonatan
author2_role	author
dc.subject.none.fl_str_mv	air pollution air quality meteorological data machine learning XGBoost LightGBM contaminación del aire calidad del aire datos meteorológicos aprendizaje automático XGBoost LightGBM
topic	air pollution air quality meteorological data machine learning XGBoost LightGBM contaminación del aire calidad del aire datos meteorológicos aprendizaje automático XGBoost LightGBM
description	Air pollution is a major problem that affects both human health and the environment, causing millions of premature deaths annually worldwide and severely degrading the state of the planet. Exposure to fine particulate matter, which is highly hazardous, enables these particles to penetrate deeply into the lungs and lead to serious health issues, including a reduction in life expectancy by more than two years. In response to this problem, it is crucial to identify effective ways to monitor the levels of these pollutants in our daily surroundings. This article presents a case study conducted in the district of San Borja, Lima, Peru, where prediction models for PM2.5 and PM10 were implemented using the XGBoost and LightGBM algorithms. Employing data from the SENAMHI portal and a correlation analysis of variables, two different scenarios were developed for training the models. In scenario 1, prediction models for PM2.5 and PM10 were trained using all available meteorological and pollution variables. In scenario 2, the models were trained for PM2.5 excluding the PM10 variable, and vice versa. The results showed that both models achieved high accuracy, measured by the coefficient of determination, with no statistically significant difference indicating the superiority of either model. Furthermore, the analysis of the proposed scenarios revealed that excluding key variables can result in significantly less accurate predictions, potentially undermining the effectiveness of environmental management strategies.
publishDate	2024
dc.date.none.fl_str_mv	2024-12-26
dc.type.none.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion
format	article
status_str	publishedVersion
dc.identifier.none.fl_str_mv	https://revistas.ulima.edu.pe/index.php/Interfases/article/view/7417 10.26439/interfases2024.n020.7417
url	https://revistas.ulima.edu.pe/index.php/Interfases/article/view/7417
identifier_str_mv	10.26439/interfases2024.n020.7417
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	https://revistas.ulima.edu.pe/index.php/Interfases/article/view/7417/7473 https://revistas.ulima.edu.pe/index.php/Interfases/article/view/7417/7474
dc.rights.none.fl_str_mv	https://creativecommons.org/licenses/by/4.0 info:eu-repo/semantics/openAccess
rights_invalid_str_mv	https://creativecommons.org/licenses/by/4.0
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf text/html
dc.publisher.none.fl_str_mv	Universidad de Lima
publisher.none.fl_str_mv	Universidad de Lima
dc.source.none.fl_str_mv	Interfases; No. 020 (2024); 185-208 Interfases; Núm. 020 (2024); 185-208 Interfases; n. 020 (2024); 185-208 1993-4912 10.26439/interfases2024.n020 reponame:Revistas - Universidad de Lima instname:Universidad de Lima instacron:ULIMA
instname_str	Universidad de Lima
instacron_str	ULIMA
institution	ULIMA
reponame_str	Revistas - Universidad de Lima
collection	Revistas - Universidad de Lima
repository.name.fl_str_mv
repository.mail.fl_str_mv
_version_	1844893192387821568
spelling	Prediction of PM2.5 and PM10 Concentrations Using XGBoost and LightGBM Algorithms: A Case Study in Lima, PeruPredicción de concentraciones de PM2.5 y PM10 utilizando los algoritmos XGboost y LightGBM: un estudio de caso en Lima, PerúOblitas Mantilla, Johan AndrésEscobedo Cárdenas, Edwin Jhonatanair pollutionair qualitymeteorological datamachine learningXGBoostLightGBMcontaminación del airecalidad del airedatos meteorológicosaprendizaje automáticoXGBoostLightGBMAir pollution is a major problem that affects both human health and the environment, causing millions of premature deaths annually worldwide and severely degrading the state of the planet. Exposure to fine particulate matter, which is highly hazardous, enables these particles to penetrate deeply into the lungs and lead to serious health issues, including a reduction in life expectancy by more than two years. In response to this problem, it is crucial to identify effective ways to monitor the levels of these pollutants in our daily surroundings. This article presents a case study conducted in the district of San Borja, Lima, Peru, where prediction models for PM2.5 and PM10 were implemented using the XGBoost and LightGBM algorithms. Employing data from the SENAMHI portal and a correlation analysis of variables, two different scenarios were developed for training the models. In scenario 1, prediction models for PM2.5 and PM10 were trained using all available meteorological and pollution variables. In scenario 2, the models were trained for PM2.5 excluding the PM10 variable, and vice versa. The results showed that both models achieved high accuracy, measured by the coefficient of determination, with no statistically significant difference indicating the superiority of either model. Furthermore, the analysis of the proposed scenarios revealed that excluding key variables can result in significantly less accurate predictions, potentially undermining the effectiveness of environmental management strategies. La contaminación del aire es un problema importante que afecta tanto a la salud humana como al medio ambiente, causando millones de muertes prematuras anualmente en todo el mundo y degradando severamente el estado del planeta. La exposición a material particulado fino, altamente peligroso, permite que estas partículas penetren profundamente en los pulmones y provoquen problemas de salud graves, incluyendo una reducción en la esperanza de vida de más de dos años. En respuesta a este problema, es crucial identificar formas efectivas de monitorear los niveles de estos contaminantes en nuestro entorno diario. Este artículo presenta un estudio de caso realizado en el distrito de San Borja, Lima, Perú, donde se implementaron modelos de predicción para PM2,5 y PM10 utilizando los algoritmos XGBoost y LightGBM. Empleando datos del portal del SENAMHI y un análisis de correlación de variables, se desarrollaron dos escenarios diferentes para el entrenamiento de los modelos. En el escenario 1, se entrenaron modelos de predicción para PM2,5 y PM10 utilizando todas las variables meteorológicas y de contaminación disponibles. En el escenario 2, los modelos se entrenaron para PM2,5 excluyendo la variable PM10, y viceversa. Los resultados mostraron que ambos modelos lograron una alta precisión, medida por el coeficiente de determinación, sin diferencias estadísticamente significativas que indicaran la superioridad de alguno de los modelos. Además, el análisis de los escenarios propuestos reveló que excluir variables clave puede resultar en predicciones significativamente menos precisas, lo que podría comprometer la efectividad de las estrategias de gestión ambiental.Universidad de Lima2024-12-26info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdftext/htmlhttps://revistas.ulima.edu.pe/index.php/Interfases/article/view/741710.26439/interfases2024.n020.7417Interfases; No. 020 (2024); 185-208Interfases; Núm. 020 (2024); 185-208Interfases; n. 020 (2024); 185-2081993-491210.26439/interfases2024.n020reponame:Revistas - Universidad de Limainstname:Universidad de Limainstacron:ULIMAenghttps://revistas.ulima.edu.pe/index.php/Interfases/article/view/7417/7473https://revistas.ulima.edu.pe/index.php/Interfases/article/view/7417/7474https://creativecommons.org/licenses/by/4.0info:eu-repo/semantics/openAccessoai:revistas.ulima.edu.pe:article/74172025-04-30T15:32:28Z
score	12.615219

Prediction of PM2.5 and PM10 Concentrations Using XGBoost and LightGBM Algorithms: A Case Study in Lima, Peru

Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).

Prediction of PM2.5 and PM10 Concentrations Using XGBoost and LightGBM Algorithms: A Case Study in Lima, Peru

Descripción del Articulo

Ejemplares Similares