Machine Learning: Comparison of algorithms for determining water quality in the Rímac river

Descripción del Articulo

The evaluation of the quality of the water in rivers is necessary to manage the efficiency of its use, being necessary to carry out physicochemical and biological analyzes to determine its healthiness, but it implies in its determination of a series of parameters that use various analytical methods...

Descripción completa

Detalles Bibliográficos
Autor: Marroquín Peralta, Juan Miguel
Formato: tesis de grado
Fecha de Publicación:2021
Institución:Universidad de Lima
Repositorio:ULIMA-Institucional
Lenguaje:español
OAI Identifier:oai:repositorio.ulima.edu.pe:20.500.12724/14791
Enlace del recurso:https://hdl.handle.net/20.500.12724/14791
Nivel de acceso:acceso abierto
Materia:Aprendizaje automático (Inteligencia artificial)
Calidad del agua
Ríos
Lima (Perú)
Machine learning
Water quality
Rivers
https://purl.org/pe-repo/ocde/ford#2.02.04
id RULI_826ba46034858b5ce08add5c65b5398f
oai_identifier_str oai:repositorio.ulima.edu.pe:20.500.12724/14791
network_acronym_str RULI
network_name_str ULIMA-Institucional
repository_id_str 3883
dc.title.es_PE.fl_str_mv Machine Learning: Comparison of algorithms for determining water quality in the Rímac river
title Machine Learning: Comparison of algorithms for determining water quality in the Rímac river
spellingShingle Machine Learning: Comparison of algorithms for determining water quality in the Rímac river
Marroquín Peralta, Juan Miguel
Aprendizaje automático (Inteligencia artificial)
Calidad del agua
Ríos
Lima (Perú)
Machine learning
Water quality
Rivers
https://purl.org/pe-repo/ocde/ford#2.02.04
title_short Machine Learning: Comparison of algorithms for determining water quality in the Rímac river
title_full Machine Learning: Comparison of algorithms for determining water quality in the Rímac river
title_fullStr Machine Learning: Comparison of algorithms for determining water quality in the Rímac river
title_full_unstemmed Machine Learning: Comparison of algorithms for determining water quality in the Rímac river
title_sort Machine Learning: Comparison of algorithms for determining water quality in the Rímac river
author Marroquín Peralta, Juan Miguel
author_facet Marroquín Peralta, Juan Miguel
author_role author
dc.contributor.advisor.fl_str_mv García López, Yván Jesús
dc.contributor.author.fl_str_mv Marroquín Peralta, Juan Miguel
dc.subject.es_PE.fl_str_mv Aprendizaje automático (Inteligencia artificial)
Calidad del agua
Ríos
Lima (Perú)
topic Aprendizaje automático (Inteligencia artificial)
Calidad del agua
Ríos
Lima (Perú)
Machine learning
Water quality
Rivers
https://purl.org/pe-repo/ocde/ford#2.02.04
dc.subject.en_EN.fl_str_mv Machine learning
Water quality
Rivers
dc.subject.ocde.none.fl_str_mv https://purl.org/pe-repo/ocde/ford#2.02.04
description The evaluation of the quality of the water in rivers is necessary to manage the efficiency of its use, being necessary to carry out physicochemical and biological analyzes to determine its healthiness, but it implies in its determination of a series of parameters that use various analytical methods that often they are tedious and time consuming to calculate. The present study makes a comparison of machine learning models such as Multiple Linear Regression (MLR), Neural Network Backpropagation (BPNN) and Support Vector Regression (SVR) to estimate Dissolved Oxygen (DO) and Biochemical Oxygen Demand (BOD) to determine the quality of the water of the Rímac river. Water samples were collected from 26 stations and non-point sources of contamination along the Rímac River with 624 records made during the years 2010 to 2012. The physical and chemical parameters introduced in the models include pH, turbidity, total dissolved solids, temperature, electrical conductivity, dissolved oxygen, biochemical oxygen demand, chemical oxygen demand, hardness, chloride, sulfate, calcium, magnesium, and nitrate. The dependent variables of the output models include biochemical oxygen demand (BOD) and dissolved oxygen (DO). The independent variables that were selected for the BOD, these were: pH, EC, turbidity, Nitrites, TOC, COD, iron, and chlorides. For DO, they were temperature, Nitrites, COD, Nitrates, STD, Chlorides and Total Solids. Both dependent parameters have 8 independent variables and the highest correlation coefficient values. The models were trained for learning and validation of 70% and 30% of the data set, respectively. The BPNN presented for the estimation of BOD, with 16 hidden nodes, values of R2 = 0.857 for training and 0.481 for the test phase; For the estimation of DO, with 8 hidden nodes, this was R2 = 0.768 in training and test phase of 0.605. These values were higher than the MLR and SVR, which showed that the BPNN was the best selection. Finally, the classification of water quality as Good, Fair and Poor obtained a precision of 0.88 with a sensitivity of 0.86 and an f1-score of 85%, which evidenced its effectiveness when carrying out this process.
publishDate 2021
dc.date.accessioned.none.fl_str_mv 2021-12-15T16:24:30Z
dc.date.available.none.fl_str_mv 2021-12-15T16:24:30Z
dc.date.issued.fl_str_mv 2021
dc.type.none.fl_str_mv info:eu-repo/semantics/bachelorThesis
dc.type.other.none.fl_str_mv Tesis
format bachelorThesis
dc.identifier.citation.es_PE.fl_str_mv Marroquin Peralta, J. M. (2021). Machine Learning: Comparison of algorithms for determining water quality in the Rímac river [Tesis para optar el Título Profesional de Ingeniero de Sistemas, Universidad de Lima]. Repositorio institucional de la Universidad de Lima. https://hdl.handle.net/20.500.12724/14791
dc.identifier.uri.none.fl_str_mv https://hdl.handle.net/20.500.12724/14791
identifier_str_mv Marroquin Peralta, J. M. (2021). Machine Learning: Comparison of algorithms for determining water quality in the Rímac river [Tesis para optar el Título Profesional de Ingeniero de Sistemas, Universidad de Lima]. Repositorio institucional de la Universidad de Lima. https://hdl.handle.net/20.500.12724/14791
url https://hdl.handle.net/20.500.12724/14791
dc.language.iso.none.fl_str_mv spa
language spa
dc.relation.ispartof.fl_str_mv SUNEDU
dc.rights.*.fl_str_mv info:eu-repo/semantics/openAccess
dc.rights.uri.*.fl_str_mv https://creativecommons.org/licenses/by-nc-sa/4.0/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/4.0/
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidad de Lima
dc.publisher.country.none.fl_str_mv PE
publisher.none.fl_str_mv Universidad de Lima
dc.source.es_PE.fl_str_mv Repositorio Institucional - Ulima
Universidad de Lima
dc.source.none.fl_str_mv reponame:ULIMA-Institucional
instname:Universidad de Lima
instacron:ULIMA
instname_str Universidad de Lima
instacron_str ULIMA
institution ULIMA
reponame_str ULIMA-Institucional
collection ULIMA-Institucional
bitstream.url.fl_str_mv https://repositorio.ulima.edu.pe/bitstream/20.500.12724/14791/3/license_rdf
https://repositorio.ulima.edu.pe/bitstream/20.500.12724/14791/5/Juan_Miguel_Marroquin_Peralta.pdf.txt
https://repositorio.ulima.edu.pe/bitstream/20.500.12724/14791/7/Tesis.pdf.txt
https://repositorio.ulima.edu.pe/bitstream/20.500.12724/14791/2/Tesis.pdf
https://repositorio.ulima.edu.pe/bitstream/20.500.12724/14791/4/license.txt
https://repositorio.ulima.edu.pe/bitstream/20.500.12724/14791/6/Juan_Miguel_Marroquin_Peralta.pdf.jpg
https://repositorio.ulima.edu.pe/bitstream/20.500.12724/14791/8/Tesis.pdf.jpg
bitstream.checksum.fl_str_mv 5a4ffbc01f1b5eb70a835dac0d501661
9449d86616e4bf58ec26010cc0b9c852
9449d86616e4bf58ec26010cc0b9c852
90ef22e3b1254dac121e0a6ab0c1a736
8a4605be74aa9ea9d79846c1fba20a33
4716c22acafc30f2d4c8fffde167d977
4716c22acafc30f2d4c8fffde167d977
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositorio Universidad de Lima
repository.mail.fl_str_mv repositorio@ulima.edu.pe
_version_ 1846611986825084928
spelling García López, Yván JesúsMarroquín Peralta, Juan Miguel2021-12-15T16:24:30Z2021-12-15T16:24:30Z2021Marroquin Peralta, J. M. (2021). Machine Learning: Comparison of algorithms for determining water quality in the Rímac river [Tesis para optar el Título Profesional de Ingeniero de Sistemas, Universidad de Lima]. Repositorio institucional de la Universidad de Lima. https://hdl.handle.net/20.500.12724/14791https://hdl.handle.net/20.500.12724/14791The evaluation of the quality of the water in rivers is necessary to manage the efficiency of its use, being necessary to carry out physicochemical and biological analyzes to determine its healthiness, but it implies in its determination of a series of parameters that use various analytical methods that often they are tedious and time consuming to calculate. The present study makes a comparison of machine learning models such as Multiple Linear Regression (MLR), Neural Network Backpropagation (BPNN) and Support Vector Regression (SVR) to estimate Dissolved Oxygen (DO) and Biochemical Oxygen Demand (BOD) to determine the quality of the water of the Rímac river. Water samples were collected from 26 stations and non-point sources of contamination along the Rímac River with 624 records made during the years 2010 to 2012. The physical and chemical parameters introduced in the models include pH, turbidity, total dissolved solids, temperature, electrical conductivity, dissolved oxygen, biochemical oxygen demand, chemical oxygen demand, hardness, chloride, sulfate, calcium, magnesium, and nitrate. The dependent variables of the output models include biochemical oxygen demand (BOD) and dissolved oxygen (DO). The independent variables that were selected for the BOD, these were: pH, EC, turbidity, Nitrites, TOC, COD, iron, and chlorides. For DO, they were temperature, Nitrites, COD, Nitrates, STD, Chlorides and Total Solids. Both dependent parameters have 8 independent variables and the highest correlation coefficient values. The models were trained for learning and validation of 70% and 30% of the data set, respectively. The BPNN presented for the estimation of BOD, with 16 hidden nodes, values of R2 = 0.857 for training and 0.481 for the test phase; For the estimation of DO, with 8 hidden nodes, this was R2 = 0.768 in training and test phase of 0.605. These values were higher than the MLR and SVR, which showed that the BPNN was the best selection. Finally, the classification of water quality as Good, Fair and Poor obtained a precision of 0.88 with a sensitivity of 0.86 and an f1-score of 85%, which evidenced its effectiveness when carrying out this process.application/pdfspaUniversidad de LimaPEinfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/4.0/Repositorio Institucional - UlimaUniversidad de Limareponame:ULIMA-Institucionalinstname:Universidad de Limainstacron:ULIMAAprendizaje automático (Inteligencia artificial)Calidad del aguaRíosLima (Perú)Machine learningWater qualityRivershttps://purl.org/pe-repo/ocde/ford#2.02.04Machine Learning: Comparison of algorithms for determining water quality in the Rímac riverinfo:eu-repo/semantics/bachelorThesisTesisSUNEDUTítulo ProfesionalIngeniería de sistemasUniversidad de Lima. Facultad de Ingeniería y ArquitecturaIngeniero de sistemashttps://orcid.org/0000-0001-9577-418861207672087095https://purl.org/pe-repo/renati/level#tituloProfesionalRamos Ponce, Oscar EfrainQuiroz Villalobos, Lennin PaulGarcia Lopez, Yvan Jesushttps://purl.org/pe-repo/renati/type#tesisCC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8914https://repositorio.ulima.edu.pe/bitstream/20.500.12724/14791/3/license_rdf5a4ffbc01f1b5eb70a835dac0d501661MD53TEXTJuan_Miguel_Marroquin_Peralta.pdf.txtJuan_Miguel_Marroquin_Peralta.pdf.txtExtracted texttext/plain64674https://repositorio.ulima.edu.pe/bitstream/20.500.12724/14791/5/Juan_Miguel_Marroquin_Peralta.pdf.txt9449d86616e4bf58ec26010cc0b9c852MD55Tesis.pdf.txtTesis.pdf.txtExtracted texttext/plain64674https://repositorio.ulima.edu.pe/bitstream/20.500.12724/14791/7/Tesis.pdf.txt9449d86616e4bf58ec26010cc0b9c852MD57ORIGINALTesis.pdfTesis.pdfTesisapplication/pdf1096172https://repositorio.ulima.edu.pe/bitstream/20.500.12724/14791/2/Tesis.pdf90ef22e3b1254dac121e0a6ab0c1a736MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://repositorio.ulima.edu.pe/bitstream/20.500.12724/14791/4/license.txt8a4605be74aa9ea9d79846c1fba20a33MD54THUMBNAILJuan_Miguel_Marroquin_Peralta.pdf.jpgJuan_Miguel_Marroquin_Peralta.pdf.jpgGenerated Thumbnailimage/jpeg10077https://repositorio.ulima.edu.pe/bitstream/20.500.12724/14791/6/Juan_Miguel_Marroquin_Peralta.pdf.jpg4716c22acafc30f2d4c8fffde167d977MD56Tesis.pdf.jpgTesis.pdf.jpgGenerated Thumbnailimage/jpeg10077https://repositorio.ulima.edu.pe/bitstream/20.500.12724/14791/8/Tesis.pdf.jpg4716c22acafc30f2d4c8fffde167d977MD5820.500.12724/14791oai:repositorio.ulima.edu.pe:20.500.12724/147912025-09-25 12:03:00.146Repositorio Universidad de Limarepositorio@ulima.edu.peTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=
score 13.056711
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).