Ensayo en modelos predictivos de Data Mining para diabetes en etapa temprana
Descripción del Articulo
Diabetes has become such a common, but deadly, chronic health problem that it has _x000D_ increased dramatically in recent years. About 50% of all people with diabetes are not _x000D_ diagnosed due to its long-term asymptomatic phase, which is why detecting diabetes in an _x000D_ early phase is of v...
| Autor: | |
|---|---|
| Formato: | tesis de grado |
| Fecha de Publicación: | 2023 |
| Institución: | Universidad Nacional de Trujillo |
| Repositorio: | UNITRU-Tesis |
| Lenguaje: | español |
| OAI Identifier: | oai:dspace.unitru.edu.pe:20.500.14414/18558 |
| Enlace del recurso: | https://hdl.handle.net/20.500.14414/18558 |
| Nivel de acceso: | acceso abierto |
| Materia: | Riesgo de diabetes Etapa prematura Minería de datos Regresión logística Máquina de Soporte Vectorial Métricas de evaluación |
| Sumario: | Diabetes has become such a common, but deadly, chronic health problem that it has _x000D_ increased dramatically in recent years. About 50% of all people with diabetes are not _x000D_ diagnosed due to its long-term asymptomatic phase, which is why detecting diabetes in an _x000D_ early phase is of vital importance. Science has advanced so much in the field of health that _x000D_ data mining classification techniques have been well accepted by the scientific community _x000D_ for the predictive model of disease risk. In the present investigation, a set of 520 data has _x000D_ been used, which information was collected through a direct survey of patients from the _x000D_ Sylhet Diabetes Hospital in Bangladesh. The respective analysis was carried out using _x000D_ classification algorithms such as Logistic Regression (classical statistical technique) and _x000D_ Support Vector Machine (machine learning technique). After adjusting the models and _x000D_ evaluating using metrics such as accuracy, sensitivity and AUC (in that order), it has been _x000D_ found that the Vector Support Machine model has a better fit and predictive power (0.98, _x000D_ 0.98, 0.99) compared to the model of logistic regression (0.92, 0.94, 0.97). Finally, as a _x000D_ suggestion, useful tips were raised to control risk factors |
|---|
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).