Tópicos Sugeridos dentro de su búsqueda.
https://purl.org/pe-repo/ocde/ford#5.02.04 41 https://purl.org/pe-repo/ocde/ford#2.07.05 40 https://purl.org/pe-repo/ocde/ford#2.02.04 34 https://purl.org/pe-repo/ocde/ford#2.11.04 24 Minería 21 Minería de datos 20 Data mining 15 más ...
Mostrando 1 - 20 Resultados de 365 Para Buscar 'data mining process', tiempo de consulta: 0.99s Limitar resultados
1
artículo
This article presents a methodology that applies natural language processing and classification algorithms by us­ing data mining techniques, and incorporating procedures for validation and verification of significance. This is conducted according to the analysis and selection of data and results based on quality statistical analysis, which guarantees the effectiveness percentage in knowledge construction. The analysis of computer incidents within an educational institution and a standardized database of historical computer incidents collected by the Service Desk area is used as case study. Such area is linked to all information technology processes and focuses on the support requirements for the performance of employee activities. As long as users’ requirements are not fulfilled in a timely manner, the impact of incidents may give rise to work problems at different levels, making it d...
2
artículo
This article presents a methodology that applies natural language processing and classification algorithms by us­ing data mining techniques, and incorporating procedures for validation and verification of significance. This is conducted according to the analysis and selection of data and results based on quality statistical analysis, which guarantees the effectiveness percentage in knowledge construction. The analysis of computer incidents within an educational institution and a standardized database of historical computer incidents collected by the Service Desk area is used as case study. Such area is linked to all information technology processes and focuses on the support requirements for the performance of employee activities. As long as users’ requirements are not fulfilled in a timely manner, the impact of incidents may give rise to work problems at different levels, making it d...
3
artículo
Technological advances have allowed to collect and store large volumes of data over the years. Besides, it is significant that today's applications have high performance and can analyze these large datasets effectively. Today, it remains a challenge for data mining to make its algorithms and applications equally efficient in the need of increasing data size and dimensionality [1]. To achieve this goal, many applications rely on parallelism, because it is an area that allows the reduction of cost depending on the execution time of the algorithms because it takes advantage of the characteristics of current computer architectures to run several processes concurrently [2]. This paper proposes a parallel version of the FuzzyPred algorithm based on the amount of data that can be processed within each of the processing threads, synchronously and independently.
4
artículo
Student academic performance at universities is crucial for education management systems. Many actions and decisions are made based on it, specifically the enrollment process. During enrollment, students have to decide which courses to sign up for. This research presents the rationale behind the design of a recommender system to support the enrollment process using the students’ academic performance record. To build this system, the CRISP-DM methodology was applied to data from students of the Computer Science Department at University of Lima, Perú. One of the main contributions of this work is the use of two synthetic attributes to improve the relevance of the recommendations made. The first attribute estimates the inherent difficulty of a given course. The second attribute, named potential, is a measure of the competence of a student for a given course based on the grades obtained i...
5
capítulo de libro
This work proposes a semi-automated analysis and modeling package for Machine Learning related problems. The library goal is to reduce the steps involved in a traditional data science roadmap. To do so, Sparkmach takes advantage of Machine Learning techniques to build base models for both classification and regression problems. These models include exploratory data analysis, data preprocessing, feature engineering and modeling. The project has its basis in Pymach, a similar library that faces those steps for small and medium-sized datasets (about ten millions of rows and a few columns). Sparkmach central labor is to scale Pymach to overcome big datasets by using Apache Spark distributed computing, a distributed engine for large-scale data processing, that tackle several data science related problems in a cluster environment. Despite the software nature, Sparkmach can be of use for local ...
6
artículo
Currently, type 2 diabetes mellitus is one of the world's most prevalent diseases and has claimed millions of people's lives. The present research aims to know the impact of the use of machine learning in the diagnostic process of type 2 diabetes mellitus and to offer a tool that facilitates the diagnosis of the dis-ease quickly and easily. Different machine learning models were designed and compared, being random forest was the algorithm that generated the model with the best performance (90.43% accuracy), which was integrated into a web platform, working with the PIMA dataset, which was validated by specialists from the Peruvian League for the Fight against Diabetes organization. The result was a decrease of (A) 88.28% in the information collection time, (B) 99.99% in the diagnosis time, (C) 44.42% in the diagnosis cost, and (D) 100% in the level of difficulty, concluding that the appl...
7
artículo
Currently, type 2 diabetes mellitus is one of the world's most prevalent diseases and has claimed millions of people's lives. The present research aims to know the impact of the use of machine learning in the diagnostic process of type 2 diabetes mellitus and to offer a tool that facilitates the diagnosis of the dis-ease quickly and easily. Different machine learning models were designed and compared, being random forest was the algorithm that generated the model with the best performance (90.43% accuracy), which was integrated into a web platform, working with the PIMA dataset, which was validated by specialists from the Peruvian League for the Fight against Diabetes organization. The result was a decrease of (A) 88.28% in the information collection time, (B) 99.99% in the diagnosis time, (C) 44.42% in the diagnosis cost, and (D) 100% in the level of difficulty, concluding that the appl...
8
artículo
El texto completo de este trabajo no está disponible en el Repositorio Académico UPC por restricciones de la casa editorial donde ha sido publicado.
9
artículo
Information has become increasingly important to us. The present study shows an introduction to the concept of data as a wing Mining tool that many businesses and organizations use to improve their decision making processes.
10
artículo
Information has become increasingly important to us. The present study shows an introduction to the concept of data as a wing Mining tool that many businesses and organizations use to improve their decision making processes.
11
artículo
In this paper, we propose an optimization model for medical services processes to reduce waiting time using process mining. In medical services, there is a high percentage of dissatisfaction with medical care due to the processes related to appointment booking and waiting time for medical consultation. As a result, patients change medical services due to the urgency of the symptoms they suffer, generating distrust in health services in Peru. Through a medical information system, events of medical care processes are collected for analysis using the Celonis tool. The process mining discipline uses the discovery of the study process to identify existing bottlenecks in the process and violations that are included when monitoring process events. The proposed model is based on identifying the existing bottlenecks in the processes, which are appointment booking and office care, as these process...
12
artículo
This paper reviews the most recent literature on experiments with different Machine Learning, Deep Learning and Natural Language Processing techniques applied to predict judicial and administrative decisions. Among the most outstanding findings, we have that the most used data mining techniques are Support Vector Machine (SVM), K Nearest Neighbours (K-NN) and Random Forest (RF), and in terms of the most used deep learning techniques, we found Long-Term Memory (LSTM) and transformers such as BERT. An important finding in the papers reviewed was that the use of machine learning techniques has prevailed over those of deep learning. Regarding the place of origin of the research carried out, we found that 64% of the works belong to studies carried out in English-speaking countries, 8% in Portuguese and 28% in other languages (such as German, Chinese, Turkish, Spanish, etc.). Very few works of...
13
artículo
ABSTRACT There are several varieties of respiratory diseases which mainly affect children between 0 and 5 years of age, not having a complete report of the behavior of each of these. This research seeks to conduct a study of the behavior of patterns in respiratory diseases of children in Peru through data mining, using data generated by the health sector, organizations and research between the years 2015 to 2019. This process was given by means of the K-Means clustering algorithm which allowed performing an analysis of this data identifying the patterns in a total of 10,000 Peruvian clinical records between the years mentioned, generating different behaviors. Through the grouping obtained in the clusters, it was obtained as a result that most of the cases in all the ages studied, they presented diseases with codes between the range of 000 and 060 approximately. This research was carried ...
14
tesis de grado
Big Data applied to mining, contemplates the combi- nation of algorithms located in advanced technological tools to process a quantity of data, Power BI allows the interaction of dif- ferent data formats, for integration it has the support of Python Script. In this article, Big Data was applied to essential activities such as drilling and blasting, analyzing the parameters, standards, quantities, advances, the objective was to develop an integration system of a quantity of data for its analysis and interpretation, it will contribute to decision making in the mining operation. The development of Dashboard for interactive reportability based on indicators, will allow to visualize more efficiently and in a virtual way among the interested parties. Finally, the application of Big Data in the field of mining mainly in the treatment of its data will be the trend of the future which will allow ...
15
artículo
The article presents the results of a non-experimental, descriptive and explanatory research. This research is related to the mining of text and feelings, for this we worked with all the official speeches of the former president of Peru Pedro Castillo Terrones, which amount to 12 and are in the public domain on the Peruvian state portal. The objective was to determine what the message is behind the aforementioned speeches of the former president. The data were processed in R through RStudio, and the lexicon NRC. The results show messages aimed at transmitting the idea of ​​a welfare state; Therefore, they are not aimed at promoting entrepreneurial initiative in the population, but rather at the state's receipt of resources, especially economic resources, which is in line with the ideology of the party that brought it to power. As for feelings, positive ones predominate. Being a topic...
16
artículo
This study presents Datalyzer, a system designed for data extraction, visualization, and prediction in the mining sector using advanced NLP and machine learning, specifically GPT-3.S Turbo. The system enhances operational efficiency through rigorous data preprocessing and specialized fine-tuning, validated on a simulated mining dataset. Results show significant improvements: data extraction time reduced by 94 % and visualization time by 97.6%. These improvements indicate a transformation in efficiency, usability, and user satisfaction. Despite limitations in data variability and complexity, this pioneering approach highlights the potential of NLP and machine learning in modernizing the mining industry and supporting data-driven decision-making.
17
artículo
Process mining enables organizations to discover, analyze, and enhance their business processes by extracting knowledge from event logs available in current information systems. This research proposes a process mining methodology for developing business and scientific-academic projects, consisting of nine phases: planning and scope, data preprocessing, data processing, flow control analysis, performance analysis, role analysis, results presentation, results publication, and transfer and monitoring. This methodology results from the researcher’s literature review, analysis of published methodologies, and their knowledge and experiences. Future research will apply this process mining methodology to various case studies in business and scientific-academic domains. This research aims to analyze the methodology’s efficiency, its application’s ease and relevance, and the congruence betwe...
18
artículo
Process mining enables organizations to discover, analyze, and enhance their business processes by extracting knowledge from event logs available in current information systems. This research proposes a process mining methodology for developing business and scientific-academic projects, consisting of nine phases: planning and scope, data preprocessing, data processing, flow control analysis, performance analysis, role analysis, results presentation, results publication, and transfer and monitoring. This methodology results from the researcher’s literature review, analysis of published methodologies, and their knowledge and experiences. Future research will apply this process mining methodology to various case studies in business and scientific-academic domains. This research aims to analyze the methodology’s efficiency, its application’s ease and relevance, and the congruence betwe...
19
artículo
The main objective of this research work was to optimize the recovery of zinc in the second stage of flotation of polymetallic minerals of the company Mines and Metals Trading Peru where its average annual recovery of zinc is 82%. At the concentrator plant, two samplings were carried out, the first one of fresh mineral in the belt No. 01 that feeds the primary grinding and the pulp were obtained from the bulk Pb / Ag flotation tails. During the study, preliminary flotation tests were carried out in order to select the most influential independent variables in the dependent variable using the MINITAB statistical program. Flotation tests were performed varying the dosage of copper sulfate (g / TM) and regrind time (minutes). With the results obtained, the optimization with the Hexagonal Design is carried out, obtaining maximum values of copper sulfate of 351.06 gr / TM and regrinding time ...
20
artículo
The main objective of this research work was to optimize the recovery of zinc in the second stage of flotation of polymetallic minerals of the company Mines and Metals Trading Peru where its average annual recovery of zinc is 82%. At the concentrator plant, two samplings were carried out, the first one of fresh mineral in the belt No. 01 that feeds the primary grinding and the pulp were obtained from the bulk Pb / Ag flotation tails. During the study, preliminary flotation tests were carried out in order to select the most influential independent variables in the dependent variable using the MINITAB statistical program. Flotation tests were performed varying the dosage of copper sulfate (g / TM) and regrind time (minutes). With the results obtained, the optimization with the Hexagonal Design is carried out, obtaining maximum values of copper sulfate of 351.06 gr / TM and regrinding time ...