Resultados de búsqueda - data extraction process

Buscar alternativas:
extraction process » construction process (Expander búsqueda)
data extraction » dna extraction (Expander búsqueda), dental extraction (Expander búsqueda), data extractivism (Expander búsqueda)

1

artículo

Data Extraction, Visualization, and Prediction Through Natural Language Processing

Publicado por
Alvarado, Carlos, Velásquez, Gabriel, Mauricio, David

Publicado 2024

This study presents Datalyzer, a system designed for data extraction, visualization, and prediction in the mining sector using advanced NLP and machine learning, specifically GPT-3.S Turbo. The system enhances operational efficiency through rigorous data preprocessing and specialized fine-tuning, validated on a simulated mining dataset. Results show significant improvements: data extraction time reduced by 94 % and visualization time by 97.6%. These improvements indicate a transformation in efficiency, usability, and user satisfaction. Despite limitations in data variability and complexity, this pioneering approach highlights the potential of NLP and machine learning in modernizing the mining industry and supporting data-driven decision-making.

2

artículo

Parallel Algorithm for Reduction of Data Processing Time in Big Data

Publicado por
Silva, Jesús, Hernández Palma, Hugo, Niebles Núẽz, William, Ovallos-Gazabon, David, Varela, Noel

Publicado 2020

Technological advances have allowed to collect and store large volumes of data over the years. Besides, it is significant that today's applications have high performance and can analyze these large datasets effectively. Today, it remains a challenge for data mining to make its algorithms and applications equally efficient in the need of increasing data size and dimensionality [1]. To achieve this goal, many applications rely on parallelism, because it is an area that allows the reduction of cost depending on the execution time of the algorithms because it takes advantage of the characteristics of current computer architectures to run several processes concurrently [2]. This paper proposes a parallel version of the FuzzyPred algorithm based on the amount of data that can be processed within each of the processing threads, synchronously and independently.

3

artículo

Extracting and Retargeting Color Mappings from Bitmap Images of Visualizations

Publicado por
Poco J., Mayhua A., Heer J.

Publicado 2018

This work was supported by a Paul G. Allen Family Foundation Distinguished Investigator Award and the Moore Foundation Data-Driven Discovery Investigator program. The second author gratefully acknowledges CONCYTEC for a scholarship in support of graduate studies.

4

artículo

Application of the KDD Process for the Visualization of Integrated Geo-Referenced Textual Data from the Pre-processing Phase

Publicado por
Gomez, Flavio, Iquira, Diego, Cuadros Valdivia, Ana María

Publicado 2018

Geo-referenced textual data has been the subject of multiple investigations, by providing opportunities to better understand certain phenomena according to the content that is shared, either on-line such as social networks, blogs, and news; or through repositories such as scientific research articles, geo-referenced virtual books, among others. However, the characteristics of this information are studied, analyzed and processed separately, either through its textual components or its geo-spatial components, which offers a separate understanding of the results. In this paper, we propose an integration of textual and geo-spatial components from the pre-processing phase to the visualization stage, As a part of the Document Mapping process based on the phases of the Knowledge Discovery in Databases (KDD). Achieving two main results (1) minimize the problems that arise in the visual phase, su...

5

tesis de maestría

High Accuracy GNSS Data Processing and Determination of Displacement by Earthquake

Publicado por
Mendoza del Águila, Mario César

Publicado 2021

Basado en los datos de observación de alta precisión GNSS y el cambio de coordenadas de la estación de monitoreo CORS antes y después del terremoto 8.0 de Perú de 2019, el autor desarrolló el software de análisis de deformación de la superficie basado en el software de procesamiento científico, que tiene valor científico y práctico en la investigación del epicentro del terremoto, magnitud y geodinámica. Se muestran los resultados obtenidos utilizando el software científico de procesamiento GNSS PANDA, un paquete de precisión para el análisis de datos GNSS, desarrollado por la Universidad de Wuhan, China. Los resultados son de alta precisión en el orden de los milímetros. Los resultados obtenidos tienen un desplazamiento de alrededor de 2 cm en las estaciones GNSS cercanas al terremoto, al noroeste.

6

informe técnico

El modelo Data warehouse-OLAP (online analytical processing)

Publicado por
Sinti Cabrera, Paolo Héctor

Publicado 2015

En el presente trabajo, se sistematizan los conceptos inherentes al Modelo Data Warehouse, haciendo referencia a cada uno de ellos en forma ordenada, en un marco conceptual claro, en el que se desplegarán sus características y cualidades, y teniendo siempre en cuenta su relación o interrelación con los demás componentes del ambiente. Inicialmente, se definirá los conceptos generales relacionados al Data WareHouse, Seguidamente, se introducirá a la definición de requerimientos y los procesos de negocio para modelar un Data Warehouse, y se expondrán sus aspectos más relevantes y significativos. Luego, se precisarán y detallarán todos los componentes que intervienen en la Integración de Datos, de manera organizada e intuitiva, atendiendo su interrelación. Posterior se describe el Diseño Dimensional para los procesos de Negocio. Finalmente, se describirán algunos conceptos qu...

7

artículo

ELLAS Architecture and Process: Collecting and Curating Data on Women’s Presence in STEM

Publicado por
Berardi, Rita Cristina Galarraga, Auceli, Pedro Henrique Stolarski, Maciel, Cristiano, Fritoli, Rodgers, Dávila Calle, Guillermo Antonio, Guzman, Indira, Mendes, Luana

Publicado 2024

The underrepresentation of women in STEM fields needs to be highlighted through data to assist decisionmakers and public policy creators in addressing the issue effectively. However, the lack of structured, organized data published openly in this domain is still a reality. To address this problem, a Latin American research network called ELLAS was created. The project’s goal is to develop a platform with Semantic Web-based technologies to structure and concentrate data from Brazil, Peru, and Bolivia, initially. This paper presents the processes defined for the collection and curation of both unstructured and structured data, sourced from scientific articles, social networks, and existing open data. We explore the architecture design in a way that facilitates understanding of the details of the processes and the actors involved for each data source. We present the preliminary results fr...

8

artículo

JOINT PRODUCTION SECTOR, extractive, PROCESSING AND MARKETING NEEDED TO BOOST THE ECONOMIC DEVELOPMENT OF THE COUNTRY

Publicado por
García Z., Teonila

Publicado 2002

It is to create the Ministry of Production in an attempt to support a national plan of production, a sector to formulate policy and direct the production, processing and extractive activity. Industrialization and commercialization of aquatic and agricultural and mining metallurgical promote, coordinate and guide his development, maximizing the productivity of each of the products you need the domestic market and for export projection.

9

artículo

JOINT PRODUCTION SECTOR, extractive, PROCESSING AND MARKETING NEEDED TO BOOST THE ECONOMIC DEVELOPMENT OF THE COUNTRY

Publicado por
García Z., Teonila

Publicado 2002

It is to create the Ministry of Production in an attempt to support a national plan of production, a sector to formulate policy and direct the production, processing and extractive activity. Industrialization and commercialization of aquatic and agricultural and mining metallurgical promote, coordinate and guide his development, maximizing the productivity of each of the products you need the domestic market and for export projection.

10

artículo

Kimball data warehouse for the sales analysis process in a manufacturing business in Perú

Publicado por
Vidal Carlos, Palomino, Obregon Patricia, Condori

Publicado 2025

The main goal of this research is to demonstrate that the use of innovative technology like business intelligence (BI) in a specific type of business significantly impacts their sales processes, enhancing decision-making, promotional strategies, and consequently customer loyalty and sales growth. The case study is a manufacturing business located in Lima, Peru. The information requirements of this business were analyzed, and a data mart model was created using the Kimball methodology. This multidimensional model enabled the comparison of client sales trends to propose new promotions and marketing strategies. The data analysis used to evaluate the results included hypothesis testing, analysis of employee responses to questionnaires to measure the impact of technology use on sales processes, and data reviews to assess sales increases both before and after the implementation of this technol...

11

tesis doctoral

Forecasting volcanic eruptions based on massive seismic data processing. Application to Peruvian volcanoes

Publicado por
Machacca Puma, Roger

Publicado 2024

This dissertation investigates the potential improvement of volcanic eruption understanding and forecasting methods by using advanced data processing techniques to analyze large datasets at three target volcanoes (Piton de la Fournaise (PdlF) (France), Sabancaya, and Ubinas (Peru)). The central objective of this study is to search for possible empirical relationships between the pre-eruptive behavior of the accelerated increase in seismic activity using the Failure Forecast Method (FFM) and velocity variations measured by Coda Wave Interferometry (CWI), since both observations are reported to be independently associated with medium damage. The FFM is a deterministic method used to forecast volcanic eruptions using an empirical relationship of increased and accelerated evolution of an observable (e.g., volcano-seismic event rates). The event rates used with FFM in this study were generate...

12

artículo

Predictive machine learning applying cross industry standard process for data mining for the diagnosis of diabetes mellitus type 2

Publicado por
Garcia-Rios, Victor, Marres-Salhuana, Marieta, Sierra-Liñan, Fernando, Cabanillas-Carbonell, Michael

Publicado 2023

Currently, type 2 diabetes mellitus is one of the world's most prevalent diseases and has claimed millions of people's lives. The present research aims to know the impact of the use of machine learning in the diagnostic process of type 2 diabetes mellitus and to offer a tool that facilitates the diagnosis of the dis-ease quickly and easily. Different machine learning models were designed and compared, being random forest was the algorithm that generated the model with the best performance (90.43% accuracy), which was integrated into a web platform, working with the PIMA dataset, which was validated by specialists from the Peruvian League for the Fight against Diabetes organization. The result was a decrease of (A) 88.28% in the information collection time, (B) 99.99% in the diagnosis time, (C) 44.42% in the diagnosis cost, and (D) 100% in the level of difficulty, concluding that the appl...

13

artículo

Predictive machine learning applying cross industry standard process for data mining for the diagnosis of diabetes mellitus type 2

Publicado por
Garcia-Rios, Victor, Marres-Salhuana, Marieta, Sierra-Liñan, Fernando, Cabanillas-Carbonell, Michael

Publicado 2023

Currently, type 2 diabetes mellitus is one of the world's most prevalent diseases and has claimed millions of people's lives. The present research aims to know the impact of the use of machine learning in the diagnostic process of type 2 diabetes mellitus and to offer a tool that facilitates the diagnosis of the dis-ease quickly and easily. Different machine learning models were designed and compared, being random forest was the algorithm that generated the model with the best performance (90.43% accuracy), which was integrated into a web platform, working with the PIMA dataset, which was validated by specialists from the Peruvian League for the Fight against Diabetes organization. The result was a decrease of (A) 88.28% in the information collection time, (B) 99.99% in the diagnosis time, (C) 44.42% in the diagnosis cost, and (D) 100% in the level of difficulty, concluding that the appl...

14

tesis de grado

El modelo Data Warehouse – Olap (online analytical processing) la minería de datos de una empresa editorial

Publicado por
Perea Domper, Goldman Denis, Tiburcio Collantes, Hugo Clay

Publicado 2014

En este informe se presenta y describe un modelo general Olap y prototipo de un Sistema DataWareHouse para una empresa del sector público en general y se implementa en el caso Practico de una Editora, Se revisan los antecedentes, cómo se consolida la información actualmente de forma manual o con apoyo de otros sistemas, se define el problema, se muestra gráficamente la situación actual, se determina la justificación del presente trabajo y los métodos utilizados. Se detallan los objetivos generales y específicos; además se explica el concepto de Inteligencia de Negocios y Almacén de Datos. Se muestra el Modelo General OLAP de Empresa Editora, así como el prototipo desarrollado para mostrar parte de la solución al problema. Finalmente se expone las conclusiones, las recomendaciones y trabajos futuros.

15

artículo

Enhancement of the distribution process on light logistics SMEs in times post-pandemic Covid-19 with Ukraine-Russia conflict by lean logistics and big data

Publicado por
Rojas-García, José Antonio, Elias-Giordano, Cynthia, Nallusamy, S., Quiroz-Flores, Juan Carlos

Publicado 2024

The objective of the work was to enhance the competitiveness of e-commerce and to mitigate the disadvantages associated with it, such as the untimely delivery of purchased products. This was to be achieved through the implementation of a proposed methodology in the processes related to the ‘Last Mile’, leveraging Big Data and Lean Logistics to boost the productivity of light logistics SMEs (small and medium-sized enterprises). To identify the conditions impacting the distribution processes, a study was conducted on a population of 750 S MEs, utilizing a sample of 255 companies through stratified probabilistic sampling. The research spanned the years 2022 and 2023. The methodology advocated in this study combines Lean Logistics and Big Data to enhance the supply chain's efficiency and profitability for SMEs engaged in light logistics, amidst the post-pandemic landscape and the conflic...

16

artículo

Data Mart Design to Increase Transactional Flow of Debit and Credit Card in Peruvian Bodegas

Publicado por
Morales-Arevalo, Juan Carlos, Aquise-Gonzales, Erick Manuel, Carpio-Ore, William Yohani, Sáenz, Emmanuel Victor Mendoza, Mazzarri-Rodriguez, Carlos Javier, Remotti-Becerra, Erick Enrique, Plata, Edison Humberto Medina La, Luque-Vega, Luis F.

Publicado 2025

The objective of this research is to design a Data Mart to identify tactical actions and increase the use of POS (points of sale) in the bodega business sector of Lima, Peru. A quantitative approach, using transaction history data, is applied using the Kimball methodology. This involves the ETL (Extract, Transform, Load) process to create a dimensional model and to develop a dashboard to visualize key indicators using Power BI. This solution is expected to improve the detection and analysis of transactional errors, categorized by geographic location and business sector while enhancing decision-making processes. This research improves the transactional flow and digital payment adoption in small businesses, fostering greater financial inclusion in the Peruvian market. Therefore, the methodology and tools to be applied in this research offer a framework as a model for similar contexts, espe...

17

artículo

Manipulation, analysis, and visualization of data from the demographic and family health survey with the r program

Publicado por
Hernándezm Vásquez, Akram, Chacón Torrico, Horacio

Publicado 2019

The Demographic and Family Health Survey (ENDES, in Spanish) is a national population-based survey with representation at the departmental level and area of residence, constituting a source of information on the health status of the Peruvian population. In order to standardize its processing and subsequent reuse by the academic community and other stakeholders, we documented the code for the manipulation, analysis, and visualization of data from the ENDES 2017 health questionnaire, through an example on the prevalence of hypertension and obesity, using the R statistical programming environment and language. The R code is presented and detailed sequentially, as well as the theoretical support of the survey structure for the manipulation of databases, considering that the complex structure of the ENDES could be a potential barrier faced by researchers. Finally, this example can serve as a ...

18

ponencia

¿Big Data o Big Analytics?

Publicado por
Gray, Allan

Publicado 2017

No presenta resumen

19

tesis de grado

Implementación del módulo de ventas y distribución en el Enterprise Resource Planning Systems, Applications, and Products in Data Processing High Performance Analytic Appliance

Publicado por
Garriazo Gonzales, Oscar Jhonatan

Publicado 2025

Este trabajo tiene como objetivo describir y analizar los beneficios que tuvo la implementación del módulo venta y distribución del sistema Systems, Applications, and Products in Data Processing (SAP) en la empresa Cumbra Ingeniería S.A. Se enfoca en cómo este módulo de ventas ha optimizado la gestión de los procesos comerciales, abarcando desde la cotización hasta la facturación. La adopción del sistema ha permitido mejorar la eficiencia operativa y facilitar en tiempo real la integración para las áreas contable, de proyectos y financiera. Gracias a la automatización de tareas y a la implementación del sistema, la empresa ha logrado obtener datos precisos sobre las ventas realizadas, lo que ha favorecido la toma de decisiones informadas para el futuro. Este trabajo también aborda los desafíos enfrentados durante la implementación, los marcos teóricos utilizados y los r...

20

objeto de conferencia

Extracting Visual Encodings from Map Chart Images with Color-Encoded Scalar Values

Publicado por
Mayhua A., Gomez-Nieto E., Heer J., Poco J.

Publicado 2019

This work was supported by grant 234-2015-FONDECYT (Master Program) from Cienciactiva of the National Council for Science,Technology and Technological Innovation (CONCYTEC-PERU).

1
2
3
4
5
6
7
8
9
10
11
Siguiente
[326]

Cannot write session to /tmp/vufind_sessions/sess_evetf9d4vs6g2qlpn222ptkthu