Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru?

Descripción del Articulo

Objective: To determine which artificial intelligence (AI) large language model demonstrates the highest accuracy in answering the 2023 National Dentistry Examination (ENAO, by its acronym in Spanish) in Peru, compared with the official answer key. Material and methods: The 100 multiple-choice quest...

Descripción completa

Detalles Bibliográficos
Autores: Saravia-Rojas, Miguel Ángel, Mendiola-Aquino, Carlos, Orejuela-Ramirez, Francisco, Tunquipa-Chacón, Wanderley, Geng-Vivanco, Rocio
Formato: artículo
Fecha de Publicación:2025
Institución:Universidad Peruana Cayetano Heredia
Repositorio:Revistas - Universidad Peruana Cayetano Heredia
Lenguaje:inglés
OAI Identifier:oai:revistas.upch.edu.pe:article/6253
Enlace del recurso:https://revistas.upch.edu.pe/index.php/REH/article/view/6253
Nivel de acceso:acceso abierto
Materia:inteligencia artificial
educación odontológica
evaluación educativa
modelos de lenguaje de gran tamaño
artificial intelligence
dental education
educational assessment
large language models
inteligência artificial
educação odontológica
avaliação educacional
modelos de linguagem de grande porte
id REVUPCH_936a6d6d1b5df85fa176e5e35d90cf6a
oai_identifier_str oai:revistas.upch.edu.pe:article/6253
network_acronym_str REVUPCH
network_name_str Revistas - Universidad Peruana Cayetano Heredia
repository_id_str
dc.title.none.fl_str_mv Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru?
¿Pueden los grandes modelos de lenguaje de inteligencia artificial aprobar el Examen Nacional de Odontología en el Perú?
Os grandes modelos de linguagem de inteligência artificial conseguem ser aprovados no Exame Nacional de Odontologia no Peru?
title Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru?
spellingShingle Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru?
Saravia-Rojas, Miguel Ángel
inteligencia artificial
educación odontológica
evaluación educativa
modelos de lenguaje de gran tamaño
artificial intelligence
dental education
educational assessment
large language models
inteligência artificial
educação odontológica
avaliação educacional
modelos de linguagem de grande porte
title_short Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru?
title_full Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru?
title_fullStr Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru?
title_full_unstemmed Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru?
title_sort Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru?
dc.creator.none.fl_str_mv Saravia-Rojas, Miguel Ángel
Mendiola-Aquino, Carlos
Orejuela-Ramirez, Francisco
Tunquipa-Chacón, Wanderley
Geng-Vivanco, Rocio
author Saravia-Rojas, Miguel Ángel
author_facet Saravia-Rojas, Miguel Ángel
Mendiola-Aquino, Carlos
Orejuela-Ramirez, Francisco
Tunquipa-Chacón, Wanderley
Geng-Vivanco, Rocio
author_role author
author2 Mendiola-Aquino, Carlos
Orejuela-Ramirez, Francisco
Tunquipa-Chacón, Wanderley
Geng-Vivanco, Rocio
author2_role author
author
author
author
dc.subject.none.fl_str_mv inteligencia artificial
educación odontológica
evaluación educativa
modelos de lenguaje de gran tamaño
artificial intelligence
dental education
educational assessment
large language models
inteligência artificial
educação odontológica
avaliação educacional
modelos de linguagem de grande porte
topic inteligencia artificial
educación odontológica
evaluación educativa
modelos de lenguaje de gran tamaño
artificial intelligence
dental education
educational assessment
large language models
inteligência artificial
educação odontológica
avaliação educacional
modelos de linguagem de grande porte
description Objective: To determine which artificial intelligence (AI) large language model demonstrates the highest accuracy in answering the 2023 National Dentistry Examination (ENAO, by its acronym in Spanish) in Peru, compared with the official answer key. Material and methods: The 100 multiple-choice questions from the 2023 ENAO were tested using ChatGPT-3.5, ChatGPT-4, Gemini, and Copilot. Responses were categorized by subject area and scored as correct or incorrect. Data were analyzed using the chi-square test (α = 0.05). Results: ChatGPT-4 achieved the highest overall accuracy (90.00%), followed by Gemini (82.00%), Copilot (79.00%), and ChatGPT-3.5 (76.00%). Across most models, the highest accuracy was observed in Public Health, Research, Health Services Management, and Ethics, whereas lower performance was observed in Anatomy and in Oral Medicine and Pathology. Pairwise comparisons revealed that ChatGPT-4 performed significantly better than ChatGPT-3.5 (difference: 14%; p = 0.0084) and Copilot (difference: 11%; p = 0.0316); no significant differences were found among the remaining model comparisons (p > 0.05). Conclusion: All AI language models demonstrated effectiveness in answering the 2023 ENAO questions, with ChatGPT-4 achieving the highest accuracy.
publishDate 2025
dc.date.none.fl_str_mv 2025-12-30
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv https://revistas.upch.edu.pe/index.php/REH/article/view/6253
10.20453/reh.v35i4.6253
url https://revistas.upch.edu.pe/index.php/REH/article/view/6253
identifier_str_mv 10.20453/reh.v35i4.6253
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://revistas.upch.edu.pe/index.php/REH/article/view/6253/6842
dc.rights.none.fl_str_mv http://creativecommons.org/licenses/by/4.0
info:eu-repo/semantics/openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by/4.0
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidad Peruana Cayetano Heredia
publisher.none.fl_str_mv Universidad Peruana Cayetano Heredia
dc.source.none.fl_str_mv Revista Estomatológica Herediana; Vol. 35 No. 4 (2025): Octubre-diciembre; 305-311
Revista Estomatológica Herediana; Vol. 35 Núm. 4 (2025): Octubre-diciembre; 305-311
Revista Estomatológica Herediana; v. 35 n. 4 (2025): Octubre-diciembre; 305-311
2225-7616
1019-4355
10.20453/reh.v35i4
reponame:Revistas - Universidad Peruana Cayetano Heredia
instname:Universidad Peruana Cayetano Heredia
instacron:UPCH
instname_str Universidad Peruana Cayetano Heredia
instacron_str UPCH
institution UPCH
reponame_str Revistas - Universidad Peruana Cayetano Heredia
collection Revistas - Universidad Peruana Cayetano Heredia
repository.name.fl_str_mv
repository.mail.fl_str_mv
_version_ 1853128782080114688
spelling Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru? ¿Pueden los grandes modelos de lenguaje de inteligencia artificial aprobar el Examen Nacional de Odontología en el Perú?Os grandes modelos de linguagem de inteligência artificial conseguem ser aprovados no Exame Nacional de Odontologia no Peru?Saravia-Rojas, Miguel ÁngelMendiola-Aquino, CarlosOrejuela-Ramirez, FranciscoTunquipa-Chacón, WanderleyGeng-Vivanco, Rociointeligencia artificialeducación odontológicaevaluación educativamodelos de lenguaje de gran tamañoartificial intelligencedental educationeducational assessmentlarge language modelsinteligência artificial educação odontológica avaliação educacionalmodelos de linguagem de grande porteObjective: To determine which artificial intelligence (AI) large language model demonstrates the highest accuracy in answering the 2023 National Dentistry Examination (ENAO, by its acronym in Spanish) in Peru, compared with the official answer key. Material and methods: The 100 multiple-choice questions from the 2023 ENAO were tested using ChatGPT-3.5, ChatGPT-4, Gemini, and Copilot. Responses were categorized by subject area and scored as correct or incorrect. Data were analyzed using the chi-square test (α = 0.05). Results: ChatGPT-4 achieved the highest overall accuracy (90.00%), followed by Gemini (82.00%), Copilot (79.00%), and ChatGPT-3.5 (76.00%). Across most models, the highest accuracy was observed in Public Health, Research, Health Services Management, and Ethics, whereas lower performance was observed in Anatomy and in Oral Medicine and Pathology. Pairwise comparisons revealed that ChatGPT-4 performed significantly better than ChatGPT-3.5 (difference: 14%; p = 0.0084) and Copilot (difference: 11%; p = 0.0316); no significant differences were found among the remaining model comparisons (p > 0.05). Conclusion: All AI language models demonstrated effectiveness in answering the 2023 ENAO questions, with ChatGPT-4 achieving the highest accuracy.Objetivo: Determinar qué modelo de lenguaje de gran tamaño basado en inteligencia artificial (IA) presenta mayor precisión al responder el Examen Nacional de Odontología (ENAO) de 2023 en Perú, en comparación con el banco de respuestas oficiales. Materiales y métodos: Las 100 preguntas de opción múltiple del examen se probaron en ChatGPT-3.5, ChatGPT-4, Gemini y Copilot, y las respuestas se clasificaron por materias. Cada respuesta se marcó como correcta o incorrecta, y los datos se analizaron mediante la prueba de chi-cuadrado (α = 0,05). Resultados: ChatGPT-4 alcanzó la mayor precisión global (90,00 %), seguido de Gemini (82,00 %), Copilot (79,00 %) y ChatGPT-3.5 (76,00 %). Por área temática, Salud Pública, Investigación, Gestión de Servicios de Salud y Ética mostraron las mayores tasas de acierto en la mayoría de los modelos, mientras que Anatomía y Medicina Oral y Patología mostraron un desempeño inferior. Las comparaciones pareadas revelaron que ChatGPT-4 tuvo un rendimiento significativamente superior al de ChatGPT-3.5 (diferencia: 14 %; p = 0,0084) y al de Copilot (diferencia: 11 %; p = 0,0316), mientras que no se encontraron diferencias estadísticamente significativas entre los demás modelos (p > 0,05). Conclusión: Todos los modelos de lenguaje de gran tamaño basados en IA demostraron su eficacia al responder las preguntas del ENAO de 2023, siendo que ChatGPT-4 mostró la mayor precisión.Objetivo: Determinar qual modelo de linguagem de inteligência artificial (IA) apresenta maior precisão ao responder ao Exame Nacional de Odontologia (ENAO) de 2023 no Peru, em comparação com o banco de respostas oficial. Materiais e métodos: As 100 perguntas de múltipla escolha do exame foram testadas no ChatGPT-3.5, ChatGPT-4, Gemini e Copilot, e as respostas foram classificadas por matéria. Cada resposta foi marcada como correta ou incorreta, e os dados foram analisados através do teste qui-quadrado (α = 0,05). Resultados: O ChatGPT-4 alcançou a maior precisão geral (90,00%), seguido pelo Gemini (82,00%), Copilot (79,00%) e ChatGPT-3.5 (76,00%). Por assunto, Saúde Pública, Pesquisa, Gestão de Serviços de Saúde e Ética apresentaram a maior precisão na maioria dos modelos, enquanto um desempenho inferior foi observado em Anatomia e Medicina Oral e Patologia. Comparações pareadas revelaram que o ChatGPT-4 teve um desempenho significativamente melhor do que o ChatGPT-3.5 (diferença: 14%; p = 0,0084) e o Copilot (diferença: 11%; p = 0,0316), enquanto não foram encontradas diferenças significativas entre os demais modelos (p > 0,05). Conclusão: Todos os modelos de linguagem da IA demonstraram a sua eficácia ao responder às perguntas da ENAO de 2023, sendo que o ChatGPT-4 apresentou a maior precisão.Universidad Peruana Cayetano Heredia2025-12-30info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://revistas.upch.edu.pe/index.php/REH/article/view/625310.20453/reh.v35i4.6253Revista Estomatológica Herediana; Vol. 35 No. 4 (2025): Octubre-diciembre; 305-311Revista Estomatológica Herediana; Vol. 35 Núm. 4 (2025): Octubre-diciembre; 305-311Revista Estomatológica Herediana; v. 35 n. 4 (2025): Octubre-diciembre; 305-3112225-76161019-435510.20453/reh.v35i4reponame:Revistas - Universidad Peruana Cayetano Herediainstname:Universidad Peruana Cayetano Herediainstacron:UPCHenghttps://revistas.upch.edu.pe/index.php/REH/article/view/6253/6842Derechos de autor 2025 Miguel Ángel Saravia-Rojas, Carlos Mendiola-Aquino, Francisco Orejuela-Ramirez, Wanderley Tunquipa-Chacón, Rocio Geng-Vivancohttp://creativecommons.org/licenses/by/4.0info:eu-repo/semantics/openAccessoai:revistas.upch.edu.pe:article/62532025-12-30T19:30:37Z
score 13.941906
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).