Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru?
Descripción del Articulo
Objective: To determine which artificial intelligence (AI) large language model demonstrates the highest accuracy in answering the 2023 National Dentistry Examination (ENAO, by its acronym in Spanish) in Peru, compared with the official answer key. Material and methods: The 100 multiple-choice quest...
| Autores: | , , , , |
|---|---|
| Formato: | artículo |
| Fecha de Publicación: | 2025 |
| Institución: | Universidad Peruana Cayetano Heredia |
| Repositorio: | Revistas - Universidad Peruana Cayetano Heredia |
| Lenguaje: | inglés |
| OAI Identifier: | oai:revistas.upch.edu.pe:article/6253 |
| Enlace del recurso: | https://revistas.upch.edu.pe/index.php/REH/article/view/6253 |
| Nivel de acceso: | acceso abierto |
| Materia: | inteligencia artificial educación odontológica evaluación educativa modelos de lenguaje de gran tamaño artificial intelligence dental education educational assessment large language models inteligência artificial educação odontológica avaliação educacional modelos de linguagem de grande porte |
| id |
REVUPCH_936a6d6d1b5df85fa176e5e35d90cf6a |
|---|---|
| oai_identifier_str |
oai:revistas.upch.edu.pe:article/6253 |
| network_acronym_str |
REVUPCH |
| network_name_str |
Revistas - Universidad Peruana Cayetano Heredia |
| repository_id_str |
|
| dc.title.none.fl_str_mv |
Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru? ¿Pueden los grandes modelos de lenguaje de inteligencia artificial aprobar el Examen Nacional de Odontología en el Perú? Os grandes modelos de linguagem de inteligência artificial conseguem ser aprovados no Exame Nacional de Odontologia no Peru? |
| title |
Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru? |
| spellingShingle |
Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru? Saravia-Rojas, Miguel Ángel inteligencia artificial educación odontológica evaluación educativa modelos de lenguaje de gran tamaño artificial intelligence dental education educational assessment large language models inteligência artificial educação odontológica avaliação educacional modelos de linguagem de grande porte |
| title_short |
Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru? |
| title_full |
Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru? |
| title_fullStr |
Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru? |
| title_full_unstemmed |
Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru? |
| title_sort |
Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru? |
| dc.creator.none.fl_str_mv |
Saravia-Rojas, Miguel Ángel Mendiola-Aquino, Carlos Orejuela-Ramirez, Francisco Tunquipa-Chacón, Wanderley Geng-Vivanco, Rocio |
| author |
Saravia-Rojas, Miguel Ángel |
| author_facet |
Saravia-Rojas, Miguel Ángel Mendiola-Aquino, Carlos Orejuela-Ramirez, Francisco Tunquipa-Chacón, Wanderley Geng-Vivanco, Rocio |
| author_role |
author |
| author2 |
Mendiola-Aquino, Carlos Orejuela-Ramirez, Francisco Tunquipa-Chacón, Wanderley Geng-Vivanco, Rocio |
| author2_role |
author author author author |
| dc.subject.none.fl_str_mv |
inteligencia artificial educación odontológica evaluación educativa modelos de lenguaje de gran tamaño artificial intelligence dental education educational assessment large language models inteligência artificial educação odontológica avaliação educacional modelos de linguagem de grande porte |
| topic |
inteligencia artificial educación odontológica evaluación educativa modelos de lenguaje de gran tamaño artificial intelligence dental education educational assessment large language models inteligência artificial educação odontológica avaliação educacional modelos de linguagem de grande porte |
| description |
Objective: To determine which artificial intelligence (AI) large language model demonstrates the highest accuracy in answering the 2023 National Dentistry Examination (ENAO, by its acronym in Spanish) in Peru, compared with the official answer key. Material and methods: The 100 multiple-choice questions from the 2023 ENAO were tested using ChatGPT-3.5, ChatGPT-4, Gemini, and Copilot. Responses were categorized by subject area and scored as correct or incorrect. Data were analyzed using the chi-square test (α = 0.05). Results: ChatGPT-4 achieved the highest overall accuracy (90.00%), followed by Gemini (82.00%), Copilot (79.00%), and ChatGPT-3.5 (76.00%). Across most models, the highest accuracy was observed in Public Health, Research, Health Services Management, and Ethics, whereas lower performance was observed in Anatomy and in Oral Medicine and Pathology. Pairwise comparisons revealed that ChatGPT-4 performed significantly better than ChatGPT-3.5 (difference: 14%; p = 0.0084) and Copilot (difference: 11%; p = 0.0316); no significant differences were found among the remaining model comparisons (p > 0.05). Conclusion: All AI language models demonstrated effectiveness in answering the 2023 ENAO questions, with ChatGPT-4 achieving the highest accuracy. |
| publishDate |
2025 |
| dc.date.none.fl_str_mv |
2025-12-30 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.none.fl_str_mv |
https://revistas.upch.edu.pe/index.php/REH/article/view/6253 10.20453/reh.v35i4.6253 |
| url |
https://revistas.upch.edu.pe/index.php/REH/article/view/6253 |
| identifier_str_mv |
10.20453/reh.v35i4.6253 |
| dc.language.none.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
https://revistas.upch.edu.pe/index.php/REH/article/view/6253/6842 |
| dc.rights.none.fl_str_mv |
http://creativecommons.org/licenses/by/4.0 info:eu-repo/semantics/openAccess |
| rights_invalid_str_mv |
http://creativecommons.org/licenses/by/4.0 |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.publisher.none.fl_str_mv |
Universidad Peruana Cayetano Heredia |
| publisher.none.fl_str_mv |
Universidad Peruana Cayetano Heredia |
| dc.source.none.fl_str_mv |
Revista Estomatológica Herediana; Vol. 35 No. 4 (2025): Octubre-diciembre; 305-311 Revista Estomatológica Herediana; Vol. 35 Núm. 4 (2025): Octubre-diciembre; 305-311 Revista Estomatológica Herediana; v. 35 n. 4 (2025): Octubre-diciembre; 305-311 2225-7616 1019-4355 10.20453/reh.v35i4 reponame:Revistas - Universidad Peruana Cayetano Heredia instname:Universidad Peruana Cayetano Heredia instacron:UPCH |
| instname_str |
Universidad Peruana Cayetano Heredia |
| instacron_str |
UPCH |
| institution |
UPCH |
| reponame_str |
Revistas - Universidad Peruana Cayetano Heredia |
| collection |
Revistas - Universidad Peruana Cayetano Heredia |
| repository.name.fl_str_mv |
|
| repository.mail.fl_str_mv |
|
| _version_ |
1853128782080114688 |
| spelling |
Can artificial intelligence-based large language models pass the National Dentistry Examination in Peru? ¿Pueden los grandes modelos de lenguaje de inteligencia artificial aprobar el Examen Nacional de Odontología en el Perú?Os grandes modelos de linguagem de inteligência artificial conseguem ser aprovados no Exame Nacional de Odontologia no Peru?Saravia-Rojas, Miguel ÁngelMendiola-Aquino, CarlosOrejuela-Ramirez, FranciscoTunquipa-Chacón, WanderleyGeng-Vivanco, Rociointeligencia artificialeducación odontológicaevaluación educativamodelos de lenguaje de gran tamañoartificial intelligencedental educationeducational assessmentlarge language modelsinteligência artificial educação odontológica avaliação educacionalmodelos de linguagem de grande porteObjective: To determine which artificial intelligence (AI) large language model demonstrates the highest accuracy in answering the 2023 National Dentistry Examination (ENAO, by its acronym in Spanish) in Peru, compared with the official answer key. Material and methods: The 100 multiple-choice questions from the 2023 ENAO were tested using ChatGPT-3.5, ChatGPT-4, Gemini, and Copilot. Responses were categorized by subject area and scored as correct or incorrect. Data were analyzed using the chi-square test (α = 0.05). Results: ChatGPT-4 achieved the highest overall accuracy (90.00%), followed by Gemini (82.00%), Copilot (79.00%), and ChatGPT-3.5 (76.00%). Across most models, the highest accuracy was observed in Public Health, Research, Health Services Management, and Ethics, whereas lower performance was observed in Anatomy and in Oral Medicine and Pathology. Pairwise comparisons revealed that ChatGPT-4 performed significantly better than ChatGPT-3.5 (difference: 14%; p = 0.0084) and Copilot (difference: 11%; p = 0.0316); no significant differences were found among the remaining model comparisons (p > 0.05). Conclusion: All AI language models demonstrated effectiveness in answering the 2023 ENAO questions, with ChatGPT-4 achieving the highest accuracy.Objetivo: Determinar qué modelo de lenguaje de gran tamaño basado en inteligencia artificial (IA) presenta mayor precisión al responder el Examen Nacional de Odontología (ENAO) de 2023 en Perú, en comparación con el banco de respuestas oficiales. Materiales y métodos: Las 100 preguntas de opción múltiple del examen se probaron en ChatGPT-3.5, ChatGPT-4, Gemini y Copilot, y las respuestas se clasificaron por materias. Cada respuesta se marcó como correcta o incorrecta, y los datos se analizaron mediante la prueba de chi-cuadrado (α = 0,05). Resultados: ChatGPT-4 alcanzó la mayor precisión global (90,00 %), seguido de Gemini (82,00 %), Copilot (79,00 %) y ChatGPT-3.5 (76,00 %). Por área temática, Salud Pública, Investigación, Gestión de Servicios de Salud y Ética mostraron las mayores tasas de acierto en la mayoría de los modelos, mientras que Anatomía y Medicina Oral y Patología mostraron un desempeño inferior. Las comparaciones pareadas revelaron que ChatGPT-4 tuvo un rendimiento significativamente superior al de ChatGPT-3.5 (diferencia: 14 %; p = 0,0084) y al de Copilot (diferencia: 11 %; p = 0,0316), mientras que no se encontraron diferencias estadísticamente significativas entre los demás modelos (p > 0,05). Conclusión: Todos los modelos de lenguaje de gran tamaño basados en IA demostraron su eficacia al responder las preguntas del ENAO de 2023, siendo que ChatGPT-4 mostró la mayor precisión.Objetivo: Determinar qual modelo de linguagem de inteligência artificial (IA) apresenta maior precisão ao responder ao Exame Nacional de Odontologia (ENAO) de 2023 no Peru, em comparação com o banco de respostas oficial. Materiais e métodos: As 100 perguntas de múltipla escolha do exame foram testadas no ChatGPT-3.5, ChatGPT-4, Gemini e Copilot, e as respostas foram classificadas por matéria. Cada resposta foi marcada como correta ou incorreta, e os dados foram analisados através do teste qui-quadrado (α = 0,05). Resultados: O ChatGPT-4 alcançou a maior precisão geral (90,00%), seguido pelo Gemini (82,00%), Copilot (79,00%) e ChatGPT-3.5 (76,00%). Por assunto, Saúde Pública, Pesquisa, Gestão de Serviços de Saúde e Ética apresentaram a maior precisão na maioria dos modelos, enquanto um desempenho inferior foi observado em Anatomia e Medicina Oral e Patologia. Comparações pareadas revelaram que o ChatGPT-4 teve um desempenho significativamente melhor do que o ChatGPT-3.5 (diferença: 14%; p = 0,0084) e o Copilot (diferença: 11%; p = 0,0316), enquanto não foram encontradas diferenças significativas entre os demais modelos (p > 0,05). Conclusão: Todos os modelos de linguagem da IA demonstraram a sua eficácia ao responder às perguntas da ENAO de 2023, sendo que o ChatGPT-4 apresentou a maior precisão.Universidad Peruana Cayetano Heredia2025-12-30info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://revistas.upch.edu.pe/index.php/REH/article/view/625310.20453/reh.v35i4.6253Revista Estomatológica Herediana; Vol. 35 No. 4 (2025): Octubre-diciembre; 305-311Revista Estomatológica Herediana; Vol. 35 Núm. 4 (2025): Octubre-diciembre; 305-311Revista Estomatológica Herediana; v. 35 n. 4 (2025): Octubre-diciembre; 305-3112225-76161019-435510.20453/reh.v35i4reponame:Revistas - Universidad Peruana Cayetano Herediainstname:Universidad Peruana Cayetano Herediainstacron:UPCHenghttps://revistas.upch.edu.pe/index.php/REH/article/view/6253/6842Derechos de autor 2025 Miguel Ángel Saravia-Rojas, Carlos Mendiola-Aquino, Francisco Orejuela-Ramirez, Wanderley Tunquipa-Chacón, Rocio Geng-Vivancohttp://creativecommons.org/licenses/by/4.0info:eu-repo/semantics/openAccessoai:revistas.upch.edu.pe:article/62532025-12-30T19:30:37Z |
| score |
13.941906 |
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).