Exportación Completada — 

ChatGPT as a global doctor: a rapid review of its performance on national licensing medical examination

Descripción del Articulo

Objective: To evaluate ChatGPT's performance onNLMEs worldwide and determine whether it could achieve licensure to practice medicine across different countries. Methods: We searched PubMed, Scopus, andGoogle Scholarforstudies evaluating ChatGPT's performance onNLMEs. Reference lists of inc...

Descripción completa

Detalles Bibliográficos
Autores: Flores-Cohaila, Javier A., Miranda Chavez, Brayan, Mayta-Tristán, Percy
Formato: artículo
Fecha de Publicación:2025
Institución:Colegio Médico del Perú
Repositorio:Acta Médica Peruana
Lenguaje:inglés
OAI Identifier:oai:amp.cmp.org.pe:article/3706
Enlace del recurso:https://amp.cmp.org.pe/index.php/AMP/article/view/3706
Nivel de acceso:acceso abierto
Materia:Medical education
Artificial Intelligence
ChatGPT
Generative Artificial Intelligence
id REVCMP_3b134ef1af9b41a05685edb74875c5aa
oai_identifier_str oai:amp.cmp.org.pe:article/3706
network_acronym_str REVCMP
network_name_str Acta Médica Peruana
repository_id_str .
dc.title.none.fl_str_mv ChatGPT as a global doctor: a rapid review of its performance on national licensing medical examination
ChatGPT as a global doctor: a rapid review of its performance on national licensing medical examination
title ChatGPT as a global doctor: a rapid review of its performance on national licensing medical examination
spellingShingle ChatGPT as a global doctor: a rapid review of its performance on national licensing medical examination
Flores-Cohaila, Javier A.
Medical education
Artificial Intelligence
ChatGPT
Generative Artificial Intelligence
Medical education
Artificial Intelligence
ChatGPT
Generative Artificial Intelligence
title_short ChatGPT as a global doctor: a rapid review of its performance on national licensing medical examination
title_full ChatGPT as a global doctor: a rapid review of its performance on national licensing medical examination
title_fullStr ChatGPT as a global doctor: a rapid review of its performance on national licensing medical examination
title_full_unstemmed ChatGPT as a global doctor: a rapid review of its performance on national licensing medical examination
title_sort ChatGPT as a global doctor: a rapid review of its performance on national licensing medical examination
dc.creator.none.fl_str_mv Flores-Cohaila, Javier A.
Miranda Chavez, Brayan
Mayta-Tristán, Percy
Flores-Cohaila, Javier A.
Miranda Chavez, Brayan
Mayta-Tristán, Percy
author Flores-Cohaila, Javier A.
author_facet Flores-Cohaila, Javier A.
Miranda Chavez, Brayan
Mayta-Tristán, Percy
author_role author
author2 Miranda Chavez, Brayan
Mayta-Tristán, Percy
author2_role author
author
dc.subject.none.fl_str_mv Medical education
Artificial Intelligence
ChatGPT
Generative Artificial Intelligence
Medical education
Artificial Intelligence
ChatGPT
Generative Artificial Intelligence
topic Medical education
Artificial Intelligence
ChatGPT
Generative Artificial Intelligence
Medical education
Artificial Intelligence
ChatGPT
Generative Artificial Intelligence
description Objective: To evaluate ChatGPT's performance onNLMEs worldwide and determine whether it could achieve licensure to practice medicine across different countries. Methods: We searched PubMed, Scopus, andGoogle Scholarforstudies evaluating ChatGPT's performance onNLMEs. Reference lists of included studies were also reviewed. Two reviewers independently screened studies and extracted the accuracy rates(performance) of GPT-3.5 and GPT-4, including those that passed thresholds, human examinee scores, and other study characteristics. The risk of bias was assessed using the JBI Critical Appraisal Checklist for Prevalence Studies. Results: We identified 37 studies evaluating ChatGPT's performance across 18 NLMEs. Most studies assessed the United States, Chinese, and Japanese examinations. While most studies used official datasets, others relied on unofficial third-party sources, and few employed advanced prompting techniques.GPT-4 wassuperiortoGPT-3.5 in allNLMEs, with accuracy rates ranging from 67% to 89%. GPT-4 passed all 18 NLMEs (100%), while GPT-3.5 passed 10 of 15 (67%). Compared to human examinees, GPT-4 outperformed the average score in 6 of 7 NLMEs (86%); the sole exception was Japan, where examinees achieved 84.9% versus 81.5% for GPT-4. Conclusion: Current evidence demonstrates that GPT-4 can pass all 18 NLMEs evaluated, surpassing human examinees in most cases. However, this finding likely reflects low passing thresholds rather than AI superiority over physicians.
publishDate 2025
dc.date.none.fl_str_mv 2025-12-30
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv https://amp.cmp.org.pe/index.php/AMP/article/view/3706
10.35663/amp.2025.424.3706
url https://amp.cmp.org.pe/index.php/AMP/article/view/3706
identifier_str_mv 10.35663/amp.2025.424.3706
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://amp.cmp.org.pe/index.php/AMP/article/view/3706/2040
dc.rights.none.fl_str_mv Copyright (c) 2025 Javier A. Flores-Cohaila, Brayan Miranda Chavez, Percy Mayta-Tristán
https://creativecommons.org/licenses/by/4.0
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Copyright (c) 2025 Javier A. Flores-Cohaila, Brayan Miranda Chavez, Percy Mayta-Tristán
https://creativecommons.org/licenses/by/4.0
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Colegio Médico del Perú
publisher.none.fl_str_mv Colegio Médico del Perú
dc.source.none.fl_str_mv ACTA MEDICA PERUANA; Vol. 42 No. 4 (2025): October - December; 284-293
ACTA MEDICA PERUANA; Vol. 42 Núm. 4 (2025): Octubre - Diciembre; 284-293
1728-5917
1018-8800
reponame:Acta Médica Peruana
instname:Colegio Médico del Perú
instacron:CMP
instname_str Colegio Médico del Perú
instacron_str CMP
institution CMP
reponame_str Acta Médica Peruana
collection Acta Médica Peruana
repository.name.fl_str_mv
repository.mail.fl_str_mv
_version_ 1864906918668009472
spelling ChatGPT as a global doctor: a rapid review of its performance on national licensing medical examinationChatGPT as a global doctor: a rapid review of its performance on national licensing medical examinationFlores-Cohaila, Javier A.Miranda Chavez, BrayanMayta-Tristán, PercyFlores-Cohaila, Javier A.Miranda Chavez, BrayanMayta-Tristán, Percy Medical educationArtificial IntelligenceChatGPTGenerative Artificial Intelligence Medical educationArtificial IntelligenceChatGPTGenerative Artificial IntelligenceObjective: To evaluate ChatGPT's performance onNLMEs worldwide and determine whether it could achieve licensure to practice medicine across different countries. Methods: We searched PubMed, Scopus, andGoogle Scholarforstudies evaluating ChatGPT's performance onNLMEs. Reference lists of included studies were also reviewed. Two reviewers independently screened studies and extracted the accuracy rates(performance) of GPT-3.5 and GPT-4, including those that passed thresholds, human examinee scores, and other study characteristics. The risk of bias was assessed using the JBI Critical Appraisal Checklist for Prevalence Studies. Results: We identified 37 studies evaluating ChatGPT's performance across 18 NLMEs. Most studies assessed the United States, Chinese, and Japanese examinations. While most studies used official datasets, others relied on unofficial third-party sources, and few employed advanced prompting techniques.GPT-4 wassuperiortoGPT-3.5 in allNLMEs, with accuracy rates ranging from 67% to 89%. GPT-4 passed all 18 NLMEs (100%), while GPT-3.5 passed 10 of 15 (67%). Compared to human examinees, GPT-4 outperformed the average score in 6 of 7 NLMEs (86%); the sole exception was Japan, where examinees achieved 84.9% versus 81.5% for GPT-4. Conclusion: Current evidence demonstrates that GPT-4 can pass all 18 NLMEs evaluated, surpassing human examinees in most cases. However, this finding likely reflects low passing thresholds rather than AI superiority over physicians.Objective: To evaluate ChatGPT's performance onNLMEs worldwide and determine whether it could achieve licensure to practice medicine across different countries. Methods: We searched PubMed, Scopus, andGoogle Scholarforstudies evaluating ChatGPT's performance onNLMEs. Reference lists of included studies were also reviewed. Two reviewers independently screened studies and extracted the accuracy rates(performance) of GPT-3.5 and GPT-4, including those that passed thresholds, human examinee scores, and other study characteristics. The risk of bias was assessed using the JBI Critical Appraisal Checklist for Prevalence Studies. Results: We identified 37 studies evaluating ChatGPT's performance across 18 NLMEs. Most studies assessed the United States, Chinese, and Japanese examinations. While most studies used official datasets, others relied on unofficial third-party sources, and few employed advanced prompting techniques.GPT-4 wassuperiortoGPT-3.5 in allNLMEs, with accuracy rates ranging from 67% to 89%. GPT-4 passed all 18 NLMEs (100%), while GPT-3.5 passed 10 of 15 (67%). Compared to human examinees, GPT-4 outperformed the average score in 6 of 7 NLMEs (86%); the sole exception was Japan, where examinees achieved 84.9% versus 81.5% for GPT-4. Conclusion: Current evidence demonstrates that GPT-4 can pass all 18 NLMEs evaluated, surpassing human examinees in most cases. However, this finding likely reflects low passing thresholds rather than AI superiority over physicians.Colegio Médico del Perú2025-12-30info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://amp.cmp.org.pe/index.php/AMP/article/view/370610.35663/amp.2025.424.3706ACTA MEDICA PERUANA; Vol. 42 No. 4 (2025): October - December; 284-293ACTA MEDICA PERUANA; Vol. 42 Núm. 4 (2025): Octubre - Diciembre; 284-2931728-59171018-8800reponame:Acta Médica Peruanainstname:Colegio Médico del Perúinstacron:CMPenghttps://amp.cmp.org.pe/index.php/AMP/article/view/3706/2040Copyright (c) 2025 Javier A. Flores-Cohaila, Brayan Miranda Chavez, Percy Mayta-Tristánhttps://creativecommons.org/licenses/by/4.0info:eu-repo/semantics/openAccessoai:amp.cmp.org.pe:article/37062026-02-14T05:21:22Z
score 13.069414
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).