Subscribe to RSS
DOI: 10.1055/s-0043-1774399
Improved Performance of ChatGPT-4 on the OKAP Examination: A Comparative Study with ChatGPT-3.5
Funding/Acknowledgment No financial support was received in pursuant to this research.Abstract
Introduction: This study aims to evaluate the performance of ChatGPT-4, an advanced artificial intelligence (AI) language model, on the Ophthalmology Knowledge Assessment Program (OKAP) examination compared to its predecessor, ChatGPT-3.5.
Methods: Both models were tested on 180 OKAP practice questions covering various ophthalmology subject categories.
Results: ChatGPT-4 significantly outperformed ChatGPT-3.5 (81% vs. 57%; p<0.001), indicating improvements in medical knowledge assessment.
Discussion: The superior performance of ChatGPT-4 suggests potential applicability in ophthalmologic education and clinical decision support systems. Future research should focus on refining AI models, ensuring a balanced representation of fundamental and specialized knowledge, and determining the optimal method of integrating AI into medical education and practice.
Keywords
artificial intelligence - Ophthalmology Knowledge Assessment Program - OKAP - ChatGPT - medical educationPublication History
Received: 13 April 2023
Accepted: 10 August 2023
Article published online:
11 September 2023
© 2023. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (https://creativecommons.org/licenses/by-nc-nd/4.0/)
Thieme Medical Publishers, Inc.
333 Seventh Avenue, 18th Floor, New York, NY 10001, USA
-
References
- 1 Yan A. How a robot passed China's medical licensing exam. scmp.com. Published November 20, 2017 . Accessed January 02, 2023 at: https://www.scmp.com/news/china/society/article/2120724/how-robot-passed-chinas-medical-licensing-exam
- 2 Shelmerdine SC, Martin H, Shirodkar K, Shamshuddin S, Weir-McCall JR. Can artificial intelligence pass the Fellowship of the Royal College of Radiologists examination? Multi-reader diagnostic accuracy study. BMJ 2022; x: e072826
- 3 Singhal K, Azizi S, Tu T. et al. Large Language Models Encode Clinical Knowledge. Published online December 26, 2022. Accessed March 30, 2023 at: http://arxiv.org/abs/2212.13138
- 4 GPT-4 is OpenAI's most advanced system, producing safer and more useful responses. OpenAI; Accessed August 29, 2023 at: https://openai.com/product/gpt-4
- 5 American Academy of Ophthalmology. Basic and Clinical Science Course Self-Assessment Program. Accessed December 03, 2023 at: https://store.aao.org/basic-and-clinical-science-course-self-assessment-program.html
- 6 Antaki F, Touma S, Milad D, El-Khoury J, Duval R. Evaluating the performance of ChatGPT in ophthalmology: an analysis of its successes and shortcomings. Ophthalmol Sci 2023; 3 (04) 100324
- 7 Wiens J, Saria S, Sendak M. et al. Do no harm: a roadmap for responsible machine learning for health care. Nat Med 2019; 25 (09) 1337-1340
- 8 Oke I. The pursuit of generalizability and equity through artificial intelligence-based risk prediction models. JAMA Ophthalmol 2022; 140 (08) 798-799