Subscribe to RSS
DOI: 10.1055/a-2327-8484
Evaluation of Current Artificial Intelligence Programs on the Knowledge of Glaucoma
Evaluierung aktueller Programme zur künstlichen Intelligenz zum Wissen über GlaukomAbstract
Background To measure the success of three different artificial intelligence chatbots, ChatGPT, Bard, and Bing, in correctly answering questions about glaucoma types and treatment modalities and to examine their superiority over each other.
Materials and Methods Thirty-two questions about glaucoma types and treatment modalities were asked using the ChatGPT, Bard, and Bing chatbots. The correct and incorrect answers were also provided. Accuracy rates were compared.
Outcomes Questions asked: ChatGPT answered 56.3%, Bard 78.1%, and Bing 59.4% correctly. There was no statistically significant difference between the three artificial intelligence chatbots in the rate of correct and incorrect answers to the questions asked (p = 0.195).
Conclusion Artificial intelligence chatbots can be used as a tool to access accurate information regarding glaucoma types and treatment modalities. However, the information obtained is not always accurate, and care should be taken when using this information.
Zusammenfassung
Hintergrund Ziel ist es, den Erfolg von drei verschiedenen Chatbots mit künstlicher Intelligenz – ChatGPT, Bard und Bing – bei der richtigen Beantwortung von Fragen zu Glaukomarten und Behandlungsmethoden zu messen und ihre Überlegenheit gegenüber den anderen zu untersuchen.
Methoden Mithilfe der Chatbots ChatGPT, Bard und Bing wurden 32 Fragen zu Glaukomarten und Behandlungsmethoden gestellt. Die richtigen und falschen Antworten wurden ebenfalls angegeben. Die Genauigkeitsraten wurden verglichen.
Ergebnisse Gestellte Fragen: ChatGPT antwortete 56,3%, Bard 78,1% und Bing 59,4% richtig. Es gab keinen statistisch signifikanten Unterschied in der Rate der richtigen und falschen Antworten auf die gestellten Fragen zwischen den drei Chatbots mit künstlicher Intelligenz (p = 0,195).
Schlussfolgerung Chatbots mit künstlicher Intelligenz können als Hilfsmittel eingesetzt werden, um genaue Informationen zu Glaukomarten und Behandlungsmethoden zu erhalten. Die erhaltenen Informationen sind jedoch nicht immer genau und bei der Verwendung dieser Informationen ist Vorsicht geboten.
Already Known:
-
Chatbots are new applications that have emerged with the development of artificial intelligence programs.
-
Although the usability of artificial intelligence programs in ophthalmology fields has been tested, the existence and usability of three different artificial intelligence chatbots, which are available for free use, have not been investigated for their superiority over each other in accessing information about glaucoma diseases and treatment methods.
Newly described:
-
Although all three artificial intelligence programs were not statistically superior to each other in answering the questions correctly, the more up-to-date Bard and Bing artificial intelligence chatbots answered the questions with higher accuracy rates.
-
Artificial intelligence programs, including current artificial intelligence programs such as Bard and Bing, may encounter various obstacles (paid access, etc.) in accessing current and accurate information. There seems to be a need for further development of artificial intelligence programs and the completion of these deficiencies.
Publication History
Received: 12 August 2023
Accepted: 12 May 2024
Article published online:
24 July 2024
© 2024. Thieme. All rights reserved.
Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany
-
References
- 1 Evans RS. Electronic Health Records: Then, Now, and in the Future. Yearb Med Inform 2016; 25: S48 DOI: 10.15265/IYS-2016-S006/ID/JRS006-48/BIB.
- 2 Rahimy E. Deep learning applications in ophthalmology. Curr Opin Ophthalmol 2018; 29: 254-260 DOI: 10.1097/ICU.0000000000000470.
- 3 Patel VL, Shortliffe EH, Stefanelli M. et al. The coming of age of artificial intelligence in medicine. Artif Intell Med 2009; 46: 5-17 DOI: 10.1016/J.ARTMED.2008.07.017.
- 4 Mikolov T, Deoras A, Povey D. et al. Strategies for training large scale neural network language models. 2011 IEEE Workshop on Automatic Speech Recognition and Understanding. Waikoloa, HI, USA: ASRU Proceedings; 2011: 196-201
- 5 Harasymowycz P, Birt C, Gooi P. et al. Medical Management of Glaucoma in the 21st Century from a Canadian Perspective. J Ophthalmol 2016; DOI: 10.1155/2016/6509809.
- 6 Thomas S, Hodge W, Malvankar-Mehta M. The Cost-Effectiveness Analysis of Teleglaucoma Screening Device. PLoS One 2015; 10: e0137913 DOI: 10.1371/JOURNAL.PONE.0137913.
- 7 Imrie C, Tatham AJ. Glaucoma: the patientʼs perspective. Br J Gen Pract 2016; 66: e371 DOI: 10.3399/BJGP16X685165.
- 8 McMonnies CW. Glaucoma history and risk factors. J Optom 2017; 10: 71-78 DOI: 10.1016/J.OPTOM.2016.02.003.
- 9 Hashemi H, Mohammadi M, Zandvakil N. et al. Prevalence and risk factors of glaucoma in an adult population from Shahroud, Iran. J Curr Ophthalmol 2018; 31: 366-372 DOI: 10.1016/J.JOCO.2018.05.003.
- 10 Tanna AP, Boland MV, Giaconi JA. et al. Glaucoma. San Francisco: American Academy of Ophthalmology; 2022
- 11 Kung TH, Cheatham M, Medenilla A. et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health 2023; 2: e0000198 DOI: 10.1371/journal.pdig.0000198.
- 12 Wen J, Wang W. The future of ChatGPT in academic research and publishing: A commentary for clinical and translational medicine. Clin Transl Med 2023; 13: e1207 DOI: 10.1002/CTM2.1207.
- 13 Khan RA, Jawaid M, Khan AR. et al. ChatGPT – Reshaping medical education and clinical management. Pak J Med Sci 2023; 39: 605 DOI: 10.12669/PJMS.39.2.7653.
- 14 Jeblick K, Schachtner B, Dexl J. et al. ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports. Eur Radiol 2023; 34: 2817-2825 DOI: 10.1007/s00330-023-10213-1.
- 15 Gao CA, Howard FM, Markov NS. et al. Comparing scientific abstracts generated by ChatGPT to original abstracts using an artificial intelligence output detector, plagiarism detector, and blinded human reviewers. bioRxiv 2022; 2022: 12.23.521610 DOI: 10.1101/2022.12.23.521610.
- 16 Jin Q, Dhingra B, Liu Z. et al. PubMedQA: A Dataset for Biomedical Research Question Answering. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong: Association for Computational Linguistics; 2019: 2567-2577
- 17 Gilson A, Safranek CW, Huang T. et al. How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment. JMIR Med Educ 2023; 9: e45312 DOI: 10.2196/45312.