RSS-Feed abonnieren
DOI: 10.1055/a-2149-0447
Assessment of ChatGPT in the Prehospital Management of Ophthalmological Emergencies – An Analysis of 10 Fictional Case Vignettes
ChatGPT in der präklinischen Versorgung augenärztlicher Notfälle – eine Untersuchung von 10 fiktiven FallvignettenAbstract
Background The artificial intelligence (AI)-based platform ChatGPT (Chat Generative Pre-Trained Transformer, OpenAI LP, San Francisco, CA, USA) has gained impressive popularity in recent months. Its performance on case vignettes of general medical (non-ophthalmological) emergencies has been assessed – with very encouraging results. The purpose of this study was to assess the performance of ChatGPT on ophthalmological emergency case vignettes in terms of the main outcome measures triage accuracy, appropriateness of recommended prehospital measures, and overall potential to inflict harm to the user/patient.
Methods We wrote ten short, fictional case vignettes describing different acute ophthalmological symptoms. Each vignette was entered into ChatGPT five times with the same wording and following a standardized interaction pathway. The answers were analyzed following a systematic approach.
Results We observed a triage accuracy of 93.6%. Most answers contained only appropriate recommendations for prehospital measures. However, an overall potential to inflict harm to users/patients was present in 32% of answers.
Conclusion ChatGPT should presently not be used as a stand-alone primary source of information about acute ophthalmological symptoms. As AI continues to evolve, its safety and efficacy in the prehospital management of ophthalmological emergencies has to be reassessed regularly.
Zusammenfassung
Hintergrund Die auf künstlicher Intelligenz (KI) basierende Plattform ChatGPT (Chat Generative Pre-Trained Transformer, OpenAI LP, San Francisco, CA, USA) hat in den vergangenen Monaten rasant an Popularität gewonnen. Vorangegange Studien zeigen ein vielversprechendes Abschneiden von ChatGPT in der Beantwortung allgemeinmedizinischer Notfallvignetten. Ziel dieser Studie war es, die Antworten von ChatGPT auf ophthalmologische Fallvignetten hinsichtlich Triagegenauigkeit, Angemessenheit empfohlener präklinischer Maßnahmen sowie Schadenspotenzial zu beurteilen.
Methoden Wir erstellten 10 kurze, fiktive Fallvignetten aus dem Bereich augenheilkundlicher Akutsymptomatik. Jede Vignette wurde entsprechend einem standardisierten Interaktionspfad 5-mal in ChatGPT eingegeben. Die Antworten wurden anhand eines strukturierten Evaluationsmanuals ausgewertet.
Ergebnisse Wir beobachteten eine Triagegenauigkeit von 93,6%. Die meisten Antworten enthielten nur angemessene Empfehlungen bezüglich präklinischer Maßnahmen. Insgesamt zeigte sich jedoch in 32% der Antworten ein Schadenspotenzial für den Nutzer/Patienten.
Schlussfolgerung ChatGPT sollte derzeit nicht als einzige Informationsquelle zur Beurteilung akuter ophthalmologischer Symptome herangezogen werden. Neuentwicklungen auf dem Bereich der KI sollten regelmäßig im Hinblick auf Chancen und Risiken im Bereich der augenärztlichen Notfallversorgung evaluiert werden.
Already known:
-
ChatGPT has been reported to perform well on the Ophthalmic Knowledge Assessment Programme to give useful information on several medical topics such as retinal diseases as well as cardiopulmonary resuscitation measures and to perform well on triaging and diagnosing general medical emergencies.
-
However, it can also give wrong information or harmful advice in a very confident and authoritative tone.
Newly described:
-
While performing remarkably well triaging ophthalmological emergencies and recommending preclinical measures, ChatGPTʼs performance strongly depended on the individual case description it was provided, and we identified 32% of its responses to be potentially harmful.
-
As the popularity of ChatGPT and other AI-based language models grows, it is important to educate the public as well as the medical community on their current limitations – at the moment, they should not be used for ophthalmological emergencies.
-
However, as even the current versions of general-purpose language models already show an impressive performance in the medical domain, research should focus on developing more advanced language models specifically designed for medical purposes.
Supporting Information
- Ergänzendes Material
The detailed evaluation manual for analysis of the answers generated by ChatGPT can be found in the supplements.
Publikationsverlauf
Eingereicht: 31. Juli 2023
Angenommen: 04. August 2023
Artikel online veröffentlicht:
27. Oktober 2023
© 2023. Thieme. All rights reserved.
Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany
-
References
- 1 Antaki F, Touma S, Milad D. et al. Evaluating the Performance of ChatGPT in Ophthalmology: An Analysis of Its Successes and Shortcomings. Ophthalmol Sci 2023; 3: 100324 DOI: 10.1016/j.xops.2023.100324.
- 2 Teebagy S, Colwell L, Wood E. et al. Improved erformance of ChatGPT-4 on the OKAP exam: A comparative study with ChatGPT-3.5. medRxiv 2023; DOI: 10.1101/2023.04.03.23287957.
- 3 van Dis EAM, Bollen J, Zuidema W. et al. ChatGPT: five priorities for research. Nature 2023; 614: 224-226 DOI: 10.1038/d41586-023-00288-7.
- 4 Shah SM, Khanna CL. Ophthalmic Emergencies for the Clinician. Mayo Clin Proc 2020; 95: 1050-1058 DOI: 10.1016/j.mayocp.2020.03.018.
- 5 Potapenko I, Boberg-Ans LC, Stormly Hansen M. et al. Artificial intelligence-based chatbot patient information on common retinal diseases using ChatGPT. Acta Ophthalmol 2023; DOI: 10.1111/aos.15661.
- 6 Hirosawa T, Harada Y, Yokose M. et al. Diagnostic Accuracy of Differential-Diagnosis Lists Generated by Generative Pretrained Transformer 3 Chatbot for Clinical Vignettes with Common Chief Complaints: A Pilot Study. Int J Environ Res Public Health 2023; 20: 3378 DOI: 10.3390/ijerph20043378.
- 7 Levine DM, Tuwani R, Kompa B. et al. The Diagnostic and Triage Accuracy of the GPT-3 Artificial Intelligence Model. medRxiv 2023; DOI: 10.1101/2023.01.30.23285067.
- 8 Ahn C. Exploring ChatGPT for information of cardiopulmonary resuscitation. Resuscitation 2023; 185: 109729 DOI: 10.1016/j.resuscitation.2023.109729.
- 9 Hopkins AM, Logan JM, Kichenadasse G. et al. Artificial intelligence chatbots will revolutionize how cancer patients access information: ChatGPT represents a paradigm-shift. JNCI Cancer Spectr 2023; 7: pkad010 DOI: 10.1093/jncics/pkad010.
- 10 Yang S. The Abilities and Limitations of ChatGPT. 10.12.2022 Accessed April 07, 2023 at: https://www.anaconda.com/blog/the-abilities-and-limitations-of-chatgpt
- 11 Deaner JD, Amarasekera DC, Ozzello DJ. et al. Accuracy of Referral and Phone-Triage Diagnoses in an Eye Emergency Department. Ophthalmology 2021; 128: 471-473 DOI: 10.1016/j.ophtha.2020.07.040.
- 12 Mehdi Y. Reinventing search with a new AI-powered Microsoft Bing and Edge, your copilot for the web. 2023. Accessed April 16, 2023 at: https://blogs.microsoft.com/blog/2023/02/07/reinventing-search-with-a-new-ai-powered-microsoft-bing-and-edge-your-copilot-for-the-web/