Identifying ChatGPT-Written Patient Education Materials Using Text Analysis and Readability

Silas Monje; Sophie Ulene; Alexis C. Gimovsky

doi:10.1055/a-2302-8604

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00000009.xml

Share / Bookmark

Facebook Linkedin Weibo

Download PDF

Am J Perinatol 2024; 41(16): 2229-2231
DOI: 10.1055/a-2302-8604

Original Article

Identifying ChatGPT-Written Patient Education Materials Using Text Analysis and Readability

Silas Monje

¹The Warren Alpert Medical School, Brown University, Providence, Rhode Island

,

Sophie Ulene

²Columbia University Vagelos College of Physicians and Surgeons, New York

,

Alexis C. Gimovsky

³Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Women & Infants Hospital of Rhode Island, Alpert Medical School of Brown University, Providence, Rhode Island

› Author Affiliations
Funding None.

› Further Information

Also available at

Abstract
Full Text
References
Supplementary Material

Permissions and Reprints

Abstract

Objective Artificial intelligence (AI)-based text generators such as Chat Generative Pre-Trained Transformer (ChatGPT) have come into the forefront of modern medicine. Given the similarity between AI-generated and human-composed text, tools need to be developed to quickly differentiate the two. Previous work has shown that simple grammatical analysis can reliably differentiate AI-generated text from human-written text.

Study Design In this study, ChatGPT was used to generate 25 articles related to obstetric topics similar to those made by the American College of Obstetrics and Gynecology (ACOG). All articles were geared towards patient education. These AI-generated articles were then analyzed for their readability and grammar using validated scoring systems and compared to real articles from ACOG.

Results Characteristics of the 25 AI-generated articles included fewer overall characters than original articles (mean 3,066 vs. 7,426; p < 0.0001), a greater average word length (mean 5.3 vs. 4.8; p < 0.0001), and a lower Flesch–Kincaid score (mean 46 vs. 59; p < 0.0001). With this knowledge, a new scoring system was develop to score articles based on their Flesch–Kincaid readability score, number of total characters, and average word length. This novel scoring system was tested on 17 new AI-generated articles related to obstetrics and 7 articles from ACOG, and was able to differentiate between AI-generated articles and human-written articles with a sensitivity of 94.1% and specificity of 100% (Area Under the Curve [AUC] 0.99).

Conclusion As ChatGPT is more widely integrated into medicine, it will be important for health care stakeholders to have tools to separate originally written documents from those generated by AI. While more robust analyses may be required to determine the authenticity of articles written by complex AI technology in the future, simple grammatical analysis can accurately characterize current AI-generated texts with a high degree of sensitivity and specificity.

Key Points

More tools are needed to identify AI-generated text in obstetrics, for both doctors and patients.
Grammatical analysis is quick and easily done.
Grammatical analysis is a feasible and accurate way to identify AI-generated text.

Keywords

obstetrics - artificial intelligence - patient education - ChatGPT

Supplementary Material

Supplementary Material

Publication History

Received: 19 February 2024

Accepted: 24 March 2024

Accepted Manuscript online:
09 April 2024

Article published online:
02 May 2024

Thieme Medical Publishers, Inc.
333 Seventh Avenue, 18th Floor, New York, NY 10001, USA

References
1 Open AI. GPT-3.5. 2023 . Accessed April 15, 2024 at: https://chat.openai.com/

MissingFormLabel
PubMed Search in Google Scholar
2 Li SW, Kemp MW, Logan SJS. et al; National University of Singapore Obstetrics and Gynecology Artificial Intelligence (NUS OBGYN-AI) Collaborative Group. ChatGPT outscored human candidates in a virtual objective structured clinical examination in obstetrics and gynecology. Am J Obstet Gynecol 2023; 229 (02) 172.e1-172.e12

MissingFormLabel
Crossref PubMed Search in Google Scholar
3 Sanchez-Ramos L, Lin L, Romero R. Beware of references when using ChatGPT as a source of information to write scientific articles. Am J Obstet Gynecol 2023; 229 (03) 356-357

MissingFormLabel
Crossref PubMed Search in Google Scholar
4 Grammarly. Free AI Writing Assistance. 2023 . Accessed April 15, 2024 at: https://www.grammarly.com/

MissingFormLabel
PubMed Search in Google Scholar
5 Levin G, Meyer R, Kadoch E, Brezinov Y. Identifying ChatGPT-written OBGYN abstracts using a simple tool. Am J Obstet Gynecol MFM 2023; 5 (06) 100936

MissingFormLabel
Crossref PubMed Search in Google Scholar

Supplementary Material

Supplementary Material

Subscribe to RSS

Share / Bookmark

Identifying ChatGPT-Written Patient Education Materials Using Text Analysis and Readability

Abstract

Keywords

Supplementary Material

Publication History

References