Machine-Learning Applications in Thrombosis and Hemostasis

The use of machine-learning (ML) algorithms in medicine has sparked a heated discussion. It is considered one of the most disruptive general-purpose technologies in decades. It has already permeated many areas of our daily lives and produced applications that we can no longer do without, such as navigation apps or translation software. However, many people are still unsure if ML algorithms should be used in medicine in their current form. Doctors are doubtful to what extent they can trust the predictions of algorithms. Shortcomings in development and unclear regulatory oversight can lead to bias, inequality, applicability concerns, and nontransparent assessments. Past mistakes, however, have led to a better understanding of what is needed to develop effective models for clinical use. Physicians and clinical researchers must participate in all development phases and understand their pitfalls. In this review, we explain the basic concepts of ML, present examples in the field of thrombosis and hemostasis, discuss common pitfalls, and present a methodological framework that can be used to develop effective algorithms.

#

Keywords

machine learning - artificial intelligence - thrombosis - hemostasis

Introduction

Machine-learning algorithms are one of the most disruptive new technologies, but their use in medicine has been controversial.[1] They can handle multidimensional data, find patterns humans do not perceive, and model complex interactions.[2] This makes them ideal for many real-world applications. Already now, they are part of our everyday lives including navigation apps, content recommendation algorithms (e.g., YouTube or TikTok), and smartphone voice assistants. Especially for the future of medicine, they pose a significant chance as healthcare data steadily increases while the number of healthcare workers decreases.[3]

Still, many people are unsure if machine-learning algorithms can and should be used in their current form in clinical medicine.[4] [5] Concerns include questions about privacy, unclear regulatory oversight, biases against certain genders and races, and dangerous implementations.[6] [7] [8] [9] [10] A prime example is the EPIC sepsis prediction model rolled out during the pandemic without regulatory approval. It was supposed to alert doctors to sepsis risk but often gave false alarms, leading to worse health outcomes.[11] Major mistakes, such as including antibiotic therapy as a predictor for sepsis, happened because clinical experts were not involved.[12]

However, such experiences have led to a better understanding of what is needed to develop safe and effective machine-learning algorithms. It requires more than technical skills; life scientists and doctors are essential in defining clear clinical use cases, guiding development, and testing models in appropriately designed clinical studies.[13] This also means that the ball is now in our court, physicians, and clinical researchers alike. We need to understand how machine-learning algorithms work, their strengths and weaknesses, and how we can develop them. In this way, they can become useful tools that improve daily clinical practice and patient outcomes.

This review aims to introduce the fundamental principles of medical machine learning, outline potential use cases, and present common pitfalls in machine learning. We also discuss how to avoid these pitfalls using a methodological framework.

#

Fundamentals of Machine Learning

Machine-learning models are computer programs designed to perform tasks that would normally require human intelligence.[14] While they have not yet matched human experts in many situations, they offer key advantages: (1) they can handle multidimensional data from various sources, (2) they manage probabilities very well, and (3) they can find patterns in data that humans might miss by modeling complex interactions.[15] However, they lack other attributes of human intelligence, such as creativity and flexibility, which are also important when solving problems in medicine.[16] [Fig. 1] contrasts the strengths of artificial and human intelligence.

Fig. 1 Illustration of the strengths of human and artificial intelligence [rerif].

Machine-learning algorithms have three main capabilities useful for medical applications: classification, pattern recognition, and optimization.[17] Different models are available, ranging from logistic regression to complex ones like deep neural networks.[18] In the following section, we will describe common machine-learning approaches and models typically used ([Fig. 2]).

Fig. 2 Overview of key machine-learning capabilities. [rerif]

Diagnosis and prognosis are classification problems.[14] Patients need to be classified as having a disease or not or predicted to experience disease progression. Automated blood cell counting is another example of a classification problem.[19] Are the cells granulocytes or lymphocytes and which subtypes? A typical approach for classification problems is supervised learning, where the training data are labeled according to a reference standard, such as expert panel ratings, a reference laboratory test, or follow-up data.[20] This means that even a perfectly trained model will only perform as well as the reference test. Standard models used for supervised learning include logistic regression, random forests, and support vector machines.[20] As an example, our group recently developed a machine-learning–based decision support tool for the diagnosis of heparin-induced thrombocytopenia (HIT; https://toradi-hit.dbmr.unibe.ch/). [21] We demonstrated that our model can accurately predict HIT as defined by the reference standard, thus solving a salient diagnostic dilemma.[21]

Another main task for machine-learning models is pattern recognition. This can be used for subgroup identification.[22] An unsupervised learning approach is often employed, meaning the model works with unlabeled data where the true diagnosis is unknown.[23] The model aims to group patients based on shared characteristics, but these groups do not have to correlate with the outcome in question. In this situation, researchers and experts must assign meaning to the identified clusters.[24] Typical models used for unsupervised learning include K-nearest neighbor, hierarchical clustering, and Gaussian mixture models.[23] As an example, a group from Mainz used a hierarchical clustering algorithm to identify endotypes (based on clinical features at presentation) in patients with acute venous thromboembolism.[25] These endotypes were found to be associated with differences in recurrence and death rates.[25] However, validation is still pending.

A typical optimization problem is treatment monitoring, typically tackled with a technique called reinforcement learning.[26] These algorithms interact with their environment and receive rewards when they achieve specific goals. Available models include Q-learning and Asynchronous Advantage Actor-Critic.[26] For example, using retrospective data from the “Multiparameter Intelligent Monitoring in Intensive Care II” (MIMIC-II) dataset, Nemati et al proposed a reinforcement learning-based algorithm for monitoring unfractionated heparin treatment.[27] The model provided dosing recommendations for heparin and was rewarded during training when the activated partial thromboplastin time was within 60 to 100 seconds. While the model gave sensible dosing recommendations in the validation dataset, it has yet to be validated prospectively or in live patients.

Besides the main methods mentioned earlier, many other approaches and mixed methods are available. For example, semisupervised learning first uses a small set of labeled data to train the initial iteration of a classifier, then refines its predictions with unlabeled data.[28] The field is rapidly evolving, with many new algorithms developed every year.

Another important development was made possible by the steady increase in computing power: generative artificial intelligence (AI) models. The most important difference to the previous methods is that generative AI generates new, previously nonexistent content. Most of these models are based on foundational models trained on enormous datasets, such as social media posts, internet articles, or code repositories.[29] For example, large language models generate text based on user prompts by determining which next part of a sentence (encoded as a token) fits best.[30] The most well-known example is OpenAI's ChatGPT. The latest version, GPT-4, can handle multimodal prompt data, including files and images.[31] These models can then be further fine-tuned for specific applications.[32] While their performance is impressive, a significant risk with these large language models is “AI hallucinations,” where the models generate completely inaccurate answers due to issues like overfitting, extreme complexity, or biases in the training data.[33]

#

Use-Cases of Machine Learning in Medicine

Currently, most machine-learning models focus on two main goals: (1) improving processes through automation or simplification and (2) improving quality or utility. While process-enhancing models sometimes operate in a legal gray area, such as administrative software, quality-enhancing models are typically considered part of a medical device or “software as a medical device” and therefore require regulatory approval.[34]

Models that improve processes include large language models to generate discharge or handover notes automatically. For this purpose, Epic, a prominent U.S.-based electronic health record vendor, has announced plans to integrate GPT-4 into their systems.[35] However, these developments mainly focus on the U.S. and English markets, raising questions about how a similar model would perform in other medical cultures. Another proposed process improvement involves using AI-powered chatbots to answer or triage patients' medical questions. A preliminary study by Ayers et al compared physicians' answers to questions posted on a public medical forum with those given by GPT-3.5.[36] An expert panel evaluated the empathy and quality of the answers and decided which they preferred. The panel favored the chatbot's answers in 78.6% of the cases and rated the quality and empathy of the chatbot's responses higher.[36]

Machine-learning models that focus on quality improvement are mostly still in the premarket or research-use-only phase, and very few have made it to clinical practice.[37] As an example of a model that aims to improve quality, Nafee et al published a machine-learning model to improve venous thrombosis prediction in acutely ill patients using data from the phase 3 clinical trial for betrixaban.[38] Their ensemble model, which combines different architectures, outperformed the established IMPROVE score.[38] While their model was developed with high-quality clinical data, the description of the methods used is brief. Besides, the model is not yet validated, meaning the performance might differ in other populations.

As another example, Zaboras et al developed a classifier to predict bleeding in cancer patients on anticoagulation for cancer-associated thrombosis.[39] Their Extreme Gradient Boosting model outperformed the CAT-BLEED score, the only available score for this purpose, in predicting bleedings at 90, 365, and 365 + 90 days after VTE.[39] However, the authors noted that the model had limited sensitivity and would require refinement before clinical use. Additionally, methodological limitations such as reliance on registry data, limited calibration, and lack of external validation arose.

Besides these clinical use cases, machine-learning models can also improve or simplify research.[40] One example is the automated detection of certain diseases in electronic health records for retrospective studies. While diagnosis codes are often available, their accuracy varies, especially since it is not always clear if the diagnosis is current or historical. A recent meta-analysis by Lam et al pooled data from eight studies on natural language processing for venous thromboembolism detection.[41] The sensitivity and specificity of these models for detecting venous thromboembolism from free-text radiology or narrative reports in electronic health records were high. However, most of these studies were conducted in English-speaking countries.

#

The Implementation Gap

Despite their potential, few machine-learning models are currently used in clinical practice. In the European Union (EU) and the United States, machine-learning models must be registered as medical devices, except for those in certain legal grey areas as described earlier.[34] The EU's approval process is decentralized and handled by “Notified Bodies,” private institutions that perform conformity assessments.[42] Although the EU has had a centralized medical device register (EUDAMED) since 2011, it does not list machine-learning–enabled devices specifically.[43] Therefore, we will describe the landscape of machine-learning models for clinical use based on the American Food and Drug Administration's (FDA) list.

As of May 13, 2024, the FDA has approved 882 AI or machine-learning–enabled devices.[44] Most of these devices (671; 76.1%) are in radiology, which adopted machine learning early. In contrast, the hematology section has only 17 devices (1.9%). These are mostly computer vision–based peripheral blood smear analyzers for large medical labs. One example is the Scopio X100HT, which, according to the manufacturer, processes 40 slides per hour.[45] [46] Other blood count devices with different intended purposes include the Sight OLO, a point-of-care device using cassettes for differential blood counts, and the Athelas Home and One, patient self-sample devices that detect neutropenia in patients treated with Clozapine.[47] Both were validated by their respective manufacturers and found to perform similarly with established hematology analyzers. Additionally, 23andMe has registered a device estimating hereditary thrombophilia risk, focusing on mutations like factor V Leiden and prothrombin G20210A.[48]

Despite the large body of research in machine-learning models, very few have made it to clinical practice. The reasons for this are manifold. Often, models are not developed with a clear clinical question in mind but due to the availability of data. This leads to clinically useless models. Additionally, machine-learning experts and clinicians often do not work in the same teams. All these points toward the need for a concise methodological framework for the development and validation of machine-learning models. In the next section, we are going to outline such a framework.[49] [50] [51] [52]

#

A Methodological Framework to Avoid Common Pitfalls

Drawing on our experience with new biomarkers and other diagnostic tests, we proposed a step-by-step framework to ensure clear clinical use cases, validity, and efficiency.[49] [50] [51] [52] In the following paragraphs, we will discuss the most important development and implementation pitfalls and how to avoid them ([Fig. 3]).

Fig. 3 Areas of potential pitfalls in the development of medical machine learning. [rerif]

Defining a Clinical Need and Research Question

Defining a clinical use case is the first step in developing any useful medical tool. Devices that do not meet a clear clinical need are not used, wasting scarce resources. These tools might even misinform physicians, potentially harming patients.[53] Therefore, the first phase of development should involve focus group discussions with relevant stakeholders, including patients and physicians.[54] The European Federation of Clinical Chemistry and Laboratory Medicine outlines four key questions to guide these discussions: (1) What clinical management problem needs solving? (2) Are there existing solutions? (3) What improvement or contribution will the new tool provide? (4) Is the new tool feasible in everyday clinical practice?[49] From the clinical need, a research question can be derived, clearly defining the study design, patient population, and desired outcomes.

#

Training Data Selection and Face Validity

One of the key advantages of machine-learning models is their ability to detect subtle data patterns, but this also makes them prone to overfitting, where they find patterns only present in the training data.[17] To avoid overfitting, selecting high-quality training data is crucial. The first consideration should be the appropriate patient population and study design.[55] Ideally, a model should be developed for the specific patient population it will serve. For diagnostic models, this means including patients suspected of having the disease, and for prognostic models, it should include patients at risk.[55] Prospective studies or randomized clinical trials are generally preferable to retrospective ones because they collect predictors before outcomes, inherently blinding the study. However, careful planning is necessary to avoid biases, such as selection or spectrum bias.[56] Retrospective studies, while less resource-intensive and able to generate larger datasets, are often biased.

To ensure a clinically useful model, special considerations are critical with regard to the selection of predictors (feature engineering and selection). While features with the highest predictive value should be selected from a strictly machine-learning perspective, additional factors are important in medicine.[57] Some features, like biopsies, may not be ethical or economically viable to collect from every patient. Therefore, it is important also to involve focus groups during the feature selection process and consider these factors. Additionally, since physicians are ultimately responsible for patient care, face validity and transparency are essential.[58] Face validity, a psychological concept, indicates whether a test appears to measure what it is supposed to. Tests with low face validity risk are not being used.[58] Surveys of U.S. physicians show that while most are open to using machine-learning models, they require an understanding of the model's inputs and how they arrive at their outputs.[59] This highlights the importance of also using interpretable machine-learning techniques to trace how a model makes its predictions.[60]

#

Implementation

The implementation of a model is often overlooked but crucial for clinical adoption. Even the most accurate model is useless if not used. Given the time constraints doctors face, the tool must integrate smoothly into the current workflow.[61] Ideally, it should be implemented within the electronic health record or laboratory information system. Another straightforward method is developing a web application using frameworks like Shiny for R or Flask for Python, which provide easy implementation for researchers.[62] [63]

#

External Validation

Validating the model in an external cohort is essential to confirm its diagnostic performance.[64] Internal validation, often the final development phase, involves testing the model against a hold-out or time-displaced set from the original training cohort. However, this provides only a rough performance estimate.[17] External validation is necessary to obtain an unbiased estimate and identify potential overfitting.[64] This involves conducting a similar study to the development study, applying the same considerations. Additionally, the impact of the diagnostic tool on patient outcomes can be measured through a randomized controlled trial, although this is rarely done due to high costs.

#

Regulatory Approval

Regulatory approval is the basis for the legal use and insurance reimbursement of a machine-learning model in clinical practice. In the European Union, machine-learning models are classified as “software as a medical device” and are governed by the Medical Device Regulation (MDR).[34] The MDR uses a risk-based approach, requiring all but the lowest-risk categories to undergo systematic clinical evaluation and post-market surveillance.[34] The regulation also emphasizes the importance of prospective real-world data. Obtaining approval is a long and costly process, feasible only in collaboration with an industry partner. Without regulatory approval, models can effectively be used only as research tests that should not impact patient care.

#
#

Conclusion

Machine-learning algorithms hold the potential to transform healthcare by enhancing care and optimizing processes amid rising costs and personnel shortages. Despite the promising advancements, an implementation gap remains due to various methodological and practical challenges. Ensuring high-quality training data, appropriate feature selection, and robust validation, including external cohort validation, are critical steps to mitigate overfitting and confirm performance. Effective integration into clinical workflows and obtaining regulatory approval, which involves systematic evaluation and post-market surveillance, are essential for clinical adoption. By adhering to a comprehensive methodological framework, these challenges can be addressed, enabling machine-learning models to realize their full potential in clinical practice.

#
#

Conflict of Interest

The authors declare that they have no conflict of interest.

References
1 Alpaydin E. Machine Learning, Revised and Updated Edition. MIT Press; 2021

Search in Google Scholar
2 Bi Q, Goodman KE, Kaminsky J, Lessler J. What is machine learning? A Primer for the epidemiologist. Am J Epidemiol 2019; 188 (12) 2222-2239

PubMed Search in Google Scholar
3 Gandhi TK, Classen D, Sinsky CA. et al. How can artificial intelligence decrease cognitive and work burden for front line practitioners?. JAMIA Open 2023; 6 (03) ooad079

Crossref PubMed Search in Google Scholar
4 Heinrichs B, Eickhoff SB. Your evidence? Machine learning algorithms for medical diagnosis and prediction. Hum Brain Mapp 2020; 41 (06) 1435-1444

Crossref PubMed Search in Google Scholar
5 Alvarado R. Should we replace radiologists with deep learning? Pigeons, error and trust in medical AI. Bioethics 2022; 36 (02) 121-133

Crossref PubMed Search in Google Scholar
6 Murdoch B. Privacy and artificial intelligence: challenges for protecting health information in a new era. BMC Med Ethics 2021; 22 (01) 122

Crossref PubMed Search in Google Scholar
7 Meskó B, Topol EJ. The imperative for regulatory oversight of large language models (or generative AI) in healthcare. NPJ Digit Med 2023; 6 (01) 120

Crossref PubMed Search in Google Scholar
8 Larrazabal AJ, Nieto N, Peterson V, Milone DH, Ferrante E. Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. Proc Natl Acad Sci U S A 2020; 117 (23) 12592-12594

Crossref PubMed Search in Google Scholar
9 Parikh RB, Teeple S, Navathe AS. Addressing bias in artificial intelligence in health care. JAMA 2019; 322 (24) 2377-2378

Crossref PubMed Search in Google Scholar
10 Wong A, Otles E, Donnelly JP. et al. External validation of a widely implemented proprietary sepsis prediction model in hospitalized patients. JAMA Intern Med 2021; 181 (08) 1065-1070

Crossref PubMed Search in Google Scholar
11 Wong A, Cao J, Lyons PG. et al. Quantification of sepsis model alerts in 24 US hospitals before and during the COVID-19 pandemic. JAMA Netw Open 2021; 4 (11) e2135286

Crossref PubMed Search in Google Scholar
12 Habib AR, Lin AL, Grant RW. The epic sepsis model falls short-the importance of external validation. JAMA Intern Med 2021; 181 (08) 1040-1041

Crossref PubMed Search in Google Scholar
13 Schwartz JM, Moy AJ, Rossetti SC, Elhadad N, Cato KD. Clinician involvement in research on machine learning-based predictive clinical decision support for the hospital setting: a scoping review. J Am Med Inform Assoc 2021; 28 (03) 653-663

Crossref PubMed Search in Google Scholar
14 Kononenko I. Machine learning for medical diagnosis: history, state of the art and perspective. Artif Intell Med 2001; 23 (01) 89-109

Crossref PubMed Search in Google Scholar
15 Deo RC. Machine learning in medicine. Circulation 2015; 132 (20) 1920-1930

Crossref PubMed Search in Google Scholar
16 Korteling JEH, van de Boer-Visschedijk GC, Blankendaal RAM, Boonekamp RC, Eikelboom AR. Human- versus artificial intelligence. Front Artif Intell 2021; 4: 622364

Crossref PubMed Search in Google Scholar
17 Kuhn M, Johnson K. Applied Predictive Modeling. Accessed August 11, 2022 at: https://link.springer.com/book/10.1007/978-1-4614-6849-3

PubMed
18 Sidey-Gibbons JAM, Sidey-Gibbons CJ. Machine learning in medicine: a practical introduction. BMC Med Res Methodol 2019; 19 (01) 64

Crossref PubMed Search in Google Scholar
19 Alam MM, Islam MT. Machine learning approach of automatic identification and counting of blood cells. Healthc Technol Lett 2019; 6 (04) 103-108

Crossref PubMed Search in Google Scholar
20 Singh A, Thakur N, Sharma A. A review of supervised machine learning algorithms. In: 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom). 2016: 1310-1315 . Accessed January 12, 2024 at: https://ieeexplore.ieee.org/abstract/document/7724478/authors#authors

Search in Google Scholar
21 Nilius H, Cuker A, Haug S. et al. A machine-learning model for reducing misdiagnosis in heparin-induced thrombocytopenia: a prospective, multicenter, observational study. EClinicalMedicine 2022; 55: 101745

Crossref PubMed Search in Google Scholar
22 Loh WY, Cao L, Zhou P. Subgroup identification for precision medicine: a comparative review of 13 methods. Wiley Interdiscip Rev Data Min Knowl Discov 2019; 9 (05) e1326

Crossref PubMed Search in Google Scholar
23 Alloghani M, Al-Jumeily D, Mustafina J, Hussain A, Aljaaf AJ. A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science. In: Berry MW, Mohamed A, Yap BW. eds. Supervised and Unsupervised Learning for Data Science. Springer International Publishing; 2020: 3-21

Crossref Search in Google Scholar
24 Weller BE, Bowen NK, Faubert SJ. Latent class analysis: a guide to best practice. J Black Psychol 2020; 46 (04) 287-311

Crossref PubMed Search in Google Scholar
25 Pallares Robles A, Ten Cate V, Lenz M. et al. Unsupervised clustering of venous thromboembolism patients by clinical features at presentation identifies novel endotypes that improve prognostic stratification. Thromb Res 2023; 227: 71-81

Crossref PubMed Search in Google Scholar
26 Zhang Z. written on behalf of AME Big-Data Clinical Trial Collaborative Group. Reinforcement learning in clinical medicine: a method to optimize dynamic treatment regime over time. Ann Transl Med 2019; 7 (14) 345

Crossref PubMed Search in Google Scholar
27 Nemati S, Ghassemi MM, Clifford GD. Optimal medication dosing from suboptimal clinical examples: a deep reinforcement learning approach. In: 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC);. 2016: 2978-2981

Crossref PubMed Search in Google Scholar
28 Zhu X, Goldberg AB. Introduction to Semi-Supervised Learning. Springer Nature; 2022

Search in Google Scholar
29 Omiye JA, Gui H, Rezaei SJ, Zou J, Daneshjou R. Large language models in medicine: the potentials and pitfalls : a narrative review. Ann Intern Med 2024; 177 (02) 210-220

Crossref PubMed Search in Google Scholar
30 Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW. Large language models in medicine. Nat Med 2023; 29 (08) 1930-1940

Crossref PubMed Search in Google Scholar
31 Open AI, Achiam J, Adler S. et al. GPT-4 Technical Report. Published online March 4, 2024

Crossref PubMed
32 Li Y, Li Z, Zhang K, Dan R, Jiang S, Zhang Y. ChatDoctor: a medical chat model fine-tuned on a large language model meta-AI (LLaMA) using medical domain knowledge. Cureus 2023; 15 (06) e40895

PubMed Search in Google Scholar
33 Alkaissi H, McFarlane SI. Artificial hallucinations in ChatGPT: implications in scientific writing. Cureus 2023; 15 (02) e35179

Crossref PubMed Search in Google Scholar
34 Muehlematter UJ, Daniore P, Vokinger KN. Approval of artificial intelligence and machine learning-based medical devices in the USA and Europe (2015-20): a comparative analysis. Lancet Digit Health 2021; 3 (03) e195-e203

Crossref PubMed Search in Google Scholar
35 Cool Stuff Now Epic and Generative AI / Epic. Accessed July 24, 2024 at: https://www.epic.com/epic/post/cool-stuff-now-epic-and-generative-ai

PubMed
36 Ayers JW, Poliak A, Dredze M. et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med 2023; 183 (06) 589-596

Crossref PubMed Search in Google Scholar
37 Health C for D and R. Artificial Intelligence and Machine Learning (AI/ML)-Enabled Medical Devices. FDA. Published online October 20, 2023. Accessed November 30, 2023 at: https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machinelearning-aiml-enabled-medical-devices

PubMed
38 Nafee T, Gibson CM, Travis R. et al. Machine learning to predict venous thrombosis in acutely ill medical patients. Res Pract Thromb Haemost 2020; 4 (02) 230-237

Crossref PubMed Search in Google Scholar
39 Zaboras Z, Jørgensen CT, Stensvold A. et al. Real-world data on treatment patterns and bleeding in cancer-associated thrombosis: data from the TROLL Registry. TH Open 2024; 8 (01) e132-e140

Thieme Connect PubMed Search in Google Scholar
40 Weissler EH, Naumann T, Andersson T. et al. The role of machine learning in clinical research: transforming the future of evidence generation. Trials 2021; 22 (01) 537

Crossref PubMed Search in Google Scholar
41 Lam BD, Chrysafi P, Chiasakul T. et al. Machine learning natural language processing for identifying venous thromboembolism: systematic review and meta-analysis. Blood Adv 2024; 8 (12) 2991-3000

Crossref PubMed Search in Google Scholar
42 Niemiec E. Will the EU Medical Device Regulation help to improve the safety and performance of medical AI devices?. Digit Health 2022; 8: 20 552076221089079

PubMed Search in Google Scholar
43 EUDAMED database - EUDAMED. Accessed July 24, 2024 at: https://ec.europa.eu/tools/eudamed/#/screen/home

PubMed
44 Joshi G, Jain A, Araveeti SR, Adhikari S, Garg H, Bhandari M. FDA-approved artificial intelligence and machine learning (AI/ML)-enabled medical devices: an updated landscape. Electronics (Basel) 2024; 13 (03) 498

PubMed Search in Google Scholar
45 Digital Cell Imaging System. Scopio Labs. Accessed July 24, 2024 at: https://scopiolabs.com/imaging-platforms/

PubMed
46 Katz BZ, Feldman MD, Tessema M. et al. Evaluation of Scopio Labs X100 Full Field PBS: the first high-resolution full field viewing of peripheral blood specimens combined with artificial intelligence-based morphological analysis. Int J Lab Hematol 2021; 43 (06) 1408-1416

Crossref PubMed Search in Google Scholar
47 Dale DC, Kelley ML, Navarro-De La Vega M. et al. A novel device suitable for home monitoring of white blood cell and neutrophil counts. Blood 2018; 132 (Suppl. 01) 1103

Crossref PubMed Search in Google Scholar
48 23andMe. Is Hereditary Thrombophilia Genetic? Genetic Testing for Hereditary Thrombophilia - 23andMe - 23andMe. Accessed July 24, 2024 at: https://www.23andme.com/topics/health-predispositions/hereditary-thrombophilia/

PubMed
49 Kolev M, Horn MP, Semmo N, Nagler M. Rational development and application of biomarkers in the field of autoimmunity: a conceptual framework guiding clinicians and researchers. J Transl Autoimmun 2022; 5: 100151

Crossref PubMed Search in Google Scholar
50 Nagler M, Nilius H. Next-generation diagnostic instruments in haematology. Br J Haematol 2023; 202 (05) 925-927

Crossref PubMed Search in Google Scholar
51 Nagler M. Translating laboratory tests into clinical practice: a conceptual framework. Hamostaseologie 2020; 40 (04) 420-429

Thieme Connect PubMed Search in Google Scholar
52 Nilius H, Tsouka S, Nagler M, Masoodi M. Machine learning applications in precision medicine: overcoming challenges and unlocking potential. Trends Analyt Chem 2024; 179: 117872

Crossref PubMed Search in Google Scholar
53 Monaghan PJ, Robinson S, Rajdl D. et al. Practical guide for identifying unmet clinical needs for biomarkers. EJIFCC 2018; 29 (02) 129-137

PubMed Search in Google Scholar
54 Monaghan PJ, Lord SJ, St John A. et al; Test Evaluation Working Group of the European Federation of Clinical Chemistry and Laboratory Medicine. Biomarker development targeting unmet clinical needs. Clin Chim Acta 2016; 460: 211-219

Crossref PubMed Search in Google Scholar
55 Rutjes AWS, Reitsma JB, Di Nisio M, Smidt N, van Rijn JC, Bossuyt PMM. Evidence of bias and variation in diagnostic accuracy studies. CMAJ 2006; 174 (04) 469-476

Crossref PubMed Search in Google Scholar
56 van der Ploeg T, Austin PC, Steyerberg EW. Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints. BMC Med Res Methodol 2014; 14 (01) 137

Crossref PubMed Search in Google Scholar
57 Remeseiro B, Bolon-Canedo V. A review of feature selection methods in medical applications. Comput Biol Med 2019; 112: 103375

Crossref PubMed Search in Google Scholar
58 Connell J, Carlton J, Grundy A. et al. The importance of content and face validity in instrument development: lessons learnt from service users when developing the Recovering Quality of Life measure (ReQoL). Qual Life Res 2018; 27 (07) 1893-1902

Crossref PubMed Search in Google Scholar
59 AMA Augmented Intelligence Research. Accessed September 30, 2024 at: https://www.ama-assn.org/system/files/physician-ai-sentiment-report.pdf

PubMed
60 Allgaier J, Mulansky L, Draelos RL, Pryss R. How does the model make predictions? A systematic literature review on the explainability power of machine learning in healthcare. Artif Intell Med 2023; 143: 102616

Crossref PubMed Search in Google Scholar
61 Verma AA, Murray J, Greiner R. et al. Implementing machine learning in medicine. CMAJ 2021; 193 (34) E1351-E1357

Crossref PubMed Search in Google Scholar
62 Kasprzak P, Mitchell L, Kravchuk O, Timmins A. Six years of shiny in research - collaborative development of web tools in R. R J 2020; 12 (02) 155

Crossref PubMed Search in Google Scholar
63 ISSN 2349–7688 IJ of RRA. Efficient Way of Web Development Using Python and Flask. Accessed January 12, 2024 at: https://www.academia.edu/41538217/Efficient_Way_of_Web_Development_Using_Python_and_Flask

PubMed
64 Cabitza F, Campagner A, Soares F. et al. The importance of being external. methodological insights for the external validation of machine learning models in medicine. Comput Methods Programs Biomed 2021; 208: 106288

Crossref PubMed Search in Google Scholar

Address for correspondence

Henning Nilius, MD

Inselspital, University Hospital of Bern

3010 Bern

Switzerland

Email: henning.nilius@insel.ch

Publication History

Received: 15 August 2024

Accepted: 19 September 2024

Article published online:
05 November 2024

Georg Thieme Verlag KG
Stuttgart · New York

References
1 Alpaydin E. Machine Learning, Revised and Updated Edition. MIT Press; 2021

Search in Google Scholar
2 Bi Q, Goodman KE, Kaminsky J, Lessler J. What is machine learning? A Primer for the epidemiologist. Am J Epidemiol 2019; 188 (12) 2222-2239

PubMed Search in Google Scholar
3 Gandhi TK, Classen D, Sinsky CA. et al. How can artificial intelligence decrease cognitive and work burden for front line practitioners?. JAMIA Open 2023; 6 (03) ooad079

Crossref PubMed Search in Google Scholar
4 Heinrichs B, Eickhoff SB. Your evidence? Machine learning algorithms for medical diagnosis and prediction. Hum Brain Mapp 2020; 41 (06) 1435-1444

Crossref PubMed Search in Google Scholar
5 Alvarado R. Should we replace radiologists with deep learning? Pigeons, error and trust in medical AI. Bioethics 2022; 36 (02) 121-133

Crossref PubMed Search in Google Scholar
6 Murdoch B. Privacy and artificial intelligence: challenges for protecting health information in a new era. BMC Med Ethics 2021; 22 (01) 122

Crossref PubMed Search in Google Scholar
7 Meskó B, Topol EJ. The imperative for regulatory oversight of large language models (or generative AI) in healthcare. NPJ Digit Med 2023; 6 (01) 120

Crossref PubMed Search in Google Scholar
8 Larrazabal AJ, Nieto N, Peterson V, Milone DH, Ferrante E. Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. Proc Natl Acad Sci U S A 2020; 117 (23) 12592-12594

Crossref PubMed Search in Google Scholar
9 Parikh RB, Teeple S, Navathe AS. Addressing bias in artificial intelligence in health care. JAMA 2019; 322 (24) 2377-2378

Crossref PubMed Search in Google Scholar
10 Wong A, Otles E, Donnelly JP. et al. External validation of a widely implemented proprietary sepsis prediction model in hospitalized patients. JAMA Intern Med 2021; 181 (08) 1065-1070

Crossref PubMed Search in Google Scholar
11 Wong A, Cao J, Lyons PG. et al. Quantification of sepsis model alerts in 24 US hospitals before and during the COVID-19 pandemic. JAMA Netw Open 2021; 4 (11) e2135286

Crossref PubMed Search in Google Scholar
12 Habib AR, Lin AL, Grant RW. The epic sepsis model falls short-the importance of external validation. JAMA Intern Med 2021; 181 (08) 1040-1041

Crossref PubMed Search in Google Scholar
13 Schwartz JM, Moy AJ, Rossetti SC, Elhadad N, Cato KD. Clinician involvement in research on machine learning-based predictive clinical decision support for the hospital setting: a scoping review. J Am Med Inform Assoc 2021; 28 (03) 653-663

Crossref PubMed Search in Google Scholar
14 Kononenko I. Machine learning for medical diagnosis: history, state of the art and perspective. Artif Intell Med 2001; 23 (01) 89-109

Crossref PubMed Search in Google Scholar
15 Deo RC. Machine learning in medicine. Circulation 2015; 132 (20) 1920-1930

Crossref PubMed Search in Google Scholar
16 Korteling JEH, van de Boer-Visschedijk GC, Blankendaal RAM, Boonekamp RC, Eikelboom AR. Human- versus artificial intelligence. Front Artif Intell 2021; 4: 622364

Crossref PubMed Search in Google Scholar
17 Kuhn M, Johnson K. Applied Predictive Modeling. Accessed August 11, 2022 at: https://link.springer.com/book/10.1007/978-1-4614-6849-3

PubMed
18 Sidey-Gibbons JAM, Sidey-Gibbons CJ. Machine learning in medicine: a practical introduction. BMC Med Res Methodol 2019; 19 (01) 64

Crossref PubMed Search in Google Scholar
19 Alam MM, Islam MT. Machine learning approach of automatic identification and counting of blood cells. Healthc Technol Lett 2019; 6 (04) 103-108

Crossref PubMed Search in Google Scholar
20 Singh A, Thakur N, Sharma A. A review of supervised machine learning algorithms. In: 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom). 2016: 1310-1315 . Accessed January 12, 2024 at: https://ieeexplore.ieee.org/abstract/document/7724478/authors#authors

Search in Google Scholar
21 Nilius H, Cuker A, Haug S. et al. A machine-learning model for reducing misdiagnosis in heparin-induced thrombocytopenia: a prospective, multicenter, observational study. EClinicalMedicine 2022; 55: 101745

Crossref PubMed Search in Google Scholar
22 Loh WY, Cao L, Zhou P. Subgroup identification for precision medicine: a comparative review of 13 methods. Wiley Interdiscip Rev Data Min Knowl Discov 2019; 9 (05) e1326

Crossref PubMed Search in Google Scholar
23 Alloghani M, Al-Jumeily D, Mustafina J, Hussain A, Aljaaf AJ. A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science. In: Berry MW, Mohamed A, Yap BW. eds. Supervised and Unsupervised Learning for Data Science. Springer International Publishing; 2020: 3-21

Crossref Search in Google Scholar
24 Weller BE, Bowen NK, Faubert SJ. Latent class analysis: a guide to best practice. J Black Psychol 2020; 46 (04) 287-311

Crossref PubMed Search in Google Scholar
25 Pallares Robles A, Ten Cate V, Lenz M. et al. Unsupervised clustering of venous thromboembolism patients by clinical features at presentation identifies novel endotypes that improve prognostic stratification. Thromb Res 2023; 227: 71-81

Crossref PubMed Search in Google Scholar
26 Zhang Z. written on behalf of AME Big-Data Clinical Trial Collaborative Group. Reinforcement learning in clinical medicine: a method to optimize dynamic treatment regime over time. Ann Transl Med 2019; 7 (14) 345

Crossref PubMed Search in Google Scholar
27 Nemati S, Ghassemi MM, Clifford GD. Optimal medication dosing from suboptimal clinical examples: a deep reinforcement learning approach. In: 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC);. 2016: 2978-2981

Crossref PubMed Search in Google Scholar
28 Zhu X, Goldberg AB. Introduction to Semi-Supervised Learning. Springer Nature; 2022

Search in Google Scholar
29 Omiye JA, Gui H, Rezaei SJ, Zou J, Daneshjou R. Large language models in medicine: the potentials and pitfalls : a narrative review. Ann Intern Med 2024; 177 (02) 210-220

Crossref PubMed Search in Google Scholar
30 Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW. Large language models in medicine. Nat Med 2023; 29 (08) 1930-1940

Crossref PubMed Search in Google Scholar
31 Open AI, Achiam J, Adler S. et al. GPT-4 Technical Report. Published online March 4, 2024

Crossref PubMed
32 Li Y, Li Z, Zhang K, Dan R, Jiang S, Zhang Y. ChatDoctor: a medical chat model fine-tuned on a large language model meta-AI (LLaMA) using medical domain knowledge. Cureus 2023; 15 (06) e40895

PubMed Search in Google Scholar
33 Alkaissi H, McFarlane SI. Artificial hallucinations in ChatGPT: implications in scientific writing. Cureus 2023; 15 (02) e35179

Crossref PubMed Search in Google Scholar
34 Muehlematter UJ, Daniore P, Vokinger KN. Approval of artificial intelligence and machine learning-based medical devices in the USA and Europe (2015-20): a comparative analysis. Lancet Digit Health 2021; 3 (03) e195-e203

Crossref PubMed Search in Google Scholar
35 Cool Stuff Now Epic and Generative AI / Epic. Accessed July 24, 2024 at: https://www.epic.com/epic/post/cool-stuff-now-epic-and-generative-ai

PubMed
36 Ayers JW, Poliak A, Dredze M. et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med 2023; 183 (06) 589-596

Crossref PubMed Search in Google Scholar
37 Health C for D and R. Artificial Intelligence and Machine Learning (AI/ML)-Enabled Medical Devices. FDA. Published online October 20, 2023. Accessed November 30, 2023 at: https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machinelearning-aiml-enabled-medical-devices

PubMed
38 Nafee T, Gibson CM, Travis R. et al. Machine learning to predict venous thrombosis in acutely ill medical patients. Res Pract Thromb Haemost 2020; 4 (02) 230-237

Crossref PubMed Search in Google Scholar
39 Zaboras Z, Jørgensen CT, Stensvold A. et al. Real-world data on treatment patterns and bleeding in cancer-associated thrombosis: data from the TROLL Registry. TH Open 2024; 8 (01) e132-e140

Thieme Connect PubMed Search in Google Scholar
40 Weissler EH, Naumann T, Andersson T. et al. The role of machine learning in clinical research: transforming the future of evidence generation. Trials 2021; 22 (01) 537

Crossref PubMed Search in Google Scholar
41 Lam BD, Chrysafi P, Chiasakul T. et al. Machine learning natural language processing for identifying venous thromboembolism: systematic review and meta-analysis. Blood Adv 2024; 8 (12) 2991-3000

Crossref PubMed Search in Google Scholar
42 Niemiec E. Will the EU Medical Device Regulation help to improve the safety and performance of medical AI devices?. Digit Health 2022; 8: 20 552076221089079

PubMed Search in Google Scholar
43 EUDAMED database - EUDAMED. Accessed July 24, 2024 at: https://ec.europa.eu/tools/eudamed/#/screen/home

PubMed
44 Joshi G, Jain A, Araveeti SR, Adhikari S, Garg H, Bhandari M. FDA-approved artificial intelligence and machine learning (AI/ML)-enabled medical devices: an updated landscape. Electronics (Basel) 2024; 13 (03) 498

PubMed Search in Google Scholar
45 Digital Cell Imaging System. Scopio Labs. Accessed July 24, 2024 at: https://scopiolabs.com/imaging-platforms/

PubMed
46 Katz BZ, Feldman MD, Tessema M. et al. Evaluation of Scopio Labs X100 Full Field PBS: the first high-resolution full field viewing of peripheral blood specimens combined with artificial intelligence-based morphological analysis. Int J Lab Hematol 2021; 43 (06) 1408-1416

Crossref PubMed Search in Google Scholar
47 Dale DC, Kelley ML, Navarro-De La Vega M. et al. A novel device suitable for home monitoring of white blood cell and neutrophil counts. Blood 2018; 132 (Suppl. 01) 1103

Crossref PubMed Search in Google Scholar
48 23andMe. Is Hereditary Thrombophilia Genetic? Genetic Testing for Hereditary Thrombophilia - 23andMe - 23andMe. Accessed July 24, 2024 at: https://www.23andme.com/topics/health-predispositions/hereditary-thrombophilia/

PubMed
49 Kolev M, Horn MP, Semmo N, Nagler M. Rational development and application of biomarkers in the field of autoimmunity: a conceptual framework guiding clinicians and researchers. J Transl Autoimmun 2022; 5: 100151

Crossref PubMed Search in Google Scholar
50 Nagler M, Nilius H. Next-generation diagnostic instruments in haematology. Br J Haematol 2023; 202 (05) 925-927

Crossref PubMed Search in Google Scholar
51 Nagler M. Translating laboratory tests into clinical practice: a conceptual framework. Hamostaseologie 2020; 40 (04) 420-429

Thieme Connect PubMed Search in Google Scholar
52 Nilius H, Tsouka S, Nagler M, Masoodi M. Machine learning applications in precision medicine: overcoming challenges and unlocking potential. Trends Analyt Chem 2024; 179: 117872

Crossref PubMed Search in Google Scholar
53 Monaghan PJ, Robinson S, Rajdl D. et al. Practical guide for identifying unmet clinical needs for biomarkers. EJIFCC 2018; 29 (02) 129-137

PubMed Search in Google Scholar
54 Monaghan PJ, Lord SJ, St John A. et al; Test Evaluation Working Group of the European Federation of Clinical Chemistry and Laboratory Medicine. Biomarker development targeting unmet clinical needs. Clin Chim Acta 2016; 460: 211-219

Crossref PubMed Search in Google Scholar
55 Rutjes AWS, Reitsma JB, Di Nisio M, Smidt N, van Rijn JC, Bossuyt PMM. Evidence of bias and variation in diagnostic accuracy studies. CMAJ 2006; 174 (04) 469-476

Crossref PubMed Search in Google Scholar
56 van der Ploeg T, Austin PC, Steyerberg EW. Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints. BMC Med Res Methodol 2014; 14 (01) 137

Crossref PubMed Search in Google Scholar
57 Remeseiro B, Bolon-Canedo V. A review of feature selection methods in medical applications. Comput Biol Med 2019; 112: 103375

Crossref PubMed Search in Google Scholar
58 Connell J, Carlton J, Grundy A. et al. The importance of content and face validity in instrument development: lessons learnt from service users when developing the Recovering Quality of Life measure (ReQoL). Qual Life Res 2018; 27 (07) 1893-1902

Crossref PubMed Search in Google Scholar
59 AMA Augmented Intelligence Research. Accessed September 30, 2024 at: https://www.ama-assn.org/system/files/physician-ai-sentiment-report.pdf

PubMed
60 Allgaier J, Mulansky L, Draelos RL, Pryss R. How does the model make predictions? A systematic literature review on the explainability power of machine learning in healthcare. Artif Intell Med 2023; 143: 102616

Crossref PubMed Search in Google Scholar
61 Verma AA, Murray J, Greiner R. et al. Implementing machine learning in medicine. CMAJ 2021; 193 (34) E1351-E1357

Crossref PubMed Search in Google Scholar
62 Kasprzak P, Mitchell L, Kravchuk O, Timmins A. Six years of shiny in research - collaborative development of web tools in R. R J 2020; 12 (02) 155

Crossref PubMed Search in Google Scholar
63 ISSN 2349–7688 IJ of RRA. Efficient Way of Web Development Using Python and Flask. Accessed January 12, 2024 at: https://www.academia.edu/41538217/Efficient_Way_of_Web_Development_Using_Python_and_Flask

PubMed
64 Cabitza F, Campagner A, Soares F. et al. The importance of being external. methodological insights for the external validation of machine learning models in medicine. Comput Methods Programs Biomed 2021; 208: 106288

Crossref PubMed Search in Google Scholar

Permissions and Reprints

Subscribe to RSS

Share / Bookmark