Pharmacopsychiatry 2023; 56(06): 209-213
DOI: 10.1055/a-2142-9325
Review

Challenges and Ethical Considerations to Successfully Implement Artificial Intelligence in Clinical Medicine and Neuroscience: a Narrative Review

Scott Monteith
1   Department of Psychiatry, Michigan State University College of Human Medicine, Traverse City Campus, Traverse City, MI, USA
,
Tasha Glenn
2   ChronoRecord Association, Fullerton, CA, USA
,
John R. Geddes
3   Department of Psychiatry, University of Oxford, Warneford Hospital, Oxford, UK
,
Eric D. Achtyes
4   Department of Psychiatry, Western Michigan University Homer Stryker M.D. School of Medicine, Kalamazoo, MI, USA
,
Peter C. Whybrow
5   Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles (UCLA), Los Angeles, CA, USA
,
Michael Bauer
6   Department of Psychiatry and Psychotherapy, University Hospital Carl Gustav Carus Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
› Author Affiliations
 

Abstract

This narrative review discusses how the safe and effective use of clinical artificial intelligence (AI) prediction tools requires recognition of the importance of human intelligence. Human intelligence, creativity, situational awareness, and professional knowledge, are required for successful implementation. The implementation of clinical AI prediction tools may change the workflow in medical practice resulting in new challenges and safety implications. Human understanding of how a clinical AI prediction tool performs in routine and exceptional situations is fundamental to successful implementation. Physicians must be involved in all aspects of the selection, implementation, and ongoing product monitoring of clinical AI prediction tools.


#

Introduction

The use of artificial intelligence (AI) to augment human intelligence in medicine is expected to reshape healthcare [1] [2]. The safe and effective use of clinical AI prediction tools requires recognition of the importance of human involvement and the technical limitations of AI. Successful use of clinical AI prediction tools needs human intelligence, creativity, situational awareness, and professional knowledge to interpret and integrate results, and determine and handle exceptions. The implementation of clinical AI prediction tools may change the workflow in medical practice, resulting in multiple new challenges and safety implications. For example, research is focused on developing AI tools to predict treatment response to specific drugs, such as antidepressants [3] [4] [5]. In the future, this will change the workflow, requiring the physician to agree or disagree with the recommendation. To achieve the potential benefits, this narrative review will discuss some of the important and diverse challenges involved in the successful implementation of clinical AI prediction tools in medicine.

Fundamental differences between human intelligence and artificial intelligence

The successful implementation of AI tools in medicine requires recognition of the unique importance of human involvement. Although human brains and computers are often compared, there are many fundamental differences. A single human brain stores roughly the same amount of information as the entire Internet [6] [7]. Typically, a human brain has about 200 billion nerve cells, connected via trillions of synapses, which is more than all the computers and routers and Internet connections on Earth [8] [9]. Human brain cells, primarily neurons and synapses, perform both data storage and data processing [10]. In contrast, computers separate data storage from data processing, and must spend considerable energy moving data [10]. The adult brain is extraordinarily energy efficient requiring only about 20 watts of power [10] [11]. A brain can perform an exaflop (billion-billion) mathematical operations per second with 20 watts of power, as compared to an advanced supercomputer that required 20 megawatts for the same computations, a million times more power [11].

Human intelligence is very different from AI. Human intelligence deals with uncertainty, and responds to very small amounts of data [12]. Humans evaluate the trustworthiness of new information, and integrate it with accumulated wisdom [13]. Humans frame decision making using mental models that allow us to understand and make abstractions [13] [14]. Causality is a fundamental aspect of human decision making, and causal knowledge underlies much of what humans do even if we don’t understand the underlying mechanisms [15]. Human reasoning about cause and effect allows humans to ask why [16]. Human evaluation of information includes the creation of constraints, abstractions, and counterfactuals, such that different answers will be given to people having the same data. In contrast to human intelligence, AI systems do not understand causality [17]. AI does not capture human understanding that if x causes y, it does not mean that y causes x [17]. AI cannot ask why [16]. AI cannot create constraints or counterfactuals, or generate abstractions [14]. AI assumes that the same inputs will always result in the same prediction [18]. Judea Pearl noted that although the achievements of modern deep learning AI are impressive, they can be described today as “curve fitting” [17].


#

Artificial intelligence technical challenges

The physician should expect technical challenges during the implementation of clinical AI prediction tools. Currently most AI, including in medicine, is based on data-intensive machine learning (ML) methods [19] [20]. ML uses very large training datasets to determine the best model (data variables and equations) for predicting an outcome, with the model remaining an opaque black box. The accuracy of the clinical AI prediction tools is tied to the training data where better quality data produces better quality predictions. The electronic medical records (EMR) and claims data that are routinely used as training data in medicine have quality problems related to inaccuracy, missing data, biases, coding errors, lack of diversity, unrepresentative samples, and lack of vendor software interoperability [21]. There are additional data quality concerns for psychiatry due to the high frequency of missing behavioral health data in the EMR [21] [22]. Compared to nonmedical domains, the size of training data available for clinical AI prediction tools in psychiatry is much smaller [23], and a small training data size will decrease the accuracy of predictions [24] [25]. It is harder to test ML applications than conventionally coded applications [26], and there is no standard for communicating the amount of uncertainty in a ML prediction [27].

Another area of concern with the data used to train ML clinical AI prediction tools is dataset shift, where the data collected from the population used to train the model is different from the population where the model is deployed. When a clinical AI prediction tool is implemented in a setting where the patient population characteristics differs from the training data, AI often does not perform well [28] [29] [30]. Many diverse factors contribute to dataset shift in medicine including changes in patient demographics, standards of care, treatment practices, disease prevalence, and technology use [28]. Additionally, there is a reproducibility crises in all scientific fields that use ML [31], including healthcare [32], and a need to address the reproducibility challenge for clinical AI prediction tools [33] [34] [35]. The problem of reproducibility of ML models emphasizes the need for validated clinical AI prediction tools that have received approval from appropriate regulatory bodies.


#

Formal implementation process

A carefully considered, clearly defined process for implementing clinical AI technology in medical settings will improve the quality of the results. Despite the high expectations, there is a well documented productivity paradox, a delay in years between the adoption of a new technology and productivity increases, including in medicine [21]. The introduction of any new clinical technology, including AI prediction tools, will change the workflow, often in unexpected ways, and may result in new types of human errors and failure paths [36]. Physicians, clinical support staff, management and technical staff should all be involved in selection, implementation, and ongoing product monitoring and maintenance of clinical AI prediction tools [37]. This includes physician training, workflow changes and clinical impacts, product testing, integration into current systems, and ongoing monitoring and reporting of product performance and accuracy [37]. In medicine, many clinical AI prediction tools are developed internally. This is very expensive in the long-term, as the ongoing costs to maintain reliable systems are much higher for ML than for traditional software [38]. Vendor contracts for clinical AI prediction tools should explicitly define responsibilities related to ML maintenance, administration, and enhancements.


#

Expect artificial intelligence errors

A fundamental assumption by physicians in an AI implementation should be that clinical AI prediction tools will make errors. There will be errors from any predictive model, whether based on AI or traditional statistics. The consequences of false positives, false negatives, and other errors will vary with the situation [24]. Much of the commercial use of AI is for low risk situations, such as a product recommendation on Amazon, where the costs of errors are financial [24]. In contrast, the potential impacts of errors from clinical AI prediction tools in medicine emphasize the need for human oversight and error tracking. The clinical AI prediction tools must perform in exceptional situations and boundary cases, as well as during routine activities. AI tools are not good at predicting rare events [24]. The results of AI prediction tools may conflict with current practice guidelines [39]. The results of an AI prediction could be plausible but incorrect, and potentially dangerous for an individual patient [40]. Additionally, when the result from an opaque black box algorithm is incorrect, it is not clear what went wrong or what should be fixed [41]. Physicians who learned their skills based on interpretation of raw data values may not function as well when only a prediction is presented [36]. AI failures may lead to unexpected safety hazards not seen previously, emphasizing the importance of physician training on potential ML dysfunction [42].

A comprehensive implementation plan for AI will include error tracking and implementing enhancements as an integral part of using any ML tool. There must be clearly defined and documented methods for physicians who use clinical AI prediction tools to identify, record and track errors. Physicians must understand what is expected of them in relation to error reporting and tracking. A framework for continuous tracking and reporting of errors in clinical AI prediction tools is especially important with ML, due to lack of model transparency, including for some Food and Drug Administration- approved ML tools [43].


#

Central importance of humans

An AI implementation plan must recognize that humans are central to the successful use of clinical AI prediction tools. The diverse types of errors from clinical AI prediction tools, and potential for serious consequences, highlights the importance and need for human intelligence in the process [24]. Increasing complexity in an automated system increases the need for human judgement and situational awareness when unexpected errors occur [36] [44]. The implementation of clinical AI prediction tools must ensure adequate physician review such that predictions can be overridden if necessary. Unlike ML, human clinical decision making is tied to context [36] [45]. The black-box nature of ML can make it difficult to detect biases, understand, or trust the results [43] [46]. Although there are ongoing efforts to provide explainability to ML models, there are limitations and drawbacks to explainability techniques [47] [48]. There are also complex problems in medicine where humans disagree on what is the best solution. Concern about the quality of evidence available for many clinical AI prediction tools is widespread, along with recognition of the need to improve and expand government regulation [49] [50].


#

Unintended consequences of artificial intelligence

The implementation of any new technology, including AI, results in unintended consequences [36] [51] [52]. The introduction of AI into routine clinical practice can lead to overreliance on the technology, automation bias and complacency. Automation bias occurs when the user gives greater authority to automated advice than to other sources of advice [53]. The risk for automation bias increases when it is hard to verify if automation is performing correctly, as found in clinical medicine [54]. Automation complacency occurs when a user in a multitasking environment focuses on the manual tasks, not noticing errors in the automated tasks [53]. One concern is that most psychiatrists have no formal training in technology and may be unaware of the risks and drawbacks of AI [55]. Another possible consequence of using AI is that overreliance on AI will lead to deskilling, reducing the clinical knowledge and the patient communications and examination skills of a physician [45] [56].


#

Limitations

There are many limitations to this discussion. With a focus on implementation, the important topic of validation standards was not discussed. Unsolved technical issues with ML that may negatively impact safety, including biases, the presentation of uncertainty in results [57] and cybersecurity [58] [59], were not discussed. The use of interpretable rather than ML models was not discussed [60]. The unique challenges for ongoing governing and regulating of ML in healthcare were not reviewed [49] [61]. Specific measures to mitigate the risks of implementation of AI were not discussed. Legal issues including physician responsibility and liability for related errors made by AI products were not included [62] [63].


#
#

Conclusion

The use of clinical AI prediction tools should emphasize the importance of human intelligence. The fundamental determinant of implementation success will be human understanding of how AI technology performs in routine and exceptional situations. Human intelligence, creativity, situational awareness, and professional knowledge is required to interpret, integrate, handle results, and recognize exceptions. Implementation of a clinical AI prediction tool may assist a physician, after the results are evaluated in the appropriate context for the individual patient. Physicians must be involved in all aspects of the selection, implementation, and ongoing product monitoring of clinical AI prediction tools.


#

Author Contributions

SM and TG wrote the initial draft. All authors reviewed and approved the final manuscript.


#
#

Conflict of Interest

The authors have no conflicts of interest to declare.


Correspondence

Scott Monteith, MD
Michigan State University College of Human Medicine
Traverse City Campus
1400 Medical Campus Drive
Traverse City, MI 49684
USA   

Publication History

Received: 25 April 2023
Received: 13 June 2023

Accepted: 16 June 2023

Article published online:
29 August 2023

© 2023. Thieme. All rights reserved.

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany