CC BY-NC-ND 4.0 · Yearb Med Inform 2023; 32(01): 215-224
DOI: 10.1055/s-0043-1768735
Section 9: Knowledge Representation and Management
Survey

Advancing Biomedicine with Graph Representation Learning: Recent Progress, Challenges, and Future Directions

Fang Li
McWilliams School of Biomedical Informatics, the University of Texas Health Science Center at Houston, Houston, TX, USA
,
Yi Nian
McWilliams School of Biomedical Informatics, the University of Texas Health Science Center at Houston, Houston, TX, USA
,
Zenan Sun
McWilliams School of Biomedical Informatics, the University of Texas Health Science Center at Houston, Houston, TX, USA
,
Cui Tao
McWilliams School of Biomedical Informatics, the University of Texas Health Science Center at Houston, Houston, TX, USA
› Author Affiliations
 

Summary

Objectives: Graph representation learning (GRL) has emerged as a pivotal field that has contributed significantly to breakthroughs in various fields, including biomedicine. The objective of this survey is to review the latest advancements in GRL methods and their applications in the biomedical field. We also highlight key challenges currently faced by GRL and outline potential directions for future research.

Methods: We conducted a comprehensive search of multiple databases, including PubMed, Web of Science, IEEE Xplore, and Google Scholar, to collect relevant publications from the past two years (2021-2022). The studies selected for review were based on their relevance to the topic and the publication quality.

Results: A total of 78 articles were included in our analysis. We identified three main categories of GRL methods and summarized their methodological foundations and notable models. In terms of GRL applications, we focused on two main topics: drug and disease. We analyzed the study frameworks and achievements of the prominent research. Based on the current state-of-the-art, we discussed the challenges and future directions.

Conclusions: GRL methods applied in the biomedical field demonstrated several key characteristics, including the utilization of attention mechanisms to prioritize relevant features, a growing emphasis on model interpretability, and the combination of various techniques to improve model performance. There are also challenges needed to be addressed, including mitigating model bias, accommodating the heterogeneity of large-scale knowledge graphs, and improving the availability of high-quality graph data. To fully leverage the potential of GRL, future efforts should prioritize these areas of research.


#

1 Introduction

The paradigm of evidence-based precision medicine has evolved toward the profound utilization of large volumes of data, driven by the rapid development of high technologies and the increasing availability of biomedical data [[1]]. A graph is a data structure that comprises nodes and edges, representing entities and relationships between them, respectively [[2]]. Graphs have emerged as a major form for describing ubiquitous real-life systems, owing to their ability to model complex temporal and spatial relationships between entities [[3], [4]]. Graph-structure data are pervasive in biomedicine and healthcare, representing information at the molecular level (such as chemical structure [[5]] and gene regulatory network [[6]]), the patient level (such as comorbidity network [[7]]), and the population level (such as epidemic network [[8]] and healthcare system [[9]]). Knowledge graphs (KGs), a type of heterogeneous graphs, are used to represent networked entities and relationships [[10]], where entities can denote real objects or theoretical concepts, and relationships indicate their associations. Moreover, entities and relationships are endowed with types and properties that accurately convey their semantics[[11]]. Graphs, together with KGs, support many cutting-edge applications in healthcare, including drug repurposing [[12]], disease risk prediction [[13]], and protein-protein interaction (PPI) prediction [[14]], and can be used to generate new hypotheses that are ultimately translated into clinically actionable outcomes.

Over the past two decades, machine learning (ML), specifically deep learning (DL), has been successful in vast healthcare scenarios, such as medical imaging and diagnostics [[15]], drug discovery [[16]], health insurance and fraud detection [[17]]. However, these techniques were mainly designed to process Euclidean data, such as electronic health records (EHRs), text, and images, and may not handle non-Euclidean graph data directly [[2]]. The distinction between Euclidean and non-Euclidean data is the underlying geometry used to represent the data: Euclidean geometry deals with flat, two-dimensional spaces, while non-Euclidean geometry studies curved surfaces [[18]]. The key challenge in utilizing graph data in ML models is finding a way to represent graph structure that is easy for the models to learn [[19]]. Graph representation learning (GRL), which embeds raw graph data into a low-dimensional space while preserving graph topology and node properties, can make graph data more amenable to ML [[3]]. GRL is a cutting-edge field of graph algorithms that has attracted significant interest from diverse fields, including computer science and biomedicine. It has proven to be a valuable tool for understanding biological systems [[20]], accelerating drug discovery [[21]], and enhancing disease diagnosis and treatment [[22], [23]]. With the potential to transform biomedicine and healthcare, GRL provides new insights into complex disease mechanism and facilitates the development of personalized health plans. However, despite its remarkable success, GRL still faces several challenges, such as the need for a better theoretical understanding of the methods, improving scalability and interpretability in real systems, and ensuring the soundness of methodology while maintaining optimal empirical performance in applications [[24]].

To catch the most recent progress in this active and fast-growing field, and to shed light on the direction of future efforts, we conduct this survey. Our survey focuses specifically on GRL methods and their applications. We also identify the key challenges that GRL faces and discuss the potential opportunities that can further advance the field.


#

2 Materials and Methods

This survey aimed to review the latest development in GRL research in the healthcare field during the past two years (2021–2022). To ensure that the analysis was in-depth and up-to-date, we conducted a thorough search for relevant articles in multiple databases, including PubMed, Web of Science, IEEE Xplore, Google Scholar, and arXiv. The study selection criteria focused on topic relevance and publication quality, with preference given to high-impact factor journals, top-tier conferences, and articles with a larger number of citations. By employing this rigorous approach, we included 78 representative articles, encompassing both original research and reviews. The selected studies were of high quality, ensuring a robust understanding of the latest advancements in GRL in the biomedical field.


#

3 Advances in GRL Methods

Over the last decade, GRL has emerged as a critical and pervasive research area, greatly improving the efficiency and flexibility of representation learning [[19]]. There are two settings for graph learning patterns: transductive learning and inductive learning (or reasoning) [[24], [25]]. Transductive learning involves observing all data, both training and test (with unknown labels), during training. The model learns from the observed training data and predicts the labels of the test data. On the other hand, inductive learning is more like traditional supervised learning, where the model encounters only the training data when developing and then the learned model is applied to the test data which it has never seen before. Transductive learning can generate node embeddings for existing nodes or suggest new relations (edges) in a fixed graph, while inductive learning has generalizability to the new graphs.

In this section, we will explore three fundamental categories of GRL methods, based on the classification defined in a few studies [[3], [26], [27]]. [Table 1] presents the principles, characteristics, and applicable tasks of these three GRL categories. Additionally, [Table 2] lists some notable GRL models that have been developed in recent years.

Zoom Image
Table 1 Principles, characteristics, and applicable tasks of three categories of GRL methods.
Zoom Image
Table 2 Recent notable GRL models.

3.1 Shallow Node Embedding

The purpose of the shallow node embedding is to project nodes onto a latent space, which is a multi-dimensional vector space learned by a model based on the input data. This latent space serves as a summary of the local graph structure, and the original relations in the graph are then represented by the topological relationships of the embedded representation. Node embeddings are characterized by an encoding and decoding process [[26]], where the encoder maps each node to the latent embedding space, serving as an embedding lookup table, while the decoder reconstructs a graph statistic for a pair of embedded nodes. The optimization of encoder and decoder is intended to minimize the loss between the decoded statistic and some node-based similarity metrics. Notable shallow node embedding methods include DeepWalk [[28]], node2vec [[29]], struc2vec [[30]], and LINE [[31]].

Shallow node embedding methods are relatively simple to implement and interpret. However, their transductive nature makes them less suitable for inductive reasoning, where the graph structure may change or not be pre-defined [[32], [33]]. Moreover, the shallow embedding methods only consider the topological structure of the graph as input and generate the embedding of nodes or edges, without considering any associated node and edge attributes.


#

3.2 Graph Neural Networks

Graph neural networks (GNNs) are neural networks designed to operate on graph data [[24]]. By learning compact representations of graph elements, their attributes, and supervised labels (if any), GNNs surpass shallow node embeddings in their ability to perform inductive reasoning and capture higher-order and nonlinear patterns through multi-hop propagation within several layers of neural message passing [[27], [34]].

Convolutional neural networks (CNNs) are among the most popular DL models used in computer vision applications [[35]], and have shown exceptional performance in tasks such as object detection [[36]] and image analysis [[37], [38]]. Although CNNs are traditionally used for structured Euclidean data, such as image pixels or text sequences, the concept of convolution to learn local connections can be adapted to non-Euclidean graphs using spectral and spatial approaches [[2]]. In the spectral approach, graph information is transformed to the spectral domain using the graph Fourier transform and the eigen-decomposition of the graph Laplacian [[39]], and convolution is performed on the graph spectrum. Graph convolutional networks (GCN) [[40]], dual graph convolutional network (DGCN) [[41]], and Cluster-GCN [[42]] are typical GNN variants that use this approach. In the spatial approach, convolution is performed directly on the topological graph. However, unlike the convolution operation on image pixels, graph convolution lacks the weight-sharing property, and the size of the node's neighbors is not always the same. To address these challenges in the spatial domain, several models have been developed, such as diffusion-convolutional neural network (DCNN) [[43]], graph sample and aggregate (GraphSAGE) [[32]], and mixture model network (MoNet) [[44]].

In addition to the aforementioned models, a number of other state-of-the-art neural networks are applicable to graphs. For instance, graph attention network (GAT) [[45]] employs the self-attention strategy to assign different weights to the neighbors of each node, allowing it to learn node representation on graphs with varying node degrees and enabling inductive learning. Gated-based models, such as Tree LSTM [[46]], gated graph neural network (GGNN) [[47]], and graph LSTM [[48]], utilize the gate mechanism to facilitate long-term information propagation. The gate operator allows information to be updated or discarded, which can help reduce the noise during the information propagation process.

Despite their success in a range of graph-based learning tasks, GNNs are often criticized for their lack of interpretability. As black box models, it can be challenging to discern how these networks make predictions or extract meaningful insights from the learned representations [[2]]. Additionally, the computational cost of GNNs can be prohibitive, particularly when dealing with large-scale biomedical graphs comprising millions of nodes and edges [[34]]. This constraint can limit their applicability in real-world scenarios where computational efficiency is critical.


#

3.3 Generative Graph Models

In recent years, generative graph models have emerged as a promising field in GRL. Unlike shallow embeddings and GNNs, which focus on learning embedding of existing graphs, generative graph models leverage graph characteristics, such as graph structure, node and edge information, to generate new graphs that possess similar properties to the original graph.

Two popular generative graph models are variational autoencoders (VAEs) and generative adversarial networks (GANs). VAEs utilize stochastic variational inference to train an encoder and decoder that can generate graphs from a learned distribution based on a latent representation [[49]]. Models such as variational graph auto-encoder (VGAE) [[50]], GraphVAE [[51]], and junction tree variational autoencoder (JEVAE) [[52]] are examples of VAE-based approaches. On the other hand, GANs consist of a generator that produces fake samples and a discriminator that distinguishes between real and fake data [[53]]. The goal is to increase the likelihood of identifying the true samples as real and the reconstructed samples as fake. GraphGAN [[54]] and MolGAN [[55]] are two examples of GAN-based generative graph models. These generative graph models have demonstrated great potential in expediting biomedical discoveries, including drug development [[55]] and protein structure construction [[56]].

However, there are still challenges to overcome, such as scalability and interpretability, to make the generative graph models more applicable to real-world scenarios. Additionally, generative graph models can be challenging to replicate, primarily due to their high sensitivity to the initial random seed used during the graph generation process [[57]]. As a result, even minor variations in the seed value can lead to significant differences in the generated graph structure, making it difficult to reproduce the same results.


#
#

4 Advances in GRL Applications for Biomedicine

From a graph ML perspective, research on GRL application can be divided into various tasks, including node, triple, and graph classification, link (relation) prediction, node and graph clustering, and graph generation [[26]]. Considering the extensive range of GRL application studies available, we selected two crucial healthcare topics, namely drug and disease, to summarize some noteworthy studies. [Table 3] outlines the key components of GRL applications in research related to drug and disease.

Zoom Image
Table 3 Core elements of KG applications in drug and disease-related research.

4.1 Drug Development and Related Association Predictions

4.1.1 In Silico Drug Repurposing

In the field of drug discovery and development, in silico drug repurposing, which involves the computational identification of new indications and targets for already marketed drugs [[58]], continues to be an attractive proposition. Drug repurposing relies on de-risked drugs, which have to potential to offer lower development costs and shorter development life cycles [[12]]. The primary objective of drug repurposing is to identify candidate drugs that have a high probability of being associated with the therapeutic indication of interest [[59]]. This task can be framed as a link prediction challenge that aims to identify potential drug-target interactions (DTIs) or drug-disease associations with a high level of confidence.

The common approach for in silico drug repurposing involves predicting DTIs. A drug target is a protein or other biomolecule (such as DNA, RNA, and peptide) to which the drug directly binds and which is responsible for the drug's therapeutic efficacy [[60]]. Peng et al. [[61]] developed an end-to-end learning-based framework (EEG-DTI) that employed heterogeneous GCNs for DTI prediction. Specifically, a heterogeneous network was created by merging multiple biological networks. A three-layer GCN was then implemented to produce low-dimensional embeddings for drugs and proteins using information from their neighbors in the heterogeneous network. The drug and protein embeddings were concatenated, and the inner product was used to calculate the drug-protein interaction score (i.e., DTI prediction). Li et al. [[62]] introduced a multi-channel GCN and GAT-based framework (DTI-MGNN) for DTI prediction, utilizing a topology graph (contextual representation), a feature graph (semantic representation), and a common representation of drug and protein pairs (DPPs). Xuan et al. [[63]] proposed a graph convolutional and variational autoencoder-based approach (GVDTI), which encoded multiple pairwise (drug-protein) representations. The pairwise representations were then fused by convolutional and fully connected neural networks for DTI prediction. Similarly, Hsieh et al. [[64]] utilized variational graph autoencoders with GraphSAGE message passing to generate drug embeddings and selected the most potent drugs for COVID-19. Ding et al. [[65]] employed a relational graph convolutional network (RGCN) to predict the drug-protein interactions and further predict the blood-brain barrier permeability of drug molecules.

In addition to predicting DTIs, another approach to in silico drug repurposing is predicting drug-disease associations. A deep understanding of the mechanism of drug action (MDA) is required for drug repurposing, which is often explained through biological pathways — a series of biochemical and molecular steps to achieve a specific function or to produce a certain product. To capture MDA and identify the critical paths from drugs to diseases in the human body, Yang et al. [[66]] proposed an interpretable DL-based path-reasoning framework (iDPath) that employed a multilayer biological network and various modules, including a GCN module, an LSTM module, and two attention modules (the node and path attention). Experiments showed that iDPath could identify explicit critical paths that were consistent with clinical evidence. Nian et al. [[67]] utilized semantic triples in SemMedDB for KG construction and drug-disease link prediction. They filtered the most relevant semantic triples for Alzheimer's disease (AD) using a BERT-based classifier and some rule-based methods, and trained graph embedding algorithms, such as TransE [[68]], DistMult [[69]], and ComplEx [[70]], to predict drug/chemical/food supplement candidates that may be helpful for AD treatment or prevention. Cai et al. [[12]] proposed a heterogeneous information fusion GCN approach (DRHGCN) for drug repurposing, which applied graph convolution operations to three networks to learn the embedding of drugs and diseases. DRHGCN also designed inter- and intra-domain feature extraction modules, and a layer attention mechanism to further improve the prediction performance. The experiment results demonstrated that DRHGCN identified several novel approved drugs for AD and Parkinson's disease.


#

4.1.2 Drug-Drug Interaction Prediction

Drug-drug interactions (DDIs) occur when two or more drugs interact with each other and can alter the absorption of one or both drugs, leading to delayed, decreased, or enhanced effects. These interactions can have significant consequences, including synergistic effects, where the total effect of the drugs is greater than the sum of their individual effects, or antagonistic effects, where the drugs have opposing effects on the body, potentially reducing or blocking the effectiveness of one or more of the drugs [[71]]. Adverse effects can also occur as a result of DDIs. Synergistic DDIs can be beneficial, particularly for cancer therapy, because they allow for the use of lower doses of chemotherapy drugs while maintaining or even enhancing their effectiveness. By contrast, antagonistic DDIs may reduce the efficacy of medications and require additional or alternative treatments.

Dai et al. [[72]] proposed a novel framework for DDI classification using an adversarial autoencoder-based embedding approach (AAE). To address the challenge of generating high-quality negative samples, the authors utilized an autoencoder which learned to produce plausible negative triplets for the discriminator while minimizing reconstruction errors via the decoder component. The discriminator was trained on both the generated negative triplets and the original positive triplets to produce a robust and effective graph representation model. To tackle vanishing gradient issues in the discrete representation, the authors employed the Gumbel-Softmax relaxation and the Wasserstein distance for training the embedding model, which provided a more stable and efficient training process, allowing for improved performance and faster convergence.

Identifying synergistic anticancer drug combinations is a common scenario in synergistic DDI prediction. To address this, Wang et al. [[73]] proposed a DL-based framework called DeepDDS. The framework utilized a multilayer feedforward neural network (MLP) to obtain the feature embedding of gene expression profiles of the cancer cell line, and either GAT or GCN to obtain the feature embedding of the drug (represented as a graph of molecular structures, from SMILE). The embedding vectors of the drug and the cell line were concatenated and fed into a multilayer fully connected network to predict the synergistic effect. The study also explored the interpretability of the GAT and found that the correlation matrix of atomic features revealed important chemical substructures of drugs. Yang et al. [[74]] developed GraphSynergy, a GCN-based framework for predicting synergistic DDIs. GraphSynergy encoded the high-order topological relationships in the PPI network between proteins that were targeted by a pair of drugs and were associated with a specific cancer cell line. The pharmacological effects of drug combinations were evaluated by their therapy and toxicity scores. An attention component was incorporated to capture the pivotal proteins that played a part in both the PPI network and biomolecular interactions between drug combinations and cancer cell lines.

Bang et al. [[75]] developed a graph feature attention network (GFAN) for predicting polypharmacy side effects with enhanced interpretability. Polypharmacy refers to the concurrent use of two or more different drugs. The GFAN model emphasized target genes differently for each side-effect prediction, making it capable of sensitively extracting target genes and providing interpretability. The experiments conducted by the authors showed that the GFAN model was effective in predicting polypharmacy side effects.


#
#

4.2 Disease and Related Association Predictions

4.2.1 Disease Prediction

Disease prediction using EHRs has become an area of significant research interest due to their increasing availability. EHR-based prediction and classification include predicting clinical risks, disease subtyping, and chronic disease onset, among others. However, conventional ML approaches rely heavily on abundant data to train the models, which can impede their performance in predicting rare diseases with severe data scarcity. Additionally, most existing disease prediction approaches rely on sequential EHRs, making it difficult to handle new patients without historical records.

To overcome these challenges, Sun et al. [[13]] proposed a GNN-based graph encoder that leveraged GATs and graph isomorphism networks (GINs) to learn highly representative node embeddings for patients. This approach utilized both the external knowledge base (the Human Phenotype Ontology) and patients' EHRs represented in the graph structure. The well-learned graph encoder can inductively infer the embeddings for a new patient, enabling the prediction of both general and rare diseases. The study demonstrated promising results in addressing the scarcity of training data for rare disease prediction.

EHRs contain tens of thousands of medical concepts that are implicitly connected. A feasible approach to improving EHR representation learning is to associate relevant medical concepts and leverage these connections. To this end, Zhu et al. [[23]] proposed a variationally regularized encoder-decoder graph neural network (VGNN) for EHRs that achieved robustness in graph structure learning by regularizing node representations. Another approach to leveraging connections among medical concepts is to exploit diagnoses as relational information by connecting similar patients in a graph. Rocheteau et al. [[76]] proposed such a strategy by designing an LSTM-GNN model for patient outcome prediction. The model extracted temporal features using LSTM and extracted the patient neighborhood information using GNNs. The results showed that the LSTM-GNN outperformed the LSTM-only baseline on length of stay prediction tasks on the eICU database, indicating that exploiting information from neighboring patient cases using GNNs is a promising research direction in EHR-based supervised learning.

Xia et al. [[77]] developed a medical conversational question-answering system that utilized a multi-modal clinical KG as its knowledge base to support entity reasoning, such as diseases, medical examinations, and drugs based on the patient's symptoms collected by the system. The system is equipped with advanced natural language processing (NLP) techniques, such as contrastive learning, prompt, bi-directional encoder, and autoregressive decoder, which helped to achieve state-of-the-art performance. With the multi-modal clinical KG and advanced NLP techniques, the system can answer medical questions in a conversational manner, making it a promising tool for assisting clinical decision-making and patient care.


#

4.2.2 Disease-protein/RNA Association Prediction

MicroRNAs (miRNAs) are crucial in the development of human complex diseases. Discovering the associations between miRNAs and diseases is essential for both basic and translational medicine. To address this, Tang et al. [[78]] developed a multi-view multichannel attention GCN (MMGCN) to predict potential miRNA-disease associations. This approach utilized a GCN encoder that took multiple similarity graphs of miRNA and disease as input, fused their neighbor information, and generated their embeddings under different views (i.e., graphs). The multichannel attention mechanism on miRNA and disease prioritized important channel embedding and produce normalized channel attention features. Additionally, a CNN combiner was used to convolve the multichannel attention features of miRNA and disease, respectively, to generate corresponding representations for association prediction. The MMGCN approach is effective in predicting miRNA-disease associations, which could aid in the development of novel therapies for complex human diseases.


#

4.2.3 Disease-microbe Association Prediction

Human microbes play a critical role in a wide range of complex diseases and have become a new target in precision medicine. In silico identification of microbe-disease associations can provide insights into understanding the pathogenic mechanism of complex human diseases and facilitate screening candidate targets for drug development. To this end, Long et al. [[79]] proposed a GAT-based framework, GATMDA, for human microbe-disease association prediction. The framework leveraged multiple similarity-based graphs to construct input features for microbes and diseases. GAT with talking heads was employed to learn the representations of microbe and disease nodes. To filter out noises and focus on more important neighbors, a bi-interaction aggregator was utilized to enforce the representation of the aggregation of similar neighbors. Finally, the inductive matrix completion (IMC) was combined to reconstruct a bipartite graph to predict microbe-disease associations. The proposed framework showed promising results in identifying potential microbe-disease associations, highlighting the potential for using GRL to facilitate precision medicine.


#
#
#

5 Challenges and Future Directions

5.1 Bias

Bias refers to the presence of prejudice or favoritism toward an individual or a group based on their inherent or acquired characteristics when making a decision [[80]]. When an algorithm's decisions favor or disfavor a specific group disproportionately, it is said to be biased. GRL methods are widely utilized in the biomedical field, but they can be susceptible to bias. For example, in the case of melanoma detection, ML models are trained using images from fair-skinned populations, primarily from the United States, Europe, and Australia. Consequently, these models may perform poorly in detecting lesions from individuals with different skin colors, indicating inherent model bias [[81]].

The issue of bias in algorithms can be mitigated at three stages of the ML pipeline [[82]]. At the pre-processing stage, bias can be addressed by generating non-discriminatory labeled data and obtaining fair data representations. However, generating non-discriminatory data can be challenging, especially for health records that often include sensitive features such as sexual identity, race, and social determinants of health. At the in-processing stage, algorithms could be modified to avoid bias, such as by changing the sampling strategy and adding regularization terms to the training process. For example, FairWalk [[83]], a graph embedding algorithm derived from node2vec, partitions neighbors into groups based on their sensitive attribute values and gives each group an equal probability of being chosen, thereby removing biases such as gender and race to a large extent. However, addressing multiple cross-attribute biases in networks with richer subgroup fairness still poses a challenge. At the post-processing stage, bias can be addressed by altering the classification threshold to ensure model fairness [[84]]. In the future, developing a sensitive information-oriented framework for GRL could be beneficial. This framework should integrate various modules for different subgroups and incorporate background information, such as data acquisition methods and the creation of training data to address bias in biomedical data.


#

5.2 Interpretability

The lack of interpretability of GRL models poses a challenge in trusting and safely utilizing them in sensitive domains such as healthcare [[85]]. To ensure the transparency and trustworthiness of graph algorithms, these models should offer both accurate predictions and human-intelligible explanations. To this end, several types of GNN explanation methods have been proposed to explain node and graph classification tasks. Here, we summarized three categories: mask-based methods, such as GNNExplainer [[86]], PGExplainer [[87]], and ZORRO [[88]]; perturbation-based methods, such as probabilistic graphical model (PGM)-Explainer [[89]]; and generative model-based methods, such as XGNN [[90]]. Mask-based approaches generate a new graph by combining masks with the original features/edges/nodes, enabling them to capture important information during backpropagation. Perturbation-based methods filter out unimportant features using data sampling, and then fit an explainable small model like a PGM on filtered data for a topological explanation. Generative model-based methods generate small explainable subgraphs in a node-by-node way. For instance, XGNN [[90]] uses a reinforcement learning framework to learn the probability of growing from a node to a subgraph for the explanation.

One major challenge of using explainability methods is determining how to assess their effectiveness. To address this issue, many studies have used synthetic data and real-world datasets, such as MUTAG [[91]] and MNIST [[92]], to validate their models. However, these validation datasets are often relatively small and straightforward, raising concerns about whether these explainable graph algorithms can be generalized to large-scale biomedical graph data. Recently, Agarwal et al. [[93]] proposed an approach for evaluating the explainability of GNNs. The authors developed a synthetic graph data generator, SHAPEGGEN, that can generate a variety of benchmark datasets and provide ground-truth explanations.

Furthermore, future efforts could focus on exploring novel training strategies to explain other tasks, such as link prediction, beyond current approaches for addressing interpretability issues for classification tasks (i.e., node and graph classification) [[94]]. Additionally, incorporating edge-based explanations, in addition to node-based explanations, would be beneficial in assisting human experts [[95]].


#

5.3 Heterogeneity

Biomedical graph data are often heterogenous, i.e., containing diverse types of nodes such as diseases, drugs, and genes. Graph ML tasks have shown that GNNs perform better than traditional methods on diverse graph data. However, recent studies suggested that GNNs, such as GCNs, may have inferior performance in heterogeneous graphs than in homogenous ones [[96]]. To address this issue, methods such as heterogeneous graph transformer (HGT) [[97]] and heterogeneous graph attention network (HAN) [[98]] have been developed. HGT utilizes meta-relations to parameterize weight matrices for heterogeneous mutual attention, message passing, and propagations [[97]]. On the other hand, HAN leverages both node- and semantic-level attention to simultaneously consider the importance of nodes and meta-paths [[98]]. However, these methods have only been validated on datasets with less than five types of nodes, whereas common biomedical KGs tend to be more heterogeneous and complex, with greater scale. Therefore, newer graph ML models that are specifically designed for large-scale heterogeneous biomedical KGs are still necessary.


#

5.4 Availability of High-quality Graph Data

GRL algorithms are computationally intensive and require large amounts of data to train effectively. The quality and quantity of graph data are critical for the performance of GRL algorithms. Incomplete or insufficient graph data can result in inaccurate embeddings, adversely affecting downstream tasks. Additionally, since KGs are usually constructed manually or semi-automatically, the process can be time-consuming, costly, and prone to errors, leading to a scarcity of high-quality graph data. Therefore, improving the quality and quantity of available graph data is crucial to unlock the full potential of GRL in biomedical applications.


#
#

6 Conclusions

In this survey, we have highlighted the significant advancements made in the field of GRL in biomedicine. GRL techniques have been extensively utilized to bridge major gaps in healthcare, enabling researchers to unravel complex disease mechanisms, accelerate drug discovery, and enhance personalized disease prediction and management. These breakthroughs are also a result of interdisciplinary collaborations among computer scientists, biologists, and health professionals, and their concerted efforts to integrate knowledge from diverse fields. Looking ahead, we anticipate that the development of more robust, interpretable, and trustworthy GRL algorithms, along with the availability of high-quality graph data, particularly well-curated KGs, will continue to play a critical role in advancing precision medicine. As GRL techniques continue to mature, they hold immense promise for boosting precision medicine by harnessing vast amounts of graph data in a meaningful and interpretable way.


#
#

No conflict of interest has been declared by the author(s).

Acknowledgments

The authors were supported by grants from the National Institutes of Health under Award Numbers RF1AG072799, R01AI130460, R56AG074604, and the American Heart Association under Award No.19GPSGC35180031.

  • References

  • 1 Santos A, Colaço AR, Nielsen AB, Niu L, Strauss M, Geyer PE, et al. A knowledge graph to interpret clinical proteomics data. Nat Biotechnol 2022;40(5):692–702. doi: 10.1038/s41587-021-01145-6.
  • 2 Zhou J, Cui G, Hu S, Zhang Z, Yang C, Liu Z, et al. Graph neural networks: A review of methods and applications. AI Open 2020;1:57–81. doi: 10.1016/j.aiopen.2021.01.001.
  • 3 Yi H-C, You Z-H, Huang D-S, Kwoh CK. Graph representation learning in bioinformatics: trends, methods and applications. Brief Bioinform 2022;23(1):1–16. doi: 10.1093/bib/bbab340.
  • 4 Leser U, Triβl S. Graph Management in the Life Sciences. In: Liu L, Özsu MT, editors. Encyclopedia of Database Systems. Springer US; 2009. p. 1266–71. doi: 10.1007/978-0-387-39940-9_1436.
  • 5 David L, Thakkar A, Mercado R, Engkvist O. Molecular representations in AI-driven drug discovery: a review and practical guide. J Cheminform 2020;12(1):56. doi: 10.1186/s13321-020-00460-5.
  • 6 Weighill D, Guebila MB, Lopes-Ramos C, Glass K, Quackenbush J, Platig J, et al. Gene regulatory network inference as relaxed graph matching. Proc Conf AAAI Artif Intell 2021;35:10263–72. doi: 10.1101/168419.
  • 7 Khan A, Uddin S, Srinivasan U. Comorbidity network for chronic disease: A novel approach to understand type 2 diabetes progression. Int J Med Inform 2018;115:1–9. doi: 10.1016/j.ijmedinf.2018.04.001.
  • 8 Gómez A, Oliveira G. New approaches to epidemic modeling on networks. Sci Rep 2023;13(1):468. doi: 10.1038/s41598-022-19827-9.
  • 9 Britto MT, Fuller SC, Kaplan HC, Kotagal U, Lannon C, Margolis PA, et al. Using a network organisational architecture to support the development of Learning Healthcare Systems. BMJ Qual Saf 2018;27(11):937–46. doi: 10.1136/bmjqs-2017-007219.
  • 10 Chandak P, Huang K, Zitnik M. Building a knowledge graph to enable precision medicine. Sci Data 2023;10(1):67. doi: 10.1038/s41597-023-01960-3.
  • 11 Ji S, Pan S, Cambria E, Marttinen P, Yu PS. A Survey on Knowledge Graphs: Representation, Acquisition, and Applications. IEEE Trans Neural Netw Learn Syst 2022; 33:494–514. doi: 10.1109/tnnls.2021.3070843.
  • 12 Cai L, Lu C, Xu J, Meng Y, Wang P, Fu X, et al. Drug repositioning based on the heterogeneous information fusion graph convolutional network. Brief Bioinform 2021;22(6):1–12. doi: 10.1093/bib/bbab319.
  • 13 Sun Z, Yin H, Chen H, Chen T, Cui L, Yang F. Disease Prediction via Graph Neural Networks. IEEE J Biomed Health Inform 2021;25(3):818–26. doi: 10.1109/JBHI.2020.3004143.
  • 14 Yuan Q, Chen J, Zhao H, Zhou Y, Yang Y. Structure-aware protein–protein interaction site prediction using deep graph convolutional network. Bioinformatics 2021;38(1):125–32. doi: 10.1093/bioinformatics/btab643.
  • 15 Aggarwal R, Sounderajah V, Martin G, Ting DSW, Karthikesalingam A, King D, et al. Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. NPJ Digit Med 2021; 4(1): 65. doi: 10.1038/s41746-021-00438-z.
  • 16 Askr H, Elgeldawi E, Aboul Ella H, Elshaier YAMM, Gomaa MM, Hassanien AE. Deep learning in drug discovery: an integrative review and future challenges. Artif Intell Rev 2022;1–63. doi: 10.1007/s10462-022-10306-1.
  • 17 Zhang G, Zhang X, Bilal M, Dou W, Xu X, Rodrigues JJPC. Identifying fraud in medical insurance based on blockchain and deep learning. Future Gener Comput Syst 2022;130:140–54. doi: 10.1016/j.future.2021.12.006.
  • 18 Singh PK. Data with Non-Euclidean Geometry and its Characterization. Journal of Artificial Intelligence and Technology 2021;2(1):3–8. doi: 10.37965/jait.2021.12001.
  • 19 Hamilton WL, Ying R, Leskovec J. Representation Learning on Graphs: Methods and Applications. Bull Tech Comm Data Eng 2017;40(3):52–74.
  • 20 Hetzel L, Fischer DS, Günnemann S, Theis FJ. Graph representation learning for single-cell biology. Curr Opin Syst Biol 2021;28:100347. doi: 10.1016/j.coisb.2021.05.008.
  • 21 Pham T-H, Qiu Y, Zeng J, Xie L, Zhang P. A deep learning framework for high-throughput mechanism-driven phenotype compound screening and its application to COVID-19 drug repurposing. Nat Mach Intell 2021;3(3):247–57. doi: 10.1038/s42256-020-00285-9.
  • 22 Li L, Zhou J, Jiang Y, Huang B. Propagation source identification of infectious diseases with graph convolutional networks. J Biomed Inform 2021;116:103720. doi: 10.1016/j.jbi.2021.103720.
  • 23 Zhu W, Razavian N. Variationally regularized graph-based representation learning for electronic health records. Proc ACM Conf Health Inference Learn 2021:1–13. doi: 10.1145/3450439.3451855.
  • 24 Wu L, Cui P, Pei J, Zhao L, Song L. Graph Neural Networks: Foundations, Frontiers, and Applications. Singarpore: Springer Nature; 2022. doi: 10.1007/978-981-16-6054-2.
  • 25 Yang Z, Cohen WW, Salakhutdinov R. Revisiting semi-supervised learning with graph embeddings. Proceedings of the 33rd International Conference on International Conference on Machine Learning 2016;48:40–8.
  • 26 Hamilton WL. Graph Representation Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning. Cham: Springer; 2020. doi: 10.1007/978-3-031-01588-5.
  • 27 Li MM, Huang K, Zitnik M. Graph representation learning in biomedicine and healthcare. Nat Biomed Eng 2022; 6(12):1353-69. doi: 10.1038/s41551-022-00942-x.
  • 28 Perozzi B, Al-Rfou R, Skiena S. DeepWalk: Online Learning of Social Representations. KDD 2014;701–10. doi: 10.1145/2623330.2623732.
  • 29 Grover A, Leskovec J. node2vec: Scalable Feature Learning for Networks. KDD 2016;855–64. doi: 10.1145/2939672.2939754.
  • 30 Ribeiro LFR, Savarese PHP, Figueiredo DR. struc2vec: Learning Node Representations from Structural Identity. KDD 2017;385–94. doi: 10.1145/3097983.3098061.
  • 31 Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q. LINE: Large-scale Information Network Embedding. Proceedings of the 24th International Conference on World Wide Web 2015;1067–77. doi: 10.1145/2736277.2741093.
  • 32 Hamilton WL, Ying R, Leskovec J. Inductive representation learning on large graphs. Proceedings of the 31st International Conference on Neural Information Processing Systems 2017:1025–35. doi: 10.5555/3294771.3294869.
  • 33 Chami I, Abu-El-Haija S, Perozzi B, Ré C, Murphy K. Machine learning on graphs: A model and comprehensive taxonomy. J Mach Learn Res 2022;(89):1-64.
  • 34 Morris C, Ritzert M, Fey M, Hamilton WL, Lenssen JE, Rattan G, et al. Weisfeiler and Leman Go Neural: Higher-Order Graph Neural Networks. Proc AAAI Conf Artif Intell 2019;33(1):4602–9. doi: 10.1609/aaai.v33i01.33014602.
  • 35 Alzubaidi L, Zhang J, Humaidi AJ, Al-Dujaili A, Duan Y, Al-Shamma O, et al. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data 2021;8(1):53. doi: 10.1186/s40537-021-00444-8.
  • 36 Zhao Z-Q, Zheng P, Xu S-T, Wu X. Object Detection With Deep Learning: A Review. IEEE Trans Neural Netw Learn Syst 2019;30(11):3212–32. doi: 10.1109/TNNLS.2018.2876865.
  • 37 Yao G, Lei T, Zhong J. A review of Convolutional-Neural-Network-based action recognition. Pattern Recognit Lett 2019;118:14–22. doi: 10.1016/j.patrec.2018.05.018.
  • 38 Ker J, Wang L, Rao J, Lim T. Deep Learning Applications in Medical Image Analysis. IEEE Access 2018;6:9375–9389. doi: 10.1109/ACCESS.2017.2788044.
  • 39 Shuman DI, Narang SK, Frossard P, Ortega A, Vandergheynst P. The Emerging Field of Signal Processing on Graphs: Extending High-Dimensional Data Analysis to Networks and Other Irregular Domains. IEEE Signal Process Mag 2013;30(3):83-98. doi: 10.1109/MSP.2012.2235192.
  • 40 Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. Proceedings of the 5th International Conference on Learning Representations 2017.
  • 41 Zheng Y, Gao C, Chen L, Jin D, Li Y. DGCN: Diversified Recommendation with Graph Convolutional Networks. Proceedings of the Web Conference 2021;401–12. doi: 10.1145/3442381.3449835.
  • 42 Chiang W-L, Liu X, Si S, Li Y, Bengio S, Hsieh C-J. Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks. KDD 2019;257–66. doi: 10.1145/3292500.3330925.
  • 43 Li Y, Yu R, Shahabi C, Liu Y. Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting. Proceedings of the 6th International Conference on Learning Representations 2018.
  • 44 Monti F, Boscaini D, Masci J, Rodolà E, Svoboda J, Bronstein MM. Geometric deep learning on graphs and manifolds using mixture model CNNs. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 2017;5425–34. doi: 10.1109/CVPR.2017.576.
  • 45 Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph Attention Networks. Proceedings of the 6th International Conference on Learning Representations 2018.
  • 46 Tai KS, Socher R, Manning CD. Improved Semantic Representations from Tree-Structured Long Short-Term Memory Networks. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing 2015;1556–66. doi: 10.3115/v1/p15-1150.
  • 47 Li Y, Zemel R, Brockschmidt M, Tarlow D. Gated graph sequence neural networks. Proceedings of the 4th International Conference on Learning Representations 2016.
  • 48 Peng N, Poon H, Quirk C, Toutanova K, Yih W. Cross-Sentence N-ary Relation Extraction with Graph LSTMs. Trans Assoc Comput Linguist 2017;5:101–15. doi: 10.1162/tacl_a_00049.
  • 49 Kingma DP, Welling M. Auto-Encoding Variational Bayes; 2013. doi: 10.48550/arxiv.1312.6114.
  • 50 Kipf TN, Welling M. Variational Graph Auto-Encoders. Proceedings of the NIPS Workshop on Bayesian Deep Learning 2016.
  • 51 Simonovsky M, Komodakis N. GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders. Proceedings of the International Conference on Artificial Neural Networks 2018;11139:412–22. doi: 10.1007/978-3-030-01418-6_41.
  • 52 Jin W, Barzilay R, Jaakkola T. Junction Tree Variational Autoencoder for Molecular Graph Generation. Proceedings of the International Conference on Machine Learning 2018;2323–32.
  • 53 Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative adversarial networks. Communications of the ACM 2020;63(11):139–44. doi: 10.1145/3422622.
  • 54 Wang H, Wang J, Wang J, Zhao M, Zhang W, Zhang F, et al. GraphGAN: Graph Representation Learning With Generative Adversarial Nets. Proc AAAI Conf Artif Intell 2018;32(1). doi: 10.1609/aaai.v32i1.11872.
  • 55 Nicola De Cao, Kipf T. MolGAN: An implicit generative model for small molecular graphs. Proceedings of the ICML Workshop on Theoretical Foundations and Applications of Deep Generative Models 2018.
  • 56 Anand N, Huang P. Generative modeling for protein structures. Proceedings of the 32nd Conference on Neural Information Processing Systems 2018.
  • 57 Guo X, Zhao L. A Systematic Survey on Deep Generative Models for Graph Generation. IEEE Trans Pattern Anal Mach Intell 2023;45(5):5370-90 doi: 10.1109/TPAMI.2022.3214832.
  • 58 Park K. A review of computational drug repurposing. Transl Clin Pharmacol 2019;27:59–63. doi: 10.12793/tcp.2019.27.2.59.
  • 59 Su C, Hou Y, Wang F. GNN-based Biomedical Knowledge Graph Mining in Drug Development. Graph Neural Networks: Foundations, Frontiers, and Applications 2022;517–40. doi: 10.1007/978-981-16-6054-2_24.
  • 60 Santos R, Ursu O, Gaulton A, Bento AP, Donadi RS, Bologa CG, et al. A comprehensive map of molecular drug targets. Nat Rev Drug Discov 2017;16:19–34. doi: 10.1038/nrd.2016.230.
  • 61 Peng J, Cao H, Guan J, Jingyi Jessica Li, Han R, Hao J, et al. An end-to-end heterogeneous graph representation learning-based framework for drug–target interaction prediction. Brief Bioinform 2021;22(5):1–9. doi: 10.1093/bib/bbaa430.
  • 62 Li Y, Qiao G, Wang K, Wang G. Drug–target interaction predication via multi-channel graph neural networks. Brief Bioinform 2021;23(1):bbab346. doi: 10.1093/bib/bbab346.
  • 63 Xuan P, Fan M, Cui H, Zhang T, Nakaguchi T. GVDTI: graph convolutional and variational autoencoders with attribute-level attention for drug–protein interaction prediction. Brief Bioinform 2021;23(1):bbab453. doi: 10.1093/bib/bbab453.
  • 64 Hsieh K, Wang Y, Chen L, Zhao Z, Savitz S, Jiang X, et al. Drug repurposing for COVID-19 using graph neural network and harmonizing multiple evidence. Sci Rep 2021;11:23179. doi: 10.1038/s41598-021-02353-5.
  • 65 Ding Y, Jiang X, Kim Y. Relational graph convolutional networks for predicting blood–brain barrier penetration of drug molecules. Bioinformatics 2022;38(10):2826–31. doi: 10.1093/bioinformatics/btac211.
  • 66 Yang J, Li Z, Wu WKK, Yu S, Xu Z, Chu Q, et al. Deep learning identifies explainable reasoning paths of mechanism of action for drug repurposing from multilayer biological network. Brief Bioinform 2022;23(6):bbac469. doi: 10.1093/bib/bbac469.
  • 67 Nian Y, Hu X, Zhang R, Feng J, Du J, Li F, et al. Mining on Alzheimer's diseases related knowledge graph to identity potential AD-related semantic triples for drug repurposing. BMC Bioinformatics 2022;23:407. doi: 10.1186/s12859-022-04934-1.
  • 68 Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O. Translating Embeddings for Modeling multi-relational Data. Proceedings of the 26th Conference on Neural Information Processing Systems 2013.
  • 69 Yang B, Yih W, He X, Gao J, Deng L. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. Proceedings of the 3rd International Conference on Learning Representations 2015.
  • 70 Trouillon T, Welbl J, Riedel S, Gaussier E, Bouchard G. Complex Embeddings for Simple Link Prediction. Proceedings of the 33rd International Conference on Machine Learning 2016;48:2071–80.
  • 71 Drug Synergism [Internet]. clinicalinfo.hiv.gov. Available from: https://clinicalinfo.hiv.gov/en/glossary/drug-synergism.
  • 72 Dai Y, Guo C, Guo W, Eickhoff C. Drug–drug interaction prediction with Wasserstein Adversarial Autoencoder-based knowledge graph embeddings. Brief Bioinform 2020;22(4):bbaa256. doi: 10.1093/bib/bbaa256.
  • 73 Wang J, Liu X, Shen S, Deng L, Liu H. DeepDDS: deep graph neural network with attention mechanism to predict synergistic drug combinations. Brief Bioinform 2022;23(1):bbab390. doi: 10.1093/bib/bbab390.
  • 74 Yang J, Xu Z, Wu WKK, Chu Q, Zhang Q. GraphSynergy: a network-inspired deep learning model for anticancer drug combination prediction. J Am Med Inform Assoc 2021;28(11):2336–45. doi: 10.1093/jamia/ocab162.
  • 75 Bang S, Ho Jhee J, Shin H. Polypharmacy Side Effect Prediction with Enhanced Interpretability Based on Graph Feature Attention Network. Bioinformatics 2021;37(18):2955-62. doi: 10.1093/bioinformatics/btab174
  • 76 Tong C, Rocheteau E, Veličković P, Lane N, Liò P. Predicting patient outcomes with graph representation learning. AI for Disease Surveillance and Pandemic Intelligence: Intelligent Disease Detection in Action. Springer International Publishing 2022;281–93. doi: 10.1007/978-3-030-93080-6_20.
  • 77 Xia F, Li B, Weng Y, He S, Liu K, Sun B, et al. MedConQA: Medical Conversational Question Answering System Based on Knowledge Graphs. Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations 2022;148–58.
  • 78 Tang X, Luo J, Shen C, Lai Z. Multi-view multichannel attention graph convolutional network for miRNA–disease association prediction. Brief Bioinform 2021;22(6):bbab174. doi: 10.1093/bib/bbab174.
  • 79 Long Y, Luo J, Zhang Y, Xia Y. Predicting human microbe–disease associations via graph attention networks with inductive matrix completion. Brief Bioinform 2020;22(3):bbaa146. doi: 10.1093/bib/bbaa146.
  • 80 Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A. A survey on bias and fairness in machine learning. ACM Comput Surv 2021;54(6):1-35. doi: 10.1145/3457607.
  • 81 Adamson AS, Smith A. Machine Learning and Health Care Disparities in Dermatology. JAMA Dermatol 2018;154:1247–8. doi: 10.1001/jamadermatol.2018.2348.
  • 82 Dai E, Wang S. Say No to the discrimination: Learning Fair Graph Neural Networks with Limited Sensitive Attribute Information. Proceedings of the 14th ACM International Conference on Web Search and Data Mining 2021;680–8. doi: 10.1145/3437963.3441752.
  • 83 Rahman T, Surma B, Backes M, Zhang Y. Fairwalk: Towards Fair Graph Embedding. Proceedings of the 28th International Joint Conference on Artificial Intelligence 2019;3289–95.
  • 84 Lohia, Pranay K, Ramamurthy N, Bhide M, Saha D, Varshney KR, Puri R. Bias Mitigation post-processing for Individual and Group Fairness. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing 2019;2847–51. doi: 10.1109/ICASSP.2019.8682620.
  • 85 Masoomi A, Hill D, Xu Z, Hersh CP, Silverman EK, Castaldi PJ, et al. Explanations of black-box models based on directional feature interactions. Proceedings of the 10th International Conference on Learning Representations 2022.
  • 86 Ying Z, Bourgeois D, You J, Zitnik M, Leskovec J. GNNExplainer: Generating Explanations for Graph Neural Networks. Proceedings of the 33rd Conference on Neural Information Processing Systems 2019:9244–55. doi: 10.5555/3454287.3455116.
  • 87 Luo D, Cheng W, Xu D, Yu W, Zong B, Chen H, et al. Parameterized Explainer for Graph Neural Network. Proceedings of the 34th Conference on Neural Information Processing Systems 2020;19620–31. doi: 10.1145/3447548.3467283.
  • 88 Funke T, Khosla M, Anand A. Hard Masking for Explaining Graph Neural Networks [Internet]. OpenReview.net 2021. Available from: https://openreview.net/forum?id=uDN8pRAdsoC.
  • 89 Vu M, Thai MT. PGM-Explainer: Probabilistic graphical model explanations for graph neural networks. Proceedings of the 34th Conference on Neural Information Processing Systems 2020;12225–35. doi: 10.5555/3495724.3496749.
  • 90 Yuan H, Tang J, Hu X, Ji S. XGNN: Towards model-level Explanations of Graph Neural Networks. KDD 2020;430–8. doi: 10.1145/3394486.3403085.
  • 91 Debnath AK, Lopez de Compadre RL, Debnath G, Shusterman AJ, Hansch C. Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity. J Med Chem 1991;34:786–97. doi: 10.1021/jm00106a046.
  • 92 Deng L. The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web]. IEEE Signal Process Mag 2012;29:141–2. doi: 10.1109/MSP.2012.2211477.
  • 93 Agarwal C, Queen O, Lakkaraju H, Zitnik M. Evaluating explainability for graph neural networks. Sci Data 2023;10:144. doi: 10.1038/s41597-023-01974-x.
  • 94 Hu W, Cao K, Huang K, Huang EW, Subbian K, Leskovec J. TuneUp: A Training Strategy for Improving Generalization of Graph Neural Networks; 2022. doi: 10.48550/arxiv.2210.14843.
  • 95 Wang Q, Huang K, Chandak P, Zitnik M, Gehlenborg N. Extending the Nested Model for User-Centric XAI: A Design Study on GNN-based Drug Repurposing. IEEE Trans Vis Comput Graph 2022;29(1):1266-76. doi: 10.1109/TVCG.2022.3209435.
  • 96 Yan Y, Hashemi M, Swersky K, Yang Y, Danai Koutra. Two Sides of the Same Coin: Heterophily and Oversmoothing in Graph Convolutional Neural Networks. Proceedings of the IEEE International Conference on Data Mining 2022;1287-92. doi: 10.1109/icdm54844.2022.00169.
  • 97 Hu Z, Dong Y, Wang K, Sun Y. Heterogeneous Graph Transformer. Proceedings of the Web Conference 2020;2704–10. doi: 10.1145/3366423.3380027.
  • 98 Wang X, Ji H, Shi C, Wang B, Ye Y, Cui P, et al. Heterogeneous graph attention network. Proceedings of the World Wide Web Conference 2019;2022–32. doi: 10.1145/3308558.3313562.
  • 99 Frasca F, Rossi E, Eynard D, Chamberlain B, Bronstein M, Monti F. SIGN: Scalable inception graph neural networks; 2020. doi: 10.48550/arXiv.2004.11198.
  • 100 Cui G, Zhou J, Yang C, Liu Z. Adaptive Graph Encoder for Attributed Graph Embedding. KDD 2020;976–85. doi: 10.1145/3394486.3403140.
  • 101 Zeng H, Zhou H, Srivastava A, Kannan R, Prasanna V. GraphSAINT: Graph Sampling Based Inductive Learning Method. Proceedings of the 8th International Conference on Learning Representations 2020.
  • 102 Wang J, Ma A, Chang Y, Gong J, Jiang Y, Qi R, et al. scGNN is a novel graph neural network framework for single-cell RNA-Seq analyses. Nat Commun 2021;12:1882. doi: 10.1038/s41467-021-22197-x.
  • 103 Fu X, Zhang J, Meng Z, King I. MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding. Proceedings of the Web Conference 2020;2331–41. doi: 10.1145/3366423.3380297.
  • 104 Hu Z, Dong Y, Wang K, Sun Y. Heterogeneous Graph Transformer. Proceedings of the Web Conference 2020;2704–10. doi: 10.1145/3366423.3380027.
  • 105 Zhu Y, Xu Y, Yu F, Liu Q, Wu S, Wang L. Graph Contrastive Learning with Adaptive Augmentation. Proceedings of the Web Conference 2021;2069–80. doi: 10.1145/3442381.3449802.
  • 106 Shi C, Xu M, Zhu Z, Zhang W, Zhang M, Tang J. GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation. Proceedings of the 8th International Conference on Learning Representations 2020.
  • 107 Du Y, Guo X, Cao H, Ye Y, Zhao L. Disentangled spatiotemporal graph generative models. Proceedings of the AAAI Conference on Artificial Intelligence 2022;36(6):6541–9. doi: 10.1609/aaai.v36i6.20607.

Correspondence to:

Dr. Cui Tao
McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston
7000 Fannin, Suite 600, Houston, TX, 77030
USA   
Phone: +1 713 500 3981   
Fax: +1 713 500 3929   

Publication History

Article published online:
26 December 2023

© 2023. IMIA and Thieme. This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (https://creativecommons.org/licenses/by-nc-nd/4.0/)

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany

  • References

  • 1 Santos A, Colaço AR, Nielsen AB, Niu L, Strauss M, Geyer PE, et al. A knowledge graph to interpret clinical proteomics data. Nat Biotechnol 2022;40(5):692–702. doi: 10.1038/s41587-021-01145-6.
  • 2 Zhou J, Cui G, Hu S, Zhang Z, Yang C, Liu Z, et al. Graph neural networks: A review of methods and applications. AI Open 2020;1:57–81. doi: 10.1016/j.aiopen.2021.01.001.
  • 3 Yi H-C, You Z-H, Huang D-S, Kwoh CK. Graph representation learning in bioinformatics: trends, methods and applications. Brief Bioinform 2022;23(1):1–16. doi: 10.1093/bib/bbab340.
  • 4 Leser U, Triβl S. Graph Management in the Life Sciences. In: Liu L, Özsu MT, editors. Encyclopedia of Database Systems. Springer US; 2009. p. 1266–71. doi: 10.1007/978-0-387-39940-9_1436.
  • 5 David L, Thakkar A, Mercado R, Engkvist O. Molecular representations in AI-driven drug discovery: a review and practical guide. J Cheminform 2020;12(1):56. doi: 10.1186/s13321-020-00460-5.
  • 6 Weighill D, Guebila MB, Lopes-Ramos C, Glass K, Quackenbush J, Platig J, et al. Gene regulatory network inference as relaxed graph matching. Proc Conf AAAI Artif Intell 2021;35:10263–72. doi: 10.1101/168419.
  • 7 Khan A, Uddin S, Srinivasan U. Comorbidity network for chronic disease: A novel approach to understand type 2 diabetes progression. Int J Med Inform 2018;115:1–9. doi: 10.1016/j.ijmedinf.2018.04.001.
  • 8 Gómez A, Oliveira G. New approaches to epidemic modeling on networks. Sci Rep 2023;13(1):468. doi: 10.1038/s41598-022-19827-9.
  • 9 Britto MT, Fuller SC, Kaplan HC, Kotagal U, Lannon C, Margolis PA, et al. Using a network organisational architecture to support the development of Learning Healthcare Systems. BMJ Qual Saf 2018;27(11):937–46. doi: 10.1136/bmjqs-2017-007219.
  • 10 Chandak P, Huang K, Zitnik M. Building a knowledge graph to enable precision medicine. Sci Data 2023;10(1):67. doi: 10.1038/s41597-023-01960-3.
  • 11 Ji S, Pan S, Cambria E, Marttinen P, Yu PS. A Survey on Knowledge Graphs: Representation, Acquisition, and Applications. IEEE Trans Neural Netw Learn Syst 2022; 33:494–514. doi: 10.1109/tnnls.2021.3070843.
  • 12 Cai L, Lu C, Xu J, Meng Y, Wang P, Fu X, et al. Drug repositioning based on the heterogeneous information fusion graph convolutional network. Brief Bioinform 2021;22(6):1–12. doi: 10.1093/bib/bbab319.
  • 13 Sun Z, Yin H, Chen H, Chen T, Cui L, Yang F. Disease Prediction via Graph Neural Networks. IEEE J Biomed Health Inform 2021;25(3):818–26. doi: 10.1109/JBHI.2020.3004143.
  • 14 Yuan Q, Chen J, Zhao H, Zhou Y, Yang Y. Structure-aware protein–protein interaction site prediction using deep graph convolutional network. Bioinformatics 2021;38(1):125–32. doi: 10.1093/bioinformatics/btab643.
  • 15 Aggarwal R, Sounderajah V, Martin G, Ting DSW, Karthikesalingam A, King D, et al. Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. NPJ Digit Med 2021; 4(1): 65. doi: 10.1038/s41746-021-00438-z.
  • 16 Askr H, Elgeldawi E, Aboul Ella H, Elshaier YAMM, Gomaa MM, Hassanien AE. Deep learning in drug discovery: an integrative review and future challenges. Artif Intell Rev 2022;1–63. doi: 10.1007/s10462-022-10306-1.
  • 17 Zhang G, Zhang X, Bilal M, Dou W, Xu X, Rodrigues JJPC. Identifying fraud in medical insurance based on blockchain and deep learning. Future Gener Comput Syst 2022;130:140–54. doi: 10.1016/j.future.2021.12.006.
  • 18 Singh PK. Data with Non-Euclidean Geometry and its Characterization. Journal of Artificial Intelligence and Technology 2021;2(1):3–8. doi: 10.37965/jait.2021.12001.
  • 19 Hamilton WL, Ying R, Leskovec J. Representation Learning on Graphs: Methods and Applications. Bull Tech Comm Data Eng 2017;40(3):52–74.
  • 20 Hetzel L, Fischer DS, Günnemann S, Theis FJ. Graph representation learning for single-cell biology. Curr Opin Syst Biol 2021;28:100347. doi: 10.1016/j.coisb.2021.05.008.
  • 21 Pham T-H, Qiu Y, Zeng J, Xie L, Zhang P. A deep learning framework for high-throughput mechanism-driven phenotype compound screening and its application to COVID-19 drug repurposing. Nat Mach Intell 2021;3(3):247–57. doi: 10.1038/s42256-020-00285-9.
  • 22 Li L, Zhou J, Jiang Y, Huang B. Propagation source identification of infectious diseases with graph convolutional networks. J Biomed Inform 2021;116:103720. doi: 10.1016/j.jbi.2021.103720.
  • 23 Zhu W, Razavian N. Variationally regularized graph-based representation learning for electronic health records. Proc ACM Conf Health Inference Learn 2021:1–13. doi: 10.1145/3450439.3451855.
  • 24 Wu L, Cui P, Pei J, Zhao L, Song L. Graph Neural Networks: Foundations, Frontiers, and Applications. Singarpore: Springer Nature; 2022. doi: 10.1007/978-981-16-6054-2.
  • 25 Yang Z, Cohen WW, Salakhutdinov R. Revisiting semi-supervised learning with graph embeddings. Proceedings of the 33rd International Conference on International Conference on Machine Learning 2016;48:40–8.
  • 26 Hamilton WL. Graph Representation Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning. Cham: Springer; 2020. doi: 10.1007/978-3-031-01588-5.
  • 27 Li MM, Huang K, Zitnik M. Graph representation learning in biomedicine and healthcare. Nat Biomed Eng 2022; 6(12):1353-69. doi: 10.1038/s41551-022-00942-x.
  • 28 Perozzi B, Al-Rfou R, Skiena S. DeepWalk: Online Learning of Social Representations. KDD 2014;701–10. doi: 10.1145/2623330.2623732.
  • 29 Grover A, Leskovec J. node2vec: Scalable Feature Learning for Networks. KDD 2016;855–64. doi: 10.1145/2939672.2939754.
  • 30 Ribeiro LFR, Savarese PHP, Figueiredo DR. struc2vec: Learning Node Representations from Structural Identity. KDD 2017;385–94. doi: 10.1145/3097983.3098061.
  • 31 Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q. LINE: Large-scale Information Network Embedding. Proceedings of the 24th International Conference on World Wide Web 2015;1067–77. doi: 10.1145/2736277.2741093.
  • 32 Hamilton WL, Ying R, Leskovec J. Inductive representation learning on large graphs. Proceedings of the 31st International Conference on Neural Information Processing Systems 2017:1025–35. doi: 10.5555/3294771.3294869.
  • 33 Chami I, Abu-El-Haija S, Perozzi B, Ré C, Murphy K. Machine learning on graphs: A model and comprehensive taxonomy. J Mach Learn Res 2022;(89):1-64.
  • 34 Morris C, Ritzert M, Fey M, Hamilton WL, Lenssen JE, Rattan G, et al. Weisfeiler and Leman Go Neural: Higher-Order Graph Neural Networks. Proc AAAI Conf Artif Intell 2019;33(1):4602–9. doi: 10.1609/aaai.v33i01.33014602.
  • 35 Alzubaidi L, Zhang J, Humaidi AJ, Al-Dujaili A, Duan Y, Al-Shamma O, et al. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data 2021;8(1):53. doi: 10.1186/s40537-021-00444-8.
  • 36 Zhao Z-Q, Zheng P, Xu S-T, Wu X. Object Detection With Deep Learning: A Review. IEEE Trans Neural Netw Learn Syst 2019;30(11):3212–32. doi: 10.1109/TNNLS.2018.2876865.
  • 37 Yao G, Lei T, Zhong J. A review of Convolutional-Neural-Network-based action recognition. Pattern Recognit Lett 2019;118:14–22. doi: 10.1016/j.patrec.2018.05.018.
  • 38 Ker J, Wang L, Rao J, Lim T. Deep Learning Applications in Medical Image Analysis. IEEE Access 2018;6:9375–9389. doi: 10.1109/ACCESS.2017.2788044.
  • 39 Shuman DI, Narang SK, Frossard P, Ortega A, Vandergheynst P. The Emerging Field of Signal Processing on Graphs: Extending High-Dimensional Data Analysis to Networks and Other Irregular Domains. IEEE Signal Process Mag 2013;30(3):83-98. doi: 10.1109/MSP.2012.2235192.
  • 40 Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. Proceedings of the 5th International Conference on Learning Representations 2017.
  • 41 Zheng Y, Gao C, Chen L, Jin D, Li Y. DGCN: Diversified Recommendation with Graph Convolutional Networks. Proceedings of the Web Conference 2021;401–12. doi: 10.1145/3442381.3449835.
  • 42 Chiang W-L, Liu X, Si S, Li Y, Bengio S, Hsieh C-J. Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks. KDD 2019;257–66. doi: 10.1145/3292500.3330925.
  • 43 Li Y, Yu R, Shahabi C, Liu Y. Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting. Proceedings of the 6th International Conference on Learning Representations 2018.
  • 44 Monti F, Boscaini D, Masci J, Rodolà E, Svoboda J, Bronstein MM. Geometric deep learning on graphs and manifolds using mixture model CNNs. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 2017;5425–34. doi: 10.1109/CVPR.2017.576.
  • 45 Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph Attention Networks. Proceedings of the 6th International Conference on Learning Representations 2018.
  • 46 Tai KS, Socher R, Manning CD. Improved Semantic Representations from Tree-Structured Long Short-Term Memory Networks. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing 2015;1556–66. doi: 10.3115/v1/p15-1150.
  • 47 Li Y, Zemel R, Brockschmidt M, Tarlow D. Gated graph sequence neural networks. Proceedings of the 4th International Conference on Learning Representations 2016.
  • 48 Peng N, Poon H, Quirk C, Toutanova K, Yih W. Cross-Sentence N-ary Relation Extraction with Graph LSTMs. Trans Assoc Comput Linguist 2017;5:101–15. doi: 10.1162/tacl_a_00049.
  • 49 Kingma DP, Welling M. Auto-Encoding Variational Bayes; 2013. doi: 10.48550/arxiv.1312.6114.
  • 50 Kipf TN, Welling M. Variational Graph Auto-Encoders. Proceedings of the NIPS Workshop on Bayesian Deep Learning 2016.
  • 51 Simonovsky M, Komodakis N. GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders. Proceedings of the International Conference on Artificial Neural Networks 2018;11139:412–22. doi: 10.1007/978-3-030-01418-6_41.
  • 52 Jin W, Barzilay R, Jaakkola T. Junction Tree Variational Autoencoder for Molecular Graph Generation. Proceedings of the International Conference on Machine Learning 2018;2323–32.
  • 53 Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative adversarial networks. Communications of the ACM 2020;63(11):139–44. doi: 10.1145/3422622.
  • 54 Wang H, Wang J, Wang J, Zhao M, Zhang W, Zhang F, et al. GraphGAN: Graph Representation Learning With Generative Adversarial Nets. Proc AAAI Conf Artif Intell 2018;32(1). doi: 10.1609/aaai.v32i1.11872.
  • 55 Nicola De Cao, Kipf T. MolGAN: An implicit generative model for small molecular graphs. Proceedings of the ICML Workshop on Theoretical Foundations and Applications of Deep Generative Models 2018.
  • 56 Anand N, Huang P. Generative modeling for protein structures. Proceedings of the 32nd Conference on Neural Information Processing Systems 2018.
  • 57 Guo X, Zhao L. A Systematic Survey on Deep Generative Models for Graph Generation. IEEE Trans Pattern Anal Mach Intell 2023;45(5):5370-90 doi: 10.1109/TPAMI.2022.3214832.
  • 58 Park K. A review of computational drug repurposing. Transl Clin Pharmacol 2019;27:59–63. doi: 10.12793/tcp.2019.27.2.59.
  • 59 Su C, Hou Y, Wang F. GNN-based Biomedical Knowledge Graph Mining in Drug Development. Graph Neural Networks: Foundations, Frontiers, and Applications 2022;517–40. doi: 10.1007/978-981-16-6054-2_24.
  • 60 Santos R, Ursu O, Gaulton A, Bento AP, Donadi RS, Bologa CG, et al. A comprehensive map of molecular drug targets. Nat Rev Drug Discov 2017;16:19–34. doi: 10.1038/nrd.2016.230.
  • 61 Peng J, Cao H, Guan J, Jingyi Jessica Li, Han R, Hao J, et al. An end-to-end heterogeneous graph representation learning-based framework for drug–target interaction prediction. Brief Bioinform 2021;22(5):1–9. doi: 10.1093/bib/bbaa430.
  • 62 Li Y, Qiao G, Wang K, Wang G. Drug–target interaction predication via multi-channel graph neural networks. Brief Bioinform 2021;23(1):bbab346. doi: 10.1093/bib/bbab346.
  • 63 Xuan P, Fan M, Cui H, Zhang T, Nakaguchi T. GVDTI: graph convolutional and variational autoencoders with attribute-level attention for drug–protein interaction prediction. Brief Bioinform 2021;23(1):bbab453. doi: 10.1093/bib/bbab453.
  • 64 Hsieh K, Wang Y, Chen L, Zhao Z, Savitz S, Jiang X, et al. Drug repurposing for COVID-19 using graph neural network and harmonizing multiple evidence. Sci Rep 2021;11:23179. doi: 10.1038/s41598-021-02353-5.
  • 65 Ding Y, Jiang X, Kim Y. Relational graph convolutional networks for predicting blood–brain barrier penetration of drug molecules. Bioinformatics 2022;38(10):2826–31. doi: 10.1093/bioinformatics/btac211.
  • 66 Yang J, Li Z, Wu WKK, Yu S, Xu Z, Chu Q, et al. Deep learning identifies explainable reasoning paths of mechanism of action for drug repurposing from multilayer biological network. Brief Bioinform 2022;23(6):bbac469. doi: 10.1093/bib/bbac469.
  • 67 Nian Y, Hu X, Zhang R, Feng J, Du J, Li F, et al. Mining on Alzheimer's diseases related knowledge graph to identity potential AD-related semantic triples for drug repurposing. BMC Bioinformatics 2022;23:407. doi: 10.1186/s12859-022-04934-1.
  • 68 Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O. Translating Embeddings for Modeling multi-relational Data. Proceedings of the 26th Conference on Neural Information Processing Systems 2013.
  • 69 Yang B, Yih W, He X, Gao J, Deng L. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. Proceedings of the 3rd International Conference on Learning Representations 2015.
  • 70 Trouillon T, Welbl J, Riedel S, Gaussier E, Bouchard G. Complex Embeddings for Simple Link Prediction. Proceedings of the 33rd International Conference on Machine Learning 2016;48:2071–80.
  • 71 Drug Synergism [Internet]. clinicalinfo.hiv.gov. Available from: https://clinicalinfo.hiv.gov/en/glossary/drug-synergism.
  • 72 Dai Y, Guo C, Guo W, Eickhoff C. Drug–drug interaction prediction with Wasserstein Adversarial Autoencoder-based knowledge graph embeddings. Brief Bioinform 2020;22(4):bbaa256. doi: 10.1093/bib/bbaa256.
  • 73 Wang J, Liu X, Shen S, Deng L, Liu H. DeepDDS: deep graph neural network with attention mechanism to predict synergistic drug combinations. Brief Bioinform 2022;23(1):bbab390. doi: 10.1093/bib/bbab390.
  • 74 Yang J, Xu Z, Wu WKK, Chu Q, Zhang Q. GraphSynergy: a network-inspired deep learning model for anticancer drug combination prediction. J Am Med Inform Assoc 2021;28(11):2336–45. doi: 10.1093/jamia/ocab162.
  • 75 Bang S, Ho Jhee J, Shin H. Polypharmacy Side Effect Prediction with Enhanced Interpretability Based on Graph Feature Attention Network. Bioinformatics 2021;37(18):2955-62. doi: 10.1093/bioinformatics/btab174
  • 76 Tong C, Rocheteau E, Veličković P, Lane N, Liò P. Predicting patient outcomes with graph representation learning. AI for Disease Surveillance and Pandemic Intelligence: Intelligent Disease Detection in Action. Springer International Publishing 2022;281–93. doi: 10.1007/978-3-030-93080-6_20.
  • 77 Xia F, Li B, Weng Y, He S, Liu K, Sun B, et al. MedConQA: Medical Conversational Question Answering System Based on Knowledge Graphs. Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations 2022;148–58.
  • 78 Tang X, Luo J, Shen C, Lai Z. Multi-view multichannel attention graph convolutional network for miRNA–disease association prediction. Brief Bioinform 2021;22(6):bbab174. doi: 10.1093/bib/bbab174.
  • 79 Long Y, Luo J, Zhang Y, Xia Y. Predicting human microbe–disease associations via graph attention networks with inductive matrix completion. Brief Bioinform 2020;22(3):bbaa146. doi: 10.1093/bib/bbaa146.
  • 80 Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A. A survey on bias and fairness in machine learning. ACM Comput Surv 2021;54(6):1-35. doi: 10.1145/3457607.
  • 81 Adamson AS, Smith A. Machine Learning and Health Care Disparities in Dermatology. JAMA Dermatol 2018;154:1247–8. doi: 10.1001/jamadermatol.2018.2348.
  • 82 Dai E, Wang S. Say No to the discrimination: Learning Fair Graph Neural Networks with Limited Sensitive Attribute Information. Proceedings of the 14th ACM International Conference on Web Search and Data Mining 2021;680–8. doi: 10.1145/3437963.3441752.
  • 83 Rahman T, Surma B, Backes M, Zhang Y. Fairwalk: Towards Fair Graph Embedding. Proceedings of the 28th International Joint Conference on Artificial Intelligence 2019;3289–95.
  • 84 Lohia, Pranay K, Ramamurthy N, Bhide M, Saha D, Varshney KR, Puri R. Bias Mitigation post-processing for Individual and Group Fairness. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing 2019;2847–51. doi: 10.1109/ICASSP.2019.8682620.
  • 85 Masoomi A, Hill D, Xu Z, Hersh CP, Silverman EK, Castaldi PJ, et al. Explanations of black-box models based on directional feature interactions. Proceedings of the 10th International Conference on Learning Representations 2022.
  • 86 Ying Z, Bourgeois D, You J, Zitnik M, Leskovec J. GNNExplainer: Generating Explanations for Graph Neural Networks. Proceedings of the 33rd Conference on Neural Information Processing Systems 2019:9244–55. doi: 10.5555/3454287.3455116.
  • 87 Luo D, Cheng W, Xu D, Yu W, Zong B, Chen H, et al. Parameterized Explainer for Graph Neural Network. Proceedings of the 34th Conference on Neural Information Processing Systems 2020;19620–31. doi: 10.1145/3447548.3467283.
  • 88 Funke T, Khosla M, Anand A. Hard Masking for Explaining Graph Neural Networks [Internet]. OpenReview.net 2021. Available from: https://openreview.net/forum?id=uDN8pRAdsoC.
  • 89 Vu M, Thai MT. PGM-Explainer: Probabilistic graphical model explanations for graph neural networks. Proceedings of the 34th Conference on Neural Information Processing Systems 2020;12225–35. doi: 10.5555/3495724.3496749.
  • 90 Yuan H, Tang J, Hu X, Ji S. XGNN: Towards model-level Explanations of Graph Neural Networks. KDD 2020;430–8. doi: 10.1145/3394486.3403085.
  • 91 Debnath AK, Lopez de Compadre RL, Debnath G, Shusterman AJ, Hansch C. Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity. J Med Chem 1991;34:786–97. doi: 10.1021/jm00106a046.
  • 92 Deng L. The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web]. IEEE Signal Process Mag 2012;29:141–2. doi: 10.1109/MSP.2012.2211477.
  • 93 Agarwal C, Queen O, Lakkaraju H, Zitnik M. Evaluating explainability for graph neural networks. Sci Data 2023;10:144. doi: 10.1038/s41597-023-01974-x.
  • 94 Hu W, Cao K, Huang K, Huang EW, Subbian K, Leskovec J. TuneUp: A Training Strategy for Improving Generalization of Graph Neural Networks; 2022. doi: 10.48550/arxiv.2210.14843.
  • 95 Wang Q, Huang K, Chandak P, Zitnik M, Gehlenborg N. Extending the Nested Model for User-Centric XAI: A Design Study on GNN-based Drug Repurposing. IEEE Trans Vis Comput Graph 2022;29(1):1266-76. doi: 10.1109/TVCG.2022.3209435.
  • 96 Yan Y, Hashemi M, Swersky K, Yang Y, Danai Koutra. Two Sides of the Same Coin: Heterophily and Oversmoothing in Graph Convolutional Neural Networks. Proceedings of the IEEE International Conference on Data Mining 2022;1287-92. doi: 10.1109/icdm54844.2022.00169.
  • 97 Hu Z, Dong Y, Wang K, Sun Y. Heterogeneous Graph Transformer. Proceedings of the Web Conference 2020;2704–10. doi: 10.1145/3366423.3380027.
  • 98 Wang X, Ji H, Shi C, Wang B, Ye Y, Cui P, et al. Heterogeneous graph attention network. Proceedings of the World Wide Web Conference 2019;2022–32. doi: 10.1145/3308558.3313562.
  • 99 Frasca F, Rossi E, Eynard D, Chamberlain B, Bronstein M, Monti F. SIGN: Scalable inception graph neural networks; 2020. doi: 10.48550/arXiv.2004.11198.
  • 100 Cui G, Zhou J, Yang C, Liu Z. Adaptive Graph Encoder for Attributed Graph Embedding. KDD 2020;976–85. doi: 10.1145/3394486.3403140.
  • 101 Zeng H, Zhou H, Srivastava A, Kannan R, Prasanna V. GraphSAINT: Graph Sampling Based Inductive Learning Method. Proceedings of the 8th International Conference on Learning Representations 2020.
  • 102 Wang J, Ma A, Chang Y, Gong J, Jiang Y, Qi R, et al. scGNN is a novel graph neural network framework for single-cell RNA-Seq analyses. Nat Commun 2021;12:1882. doi: 10.1038/s41467-021-22197-x.
  • 103 Fu X, Zhang J, Meng Z, King I. MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding. Proceedings of the Web Conference 2020;2331–41. doi: 10.1145/3366423.3380297.
  • 104 Hu Z, Dong Y, Wang K, Sun Y. Heterogeneous Graph Transformer. Proceedings of the Web Conference 2020;2704–10. doi: 10.1145/3366423.3380027.
  • 105 Zhu Y, Xu Y, Yu F, Liu Q, Wu S, Wang L. Graph Contrastive Learning with Adaptive Augmentation. Proceedings of the Web Conference 2021;2069–80. doi: 10.1145/3442381.3449802.
  • 106 Shi C, Xu M, Zhu Z, Zhang W, Zhang M, Tang J. GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation. Proceedings of the 8th International Conference on Learning Representations 2020.
  • 107 Du Y, Guo X, Cao H, Ye Y, Zhao L. Disentangled spatiotemporal graph generative models. Proceedings of the AAAI Conference on Artificial Intelligence 2022;36(6):6541–9. doi: 10.1609/aaai.v36i6.20607.

Zoom Image
Table 1 Principles, characteristics, and applicable tasks of three categories of GRL methods.
Zoom Image
Table 2 Recent notable GRL models.
Zoom Image
Table 3 Core elements of KG applications in drug and disease-related research.