Key words
MR-diffusion/perfusion - neural networks - vascular - staging - diagnostic radiology
Introduction
The basic goal of all contrast agent (CA)-based perfusion measurement methods is to obtain detailed information about the structure and function of the vascular network by observing its dynamic response to a defined CA bolus. For this purpose, MR imaging is an ideal measurement method as both high temporal and spatial resolution may be achieved using current-generation scanners. If the relationship between contrast agent concentration and signal intensity is known, the contrast agent concentration can be calculated for each voxel and timestep. After this has been achieved, tissue models of different complexity may be fitted to the concentration time curves and the results used for diagnosis, treatment monitoring, or basic research.
The goal of this review is to explore the potential of novel deep learning-based processing methods which may be able to capture hitherto unknown perfusion parameters such as temporal curve shape or spatial enhancement patterns. It may be possible to identify, similarly to MR fingerprinting in structural MR imaging, perfusion signatures which carry information about the underlying microvascular architecture. As ample literature on the technical details of perfusion MRI exists, conventional acquisition and processing methods are only presented in brief.
In the first section, the influence of the vascular architecture and function on the CA dynamics are reviewed. This is followed by a brief recapitulation of conventional T1-weighted and T2*-weighted image acquisition methods and their relative strengths and weaknesses. In the main section, the potential of deep learning-based perfusion processing is reviewed and discussed, followed by an overview of current and potential future clinical applications. This is followed by a look at future research directions.
Background
Vascular networks and blood flow
The architecture and integrity of the tissue microvasculature determine the temporal and spatial signal dynamics in response to an external CA bolus. It must be emphasized that, while the signal dynamics are highly dynamic and of large magnitude, the measured blood flow and diffusion effects are largely static over the measurement duration: the changes in contrast agent concentration do not represent the return of a perturbed system to its equilibrium – the CA bolus is merely the measurement vehicle with which static effects such as blood flow, contrast agent extravasation, and diffusion are measured. The relevant effects can be largely classified into two categories: flow effects that take place inside the vascular system and exchange effects taking place between intra- and extravascular spaces. As physiological microvascular flow is nearly always laminar and therefore deterministic, the flow of each measured intravascular proton is theoretically predetermined by the vascular architecture and its flow patterns [1]. Due to resolution limits, however, the precise morphology and flow velocity of each capillary segment cannot be known. Recent models suggest [2]
[3] that the capillary network is organized according to common principles, and that the microarchitecture can be described by a few organizational parameters [4]
[5]. This creates the possibility to simulate realistic microvascular networks relatively easily, which can in turn be used to explore the influence of these organizational parameters on the CA dynamics [6]. Concepts of statistical mechanics and graph theory may be of use to explore this space further [7]. Exchange effects arise due to the permeability of the vessel walls, allowing the transport of fluid and solutes, including the CA itself, between blood and the extravascular space. This exchange is driven by concentration gradients, hydrostatic and oncotic pressure differentials and, partly, active transport. Tumor growth is associated with neoangiogenesis [8], with the newly developed vasculature being significantly more fragile and permeable than the physiological vasculature [9]. Imaging permeability effects requires much longer measurement times [10], as the exchange effects are at least two magnitudes slower than directed blood flow.
Basic measurement principles
For the acquisition of T1-weighted perfusion imaging, called dynamic contrast-enhanced (DCE) MRI, dynamic MRI measurements using a heavily T1-weighted MR sequence with sufficient spatial and temporal resolution are necessary. This is possible using either spin echo or gradient echo-based stimulation schemes, although 3D gradient echo sequences have become the de-facto standard. An example of a DCE acquisition can be seen in [Fig. 1B]. Recently, compressed sensing-based sequences have been introduced, allowing shorter acquisition times and adaptive readout schemes. The absolute contrast agent concentration can be calculated from the relative change in signal intensity before and after application of the CA. For this, the absolute T1 time of the tissue must be known beforehand, usually by applying a quantitative T1 mapping sequence such as a variable flip angle (VFA) or modified Look-Locker (MOLLI) sequence. The main advantage of T1w imaging-based perfusion measurements is that the signal intensity-contrast agent concentration relationship is only weakly influenced by tissue effects, allowing the modeling of permeability effects in which contrast agent leaves the vasculature.
Fig. 1 Comparison of DCE and DSC MRI. A Raw DSC MRI of the human brain using single-shot gradient echo planar imaging at time point t1 (left) and t10 (right). B Calculated CBF (left) and CBV (right) maps. C Raw DCE MRI of the human brain using a 3D gradient echo TWIST sequence at time point t1 (left) and t10 (right). D Fitted Tofts model with parameters ktrans (left), vep (right).
Abb. 1 Vergleich zwischen DCE- und DSC-MRT. A Rohdaten einer DSC-Aufnahme des menschlichen Gehirns mittels einer single-shot echo-planaren Gradientenechosequenz an Zeitpunkt t1 (links) und t10 (rechts). B. Berechnete CBF- (links) und CBV- (rechts) Karten. C Rohdaten einer DCE-Aufnahme des menschlichen Gehirns mittels einer 3D Gradientenecho-TWIST-Sequenz an Zeitpunkt t1 (links) und t10 (rechts). D Gefittetes Tofts-Modell mit Parametermaps für ktrans (links) und vep (rechts).
For T2*-weighted perfusion imaging, called dynamic susceptibility-weighted (DSC) MRI, usually a gradient echo sequence is applied, most commonly echo planar imaging due to its uniquely high acquisition speed [11], as shown in [Fig. 1A]. As the vessel geometry and susceptibility difference between vessel and parenchyma have a large influence on signal dephasing speed, it is common practice to saturate the interstitial space in regions of high permeability by applying a small pre-bolus of contrast agent before starting the dynamic acquisition. This attenuates the interstitial signal changes during passage of the main bolus, allowing correct quantification of cerebral blood flow (CBF) and cerebral blood volume (CBV). Still, additional postprocessing correction of contrast agent extravasation is highly advantageous [12]. The relative signal intensity of T2*-weighted spin echo and gradient echo sequences is dependent on the vessel geometry inside the voxel. This effect is used in vessel size imaging (VSI) [13]
[14] or vessel architecture imaging (VAI) [15] to determine vessel diameters.
Quantitative analysis
The challenge of quantitative perfusion and permeability modelling is fitting complex nonlinear models to noisy measurement data while having to correctly describe the aberrant contrast agent dynamics created by pathological tissue. Therefore, care has to be taken when using regularization algorithms in order to not correct away important abnormalities. From a mathematical viewpoint, dynamic contrast agent curves are stochastic time series with overlying noise. Depending on the specific measurement sequence, the noise distribution of the signal intensity [16] and contrast agent concentration is not necessarily Gaussian. This has important implications for noise estimation and significance testing.
Perfusion modeling
The basis of perfusion modeling is the indicator-dilution theorem, which connects the local dynamics of centrally injected contrast agent with the local tissue blood flow and blood volume [17]
[18]. These parameters can be determined directly from the concentration time curve without having to assume an underlying tissue model. In order to account for variations in the central circulation, the bolus curve either of a large perfusing artery or an interpolated “local” vessel is used as an arterial input function (AIF) [19]. The AIF is usually either manually or automatically selected [20]
[21]. The AIF is deconvolved from the voxel CA concentration time curve, resulting in the tissue residuum function, from which CBF and CBV can be directly calculated. Examples of the resulting maps are shown in [Fig. 1B]. While the calculation of conventional parameters like CBF, CBV, time to peak (TTP), and mean transit time (MTT) is well-established and a mainstay of neurooncological and neurovascular diagnosis, novel processing methods like wavelet-based analysis [22]
[23], Bayesian vascular models [24], and control-point interpolation methods [25] may be able to capture pathological changes better. In addition, direct inference from the tissue residuum function may be able to capture important features which are missed by calculated single parameters. Changes in the microvascular architecture impact the transit delay of the CA, leading to changes in the residuum function shape.
Permeability modeling
For modeling of permeability effects, a dynamic tissue model with fixed exchange coefficients is assumed. Parameter fitting is usually accomplished by either nonlinear least squares (LS) fitting or Bayesian optimization [26]
[27]. Since the introduction of the Tofts [28], Brix [29] and Patlak models [30], increasingly complex models have been proposed. Examples include the 2-compartment chemical exchange model (2CXM) [31], the compartmental tissue uptake (CTU) model [32], and the adiabatic approximation to the tissue homogeneity (TH) model [33]. The maps resulting from a Tofts model fit can be seen in [Fig. 1 D]. While research of tissue models for DCE perfusion has enjoyed a constant popularity among MRI researchers and robust fitting algorithms have been developed, clinical utilization of the derived parameters remains low. Most diagnostic guidelines still rely on qualitative descriptors of contrast agent bolus curves.
Deep learning
Deep learning algorithms are essentially complex nonlinear functions which can represent nearly any underlying distribution provided enough variable input is provided [34]
[35]. This essentially implies that most, if not all, conventional processing steps that are applied to the raw perfusion data can be learned by a neural network. This raises the question as to whether it is useful to train a neural network to provide known perfusion parameters as an output. The ultimate goal of perfusion imaging is to provide functional information not available from purely morphological imaging and to use it to improve detection rates, diagnostic accuracy, and outcome prediction. Neural networks can be trained to directly output this information by training them on data obtained from medical records or histopathological reports, e. g., tumor grading or staining density. The prediction of clinical outcome parameters directly based on raw perfusion data is not an easy task, however, and requires both a large amount of training data and enough training time to provide sensible results. As an intermediate step, the output parameters of conventional quantitative processing methods can be learned instead, essentially teaching a neural network the mathematical transformations behind the processing pipeline. This is far easier than predicting clinical or outcome parameters, as the quantitative parameter is usually known for each voxel separately, and there is a directly (quasi-)deterministic relationship between input and output variables. Therefore, less training data is required, and training converges faster. Furthermore, even if no new information is gained compared to conventional algorithms, implementing the processing in a common neural network may have other advantages, such as higher processing speed or better interoperability between data acquired by MR devices from different vendors. The raw data from T1- or T2*-weighted perfusion imaging take the form of a four-dimensional array – three space dimensions are acquired for each timestep. Two different classes of neural networks have been employed to model this spatiotemporal data: convolutional neural networks and recurrent neural networks.
Convolutional neural networks are derived from fully convolutional networks and incorporate hidden layers which perform spatial convolution steps, helping them capture complex relationships at different resolution scales. For perfusion imaging, the 3D images captured at each timestep are usually assigned as different input channels. This has the advantage of easily capturing spatial relationships but requiring that the input data always has the same number of timesteps. In addition, the results are not invariant under temporal shifts, such as when the acquisition was started earlier or later. A possible solution is four-dimensional convolutional neural networks with modified loss functions [36]
[37]. Finally, as convolutional neural networks are commonly used for tissue segmentations, their incorporation into perfusion processing workflow can help in extracting tissue parameters [38].
Recurrent neural networks, on the other hand, are natively designed for sequential input data: there is a one-to-one-relationship between each temporal position in the input data and a network layer [39]. Each layer, in addition to input and output nodes, consists of several hidden nodes, with weight matrices shared between different timesteps. This makes them invariant under time shifts, an important advantage when considering perfusion data. Training recurrent networks has specific challenges such as vanishing or exploding gradient problems [39]. Several network architectures were designed to deal with these problems, the most prominent being long short-term memory (LSTM) nets [40]
[41]. The architecture of LSTM designed to learn CBV values from DSC data is shown in [Fig. 2]. LSTM networks have shown high promise in modeling a wide range of different sequential problems in radiology, such as in predicting IDH genotype in gliomas [42], breast lesion classification [43], for segmenting tissue or organs [44]
[45], differentiating the origins of spinal metastases [46], and recently for predicting DCE model parameters [47]. It is also possible to learn the necessary transformations for perfusion modelling of DSC data, as demonstrated in [Fig. 3] (own work). As can be seen in the right part of [Fig. 3B], the root mean squared error between the predicted and the conventionally calculated CBV is very small.
Fig. 2 Recurrent neural network architecture for the prediction of CBV from DSC contrast agent curves with N time points. The network consists of L layers with N LSTM cells, each with M hidden features, in each layer. For each voxel separately, the concentration at each time point c(tn) is processed by a separate LSTM cell with hidden state hl,n. The weighting function wl,n is the same inside each layer. The last output of the last layer is given as the input for a fully connected network (FCN) layer with (M, 1) nodes. The final output is the trained parameter, in this case the CBV.
Abb. 2 Beispielhafte Architektur eines rekurrenten neuronalen Netzwerks für die Vorhersage des CBV ausgehend von den Kontrastmittelkurven einer DSC-MRT-Aufnahme mit N Zeitpunkten. Das Netzwerk besteht aus L Schichten mit je N LSTM-Zellen, jede mit M hidden features. Die Kontrastmittelkonzentrationen c(tn) zu jedem Zeitpunkt werden für jeden Voxel einzeln von einer eigenen LSTM-Zelle mit dem hidden state hl,n verarbeitet. Die Gewichtungsfunktion wl,n ist für alle Zellen in einem Layer gleich. Der letzte Output des letzten Layers wird als Input für ein fully connected network (FCN)-Layer mit (M,1)-Neuronen benützt. Der Outputparameter dieses FCN ist schlussendlich das CBV pro Voxel.
Fig. 3 Demonstration of the capabilities of an exemplary LSTM network. A Exemplary mean squared error (MSE) loss, evaluated on a validation subset, over the training epochs, showing rapidly decreasing loss when using the ADAM optimizer for a learning rate of 1e-6. B Comparison of the conventionally obtained CBV, as obtained by a Tikhonov-regularized singular value decomposition (TiSVD, left) and the learned CBV LSTMpred (middle) on a test case which was not in the training or validation cohort. The right image shows the root MSE between the CBV values obtained by the two different methods. Pseudocolor scale is identical across the methods with a range of [0, 30] ml/min.
Abb. 3 Beispiel der Fähigkeiten eines exemplarischen LSTM-Netzwerks. A Beispielhafte Darstellung des mittleren quadratischen Fehlers (MSE loss) eines Validationsdatensatzes abhängig von der Trainingsepoche. Der Fehler nimmt bei Benutzung des ADAM-Optimizers und einer learning rate von 1e-6 rasch ab. B Vergleich der mittels eines konventionellen Tikhonov-stabilisierten SVD-Algorithmus berechneten CBV, (TiSVD, links) und der gelernten CBV (CBV LSTMpred, mittig) anhand eines Testdatensatzes, der nicht Bestandteil der Trainings- oder Validierungskohorte war. Das rechte Bild zeigt die Wurzel des mittleren quadratischen Fehlers (RMSE) zwischen den beiden Methoden. Die Pseudofarbskala ist bei allen Darstellungen einheitlich [0, 30] ml/min.
Model interpretability and error estimation
In order to compare perfusion parameters from different voxels or measurements, it is necessary to have an estimation of the parameter errors. This error arises from two main sources: from intrinsic MRI measurement noise, and from deviations between real voxel tissue and the chosen tissue model. Only when the error is known is it possible to correctly assess the magnitude of differences and do significance testing. There are several methods for error estimation, with the most common model-free being bootstrapping [48]. This method treats the model algorithm as essentially a black box and considers the output error after artificially adding noise to the input parameter. For deep learning, different technique-specific methods have been proposed, such as dropout methods [49] or neural networks based on Bayesian reasoning [50]
[51].
When using black-box machine learning algorithms, the outcome predictions must be taken at face value as the algorithm natively does not provide a reason for its prediction. Recently, the field of neural network interpretability has made significant advances in providing measures which help with the interpretation of results [52]. Specifically for LSTM networks, gradient-based attribution methods [53] or structure modifications allowing direct variable importance output [54] have been proposed. This can be used to extract associations between contrast agent curve shape and outcome parameters [42].
Current and future clinical applicability
The quantitative evaluation of DSC MRI, that is, perfusion modeling, currently has far more clinical applications than permeability modeling using DCE MRI. CBF and CBV maps are used in the diagnosis of stroke, glioma, head-and-neck tumors, and sometimes in cardiac imaging. On the other hand, most clinical applications of DCE MRI imaging rely on purely qualitative or semiquantitative assessment. In the PI-RADS guidelines for the diagnosis of prostate cancer, only the presence or absence of early enhancement is scored since the available evidence for pharmacodynamic modeling is deemed insufficient [55]. Similarly, breast cancer diagnosis using the BI-RADS criteria also uses a classification of the signal dynamics into one of three curve types according to rise speed and washout [56]. This does not imply, however, that there is no evidence for the usefulness of these models, and a large number of small-scale studies exist [57]. Widespread adoption of quantitative DCE is hindered by several factors such as insufficient standardization of acquisition and processing. The relatively poor current performance of quantitative DCE models in distinguishing healthy from malignant tissue may stem from deficits in the handling of the noise levels inherent in fast T1w imaging and in cleanly separating perfusion effects such as bolus delay and dispersion from permeability effects. This may change in the future, however, as deep learning-based modelling becomes commonplace. The optimal mathematical framework for correctly handling both intrinsic and extrinsic noise is given by stochastic analysis, in particular by stochastic differential equations.
OUTLOOK
Currently, CA-based perfusion MRI is a well-established functional imaging method with a multitude of appropriate processing methods. While new processing methods continue to be developed, progress has declined somewhat in recent years. Machine learning in general, and deep learning in particular, comprise promising new avenues for better and more reliable processing methods. The inherently stochastic nature of neural networks represents an ideal fit for modeling the inherently noisy CA dynamics. Due to their flexibility, these methods may be capable of modelling the complex dynamics inherent in aberrant blood flow patterns without having to specify a particular model beforehand.
The main challenge in applying deep learning algorithms to perfusion MRI data remains the necessity of high-quality and plentiful data for training. Not only does the raw perfusion data need to be acquired under standardized conditions using comparable sequence parameters, but the trained goal parameters, whether segmentations, clinical outcome parameters, or conventional perfusion parameters, need to be high-quality too. Particularly the prediction of clinical outcome parameters is demanding due to the often highly nonlinear and indirect relationship between perfusion data and final outcome.
A possible avenue to circumvent the problem of always having to learn the complete problem set is using physics-informed neural networks, which can learn complex tasks while preserving physical or heuristic relationships specified as differential equations [58]. These may be able to directly learn, for example, DCE parameters for a specified model. An even newer generalization of physics-informed neural networks, universal neural differential equations, provide an explicit way of doing this [59]
[60]. The disadvantage, however, is that the explicit model-free nature of deep learning is partially lost.
In conclusion, deep learning-based processing of perfusion MRI data holds high promise for diagnosis and treatment monitoring in oncology. The novel methods may be uniquely suited for the inherently noisy time series obtained for each voxel and can learn almost any sensible parameter. A special focus should be on connecting architectural modeling and perfusion parameters, as this may allow monitoring of microstructural changes in the microvascular architecture induced by neoangiogenesis or as treatment response. Due to the rapid progress in the field, further research is urgently needed.