Introduction
The field of radiology has undergone substantial changes due to technological advances
with the introduction of computed tomography (CT) and magnetic resonance imaging (MRI)
and their subsequent steady improvement, which accounts for the major changes in diagnostic
medicine and treatment response measures in the field of oncology. The next step in
the innovation of radiology will be mainly due to the advances made in the application
of computer-assisted analysis of imaging and clinical data, especially radiomics and
artificial intelligence. Radiological imaging plays a major role in the clinical decision-making
process in oncology, and computer-assisted image analysis promises to offer obvious
improvement to this process where the human expert fails to comprehend the complex
information beyond tumor size and contrast behavior.
Radiological imaging in lung cancer patients is one of the fields in oncology in which
such applications are eagerly awaited. Lung cancer is the second most common cancer
type globally and the leading cause of cancer deaths, with non-small cell lung cancer (NSCLC) being the predominant subtype, accounting for approximately 85 % of cases
with approximately 40 % being classified as adenocarcinomas [1 ]
[2 ]. Early detection of malignant pulmonary lesions is highly relevant, as the 5-year
survival rate of patients treated with early stage lung cancer is 57 %, while it is
only about 5 % in patients with generalized disease [3 ]. Due to limited symptoms in early stages, lung cancer is often diagnosed in late
stages [4 ]. In recent years the identification of new molecular and genomic biomarkers has
provided new targets for therapeutic approaches in patients with late-stage non-small
cell lung cancer (NSCLC), thereby increasing overall survival [5 ]
[6 ]. Important examples include mutations of the epidermal growth factor receptor (EGFR) or anaplastic lymphoma kinase (ALK) rearrangements, which can be found in 15 % and 2 % of non-small cell lung cancer
patients, respectively [7 ]
[8 ]. For both alterations a number of targeted therapeutics are already being used in
the clinical routine (e. g., the tyrosine kinase inhibitors erlotinib, gefitinib,
and afatinib for EGFR mutations [9 ]
[10 ] and crizotinib, alectinib, and brigatinib for ALK translocations [11 ]
[12 ]). The histological subtypes of lung cancer and molecular alterations in the biggest
histological subgroup, i. e., adenocarcinomas, are summarized in [Fig. 1 ].
Fig. 1 Overview of histological subtypes of lung cancer (left) and molecular alterations
in the biggest histological subgroup of NSCLC, adenocarcinomas (right) [1 ].
Abb. 1 Übersicht der histologischen Subtypen von Lungenkrebs (links) und der molekularen
Alterationen der größten Subgruppe, dem Adenokarzinom (rechts) [1 ].
To date, the identification of the aforesaid mutations usually requires the invasive
collection of tissue samples, e. g., by transbronchial biopsy or CT-guided biopsy
of the primary tumor or its metastases. Invasive procedures carry the risk of complications
and consequently not all patients can undergo these procedures due to comorbidities.
In some cases, tissue samples might be inadequate or in the case of oncological progression
re-biopsy cannot always be performed. To overcome these hurdles, new approaches are
being investigated to assess and monitor tumor mutation status, for example the examination
of circulating cell-free nucleic acids in the blood stream, usually referred to as
liquid biopsy [13 ], or the use of surrogate markers as derived, e. g., from imaging through computer-assisted
analysis of radiological imaging or clinical data. Even though the identification
of the described molecular biomarkers as therapy targets has had a considerable impact
on patient outcome, not all patients with corresponding molecular alterations benefit
from targeted therapies and in many cases tumor recurrence is observed over time,
highlighting the need for close monitoring of therapy response as part of the diagnostic
process in lung cancer patients. In this review we give a short introduction to the
field of radiomics in the context of identification of imaging biomarkers in lung
cancer and provide an outlook on how radiomics could impact the management and treatment
of lung cancer patients in the future.
Basic Principles of Radiomics for the Identification of Imaging Biomarkers
In oncologic radiology the evaluation of imaging traditionally involves a mainly qualitative
approach by the human reader, known as semantic
[14 ]. This refers to cross-sectional imaging, both CT and MRI. The tumor phenotype can
only be partially detected by the human eye, also depending on factors like the experience
of the radiologist. Quantitative analysis in this context is often limited to one-dimensional
measurements of tumor manifestations with follow-up examinations being evaluated according
to RECIST (response evaluation of criteria for solid tumors ) guidelines [15 ]. This approach potentially fails to recognize a large part of the information available
from imaging, as this information might not be easily accessible to the human eye.
[Fig. 2 ] shows three examples of pulmonary adenocarcinomas with a different tumor mutation
status, showing no obvious features that make it possible to differentiate them visually.
Fig. 2 Examples of pulmonary adenocarcinomas in soft tissue (a, c, e ) and lung tissue window (b, d, f ): CT examinations of three different patients at the time of initial diagnosis. Patients
later underwent tissue biopsies with molecular pathological analysis detecting no
tumor mutation (a, b ), EGFR mutation (c, d ), and ALK rearrangement (e, f ).
Abb. 2 Beispiele pulmonaler Adenokarzinome im Weichteil- (a, c, e ) und Lungenfenster (b, d, f ): CT-Untersuchungen von 3 unterschiedlichen Patienten zum Zeitpunkt der Erstdiagnose,
bei denen später eine Biopsie mit anschließender Molekularpathologie durchgeführt
wurde. Es zeigten sich keine Mutation (a, b ), eine EGFR-Mutation (c, d ) und eine ALK-Translokation (e, f ).
An increasing number of studies show that radiomics-based image analysis allows for
the extraction of otherwise missed features and their quantitative analysis [16 ], which in turn could improve diagnosis and might lead to a better prediction of
tumor response to therapy [17 ]. The term radiomics refers to the concept of large-scale analysis of radiological
images and the association with biological markers or clinical endpoints using mathematical
and machine learning methods [18 ]. The main steps of any radiomics workflow follow the same principles and can be
summarized as shown in [Fig. 3 ].
Fig. 3 Steps of radiomics workflows: After image acquisition, target structures are either
manually or (semi-)automatically segmented for the feature extraction and selection
process, which is succeeded by analysis of the association of radiomics features with
other endpoints to establish a prediction mode, followed by performance testing of
the established model.
Abb. 3 Schritte eines Radiomics-Arbeitsflusses: Nach der Bildakquisition werden die Zielstrukturen
für den Extraktions- und Selektionsprozess der Bildeigenschaften manuell oder (semi-)automatisch
segmentiert, gefolgt von der Analyse der Assoziation der erhobenen Bildeigenschaften
mit definierten Endpunkten zur Entwicklung eines Modells, welches im letzten Schritt
getestet werden muss.
Image acquisition is the starting point in any radiological study. Differences in
imaged organ systems and image acquisition techniques can have a considerable impact
on the reproducibility of radiomics models as scanners, scanning protocols, reconstruction
algorithms, etc. can vary significantly in large data sets [19 ]. As most radiomic analyses are conducted retrospectively, this factor must be considered
and needs to be addressed by techniques like feature normalization or harmonization
[20 ]
[21 ]. Otherwise, the resulting model might be limited by trained technical pre-settings.
Following image acquisition, the region of interest (ROI) needs to be defined and
segmented for further analysis. This may not only include the tumor itself but also
its close pulmonary vicinity to study the interaction with surrounding tissues. Semi-automated
and fully automated segmentation techniques have improved in recent years, speeding
up the otherwise time-consuming work of manual segmentation by experts [22 ].
Radiomic features are consequently calculated from the segmented structures. Many
different features have been described, some being highly standardized while other
studies also include handcrafted features. Authors take different approaches in the
classification of features. Conventionally, four main groups can be distinguished:
tumor intensity-based features (also called first order features as first-order statistics are used for description), shape features, texture
features and wavelet features. Tumor intensity-based features are histogram-based
quantifications of all tumor voxel intensity values. Shape features describe the geometric
properties of the region of interest. Texture features are used to quantify the heterogeneity
of the region of interest in terms of grayscale values. For example, a homogeneous
structure or tissue would show similar gray values while heterogenous structures would
exhibit high differences in gray levels. Examples for this feature class include the
Gray Level Co-occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM), or Gray Level Size Zone Matrix (GLSZM). Lastly, wavelet features are calculated from wavelet decompositions of the
original imaging [16 ]
[23 ]
[24 ]. A wavelet transformation is a form of mathematical filter that results in decompositions
according to the scale and orientation of the initial image. [Table 1 ] gives an overview of these feature classes and provides relevant examples for each
class.
Table 1
Radiomics feature classes and selected examples.
Tab. 1 Klassen der „Radiomics“-Eigenschaften und ausgewählte Beispiele.
Feature class
Description
Examples
Tumor-intensity based features (first order statistics)
Histogram-based quantifications of all voxel intensity values of ROI
Minimum, maximum, mean, median, range, etc.
Shape features
Used to describe geometric properties of ROI
Volume, surface area, sphericity, maximum diameter, elongation, etc.
Texture features
Used to quantify heterogeneity of ROI
Gray Level Co-occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM), Gray
Level Size Zone Matrix (GLSZM), etc.
Wavelet features
Calculated from wavelet decompositions of original imaging
Fourier, Gabor, Haar wavelet transforms
Not all resulting features are equally useful for statistical correlation with clinical
endpoints, as there may be redundancies or only a weak association with the classification
task [18 ]. Therefore, to identify which radiomic features should be used as imaging biomarkers
for the aimed task, feature selection has to be performed to reduce the dimensionality
of the feature space [25 ]. Multiple studies have compared different feature selection approaches and have
thereby shown that feature selection is critical for the development of accurate radiomics
models [23 ]
[26 ]
[27 ]. The last but most challenging step is the development of a model that integrates
radiomics data with clinical data to establish models for tumor classification or
therapy prediction. This is usually achieved by a regression model for scalar problems
(e. g., survival) or a classification model and can be greatly enhanced by machine
learning techniques [28 ].
Addressing Limitations of Radiomics
A major issue that remains to be solved in the field of radiomics is the reproducibility
of results. Many radiomics studies have only used the split-sample approach, meaning
that monocentric datasets are split into training and validation datasets before feature
extraction [29 ]. Such models usually show low performance when applied to an independent cohort
[30 ]. A lack of harmonization of data acquisition is still an important drawback for
high-throughput data methodologies such as radiomics, since different CT scanner manufacturers,
scanning protocols, contrast media, and image reconstruction methods impact image
features. Further validation and generalization of results requires large, multi-centric
datasets that incorporate imaging data and clinical data of good quality to allow
for the creation of robust models [30 ]. Moreover, Ninatti et al. found in their recent review of radiomics approaches in
lung cancer that no reliable radiomic features could be identified between different
studies [29 ]. To allow for radiomics approaches to be introduced into clinical practice, further
standardization is still required.
Radiogenomics in lung cancer: Predicting the tumor genotype
As mentioned, therapy of advanced non-small cell lung cancer has changed considerably
due to the identification of numerous molecular and genomic markers, making molecular
pathological testing part of the clinical routine for NSCLC patients, with mutations
in the EGF-receptor, tyrosine kinase receptor ALK, and the oncogenes ROS1 and BRAF
usually being investigated since potent therapeutics for these targets are in use
[31 ]. In addition, programmed-death-1 (PD-1)/programmed-death ligand (PD-L1) inhibitors
are another important group of novel targeted therapeutics for which testing is routinely
performed [31 ]. These immune checkpoint inhibitors have had a significant impact on patient outcome
if no other targetable molecular alteration can be found.
Identification of these biomarkers using radiomics approaches has proven to be promising
as it could reduce the need for biopsies or could help track changes in the tumor
mutation status over time without the need for re-biopsy. The integration of medical
imaging data derived by radiomics and genomic data is often referred to as radiogenomics
[32 ]. The number of publications regarding the identification of these relevant biomarkers
in NSCLC patients is increasing ([Table 2 ]), and several reviews have focused on the comparison of these studies, usually comparing
the reliability of the developed models by measuring the area under the curve (AUC).
Prediction of EGFR status has seen the highest number of publications in accordance
with the relatively high prevalence of mutations in NSCLC patients ([Table 2 ]) and recent papers have shown good results, with the highest AUC values achieved
by combinations of radiomics data, visual qualitative CT features, AI approaches using
convolutional neural networks, positron emission tomography parameters, and clinical/pathological
features [29 ]. Results in the validation cohorts in these studies ranged from AUCs of 0.73 [33 ] to 0.95 [34 ] ([Table 3 ]).
Table 2
Number of publications in the field of radiogenomics in NSCLC.
Tab. 2 Anzahl der Publikationen im Bereich der „Radiogenomics“ bei nichtkleinzelligem Lungenkrebs.
Molecular marker
Frequency in patients [3 ]
[10 ]
Number of publications
EGFR
Overall
Exon 19 deletion
Exon 21 L858 R mutation
Others
10–20 %
~ 45 %
~ 40 %
~ 15 %
20 +
ALK
2–5 %
5
ROS1
1–4 %
1
BRAF
~ 2 %
0
PD-L1 expression
30–40 %
30–40 %
~ 30 %
2
Table 3
Summary of selected studies presented in this review.
Tab. 3 Zusammenfassung der Ergebnisse der präsentierten Studien.
Authors
Approach
Modality
Results (in validation cohorts)
Zhao et al.
[33 ]
Prediction model for EGFR gene mutation status in NSCLC patients
CT
AUC = 0.73
Jiang et al.
[34 ]
Prediction model for EGFR gene mutation status in NSCLC patients
PET/CT
AUC = 0.95
Yamamoto et al.
[35 ]
Model for prediction of ALK aberrations
CT
Sensitivity 83.3 %, specificity 77.9 %, accuracy 78.8 %
Jiang et al.
Model for prediction of PD-L1 expression rates of ≥ 1 % or ≥ 50 %
PET/CT
≥ 1 %: AUC = 0.97
≥ 50 %: AUC = 0.91
Xu et al.
[45 ]
Model to stratify patients into low and high mortality-risk groups strongly correlating
with 2-year overall survival
CT
AUC = 0.74
Mu et al.
[46 ]
Response prediction of advanced NSCLC patients to immunotherapy
PET/CT
AUC = 0.83 (retrospective validation)
AUC = 0.81 (prospective validation)
Cucchiara et al.
[49 ]
Integrating liquid biopsy and radiomics to monitor clonal heterogeneity in EGFR-positive
NSCLC
CT
(liquid biopsy)
R2 = 0.447, p < 0.001
Only a small number of studies investigating ALK rearrangements have been published
and even fewer for PD-L1 expression and the rarer ROS1 and BRAF mutations. These studies
showed promising results, for example Yamamoto et al. showed a sensitivity of 83.3 %,
a specificity of 77.9 %, and an accuracy of 78.8 % for the prediction model of ALK
aberrations [35 ], and Jiang et al. developed a model with an AUC of 0.97 and 0.91 for the prediction
of PD-L1 expression rates of ≥ 1 % and ≥ 50 %, respectively [36 ]. These positive outcomes make radiogenomics a promising area for further research
since the individual prediction of tumor mutation status solely based on imaging has
not been achieved and is not ready for routine use in patients.
Multi-omics approaches for response prediction in lung cancer and therapy monitoring
A deeper understanding of the tumor biology of lung cancer is emerging as increasing
attention is given to the heterogeneity and microenvironment of tumors. Even though
the identification of the described molecular biomarkers as therapy targets has had
a considerable impact on patient outcome, not all patients with corresponding molecular
alterations benefit from targeted therapies. Moreover, patients initially benefitting
from these therapies often experience recurrence or (hyper-)progression due to acquired
therapy resistance of the tumors [37 ]
[38 ]. Neoplastic transformation results from the accumulation of genetic and epigenetic
alterations, leading to a variation of different genetic alterations within the resulting
macroscopic tumor. In the example of lung adenocarcinomas, two or more histopathological
subtypes have been described with regions with different degrees of differentiation,
proliferation, vascularity, inflammation, and invasiveness [39 ]
[40 ]. Other factors contributing to the heterogeneity of tumors are epigenetic alterations
and the tumor microenvironment [41 ]. The resulting manifestation of tumor heterogeneity within tumors, in the microenvironment,
and between patients has been found to contribute to differences in survival and tumor
recurrence as cells that are not susceptible to the administered therapy can replace
cells that have been successfully destroyed [42 ]. Additionally, the accumulation of mutations is a dynamic process that does not
stop at the time of initial diagnosis but may lead to resistance mechanisms being
acquired during the course of therapy [43 ]. In clinical practice, this leads to the need for tissue re-biopsy at the time of
tumor progression or recurrence from progressing lesions.
Radiomics and artificial intelligence-based analysis might offer a noninvasive alternative
as it allows for the assessment of the entire tumor volume and can be applied to every
radiological follow-up examination, and in theory to every tumor manifestation throughout
the body. To further improve the success of targeted therapies, it will be crucial
to find better ways to take tumor heterogeneity into account to predict therapy response
and monitor tumor phenotypes during therapy. Some studies have already shown the feasibility
of predicting tumor response using radiomics approaches [44 ] ([Table 3 ]). For example, Xu et al. describe a model that predicts patient outcome by stratifying
patients into low and high mortality-risk groups that strongly correlate with 2-year
overall survival (HR = 6.16, 95 %CI [2.17,17.44], p < 0.001), achieving an AUC of
0.74 (p < 0.05) for their model [45 ].
In addition to structural radiomics approaches, functional imaging can also be analyzed
using machine-learning approaches. Lung cancer patients receive 18FDG-PET/CT examinations
in the clinical routine to assess tumor glucose metabolism as a parameter for tumor
activity. Studies have shown that PET-based radiomics can predict clinical outcomes
[30 ]. For example, Mu et al. developed a model identifying NSCLC patients, who are most
likely to benefit from immunotherapy with AUC values of 0.86 (95 %CI 0.79–0.94), 0.83
(95 %CI 0.71–0.94), and 0.81 (95 %CI 0.68–0.92) in the training, retrospective test,
and prospective test cohorts, respectively [46 ].
Integration of structural and functional imaging biomarkers in a multi-omics approach
could lead to further improvement of tumor phenotyping with the inclusion of formerly
uncorrelated data from new sources [47 ]. One such example of a new data source is the quickly emerging field of liquid biopsies,
i. e., peripheral blood samples. In lung cancer, liquid biopsies offer the possibility
to longitudinally extract genetic information from tumors without the need for tissue
(re-)biopsy. The genetic information is derived from tumor-specific cfDNA that originates
from perishing or circulating tumor cells, giving an overview of all genetic alterations
of the tumor [41 ], thus offering another way to take tumor heterogeneity into account, potentially
allowing for therapy response prediction. Janke et al., for example, report a highly
significant marker panel indicating therapeutic response (R2 = 0.78, R2 = 0.71, and R2 = 0.71) in patients with advanced non-small cell lung cancer receiving chemotherapy
or targeted therapies [48 ]. Studies have already established multi-parametric approaches in smaller data sets.
Cucchiara et al. show that the combination of radiomics analyses and liquid biopsy
results can be used to monitor mutation status in NSCLC patients over the course of
treatment with promising performance on predicting the presence of EGFR mutation status
(R2 = 0.447, p < 0.001) [49 ].
Conclusion
The use of radiomics in combination with artificial intelligence-based approaches
allows for the identification of novel imaging biomarkers. In non-small cell lung
cancer patients, radiogenomics describes the combination of radiomic data with tumor
genome mutation status, which is a promising approach for the identification of therapy
targets, such as EGFR mutations, ALK rearrangements, or PD-L1 expression rates, with
several prediction models showing encouraging results. Furthermore, radiomics could
help to solve the pressing clinical need for early assessment and prediction of therapy
response, as not all patients with corresponding molecular alterations benefit from
targeted therapies and in many cases tumor recurrence is observed over time. To reach
this goal, advanced tumor phenotyping is required, which could be achieved by integrating
structural and functional imaging biomarkers with clinical data sources, such as genomics
approaches using liquid biopsy results, in a multi-omics approach. However, to allow
radiomics- and artificial intelligence-based approaches to be introduced into clinical
practice, further standardization using large, multi-center and multi-vendor datasets
is required.