Keywords Thyroid nodule characterization - US-Elastography - TIRADS - CEUS - AI
Introduction
Thyroid nodules are common findings in the general population. Depending on the population examined and the technique used, nodules are found in 20–70 % of people [1 ]. Individual risk factors include, in particular, a positive family history [2 ] of nodules as well as long-standing iodine deficiency [3 ], which continues to be a relevant health policy problem [4 ]. Programs to improve iodine supply, especially through iodization of table salt, have led to a decrease in nodules [5 ]. Nevertheless, barriers against iodine fortification are still found in the category of implementation and management and a lack of political support seems to play an important role [6 ].
Ultrasound is the most sensitive method for detecting thyroid nodules [7 ]
[8 ]. It is widely available, not burdensome for the patient, and is free of ionizing radiation. However, the increasing availability and use of ultrasound has also led to an increase in the diagnosis of thyroid nodules [9 ]. While the vast majority of nodules are benign and of no relevance to the patient, a small proportion of nodules are malignant and require further diagnosis and treatment [10 ]. Data regarding the probability of malignancy vary greatly and range from < 1 % to 20 %. The reasons for this observation are mainly differences in the pre-selection of thyroid nodules included in studies rather than regional differences in malignancy rates. In a large cohort with a level of risk close to that of the general population, the malignancy rate of nodules in Central Europe is around 1 % or lower [11 ]. However, in many studies using ultrasound for the characterization of nodules, the pretest probability and, therefore, the malignancy rate is 10 % or even higher, which has a significant impact on the value of the testing procedures used.
Before using sonography as a method for the detection and characterization of thyroid nodules, the right indication for thyroid sonography should be ensured [12 ]. Autopsy studies suggest that papillary microcarcinomas of the thyroid are present in up to 30 % of older people without being clinically relevant [13 ]. Screening programs for thyroid nodules, particularly in South Korea, have shown that although this leads to a massive increase in diagnosed thyroid carcinomas, surgery does not lead to a reduction in mortality [14 ]. On the other hand, as expected, there was a significant increase in surgical complications such as recurrent laryngeal nerve (RLN ) palsy (about 2 %) and postoperative hypoparathyroidism (about 11 %). In addition, the increase in thyroid carcinomas was almost entirely due to the increase in papillary thyroid carcinomas, mostly papillary microcarcinomas (PTMC) [14 ]. In contrast, these screening programs did not detect other entities [15 ]. However, the clinical relevance of PTMC, especially in the elderly, is often low. Increasingly, even if the diagnosis of PTMC is confirmed, not surgery but rather active surveillance is recommended [16 ]. In countries without screening programs, on the other hand, such an increase in the diagnosis of thyroid carcinoma has not been observed [17 ]. Also, the long-term prognosis seems to be similar between incidentally and non-incidentally diagnosed PTMC of thyroid cancer, as long as there are no symptoms suspicious for metastases [18 ]. US recommendations, therefore, discourage screening for thyroid carcinoma in asymptomatic adults [19 ].
Nevertheless, the use of thyroid ultrasound is clinical practice and a reality. Thyroid nodules are frequently incidentally detected through various imaging modalities, with ultrasound being the most common method (it happens, for example, in patients undergoing carotid duplex scanning), followed by computed tomography (CT), magnetic resonance imaging (MRI), and 2–18[F] fluoro-2-deoxy-d-glucose (FDG) positron emission tomography (PET) [20 ]. The task and challenge are therefore not primarily to detect thyroid nodules but to correctly characterize them sonographically [21 ] and to identify the few malignant nodules from the large number of benign nodules and to direct these patients to adequate further diagnostic workup and, if necessary, treatment [22 ]. For this reason, the recommendations for comprehensive thyroid ultrasound evaluation in Germany, especially for the elderly, emphasize that imaging procedures should only be employed once the existence of hormonal disease has been confirmed. Additionally, sonographic screening for thyroid disease is not recommended in the elderly [23 ].
Individual sonographic criteria for malignancy such as irregular margins, microcalcifications, hypoechogenicity, and a taller-than-wide shape are not sufficiently sensitive and specific when used as single criteria [24 ]. In recent years, systems combining individual criteria, so-called TIRADS (thyroid imaging and reporting system), have therefore become established for risk stratification of thyroid nodules [25 ]. Most TIRAD systems are based exclusively on B-mode sonography criteria. New tools for the sonographic assessment of thyroid nodules are the measurement of tissue stiffness using ultrasound, so-called elastography [26 ], and contrast-enhanced ultrasound (CEUS) [27 ]. The different TIRAD systems used today all have a similar sensitivity and a comparable high negative predictive value (NPV) [28 ]. However, there are differences in the reduction of unnecessary further diagnostic workup such as fine-needle aspiration [29 ]. Compared to the practice patterns of experts without knowledge of TIRADS criteria, TIRAD systems outperformed experts with a significant reduction of unnecessary FNPs [30 ].
However, TIRAD systems have some limitations. TIRAD systems have mainly been evaluated with respect to the detection of papillary thyroid cancer [31 ]. In addition, TIRAD systems are not applicable in the presence of an autonomous adenoma [32 ] – this should be ruled out beforehand, e. g. by scintigraphy. TIRAD systems also do not appear to be sufficiently accurate for the detection of medullary thyroid carcinomas [33 ]. TIRAD systems, therefore, does not replace calcitonin measurement. In addition to the sonographic criteria for malignancy, other risk factors such as age, sex, comorbidities, and location of the nodules within the thyroid gland should always be taken into account [34 ]. In particular from a clinical perspective, factors that strengthen the indication for FNA include male sex, young age, the presence of a solitary nodule, and the presence of compressive symptoms related to the nodule. Additionally, a family history of medullary thyroid cancer or MEN2, childhood head and neck radiation exposure, planned thyroid or parathyroid surgery, and patient preferences can influence the decision to perform FNA.
From a genetic standpoint, the presence of syndromic monogenic thyroid susceptibility and a strong family history of thyroid cancer (more than two relatives affected) are considerations.
In terms of the diagnostic workup, elevated serum calcitonin levels can strengthen the indication for FNA or commonly also lead to a preference for surgery without the need for previous needle aspiration.
Finally, nuclear medicine imaging can serve as a diagnostic criterion, with 18-FDG and MIBI uptake influencing the decision to perform FNA [29 ]
[35 ]
[36 ].
Instead, certain factors can diminish the need for FNA. A long personal history of a stable or slowly growing multinodular goiter (MNG) suggests a lower likelihood of malignancy. Limited life expectancy and significant comorbidities may also influence the decision against FNA, considering the potential risks and benefits in the context of the patient’s overall health. Additionally, patient preference plays a role, as individual choices and values can impact the decision-making process. Furthermore, a family history of benign nodular thyroid disease may contribute to a less aggressive approach.
In terms of biological tests, subnormal thyrotropin levels may provide further context, potentially indicating a less aggressive nature of the thyroid nodule.
Nuclear medicine imaging findings can also guide the decision. The presence of autonomous nodules on isotope scans may suggest a lower likelihood of malignancy, providing valuable information in the assessment process.
Therefore, a comprehensive evaluation involves considering both factors that strengthen and weaken the indication for FNA. This ensures a nuanced and individualized approach to managing thyroid nodules, taking into account the diverse characteristics and circumstances of each patient [37 ]
[38 ].
On the other hand, the role of scintigraphy in the diagnosis of thyroid nodules and in particular in the identification of malignant thyroid nodules using radionuclide imaging is still controversial. A recent extensive metanalysis by Xin Song et al. highlights how radionuclide scanning had a certain diagnostic value for thyroid nodules, but poor overall diagnostic accuracy. Some studies considered in that meta-analysis, in particular, argue that when radionuclide scan imaging is used for diagnostic purposes, simultaneous detection with serum TSH levels should be performed to improve the accuracy of the diagnosis. In fact, autonomous nodules can be considered benign, indicating the absence of malignancy [39 ].
This paper gives an up-to-date overview of standards and new developments in thyroid nodule sonography. In addition to the individual sonographic criteria, the so-called TIRAD systems for risk stratification and other techniques such as elastography, contrast-enhanced ultrasound (CEUS), and artificial intelligence application are presented.
Ultrasound risk-stratification systems for thyroid nodules
Ultrasound risk-stratification systems for thyroid nodules
When thyroid nodules are diagnosed and brought to our attention, our primary aim is to determine whether these lesions are benign or malignant. Indeed, thyroid ultrasound (US) is the first-line tool for estimating the risk of malignancy [40 ]. It is universally recommended by the international clinical practice guidelines to differentiate nodules that can be safely referred for periodic surveillance from those that require second-level examinations [i. e., cytology assessment through fine-needle aspiration biopsy (FNAB)]. When surveillance is selected, thyroid nodule US guides clinicians with respect to follow-up intervals and duration [41 ]
[42 ].
The steps to describe thyroid nodules and to estimate their sonographic risk of malignancy are as follows:
Describe the composition: cystic, mixed, solid.
Describe the echogenicity: hyperechoic, isoechoic, mildly or marked hypoechoic compared to the thyroid parenchyma.
Describe the shape: oval, non-oval (i. e., so called “taller-than-wide”).
Describe the margins: regular, irregular (i. e., lobulated, spiculated).
Look for: echogenic foci/calcifications, signs of extrathyroid extension, suspicious loco-regional lymph nodes.
All the above-mentioned descriptors, combined in different ways, contribute to the definition of specific US categories characterized by increasing risks for malignancy, ranging from low (< 5 %) to intermediate (5–20 %) and high (> 20 %) ([Fig. 1 ]). These categories represent the core units of US-based risk stratification systems (RSSs), also referred to as a Thyroid Imaging Reporting and Data System (TIRADS) [41 ]
[42 ]
[43 ]
[44 ]
[45 ]
[46 ]. Several RSSs have been proposed by the international scientific societies, and subsequently validated in longitudinal observational studies [29 ]
[47 ]. No clear hierarchy has been reported, a fact that usually leads clinicians to choose a specific RSS based on the geographical area of origin or on their own familiarity with the system [48 ]. The logic and the rules applied in the risk estimation process are fundamentally the same across all RSSs. A number of cardinal sonographic features are consistently associated with the highest risks of malignancy. They include marked hypoechogenicity, taller-than-wide shape, irregular margins, punctate echogenic foci, signs of extrathyroid extension, mainly when associated with a solid composition and the presence of suspicious lymph nodes. The identification of any of these features is sufficient to classify the nodule as high risk. There are also low-risk lesions, characterized by a cystic or mixed composition, the latter having a solid component that is iso- or hyperechoic with respect to the surrounding thyroid tissue. All other nodules (solid or predominantly solid, mildly hypoechoic) fall into the intermediate risk category. The American College of Radiology TIRADS represents an exception to these rules [44 ]. The key sonographic features are assessed individually and rated with a numerical score, with the sum of the scores determining the thyroid nodule risk category. Within each RSS category, specific size cutoffs are used to select nodules to be referred for FNAB or simple surveillance.
Fig. 1 Diagnostic pathway to stratify the risk of thyroid nodule malignancy using neck ultrasound.
The introduction of RSSs into clinical practice has a number of advantages. First of all, it introduced common and standardized language that facilitates communication between specialists and researchers. In addition, it has substantially improved the interobserver and even intraobserver agreement regarding thyroid nodule risk classification [49 ]. More importantly, RSSs have the potential to significantly reduce the rate of FNABs, without significantly missing out on malignancy diagnoses [29 ]. The false-negative rates range between 2 % and 4 %, depending on the RSS. It is noteworthy that these percentages are not unlike the false-negative results observed with benign FNAB results, making the “benign” sonographic pattern an equivalent of benign cytology. This is a significant achievement with respect to avoiding unnecessary or even potentially harmful procedures.
Despite these considerations, there is still much to be done in the process of improving thyroid nodule US risk stratification [48 ]. The proliferation of multiple RSSs may be confusing for users, as each system adopts different definitions of the US features being assessed, assigns different relative weights to individual sonographic characteristics, and applies different size cutoffs for FNAB recommendations. To address these specific issues, an International Thyroid Nodule Ultrasound Working Group is currently attempting to devise a unified RSS to be disseminated and used worldwide [50 ]. A further limitation lies in the fact that these systems have been specifically developed and validated to detect papillary thyroid carcinomas [51 ]. This limits their applicability to the occurrence of other types of neoplasia, such as follicular and medullary thyroid cancers.
US-elastography
Ultrasound elastography is a valuable software tool for complementing B-mode examinations and assessing the malignancy of thyroid nodules based on their increased stiffness compared to the surrounding normal parenchyma. Its use has been incorporated within WFUMB (World Federation for Ultrasound in Medicine and Biology) [26 ] and EFSUMB (European Federation of Societies for Ultrasound in Medicine and Biology) Guidelines [52 ]. These guidelines underline the fact that the clinical use of SE must be carefully considered. If nodules are suspicious on ultrasound (US), fine-needle aspiration (FNA) is recommended, while benign SE features without suspicious US findings may not warrant active intervention. SE is valuable for non-diagnostic cytology, but operator experience is crucial to avoid false positives. Limitations include interference from nodule characteristics, operator experience, and motion artifacts. Transverse scans are less suitable for elastography due to carotid pulsations, while longitudinal scans provide better reference tissue.
There are two techniques:
SWE (shear wave elastography), which consists of quantitative evaluation of wave propagation velocity expressed in kPa or in m/s by placing an ROI in the suspicious nodule. It includes two methods which differ with respect to the ROI’s size, the point shear wave (p-SWE, with ROI < 1cm3), and the multiparametric shear wave (2D-SWE and 3D-SWE) [5 ]. 3 to 10 measurements in the thyroid nodule are required for p-SWE, while three measurements are calculated for 2D-SWE. In SWE, the probe is positioned perpendicular to the lesion using controlled compression to minimize the vertical movement artifacts. Experts also suggest minimizing carotid pulsation artifacts and avoiding areas with artifacts (calcifications or cysts);
SRE (strain ratio elastography), a semiquantitative evaluation, with higher sensitivity and specificity than SWE, which describes the behavior of the relevant tissue before and after mechanical compression with the probe. As a result, the thyroid nodule is colored according to its stiffness on the elastogram (the colored map). Stiff nodules turn red or blue, while soft nodules turn the opposite color and intermediate stiffness is represented by hues of green.
There are different ways to perform a semiquantitative SRE evaluation ([Fig. 2 ]). The most common is to put a circular ROI inside the nodule and a second one in the adjacent healthy parenchyma. Experts suggest using an ROI with a similar size and similar depths. Strain ratio values are automatically calculated with software, by determining the ratio between the ROI in the healthy parenchyma and the ROI in the nodule [4 ]
[6 ].
Fig. 2 Elasticity classification with colorimetric map according to Gürkan Dumlu E. et al. [54 ]
In detail, the WFUMB [27 ] and EFSUMB Guidelines [52 ] respectively recommend:
SE can be used with conventional US to improve specificity,
Ultrasound elastography of the thyroid could be used as part of nodule characterization, particularly when using semi-quantitative methods.
More recently, the accuracy of USE especially with the strain technique as reported by Cantisani et al. [53 ] was confirmed by a metanalysis published in March 2022 about the diagnostic performance of US-elastosonography (USE) for assessing the malignancy risk of thyroid nodules [54 ]. It showed pooled sensitivity, specificity, and AUC of 84 %, 81 %, and 0.89, respectively, for qualitative USE; 83 %, 80 %, and 0.93, respectively, for semi-quantitative USE; and 78 %, 81 % and 0.87, respectively, for quantitative USE. In short, all the techniques have good accuracy, but qualitative and semiquantitative USE both have better accuracy than quantitative USE.
Even if no consensus has been reached on the cut-off value to be used in SR (< 1.5 for benign nodules and > 5 for malignant nodules have been suggested), SR has been shown to have less interobserver variability and to be easier to evaluate than the color map ([Fig. 3 ], [4 ]) [55 ]
[56 ].
Fig. 3 Elasticity classification with colorimetric map according to Asteria et al. [55 ]
Fig. 4 a, b On B-mode US, the lesion appeared well marginated, encapsulated, wider-than-tall, heterogeneous but mostly isoechoic; no internal microcalcifications were identified (ACR-TIRADS 2); c On color Doppler, the lesion showed peripheral vascular flow; d, e On CEUS evaluation, the lesion showed a similar contrast enhancement curve to that of the surrounding thyroid parenchyma. On FNAC, the lesion was classified as benign (Tir 2 or category II according to the Bethesda classification system).
CEUS
CEUS (contrast-enhanced US) is a valid technique for the characterization of lesions in various organs such as the liver and breast after the intravenous administration of a contrast media bolus characterized by microbubbles of sulfur derivates. It has a very low incidence of adverse effects and can be used in patients with kidney insufficiency [57 ]. Furthermore, ultrasound contrast agents have the advantage of not containing iodine. This is particularly significant for patients with thyroid cancer, as the use of radioiodine treatment is hindered by the prior administration of iodine-containing contrast agents.
With CEUS both quantitative and qualitative evaluation is possible:
The principle of qualitative evaluation is to consider the time of entry of the contrast medium into the affected nodule and the peak of enhancement, which can be higher, lower, or equal to the surrounding parenchyma (hyper-, hypo- or iso-enhancement, respectively) or can be absent.
The quantitative evaluation is carried out by comparing the enhancement values obtained from an ROI placed within the affected nodule and an ROI placed in the healthy parenchyma and calculating the rise time, time to peak (time until peak intensity is reached), wash-in slope, peak intensity, mean transit time (intensity values are higher than the mean), and area under the time-intensity curve.
In particular, some recent studies argue that the intensity curve (TIC) analysis shows significant statistical differences between benign and malignant lesions in TTP measurement. In a study by Brandenstein et al., it was demonstrated that carcinomas exhibited an earlier peak after the injection of the contrast agent, approximately 6 seconds ahead of benign nodules, both centrally and marginally. The center of carcinomas, in particular, showed the fastest average TTP at 13.2 seconds. Notably, the relative TTP values for benign and malignant nodules were comparable, hovering around 90 %. [58 ]
As reported by Sorrenti et al. [59 ], CEUS can be used:
to identify variations in perfusion in lesions of interest and consequently differentiate benign lesions from malignant ones, especially when the cytological evaluation is indeterminate;
to define the region that will undergo minimally invasive treatment (e. g., thermal ablation), especially as a diagnostic imaging tool for the follow-up of lesions treated on a minimally invasive basis.
However, the use of CEUS for thyroid nodule characterization is controversial. Unfortunately its parameters have not yet been standardized and do not have sufficient specificity or sensitivity to diagnose a nodule’s malignancy (pooled sensitivity, specificity, positive predictive value, and negative predictive value for CEUS 85 %, 82 %, 83 % and 85 %, respectively – Trimboli et al. 2020). Therefore, the technique is not yet recommended in daily clinical life by EFSUMB, but it remains a valid tool for research [60 ] ([Fig. 4 ]).
Some preliminary studies show how the evaluation of dynamic tumor microvascularization is better accomplished with high-resolution techniques like HiFR. They suggest that the characterization of peri-tumoral microvascularization is enhanced when employing high-resolution contrast imaging like CEUS. Ongoing developments in this field are steadily advancing, enabling comprehensive characterization of tissues in terms of capillary blood flow and morphology with an unprecedented level of detail ([Table 1 ]) [61 ].
Table 1
The table below summarizes the advantages and disadvantages of the techniques described above.
Advantages
Disadvantages
Strain elastography/shear wave elastography
No univocal classification
Undefined thyroiditis influence
Calcific and cystic nodules can be assessed incorrectly
Different equipment may give different values of stiffness and different thresholds [5 ]
Elastography has low sensitivity for nodules smaller than 10 mm and low specificity for nodules larger than 20 mm [2 ]
Operator-dependent
CEUS
Sulfur allergy
Not approved in clinical practice by the guidelines
Microbubble contrast agent’s duration is only about 5–10 min and the image acquired during the injection of contrast has a low mechanical index
Only one nodule can be evaluated for each injection of contrast agent
Depends on the operator’s experience
High economic cost of contrast medium
Undefined intra/inter-observer agreement
Artificial intelligence
To help physicians to reduce their workload and to improve the accuracy of their diagnoses, the use of artificial intelligence (AI) has increased over the last two decades [62 ]
[63 ].
Machine learning is an AI subset that uses statistical methods to improve the performance of algorithms created by human programmers based on data extracted from regions of interest (ROI) with explicit parameters based on expert knowledge.
Deep learning (DL) is part of machine learning that doesn’t require predetermined features and regions of interest defined by specialists. It can automatically learn representations of information and gain experience from raw data. It includes several different algorithms, essentially belonging to two different groups: convolutional neural networks (CNNs), which is the most commonly utilized architecture, and non-neural networks [64 ]. The network can directly input the original image, eliminating the need for preprocessing and complex feature extraction procedures that can lead to errors and classification biases [63 ].
These technologies can be transferred to software used directly by clinicians: computer-aided diagnosis (CAD). CAD systems are actually already available as commercial applications or as systems embedded in US equipment ([Fig. 5 ]).
Fig. 5 a On B-mode US, the lesion appeared markedly hypoechoic, with lobulated margins; some internal microcalcifications were seen (ACR-TIRADS 5); b On qualitative USE evaluation, the lesion appeared stiff under strain and c intermediate-stiff on 2 D shear wave elastography; d The automatic evaluation software supported by artificial intelligence algorithms confirmed the malignancy of the nodule, characterized as Tir 5. On FNAC, the lesion was classified as Tir 5. The histological final diagnosis was papillary carcinoma.
Several studies and metanalyses confirm the benefits of the use of CAD systems for diagnosing thyroid nodules in terms of accuracy, in particular for less experienced sonographers, to obtain more systematized results and to reduce unnecessary FNAB procedures [62 ]
[64 ].
In addition, a convolutional neural-network-based CAD program may help to predict the BRAFV600E genetic mutation and may also help to identify nodules with high-risk mutations [62 ].
AI can also be used since CEUS has not yet been recommended for routine clinical practice but can be used for research and educational purposes.
Currently, AI software is integrated into the diagnostic procedure for multiparametric assessment, complemented by techniques like elastosonography to achieve a more precise evaluation of nodules, although it is not part of standard protocols. Moreover, FNAB is a diagnostically significant test, prompting numerous studies aimed at enhancing its efficacy through the incorporation of artificial intelligence. This involves leveraging AI algorithms capable of improving its accuracy, thereby minimizing the margin of error, as suggested in the studies by Dov et al. and Elliott et al. [64 ]
[65 ]
[66 ]. In particular, further multicenter evaluation is required to assess the potential integration of additional AI systems for differentiation between TI-RADS III and TI-RADS V categories, as the current scientific evidence remains insufficient for validation in the clinical setting [67 ].
Conclusion
Ultrasound currently plays a crucial role in thyroid nodule management. There are still open issues regarding the use of TIRADS, whether US-elastography should always be used, and the role of AI-based techniques and CEUS.
However, the use of clinical information and laboratory information together with multiparametric ultrasound features is recommended to reduce unnecessary FNAB and treatments.
Scientifically responsible based on certification regulations
Scientifically responsible based on certification regulations
According to certification regulations, the person scientifically responsible for this article is Dr. Vincenzo Dolcetti, Roma, Italy.