Keywords
Breast cancer - mammography screening - digital breast tomosynthesis - breast density
- tumor grade
Introduction
Early-stage breast cancer detection aims to reduce advanced tumour stages by diagnosing
breast cancer earlier, thus enabling potential therapeutic advantages and reducing
breast cancer-specific mortality [1]. Mammography is an evidence-based method for systematic early-stage cancer detection
with a proven reducing effect on breast cancer mortality [2]
[3]. In Germany, a mammography screening programme (MSP) based on the European Guidelines
for women aged between 50 and 69 years has been implemented nationwide, starting in
2005 [4]. With breast cancer being the most common cause of cancer-related death in women,
research into innovative screening strategies is warranted [5].
By reducing superimpositions, made possible by X-ray tube arching and reconstruction
of layers parallel to the detector surface, digital breast tomosynthesis (DBT) achieves
higher breast cancer detection rates compared to digital mammography (DM), the current
standard in population-basted early detection [6].
The large, randomized controlled TOSYMA study, which was embedded in the German mammography
screening programme, showed that DBT plus reconstructed, synthetic mammography (DBT+SM)
was superior to the current standard screening method (DM) in the detection of invasive
breast cancers [7]
[8]. The higher cancer detection rate when using DBT+SM was observed with non-advanced,
invasive breast cancers (UICC stage I) in particular, among those with histological
grades 2 or 3 [9].
It is known that the effect of screening on breast cancer mortality not only depends
on early-stage cancer detection but also on tumour biology. The latter determines
both the speed of tumour growth and the risk of metastasis.
The histological grade is an independent strong prognostic factor for breast cancer,
reflecting tumour biology; it is associated with breast cancer-specific survival and
disease-free survival [10]
[11]. Genome expression profile studies have deciphered additional useful factors of
breast cancer biology and significantly deepened our understanding of the biology
of the disease; the utility of these factors as prognostic and predictive tools is
currently being evaluated. At the same time, these studies have provided further evidence
of the high level of relevance of the biological characteristics reflected in the
histological grade [12]
[13]
[14]
The relative reduction in breast cancer-specific mortality attributable to tumour
detection in screening varies depending on tumour grade, as shown in the Swedish screening
programme [11]. Of the breast cancers detected in the UK screening programme that had a fatal outcome,
6%, 37% and 47% were classified as grade 1, grade 2 or grade 3, respectively [15].
Since the desired reduction in breast cancer mortality is largely based on the detection
of histological grade 2 or 3 breast cancers in a screening setting, the aim of this
study was to provide an explorative comparison of the rates of detection of invasive
breast cancers of grades 2 or 3, independent of the stage, between the DBT+SM arm
and the DM arm of the TOSYMA study, while also taking into account breast density.
Materials and Methods
Study design
The multicentre TOSYMA study was conducted in 17 screening units in the German federal
states of North Rhine-Westphalia and Lower Saxony from July 2018 to December 2020.
A total of 99689 women were randomly assigned (1:1) to the test arm (DBT+SM) or the
control arm (DM). The study protocol was approved by the responsible ethics committee
(2016–132-f-S) and assessed by two further ethics committees. A written consent was
obtained from all study participants [16]. The study is registered on the publicly accessible database ClinicalTrials.gov
(NCT 03377036). The study protocol, the results of the first primary end point with
secondary endpoints as well as subanalyses have already been published [7]
[8]
[9]
[16]
[17].
Study participants
In Germany, all women between the ages of 50 and 69 years receive an invitation letter
every two years to take part in the Mammography Screening Programme (MSP). In the
catchment areas of the study centres, they received a personal study invitation with
information material in addition to the regular MSP invitation letter. Women diagnosed
with breast cancer within the last 5 years or with a mammogram within the last 12
months were not eligible for MSP participation. Breast implants and repeated TOSYMA
participation were specific exclusion criteria for the TOSYMA study [7]
[8]
[9]
[16]
[17].
Set-up of the screening examination
The opportunity to participate in the study was offered in 17 screening units at 21
sites: Lower Saxony Northwest (Wilhelmshaven), Hannover, Lower Saxony North (Stade),
Lower Saxony Central (Vechta), Lower Saxony Northeast (Lüneburg), Duisburg, Krefeld/Mönchengladbach/Viersen,
Wuppertal/Solingen (Bergisches Land/Mettmann district), Aachen-Düren-Heinsberg, Cologne
(Bergisch Gladbach), Münster-South/Coesfeld, Bottrop, Gelsenkirchen, Recklinghausen,
Minden-Lübbecke/Herford, Bielefeld/Gütersloh, Hamm/Unna/Märkischer Kreis (Schwerte),
Höxter, Paderborn, Soest (Lippstadt), Münster-North/Warendorf).
Seven different manufacturers of mammography systems were used to provide the DBT+SM
or DM examination: Fujifilm Cooperation, Amulet Innovality, Tokyo, Japan (n=10075);
IMS Giotto, Class Tomo, Sasso Marconi, Italy (n=7970); Hologic, Lorad Selenia 3Dimensions,
Marlborough, US (n=10955); Hologic, Lorad Selenia Dimensions, Marlborough, US (n=40645);
Siemens Healthineers, MAMMOMAT Inspiration, Erlangen, Germany (n=6759); Siemens Healthineers,
MAMMOMAT Revelation, Erlangen, Germany (n=12917); GE Healthcare, Senograph Essential,
Chicago, US (n=10237).
In both study arms, the examinations comprised the craniocaudal and the mediolateral
oblique views of each breast. In addition to synthetic, 2-dimensional mammograms (SM),
stacked slices of ≤1mm thickness were reconstructed to create the images for reading
(DBT) [7]
[8]
[9]
[16]
[17].
Reading of the screening examination and diagnostic work-up
As in the current MSP, independent double reading was performed in both study arms
by the same certified readers. The screening study comprised a total of 83 experienced
readers with at least 2 years of prior screening experience and more than 5000 screening
readings per year. DBT training was provided prior to the start of the TOSYMA study
in the Reference Centre for Mammography Münster. Based on the DM and SM images, breast
density was visually assigned to the categories A (fatty), B (fibroglandular), C (heterogeneously
dense), D (extremely dense) [18]
[19]. There were 4 to 8 readers at each site. They received their list of study examinations
in a mixed sequence of the two study arms without being able to identify the study
arm prior to selecting the examination in the screening software. In the case of suspicious
findings, the results were discussed in the consensus conference with the physician
responsible for the programme to decide whether a further diagnostic work-up was indicated.
The diagnostic work-up after study participation was not different from the established
procedure of the MSP and comprised, besides the clinical examination, additional mammography
views (e.g., magnification mammograms or DBT), ultrasonography, MRI scans, and invasive
diagnostic interventions.
Each of the 32 pathologists involved produced at least 100 screening diagnoses per
year and took part in a mandatory continuing education course every 2 years, in addition
to the self-auditing procedures. The training focused on the internationally recommended
Nottingham Grading System, based on semi-quantitative scoring (1 to 3) of glandular
differentiation, nuclear pleomorphism and mitotic rate per square millimetre (G1:
scoreΣ 3–5, G2: scoreΣ 6–7, G3: scoreΣ 8–9) [4]
[20]
[21].
All screening data were stored in the screening documentation system MaSc (KV-IT GmbH,
Dortmund, Germany) [8]
[9]
[16]
[17].
Study data and statistical analyses
This subanalysis included 49,479 participants in the DBT+SM arm and 49,689 participants
in the DM arm with complete screening documentation, including visual density categorization
([Fig. 1]). Descriptive analyses with stratification of invasive detection rates (iCDRs) by
histological grade (grade 1 vs. grade 2 or grade 3) and breast density (A+B: non-dense
breast vs. C+D: dense breast) were performed for each study arm [18]
[19]. When the two breasts differed in density, the higher category was documented; when
independent double reading led to discordant categorization of breast density, the
highest density category was used [9]
[16]
[17]
[18]
[19].
Fig. 1 Flowchart of randomization in der TOSYMA study and inclusion in this subanalysis.
TOSYMA: TOmosynthesis plus SYnthesized MAmmography trial; DBT+SM: digital breast tomosynthesis
and synthetic mammography; DM: digital mammography.
The findings are presented as the absolute number of invasive breast cancers and as
invasive breast cancer detection rates (iCDR, per 1000 women screened) in the two
study arms as well as their respective differences. The resulting estimates of the
risk difference are reported with a 95% Wald confidence interval (CI). Given the explorative
nature of these analyses, no adjustments for multiple comparisons were made and no
p-values are provided. The statistical analyses were performed using SAS version 9.4
(SAS Institute, Cary, NC, USA).
Results
In the DM arm, 240 invasive breast cancers were detected among 49689 participants
included; in the DBT+SM-Arm, 354 invasive breast cancers were diagnosed among 49479
participants included.
Invasive breast cancer detection stratified by histological grade
In the DM arm, the iCDR of grade 1 breast cancers was 1.3 per 1000 women screened
(63/49689); the ICDR of grade 2 or grade 3 breast cancers was 3.6 per 1,00 women screened
(177/49689).
In the DBT+SM arm, the iCDR of grade 1 tumours was 2.1 per 1000 women screened (104/49479);
the ICDR of grade 2 or grade 3 tumours was 5.1 per 1000 women screened (250/49479).
The number of examinations with detected breast cancers, the iCDRs and the resulting
differences (with confidence intervals) between the study arms are shown in [Table 1].
Table 1 Comparative invasive breast cancer detection rates with stratification by histological
grade for the two study arms of the TOSYMA RCT.
|
DM
|
DBT+SM
|
iCDR difference
(DBT+SM – DM)
(95% Wald confidence interval)
|
DM: digital mammography; DBT+SM: digital breast tomosynthesis and synthetic mammography;
iCDR: invasive breast cancer detection rate; G: histological grade
|
Invasive cancers
|
240
|
354
|
114
|
G1
|
63
|
104
|
41
|
iCDR G1
|
1.3 ‰
|
2.1 ‰
|
0.8 ‰
(0.31–1.37)
|
G2+G3
|
177
|
250
|
73
|
iCDR G2+3
|
3.6 ‰
|
5.1 ‰
|
1.5 ‰
(0.66–2.30)
|
Invasive breast cancer detection stratified by histological grade and mammographic
density
In the DM arm, the iCDR of grade 1 breast cancers was 1.0 per 1000 women screened
for the categories A/B (28/28009) and 1.6 for the categories C/D (35/21680). The corresponding
rates were higher for grade 2 or grade 3 breast cancers with 3.2 per 1000 women screened
for the categories A/B (90/28009) and 4.0 for the categories C/D (87/21680).
In the DBT+SM arm, the iCDR of grade 1 tumours was 1.7 per 1000 women screened for
the categories A/B (46/26767) and 2.6 for the categories C/D (58/22712). The highest
iCDRs were found for grade 2 or grade 3 breast cancers with 4.5 per 1000 women screened
for the categories A/B (120/26767) and 5.7 for the categories C/D (130/22712).
The number of examinations with detected breast cancers, the iCDRs and the resulting
differences (with confidence intervals) between the study arms are shown in [Table 2].
Table 2 Comparative invasive breast cancer detection rates with stratification by histological
grade and breast density for the two arms of the TOSYMA RCT.
|
DM
|
DBT+SM
|
Difference iCDR
(DBT+SM – DM)
(95% Wald confidence interval)
|
DM Digital mammography; DBT+SM: digital breast tomosynthesis and synthetic mammography;
iCDR: invasive breast cancer detection rate; G: histological grade; A+B: non-dense
breast; C+D: dense breast
|
Invasive breast cancers G1
|
|
|
|
Invasive cancers
A+B
|
28
|
46
|
18
|
iCDR A+B
|
1.0 ‰
|
1.7 ‰
|
0.7 ‰
(0.06–1.38)
|
Invasive cancers
C+D
|
35
|
58
|
23
|
iCDR C+D
|
1.6 ‰
|
2.6 ‰
|
1.0 ‰
(0.05–1.80)
|
Invasive breast cancers G2+3
|
|
|
|
Invasive cancers
A+B
|
90
(69+21)
|
120
(96+24)
|
30
(27+3)
|
iCDR A+B
|
3.2 ‰
|
4.5 ‰
|
1.3 ‰
(0.20–2.35)
|
Invasive cancers
C+D
|
87
(72+15)
|
130
(106+24)
|
43
(34+9)
|
iCDR C+D
|
4.0 ‰
|
5.7 ‰
|
1.7 ‰
(0.38–3.06)
|
[Fig. 2] shows the iCDRs with stratifications by tumor grade and breast density, comparing
the study arms DM and DBT+SM ([Fig. 3]).
Fig. 2 Comparative invasive breast cancer detection rates with stratification by histological
grade and breast density for the two arms of the TOSYMA study. DM: digital mammography;
DBT+SM: digital breast tomosynthesis and synthetic mammography; G: histological grade;
A+B non-dense breast; C+D: dense breast.
Fig. 3 Digital breast tomosynthesis of the right breast in a craniocaudal (cc) and b mediolateral oblique (MLO) views. In the individual slices, an architectural distortion
is noted in the right upper lateral aspect in both views. Histology: invasive lobular
breast cancer, pT2 (31 mm), pN0, cM0, G2.
Discussion
The first primary endpoint of the first phase of the randomized, controlled TOSYMA
trial investigated whether a clinically relevant increase in the detection rate of
invasive tumours is achieved when DBT+SM is used for breast cancer screening compared
to DM, the standard imaging modality [7].
After recruitment had closed in 12/2020, 354 invasive breast cancers were documented
in 49,715 women of the DBT+SM arm (invasive detection rate: 7.1 per 1000 women screened)
und 240 invasive breast cancers in 49,762 women of the DM arm (invasive detection
rate: 4.8 per 1000 women screened). The invasive breast cancer detection rate was
significantly higher in the intervention arm compared to the control arm (odds ratio
[OR] 1.48; 95% confidence interval [CI] 1.25–1.75; p<0.0001) [8]. The detection rate for invasive tumours up to 20mm in diameter was substantially
higher in the intervention arm compared to the control arm (OR 1.73; 95% CI 1.41–2.13)
[8]. These results were achieved with no marked difference in the recall rates between
the two study arms (DBT+SM: 4.9%; DM: 5.1%).
The PPV1 was higher with DBT+SM compared to DM (DBT + SM: 17.2%, DM: 12.3%) [8] (TOSYMA-1).
As yet, there is no conclusive evidence in the literature to suggest that DBT screening
is more effective compared to DM screening, particularly in reducing breast cancer-specific
mortality. The increase in detection rates observed with DBT+SM screening could be
attributable to an increasing level of overdiagnosis, i.e. cancer diagnoses that would
not have progressed to a symptomatic or life-threatening disease during the patient's
lifetime [21]
[22]. Therefore, the TOSYMA study was supplemented by a second phase, investigating the
incidence rates of invasive interval breast cancers diagnosed within 24 months after
the screening examination (TOSYMA-2). Interval cancers are considered an important
clinical surrogate endpoint for the evaluation of breast cancer screening [23]. Results for this second primary endpoint of the TOSYMA study are expected to become
available by 2024/2025 [7].
In addition, it would be useful to carry out a supplementary assessment closer to
screening based on prognostic tumour parameters [9]. Several studies have shown an association between histological grade, a well-established
indicator of tumour growth rates, and prognosis [11]
[24]
[25]. According to the results of the Swedish Two County Trial, the histological tumour
grade at the time of diagnosis has a long-term effect on later survival, similar to
nodal status and tumour size [26].
This prognostic significance of the histological grade is also reflected in the differences
in reduction of breast cancer-specific mortality due to screening detection in the
Swedish Two County trial. The reduction in screening-detected grade 3 tumours is 35%,
in grade 2 tumours 32%, but in grade 1 tumours only 6% [11]. This grading-related effect is long-term in nature and contributes to the fact
that the impact of screening programmes on breast cancer mortality can still be observed
many years later; while the effect is strongest in the first 5 years, it can last
for up to 15 years [26].
The results of the explorative TOSYMA subanalysis presented here show that in both
study arms the breast cancer detection rates were higher for grade 2 or 3 breast cancers
compared to grade 1 tumours. The difference achieved with DBT+SM vs. DM is greater
for iCDR of tumours with grades 2 or 3 compared to grade 1. Early grade 1 breast cancers
are more likely to contribute to overdiagnosis than grade 2 or grade 3 tumours; therefore,
its rates among early tumour diagnoses in stage UICC I are of interest.
Unlike the above mentioned earlier TOSYMA subanalysis which assessed grade-dependent
detection for the early UICC I tumour stage [9], this subanalysis includes screening-detected breast cancers of all tumour stages.
The results are consistent: Screening leads to a higher detection rate of grade 2
or 3 tumours compared to grade 1 tumours, both in the detection of the early tumour
stage and also when advanced tumour stages are included. DBT+SM achieved higher detection
rates than DM, with the highest rates being observed with dense breast parenchyma
[9]. Supplementary information of the TOSYMA study on prognostic parameters indicates
that in the DBT+SM arm the higher iCDR in women with dense parenchyma is primarily
based on the detection of screening-relevant grade 2 or grade 3 breast cancers and
not on the detection of grade 1 tumours [27]
[28]. While DBT+SM also increased the detection of grade 1 cancers compared to DM, the
magnitude of the effect is smaller compared to that on the detection rate of more
prognosis-relevant grade 2 or 3 cancers.
With almost 100000 participants, TOSYMA is the largest randomized controlled screening
trial evaluating DBT+SM vs. DM conducted so far. It provides the opportunity to carry
out supplementary analyses on the basis of a successful randomization. The pragmatic
approach offers a high degree of external validity and also demonstrates its real-world
feasibility, especially due to the inclusion of numerous screening units and device
technologies. Radiology staff and physicians underwent special training prior to the
start of the study. All investigators were experienced, with no differences between
the two study arms or between the study and routine screening [17].
This study has limitations. TOSYMA analysed only one screening round; consequently,
it is possible that the differences between the study arms are influenced by an initial
prevalence screening effect with DBT+SM. In addition, there might be a learning curve
in reading tomosynthesis images. Having access to the screening examination via the
screening software, the TOSYMA readers were not blinded with regard to the study arm
[17].
Conclusion
The explorative analyses of this large, randomized trial indicate that DBT+SM screening
increases the detection of prognostically more relevant breast cancers (TOSYMA-1).
Once the follow-up data have been analysed in 2024/2025 (TOSYMA-2), we will be able
to evaluate whether the higher breast cancer detection rates achieved with DBT+SM
result in measurable differences in invasive interval cancer rates, an important surrogate
indicator, between the two study arms.