Introduction
In 2020, it is estimated that nearly 150,000 individuals will be newly diagnosed with
colorectal cancer (CRC) in the United States alone, with over 50,000 expected CRC-related
deaths [1]. CRC is a disease process that is appropriate for employing population-based screening,
given that its natural history typically involves a slow progression from adenoma
to cancer [2]. Colonoscopy has been shown to reduce CRC incidence and mortality in a cost-effective
fashion [3]
[4], given its capacity to both identify and remove adenomatous polyps. It can thus
potentially act either as a primary screening modality or a primary method of following
up on other abnormal screening tests.
To assess and optimize the overall quality of screening-related colonoscopy, several
surrogate indicators have been widely adopted, including cecal intubation rate (CIR)
and adenoma detection rate (ADR) [5]
[6]. There is a well-established relationship between higher ADR and lower incidence
of post-colonoscopy CRC (PCCRC) [7]. Rates of PCCRC can vary depending on the methodology used for their calculation,
but are thought to range from 3 % to 13 % with an estimated average of 7.4 % [8]
[9]. Several screening programs mandate minimum ADR benchmarks of ≥ 25 % for screening
colonoscopy [5]
[10], with some advocating for higher targets depending on the population(s) being screened
[11].
Despite efforts to improve colonoscopy quality, wide variations in endoscopists’ ADRs
exist [12]
[13]
[14]. Suboptimal technique and inadequate withdrawal times (WTs) are considered to be
major factors responsible for this disparity [15]. Several interventions have been studied that aim to improve ADR, ranging from medical
devices [16] to optimized endoscopy curricula for trainees [17]. Of particular interest are educational interventions specifically targeted at independently
practicing endoscopists. Despite planned systematic performance improvement interventions
[18], early evidence failed to demonstrate significant improvements in ADR, leading some
to conclude that educational interventions do not improve colonoscopy quality. However,
more recent evidence reported that the implementation of a bundle of educational interventions
was associated with significant ADR improvements in the poorest performers [19].
We performed a systematic review and meta-analysis to determine whether there is an
association between educational interventions and improvements in ADR or any other
colonoscopy quality indicators.
Methods
Overview
Our systematic review was conducted and reported according to Preferred Reporting
Items for Systematic Reviews and Meta-Analyses (PRISMA) statement recommendations
[20] and Meta-analysis of Observational Studies in Epidemiology (MOOSE) statement recommendations.
A detailed PRISMA checklist is provided in Supplementary Table 1. The protocol for this review was registered on PROSPERO (CRD42019149683). Our primary
objective was to determine if there is an association between educational interventions
for endoscopists and improvement in ADR. Our secondary objectives were to evaluate
whether endoscopist educational interventions are associated with improvements in
other colonoscopy quality indicators: polyp detection rate (PDR), advanced neoplasm
detection rate (ANDR), proximal ADR (pADR), CIR and WT.
Search strategy
A comprehensive search strategy was developed by members of the study team in conjunction
with a health research librarian. We searched MEDLINE, EMBASE (Excerpta Medica Database),
Google Scholar, and CENTRAL (Cochrane Central Registry of Controlled Trials) from
inception of the databases through August 31, 2019. Our full search strategy is provided
along with additional gray literature searches performed in Appendix 1.
Eligibility criteria
A study was eligible for inclusion if it was a cohort study, quasi-experimental study
or clinical trial, it was published in English as either an abstract or manuscript,
it assessed the effect of an educational intervention targeting colonoscopy (including
live lectures, slide decks, video tutorials, online training modules, individualized
assessment and optimization, and skills enhancement and training courses), and it reported on at least one colonoscopy quality indictor (including ADR, PDR, ANDR,
pADR, CIR, or WT).
A study was excluded if it reported on data that overlapped with another published
study, in part or in whole (in these cases, the study with longer follow up or more
complete data was included), it assessed the effect of educational interventions on
the performance of trainees, or it assessed only the effect of audit and feedback
or other interventions without a targeted educational intervention.
Study selection
Following removal of duplicates, citations were imported into Rayyan (M Ouzzani, Qatar
Computing Research Institute, HBKU, Doha, Qatar). All abstracts were screened independently
by two reviewers (EGM, KB). In the case of disagreements, a third author (NF) reviewed
the study and consensus was achieved. The full-length texts of selected abstracts
were retrieved and reviewed.
Data extraction and study quality
A data abstraction form was designed a priori to collect data from each included study.
Two reviewers (EGM, KB) independently extracted pre-established data points, in addition
to performing assessments of bias and overall study quality. The risk of bias in individual
studies was determined using the Newcastle-Ottawa Scale (NOS) for non-randomized studies
[21]. Inter-reviewer discrepancies in data abstraction were resolved by consensus after
input of a third author (NF). When studies met inclusion criteria but had insufficient
data to be included in the quantitative meta-analysis, we attempted to contact corresponding
authors to obtain missing information; if unsuccessful, the study was summarized qualitatively
only. We used the Grading of Recommendations, Assessment, Development and Evaluations
(GRADE) system to assess the certainty of the evidence according to study design,
consistency, directness, imprecision and reporting bias [22].
Outcomes
The primary outcome of our study was ADR. Secondary outcomes of interest were PDR,
ANDR, pADR, WT and CIR.
Statistical analysis
Rate ratios (RR) with their respective 95 % confidence intervals (CI) were pooled
and presented in Forest plots to estimate the effect of educational interventions
on outcomes. Whenever a randomized controlled trial (RCT) also presented preintervention
and post-intervention data, we: a) pooled these data along with the meta-analysis
of other quasi-experimental observational studies; and b) analyzed the randomized
data separately. We defined RR as the ratio of the post-intervention quality indicator
divided by the pre-intervention value. For continuous variables such as WT, we calculated
mean differences in preintervention and post-intervention measures. We used DerSimonian
and Laird random effects models to account for expected heterogeneity across study
designs. In addition, the 95 % prediction interval for the primary outcome of interest
was calculated [23]. χ
2 tests and I
2 statistics were calculated as a measure of between study heterogeneity. I
2 values of 0 – 30 % were regarded as possibly unimportant, values of 30 % to 50 %
were regarded to represent moderate heterogeneity, values of 50 % to 75 % were regarded
to represent substantial heterogeneity, and values > 75 % were regarded to represent
considerable heterogeneity [24]. Funnel plots as well as Egger’s and Begg’s tests were used to assess publication
bias [25]
[26].
To assess other potential sources of heterogeneity, we performed several subgroup
analyses, including by study design (RCTs versus observational studies), number of
centers (single-center versus multicenter), and publication type (conference abstract
versus published manuscript). Subgroup analyses were also performed by indication
for colonoscopy (primarily screening-related versus mixed populations), education
type (hands-on versus didactic training programs), Endoscopic Quality Improvement
Program (EQUIP)-based [27] versus other strategies, and presence of lag time (any versus none). Lag time was
defined as any time period between the intervention and outcome measurement. Meta-regression
analyses were not performed given that there were fewer than 10 studies in the analysis
for the primary outcome [24].
We conducted sensitivity analyses whereby each study was removed in turn and whereby
fixed effects models were used rather than random effects models. Statistical analyses
were performed using STATA version 14.2 (StataCorp, College Station, Texas, United
States) and Revman 5.3 (Cochrane Collaboration).
Results
Study selection
A PRISMA flowchart summarizing the overall search results and study selection process
is presented in Supplementary Fig. 1. A total of 2,253 citations were identified from the search strategy, without any
additional studies identified through manual searches. Of these, 30 full-text articles
were reviewed. Eight studies were included in the meta-analysis for the primary outcome.
An additional two studies did not contain sufficient data to be quantitatively analyzed
despite attempts to contact study authors; thus, these were summarized qualitatively.
Study characteristics and quality
Baseline characteristics of the studies included in the meta-analysis are summarized
in [Table 1]. Seven studies were performed in North America and one was performed in Europe.
Included studies were published between 2010 and 2019. Three were RCTs [27]
[28]
[29], four employed quasi-experimental designs (pre-comparisons and post-comparisons)
[30]
[31]
[32]
[33] and one was a retrospective cohort with pre-comparisons and post-comparisons [34]. A summary of interventions and outcomes from studies included in the meta-analysis
is provided in [Table 2].
Table 1
Summary of characteristics of studies included in the meta-analysis.
Author,
year
|
Study type
|
Country
|
Number of study sites
|
Endoscopists (N = ), specialty, practice type
|
Colonoscopies (N = ) pre-/post- intervention
|
Patient sex (% male)
|
Median patient age
|
Indication (% screening-related)
|
Study quality
|
Berger 2017 [33]
|
OBS
|
USA
|
1
|
11, 100 % GI, academic
|
1,113/849
|
N/R
|
N/R
|
N/R
|
N/A[*]
|
Coe 2013 [27]
|
RCT
|
USA
|
1
|
15, 100 % GI, academic
|
602/520
|
51
|
63
|
42
|
NOS-8
|
Corley 2019 [32]
|
OBS
|
USA
|
20
|
85, specialty N/R, setting N/R
|
12,266/20,897
|
49
|
63
|
22
|
N/A[*]
|
Evans 2019 [34]
|
OBS
|
Canada
|
1
|
14, specialty N/R, academic
|
833/850
|
|
|
|
|
Hall 2010 [31]
|
OBS
|
USA
|
1
|
11, 100 % GI, academic
|
550/413
|
48
|
53
|
100
|
N/A[*]
|
Kaminski 2016 [29]
|
RCT
|
Poland
|
38
|
38, specialty N/R, setting: National CRC Screening Program
|
14,264/10,615
|
39
|
57
|
100
|
NOS-9
|
Keswani 2015 [30]
|
OBS
|
USA
|
1
|
20, specialty N/R, setting N/R
|
2,444/3,639
|
N/R
|
N/R
|
100
|
NOS-8
|
Wallace 2017 [28]
|
RCT
|
USA
|
9
|
N/R
|
7,480/8,673
|
59
|
47
|
70
|
NOS-8
|
OBS, observational study; RCT, randomized controlled trial; GI, gastroenterologist;
Sx, surgeon; NOS, Newcastle-Ottawa Scale [21]; N/R = not reported
* conference abstract.
Table 2
Summary of interventions and outcomes from studies included in the meta-analysis.
Author, year
|
Description of educational intervention
|
Number of sessions
|
Lag time after intervention[*]
|
Post-intervention observation period
|
Preintervention ADR
|
Post-intervention ADR
|
Other outcomes reported
|
Berger 2017 [33]
|
One-hour slide show presentation/ lecture -focusing on improving ADR (EQUIP I/II intervention).
|
One
|
None
|
N/R
|
33.0
|
41.9
|
PDR, SPDR
|
Coe 2013 [27]
|
EQUIP I: two slide show presentations that included videos, images, and reference
material along with pre- and post- tests (each session duration of approximately 1
hour). First session: methods and technical aspects; lesion recognition (particularly flat lesions). Second session: pre- and post-test on neoplastic vs. non-neoplastic lesions and advanced imaging
modalities.
|
Two
|
4 months
|
7 months
|
36.0
|
47.0
|
PDR, ANDR, pADR
|
Corley 2019 [32]
|
30-minute interactive online training module on polyp identification, cleaning/washing
techniques, and colonoscopy quality, combined with feedback on ADR.
|
One
|
None
|
24 months
|
31.5
|
37.4
|
None
|
Evans 2019 [34]
|
Colonoscopy Skills Improvement (CSI) program, consisting of one day live endoscopy
sessions, with two certified faculty teaching up to three 3 endoscopists per session.
|
One
|
None
|
8 months
|
31.8
|
35.3
|
WT, CIR
|
Hall 2010 [31]
|
Departmental education regarding current national recommendations regarding withdrawal
times and expected detection rates.
|
One
|
28 months
|
3 months
|
22.0
|
34.0
|
WT
|
Kaminski 2016 [29]
|
Train-Colonoscopy-Leaders (TCL) cours, comprising three phases. Phase I: a 2-hour assessment visit by endoscopy nurses (10 colonoscopies) and two-day training
by UK trainers (skills improvement, training the trainer, leadership training). Phase
II: 2-day hands on training. Phase III: repeat previous nurse assessments (10 colonoscopies);
evaluation of first 30 colonoscopies with feedback.
|
Two
|
None
|
18 months
|
18.4
|
24.1
|
PDR, SPDR, CIR, pADR
|
Keswani 2015 [30]
|
Physician report cards containing endoscopistsʼ and institutional ADRs and WT, as
well as the institutional mean ADR and ADRs of the 10th and 90th percentile. Concurrent
educational meeting detailing the data supporting ADR and its relationship with interval
CRC cancer and the report card methodology.
|
One
|
12 months
|
6 months
|
28.0
|
39.0
|
WT
|
Wallace 2017 [28]
|
One-hour lecture focusing on improving adenoma detection (EQUIP I/II intervention)
followed by: 1 – 2-h review session, identification of low performers, discussion
of obstacles to high quality colonoscopy. In addition, optional one-on-one proctoring
is offered as well as telephone calls to discuss progress. EQUIP posters are posted
in endoscopy units.
|
One
|
N/R
|
N/R
|
31.0
|
42.0
|
PDR, ANDR, CDR, WT
|
ADR, adenoma detection rate; CIR, cecal intubation rate; WT, withdrawal time; SPDR,
sessile polyp detection rate; PDR, polyp detection rate; ANDR, advanced neoplasia
detection rate; pADR, proximal adenoma detection rate; N/R, not reported.
* Lag time refers to time between intervention and start of post-intervention measurement
of outcome(s).
Summaries of quality assessments are provided in [Table 1], with full assessments provided in Supplementary Table 2 and Supplementary Fig. 2. Study quality was high for fully published studies, with a mean NOS of 8.25. Assessments
of the certainty of the evidence according to the GRADE approach [22] are summarized in [Table 3]. Summaries of studies included in the systematic review, but not quantitatively
analyzed via meta-analysis, are provided in Supplementary Table 3.
Table 3
GRADE summary of effects of feedback interventions on colonoscopy quality indicators
[22].
Outcomes
|
Anticipated absolute effects[1] (95 % CI)
|
Relative effect (95 % CI)
|
№ of participants (studies)
|
Certainty of the evidence (GRADE)
|
Comments
|
Risk with control
|
Risk with Educational interventions
|
Adenoma detection rate – all studies as observational (ADR – Obs)
|
265 adenomas detected per 1,000 colonoscopies
|
341 per 1,000 (323 to 362)
|
Rate ratio 1.29 (1.22 to 1.37)
|
86008 (8 observational studies)
|
⊕⊕○○ LOW
|
Educational interventions likely result in an increase in adenoma detection rate –
all studies as observational.
|
Adenoma detection rate – only non-randomized studies (ADR)
|
308 adenomas detected per 1,000 colonoscopies
|
391 per 1,000 (357 to 431)
|
Rate ratio 1.27 (1.16 to 1.40)
|
43854 (5 observational studies)
|
⊕○○○ VERY LOW[2]
|
Educational interventions may result in an increase in adenoma detection rate.
|
Adenoma detection rate – only RCTs (ADR-RCTs)
|
267 adenomas detected per 1,000
|
315 per 1,000 (283 to 350)
|
Rate ratio 1.18 (1.06 to 1.31)
|
25791 (3 RCTs)
|
⊕⊕⊕○ MODERATE
|
Educational interventions likely increase adenoma detection rate – only RCTs.
|
Polyp Detection Rate (PDR)
|
494 polyps detected per 1,000 colonoscopies
|
608 per 1,000 (593 to 628)
|
Rate ratio 1.23 (1.20 to 1.27)
|
19237 (3 observational studies)
|
⊕○○○ VERY LOW[2]
|
Educational interventions may increase polyp detection rate but the evidence is very
uncertain.
|
Proximal adenoma detection rate (pADR)
|
93 proximal adenomas detected per 1,000 colonoscopies
|
130 per 1,000 (120 to 138)
|
Rate ratio 1.39 (1.29 to 1.48)
|
26001 (2 observational studies)
|
⊕○○○ VERY LOW[2]
|
Educational interventions may increase proximal adenoma detection rate but the evidence
is very uncertain.
|
Withdrawal time (WT)
|
The mean withdrawal time was 10.5 minutes
|
MD 0.29 minutes higher (0.18 lower to 0.76 higher)
|
–
|
48393 (4 observational studies)
|
⊕○○○ VERY LOW
|
Educational interventions may increase withdrawal time but the evidence is very uncertain.
|
Cecal intubation rate (CIR)
|
962 per 1,000 colonoscopies
|
962 per 1,000 (962 to 971)
|
Rate ratio 1.00 (1.00 to 1.01)
|
26562 (2 observational studies)
|
⊕○○○ VERY LOW
|
Educational interventions may have little to no effect on cecal intubation rate but
the evidence is very uncertain.
|
ADR, adenoma detection rate; RR, rate ratio; CI, confidence interval; MD, mean difference;
OBS, observational studies
GRADE Working Group grades of evidence:
High certainty: We are very confident that the true effect lies close to that of the
estimate of the effect
Moderate certainty: We are moderately confident in the effect estimate: The true effect
is likely to be close to the estimate of the effect, but there is a possibility that
it is substantially different
Low certainty: Our confidence in the effect estimate is limited: The true effect may
be substantially different from the estimate of the effect
Very low certainty: We have very little confidence in the effect estimate: The true
effect is likely to be substantially different from the estimate of effect
1 The risk in the intervention group (and its 95 % confidence interval) is based on
the assumed risk in the comparison group and the relative effect of the intervention
(and its 95 % CI).
2 a. All the observational studies are quasi-experimental with before and after comparisons.
b. The heterogeneity, both statistical and clinical, is substantial and cannot be
fully explained. c. The overlap between the confidence intervals is very limited.
d. The confidence in the estimate is low (the 95 % CI for the pooled estimate is wide
and/or crosses the line of no effect).
Adenoma detection rate
Meta-analysis of eight studies compared the ADR pre-education and post-education as
RRs and included 86,008 colonoscopies. The pooled baseline ADR was 26.5 % and the
post-intervention ADR was 35.4 %. Educational interventions were associated with a
29 % relative increase in ADR (RR 1.29, 95 % CI 1.22 to 1.37, 95 % prediction interval
1.09 to 1.53) as shown in [Fig. 1]. There was considerable heterogeneity between the eight included studies, demonstrated
by an I
2 value of 82.96 % ([Fig. 1]).
Fig. 1 Forest plot comparing the primary outcome of adenoma detection rate (ADR) pre-educational
intervention and post-educational intervention. CI, confidence interval.
Two additional studies met inclusion criteria for the primary outcome but did not
contain sufficient data to be included in the meta-analysis, even after attempts to
contact corresponding author(s) were made. Both studies reported significant improvements
in ADR following multi-level educational interventions [35]
[36].
Polyp detection rate
Three studies representing 19,237 colonoscopies compared PDR pre- and post-educational
interventions. At baseline, PDR was 49.4 %, whereas after the intervention it increased
to 61.1 %. We found that educational interventions were associated with a 23 % relative
increase in PDR (RR 1.23, 95 % CI 1.19 to 1.27), as demonstrated in [Fig. 2a]. There was low heterogeneity between the included studies (I
2 value of 11.21 %).
Fig. 2 Forest plot comparing a polyp detection rate (PDR), b proximal adenoma detection rate (pADR), c withdrawal time (WT), and d cecal intubation rate (CIR) pre-educational intervention and post-educational intervention.
CI, confidence interval.
Proximal adenoma detection rate
Two studies reported pADR before and after educational interventions, representing
26,001 colonoscopies. Prior to the intervention, the pADR was 9.3 %, whereas after
the intervention it increased to 13.10 %. Educational interventions were associated
with a 39 % relative increase in pADR (RR 1.39, 95 % CI 1.29 to 1.48) as displayed
in [Fig. 2b]. There was no heterogeneity between the studies (I
2 value of 0 %).
Withdrawal time
Four studies representing 50,106 colonoscopies that compared WT before and after educational
interventions were included in the meta-analysis. WT was relatively homogeneously
defined across studies, being either calculated using negative procedures only, or
using all procedures with a timer used to remove period(s) spent on any intervention(s)
performed. There were no significant differences in WT (MD 0.29 minutes, 95 % CI – 0.12
to 0.70 minutes) as shown in [Fig. 2c]. There was considerable heterogeneity between these studies (I
2 value of 93.96 %). Of note, the WTs in the majority of the included studies exceeded
those proposed by guidelines [5]
[6], ranging from 8.6 to 12.1 minutes in the preintervention arm and from 8.4 to 12.5
minutes in the post-intervention arm.
Cecal intubation rate
Two studies reported on CIR preintervention and post-intervention, representing 25,568
colonoscopies. There were no significant changes in CIR before and after educational
interventions, as shown in [Fig. 2d] (RR 1.00, 95 % CI 1.00 to 1.01). There was no heterogeneity between the included
studies (I
2 value of 0.08 %). Of note, the pooled CIR met recommended targets [5]
[6] in both the pre- and post-intervention periods, at 96.2 % and 96.4 %, respectively.
Subgroup analyses
Lag times (times between the intervention and the start of outcome measurement) were
clearly reported in five studies and varied between 6 and 28 months. The improvement
in ADR was slightly less pronounced in studies reporting any lag time following educational
interventions compared with studies with no lag time or no lag time reported (RR 1.28,
95 % CI 1.19 to 1.38 and RR 1.34, 95 % CI 1.28 to 1.41, respectively). However, this
difference was not statistically significant (p = 0.32). The first subgroup, however,
had substantial inter-study heterogeneity (I
2 82 %) whereas the second had low heterogeneity (I
2 11 %). There were no significant differences in ADR improvements between studies
in screening and mixed populations, hands-on versus didactic training programs, or
EQUIP-based versus other strategies. Interestingly, heterogeneity was absent between
the EQUIP-based studies (I
2 0 %) and when randomized trials were grouped together using before-and-after data
(I
2 0 %). There were also no significant differences in subgroup analyses by study design
(RCT versus observational), number of centers (single-center versus multicenter),
or publication type (published manuscript versus conference abstract). The inter-study
heterogeneity was slightly reduced within the subgroup containing only single-center
studies (I
2 value of 64 %), and in the subgroup containing only full text publications (I
2 59 %). When the three RCTs were analyzed using experimental and control groups, the
rate ratio was still significant (RR 1.18, 95 % CI 1.06 to 1.31), but less pronounced
than when these studies were analyzed using pre-intervention and post-intervention
groups in order to compare them with other observational studies (RR 1.33, 95 % CI
1.29 to 1.37). Subgroup analyses are summarized in [Table 4] and are provided in detail in Supplementary Fig. 3.
Table 4
Summary of subgroup analyses.
Subgroups
|
Pooled RR (95 % CI)
|
Inter-study heterogeneity (I
2)
|
No lag time[*]
[28]
[33]
[34]
|
1.34 (1.28, 1.41)
|
11 %
|
Any lag time specified[*]
[27]
[29]
[30]
[31]
[32]
|
1.28 (1.19, 1.38)
|
82 %
|
Primarily screening colonoscopy [28]
[29]
[30]
[31]
[33]
|
1.32 (1.25, 1.39)
|
60 %
|
Mixed populations [27]
[32]
[34]
|
1.21 (1.12, 1.30)
|
37 %
|
EQUIP-based studies [27]
[28]
[33]
|
1.34 (1.29, 1.39)
|
0 %
|
Non-EQUIP-based studies [29]
[30]
[31]
[32]
[34]
|
1.28 (1.18, 1.39)
|
86 %
|
Hands-on skills training [29]
[34]
|
1.22 (1.04, 1.43)
|
80 %
|
Didactic training [27]
[28]
[30]
[31]
[32]
[33]
|
1.32 (1.22, 1.42)
|
86 %
|
Published manuscripts [27]
[28]
[29]
[30]
[34]
|
1.32 (1.25, 1.38)
|
59 %
|
Conference abstracts [31]
[32]
[33]
|
1.28 (1.14, 1.43)
|
71 %
|
RCT (pre-/post-) [27]
[28]
[29]
|
1.33 (1.29, 1.37)
|
0 %
|
RCT (control/intervention) [27]
[28]
[29]
|
1.18 (1.06, 1.31)
|
82 %
|
Observational studies [30]
[31]
[32]
[33]
[34]
|
1.27 (1.16, 1.40)
|
82 %
|
Multi-center studies [28]
[29]
[32]
|
1.28 (1.17, 1.39)
|
93 %
|
Single-center studies [27]
[30]
[31]
[33]
[34]
|
1.31 (1.19, 1.43)
|
64 %
|
RR, rate ratio of ADR post- versus preintervention; CI, confidence interval; RCT,
randomized
controlled trial; EQUIP [27], endoscopic quality improvement program.
* Lag time refers to time between intervention and start of post-intervention measurement
of outcome(s).
Other sensitivity analyses and publication bias
The findings for our primary outcome of ADR were robust to sensitivity analysis, as
the RR did not change appreciably with exclusion of each study in turn or with analysis
using a fixed effects model. There was no evidence of publication bias for the primary
outcome by Egger’s or Begg’s tests, or by visual inspection of the funnel plot (Supplementary Fig. 4).
Discussion
In our meta-analysis of eight studies including 86,008 colonoscopies, educational
interventions were associated with a significant 29 % relative increase in ADR. We
also found that educational interventions are associated with significant improvements
in PDR and pADR. Other quality indicators such as WT and CIR remained unchanged after
educational interventions, though there was a trend toward increases in WT. Our results
suggest that educational interventions aimed at independently practicing endoscopists
contribute meaningfully to improvements in colonoscopy indicators.
ADR is the most widely endorsed colonoscopy quality indicator given its established
inverse relationship with PCCRC [7] and CRC-related death [37]. Various strategies to optimize ADR specifically aimed at the endoscopist have been
studied, including formalized audit and feedback [38], video-based assessments [39]
[40], and combinations of these strategies [18], with mixed results. Short educational interventions targeting endoscopists, the
subject of this review, have also been studied. Though these interventions vary considerably,
they have in common the ultimate goal of improving the quality of colonoscopy performed
by independent non-trainee endoscopists. EQUIP training, for example, is comprised
of didactic presentations focusing on optimal withdrawal techniques and image recognition
of neoplastic versus non-neoplastic polyps [27]. Conversely, skills enhancement and ‘train-the-endoscopy-trainer’ programs take
a more holistic approach, providing a range of theoretical and hands-on training sessions,
from basic to advanced [41]. These sessions are primarily focused on navigating the transition from unconscious
competence to conscious competence [42], but also have the concurrent effect of optimizing technique and enabling more robust
teaching, ultimately improving quality indicators [29]. Our findings confirm that these interventions have a demonstrable effect on several
important colonoscopy quality indicators. Interestingly, studies assessing hands-on
training interventions improved ADR at approximately the same rate as didactic interventions
in our study. While hands-on training, including simulation, is currently recommended
as part of endoscopy training curricula [43], its use among trained endoscopists has been relatively poorly studied.
Another interesting finding is that educational interventions were associated with
significantly increased proximal adenoma detection rates. While the mechanisms for
this change are not clearly explained by our results, several endoscopist-related
factors could have been influenced by educational interventions, including technical
skills, adequate air insufflation, washing and suctioning of debris and fluid, attention
to flexures and folds, repetitive segment viewing, and torqueing maneuvers to enhance
visualization. This finding is particularly important, given that missed proximal
adenomas play an important role in the development of PCCRC [44].
While our results are encouraging, one must interpret them with a degree of caution.
When all studies (including RCTs using pre-intervention and post-intervention groups)
were analyzed as having quasi-experimental designs, there were significant improvements
in ADR. When the RCTs were considered separately using experimental and control groups,
the improvement in ADR was less pronounced. This could lead one to hypothesize a potential
contribution from the Hawthorne effect to our overall pooled results, whereby the
behavior of study subjects has the ability to change merely from their knowledge of
being surveilled [45]. Potential pitfalls of educational interventions also need close consideration.
Expert facilitators and instructors are often required to deliver such interventions,
and they are often not readily available. Furthermore, considerable preparation and
considerable resources are required to successfully conduct such educational programs.
The cost to the system associated with these interventions thus needs to be considered,
even if one can ultimately make a strong argument for overall cost savings through
quality improvement and reduction of disease burden. In addition, it is unknown whether
there are differences in the degree of improvement when an endoscopist independently
seeks out additional training, versus when it is mandated. Finally, the durability
of any improvements in quality seen as a result of educational interventions is poorly
established, given that the length of post-intervention follow-up was not uniformly
reported across studies. Kaminski et al. included a long post-intervention phase of 18 months and found that ADR continued
to be higher than the preintervention period, but less pronounced compared to the
first 6 months post-intervention [29]. Conversely, the long-term sustainability of ADR improvement with the EQUIP-3 intervention
is less clear [28].
Of note, in a recent meta-analysis, we reported a significant association between
endoscopist feedback and improvements in ADR, with a rate ratio of 1.21 [38]. The rate ratio of 1.29 we report in our present study reflects a potential added
impact of educational interventions. However, all but one study included in our analysis
employed endoscopist audit and feedback concurrently with educational interventions.
As endoscopist feedback has previously been independently associated with improved
colonoscopy quality [38], a potential confounding effect needs to be considered. The lone study assessing
educational interventions alone (without feedback) in our analysis demonstrated non-statistically
significant ADR improvements. Thus, further study is required to reliably determine
the effect of educational interventions without any form of audit and feedback, and
to assess the direct contributions of each of these measures on ADR and other quality
indicators.
Our review has a number of strengths. A separate meta-analysis also recently evaluated
the effect of educational interventions on ADR, but restricted inclusion to the 3
RCTs only, thereby missing several important studies assessing this question [46]. Furthermore, subgroup analyses were limited by the small number of input studies
[46]. Our comprehensive search strategy included both randomized and non-randomized study
designs as well as conference abstracts, thereby resulting in 13 studies included
in our systematic review, with eight ultimately included in our quantitative meta-analysis.
We also carried out several subgroup analyses to better understand potential sources
of heterogeneity between studies. Importantly, we identified that studies restricted
to primarily screening populations, the most common indication for colonoscopy, displayed
considerably less inter-study heterogeneity. Lastly, we employed the GRADE approach
to assess and summarize the certainty of the evidence leading to our conclusions.
Our study also has several limitations. For our overall analysis, we included only
the interventional arms of RCTs and considered them as quasi-experimental studies
in order to be able to compare them to the other included observational studies. However,
to mitigate this, we also conducted a sensitivity analysis pooling only RCT data;
the pooled interventional group still had ADR improvements compared to the control
group. However, one should be cognizant that the degree of improvement seen with RCTs
was somewhat lower than seen with quasi-experimental studies. This is in part owing
to the fact that there was a (lesser) degree of ADR improvement in two of the RCT
control groups as well. Thus, the overall pooled magnitude of ADR improvement should
be interpreted with caution and in conjunction from the RCT-specific estimates. Secondly,
there is a present lack of understanding of the mechanism(s) linking educational interventions
with ADR improvement. This is important given that WT and CIR, whose improvements
would be plausible mechanisms, remained unchanged. Thirdly, we opted to include gray
literature. Although we believe this reduced publication bias, we acknowledge this
may represent a limitation as it increases the heterogeneity due to incomplete details
about the methodology and the lack of a peer review process. Encouragingly, in subgroup
analyses, estimates of the primary outcome remained unchanged based on publication
type. Finally, the majority of included studies were observational by design, and
though attempts were made to report results adjusted for known confounders, the inability
to adjust for unknown confounders must be acknowledged.
Conclusion
In conclusion, we found evidence that educational interventions improve ADR when conducted
among independent endoscopists. Furthermore, they are also associated with improvements
in pADR and overall PDR. Though the certainty of the evidence leading to conclusions
on the primary outcome was low, we believe our findings are important. Future research
should prioritize addressing important gaps, such as assessing the durability of the
intervention, the impact on low- versus high-performers, and whether hands-on training
or multimodal training are superior to didactic educational sessions.