Keywords
very preterm infants - neurodevelopmental outcome - Bayley Scales of Infant Development - perinatal risk factors
Introduction
Each year 15 million infants are born preterm worldwide, with the rate ranging between 5 and 18% depending on the country.[1] In Austria, 7% of live births are below 37 weeks gestational age (GA) and about 1% below 32 weeks GA.[2] While the survival rate and morbidity are improving, there is no corresponding reduction regarding disabilities within this population.[3] Very preterm infants are more likely to have behavioral and emotional difficulties, as well as learning disabilities.[4] The focus lies on motor impairment, language delay, personal-social immaturity, cognitive rigidity, and poor ability to manage practical situations. Extremely preterm infants more often score two standard deviations below the mean cognitive score,[5] while late preterm infants (34–36 weeks GA) have twice the risk for borderline intellectual function.[5]
The aim of this study is to evaluate the neurodevelopmental outcome of very and extremely preterm infants in a regional setting in Vorarlberg, Austria, at the corrected age of 24 months. First, we evaluate the outcome of Bayley Scales of Infant Development (BSID-II/Bayley-III) scores and assess adverse outcomes (scores <70). Second, these results are compared to corresponding national and international data. Third, we assess perinatal parameters and short-term morbidities as risk factors for poor neurological outcome.
Methods
Ethics
The data for this population-based study have been prospectively collected in an internal register in Vorarlberg, Austria, since 2007 and thereafter in the national quality assessment program registry named “Österreichisches Frühgeborenen Outcome Register, ÖFGOR”[2] and stored anonymously. The study was approved by the local ethics committee (EK No. 1828/2019) in compliance with the Helsinki Declaration.
Study Population
Every infant with GA less than 32 + 0 weeks born alive and admitted to the neonatal intensive care unit in Vorarlberg between 2007 and 2019 was included in the register. Infants were assessed with either the BSID-II or the Bayley-III test, depending on which version was in use at the time. We calculated the data for this population, referred to as the total group. Additionally, we further divided this group into two subgroups based on GA, namely above (n = 185) or below 28 + 0 (n = 79) weeks GA to improve the value of our study. The inclusion criteria for this study were a GA less than 32 + 0 weeks and fully completed BSID-II/Bayley-III evaluation.
Variables
Demographic variables for this study were obtained from the internal register. They included demographic items such as sex, birth weight, head circumference at birth, GA, antenatal corticosteroids, premature rupture of membranes (PROM), early- and late-onset sepsis, APGAR score at 1/5/10 minutes, and patent ductus arteriosus (PDA).[6]
[7]
[8]
[9]
[10]
[11] The short-term morbidities such as bronchopulmonary dysplasia (BPD), severe intraventricular hemorrhage (IVH), necrotizing enterocolitis (NEC), and severe retinopathy of prematurity (ROP) were also defined and calculated as risk factors.[12]
[13]
[14]
[15]
[16]
Neurodevelopmental Assessment
Follow-up examinations of preterm infants are mandatory practice in Austria. These include at minimum a neurological examination, an eye examination, and a hearing screening. More thorough testing is performed at the corrected age of 24 months in the form of a clinical and neurological developmental test, the BSID-II/Bayley-III.[2]
[17] Testing was performed by trained clinical psychologists. The outcome parameters for the BSID-II are called mental developmental index (MDI) and psychomotor developmental index (PDI). The BSID-II was used until 2016, after which the Bayley-III was used. The Bayley-III comprises five scales, namely cognitive scale, receptive language scale, expressive language scale, fine motor scale, and gross motor scale.[18] Studies have shown that the Bayley-III tends to underestimate neurodevelopmental delay in comparison to the BSID-II.[19]
[20]
[21]
[22] For the purpose of comparison, the formula used by Moore et al[20] is used to combine the language and cognitive composite scores of the Bayley-III to form a predicted MDI (pMDI). Furthermore, in our analysis, we combined the MDI and PDI results of each BSID to form one variable each (allMDIs and allPDIs) to summarize the results for the whole dataset and all infants. A score of less than 70 in any category was considered adverse and scores lower than 45 were adjusted to a level of 45.[23]
Statistical Analysis
Descriptive data for the sociodemographic and clinical data were analyzed for all registered patients meeting the inclusion criteria and were presented as means with standard deviation; for proportional data, a 95% confidence interval was used. The statistics were presented in percentages, if not stated otherwise. To determine the statistical significance of the outcome parameters of the BSID-II and Bayley-III, we used the multivariate linear regression model. However, only the statistically significant results are presented in the Results section. For all calculations, we used IBM SPSS Statistics (Version 28) and a p-value lower than 0.05 was considered statistically significant.
Results
In total, 623 datasets were registered (for visualized information see [Supplementary Fig. S1], available in the online version). Of those, 148 did not meet the eligibility criteria and 19 infants died before reaching 24 months of corrected age. A total of 456 infants were, therefore, eligible for this study. We invited each infant for a follow-up examination at 24 months of corrected age. A total of 264 infants (response rate 57.9%) completed the BSID-II/Bayley-III evaluation. Of these, 79 (29.9%) were born below 28 weeks of GA, while 185 (70.1%) belonged to the above 28 weeks of GA group. The excluded infants (n = 192) did not significantly differ from the included infants regarding sex (p = 0.49), birth weight (p = 0.06), APGAR at 10 minutes (p = 0.14), antenatal steroids (p = 0.11), GA (p = 0.36), BPD (p = 0.28), PDA (p = 0.72), NEC (p = 0.30), intracerebral hemorrhage (ICH) Grade 3-4 (p = 0.39), or ROP (p = 0.23). Excluded infants showed significantly lower APGAR scores at minute 1 (p = 0.02) and at minute 5 (p = 0.01).
At birth all included infants had a mean GA of 29 (± 2.1) weeks, a weight of 1,177 (± 328) g, and a head circumference of 26.5 ± 2.5 cm. The mean APGAR for 1, 5, and 10 minutes was 7 ± 2, 8 ± 1, and 9 ± 1, respectively. Of the total infants, 104 (53%) were male and 227 (86%) received antenatal steroids. For this overall group, 63 (23.9%) infants had a PDA, 39 (14.8%) BPD, 12 (4.5%) severe NEC, 14 (5.3%) IVH Grade 3-4, and 37 (14%) had severe ROP ([Table 1]). The corresponding data for the group less than 28 weeks GA (n = 79) are also shown in [Table 1].
Table 1
Sociodemographic data for the study population (n= 264) and the below (n = 79) and above (n = 185) 28 weeks GA groups
|
Total
|
GA below 28 + 0 weeks
|
GA above 28 weeks
|
n
|
% of total n
|
Mean
|
Std. deviation
|
n
|
Mean
|
Std. deviation
|
n
|
Mean
|
Std. deviation
|
Sex (1 = female; 2 = male)
|
264
|
100.0
|
1.47
|
|
79
|
1.46
|
|
185
|
1.48
|
|
Birth weight (g)
|
264
|
100.0
|
1177
|
328
|
79
|
864
|
206
|
185
|
1311
|
275
|
Head circumference at birth (cm)
|
258
|
97.7
|
26.5
|
2.5
|
79
|
24.0
|
2.3
|
179
|
27.6
|
1.8
|
GA (weeks)
|
264
|
100.0
|
29.0
|
2.1
|
79
|
26.4
|
1.2
|
185
|
30.2
|
1.1
|
Steroids antenatal
|
227
|
86.0
|
|
|
69
|
|
|
158
|
|
|
Magnesium prenatal
|
9
|
3.4
|
|
|
4
|
|
|
5
|
|
|
Premature rupture of membranes
|
85
|
32.2
|
|
|
28
|
|
|
57
|
|
|
Early-onset sepsis
|
38
|
14.4
|
|
|
16
|
|
|
22
|
|
|
Late-onset sepsis
|
81
|
30.7
|
|
|
47
|
|
|
34
|
|
|
APGAR minute 1
|
258
|
98.9
|
7
|
2
|
78
|
6
|
2
|
183
|
7.2
|
1.8
|
APGAR minute 5
|
250
|
97.3
|
8
|
1
|
76
|
7
|
1
|
181
|
8.6
|
1.2
|
APGAR minute 10
|
242
|
95.8
|
9
|
1
|
73
|
8
|
1
|
180
|
9.2
|
1.0
|
PDA
|
63
|
23.9
|
|
|
37
|
|
|
26
|
|
|
BPD
|
39
|
14.8
|
|
|
29
|
|
|
10
|
|
|
NEC
|
12
|
4.5
|
|
|
10
|
|
|
2
|
|
|
IVH Grade 3-4
|
14
|
5.3
|
|
|
10
|
|
|
4
|
|
|
ROP
|
37
|
14.0
|
|
|
29
|
|
|
8
|
|
|
Abbreviations: BPD, bronchopulmonary dysplasia; GA, gestational age; IVH, intraventricular hemorrhage; NEC, necrotizing enterocolitis; PDA, patent ductus arteriosus; ROP, retinopathy of prematurity.
Neurodevelopmental Results
Infants investigated with the BSID-II (n = 172; [Fig. 1]) had a mean PDI of 98.6 (± 13) and six (3.5%) of these infants had an adverse result. Mean MDI was 95.3 (± 15.2), with 14 (8.1%) of these infants showing an adverse result. Infants investigated with the Bayley-III (n = 92; [Fig. 1]) had a mean motor score of 101.5 (± 16.4), five (5.4%) of whom had an adverse result. Mean cognitive score was 97.6 (± 19.7), with eight (8.7%) of these infants showing an adverse result. Mean language score was 87.1 (± 22.0) and was adverse in 24 (26.1%) infants. The calculated sum of language and cognitive scores (predicted pMDI) was 82.8 (± 25.8) and showed an adverse rate in 27 (29.3%) infants.
Fig. 1 Neurodevelopmental outcome calculated and presented as boxplots for Bayley Scales of Infant Development (BSID-II) (n = 172) and Bayley-III (n = 92). MDI, mental developmental index; PDI, psychomotor developmental index.
Infants investigated with BSID-II or Bayley-III (all included infants, n = 264; [Table 2]) had a mean PDI (allPDIs) of 99.6 (± 14.4), of whom 11 (4.2%) infants had adverse results.
Table 2
BSID score results for the total group (n = 264) and the below (n = 79) and above (n = 185) 28 weeks GA groups
|
|
PDI
BSID-II
|
MDI BSID-II
|
Motor score
Bayley-III
|
Cognitive score Bayley-III
|
Language score Bayley-III
|
pMDI
|
allPDIs
|
allMDIs
|
Total
|
n
|
172
|
172
|
92
|
92
|
92
|
92
|
264
|
264
|
Mean
|
98.6
|
95.3
|
101.5
|
97.6
|
87.1
|
82.8
|
99.6
|
91
|
Std. deviation
|
13.1
|
15.2
|
16.4
|
19.7
|
22
|
25.8
|
14.4
|
20.4
|
Median
|
100
|
98
|
106
|
97.5
|
91
|
85.9
|
103
|
94.1
|
Adv. results
|
6
|
14
|
5
|
8
|
24
|
27
|
11
|
41
|
% of n
|
3.5
|
8.1
|
5.4
|
8.7
|
26.1
|
4.2
|
4.2
|
15.5
|
Below 28 +0 weeks GA
|
n
|
47
|
47
|
32
|
32
|
32
|
32
|
79
|
79
|
Mean
|
100
|
93.2
|
100.1
|
96.9
|
85.4
|
81.4
|
100.1
|
88.4
|
Std. deviation
|
14.7
|
16
|
19.7
|
20.7
|
24.9
|
28.2
|
16.8
|
22.4
|
Median
|
103
|
94
|
104.5
|
95
|
94
|
87
|
103
|
92
|
Adv. results
|
1
|
6
|
3
|
3
|
10
|
10
|
4
|
16
|
% of n
|
2.1
|
12.8
|
9.4
|
9.4
|
31.3
|
31.3
|
5.1
|
20.3
|
Above 28 weeks GA
|
n
|
125
|
125
|
60
|
60
|
60
|
60
|
185
|
185
|
Mean
|
98
|
96.1
|
102.3
|
97.9
|
88.1
|
83.5
|
99.4
|
92.1
|
Std. deviation
|
12.4
|
14.9
|
14.4
|
19.3
|
20.5
|
24.6
|
13.2
|
19.5
|
Median
|
100.00
|
100.00
|
106.00
|
100.00
|
91.00
|
85.85
|
102.00
|
96.00
|
Adv. results
|
5
|
8
|
2
|
5
|
14
|
17
|
7
|
25
|
% of n
|
4.0
|
6.4
|
3.3
|
8.3
|
23.3
|
28.3
|
3.8
|
13.5
|
Abbreviations: allMDIs, total of all MDIs across the database; allPDIs, total of all PDIs across the database; BSID, bayley scales of infant development; GA, gestational age; PDI, psychomotor developmental index, MDI, mental developmental index; pMDI, predicted MDI.
The allMDI (Bayley-III data corrected with the formula of Moore) was 91.0 (± 20.4) with an adverse result in 41 (15.5%).
In the group of infants less than 28 weeks GA, the corresponding results were mean allPDI and allMDI 100.1(± 16.8) and 88.4 (± 22.4), respectively, with an adverse result in six (5.1%) and 16 (20.3%) infants, respectively ([Table 2]).
Multivariate Linear Regression Analysis
For better visibility, significant results were highlighted ([Table 3]). For the whole database, the statistically significant risk factors for poor outcome were BPD and IVH Grade 3-4, and BPD was the only factor for all analyzed BSID-II/Bayley-III parameters. In addition, male sex, head circumference at birth, PROM, PDA, and IVH Grade 3-4 showed a nonsignificant trend for a poor outcome in one or more neurodevelopmental parameters at the corrected age of 24 months.
Table 3
Multivariate regression results for each inspected variable and each BSID score (significant results shown in bold)
|
PDI BSID-II
|
MDI BSID-III
|
Motor score Bayley-III
|
Cognitive score Bayley-III
|
Language score Bayley-III
|
pMDI
|
allPDIs
|
allMDIs
|
|
t
|
Sig.
|
t
|
Sig.
|
t
|
Sig.
|
t
|
Sig.
|
t
|
Sig.
|
t
|
Sig.
|
t
|
Sig.
|
t
|
Sig.
|
(Constant)
|
4.457
|
0.000
|
2.648
|
0.009
|
3.096
|
0.003
|
1.802
|
0.076
|
0.216
|
0.830
|
0.078
|
0.938
|
5.025
|
0.000
|
0.538
|
0.591
|
Sex
|
0.131
|
0.896
|
2.026
|
0.045
|
0.771
|
0.444
|
−0.030
|
0.977
|
−0.276
|
0.784
|
−0.522
|
0.603
|
0.688
|
0.492
|
1.122
|
0.263
|
Birth weight (g)
|
0.879
|
0.381
|
1.061
|
0.290
|
1.329
|
0.188
|
−0.496
|
0.621
|
−0.556
|
0.580
|
−0.915
|
0.364
|
1.077
|
0.283
|
−0.834
|
0.405
|
Head circumference at birth (cm)
|
1.095
|
0.275
|
1.904
|
0.059
|
−2.247
|
0.028
|
−0.419
|
0.677
|
−0.630
|
0.531
|
−0.347
|
0.730
|
0.213
|
0.831
|
0.999
|
0.319
|
Gestational age (weeks)
|
−1.617
|
0.108
|
−1.535
|
0.127
|
0.645
|
0.522
|
0.188
|
0.852
|
1.485
|
0.143
|
1.215
|
0.229
|
−1.060
|
0.290
|
0.840
|
0.402
|
Steroids antenatal
|
−1.918
|
0.057
|
−0.534
|
0.594
|
0.992
|
0.325
|
0.084
|
0.934
|
0.459
|
0.648
|
0.319
|
0.751
|
−0.003
|
0.317
|
−0.002
|
0.999
|
Premature rupture of membranes
|
2.056
|
0.042
|
−0.991
|
0.324
|
0.214
|
0.831
|
2.241
|
0.029
|
2.606
|
0.011
|
2.686
|
0.009
|
1.949
|
0.053
|
1.428
|
0.155
|
Early-onset sepsis
|
−0.676
|
0.500
|
0.047
|
0.963
|
−1.179
|
0.243
|
−0.579
|
0.564
|
−0.399
|
0.691
|
−0.488
|
0.627
|
−1.577
|
0.116
|
−0.300
|
0.764
|
Late-onset sepsis
|
−0.031
|
0.976
|
−1.159
|
0.248
|
0.760
|
0.450
|
0.833
|
0.408
|
1.721
|
0.090
|
1.668
|
0.100
|
0.341
|
0.733
|
0.622
|
0.535
|
APGAR minute 1
|
−0.667
|
0.506
|
−0.199
|
0.842
|
−0.646
|
0.521
|
−1.937
|
0.057
|
0.073
|
0.942
|
−0.925
|
0.359
|
−0.650
|
0.516
|
−0.092
|
0.927
|
APGAR minute 5
|
−1.059
|
0.291
|
−0.699
|
0.486
|
−1.507
|
0.137
|
−0.120
|
0.905
|
−1.054
|
0.296
|
−0.789
|
0.433
|
−1.415
|
0.158
|
−1.003
|
0.317
|
APGAR minute 10
|
1.528
|
0.129
|
1.211
|
0.228
|
0.879
|
0.383
|
0.646
|
0.521
|
1.085
|
0.282
|
1.358
|
0.179
|
1.354
|
0.177
|
1.694
|
0.092
|
PDA
|
1.495
|
0.137
|
2.067
|
0.041
|
−1.997
|
0.050
|
−1.790
|
0.078
|
−0.774
|
0.442
|
−1.174
|
0.245
|
0.532
|
0.595
|
0.642
|
0.522
|
BPD
|
−2.766
|
0.007
|
−2.154
|
0.033
|
−0.795
|
0.430
|
−2.480
|
0.016
|
−2.238
|
0.029
|
−2.296
|
0.025
|
−2.529
|
0.012
|
−3.494
|
0.001
|
NEC
|
1.112
|
0.268
|
0.136
|
0.892
|
−0.173
|
0.863
|
0.372
|
0.711
|
−0.112
|
0.911
|
0.343
|
0.733
|
0.505
|
0.614
|
−0.007
|
0.994
|
IVH Grade 3-4
|
−0.735
|
0.464
|
0.193
|
0.847
|
−2.851
|
0.006
|
−1.571
|
0.121
|
−1.102
|
0.275
|
−0.945
|
0.348
|
−2.971
|
0.003
|
−0.462
|
0.645
|
ROP
|
0.196
|
0.845
|
0.007
|
0.994
|
−0.907
|
0.368
|
−0.171
|
0.865
|
0.932
|
0.355
|
0.430
|
0.669
|
0.141
|
0.888
|
1.808
|
0.072
|
Abbreviations: allMDIs, total of all MDIs across the database; allPDIs, total of all PDIs across the database; BPD, bronchopulmonary dysplasia; BSID, Bayley scales of infant development; IVH, intraventricular hemorrhage; NEC, necrotizing enterocolitis; PDA, patent ductus arteriosus; PDI, psychomotor developmental index, MDI, mental developmental index; pMDI, predicted MDI; ROP, retinopathy of prematurity.
Discussion
Our state-wide population-based register study shows a fair and favorable neurodevelopmental outcome assessed with the BSID-II and Bayley-III in very and extremely preterm infants at the corrected age of 24 months. Overall, this research aims to describe and analyze the data without discussing the results in depth due to the relatively small sample size.
First, the neurodevelopmental assessments reveal the best results in the psychomotor categories (allPDI of 99.6, ± 14.4) with very low rates of adverse results (4.2%). In comparison, other national and international studies showed median psychomotor scores of around 90 and rates of abnormal results up to 20%.[24]
[25]
[26]
[27]
[28]
Second, also our results for mental developmental outcome (allMDI of 91.0 ± 20.4) tend to be better than those of other studies.[26]
[27] A national study from Austria, where a group of infants above 30 weeks GA, tested at a corrected age of 12 months, achieved a median MDI score of 102.[24] Similarly, our rate of adverse mental outcome of 15.5% does not differ from other studies.[25]
[26]
[27]
[28]
Third, the group less than 28 weeks GA offers encouraging results with a good psychomotor result (100.1, adverse rate 5.1%), although it has considerably lower mental scores (88.4, adverse rate 20.3%). This can be compared to a Swedish study[26] showing an adverse rate of 15% for psychomotor results and an adverse rate of 20% for mental developmental results. To our knowledge, only few studies[25]
[26] report data about this vulnerable but therefore most interesting infant group.
We are very aware that a comparison of results with those of other groups is difficult because of differences in patient populations, follow-up rates, or tests performed[4]
[29] as well as languages used.[25]
[26] Also, it is important to note that the norms in use play an important role in clinical and research application when interpreting outcome results. The use of different norms might lead to different outcomes; for example, using US norms in an Austrian population can lead to underestimation of neurodevelopmental delay.[18] The application of culture-specific norms for clinical as well as research purposes has already been proposed[18] and would be desirable. However, especially with regard to quality control, we are pleased to report a performance very similar to that of the above mentioned leading studies[24]
[25]
[26]
[27] and we look forward to more recent national data from the ÖFGOR data base.[30]
A comparison of the more recent results obtained with Bayley-III after the year 2016 and the results previously obtained with the BSID-II investigation shows that there might be no improvement over time ([Fig. 1]). This important finding might be due to bias and could be influenced by many factors like patient numbers and test characteristics. Namely, in the more recent Bayley-III investigation, the mental outcome is defined by two different parameters for the first time, namely the cognitive and the language capabilities. While our results for cognitive results are higher than in other studies,[26]
[27] the language section shows an inhomogeneous result, which does not concur with the above-mentioned result obtained in the Swedish study.[26]
Lastly, we investigated different perinatal risk factors for poor outcome. In our study, BPD is the predominant risk factor for delayed mental and motor development, which corresponds to other studies.[25]
[28] In our as well as other studies,[24]
[25]
[26]
[27]
[28] sex, PROM, and severe IVH emerged as common risk factors associated with results of the BSID-II and Bayley-III tests. In our study, factors that negatively influence outcome but are not reported in the other studies are head circumference at birth and PDA.
Strengths and Limitations
The strength of our study is the population-based design of prospectively collected health care data in a very vulnerable group of patients. Some may argue that the number of study participants is small (n = 264) and from a nonrepresentative region. Over this period of time, we constantly applied a concept of patient care consisting of antenatal transport to and care of these infants at the only neonatal intensive unit in Vorarlberg. For this reason, we were able to include every infant in the local and subsequently national register. This patient care system has been stable over time, the response rate of infants (57.9%) who participated in the study is fair, and data on nonincluded infants do not differ from data on included infants. All these factors make us confident that our patient population is representative.
The sample size of 264 cases and the study period of 13 years permit only a descriptive delineation of results, and we feel that an attempt at a general deduction from these results may not be appropriate. Another limiting factor might be the use of different versions of the BSID-II/Bayley-III over the long course of the study. As recommended, we adjusted the Bayley-III cognitive and language scores using the formula of Moore et al[20] to make these scores comparable. Despite several attempts to harmonize them by using, for example, other conversion formulas,[19] changing cutoff scores to 80 or 85,[31] or renorming Bayley-III,[32] the possibility that BSID-II could under- or Bayley-III overestimate development remains the subject of discussion.[33]
The low language score observed with Bayley-III does not correspond to the other results and may have socioeconomic reasons like bilingualism, education, and other factors.[34]
[35] However, these parameters are not documented in the national minimal data we consensually agreed for the ÖFGOR data base.
Our data were collected over a long period of time. Changes in pre- and postnatal decision-making, modified strategies in delivery room management, high-end neonatal and intensive care medicine, and progressing postdischarge management led to improved outcome.[23]
[36] Survival rates have increased, but survival free of major complication rates did not show any difference over time. Numerous studies[37]
[38] offer approaches to factors influencing BPD, like sustained inflation, mechanical ventilation versus continuous positive airway pressure (CPAP), oxygen saturation limits or even nutrition, but incidences do not show significant changes over time. Reducing neonatal morbidity will continue to be difficult but is probably the most important way to improve the outcome.
Our study showed, as does the literature,[39] that the presence of PDA was associated with a significantly poorer outcome. Our data assessed only whether a PDA was present or not. There was no grading or classification of hemodynamic relevance. In addition, we could not differentiate between spontaneous, pharmacological, and surgical closure.
Conclusion
Overall, the participants in this study had remarkably good neurodevelopmental outcomes as compared to national and international data. Also, the group of extremely preterm infants shows encouraging results. However, the results did not improve over time, as might be expected. As previously described in the literature, this study reveals predictive factors for poor developmental outcome, especially BPD.