Keywords ontology - knowledge representation - postpartum - depression - women's health
Background and Significance
Background and Significance
Importance and Prevalence of Postpartum Depression
While approximately 15 to 85% of women experience the “baby blues” or some form of
sadness in the 2 weeks following delivery,[1 ] postpartum depression (PPD) is a more severe and longer lasting mental illness that
is detrimental to both the mother and newborn. PPD is classified as an episode of
major depressive disorder (MDD) that can occur up to 12 months after childbirth,[2 ] and it affects approximately 10 to 20% of mothers.[3 ]
[4 ] While unexplained crying and general sadness are common symptoms, PPD can produce
more severe consequences such as feelings of hopelessness, intense anxiety,[5 ] suicidal ideation, thoughts about harming the baby,[6 ] and mother–infant bonding challenges that can affect the child's future development.[5 ]
Need for Increased Research on Risk Factors Underlying Postpartum Depression
The PPD field includes a considerable amount of risk factor research, with the strongest
identified factors being a history of depression, anxiety during pregnancy, and depression
during pregnancy;[7 ] however, due to the large number of variables involved in pregnancy and birth, numerous
factors are understudied or not researched at all. Many risk factors, such as preeclampsia[7 ]
[8 ]
[9 ]
[10 ] and Cesarean section,[7 ]
[8 ]
[9 ]
[10 ]
[11 ] lack a consensus, with separate studies reporting mixed levels of significance.
Additionally, several studies[12 ]
[13 ]
[14 ] do not adjust for important confounding variables, such as a history of psychiatric
illness, and therefore reduce the generalizability and usefulness of the results.
Furthermore, due to the stigma associated with mental illness disclosure, studies
may also suffer from recruitment or follow-up difficulties.[15 ] Despite electronic health record (EHR) research promoting investigation of a variety
of confounding variables, assembly of large cohorts, assessment of associations among
risk factor subgroups (e.g., types of Cesarean sections), and avoidance of challenges
in study recruitment and follow-up,[15 ] very few PPD papers have utilized EHRs. Thus, it is necessary to improve PPD risk
factor research by expanding the use of EHRs.
Ontologies Are Useful for Large-Research Consortiums
Large research consortiums exist, including the Observational Health Data Sciences
and Informatics consortium (OHDSI;
https://www.ohdsi.org/
) along with several recent novel coronavirus disease 2019 (COVID-19)-specific consortiums
such as N3C (
https://ncats.nih.gov/n3c
). While a separate initiative, N3C utilizes the same Common Data Model that OHDSI
utilizes in their consortium. These consortiums are great in providing standard methods
to translate individual hospital record systems into a shareable and easily computable
framework that enables queries and more complex scripts to run across multiple sites
simultaneously. However, these consortiums do not provide disease-specific ontology
information or methods for extracting relevant patients for particular diseases, they
merely provide concept sets. These concept sets include information that is largely
derived from the Systematized Nomenclature of Medicine–Clinical Terms (SNOMED CT)
ontology framework. They do not contain links to relevant comorbidities or diseases
often confused with the disease of interest. However, a well-defined ontology would
provide not only the disease codes and concepts needed to extract relevant patients
but also the links and relationships between these concepts. Ideally, it would also
contain relevant risk factors specific to the disease of interest.
Purpose of This Study
To support clinicians in screening for and treating patients with PPD, we aim to characterize
the important clinical facets of PPD by demonstrating the relationships among confirmed
(known) and potential risk factors, symptoms, comorbidities, and treatments in an
ontology. We also link relevant International Classification of Diseases (ICD) code
sets corresponding to the PPD risk factors to enable researchers to easily identify
them in their EHR cohorts. We use a task-based approach to validate our ontology by
considering whether it can identify the PPD risk factors, symptoms, and treatments
present in 10 case studies of PPD patients derived from web sites,[16 ]
[17 ] magazines,[18 ] and blogs.[19 ]
[20 ]
[21 ]
[22 ]
[23 ] However, we are not limited to just code sets. We also include noncore information,
such as age, parity, and other relevant nondiagnostic code comorbidities, and risk
factors that should be included in any PPD study.
Methods
The methods for this ontology include four parts as follows: (1) determination of
the ontology's scope and survey of the literature, (2) review of existing ontologies
related to PPD and evaluation of their usability in the ontology, (3) representation
of the PPD knowledge base, and (4) evaluation of the ontology for correctness and
usefulness. The target population for this postpartum depression ontology (PDO) includes
researchers interested in using EHRs to investigate maternal mental health and medical
professionals hoping to improve PPD screening and diagnosis.
Ontology Scope and Survey of the Literature
To determine the scope of the PDO, we conducted a survey of the PPD literature based
on the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA)
guidelines[24 ] using PubMed and Google Scholar ([Fig. 1 ]). Although the PRISMA format is used, numbers are not included in [Fig. 1 ] because the total number of reviewed papers varied by each risk factor investigated;
moreover, this was not a comprehensive systematic review or meta-analysis, but a survey
of the literature for the purposes of constructing our ontology. We screened titles
and abstracts of studies for conditions and their relationship to PPD or depression,
then assessed each condition against the criteria for determining confirmed versus
potential PPD risk factors which are delineated in [Fig. 2 ]. Eligibility for consideration as a confirmed PPD risk factor was determined by
adjustment for important confounders; when the study was not a meta-analysis, systematic
review or literature review, papers had to adjust for a history or experience during
pregnancy of psychiatric illness, either prior to or during pregnancy. Confirmed risk
factor papers were further required to find a statistically significant increased
risk of PPD at p < 0.05 and/or an effect size that was moderate (>0.30)[25 ] postadjustment, include a comparison group, and have a general consensus in the
field regarding the condition's relationship with PPD. In contrast, eligibility for
consideration as a potential PPD risk factor from the literature survey required a
statistically significant association between the condition and either PPD or depression.
We noted that postnatal surveys were often given to women which introduced the potential
for recall bias; however, their reported symptoms and risk factors were corroborated
in other papers and case studies, leading to their inclusion in risk factor determination.
Following initial review, full-text retrieval and an additional round of review followed.
Studies were then synthesized to classify the risk factors.
Fig. 1 Flow diagram of paper selection process for papers involved in postpartum depression
(PPD). There were many risk factors for PPD and therefore each risk factor (i.e.,
labeled as [Condition] in [Fig. 1 ]) was searched to identify further information with regards to PPD risk factor status.
This step was conducted primarily because research of certain PPD risk factors is
scant and the field can change over time; therefore, we wanted the most up-to-date
and relevant information with regards to PPD risk factor status. We have not included
numbers in [Fig. 1 ] because the total number of reviewed papers varies by each confirmed and potential
risk factor that we investigated. The criteria for delineating confirmed versus potential
PPD risk factors is further described in [Fig. 2 ] with paper count cutoffs.
Fig. 2 Criteria used to determine postpartum depression (PPD) risk factor status. In [Fig. 2 ], the eligibility criteria for a condition's consideration as a confirmed or potential
PPD risk factor is shown. All confirmed risk factor papers had to adjust for psychiatric
illness, either prior to or during pregnancy, excluding meta-analyses, systematic
reviews and literature reviews. In determining whether to include these latter studies,
three considerations were made: (1) at least one other source had to adjust for a
history or experience during pregnancy of psychiatric illness, (2) these studies had
a level of influence in their field as determined by citations numbering above and
often beyond 200, and (3) the first authors were influential in their field with h-indexes
above 40.
After identifying confirmed and potential PPD risk factors, we categorized them into
four different types as follows: (1) mental condition related, (2) pregnancy or birth
related, (3) instability mediated by outside factor related, and/or (4) other body
condition related. Mental condition–related risk factors included mental illness diagnoses,
as well as perceptions of self or others. Pregnancy- or birth-related risk factors
included any complications or difficulties related to the pregnancy and delivery,
while instability mediated by outside factor–related risk factors included conditions
at least partially outside the mother's control such as socioeconomic status and abuse.
Finally, unclassifiable risk factors were categorized as other body condition-related
which included nonpregnancy issues such as asthma. Polyhierarchical modeling was used
to create the ontology, allowing risk factors to be categorized as multiple risk types.
Review of Existing Ontologies in the Postpartum Depression Domain
A useful ontology must be accessible for different groups and applications,[26 ] so we first reviewed existing ontologies for reuse. We used the National Center
for Biomedical Ontology (NCBO) BioPortal (
https://bioportal.bioontology.org/
), Ontobee (
http://www.ontobee.org/
), Open Biological and Biomedical Ontology Service (
http://www.obofoundry.org/
), AberOWL (
http://aber-owl.net/#/
), Ontology Search (
https://www.ebi.ac.uk/ols/index
), and Protégé Ontology Library (
https://protegewiki.stanford.edu/wiki/Protege_Ontology_Library
) to examine ontologies and ontology classes related to PPD. We evaluated these ontologies
for their coverage of the PPD domain, as well as their ability to characterize PPD
risk factors and symptoms.
More than 110 vocabulary resources were found that included depression-related classes
but were not specific to PPD at the ontology level. We included information from the
five most relevant sources in the PPD domain in [Table 1 ]. Of the vocabulary resources identified, few included PPD entries, and most entries
lacked detail, further demonstrating the need to develop a PDO.
Table 1
Relevance of existing ontologies to the PPD domain
Resource
Type of resource
PPD knowledge
Suitability to characterize PPD risk factors and symptoms
Relevance to this ontology
International Classification of Diseases, version 9-clinical modification (ICD-9-CM)
Clinical terminology
Limited coverage with no PPD-specific code, 16 MDD codes, and some symptom codes
Vague definitions and lack of codes designated for PPD limit the usefulness for characterizing
PPD without clinical notes. No risk factor relationships are included
Terms in the ontology will map to ICD-9 codes
International Classification of Diseases, version 10-clinical modification (ICD-10-CM)
Clinical terminology
Reasonable coverage with 1 PPD-specific code, 22 MDD codes, and some symptom codes
Designated PPD code and many detailed MDD codes better characterize PPD as compared
with ICD-9-CM. No risk factor relationships are included
Terms in the ontology will map to ICD-10 codes
Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT)
Ontology
Comprehensive coverage of MDD, maternal mental disorders, and symptoms
Suitable for ontology terms related to PPD, but conditions and symptoms are spread
across the ontology No risk factor relationships are included
Terms in the ontology will map to SNOMED CT terms
Psychology Ontology (APAONTO)
Thesaurus
Limited coverage with 1 PPD entry and 1 MDD entry
Disorganized alphabetical list of psychology terms constitutes APA thesaurus. Includes
definitions of depressive disorders. No risk factor relationships are included
Terms in the ontology will map to APAONTO terms
MedlinePlus Health Topics (MEDLINEPLUS)
Ontology
Limited coverage with 1 PPD entry and 28 mappings of that term
Disorganized hierarchy of health topics with annotated definitions. Does not provide
more details for conditions beyond definitions. No risk factor relationships are included
Mappings to other ontologies suggest mappings for PDO
Abbreviations: APA, American Psychological Association; MDD, major depressive disorder;
PDO, postpartum depression ontology; PPD, postpartum depression.
Given the lack of focus on PPD in these resources and our specific target population
of EHR researchers and medical professionals, we created an ontology separate from
other existing resources to best fit researchers' and medical professionals' needs.
The five most relevant sources in [Table 1 ] were used to organize the ontology and will help to standardize future versions
through mappings to established ontologies such as SNOMED CT (
https://www.nlm.nih.gov/healthit/snomedct/index.html
), a source commonly used in constructing ontologies representative of EHR data.[27 ]
Protégé-OWL Representation of the Postpartum Depression Knowledge Base
The PDO was written in the Web Ontology Language (OWL) using the application Protégé-OWL
v.5.5.0.[28 ] The initial ontology was built considering pregnancy- and mental health–related
ICD codes, as well as ICD codes for PPD risk factors. Symptoms of PPD included in
the initial ontology were obtained from the literature review by Stewart et al.[29 ] Updated versions of the ontology postevaluation included treatments and other PPD-related
variables.
[Supplementary Table S1 ] (available in the online version) includes the descriptions of the PDO's three main
superclasses, as well as examples of important pregnancy, and mental health subclasses
in each hierarchy. In the PDO, each class that was a confirmed or potential risk factor
included the relevant ICD-9 and -10 codes as individuals. Although the ICD-10-clinical
modification (CM) clinical terminology defined ICD codes as classes, they had only
one logical parent class, whereas ICD codes in the PDO had relationships with multiple
parent classes and therefore necessitated multiple inheritance; thus, ICD codes were
designated as individuals in this ontology.
Evaluation of the Ontology for Correctness and Usefulness
The ontology was evaluated for correctness of ontology form, domain knowledge, and
usability. Ontology form was evaluated using the Pellet reasoner in Protégé-OWL, while
domain knowledge and usability were assessed through case review by two domain experts
(R.B.M. and M.R.B.). Ten patient case studies from eight online sources were compiled,[16 ]
[17 ]
[18 ]
[19 ]
[20 ]
[21 ]
[22 ]
[23 ] describing women presenting with various PPD phenotypes. These online sources were
chosen to provide a representative set of experiences including various risk factors
and symptoms that may not always be recorded in clinical notes, as well as to allow
widespread sharing of results without Health Insurance Portability and Accountability
Act (HIPAA) concerns. Recent work by Borland et al[30 ] has also demonstrated the importance of considering patient experiences in building
accurate ontologies about specific conditions in addition to the traditional encyclopedic
knowledge included. The two domain experts independently reviewed the first five cases
to determine missing risk factors, as well as PPD symptoms and treatments. During
case review, terms, and phrases (“chunks”) relevant to the mother and PPD were highlighted,
then compiled and categorized by their relevant features. For example, the chunk “down
feelings” was categorized by the relevant feature “symptoms of depression: depressed
mood.” The chunks chosen by the domain experts were compared to compile a list of
all relevant features identified. Then, duplicates were removed, and the unique relevant
features were labeled as symptoms, risk factors, treatments, or other. Finally, the
PDO was evaluated for its inclusion of the unique relevant features, so that necessary
changes and additions to the domain knowledge could be made. A second pass using five
more cases was then conducted to evaluate the updated ontology. [Fig. 3 ] illustrates an overview of the evaluation process, including the specific process
of determining the number of unique relevant features to be analyzed against the existing
ontology in each evaluation.
Fig. 3 Evaluation schema. Two postpartum depression (PPD) case study evaluations for the
postpartum depression ontology (PDO) were conducted. Both evaluations consisted of
extracting relevant terms and phrases from the case studies by two evaluators, comparing
relevant terms, and excluding features that were not PPD symptoms, risk factors, or
treatments. In [Fig. 3 ], the assessment of the PDO against these relevant features is shown.
Results
Postpartum Depression Ontology
The PDO was designed to formalize the PPD knowledge base in terms of ICD codes and
clinical notes, including its risk factors, symptoms, treatments, comorbidities, and
other related pregnancy or mental illness conditions. The ontology includes 734 classes,
13 object properties, and 4,844 individuals. The ontology has been made available
on the NCBO BioPortal at
https://bioportal.bioontology.org/ontologies/PARTUMDO
for researchers to utilize and incorporate into their future work.
Postpartum Depression Risk Factor Identification
In total, 78 risk factors were identified, with 8 labeled as confirmed risk factors
and 70 as potential risk factors. The first PPD ontology was constructed with 7 confirmed
risk factors and 55 potential risk factors; however, after the initial evaluation,
one confirmed risk factor was identified by the first case study evaluation and added.
Furthermore, 11 potential risk factors were added, with 7 identified by the first
case study evaluation and 4 by clinical expertise. The final evaluation of case studies
revealed three more potential risk factors, and one potential risk factor was identified
by additional clinical expertise.
[Table 2 ] includes all confirmed risk factors and eight selected potential risk factors. Confirmed
risk factors were strongly supported in the literature with at least two independent
groups finding a statistically significant increased risk of PPD and many citations;
citations for the literature supporting the eight confirmed risk factors ranged from
47 to 2,492 citations as of April 12, 2021, with each risk factor supported by at
least one paper with 245 or more citations ([Table 2 ]). In contrast, conditions that were classified as potential risk factors often failed
to include some known confounding variables, had a potentially bidirectional relationship
with PPD,[31 ] or lacked extensive research in the field, with few papers on the subject or an
established association with depression but not PPD. [Supplementary Table S2 ] (available in the online version) includes a complete list of the 70 potential risk
factors identified from the literature, case studies, and clinical expertise. Since
these additional risk factors are only “potential” due to the existence of some disagreement
in the literature with regard to their role in PPD, we include them as a supplement
if researchers are interested.
Table 2
Classifications and sources for confirmed PPD risk factors and eight selected potential
PPD risk factors (out of 70 total potential PPD risk factors)
Risk factor[a ]
Class(es) in ontology[b ]
[c ]
Type(s)
Status[d ]
History of depression[7 ]
[8 ]
[9 ]
[25 ]
[37 ]
[38 ]
History_of_Major_Depressive_Disorder
*Major_Depressive_Disorder
Mental condition
C
History of anxiety[25 ]
[37 ]
History_of_Generalized_Anxiety_
Disorder
*Generalized_Anxiety_Disorder
Mental condition
C
History of postpartum depression[9 ]
[37 ]
[39 ]
History_of_Postpartum_Depression *Postpartum_Depression
Mental condition
Pregnancy or birth
C
Anxiety during pregnancy[8 ]
[25 ]
[50 ]
*Anxiety_During_Pregnancy
*Generalized_Anxiety_Disorder
Mental condition
Pregnancy or birth
C
Depression during pregnancy[8 ]
[25 ]
[50 ]
*Depression_During_Pregnancy
*Major_Depressive_Disorder
Mental condition
Pregnancy or birth
C
Abuse[37 ]
[51 ]
Abuse_Violence_Type
Outside factor
C
Subjective lack of support post pregnancy[29 ]
[38 ]
[52 ]
[53 ]
Negative_Perception_of_Support_
Postpregnancy
*Negative_Perception_of_Support
Outside factor
Mental condition
Pregnancy or birth
C
Relationship dissatisfaction[9 ]
[54 ]
Relationship_Dissatisfaction
Outside factor
C
Multiple gestation[55 ]
[56 ]
Multiple_Gestation
Pregnancy or birth
P
Preeclampsia[10 ]
[57 ]
Preeclampsia
Pregnancy or birth
P
Traumatic brain injury[58 ]
Traumatic_Brain_Injury
Outside factor
Other
P
Unplanned, mistimed, or unwanted pregnancy[59 ]
Unplanned_Pregnancy
Mistimed_Desire
*Unwanted_Pregnancy
Pregnancy or birth
Mental condition
P
Assisted delivery[7 ]
[11 ]
Emergency_Cesarean_Section, Instrument_Assisted_Delivery
Pregnancy or birth
P
Preterm delivery (<37 weeks)[60 ]
Moderate_to_Late_Preterm
*Preterm
Pregnancy or birth
P
**Breastfeeding intent different from reality[45 ]
Intent_to_Breastfeed_and_Did_Not_
Breastfeed
No_Intent_to_Breastfeed_and_Did_
Breastfeed
Pregnancy or birth
Mental condition
P
Gestational diabetes[7 ]
Gestational_Diabetes
Pregnancy or birth
P
Abbreviation: PPD, postpartum depression.
a A double starred (**) row indicates risk factors for which there are no ICD codes
(no individuals) in the ontology.
b Since there are no International Classification of Diseases (ICD) codes with the
temporal designation “history of” for depression, anxiety, and PPD, or “post pregnancy”
for lack of support post pregnancy, starred (*) classes include the individuals for
these conditions. Similarly, potential risk factors with starred classes indicate
the ontology class under which ICD codes can be found.
c Classes that do not specifically describe the risk factor, but which include ICD
codes that could be used to identify patients with the risk factor after adjusting
for necessary temporal or other relationships, are designated by a star (*) and italics . For example, a potential risk factor is Moderate_to_Late_Preterm delivery; however,
there are no ICD codes specific to this time period, so the ICD codes reside under
the less specific *Preterm class.
d Above, confirmed PPD risk factors are designated “C,” whereas potential PPD risk
factors are designated “P” under the Status column.
All 4,844 individuals in the ontology were subclasses of one or more of the 8 confirmed
risk factor classes, or subclasses of the 47 potential risk factors for which ICD
codes existed. For history of depression, anxiety, and PPD, as well as lack of support
postpregnancy, there were no ICD codes with the “history of” or “post pregnancy” temporal
designations. A star (*) in [Table 2 ] indicates the ontology class under which the ICD codes (individuals) for these conditions
are located; however, the temporal relationship would need to be determined by a researcher
or medical professional. For example, a patient with a history of PPD would be diagnosed
with one of the ICD codes that was an individual of the Postpartum_Depression class.
Researchers could then specify a time range when pulling patient codes from EHRs to
determine whether patients had a history of PPD or current PPD. Furthermore, there
were 23 potential risk factors for which there existed no ICD codes; thus, these risk
factors are designated by a double star (**).
Postpartum Depression Ontology Object Properties
The object properties within the PDO are included in [Table 3 ]. Object properties were critical for demonstrating the relationships of risk factors
with PPD, as well as detailing injury relationships for the investigation of traumatic
brain injury (TBI) as a potential risk factor. There were 13 object properties in
total with seven classified as pregnancy- or mental health–related and two specifically
showing the relationships among PPD and its risk factors. There were nine object properties
with the general domain class OWL: Thing due to the domain spanning the entire ontology;
for example, there were risk factors in all three superclasses of the PDO.
Table 3
Ontology object properties
Domain class
Object property
Conjunction
Range class
Related to mental health, pregnancy, or other
owl:Thing
has_PPD_risk_factor_status
only
Postpartum_Depression_
Risk_Status
Pregnancy
Mental health
owl:Thing
has_PPD_risk_type
only
some
Postpartum_Depression_
Risk_Type
Pregnancy
Mental health
owl:Thing
has_ICD_version
NA
ICD_Code_Version
Pregnancy
owl:Thing
is_symptom_of
some
Postpartum_Depression
Pregnancy
Mental health
Inviable_
Pregnancy_
Condition
causes_living_status
only
Infant_or_Fetus_
Inviability
Pregnancy
owl:Thing
has_postpartum_psychosis_
risk_factor_status
only
Postpartum_Psychosis_
Risk_Status
Pregnancy
Mental health
owl:Thing
has_psychotic_status
only
Psychotic_Status
Mental health
Injury_Type
has_injury_depth
only
Injury_Depth
Other
Injury_Type
has_injury_trauma_type
only
Injury_Trauma
Other
TBI_Related_
Injury
has_injury_area
only
Injury_Area
Other
owl:Thing
is_during
only
Condition_Time
Other
owl:Thing
has_condition_type
only
Condition
Other
owl:Thing
has_procedure
only
Medical_Procedure_
Encounter_or_Treatment
Other
Abbreviations: ICD, International Classification of Diseases; PPD, postpartum depression.
Pregnancy- and Mental Health–Related Object Properties
The PPD risk factors spanned the ontology and were categorized as several distinct
types, leading to general owl:Thing domain classes for both PPD-related object properties.
To differentiate among the types of risk factors, the object property has_PPD_risk_type was created with a range of Postpartum_Depression_Risk_Type. This class was further
subdivided into the four risk factor type classes. To demonstrate whether the literature
supported classes as confirmed or potential risk factors, the object property has_PPD_risk_factor_status was formed with a range of Postpartum_Depression_Risk_Status. This class included
three subclasses identifying variables as confirmed, potential or not risk factors;
this latter categorization included the class Mistaken_for_PPD which contained conditions
with symptoms similar to those of PPD that could lead to an incorrect diagnosis.
The object property has_PPD_risk_factor_status required the use of the conjunction ‘only’ to relate it to a class and exclude the
possibility of a confirmed risk factor also being a potential risk factor or not a
risk factor. For example, Abuse_Violence_Type—the class representing all forms of
abuse—had the object property relationship has_PPD_risk_factor_status ‘only’ Confirmed_PPD_Risk_Factor. In contrast, the object property has_PPD_risk_type allowed the conjunctions ‘only’ and ‘some’ to relate classes due to the polyhierarchical
structure of the PDO. Abuse_Violence_Type had the object property relationship has_PPD_risk_type ‘only’ Instability_Mediated_by_Outside_Factor_Related_PPD_Risk_Type because it could
not be classified as one of the other three types. However, History_of_Postpartum_Depression
used the conjunction ‘some’ because it could be considered both a mental condition
and pregnancy- or birth-related risk type, and ‘some’ specifies an “at least one”
relationship.[32 ]
[Fig. 4 ] shows an example of the relationships and individuals for the confirmed risk factor
abuse and, more specifically, sexual abuse.
Fig. 4 Individuals and the has_PPD_risk_factor_status object property showing abuse as an
example. In [Fig. 4 ], blue arrows show a has subclass relationship, the yellow arrow shows a has_PPD_risk_factor_status only relationship with the Confirmed_PPD_Risk_Factor class, and the purple arrows
signify a has individual relationship. ICD codes for sexual abuse are shown here, with two overlapping the
Physical_Abuse class. ICD, International Classification of Disease; PPD, postpartum
depression.
Three more pregnancy- or mental health–related object properties were particularly
important in the ontology. The object property has_ICD_version was used among the 4,844 individuals of the PPD risk factors to show whether the
codes were from version ICD-9 or -10; no specified conjunction was necessary because
the object property connected individuals. [Fig. 5 ] represents the other object property illustrating the variety of PPD symptoms included
in the final version of the PDO. The is_symptom_of object property had a range of Postpartum_Depression, and the conjunction ‘some’
was used because these symptoms are at least related to Postpartum_Depression, yet
they could also be related to other conditions. Any symptoms and signs with synonymous
or similar descriptions were included using the rdfs:label annotation.
Fig. 5 Symptoms of PPD as shown in the PDO. In [Fig. 5 ], a representation of several PPD symptoms is visualized, in which the blue arrows
show a has subclass relationship and the orange arrows signify an is_symptom_of
some relationship with the Postpartum_Depression class. The yellow boxes show clustering
of synonyms and similar terms under the rdfs:label annotation. PDO, postpartum depression
ontology; PPD, postpartum depression.
Other Object Properties
There were six object properties classified as “Other.” Due to the relationship among
traumatic injury, TBI, physical abuse, and depression,[33 ]
[34 ]
[35 ] a clinical expert (M.R.B.) recommended that we include TBI in the ontology as a
potential PPD risk factor. Thus, the injury class required three object properties
to relate five important subclasses of the injury superclass: Injury_Type, Injury_Area,
Injury_Depth, Injury_Trauma, and TBI_Related_Injury. The class Injury_Type was related
to the classes Injury_Depth and Injury_Trauma by the object properties has_injury_depth and has_injury_trauma_type , respectively, while TBI_Related_Injury was related to Injury_Area by the object
property has_injury_area. All used the conjunction ‘only’ to exclude false injury descriptions; for example,
the class Traumatic_Brain_Injury used the conjunction ‘only’ to provide closure to
the object property has_injury_trauma_type ‘only’ Traumatic_Injury such that a nontraumatic cause of a brain injury was excluded.
Of the remaining three object properties classified as “Other,” the is_during object property was particularly important. Since the ontology was created to represent
the PPD knowledge base, time was defined relative to pregnancy and delivery. The is_during object property was used to form many complex class expressions which have more than
one conjunction. For example, the class Gestational_Diabetes was equivalent to Diabetes
‘and’ is_during ‘only’ Time_During_Pregnancy. In other words, gestational diabetes is a type of diabetes
that only occurs during pregnancy. In another example, the object property defined
the meaning of a very preterm baby, which is a baby born between 28 weeks of gestation
up to 31 weeks 6 days gestation.[36 ] In this case, the object property would be used as follows: Very_Preterm is_during ‘only’ (28_Weeks_Gestation ‘or’ 29_Weeks_Gestation ‘or’ 30_Weeks_Gestation ‘or’ 31_Weeks_Gestation).
Evaluation Results
Case Study Evaluation Results
The case study evaluation was performed in two rounds of five case studies by R.B.M.
and M.R.B ([Supplementary Table S3 ], available in the online version). In the first set of 158 sentence chunks ([Fig. 3 ]), 60 chunks (38%) were identified as relevant features by the two evaluators, and
all 60 were annotated similarly. The remaining 98 chunks (62%) were only identified
by one evaluator due to varying expertise; these were discussed to reach a consensus
opinion for all. This process was repeated for the second set of case studies with
214 sentence chunks ([Fig. 3 ]). Of these,106 chunks (49.5%) were identified as relevant features by both evaluators
with 96 annotated similarly and 10 without consensus. This lack of consensus was due
to differing interpretations of the sentence chunks. For example, “I wasn't eating
or drinking enough water, which meant my body wasn't making breastmilk” was interpreted
by R.B.M. as “breastfeeding difficulties,” whereas M.R.B. interpreted this as “nutrition
issues.” For these 10 chunks and 108 (50.5%) identified by only one evaluator, a second
discussion led to a consensus on classification of relevant features.
[Table 4 ] shows the results of the analysis comparing the unique relevant features identified
through case studies to the PDO in both rounds of evaluation. During the initial evaluation,
79 unique relevant features were analyzed. In total, 30.4% of all unique relevant
features were explicitly included in the first PPD ontology; when similar classes
for unique relevant features without an explicit class in the ontology were considered,
45.6% of the unique relevant features were covered. All 14 PPD treatments in the initial
evaluation were not encapsulated, so they were added to a new class of features called
PPD_Treatment. Furthermore, of the 11 PPD risk factors that were not already included,
10 were potential risk factors and one—History_of_Anxiety—was identified as a confirmed
risk factor. A subsequent survey of the literature on History_of_Anxiety was performed
to fulfill the criteria required in [Fig. 2 ].
Table 4
Case study evaluation results for PPD symptoms, risk factors, and treatments
Initial evaluation (n = 79)[a ]
Final evaluation (n = 88)[b ]
Unique relevant feature type
Included in first version of PDO
Total
Included in final version of PDO
Total
PPD symptoms
13 (30.2%)
43
31 (67.4%)
46
PPD risk factors
11 (50.0%)
22
27 (90.0%)
30
PPD treatments
0 (0.0%)
14
10 (83.3%)
12
Abbreviations: PDO, postpartum depression ontology; PPD, postpartum depression.
a The initial evaluation involved assessing the first version of PDO against n = 79 unique relevant features obtained from the first set of five case studies.
b The final evaluation involved assessing the second version of PDO against n = 88 unique relevant features obtained from the second set of five case studies.
The final evaluation involved 88 unique relevant features of which 77.3% were already
explicitly included in the second PPD ontology and 86.4% were covered when similar
classes were evaluated. Interestingly, only three new potential risk factors were
identified out of the 30 risk factor chunks, demonstrating a 90.0% accuracy in the
ontology's goal of characterizing PPD risk factors. Most treatments were already included
(83.3%), whereas several new symptoms were identified.
The Pellet Reasoner Evaluation Results
We used the Pellet reasoner to evaluate the final ontology and 154 inconsistencies
were found. All inconsistencies resulted from multiple inheritance of ICD code individuals
under the three disjoint superclasses: Condition, Maternal_Descriptor, and Medical_Procedure_Encounter_or_Treatment
([Supplementary Table S1 ], available in the online version).
One major source of inconsistencies were ICD codes describing high-risk pregnancies
which are classified as a descriptor of pregnancy risk in the ontology. For example,
ICD-10 code O09.811, which is defined as “Supervision of pregnancy resulting from
assisted reproductive technology, first trimester,” is an individual of the High_Risk
and Assisted_Reproductive_Technology_Cycle classes. However, these are located under
the Maternal_Descriptor and Medical_Procedure_Encounter_or_Treatment superclasses,
respectively. ICD codes classified as multiple gestations also exhibited many inconsistencies.
The ICD-9 code V27.3, which is defined as “Outcome of delivery, twins, one liveborn
and one stillborn,” exhibited multiple inheritance of the classes Infant_Stillbirth
(Condition) and Multiple_Gestation (Maternal_Descriptor).
The inconsistencies identified by the Pellet reasoner demonstrated that the three
superclasses of the PDO should not be disjoint; some ICD codes are both descriptive
in nature, as well as indicate a condition or procedure. All inconsistencies were
deemed to be valid from a logical and clinical perspective, and we modified the ontology
accordingly. The finalized ontology available on NCBO reflects the latest and most
correct and updated version of the ontology. If any changes are made postpublication
of this manuscript, those will also be shared with the community via the NCBO BioPortal
at:
https://bioportal.bioontology.org/ontologies/PARTUMDO
. For an overview of the entire ontology's hierarchy please see [Fig. 6 ].
Fig. 6 A graphical overview of the Postpartum Depression Ontology Superclasses and Direct
Subclasses of the Ontology.
Discussion
As the first PPD-specific ontology to our knowledge, the PDO was built inclusive of
ICD codes to represent the PPD knowledge domain in an EHR-accessible format for researchers
and medical professionals. After a literature search and two rounds of evaluation,
the ontology encompasses treatments, symptoms, risk factors, and other related conditions,
as well as personal descriptors and procedures. Most importantly, the PDO compiles
78 known and potential PPD risk factors that were identified through the literature,
case studies, and clinical expertise. To date, no studies have considered all of these
factors together, yet understanding the totality of risk factors is critical given
the high prevalence of PPD[3 ]
[4 ] and its serious consequences.[5 ]
[6 ] The PDO developed here designates eight variables as confirmed risk factors and
70 as potential risk factors in an effort to inform not only diagnoses but also to
identify and improve prevention strategies.
The literature search revealed many risk factors, yet very few had sufficient evidence
to support a causal relationship with PPD. Despite agreement among most papers reviewed
that a history of mental illness was one of the strongest risk factors for PPD development,[7 ]
[8 ]
[9 ]
[25 ]
[37 ]
[38 ]
[39 ] other variables investigated often had contradictory significance levels, indicating
the possibility of a bidirectional relationship or an absence of sufficient evidence.
Given the importance of understanding risk factors and the lack of agreement in the
field, we designed this ontology as a resource to discover which areas in the field
required further research. Thus, the object property and class combination of “has_PPD_risk_factor_status ‘only’ Potential_PPD_Risk_Factor” directs researchers to identify conditions that
have not been adequately studied and promotes investigation of these conditions to
add to the field through the inclusion of the relevant ICD codes.
The confirmed risk factor section of the ontology promotes adjustment for factors
as confounding variables in future risk factor studies, as well as helps medical professionals
to identify patients at risk. Since ICD codes were included in the ontology for each
of the confirmed PPD risk factors, medical professionals may offer more individualized
care by using these codes to supplement screening tools such as the PPD-specific Edinburgh
Postnatal Depression Scale (EPDS),[40 ] the Center for Epidemiologic Studies Depression (CES-D) scale,[41 ] the Beck Depression Inventory,[42 ] and the Patient Health Questionnaire-9 (PHQ-9).[43 ] Additionally, researchers can now more easily identify larger cohorts of women at
risk for PPD by using the ICD codes included in EHRs; this population is a critical
target for future intervention studies.
While the PDO incorporates ICD codes for EHR analyses, it was important to include
information beyond its coverage of medical diagnoses and the information typically
included in EHRs such as age and weight. Specifically, we added classes characterizing
the mother's intentions and perceptions of support, self, and the pregnancy that may
only be present in clinical notes or not at all. This type of information is rarely—if
ever—included in ontologies with mental health sections, yet negative perceptions
of reality and illusions often influence mental health unfavorably.[44 ] Furthermore, outcomes that differ from the mother's expectations, such as intent
to breastfeed but inability to do so, have been linked to an increased risk of PPD;[45 ] thus, this type of information is crucial to include. While perceptions of reality
may be difficult to diagnose, this PDO can improve the current self-assessment screening
tools[40 ]
[41 ]
[42 ]
[43 ] that may suffer from self-reporting bias,[46 ] as well as suggest topics about intentions and perceptions to discuss with patients
such as expecting to lose pregnancy weight quickly.
In addition to classes that were added to characterize intentions, perceptions, and
other variables missing from ICD codes, the use of polyhierarchical modeling and multiple
inheritance was crucial in building the PDO. The polyhierarchy was particularly important
for risk factor classification; simply creating a list of risk factors would have
been insufficient, as this would not allow the categorization of risk factors into
multiple risk types. In addition, multiple inheritance was critical for ICD codes
which often had more than one parent class. For example, the ICD-10 code T74.11XA,
which is defined as “Adult physical abuse, confirmed, initial encounter,” had the
parent classes Abuse_Violence_Type, Adult_Victim, Confirmed_Abuse, and Physical_Abuse.
It was necessary to identify each aspect of the code—confirmed abuse, abuse type,
and victim age—because these variables could play a role in risk factor research,
such as determining age of sustained abuse or type of sustained abuse as carrying
a greater risk of developing PPD. Thus, the development of the PDO through these modeling
choices allowed the characterization of the PPD knowledge domain's complexity.
During the evaluation of the PDO, clinical case studies provided a unique focus on
PPD treatments and symptoms that were often absent from the literature and ICD codes.
None of the treatments identified from the first set of case studies were included
in the ontology which partially accounts for the low total percent coverage (30.4%);
when treatments were removed from analysis, there was a small increase to 36.9% coverage.
Further, comprehensive coverage of PPD symptoms was difficult because many of the
symptom names used in the case studies were similar to other symptoms described by
different women. Even though the coverage of symptoms doubled between evaluations,
there was only 67.4% coverage in the final evaluation, suggesting the need to continue
updating the ontology with more symptoms and to increase the use of the rdfs:label
annotation. Moreover, despite their absence from ICD codes, which limits EHR research,
these symptoms may be included in clinical notes or screening results that could aid
in the development of better screening tools or in the choice of treatments by medical
professionals.
To date, this PDO evaluation is limited by the relatively low number of case studies
reviewed and the use of two evaluators. Another limitation is that one of the evaluators
(R.B.M.) was the developer of the ontology; however, the second evaluator (M.R.B)
produced similar results in the evaluation and a consensus opinion with regard to
correctness was derived from comparison of the evaluations, demonstrating the usability
and reliability of the ontology independent of the developer. Future work will involve
a secondary evaluation with additional case studies and new evaluators to assess the
strength of the ontology in its inclusion of all information relevant to a PPD diagnosis.
Newly constructed ontologies will be assessed for relevance to the PDO and new terms
will be included; one such example is the semantic-based verbal autopsy framework
for maternal death[47 ] which includes many pregnancy complications that could increase the risk of PPD
in cases of maternal survival. Additionally, further refinement of PPD symptoms and
treatments is required, as the number of relevant features in those categories added
after the two rounds of evaluation suggest that there are more PPD features. However,
we are confident that most treatments, symptoms, and risk factors are included either
explicitly or through a similar feature in the ontology due to the final evaluation
yielding 86.4% coverage. In fact, the inclusion of 90.0% of risk factors—which is
the most important category for the PDO's target population—demonstrates that these
are sufficiently characterized.
Beyond a deeper exploration and evaluation of the current PDO, our future work will
involve incorporating more relevant ICD codes into the PDO. Currently, the ICD-9 and
-10 codes included as individuals in the ontology are categorized under the eight
confirmed PPD risk factors and the 47 potential PPD risk factors for which there are
relevant ICD codes. As the literature expands, future iterations of the ontology will
be updated with additional risk factors and their ICD codes,[48 ] as well as include codes from ICD version 11. ICD-11 will be available to Member
States in 2022 to replace earlier versions;[49 ] thus, the relevant ICD codes from all versions must be included as individuals,
so that researchers can identify the most up-to-date codes to use in EHR studies.
Finally, the terms in the PDO will be mapped to SNOMED CT, ICD-9-CM, and ICD-10-CM
to improve accessibility and standardization of resources.
Conclusion
The PDO is a comprehensive ontology of the PPD knowledge base designed to include
information needed for a PPD diagnosis. We made our ontology readily accessible via
the NCBO BioPortal (available at:
https://bioportal.bioontology.org/ontologies/PARTUMDO
) for researchers to utilize and incorporate into their future work. Our evaluation
focused on the use of case studies to demonstrate its coverage and usefulness. Interestingly,
the PDO adds a new dimension to the knowledge base by compiling researched risk factors
and designating them as confirmed (known) or potential PPD risk factors, with ICD
codes used in EHRs included for these risk factors. The PDO can therefore illuminate
areas of PPD that require further investigation, as well as supplement the current
PPD screening techniques employed, promoting more clarity in the field for researchers
and potentially improving the standard of care provided by medical professionals.
As an ontology, the PDO provides much more detail than would typically be available
in disease concept sets because it provides relationships between concepts and other
useful information for conducting research studies on PPD.
Clinical Relevance Statement
Clinical Relevance Statement
Postpartum pepression (PPD) remains an understudied research area despite its high
prevalence. This study contributes an ontology to aid in the identification of patients
with PPD and to enable future analyses with electronic health record (EHR) data. In
addition to PPD, relevant comorbidities that have been reported in the literature
that are related to PPD are included. The ontology is freely available on the NCBO
BioPortal web site and was constructed using Protégé-OWL. Our ontology will enable
future researchers to study PPD using EHR data as it contains important information
with regards to structured (e.g., billing codes) and unstructured data (e.g., synonyms
of symptoms not coded in EHRs) and also the connections between diseases and comorbidities.
The PDO is publicly available through the National Center for Biomedical Ontology
(NCBO) BioPortal (
https://bioportal.bioontology.org/ontologies/PARTUMDO
) which will enable other informaticists to utilize the PDO to study PPD in other
populations.
Multiple Choice Questions
Multiple Choice Questions
What public database houses publicly available ontologies, including our postpartum
depression ontology?
NCBO BioPortal
PubMed Central
Entrez
GitHub
Correct Answer: The correct answer is option a. Our ontology is made freely available on the NCBO
BioPortal web site. The NCBO was founded as one of the National Centers for Biomedical
Computing supported by the NHGRI, NHLBI, and the NIH Common Fund.
Which of the following International Classification of Diseases (ICD) versions are
included in terms of codes in the Postpartum Depression Ontology?
ICD-9
ICD-10
ICD-11
ICD-9 and ICD-10
Correct Answer: The correct answer is option d. Our ontology includes diagnostic codes for both ICD-9
and ICD-10 for researchers using those terminologies. These can be easily mapped to
SNOMED-CT using existing OHDSI Common Data Model mappings.
Postpartum depression is a diagnosis code in which clinical terminologies?
ICD-9
ICD-10
ICD-9, ICD-10
ICD-10, ICD-11
Correct Answer: The correct answer is option d. While ICD-9 has diagnosis codes for depression, there
is no explicit code for “postpartum depression.” However, in ICD-10, an explicit code
for postpartum depression appears, and ICD-11 also contains specific codes.