Keywords PACS - Data Warehouse - diagnostic radiology - Artificial Intelligence - Fast Healthcare
Interoperability Resources (FHIR)
Introduction
A medical data warehouse is an essential building block for implementing data-driven
medicine in hospitals and research [1 ].
In the clinical context, a data warehouse not only provides relevant patient and examination
data for clinical care but also serves as the basis for clinical decision support
systems (CDSS). A CDSS can help prevent medical errors and ensure efficient and safer
care [2 ].
In radiology, in particular, technological progress has led to a significant increase
in the volumes of data collected from every examination [3 ]. AI systems can help reduce the radiologist’s workload [4 ], but image-based AI systems usually provide the results in the form of DICOM secondary
captures, which have to be retrieved individually and assessed by the diagnostician
in the PACS system. A radiology data warehouse not only enables users to clearly compile
AI results from a range of sources in a consistent way but also allows users to integrate
these results directly in a structured report template. A data warehouse is also necessary
for storing structured reports independently of the producer of the report.
Not only does setting up a data warehouse support the primary clinical use of data
but it also allows for secondary use of data for research and quality improvement
studies [5 ], as well as to train robust AI algorithms [6 ]. Comprehensive, interoperable, and semantically annotated data sets enable the time-saving
testing of hypotheses in retrospective analyses, from which generalizable knowledge
can be generated that will benefit future patients [7 ]. Particularly for translational research and rare diseases, it is important to aggregate
data from across locations [8 ]. For this reason, the German Federal Ministry of Education and Research (BMBF) has
launched its medical informatics initiative (MII), which aims to improve patient care
and research through the establishment of data integration centers (DIC). A DIC stores
the core data set in Health Level 7 (HL7) Fast Healthcare Interoperability Resources
(FHIR) format [9 ]. Initially, this included mainly administrative data. However, the core data set
is currently being expanded to include an image core data set as part of the OMI project.
Fast Healthcare Interoperability Resources (FHIR, 2014) is the fourth and current
standard in data exchange from the Health Level 7 (HL7) organization; it was preceded
by HL7v1 (1987, proof-of-concept), HL7v2 (1988, current version HL7v2.9), HL7v3 (never
properly established due to lack of backward compatibility and complexity). Compared
to previous versions, it is based on modern and widely established concepts for data
exchange and data storage, such as the Hypertext Transfer Protocol (HTTP) and representational
state transfer application interfaces (REST APIs) ([Table 1 ]) [10 ]. In the FHIR standard, data are stored in resources, which are based on general
concepts in healthcare (e.g. patient, observation, encounter, diagnostic report, imaging
study, questionnaire, questionnaire response, see [Table 2 ]). This granular data storage reduces the complexity of the data without losing any
information. Via a REST API interface, different applications (desktop, browser, app)
and different user groups (doctor, nursing, controlling, research, patient) can access
and, if properly authorized, modify data. FHIR also enables semantic enrichment of
the data through the integration of medical ontologies and terminologies such as SNOMED-CT,
LOINC, or RadLex.
Table 1 Web standards.
HTTP(S)
Hypertext Transfer Protocol (Secure), a protocol for transmitting data over the Internet,
often used for web communications. With HTTPS, data exchange is encrypted.
REST-API
A REST API is a set of rules and conventions for creating and interacting with web
services. It enables communication between the application and the server, and supports
data manipulation using standard HTTP methods.
TLS
Transport Layer Security is a cryptographic protocol designed to ensure secure communication
over a computer network.
OAuth
Open Authorization is a framework for securely authorizing third-party applications
to access user data without sharing their passwords.
Table 2 Overview of FHIR resources.
Patient
Contains information about a patient, including demographics, medical history, and
contact information.
ImagingStudy
Describes a medical imaging procedure, such as an X-ray or CT scan, and includes information
about the patient, the imaging process, and the results.
Observation
Contains measurements or observations made during a medical examination or treatment,
such as radiation dose or bone density.
Questionnaire
Defines questions and possible answers used for collecting patient data or, for example,
structured reporting templates.
QuestionnaireResponse
Contains the responses by a patient or doctor to a questionnaire (see above).
ServiceRequest
Represents a request for a medical service, such as a laboratory test or radiology
imaging.
DiagnosticReport
Contains reports of diagnostic examinations or tests, including interpretations and
results.
FHIR therefore provides an ideal foundation for a radiology data warehouse. This article
presents an overview of FHIR and explains how to use FHIR to build a radiology data
warehouse.
Overview of current standards in radiology
Overview of current standards in radiology
Interoperability, as defined by the Healthcare Information and Management Systems
Society (HIMSS), is the ability of different information technology systems and software
applications to communicate with each other, exchange data, and use the exchanged
information [11 ]. Data interoperability plays a critical role in data-driven medicine, in general,
and data warehouses, in particular.
Interoperability is based on two basic concepts: syntax and semantics. Syntax refers to a system of rules according to which data are organized. In a linguistic
context, these rules correspond to a grammar. The syntax enables the defined processing of data between different IT systems. A syntax enables users to define document standards ([Table 3 ]), such as Extensible Markup Language (XML), JavaScript Object Notation (JSON), Clinical
Document Architecture (CDA) for clinical documents or DICOM Structured Reporting (DICOM-SR)
[12 ]
[13 ]. How such data are exchanged between different systems is defined in a data exchange
standard, such as DICOM, HL7v2, HL7v3 or FHIR ([Table 4 ]).
Table 3 Document standards.
CDA
CDA is a specific implementation and subset of HL7v3 that focuses specifically on
the structure and exchange of clinical documents. The focus is on presenting patient
information in a consistent way, including patient history, observations, and other
health data. While CDA documents conform to the principles and structures defined
in HL7v3, HL7v3 actually covers a broader range of standards for healthcare communication
that goes beyond clinical documentation. In contrast to HL7v3, CDA is quite widespread
as a document standard.
DICOM-SR
DICOM Structured Reporting (DICOM-SR) is a standard for organizing and exchanging
structured information (e.g. text or numbers) in medical imaging [13 ]. DICOM-SR enables the use of ontologies and terminologies (LOINC, SNOMED-CT, RadLex,
[Table 5 ]) to enable the semantic interpretability of the data.
DICOM-SC
DICOM Secondary Capture (DICOM-SC) is a special data format in the DICOM standard
for medical image data. It is used to store image data derived from primary images,
often through image processing or conversion to another format such as JPEG. These
secondary images can be used for reference purposes or reporting.
Table 4 Standards in data exchange.
DICOM
In a radiology context, DICOM has established itself worldwide as the main standard
for the exchanging, storing, and displaying medical imaging [14 ]. DICOM ensures interoperability between different imaging devices and health information
systems, and it enables collaborative diagnostics and treatment planning, as well
as seamless exchange of radiology data.
HL7v2
HL7v2 is currently the most widely used standard for the exchange of clinical and
administrative data between different healthcare systems [15 ]. However, its purely text-based exchange format makes it difficult to exchange complex
data sets with semantic information.
HL7v3
HL7v3 is the successor to HL7v2 and is based on the Reference Information Model (RIM),
which defines a standardized, abstract representation of health data and their relationships,
and thus adds semantic interpretability of the data to HL7v2 [16 ]. However, HL7v3 is known for its complexity and requires significant resources and
expertise to implement, which is why HL7v3 has never been adopted widely.
FHIR
The main difference between FHIR and its predecessor HL7v3 is the modular approach
to structuring data. FHIR breaks down healthcare information into individual components,
called resources. These resources can be flexibly combined and extended, allowing
adaptation to new healthcare requirements without disrupting existing implementations.
While syntax ensures the formal correctness of the data, semantics deals with the interpretation of the data, i.e. what the data elements actually mean
and how they are understood in a particular context. Semantic interoperability thus
ensures not only that data can be transmitted correctly but also that they can be
interpreted correctly in various (IT) systems. Medical terminologies or ontologies
(SNOMED CT, LOINC, RadLex, [Table 5 ]) define the semantics and thus form the basis for correctly interpreting medical
data [21 ].
Table 5 Relevant terminologies and ontologies.
SNOMED CT
SNOMED CT (Systematized Nomenclature of Human and Veterinary Medicine – Clinical Terminology)
is an ontology used worldwide to encode clinical terms and concepts; it provides common
language for the exchange of health information [17 ].
RadLex
RadLex is an ontology developed by the Radiological Society of North America (RSNA)
because SNOMED CT does not contain many specifically radiology terms. RadLex provides
a common language for describing imaging findings, procedures, and anatomical structures.
RadLex is also available in German [18 ].
LOINC
LOINC (Logical Observation Identifiers Names and Codes) is a terminology and defines
terms and concepts related to the exchange of medical laboratory observations, clinical
measurements, and other health observations [19 ]. LOINC has now been integrated in the RadLex Playbook and provides a universal standard
for terminology related to radiology requirements and results [20 ].
A terminology is in this context simply a list or collection of terms and their definitions. These
terms may include medical diagnoses, procedures, anatomy, diseases, symptoms, or other
relevant concepts [22 ].
An ontology is a formal, explicit specification of a common conceptualization in a particular
domain. It represents the entities (concepts) in this domain and the relations between
the entities in a structured and organized manner [23 ].
Integrating the Healthcare Enterprise (IHE)
Launched in 1998, Integrating the Healthcare Enterprise (IHE) is a global initiative
aimed at improving the interoperability of healthcare information systems. IHE does
not define its own standards but develops integration profiles based on existing standards
to enable seamless information exchange [24 ]. An overview of the FHIR profiles developed can be found at https://wiki.ihe.net/index.php/Category:FHIR
[25 ].
FHIR (Fast Healthcare Interoperability Resources)
FHIR (Fast Healthcare Interoperability Resources)
Need for FHIR
HL7v2 was introduced in its first version in 1988 and despite continual further development
has some methodological limitations (current version: Version 2.9, released in 2019
[26 ]). In particular, this includes the lack of consistent semantics, which leads to
variability in the interpretation of data and makes it difficult for systems to interpret
data consistently [27 ]. In simple terms, an HL7v2 message can be compared to an Excel spreadsheet. Subsequent
additions or changes to cells or interpreting what exactly is in a cell is not necessarily
clear.
To remedy these shortcomings, the HL7v3 standard was developed. At the time of development,
more modern transfer protocols, such as Hypertext Transfer Protocol (Secure) (HTTP(S)),
Simple Mail Transfer Protocol (SMTP), or Minimal Lower Layer Protocol (MLLP), were
adapted for data exchange ([Table 1 ]) [28 ]. In addition, the reference information model (RIM) was intended to ensure the semantic
interpretability of the data [29 ]. However, the internal documentation is already inconsistent, which understandably
led to a lot of criticism [30 ]. The complexity of RIM requires a great deal of expertise for implementation, which
delayed projects and resulted in considerable costs [27 ]. These reasons, as well as the lack of backward compatibility with HL7v2, meant
that HL7v3 was never widely implemented by the industry and HL7v2, despite its weaknesses
described above, continues to be the most widely used technology for transmission
of clinical data.
Introduction to FHIR
These fundamental problems with HL7v2 and HL7v3 prompted HL7 to develop a new standard
for data exchange, which is simpler in its implementation, semantically consistent,
and based on modern web standards with established security concepts such as Transport
Layer Security (TLS) and Open Authorization (OAuth) ([Table 1 ]). This has made it much easier to develop new applications and has led to wider
acceptance in the IT industry and among healthcare providers [31 ].
The development of FHIR officially began in 2011, and since then it has continued
to evolve with iterative versions appearing regularly, including the current version
FHIR Version 5 released on March 26, 2023. In order not only to meet the current requirements
of the healthcare industry but also to provide a basis for future advances in the
exchange of health data, further development is taking place hand in hand with the
healthcare industry.
Main features and principles
The aim of developing Fast Healthcare Interoperability Resources (FHIR) was to create
a standard that is capable of handling the complexity of data exchange in the healthcare
sector.
The main difference between FHIR compared to its predecessor HL7v3 is the modular
approach to structuring data. Health information is broken down into individual components
also known as resources. Examples of resources include Patient, ImagingStudy or DiagnosticReport.
These resources ([Table 1 ], [Fig. 1 ]) can be combined and expanded as needed, allowing flexibility while maintaining
a standard and meeting changing healthcare needs without impacting existing implementations.
Fig. 1 Example illustrating a potential combination of FHIR resources.
Another key issue for FHIR is interoperability. This is ensured by the use of standardized
terminologies (e.g. LOINC) and ontologies (SNOMED CT, RadLex) ([Table 5 ]). In addition, FHIR applies widely used, vendor-independent web standards such as
HTTP(S), JSON and XML, which promotes seamless integration in existing web-based systems
and makes it easier to develop new applications. This allows data from different healthcare
applications, systems, and devices to be securely exchanged and interpreted.
In addition, FHIR has robust security and privacy features that meet regulatory requirements
for healthcare. It includes authentication, authorization, and encryption mechanisms
to protect patient data and ensure secure information exchange.
Choosing an open standard based on established technologies encourages continued development,
which will lead to continued improvement of FHIR and keep the standard relevant –
not only for the present but also for the future.
Considerations when setting up a radiology FHIR data warehouse
Considerations when setting up a radiology FHIR data warehouse
FHIR server
The key component of a FHIR data warehouse is the FHIR server, where data are stored
in the form of FHIR resources and can be retrieved as such (online transaction processing,
OLTP). There are a variety of providers for FHIR servers: free open source variants
[32 ], as well as solutions from commercial providers that are offered on premise (locally)
or in the cloud (e.g. SMILE-CDR, Google, Microsoft, Amazon, Apple).
One example of an open source option under the Apache Software License 2.0 is the
HAPI-FHIR server [33 ]. In addition to a public test server (http://hapi.fhir.org/ ), a local instance can be set up very easily. If a local Docker instance is installed
[34 ], the HAPI-FHIR server can be downloaded and started using simple commands [33 ].
Technical considerations
FHIR’s resource-oriented architecture enables radiology data to be organized systematically.
The FHIR resources that are most relevant for structured and interoperable representation
of radiology data in a data warehouse are Patient, ImagingStudy, Observation, Device,
Questionnaire, QuestionnaireResponse, ServiceRequest and DiagnosticReport ([Table 2 ], [Fig. 1 ]).
Integration with legacy systems
In order to store data on the FHIR server, these data have to be converted into FHIR
resources. Since not all manufacturers offer a FHIR interface as standard, there is
a need to convert to FHIR resources if you want to extend the FHIR data warehouse
to include such sources. There are several open source solutions for converting messages
from HL7 format to FHIR [35 ]
[36 ].
However, the inconsistencies and lack of semantic uniqueness of the HL7v2 messages
described above require the conversion to FHIR resources to be adapted to specific
messages. The conversion from DICOM-SR is somewhat easier because DICOM-SR already
includes semantic coding.
Structured reporting
FHIR questionnaires are also ideal for structured reporting thanks to their modular
structure based on “items” (i.e. questions). The German Radiological Society (DRG)
is therefore in the process of expanding its report templates to include FHIR questionnaires.
An easy-to-use web-based platform for building FHIR-based report templates is available
at https://drg-befundvorlagen.uniklinik-freiburg.de .
Integration of AI results
If the results of AI algorithms have been stored in a FHIR data warehouse, it is possible
to assign them to a corresponding question in the FHIR questionnaire via a semantic
mapping by using the codes assigned in an ontology or terminology. This question field
can then be filled automatically when the report is created, and it only needs to
be validated by the radiologist.
Outlook: Semantic web in the FHIR data warehouse
The resource-based architecture of FHIR allows data to be stored in the resource description
framework (RDF) format [37 ]. RDF is part of what is known as the “semantic web,” which Tim Berners-Lee described
in 2001 [38 ]
[39 ]. RDF works together with SPARQL (SPARQL Protocol and RDF Query Language), a query
language, and OWL (Web Ontology Language), a language for defining data structures
on the internet. RDF uses a simple structure, also known as “triples”, which consists
of “subject”, “predicate”, and “object”, to represent data, concepts, and relationships
in a uniform manner. Subject and object represent nodes, which are connected by the
predicate with an edge. The subject is the starting point of the triple, e.g. a person,
and the predicate establishes a relationship between subject and object (similar to
the verb in a sentence). The combination of predicate and object describes the subject.
This structure of nodes connected by edges makes it possible to combine information
from different sources into a large, linked data set, which is also known as a knowledge
graph [40 ]. Although these knowledge graphs can be queried using SPARQL, such queries are quite
complicated and are not usually intended for end users [41 ]. The future may hold some relief in this regard based on AI systems that can convert
simple language to SPARQL queries [42 ].
Knowledge graphs and FHIR RDF therefore also have great potential for secondary data
analysis (online analytical processing, OLAP) in the medical context, and they provide
a semantic basis for using artificial intelligence in healthcare, such as explainable
AI applications [43 ].
Conclusions
FHIR, as a standard for a radiology data warehouse, opens up perspectives that have
the potential to significantly improve primary data evaluation in clinical routine.
This is done by integrating heterogeneous data sources such as decision support systems
and AI results, which not only increases the quality of data analysis but also simplifies
workflows for radiologists. In addition, the high level of interoperability of FHIR
enables the creation of inter-institutional and translational data exchange, which
promotes the creation of a cross-institutional knowledge database in line with the
semantic web.
Although integrating data from legacy systems that do not support the FHIR standard
is challenging, the expected synergies from a semantically consistent and defragmented
data warehouse justify the effort with significant improvements in the quality of
patient care.
A data warehouse based on FHIR is therefore a major step on the important path towards
data-driven medicine.