Keywords clinical research informatics - workflows and human interaction - interfaces and usability
- user acceptance and resistance - evaluation - COVID-19
Background and Significance
Background and Significance
The Corona epidemic is a challenge that urgently requires new strategies for action,
not only to stop the spread of the virus but also to ensure the best possible medical
care for patients. In this context, a rapid gain of knowledge, as well as an exchange
of procedures, and best practices have been on high priority. As a result, many new
applications have emerged that enable the analysis of various data sources[1 ]
[2 ]
[3 ] and support decision makers. In addition, medical researches with routine data are
of particular importance; to gain knowledge about the novel coronavirus disease 2019
(COVID-19) as quickly as possible and to be able to develop approaches for new therapies,
it is necessary that researchers can access data from clinical care collectively and
across locations. Appropriate platforms for shared access to routine data have been
developed in various countries, although only a few national solutions have emerged.
In the United Kingdom, for example, the platform OpenSAFELY, the COVID-19 Research
Platform,[4 ] or C19, a COVID-19 research database,[5 ] combining primary care electronic health record and patient reported information.
In Germany, too, a national research data platform called “CODEX, Covid-19 Data Exchange
Platform,”[6 ] is to be developed within the German “Network University Medicine (NUM)”[7 ] that will make data available to researchers nationwide in a standardized manner
and in compliance with data protection laws. Part of this platform is a so-called
“Feasibility Portal” ([Fig. 1 ]) which is intended to enable researchers to find out whether sufficient patient
data are available within the data integration centers of the NUM university hospitals
for conducting clinical research and, in a subsequent step, to be able to request
the use of the data centrally. Until now, requests for the availability of routine
data for research were made by telephone or e-mail and required the conclusion of
a data usage contract with each hospital which is a very time-consuming process.
Fig. 1 Architecture of the National Research Data Platform.
In the development of this “Feasibility Portal,” a special focus should be placed
on the user friendliness of the portal, so that the portal can be used intuitively
and effectively. So far, there are only a few studies that address the usability of
research platforms, for example,[8 ]
[9 ]
[10 ]
[11 ]
[12 ] the results of these studies vary widely, ranging from poor to good usability. Usability
problems that have been identified include, for example, confusing terms,[8 ]
[10 ]
[13 ] complexity of the user interface,[12 ]
[13 ] or lack of appropriate system feedback.[12 ]
[13 ] Published reports on the usability of COVID-19 research data platforms, feasibility
portals for COVID-19 research, in particular, do not exist at present which indicates
the need for further research. The presented paper aims to fill this gap in scientific
literature.
Objectives
The objective of the study was to evaluate the usability of a first prototype to provide
specific recommendations for the further development of the portal (formative usability
study). For this, we focused primarily on the usability criteria “effectiveness” and
“satisfaction”[14 ] to answer the following questions:
Can feasibility queries be entered completely and correctly with the current interface?
Which positive aspects of the interface design are mentioned?
Which usability problems occur when entering feasibility queries and how is the severity
of the problems rated?
What recommendations for improving the user interface can be derived from the usability
problems?
Methods
Study Design
The usability study was conducted as a moderated remote test via the communication
software “Zoom” (
https://zoom.us/
) in the period April to May 2021. A qualitative approach consisting of an explorative
usability walkthrough combined with the method of “thinking aloud” (i.e., testers
express their thoughts aloud while working on the task)[15 ] and additional interview questions was applied. We decided on a remote test to be
able to reach clinical researchers easily in times of the nationwide lockdown and
to collect user feedback within a short time. Regarding the identification of critical
problems and the complete processing of test items, remote tests can be considered
equivalent to laboratory tests.[16 ]
[17 ] Walkthroughs with the thinking aloud method and interviews are established methods
in usability research and have been used in this combination many times for the evaluation
of clinical systems[18 ]
[19 ].
Evaluated Prototype of the CODEX Feasibility Portal
A first prototype of the Feasibility Portal was evaluated which enabled feasibility
queries of COVID-19 relevant data based on the uniform nationwide dataset “German
Corona Consensus Data Set” (GECCO).[20 ] The Feasibility Portal is aimed at scientists/medical researchers who want to search
for COVID-19 relevant data, nationally across multiple institutions from one central
place. Inclusion and exclusion criteria can be searched for and added to the query
via a corresponding free-text search or via a category search ([Fig. 2 ]). For a free-text search, the criteria can be searched as follows: (1) searched
for via the corresponding search field (there is one for inclusion criteria and one
for exclusion criteria); (2) selected from the search results displayed below; (3)
then the selection is made by clicking on the criteria and the option “Add”; (4) for
a category search, the folder icon next to the search field has to be clicked; (5)
the possible categories are displayed from which one can select one by clicking; (6)
the corresponding criteria appear under the category, then a criterion is selected
by clicking on the checkbox and the option “Add”; (8) selected criteria appear in
the field “Selected characteristics” and can be linked with each other using the respective
switch buttons “AND-OR”; (9) using a drag-and-drop function, already selected criteria
in the “Selected characteristics” area can be swapped (e.g., from inclusion criteria
to exclusion criteria) or moved/resorted (e.g., grouping with other criteria) and
after entering all characteristics, the query can be started with “Send”; (10) the
result of the search is displayed in the upper area under “Number of patients.” In
addition, the option “Details” can be used to view how many data records are available
at which location/university hospital.
Fig. 2 Screenshots of the CODEX Feasibility Portal (development status: April 2021). Explanation
of search paths for criteria: Free text search of criteria via: (1) entering the search
term, (2) displaying the search results and selection, (3) adding the search result;
Category search of criteria via: (4) selecting the icon, (5) selecting the category,
(6) selecting the search result, (7) adding the search result; Linking the entered
criteria via: (8) toogle buttons “AND–OR”; displaying the query results via: (9) sending,
(10) displaying details (respective clinics in which the data are available).
For the test, data from synthetic patients were used. The categorization of the criteria
(e.g., the classification of “medication” into the category “other”) was predetermined
by the structure of the GECCO dataset. Apart from saving the entered query, all intended
functions for this version of the prototype were accessible to the test participants.
Participants
The target group of the study was medical researchers who need COVID-19-relevant patient
data and, therefore, need to define “their” cohort. All sites involved in the CODEX
project were approached to recruit participants. Eleven sites were willing to participate
in the study. A total of 16 test participants were approached who corresponded to
the target group of the Feasibility Portal; of these, 15 participants agreed to test
the prototype. A description of the test participants can be found in [Table 1 ].
Table 1
Description of the sample
Variable
n
%
Age group
25–34 years
5
33.33
35–44 years
8
53.33
45–50 years
2
13.33
Gender
Male
8
53.33
Female
7
46.67
Professional group
Study manager
1
6.67
Medical researcher/clinician scientist
9
60.00
Research assistant
2
13.33
Other group (e.g., quality manager of a biobank, employee in the Coordination Centre
for Clinical Trials)
3
20.00
Professional experience
Professional experience in years[a ]
Mean: 4.96 years
SD: 5.983 years
Experiences with the query of case numbers for clinical studies
No/little experience
8
53.33
Some experience
7
46.67
Previous experience with similar systems
no
6
40.00
yes
9
60.00
Computer skills/knowledge
medium: I get along well with most systems.
8
53.33
high: I have a lot of experience and am technically proficient.
7
46.67
Abbreviation: SD, standard deviation.
Note: Absolute number and frequency per category.
a For work experience, the mean and standard deviation were calculated.
Testing Procedure
At the beginning of the study, consent of the participants was obtained via e-mail
(signed, scanned, and returned consent forms). If consent was given, participants
were sent an e-mail with a link for the access to the communication software “Zoom,”
as well as the task sheet ([Supplementary Appendix A ], available in the online version), with the request to have it ready for the test.
After dialing in via “Zoom,” the participants were welcomed by the test leader and
received the link to the prototype via the chat option in “Zoom.” After the participants
had opened up the prototype link, they received further instructions from the test
leader on how to process the tasks. At any time during the task, the participants
were asked to express their thoughts aloud. Positive comments, expressed usability
problems, as well as the correctness of the processing of tasks, were noted by the
test leader in a paper protocol prepared for this purpose. After completing the tasks,
the participants were interviewed on specific usability aspects and asked about demographic
characteristics and previous experience. The interview answers were written down by
the test leader in a structured record sheet. The test guide with the protocol sheets,
interview questions, and answer options can be found in [Supplementary Appendix B ] (available in the online version). For backup reasons, the test sessions were additionally
recorded using “Zoom.” The duration of each test session was 30 to 45 minutes.
Test Tasks
In consultation with clinical researchers and developers, the evaluation team defined
two test tasks that (1) can be completed with the current prototype of the Feasibility
Portal, (2) are typical for a query as it is currently performed by researchers, and
(3) vary in their degree of complexity ([Supplementary Appendix A ], available in the online version). The determinant for successful completion of
each task was entering all criteria completely and linking them correctly. By triggering
the search, the task was considered completed.
Posttask Interviews
To obtain a final judgment on the usability aspects of completeness of functions,
ease of use, operating logic, navigation, and information presentation/esthetics,
a corresponding interview questionnaire was developed. In addition, an interview questionnaire
was constructed to collect demographic information such as age, gender, and professional
experience, as well as to determine expertise and previous experience with similar
systems ([Supplementary Appendix B ], available in the online version). Both interview questionnaires were developed
in accordance with the SPSS method of interview guideline development according to
Helfferich[21 ] and checked in advance in a pretest.
Data Analysis
The handwritten paper protocols were transferred to MS Word and summarized in MS Excel;
the correctness of task completion was counted per task across all participants. The
named usability problems were summarized for all participants, deleting duplicate
problems and noting how many participants named the problem in total. Excluded from
this were problems caused by an intended functional limitation of the prototype or
the structure of the stored GECCO dataset (see also “Evaluated Prototype of the CODEX
Feasibility Portal”). The severity of the usability problems was assessed by two independent,
trained persons using the “Severity Scale” according to Nielsen: 0 = “I don't agree
that this is a usability problem at all,” 1 = “Cosmetic problem only: need not be
fixed unless extra time is available on project,” 2 = “Minor usability problem: fixing
this should be given low priority,” 3 = “Major usability problem: important to fix,
so should be given high priority,” and 4 = “Usability catastrophe: imperative to fix
this before product can be released.”[22 ] Rating differences of the two evaluators were discussed until consensus was reached.
For the interview protocols on usability aspects, it was counted across all participants
whether the respective aspect (e.g., ease of use, ease of navigation) was assessed
as “fulfilled” or in “need of improvement.” The suggestions for improvement named
by the participants were key worded. Interview responses related to demographic characteristics
and prior experience were evaluated according to their frequency of the given answer
categories. The results were used to describe the sample. In the follow-up to the
test sessions, the evaluation team worked out proposals for solutions to the identified
usability problems; the proposals named by the participants in the interviews were
taken into account for this.
Results
Task Success
The results of the task success can be found in [Fig. 3 ]. It shows that the tasks could be completed successfully for the most part, but
that user errors still occur. In task 1, one participant tried to enter “COVID” in
the search box to find the corresponding medication. However, this did not work because
the medications are not tagged with “COVID.” In task 2a, one participant mistakenly
defined all criteria as exclusion criteria and two participants linked the inclusion
criteria with “AND” instead of “OR.”
Fig. 3 Correctness of task processing across all subjects (n = 15).
Positive Aspects
Positive aspects mentioned during the task processing were that the portal is easy
and intuitive to use. From the user's point of view, entering criteria works very
well, as it is simple and easy to navigate within the application. Features can be
found quickly via the free-text search, even if one does not know under which category
the criteria can be found. The search is already performed when typing or narrows
down with each additional letter which is perceived as comfortable. The way of presenting
the linkage (“bracketing”) of the criteria was well received by the participants;
the linkage type can be changed quickly. The drag-and-drop function for changing the
criterion type (exclusion/inclusion) in each case is a very elegant solution and works
well. The execution of the query and receiving a result works quickly. The interface
is not overloaded and, therefore, very clear.
The results of the interview questionnaire support this picture: The majority of the
participants (> n = 11) consider the application to be visually appealing, easy to navigate, logical
in its operation, usable with little effort, and complete in terms of functionality
([Fig. 4 ]). When asked about the intention to use/acceptance, all participants stated that
they would use the Feasibility Portal in their work.
Fig. 4 Results of the interview questionnaire on usability.
Usability Problems
A total of 26 user problems were identified of which 8 problems were rated as “cosmetic,”
6 as “minor,” 4 as “major,” and 8 as “catastrophic.” In the following, the serious
problems (“major usability problem” or “usability catastrophe”) are presented ([Fig. 5 ]). A complete overview of all identified problems can be found in [Supplementary Appendix C ] (available in the online version).
Fig. 5 Visualization of usability problems with the most urgent need for revision.
One of the main problems was that there are different default settings for the linking
type for the inclusion criteria (“AND”) and exclusion criteria (“OR”) in the area
“Selected characteristics” and that the different areas for the inclusion and exclusion
criteria are not visually distinguishable enough. As a result, the participants assumed
that criteria to be linked with “OR” must always be sorted into the right-hand area
(this is actually the area for the “exclusion criteria”) or inadvertently selected
a wrong linkage for the exclusion criteria.
Problems also arose regarding the different visual design of the linkage in the inclusion
and exclusion criteria: the “OR” linkage of the inclusion criteria was presented as
“connecting” between the characteristics; for the exclusion criteria, however, it
was presented as “visually separating.” This led to confusion among the participants.
It is also problematic that the user expects to be able to change the type of link
from “AND” to “OR” before adding another feature. This, however, only works after
another criteria has been entered. Another problem is the case sensitivity of the
free-text search which, for example, made ICD codes undetectable when searching with
lowercase letters. The user, therefore, receives no search result. Additionally, the
user expects that “smoking status” can be found with the search term “nicotine,” since
“nicotine abuse” is the usual term documented in routine; however, the search does
not find the term “nicotine.” Furthermore, problems occurred with the restriction
of the characteristic “age,” since no “unit” can be selected. In addition, there were
problems due to a delayed reaction of the system. In the category search, the subcategories
only folded out after several clicks on the arrows in front of the characteristics/the
characteristic designation. As a result, the participants initially assumed that the
entire upper category had to be added first. However, this does not work in the system.
Moreover, the portal “crashed” for some participants after clicking on the “Details”
option. It was observed that a medication search via the category tree is very time
consuming or is even aborted if the code is unknown, since the drugs are not sorted
alphabetically and a sorting option is missing. The individual levels of the category
tree are not easily distinguishable for the user; as it partly contains many subentries,
the selection option of the criterion “essential (primary) hypertension” is overlooked.
Due to the visual indentation of the criterion “Smoker status” under “Active tumour
disease,” the user assumes that this criterion is assigned to “Active tumour disease”
and overlooks the criterion or does not recognize it immediately.
Improvement Recommendations
The solutions developed for the usability problems can be found in [Supplementary Appendix C ] (available in the online version). The severity rating of each usability problem
indicates the priority with which an adjustment should be implemented. The most urgent
revisions for the next iteration of the Feasibility Portal would be to (1) clearly
highlight the criteria in the “Selected Characteristics” as inclusion or exclusion
criteria, (2) eliminate the time delay in selecting characteristics from the “category
tree,” and (3) implement a comprehensive free-text search.
Discussion
To assess the quality of the interface design of the CODEX Feasibility Portal, a remote
usability test was conducted with clinical researchers during the development phase.
Task Success
With regard to the effectiveness of our portal and the question of whether tasks can
be successfully completed with the system, it became apparent that this is predominantly
the case, but mistakes are still made. In task 1, 1 out of 15 testers could not complete
the task correctly; in task 2, this was the case for 3 out of 15 testers. Direct comparative
studies regarding effectiveness for our type of national platform for COVID-19 research
do not exist. However, in comparison to usability studies of other research platforms
for the identification of cohorts for clinical trials, our platform leads to more
correct task completion; with the platforms “ATLAS” and “i2b2,” only 50% of the tasks
could be completed correctly[12 ]; with the platform “EHR4CR,” two tasks were completed correctly and completely in
10 out of 13 cases, one task was completed correctly by 4 out of 12 testers.[10 ] From this, it can be cautiously concluded that the interface of our application
is more intuitive and self-explanatory than other research platforms. However, our
portal is far less complex, has a smaller range of functions than the named query
builders, and was tested with other yet similar query tasks.
Positive Aspects of the Interface Design
Positive design aspects refer to the simple and intuitive use of the portal and its
clearly designed user interface. This places our results in line with the usability
results of similar query tools; the “EHR4CR” platform was also rated as user friendly,
for example, in terms of a user friendly terminology, the easy-to-use drag-and-drop
function and the layout, thus highlighting similar positive aspects.[10 ] However, the research platform has a different operating concept than ours (a purely
graphical operating concept and selection of criteria via building blocks). A usability
study of the “Sample Locator” also shows that it is clearly and intuitively designed.[12 ] The aspects of easy and fast input of queries were particularly emphasized which
was also noted as a positive design aspect for our portal. However, compared with
our system, the “Sample Locator” has fewer functions (e.g., there is no query option
regarding anamnesis/risk factors, laboratory values, or therapy, and inclusion criteria
cannot be put into an “OR” relationship across all criteria).
Usability Problems and Suggestions for Improvement
Numerous usability problems could still be identified in our current version. The
usability problems rated as most serious relate mainly to a visually poor differentiation
of the inclusion/exclusion criteria in the “Selected characteristics” field and the
default setting of “OR” in the selected exclusion criteria area which leads to confusion
errors. Other usability studies also show that unclear presentation of items is a
major cause of user dissatisfaction[13 ] and that good design of linking options is one of the problem areas of such query
systems; for example, Schüttler et al[12 ] found that out of three query systems evaluated (ATLAS, i2b2, and Sample Locator),
all systems had difficulties in use due to poor design of linking operators. The study
by Soto-Rey et al[10 ] also shows that confusion about the order in which criteria should be linked is
a major cause for user difficulties or incorrectly completed tasks. The design of
such links is not a simple undertaking. On one hand, they must consider all possible
combinations of inclusion and exclusion possibilities (and time constraints), and,
on the other hand, they should be as self-explanatory as possible in their linking
logic. The evaluation shows us that we are on the right track and that the basic presentation
is good, but that there is still a need to make the criteria more visually distinguishable
to keep the default settings for inclusion and exclusion criteria the same and to
keep the visual presentation of the linkage displays consistent.
In addition, we also found that a delayed response of the system in the category display
and a limited search function led to problems. Other studies have also identified
similar problems. The study by Schüttler et al,[12 ] for example, showed that due to the delayed display of the “ATLAS” query tool, participants
assumed that they had entered the query incorrectly and then took further detours
or changed their already correctly entered characteristics in such a way that the
query was ultimately incorrect. The importance of a good and functioning search concept
is shown by the study by Hultman et al[23 ] which identifies this as an important problem area for lack of completeness in task
processing.
For each usability problem, we have developed corresponding solution proposals to
fix the problems in a next iteration. The severity rating helps us prioritize our
work in the pandemic period and be the first to address the most serious issues from
a user perspective. After the solutions have been implemented, however, they must
be tested again to determine whether the design revisions do not provoke new problems.
Usability Key Aspects for Future Feasibility Portals
From our experience, we would like to share the following key aspects and lessons
learned that can serve as important input for the future development of similar national
portals:
User satisfaction is primarily influenced by a clear, minimalist design, and a simple,
quick, and instantaneous selection of criteria.
An intelligent search should be offered; this should take into account synonyms for
certain clinical pictures (e.g., COPD for chronic obstructive pulmonary disease),
as well as no adherence to case sensitivity.
Selected inclusion and exclusion criteria should be clearly identified as such. In
the case that criteria restrictions have been made, the type of restriction should
also be displayed textually in addition to the criteria name after adding this criteria.
The operators for linking the inclusion and exclusion criteria should have identical
default settings. The respective linkage type should be visually displayed in the
same way for both the inclusion criteria and the exclusion criteria.
For linking characteristics, users take different paths: (1) select characteristic
1 and select link operator, enter characteristic 2; or (2) select characteristic 1,
select characteristic 2, and define link between characteristics. A design should
be flexible and support both approaches.
Limitations
Our study has some limitations. A disadvantage of our qualitative approach is that
the chosen methods are not suitable for detailed statistical analysis. Thus, we cannot
quantify the usability of our prototype. An alternative would have been to use quantitative
standardized usability questionnaires, for example, the “System Usability Scale.”[24 ] However, such standardized usability questionnaires are not suitable for detecting
specific usability problems and recording reasons for operating difficulties, so we
decided against this procedure. Another limitation is that due to the 45-minute time
slots per person, we had to choose a rather pragmatic approach and could only test
the prototype on two tasks with few interview questions. Yet, our results show that
we were able to identify many relevant operating problems. We only tested with 15
participants. Nevertheless, we found that already with this number of testers, a certain
saturation effect was reached, and the same problems were identified several times.
This is also confirmed by the literature: according to Nielsen,[25 ] 15 participants discover almost all usability problems. Furthermore, we have not
conducted a comparative study. However, a comparison with the conventional way of
working would not have made sense because the researcher would have had to request
all the data separately from the clinics which would always have taken longer and
been more cumbersome than an electronic implementation.
Conclusion
Although the interface is already well designed in terms of functionality, navigation,
ease of use, logic of operation, and layout, several usability problems could be identified.
Improvement proposals were developed for these user problems which will guide further
development and adaptation of the Feasibility Portal to user needs. This is an important
prerequisite for ensuring that the portal can be used correctly in everyday clinical
work in the future. Our research will continue within the ABIDE_MI project[26 ] where we will implement the revisions identified in the study and add further functionalities
to the portal (more datasets and a temporal linkage of the criteria). The results
of our study can help to avoid usability problems with similar portals in the future.
Our methodological approach can be used and adapted by other developers of similar
systems when only limited time is available for an evaluation and a pragmatic approach
is required.
Clinical Relevance Statement
Clinical Relevance Statement
Our interface concept can be used by other researchers and developers for further
developments of similar portals. Core aspects for usability which we have derived
from our results can serve as input for an adapted design. Furthermore, we present
a pragmatic procedure that is easily transferable to various other areas and similar
systems and with which prototypes can be evaluated well with clinical end users. This
applies, especially in the COVID-19 situation, particularly when distance is required,
and participants can only allow themselves very little time for an evaluation. With
our results, we contribute to filling the gap in the existing research literature:
To date, there are no studies that have evaluated the usability of national research
data platforms that are based on routine data and support COVID research.
Multiple Choice Questions
Multiple Choice Questions
What combination of methods was used to test the interface of the developed feasibility
portal?
Logging and questionnaire
Video observation and questionnaire
Thinking-aloud method and questionnaire
Thinking-aloud method and posttask interviews
Correct Answer: The correct answer is option d The Feasibility Portal was first tested with the method
of thinking aloud, then the test participants were asked about the usability and acceptance
of the portal in posttask interviews.
In which area did problems of use/usability occur most frequently?
Color scheme of the interface
Navigation within the portal
Correct linking of parameters
Labeling of options.
Correct Answer: The correct answer is option c. The correct linking of the parameters was the task
with the most uncertainty and problems for the test participants. Design weaknesses
mainly relate to a less visual distinguishability of selected features as inclusion
and exclusion criteria and the lack of uniform presentation of the linkage options.