Keywords internship and residency - requirements analysis and design - evaluation of impact - qualitative - quantitative - dashboard
Background and Significance
Background and Significance
Each year residency programs across the country graduate a new class of board eligible physicians. Nationally, these programs are accredited by the Accreditation Council for Graduate Medical Education (ACGME). The ACGME mandates that residency and fellowship programs must develop Clinical Competency Committees (CCCs)[1 ] whose role is to demonstrate “that graduates will provide high quality, safe care to patients while in training and be well prepared to do so once in practice.”[2 ]
CCCs use methods such as programmatic assessment, a systematic approach in which robust amounts of data are collected from multiple sources in multiple ways.[3 ]
[4 ]
[5 ] CCCs consider multiple learners during a given meeting, causing time pressure for efficient data review and group processing. Information sharing is, therefore, crucial to defensible group decision-making.[6 ]
[7 ]
[8 ]
[9 ]
[10 ] Group decision-making in this context is fraught with challenges, however, especially effective and efficient sharing of information.[11 ] Dashboards are used for information sharing,[12 ]
[13 ] but optimal design approaches are unclear.
While literature has described best practices for reviewing assessment data and for better integration of data into group decision-making,[11 ]
[12 ] actual practice across institutions remains highly variable. One suggestion is to use interactive data dashboards. Dashboards combine several types of scientific data visualization[14 ] to help interpret large datasets. Dashboards have also been used to facilitate communication between program leadership and residents.[15 ]
[16 ] Recently, medical education dashboards have integrated real-time, web-based systems with data visualizations to provide timely feedback to trainees and increase transparency of competency assessment.[17 ]
[18 ]
[19 ]
[20 ] Few studies, however, describe how CCC dashboards should be designed in a learner-centered manner that meets the needs of end users.[21 ] Boscardin et al provided 12 practical tips to implement educational dashboards,[13 ] one of which suggests a team-based approach to codesign a dashboard with multiple stakeholders, including the learners. CCC dashboards should be designed using rigorous methods and user input to ensure maximum value.
Objectives
Our objective was to conduct a user-centered evaluation of an existing dashboard using a multimethod approach and generate design recommendations (DRs) that can be used to guide programs in the development of a competency assessment dashboard.
Methods
Current Clinical Competency Committee Dashboard
The CCC at the University of Cincinnati Internal Medicine (IM) residency program uses an Excel (Excel 2016, Microsoft Corporation, Redmond, WA)-based dashboard developed by the Internal Medicine (IM) program director. The dashboard incorporates data from the program's workplace-based assessment system, utilizing frontline assessment elements called observable practice actions/activities (OPAs).[22 ]
[23 ] OPAs are specific skills that residents perform in daily practice that are rated by faculty members, resident peers, medical students, nurses, and allied health professionals using a five point entrustment–supervision scale.[24 ] These OPA-based entrustment ratings are mapped to the ACGME IM subcompetencies in the electronic residency management system (MedHub Incorporated, Ann Arbor, MI) and inform each residents' subcompetency ratings.[25 ] A typical resident receives an annualized average of 83 assessment encounters producing approximately 4,000 subcompetency assessments. The dashboard also includes learning analytics that overlays an “expected entrustment score.” This score is generated using historical programmatic entrustment data and a generalized linear-mixed model with random effects.[26 ]
A mock-up of the current dashboard is displayed in [Fig. 1 ]. [Table 1 ] describes the various components that we considered for this study, along with their labels from [Fig. 1 ]. In addition to structured feedback, assessors also provide narrative comments describing each resident's strengths and opportunities for improvement. Narrative data are downloaded from MedHub into an Excel document that is separate from the entrustment dashboard and separately reviewed by the CCC members. Of note, the current dashboard does not display numerous other data points, such as practice exam scores or patient care data, that are collected about residents and must be accessed separately.
Fig. 1 (a ) Mock-up of the front page of the dashboard with the following components: (A) Spider Graphs of faculty and peers/allied averaged subcompetency score from last 6 months, (B) Line Graphs demonstrating trend over time of data in (A), (C) Control Chart to assess special cause variation, (D) Main Heatmap of subcompetencies including counts, and z-scores, (E) PivotTable Selectors. (b ) Mock-up of second page of the dashboard, with the following components: (F) Comments by Month table, (G) Ratings count table, (H) Subcompetency Review Chart, (I) OPA's Past 6 Months, (J) OPA's ranked >25 Times during residency, (K) CCC Meeting Worksheet, (L) PivotTable Selectors. CCC, clinical competency committee; OPA, observable practice action/activity.
Table 1
Description of relevant dashboard components used in the ranking experiment
Component
Description
Line Graph (A)
Presents actual versus expected average entrustment scores by month, as well as a trendline
Spider Graph (B)
Shares average actual versus expected entrustment ratings by subcompetency over the past 6 months
Control Chart (C)
Designed to identify hidden variations in scores, should alert users when data for a certain month are out of the ordinary
Main Heatmap (D)
Displays average ratings of each subcompetency and is color coded by z-score
Comments by Month (F)
Worksheet in which committee members can paste relevant narrative data and leave their own comments about scores by month
Ratings Count (G)
Shows number of ratings by entrustment scores (i.e., number of 1's, 2's, 3's, etc.)
Subcompetency Review (H)
Shares similar data to the Main Heatmap, specifically average scores by subcompetency, but located on the second page
OPA's Past 6 Months (I)
Shares OPA's that have been rated in the past 6 months
OPA's Rated <25 Times (J)
Share OPA's that have been rated less than 25 times, meaning that they may not give an accurate depiction of a resident's skill in that area.
Abbreviation: OPA, observable practice action/activity.
Study Setting and Participants
This study was conducted with members of the IM residency program CCC. There were 87 categorical residents enrolled in the program at the time of this study. The IM CCC consisted of 22 members including the program director, associate program directors, core education faculty, and chief residents who have completed their residency training. Not every member was present at every meeting. The CCC met monthly for 2 hours to discuss half of a single class as well as any specific resident competency issues as they arise. Each resident is reviewed at least semiannually.
All members of the CCC received an email requesting their participation from author B.K. and responded on a voluntary basis. After 11 interviews (50% of the CCC members), the study team felt that thematic saturation was achieved. The final participants in this study included eight faculty physicians and three chief residents. Participants were offered a gift card to compensate for their time.
Study Design
We employed a user-centered evaluation approach to better understand how CCC members used the current system before our team began an iterative redesign process. As the first round of evaluation in a phased strategy,[27 ] our study utilized a multimethod approach[28 ] combining quantitative questions (ranking experiment) with the qualitative interviews. These methods included affinity diagramming,[29 ]
[30 ]
[31 ] a ranking experiment,[32 ] and process mapping for workflow analysis.[33 ]
[34 ] The overall study flow is detailed in [Fig. 2 ]. The process began with an initial preinterview with a CCC associate program director (author E.W.). The first author, along with two research assistants, used this preinterview about dashboard function with author B.K. to develop a set of 12 “grand tour” style questions.[35 ] These questions, listed in [Table 2 ], were designed to allow participants to give an overview of their background, experience, workflow, and overall thoughts and opinions of the dashboard.
Fig. 2 Study conduction.
Table 2
Questions asked during the semistructured interview
Category
Question
Text
Participant background
1
What is your name and job title?
2
How many years have you been with Internal Medicine?
What is your other prior experience?
3
What is your role on the competency committee?
What responsibilities do you have?
4
When do you use this dashboard?
Workflow and daily usage
5
Who collects and enters the data that are used in this dashboard?
6
Please walk me through your process when you utilize the dashboard to review a learner, sharing your thoughts aloud.
Introduction of ranking session
Ranking
7
Rank the cards in order of value. In other words, order the cards by how valuable you feel each component of the dashboard is to help you review a learner.
8
Rank the cards in order of frequency of use. In other words, order the cards by how often you find yourself looking at each individual component, regardless of how important the information it contains is to your process.
Individual interpretation and opinion
9
How much of the raw data are helpful to see? On one extreme (1), we would only display the raw data from the assessors with no backend analysis. On the other extreme (5), we would display no raw data and only display a recommendation based on backend analysis of the data.
10
What issues have you encountered while trying to use the dashboard?
11
Once you are finished using the dashboard, how do you save what you have learned for later? Are there other resources that you utilize to help you with your competency assessment decisions?
12
Knowing that we are looking to redesign this dashboard to improve both competency assessment and resident education, what would be most important for us to consider about how you utilize this platform? Are there any additional features that you can think of that would be helpful to you?
Questions (7) and (8) were specifically designed to accompany a ranking experiment. The interviewer asked follow-up questions at each step and allowed the participant to guide the conversation whenever possible. Most interviews were conducted through online video calls and screen sharing. These interviews lasted approximately 45 minutes and were facilitated, recorded, and transcribed verbatim by the interviewers. The interviewers then reviewed each other's transcriptions as a quality check to guarantee the accuracy and completeness of the transcriptions. The privacy of each participant was kept by de-identifying and referring to them as “P00” to “P10” in all interviews. This study was reviewed and approved by the University of Cincinnati Institutional Review Board (IRB# 2019-1418).
Data Analysis
Using the transcribed interviews, three analyses were conducted. First, the transcripts were examined using the actor–action–artifact model to complete process mapping.[33 ]
[34 ] This method helped determine the action (what is being done at each step), the actor (who is doing the action), and the artifact (items that enable the actor to complete the action) carried out at each step. Individual process maps were used to generate a consolidated workflow diagram.
Second, the transcriptions were utilized to create affinity diagrams.[29 ]
[30 ]
[31 ] Affinity diagramming, also known as the K-J Method, was originally developed for groups to make sense of large amounts of data and helps organize information into groupings of related concepts.[30 ] To begin, the research team used a bottom-up approach to determine categories used for the initial data coding for transcript highlighting. These categories include “dashboard background/context,” “potential pain points,” “expectations/personal opinions/suggestions” from the users, and “dashboard usage and mindset.” After highlighting was completed, the quotes were exported to the online platform “Miro” (Miro, San Francisco, CA) and placed on “virtual sticky notes.” In the critical next step,[30 ] the team worked together silently to form groupings of quotes based on their similarities. Once categories had been formed, names were assigned based on the theme of the quotes in each group for further analysis.
The third analysis technique combined a quantitative approach with follow-up qualitative interview questions. This technique, a ranking experiment,[32 ] assigned a numerical rank to the subjective value of nine dashboard components (highest rank being “1” and lowest being “9”). In interview questions (7) and (8), the participants were asked to rank components by “value” and by “frequency of use,” then to discuss their thought process. The average scores were calculated for each of the two questions, then combined to produce a final rank for each component. A preliminary analysis of the ranking experiment has been published elsewhere,[36 ] but has been expanded upon greatly in this work.
Results
Process Mapping
There are four major actors in the CCC assessment process: residents, frontline assessors, pre-reviewers, and CCC members. There are five phases in which actions can occur: patient care, assessment, data compilation, pre-review, and committee review. A summary of this workflow in the form of a swim lane diagram is presented in [Fig. 3 ].
Fig. 3 Condensed workflow diagram demonstrates the role of the resident, assessor, pre-reviewer, and CCC committee. Importantly, the pre-review process and the CCC meeting discussions do not share the same workflow but use the same visualizations on the dashboard. Pre-reviewers also do not follow a standard method for reviewing residents and may choose to look at visualizations in any order and choose to highlight whatever commentary they feel is appropriate for the discussion. CCC, clinical competency committee.
First, resident physicians care for patients. Then, the frontline assessors provide entrustment ratings and narrative assessments for each resident. After assessments are completed, the data are processed and compiled into the Excel sheet by the dashboard design team (specific members of the CCC that help to create the dashboard).
At this point, the dashboard is sent to the CCC. Some members of the CCC serve as volunteer “pre-reviewers” and are assigned four to six residents to present to the other CCC members during the monthly meeting.
When pre-reviewers evaluate a resident, they first look at the opening screen of the dashboard to synthesize information found on the line and spider graphs, the subcompetency review chart, and the control chart. It is just as critical to their review are the narrative data offered by assessors. This helps the CCC members understand why a resident was rated at a certain entrustment level. To access this information, the CCC member must switch to a separate Excel file and filter manually to the correct resident. At this point, the pre-reviewer completes the “resident review sheet” by manually adding narrative comments that give context to the scores on the dashboard and synthesize the data to create a set of recommendations for CCC review.
At the CCC meeting, the dashboard is utilized again but in a different manner. The pre-reviewer shares the recommendations they have made and directs the dashboard that is shared on a screen for the committee to see. At this point, the narrative data have been manually added to the dashboard by the pre-reviewer and is heavily emphasized in the committee's discussion. They do not emphasize the numeric values but rather look for signal using z-scores that are highlighted in green (above expected scores) or red (below average scores).[26 ] The full committee shares its thoughts on the resident's performance and provides feedback to be shared with the resident by their faculty coach (a role outside of the committee filled by faculty members assigned to each resident).
Affinity Diagramming
Five themes were identified through affinity diagramming that help us further make sense of these workflows. [Table 3 ] contains selected quotations from the interviews with participants that support these findings, but each theme is summarized below.
Table 3
Theme and representative quotes
Theme
Participant ID
Representative quotes
Preprocessing of data is time-consuming
P02
“It's a process that I work with [P04]. So first we download the data as a flat file from MedHub and [P04] does the analysis and I take the data and plug it and update the dashboard every month and then I send the dashboard out”
P02
“[we run] it through his program which sometimes takes three/four hours”
P07
“So having this Excel sheet that you have to download, it's not updated. By the time we actually get the data, we're probably at least a month behind; and then by the time I meet with my resident, there's another month of data that hasn't been uploaded.”
P00
“As a pre-reviewer and as a coach, you must have both of these files open and toggle them back and forth and all that kinds of stuff and so it's really hard to connect the quantitative to the qualitative without doing all this toggling.”
Integration of narrative and other data
P00
“The narrative comes in a separate Excel file, and I would filter to whatever resident I was going to look at, and then the problem is that you try to match up which data point on the graph goes with which comments”
P03
“It bothers me that you know ITE scores[a ] and MKSAP[b ] and Long Block are nowhere to be found in the dashboard at all.”
P01
“I don't think right now we have a good way of viewing narrative data…”
There is no agreed upon interpretation for some dashboard components
P03
“In my opinion, [the control chart] has only been useful in probably two instances in my life”
P07
“And when I have seen it or seen significant standard deviation, I'm usually not able to tell why. So even looking through narrative data, it doesn't really give me a good explanation. So I think it was usually just two different attendings that have rated in two different ways. I couldn't really get anything out of that”
P01
“The heat map means nothing to me. I don't even have a framework in my head for what “systems-based practice 2” is. I would have to go back to like the ACGME internal medicine [subcompetency] document to even know what that even stood for.”
Data are not saved for future competency review
P06
“The [ideal] dashboard would be more of sort of a longitudinal tool, where we could be able to quickly assess those graphs. Not going back in or not just looking at this dashboard for this particular month, but for this particular learner over the past three years, a certain way to collate the information where I can be able to quickly access that data within one master document or program if that makes sense.”
P10
“I mean I think that being able to see all of the critical deficiencies in the last six months, being able to toggle through and see what performance evaluations and where residents were in their schedule are two big things and then being able to quickly access the comment, the qualitative comments and match them to the scores that people are getting are most important and then again being able to store that information. As you're kind of creating a live document and going through this process, such that you're not reinventing the wheel all the time, you don't have to like rename the Excel sheet with the residents name and all that business.”
P08
“It would be really nice if it was just an interactive dashboard that we could go in at any time rather than having to rely on Excel spreadsheet data.”
CCC meeting workflow is different from pre-reviewer workflow
P00
“We have sharing issues with the committee too. Committee members get the dashboard ahead of time but anything that I fill out here, they can't see until I show up at committee and pull it up on the screen because it's saved on my OneDrive or whatever”
P03
“We have realized that there is some danger in having one person to review everything in depth, and then start sharing it to the group…there is a potential for bias there.”
P07
“Unless it's a resident that's obviously struggling and I'm spending a lot more time looking through every single narrative, and going through month by month, even if it's a lot older data. But it usually takes me to do the preview around eight to ten minutes.”
Abbreviations: ACGME, Accreditation Council for Graduate Medical Education; CCC, Clinical Competency Committee.
a ITE, in training exam; an annual, standardized examination for learners.[37 ]
b MKSAP, American College of Physicians Medical Knowledge Self-Assessment Program.[38 ]
Preprocessing of Data is Time-Consuming
There are significant delays in the delivery of real-time data for CCC review. This is because the current dashboard is recompiled each month by using data from multiple sources. The Excel sheet is cumbersome to create and dissemination to reviewers can delay assessment and review by several days. Additionally, more data accrue between the time the spreadsheet is created and the date that the CCC meets. These data are often not reviewed at the meeting even if they could be helpful. This leads to the CCC using data that may not be the most up to date.
Integration of Narrative and Other Data
Narrative data play a critical role in understanding how residents are performing, yet these data are very difficult to combine with the existing dashboard. Currently, the narrative comments must be accessed separately from the dashboard and are manually copied into the “resident review sheet.” Reviewers struggle to navigate through the sheets to get a complete picture of the data. Other data sources like the annual “in training exam”,[37 ] the American College of Physicians Medical Knowledge Self-Assessment Program,[38 ] and the Long Block data (a unique 1-year ambulatory training program for the UC IM Residency Program)[39 ] are not included in this dashboard, but were useful to participants in their review tasks.
Users Lack Understanding of Certain Dashboard Components
When asked to describe each visualization, several participants were not certain how components like the Control Chart (C) were meant to work. For example, P03 only found the Control Chart (C) useful “in probably two instances.” P07, however, uses the Control Chart (C) to “immediately jump to the narrative data” based on the information they gain from the chart. Because P07 knew how to utilize this chart, they do not discount the graph immediately because they understand how to properly interpret the data.
Data are Not Saved for Future Competency Reviews
Even though data are added into the dashboard each month, the reviews for each resident from the prior CCC meetings are not saved in the next month's file for later review. To access these previous reviews, CCC members must remember the last time that the resident was reviewed and search through computer folders to find the appropriate file. This makes the tool more difficult to follow-up for longitudinal resident development.
Clinical Competency Committee Meeting Workflow is Different from Pre-reviewer Workflow
Pre-reviewers spend much more time reviewing resident data compared to the amount of time spent at CCC meetings presenting the data. This includes the narrative data described above. The current system does not accommodate these differing workflows.
Ranking Experiment
The ranking experiment allowed the team to assess the dashboard components by their overall perceived value and frequency of use. [Table 4 ] displays the results of the ranking experiment. The components can be organized into the following five groups based on the similarity of their average scores.
Table 4
Results of the ranking experiment, ranks, and standard deviation of the scores (11 participants)
Group[a ]
Dashboard components[b ]
Average perceived value
Standard deviation
Average frequency of use
Standard deviation
Average overall score
Overall rank
1
(A) Line Graph
1.31
0.46
1.88
1.36
1.59
1
2
(F) Comments by Month
3.00
1.27
2.69
2.19
2.84
2
(B) Spider Graph
2.63
2.46
3.31
2.07
2.97
3
3
(H) Subcompetency Review
5.88
2.01
4.88
1.91
5.38
4
(G) Ratings Count
5.38
1.79
6.00
1.65
5.69
5
(I) Lowest OPA's Past 6 Months
5.63
1.16
5.81
2.07
5.72
6
4
(D) Main Heatmap
6.44
0.93
6.44
1.71
6.44
7
(J) OPA's > 25
6.88
1.53
6.19
1.81
6.54
8
5
(C) Control Chart
7.81
1.98
7.81
1.98
7.81
9
Abbreviation: OPA, observable practice action/activity.
a The group number was assigned based on similarity of overall average scores to compare similarly ranked components.
b Dashboard components are labeled “(A) to (J)” based on their order in the interface (as seen in [Fig. 1 ]).
Group 1 (the highest ranked component) contained the Line Graph (A). This visualization was valued for its quick “at-a-glance” style information.
Group 2 contained the Comments by Month (F) and Spider Graph (B) components. These two ranked closely and were appreciated for being essential for narrative data collection and CCC decision-making, respectively.
Group 3 contained the widest variety of components, including the subcompetency Review (H), Ratings Count (G), and Lowest OPA's Past 6 Months (I). These provided valuable information but are not applicable to every review.
Group 4 includes the Main Heatmap (D) and OPA > 25 (J). These components were less favored by the committee members, even though they could provide potential utility.
Group 5 (the lowest ranked group) contains one component, the Control Chart (C). This component was generally overlooked by users and was ranked low in both categories.
Calculating a standard deviation for the ranks of each component also helps us place our quantitative results in context with our qualitative interviews. The highest deviation in ranks came from the Comments by Month (F, SD = 2.19) by frequency of use and from the Spider Graph (B, SD = 2.46) by value. This demonstrates the finding from affinity diagramming that there is not one “agreed upon interpretation” of these dashboard components. The lowest deviation came from Line Graph (A, SD = 0.46) by value. This means to us that the line graphs would always be considered most valuable by the users of the dashboard.
Discussion
Key Findings
This user-centered evaluation of the University of Cincinnati IM residency program's CCC dashboard employed a multimethod approach. The process maps revealed that individual users of this dashboard have highly varied approaches to their use of the system. This implies that a good dashboard must cater to multiple user types with differing needs. Affinity diagramming yielded several qualitative themes that describe areas for improvement, identified by current users of the system. These themes can be used to both improve our own system and to recommend design principles for others looking to create their own dashboards. Finally, the ranking experiment showed how users assign varying levels of importance to different dashboard components. For example, components like Line Graph (A) and Comments by Month should be prioritized due to their universal utility, while others, like the Control Chart, may require reevaluation or redesign to better align with user needs. These findings emphasize the importance of user-centered design in health care informatics and allow for the opportunity for others to learn from our evaluation efforts.
Design Recommendations
Using these findings, this research team is conducting an iterative redesign process, making the following four DRs to improve upon the current CCC dashboard system. These recommendations can be used to guide other programs that may be looking to evaluate or design their own CCC dashboards.
Design Recommendation 1: Integrating Both Quantitative and Qualitative Data
CCC members require both quantitative scores and qualitative data to make defensible decisions.[40 ]
[41 ]
[42 ]
[43 ]
[44 ] According to the interviews, there is no integration between narrative data and dashboard information. Despite this, narrative comments are integral to understand the scores provided by assessors. They also provide direct feedback to the CCC and can help residents set specific goals for improvement in the future when combined with the quantitative scores. An ideal dashboard should integrate both types of data.
Design Recommendation 2: Developing an Informatics Platform for Better Management of Data
Currently, the dashboard is formed by combining raw data from multiple sources to create a document each month shared with the CCC for review. In addition to workplace-based entrustment data[22 ]
[23 ] enhanced through the use of learning analytics[26 ] and narrative data,[23 ] multiple sources are used outside of the Excel-based dashboard to assess residents. These sources include an internal testing program,[45 ] multisource feedback from a year-long longitudinal ambulatory Long Block,[39 ] resident self-assessment,[46 ] and outpatient clinical care measures.[47 ] The current process leads to difficulty retrieving data for future reviews and a time-consuming preprocessing workflow described above.
Design Recommendation 3: Improving the Interpretability and Actionability of Visualization
At the beginning of this project, it was clear that the assessment data used by the IM team were high quality and rooted in the best practices described in the literature. As the dashboard grew from a small set of graphs to a full assessment system, each user developed their own workflow for using the dashboard to evaluate resident progress. Along with the redesign efforts, additions to the dashboard that may help users interpret and take action using the visualizations could include informational tooltips that describe the content of each figure and provide instructions for use.
Design Recommendation 4: Creating Multiple Views to Support Workflow with High Usability
A comprehensive dashboard would include multiple views of the data, depending upon which user logs in to the system. Workflow can vary greatly depending upon the goal of the user (e.g., a CCC pre-reviewer has a different workflow than a coach sharing the reviews with the resident). Laying out the page differently based on the user's goal may help create a more holistic picture of each resident's competencies and make the system more usable. Dashboard visualization ought to be customized to serve the disparate workflows.
Learner Evaluation
Described briefly above, IM team uses the current dashboard as a tool to help learners understand their own progress. Although they receive basic training on the information contained within the system, most residents do not spend a significant amount of time reviewing the dashboard. A focused and resident-specific learning plan would revolutionize the way that the residents interact with their own data to learn. Additionally, utilizing descriptive tooltips or pop-up windows with explanations of how the data are collected, what the data measure, how the visualization should be interpreted, or other helpful tips will be a key strategy to support usability.
Limitations
The study has multiple limitations. First, two of the eleven participants did not participate in the card sorting interview. One took a job at another institution and was unable to participate in a follow-up card-sorting interview. The other participant performs critical dashboard development tasks but does not spend time utilizing the dashboard for reviews. Therefore, their input was valuable to help the team develop initial qualitative themes but was not necessary for accurate card-sorting results. Second, the needs of the University of Cincinnati IM CCC may be different from CCCs in other fields and institutions. This may limit the generalizability to other residency programs of our specific workflow. Our DRs, however, are broad and could be widely applicable to any program looking to replicate or attempt similar work. Third, end users were defined as the members of the CCC who are tasked with providing summative assessments for residents. Ideally, the list of end users would also include learners, but this group was not considered in this initial evaluation. In the future, we could look to create a generalizable process and/or system with a common data model that can be utilized by multiple residency programs. Fourth, discomfort with and lack of understanding about how to use certain charts or calculations such as the Control Chart (C) may have influenced rankings from participants. Training on how to use this tool and interpret data will ensure consistency in evaluation of learner data.
Future Directions
Our team is currently completing a full redesign and implementation of the dashboard using the findings of this study. We will look to create a flexible and automatic system allowing for multiple users, different views, and drill-down analysis. We will also conduct thorough usability testing to validate our product.
To address each DR, we will take the following steps in the redesign: First, the new system will integrate the narrative data fully by allowing for drill-down analysis of the narrative comments for each resident and subcompetency. Second, data from all sources will be stored in a relational database. As much as possible, this process will be automated using batched data downloads and real-time queries to MedHub, which store current IM education data for this program. From here, the Flask library[48 ] in Python with a Representational State Transfer (REST) design will be used to create a web application enabling access control and increased extensibility and connectivity. This will require strong informatics and development support to build and maintain but will allow for more flexible use of the system and fuller control over the development of future features.
Next, a cross-functional development team consisting of user interface/user experience designers, developers, stakeholders from the IM team, and researchers will work together to create and iteratively refine this new tool. Finally, the dashboard will be accessible by both the CCC members and the residents. User-specific views will be created so that residents can safely view their data, but not edit comments or see other resident evaluations, while CCC members can continue their reviews seeing all data.
An additional area of feedback that we received from the CCC members concerned how they utilize the dashboard during coaching sessions with the residents. Because of the data structure and the limitation of Excel spreadsheet, there is no easy way for a resident to access their own data without having access to the data for the entire program. A future project will take this into account and work specifically to address the needs of coaches as they help residents understand their data using this system. After completing the first iteration of the dashboard, our team will interview current residents so that we can take their opinions into account.
Conclusion
We conducted a multimethod, user-centered evaluation of an existing competency assessment dashboard. We identified both strengths and weaknesses of the current dashboard and plan to use them in our efforts to redesign the system. We hope the process that we describe of analyzing and optimizing a dashboard for residency education would be helpful for others developing or refining similar tools. From our findings, we recommend that programs integrate both qualitative and quantitative data into their dashboards, develop a strong informatics pipeline to manage that data, create informative and useful visualizations to aid in the review process, and create multiple views of the data to support different workflows.
Clinical Relevance Statement
Clinical Relevance Statement
CCCs aim to assess resident progress and readiness to practice unsupervised. This user-centered evaluation study is clinically relevant because it helps to maintain and update a dashboard that is used to provide specific feedback about important core competencies for physicians-in-training.
Multiple Choice Questions
Multiple Choice Questions
According to the process mapping and affinity diagramming done in the study, which of the following is a weakness of the current dashboard identified in the study?
(A) All users work with the same view of the dashboard.
(B) The assessment data used by the IM team were not of high quality.
(C) CCC members do not use quantitative scores and qualitative narrative data.
(D) The required OPAs are not indicated well in the dashboard.
The correct choice is (A). Both quantitative scores and qualitative narrative data are high quality when displayed in the dashboard. However, they were poorly integrated when considering how each member of the CCC reviews and ranks their residents. As mentioned in the manuscript, narrative data and quantitative scores were essential to rank a resident, and the required OPAs are clearly indicated on the dashboard. Option (A) remains the correct answer choice, as the manuscript showed that each CCC member uses the dashboard differently, depending on whether they are talking to the student, preparing a review for the team, or reviewing the resident as a part of the team.
In this multimethod approach, which of the following would be considered a quantitative approach taken in the study to understand the user behavior of the dashboard?
(A) Affinity diagramming
(B) Process mapping
(C) Ranking experiment
(D) Interviews
The correct answer choice is (C). Although most of the feedback received from dashboard users was qualitative, a ranking experiment was chosen as the third user-centered evaluation approach to allow a quantitative measure of the subjective value placed on dashboard components by each participant. Process mapping offered insight into the workflow of the review committee (presented in [Fig. 3 ]), and affinity diagramming allowed the identification of themes to make sense of the workflow. While these methods offered narrative data on the usefulness of dashboard components, subjective data gathered from the ranking experiment allowed the study to create a holistic picture combined with the qualitative interview results.