Abstract
We investigated the applicability of the Delphi method for increasing the agreement
among multiple cardiologists on, firstly, their classifications of a set of electrocardiograms
and, secondly, their reasons for these classifications. Five cardiologists were requested
to judge the computer classifications of a set of thirty ECGs. If a cardiologist disagreed
with the computer classification, he had to provide a new classification and a reason
for this change. The results of this first round were compiled and anonymously fed
back to the cardiologists. In a second round the cardiologists were asked once again
to judge the ECGs and to rate the reasons provided in the first round. The level of
agreement was estimated by means of the kappa statistic. The Delphi procedure substantially
increased the agreement on the classifications among the cardiologists. The final
agreement was very high and comparable with the intra-observer agreement. There was
also a high level of agreement on the reasons provided by the cardiologists. However,
their use in improving the program’s performance is hampered by the qualitative nature
of many of the reasons. Suggestions are given for a more formalized elicitation of
knowledge.
Key-Words
Kappa Statistic - Inter-Observer Agreement - Knowledge Validation