Methods Inf Med 2007; 46(05): 523-529
DOI: 10.1160/ME0317
Paper
Schattauer GmbH

Using T3, an Improved Decision Tree Classifier, for Mining Stroke-related Medical Data

C. Tjortjis
1   School of Computer Science, University of Manchester, Manchester, UK
,
M. Saraee
2   School of Computing, Science and Engineering, University of Salford, Salford, UK
,
B. Theodoulidis
3   Manchester Business School, University of Manchester, Manchester, UK
,
J. A. Keane
1   School of Computer Science, University of Manchester, Manchester, UK
› Institutsangaben
Weitere Informationen

Publikationsverlauf

Publikationsdatum:
22. Januar 2018 (online)

Summary

Objectives: Medical data are a valuable resource from which novel and potentially useful knowledge can be discovered by using data mining. Data mining can assist and support medical decision making and enhance clinical managementand investigative research. The objective of this work is to propose a method for building accurate descriptive and predictive models based on classification of past medical data. We also aim to compare this method with other well established data mining methods and identify strengths and weaknesses.

Method: We propose T3, a decision tree classifier which builds predictive models based on known classes, by allowing for a certain amount of misclassification error in training in order to achieve better descriptive and predictive accuracy. We then experiment with a real medical data set on stroke, and various subsets, in order to identify strengths and weaknesses. We also compare performance with a very successful and well established decision tree classifier.

Results: T3 demonstrated impressive performance when predicting unseen cases of stroke resulting in as little as 0.4% classification error while the state of the art decision tree classifier resulted in 33.6% classification error respectively.

Conclusions: This paper presents and evaluates T3, a classification algorithm that builds decision trees of depth at most three, and results in high accuracy whilst keeping the tree size reasonably small. T3 demonstrates strong descriptive and predictive power without compromising simplicity and clarity. We evaluate T3 based on real stroke register data and compare it with C4.5, a well-known classification algorithm, showing that T3 produces significantly more accurate and readable classifiers.