Methods Inf Med 2005; 44(05): 639-646
DOI: 10.1055/s-0038-1634020
Original Article
Schattauer GmbH

Predicting Missing Values in a Home Care Database Using an Adaptive Uncertainty Rule Method

S. Konias
1   Laboratory of Medical Informatics, Medical School, Aristotle University of Thessaloniki, Thessaloniki, Greece
,
G. Gogou
1   Laboratory of Medical Informatics, Medical School, Aristotle University of Thessaloniki, Thessaloniki, Greece
,
P. D. Bamidis
1   Laboratory of Medical Informatics, Medical School, Aristotle University of Thessaloniki, Thessaloniki, Greece
,
I. Vlahavas
2   Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
,
N. Maglaveras
1   Laboratory of Medical Informatics, Medical School, Aristotle University of Thessaloniki, Thessaloniki, Greece
› Author Affiliations
Further Information

Publication History

Received 06 April 2004

accepted: 06 December 2004

Publication Date:
07 February 2018 (online)

Summary

Objectives: Contemporary literature illustrates an abundance of adaptive algorithms for mining association rules. However, most literature is unable to deal with the peculiarities, such as missing values and dynamic data creation, that are frequently encountered in fields like medicine. This paper proposes an uncertainty rule method that uses an adaptive threshold for filling missing values in newly added records. A new approach for mining uncertainty rules and filling missing values is proposed, which is in turn particularly suitable for dynamic databases, like the ones used in home care systems.

Methods: In this study, a new data mining method named FiMV (Filling Missing Values) is illustrated based on the mined uncertainty rules. Uncertainty rules have quite a similar structure to association rules and are extracted by an algorithm proposed in previous work, namely AURG (Adaptive Uncertainty Rule Generation). The main target was to implement an appropriate method for recovering missing values in a dynamic database, where new records are continuously added, without needing to specify any kind of thresholds beforehand.

Results: The method was applied to a home care monitoring system database. Randomly, multiple missing values for each record’s attributes (rate 5-20% by 5% increments) were introduced in the initial dataset. FiMV demonstrated 100% completion rates with over 90% success in each case, while usual approaches, where all records with missing values are ignored or thresholds are required, experienced significantly reduced completion and success rates.

Conclusions: It is concluded that the proposed method is appropriate for the data-cleaning step of the Knowledge Discovery process in databases. The latter, containing much significance for the output efficiency of any data mining technique, can improve the quality of the mined information.

 
  • References

  • 1 Lavrac N. Machine Learning for Data Mining in Medicine. In Horn W. et al editors Proceedings of the Joint European Conference on Artificial Intelligence in Medicine and Medical Decision Making, Lecture Notes in Artificial Intelligence 1620, June 20-24, 1999. Aalborg: Denmark; 2000: 47-62.
  • 2 Zupan B, Demsar J, Smrke D, Bozikov K, Strankonshi V, Bratko I, Beck JR. Predicting Patient’s Long-Term Clinical Status after Hip Arthroplasty Using Hierarchical Decision Modeling and Data Mining. Methods Inf Med 2001; 40: 25-31.
  • 3 Zhu AL, Li J, Leong TY. Automated Knowledge Extraction for Decision Model Construction: A Data Mining Approach. In Musen M. editor Proceedings of American Medical Informatics Association, Nov. 8-12, 2003. Washington: 2003: 758-62.
  • 4 Zaït M, Messatfa H. A comparative study of clustering methods. Future Generation Computer Systems 1997; 13: 149-59.
  • 5 Bayardo RJ, Agrawal R, Gunopulos D. Constraint- Based Rule Mining in Large, Dense Databases. Data Mining and Knowledge Discovery Journal 2000; 4 (2/3) 217-40.
  • 6 Agrawal R, Swami A. Fast Algorithm for Mining Association Rules. In Bocca JB, Jarke M, Zaniolo C. editors Proceedings of the 20th International Conference on Very Large Data Bases, Sept. 12-15, 1994. Santiago de Chile 1994: 487-99.
  • 7 Zaki MJ, Parthasarathy S, Li W. New Algorithm for fast Discovery of Association Rules. In Heckerman D, Mannila H, Pregibon D, Uthurusamy R. editors Proceedings of the 3th International Conference on Knowledge Discovery and Data Mining, Aug. 14-17, 1997. California: 1997: 283-6.
  • 8 Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases. Proceedings of the ACM SIGMOD Conference on Management of Data, May 1993. Washington: 1993: 207-16.
  • 9 Maglaveras N, Koutkias V, Chouvarda I, Goulis DG, Avramides A, Adamidis D, Louridas G, Balas EA. Home Care Delivery through the Mobile Telecommunications Platform: The Citizen Health System (CHS) Perspective. International Journal of Medical Informatics 2002; 68: 99-111.
  • 10 Shen L, Shen H, Cheng L. New algorithms for efficient mining of association rules. Information Sciences 1999; 118: 251-68.
  • 11 Grzymala-Busse JW, Hu M. A Comparison of Several Approaches to Missing Values in Data Mining. Proceedings of International Conference on Rough Sets and Current Trends in Computing 2000, Lecture Notes in Artificial Intelligence 2005, Oct. 14-20, 2001. Alberta; Canada: 2001: 378-85.
  • 12 Ragel A, Cremilleux B. MVC- a preprocessing method to deal with missing values. Knowledge- Based Systems 1999; 12: 285-91.
  • 13 Pyle D. Data Preparation for Data Mining. San Francisco: Morgan Kaufmann; 1999
  • 14 Weiss SM, Indurkhya N. Decision-Rules Solutions for Data Mining with Missing Values. In Monard MC, Jaime JS. editors Proceeding of Ibero-American Conference on Artificial Intelligence- Brazilian Symposium on Artificial Intelligence, Lecture Notes in Artificial Intelligence 1952, Nov. 19-22, 2000. Atibaia: Brazil; 2000: 1-10.
  • 15 Buchanan BG, Duda RO. Principles of rule-based expert systems. Advances in Computers 1983; 22: 164-216.
  • 16 Buchanan BG, Shortliffe EH. Rule-based expert systems: The MYCIN Experiments of the Stanford Heuristic Programming Project. Massachusetts: Addison-Wesley; 1985
  • 17 Konias S, Bamidis PD, Maglaveras N, Chouvarda I. Treatment of Missing Values Using Uncertainty in Medical Data. Eur J Med Res 7 (Suppl I) Regensburg; Germany: 2002: 39-40.
  • 18 Konias S, Giaglis GD, Gogou G, Bamidis PD, Maglaveras N. Uncertainty Rule Generation on a Home Care Database of Heart Failure Patients. In Muray A. editor Proceedings of Computers in Cardiology, IEEE Comp Soc Press; Sept. 21-24, 2003. Halkidiki: Greece; 30 765-8.
  • 19 Konias S, Bamidis PD, Maglaveras N. Efficient Mining of Uncertainty Rules using Adaptive Thresholds in Medical Data. In Vouros GA, Panayiotopoulos T. editors Proceedings of 3rd Hellenic Conference on Artificial Intelligence; May 5-8, 2004. Samos; Greece: 2004: 32-41.
  • 20 Ada WF, Renfrew W.K, Jian T. Mining N-most Interesting Itemsets. In Zbigniew WR, Setsuo O. editors Proceedings of the 12th International Symposium on Methodologies for Intelligent Systems; Oct. 11-14, 2000. Charlotte 2000: 59-67.
  • 21 Lin W, Ruiz C, Alvarez SA. A New Adaptive Support Algorithm for Association Rule Mining. Technical Report WPI-CS-TR-00–13, Department of Computer Science, Worcester Polytechnic Institute. 2000
  • 22 Cohen E, Datar M, Fujiwara S, Gionis A, Indyk P, Motwani R, Ullman JD, Yang C. Finding Interesting Association without Support Pruning. In Larson P, Lomet D. editors Proceedings of the 16th International Conference on Data Engineering, Feb. 28-Mar. 3, 2000. San Diego; 2000: 489-99.
  • 23 Cohen E, Datar M, Fujiwara S, Gionis A, Indyk P, Motwani R, Ullman JD, Yang C. Finding Interesting Association without Support Pruning. IEEE Transactions on Knowledge and Data Engineering 2001; 13 (01) 64-78.
  • 24 Lucas P. Certainty-factor-like Structures in Bayesian Networks. In Lamma E, Mello P. editors Proceedings of the 6th Congress of the Italian Association for Artificial Intelligence on Advances in Artificial Intelligence, Lecture Notes in Artificial Intelligence 1792, Sept. 14-17, 1999. Bologna 2000: 25-36.
  • 25 Stilou S, Bamidis PD, Maglaveras N, and Pappas C. Mining Association Rules from Clinical Databases: an Intelligent Diagnostic Process in Healthcare. In Patel VL, Rogers R, Haux R. editors Proceedings of the 11th World Congress on Medical Informatics; Sept. 2-5; 2001. London: 2001: 782-6.