Methods Inf Med 1983; 22(02): 87-92
DOI: 10.1055/s-0038-1635421
Original Artical
Schattauer GmbH

How to Detect and Prevent Errors in Computer-Supported Statistical Analysis: An Example[*)]

Wie Kann Man Fehler Bei Rechnergestützten Statistischen Auswertungen Entdecken Und Verhindern? — Ein Beispiel
R. Haux
1   (From the Department of Medical Documentation, Statistics and Computer Science, University of Heidelberg, F.R.G.)
› Author Affiliations
Further Information

Publication History

Publication Date:
20 February 2018 (online)

In statistical analysis systems data are stored in a database, and procedures or analysis methods, respectively, are stored in a method base. As will be shown in the present article, even in the case of a simple statistical test the detection of faulty data turns out to be deficient in weill-known statistical analysis systems like BMDP, SAS and SPSS. Suggestions are made to program methods and to design statistical analysis systems in such a way that careful error detection is possible to provide the user with more reliable results.

Statistische Auswertungssysteme speichern Daten in ihrer Datenbank und Methoden (Auswertungsprogramme) in ihrer Methodenbank. Anhand eines Beispiels wird gezeigt, daß, selbst im Falle eines einfachen statistischen Tests, die Überprüfung fehlerhafterDaten in den bekannten statistischen Auswertungssystemen BMDP, SAS und SPSS mangelhaft ist. Diese Arbeit enthält Vorschläge, wie Methoden zu programmieren und wie statistische Auswertungssysteme zu entwerfen sind, um eine bessere Fehlerprüfung zu gewährleisten, mit dem Ziel, dem Anwender vertrauenswürdigere Ergebnisse liefern zu können.

* This article is an extended version of a talk given at the colloquium »Design Criteria for Statistical Analysis Systems« at the University of Heidelberg in 1981.


 
  • References

  • 1 Alle M., Haux R., Schumacher M.. Rank Tests and Ties —some Properties of a Rank Test for Complete Block Designs. Preprint series of the >Sonderforschungsbereich Stocha-stische Mathematische Modelle University of Heidelberg (to be published)
  • 2 Beutel P.. New Developments in Integration of Statistical Systems. In Barrit M. M., Wishart B.. (Eds) COMP-STAT 1980 pp. 470-476 Vienna: Physika Verlag 1980
  • 3 Campbell N. A., Woodings T. L.. Improved diagnostic output from statistical packages. 43rd Session of the ISI (to appear)
  • 4 Codd E. F.. A relational model of data for large shared data banks. Commun. ACM 1970; 13: 377-387.
  • 5 Codd E. F.. Extending the relational model to capture more meaning. ACM Transact. Database Syst 1979; 4: 397-434.
  • 6 Dijkstra E. W.. The Thinking Programmer—a More Formal Treatment of a Less Simple Example. In Bauer F. L.. et al. Program Construction, pp.. 2-20 Heidelberg: Springer; 1979
  • 7 Dixon W. J.. Statistical software: touching on various parts of the elephant. Bull. ISI 1975; 46: 566-671.
  • 8 Dixon W. J., Brown M. B.. (Eds) Biomedical Computer Programs P-Series, 1979. Berkeley: Univ. of California Press 1979
  • 9 Francis I., Heiberger R. M., Vellemann P. F.. Criteria and considerations in the evaluation of statistical program packages. Amer. Statist 1975; 29: 52-56.
  • 10 Goodnight J. H.. Validity Checking—How far should we go? In Frane, J. W. (Edit.): Computer Science and Statistics—8th Annual Symposium on the Interface, pp 146-148 Berkeley: Univ. of California Press 1975;
  • 11 Hajek J., Sidak H.. Theory of Rank Tests. New York: Academic Press 1967
  • 12 Haux R.. How to detect and prevent errors in computer supported statistical analysis: an example. Technical Report, Dept. of Medical Documentation, Statistics and Computer Science, University of Heidelberg 1982
  • 13 Haux R.. The Use of Complex Data Structures in Statistical Analysis Systems. (Submitted as doctoral thesis, 1982–in German)
  • 14 Haux R., Schumacher M., Weckesser G.. Rank Tests for Complete Block Designs. Preprint series of the >Sonderfor-schungsbereich Stochastische Mathematische Modelle«, University of Heidelberg. 1981 (Biomed. J., in print)
  • 15 Helwig J. T., Council K. A.. (Eds) SAS User’s Guide, 1979 Edition. (Raleigh/N. C: SAS Institute, Inc 1979
  • 16 Hoare C. A. R.. An axiomatic basis for computer programming. Commun. ACM 1969; 12: 576-583.
  • 17 Holm S.. A simple sequentially rejective multiple test procedure. Scand. J. Statist 1979; 6: 65-70.
  • 18 Hull C. H., Nie N. H.. (Eds) SPSS Update: New Procedures for Releases 7 and 8. New York: McGraw-Hill 1979
  • 19 Hultsch E., Jannasch H., Krier N., Sund M., Victor N.. Requirements for program systems used for statistical data analysis. Stat. Software Newsletter 1978; 4: 3-30.
  • 20 Kruskal W. H., Wallis W. A.. Use of ranks in one-criterion variance analysis. J. Amer. Statist. Ass 1952; 47: 584-621.
  • 21 Luce S. R.. A Conceptual Analysis of SPSS and BMDP. In Barrit M. M., Wishart D.. (Eds) COMPSTAT 1980; pp. 509-514 Vienna: Physika Verlag 1980;
  • 22 Mann H. B., Whitney D. R.. On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Statist 1947; 18: 50-60.
  • 23 Nelder J. A.. Intelligent Programs, the Next Stage in Statistical Computing. In Barra J. R.. et al. (Eds) Recent Developments in Statistics, pp. 79-86 Amsterdam: North-Holland Publishing Company; 1977
  • 24 Nie N. H., Hull C. H., Jenkins J. G., Steinbrenner K., Bent D. H.. SPSS-Statistical Package for the Social Sciences. Second Edit. New York: McGraw-Hill 1975
  • 25 Wilcoxon F.. Individual comparison by ranking methods. Biometrics 1945; 1: 80-83.
  • 26 Wirth N.. Program development by stepwise refinement. Commun. ACM 1971; 14: 221-227.