How to Detect and Prevent Errors in Computer-Supported Statistical Analysis: An Example

R. Haux

doi:10.1055/s-0038-1635421

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00035037.xml

Share / Bookmark

Facebook Linkedin Weibo

Download PDF

Methods Inf Med 1983; 22(02): 87-92
DOI: 10.1055/s-0038-1635421

Original Artical

Schattauer GmbH

How to Detect and Prevent Errors in Computer-Supported Statistical Analysis: An Example^[*)]

Wie Kann Man Fehler Bei Rechnergestützten Statistischen Auswertungen Entdecken Und Verhindern? — Ein Beispiel

R. Haux

¹(From the Department of Medical Documentation, Statistics and Computer Science, University of Heidelberg, F.R.G.)

› Author Affiliations

Further Information

Publication History

Publication Date:
20 February 2018 (online)

Abstract
Full Text
References

Permissions and Reprints

In statistical analysis systems data are stored in a database, and procedures or analysis methods, respectively, are stored in a method base. As will be shown in the present article, even in the case of a simple statistical test the detection of faulty data turns out to be deficient in weill-known statistical analysis systems like BMDP, SAS and SPSS. Suggestions are made to program methods and to design statistical analysis systems in such a way that careful error detection is possible to provide the user with more reliable results.

Statistische Auswertungssysteme speichern Daten in ihrer Datenbank und Methoden (Auswertungsprogramme) in ihrer Methodenbank. Anhand eines Beispiels wird gezeigt, daß, selbst im Falle eines einfachen statistischen Tests, die Überprüfung fehlerhafterDaten in den bekannten statistischen Auswertungssystemen BMDP, SAS und SPSS mangelhaft ist. Diese Arbeit enthält Vorschläge, wie Methoden zu programmieren und wie statistische Auswertungssysteme zu entwerfen sind, um eine bessere Fehlerprüfung zu gewährleisten, mit dem Ziel, dem Anwender vertrauenswürdigere Ergebnisse liefern zu können.

Key-Words

Statistical Analysis Systems - Error Detection - WILCOXON-MANN-WHITNEY Test - Semantic Integrity Constraints - Statistical Quality - Statistical Computing - BMDP - SAS - SPSS

Schlüssel-Wörter

Statistische Auswertungssysteme - Fehlerprüfung - WILCOXON-MANN-WHiTNEY-Test - semantische Integritätsbedingungen - statistische Qualität - rechnergestütztes statistisches Auswerten - BMDP - SAS - SPSS

^* This article is an extended version of a talk given at the colloquium »Design Criteria for Statistical Analysis Systems« at the University of Heidelberg in 1981.

References
1 Alle M., Haux R., Schumacher M.. Rank Tests and Ties —some Properties of a Rank Test for Complete Block Designs. Preprint series of the >Sonderforschungsbereich Stocha-stische Mathematische Modelle University of Heidelberg (to be published)

MissingFormLabel
PubMed
2 Beutel P.. New Developments in Integration of Statistical Systems. In Barrit M. M., Wishart B.. (Eds) COMP-STAT 1980 pp. 470-476 Vienna: Physika Verlag 1980

MissingFormLabel
PubMed Search in Google Scholar
3 Campbell N. A., Woodings T. L.. Improved diagnostic output from statistical packages. 43rd Session of the ISI (to appear)

MissingFormLabel
PubMed
4 Codd E. F.. A relational model of data for large shared data banks. Commun. ACM 1970; 13: 377-387.

MissingFormLabel
Crossref PubMed Search in Google Scholar
5 Codd E. F.. Extending the relational model to capture more meaning. ACM Transact. Database Syst 1979; 4: 397-434.

MissingFormLabel
PubMed Search in Google Scholar
6 Dijkstra E. W.. The Thinking Programmer—a More Formal Treatment of a Less Simple Example. In Bauer F. L.. et al. Program Construction, pp.. 2-20 Heidelberg: Springer; 1979

MissingFormLabel
Search in Google Scholar
7 Dixon W. J.. Statistical software: touching on various parts of the elephant. Bull. ISI 1975; 46: 566-671.

MissingFormLabel
PubMed Search in Google Scholar
8 Dixon W. J., Brown M. B.. (Eds) Biomedical Computer Programs P-Series, 1979. Berkeley: Univ. of California Press 1979

MissingFormLabel
PubMed Search in Google Scholar
9 Francis I., Heiberger R. M., Vellemann P. F.. Criteria and considerations in the evaluation of statistical program packages. Amer. Statist 1975; 29: 52-56.

MissingFormLabel
PubMed Search in Google Scholar
10 Goodnight J. H.. Validity Checking—How far should we go? In Frane, J. W. (Edit.): Computer Science and Statistics—8th Annual Symposium on the Interface, pp 146-148 Berkeley: Univ. of California Press 1975;

MissingFormLabel
PubMed Search in Google Scholar
11 Hajek J., Sidak H.. Theory of Rank Tests. New York: Academic Press 1967

MissingFormLabel
PubMed Search in Google Scholar
12 Haux R.. How to detect and prevent errors in computer supported statistical analysis: an example. Technical Report, Dept. of Medical Documentation, Statistics and Computer Science, University of Heidelberg 1982

MissingFormLabel
PubMed Search in Google Scholar
13 Haux R.. The Use of Complex Data Structures in Statistical Analysis Systems. (Submitted as doctoral thesis, 1982–in German)

MissingFormLabel
PubMed
14 Haux R., Schumacher M., Weckesser G.. Rank Tests for Complete Block Designs. Preprint series of the >Sonderfor-schungsbereich Stochastische Mathematische Modelle«, University of Heidelberg. 1981 (Biomed. J., in print)

MissingFormLabel
PubMed Search in Google Scholar
15 Helwig J. T., Council K. A.. (Eds) SAS User’s Guide, 1979 Edition. (Raleigh/N. C: SAS Institute, Inc 1979

MissingFormLabel
PubMed Search in Google Scholar
16 Hoare C. A. R.. An axiomatic basis for computer programming. Commun. ACM 1969; 12: 576-583.

MissingFormLabel
Crossref PubMed Search in Google Scholar
17 Holm S.. A simple sequentially rejective multiple test procedure. Scand. J. Statist 1979; 6: 65-70.

MissingFormLabel
PubMed Search in Google Scholar
18 Hull C. H., Nie N. H.. (Eds) SPSS Update: New Procedures for Releases 7 and 8. New York: McGraw-Hill 1979

MissingFormLabel
PubMed Search in Google Scholar
19 Hultsch E., Jannasch H., Krier N., Sund M., Victor N.. Requirements for program systems used for statistical data analysis. Stat. Software Newsletter 1978; 4: 3-30.

MissingFormLabel
PubMed Search in Google Scholar
20 Kruskal W. H., Wallis W. A.. Use of ranks in one-criterion variance analysis. J. Amer. Statist. Ass 1952; 47: 584-621.

MissingFormLabel
PubMed Search in Google Scholar
21 Luce S. R.. A Conceptual Analysis of SPSS and BMDP. In Barrit M. M., Wishart D.. (Eds) COMPSTAT 1980; pp. 509-514 Vienna: Physika Verlag 1980;

MissingFormLabel
PubMed Search in Google Scholar
22 Mann H. B., Whitney D. R.. On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Statist 1947; 18: 50-60.

MissingFormLabel
Crossref PubMed Search in Google Scholar
23 Nelder J. A.. Intelligent Programs, the Next Stage in Statistical Computing. In Barra J. R.. et al. (Eds) Recent Developments in Statistics, pp. 79-86 Amsterdam: North-Holland Publishing Company; 1977

MissingFormLabel
Search in Google Scholar
24 Nie N. H., Hull C. H., Jenkins J. G., Steinbrenner K., Bent D. H.. SPSS-Statistical Package for the Social Sciences. Second Edit. New York: McGraw-Hill 1975

MissingFormLabel
PubMed Search in Google Scholar
25 Wilcoxon F.. Individual comparison by ranking methods. Biometrics 1945; 1: 80-83.

MissingFormLabel
Crossref PubMed Search in Google Scholar
26 Wirth N.. Program development by stepwise refinement. Commun. ACM 1971; 14: 221-227.

MissingFormLabel
Crossref PubMed Search in Google Scholar