Relevance Predictability in Information Retrieval Systems

A. Kent; J. Belzer; M. Kuhfeerst; E. D. Dym; D. L. Shirey; A. Bose

doi:10.1055/s-0038-1636254

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00035037.xml

Download PDF

Methods Inf Med 1967; 06(02): 45-51
DOI: 10.1055/s-0038-1636254

Original Articles

Schattauer GmbH

Relevance Predictability in Information Retrieval Systems ^[*)]

Voraussagbarkeit Der Relevanz Bei Information-Retrieval-Systemen

Authors

A. Kent
J. Belzer
M. Kuhfeerst
E. D. Dym
D. L. Shirey
A. Bose

Further Information

Publication History

Publication Date:
17 February 2018 (online)

Permissions and Reprints

An experiment is described which attempts to derive quantitative indicators regarding the potential relevance predictability of the intermediate stimuli used to represent documents in information retrieval systems. In effect, since the decision to peruse an entire document is often predicated upon the examination of one »level of processing« of the document (e.g., the citation and/or abstract), it became interesting to analyze the properties of what constitutes »relevance«. However, prior to such an analysis, an even more elementary step had to be made, namely, to determine what portions of a document should be examined.

An evaluation of the ability of intermediate response products (IRPs), functioning as cues to the information content of full documents, to predict the relevance determination that would be subsequently made on these documents by motivated users of information retrieval systems, was made under controlled experimental conditions. The hypothesis that there might be other intermediate response products (selected extracts from the document, i.e., first paragraph, last paragraph, and the combination of first and last paragraph), that would be as representative of the full document as the traditional IRPs (citation and abstract) was tested systematically. The results showed that:

1. there is no significant difference among the several IRP treatment groups on the number of cue evaluations of relevancy which match the subsequent user relevancy decision on the document;

2. first and last paragraph combinations have consistently predicted relevancy to a higher degree than the other IRPs;

3. abstracts were undistinguished as predictors; and

4. the apparent high predictability rating for citations was not substantive.

Some of these results are quite different than would be expected from previous work with unmotivated subjects.

Es wird ein Experiment beschrieben, bei dem versucht wurde, quantitative Anhaltspunkte über die Vorbestimmbarkeit der Relevanzstärke von »intermediate Stimuli« (das sind irgendwelche Repräsentationsformen eines Dokumentes anstelle des Gesamtdokumentes, z. B. Titel, Abstrakt, Extrakt) zu gewinnen. Da der Entschluß, ein Dokument vollständig durchzulesen, sich häufig auf die Durchsicht einer »Bearbeitungsstufe« des Dokumentes (z. B. ein Zitat oder eine Zusammenfassung) stützt, ist es interessant, die Eigenschaften zu untersuchen, die »Relevanz« ausmachen. Vor einer solchen Analyse mußte jedoch zunächst einmal geklärt werden, welche Teile eines Dokumentes daraufhin untersucht werden sollten.

Unter kontrollierten experimentellen Bedingungen wurde die Brauchbarkeit von Intermediärprodukten, die als Schlüssel zum Informationsgehalt des Gesamtdokumentes dienen, geprüft, um die Relevanzentscheidung, die nachfolgend von thematisch interessierten Benutzern des Systems nach Durchlesen des gesamten Dokuments getroffen wurde, vorauszusagen. Systematisch wurde dabei die Hypothese getestet, daß andere Intermediärprodukte (ausgewählte Auszüge aus dem Dokument, wie z. B. erster und letzter Absatz der Arbeit oder die Kombination von beiden) möglicherweise genau so repräsentativ für den vollen Text sein könnten wie die traditionellen Zwischenprodukte (Zitat und Zusammenfassung).

Die Untersuchungsergebnisse zeigten:

1. es besteht kein wesentlicher Unterschied zwischen den verschiedenen Auswertungsgruppen hinsichtlich der Anzahl zutreffender Relevanzbewertungen im Vergleich zur nachfolgenden Relevanzentscheidung des Benutzers nach Durchlesen des gesamten Dokumentes;

2. die Kombination von erstem und letztem Absatz einer Arbeit liefert eine höhere Quote voraussagbarer Relevanz als die übrigen Zwischenprodukte;

3. Zusammenfassungen sind für die Relevanz-Voraussage nicht differenziert genug;

4. die anscheinend hohe Voraussagequote von Zitaten (Titelangaben) konnte nicht bestätigt werden.

Einige dieser Untersuchungsergebnisse weichen erheblich von den Resultaten früherer Arbeiten mit

^* Supported by National Institutes of Health Grant FR-00202-01, FR-00202-02, and FR-00202-03.

References
1 Davis R. A, and Bailey C. A. Bibliography of Use Studies. Drexel Institute of Technology, Project No. 195, Graduate School of Library Sciences; March 1964: 98 pp.

Search in Google Scholar
Download RIS citation
2 Hayes W. Statistics for Psychologists, p. 382. Holt; Rhine-hart, and Winston, Inc., New York: 1963

Search in Google Scholar
Download RIS citation
3 Hillman D. J. The Notation of Relevance. Amer. Doc 15: 26-34 1964;

Crossref Search in Google Scholar
Download RIS citation
4 Kent A. The Information Retrieval Game. In Kent A, and Taulbee O. E. (ed.) Electronic Information Handling. pp. 311-348 Spartan Books, Inc.; New York: 1965

Search in Google Scholar
Download RIS citation
5 Kullback S. Information Theory and Statistics, p. 84. John Wiley & Sons; Inc., New York: 1959

Search in Google Scholar
Download RIS citation
6 Resnick A, and Savage T. R. The Consistency of Human Judgments of Relevancy. Amer. Doc 25: 93-95 1965;

Search in Google Scholar
Download RIS citation
7 Swanson D.R. The Evidence Underlying the Cranfield Results. Library quart. 35: 1-20 1965;

PubMed Search in Google Scholar
Download RIS citation
8 Tague J. Matching of Question and Answer Terminology in an Education Research File. Amer. Doc 36: 26-32 1965;

Search in Google Scholar
Download RIS citation

Related Journals

Subscribe to RSS

Share / Bookmark

Relevance Predictability in Information Retrieval Systems [*)]

Authors

Publication History

References

Relevance Predictability in Information Retrieval Systems ^[*)]