Subscribe to RSS
DOI: 10.3414/ME0561
Content-based Image Retrieval for Scientific Literature Access
Publication History
Received:
11 April 2008
accepted:
12 January 2009
Publication Date:
17 January 2018 (online)
Summary
Objectives: An increasing number of articles are published electronically in the scientific literature, but access is limited to alphanumerical search on title, author, or abstract, and may disregard numerous figures. In this paper, we estimate the benefits of using content-based image retrieval (CBIR) on article figures to augment traditional access to articles. Methods: We selected four high-impact journals from the Journal Citations Report (JCR) 2005. Figures were automatically extracted from the PDF article files, and manually classified on their content and number of sub-figure panels. We make a quantitative estimate by projecting from data from the Cross-Language Evaluation Forum (Image-CLEF) campaigns, and qualitatively validate it through experiments using the Image Retrieval in Medical Applications (IRMA) project.
Results: Based on 2077 articles with 11,753 pages, 4493 figures, and 11,238 individual images, the predicted accuracy for article retrieval may reach 97.08%.
Conclusions: Therefore, CBIR potentially has a high impact in medical literature search and retrieval.
-
References
- 1 Smeulders AWM, Worring M, Santini S, Gupta A, Jain R. Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence 2000; 22 (12) 1349-1380.
- 2 Vailaya A, Figueiredo MAT, Jain AK, Zhang HJ. Image classification for content-based indexing. IEEE Transactions of Image Processing 2001; 10 (Suppl. 01) 117-130.
- 3 Haux R. Health information systems. Past, present, future. International Journal of Medical Informatics 2006; 75 3–4 268-281.
- 4 Müller H, Michoux N, Bandon D, Geissbuhler A. A review of content-based image retrieval systems in medical applications. Clinical benefits and future directions. International Journal of Medical Informatics 2004; 73 (Suppl. 01) 1-23.
- 5 Lehmann TM, Güld MO, Deselaers T, Keysers D, Schubert H, Spitzer K, Ney H, Wein BB. Automatic categorization of medical images for content-based retrieval and data mining. Computerized Medical Imaging and Graphics 2005; 29 (Suppl. 02) 143-155.
- 6 Tagare HD, Jaffe CC, Duncan J. Medical image data-bases: A content-based retrieval approach. Journal of the American Medical Informatics Association – JAMIA 1997; 4 (Suppl. 03) 184-198.
- 7 Lehmann TM, Güld MO, Thies C, Fischer B, Spitzer K, Keysers D, Ney H, Kohnen M, Schubert H, Wein BB. Content-based image retrieval in medical applications. Methods Inf Med 2004; 43 (Suppl. 04) 354-361.
- 8 Hersh W, Mailhot M, Arnott-Smith C, Lowe H. Selective automated indexing of findings and diagnoses in radiology reports. J Biomed Inform 2001; 34 (Suppl. 04) 262-273.
- 9 Park SC, Sukthankar R, Mummert L, Satyanarayanan M, Zheng B. Optimization of reference library used in content-based medical image retrieval scheme. Medical Physics 2007; 34 (11) 4331-4339.
- 10 Scott G, Shyu CR. Knowledge-driven multidimensional indexing structure for biomedical media database retrieval. IEEE Transactions on Information Technology in Biomedicine 2007; 11 (Suppl. 03) 320-331.
- 11 Kim J, Cai W, Feng D, Wu H. A new way for multidimensional medical data management: Volume of interest (VOI)-based retrieval of medical images with visual and functional features. IEEE Transactions on Information Technology in Biomedicine 2006; 10 (Suppl. 03) 598-607.
- 12 Rahman MM, Bhattacharya P, Desai BC. A framework for medical image retrieval using machine learning and statistical similarity matching techniques with relevance feedback. IEEE Transactions on Information Technology in Biomedicine 2007; 11 (Suppl. 01) 58-69.
- 13 Greenspan H, Pinhas AT. Medical image categorization and retrieval for PACS using the GMM-KL framework. IEEE Transactions on Information Technology in Biomedicine 2007; 11 (Suppl. 02) 190-202.
- 14 Hassan K, Tweed T, Miguet S. A multi-resolution approach for content-based image retrieval on the Grid-application to breast cancer detection. Methods Inf Med 2005; 44 (Suppl. 02) 211-214.
- 15 Montagnat J, Breton VE, Magnin I. Partitioning medical image databases for content-based queries on a Grid. Methods Inf Med 2005; 44 (Suppl. 02) 154-160.
- 16 Niblack W, Barber R, Equitz W, Flickner M, Glasman E, Petkovic D, Yanker P, Faloutsos C, Taubin G. The QBIC project: Querying images by content using color, texture, and shape. Proceedings SPIE 1993; 1908: 173-187.
- 17 Puzicha J, Rubner Y, Tomasi C, Buhmann J. Empirical evaluation of dissimilarity measures for color and texture. Proceeding ICCV 1999; 2: 1165-1173.
- 18 Deserno TM, Antani S, Long R. Ontology of gaps in content-based image retrieval. J Digit Imaging 2008 online-first, DOI 10.1007/s10278–007– 9092-x.
- 19 Clough P, Müller H, Deselaers T, Grubinger M, Lehmann TM, Jensen J, Hersh W. The CLEF 2005 cross language image retrieval track. Lecture Notes in Computer Science 2006; 4022: 535-558.
- 20 Deselaers T, Müller H, Clough P, Ney H, Lehmann TM. The CLEF 2005 automatic medical image annotation task. International Journal of Computer Vision 2007; 74 (Suppl. 01) 51-58.
- 21 Müller H, Deselaers T, Deserno TM, Clough P, Kim E, Hersch W. Overview of the ImageCLEFmed 2006 medical retrieval and annotation tasks. Lect Notes Comput Sci 2007; 4730: 595-608.
- 22 Fischer B, Winkler B, Thies C, Güld MO, Lehmann TM. Strukturprototypen zur Modellierung medizinischer Bildinhalte. In: Handels H, Erhardt J, Horsch A, Meinzer HP, Tolxdorff T. (eds.) Bildverarbeitung für die Medizin 2006. Berlin: Springer-Verlag; 2006. pp 71-75 (in German).
- 23 The Thomson Corporation (ed).. ISI Journal Citation Reports 2005, Science Edition 2006. ( scientific.thomson.com/products/jcr/ )
- 24 Lehmann TM, Plodowski B, Spitzer K, Wein BB, Ney H, Seidl T. Extended query refinement for content-based access to large medical image databases. Procs SPIE 2004; 5371: 90-98.
- 25 Lehmann TM, Schubert H, Keysers D, Kohnen M, Wein BB. The IRMA code for unique classification of medical images. Proceedings SPIE 2003; 5033: 440-451.
- 26 Güld MO, Thies C, Fischer B, Lehmann TM. Content-based retrieval of medical images by combining global features. Lecture Notes in Computer Science 2006; 4022: 702-711.
- 27 Deserno TM, Antani S, Long LR. Exploring access to scientific literature using content-based image retrieval. Procs SPIE 2007; 6516: OL1-OL8.
- 28 Christiansen A, Lee DJ, Chang Y. Finding relevant PDF medical journal articles by the content of their figures. Procs SPIE 2007; 6516: OK1-OK12.
- 29 Demner-Fushman D, Antani S, Thoma GR. Automatically finding images for clinical decision support. Proc. IEEE International Conference on Data Mining, Workshop on Data Mining in Medicine 2007 pp 139-144.
- 30 Névéol A, Deserno TM, Darmonic SJ, Oliver M, Güld Aronson AR. Natural language processing vs. content-based image analysis for medical document retrieval. Journal of the American Society for Information Science and Technology 2009; 60 (Suppl. 01) 123-134.
- 31 Antani S, Demner-Fushman D, Li J, Srinivasan BV, Thoma GR. Exploring use of images in clinical articles for decision support in evidence-based medicine. To appear in Proc. IS&T/SPIE Electronic Imaging: Document Recognition and Retrieval 2008
- 32 Lehmann TM, Molander B, Güld MO, Thies C, Gröndahl HG. Content-based access to oral and maxillofacial radiographs. Dentomaxillofacial Radiology 2007; 36 (Suppl. 06) 328-335.
- 33 Woods JW, Sneiderman CA, Hameed K, Ackermaan MJ, Hatton C. Using UMLS metathesaurus concepts to describe medical images: dermatology vocabulary. Computers in Biology and Medicine 2006; 36 (Suppl. 01) 89-100.