A Cognitive Architecture for Semantically Based Medical Image Retrieval

by John Moustakas, Socrates Dimitriadis and Kostas Marias

The automatic extraction of meaningful image semantics is an important step towards the development of intelligent systems for Content-Based Image Retrieval (CBIR). Such systems have the potential to become useful clinical decision support tools, by retrieving medical images with established diagnoses that are ‘similar’ to the images the clinician must read. In the Institute of Computer Science - FORTH, we are developing and implementing experimental platforms for the investigation of CBIR, based on biologically inspired multi-agent architectures. In this news article, we present a novel platform based on a two-level architecture inspired by human cognitive mechanisms. These two levels share the computation of generic similarity and medical image semantics.

Humans undoubtedly possess the ability to process visual information efficiently and to identify images as being similar based on their visual content. However, computational approaches currently fall short of matching this ability. At FORTH-ICS we aim to develop and implement CBIR mechanisms that are perceptually motivated and based on biologically inspired architectures. Our recent work is concerned with medical image retrieval and aims to provide a reliable framework that can be customized for several imaging applications, and potentially be combined with DICOM functionality as well. Crucial to this goal is the automatic extraction of semantic information from medical images (eg asymmetry or pathology detection). Nevertheless, the semantic content of images is subjective, and depends on the specific image class. For this reason, generic similarity CBIR approaches often fall into the ‘semantic gap’ problem, meaning that the computed features can’t always properly describe the real characteristics of the image. At FORTH-ICS we developed a novel two-tier CBIR platform inspired by the human cognitive architecture. The key idea underlying our work emanates from psychological and neuroscientific studies which indicate that the human visual system processes information in several stages.

The visual system retains independent retinotopic maps for different primitive visual features (colour, form etc). In a pre-attentive or early stage of vision, the processing on these feature maps is undertaken independently and in parallel, whereas in the subsequent attentive stage the visual modalities engage in cooperative work. In other words, the pre-attentive level decomposes the optical scene in its primitive characteristics, which to a large extent are processed independently, in parallel and autonomously.

After the first stage of fixed-time pre-attentive processing, the human visual system performs a serial and selective examination of semantic objects that draw the subject's attention – that is, the attentive level of perception.

A User Interface generated by the authoring environment.

Two-tier architecture for medical image retrieval consisting of a pre-attentive (a), and an attentive level (b).

Based on this biological paradigm we developed a two-tier CBIR platform featuring both a pre-attentive and an attentive level of retrieval by extending our previous work on agent-based single-tier CBIR (ERCIM News No. 53, April 2003). In order to be able to assess the value of the proposed architecture, our platform was customized for a specific domain, ie brain MRI image retrieval. The pre-attentive layer of the proposed architecture produces independent, parallel feature maps (A, B, C, in Figure), each coding an independent visual feature. During the retrieval stage, each autonomous agent compares the computed values between the query and each database image. The comparison scores from all relevant agents are driven to the voting system, resulting in the final score for the candidate retrieval image. The voting scheme is selected by the user.

An additional, ‘attentive’ layer is designed for the agents to receive, one by one, semantic regions of interest (ROIs). A specialized group of ‘attentive’ agents then carefully examines and compares ROIs in a serial fashion (first ‘1’, then ‘2’, and so on). In our implementation, the attentive similarity of a given pair of MRI images is defined as the similarity of their ‘closest’ pair of ROIs. It is obvious that in order to implement the attentive retrieval level, the semantics must be defined for the specific application (brain MRI retrieval), since it is still difficult to automatically define important semantics in any image class without incorporating any prior knowledge.

The definition of semantic regions for brain MRI retrieval was based on novel algorithms for brain symmetry detection. It is well known that a normal human brain exhibits a remarkable degree of symmetry with respect to the mid-sagittal plane. In addition, the identification of regions of asymmetry is often indicative of diseases such as schizophrenia, epilepsy, and Alzheimer’s. We developed algorithms for segmenting and analysing asymmetrical regions for the ‘attentive’ level of CBIR. The authors will report initial retrieval results on publicly available data (‘The Whole Brain Atlas’ from http://www.harvard.edu) at the forthcoming IEEE International Conference on Multimedia & Expo (ICME2005).

The CBIR system was developed in JavaTM, and makes use of the Java Advanced Imaging package. It is fully scalable and can be easily extended with additional agents or voting schemes in order to take into account the specific requirements of different experiments and classes of images. While the attentive level of retrieval can be customized for any application, provided that semantic features can be automatically defined, this is an extremely hard task. Researchers across disciplines have tried for many years to shed light on the mechanisms of decomposing any image to its primitive visual features, and extracting the true underlying semantics.

Please contact:
Kostas Marias, ICS-FORTH, Greece
Tel: +30 2810 391696
E-mail: kmariasics.forth.gr