Scene Understanding using Hierarchical Neural Networks

by Gabriele Pieri and Ovidio Salvetti

Scientists at ISTI-CNR are using Hierarchical Neural Networks in order to develop a methodology for the automatic identification of characteristic patterns ) - representative of particular phenomena - in a given scene. This approach can be adopted in a wide range of applications.

When changes to a dynamic or 3-D scene represented by sequences of two-dimensional images occur with some frequency, a system which learns and adapts to these changes can be developed using neural networks (NN). Each image of the sequence is represented by a set of morphological structures with textural properties, and every image is analysed in order to compute a set of characteristic features, that can then be used to understand the scene. In order to do this, we are adopting a strategy that uses Hierarchical Neural Networks (HNNs), which are global networks composed of a hierarchy of single neural networks.

The hierarchical approach guarantees both specialisation and adaptability at the same time, since the single levels of the network can be finely tuned on the specific characteristics of the problem to be solved and the entire architecture can be easily modified if the problem description changes, by simply training only those levels involved in the modification.

The advantages of using an NN-based hierarchical architecture can be summarised in two main points: the modular organisation facilitates analysis of the networks and the hierarchy enables the network to finely tune itself towards recognising the most promising directions to look for relevant patterns.

Important requirements to take into account when using this approach are the possibility to exploit differences between extracted features to improve the classification capability and to change easily the number of features. The hierarchical NN architecture has been studied to meet these requirements.

Scene understanding can be achieved by combining K parallel neural networks, each trained to extract a specific property class from the source images. The use of a set of parallel-specialised networks instead of a single complex net is appropriate to implement efficiency and flexibility.

Furthermore, this model also makes it possible to optimise the computational complexity of the single levels independently. In particular, the addition of a new feature has only a partial effect on a single part of the HNN architecture.

In this context, scene understanding mainly consists of two phases (see Figure 1): classification of the single features extracted from each image used to define a scene (first and lower level); data fusion for a more complete understanding of the characteristic patterns (second and higher level).
The global network does have to be homogeneous, since its sub-nets can differ for typology and topology in order to face complex problems with high flexibility.


Figure 1: Scene understanding using a hierarchical neural network architecture.	Figure 2: Example of 3D neuro-image density analysis for cerebral anomalies diagnosis. Eight density classes are classified: bone, white matter, grey matter, liquor, air, blood, hypo- and hyper-density.

A large number of different application fields can be modelled using an approach of this type. In our study, we have applied HNN architectures to a number of problems ranging from real-time monitoring of the oscillation states in the flame front of gas combustors in power plants to the three-dimensional classification of brain tissue densities for the diagnosis and follow-up of cerebral anomalies (see Figure 2).

Please contact:
Ovidio Salvetti, ISTI-CNR
Tel: +39 050 3153124
E-mail: o.salvetti@iei.pi.cnr.it