ERCIM News No.43 - October 2000 [contents]

Intelligent Post-Genomics

by Francisco Azuaje

The Intelligent Post-Genomics initiative aims to support a number of knowledge discovery tasks with crucial applications in medicine and biology, such as diagnostic systems, therapy design and organism modelling. This research is mainly focused on the development of a new generation of genomic data interpretation systems based on Artificial Intelligence and data mining techniques.

The union of theoretical sciences such as mathematics, physics and computer science with biology is allowing scientists to model, predict and understand crucial mechanisms of life and the cure of diseases. The success of this scientific synergy will depend not only on the application of advanced information processing methods, but also on the development of a new multidisciplinary language. Researchers from the Artificial Intelligence Group and Centre for Health Informatics together with the Departments of Microbiology, Biochemistry and Genetics of the University of Dublin are convinced that this joint action will yield significant benefit in the understanding of key biological problems. The Figure illustrates the main aspects addressed in this project as well as some of its possible applications.

The need for higher levels of reliability, emphasising at the same time theoretical frameworks to perform complex inferences about phenomena, makes Artificial Intelligence (AI) particularly attractive for the development of the post-genomic era. The incorporation of techniques from AI and relating research fields may change the way in which biological experiments and medical diagnoses are implemented. For example, the analysis of gene expression patterns allows us to combine advanced data mining approaches in order to relate genotype and phenotype.

Within the past few years, technologies have emerged to record the expression pattern of not one, but tens of thousands of genes in parallel. Recognising the opportunities that these technologies provide, the research groups mentioned above are launching a long-term initiative to apply AI, data mining and visualisation methods to important biological processes and diseases in order to interpret their molecular patterns. Initial efforts have already allowed us to explore and confirm the potential of these technologies for the development of tumour classification systems, detection of new classes of cancer and the discovery of associations between expression patterns.

Thus, one of our major goals is the automation of the genome expression interpretation process. This task should provide user-friendly, effective and efficient tools capable of organising complex volumes of expression data. It also aims to allow users to discover associations between apparently unrelated classes or expression patterns. This may represent not only a powerful approach to understand genetic mechanisms in the development of a specific disease, but also to support the search for fundamental processes that differentiate multiple types of diseases. A number of intelligent hybrid frameworks based on neural networks, fuzzy systems and evolutionary computation have been shown to be both effective and efficient for the achievement of these decision support systems. Furthermore it may support the identification of drug targets by making inferences about functions that are associated to sequences and genome patterns.

Other crucial research goals involve the combination of gene expression data with other sources of molecular data (such as drug activity patterns) and morphological features data. Similarly, we recognise the need to develop approaches to filter, interconnect and organise dispersed sources of genomic data.

Collaboration links with other European research institutions, such as the German Cancer Research Centre (Intelligent Bioinformatics Systems Group) and Max Planck Institute for Astrophysics (Molecular Physics Group), are actually being developed as part of our efforts to develop advanced data interpretation technologies for life sciences.


A collection of representative links to genomics and bioinformatics resources:
Genomes & Machines: http://www.cs.tcd.ie/Francisco.Azuaje/genomes&machines.html

Please contact:
Francisco Azuaje - Trinity College Dublin
Tel: +353 1 608 2459
E-mail: Francisco.Azuaje@cs.tcd.ie