Integrative Biology - Exploiting e-Science to Combat Fatal Diseases
by Damian Mac Randal, David Gavaghan, David Boyd, Sharon Lloyd, Andrew Simpson and Lakshmi Sastry
Heart disease and cancer are the two biggest diseases, and obviously the focus of intense work within the biomedical community. One aspect of this work is the computer simulation of whole organs based on molecular and cellular level models, and this requires the application of large scale computational and data management resources. The Integrative Biology project, funded by EPSRC, will build a customized Grid framework for running large scale, whole organ HPC simulations, managing a growing database of simulation results and supporting collaborative analysis and visualization.
Approximately 60% of the UK methodologies tools and standards to population will die from either heart disease or cancer. Computer simulation of whole organs based on molecular and cellular level models offers the potential to understand better the causes of these conditions and eventually to develop new treatment regimes and drugs to reduce their threat to life. The Integrative Biology (IB) project brings together a team uniquely qualified to tackle this problem, the universities of Oxford, Auckland, Sheffield, Leeds, Birmingham, Nottingham and UCL together with CCLRC and the support of IBM. The project is a second generation UK e-Science project, funded by EPSRC, which will build on the output of first round projects and integrate these and other new developments into a customised Grid framework for running large scale, whole organ HPC simulations, managing a growing database of simulation results and supporting collaborative analysis and visualisation.
Three major cornerstones of the project are the cellular models of cardiac electrophysiology developed over many years by Denis Noble's group at Oxford, the extensive work underpinning computational modelling of the whole heart by Peter Hunter's group in Auckland, and Grid software already developed by several of the partners, particularly CCLRC.
Figure 1 shows a view of a whole heart model. The long term goal driving the project is development of an underpinning theory of biology and biological function capable of sufficiently accurate computational analysis that it can produce clinically useful results.
The e-Science challenges for this project are:
- to provide transparent, co-scheduled access to appropriate combinations of distributed HPC and database resources needed to run coupled multi-scale whole organ simulations
- to exploit these resources efficiently through application of computational steering, workflow, visualisation and other techniques developed in earlier e-Science projects
- to enable globally distributed biomedical researchers to collaboratively control, analyse and visualise simulation results in order to progress the scientific agenda of the project
- to maintain a secure environment for the resources used and information generated by the project without inhibiting scientific collaboration.
The scientific agenda being addressed includes:
- developing integrated whole organ models for some of the most complex biological systems in the clinical and life sciences
- using these models to begin to study the development cycle of cardiac disease and cancer tumours
- bringing together clinical and laboratory data from many sources to evaluate and improve the accuracy of the models
- understanding the fundamental causes of these life-threatening conditions and how to reduce their likelihood of occurrence
- identifying opportunities for intervention at the molecular and cellular level using customised drugs and novel treatment regimes.
In both cancer and heart disease, Integrative Biology will improve the design and understanding of new drugs as well as enabling optimisation of novel treatments such as gene therapy or cancer vaccines which might complement conventional cytotoxic drugs. The tools developed by the project will improve the productivity of clinical and physiological researchers in academia and the pharmaceutical and biotechnology sectors. The UK e-Science community will benefit from access to new tools developed by the project and from the example of an integrated computational framework that the project will develop. This will be useful in other areas requiring a total system approach such as understanding environmental change processes. But most importantly, the ultimate beneficiaries will be patients with heart disease, cancer and, eventually, other potentially fatal diseases.
Project Organisation and Software Architecture
The Integrative Biology project team is organised into three main groups charged with developing:
- the modelling and simulation codes
- the computational framework for simulation and interaction
- the security infrastructure required by the project.
Within these groups, cross-institutional teams are working on specific technical areas including heart modelling, cancer modelling, molecular and cellular modelling, testing tuning and running simulations, data management, computational steering, workflow, visualising data and user interfaces.
Portal technology will be used to provide users with a lightweight interface to the Integrative Biology front end services and will support collaborative access to ongoing simulations and results. The services available can be grouped into four main categories:
- job management (including deployment, co-scheduling and workflow management across heterogeneous resources)
- data management (from straightforward results data handling and storage to location and transformation of experimental data for model development and validation)
- computational steering (both interactive for simulation monitoring/control and pre-defined for parameter space searching)
- analysis and visualization (not only of results, but also of interim state, parameter spaces, etc for steering purposes).
Underpinning the entire system are three overriding considerations:
- standardization (in particular OGSA and/or Web Service compliance)
- security (covering confidentiality, integrity and accessibility of data and resources).
Many of the underlying components will be adopted from existing projects, and adapted if necessary in collaboration with their original developers. Currently exploited projects include Reality Grid, gViz, myGrid and of course the middleware being developed by the OMII.
Damian Mac Randal, CCLRC,
David Gavaghan, University of Oxford