Integrative Biology at CCLRC
by Daniel Hanlon, Lakshmi Sastry and Kerstin Kleese van Dam
The Integrative Biology project is a collaboration between researchers in diverse disciplines who are applying the developing 'Grid' metaphor to tackle two of medicine's biggest killers - cardiac disease and cancer.
Many physiological models of the heart exist, but they are often restricted in scope by a lack of available computing power. In the case of tumours, a systematic modelling framework is yet to emerge. The biological modelling community is not like that of Particle Physics where the use of high performance computing is commonplace. It is not unheard of for a researcher's laptop to be the most powerful machine on which a simulation is ever run. The relative youth of the physiological simulation community means that collaboration between researchers is not as routinely adopted as it might be - to the detriment of the science.
Integrative Biology (IB) is an inter-disciplinary project that aims to establish an e-Science/Grid based e-Infrastructure for the advancement of biological modelling in general, using, as test cases, studies to understand heart arrhythmia and tumour growth. IB uses simulation codes from teams around the world: Universities of Oxford, Sheffield and Nottingham and overseas from Aukland, San Diego and elsewhere. E-Scientists in the UK from the CCLRC e-Science Centre, Leeds and UCL are working with these numerical physiologists to develop an e-Infrastructure that facilitates the collaborative development of these codes and a secure exchange of research results.
The approach taken by Integrative Biology has two main themes:
- to facilitate the extension of existing physiological codes to cover biological processes on different scales
- to deploy codes within an Integrative Biology grid environment where scientists can experiment with the various physiological models to further their understanding of the systems under examination.
Key to these goals is the creation of a stable middleware environment which is compatible with existing codes but which is not held back from embracing the Grid philosophy. The IB software environment is starting with the synthesis of the proven technology components from a number of existing e-Science projects. This approach aims to maximise code re-use and to fully take advantage of the existing expertise.
The grid infrastructure is being built on a number of key pillars:
The Grid Security Infrastructure (http://www.globus.org/security/overview.html) is employed throughout IB. Using the UK e-Science Certification Authority's X509 certificates and this tried and tested architecture, the IB environment gains strong user authentication and encryption capabilities.
The Storage Resource Broker (originally developed by the San Diego Supercomputing Centre) forms the basis of the IB data management provision by virtualising file location and providing sophisticated access control mechanisms. Once a file is put into 'SRB space' authorised users can access it via a variety of different client tools from software APIs to web browsers anywhere in the world. File owners retain complete control over who can see or alter their files. A metadata catalogue based on the CCLRC Scientific Metadata Format with extensions from the UK myGrid Project provides the information and annotations necessary to make the data accessible for reuse and sharing with fellow researchers. Finally the CCLRC DataPortal will provide access and search capabilities to the both the data and accompanying annotations.
Visualisation, Interaction and Computational Steering
Fundamental to the successful analysis and extension of existing models is the scientist's ability to interactively control, or 'steer', simulations as they are executing. This experimentation increases their understanding of the parameters under investigation and provides an opportunity for collaboration with colleagues and peers, promoting the sharing of knowledge and understanding of the models and the results being produced. A useful dialogue can be established between those working on different scale systems - from single cells to multi-cell and whole organ models; so that a holistic understanding of the chemical, biological and functional processes of diseases can be achieved. The application visualization and user interface experts within the IB consortium are focusing on these issues, using a range of commercial and public domain problem solving environments such as Matlab and VTK. The methodologies being employed are based upon those developed for generic visualization on the Grid as part of the CCLRC e-Science Centre's core activity and libraries from the eViz and RealityGrid projects.
Current development is aimed at a middleware suite specifically tailored to the IB community. In the future however, the Integrative Biology infrastructure will be deployed as a demonstrator within a more generic Virtual Research Environment. Its functionality will be exposed within the portlet development framework of the Open Grid Computing Environment (http://www.collab-ogce.org). The aim of this interface is to make available the plethora of tools available from a host of disparate scientific disciplines such that the researcher can use any that are appropriate within their own experimental area, accessed with a simple web browser from any networked computer.
Daniel Hanlon, CCLRC e-Science Centre,
Daresbury Laboratory, UK
Tel.: +44 1925 603683