keynote

ERCIM News No.43 - October 2000 [contents]

Michael Ashburner, Joint-Head of the European Bioinformatics Institute:

“What we so desperately need, if we are going to have any chance of competing with our American cousins over the long term in bioinformatics, genomics and science in general, is a European Science Council with a consistent and science led policy with freedom from political and nationalistic interference.” Some revolutions in science often come when least you expect them. Others are forced upon us. Bioinformatics is a revolution forced by the extraordinary advances in DNA sequencing technologies, in our understanding of protein structures and by the necessary growth of biological databases. Twenty years ago pioneers such as Doug Brutlag in Stanford and Roger Staden in Cambridge began to use computational methods to analyse the very small DNA sequences then determined. Pioneer efforts were made in 1974 by Bart Barrell and Brian Clarke to catalogue the first few nucleic acid sequences that had been determined. A few years later, in the early 1980’s, first the European Molecular Biology Laboratory (EMBL) and then the US National Institutes of Health (NIH) established computerised data libraries for nucleic acid sequences. The first release of the EMBL data library was 585,433-bases; it is 9,678,428,579 on the day I write this, and doubling every 10 months or so.
Bioinformatics is a peculiar trade since, until very recently, most in the field were trained in other fields – computer science, physics, linguistics, genetics, etc. The term will include database curators and algorithmists, software engineers and molecular evolutionists, graph theorists and geneticists. By and large their common characteristic is a desire to understand biology through the organisation and analysis of molecular data, especially those concerned with macromolecular sequence and structure. They rely absolutely on a common infrastructure of public databases and shared software. It has proven, in the USA, Japan and Europe, to be most effective to provide this infrastructure by a mix of major public domain institutions, academic centres of excellence and industrial research. Indeed, such are the economies of scale for both data providers and data users that it has proved to be effective to collect the major data classes, nucleic acid and protein sequence, protein structure co-ordinates, by truly global collaborative efforts.

In Europe the major public domain institute devoted to bioinformatics is the European Bioinformatics Institute, an Outstation of the EMBL. Located adjacent to the Sanger Centre just outside Cambridge, this is the European home of the major international nucleic acid sequence and protein structure databases, as well as the world’s premier protein sequence database. Despite the welcome growth of national centres of excellence in bioinformatics in Europe these major infrastructural projects must be supported centrally. The EBI is a major database innovator, eg its proposed ArrayExpress database for microarray data, and software innovator, eg its SRS system. Jointly with the Sanger Centre the EBI produces the highest quality automatic annotation of the emerging human genome sequence (Ensembl).

To the surprise of the EBI, and many others, attempts to fund these activities at any serious level through the programmes of the European Commission were rebuffed in 1999. Under Framework IV the European Commission had funded databases at the EBI; despite an increased funding to the area of ‘infrastructure’ generally the EBI was judged ineligible for funding under Framework Programme V. Projects internationally regarded as excellent, such as the ArrayExpress database, simply lack funding. The failure of the EC to fund the EBI in 1999 led to a major funding crisis which remains to be resolved for the long term, although the Member States of EMBL have stepped in with emergency funds, and are considering a substantial increase in funding for the longer term.

It is no coincidence that the number of ‘start-up’ companies in the fields of bioinformatics and genomics in the USA is many times that in Europe. There there is a commitment to funding both national institutions (the budget of the US National Center for Biotechnology Information is three-times that of the EBI) and academic groups. What we so desperately need, if we are going to have any chance of competing with our American cousins over the long term in bioinformatics, genomics and science in general, is a European Science Council with a consistent and science led policy with freedom from political and nationalistic interference. The funding of science through the present mechanisms in place in Brussels is failing both the community and the Community.