GMD Successful at a World-Wide Protein Structure Prediction Experiment
by Bernd Kramer and Thomas Lengauer
Seventy-two research groups from all over the world participated in the Second Meeting on the Critical Assessment of Techniques for Protein Structure Prediction (CASP2) at Asilomar, California, 12-16 December 1996. This meeting was the culmination of a world-wide experiment to determine the effectiveness of current structure prediction methods. The two projects PROTAL (Proteins: Sequence, Structure, and Evolution) and RELIWE (Calculation and Prediction of Receptor Ligand Interactions) from the Institute for Algorithms and Scientific Computing at GMD participated in this experiment and exhibited successful predictions.
The aim of the CASP2 experiment was to test existing structure prediction methods and software tools on so-called blind predictions. A blind prediction is a structure prediction, for which the actual structure is not known at the time of the prediction, but will become available soon thereafter for evaluation. The organizing team of CASP2 at Lawrence Livermore Laboratory, California, had collected 42 different prediction targets from protein crystallographers and Nuclear Magnetic Resonance (NMR) spectroscopists, who provided the sequences for prediction before the summer of 1996, and the resolved structures lateron, as they became available in late 1996. During the intervening months the predictor teams submitted structure models based on their theoretical methods. At the last deadline for submission of the predictions, more than nine hundred models had been sent to the organizers.
There were four disciplines within the contest:
Here the protein sequence given displays a high degree of homology (above 40% sequence identity) to a protein of known structure. The goal is to generate a detailed atomic structure model of the protein.
Here the protein sequence given displays a low or marginal similarity (less than 40% and down to well below 20% sequence identity) to a structurally known protein. The goal is twofold. First, detect a good structural model - the so-called template - of the protein in question -the so-called target - among the proteins whose structure is known. This task is called fold recognition. The second goal is to faithfully map the target sequence onto the template structure. This task is called threading.
Ab Initio Prediction
Here the structure prediction of the sequence in question is not based on a homologue among the structurally known proteins, most often because no such homologue exists. In this case, one should find out whatever one can about the protein structure, eg, in terms of secondary structure, topology, or tertiary structure.
Here, the available information includes a structure of a protein and a structural formula of a ligand molecule that binds to the protein. The ligand can be another protein (in which case its structure is given, as well) or a small molecule. The goal is to predict the structure of the molecular complex consisting of the protein and the bound ligand.
Predictions (light grey) of a protein structure (left) and a ligand position (right), experimental structures in dark grey.
GMD groups in competition
The two groups of GMD Institute for Algorithms and Scientific Computing entered this competition in the disciplines fold recognition (threading) and docking.
The PROTAL group (http://cartan.gmd.de/PROTAL/) has developed the program package ToPLign (http://cartan. gmd.de/ToPLign.html) which can be used to analyze, and align protein sequences and predict protein structures on the basis of their sequence. One of the tools developed within this suite is an improved follow-up of the predictor 123D which in a recent report in Nature Structural Biology has been mentioned as a prime resource for threading. Another tool called RDP, included in the ToPLign package can be used to refine these alignments in order to reach a model quality that is sufficient for docking studies. In the project RELIWE (http://www.gmd.de/SCAI/alg/reliwe/reliwe_home.html) the development of algorithms for docking of flexible ligand molecules at GMD has led to the software tool FlexX, the fastest docking tool published so far, and apparently the only tool currently available on the internet (http://cartan.gmd.de/FlexX.html).
FlexX supports ligand flexibility, handles steric as well as chemical aspects of docking and produces models whose quality is comparable to that of other tools that use much more runtime.
With the help of these prediction tools the two GMD groups have submitted models for most of the targets within the disciplines (2) and (4) of CASP-2. At the meeting the methods and the results were presented, and a team of independent scientists made an assessment of the quality of the predictions. PROTAL and RELIWE turned out to be well placed among the leading groups of both prediction areas. In particular, the reliability of the predictions was high, the number of false positives in fold recognition was low, and the docking tool FlexX had the smallest runtimes among all docking tools.
Bernd Kramer - GMD
Tel: +49 2241 14 2276
Thomas Lengauer - GMD
Tel: +49 2241 14 2777