Study Weekend Program... UP

New model completion capabilities of ARP/wARP

SX Cohen1, R Morris, FJ Fernandez, V Lamzin2, A Perrakis1
1Netherlands Cancer Institute, Amsterdam, Netherlands
2EMBL Hamburg Outstation, Hamburg, Germany

Automatic procedures to build protein structures in electron density maps (using software like ARP/wARP, Resolve or MAID) made a big contribution towards automation. Still, crystallographers need to spend an important amount of time in front of interactive modelling programs to bring the model to a reasonable level of completion. Depending on the quality and resolution of the data and the relative success of the automatic modelling, the user is left in the best case with a model lacking only poorly ordered loops and side chains and in the best case with a very partial model made of discontinuous poly-Alanine stretches.

We introduce new modules of the ARP/wARP suite which will lead to the automatic generation of models with higher completeness. A docking algorithm is used to assign each main chain fragment to the available amino-acid sequence and thus assign the amino acid type information to each residue of the main chain. The latest version of the algorithm is able to perform in the presence of NCS and is able to derive the NCS operators mapping each copy of the molecule to another. After sequence assignment the side chain are constructed and the best rotamers are fitted. Finally, the side chains are refined using torsion angle parametrisation and a pseudo real space fitting target. To incorporate anti-bumping restraints the refined electron density map is altered in such a way that restraints are included in the real space correlation computation. At this stage short poorly defined loops can be modelled using an exhaustive search algorithm whose cost is greatly lowered by the knowledge of the start and end point of the loop.

These algorithms are implemented in a way that they perform fast enough to be used in an iterative way, each being called as soon as they can provide some significant improvement of the current model rather than just at the last cycle, enabling ARP/wARP to work on even poorer data than in the current version. In this respect we will use information provided by novel statistical validation tools (developed by G. Kleywegt based on the EDS server) to discriminate between the part of the model that should be rebuilt and the part for which rebuilding is no more necessary.