Study Weekend Program... UP


Assembling proteins from typical fragments

Zbyszek Otwinowski, Jan Zelinka, Mei Qi, Maga Rowicka-Kudlicka
Texas Southwestern Medical Center. 5323 Harry Hines Boulevard, Dallas,
Texas 75390-9038

For data resolution significantly lower than atomic, model building has to be based on assembling proteins from larger fragments. Even for a protein with no homology to solved structures there are characteristic polypeptide shapes that will likely occur. Four stages in writing a fragment-based model-building program are discussed. The first is the creation of fragments library. Data mining techniques were used to identify the clusters of similar 3-D shapes in the lengths range of 5-18 connected carbon alphas. This analysis was done for single-stranded and compact two-stranded fragments. The resulting library can also be used for purposes other than model building, due to these shapes having strong sequence preferences. This library is used in a search based on a fast-rotation function, which is faster and more flexible than a translation function based search. Even for complete searches the rotation function is faster, checking up to sixty million structural hypotheses per second on a single CPU. In most applications the search space can be restricted by the solvent mask, to the volume of the asymmetric unit of NCS, or by avoiding previously interpreted electron density. The high speed allows searching for much larger library of fragments. The results of fragment search need to be connected to create a protein main chain. This problem can be only heuristically solved and experience is being accumulated to generate an algorithm that is both fast and does not miss the unusual conformations. The current work is to build side chains using conformational preferences that depend on the main chain structure. Bayesian analysis of known structures based on maximum entropy approach is being applied. The numerical algorithms of the program are developed together with GUI and 3-dimensional graphics. The results of using the program in solving a number of recent structures will be presented.