|Friday 3 January|
|Session 1: Introduction / Overview |
|The phase problem|
Given recent advances in phasing problems, those new to protein crystallography
may be forgiven for asking What problem ? As many of those
attending the CCP4 meeting come from a biological background, struggling with
expression and crystallisation, this introductory lecture aims to introduce
some of the basics that will hopefully make the subsequent lectures penetrable.
What is the phasein crystallography ?
What is the problem ? How can we overcome the problem ? The
lecture will emphasise that we can only discover the phase values through some
prior knowledge of the structure. The lecture will canter through direct
methods, isomorphous replacement, anomalous scattering and molecular
replacement. As phasing is the most acronymic realm of crystallography, MR,
SIR, SIRAS, MIR, MIRAS, MAD, BAD and SAD will be expanded and explained in
part. Along the way we will meet some of the heroes of protein
crystallography such as Perutz, Kendrew, Crick, Rossman and Blow who
established many of the phasing methods in the UK. It is inevitable that we
meet some basic mathematics, but this will done as gently as possible.
|New ways of looking at experimental phasing|
In the original work by Blow and Crick, experimental phasing was
formulated as a least-squares problem. For good data on good
derivatives, this approach works reasonably well, but we now
attempt to extract more information from poorer data than in the past.
As in many other crystallographic problems, the assumptions underlying
the use of least squares for phasing are not satisfied, particularly for
poor derivatives. The introduction of maximum likelihood (and more
powerful computers) has led to substantial improvements. For
computational convenience these new methods still make many assumptions
about the independence of different measurements and sources of error.
We have been looking at a more general formulation for the probability
distributions underlying likelihood-based methods for both experimental
phasing and molecular replacement phasing. In the new formulation, all
the structure factors associated with a particular hkl are considered
to be related by a complex multivariate normal distribution. When the
appropriate assumptions are introduced (i.e. measurement errors and
lack-of-isomorphism errors of different derivatives are independent,
derivatives contain no common heavy-atom sites, isomorphous addition
rather than isomorphous replacement), the general formulation reduces
to current likelihood targets. But the new formulation makes the
necessary assumptions more explicit, and points the way to improving
phasing using both isomorphous and anomalous differences.
|Session 2: Making Derivatives |
|Heavy atom preparation (and introduction to the heavy atom databank)|
Several of the standard methods of solving macromolecular structures
involve making a protein crystal that is derivatised by an anomalous
scatterer or heavy atom (MIR, SIRAS, MAD, SAD...). The theoretical methodology
which underpins the extraction of phase information from such derivatives
is widely available in the literature. In addition there are comprehensive
sources of information on the chemistry of heavy atom compounds and the ligands
with which they are known to interact.
Thus this contribution to the Workshop will aim to provide some
information on the less well documented practical problems of first deciding on
an overall strategy and secondly performing the physical manipulations involved
in producing and then cryo-cooling heavy atom derivatives from native protein
crystals. Ways to optimise the chances of isomorphous unit cells will be
suggested. Methods of determining whether or not the heavy atom is bound will
be discussed, including the powerful technique of PIXE (Particle-Induced X-Ray
An introduction will be given to the Heavy Atom Databank. This is a
database of the heavy atom compounds and conditions used to derivatise known
protein structures which has been assembled from the literature.
The various considerations when making heavy atom derivatives will be
illustrated with examples from our own laboratory.
|A new class of lanthanide complexes to obtain high phasing-power
heavy-atom derivatives for macromolecular crystallography|
The current emphasis on high-throughput crystallography leads to develop heavy-atom
preparation methods that are more reliable and less disruptive than traditional
heavy-atom soaking. Seven gadolinium complexes have been tested and found to be
excellent candidates to obtain heavy-atom derivatives in macromolecular crystallography.
These highly soluble lanthanide complexes can be easily introduced at high
concentration (100 mM or higher) in protein crystals either by soaking or by
co-crystallisation, without changing significantly the crystallisation conditions,
as was already demonstrated for Gd-HPDO3A derivative crystals of hen egg-white
lysozyme [Girard et al., Acta Cryst. (2002), D58, 1-9]. These complexes, combined
to the Single-wavelength Anomalous Dispersion (SAD) method, are of special interest
for high-throughput macromolecular crystallography.
Using this new class of heavy-atom derivative crystals, de novo phasing by
method has been carried out on several proteins of known structures as well as of
unknown structures. Diffraction data have been collected either with a laboratory
source making use of the high anomalous signal (f" =
12 e-) of gadolinium for CuKα
radiation, or using synchrotron radiation at the white line in the gadolinium
absorption edge (λ= 1.711Å, f" = 28 e-).
Using Gd-HPDO3A, one of these gadolinium complexes, we have determined the structure
of a chimeric form of OTCase from P.aeruginosa and E.Coli, which is a dodecameric
protein of 450 kDa.
|Tri-iodide derivatization of macromolecules|
Methods for producing macromolecular derivatives using cryo-soak techniques
with triiodide solutions are described. The methods have been tested on
six different protein types. SAD/SIRAS phasing has been attempted for
each protein with data measured using conventional Cu-Kα x-ray equipment and long wavelength synchrotron radiation.
The results show varying degrees of success. Refinement of all six derivative
structures has shown that iodine is able to bind as I- (as observed
in standard halide soaks) and also as the polyiodide anions I
3- and I5-.
The various species are able to bind through hydrogen bond interactions and to
more hydrophibic regions of the protein at surface pockets and in inter- and
intra-molecular cavities. On the whole the derivative displays a promiscuous
behaviour in terms of its binding to proteins and is capable of generating
sufficient phasing power from in-house Cu-Kα
data to permit structure solution by SAD. The resuls of the phasing experiments
and structure refinements will be presented.
|Generation of noble gas binding sites for phasing using mutagenesis|
The utility of noble gases for phase determination has been limited by the
lack of naturally occurring binding sites in proteins. Wild-type T4
lysozyme contains one such binding site. By mutating large hydrophobic
residues to alanine, additional noble gas binding sites were successfully
introduced into this protein. Using data from xenon derivatives of
wild-type, two single mutants, and the corresponding double mutant,
experimental phases for T4 lysozyme were determined using standard MIR
techniques. These phases, which were obtained from room-temperature data
collected on a rotating-anode source, are comparable in quality to phases
calculated using selenomethionine MAD on frozen crystals at a synchrotron.
In addition, this method of introducing noble-gas binding sites near
specific residues should provide useful information for determining the
register of amino acids within electron-density maps.
|Session 3: Data Collection|
|Optimizing data collection for structure determination|
The final purpose of diffraction data collection is to produce data set
which provides enough structural information about the molecule of
interest. This usually entails collecting a complete and accurate set of
reflection to as high resolution as possible. In the practice, the
characteristics of the crystal and properties of the x-ray source can
be limiting factors to the data set quality that can be achieved and a
reasonable strategy has to be used to extract the maximum amount of
information from the data whithin the experimental constraints.
In the particular case of data intended for phasing using anomalous
dispersion, the synchrotron beamline properties are relevant to
determine how many wavelengths (one or more) should be used and what
the wavelength values should be. In general, Multiwavelength Anomalous
Dispersion (MAD) experiments produce very accurate phases , but are
very demanding in terms beamline spectral range, easy tunability,
stability and reproducibility. When these contidions cannot be, single
wavelength experiments may be a better option.
In addition, understanding the effect of crystal characteristics,
diffraction quality, anomalous scattering properties on the phasing is
critical to balance the benefits of increased phase accuracy resulting
from long exposures and redundant measurements with the increased risk
of radiation damage to the crystal during the experiment.
|Extension of Home Laboratory Phasing Capabilities Using Chromium Radiation|
A home laboratory high-intensity chromium X-ray source appears to be ideally
suited for use in enhancing the weak anomalous signals from sulfur, selenium,
calcium and other atoms found in protein crystals. Specifically, the f" for
sulfur is 1.14 electrons at CrKα
which is similar to the f" of calcium and
selenium collected at CuKα.
Since calcium anomalous scattering has been used to
phase trypsin  from CuKα diffraction
data, we expect a high quality CrKα data
set to provide even more phasing power and allow for routine phasing of
macromolecular diffraction data without the need for synchrotron data or
In order to test this hypothesis we have commissioned Osmic, Inc. to design and
manufacture a Confocal Max-Flux optic optimized specifically for Cr
radiation. We have performed experiments with this optic using an RU-300
generator and R-AXIS IV image plate in order to determine how to maximize the
anomalous signal from sulfur and other light elements within different proteins
with the goal of solving the phase problem more readily. Special attention
will be given to the details required to offset the strong absorption of Cr
radiation and decay caused by Cr radiation by the experimental setup typically
used for diffraction studies. We will also discuss the solution of protein
structures using only the Cr radiation enhanced anomalous signal from sulfur.
If rwo measurements of an Xray amplitude are available,
with a description of their vector difference then it is possible to
an estimate of the phase of that amplitude, along with its reliability.
This is the case for the methods known as " Single Isomorphous
Replacement" ( SIR)
where the vector difference is due to heavy atoms added to the crystal
"single Anomalous Dispersion" ( SAD) where the vector difference is due
anomalous scattering of a few atoms in the lattice, eg Se in Se
a bound metal.
The phase estimate can be refined by imposing prior knowlege of the
a map for a macro-molecule; eg a considerable fraction of the asymmetric
will be filled with disordered solvent, the map should show continuous
These techniques work very well providing the initial measurements are
first to position the anomalous scatteres then to give realistic
figures of merit.
In some cases the relative signal from the anomalous scatters has been
as low as 1%.
I will examine several test cases which illustrate both success and
failure to pinpoint
the factors which govern the outcome.
|Practical Aspects of Sulfur ISAS Phasing|
Sulfur, which exists in almost all proteins, has been investigated as an
anomalous phasing probe for protein structure determination since the
early 1980s. However, during the past two decades only a few de novo
structures have been determined using this method. This is due to the
fact that sulfur's anomalous scattering signal is weak,
with ΔF" ranging from 0.124 to 1.42 e-
tunable range of most synchrotron X-ray sources. Using third generation
synchrotrons, improved CCD detector technology and special attention to
data collection we have shown that the weak sulfur anomalous scattering
signal can be recorded with the accuracy needed for successful de novo
structure determination. Based on these successes we have developed
data collection, data processing and phasing procedures for protein
structure determination using sulfur single-wavelength anomalous
scattering (SAS) data. Our results show that sulfur phasing can be
applied to many crystal structure determinations if the correct
experimental procedure is pursued. The practical aspects of this
approach will be presented.
|Radiation-Damage Induced Phasing|
Our brightest synchrotron sources have had an extraordinary impact in
biology during the past few years. However, they are also creating havoc with
crystalline biological samples by processes referred to as radiation-damage.
In the course of the data collection, the diffraction power of the crystal is
reduced, the mosaicity and overall B-factor go up, and eventually one will lose
all higher order reflections. In addition to these general effects, some highly
specific changes might occur, such as breakage of disulphide bonds and loss of
definition of carboxyl groups. During data collection, the diffraction
intensities change and these changes can be measured accurately. Part of these
changes can be assigned to an in general large, but limited number of highly
susceptible sites. In this paper it is shown how to extract reliable intensity
differences of the X-ray susceptible part of the structure. These can eventually
be used to obtain phase information by a method that we have named
Radiation-damage Induced Phasing (RIP).
|Saturday 4 January|
|Session 4: Finding the Sites |
|Heavy atom searches and their symmetries|
This presentation gives an overview of the various heavy atom search
procedures that are available to the macromolecular crystallographer
including Patterson methods and Direct methods. To help understanding
potential pitfalls special emphasis is put on elucidating the
symmetries of the search spaces.
||Frank von Delft
|A Very Large Substructure: The 160 Seleniums of KPHMT|
We present the largest successful application of selenomethionine MAD reported to
date: the crystal structure of the decameric E.coli enzyme ketopantoate
hydroxymethyltransferase (KPHMT), with 160 ordered selenium atoms and 560 kDa of
protein in the asymmetric unit. Despite small (< 150 µm), irregular, weakly
diffracting (< 3.2 Å) crystals, the substructure was solved by SAD combine
with Direct Methods, using a 20-fold redundant peak
dataset. SnB produced the first correct solution after 2600 computing hours
and phases from SHARP and Solomon produced traceable maps, even before
20-fold NCS averaging. Subsequent analysis revealed that while data redundancy
was critical for success, careful selection of data was even more so; on the
other hand, speed and success rate vary considerably between programs. Apart
from a favourable ratio of selenium to scattering matter, the procedure was quite
general, suggesting that this is still a long way from the practical upper
limit of applicability, if that exists.
|Structure determination of the extracellular domain of the LDL Receptor:
a non-trivial case of MAD phasing|
We have solved the structure of the extracellular domain
of the LDL receptor at 3.7Å resolution.
A MAD experiment was carried out on our crystals at the tungsten edge.
The asymmetric unit of our crystals contained one protein molecule and
31 tungsten atoms arranged in 2 1/2 clusters. The MAD phases led to an
interpretable electron density map in which known fragments could easily
be placed. In diffraction experiments using energies at the tungsten edge
and above, the presence of so many anomalous scatterers together with
only 85 kDa protein, generated a tremendous anomalous signal. While
in principal useful for phasing, such a large anomalous signal proved in
practice problematic for data processing and structure determination.
In addition, the poor quality of the crystals and their radiation
sensitivity hindered progress. We will describe the methods used that
ultimately led to structure determination of the extracellular domain of
the LDL Receptor.
|Determination of accurate substructures with SHELXD|
Using the signal of naturally built-in or artificialy introduced anomalous scatterers
to derive a starting phase set in a macromolecular crystal structure determination
has become routine in recent years. In particular in the contect of high-
throughput crystallography, MAD and SAD (multiple and single wavelength
anomalous dispersion) methods are central tools. For both techniques, a crucial
step is the determination of the substructure of anomalous scatterers.
Due to the molecules investigated becoming larger and the use of soaking
techniques that may populate many sites, the size of typical substructures
is increasing and classical direct methods and Patterson techniques have hit
their limit. Although originally designed for the ab initio phasing
of entire macromolecular structures, real/reciprocal Fourier recycling methods
combined with Patterson-based seeding as implemented in SHELXD prove
very effective in finding substructure sites.
Choosing the right subset of the diffraction data for the substructure
determination can make the difference between success and failure. For example,
the inclusion of noisy high resolution data will generally do more harm than
good. Furthermore, subsequent phasing procedures such as SHELXE will profit
from starting with a substructure model that is as accurate as possible.
Using a computer program that allows the quantitative comparison of different
substructure models (SITCOM), we investigated how the most accurate sustructure
can be obtained under different circumstances. The results of this study for
different scenarios in MAD and SAD phasing will be discussed.
|Session 5: Twinning |
|The Derivation of Non-Merohedral Twin Laws During Refinement by Analysis of Poorly-Fitting Intensity Data|
Data sets from non-merohedral twins contain large numbers of reflections
that are unaffected by twinning it is our experience that their structures
can be solved with out difficulty. Problems such as large, inexplicable
difference peaks and a high R-factor may indicate that twinning is a problem
during refinement. Careful analysis of poorly fitting data reveals that they
belong predominantly to certain distinct zones in which
systematically larger than |Fcalc|2.
If twinning is not taken into
account it is likely that these zones are being poorly modelled, and that
their indices may provide a clue as to a possible twin law. We have
written a computer program, called ROTAX, which makes use of this idea to
identify possible twins laws. A set of data with the largest
values of [Fo2 -
identified and the indices transformed by two-fold rotations about
possible direct and reciprocal lattice directions. Matrices which
transform the indices of the poorly fitting data to integers are
identified as possible twin laws. The user then has a set of potential
matrices which might explain the source of the refinement problems
described above. If area detector intensity frames are available, then the
current orientation matrix may be transformed in an attempt index
previously unindexed spots. Alternatively the twin laws can be used to
split affected zones of reflection data and a check made to see if this
improves the refinement statistics.
|MAD phasing on merohedrally twinned crystals|
Merohedrally or pseudomerohedrally twinned protein
crystals may occur more often than usually acknowledged.
Often such crystals are discarded without further analysis
as difficult without reason. Several structures have been
solved from twinned crystals by Molecular Replacement and
MIR, but not so many by MAD. The crystals of gpd were
twinned at about 35 % level, but it was possible to
solve the protein structure by the classic SeMet MAD
approach. This case will be presented in detail and
various procedures for the evaluation of the twinning
ratio and detwinning discussed.
||Anke Terwisscha van Scheltinga
|MIR structure determination of Deacetoxycephalosporin C Synthase from twinned crystals
Crystals of deacetoxycephalosporin C synthase (DAOCS) were found to be twinned
merohedry by many diagnostic criteria, e.g. a distorted cumulative intensity
and additional symmetry in the self rotation function. We determined the
DAOCS from these twinned crystals, based on a combination of isomorphous
the use of a multiple wavelength diffraction data set.
To identify and use possible derivatives, we detwinned the data, applying twin
estimated from Britton plots. The detwinned data resulting from these
accurate enough to interpret Patterson maps, to refine the sites and to obtain
phases. The twin fractions were refined by using the phasing statistics as
accuracy did not result in a relevant improvement of the phases, showing that
obtained from Britton plots were sufficient for structure determination.
We found that merohedral twinning is not necessarily an obstacle for structure
determination by MIR; even crystals with twin fractions as high as 0.45 can
information after detwinning. However, the use of crystals with lower twin
result in smaller errors when detwinning, and will produce better experimental
|Twinning in presence of multiple NCS: metals in a bacterial transferrin|
Merohedral twinning is usually detected and overcome by statistical methods,
which rely on the assumption that the intensity probability distributions for the
twin-related reflections are independent. They become correlated if a
non-crystallographic rotational symmetry element coincides with the twinning
symmetry. Presence of translational NCS seriously distorts the probability
distribution itself and biases our twinning estimates. Both rotational and
translational NCS complicated our twinned structure solution. Crystals of the
Ferric Binding Protein (Fbp) contain 9 protein molecules in the asymmetric
unit of the space group P32.
Self-patterson map reveals two strong
translational peaks. Six protein molecules are related by non-crystallographic
dyad axis parallel to the twinning axis, which lies perpendicular to the
crystallographic 3-fold axis and generates the P32
12 symmetry of the
diffraction pattern. The structure was solved by the combination of molecular
replacement and manual analysis of anomalous difference maps phased with the
trial models from molecular replacement. The presence of heavy hafnium atoms
bound to the protein provided strong anomalous signal, which ensured the
structure solution. Interestingly, oxidized Hf+4
ions form clusters (from
3 to 5 metal atoms) in the iron-binding pocket of FBP.
|Session 6: Refinement and Phasing|
|Generation and flow of experimental phase information in structure determination: recent enhancements in SHARP 2.0|
The SHARP program
for heavy-atom refinement and phasing, described in its initial form in 1997,
has been completely rewritten, New numerical methods, code restructuring and
the use of a powerful new optimiser have produced considerable gains in speed
and accuracy. These improvements have been accompanied by the development
of a new representation of phase information better suited to its transfer
from and towards other steps of structure determination and refinement.
|Errors and sigmas when crystals are changing|
Diffraction intensity measurements are obtained with some uncertainty,
described by sigma values. The sources for uncertainties in
measurements can be divided into two categories: systematic
phenomena and random noise. Systematic phenomena are mainly
associated with the crystal and instrument quality. Some of them, for
example non-uniform crystal rotation, can be accounted for by
multiplicative scale factors that apply to all reflections; others
require separate corrections for each unique reflection (for example
radiation-induced localized changes in the structure). If not
corrected for, systematic phenomena, together with random effects, will be
reflected in the final sigmas. Optimal treatment of all errors might be
crucial in phasing from anomalous signal, where the anomalous signal is of
similar magnitude as the uncertainty in the measurements.
Currently, there is no satisfactory theory for the consistent treatment of
uncertainties in crystallography. We will discuss theoretical and practical
problems, such as the role of personal bias, approximations and
assumptions of error correction procedures, propagation of errors
between different crystallographic procedures, and correlations
between parameters. In addition, the impact of some of the
systematic effects will be discussed with examples of uncertainty
treatment in popular crystallographic programs.
|Phasing the AP2 complex with Xe, Hg and Se|
The AP2 clathrin adaptor complex is a heterotetramer involved in the
formation of clathrin-coated vesicles. Crystals of the 200kD core complex
contain the trunk domains of the two large subunits α
and β2, the medium µ2 and small σ2 subunits. This
complex crystallises together with the lipid-headgroup mimic IP6 in
spacegroup P3121, unit cell a = b = 122 Å
, c = 258Å, γ = 120° with one complex in the asymmetric
unit. The crystals diffract at best to about 2.6Å resolution, but the
diffraction is weak beyond about 3.2Å (Wilson plot B-factor about 80Å
The structure was solved using a series of Xe and mercury (EMTS)
derivatives. The most important phasing came from two Xe derivative
datasets, which were collected at long wavelength s to enhance the
anomalous signal (f" = 9.0 at λ=1.74Å, f" = 11.1 at
=2Å). SeMet protein could only be prepared in the presence of a
partially rich culture medium, so the incorporation of Se was less than
100%. Phases from two Se datasets (each at two wavelengths) were used
together with the Xe and EMTS datasets, but the improvement to the phases
was small. The main value of the Se data was in aiding the model-building.
Xe derivatives are valuable phasing tools even for large structures,
particularly with data collected at long wavelengths.
|Can three-beam interference be an alternative to anomalous dispersion?|
The recently developed, reference-beam diffraction data-collection
technique, makes it possible to collect large numbers of relative phases
(Triplet phases) of Bragg reflections. Here we will demonstrate the
differences between the reference-beam diffraction method, the
conventional three-beam interference techniques, and the standard
oscillating-crystal method. With the reference beam technique it is
possible to collect hundreds of phase-sensitive three-beam interference
profiles on a time scale comparable to that of a multiple-wavelength
anomalous dispersion (MAD) experiment. Experimental results and analysis
will be presented, on how the triplet phases are obtained from measured
reflections, and then how individual phases can be deduced to produce an
electron density map based on the measured triplet phases.
|Phasing the 30S ribosomal subunit|
During the last couple of years, high-resolution crystal structures of both
subunits of the bacterial 70S ribosome have been determined which means
that we now have a complete structural scaffold that can be used to ask
new biochemical and structural questions to try and further understand
the function of this important and complex enzyme.
In the case of the 30S ribosomal subunit from Thermus thermophilus,
phasing went hand in hand with efforts on several fronts to push the
resolution limit of the crystals to a level where individual protein
side chains and RNA bases could be distinguished in the electron
density. In spite of significant technical and methodological developments
in the field of macromolecular crystallography over the last decade, the
30S case proved particularly difficult for a number of reasons including
crystal variability and radiation damage. Crucial to the success of ab
initio phasing was the use of single wavelength anomalous data
collected at the LIII-absorption
edges of relevant heavy atom compounds
such as lanthanides and osmium complexes. Other important aspects of the
data collection and phasing protocol included the use of crystal alignment,
pre-screening of crystals, and the use of isostructural compounds to
compensate for non-isomorphism between native and derivative crystals.
Successful interpretation of the resulting electron density map was further
aided by vast amounts of prior biochemical data on the composition and
structure of the ribosome. A reexamination of the origin of the phasing
signal has indicated that very weak anomalous signal is more important for
successful phasing of very large complexes than one might initially expect.