**by Alexandre G. URZHUMTSEV ^{a*} and Vladimir Y. LUNIN^{ab}**

^{a} LCM3B, UPRESA
7036 CNRS, Faculte des Sciences, Universite Henri Poincare, Nancy I, 54506
Vandoeuvre-les-Nancy, France

^{b} Institute of
Mathematical Problems of Biology, Russian Academy of Sciences, Pushchino,
Moscow Region, 142290, Russia

*Email:sacha@lcm3b.u-nancy.fr*

Recently,
several groups (Murshudov *et al*.,
1997; Templton, 1999; Tronrud, 1999) reported some advances in calculation of
an approximate matrix of the second derivatives (Hessian or normal matrix) for
crystallographic criteria which, in particular, is used in optimisation
methods. In fact, the gradient methods need only the product of this matrix by
a given vector which can be calculated rapidly and directly, without explicit
matrix calculation (Lunin & Urzhumtsev, 1985). The second-order
minimisation methods need an inverted Hessian matrix with an inversion being
very sensible to approximations; as a consequence, algorithms for the
calculation of the *exact* Hessian
matrix [H] must be developed in this case. Moreover, it is important to know to
calculate such matrix for combined minimisation criteria, with a mixture of
different structure-factor depending criteria (crystallographic criterion in
what follows), geometric criteria and others. This knowledge can influence the
choice of the optimisation methods which vary in computation time per
iteration, in the number of iterations necessary to arrive to a similar result,
and in radius of convergence.

The direct calculation of the normal matrix of a crystallographic
criterion needs of order of MN² operations (derivative of the contribution of
every reflection with respect to every couple of parameters). Here N is the
number of atomic parameters, M is the number of structure factors. In order to
gain CPU time, some authors use a sparse matrix (Sussman *et al*., 1977; Hendrickson & Konnert, 1980; Dodson, 1981;
Guillot *et al*., 2001). To estimate the full matrix of the second
derivatives for the weighted least-squares fit of the magnitudes Templton
(1999) suggested a statistical approach. Murshudov *et al*. (1997) and Tronrud (1999) developed the FFT technique
suggested for this goal by Agarwal (1978). In particular, in the case of atomic
parameters **q** as independent
variables, Tronrud (1999) proposed a practical algorithm which needs

*T*_{HT} = C_{1}N² + C_{2}M
ln M (1)

operations to calculate the principal part of the matrix [H]

traditionally
neglecting the second term _{}

Similar
formulae were derived by Murshudov *et al*.
(1997) for the case of the maximum likelihood criterion. At the same time, the
use of Fast Differentiation Algorithm (Baur & Strassen, 1983; Kim *et al*., 1984; for crystallographic
applications see Lunin & Urzhumtsev, 1985) allows to calculate rapidly the *exact* matrix of the second derivatives
with respect to atomic parameters for both these criteria and to generalise
this result for other cases.

Traditionally,
it is considered that the gradient methods are much more time consuming than
the methods without derivatives, and that generally every computation of a
gradient needs about N times CPU more than a single value of the function, N
being the number of variables. However, several groups (*e.g.,* Baur & Strassen, 1983; Kim *et al*., 1984) have shown
the following result and its conclusions:

For any function
of N variables, calculation of a single value of which takes the time *T*, an algorithm exists to calculate its *exact* gradient faster than in 4*T* *independently
on the number N* of variables.

For any function
of N variables, calculation of a single value of which takes the time *T*, an algorithm exists to calculate its *exact* derivative along a given direction
faster than in 4*T* *independently on the number N* of
variables and *without* a gradient
calculation.

For any function
of N variables, calculation of a single value of which takes the time *T*, an algorithm exists to calculate the *exact* product of the matrix of the
second derivatives by a given direction faster than in 20*T* *independently on the number
N* of variables.

In practice, when some common expressions can be saved in memory,
the time to calculate all 4 objects, namely the function *f* itself, its gradient Ñ*f*, the function derivative
_{} along a given
direction **s** and the product [H]**s** of the Hessian matrix [H] by
the same direction, can be estimated rather by 4*T* where *T* stands for the
time of a single calculation of the function value (Urzhumtsev *et al*., 1989).

FDA shows that the crucial point in the fast optimisation of crystallographic functional is a fast calculation of structure factors. A fast way of their calculations (Sayre, 1951; Ten Eyck, 1977) needs about

*T _{f}* = C

operations (C_{1}
and C_{2} are some constants), instead of CNM operations for a direct
calculation.

FDA gives the same amount of operations for the
gradient calculation being applied to a least-squares criterion depending on
atomic parameters (Lunin, 1978, unpublished; Lifchitz, in Agarwal, 1981) or to
other crystallographic criteria (Lunin & Urzhumtsev, 1985; Urzhumtsev *et al*, 1989).

Moreover, because the n-th
column of the matrix [H] of the second derivatives represents the product of
this matrix by the direction (0, 0, … , 0, 1, 0, …, 0) where 1 is in the n-th
position, the whole *exact* matrix of
the second derivatives composed from N columns can be calculated by

*T*_{HE} = C_{1}N² + C_{2}
NM lnM (5)

operations. However, a faster and more general way to calculate the exact Hessian matrix can be suggested.

Let a function R(y_{1}(x_{1}, …, x_{N}), …,
y_{M}(x_{1}, …, x_{N})) depend on M variables y_{1},
…, y_{M} with every y_{m}
depending in turn on N variables x_{1}, …, x_{N} . The chain
rule formula:

can be rewritten as

Here
_{}
is a tensor composed of M matrices
_{},
_{}, ...
_{}. If the gradient [Ñ** _{y}**R] is supposed
to be known (the calculations are carried out accordingly to the FDA) then the
operations needed to get

Formula (7) can be simplified for a number of special cases.

*Special
case 1: *If
the criterion R is additive, *i.e.* can
be presented by a sum of individual contributions, not necessarily quadratic,
from the components of the vectors **y**

then the matrix
_{} becomes diagonal. An
example is the least-squares fit of the calculated data to the experimental
ones.

*Special
case 2: *When
the variables **y** depend linearly on the
variables **x** , **y** = [A] **x** , the second
term in the formula is absent and (7) becomes

An example of such linear dependence is the relation between the electron density and its structure factors.

*Special
case 3: *
Quite often variables** x**
contribute to variables **y** locally,
i.e. each of x_{1}, x_{2}, ... x_{N} contribute only to a
small number C_{y} of variables y_{k}, C_{y} << M . Complementary,
every y_{k} may depend only on a small number of C_{x}
parameters x_{j}, C_{x}
<< N . The matrix [d**y **/ d**x**] becomes sparse giving an essential
reduction in the number of calculations. An example of such dependence is the
calculation of any field (e.g., an
electron density) from an atomic model where atoms have a limited radius of
contribution and are separated each from others.

It is easy to see that the calculation of normal
matrix for crystallographic criteria can profit the features of all particular
cases discussed above and it can be shown that the total amount of operations
necessary to calculate the exact matrix of the second derivatives of the
crystallographic least-squares criterion with respect to the atomic parameters
is

*T*_{HC} = C_{1}M + C_{2}M
lnM + C_{3}N² ~ C_{12 }M lnM + C_{3}N² (10)

where C_{1},
C_{2}, C_{3 }and C_{12 }are some constants which do not
depend neither on the number of structure factors M nor on the number of atoms
N. This estimate is the same as (1) for an *approximate*
matrix where the second-order terms _{}
are neglected from the beginning and is better than (5) which
is obtained for the *exact *matrix by simple N-times calculation of the
product of a matrix by a direction following a coordinate axis.

The same amount of operations is enough to calculate the matrix for
the intensity least-squares criterion (Sheldrick & Schneider, 1997), for
the phase criterion (Lunin & Urzhumtsev, 1985; Pannu *et al*., 1998), for the maximum likelihood* *criterion (Bricogne & Irwin, 1996; Pannu & Read, 1996;
Read, 1997; Murshudov *et al*., 1997; Pannu *et al*.,1998).

In the general situation when the criterion cannot be represented by a sum of contributions from many small subsets of structure factors, the total number of computer operations becomes :

*T*_{HG} = C_{6}M² lnM + C_{3}N²
(11)

If the
independent parameters are not atomic ones (*e.g.,*
parameters for a rigid groups refinement), the matrix with respect to
independent parameters can be calculated in the same way.

The formula (7)
gives also an idea that for some special cases the inverted matrix of the
second derivatives can be obtained directly without calculation the Hessian
matrix itself. Indeed, if the transformation **y**(**x**) is linear, **y** = [A] **x**, then

If [A]
corresponds to the Fourier transformation, the inverse operation is the inverse
Fourier transform for which the matrix [A^{-1}] can be written
immediately. The matrix _{}
can be easily calculated for many crystallographic criteria,
in particular for such important criteria as the least-squares or maximum
likelihood functionals. Therefore, when the independent parameters are density
values at the grid points the inverse Hessian matrix can be easily and directly
calculated for these criteria as

where all
elements of the matrix (13) are presented by the same function calculated as a
Fourier series at different points **r**
. In order to calculate this function at a grid compatible with the number of
Fourier coefficients M, estimating K ~ M , the number of operations needed is
about C_{2}M lnM (Cooley & Tukey, 1965; Ten Eyck, 1973). This
results shows that the minimisation methods of simple iteration, usually
applied for density modification procedures, can be replaced not only by the
gradient methods (Sayre, 1972, and Sayre & Toupin, 1975, for the particular
Sayre criterion; Lunin, 1985, for the general case) but even by the methods of
the second order. In this case the computational expenses are practically the
same as those for the simple iteration methods.

In the case of
the crystallographic least-squares or maximum likelihood refinement of atomic
model, the *exact* matrix of second
derivatives can be calculated by

*T*_{HC} = C_{12 }M lnM + C_{3}N² (14)

Operations where
M is the number of structure factors, N is the number of atomic parameters and
C_{12} and C_{3 }are some constants which do not depend neither
on M nor on N. This algorithm suggests step-by-step recalculation of the matrix
with respect to variables of different levels of the molecular models
(structure factors, density, atomic parameters etc.). The same iterative
calculations are basic steps for the fast gradient calculation (Lunin &
Urzhumtsev, 1985). It should be noted that such way of calculation allows
easily to add the contribution from any other criteria of the same type or of
any other type of models, *e.g.,* phase
criterion, stereochemical criteria, criteria depending on the electron density
etc. Therefore, the formulae which give the expression of the gradient (or of
the Hessian matrix) of a crystallographic criterion directly in terms of atomic
parameters can be useful for understanding but may be rather misleading
algorithmically.

The full material with details of corresponding derivations will be reported elsewhere (paper in preparation for Acta Crystallographica).

The work was supported by RFBR grant 00-04-48175 (VYL). The authors thank A.G. Kushnirenko and K.V. Kim who attracted their attention to the fast differentiation algorithm and C. Lecomte and Centre Charles Hermite (Nancy) for their support of the work

**References**

Agarwal, R.C. (1978) *Acta Cryst*., **A34**, 791-809.

Agarwal, R.C. (1981) In *Refinement of Protein Stryctures*:
Proceeding of the Daresbury Study Weekend 15-16 November, 1980; compiled by
P.A.Machin, J.W.Campbell and M.Elder, pp. 24-28. Warrington : Science and
Engineering Research Council, Daresbury Laboratory.

Baur, W. &
Strassen, V. (1983). *Theoretical Computer
Science*, **22**, 317-330.

Bricogne,
G., Irwin, J. (1996) In *Macromolecular
Refinement*: Proceedingof the CCP4 Study Weekend, E.Dodson, M.Moore, A.Ralph
& S.Bailey, eds., pp.85-92. Warrington : Daresbury Laboratory.

Brünger, A.T. (1992) *Nature*,** 355**, 472-474.

Cooley, J.W. &
Tukey, J.W. (1965) Math.Comput., **19**,
297-301

Cowtan, K. & Ten
Eyck, L.F. (1998) *ECM-18 Abstracts,* *XVIIIth European Cryst.Meeting, 16-20 August
1998, Prague, Republic Czeck*, *E5-P7,*
*Bulletin of the Czeck and Slovak
Crystallographic Association, ***5A***, *136

Dodson, E. (1981) In *Macromolecular Refinement*: Proceeding of
the CCP4 Study Weekend, E.Dodson, M.Moore, A.Ralph & S.Bailey, eds., pp.
85-92. Warrington : Daresbury Laboratory.

Guillot, B., Viry,
L., Guillot, R., Lecomte, C., Jelsch, C. (2001) *J.Appl. Cryst*., **34**, in
press.

Hansen, N.K. &
Coppens, P. (1978) *Acta Cryst*., **A34,** 909-921.

Hendrickson, W.A.
& Konnert, J.H. (1980) In *Biomolecular
Structure, Function, Conformation and Evolution*, edited by R.S.Srinivasan,
Vol. 1, 43-57. Oxford : Pergamon.

Hestenes, M.R. &
Stiefel, E. (1952) *J.Res.Nat.Bur.Standards*,
**49**, 409-436.

Kalinin,
D.I. (1981) *Kristallographiya*, **25**, 535-544.

Kim, K.M., Nesterov,
Yu.E. & Cherkassky, B.V. (1984) *Dokl.Acad.Nauk
SSSR*, **275**, 1306-1309.

Lanczos, C. (1952) *J.Res.Nat.Bur.Standards*, **49**, 33-53.

Lunin, V.Yu. (1985). *Acta Cryst*. A**41**, 551-556.

Lunin, V.Yu. &
Skovoroda, T.P. (1995). *Acta Cryst*. A**51**, 880-887.

Lunin, V.Yu. &
Urzhumtsev, A.G. (1985) *Acta Cryst.*, **A41**, 327-333

Lunin, V.Yu. & Urzhumtsev,
A.G. (1999) *CCP4 Newsletter on Protein
Crystallography,* **37**, 14-28

Murshudov, G.N.,
Vagin, A.A. & Dodson, E.J. (1997) *Acta
Cryst*. D**53**, 240-255.

Pannu, N.S.,
Murshudov, G.N., Dodson, E.J. & Read, R.J. (1998) *Acta Cryst*. D**54**,
1285-1294

Pannu, N.S. &
Read, R.J. (1996) *Acta Cryst*. A**52**, 659-668.

Read, R.J. (1997) In *Methods in Enzymology*, Academic Press,
San Diego., C.W.Carter, Jr., R.M.Sweet, eds., **277B**, 110-128.

Sheldrick, G.M. &
Schneider, T.R. (1997). In *Methods in
Enzymology*, Academic Press, San Diego., C.W.Carter, Jr., R.M.Sweet, eds., **277B**, 319-343.

Sayre, D. (1951) *Acta Cryst*., **4**, 362-367

Sayre, D. (1972) *Acta Cryst*., A**28**, 210-212

Sayre, D. &
Toupin, R.A. (1975) *Acta Cryst*., A**31**, S20

Sussman, J.L.,
Holbrook, S.R., Church, G.M. & Kim, S.-H. (1977) *Acta Cryst*., **A33**,
800-804.

Templton, D. (1999) *Acta Cryst*., **A55**, 695-699.

Ten Eyck, L.F. (1973)
*Acta Cryst*., **A29**, 183-191.

Ten Eyck, L.F. (1977)
*Acta Cryst*., **A33**, 486-492.

Ten Eyck, L.F. (1999)
*IU**
Collected Abstracts,* *XVIIIth IUCr
Congress & General Assembly, 4-13 August 1999, Glasgow, Scotland*,* *97.

Tronrud, D.L. (1992) *Acta Cryst*., **A48**, 912-916.

Tronrud, D.L. (1999) *Acta Cryst*., **A55**, 700-703.

Urzhumtsev, A.G.,
Lunin, V.Yu. & Vernoslova, E.A. (1989) *J.Appl.Cryst.*,
**22**, 500-506

Newsletter contents...