(file) Return to fft.tex CVS log (file) (dir) Up to [CCP4] / ccp4 / manual

File: [CCP4] / ccp4 / manual / fft.tex (download) / (as text)
Revision: 1.6, Tue Dec 21 17:40:21 2004 UTC (5 years, 8 months ago) by ccb
Branch: MAIN
CVS Tags: series-6_1-root, series-6_1, series-6_0_99e-root, series-6_0_99e, series-6_0_99d-root, series-6_0_99d, release-6_1_3, release-6_1_24, release-6_1_2, release-6_1_13, release-6_1_1, release-6_1_0, release-6_0_patch, release-6_0_2, release-6_0_1, release-6_0, pre_xia2_remove, pre-merge-release-6_1_3, pre-merge-6_0_99e, pre-merge-6_0_99d, pre-merge-20_4_2009, pre-merge-13_08_2009, post_merge-13_08_2009, post-merge-release-6_1_3, post-merge-6_0_99d, post-merge-20_4_2009, merge-release_6_1_0, merge-6_0_patch_240407, merge-6_0_patch_100907, merge-5_8_2008, branch-merge-20_4_2009, branch-merge-13_08_2009, HEAD
Changes since 1.5: +2 -1 lines
replace fft by fftbig


\chapter[The Fast Fourier Transformation]
{The Fast Fourier
  Transformation%\protect\footnote{By Wojtek Rypniewski}
\\  --- or some of the things you always wanted to know about the
  FFT but were afraid to ask}

\begin{chapquote}
  ``Of course,'' the Mock Turtle said: ``advance twice, set to
  partners---''\\
  ``---change lobsters, and retire in same order,'' continued the
  Gryphon.\\
  --- {\em Alice's Adventures in Wonderland}
\end{chapquote}

\section{Introduction}\index{FFT@\acronym{fft}}
There must be very few algorithms that have made as great an impact on science
as the Fast Fourier Transform
(\acronym{fft}).\index{Gauss, C. F.@C.~F.~Gau\ss{}}%
\footnote{C.~F.~Gau\ss{} reputedly used the equivalent of real-valued
  FFTs in 1805, but no surprise there.  There were independent discoveries of
  FFT algorithms in modern times preceding that of Cooley and Tukey, often
  referred to as the discoverers.}
It was Lynn Ten Eyck
\shortcite{teneyck-73,teneyck-77} who introduced the \acronym{fft} to
crystallography and we have been using it ever since.  Whenever we do
something that involves a calculation of an electron density map or structure
factors i.e., moving between real and reciprocal spaces, it is almost certain
that the program we use has an \acronym{fft} algorithm embedded somewhere
inside it.  We can tell if it does not---the calculation is orders of
magnitude slower than what we have come to expect.  In addition to the above
uses, the \acronym{fft} has an important role in molecular replacement, in the
calculation of the rotation and translation functions.  This development is
mostly due to Tony Crowther \shortcite{crowther-72}, who showed how the
rotation function can be calculated by an \acronym{fft} if the Patterson
function is expanded in spherical harmonics, and David Blow
\cite{crowther/blow-67}, who applied the \acronym{fft} to translation
functions.  

\section[How it works]{How it works\footnote{It is `interesting' to note that there are various software \idx{patents}
    connected with the FFT and similar transforms which monopolies may
    restrict your freedom to use these algorithms; (this is of course
    impossible).\index{Adams, Douglas} Such software monopolies are a
    serious threat to \idx{free software} and quasi-free software such
    as \ccp4.  Oppose software patents and interface copyrights!}}

Although it is not necessary to know anything about the mechanism of the
\acronym{fft} in order to use it, it might still be interesting---especially
if the explanation is reasonably understandable.  A good description of the
\acronym{fft} is given in \cite{numerical-recipes}.  It goes like this:

If we have a discrete Fourier transform of length $N$ we can split it into
two transforms of length $N/2$ each.  One of them is formed from the
even-numbered terms and the other from the odd-numbered terms:
\begin{eqnarray}
  F_h & = & \sum_{j=0}^{N-1} {f_j e^{2\pi i j h/N}}\nonumber\\
      & = & {\sum_{j=0}^{N/2-1} {f_{2j} e^{2\pi i h (2j)/N}}} +
      {\sum_{j=0}^{N/2-1} {f_{2j+1}e^{2\pi i h(2j+1)/N}}} \nonumber\\
      & = & {\sum_{j=0}^{N/2-1} {f_{2j} e^{2\pi i j h/(N/2)}}} +
      W_h {\sum_{j=0}^{N/2-1} {f_{2j+1}e^{2\pi i h j/(N/2)}}}\nonumber\\
      & = & F_h^{\mr{even}} + {W_h F_h^{\mr{odd}}},
\end{eqnarray}       
where $W_h = e^{2\pi ih/N}$.  

Depending on $N$ we can continue dividing the original transform.  This is
known as {\sl factoring}.  We then get a series of terms like $F_k^{\mr{even,
    odd, even\dots}}$, in this case, for a Fourier transform of points that
are successively even, odd, even etc.\ in the successive subdivisions of the
data.  In the extreme case, when $N$ is some power of two, we can divide the
data all the way down to transforms of length one.  A transform of length one
is just an identity operation.  Now the transform consists of terms like
\begin{equation}
  F_k^{\mr{even, even, odd, even, odd, odd \ldots\ even}} = f_n.
\end{equation}
The problem now is with book-keeping---we need to know what $n$ is.  The
answer is as follows.  Let $\mr{even}=0$, $\mr{odd}=1$ and then read the
resulting patterns backwards i.e., reverse the bits, and you get in binary the
value of $n$.  It works because the successive subdivision of data into even
and odd are tests of successive least significant bits of $n$.  If we now
rearrange the data into the bit-reversed order, the transform can be
constructed by combining adjacent pairs of points to get two-point transforms,
then the adjacent pairs of pairs are combined to get four-point transforms and
so on until the first and the second half of the data is combined into the
final transform.

If you have a mathematical or functional programming bent and want a
deep understanding of the algorithm, consult \cite{jones-89} for a
{\em calculation\/} of it.

\subsection{Example}

Let's take a simple example of an 8-point transform.  The coefficients of $F$s
of the successive divisions of data are:
\begin{displaymath}
  \begin{array}{*{15}{c}}
           &   &&       e \:\rlap{(=even)\hss}   &&&&&&&                    o\:
           \rlap{(=odd)}\\

           & ee&       &&       &eo&     &&         &oe&      &&        &oo&\\

        eee  &&   eeo  &&   eoe  &&   eoo  &&   oee  &&   oeo  &&   ooe  &&   ooo
\end{array}
\end{displaymath}
In binary this becomes
\begin{displaymath}
  000\quad 001\quad 010\quad 011\quad 100\quad 101\quad 110\quad
  111,
\end{displaymath}
and with the bits reversed:
\begin{displaymath}
  000\quad 100\quad 010\quad 110\quad 001\quad 101\quad 011\quad
  111,
\end{displaymath}
which in decimal corresponds to elements numbered $ 0\: 4\: 2\: 6\: 1\: 5\:
3\: 7$.  This means that we first have to combine element 0 with 4 (adjacent
pairs on the above list), 2 with 6, 1 with 5 and 3 with 7 to get two-point
transforms.  (N.b., each element is normally a complex number).

Next we combine the resulting elements 0 with 2, 1 with 3, 4 with 6 and 5 with
7 to get 4-point transforms.  Finally we combine the resulting elements 0 with
1, 4 with 5, 2 with 3 and 6 with 7 to get the complete 8-point transform.
Done.

\subsection{Why is it fast?}

Let's now see how fast the \acronym{fft} is compared with the traditional,
`brute force' approach.  First the data have to be rearranged into the bit
reversed order.  That takes very little time and requires no extra
memory as it only involves swapping pairs of data.
The transform itself is constructed in $\log_2 N$ sweeps through the data
i.e., the number of times all the $N$ data points can be factored---three in
the above example.  At each level data points are combined pairwise.  This
means the number of operations in each sweep is proportional to $N$\@.  Thus
the whole algorithm is of order $N\log_2 N$.  Such $O(N\log N)$ asymptotic
complexity is typically associated with `divide-and-conquer' algorithms like
this.  The transform can be parallelised and vectorised.

What about the non-\acronym{fft} approach?  From the expression
\begin{equation}
  F_h = \sum_{j=0}^{N-1}{f_j e^{2\pi ijh/N}}
\end{equation}
we see that for every $F_h$, numbering $N$, we cycle over all $f_j$s, (also
numbering $N$)\@.  Thus the number of operations is proportional to $N^2$.

What is the difference between $N^2$ and $N\log_2 N$\@?  For $N=1000$ the
factor is 100.  For $N=1000000$ the factor is 50000---roughly the difference
between one minute and one month.  Moral: beware of algorithms with
quadratic (or higher) complexity.

\section{Crystallographic \acronym{fft}}

So far we have only discussed the case where the number of data points $N$ is
a power of two.  What if $N$ cannot be factored all the way to single points?
There are other, highly optimised, algorithms tailored for handling transforms
of various lengths.  In fact, the crystallographic \acronym{fft} does not
require $N$ to be a power of 2, but in general $N$ has to be a `nice' number.
No prime factors greater than 19 are permitted.  Additional requirements have
to be satisfied for specific space groups.

In crystallographic applications the \acronym{fft} is additionally optimised
to take advantage of crystallographic symmetry.  The application of the
\acronym{fft} in crystallography is explained in full by Ten Eyck
\shortcite{teneyck-73,teneyck-77,teneyck-85}.

Spacegroup-specific transforms use symmetry to speed up the
calculation but are more likely to have bugs, especially in uncommon
spacegroups.  Use of P1 is thus recommended if speed isn't a problem;
if it is, test the higher-symmetry version against P1.

\section{Programs}

The \ccp4 suite contains several programs making use of the \ac{fft}, e.g:
\cprog{fft} (based on Ten Eyck's original) calculates Fouriers, difference
Fouriers, Pattersons and difference Pattersons from reflection data;
\cprog{sfall} calculates structure factors and X-ray gradients for refinement
using inverse and forward \ac{fft}s; \cprog{almn} calculates rotation function
overlap values using \ac{fft} techniques; etc.~\dots{}

\cprog{fftbig} is an in-core version of \cprog{fft} which runs in P1
only.  It may be faster you have enough {\em real\/}
memory available---otherwise it will cause page thrashing.  This has replaced
the original fft program.

\endinput

% Local Variables: 
% mode: latex
% TeX-master: "manual"
% End: 

ccp4@ccp4.ac.uk
Powered by
ViewCVS 0.9.3