RSPS (CCP4: Supported Program)


rsps - heavy atom positions from derivative difference Patterson maps.


[Keyworded input]


Program name:
RSPS, Version 4.2
F77 with exceptions as listed below
Unix, Windows, MacOS X
All acentric spacegroups
Stefan D. Knight
Stefan Knight,
SLU, Dept. of Mol. Biol.,
BMC, Box 590,
S-751 24 Uppsala, Sweden.
Phone: + 46 - (0)18 471 45 54
Knight, S.D. (1999): RSPS version 4.0: a semi-interactive vector-search program for solving heavy-atom derivatives, Acta Cryst. D
Knight, S.D. (1989): Ribulose 1,5-Bisphosphate Carboxylase/ Oxygenase - A Structural Study, Thesis, Swedish University of Agricultural Sciences, Uppsala.


  • Description
  • Commands
  • Input and output files
  • Patterson map requirements
  • Summary of RSPS search options
  • Examples
  • Program structure
  • Miscellaneous comments on the use of the program
  • Input changes from 3.2 to 4.2


    RSPS is a command-driven program intended to help protein crystallographers solve their heavy atom derivatives. The program can also be used as an interactive tool to examine the fit of potential heavy atom sites to the difference Patterson map. The program can handle all acentric spacegroups. RSPS version 4.0 may also be used to locate molecules with NCS. The goal of RSPS is not to generate a complete solution to the heavy atom difference Patterson, but rather to find enough sites to allow initial phases to be calculated for difference Fourier analysis. The program as such will not provide absolute answers and it is therefore very useful if you have at least a rudimentary understanding of the Patterson function, and of how to solve it, before you start using RSPS. See for example Stout & Jensen (1989), chapter 12; Blundell & Johnson (1976), chapter 11.

    RSPS is a grid search program that provides search options as well as options for examining potential solutions. All options operate in real and vector space. Searches can be performed to locate either heavy atom positions, or, under certain conditions, to locate the position of molecules with internal symmetry. Searches are carried out by assigning trial positions on a grid covering the asymmetric unit of the crystal, and then computing a score for each trial position, based on the Patterson densities at the positions corresponding to the predicted vectors for each position. From the symmetry operators (crystallographic and/or non-crystallographic) all unique transformations that map a point in real (crystal) space to a point in vector (Patterson) space are generated. In other words, these transformations map a point in real space to the Patterson vectors associated with that point.

    Search options are divided into six groups depending on the vector set to be used in the search. The available sets are of two main types: atom vector sets that are used to search for the position of heavy atoms, and molecule vector sets that are used to find the position of molecules with NCS when at least one NCS axis is closely parallel to a crystallographic symmetry axis. The two main types of vector sets each have three subcategories termed single, more, and translate. With single vector sets (selected by VECTORSET SINGLE), the position of one site at a time is determined by considering vectors between symmetry related positions. This type of search is referred to as a single site search. When only SGS (spacegroup symmetry) is used in a single site search, this corresponds to using only Harker vectors, i.e. vectors between SGS-related positions. When NCS is applied, cross vectors that may be termed pseudo-Harker vectors will be generated between NCS-related positions. The combination of SGS and NCS will in addition generate cross vectors between positions on different copies of the NCS protein molecule. Once at least one atom position (or NCS molecule) has been found, it may be fixed and used to search for further sites by considering cross vectors from trial sites to the fixed site(s) (using VECTORSET MORE). VECTORSET TRANSLATE is used to search simultaneously for two or more atoms provided that their inter-atomic vectors are known. Both SGS and NCS may be independently switched on and off, giving a very flexible means of controlling the type of vectors to be used in the search.

    With the search options, a volume of the unit cell (usually one asymmetric unit) is scanned, and test points assigned on a grid within this volume. In single and more site searches, the scan parameter is the coordinates of a heavy atom position whereas in translate mode the scan parameter is the translation of the rigid fragment. For each test point, the values of the Patterson function at the predicted Patterson vector positions are collected and combined in some way (presently sum, product and harmonic mean functions are available). All or only the minimum N peaks may be used in the scoring function. The value of the combining function (the "score") is stored on a CCP4 format map (the "scoremap", defined with the SCORFILE command). A rejection level for peaks may be specified (REJECT command) together with a limit for the maximum number of peaks allowed to be less than the rejection level (LOW command). When this limit is passed, further work on that test position is aborted and it is given a score of zero. In this way computation for obviously wrong solutions may be aborted at an early stage which may considerably reduce the time needed for the search. The combined use of this rejection scheme and judicious use of the minimum function leads to a flexible set of scoring options well suited to accommodate the varying degree of "vectorness" of a protein difference Patterson function.

    The map resulting from a search will contain peaks at the positions of possible heavy atom (or molecule) sites and at positions related to these by a shift in origin or by inversion. The scoremap may be picked using the PICK SCOREMAP command to generate a list of potential heavy atom sites. Coordinates of (potential) heavy atom sites may also be read from an external file using the READ command. Coordinates are stored in the main coordinate array (see below). Picking or reading new coordinates will overwrite any previously stored coordinates. Coordinates for up to 800 positions may be stored. Each position is given position number which is just a sequential numbering of the stored coordinates, and a site number that groups together positions that generate the same set of Harker vectors. The site number thus groups positions related by inversion and/or origin shift (or spacegroup symmetry), i.e. different representations of the same solution. The assignment of site numbers may be quite slow for high-symmetry spacegroups. Also note that due to rounding errors positions with the same site number will not necessarily have exactly the same score, although in most cases they will. The position and site number may be used to reference position for use with the VLIST and FIXXYZ commands.

    Given a list of potential heavy atom positions, the GETSETS option may be used to search for sets of positions. This is done by looking at the cross vectors between all pairs of atoms (and their symmetry related equivalents) in the list. If a pair of positions pass the rejection criteria as specified by REJECT and LOW they are flagged as connected, otherwise they are flagged as unconnected. The program then finds all sets where all pairs of positions are flagged as connected. The output from GETSETS consists of, for each set, the coordinates of the positions in the set and a score table giving the score for the vectors generated by these positions. The TABLE command may be used to write out the score table for any currently stored set, or for a user defined set. The command LIST SETS will give a summary listing of all stored sets.

    A single site search in a polar spacegroup will result in a list of potential positions where one coordinate has not been determined (this will have been arbitrarily set to zero by the program). The POLARSCAN option may then be used to try and relate different solutions from the single site search to the same origin by fixing one position and translating the others, one at a time, along the polar axis. Scores based on all the cross vectors between the fixed and the translated position (and their symmetry-related equivalents) are computed as a function of displacement along the polar axis and stored. This type of search does not produce a scoremap, instead the coordinates are put directly in the main positions storage area where they will overwrite any previously stored coordinates.

    The VLIST option is used to examine potential solutions. If the Patterson map has been picked (PICK PATTERSON) the list of stored peaks is searched to find peaks close to the predicted vectors. The largest peak within 2.5 grid divisions from a predicted vector is listed together with the distance (in Angstrom) from the predicted vector. Note that the Patterson peaks as such are never used in a search.

    There are two areas for storing coordinates: the main coordinate array and the FIXXYZ coordinate array. Coordinates from picking the scoremap, as well as coordinates read from a file using the READ command are stored in the main coordinate array. When new positions are stored here these will overwrite any previously stored coordinates. Coordinates may be inserted in the FIXXYZ coordinate array by copying from the main coordinate array or by explicitly giving the fractional coordinates. Coordinates in the FIXXYZ area are stored until explicitly deleted (DELETE FIXXYZ). The number of positions that may be stored in the different coordinate storage areas are defined by the MXPICK and MXIPSN parameters. These parameters are set in the include file Values of the parameters in the distributed program are listed under "Space limitations" below.


    Commands are listed below in bold face, arguments in italics, optional subcommands and arguments are given in parenthesis (), and defaults in square brackets []. Angular brackets < > are used to enclose a list of alternative subcommands or arguments, separated by |. Curly brackets {} indicate that the enclosed subcommand may be repeated, a suffixed number sometimes indicating the maximum number of repetitions. Several commands may be entered on a single line separated by a semicolon (";") (exception: the macro command "@ filename" must be on a line by itself) and may be abbreviated. A command line may be continued over several input lines by appending a hyphen ("-") or ampersand ("&") at the end. Arguments and subcommands should be separated by tabs or spaces. Both upper case and lower case is OK. Entering a command without its argument(s) echoes back the current value of the argument(s). Anything following an exclamation mark ("!") or a hash sign ("#") is treated as a comment and ignored, as are blank lines.

    The available commands are:

    Scoring commands:
    Score, Weight, Tolerance, Bump, Reject, Low.
    Symmetry commands:
    Cell, Spacegroup, Sgsymm, Ncsrot, Ncsymm.
    Patterson and score map commands:
    Patfile, Scorfile.
    Coordinate manipulation commands:
    Fixxyz, Rotate, Add/Subtract, Read, Write.
    Searching and checking commands:
    Vectorset, Scan, Polarscan, Getsets, Vlist, Table, Pick.
    Miscellaneous commands:
    @ filename, Print, Status, List, Delete, Help, Exit, Quit.

    Alphabetic listing:

    @ filename, add/subtract, bump, cell, delete, exit, fixxyz, getsets, help, list, low, ncsrot, ncsymm, patfile, pick, polarscan, print, quit, read, reject, rotate, scan, score, scorfile, sgsymm, spacegroup, status, table, tolerance, vectorset, vlist, weight, write

    Scoring commands

    Score (< [sum n [all]] | product n [all] | harmonic n [all] | minimum n [1] >)

    Selects scoring function:
    Sum = sum function, product = product function, harmonic = harmonic mean function. An argument n may be given to specify that only the n smallest peaks should be used in scoring (the default is to use all peaks). The command score minimum n may alternatively be given as a separate command to select the min(n) function.

    Weight (n [1])

    Determines how the peaks are weighted in the score function:
    n   = 1    w(i) = 1.0
    n   = 2    w(i) = mult(i) ; the multiplicity of peak i.
    The Patterson peaks are multiplied by w(i) before forming the score so that using weight 2 with the sum function results in a correlation function between the observed and predicted (but unscaled) Patterson vectors.

    Tolerance (pektol [0])

    When scoring, the peak density used is the maximum value found within a box +- pektol grid units around the predicted vector position. By setting pektol = 1 this allows for slight initial misplacement of positions in the getsets and polarscan options where cross vectors between pairs of fixed positions are used, and also for small errors in non-crystallographic symmetry.

    Bump (dist [3.5])

    Shortest allowed minimum distance (in Angstrom) between symmetry related positions (using single vectorset and translate vectorset), or between a fixed position and a new position (using more vectorsets). If a negative value is given then null vectors are still allowed, i.e. special positions will be included in the search (applies to single vectorsets only; when using more vectorsets no check is made for special positions).

    Reject (rlevel [1*sigma])

    Rejection level for peaks, in units of the standard deviation of the Patterson map.

    Low (n [0])

    Maximum number of peaks/search point (peaks/atom pair with getsets) allowed to be less than rlevel (the value specified with the reject command).

    Symmetry commands

    Cell (a b c alpha beta gamma)

    Unit cell parameters in Angstrom and degrees. Cell parameters are normally read from the Patterson map header or the coordinate file header but for some operations it may be necessary to specify them explicitly. The best thing to do is probably to stick them at the top of a "startup" macro together with the spacegroup definition, file names etc.

    Spacegroup (< spgnum | spgnam >)

    Read spacegroup symmetry from library file. The spacegroup may be specified by number or name. The spacegroup command must be specified before the patfile command.

    Sgsymm (< on | off >)

    Switch use of crystallographic symmetry on/off.

    Ncsrot (< matrix r11 r12 r13 r21 r22 r23 r31 r32 r33 t1 t2 t3 | polar omega phi kappa t1 t2 t3 | n-fold along < x | y | z | l m n > (through t1 t2 t3) | ortho code [1] >)

    Read non-crystallographic symmetry. NCS may be specified in one of three different formats: either

    1. as a matrix given row by row followed by three translational components (fractional),

    2. as polar angles + translations (translation applied after rotation: xrot = Rx + t ),

    3. or by defining the order of the rotation axis using the n-fold along construct, where n = 2, 3, 4, or 6. The direction of the rotation axis can be given as three direction cosines l, m, n, or in shorthand notation as x, y, or z if the rotation axis is parallel to one of the crystallographic axes. The location of the axis is specified by giving the fractional coordinates of a point on the axis with the through subcommand.

    Each symmetry operation except the identity must be explicitly given, even in the case of proper symmetry.

    Ortho code specifies the orthogonalized frame in which the given non-crystallographic symmetry operates:
    code = 1,  orthogonal x y z  along  a,c*xa,c*  (Brookhaven, default)
         = 2                            b,a*xb,a*
         = 3                            c,b*xc,b* 
         = 4                            a+b,c*x(a+b),c*
         = 5                            a*,cxa*,c  (Rollett)
    To initialize the NCS arrays use delete ncs. The cell dimensions must be known prior to reading non-crystallographic symmetry. It is useful to put the NCS definition in a macro.

    Ncsymm (< on | off >)

    Switch use of non-crystallographic symmetry on/off.

    Patterson and score map commands

    Patfile (filename [patterson] (type < [ccp4] | protein >) (scale s1 [1.0] (s2 [0.0])) (truncate from to) ({reset < origin | u v w > radius level}))

    This command is used to read and (possibly) at the same time modify the Patterson map by scaling, truncation, and resetting peaks. The type subcommand specifies the type of map file to be read. At present ccp4 and protein format maps are recognized. The scale subcommand is used to specify a scale to be applied to the Patterson densities such that rho_scaled = s1 x rho + s2. The truncate subcommand sets a maximum for the densities in the map. Densities above from are then truncated to to. The subcommand reset is used to reset the densities within the specified radius (in Angstrom) of the origin or position u v w (fractional coordinates) to level. The default is to not reset any points in the map, but it is recommended that the origin peak, and any strong peaks arising from non-crystallographic translations, are reset to zero or a low value (in the background). The spacegroup must be known prior to reading the Patterson map (spacegroup command).

    Scorfile (filename [] (sect fast slow sect [Y X Z]) (title string [Real Space Patterson Search Map]))

    Define scan output file (the "scoremap"). Optionally, the sectioning of the scoremap may be specified here by using the sect subcommand. A title string (maximum length 80 characters) to be written on the scoremap header may also be specified using the title subcommand. The actual title written also contains a string stating the type of scan and the run date. If the title contains any words that are also RSPS command words, the title must be enclosed by quotes.

    Coordinate manipulation commands

    Fixxyz (< site n1 (n2) | position n1 (n2) | set n1 (n2) | peak n1 (n2) | x y z >)

    Up to 30 positions may be stored in the fixxyz area. These coordinates are kept in core until explicitly deleted (delete fixxyz). With more vectorsets the positions stored in this area are used as the fixed positions; in translate vectorsets they define the rigid fragment to be translated. Coordinates may be specified explicitly (fractional) or by reference to site, position, set or peak numbers. Giving just one number selects one position whereas giving two numbers selects a range.

    Rotate < position n1 (n2) | fixxyz n1 (n2) > < sgs m1 (m2) | ncs m1 (m2) | matrix r11...r33 (t1 t2 t3) > (fixxyz (k1))

    Rotate position(s) n1 through n2 from the main coordinate area (positions subcommand) or from the fixxyz area by spacegroup symmetry (sgs) or non-crystallographic symmetry (ncs) operation(s) m1 through m2, or by matrix explicitly given (row by row). The rotated coordinates may optionally be stored in the fixxyz area starting at location k1 (if no location is specified the first empty slot will be used).

    < Add | subtract > < fixxyz position n | fixxyz vector u v w >

    Add or subtract a vector to each of the stored fixxyz positions. The vector may be taken either from positions stored in a pick scoremap list, or can be explicitly given (in fractional coordinates) following the keyword vector. The main use of the add/subtract command is to add/subtract a translation vector found in a vectorset translate search.

    Read (filename [rsps.pdb])

    Read coordinates from PDB format file. The file must have a CRYST card defining the cell parameters and three SCALE cards specifying the transformation from the stored Angstrom orthogonal coordinates to fractional crystallographic coordinates. The ATOM cards should contain position number, (site number), x, y, z, and score in the atom number, residue name, x, y, z, and occupancy fields, respectively. If the site field is missing, the program will assign site numbers. The coordinates are stored in the main coordinate storage area and will overwrite any previously stored positions. A file that can be read with this command is produced by the write command described below.

    Write ( < [ positions (filename [positions.pdb]) ] | peaks (filename [peaks.pdb]) | set (iset [1] filename [set.pdb]) | fixxyz (filename [fixxyz.pdb]) > )

    Write position, peak, getsets set, or fixxyz coordinates to a pdb file.

    Searching and checking commands

    Vectorset (< [ single ] < [ atoms ] | molecules > | more < atoms | molecules > | translate < atoms | molecules > >)

    Determines the type of vectors to be computed. Single vectorsets consist of Harker vectors, and pseudo-Harker vectors if non-crystallographic symmetry is loaded (see ncsrot) and switched on (see ncsymm on). More vectorsets consist of cross-vectors to one or more fixed (fixxyz) positions. Translate vectorsets consist of Harker and cross-vectors for a several-atom fragment as defined by the fixxyz positions. These positions will be translated as a rigid body in a search using this vectorset. Atoms vectorsets use all relevant vectors whereas molecules vectorsets only use the subset of structure-invariant vectors.

    Scan (< [au] | limits ui uf vi vf wi wf > (grid nx ny nz [n*nu n*nv n*nw]))

    Scan asymmetric unit (au) or volume defined by limits ui uf vi vf wi wf along x, y, z; in fractions of the unit cell. Optionally the search grid along x, y, z may be specified. The default search grid is twice the Patterson grid when using single vectorsets, i.e. n = 2. When using more or translate vectorsets the default grid is identical to the Patterson grid, i.e. n = 1. In single vectorsets, directions that are indeterminate from the Harker sections (polar spacegroups) are identified by the program and the scan limits reset as necessary. A score is computed for each trial position, and the result stored as a score map which then has to be picked (pick scoremap) to get the positions. The score map may become very big, particularly when using single vectorsets where the default search grid is twice the Patterson grid.

    Polarscan < positions n1 (n2) | x y z >

    This is a special for polar spacegroups. Positions selected from the main positions area using the positions subcommand, or explicitly given as fractional coordinates x y z, will be translated along the polar axis. Cross-vectors to one or more fixed positions stored in the fixxyz area are computed and used to form a score. In polar spacegroups the origin may be freely defined along the polar direction. One position from a single site search may then be fixed and the other positions translated to put them on the same origin as the fixed position. This type of search does not produce a scoremap, instead the coordinates are put directly in the main positions storage area where they will overwrite any coordinates previously stored.

    Getsets (minset [3] numuse [50])

    Using the positions stored in the main coordinate area, find sets of positions where the Patterson map at all predicted cross-vector positions satisfy the rejection criteria specified by commands reject and low for all pairs of positions in the set. Minset is the minimum number of members in a set. The numuse top positions of those stored are used.

    Vlist < site n1 (n2) | set n1 (n2) | position n1 (n2) | fixxyz n1 (n2) | x y z >

    This command is used to inspect individual solutions. With single vectorsets, vectors between symmetry related positions (= Harker vectors if only SGS is being used) are listed for each vlist position. With more vectorsets, cross vectors to fixed positions are listed. With translate vectorsets, all vectors for the positions stored in the fixxyz area are listed for the specified translation(s). Coordinates may be specified explicitly (fractional) or by reference to site, set, position or fixxyz numbers. Giving just one number selects one position, whereas giving two numbers selects a range.

    Table < set (n1 (n2)) | positions n1 n2 ... nk | fixxyz >

    Print scoretable for getsets set(s) n1 (through n2), or for positions n1, n2 etc. in the main coordinate area, or for fixed positions.

    Pick < scoremap (n [50]) | patterson (n [50]) (harker) > (level rhomin) (limits xi xf yi yf zi zf) ({exclude {x|y|z}3 value})

    Command used to pick the Patterson map or the scoremap. The n highest peaks above rhomin are picked. If the level subcommand is omitted the program will try to set an appropriate minimum level. When the scoremap is picked, peaks found are stored in the main coordinate array and any previously stored coordinates are overwritten. Peaks reported by the pick patterson command may be limited to peaks on Harker sections by using the subcommand harker. The position and peak lists may be deleted with the delete command. The limits subcommand defines the part of the map to be picked (the default is to pick peaks from the whole map) and should be given as fractional coordinates. The exclude subcommand is used to specify areas of the map to be excluded from picking. For example, exclude xyz 0.0 excludes all points where any coordinate is zero, whereas exclude y 0.5 excludes points where y = 0.5.

    Miscellaneous commands

    @ filename

    Read commands from macro file. Must be entered on a line by itself. Nesting of macro commands is not allowed.

    Print (n [1])

    Determines the amount of output.
    n = 0   minimum output
    n = 1   a little more output (default)
    n = 2   prints out symmetry operations and score statistics
                 for each section during a scan. 
    n = 3   prints out real to vector space transformations


    Gives a summary of current parameter settings, number of stored positions etc.

    List < positions (n1 n2) | peaks (n1 n2) | sets (n1 n2) | symops | vectors >

    List positions, Patterson peaks, getsets sets, symmetry operators, or vector operators (defined by the vectorset command).

    Delete < peaks | fixxyz (n1) (n2) | sets | ncs >

    Delete picked Patterson peaks, fixxyz coordinates, getsets sets, or non-crystallographic symmetry operators. The subcommands can not be abbreviated. Specific fixxyz positions, or a range of positions, may be deleted by giving the (range of) fixxyz position number.

    Help (command)

    Simple online help. The environment variable rspshlp4 should point to the help file (rsps4.hlp).


    Leave RSPS.


    Leave RSPS.



    Patterson map
    Difference Patterson map in CCP4 format. The map must of course cover at least one asymmetric unit. File name defined with the patfile command. Default file name: patterson
    Coordinate file
    Positions may be read from a PDB format file using the read command. The file is expected to have a CRYST card defining the cell parameters and three SCALE cards specifying the transformation from the stored Angstrom orthogonal coordinates to fractional crystallographic coordinates. The ATOM cards should contain position number, (site number), x, y, z, and score in the atom number, residue name, x, y, z, and occupancy fields, respectively (format (6x,i5,11x,i4,4x,3f8.3,f6.2)). Default file name: rsps.pdb.
    Spacegroup symmetry library
    Library file of crystallographic symmetry operations (logical name SYMOP). For each entry in the library file there is a header line containing five entries: spacegroup number, total number of lines of symmetry operators, number of lines of primitive symmetry operators, spacegroup name, name of corresponding point group. After the header, symmetry operators follow line by line in the style of the International Tables for X-ray Crystallography. The identity must be included and should be first. A SYMOP library suitable for most purposes without modification is included in the CCP4 package.


    Output map of scores in CCP4 map format. File name defined with the scorfile command. Note that this file can become VERY BIG. Default file name:
    Coordinate file
    Positions stored in the main coordinate array may be written to this file using the write command. Coordinates for getsets sets, positions stored in the fixxyz area, as well as Patterson peaks may alternatively be written. The file produced is in PDB format but a few of the ATOM card fields are missing, while others have non-standard entries. The currently stored RSPS title is written on a HEADER card at the top of the file. The file will also have a CRYST card defining the cell parameters and three SCALE cards specifying the transformation from the stored Angstrom orthogonal coordinates to fractional crystallographic coordinates. The ATOM cards contain position number, atom type ("ME"), residue type ("MTL"), site number, x, y, z, and score in the atom number, atom type, residue type, residue name, x, y, z, and occupancy fields, respectively (format (6x,i5,2x,2a,2x,a3,2x,i4,4x,3f8.3,f6.2)). Default file name: rsps.pdb


    The volume of the Patterson function needed is listed below for each spacegroup. In all cases this volume is the smallest possible box that includes at least one full asymmetric unit. The limits correspond to the default limits used in the CCP4 FFT programs. If a map covering a full unit cell is provided, the program will recognize this and skip the reduction of peaks to the asymmetric unit, thus saving computing time at the cost of space.

    Patterson symmetry      Spacegroups                     Limits along x, y, z
    P-1                     P1                              0 1     0 1/2   0 1
    P2/m unique axis b      P2, P21                         0 1/2   0 1/2   0 1
    C2/m                    C2                              0 1/2   0 1/4   0 1
    Pmmm                    P222, P2221, P21212,            0 1/2   0 1/2   0 1/2
    Cmmm                    C2221, C222                     0 1/2   0 1/4   0 1/2
    Fmmm                    F222                            0 1/4   0 1/4   0 1/2
    Immm                    I222, I212121                   0 1/2   0 1/4   0 1/2
    P4/m                    P4, P41, P42, P43               0 1/2   0 1/2   0 1/2
    I4/m                    I4, I41                         0 1/2   0 1/2   0 1/4
    P4/mmm                  P422, P4212, P4122,             0 1/2   0 1/2   0 1/2
                            P41212, P4222, P42212,
                            P4322, P43212
    I4/mmm                  I422, I4122                     0 1/2   0 1/2   0 1/4
    P-3                     P3, P31, P32                    0 2/3   0 2/3   0 1/2
    R-3 hexagonal axes      R3 hexagonal axes               0 2/3   0 2/3   0 1/6
    P-31m                   P312, P3112, P3212              0 2/3   0 1/2   0 1/2
    P-3m1                   P321, P3121, P3221              0 2/3   0 1/3   0 1
    R-3m hexagonal axes     R32 hexagonal axes              0 2/3   0 2/3   0 1/6
    P6/m                    P6, P61, P65, P62, P64,         0 2/3   0 1/2   0 1/2
    P6/mmm                  P622, P6122, P6522,             0 2/3   0 1/3   0 1/2
                            P6222, P6422, P6322
    Pm-3                    P23, P213                       0 1/2   0 1/2   0 1/2
    Fm-3                    F23                             0 1/2   0 1/2   0 1/4
    Im-3                    I23, I213                       0 1/2   0 1/2   0 1/2
    Pm-3m                   P432, P4232, P4332,             0 1/2   0 1/2   0 1/2
    Fm-3m                   F432, F4132                     0 1/2   0 1/4   0 1/4
    Im-3m                   I432, I4132                     0 1/2   0 1/2   0 1/4


    The table below is a summary of the search options available using the scan command. In addition, the special search options polarscan and getsets are available.

    SINGLE ATOMS ON ON/OFF Single-site search using vectors between symmetry-related positions. When only SGS is used, this will be a search using Harker vectors only. NCS symmetry in addition generates pseudo-Harker cross vectors between NCS-related positions, and cross vectors between different NCS copies of the protein molecule.
    OFF ON using only rotational part Locate positions related by NCS from the translation-independent cross vectors (pseudo-Harker vectors) between NCS-related positions. The positions will be displaced from their true position by a vector t which may be found in a TRANSLATE ATOMS scan.
    MORE ATOMS ON/OFF ON/OFF Given one or more fixed positions, find additional sites by looking at cross-vectors to the fixed sites. Harker vectors for potential solutions may then be examined by using the VLIST command.
    TRANSLATE ATOMS ON/OFF ON/OFF Translate two or more positions as a rigid body. These positions may come from
    1. single site search with SGSYMM OFF, NCSYMM ON
    2. cross-vector => two-site search
    3. known (oriented) fragment.
    SINGLE MOLECULES ON ON using only rotational part Find location of symmetric molecule using the structure-invariant subset of Harker vectors
    MORE MOLECULES ON ON using only rotational part Given the position of one or more molecules with NCS, find the position of additional molecules using the structure-invariant subset of cross vectors
    TRANSLATE MOLECULES ON ON using only rotational part Translate two or more NCS molecules with a fixed separation as a rigid body


    Sample command procedures

    1) Command procedure to run single atom search, then more atom search to top position from single atom search:

    # Command procedure file starts here
    #!/bin/csh -f
    rsps04_2  << eof-rsps >& rsps_fre1.log
    # RSPS example script for flavin oxidase/reductase
    # Au anomalous (peak) data to 4.0 A
    spacegroup P212121
    patfile reset origin 8 0
    scorfile /nfs/scr_slu5/stefan/
    pick patterson 200
    # Single site scan of asymmetric unit
    # Only Harker vectors will be considered
    pick scoremap 
    vlist site 1 4
    write positions fre_single.pdb
    # Fix top site and look for more atoms
    # Now only cross vectors will be considered
    fix site 1
    vectorset more atoms
    pick scoremap
    vlist site 1 20
    write positions fre_more.pdb
    # Check Harker vectors for sites found in more atoms scan
    vec si at
    vlist site 1 20
    # The command procedure file ends here

    2) One could alternatively make the following macros:

    # RSPS example script for flavin oxidase/reductase
    # Au anomalous (peak) data to 4.0 A
    # spacegroup and file definitions
    spacegroup P212121
    patfile reset origin 8 0
    scorfile /nfs/scr_slu5/stefan/
    pick patterson 200
    # Single atom scan of asymmetric unit
    pick scoremap 
    vlist site 1 4
    write positions fre_single.pdb
    # Fix top site and look for more atoms
    # Now only cross vectors will be considered
    fix site 1
    vectorset more atoms
    pick scoremap
    vlist site 1 20
    write positions fre_more.pdb
    # List Harker vectors for top sites
    vectorset single atoms
    vlist site 1 20

    and then run the command procedure

    # Command procedure file starts here
    # This procedure for flavin oxidase/reductase Au anomalous
    rsps04_2 << eof-rsps
    @ fre
    @ sscan
    @ mscan
    @ hvlist
    # The command procedure file ends here

    In reality, one would run RSPS interactively rather than as a batch job, starting with a vectorset single scan as above, and then keep adding more and more atoms by repeatedly fixing sensible looking sites and carrying out more atoms scans. As more atoms are added to the fixxyz list, more and more vectors will be considered in the search, and rejection criteria may have to be relaxed in order to find additional sites. But remember that the goal is not to find all of the sites, just enough to start phasing.

    3) Command procedure to run single atom and polarscan search in P21

    # Command procedure file starts here
    # This procedure for Mutase spacegroup P21
    rsps04_2 << eof-rsps
    spacegroup P21
    patfile mmcm_hgac.pat reset origin 8 0
    scorfile /nfs/scr_slu5/stefan/ title "HgAc - M483 diff patt 15 - 6 A"
    # Run single site search - this will be over one section perpendicular
    # to the y axis for P21 second setting
    pick scoremap
    write positions harker.pdb
    # The top position is fixed and all positions translated along the
    # polar axis (b). The vectorset is automatically set to MORE ATOMS when 
    # issuing the  POLARSCAN command. The positions found are put 
    # directly into the main coordinate storage area and so no
    # PICK SCOREMAP command is needed.
    fix pos 1 ; polarscan pos 1 50
    write positions polarscan.pdb
    # The command procedure file ends here

    4) Often the direction of a non-crystallographic symmetry (NCS) axis may be found from e.g. a rotation function whereas the position of the NCS axis is harder to find. The vectors between heavy atom positions related by NCS are independent of the position of the NCS axis, and thus these vectors may be used to find such heavy atom positions. To do this in RSPS, the spacegroup symmetry is switched off and a single atom scan carried out using only the NCS. The positions related by the NCS may then be located in the cell by translating them as a rigid fragment and considering the vectors to positions in SGS related fragments. The following command procedure is an example of such a search for spinach Rubisco in spacegroup C2221 with a 4-fold NCS axis almost parallel to the c axis.

    # Command procedure file starts here
    rsps04_2 << eof-rsps
     CELL   157.20  157.20  201.30   90.00   90.00   90.00
     PATFILE merc.pat RESET ORIGIN 8 0 RESET 0.5 0.0 0.5 6 0
    # Define non-crystallographic symmetry.
    # Translations set to 0,0,0
     NCSROT POLAR   -1.8   0.0  90.0  0  0  0
     NCSROT POLAR   -1.8   0.0 180.0  0  0  0
     NCSROT POLAR   -1.8   0.0 270.0  0  0  0
    # Set scoring parameters
    # LOW is set very high to allow any number of low peaks
    # (alternatively REJECT could have been set at zero).
     LOW    100
     WEIGHT   2
    # First select single atom vectorset and switch spacegroup symmetry off
    # and non-crystallographic symmetry on; scan asymmetric unit.
       NCSYMM ON
    # Select top position and apply the four-fold rotation;
    # store the resulting four positions in the FIXXYZ area.
     ROTATE POS 1 NCS 1 4 FIX 1
    # Now select the translate atoms vectorset, and switch non-crystallographic
    # symmetry  off and spacegroup symmetry on. The four positions in the FIXXYZ 
    # area will be translated through the unit cell as a rigid body, 
    # and vectors (Harker and cross) to positions in symmetry related 
    # fragments used to form the score. High values in the resulting
    # scoremap will indicate possible translation vectors to be added to
    # the FIXXYZ positions to give the correct coordinates of the
    # four-fold related sites.
       SGSYMM ON
       SCORFILE ; SCAN LIMITS 0 1 0 1 0 1
    #Command procedure file ends here

    5) If a cross vector u can be identified in the Patterson function a "two-site" search may be carried out. One position is then set to x (e.g. 0,0,0) and another is placed at x + u. The two positions are stored in the fixxyz area, and a vectorset translate scan carried out to find the translation that correctly locates the two sites in the unit cell. Note that the search is over the entire unit cell since the atoms related by the selected cross vectors could be anywhere in the cell (by selecting a Patterson peak from the peak list we have arbitrarily chosen coordinates for the cross vector reduced to the asymmetric unit of the Patterson.)

    # Command procedure file starts here
    rsps04_2 << eof-rsps
     CELL   157.20  157.20  201.30   90.00   90.00   90.00
     PATFILE merc.pat RESET ORIGIN 8. 0. RESET 0.5 0 0.5 6. 0.
    # A cross vector at 0.3041 0.0541 0.000  was identified in the
    # Patterson function. Thus one site is fixed at 0,0,0 and another at
    # 0.3041 0.0541 0.000 (alternatively if the Patterson had been picked
    # the second position could have been fixed using the FIXXYZ PEAK
    # command).
    FIXXYZ 0. 0. 0.
    FIXXYZ 0.3041 0.0541 0.000 
    # The two sites are translated as a rigid fragment to
    # find their location in the unit cell
       SGSYMM ON
       SCORFILE ; SCAN LIMITS 0 1 0 1 0 1 
    #Command procedure file ends here

    Sample output

    Output generated by pick patterson (flavin reductase Au anomalous Patterson, spacegroup P212121). Only the top 10 peaks are shown.

    PICK >> The 200 highest peaks above      0.0 are listed in descending order
     Peak       Fractional coordinates     Angstrom coordinates         Grid coordinates       Value      S/N
     ----      ------------------------  ------------------------    ----------------------  ---------  -------
       1        0.0000  0.1532  0.2009      0.00   15.26   43.43         0      11      32        1.41     6.0
       2 H      0.5000  0.4499  0.1256     25.78   44.81   27.16        18      32      20        1.33     5.6
       3        0.1704  0.0000  0.0203      8.78    0.00    4.39         6       0       3        1.13     4.8
       4 H      0.2882  0.5000  0.1675     14.86   49.80   36.22        10      36      27        1.04     4.4
       5        0.1395  0.0000  0.0303      7.19    0.00    6.56         5       0       5        1.02     4.3
       6        0.1922  0.0300  0.0000      9.91    2.99    0.00         7       2       0        1.02     4.3
       7 H      0.2632  0.5000  0.4225     13.57   49.80   91.33         9      36      68        1.01     4.3
       8        0.0548  0.0990  0.0000      2.82    9.86    0.00         2       7       0        0.99     4.2
       9 H      0.5000  0.4175  0.3705     25.78   41.58   80.09        18      30      59        0.94     4.0
      10 H      0.5000  0.3715  0.1043     25.78   37.00   22.55        18      27      17        0.90     3.8

    An "H" after the peak number in the "Peak" column indicates that the peak is on a Harker section.

    Output generated by pick scoremap on a single site scoremap:

    PICK >>    50 peaks found; these are listed in descending order
      PosnN      Fractional coordinates     Angstrom coordinates      Score    Site
      -----     ------------------------  ------------------------  ---------  ----
        1        0.6322  0.5519  0.0385     32.60   54.97    8.31      3.39       1
        2        0.6322  0.9481  0.0385     32.60   94.43    8.31      3.39       1
        3        0.6322  0.0519  0.0385     32.60    5.17    8.31      3.39       1
        4        0.6322  0.4481  0.0385     32.60   44.63    8.31      3.39       1
        5        0.8678  0.5519  0.0385     44.74   54.97    8.31      3.39       1
        6        0.8678  0.9481  0.0385     44.74   94.43    8.31      3.39       1
        7        0.8678  0.0519  0.0385     44.74    5.17    8.31      3.39       1
        8        0.8678  0.4481  0.0385     44.74   44.63    8.31      3.39       1
        9        0.1322  0.5519  0.0385      6.82   54.97    8.31      3.39       1
       10        0.1322  0.9481  0.0385      6.82   94.43    8.31      3.39       1
       11        0.1322  0.0519  0.0385      6.82    5.17    8.31      3.39       1
       12        0.1322  0.4481  0.0385      6.82   44.63    8.31      3.39       1
       13        0.3678  0.5519  0.0385     18.96   54.97    8.31      3.39       1
       14        0.3678  0.9481  0.0385     18.96   94.43    8.31      3.39       1
       15        0.3678  0.0519  0.0385     18.96    5.17    8.31      3.39       1
       16        0.3678  0.4481  0.0385     18.96   44.63    8.31      3.39       1

    Only the 16 first positions (8 possible origins in P212121 x 2 hands), representing one unique site, are shown here.

    Output generated by vlist site 1 with positions above stored:

     Harker vectors for a heavy atom position at   0.6322  0.5519  0.0385: 
     Vec       U       V       W        Rho     Multiplicity     Peak   Distance
     ---    ------  ------  ------    -------   ------------     ----   --------
      1     0.2355  0.1038  0.5000      0.65          1           31       0.22
      2     0.2645  0.5000  0.4231      0.98          1            7       0.14
      3     0.5000  0.3962  0.0769      0.77          1           12       0.13
     Score               =     3.39 with     0 low peaks
     Rmsd peak positions =   0.1694
     Rmsd peak heights   =   0.6132
     Matching index      =   0.8297

    The fractional coordinates along the cell axes of the three Harker vectors are listed together with the value of the Patterson function and the relative multiplicity of each vector. The "Peak" column shows the number of the highest stored Patterson peak within 2.5 grid divisions from the calculated position and "Distance" is the actual distance (in Angstrom). (If the Patterson map hasn't been picked these columns are absent). The "matching index" is calculated as

    M = ( 1 + ihit ) / ( 1 + rmspsw*rmspos )( 1 + rmshtw*rmshgt )( 1 + ntrans )


    ihit =
    number of predicted vectors with Patterson peaks within cutoff distance on peak list
    rmspos =
    r.m.s.d. between predicted and observed peak positions
    rmshgt =
    r.m.s.d. between predicted and observed peak heights
    rmspsw, rmshtw =
    weights for the positional and height r.m.s. values, respectively (the higher the weight, the bigger the influence of the corrsponding r.m.s.d. value on the matching index)
    In the distributed version
    rmspsw = 0.8
    rmshtw = 0.1
    ntrans =
    total number of vectors predicted for this set

    The idea for the matching index was stolen from G. Kleywegt's LSQMAN program. The matching index assumes values between 0 and 1, where "0" indicates a "perfect mis-match" and "1" a perfect match. Note that because the matching index is based on the match between predicted vectors, and peaks on the Patterson peak list, the value may depend on the number of peaks on the list.

    PDB file generated by the write positions command:

    HEADER  RSPS MORE ATOMS SCAN  7/12/99 > Real Space Patterson Search Map                
    REMARK File written by RSPS on  7/12/99
    CRYST    51.560   99.600  216.166  90.00  90.00  90.00
    SCALE1       0.01939   0.00000   0.00000
    SCALE2       0.00000   0.01004   0.00000
    SCALE3       0.00000   0.00000   0.00463
    REMARK  POSNN   TYPE   SITE       X       Y       Z    SCORE
    ATOM      1  HG  MTL     1       7.897  11.561  46.892  2.11 20.00
    ATOM      2  HG  MTL     2       5.729   9.683  33.776  1.62 20.00
    ATOM      3  HG  MTL     3      17.187  47.033   5.404  1.51 20.00
    ATOM      4  HG  MTL     4      22.916  70.550  18.915  1.42 20.00
    ATOM      5  HG  MTL     5       0.000  13.833   6.755  1.37 20.00
    ATOM      6  HG  MTL     5      51.560  13.833   6.755  1.37 20.00
    ATOM      7  HG  MTL     6      50.128  94.067  25.670  1.05 20.00

    Sample scoretable. A scoretable is generated as part of the getsets output, or may be explicitly generated using the table command, as in the example below.

     Set number   0;    3 members , overall score     2.35
     PosnN      Fractional coordinates     Angstrom coordinates
     -----     ------------------------  ------------------------
       1        0.6322  0.5519  0.0385     32.60   54.97    8.31
       2        0.1567  0.1008  0.1698      8.08   10.04   36.70
       3        0.1532  0.1111  0.2188      7.90   11.07   47.29
     Score table
     PosnN    1     2     3   <Score>
       1   3.39  2.84  1.80     2.61
       2         2.92  2.18     2.62
       3               1.37     1.82
     Number of vectors      =         33 (all)         9 (Harker)        24 (Cross)
     Number of low vectors  =          0 (all)         0 (Harker)         0 (Cross)
     Score                  =       2.35 (all)      2.56 (Harker)      2.27 (Cross)
     Peak hit frequency     =     0.9697 (all)    0.8889 (Harker)    1.0000 (Cross)
     Rmsd peak positions    =     1.5657 (all)    1.3821 (Harker)    1.6223 (Cross)
     Rmsd peak heights      =     0.9805 (all)    1.0586 (Harker)    0.9496 (Cross)
     Matching index         =     0.3924

    The score table gives the scores for the Harker (and pseudo-Harker in the case of NCS) vectors for each position along the diagonal, the off-diagonal entries are pairwise cross-vector scores. If Patterson peaks have been picked (as in this example), details of the fit between predicted and observed vectors are also given. This is often a useful guide to the correctness of a solution. In particular, correct solutions tend to have a rather high peak hit frequency, in contrast to incorrect solutions. The matching index, in the author's limited experience, is usually above 0.3 for correct solutions.


    RSPS is written in Fortran 77 with a few commonly accepted extensions that are detailed below. The program has been implemented and successfully run on Digital/VAX systems as well as a host of Unix machines such as the Alliant FX 2800 and the SGI 4D series. RSPS is designed so that it can easily be run interactively, although, depending on the symmetry, the size of the cell and the computer, the response may be far from interactive. Thus, a vector search for heavy atom positions would normally be run as a batch job, whereas checking of results can in most cases be done interactively.

    The structure of the RSPS program is highly modular to allow for flexibility in debugging and future development. At the lowest level are a number of library routines that handle matrix and vector algebraic operations as well as elementary operations on positions and peaks in direct and vector space respectively. At the heart of the program is the command interpreter which is based on the CCP4 parser and terminal i/o routines from the library package FORLIB ((C) Per Kraulis 1990). Higher level routines carry out the various search and checking options available in RSPS. Definitions of default values for parameters, dimensioning statements, and common block statements have been collected in a number of include files and may thus easily be modified.

    Include files

    The following include files are used in the RSPS program:             RSPS control variables             RSPS definition of defaults             RSPS dimensioning parameters             RSPS file definitions             RSPS grid information             RSPS map information             RSPS position information             RSPS symmetry information

    Space limitations

    Dimensioning parameters are defined in the file ''. In the distributed version this gives the following space limitations:

    Maximum number of spacegroup symmetry operations48
    Maximum number of pseudo symmetry operations60
    Maximum number of points in Patterson map800000
    Maximum number of points along fast and slow axis in search map600
    Maximum number of VLIST positions30
    Maximum number of FIXXYZ positions30
    Maximum number of input positions to GETSETS800
    Maximum number of PICKable peaks800

    Known exceptions from Fortran 77

    Lower case code
    IMPLICIT NONE statement
    INCLUDE statement
    END DO statement
    CARRIAGECONTROL = 'LIST' element in open statement control list (s/r rswpdb)
    The $ format descriptor is used in the FORLIB library package.

    Known bugs



    The Patterson grid should be chosen as ca 1/3 of the resolution.

    It is probably a good idea to initially run the program with rather strict rejection criteria (reject 1. ; low 0; this is the default) to see if anything shows up. If nothing is then found re-run the search with looser rejection criteria until a sufficient number of possible solutions is found. On the other hand, if too many solutions are found then re-run the search using a higher rejection level. It is worthwhile to use the vlist option to examine the Patterson peaks predicted by potential solutions. Note that by default special positions are not considered in a single atoms search, but may be included by specifiying a negative bump argument.

    It is recommended to always use a search scheme that maximizes the number of vectors used for each trial position. Thus, if non-crystallographic symmetry is present, use it. If a cross vector can be identified, do a two-site search rather than a single-site search.

    Positions from a cross scan should be checked for Harker vectors using the vlist option before they are added to the list of fixed positions.

    In polar spacegroups where one coordinate is indeterminate from the Harker section it is only necessary to perform the search scan over one section. This means that positions that only differ in the polar coordinate will be unresolved. As long as at least one position can be found this should not be a big problem however, since that position can then be used to find further sites by doing a cross-vector (vectorset more atoms) or polarscan search.

    Tolerance > 0 gives an increased number of junk solutions.

    Densities around the origin, and any NCS translational peaks, should always be reset or a lot of junk solutions will appear.

    If vectors fall between grid points the nearest grid point will be used. If this is a serious problem the scoremap sectioning or the scan grid may be adjusted accordingly.

    In vectorset translate scans different representations of the same solution may not necessarily have the same site number.

    In a difference Patterson map between two heavy atom derivatives the cross-vectors between sites in different derivatives will appear as negative densities whereas the Harker peaks and the cross-vectors between sites within each derivative will be positive. This implies that a derivative could be solved by looking at cross-vectors between sites in this derivative and known sites in the second derivative. To do this apply a scale factor of -1 to the map and then perform a more atoms scan.

    When running RSPS interactively use a wide screen (132 characters) to avoid scrambled output.


    The most important change to the keyword input in RSPS 4.2 is that the old MODE keyword is no longer used; it has been replaced with the new vectorset command. The functionality of the old command can be recovered as follows:

    Old command: New command:
    MODE HARKER vectorset single atoms
    MODE CROSS vectorset more atoms
    MODE TRANSLATE vectorset translate atoms

    The old WRITEX command has also been changed slightly; it is now called write and has a different syntax (see main documentation).

    The old SAVE command is obsolete.


    1. Knight, S.D. (2000): RSPS version 4.0: a semi-interactive vector-search program for solving heavy-atom derivatives, Acta Cryst. D 52, 42-47
    2. Knight, S.D. (1989): Ribulose 1,5-Bisphosphate Carboxylase/ Oxygenase - A Structural Study, Thesis, Swedish University of Agricultural Sciences, Uppsala.
    3. Blundell, T.L. & Johnson, L.N. (1976). "Protein Crystallography", Academic Press, London.
    4. Stout, G.H. & Jensen, L.H. (1989). "X-Ray Structure Determination. A Practical Guide", 2nd edition, John Wiley & Sons, New York.


    Stefan Knight.