(file) Return to cbuccaneer.doc CVS log (file) (dir) Up to [CCP4] / ccp4 / doc

File: [CCP4] / ccp4 / doc / cbuccaneer.doc (download) / (as text)
Revision: 1.5, Tue Dec 16 16:26:10 2008 UTC (20 months, 3 weeks ago) by ccb
Branch: MAIN
CVS Tags: pre-merge-release-6_1_3, pre-merge-20_4_2009, pre-merge-13_08_2009, post_merge-13_08_2009, post-merge-release-6_1_3, post-merge-20_4_2009, merge-release_6_1_0, HEAD
Changes since 1.4: +100 -100 lines
merge in release-6_1_0 changes

                      BUCCANEER (CCP4: Supported Program)

NAME

   buccaneer - Statistical protein chain tracing

SYNOPSIS

   cbuccaneer -mtzin-ref filename -pdbin-ref filename -mtzin-wrk filename
   -pdbin-wrk filename -seqin-wrk filename -pdbout-wrk colpath
   -colin-ref-fo colpath -colin-ref-hl colpath -colin-wrk-fo colpath
   -colin-wrk-hl colpath -resolution resolution -find -grow -join -link
   -sequence -correct -filter -ncsbuild -prune -build -cycles number of
   cycles -fragments number of fragments -fragments-per-100-residues
   number of fragments -ramachandran-filter type
   -main-chain-likelihood-radius radius/A -side-chain-likelihood-radius
   radius/A -sequence-reliability reliability -new-residue-name type
   -new-residue-type type -correlation-mode -verbose verbosity -stdin
   [Keyworded input]

DESCRIPTION

   'buccaneer' performs statistical chain tracing by identifying connected
   alpha-carbon positions using a likelihood-based density target.

   The target distributions are generated by a simulation calculation
   using a known 'reference' structure for which calculated phases are
   available. The success of the method is dependent on the features of
   the reference structure matching those of the unsolved, 'work'
   structure. For almost all cases, a single reference structure can be
   used, with modifications automatically applied to the reference
   structure to match its features to the work structure.

HOW TO RUN BUCCANEER

   A set of reference structure will have been provided with the program.
   The structure 1TQW is good for typical protein problems at resolutions
   up to 1.25A, although in practice including data much beyond 2.0A
   doesn't make much difference. For exotic cases you might want to
   provide your own reference structures.

   The calculation involves 10 stages:

   Finding C-alphas
          Candidate C-alpha positions are located by searching the
          electron density.

   Growing fragments
          The candidate C-alphas or input chains are grown by adding
          residues at either end, according to the density.

   Joining Fragments
          Overlapping fragments are joined to make longer chains. If this
          leads to a junction in a chain, the contested residue is
          removed.

   Linking Fragments
          Nearby N and C termini are examined to see if they can be linked
          by a short loop.

   Assigning Sequence
          Likelihood comparison between the density of each residue in the
          work structure and the residues of the reference structure
          allows sequence to be assigned to longer fragments.

   Correcting sequence.
          Insertions and deletions in the model building are fixed by
          rebuilding, where possible.

   Filtering fragments in poor density
          Residues in poor density are removed.

   Building NCS
          Any NCS relationships found in the model are used to augment the
          related chains.

   Pruning Fragments
          Clashing fragments are examined and the one with the worse
          density is removed. This stage can be disabled by the -no-prune
          keyword.

   Rebuilding
          Rebuilding allows side chain atoms and carbonyl oxygens to be
          rebuilt.

INPUT/OUTPUT FILES

   -pdbin-ref
          Input PDB file containing the final model for the reference
          structure.

   -mtzin-ref
          Input 'reference' MTZ file. This contains the data for a known,
          reference structure. The required columns are F, sigF, and a set
          of Hendrickson-Lattman (HL) coefficients describing the
          calculated phases from the final model. Suitable reference
          structures can be constructed from the PDB using the 'Make
          Pirate reference' task.

   -mtzin-wrk
          Input 'work' MTZ file. This contains the data for the unknown,
          work structure. The required columns are F, sigF, and a set of
          HL coefficients from phasing improvement.

   -pdbin-wrk
          [Optional] Input PDB file containing an initial model.

   -seqin-wrk
          [Optional] Input sequence file in any common format, e.g. pir,
          fasta.

   -pdbout-wrk
          Output PDB file. This will contain the new chain trace.

KEYWORDED INPUT

   See Note on keyword input.

  -colin-ref-fo colpath

     Observed F and sigma for reference structure. See Note on column
     paths.

  -colin-ref-hl colpath

     Hendrickson-Lattman coefficients for reference structure. If you do
     not have these, they can be generated using the accompanying
     chltofom program. See Note on column paths.

  -colin-wrk-fo colpath

     Observed F and sigma for work structure. See Note on column paths.

  -colin-wrk-hl colpath

     Hendrickson-Lattman coefficients for work structure. See Note on
     column paths.

  -resolution resolution/A

     [Optional] Resolution limit for the calculation. All data is
     truncated.

  -find

     [Optional] Enable growing of fragments.

  -grow

     [Optional] Enable growing of fragments.

  -join

     [Optional] Enable joining of fragments.

  -link

     [Optional] Enable linking of nearby fragments.

  -sequence

     [Optional] Enable sequencing of fragments.

  -correct

     [Optional] Enable correction of any missing or extra residues
     uncovered during the sequencing process.

  -filter

     [Optional] Enable removal of residues in low density or linking
     disjoint sequence.

  -ncsbuild

     [Optional] Enable use of NCS to build related molecules, if present.

  -prune

     [Optional] Enable pruning of fragments.

  -build

     [Optional] Enable rebuilding of side-chains and Carbonyl Oxygens.

  -cycles number of cycles

     [Optional] Number of cycles of building to run. Running multiple
     cycles leads to a more complete model, although it is not as
     effective as recycling with refmac.

  -fragments number of fragments

     [Optional] Maximum number of fragments to build.

  -fragments-per-100-residues number of fragments

     [Optional] Approximate number of fragments to build per 100 residues
     (assuming average solvent).

  -ramachandran-filter type

     [Optional] Only use particular types of residues when preparing the
     main chain likelihood search function. By selecting particular
     secondary structure types, it is possible to prefferentially find
     different types of sequence. type may be one of all, helix, strand,
     nonhelix.

  -main-chain-likelihood-radius radius/A

     [Optional] Default 4.0A. For very low resolution maps it may be
     worth increasing this.

  -side-chain-likelihood-radius radius/A

     [Optional] Default 5.5A.

  -sequence-reliability reliability

     [Optional] Values between 0.5 and 1.0 vary the reliability cutoff
     for docking a sequence. The value is the probability at which the
     sequence will be accepted. 0.5 means every sequence will be docked,
     1.0 means that no sequences are docked. Default = 0.95.

  -new-residue-name type

     [Optional] Set the name which will be given to newly built residues.

  -new-residue-type type

     [Optional] Set the type of residue to be used when building new
     residues.

  -correlation-mode

     [Optional] Use the correlation target function for growing new
     chains and for sequencing. This is less effective for initial
     building, but better for model completion, especial after molecular
     replacement.

  -verbose verbosity

    Note on column paths:

   When using the command line, MTZ columns are described as groups using
   a slash separated format including the crystal and dataset name. If
   your data was generated by another column-group using program, you can
   just specify the name of the group, for example '/native/peak/Fobs'.
   You can wildcard the crystal and dataset if the file does not contain
   any duplicate labels, e.g. '/*/*/Fobs'. You can also access individual
   non-grouped columns from existing files by giving a comma-separated
   list of names in square brackets, e.g. '/*/*/[FP,SIGFP]'.

    Note on keyword input:

   Keywords may appear on the command line, or by specifying the '-stdin'
   flag, on standard input. In the latter case, one keyword is given per
   line and the '-' is optional, and the rest of the line is the argument
   of that keyword if required, so quoting is not used in this case.

Reading the Output:

   The program outputs a short list of statistics each cycle. The Free-E
   correlation is probably the most useful (larger is better). After the
   first cycle these may be biased in various ways. They are fairly useful
   for selecting a reference structure from a list of candidates or for
   selecting a radius. They can be used to control the likelihood
   weighting, but see the notes under the keyword for the appropriate
   protocol.

Problems:

AUTHOR

   Kevin Cowtan, York.

SEE ALSO

ccp4@ccp4.ac.uk
Powered by
ViewCVS 0.9.3