*PROCHECK Operating Manual*

# Appendix E - *G*-factors

The *G*-factor provides a measure of how "**normal**", or
alternatively how "**unusual**", a given stereochemical property is.
In **PROCHECK** it is computed for the following properties:-

Torsion angles:-
**phi-psi** combination
**chi1-chi2** combination
**chi1** torsion for those residues that do not have a **chi-2**
- combined
**chi-3** and **chi-4** torsion angles
**omega** torsion angles

Covalent geometry:-

- main-chain
**bond lengths**
- main-chain
**bond angles**

The *G*-factor is essentially just a **log-odds score**
based on the observed distributions of these stereochemical parameters.

When applied to a given residue, a **low** *G*-factor indicates
that the property corresponds to a low-probability conformation. So, for
example, residues falling in the disallowed regions of the **Ramachandran
plot** will have a **low** (or very negative)
*G*-factor. Similarly for unfavourable **chi1-chi2** and **chi1**
values.

Thus, if a protein has many residues with low *G*-factors it
suggests that something may be amiss with its overall geometry.

## Torsion angle *G*-factors

For the **torsion angle** *G*-factors
the standards of "**normality**" have been derived from an
analysis of **163** non-homologous, high-resolution protein chains
chosen from structures solved by **X-ray crystallography** to a
resolution of **2.0Å** or better and an *R*-factor
no greater than **20%**. No two of the **163** chains shared a
sequence homology greater than **35%**, and all atoms having **zero
occupancy** were excluded from the analysis.

The analyses provided the **observed** distributions of **phi-psi**,
**chi1-chi2**, **chi-1**, **chi-3**, **chi-4** and **omega**
values for each of the **20** amino acid types. These distributions were
then divided into **cells**. For example, each residue type's
**Ramachandran plot** of **phi-psi** values was divided into
**45** x **45** cells. The numbers of observations in each cell were
used to calculate the **probability** of a given residue type having a
given **phi-psi** combination. The probabilities were, in turn, used to
compute a **log-odds score** for each cell. Log-odds scores can be
summed, rather than multiplied like probabilities; therefore, taking
meaningful averages becomes possible.

## Bond-length and bond-angle *G*-factors

For the main-chain **bond lengths** and **bond angles**, the
*G*-factors are computed using the Engh & Huber (1991) small-molecule means
and standard deviations.

