PreLysCar (__Pre__dictor of __Lys__ine __Car__boxylation) is a computational method for
the prediction of lysine carboxylation (KCX) in proteins with an available
three-dimensional structure.

The carboxylation of lysine residues is a post-translational modification that plays a critical role in the catalytic mechanisms of several important enzymes. It occurs spontaneously under certain physicochemical conditions, but is difficult to detect experimentally.

PreLysCar is based on a Bayesian probabilistic classifier to distinguish lysine residues that are and are not carboxylated. Two main parameters are calculated by this Bayesian method.

- Probability distribution of the features, which is approximated with relative frequencies from the training set (F1, F2, ..., Fn). The features are the frequency of amino acids, ions, and water molecules found within 5 Angstroms from carboxylated (KCX) and non carboxylated (LYS) lysine residues.
- Prior probability, which it can be arbitrarily selected as a best reasonable guess of the frequency of lysine carboxylation.

**-i=<PDB_FILE>** Input file: a PDB file (Example: *-i=3U4F.pdb*)

**-c=<PROTEIN_CHAIN>** (Example: *-c=A*)

**-p=<PRIOR_PROBABILITY>** Any value between 0.0001 and 0.9999. (Recommended: *-p=0.009*)

List of predicted carboxylated lysine residues (*p*KCX), if any.

In Bayesian probability, the prior probability represents one's uncertainty about the probability of an event before some evidence is taken into account.

p = 0.5 The carboxylation of lysine residues is expected to occur with the same prevalence than unmodified lysine residues. Running PreLysCar under this unrealistic assumption might result in more false positives (lysine residues incorrectly predicted as carboxylated, i.e., low specificity).

p = 0.0001 Lysine carboxylation is expected to occur very rarely. Execute PreLysCar with this prior probability might result in a lower sensitivity (lower number of true carboxylated lysine residues predicted as carboxylated), but an almost perfect specificity (no false positives).

p = 0.009 This value obtained the best performance on the training data set (see reference).

Results on the training data set indicates that PreLysCar can provide predictions with robust positive and negative predictive values (93.1 and 99.4%, respectively, see reference), namely, the odds of correctly predicting both KCX and LYS sites are high. It also obtained an excellent sensitivity (~87%) and specificity (~100%).

A positive peak in the Fo-Fc map at the tip of the side chain of the predicted carboxylated lysine residue could be considered a strong supporting indication of the reliability of the prediction. However, a careful structural analysis of its microenvironment is highly recommended. If the result of the analysis supports the carboxylation, further experimental procedures for its validation should be performed.

PreLysCar is also available online at the following url: http://tanto.bioe.uic.edu/prelyscar/

Please, cite: