APL: Angle Probability List

The Angle Probability List (APL) represents the normalized frequency of observed pairs of amino acid residues and secondary structure in the Protein Data Bank. We combine the conformational preferences of amino acid residues (AA, torsion angles) in proteins with their secondary structure information (SS). We selected a set of 6,650 protein structures from PDB. All 3-D protein structures were experimentally determined by X-ray diffraction with resolution ≤ 2.0Å and stored in PDB until December 2014. We remove all structures with R-factor greater than 0.2. If homologous protein chains with sequence identity at most 30% were found, only one of them was retained. We select only amino acid residues with b-factor ≤ 30Å2 and occupancy equal to 1. Similar parameters to filter PDB data were used before by Hovmoller and Ohlson (2002). For more details please contact us: mdorn[at]inf.ufrgs.br.


Please cite APL as shown below:

BORGUESAN, B.; BARBACHAN e SILVA, M.; GRISCI, B. I.; INSTROZA-PONTA, M.; DORN, M. APL: an Angle Probability List to improve knowledge-based metaheuristics for the three-dimensional protein structure prediction. Computational Biology and Chemistry (Print), v. 59, p. 142-157, 2015.

The 'APL.zip' file contains all APLs arranged by the 20 standard amino
acid residues (3 letter code) and eight different secondary structure 
assigned by STRIDE.
            Amino Acid  3-Letter-Code
             Alanine         Ala
             Arginine        Arg
             Asparagine      Asn
             Aspartic acid   Asp
             Cysteine        Cys
             Glutamic acid   Glu
             Glutamine       Gln               Secondary Structure    1-Letter-Code
             Glycine         Gly               Alpha helix                 H
             Histidine       His               3-10 helix                  G
             Isoleucine      Ile               PI-helix                    I
             Leucine         Leu               Extended conformation       E
             Lysine          Lys               Isolated bridge             B or b
             Methionine      Met               Turn                        T
             Phenylalanine   Phe               Coil (none of the above)    C
             Proline         Pro
             Serine          Ser
             Threonine       Thr
             Tryptophan      Trp
             Tyrosine        Tyr
             Valine          Val

File ALA_H_histogram.dat represents the APL for the Alanine (ALA) amino acid 
with alpha-Helix (H) secondary structure.
In each *_histogram.dat file, we have 4 major groups:
PHI PSI Frequency {OMEGA}

When the amino acid residue has CHI angles they are represented as:
PHI PSI Frequency {OMEGA} {CHI}'s
PHI        PSI       Freq.    {OMEGA}
-94.000000 11.000000 0.000010 {0: [169.7]}
PHI        PSI       Freq.    {OMEGA}       {CHI-1}       {CHI-2}
-99.000000 11.000000 0.000240 {0: [-176.0]} {0: [-171.3]} {0: [48.1]}

When the number of occurrences grows, the number of {OMEGA} and {CHI}'s grows as well.
PHI        PSI       Freq.    {OMEGA}
-95.000000 20.000000 0.000019 {0: [172.6, 173.5]}
PHI        PSI        Freq.    {OMEGA}               {CHI-1}             {CHI-2}
-95.000000 -16.000000 0.000480 {0: [-175.5, -159.3]} {0: [-68.3, -66.1]} {0: [-51.5, -33.8]}
When the number of occurrences grows but the values of {OMEGA} or {CHI}'s 
are different, sub-groups are made.
PHI        PSI       Freq.    {OMEGA}
-95.000000 -5.000000 0.000019 {0: [-179.1], 1: [172.0]}
PHI        PSI        Freq.    {OMEGA}               {CHI-1}             {CHI-2}
-95.000000 -11.000000 0.000480 {0: [-176.6, -163.6]} {0: [-79.8, -74.1]} {0: [-23.1], 1: [131.0]}
PHI        PSI      Freq.    {OMEGA}                   {CHI-1}             {CHI-2}
-91.000000 8.000000 0.000480 {0: [-179.2], 1: [173.4]} {0: [-76.7, -75.1]} {0: [-13.8, -9.4]}
PHI        PSI        Freq.    {OMEGA}                   {CHI-1}                   {CHI-2}
-71.000000 -24.000000 0.000480 {0: [-176.1], 1: [175.3]} {0: [-170.2], 1: [-68.8]} {0: [-36.1], 1: [51.2]}