sugar.data package

sugar.data – Use genetic code and substitution matrices

For reference the IUPAC nucleotide code:

IUPAC nucleotide code       Base
A   Adenine
C   Cytosine
G   Guanine
T (or U)    Thymine (or Uracil)
R   A or G
Y   C or T
S   G or C
W   A or T
K   G or T
M   A or C
B   C or G or T
D   A or G or T
H   A or C or T
V   A or C or G
N   any base
. or -      gap

And the amino acid codes:

IUPAC amino acid code       Three letter code       Amino acid
A   Ala     Alanine
C   Cys     Cysteine
D   Asp     Aspartic Acid
E   Glu     Glutamic Acid
F   Phe     Phenylalanine
G   Gly     Glycine
H   His     Histidine
I   Ile     Isoleucine
K   Lys     Lysine
L   Leu     Leucine
M   Met     Methionine
N   Asn     Asparagine
P   Pro     Proline
Q   Gln     Glutamine
R   Arg     Arginine
S   Ser     Serine
T   Thr     Threonine
V   Val     Valine
W   Trp     Tryptophan
Y   Tyr     Tyrosine
sugar.data.gcode(tt=1)[source]

Return a genetic code object

Parameters:

tt – number of the translation table (default: 1)

>>> from sugar.data import gcode
>>> gc = gcode()
>>> gc.tt['TAG']
'*'
>>> gc.starts
{'ATG', 'CTG', 'TTG'}
sugar.data.scale_submat(sm, scale=1)[source]

Return Scaled substition matrix

The matrix values are divided by the sum of all entries and multiplied with the scale factor.

Parameters:

scale – scale factor

Warning

It is not clear if this function is useful. It might be removed in a later version of sugar without further notice.

sugar.data.submat(fname)[source]

Return substitution matrix as a dict of dicts

>>> from sugar.data import submat
>>> bl = submat('blosum62')
>>> bl['A']['A']
4
Parameters:

fname – One of the following values: BLOSUM100, BLOSUM100.50, BLOSUM30, BLOSUM30.50, BLOSUM35, BLOSUM35.50, BLOSUM40, BLOSUM40.50, BLOSUM45, BLOSUM45.50, BLOSUM50, BLOSUM50.50, BLOSUM55, BLOSUM55.50, BLOSUM60, BLOSUM60.50, BLOSUM62, BLOSUM62.50, BLOSUM65, BLOSUM65.50, BLOSUM70, BLOSUM70.50, BLOSUM75, BLOSUM75.50, BLOSUM80, BLOSUM80.50, BLOSUM85, BLOSUM85.50, BLOSUM90, BLOSUM90.50, BLOSUMN, BLOSUMN.50, DAYHOFF, GONNET, IDENTITY, MATCH, NUC, NUC.4.2, NUC.4.4, PAM10, PAM100, PAM110, PAM120, PAM120H, PAM130, PAM140, PAM150, PAM160, PAM160H, PAM170, PAM180, PAM190, PAM20, PAM200, PAM200H, PAM210, PAM220, PAM230, PAM240, PAM250, PAM250H, PAM260, PAM270, PAM280, PAM290, PAM30, PAM300, PAM310, PAM320, PAM330, PAM340, PAM350, PAM360, PAM370, PAM380, PAM390, PAM40, PAM400, PAM40H, PAM410, PAM420, PAM430, PAM440, PAM450, PAM460, PAM470, PAM480, PAM490, PAM50, PAM500, PAM60, PAM70, PAM80, PAM80H, PAM90. Or use your own file.