Rotamer libraries¶
Concept¶
As previously established in MMM, site-directed labeling is modelled in MMMx based on pre-computed rotamer libraries of the label side groups. Such libraries can be generated with classical atomic force fields via molecular dynamics simulations or Monte Carlo sampling of conformation space. Briefly, the label side group is represented by a moderate number, typically between 100 and 10000, of rotameric states and their associated reference populations for a state that is non-interacting with the protein. Upon attachment to a labelling site, interaction with the protein is computed for a static (ensemble) structure of the latter assuming only non-bonded interaction via a Lennard-Jones potential parametrized in the Universal Force Field (UFF). Flexibility of the protein as well as of the label beyond rotameric states is roughly modelled by an f-factor () that scales van-der-Waals radii of atoms in the underlying force field. A few libraries use an enhanced attraction term, i.e., the attractive part of the Lennard-Jones potential is multiplied by the f-factor and an enhancement factor . The default f-factor is 0.5 and the default enhancement factor is 1.0.
Attachment results in rotation and translation of the rotamer coordinates and in reweighting of populations, assuming a Boltzmann distribution at 298 K. Distributions of properties can then be computed by population-weighted averaging.
A number of features in MMM were used only in method development and have not been in use anymore for years. These features are deprecated in MMMx. The concept of a “library temperature” has been deprecated as well, as libraries computed for the glass transition temperature of the medium have been found to perform worse than those computed for ambient temperature (298 K).
Implementation¶
The labeling concept is implemented by providing a set of rotamer libraries, a package for generating additional libraries for new labels,
and by a function get_label
for retrieving attributes of labels, such as distributions of label position and orientation.
Labelling is dynamic and virtual. Dynamic labeling means that a label L at residue R is computed at the time when the user of an entity first tries
to retrieve attributes of this label at this particular site. The label attributes are then stored at residue level, whereas no atomic coordinates are
generated at entity level (virtual labeling). Explicit coordinates of all atoms are generated only for clash tests, for saving label rotamers to a PDB file,
for visualizing the label in all-atom graphics, or when requested. Once computed, the explicit coordinates are stored in the atom coordinate array of the
entity, while their indices are stored in the labels
field of the labelled residue.
The following chemical modifications are implicit (d on need to be performed before labeling):
mutation of any amino acid to cysteine or an unnatural amino acid for labeling
conversion of uracil to 4-thiouracil
conversion of phosphates to thiophosphates
The get_label
function¶
Use this function if you want to label a residue with a particular label or want to retrieve attributes of a previously computed label.
argsout = get_label(entity,label,attributes)
[argsout,exceptions] = get_label(entity,label,attributes)
argsout = get_label(entity,label,attributes,address)
[argsout,exceptions] = get_label(entity,label,attributes,address)
- Parameters
entity
- entity in MMMx:atomic formatlabel
- label nameattributes
- either a string (see table below) or a cell array of stringsaddress
- MMMx residue address
- Returns
argsout
- output arguments (M-element cell array for a single attribute)exceptions
- error message (1-element cell array)
Either a rotamer library for label
must be registered with MMMx (see below) or the label name must be atom.<atname>
, where <atname>
is an existing atom name for this residue.
Requesting several attributes for the same site (residue) simultaneously can lead to a significant speedup in large entitities. The gain in speed may be very large if the labels were already precomputed. To do this, arrange all attributes in a cell of strings and split the output cell array:
site = sprintf('{%i}%s',conformer_number,residue);
[argsout,entity,exceptions] = get_label(entity,label,{'positions','populations'},site);
positions = argsout{1}{1};
populations = argsout{2}{1};
Attributes
Variable |
Explanation |
Type |
---|---|---|
|
rotamer library information |
struct |
|
MMMx internal three-letter code |
string |
|
label class, for instance |
string |
|
SMILES string representing label structure |
string |
|
f-factor and attraction enhancement factor |
(1,2) double |
|
reference populations for R rotamers |
(R,1) double |
|
site address |
string |
|
attachment partitition function |
double |
|
attachment potentials (J/mol) |
(R,1) double |
|
T sidechain torsions for R rotamers |
(R,T) double |
|
Euler angles of molecular frame |
(R,3) double |
|
populations for R rotamers |
(R,1) double |
|
positions for R rotamers |
(R,3) double |
|
numbers of the rotamers in the library |
(R,1) int |
|
full atom coordinates of R rotamers |
(R,1) cell |
|
affine matrix that transforms from the standard frame to the site frame |
(4,4) double |
Note that SMILES strings for nitroxides tend to be interpreted as the corresponding hydroxylamines by some programs, notably by ChemDraw. For libraries that contain several stereoisomers, the SMILES string refers to only one of them.
The attribute orientations
can be used for simulating orientation selection in pulsed dipolar spectroscopy for spin labels
or averaging for FRET chromophores. In order to compute unit vectors along the axes of the label molecular frame
in the entity frame (direction cosine matrix DCM
, the unit vectors are matrix rows) for rotamer , use the code (example):
entity = get_pdb('2lzm'); % load structure of T4 Lysozyme with ID 2lzm from PDB server
[argout,exceptions] = get_label(entity,'mtsl','orientations','(A)131'); % get MTSL label orientation at residue 13
orientations = argout{1}; % extract from cell output
r = 1;
DCM = Euler2DCM(orientations(r,:)); % compute direction cosine matrix
Set of libraries¶
MMMx uses a single type of rotamer libraries, built by hierarchical clustering of Monte-Carlo generated conformer ensembles. Ensemble generation assumes torsion potentials and the non-bonded interaction potential of UFF.
For the following labels, rotamer libraries are provided with MMMx:
Label |
Synonyms |
Class |
Attachment |
Rotamers |
|
---|---|---|---|---|---|
|
|
nitroxide |
cysteine |
216 |
1 |
|
|
nitroxide |
cysteine |
216 |
1 |
|
|
nitroxide |
cysteine |
72 |
1 |
|
|
nitroxide |
cysteine |
108 |
1 |
|
|
nitroxide |
cysteine |
240 |
1 |
|
|
nitroxide |
cysteine |
108 |
2.0 |
|
|
nitroxide |
cysteine |
216 |
1 |
|
|
nitroxide |
cysteine |
216 |
1 |
|
|
nitroxide |
cysteine |
2461 |
2.0 |
|
|
nitroxide |
cysteine |
1367 |
2.0 |
|
|
nitroxide |
4-thiouracil |
72 |
1 |
|
|
nitroxide |
4-thiouracil |
192 |
1 |
|
|
nitroxide |
5’-thiophosphate |
576 |
1 |
|
|
nitroxide |
thiophosphate |
360 |
1 |
|
|
nitroxide |
5’-thiophosphate |
2048 |
1 |
|
|
nitroxide |
3’-thiophosphate |
512 |
1 |
|
|
nitroxide |
unnatural aa |
288 |
1 |
|
|
nitroxide |
unnatural aa |
1024 |
1 |
|
|
nitroxide |
unnatural aa |
1024 |
1 |
|
|
nitroxide |
tyrosine |
128 |
1 |
|
|
nitroxide |
tyrosine |
256 |
1 |
|
|
nitroxide |
cofactor |
144 |
1 |
|
|
gadolinium |
cysteine |
648 |
1 |
|
|
gadolinium |
cysteine |
2430 |
1 |
|
|
gadolinium |
cysteine |
180 |
1 |
|
|
gadolinium |
cysteine |
180 |
1 |
|
|
gadolinium |
cysteine |
1944 |
1 |
|
|
gadolinium |
cysteine |
432 |
1 |
|
|
trityl |
cysteine |
3888 |
1 |
|
|
chromophore |
cysteine |
1024 |
1 |
|
|
chromophore |
cysteine |
1024 |
1 |
|
|
histidine |
any amino acid |
12 |
1 |
Label names (three-letter codes) and synonyms are case-insensitive. Note that gadolinium labels are sufficiently good approximations for other lanthanide labels with the same ligand, for instance, for pseudo-contact shift (PCS) and paramagnetic relaxation enhncement (PRE) computations.
Rotamer library format¶
Rotamer libraries are stored in a binary Matlab format as a struct
variable rot_lib
. The fields are defined as follows:
Field |
Content |
Type |
---|---|---|
|
three-letter code |
string |
|
S synonyms for the label name |
(1,*S*) cell string |
|
SMILES string defining the structure |
string |
|
information on R rotamers (index r) |
(1,*R*) struct |
|
Cartesian coordinates of |
(A,3) double |
|
values of T torsion angles for R rotamers |
(R,T) double |
|
atomic numbers for A atoms |
(A,1) uint8 |
|
R populations for the non-attached rotamers |
(R,1) double |
|
P atom numbers and densities that define the label position |
(P,2) double |
|
structure element to which the label can be
attached, for instance |
string |
|
first side chain atom numer, for instance |
ìnt |
|
“forgive” factor and attraction enhancement |
(1,2) double |
|
A atom names |
(1,*A*) cell string |
|
atoms that define the standard frame for attachment: origin, atom on x axis, atom in xy plane |
(1,3) int |
|
atom types of the standard frame |
(1,3) cell string |
|
atoms that define the label molecular frame: origin, atom on x axis, atom in xy plane |
(1,3) int |
|
atom types of the standard frame |
(1,3) cell string |
|
label class, for instance |
string |
|
definition of T torsion angles |
(T,4) int |
|
bonding information for up to B bonds for A atoms |
(A,B) int |
|
force field for protein attachment, usually
|
string |
|
pseudo-temperature factors for N atoms in R rotamers |
(N,R) double |
|
method for library generation. for instance
|
string |
|
force field used in library generation
for instance |
string |
|
A atom type numbers for the used force field |
(A,1) uint16 |
|
solvation assumed in library generation,
usually |
string |
|
number of trials in a prerun of the MMMx native UFF Monte Carlo rotamer library generator |
int |
|
true if hydrogen atoms were neglected in library generation, not recommended |
boolean |
|
thresholds for confermer acceptance in the MMMx native generator |
double |
|
minimum strain energy (kcal/mol) encountered |
double |
|
maximum distance of the label position from the backbone (CA atom or origin of attachment) frame |
double |
|
RGB color triplet (fraction) for display |
(1,3) double |
|
sphere radius (Angstroem) corresponding to 100% rotamer population for display |
double |
|
force field parameters for attachment |
struct |
|
van-der-Waals radii for attachment indexed by atomic number |
(1,103) double |
|
Lennard-Jones potentials for attachment indexed by atomic number |
(1,103) double |
|
atom type tags for force field |
string |
The pseudo-temperature factors relate to the variation of atom positions within the cluster of conformers that was projected onto a single rotamer.