Control files¶
Concept¶
Modelling with MMMx is controlled by an .mcx
file:
MMMx control_file.mcx
Output is written to the console, if the control file does not contain a # log
directive. This is discouraged.
If the # log
directive is present, a log file with the same name as the control file and extension .log
is generated.
If the control file also contains a # report
directive, this log file is opened in the Windows notepad application
after completion. We recommend this when working with the executable, where it signals completion. When working from within Matlab,
completion is signalled by the prompt in the Matlab command window.
If MMMx is called without an argument, it opens a file browser window for selection a control file.
Further output files may be generated depending on the modules that are called in the control file.
The control file specifies a modelling task by a sequence of module calls, each having its own run options and list of restraints.
Syntax¶
Keywords are case-insensitive. Empty lines are allowed.
The MMMx control file consists of blocks corresponding to individual modules. Outside such blocks, a few directives are supported. Directives have the form
#
directive
The # log
and # report
directives do not have arguments. # logfile
specifies a user-selected name for the logfile.
Comments are initiated by %
and can appear anywhere. The rest of the line after a %
character is ignored.
Module blocks are opened by !
module, where module is the name of the module. They are closed by .
module.
If a new block is opened or the file ends before the closing line of a block, MMMx proceeds, but raises a warning.
The control file ends with a #end
statement. If it is missing, MMMx raises a warning. All lines after an #end
statement are ignored.
Keywords and restraint specifications of individual modules are described in section Modules of the documentation.
Keywords inside blocks must be indented by at least two spaces. If a keyword contains a block of further arguments,
such as restraints, these rguments lines must be indented by at least two spaces with respect to the keyword.
Such a block must end with .
keyword.
Restraint specification¶
The following restraint format applies to all modules. There can be multiple blocks of restraints of the same type within the same module section. Please note that there exist additional module-specific restraint types that are explained in keyword specifications of the modules.
Distance distribution restraints
Keyword: ddr
Legacy keyword: deer
The distance unit is Angstroem.
Syntax:
Note
ddr
label_1 [label_2]
site_1 site_2 <r> stdr % example for (multi)Gaussian restraints``
site_1 site_2
-lb
lower_bound-ub
upper_bound% example for lower-bound/upper-bound restraints
site_1 site_2
@
distribution_data% example for parameter-free distributions
site_1 site_2 <r> stdr
@
distribution_data % example for providing Gaussian restraints and distribution data``
.ddr
Each combination of label types at the two sites requires its own ddr
block. If both sites are labeled with the same label, it is sufficient to specify it once.
The allowed label types label_1 and label_2 correspond to existing rotamer libraries.
The syntax atom.<atname>
specifies an atom, for instance, atom.CA
the CA atom of the addressed site.
The two labelled sites are specified by MMMx residue addresses site_1 and site_2 . These addresses must refer to either a residue in an entity passed to the module or to a residue generated by the module.
For a Gaussian restraint, the mean value <r> and standard deviation stdr need to be specified.
If you wish to use multi-Gaussian restraints, please specify the full distance distribution (@
syntax).
Note that some modules can process only single-Gaussian restraints.
Upper/lower bound restraints require the option specifiers -lb
and ub
. These are hard restraints, i.e., any violation rejects the model.
Use distributions if you expect that some restraints may be violated.
Parameter-free distributions require an ASCII data file with name distribution_data.dat (extension may be included)
that has at least two columns. The first column is the distance axis (Angstroem units) and the second column is the
probability per distance bin. We advise to provide lower and upper confidence limits for the bin probabilities in columns 3 and 4.
White space between @
and distribution_data is allowed, but not required. Bin probabilities are automatically normalized to unity sum,
where necessary. A distance axis in units of nm is assumed if the longest distance value is maller than 20.
Note that you can provide any parametrized distance distribution by first converting it to a binned distance distribution and saving it as distribution_data.dat.
Paramagnetic relaxation enhancement (PRE) restraints
Keyword: pre
The distance unit is Angstroem.
Syntax:
Note
pre
label atom [taui [taur [taus]]]
site_1 site_2 ratio
% example for ratio Ipara/Idia
site_1 site_2
-Gamma2
Gamma2% example transverse relaxation enhancement
.pre
Possible spin label types label correspond to existing rotamer libraries. If the entity has explicit protons, they can be specified by atom. Otherwise, the heavy atom, for instance for a backbone NH atom should be specified. MMMx will attempt to generate the proton position.
Correlation times can be provided as additional arguments. The default for the correlation time of internal motion (taui) is 250 ps, the one for global tumbling of the protein (taur) 3 ns, and the one for spin label relaxation (taus) 1 s.
The experimental restraints can be specified either by the intensity ratio between the paramagnetically and diamagnetically labelled sample or by the transverse relaxation enhancement rate Gamma2.
SAXS restraints
Small-angle x-ray scattering fits can be specified for the whole entity or for a subset of chains.
Syntax:
Note
saxs
saxs_data [sm] [-v3
]
chain_1 [chain_2 …]
.saxs
Usually, SAXS restraints are specified by a single line, giving only the file name of the SAXS data and optionally a maximum scattering vector sm for fitting till sm.
SAXS fitting in MMMx uses crysol of the ATSAS package. Option -v3
specifies that crysol3 is used instead.
It is possible to specify only a subset of the chains of an enity for SAXS fitting.
For this, saxs
is used as a block restraint with a single additional line that specifies the included chains by MMMx chain addresses.
SANS restraints
Small-angle neutron scattering fits can be specified for the whole entity or for a subset of chains. If an experimental selection of part of the entity was made by contrast matching, it is better to specify the deuterium content in the buffer than to specify the selected chains.
Syntax:
Note
sans
sans_data [illres [*D2O*
]]
chain_1 [chain_2 …]
.sans
Usually, SANS restraints are specified by a single line, giving only the file name of the SANS data and optionally the name of a resolution file, a fraction of D2O in the solution. If the second argument is a number instead of a string, it is interpreted as D2O content and the resolution file is considered to be missing (a warning is raised).
SANS fitting in MMMx uses cryson of the ATSAS package.
It is possible to specify only a subset of the chains of an enity for SANS fitting.
For this, sans
is used as a block restraint with a single additional line that specifies the included chains by MMMx chain addresses.
Crosslink restraints
Crosslink restraints can be specified as a fraction of potentially crosslinkable residue pairs that are sufficiently close to be actually crosslinked. The maximum distance (in Angstroem) and fraction (0… 1) apply to all crosslinks in one block.
Syntax:
Note
crosslink
maxdist fraction [atom_a [atom_b]]
site_1_a site_1_b
site_2_a site_2_b
…
.crosslink
The distance is measured between CA atoms, unless the atom types in sites a and b of the crosslink are specified.
The crosslink
line is followed by x lines specifying individual pairs of residues for which crosslinks were found.
The two linked sites are specified by MMMx residue addresses
A conformer is rejected if more than fraction·x of the addressed atom pairs have a larger distance than maxdist.
Immersion depth restraints
The depth of immersion of sites into a lipid bilayer can be specified with this restraint.
Syntax:
Note
depth
label [ox oy oz [dx dy dz]]
site <r> fwhm
% example for Gaussian restraints
site
-lb
lower_bound-ub
upper_bound% example for lower-bound/upper-bound restraints
.depth
The depth is measured as a distance from the center plane of the bilayer, i.e., large values correspond to low immersion depth or even positions outside the bilayer.
Possible label types label correspond to existing rotamer libraries.
The syntax atom.<atname>
specifies an atom, for instance, atom.CA
the CA atom of the addressed site.
For a Gaussian restraint, the mean value <r> and full width at half maximum fwhm need to be specified.
Upper/lower bound restraints require the option specifiers -lb
and ub
. These are hard restraints.
Conformers are rejected, if the simulated distance is outside bounds. Use distributions if you expect that some restraints may be violated.
By default, the bilayer normal is assumed to be the z axis and the center plane is assumed to pass through z = 0 of the coordinate frame of the entity. It is possible to specify a point on the center plane by coordinates ox, oy, and oz and the direction of the bilayer normal by coordinates dx, dy, and dz.
Secondary structure and cis peptide propensities
The keywords alpha
, beta
, polypro
, and cis
allow to specify propensities at a residue to adopt -helix, -strand, polyproline-helix, or cis-peptide backbone torsion angles.
The following example is for -helix propensities.
Syntax:
Note
alpha
site propensity
% example for a single site
site_1
-
site_2 propensity% example for a range of residues
.alpha
The sites site, &site_1*, and site_2 are specified by MMMx residue addresses and propensity is a value between 0 and 1. Use the range syntax with propensity 1 to strictly enforce secondary structure for a certain section of residues.
Specifying propensities instead of physical ensemble mean restraints related to them (e.g. NMR chemical shifts and residual dipolar couplings) is prefereable in ensemble building, as it allows to adapt backbone torsions statistics, which in turn improves sampling of suitable conformations.
In ensemble fitting, it is advisable to specify restraints as close as possible to primary experimental data.
Rigid bodies
In general, a rigid body can comprise one or more sections of one or more macromolecular chains, as well as cofactors or other ligands.
Builder modules, such as Rigi
, may require that the complete chain behaves as a rigid body. Template entities have to be prepared to ensure this.
Syntax:
Note
rigid
chain_1 [chain_2 […]]
refsite_1 reflabel_1
% origin
refsite_2 reflabel_2
% point on *x* axis
refsite_3 reflabel_3
% point in *xy* plane
.rigid
Rigid bodies are internally numbered in the sequence of the corresponding rigid
blocks. This numbering is internal to a module.
The chain specifiers chain_1, chain_2, … are MMMx chain or residue-range addresses, such as (B)
.
If only part of a chain defines the rigid body or rigid bodies are derived from various structure, use keyword merge
in module
Prepare to generate a rigid-body file first.
The three reference sites refsite_1, refsite_2. and refsite_3 are obligatory and should not be situated on a line. They specify a local frame and can be used for computing rigid-body arrangements by distance geometry.
The labels reflabel_1, reflabel_2, and reflabel_3 either correspond to existing rotamer libraries or
have the syntax atom.<atname>
. The latter syntax specifies an atom, for instance, atom.CA
the CA atom of the reference site.