Skip to content

Action: CS2BACKBONE

Module isdb
Description Usage
Calculates the backbone chemical shifts for a protein. used in 0 tutorialsused in 6 eggs
output value type
the backbone chemical shifts scalar

Details and examples

Calculates the backbone chemical shifts for a protein.

The functional form is that of CamShift, which is discussed in the first paper cited below. The chemical shift of the selected nuclei can be saved as components. Alternatively one can calculate either the CAMSHIFT score (useful as a collective variable as you can see in the second paper cited below or as a scoring function as discussed in the third paper cited below) or a METAINFERENCE score (using DOSCORE). For these two latter cases experimental chemical shifts must be provided.

CS2BACKBONE calculation can be relatively heavy because it often uses a large number of atoms, it can be run in parallel using MPI and Openmp (see here for more details).

As a general rule, when using CS2BACKBONE or other experimental restraints it may be better to increase the accuracy of the constraint algorithm due to the increased strain on the bonded structure. In the case of GROMACS it is safer to use lincs-iter=2 and lincs-order=6.

In general the system for which chemical shifts are calculated must be completely included in ATOMS and a TEMPLATE pdb file for the same atoms should be provided as well in the folder DATADIR. The system is made automatically whole unless NOPBC is used, in particular if the system is made by multiple chains it is usually better to use NOPBC and make the molecule whole WHOLEMOLECULES selecting an appropriate order of the atoms. The pdb file is needed to the generate a simple topology of the protein. For histidine residues in protonation states different from D the HIE/HSE HIP/HSP name should be used. GLH and ASH can be used for the alternative protonation of GLU and ASP. Non-standard amino acids and other molecules are not yet supported, but in principle they can be named UNK. If multiple chains are present the chain identifier must be in the standard PDB format, together with the TER keyword at the end of each chain. Termini groups like ACE or NME should be removed from the TEMPLATE pdb because they are not recognized by CS2BACKBONE.

Atoms indices in the TEMPLATE file should be numbered from 1 to N where N is the number of atoms used in ATOMS. This is not a problem for simple cases where atoms goes from 1 to N but is instead something to be carefull in case that a terminal group is removed from the PDB file.

In addition to a pdb file one needs to provide a list of chemical shifts to be calculated using one file per nucleus type (CAshifts.dat, CBshifts.dat, Cshifts.dat, Hshifts.dat, HAshifts.dat, Nshifts.dat), add only the files for the nuclei you need, but each file should include all protein residues. A chemical shift for a nucleus is calculated if a value greater than 0 is provided. For practical purposes the value can correspond to the experimental value. Residues numbers should match that used in the pdb file, but must be positive, so double check the pdb. The first and last residue of each chain should be preceded by a # character.

CAshifts.dat:
#1 0.0
2 55.5
3 58.4
.
.
#last 0.0
#first of second chain
.
#last of second chain

The default behavior is to store the values for the active nuclei in components (ca-#, cb-#, co-#, ha-#, hn-#, nh-# and expca-#, expcb-#, expco-#, expha-#, exphn-#, exp-nh#) with NOEXP it is possible to only store the back-calculated values, where # includes a chain and residue number.

One additional file is always needed in the folder DATADIR: camshift.db. This file includes all the parameters needed to calculate the chemical shifts and can be found in regtest/isdb/rt-cs2backbone/data/ .

Additional material and examples can be also found in the tutorials as well as in the cs2backbone regtests in the isdb folder.

Examples

In this first example the chemical shifts are used to calculate a collective variable to be used in NMR driven Metadynamics that is similar to what was done in the second paper cited below:

Click on the labels of the actions for more information on what each action computes
tested on2.11
#SETTINGS
whole: GROUPDefine a group of atoms so that a particular list of atoms can be referenced with a single label in definitions of CVs or virtual atoms. More details ATOMSthe numerical indexes for the set of atoms in the group=2612-2514:-1,961-1:-1,2466-962:-1,2513-2467:-1
WHOLEMOLECULESThis action is used to rebuild molecules that can become split by the periodic boundary conditions. More details ENTITY0the atoms that make up a molecule that you wish to align=whole
cs: CS2BACKBONECalculates the backbone chemical shifts for a protein. This action has hidden defaults. More details ...
   ATOMSThe atoms to be included in the calculation, e=1-2612
   DATADIR The folder with the experimental chemical shifts=regtest/isdb/rt-cs2backbone/data/
   TEMPLATE A PDB file of the protein system=template.pdb CAMSHIFT Set to TRUE if you to calculate a single CamShift score NOPBC ignore the periodic boundary conditions when calculating distances
...

metad: METADUsed to performed metadynamics on one or more collective variables. This action has hidden defaults. More details ARGthe labels of the scalars on which the bias will act=cs HEIGHTthe heights of the Gaussian hills=0.5 SIGMAthe widths of the Gaussian hills=0.1 PACEthe frequency for hill addition=200 BIASFACTORuse well tempered metadynamics and use this bias factor=10 PRINTPrint quantities to a file. More details ARGthe labels of the values that you would like to print to the file=cs,metad.bias FILEthe name of the file on which to output these quantities=COLVAR STRIDE the frequency with which the quantities of interest should be output=100

In this second example the chemical shifts are used as replica-averaged restrained as was done in the fourth and fifth paper cited below.

Click on the labels of the actions for more information on what each action computes
tested on2.11
#SETTINGS NREPLICAS=2
cs: CS2BACKBONECalculates the backbone chemical shifts for a protein. This action has hidden defaults. More details ATOMSThe atoms to be included in the calculation, e=1-174 DATADIR The folder with the experimental chemical shifts=regtest/isdb/rt-cs2backbone/data/
encs: ENSEMBLECalculates the replica averaging of a collective variable over multiple replicas. More details ARGthe labels of the values from which the function is calculated=(cs\.hn-.*),(cs\.nh-.*)
stcs: STATSCalculates statistical properties of a set of collective variables with respect to a set of reference values. More details ARGthe labels of the values from which the function is calculated=encs.* SQDEVSUM calculates only SQDEVSUM PARARGthe input for this action is the scalar output from one or more other actions without derivatives=(cs\.exphn-.*),(cs\.expnh-.*)
RESTRAINTAdds harmonic and/or linear restraints on one or more variables. More details ARGthe values the harmonic restraint acts upon=stcs.sqdevsum ATthe position of the restraint=0 KAPPA specifies that the restraint is harmonic and what the values of the force constants on each of the variables are=0 SLOPE specifies that the restraint is linear and what the values of the force constants on each of the variables are=24

PRINTPrint quantities to a file. More details ARGthe labels of the values that you would like to print to the file=(cs\.hn-.*),(cs\.nh-.*) FILEthe name of the file on which to output these quantities=RESTRAINT STRIDE the frequency with which the quantities of interest should be output=100

This third example show how to use chemical shifts to calculate a METAINFERENCE score .

Click on the labels of the actions for more information on what each action computes
tested on2.11
cs: CS2BACKBONECalculates the backbone chemical shifts for a protein. This action has hidden defaults. More details ...
   ATOMSThe atoms to be included in the calculation, e=1-174
   DATADIR The folder with the experimental chemical shifts=regtest/isdb/rt-cs2backbone/data/
   SIGMA_MEAN0starting value for the uncertainty in the mean estimate=1.0 DOSCORE activate metainference
...
csbias: BIASVALUETakes the value of one variable and use it as a bias More details ARGthe labels of the scalar/vector arguments whose values will be used as a bias on the system=cs.score

PRINTPrint quantities to a file. More details ARGthe labels of the values that you would like to print to the file=(cs\.hn-.*),(cs\.nh-.*) FILEthe name of the file on which to output these quantities=CS.dat STRIDE the frequency with which the quantities of interest should be output=1000
PRINTPrint quantities to a file. More details ARGthe labels of the values that you would like to print to the file=cs.score FILEthe name of the file on which to output these quantities=BIAS STRIDE the frequency with which the quantities of interest should be output=100

Input

The arguments and atoms that serve as the input for this action are specified using one or more of the keywords in the following table.

Keyword Type Description
ARG scalar the labels of the values from which the function is calculated
ATOMS atoms The atoms to be included in the calculation, e

Output components

This action can calculate the values in the following table when the associated keyword is included in the input for the action. These values can be referenced elsewhere in the input by using this Action's label followed by a dot and the name of the value required from the list below.

Name Type Keyword Description
score scalar default the Metainference score
sigma scalar default uncertainty parameter
sigmaMean scalar default uncertainty in the mean estimate
neff scalar default effective number of replicas
acceptSigma scalar default MC acceptance for sigma values
acceptScale scalar SCALEDATA MC acceptance for scale value
acceptFT scalar GENERIC MC acceptance for general metainference f tilde value
weight scalar REWEIGHT weights of the weighted average
biasDer scalar REWEIGHT derivatives with respect to the bias
scale scalar SCALEDATA scale parameter
offset scalar ADDOFFSET offset parameter
ftilde scalar GENERIC ensemble average estimator
ha scalar default the calculated Ha hydrogen chemical shifts
hn scalar default the calculated H hydrogen chemical shifts
nh scalar default the calculated N nitrogen chemical shifts
ca scalar default the calculated Ca carbon chemical shifts
cb scalar default the calculated Cb carbon chemical shifts
co scalar default the calculated C' carbon chemical shifts
expha scalar default the experimental Ha hydrogen chemical shifts
exphn scalar default the experimental H hydrogen chemical shifts
expnh scalar default the experimental N nitrogen chemical shifts
expca scalar default the experimental Ca carbon chemical shifts
expcb scalar default the experimental Cb carbon chemical shifts
expco scalar default the experimental C' carbon chemical shifts

Full list of keywords

The following table describes the keywords and options that can be used with this action

Keyword Type Default Description
ARGThis keyword do not have examples input none the labels of the values from which the function is calculated
ATOMS input none The atoms to be included in the calculation, e
NOISETYPE compulsory MGAUSS functional form of the noise (GAUSS,MGAUSS,OUTLIERS,MOUTLIERS,GENERIC)
LIKELIHOODThis keyword do not have examples compulsory GAUSS the likelihood for the GENERIC metainference model, GAUSS or LOGN
DFTILDEThis keyword do not have examples compulsory 0.1 fraction of sigma_mean used to evolve ftilde
SCALE0This keyword do not have examples compulsory 1.0 initial value of the scaling factor
SCALE_PRIORThis keyword do not have examples compulsory FLAT either FLAT or GAUSSIAN
OFFSET0This keyword do not have examples compulsory 0.0 initial value of the offset
OFFSET_PRIORThis keyword do not have examples compulsory FLAT either FLAT or GAUSSIAN
SIGMA0 compulsory 1.0 initial value of the uncertainty parameter
SIGMA_MIN compulsory 0.0 minimum value of the uncertainty parameter
SIGMA_MAX compulsory 10. maximum value of the uncertainty parameter
OPTSIGMAMEAN compulsory NONE Set to NONE/SEM to manually set sigma mean, or to estimate it on the fly
WRITE_STRIDE compulsory 10000 write the status to a file every N steps, this can be used for restart/continuation
DATADIR compulsory data/ The folder with the experimental chemical shifts
TEMPLATE compulsory template.pdb A PDB file of the protein system
NEIGH_FREQ compulsory 20 Period in step for neighbor list update
NUMERICAL_DERIVATIVESThis keyword do not have examples optional false calculate the derivatives for these quantities numerically
DOSCORE optional false activate metainference
NOENSEMBLEThis keyword do not have examples optional false don't perform any replica-averaging
REWEIGHTThis keyword do not have examples optional false simple REWEIGHT using the ARG as energy
AVERAGINGThis keyword do not have examples optional not used Stride for calculation of averaged weights and sigma_mean
SCALEDATAThis keyword do not have examples optional false Set to TRUE if you want to sample a scaling factor common to all values and replicas
SCALE_MINThis keyword do not have examples optional not used minimum value of the scaling factor
SCALE_MAXThis keyword do not have examples optional not used maximum value of the scaling factor
DSCALEThis keyword do not have examples optional not used maximum MC move of the scaling factor
ADDOFFSETThis keyword do not have examples optional false Set to TRUE if you want to sample an offset common to all values and replicas
OFFSET_MINThis keyword do not have examples optional not used minimum value of the offset
OFFSET_MAXThis keyword do not have examples optional not used maximum value of the offset
DOFFSETThis keyword do not have examples optional not used maximum MC move of the offset
REGRES_ZEROThis keyword do not have examples optional not used stride for regression with zero offset
DSIGMAThis keyword do not have examples optional not used maximum MC move of the uncertainty parameter
SIGMA_MEAN0 optional not used starting value for the uncertainty in the mean estimate
SIGMA_MAX_STEPSThis keyword do not have examples optional not used Number of steps used to optimise SIGMA_MAX, before that the SIGMA_MAX value is used
TEMPThis keyword do not have examples optional not used the system temperature - this is only needed if code doesn't pass the temperature to plumed
MC_STEPSThis keyword do not have examples optional not used number of MC steps
MC_CHUNKSIZEThis keyword do not have examples optional not used MC chunksize
STATUS_FILEThis keyword do not have examples optional not used write a file with all the data useful for restart/continuation of Metainference
FMTThis keyword do not have examples optional not used specify format for HILLS files (useful for decrease the number of digits in regtests)
SELECTORThis keyword do not have examples optional not used name of selector
NSELECTThis keyword do not have examples optional not used range of values for selector [0, N-1]
RESTARTThis keyword do not have examples optional not used allows per-action setting of restart (YES/NO/AUTO)
NOPBC optional false ignore the periodic boundary conditions when calculating distances
SERIALThis keyword do not have examples optional false Perform the calculation in serial - for debug purpose
CAMSHIFT optional false Set to TRUE if you to calculate a single CamShift score
NOEXPThis keyword do not have examples optional false Set to TRUE if you don't want to have fixed components with the experimental values

References

More information about how this action can be used is available in the following articles: