Shortcut: PAMM
| Module | pamm |
|---|---|
| Description | Usage |
| Probabilistic analysis of molecular motifs. |
Details and examples
Probabilistic analysis of molecular motifs.
Probabilistic analysis of molecular motifs (PAMM) was introduced in the papers in the bibliography. The essence of this approach involves calculating some large set of collective variables for a set of atoms in a short trajectory and fitting this data using a Gaussian Mixture Model. The idea is that modes in these distributions can be used to identify features such as hydrogen bonds or secondary structure types.
The assumption within this implementation is that the fitting of the Gaussian mixture model has been done elsewhere by a separate code. You thus provide an input file to this action which contains the means, covariance matrices and weights for a set of Gaussian kernels, . The values and derivatives for the following set of quantities is then computed:
Each of the is a Gaussian function that acts on a set in quantities calculated that might be calculated using a TORSION, DISTANCE or ANGLE action for example. These quantities are then inserted into the set of kernels that are in the the input file. This will be done for multiple sets of values for the input quantities and a final quantity will be calculated by summing the above values or some transformation of the above. This sounds less complicated than it is and is best understood by looking through the example given below, which can be expanded to show the full set of operations that PLUMED is performing.
\warning Mixing input variables that are periodic with variables that are not periodic has not been tested
Examples
In this example I will explain in detail what the following input is computing:
#SETTINGS MOLFILE=regtest/pamm/rt-pamm-periodic/M1d.pdb INPUTFILES=regtest/pamm/rt-pamm-periodic/2D-testc-0.75.pammp MOLINFOThis command is used to provide information on the molecules that are present in your system. More details MOLTYPE what kind of molecule is contained in the pdb file - usually not needed since protein/RNA/DNA are compatible=protein STRUCTUREa file in pdb format containing a reference structure=regtest/pamm/rt-pamm-periodic/M1d.pdbClick here to see an extract from this file.psi: TORSIONCalculate one or multiple torsional angles. More details ATOMS1the four atoms involved in the torsional angle=@psi-2the four atoms that are required to calculate the psi dihedral for residue 2. Click here for more information. ATOMS2the four atoms involved in the torsional angle=@psi-3the four atoms that are required to calculate the psi dihedral for residue 3. Click here for more information. ATOMS3the four atoms involved in the torsional angle=@psi-4the four atoms that are required to calculate the psi dihedral for residue 4. Click here for more information. phi: TORSIONCalculate one or multiple torsional angles. More details ATOMS1the four atoms involved in the torsional angle=@phi-2the four atoms that are required to calculate the phi dihedral for residue 2. Click here for more information. ATOMS2the four atoms involved in the torsional angle=@phi-3the four atoms that are required to calculate the phi dihedral for residue 3. Click here for more information. ATOMS3the four atoms involved in the torsional angle=@phi-4the four atoms that are required to calculate the phi dihedral for residue 4. Click here for more information. p: PAMMProbabilistic analysis of molecular motifs. This action is a shortcut and it has hidden defaults. More details ... ARGthe vectors from which the pamm coordinates are calculated=phi,psi MEAN calculate the mean of all the quantities CLUSTERSthe name of the file that contains the definitions of all the clusters=regtest/pamm/rt-pamm-periodic/2D-testc-0.75.pammpClick here to see an extract from this file.... PRINTPrint quantities to a file. More details ARGthe labels of the values that you would like to print to the file=p-1_mean,p-2_mean FILEthe name of the file on which to output these quantities=colvar
The best place to start our explanation is to look at the contents of the 2D-testc-0.75.pammp file, which you can do
by clicking on the links in the annotated input above. This files contains the parameters of two two-dimensional Gaussian functions.
Each of these Gaussian kernels has a weight, , a vector that specifies the position of its center, , and a covariance matrix, .
The functions that we use to calculate our PAMM components are thus:
In the above is a normalization factor that is calculated based on . The vector is a vector of quantities that are calculated by the input TORSION actions. This vector must be two dimensional and in this case each component is the value of a torsion angle. If we look at the two TORSION actions in the above we are calculating the and backbone torsional angles in a protein (Note the use of MOLINFO to make specification of atoms straightforward). We thus calculate the values of our 2 kernels 3 times. The first time we use the and angles in the second residue of the protein, the second time it is the and angles of the third residue of the protein and the third time it is the and angles of the fourth residue in the protein. The final two quantities that are output by the print command, p.mean-1 and p.mean-2, are the averages over these three residues for the quantities:
and
There is a great deal of flexibility in this input. We can work with, and examine, any number of components, we can use any set of collective variables and compute these PAMM variables and we can transform the PAMM variables themselves in a large number of different ways when computing these sums. Furthermore, by expanding the shortcuts in the example above we can obtain insight into how the PAMM method operates.
Output components
This action can calculate the values in the following table when the associated keyword is included in the input for the action. These values can be referenced elsewhere in the input by using this Action's label followed by a dot and the name of the value required from the list below.
| Name | Type | Keyword | Description |
|---|---|---|---|
| lessthan | scalar | LESS_THAN | the number of colvars that have a value less than a threshold |
| morethan | scalar | MORE_THAN | the number of colvars that have a value more than a threshold |
| altmin | scalar | ALT_MIN | the minimum value of the cv |
| min | scalar | MIN | the minimum colvar |
| max | scalar | MAX | the maximum colvar |
| between | scalar | BETWEEN | the number of colvars that have a value that lies in a particular interval |
| highest | scalar | HIGHEST | the largest of the colvars |
| lowest | scalar | LOWEST | the smallest of the colvars |
| sum | scalar | SUM | the sum of the colvars |
| mean | scalar | MEAN | the mean of the colvars |
Full list of keywords
The following table describes the keywords and options that can be used with this action
| Keyword | Type | Default | Description |
|---|---|---|---|
| ARG | compulsory | none | the vectors from which the pamm coordinates are calculated |
| CLUSTERS | compulsory | none | the name of the file that contains the definitions of all the clusters |
| REGULARISE | compulsory | 0.001 | don't allow the denominator to be smaller then this value |
| KERNELS | compulsory | all | which kernels are we computing the PAMM values for |
deprecated keywords
The keywords in the following table can still be used with this action but have been deprecated
| Keyword | Description |
|---|---|
| LESS_THAN | calculate the number of variables that are less than a certain target value |
| MORE_THAN | calculate the number of variables that are more than a certain target value |
| ALT_MIN | calculate the minimum value |
| MIN | calculate the minimum value |
| MAX | calculate the maximum value |
| BETWEEN | calculate the number of values that are within a certain range |
| HIGHEST | this flag allows you to recover the highest of these variables |
| HISTOGRAM | calculate a discretized histogram of the distribution of values |
| LOWEST | this flag allows you to recover the lowest of these variables |
| SUM | calculate the sum of all the quantities |
| MEAN | calculate the mean of all the quantities |
References
More information about how this action can be used is available in the following articles:
- P. Gasparotto, M. Ceriotti, Recognizing molecular patterns by machine learning: An agnostic structural definition of the hydrogen bond. The Journal of Chemical Physics. 141 (2014)
- P. Gasparotto, R. H. Meißner, M. Ceriotti, Recognizing Local and Global Structural Motifs at the Atomic Scale. Journal of Chemical Theory and Computation. 14, 486–498 (2018)