Shortcut: PDB2CONSTANT
| Module | generic |
|---|---|
| Description | Usage |
| Create a constant value from a PDB input file | |
| output value | type |
| a value that is constructed from the information in the PDB file | scalar/vector/matrix |
Details and examples
Create a constant value from a PDB input file
This shortcut converts the contents of a PDB file to one or more CONSTANT actions. Converting PDB files to Values in this way is useful because it means that when we implement methods, like those in the refdist or mapping modules or the RMSD action, that calculate the distance between two configurations those two configurations are both stored in PLUMED values. The same code can thus be used to calculate the difference between the instantaneous configuration and a constant reference configuration that was read from a file or between two reference configuration.
The following example illustrates how this action can be used to read a set of reference atomic positions
#SETTINGS INPUTFILES=regtest/basic/rt19/test0.pdb ref: PDB2CONSTANTCreate a constant value from a PDB input file This action is a shortcut and it has hidden defaults. More details REFERENCEa file in pdb format containing the reference structure=regtest/basic/rt19/test0.pdbClick here to see an extract from this file.
You can see how the reference positions are converted to CONSTANT action that outputs a vector by expanding the shortcut.
You can also use this command to read in multiple reference positions as illustrated below:
#SETTINGS INPUTFILES=regtest/mapping/rt-pathtools-3/all.pdb ref: PDB2CONSTANTCreate a constant value from a PDB input file This action is a shortcut and it has hidden defaults. More details REFERENCEa file in pdb format containing the reference structure=regtest/mapping/rt-pathtools-3/all.pdbClick here to see an extract from this file.
The CONSTANT that is created by this action is a matrix. Each row of the output matrix contains one set of reference positions.
Notice also that if you have a PDB input which contains multiple reference configurations you can create a vector constant by using the NUMBER
keyword to specify the particular configuration that you would like to read in as shown below:
#SETTINGS INPUTFILES=regtest/mapping/rt-pathtools-3/all.pdb ref: PDB2CONSTANTCreate a constant value from a PDB input file This action is a shortcut. More details REFERENCEa file in pdb format containing the reference structure=regtest/mapping/rt-pathtools-3/all.pdbClick here to see an extract from this file.NUMBER if there are multiple structures in the pdb file you can specify that you want the RMSD from a specific structure by specifying its place in the file here=4
The input above will reads in the fourth configuration in the input PDB file only.
The PDB file format
PLUMED uses the PDB file format here and in several other places
- To read the molecular structure (MOLINFO).
- To read reference conformations (RMSD, but also in methods such as FIT_TO_TEMPLATE, etc).
The implemented PDB reader expects a file formatted correctly according to the PDB standard. In particular, the following columns are read from ATOM records
columns | content
1-6 | record name (ATOM or HETATM)
7-11 | serial number of the atom (starting from 1)
13-16 | atom name
18-20 | residue name
22 | chain id
23-26 | residue number
31-38 | x coordinate
39-46 | y coordinate
47-54 | z coordinate
55-60 | occupancy
61-66 | beta factor
The PLUMED parser is slightly more permissive than the official PDB format in the fact that the format of real numbers is not fixed. In other words, any real number that can be parsed is OK and the dot can be placed anywhere. However, columns are interpret strictly. A sample PDB should look like the following
ATOM 2 CH3 ACE 1 12.932 -14.718 -6.016 1.00 1.00
ATOM 5 C ACE 1 21.312 -9.928 -5.946 1.00 1.00
ATOM 9 CA ALA 2 19.462 -11.088 -8.986 1.00 1.00
Notice that serial numbers need not to be consecutive. In the three-line example above, only the coordinates of three atoms are provided. This is perfectly legal and indicates to PLUMED that information about these atoms only is available. This could be both for structural information in MOLINFO, where the other atoms would have no name assigned, and for reference structures used in RMSD, where only the provided atoms would be used to compute RMSD.
Including arguments in PDB files
If you wish to specify reference values for PLUMED Values in the REMARKS of a PLUMED input file like this:
REMARK t1=-4.3345
REMARK t2=3.4725
END
You can read in these reference values by using the PDB2CONSTANT command as follows:
#SETTINGS INPUTFILES=regtest/mapping/rt-pathtools-4/epath.pdb t1: TORSIONCalculate one or multiple torsional angles. More details ATOMSthe four atoms involved in the torsional angle=1,2,3,4 t2: TORSIONCalculate one or multiple torsional angles. More details ATOMSthe four atoms involved in the torsional angle=5,6,7,8 t1_ref: PDB2CONSTANTCreate a constant value from a PDB input file This action is a shortcut and it has hidden defaults. More details REFERENCEa file in pdb format containing the reference structure=regtest/mapping/rt-pathtools-4/epath.pdbClick here to see an extract from this file.ARGread this single argument from the input rather than the atomic structure=t1 t2_ref: PDB2CONSTANTCreate a constant value from a PDB input file This action is a shortcut and it has hidden defaults. More details REFERENCEa file in pdb format containing the reference structure=regtest/mapping/rt-pathtools-4/epath.pdbClick here to see an extract from this file.ARGread this single argument from the input rather than the atomic structure=t2
In this case the input must define values with the labels that are being read in from the reference file
and separate PDB2CONSTANT commands are required for reading in t1 and t2. Furthermore, because the
input PDB file contains multiple frames vectors containing all the values for t1 and t2 are output from
the constant commands that are created by the shortcuts in the above input. If you want to read only one of the
configurations in the input PDB file you can use a pdb with a single frame or the NUMBER keyword described above.
If, for any reason, you want to read data from a PDB file that is not a reference value for one of the values defined in your PLUMED input file you use the NOARGS flag as shown below:
#SETTINGS INPUTFILES=regtest/mapping/rt-pathtools-4/epath.pdb t1_ref: PDB2CONSTANTCreate a constant value from a PDB input file This action is a shortcut and it has hidden defaults. More details REFERENCEa file in pdb format containing the reference structure=regtest/mapping/rt-pathtools-4/epath.pdbClick here to see an extract from this file.NOARGS the arguments that are being read from the PDB file are not in the plumed input ARGread this single argument from the input rather than the atomic structure=t1 t2_ref: PDB2CONSTANTCreate a constant value from a PDB input file This action is a shortcut and it has hidden defaults. More details REFERENCEa file in pdb format containing the reference structure=regtest/mapping/rt-pathtools-4/epath.pdbClick here to see an extract from this file.NOARGS the arguments that are being read from the PDB file are not in the plumed input ARGread this single argument from the input rather than the atomic structure=t2
Occupancy and beta factors
PLUMED also reads the occupancy and beta factors from the input PDB files. However, these columns of data are given a very special meaning. In cases where the PDB structure is used as a reference for an alignment (that's the case for instance in RMSD and in FIT_TO_TEMPLATE), the occupancy column is used to provide the weight of each atom in the alignment. In cases where, perhaps after alignment, the displacement between running coordinates and the provided PDB is computed, the beta factors are used as weight for the displacement. Since setting the weights to zero is the same as not including an atom in the alignment or displacement calculation, the two following reference files would be equivalent when used in an RMSD calculation. First file:
ATOM 2 CH3 ACE 1 12.932 -14.718 -6.016 1.00 1.00
ATOM 5 C ACE 1 21.312 -9.928 -5.946 1.00 1.00
ATOM 9 CA ALA 2 19.462 -11.088 -8.986 0.00 0.00
Second file:
ATOM 2 CH3 ACE 1 12.932 -14.718 -6.016 1.00 1.00
ATOM 5 C ACE 1 21.312 -9.928 -5.946 1.00 1.00
However notice that many extra atoms with zero weight might slow down the calculation, so removing lines is better than setting their weights to zero. In addition, weights for alignment need not to be equivalent to weights for displacement. Starting with PLUMED 2.7, if all the weights are set to zero they will be normalized to be equal to the inverse of the number of involved atoms. This means that it will be possible to use files with the weight columns set to zero obtaining a meaningful result. In previous PLUMED versions, setting all weights to zero was resulting in an error instead.
Systems with more than 100k atoms
Notice that it very likely does not make any sense to compute the RMSD or any other structural deviation using many atoms. However, if the protein for which you want to compute RMSD has atoms with large serial numbers (e.g. because it is located after solvent in the sorted list of atoms) you might end up with troubles with the limitations of the PDB format. Indeed, since there are 5 columns available for atom serial number, this number cannot be larger than 99999. In addition, providing MOLINFO with names associated to atoms with a serial larger than 99999 would be impossible.
Since PLUMED 2.4 we allow the hybrid 36 format to be used to specify atom numbers. This format is not particularly widespread, but has the nice feature that it provides a one-to-one mapping between numbers up to approximately 80 millions and strings with 5 characters, plus it is backward compatible for numbers smaller than 100000. This is not true for notations like the hex notation exported by VMD. Using the hybrid 36 format, the ATOM records for atom ranging from 99997 to 100002 would read like these:
ATOM 99997 Ar X 1 45.349 38.631 15.116 1.00 1.00
ATOM 99998 Ar X 1 46.189 38.631 15.956 1.00 1.00
ATOM 99999 Ar X 1 46.189 39.471 15.116 1.00 1.00
ATOM A0000 Ar X 1 45.349 39.471 15.956 1.00 1.00
ATOM A0000 Ar X 1 45.349 38.631 16.796 1.00 1.00
ATOM A0001 Ar X 1 46.189 38.631 17.636 1.00 1.00
There are tools that can be found to translate from integers to strings and back using hybrid 36 format (a simple python script can be found here). In addition, as of PLUMED 2.5, we provide a command line tool that can be used to renumber atoms in a PDB file.
Input
The arguments that serve as the input for this action are specified using one or more of the keywords in the following table.
| Keyword | Type | Description |
|---|---|---|
| ARG | scalar/vector | read this single argument from the input rather than the atomic structure |
Full list of keywords
The following table describes the keywords and options that can be used with this action
| Keyword | Type | Default | Description |
|---|---|---|---|
| ARG | input | none | read this single argument from the input rather than the atomic structure |
| REFERENCE | compulsory | none | a file in pdb format containing the reference structure |
| NUMBER | compulsory | 0 | if there are multiple structures in the pdb file you can specify that you want the RMSD from a specific structure by specifying its place in the file here |
| NOARGS | optional | false | the arguments that are being read from the PDB file are not in the plumed input |