Lugano tutorial: Brief guide to PLUMED syntax and analyzing trajectories

Aims

The aim of this tutorial is to introduce you to the PLUMED syntax. We will go through the writing of input files to calculate and print simple collective variables on a pre-calculated trajectory.

Learning Outcomes

Once this tutorial is completed, students will be able to:

  • Write a simple PLUMED input file and use it with the PLUMED driver to analyze a trajectory.
  • Print collective variables such as distances. (DISTANCE), torsional angles (TORSION), gyration radius (GYRATION), and RMSD (RMSD) using the PRINT action.
  • Use MOLINFO shortcuts.
  • Define and use virtual atoms, such as CENTER.
  • Take care of periodic boundary conditions within PLUMED using WHOLEMOLECULES and be able to verify the result with DUMPATOMS.
  • Write their own CVs directly in the input file using the CUSTOM action.

Install PLUMED

You can find detailed instructions about how to install PLUMED here:

https://github.com/plumed/conda

Resources

Before starting the tutorial, please create a separate directory, named hands_on_1, and enter it:

mkdir hands_on_1
cd hands_on_1

The reference trajectories and other files needed for the exercises proposed in this tutorial can be downloaded from github using the following command:

wget https://github.com/plumed/lugano2019/raw/master/handson_1/handson_1.tgz 

The zip archive contains the following files:

  • GB1_native.pdb : A PDB file with the native structure of the GB1 protein.
  • traj-whole.xtc: A trajectory in xtc format. To make the exercise easier, GB1 has been made whole already.
  • traj-broken.xtc: The same trajectory as it was originally produced by GROMACS. Here GB1 is broken by periodic boundary conditions and should be fixed.

The archive can be opened with the following command:

tar xvzf handson_1.tgz

This tutorial has been tested on PLUMED version 2.6.0

Instructions

PLUMED is a library that can be incorporated into many MD codes by adding a relatively simple and well documented interface. Once it is incorporated you can use PLUMED to perform a variety of different analyzes on the fly and to bias the sampling in the molecular dynamics engine. PLUMED can also, however, be used as a standalone code for analyzing trajectories. If you are using the code in this way you can, once PLUMED is compiled, run the plumed executable by issuing the command:

plumed <instructions>

Let's start by getting a feel for the range of calculations that we can use PLUMED to perform. Issue the following command now:

plumed --help 

What is output when this command is run is a list of the tasks you can use PLUMED to perform. There are commands that allow you to patch an MD code, commands that allow you to run molecular dynamics and commands that allow you to build the manual. In this tutorial we will mostly be using PLUMED to analyze trajectories, however. As such most of the calculations we will perform will be performed using the driver tool. Let's look at the options we can issue to plumed driver by issuing the following command:

plumed driver --help

As you can see we can do a number of things with plumed driver. For all of these options, however, we are going to need to write a PLUMED input file. The syntax of the PLUMED input file is the same that we will use later to run enhanced sampling simulations, so all the things that you will learn now will be useful later when you will run PLUMED coupled to an MD code. In the following we are going to see how to write an input file for PLUMED.

The PLUMED internal units

By default the PLUMED inputs and outputs quantities in the following units:

  • Energy - kJ/mol
  • Length - nanometers
  • Time - picoseconds

If you want to change these units you can do this using the UNITS keyword.

The syntax of the PLUMED input file

The main goal of PLUMED is to compute collective variables, which are complex descriptors than can be used to analyze a conformational change or a chemical reaction. This can be done either on-the-fly during molecular dynamics or a posteriori, using PLUMED as a post-processing tool. In both cases one, should create an input file with a specific PLUMED syntax. A sample input file is below:

Click on the labels of the actions for more information on what each action computes
tested on master
# Compute distance between atoms 1 and 10.
# Atoms are ordered as in the trajectory files and their numbering starts from 1.
# The distance is called "d" for future reference.
d: DISTANCE 
ATOMS
the pair of atom that we are calculating the distance between.
=1,10 # Create a virtual atom in the center between atoms 20 and 30. # The virtual atom only exists within PLUMED and is called "center" for future reference. center: CENTER
ATOMS
the list of atoms which are involved the virtual atom's definition.
=20,30 # Compute the torsional angle between atoms 1, 10, 20, and center. # Notice that virtual atoms can be used as real atoms here. # The angle is called "phi" for future reference. phi1: TORSION
ATOMS
the four atoms involved in the torsional angle
=1,10,20,center # the same CV defined before can be split into multiple line phi2: TORSION ...
ATOMS
the four atoms involved in the torsional angle
=1,10,20,center ... # Print d every 10 step on a file named "COLVAR1". PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=d
STRIDE
compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be output
=10
FILE
the name of the file on which to output these quantities
=COLVAR1 # Print phi1 and phi2 on another file names "COLVAR2" every 100 steps. PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=phi1,phi2
STRIDE
compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be output
=100
FILE
the name of the file on which to output these quantities
=COLVAR2

In the input file above, each line defines a so-called action. An action could either compute a distance, or the center between two or more atoms, or print some value on a file. Each action supports a number of keywords, whose value is specified. Action names are highlighted in green and, clicking on them, you can go to the corresponding page in the manual that contains a detailed description for each keyword. Actions that support the keyword STRIDE are those that determine how frequently things are to be done. Notice that the default value for STRIDE is always 1. In the example above, omitting STRIDE keywords the corresponding COLVAR files would have been written for every frame of the analyzed trajectory. All the other actions in the example above do not support the STRIDE keyword and are only calculated when requested. That is, d will be computed every 10 frames, and phi1 and phi2 every 100 frames. In short, you can think that for every snapshot in the trajectory that you are analyzing PLUMED is going to execute all the listed actions, though some of them are optimized out when STRIDE is different from 1.

Variables should be given a name (in the example above, d, phi1, and phi2), which is then used to refer to these variables. Lists of atoms should be provided as comma separated numbers, with no space. Virtual atoms can be created and assigned a name for later use.

You can find more information on the PLUMED syntax at Getting Started page of the manual. The complete documentation for all the supported collective variables can be found at the Collective Variables page.

Exercises

Exercise 1: Computing and printing simple collective variables

In this exercise, we will make practice with computing and printing collective variables. To analyze the trajectories provided here, you should:

  • Create a PLUMED input file with a text editor (let us call it plumed.dat) similar to the one above.
  • Run the command plumed driver command (see below)

Notice that you can also visualize trajectories with VMD directly. For example, the trajectory traj-whole.xtc can be visualized with the command vmd GB1_native.pdb traj-whole.xtc.

Let's prepare a PLUMED input file that calculates

  • The gyration radius of all CA protein atoms (GYRATION). Look in the GB1_native.pdb file to retrieve the list of indexes of the CA atoms.
  • The total number of contacts (COORDINATION) between all protein CA atoms. For COORDINATION, set reference distance R_0 to 8.0 A (be careful with units!!).

Here you can find a sample plumed.dat file that you can use as a template. Whenever you see an highlighted FILL string, this is a string that you should replace.

Click on the labels of the actions for more information on what each action computes
tested on master
# Compute gyration radius on CA atoms:
r: GYRATION 
ATOMS
the group of atoms that you are calculating the Gyration Tensor for.
=__FILL__ # Compute number of contacts between CA atoms co: COORDINATION
GROUPA
First list of atoms.
=__FILL__
R_0
could not find this keyword
=__FILL__ # Print the collective variables on COLVAR file every step PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=r,co
FILE
the name of the file on which to output these quantities
=COLVAR
STRIDE
compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be output
=__FILL__

Once your plumed.dat file is complete, you can use it with the following command

> plumed driver --plumed plumed.dat --mf_xtc traj-broken.xtc

Scroll in your terminal to read the PLUMED log. As you can see, PLUMED gives a lot of feedback about the input that he is reading. There's the place where you can check if PLUMED understood correctly your input.

The command above will create a COLVAR file like this one:

#! FIELDS time r co
 0.000000 2.458704 165.184131
 1.000000 2.341932 164.546603
 2.000000 2.404708 162.606975
 3.000000 2.454297 143.850117
 4.000000 2.569342 147.110410
 5.000000 2.304027 163.608695
 6.000000 2.116676 177.549792
 7.000000 2.068599 183.177952
 8.000000 2.021605 181.929958
... more lines ...

Notice that the first line informs you about the content of each column.

In case you obtain different numbers, check your input, you might have made some mistake!

This file can then be visualized using the following python script:

import matplotlib.pyplot as plt
import plumed
colvar=plumed.read_as_pandas("COLVAR")
plt.plot(colvar.time,colvar.r)
plt.show()
plt.plot(colvar.time,colvar.co)
plt.show()

Now, look at what happens if you run the exercise twice. The second time, PLUMED will back up the previously produced file so as not to overwrite it. You can also concatenate your files by using the action RESTART at the beginning of your input file.

Finally, let's try to use the same input file with traj-whole.xtc and examine the results. Did you find the same results as with the previous trajectory? Why?

Exercise 2: MOLINFO shortcuts

PLUMED provides some shortcuts to select atoms with specific properties. To use this feature, you should specify the MOLINFO action along with a reference pdb file.

Let's try to use this functionality to calculate the backbone dihedral angle phi of residue 2 of GB1. This CV is defined by the action TORSION and a set of 4 atoms. For residue i, the dihedral phi is defined by these atoms: C(i-1),N(i),CA(i),C(i). After consulting the manual and inspecting GB1_native.pdb, let's define the dihedral angle in question in two different ways: using the MOLINFO shortcut to define psi of residue 2 and with an explicit list of 4 atoms.

Click on the labels of the actions for more information on what each action computes
tested on master
# activate MOLINFO functionalities
MOLINFO 
STRUCTURE
compulsory keyword a file in pdb format containing a reference structure.
=__FILL__ # define phi dihedral of residue 2 as a list of 4 atoms t1: TORSION
ATOMS
the four atoms involved in the torsional angle
=__FILL__ # define the same dihedral using MOLINFO shortcuts t2: TORSION
ATOMS
the four atoms involved in the torsional angle
=__FILL__ # print the value of t1 and t2 every 10 steps PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=__FILL__
FILE
the name of the file on which to output these quantities
=COLVAR
STRIDE
compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be output
=10

After completing the PLUMED input file above, let's use it to analyze the trajectory traj-whole.xtc that you downloaded at the start of this exercise. Now, let's plot the trajectory of the two CVs above. If you executed this exercise correctly, these two trajectories should be identical.

Note
In the manual you can also learn how to use selections done with MDtraj or MDAnalysis. For instance, the CA atoms that you needed in Exercise 1: Computing and printing simple collective variables could have been picked with {{@mda:name CA}}.

Exercise 3: Virtual atoms

Sometimes, when calculating a CV, you may not want to use the positions of a number of atoms directly. Instead you may want to define a virtual atom whose position is generated based on the positions of a collection of other atoms. For example you might want to use the center of mass (COM) or the geometric center (CENTER) of a group of atoms.

In this exercise, you will learn how to specify virtual atoms and later use them to define a CV. Let's start by having a look at the PLUMED input file below.

Click on the labels of the actions for more information on what each action computes
tested on master
# geometric center of first residue
first: CENTER 
ATOMS
the list of atoms which are involved the virtual atom's definition.
=1,2,3,4,5,6,7,8 # geometric center of last residue last: CENTER
ATOMS
the list of atoms which are involved the virtual atom's definition.
=427-436 # distance between centers of first and last residues, with PBCs d1: DISTANCE
ATOMS
the pair of atom that we are calculating the distance between.
=first,last # distance between centers of first and last residues, without PBCs d2: DISTANCE
ATOMS
the pair of atom that we are calculating the distance between.
=first,last
NOPBC
( default=off ) ignore the periodic boundary conditions when calculating distances
# print the two distances every step PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=d1,d2
STRIDE
compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be output
=1
FILE
the name of the file on which to output these quantities
=COLVAR

Make a PLUMED input containing the above input and execute it on the trajectory traj-broken.xtc by making use of the following command:

plumed driver --mf_xtc traj-broken.xtc --plumed plumed.dat

Before we turn to analyzing what is output from this calculation there are a few things to note about this input file. Firstly, I should describe what this file instructs PLUMED to do. It tells PLUMED to:

  1. calculate the position of the Virtual Atom 'first' as the CENTER of atoms from 1 to 8;
  2. calculate the position of the Virtual Atom 'last' as the CENTER of atoms from 427 to 436;
  3. calculate the distance between the two Virtual Atoms 'first' and 'last' and saves it in 'd1';
  4. calculate the distance (ignoring periodic boundary conditions) between the two Virtual Atom 'first' and 'last' and saves it in 'd2';
  5. print the content of 'd1' and 'd2' in the file COLVAR for every frame of the trajectory

Notice that in the input above we have used two different ways of writing the atoms used in the CENTER calculation:

  1. ATOMS=1,2,3,4,5,6,7,8 is the explicit list of the atoms we need
  2. ATOMS=427-436 is the range of atoms needed

Notice also that ranges of atoms can be defined with a stride which can also be negative as shown by the commands below, which are both equivalent:

  1. ATOMS=from,to:by (i.e.: 427-436:2)
  2. ATOMS=to,from:-by (i.e.: 436-427:-2)

Let's now analyze the output of the calculation by plotting the time series of the two CVs. Are they identical? What do you think is happening in those frames in which the two CVs are different?

You can repeat the same analysis on traj-whole.xtc and compare with the previous trajectory. Are the results identical?

Exercise 4: Taking care of periodic boundary conditions

As you should have noticed, in this tutorial we are working with two different trajectories of the GB1 protein:

  • traj-broken.xtc: the original GROMACS trajectory in which GB1 is broken by periodic boundary conditions
  • traj-whole.xtc: the trajectory processed by the gmx trjconv utility to fix discontinuities due to periodic boundary conditions

In many PLUMED CVs, periodic boundary conditions are automatically taken into account unless a special option (NOPBC) is used. In this way, the user can work directly with the raw trajectory traj-broken.xtc. Alternatively, PLUMED can reconstruct internally the coordinates of the system and thus fix discontinuities due to the periodic boundary conditions. In order to do so, the WHOLEMOLECULES action should be used.

In this exercise, we will learn how to use the WHOLEMOLECULES action. Let's ask PLUMED to internally make the structure of GB1 whole and print the new atoms positions on a new file, called dump.gro, every 10 steps.

Click on the labels of the actions for more information on what each action computes
tested on master
# use WHOLEMOLECULES to make the entire protein whole
# let's use the range syntax to specify all GB1 atoms
WHOLEMOLECULES 
ENTITY0
the atoms that make up a molecule that you wish to align.
=__FILL__ # print the positions of all atoms every 10 steps DUMPATOMS
FILE
compulsory keyword file on which to output coordinates; extension is automatically detected
=dump.gro
ATOMS
the atom indices whose positions you would like to print out.
=__FILL__
STRIDE
compulsory keyword ( default=1 ) the frequency with which the atoms should be output
=__FILL__

Once your plumed.dat file is complete, you can use it with the following command

> plumed driver --plumed plumed.dat --mf_xtc traj-broken.xtc

and look at the output file with vmd. Have the problems with the periodic boundary conditions been fixed?

At this point, we can repeat Exercise 3: Virtual atoms after adding the WHOLEMOLECULES action at the beginning of the PLUMED input file:

Click on the labels of the actions for more information on what each action computes
tested on master
# make protein whole
WHOLEMOLECULES 
ENTITY0
the atoms that make up a molecule that you wish to align.
=__FILL__ # geometric center of first residue first: CENTER
ATOMS
the list of atoms which are involved the virtual atom's definition.
=1,2,3,4,5,6,7,8 # geometric center of last residue last: CENTER
ATOMS
the list of atoms which are involved the virtual atom's definition.
=427-436 # distance between centers of first and last residues, with PBCs d1: DISTANCE
ATOMS
the pair of atom that we are calculating the distance between.
=first,last # distance between centers of first and last residues, without PBCs d2: DISTANCE
ATOMS
the pair of atom that we are calculating the distance between.
=first,last
NOPBC
( default=off ) ignore the periodic boundary conditions when calculating distances
# print the two distances every step PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=d1,d2
STRIDE
compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be output
=1
FILE
the name of the file on which to output these quantities
=COLVAR

We can now visualize the time line of the two CVs and compare it with the results of Exercise 3: Virtual atoms. Which CVs are identical and which are not? Why?

Exercise 5: Using CVs that measure the distance from a reference conformation

In many cases, it is useful to define a CV that quantifies the distance of the system from a reference conformation. In PLUMED, this can be achieved using a variety of different CVs. Please, have a look here for more info.

In this exercise, we will learn how to use the RMSD and DRMSD CVs. In order to do so, we need to edit a reference pdb file and identify:

  • for RMSD, the atoms that you want to use for alignment and RMSD calculation
  • for DRMSD, the atoms that you want to use for the DRMSD calculation

Please refer to the manual to understand how to specify the atoms needed to calculate the CA-RMSD and CA-DRMSD with respect to the native GB1 structure. After editing the reference pdb file, please complete the following input file:

Click on the labels of the actions for more information on what each action computes
tested on master
# RMSD on CA atoms, after optimal alignment
rmsd: RMSD 
REFERENCE
compulsory keyword a file in pdb format containing the reference structure and the atoms involved in the CV.
=__FILL__
TYPE
compulsory keyword ( default=SIMPLE ) the manner in which RMSD alignment is performed.
=OPTIMAL # DRMSD on CA atoms # Only pairs of atoms whose distance in the reference structure # is within 0.1 and 0.8 nm are considered drmsd: DRMSD
REFERENCE
compulsory keyword a file in pdb format containing the reference structure and the atoms involved in the CV.
=__FILL__
LOWER_CUTOFF
compulsory keyword only pairs of atoms further than LOWER_CUTOFF are considered in the calculation.
=0.1
UPPER_CUTOFF
compulsory keyword only pairs of atoms closer than UPPER_CUTOFF are considered in the calculation.
=0.8 # print both CVs to file PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=__FILL__
STRIDE
compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be output
=1
FILE
the name of the file on which to output these quantities
=COLVAR

Once your plumed.dat file is complete, you can use it with the following command

> plumed driver --plumed plumed.dat --mf_xtc traj-broken.xtc

and also on the trajectory file in which problems due to the periodic boundaries have been fixed:

> plumed driver --plumed plumed.dat --mf_xtc traj-whole.xtc

Can you comment the results obtained and the differences - if any - between the two trajectories? What is happening to the protein during the course of the simulation? Are the two metrics, RMSD and DRMSD, equivalent?

Exercise 6: Creating your own CV directly in the PLUMED input file

In PLUMED, you can define your own CV directly in the input file by writing it as a function of existing CVs using the CUSTOM action. PLUMED will then automatically calculate the derivatives with the respect to the atoms positions.

Let's look at the following example.

Click on the labels of the actions for more information on what each action computes
tested on master
# distance between atoms 10 and 12
dAB: DISTANCE 
ATOMS
the pair of atom that we are calculating the distance between.
=10,12 # distance between atoms 10 and 15 dAC: DISTANCE
ATOMS
the pair of atom that we are calculating the distance between.
=10,15 # custom CV defined as the difference between dAC->y and dAB->x diff: CUSTOM
ARG
the input for this action is the scalar output from one or more other actions.
=dAB,dAC
FUNC
compulsory keyword the function you wish to evaluate
=y-x
PERIODIC
compulsory keyword if the output of your function is periodic then you should specify the periodicity of the function.
=NO # print the 3 CVs to file every 10 steps PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=dAB,dAC,diff
FILE
the name of the file on which to output these quantities
=COLVAR
STRIDE
compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be output
=10

In this example, a custom CV is defined as the difference of two distances between pairs of atoms.

Please complete the following input file to calculate two new CVs from those defined in Exercise 5: Using CVs that measure the distance from a reference conformation:

  • the average of RMSD and DRMSD, calculated on all the CA atoms of GB1 (easy)
  • the minimum distance between RMSD and DRMSD (a bit more difficult). To define this function, you can creatively take inspiration from the path CV s (see PATH).
Click on the labels of the actions for more information on what each action computes
tested on master
# RMSD on CA atoms, after optimal alignment
rmsd: RMSD 
REFERENCE
compulsory keyword a file in pdb format containing the reference structure and the atoms involved in the CV.
=__FILL__
TYPE
compulsory keyword ( default=SIMPLE ) the manner in which RMSD alignment is performed.
=OPTIMAL # DRMSD on CA atoms # Only pairs of atoms whose distance in the reference structure # is within 0.1 and 0.8 nm are considered drmsd: DRMSD
REFERENCE
compulsory keyword a file in pdb format containing the reference structure and the atoms involved in the CV.
=__FILL__
LOWER_CUTOFF
compulsory keyword only pairs of atoms further than LOWER_CUTOFF are considered in the calculation.
=0.1
UPPER_CUTOFF
compulsory keyword only pairs of atoms closer than UPPER_CUTOFF are considered in the calculation.
=0.8 # average of RMSD and DRMSD ave: CUSTOM
ARG
the input for this action is the scalar output from one or more other actions.
=rmsd,drmsd
FUNC
compulsory keyword the function you wish to evaluate
=__FILL__
PERIODIC
compulsory keyword if the output of your function is periodic then you should specify the periodicity of the function.
=NO # minimum distance between RMSD and DRMSD min: CUSTOM
ARG
the input for this action is the scalar output from one or more other actions.
=rmsd,drmsd
FUNC
compulsory keyword the function you wish to evaluate
=__FILL__
PERIODIC
compulsory keyword if the output of your function is periodic then you should specify the periodicity of the function.
=NO # print all 4 CVs to file every step PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=__FILL__
STRIDE
compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be output
=1
FILE
the name of the file on which to output these quantities
=COLVAR

Once your plumed.dat file is complete, you can use it with the following command

> plumed driver --plumed plumed.dat --mf_xtc traj-broken.xtc

and then check that the custom CVs are indeed what they are expected by plotting the COLVAR file with matplotlib.