PLUMED Masterclass 22.11: Variationally enhanced sampling with PLUMED
Date
July 4, 2022

# Aims

In this Masterclass, we will discuss how to run variationally enhanced sampling simulations with PLUMED. We will also understand how to analyze the results.

# Objectives

Once you have completed this Masterclass you will be able to:

• Run variationally enhanced sampling simulations biasing one and two CVs
• Assess the convergence of the simulation
• Perform reweighting from a variationally enhanced sampling simulations

Use PLUMED to run and analyze

• Use PLUMED to reweight from

# Setting up PLUMED

For this masterclass you will need versions of PLUMED (with the VES module enabled) and GROMACS that are compiled using the MPI library. All the exercises were tested with PLUMED version 2.8.0 and GROMACS 2020.6. In order to obtain the correct versions, please follow the instructions at this link.

The data needed to execute the exercises of this Masterclass can be found on GitHub. You can clone this repository locally on your machine using the following command:

git clone https://github.com/valsson-group/masterclass-22-11.git


# Summary of theory

Here, we will briefly summarize the theory behind variationally enhanced sampling (VES). For an full overview of VES, you should read the original paper, a recent review book chapter on VES, or a recent enhanced sampling review that includes discussion about VES.

VES is based on the the following functional of the bias potential:

$\Omega [V] = \frac{1}{\beta} \log \frac {\int d\mathbf{s} \, e^{-\beta \left[ F(\mathbf{s}) + V(\mathbf{s})\right]}} {\int d\mathbf{s} \, e^{-\beta F(\mathbf{s})}} + \int d\mathbf{s} \, p_{\mathrm{tg}}(\mathbf{s}) V(\mathbf{s}),$

where $$\mathbf{s}$$ are the CVs that we are biasing, $$p_{\mathrm{tg}}(\mathbf{s})$$ is a predefined probability distribution that we will refer to as the target distribution, and $$F(\mathbf{s})$$ is the free energy surface. This functional can be shown to be convex and to have a minimum at:

$V(\mathbf{s}) = -F(\mathbf{s})-{\frac {1}{\beta}} \log {p_{\mathrm{tg}}(\mathbf{s})}.$

The last equation states that when we minimize the functional $$\Omega [V]$$, we can obtain the free energy surface from the bias potential (and the target distribution) We can choose the target distribution $$p_{\mathrm{tg}}(\mathbf{s})$$ at will and it is the CV distribution that we obtain when minimizing $$\Omega [V]$$.

We put the variational principle to practice by expanding $$V(\mathbf{s})$$ in some basis set:

$V(\mathbf{s}) = \sum\limits_{i} \alpha_i \, f_i(\mathbf{s}),$

where $$f_i(\mathbf{s})$$ are the basis functions and the $$\boldsymbol\alpha$$ are the coefficients in the expansion. We then need to find the coefficients $$\boldsymbol\alpha$$ that minimize $$\Omega [V]$$. In principle one could use any optimization algorithm. In practice the algorithm that has become the default choice for VES is the so-called averaged stochastic gradient descent algorithm [9]. In this algorithm the $$\boldsymbol\alpha$$ are evolved iteratively according to:

$\boldsymbol\alpha^{(n+1)} = \boldsymbol\alpha^{(n)}-\mu \left[ \nabla\Omega(\bar{\boldsymbol\alpha}^{(n)})+ H(\bar{\boldsymbol\alpha}^{(n)})[\boldsymbol\alpha^{(n)}-\bar{\boldsymbol\alpha}^{(n)}] \right]$

where $$\mu$$ is the step size, $$\bar{\boldsymbol\alpha}^{(n)}$$ is the running average of $$\boldsymbol\alpha^{(n)}$$ at iteration $$n$$, and $$\nabla\Omega(\bar{\boldsymbol\alpha}^{(n)})$$ and $$H(\bar{\boldsymbol\alpha}^{(n)})$$ are the gradient and Hessian of $$\Omega[V]$$ evaluated at the running average at iteration $$n$$, respectively. The behavior of the coefficients will become clear in the examples below.

As said above, we can choose the target distribution $$p_{\mathrm{tg}}(\mathbf{s})$$ at will. The most simple choice would be a uniform target distribution. However, it has found more optimal to employ the so-called well-tempered distribution [137] :

$p_{\mathrm{tg}}(\mathbf{s}) = \frac{[ P(\mathbf{s}) ]^{1/\gamma}} {\int d\mathbf{s}\, [ P(\mathbf{s}) ]^{1/\gamma}}$

where $$\gamma$$ is the so-called bias factor and $$P(\mathbf{s})$$ is the unbiased CV distribution. Therefore the well-tempered distribution is the unbiased distribution with enhanced fluctuations and lowered barriers. This is the same distribution as sampled in well-tempered metadynamics. The advantages of this distribution are that the features of the FES (metastable states) are preserved and that the system is not forced to sample regions of high free energy (that can represent un-physical configurations) as it would if we had chosen the uniform target distribution.

There is a caveat though, the well-tempered $$p_{\mathrm{tg}}(\mathbf{s})$$ depends on $$F(\mathbf{s})$$ that is the function that we are trying to calculate. One way to approach this problem is to calculate $$p_{\mathrm{tg}}(\mathbf{s})$$ self-consistently [137], for instance at iteration $$k$$:

$p^{(k+1)}(\mathbf{s})=\frac{e^{-(\beta/\gamma) F^{(k+1)}(\mathbf{s})}}{\int d\mathbf{s} \, e^{-(\beta/\gamma) F^{(k+1)}(\mathbf{s})}}$

where:

$F^{(k+1)}(\mathbf{s})=-V^{(k)}(\mathbf{s}) - \frac{1}{\beta} \log p^{(k)}(\mathbf{s})$

Normally $$p^{(0)}(\mathbf{s})$$ is taken to be uniform. Therefore the target distribution evolves in time until it becomes stationary when the simulation has converged. It has been shown that in some cases the convergence is faster using the well-tempered target distribution than using the uniform $$p(\mathbf{s})$$ [137].

# The system

In this tutorial, we will consider the association/dissociation of NaCl in aqueous solution. The system consists of 1 Na atom, 1 Cl atom, and 107 water molecules for a total of 323 atoms. In an effort to speed up the simulations, we employ a rather small water box, and thus need to employ smaller cutoffs than usually used. Therefore, this simulation setup should not be used in production runs. Typically, the run should take around 15-20 minutes to run on a laptop using two MPI processes. By running the simulations on a cluster, you reduce the simulation time.

The most relevant CV for this system is the distance between the Na and Cl atoms that is defined in PLUMED as

Click on the labels of the actions for more information on what each action computes
dist: DISTANCE ATOMSthe pair of atom that we are calculating the distance between. =322,323


Furthermore, the NaCl association/dissociation is coupled to the collective motion of the solvent. To measure that, we will use a CV that measures the solvation of the Na atom. For this, we employ the coordination number of the Na atom with respect to the oxygens of the water molecules that we define in PLUMED as

Click on the labels of the actions for more information on what each action computes
coord: COORDINATION ...
GROUPAFirst list of atoms. =322
GROUPBSecond list of atoms (if empty, N*(N-1)/2 pairs in GROUPA are counted). =1-321:3
SWITCHThis keyword is used if you want to employ an alternative to the continuous switching
function defined above. ={RATIONAL R_0=0.315 D_MAX=0.5 NN=12 MM=24}
NLIST( default=off ) Use a neighbor list to speed up the calculation
NL_CUTOFFThe cutoff for the neighbor list =0.55
NL_STRIDEThe frequency with which we are updating the atoms in the neighbor list =10

...


We will also limit CV space exploration by employing an upper wall on the distance between Na and Cl atoms that is defined in PLUMED as

Click on the labels of the actions for more information on what each action computes
uwall: UPPER_WALLS ...
ARGthe input for this action is the scalar output from one or more other actions. =dist
ATcompulsory keyword
the positions of the wall. =0.6
KAPPAcompulsory keyword
the force constant for the wall. =4000.0

...


# Exercise 1: Biasing with one collective variable

We will start by performing a simulation where we bias the Na-Cl distance.

Every VES simulation has three key ingredients

• Basis set
• Target distribution
• Optimization algorithm

For the basis set, we will use the recently introduced wavelet-based basis set that are localized basis functions that have been shown to perform better than the previously used Chebyshev or Legendre polynomials. We will employ the least asymmetric variant of these wavelets or so-called symlets (as indicated by the TYPE=SYMLETS keyword). We will use an order 10 of the symlets or Sym10 (as indicated by the ORDER=10 keyword). Furthermore information about the wavelets can be found in the reference above.

We need to select the range on which the bias potential is expanded. Here we will use the range from 0.2 nm to 0.7 nm (as indicated by the MINIMUM and MAXIMUM keywords). We also need to select the number of basis functions, and here we will use 26 basis functions (as indicated by the NUM_BF keyword). The PLUMED action corresponding to this setup is given by

Click on the labels of the actions for more information on what each action computes
# Basisset for Na-Cl distance
bf1: BF_WAVELETS ...

TYPEcompulsory keyword
Specify the wavelet type. =SYMLETS
ORDERcompulsory keyword
The order of the basis function expansion. =10
NUM_BFThe number of basis functions that should be used. =26
MINIMUMcompulsory keyword
The minimum of the interval on which the basis functions are defined. =0.2
MAXIMUMcompulsory keyword
The maximum of the interval on which the basis functions are defined. =0.7
TAILS_THRESHOLDThe threshold for cutting off tail wavelets as a fraction of the maximum value. =0.01
...


For the target distribution, we employ a well-tempered distribution with a bias factor of 10.

Click on the labels of the actions for more information on what each action computes
# Target distribution
td: TD_WELLTEMPERED BIASFACTORcompulsory keyword
The bias factor used for the well-tempered distribution. =10


Then we define the VES bias potential using the VES_LINEAR_EXPANSION action

Click on the labels of the actions for more information on what each action computes
ves: VES_LINEAR_EXPANSION ...

ARGthe input for this action is the scalar output from one or more other actions. =dist
BASIS_FUNCTIONScompulsory keyword
the label of the one dimensional basis functions that should be used. =bf1
TEMPthe system temperature - this is needed if the MD code does not pass the temperature
to PLUMED. =300.0
GRID_BINSthe number of bins used for the grid. =300
OPTIMIZATION_THRESHOLDThreshold value to turn off optimization of localized basis functions. =0.000001
TARGET_DISTRIBUTIONthe label of the target distribution to be used. =td
...


Finally, we need to define the optimization algorithm. The standard is the averaged stochastic gradient descent (OPT_AVERAGED_SGD). We need to define two parameters: the stride and the step size. The stride (given by the keyword STRIDE) is the number of MD steps in which samples are collected to calculate the gradient and hessian of $$\Omega [V]$$. The step size (given by the keyword STEPSIZE) is the step by which the coefficients are evolved at every optimization steps, given by $$\mu$$ in the equation above. It has become traditional to choose a stride of around 500-2000 MD steps. It must be noted that we are not looking for an accurate estimation of the gradient, since for this we would need to sample all the CV space. The step size in the optimization has a strong connection with the height of typical barriers in the system. The larger the barriers, the larger the step size needed such that the bias can grow fast enough to overcome them. For this example we have chosen a stride of 500 steps (i.e., 1 ps) and a step size of 5.0 kJ/mol. We also need to choose how often we update the target distribution (given by the keyword TARGETDIST_STRIDE) and we do this every 100 bias potential updates (i.e., every 100 ps in the current case).

Click on the labels of the actions for more information on what each action computes
opt: OPT_AVERAGED_SGD ...

BIAScompulsory keyword
the label of the VES bias to be optimized =ves
STRIDEcompulsory keyword
the frequency of updating the coefficients given in the number of MD steps. =500
STEPSIZEcompulsory keyword
the step size used for the optimization =5.0
FES_OUTPUThow often the FES(s) should be written out to file. =100
BIAS_OUTPUThow often the bias(es) should be written out to file. =500
COEFFS_OUTPUTcompulsory keyword ( default=100 )
how often the coefficients should be written to file. =10
TARGETDIST_STRIDEstride for updating a target distribution that is iteratively updated during the
optimization. =100
TARGETDIST_OUTPUThow often the dynamic target distribution(s) should be written out to file. =500
...


The other parameters are related to the outputting frequency of various output files.

Finally, we need to define the PRINT action to output all the variables

Click on the labels of the actions for more information on what each action computes
PRINT ARGthe input for this action is the scalar output from one or more other actions. =dist,coord,ves.*,uwall.* FILEthe name of the file on which to output these quantities =colvar.data STRIDEcompulsory keyword ( default=1 )
the frequency with which the quantities of interest should be output =125


The full PLUMED input file can be found in the Exercise-1 folder. There you also find all the relevant GROMACS input file. First, you need to run the generate-tpr-file.sh script that generates the GROMACS TPR file using the parameters defined in MD-NPT.mdp and the initial geometry defined using the StartingGeometry variable. You can then run the simulation using the run.sh script

./generate-tpr-file.sh
./run.sh


The run might take around 15-20 minutes to run using two MPI processes. You can adjust the number of MIP processes used for the simulation using the NumProcs variable in the run.sh script.

At the end of simulation, you will get several files:

• colvar.data: Colvar file
• coeffs.data : Values of the coefficients $$\boldsymbol\alpha$$ and $$\bar{\boldsymbol\alpha}$$ at different iterations.
• bias.<bias-label>.iter-<iteration-number>.data : Bias potential at iteration <iteration-number>.
• fes.<bias-name>.iter-<iteration-number>.data : FES at iteration <iteration-number>.
• targetdistribution.<bias-name>.iter-<iteration-number>.data : Target distribution at iteration <iteration-number>.

To assess the simulation and its convergence, you should first look at the time evolution of the biased CV and check that it is diffusive in CV space. Second, you should look at how the free energy surfaces behave as a function of time by looking at the fes.<bias-name>.iter-<iteration-number>.data files at different number of iterations (the minimum of the FES is always align to zero to facilitate comparison). To do this, you need to use your favorite way to plot datafiles (e.g., Matplotlib or Gnuplot).

You can also visualize the trajectory by opening it with VMD by using the command

vmd NaCl_StartingStructure-1.gro NaCl_VES_NPT-300K.pbc-whole.xtc


The NaCl_VES_NPT-300K.pbc-whole.xtc is the trajectory file with the periodic boundary conditions made whole. This is done with the run.sh script.

The coeffs.data file includes the values of coefficients $$\boldsymbol\alpha$$ and $$\bar{\boldsymbol\alpha}$$ at different iterations. To extract the time evolution of a given coefficient, you can use the ExtractCoeff.sh script, for example to extract coefficient 3:

./ExtractCoeff.sh 3 > coeff.3.data


This will create a file with the first column the iteration number, the second column the averaged coefficient $$\bar{\boldsymbol\alpha}$$, and the third column the instantaneous coefficients $$\boldsymbol\alpha$$. You should create files for different coefficient and visualize both the second and third column to understand how the coefficients converge.

# Exercise 2: Reweighting a VES simulation

Apart from estimating the FES directly from the bias potential, you can also estimate the FES through reweighting by histogramming where each configuration is weighted by the bias acting on it, $$e^{\beta V(\mathbf{s})}$$. The VES bias acting at each time step is given by the ves.bias variable in the colvar.dat file.

When doing performing reweighting, it is better to ignore the initial part of the simulation where the bias potential is changing more rapidly. You can use the trim-colvar-file.py python script in the Exercise-2 folder to do this

./trim-colvar-file.py --colvar-file ../Exercise-1/colvar.data --output-file colvar_reweight.data --time-min 400


where here we ignore the first 400 ps of the colvar.data file from the Exercise-1 and create a new file called colvar_reweight.data.

We can then perform the reweighting for the distance using the following PLUMED input (plumed_reweight.dat in the Exercise-2 folder)

Click on the labels of the actions for more information on what each action computes
dist: READ FILEcompulsory keyword
the name of the file from which to read these quantities =colvar_reweight.data IGNORE_TIME( default=off ) ignore the time in the colvar file.  VALUEScompulsory keyword
the values to read from the file =dist
the name of the file from which to read these quantities =colvar_reweight.data IGNORE_TIME( default=off ) ignore the time in the colvar file.  VALUEScompulsory keyword
the values to read from the file =coord
the name of the file from which to read these quantities =colvar_reweight.data IGNORE_TIME( default=off ) ignore the time in the colvar file.  VALUEScompulsory keyword
the values to read from the file =ves.bias
weights: REWEIGHT_BIAS TEMPthe system temperature. =300 ARGcompulsory keyword ( default=*.bias )
the biases that must be taken into account when reweighting =ves.bias
hg_dist: HISTOGRAM ...
ARGthe input for this action is the scalar output from one or more other actions. =dist
GRID_MINcompulsory keyword
the lower bounds for the grid =0.2
GRID_MAXcompulsory keyword
the upper bounds for the grid =0.7
GRID_BINthe number of bins for the grid =60
KERNELcompulsory keyword ( default=gaussian )
the kernel function you are using. =DISCRETE
LOGWEIGHTSlist of actions that calculates log weights that should be used to weight configurations
when calculating averages =weights

...
fes_dist: CONVERT_TO_FES GRIDcompulsory keyword
the action that creates the input grid you would like to use =hg_dist TEMPthe temperature at which you are operating =300 MINTOZERO( default=off ) set the minimum in the free energy to be equal to zero
DUMPGRID GRIDcompulsory keyword
the action that creates the grid you would like to output =fes_dist FILEcompulsory keyword ( default=density )
the file on which to write the grid. =fes-reweight.dist.data FMTthe format that should be used to output real numbers =%24.16e


You can run this input by using the PLUMED driver

plumed driver --plumed plumed_reweight.dat --noatoms


You should compare the resulting FES (the fes-reweight.dist.data file) to the results obtained directly from the bias potential in Exercise 1.

We can also obtained the FES for CVs that are not biased in the VES simulation. For example, we can obtain the two-dimensional FES for the distance and the solvation CV given by the coordination number CV. For this, you will need to add the following to the plumed_reweight.dat file and repeat the PLUEMD driver run

Click on the labels of the actions for more information on what each action computes
hg_dist_coord: HISTOGRAM ...
ARGthe input for this action is the scalar output from one or more other actions. =dist,coord
GRID_MINcompulsory keyword
the lower bounds for the grid =0.2,2.5
GRID_MAXcompulsory keyword
the upper bounds for the grid =0.7,7.5
GRID_BINthe number of bins for the grid =200,200
BANDWIDTHcompulsory keyword
the bandwidths for kernel density estimation =0.004,0.04
LOGWEIGHTSlist of actions that calculates log weights that should be used to weight configurations
when calculating averages =weights

...
fes_dist_coord: CONVERT_TO_FES GRIDcompulsory keyword
the action that creates the input grid you would like to use =hg_dist_coord TEMPthe temperature at which you are operating =300 MINTOZERO( default=off ) set the minimum in the free energy to be equal to zero
DUMPGRID GRIDcompulsory keyword
the action that creates the grid you would like to output =fes_dist_coord FILEcompulsory keyword ( default=density )
the file on which to write the grid. =fes-reweight.dist-coord.data FMTthe format that should be used to output real numbers =%24.16e


Note that here we use kernel density estimation with Gaussian kernels to obtain smoother results.

You can also try to obtain the one-dimensional FES for the solvation CV by adjusting the input above.

This will generate a two-dimensional FES that you can visualize.

# Exercise 3: Run another independent simulation

To check the results, it is a good practice to run another independent simulation using different initial conditions. You can achieve this here by changing the initial geometry in the generate-tpr-file.sh script

StartingGeometry=NaCl_StartingStructure-2.gro


and regenerating the GROMACS tpr file. Do this and rerun the simulation, check the convergence, and perform reweighting as before. Make sure that you do this in a new clean folder that is separate from the run in Exercise 1.

# Exercise 4: biasing with two collective variables

We will now bias also the solvation CV. To achieve this, we need first to setup a separate basis set for the solvation CV, where again we use the symlets but with a different range from 2.0 to 8.0

Click on the labels of the actions for more information on what each action computes
# Basisset for coordination number
bf2: BF_WAVELETS ...

TYPEcompulsory keyword
Specify the wavelet type. =SYMLETS
ORDERcompulsory keyword
The order of the basis function expansion. =10
NUM_BFThe number of basis functions that should be used. =22
MINIMUMcompulsory keyword
The minimum of the interval on which the basis functions are defined. =2.5
MAXIMUMcompulsory keyword
The maximum of the interval on which the basis functions are defined. =7.5
TAILS_THRESHOLDThe threshold for cutting off tail wavelets as a fraction of the maximum value. =0.01
...


We also need to change the relevant keywords in the VES_LINEAR_EXPANSION action, namely the ARG, BASIS_FUNCTIONS, and GRID_BINS keywords

Click on the labels of the actions for more information on what each action computes
ves: VES_LINEAR_EXPANSION ...

ARGthe input for this action is the scalar output from one or more other actions. =dist,coord
BASIS_FUNCTIONScompulsory keyword
the label of the one dimensional basis functions that should be used. =bf1,bf2
TEMPthe system temperature - this is needed if the MD code does not pass the temperature
to PLUMED. =300.0
GRID_BINSthe number of bins used for the grid. =300,300
OPTIMIZATION_THRESHOLDThreshold value to turn off optimization of localized basis functions. =0.000001
TARGET_DISTRIBUTIONthe label of the target distribution to be used. =td
PROJ_ARG1arguments for doing projections of the FES or the target distribution.. =dist
PROJ_ARG2arguments for doing projections of the FES or the target distribution.. =coord
...


Additionally, we have turned on the calculation of the one-dimensional projection of the FES on the two CVs (we also need to set the keyword FES_PROJ_OUTPUT=100 in OPT_AVERAGED_SGD). This is useful to assess the convergence as this is easier in one-dimension. We can also compare the projection to the results from Exercise 1. These changes should be sufficient to do the simulations using two CVs. You can see full input file in the Exercise-4 folder.

Once you have performed this simulation, you should also try to reweight from this simulations. Furthermore, if you have time, you should also try to perform another independent simulation.

# Optional exercises

The following three exercises are optional, but they will show you how different parameters effect the results. You should base these exercises on the files from the Exercise-1 folder and make the necessary changes. Make sure that you run these simulations in separate folders and start from clean files from the Exercise-1 folder.

# Optional exercise 5: Play with the optimization parameters

The main parameter in the optimization algorithm is the step size and it is not always easy to choose this parameter. Luckily, the algorithm is quite robust and will work for different step sizes.

Run different simulations using step sizes 0.5 and 50.0 and try to rationalize the behavior. Normally, when the step size is too large, the system gets stuck in CV space and coefficients oscillate wildly. When the step size is too small, the algorithm runs out of "steam" too fast and the simulation converges slowly. These two extreme cases should be avoided.

# Optional exercise 6: Uniform target distribution

Perform a simulation using an uniform target distribution and see how this changes the results.

In this case, you need to change the target distribution to

Click on the labels of the actions for more information on what each action computes
# Target distribution
td: TD_UNIFORM


and remove the TARGETDIST_STRIDE and TARGETDIST_OUTPUT keywords from the OPT_AVERAGED_SGD action.

# Optional exercise 7: Legendre polynomials basis function

Perform a simulation using Legendre polynomials (BF_LEGENDRE) basis functions instead of the wavelets and see how this will affect the result. As the Legendre polynomials are delocalized basis functions, this should lead to more fluctuations in the bias potential as has been observed in the paper introducing the wavelets.

In this case, you need to change the basis set action in the PLUMED input to

Click on the labels of the actions for more information on what each action computes
# Basisset
bf1: BF_LEGENDRE ...

ORDERcompulsory keyword
The order of the basis function expansion. =20
MINIMUMcompulsory keyword
The minimum of the interval on which the basis functions are defined. =0.2
MAXIMUMcompulsory keyword
The maximum of the interval on which the basis functions are defined. =0.7
...