Aims

The aim of this tutorial is to introduce the use of VES for obtaining kinetic information of activated processes. We will learn how to set up a biased simulation using a fixed flooding potential acting on a relevant slow degree of freedom. We can then scale the accelerated MD time in a post-processing procedure.

Learning Outcomes

Once this tutorial is completed students will:

Optimize a bias using VES with an energy cutoff to selectively fill low regions of the free energy surface.
Use the optimized bias to observe several rare event transitions
Post-process the accelerated trajectory to obtain an unbiased estimate of the transition rate
Compute the empirical cumulative distribution of first passage times and compare to a theoretical model.

Resources

The tarball for this project contains the following files:

CH.airebo : Force field parameters for LAMMPS
input : LAMMPS input script
data.start : Starting configuration in LAMMPS format
plumed.dat : Example PLUMED file
time-reweighting.py : Python script for post-processing trajectory
TRAJECTORIES-1700K : A directory containing many trajectories for post processing
get-all-fpt.py : A python script to extract transition times from the trajectories
cdf.analysis.py : A python script to compute the cumulative probability distribution and perform KS test

Requirements

python with numpy, scipy, and statsmodels
LAMMPS compiled with MANYBODY package
VMD for visualization

Instructions

Stone Wales Transformation

Exercise 1A. Preliminary investigation using VES

As an example of an activated process in materials science, we will work with the Stone-Wales transformation in a carbon nanotube. The tarball for this project contains the inputs necessary to run a simulation in LAMMPS for a 480 atom carbon nanotube. We use the AIREBO (Adaptive Intermolecular Reactive Empirical Bond Order) force field parameters which can approximately describe C-C bond breakage and formation at reasonable computational cost. For CVs we can use the coordination number in PLUMED to measure the number of covalent bonds among different groups of atoms. The transformation involves breaking two C-C bonds and forming two alternative C-C bonds.A definition of these CVs as well as the relevant C-C bonds are depicted in Figure stone-wales. We prepare a PLUMED input file as follows.

Click on the labels of the actions for more information on what each action computes

# set distance units to angstrom, time to ps, and energy to eV
UNITS LENGTHthe units of lengths. 
=A TIMEthe units of time. 
=ps ENERGYthe units of energy. 
=eV The UNITS action with label 
# define two variables
CV1: COORDINATION GROUPAFirst list of atoms. 
=229,219 GROUPBSecond list of atoms (if empty, N*(N-1)/2 pairs in GROUPA are counted). 
=238,207 R_0 could not find this keyword 
=1.8 NNcompulsory keyword ( default=6 )
The n parameter of the switching function 
=8 MMcompulsory keyword ( default=0 )
The m parameter of the switching function; 0 implies 2*NN 
=16 PAIR( default=off ) Pair only 1st element of the 1st group with 1st element in the second,
etc 
 The COORDINATION action with label CV1 calculates a single scalar value
CV2: COORDINATION GROUPAFirst list of atoms. 
=229,219 GROUPBSecond list of atoms (if empty, N*(N-1)/2 pairs in GROUPA are counted). 
=207,238 R_0 could not find this keyword 
=1.8 NNcompulsory keyword ( default=6 )
The n parameter of the switching function 
=8 MMcompulsory keyword ( default=0 )
The m parameter of the switching function; 0 implies 2*NN 
=16 PAIR( default=off ) Pair only 1st element of the 1st group with 1st element in the second,
etc 
 The COORDINATION action with label CV2 calculates a single scalar value
# the difference between variables
d1: COMBINE ARGthe input for this action is the scalar output from one or more other actions. 
=CV1,CV2 COEFFICIENTScompulsory keyword ( default=1.0 )
the coefficients of the arguments in your function 
=1,-1 POWERScompulsory keyword ( default=1.0 )
the powers to which you are raising each of the arguments in your function 
=1,1 PERIODICcompulsory keyword 
if the output of your function is periodic then you should specify the periodicity
of the function. 
=NO The COMBINE action with label d1 calculates a single scalar value

In the above, the first line sets the energy units. The second and third line define the two CVs for the C-C covalent bonds. (We have chosen atoms 238,207,229 and 219; however this choice is arbitrary and other atoms could equally well have been chosen.) Lastly, we define a simple approximate reaction coordinate given by the difference between CV1 and CV2 that we can use to monitor the transition.

Next we will use VES to drive the transformation at 1700 K. We bias the formation of bonds DB and AC shown in Figure stone-wales using CV2 which changes from 0 to 2 during the transformation. (Although a more rigorous treatment would bias both CVs, for this tutorial we will simplify things and work in only one dimension). We choose a Chebyshev polynomial basis set up to order 36

Click on the labels of the actions for more information on what each action computes

__FILL__  You cannot view the components that are calculated by each action for this input file. Sorry 
# The basis set to use
bf1: BF_CHEBYSHEV ORDERcompulsory keyword 
The order of the basis function expansion. 
=36 MINIMUMcompulsory keyword 
The minimum of the interval on which the basis functions are defined. 
=0.0 MAXIMUMcompulsory keyword 
The maximum of the interval on which the basis functions are defined. 
=2.0  You cannot view the components that are calculated by each action for this input file. Sorry

and we tell PLUMED to use VES acting on CV2 with a free energy cutoff

Click on the labels of the actions for more information on what each action computes

__FILL__  You cannot view the components that are calculated by each action for this input file. Sorry 
td_uniform: TD_UNIFORM  You cannot view the components that are calculated by each action for this input file. Sorry 
variational: VES_LINEAR_EXPANSION ...
   ARGthe input for this action is the scalar output from one or more other actions. =CV2 
   BASIS_FUNCTIONScompulsory keyword 
the label of the one dimensional basis functions that should be used. =bf1 
   
   TEMPthe system temperature - this is needed if the MD code does not pass the temperature
to PLUMED. =1700 
   BIAS_CUTOFFcutoff the bias such that it only fills the free energy surface up to certain level
F_cutoff, here you should give the value of the F_cutoff. =15.0 
   BIAS_CUTOFF_FERMI_LAMBDAthe lambda value used in the Fermi switching function for the bias cutoff (BIAS_CUTOFF),
the default value is 10.0. =10.0 
   TARGET_DISTRIBUTIONthe label of the target distribution to be used. 
=td_uniform 
... You cannot view the components that are calculated by each action for this input file. Sorry

Here we are biasing CV2 with the basis set defined above. The final two lines impose the cutoff at 15 eV. The cutoff is of the form

\[ \frac{1}{1+e^{\lambda [F(s)-F_c]}} \]

where \(\lambda\) (inverse energy units) controls how sharply the function goes to zero. Above we have set \(\lambda=10.0\) to ensure the cutoff goes sharply enough to zero.

The following input updates the VES bias every 200 steps, writing out the bias every 10 iterations. To enforce the energy cutoff we also need to update the target distribution which we do every 40 iterations with the TARGETDIST_STRIDE flag.

Click on the labels of the actions for more information on what each action computes

__FILL__  You cannot view the components that are calculated by each action for this input file. Sorry 
var-S: OPT_AVERAGED_SGD ...
   BIAScompulsory keyword 
the label of the VES bias to be optimized =variational 
   STRIDEcompulsory keyword 
the frequency of updating the coefficients given in the number of MD steps. =200 
   
   STEPSIZEcompulsory keyword 
the step size used for the optimization =0.1 
   COEFFS_FILEcompulsory keyword ( default=coeffs.data )
the name of output file for the coefficients =coeffs.dat 
   BIAS_OUTPUThow often the bias(es) should be written out to file. =10 
   TARGETDIST_STRIDEstride for updating a target distribution that is iteratively updated during the
optimization. =40 
   TARGETDIST_OUTPUThow often the dynamic target distribution(s) should be written out to file. =40 
   COEFFS_OUTPUTcompulsory keyword ( default=100 )
how often the coefficients should be written to file. =1 
... You cannot view the components that are calculated by each action for this input file. Sorry

Finally, we will stop the simulation when the transition occurs using the COMMITTOR in PLUMED

Click on the labels of the actions for more information on what each action computes

__FILL__  You cannot view the components that are calculated by each action for this input file. Sorry 
COMMITTOR ARGthe input for this action is the scalar output from one or more other actions. 
=d1 BASIN_LL1compulsory keyword 
List of lower limits for basin # You can use multiple instances of this keyword i.e.
=-2.0 BASIN_UL1compulsory keyword 
List of upper limits for basin # You can use multiple instances of this keyword i.e.
=-1.0 STRIDEcompulsory keyword ( default=1 )
the frequency with which the CVs are analyzed 
=600  You cannot view the components that are calculated by each action for this input file. Sorry

Run the simulation using LAMMPS

lmp_mpi < input

and plot the last few bias output files (bias.variational.iter-n.data) using gnuplot. What is the difference between the maximum and minimum values of the bias obtained during the simulation? Is the cutoff value sufficient to cross the barrier? Is the cutoff value too large?

Figure Figure1A shows an example bias potential after 90 iterations. Note that a cutoff of 15 eV is too large as the system transitions before the bias reaches the prescribed cutoff. From Figure Figure1A we see a barrier height of approximately 7.3 eV.

Figure-1A

The output also produces an movie.xyz file which can be viewed in vmd. For better visualization, first change atom id from 1 to C by typing

sed "s/^1 /C /" movie.xyz > newmovie.xyz

Then load the newmovie.xyz into vmd and choose Graphics --> Representations and set Drawing Method to DynamicBonds. Create a second representation by clicking Create Rep and set the Drawing Method to VDW. Change the sphere scale to 0.3 and play the movie. You should observe the Stone-Wales transformation right before the end of the trajectory.

Exercise 1B. Set up and run a VES bias imposing a cutoff

In this exercise we will run a VES simulation to fill the FES only up to a certain cutoff. This will be the first step in order to obtain kinetic information from biased simulations. In the previous section, we observed that a cutoff of 15 eV is too strong for our purpose. Change the cutoff energy from 15.0 to 6.0 eV by setting

BIAS_CUTOFF=6.0

and rerun the simulation from Exercise 1A. [Note: In practice one can use multiple walkers during the optimization by adding the flag MULTIPLE_WALKERS]

Plot some of the bias files (bias.variational.iter-n.data) that are printed during the simulation using gnuplot. At the end of the simulation, you should be able to reproduce something like Figure Figure1B. Is the bias converging? If so, how many iterations does it require to converge?

Figure Figure1B shows the bias potential after 70,80, and 90 iteration steps. Note that the bias has reached the cutoff and goes to zero at around 1 Angstrom.

Figure-1B

Exercise 2. Using a fixed bias as a flooding potential to obtain rates

In this exercise we will use the bias obtained above as a static umbrella potential. We will set up and run a new trajectory to measure the first passage time of escape from the well.

We can extract the coefficients that we need from the final iteration in Exercise-1B above.

tail -n 47 coeffs.dat > fixed-coeffs.dat

Now create a new directory from which you will run a new simulation and copy the necessary input files (including the fixed-coeffs.dat) into this directory. Modify the PLUMED file so that the optimized coefficients are read by the VES_LINEAR_EXPANSION

Click on the labels of the actions for more information on what each action computes

__FILL__  You cannot view the components that are calculated by each action for this input file. Sorry 
td_uniform: TD_UNIFORM  You cannot view the components that are calculated by each action for this input file. Sorry 
variational: VES_LINEAR_EXPANSION ...
   ARGthe input for this action is the scalar output from one or more other actions. =CV2 
   BASIS_FUNCTIONScompulsory keyword 
the label of the one dimensional basis functions that should be used. =bf1 
   
   TEMPthe system temperature - this is needed if the MD code does not pass the temperature
to PLUMED. =1700 
   BIAS_CUTOFFcutoff the bias such that it only fills the free energy surface up to certain level
F_cutoff, here you should give the value of the F_cutoff. =6.0 
   BIAS_CUTOFF_FERMI_LAMBDAthe lambda value used in the Fermi switching function for the bias cutoff (BIAS_CUTOFF),
the default value is 10.0. =10.0 
   TARGET_DISTRIBUTIONthe label of the target distribution to be used. 
=td_uniform 
   COEFFSread in the coefficients from files. =fixed-coeffs.dat 
... You cannot view the components that are calculated by each action for this input file. Sorry

The final line specifies the coefficients to be read from a file.

Make sure to remove the lines for the stochastic optimization (OPT_AVERAGED_SGD) as we no longer wish to update the bias.

We will also perform metadynamics with an infrequent deposition stride to ensure that the trajectory does not get stuck in any regions where the bias potential is not fully converged The following implements metadynamics on both CV1 and CV2 with a deposition stride of 4000 steps and a hill height of 0.15 eV.

Click on the labels of the actions for more information on what each action computes

__FILL__  You cannot view the components that are calculated by each action for this input file. Sorry 
metad: METAD ...
   ARGthe input for this action is the scalar output from one or more other actions. =CV1,CV2 
   SIGMAcompulsory keyword 
the widths of the Gaussian hills =0.2,0.2 
   HEIGHTthe heights of the Gaussian hills. =0.15 
   PACEcompulsory keyword 
the frequency for hill addition =4000 
   
... You cannot view the components that are calculated by each action for this input file. Sorry

Again we will use the COMMITTOR to stop the trajectory after the transition.

Click on the labels of the actions for more information on what each action computes

__FILL__  You cannot view the components that are calculated by each action for this input file. Sorry 
COMMITTOR ARGthe input for this action is the scalar output from one or more other actions. 
=d1 BASIN_LL1compulsory keyword 
List of lower limits for basin # You can use multiple instances of this keyword i.e.
=-2.0 BASIN_UL1compulsory keyword 
List of upper limits for basin # You can use multiple instances of this keyword i.e.
=-1.0  You cannot view the components that are calculated by each action for this input file. Sorry

Now run a trajectory with the fixed bias. What is the time (biased) to escape? Plot the trajectory of the approximate reaction coordinate (column 4 in the COLVAR) in gnuplot. An example is shown in Figure Figure2.

Figure-2

Also look at the metadynamics bias in the column labeled metad.bias. How does the magnitude compare to the bias from VES in column variational.bias?

The crossing time for a single event doesn't tell us much because we don't have any statistics on the transition events. To obtain the mean first passage time, we have to repeat the calculation many times. To generate statistically independent samples, we have to change the seed for the random velocities that are generated in the LAMMPS input file. In a new directory, copy the necessary files and edit the following line in the input file

velocity      all create 1700.  495920

Choose a different 6 digit random number and repeat Exercise 2. How does the escape time compare to what you obtained before? Repeat the procedure several times with different velocity seeds to get a distribution of first passage times. Make sure you launch each simulation from a separate directory and keep all COLVAR files as you will need them in the next section where we will analyze the transition times.

Exercise 3. Post processing to obtain unbiased estimate for the transition rate

In the previous section you generated several trajectories with different first passage times. However, these times need to be re-weighted to correct for the bias potential. We can scale the time according to the hyperdynamics formula

\[ t^{*} = \Delta t_{MD} \sum_i^{n} e^{\beta V(s)} \]

Note that we need to add the total bias at each step, coming from both the VES bias and metadynamics. The python script time-reweighting.py will read the COLVAR from Exercise 2 and print the final reweighted time (in seconds). Open the script to make sure you understand how it works.

Run the script using

python time-reweighting.py

Note that the output first passage time is converted to seconds. What is the acceleration factor of our biased simulation? (i.e. the ratio of biased to unbiased transition times) The script also produces a time-reweighted trajectory COLVAR-RW for a specified CV (here we choose the approximate reaction coordinate, d1). Plot the reweighted COLVAR-RW in gnuplot and compare the original vs. time-reweighted trajectories. In particular, what effect does rescaling have on the time step?

Rerun the script for each of the trajectories you have run with the fixed bias and compute the mean first passage time from your data. The Stone-Wales transformation at 1700 K is estimated in fullerene to be ~10 days. How does your average time compare to this value?

The distribution of first passage times for an activated processes typically follows an exponential distribution. Instead of directly making a histogram of the first passage times, we can look at the cumulative distribution function which maps a value x to the fraction of values less than or equal to x. Since we are computing the cumulative distribution from a data set, this is called the empirical cumulative distribution (ECDF). On the other hand, the theoretical cumulative distribution (CDF) of an exponentially distributed random process is

\[ P(t) = 1-e^{-t/\tau} \]

where \(\tau\) is the mean first passage time.

In order to calculate the ECDF we need many trajectories. Several COLVAR files are included in the TRAJECTORIES-1700K directory. The script get-all-fpt.py is a modified version of time-reweighting.py and will calculate the first passage time (fpt) from all the simulation data and will output the times to a file fpt.dat. Run the script from the TRAJECTORIES-1700K directory

python get-all-fpt.py

You should obtain the output file fpt.dat which has a list of all the times. You can append your own values you obtained from Exercise 2 to the end of the list to increase the number of data points.

The script cdf-analysis.py will compute the ECDF and fit the distribution to the theoretical CDF. The script uses the statsmodels Python module. Run the script with

python cdf-analyysis.py

from the same directory where the fpt.dat file is located. The script prints both the mean first passage time of the data as well as the fit parameter \(\tau\). How do these two values compare?

Figure-3

Figure Figure3 shows an example fit of the ECDF to the theoretical CDF for a Poisson process. To test the reliability of the fit, we can generate some data set according the theoretical CDF and perform a two-sample Kolmogorov-Smirnov (KS) test. The KS test provides the probability that the two sets of data are drawn from the same underlying distribution expressed in the so-called p-value. The null hypothesis is typically rejected for p-value < 0.05. The script cdf-analysis.py also performs the KS test and prints the p-value. What is the p-value we obtain from your dataset of transition times?