PLUMED Masterclass 22.6: EDS module + Coarse-Grained directed simulations
Authors
Glen Hocky, Andrew White
Date
April 26, 2022

Aims

This Masterclass describes how to bias simulations to agree with experimental data using experiment directed simulation.

Objectives

Once this Masterclass is completed, you will know how to:

  • How to bias collective variables to agree with set values
  • Develop a deeper feeling how parameters in method change learning efficiency

Prerequisites

We assume that you are familiar with PLUMED and enhanced sampling calculations. If you are not, the 2021 PLUMED Masterclass is a great place to start. In particular, you should be familiar with specifying collective variables.

Furthermore, some Python or other programming knowledge is required for this masterclass to generate plots and perform some analysis calculations.

Overview

Experiment directed simulation (EDS) is a maximum entropy method for biasing specific collective variables (CVs) to agree with set values. These are typically from experimental data, like a known radius of gyration or NMR chemical shift. Biasing a CV is an under-determined problem because there are many ways to change the systems' potential energy to agree with a set point. If we further maximize ensemble entropy while matching the set point, the problem has a unique solution of

\[ U'(x) = U(x) + \lambda f(x) \]

where \(U(x)\) is the potential energy of the system, \(f(x)\) is the CV, and \(\lambda\) is a fit parameter. EDS is a time-dependent method that finds \(\lambda\) while the simulation is running. It typically converges much faster than free-energy methods, but comes with the same caveats that insufficient sampling or rare events can affect the method. Another important detail is that EDS/maximum entropy biasing is for matching the set point on average (in expectation), rather than at every frame.

Software and data

The only data needed to complete the exercises of this Masterclass can be found on an the following github page data, with alanine inputs being borrowed from an earlier one GitHub-22-03. Simulations will be performed using PLUMED's pesmd module, and gromacs.

Exercises

The exercises are presented below.

Setup 1-dimensional system.

Set up a plumed file to use with pesmd that has a harmonic potential. Then run using the 1d input provided in the github. For example:

Click on the labels of the actions for more information on what each action computes
tested on v2.9
d1: DISTANCE 
ATOMS
the pair of atom that we are calculating the distance between.
=1,2
COMPONENTS
( default=off ) calculate the x, y and z components of the distance separately and store them as label.x,
ff: MATHEVAL
ARG
the input for this action is the scalar output from one or more other actions.
=d1.x
PERIODIC
compulsory keyword if the output of your function is periodic then you should specify the periodicity of the function.
=NO
FUNC
compulsory keyword the function you wish to evaluate
=0.5*10*(x^2) bb: BIASVALUE
ARG
the input for this action is the scalar output from one or more other actions.
=ff PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=d1.x
STRIDE
compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be output
=25
FILE
the name of the file on which to output these quantities
=_PREFIX_.colvars.dat
plumed pesmd < harmonic_1d.in

Histogram your data, and see if it fits the ideal Boltzmann distribution for this harmonic oscillator. Make sure to normalize your distribution properly!

Now try adding a harmonic bias using the RESTRAINT function, so that the data becomes centered near \(x=1\).

Finally, we will add an EDS bias. Make your EDS bias also be centered at 1.

Click on the labels of the actions for more information on what each action computes
tested on v2.9
d1: DISTANCE 
ATOMS
the pair of atom that we are calculating the distance between.
=1,2
COMPONENTS
( default=off ) calculate the x, y and z components of the distance separately and store them as label.x,
ff: MATHEVAL
ARG
the input for this action is the scalar output from one or more other actions.
=d1.x
PERIODIC
compulsory keyword if the output of your function is periodic then you should specify the periodicity of the function.
=NO
FUNC
compulsory keyword the function you wish to evaluate
=0.5*10*(x^2) bb: BIASVALUE
ARG
the input for this action is the scalar output from one or more other actions.
=ff eds: EDS
ARG
the input for this action is the scalar output from one or more other actions.
=d1.x
CENTER
The desired centers (equilibrium values) which will be sought during the adaptive linear biasing.
=__FILL__
PERIOD
Steps over which to adjust bias for adaptive or ramping
=__FILL__
OUT_RESTART
Output file for all information needed to continue EDS simulation.
=__FILL__
TEMP
The system temperature.
=1.0
BIAS_SCALE
A divisor to set the units of the bias.
=1 PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=*
STRIDE
compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be output
=100
FILE
the name of the file on which to output these quantities
=_PREFIX_.colvars.dat

Note, we have to add TEMP=1 because pesmd does not provide the temperature to the EDS module.

Position versus time, unbiased, harmonic bias, and EDS bias.
Histogram of position for 1d oscillator shows how RESTRAINT fails and EDS wins.

Now look at how the bias factor changes with time. Can you also compute the bias from this value, and the value of the position versus time? Look in the colvar file and the restart file to see what values are there.

Value of learned EDS linear coefficient and eds bias versus time in the simulation.

Setup and bias a 2-dimensional system.

Now we will move on to a 2d harmonic oscillator. This way we can see the effect of biasing x, y, or a combination. Note, there is a different pesmd input file for this.

Click on the labels of the actions for more information on what each action computes
tested on v2.9
d1: DISTANCE 
ATOMS
the pair of atom that we are calculating the distance between.
=1,2
COMPONENTS
( default=off ) calculate the x, y and z components of the distance separately and store them as label.x,
ff: MATHEVAL
ARG
the input for this action is the scalar output from one or more other actions.
=__FILL__
PERIODIC
compulsory keyword if the output of your function is periodic then you should specify the periodicity of the function.
=NO
FUNC
compulsory keyword the function you wish to evaluate
=__FILL__ bb: BIASVALUE
ARG
the input for this action is the scalar output from one or more other actions.
=ff PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=__FILL__
STRIDE
compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be output
=25
FILE
the name of the file on which to output these quantities
=_PREFIX_.colvars.dat

Try to bias either the x or y directions separately. Try biasing a combination of them. We biased the average of x*y to be = 1. Use the CUSTOM command to try that out. What do you think it should look like? Write a python program to histogram the x and y data and see. It should look like below.

Histogram of positions for 2d oscillator with various EDS biases applied.

Alanine dipeptide

Bias alanine dipeptide using gromacs and the input files provided in the github.

Click on the labels of the actions for more information on what each action computes
tested on v2.9
MOLINFO 
STRUCTURE
compulsory keyword a file in pdb format containing a reference structure.
=./input.ala2.pdb phi: TORSION
ATOMS
the four atoms involved in the torsional angle
=@phi-2 psi: TORSION
ATOMS
the four atoms involved in the torsional angle
=@psi-2 eds: EDS
ARG
the input for this action is the scalar output from one or more other actions.
=__FILL__
CENTER
The desired centers (equilibrium values) which will be sought during the adaptive linear biasing.
=__FILL__
PERIOD
Steps over which to adjust bias for adaptive or ramping
=__FILL__
OUT_RESTART
Output file for all information needed to continue EDS simulation.
=_PREFIX_.restart.dat
BIAS_SCALE
A divisor to set the units of the bias.
=1 PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=*
STRIDE
compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be output
=500
FILE
the name of the file on which to output these quantities
=_PREFIX_.colvars.dat

Try biasing phi and psi independently. Try biasing them at the same time. What is a good PERIOD to choose?

Histogram of phi and psi for adp from a very long unbiased simulation.
Histogram of phi and psi for adp from a very long unbiased simulation with <phi> biased to be 1.
Histogram of phi and psi for adp from a very long unbiased simulation with <psi> biased to be 1.

Free explore

Take a protein of interest, or alanine 2, and bias more than one variable. To try CGDS, choose the center of mass of some different parts of your protein. Can you bias the distance between these COMs and the angle between them at the same time? Does using COVAR or LM affect the speed of convergence?