Belfast tutorial: Umbrella sampling


In the previous lectures we learned how to compute collective variables (CVs) from atomic positions. We will now learn how one can add a bias potential to enforce the exploration of a particular region of the space. We will also see how it is possible to bias CVs so as to enhance the sampling of events hindered by large free-energy barriers and how to analyze this kind of simulation. This technique is known as "umbrella sampling" and can be used in combination with the weighted-histogram analysis method to compute free-energy landscapes. In this tutorial we will use simple collective variables, but the very same approach can be used with any kind of collective variable.

Summary of theory

Biased sampling

A system at temperature \( T\) samples conformations from the canonical ensemble:

\[ P(q)\propto e^{-\frac{U(q)}{k_BT}} \]

. Here \( q \) are the microscopic coordinates and \( k_B \) is the Boltzmann constant. Since \( q \) is a highly dimensional vector, it is often convenient to analyze it in terms of a few collective variables (see Belfast tutorial: Analyzing CVs , Belfast tutorial: Adaptive variables I , and Belfast tutorial: Adaptive variables II ). The probability distribution for a CV \( s\) is

\[ P(s)\propto \int dq e^{-\frac{U(q)}{k_BT}} \delta(s-s(q)) \]

This probability can be expressed in energy units as a free energy landscape \( F(s) \):

\[ F(s)=-k_B T \log P(s) \]


Now we would like to modify the potential by adding a term that depends on the CV only. That is, instead of using \( U(q) \), we use \( U(q)+V(s(q))\). There are several reasons why one would like to introduce this potential. One is to avoid that the system samples some un-desired portion of the conformational space. As an example, imagine that you want to study dissociation of a complex of two molecules. If you perform a very long simulation you will be able to see association and dissociation. However, the typical time required for association will depend on the size of the simulation box. It could be thus convenient to limit the exploration to conformations where the distance between the two molecules is lower than a given threshold. This could be done by adding a bias potential on the distance between the two molecules. Another example is the simulation of a portion of a large molecule taken out from its initial context. The fragment alone could be unstable, and one might want to add additional potentials to keep the fragment in place. This could be done by adding a bias potential on some measure of the distance from the experimental structure (e.g. on root-mean-square deviation).

Whatever CV we decide to bias, it is very important to recognize which is the effect of this bias and, if necessary, remove it a posteriori. The biased distribution of the CV will be

\[ P'(s)\propto \int dq e^{-\frac{U(q)+V(s(q))}{k_BT}} \delta(s-s(q))\propto e^{-\frac{V(s(q))}{k_BT}}P(s) \]

and the biased free energy landscape

\[ F'(s)=-k_B T \log P'(s)=F(s)+V(s)+C \]

Thus, the effect of a bias potential on the free energy is additive. Also notice the presence of an undetermined constant \( C \). This constant is irrelevant for what concerns free-energy differences and barriers, but will be important later when we will learn the weighted-histogram method. Obviously the last equation can be inverted so as to obtain the original, unbiased free-energy landscape from the biased one just subtracting the bias potential

\[ F(s)=F'(s)-V(s)+C \]

Additionally, one might be interested in recovering the distribution of an arbitrary observable. E.g., one could add a bias on the distance between two molecules and be willing to compute the unbiased distribution of some torsional angle. In this case there is no straightforward relationship that can be used, and one has to go back to the relationship between the microscopic probabilities:

\[ P(q)\propto P'(q) e^{\frac{V(s(q))}{k_BT}} \]

The consequence of this expression is that one can obtained any kind of unbiased information from a biased simulation just by weighting every sampled conformation with a weight

\[ w\propto e^{\frac{V(s(q))}{k_BT}} \]

That is, frames that have been explored in spite of a high (disfavoring) bias potential \( V \) will be counted more than frames that has been explored with a less disfavoring bias potential.

Umbrella sampling

Often in interesting cases the free-energy landscape has several local minima. If these minima have free-energy differences that are on the order of a few times \(k_BT\) they might all be relevant. However, if they are separated by a high saddle point in the free-energy landscape (i.e. a low probability region) than the transition between one and the other will take a lot of time and these minima will correspond to metastable states. The transition between one minimum and the other could require a time scale which is out of reach for molecular dynamics. In these situations, one could take inspiration from catalysis and try to favor in a controlled manner the conformations corresponding to the transition state.

Imagine that you know since the beginning the shape of the free-energy landscape \( F(s) \) as a function of one CV \( s \). If you perform a molecular dynamics simulation using a bias potential which is exactly equal to \( -F(s) \), the biased free-energy landscape will be flat and barrierless. This potential acts as an "umbrella" that helps you to safely cross the transition state in spite of its high free energy.

It is however difficult to have an a priori guess of the free-energy landscape. We will see later how adaptive techniques such as metadynamics (Belfast tutorial: Metadynamics) can be used to this aim. Because of this reason, umbrella sampling is often used in a slightly different manner.

Imagine that you do not know the exact height of the free-energy barrier but you have an idea of where the barrier is located. You could try to just favor the sampling of the transition state by adding a harmonic restraint on the CV, e.g. in the form

\[ V(s)=\frac{k}{2} (s-s_0)^2 \]

. The sampled distribution will be

\[ P'(q)\propto P(q) e^{\frac{-k(s(q)-s_0)^2}{2k_BT}} \]

For large values of \( k \), only points close to \( s_0 \) will be explored. It is thus clear how one can force the system to explore only a predefined region of the space adding such a restraint. By combining simulations performed with different values of \( s_0 \), one could obtain a continuous set of simulations going from one minimum to the other crossing the transition state. In the next section we will see how to combine the information from these simulations.

Weighted histogram analysis method

Let's now consider multiple simulations performed with restraints located in different positions. In particular, we will consider the \(i\)-th bias potential as \(V_i\). The probability to observe a given value of the collective variable \(s\) is:

\[ P_i({s})=\frac{P({s})e^{-\frac{V_i({s})}{k_BT}}}{\int ds' P({s}') e^{-\frac{V_i({s}')}{k_BT}}}= \frac{P({s})e^{-\frac{V_i({s})}{k_BT}}}{Z_i} \]


\[ Z_i=\sum_{q}e^{-\left(U(q)+V_i(q)\right)} \]

The likelyhood for the observation of a sequence of snapshots \(q_i(t)\) (where \(i\) is the index of the trajectory and \(t\) is time) is just the product of the probability of each of the snapshots. We use here the minus-logarithm of the likelihood (so that the product is converted to a sum) that can be written as

\[ \mathcal{L}=-\sum_i \int dt \log P_i({s}_i(t))= \sum_i \int dt \left( -\log P({s}_i(t)) +\frac{V_i({s}_i(t))}{k_BT} +\log Z_i \right) \]

One can then maximize the likelyhood by setting \(\frac{\delta \mathcal{L}}{\delta P({\bf s})}=0\). After some boring algebra the following expression can be obtained

\[ 0=\sum_{i}\int dt\left(-\frac{\delta_{{\bf s}_{i}(t),{\bf s}}}{P({\bf s})}+\frac{e^{-\frac{V_{i}({\bf s})}{k_{B}T}}}{Z_{i}}\right) \]

In this equation we aim at finding \(P(s)\). However, also the list of normalization factors \(Z_i\) is unknown, and they should be found selfconsistently. Thus one can find the solution as

\[ P({\bf s})\propto \frac{N({\bf s})}{\sum_i\int dt\frac{e^{-\frac{V_{i}({\bf s})}{k_{B}T}}}{Z_{i}} } \]

where \(Z\) is selfconsistently determined as

\[ Z_i\propto\int ds' P({\bf s}') e^{-\frac{V_i({\bf s}')}{k_BT}} \]

These are the WHAM equations that are traditionally solved to derive the unbiased probability \(P(s)\) by the combination of multiple restrained simulations. To make a slightly more general implementation, one can compute the weights that should be assigned to each snapshot, that turn out to be:

\[ w_i(t)\propto \frac{1}{\sum_j\int dt\frac{e^{-\beta V_{j}({\bf s}_i(t))}}{Z_{j}} } \]

The normalization factors can in turn be found from the weights as

\[ Z_i\propto\frac{\sum_j \int dt e^{-\beta V_i({\bf s}_j(t))} w_j(t)}{ \sum_j \int dt w_j(t)} \]

This allows to straighforwardly compute averages related to other, non-biased degrees of freedom, and it is thus a bit more flexible. It is sufficient to precompute this factors \(w\) and use them to weight every single frame in the trajectory.

Learning Outcomes

Once this tutorial is completed students will know how to:

  • Setup simulations with restraints.
  • Use multiple-restraint umbrella sampling simulations to enhance the transition across a free-energy barrier.
  • Analyze the results and compute weighted averages and free-energy profiles.


The tarball for this project contains the following files:

  • A gromacs topology (, configuration (conf.gro), and control file (grompp.mdp). They should not be needed.
  • A gromacs binary file (topol.tpr). This is enough for running this system.
  • A small C++ program that computes WHAM (wham.cpp) and a script that can be used to feed it (

By working in the directory where the topol.tpr file is stored, one can launch gromacs with the command

gmx_mpi mdrun -plumed plumed.dat -nsteps 100000

(notice that the -nsteps flag allows the number of steps to be changed).


The model system

We here use a a model system alanine dipeptide with CHARM27 all atom force field already seen in the previous section.

Restrained simulations

The simplest way in which one might influence a CV is by forcing the system to stay close to a chosen value during the simulation. This is achieved with a restraining potential that PLUMED provides via the directive RESTRAINT. In the umbrella sampling method a bias potential is added so as to favor the exploration of some regions of the conformational space and to disfavor the exploration of other regions [87] . A properly chosen bias potential could allow for example to favor the transition state sampling thus enhancing the transition state for a conformational transition. However, choosing such a potential is not trivial. In a later section we will see how metadynamics can be used to this aim. The simplest way to use umbrella sampling is that to apply harmonic constraints to one or more CVs.

We will now see how to enforce the exploration of a the neighborhood of a selected point the CV space using a RESTRAINT potential.

# set up two variables for Phi and Psi dihedral angles 
phi: TORSION ATOMS=5,7,9,15
psi: TORSION ATOMS=7,9,15,17
# Impose an umbrella potential on CV 1 and CV 2
# with a spring constant of 500 kjoule/mol
# at fixed points on the Ramachandran plot
restraint-phi: RESTRAINT ARG=phi KAPPA=500 AT=-0.3
restraint-psi: RESTRAINT ARG=psi KAPPA=500 AT=+0.3

# monitor the two variables and the bias potential from the two restraints
PRINT STRIDE=10 ARG=phi,psi,restraint-phi.bias,restraint-psi.bias FILE=COLVAR


The syntax for the command RESTRAINT is rather trivial. The directive is followed by a keyword ARG followed by the label of the CV on which the umbrella potential has to act. The keyword KAPPA determines the hardness of the spring constant and its units are [Energy units]/[Units of the CV ]. The additional potential introduced by the UMBRELLA takes the form of a simple Hooke’s law:

\[ U(s)=\frac{k}{2} (x-x_0)^2 \]


where \( x_0 \) is the value specified following the AT keyword. The choice of AT ( \( x_0 \)) is obviously depending on the specific case. KAPPA ( \( k \)) is typically chosen not to affect too much the intrinsic fluctuations of the system. A typical recipe is \( k \approx \frac{k_BT}{\sigma^2} \), where \( \sigma^2 \) is the variance of the CV in a free simulation). In real applications, one must be careful with values of \( k \) larger than \( \frac{k_BT}{\sigma^2} \) because they could break down the molecular dynamics integrator.

The CVs as well as the two bias potentials are shown in the COLVAR file. For this specific input the COLVAR file has in first column the time, in the second the value of \(\phi\), in the third the value of \(\psi\), in the fourth the the additional potential introduced by the restraint on \(\phi\) and in the fifth the additional potential introduced by the restraint on \(\psi\).

It may happen that one wants that a given CV just stays within a given range of values. This is achieved in plumed through the directives UPPER_WALLS and LOWER_WALLS that act on specific collective variables and limit the exploration within given ranges.

Reweighting the results

Now consider a simulation performed restraining the variable \(\phi \):

phi: TORSION ATOMS=5,7,9,15
psi: TORSION ATOMS=7,9,15,17
restraint-phi: RESTRAINT ARG=phi KAPPA=10.0 AT=-2
PRINT STRIDE=10 ARG=phi,psi,restraint-phi.bias FILE=COLVAR10

and compare the result with the one from a single simulation with no restraint

phi: TORSION ATOMS=5,7,9,15
psi: TORSION ATOMS=7,9,15,17
# we use a "dummy" restraint with strength zero here
restraint-phi: RESTRAINT ARG=phi KAPPA=0.0 AT=-2
PRINT STRIDE=10 ARG=phi,psi,restraint-phi.bias FILE=COLVAR0

Plot the time dependence of \(\phi \) in the two cases and try to understand the difference.

Now let's try to compute the probability that \(\psi \) falls within a given range, say between 1 and 2. This can be done e.g. with this shell script

> grep -v \# COLVAR0 | tail -n 80000 |
  awk '{if($3>1 && $3<2)a++; else b++;}END{print a/(a+b)}'

Notice that we here considered only the last 80000 frames in the average. Look at the time series for \(\psi \) and guess why. Also notice that the script is removing the initial comments. After this trivial preprocessing, the script is just counting how many times the third column ( \( \psi \)) lies between 1 and 2 and how many times it doesn't. At the end it prints the number of times the variable is between 1 and 2 divided by the total count. The result should be something around 0.40. Now try to do it on trajectories generated with different values of AT. Does the result depend on AT?

We can now try to reweight the result so as to get rid of the bias introduced by the restraint. Since the reweighting factor is just \(\exp(\frac{V}{k_BT} \) the script should be modified as

> grep -v \# COLVAR10 | tail -n 80000 |
awk '{w=exp($4/2.5); if($3>1 && $3<2)a+=w; else b+=w;}END{print a/(a+b)}'

Notice that 2.5 is just \(k_BT\) in kj/mol units.

Repeat this calculation for different values of AT. Does the result depend on AT?

A free-energy landscape

One can also count the probability of an angle to be in a precise bin. The logarithm of this quantity, in kbT units, is the free-energy associated to that bin. There are several ways to compute histograms, either with PLUMED or with external programs. Here I decided to use awk.

grep -v \# COLVAR10 | tail -n 80000 |
awk 'BEGIN{
  for(i1=0;i1<nb1;i1++) for(i2=0;i2<nb2;i2++) f[i1,i2]=0.0;
# we assume the potential is in the last column, and kbT=2.5 kj/mol
  for(i2=0;i2<nb2;i2++) print min1+i1/100.0*(max1-min1), min2+i2/100.0*(max2-min2), -2.5*log(f[i1,i2]);
  print "";
}}' > plotme

You can then plot the "plotme" file with

gnuplot> set pm3d map
gnuplot> splot "plotme"

Combining multiple restraints

In the last paragraph you have seen how to reweight simulations done with restraints in different positions to obtain virtually the same result. Let's now see how to combine data from multiple restraint simulations. A possible choice is to download and use the WHAM software here, which is well documented. This is probably the best idea for analyzing a real simulation.

For the sake of learning a bit, we will use a different approach here, namely we will use a short C++ program that implements the weight calculation. Notice that whereas people typically use harmonic restraints in this framework, PLUMED offers a very large variety of bias potentials. For this reason we will keep things as general as possible and use an approach that can be in principle used also to combine simulation with restraint on different variables or with complicated bias potential.

The first step is to generate several simulations with different positions of the restraint, gradually going from say -2 to +2. You can obtain them using e.g. the following script:

for AT in -2.0 -1.5 -1.0 -0.5 +0.0 +0.5 +1.0 +1.5 +2.0

cat >plumed.dat << EOF
phi: TORSION ATOMS=5,7,9,15
psi: TORSION ATOMS=7,9,15,17
# Impose an umbrella potential on CV 1 and CV 2
# with a spring constant of 500 kjoule/mol
# at fixed points on the Ramachandran plot
restraint-phi: RESTRAINT ARG=phi KAPPA=40.0 AT=$AT
# monitor the two variables and the bias potential from the two restraints
PRINT STRIDE=10 ARG=phi,psi,restraint-phi.bias FILE=COLVAR$AT

gmx_mpi mdrun -plumed plumed.dat -nsteps 100000 -x traj$AT.xtc


Notice that we are here saving separate trajectories for the separate simulation, as well as separate colvar files. In each simulation the restraint is located in a different position. Have a look at the plot of (phi,psi) for the different simulations to understand what is happening.

An often misunderstood fact about WHAM is that data of the different trajectories can be mixed and it is not necessary to keep track of which restraint was used to produce every single frame. Let's get the concatenated trajectory

gmx_mpi trjcat -cat -f traj*.xtc -o alltraj.xtc

Now we should compute the value of each of the bias potentials on the entire (concatenated) trajectory

for AT in -2.0 -1.5 -1.0 -0.5 +0.0 +0.5 +1.0 +1.5 +2.0

cat >plumed.dat << EOF
phi: TORSION ATOMS=5,7,9,15
psi: TORSION ATOMS=7,9,15,17
restraint-phi: RESTRAINT ARG=phi KAPPA=40.0 AT=$AT

# monitor the two variables and the bias potential from the two restraints
PRINT STRIDE=10 ARG=phi,psi,restraint-phi.bias FILE=ALLCOLVAR$AT

plumed driver --mf_xtc alltraj.xtc --trajectory-stride=10 --plumed plumed.dat


It is very important that this script is consistent with the one used to generate the multiple simulations above. Now, single files named ALLCOLVARXX will contain on the fourth column the value of the bias centered in XX computed on the entire concatenated trajectory.

Next step is to compile the C++ program that computes weights self-consistently solving the WHAM equations. This is named wham.cpp and can be compiled with

g++ -O3 wham.cpp -o wham.x

and can be then used through a wrapper script as

./ ALLCOLVAR* > colvar

The resulting colvar file will contain 3 columns: time, phi, and psi, plus the weights obtained from WHAM written in logarithmic scale. That is, the file will contain \(k_BT \log w \).

Try now to use this file to compute the unbiased free-energy landscape as a function of phi and psi. You can use the script that you used earlier to compute histogram.


How does PLUMED work

The fact that when you add a force on the collective variable PLUMED can force the atoms to do something depends on the fact that the collective variables implemented in PLUMED has analytical derivatives. By biasing the value of a single CV one turns to affect the time evolution of the system itself. Notice that some of the collective variables could be implemented without derivatives (either because the developers were lazy or because the CVs cannot be derived). In this case you might want to have a look at the NUMERICAL_DERIVATIVES option.

Further Reading

Umbrella sampling method is a widely used technique. You can find several resources on the web, e.g.: