Accumulate the average probability density along a few CVs from a trajectory.
When using this method it is supposed that you have some collective variable \(\zeta\) that gives a reasonable description of some physical or chemical phenomenon. As an example of what we mean by this suppose you wish to examine the following SN2 reaction:
The distance between the chlorine atom and the carbon is an excellent collective variable, \(\zeta\), in this case because this distance is short for the reactant, \(\textrm{CH}_3Cl\), because the carbon and chlorine are chemically bonded, and because it is long for the product state when these two atoms are not chemically bonded. We thus might want to accumulate the probability density, \(P(\zeta)\), as a function of this distance as this will provide us with information about the overall likelihood of the reaction. Furthermore, the free energy, \(F(\zeta)\), is related to this probability density via:
\[F(\zeta) = - k_B T \ln P(\zeta)
\]
Accumulating these probability densities is precisely what this Action can be used to do. Furthermore, the conversion of the histogram to the free energy can be achieved by using the method CONVERT_TO_FES.
We calculate histograms within PLUMED using a method known as kernel density estimation, which you can read more about here:
In PLUMED the value of \(\zeta\) at each discrete instant in time in the trajectory is accumulated. A kernel, \(K(\zeta-\zeta(t'),\sigma)\), centered at the current value, \(\zeta(t)\), of this quantity is generated with a bandwidth \(\sigma\), which is set by the user. These kernels are then used to accumulate the ensemble average for the probability density:
Here the sums run over a portion of the trajectory specified by the user. The final quantity evaluated is a weighted average as the weights, \(w(t')\), allow us to negate the effect any bias might have on the region of phase space sampled by the system. This is discussed in the section of the manual on Analysis.
A discrete analogue of kernel density estimation can also be used. In this analogue the kernels in the above formula are replaced by Dirac delta functions. When this method is used the final function calculated is no longer a probability density - it is instead a probability mass function as each element of the function tells you the value of an integral between two points on your grid rather than the value of a (continuous) function on a grid.
Additional material and examples can be also found in the tutorials lugano-1.
A note on block averaging and errors
Some particularly important issues related to the convergence of histograms and the estimation of error bars around the ensemble averages you calculate are covered in trieste-2. The technique for estimating error bars that is known as block averaging is introduced in this tutorial. The essence of this technique is that the trajectory is split into a set of blocks and separate ensemble averages are calculated from each separate block of data. If \(\{A_i\}\) is the set of \(N\) block averages that are obtained from this technique then the final error bar is calculated as:
If the simulation is biased and reweighting is performed then life is a little more complex as each of the block averages should be calculated as a weighted average. Furthermore, the weights should be taken into account when the final ensemble and error bars are calculated. As such the error should be:
where the sums runs over the trajectory, \(w_t\) is the weight of the \(t\)th trajectory frame, \(x_t\) is the value of the CV for the \(t\)th trajectory frame and \(K\) is a kernel function centered on \(x_t\) with bandwidth \(\sigma\). The quantity that is evaluated is the value of the normalized histogram at point \(x\). The following ensemble average will be calculated if you use the NORMALIZATION=true option in HISTOGRAM. If the ensemble average is calculated in this way we must calculate the associated error bars from our block averages using the second of the expressions above.
A number of works have shown that when biased simulations are performed it is often better to calculate an estimate of the histogram that is not normalized using:
instead of the expression above. As such this is what is done by default in HISTOGRAM or if the NORMALIZATION=ndata option is used. When the histogram is calculated in this second way the first of the two formula above can be used when calculating error bars from block averages.
Examples
The following input monitors two torsional angles during a simulation and outputs a continuous histogram as a function of them at the end of the simulation.
Click on the labels of the actions for more information on what each action computes
r1: TORSIONATOMSthe four atoms involved in the torsional angle =1,2,3,4 The TORSION action with label r1 calculates a single scalar valuer2: TORSIONATOMSthe four atoms involved in the torsional angle =2,3,4,5 The TORSION action with label r2 calculates a single scalar valuehh: HISTOGRAM ...
ARGthe quantity that is being averaged =r1,r2GRID_MINcompulsory keyword ( default=auto )
the lower bounds for the grid =-3.14,-3.14
GRID_MAXcompulsory keyword ( default=auto )
the upper bounds for the grid =3.14,3.14
GRID_BINthe number of bins for the grid =200,200
BANDWIDTHthe bandwidths for kernel density esimtation =0.05,0.05
...The HISTOGRAM action with label hh calculates a single scalar valueDUMPGRIDGRIDthe grid you would like to print (can also use ARG for specifying what is being printed)
=hhFILEcompulsory keyword ( default=density )
the file on which to write the grid. =histo The DUMPGRID action with label
r1: TORSIONATOMSthe four atoms involved in the torsional angle =1,2,3,4 The TORSION action with label r1 calculates a single scalar valuer2: TORSIONATOMSthe four atoms involved in the torsional angle =2,3,4,5 The TORSION action with label r2 calculates a single scalar valuehh_weight: ONESSIZEcompulsory keyword
the number of ones that you would like to create =1 The ONES action with label hh_weight calculates a single scalar valuehh_kde: KDEARGthe input for this action is the scalar output from one or more other actions. =r1,r2GRID_MINcompulsory keyword ( default=auto )
the lower bounds for the grid =-3.14,-3.14 GRID_MAXcompulsory keyword ( default=auto )
the upper bounds for the grid =3.14,3.14 GRID_BINthe number of bins for the grid =200,200 BANDWIDTHthe bandwidths for kernel density esimtation =0.05,0.05 The KDE action with label hh_kde calculates a single scalar valuehh_kdep: CUSTOMARGthe input to this function. =hh_kde,hh_weightFUNCcompulsory keyword
the function you wish to evaluate =x*y PERIODICcompulsory keyword
if the output of your function is periodic then you should specify the periodicity
of the function. =NO The CUSTOM action with label hh_kdep calculates a single scalar valuehh_u: ACCUMULATEARGthe input for this action is the scalar output from one or more other actions. =hh_kdepSTRIDEcompulsory keyword ( default=1 )
the frequency with which the data should be collected and added to the quantity being
averaged =1 CLEARcompulsory keyword ( default=0 )
the frequency with which to clear all the accumulated data. =0 The ACCUMULATE action with label hh_u calculates a single scalar valuehh_nsum: ACCUMULATEARGthe input for this action is the scalar output from one or more other actions. =hh_weightSTRIDEcompulsory keyword ( default=1 )
the frequency with which the data should be collected and added to the quantity being
averaged =1 CLEARcompulsory keyword ( default=0 )
the frequency with which to clear all the accumulated data. =0 The ACCUMULATE action with label hh_nsum calculates a single scalar valuehh: CUSTOMARGthe input to this function. =hh_u,hh_nsumFUNCcompulsory keyword
the function you wish to evaluate =x/y PERIODICcompulsory keyword
if the output of your function is periodic then you should specify the periodicity
of the function. =NO The CUSTOM action with label hh calculates a single scalar valueDUMPGRIDGRIDthe grid you would like to print (can also use ARG for specifying what is being printed)
=hhFILEcompulsory keyword ( default=density )
the file on which to write the grid. =histo The DUMPGRID action with label
The following input monitors two torsional angles during a simulation and outputs a discrete histogram as a function of them at the end of the simulation.
Click on the labels of the actions for more information on what each action computes
r1: TORSIONATOMSthe four atoms involved in the torsional angle =1,2,3,4 The TORSION action with label r1 calculates a single scalar valuer2: TORSIONATOMSthe four atoms involved in the torsional angle =2,3,4,5 The TORSION action with label r2 calculates a single scalar valuehh: HISTOGRAM ...
ARGthe quantity that is being averaged =r1,r2KERNELcompulsory keyword ( default=GAUSSIAN )
the kernel function you are using. =DISCRETE
GRID_MINcompulsory keyword ( default=auto )
the lower bounds for the grid =-3.14,-3.14
GRID_MAXcompulsory keyword ( default=auto )
the upper bounds for the grid =3.14,3.14
GRID_BINthe number of bins for the grid =200,200
...The HISTOGRAM action with label hh calculates a single scalar valueDUMPGRIDGRIDthe grid you would like to print (can also use ARG for specifying what is being printed)
=hhFILEcompulsory keyword ( default=density )
the file on which to write the grid. =histo The DUMPGRID action with label
r1: TORSIONATOMSthe four atoms involved in the torsional angle =1,2,3,4 The TORSION action with label r1 calculates a single scalar valuer2: TORSIONATOMSthe four atoms involved in the torsional angle =2,3,4,5 The TORSION action with label r2 calculates a single scalar valuehh_weight: ONESSIZEcompulsory keyword
the number of ones that you would like to create =1 The ONES action with label hh_weight calculates a single scalar valuehh_kde: KDEARGthe input for this action is the scalar output from one or more other actions. =r1,r2KERNELcompulsory keyword ( default=GAUSSIAN )
the kernel function you are using. =DISCRETE GRID_MINcompulsory keyword ( default=auto )
the lower bounds for the grid =-3.14,-3.14 GRID_MAXcompulsory keyword ( default=auto )
the upper bounds for the grid =3.14,3.14 GRID_BINthe number of bins for the grid =200,200 The KDE action with label hh_kde calculates a single scalar valuehh_kdep: CUSTOMARGthe input to this function. =hh_kde,hh_weightFUNCcompulsory keyword
the function you wish to evaluate =x*y PERIODICcompulsory keyword
if the output of your function is periodic then you should specify the periodicity
of the function. =NO The CUSTOM action with label hh_kdep calculates a single scalar valuehh_u: ACCUMULATEARGthe input for this action is the scalar output from one or more other actions. =hh_kdepSTRIDEcompulsory keyword ( default=1 )
the frequency with which the data should be collected and added to the quantity being
averaged =1 CLEARcompulsory keyword ( default=0 )
the frequency with which to clear all the accumulated data. =0 The ACCUMULATE action with label hh_u calculates a single scalar valuehh_nsum: ACCUMULATEARGthe input for this action is the scalar output from one or more other actions. =hh_weightSTRIDEcompulsory keyword ( default=1 )
the frequency with which the data should be collected and added to the quantity being
averaged =1 CLEARcompulsory keyword ( default=0 )
the frequency with which to clear all the accumulated data. =0 The ACCUMULATE action with label hh_nsum calculates a single scalar valuehh: CUSTOMARGthe input to this function. =hh_u,hh_nsumFUNCcompulsory keyword
the function you wish to evaluate =x/y PERIODICcompulsory keyword
if the output of your function is periodic then you should specify the periodicity
of the function. =NO The CUSTOM action with label hh calculates a single scalar valueDUMPGRIDGRIDthe grid you would like to print (can also use ARG for specifying what is being printed)
=hhFILEcompulsory keyword ( default=density )
the file on which to write the grid. =histo The DUMPGRID action with label
The following input monitors two torsional angles during a simulation and outputs the histogram accumulated thus far every 100000 steps.
Click on the labels of the actions for more information on what each action computes
r1: TORSIONATOMSthe four atoms involved in the torsional angle =1,2,3,4 The TORSION action with label r1 calculates a single scalar valuer2: TORSIONATOMSthe four atoms involved in the torsional angle =2,3,4,5 The TORSION action with label r2 calculates a single scalar valuehh: HISTOGRAM ...
ARGthe quantity that is being averaged =r1,r2GRID_MINcompulsory keyword ( default=auto )
the lower bounds for the grid =-3.14,-3.14
GRID_MAXcompulsory keyword ( default=auto )
the upper bounds for the grid =3.14,3.14
GRID_BINthe number of bins for the grid =200,200
BANDWIDTHthe bandwidths for kernel density esimtation =0.05,0.05
...The HISTOGRAM action with label hh calculates a single scalar valueDUMPGRIDGRIDthe grid you would like to print (can also use ARG for specifying what is being printed)
=hhFILEcompulsory keyword ( default=density )
the file on which to write the grid. =histo STRIDEcompulsory keyword ( default=0 )
the frequency with which the grid should be output to the file. =100000 The DUMPGRID action with label
r1: TORSIONATOMSthe four atoms involved in the torsional angle =1,2,3,4 The TORSION action with label r1 calculates a single scalar valuer2: TORSIONATOMSthe four atoms involved in the torsional angle =2,3,4,5 The TORSION action with label r2 calculates a single scalar valuehh_weight: ONESSIZEcompulsory keyword
the number of ones that you would like to create =1 The ONES action with label hh_weight calculates a single scalar valuehh_kde: KDEARGthe input for this action is the scalar output from one or more other actions. =r1,r2GRID_MINcompulsory keyword ( default=auto )
the lower bounds for the grid =-3.14,-3.14 GRID_MAXcompulsory keyword ( default=auto )
the upper bounds for the grid =3.14,3.14 GRID_BINthe number of bins for the grid =200,200 BANDWIDTHthe bandwidths for kernel density esimtation =0.05,0.05 The KDE action with label hh_kde calculates a single scalar valuehh_kdep: CUSTOMARGthe input to this function. =hh_kde,hh_weightFUNCcompulsory keyword
the function you wish to evaluate =x*y PERIODICcompulsory keyword
if the output of your function is periodic then you should specify the periodicity
of the function. =NO The CUSTOM action with label hh_kdep calculates a single scalar valuehh_u: ACCUMULATEARGthe input for this action is the scalar output from one or more other actions. =hh_kdepSTRIDEcompulsory keyword ( default=1 )
the frequency with which the data should be collected and added to the quantity being
averaged =1 CLEARcompulsory keyword ( default=0 )
the frequency with which to clear all the accumulated data. =0 The ACCUMULATE action with label hh_u calculates a single scalar valuehh_nsum: ACCUMULATEARGthe input for this action is the scalar output from one or more other actions. =hh_weightSTRIDEcompulsory keyword ( default=1 )
the frequency with which the data should be collected and added to the quantity being
averaged =1 CLEARcompulsory keyword ( default=0 )
the frequency with which to clear all the accumulated data. =0 The ACCUMULATE action with label hh_nsum calculates a single scalar valuehh: CUSTOMARGthe input to this function. =hh_u,hh_nsumFUNCcompulsory keyword
the function you wish to evaluate =x/y PERIODICcompulsory keyword
if the output of your function is periodic then you should specify the periodicity
of the function. =NO The CUSTOM action with label hh calculates a single scalar valueDUMPGRIDGRIDthe grid you would like to print (can also use ARG for specifying what is being printed)
=hhFILEcompulsory keyword ( default=density )
the file on which to write the grid. =histo STRIDEcompulsory keyword ( default=0 )
the frequency with which the grid should be output to the file. =100000 The DUMPGRID action with label
The following input monitors two torsional angles during a simulation and outputs a separate histogram for each 100000 steps worth of trajectory. Notice how the CLEAR keyword is used here and how it is not used in the previous example.
Click on the labels of the actions for more information on what each action computes
r1: TORSIONATOMSthe four atoms involved in the torsional angle =1,2,3,4 The TORSION action with label r1 calculates a single scalar valuer2: TORSIONATOMSthe four atoms involved in the torsional angle =2,3,4,5 The TORSION action with label r2 calculates a single scalar valuehh: HISTOGRAM ...
ARGthe quantity that is being averaged =r1,r2CLEARcompulsory keyword ( default=0 )
the frequency with whihc to clear the data that is being averaged =100000
GRID_MINcompulsory keyword ( default=auto )
the lower bounds for the grid =-3.14,-3.14
GRID_MAXcompulsory keyword ( default=auto )
the upper bounds for the grid =3.14,3.14
GRID_BINthe number of bins for the grid =200,200
BANDWIDTHthe bandwidths for kernel density esimtation =0.05,0.05
...The HISTOGRAM action with label hh calculates a single scalar valueDUMPGRIDGRIDthe grid you would like to print (can also use ARG for specifying what is being printed)
=hhFILEcompulsory keyword ( default=density )
the file on which to write the grid. =histo STRIDEcompulsory keyword ( default=0 )
the frequency with which the grid should be output to the file. =100000 The DUMPGRID action with label
r1: TORSIONATOMSthe four atoms involved in the torsional angle =1,2,3,4 The TORSION action with label r1 calculates a single scalar valuer2: TORSIONATOMSthe four atoms involved in the torsional angle =2,3,4,5 The TORSION action with label r2 calculates a single scalar valuehh_weight: ONESSIZEcompulsory keyword
the number of ones that you would like to create =1 The ONES action with label hh_weight calculates a single scalar valuehh_kde: KDEARGthe input for this action is the scalar output from one or more other actions. =r1,r2GRID_MINcompulsory keyword ( default=auto )
the lower bounds for the grid =-3.14,-3.14 GRID_MAXcompulsory keyword ( default=auto )
the upper bounds for the grid =3.14,3.14 GRID_BINthe number of bins for the grid =200,200 BANDWIDTHthe bandwidths for kernel density esimtation =0.05,0.05 The KDE action with label hh_kde calculates a single scalar valuehh_kdep: CUSTOMARGthe input to this function. =hh_kde,hh_weightFUNCcompulsory keyword
the function you wish to evaluate =x*y PERIODICcompulsory keyword
if the output of your function is periodic then you should specify the periodicity
of the function. =NO The CUSTOM action with label hh_kdep calculates a single scalar valuehh_u: ACCUMULATEARGthe input for this action is the scalar output from one or more other actions. =hh_kdepSTRIDEcompulsory keyword ( default=1 )
the frequency with which the data should be collected and added to the quantity being
averaged =1 CLEARcompulsory keyword ( default=0 )
the frequency with which to clear all the accumulated data. =100000 The ACCUMULATE action with label hh_u calculates a single scalar valuehh_nsum: ACCUMULATEARGthe input for this action is the scalar output from one or more other actions. =hh_weightSTRIDEcompulsory keyword ( default=1 )
the frequency with which the data should be collected and added to the quantity being
averaged =1 CLEARcompulsory keyword ( default=0 )
the frequency with which to clear all the accumulated data. =100000 The ACCUMULATE action with label hh_nsum calculates a single scalar valuehh: CUSTOMARGthe input to this function. =hh_u,hh_nsumFUNCcompulsory keyword
the function you wish to evaluate =x/y PERIODICcompulsory keyword
if the output of your function is periodic then you should specify the periodicity
of the function. =NO The CUSTOM action with label hh calculates a single scalar valueDUMPGRIDGRIDthe grid you would like to print (can also use ARG for specifying what is being printed)
=hhFILEcompulsory keyword ( default=density )
the file on which to write the grid. =histo STRIDEcompulsory keyword ( default=0 )
the frequency with which the grid should be output to the file. =100000 The DUMPGRID action with label
Glossary of keywords and components
Description of components
Quantity
Description
.#!value
the estimate of the histogram as a function of the argument that was obtained
Compulsory keywords
NORMALIZATION
( default=ndata ) This controls how the data is normalized it can be set equal to true, false or ndata. See above for an explanation
GRID_MIN
( default=auto ) the lower bounds for the grid
GRID_MAX
( default=auto ) the upper bounds for the grid
KERNEL
( default=GAUSSIAN ) the kernel function you are using. More details on the kernels available in plumed plumed can be found in kernelfunctions.
STRIDE
( default=1 ) the frequency with which to store data for averaging
CLEAR
( default=0 ) the frequency with whihc to clear the data that is being averaged
Options
UPDATE_FROM
Only update this action from this time
UPDATE_UNTIL
Only update this action until this time
ARG
the quantity that is being averaged
DATA
an alternative to the ARG keyword
BANDWIDTH
the bandwidths for kernel density esimtation
GRID_BIN
the number of bins for the grid
GRID_SPACING
the approximate grid spacing (to be used as an alternative or together with GRID_BIN)
LOGWEIGHTS
the logarithm of the quantity to use as the weights when calculating averages