Skip to content

Action: KDE

Module gridtools
Description Usage
Create a histogram from the input scalar/vector/matrix using KDE used in 2 tutorialsused in 1 eggs
output value type
a function on a grid that was obtained by doing a Kernel Density Estimation using the input arguments grid

Details and examples

Create a histogram from the input scalar/vector/matrix using KDE

This action can be used to construct instantaneous distributions for quantities by using kernel density esstimation. The input arguments must all have the same rank and size but you can use a scalar, vector or matrix in input. The distribution of this quantity on a grid is then computed using kernel density estimation.

The following example demonstrates how this action can be used with a scalar as input:

Click on the labels of the actions for more information on what each action computes
tested on2.11
d1: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2
kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=d1 GRID_MIN the lower bounds for the grid=0.0 GRID_MAX the upper bounds for the grid=1.0 GRID_BINthe number of bins for the grid=100 BANDWIDTHthe bandwidths for kernel density esimtation=0.2
DUMPGRIDOutput the function on the grid to a file with the PLUMED grid format. More details ARGthe label for the grid that you would like to output=kde STRIDE the frequency with which the grid should be output to the file=1 FILE the file on which to write the grid=kde.grid

This input outputs a different file on every time step. These files contain a function stored on a grid. The function output in this case consists of a single Gaussian with that is centered on the instantaneous value of the distance between atoms 1 and 2. Obviously, you are unlikely to use an input like the one above. The more usual thing to do would be to accumulate the histogram over the course of a few trajectory frames using the ACCUMULATE command as has been done in the input below, which estimates a histogram as a function of two collective variables:

Click on the labels of the actions for more information on what each action computes
tested on2.11
d1: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2
d2: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2
kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=d1,d2 GRID_MIN the lower bounds for the grid=0.0,0.0 GRID_MAX the upper bounds for the grid=1.0,1.0 GRID_BINthe number of bins for the grid=100,100 BANDWIDTHthe bandwidths for kernel density esimtation=0.2,0.2
histo: ACCUMULATESum the elements of this value over the course of the trajectory This action has hidden defaults. More details ARGthe label of the argument that is being added to on each timestep=kde STRIDE the frequency with which the data should be collected and added to the quantity being averaged=1
DUMPGRIDOutput the function on the grid to a file with the PLUMED grid format. More details ARGthe label for the grid that you would like to output=histo FILE the file on which to write the grid=histo.grid STRIDE the frequency with which the grid should be output to the file=10000

Notice, that you can also achieve something similar by using the HISTOGRAM shortcut.

Controlloing the grid

If you prefer to specify the grid spacing rather than the number of bins you can do so using the GRID_SPACING keyword as shown below:

Click on the labels of the actions for more information on what each action computes
tested on2.11
d1: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2
kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=d1 GRID_MIN the lower bounds for the grid=0.0 GRID_MAX the upper bounds for the grid=1.0 GRID_SPACINGthe approximate grid spacing (to be used as an alternative or together with GRID_BIN)=0.01 BANDWIDTHthe bandwidths for kernel density esimtation=0.2
DUMPGRIDOutput the function on the grid to a file with the PLUMED grid format. More details ARGthe label for the grid that you would like to output=kde STRIDE the frequency with which the grid should be output to the file=1 FILE the file on which to write the grid=kde.grid

If is one of the input arguments to the KDE action and or , where and are the minimum and maximum values on the grid for that argument that were specified using GRID_MIN and GRID_MAX, then by PLUMED will crash.

Notice also that when you use Gaussian kernels to accumulate a denisty as in the input above you need to define a cutoff beyond, which the Gaussian (which is a function with infinite support) is assumed not to contribute to the accumulated density. When setting this cutoff you set the value of in the following expression , where is the bandwidth. By default is set equal to 6.25 but you can change this value by using the CUTOFF keyword as shown below:

Click on the labels of the actions for more information on what each action computes
tested on2.11
d1: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2
kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=d1 GRID_MIN the lower bounds for the grid=0.0 GRID_MAX the upper bounds for the grid=1.0 GRID_SPACINGthe approximate grid spacing (to be used as an alternative or together with GRID_BIN)=0.01 BANDWIDTHthe bandwidths for kernel density esimtation=0.2 CUTOFF the cutoff at which to stop evaluating the kernel functions is set equal to sqrt(2*x)*bandwidth in each direction where x is this number=6.25
DUMPGRIDOutput the function on the grid to a file with the PLUMED grid format. More details ARGthe label for the grid that you would like to output=kde STRIDE the frequency with which the grid should be output to the file=1 FILE the file on which to write the grid=kde.grid

Constructing the density

If you are performing a simulation in the NVT ensemble and wish to look at the density as a function of position in the cell you can use an input like the one shown below:

Click on the labels of the actions for more information on what each action computes
tested on2.11
a: FIXEDATOMAdd a virtual atom in a fixed position. This action has hidden defaults. More details ATcoordinates of the virtual atom=0,0,0
dens: DISTANCESCalculate the distances between multiple piars of atoms This action is a shortcut. More details ATOMSthe pairs of atoms that you would like to calculate the angles for=1-100 ORIGINcalculate the distance of all the atoms specified using the ATOMS keyword from this point=a COMPONENTS calculate the x, y and z components of the distance separately and store them as label
kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=dens.x,dens.y,dens.z GRID_BINthe number of bins for the grid=100,100,100 BANDWIDTHthe bandwidths for kernel density esimtation=0.05,0.05,0.05
DUMPGRIDOutput the function on the grid to a file with the PLUMED grid format. More details ARGthe label for the grid that you would like to output=kde STRIDE the frequency with which the grid should be output to the file=1 FILE the file on which to write the grid=density

Notice that you do not need to specify GRID_MIN and GRID_MAX values with this input. In this case PLUMED gets the extent of the grid from the cell vectors during the first step of the simulation.

Specifying a non diagonal bandwidth

If for any reason you want to use a bandwidth that is not diagonal when doing kensity density estimation you can do by using an input similar to the one shown below:

Click on the labels of the actions for more information on what each action computes
tested on2.11
d1: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2
d2: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2
kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ...
  ARGthe label for the value that should be used to construct the histogram=d1,d2 GRID_MIN the lower bounds for the grid=0.0,0.0
  GRID_MAX the upper bounds for the grid=1.0,1.0 GRID_BINthe number of bins for the grid=100,100
  BANDWIDTHthe bandwidths for kernel density esimtation=0.2,0.1,0.1,0.2 HEIGHTSthis keyword takes the label of an action that calculates a vector of values=1
...
histo: ACCUMULATESum the elements of this value over the course of the trajectory This action has hidden defaults. More details ARGthe label of the argument that is being added to on each timestep=kde STRIDE the frequency with which the data should be collected and added to the quantity being averaged=1
DUMPGRIDOutput the function on the grid to a file with the PLUMED grid format. More details ARGthe label for the grid that you would like to output=histo FILE the file on which to write the grid=histo.grid STRIDE the frequency with which the grid should be output to the file=10000

As there are two arguments for this KDE action the four numbers passed in the bandwdith parameter are interepretted as a matrix. Notice that you can also pass the information for the bandwidth in from another argument as has been done here:

Click on the labels of the actions for more information on what each action computes
tested on2.11
m: CONSTANTCreate a constant value that can be passed to actions More details VALUESthe numbers that are in your constant value=0.2,0.1,0.1,0.2 NROWS the number of rows in your input matrix=2 NCOLS the number of columns in your matrix=2

d1: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2
d2: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2
kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ...
  ARGthe label for the value that should be used to construct the histogram=d1,d2 GRID_MIN the lower bounds for the grid=0.0,0.0
  GRID_MAX the upper bounds for the grid=1.0,1.0 GRID_BINthe number of bins for the grid=100,100
  BANDWIDTHthe bandwidths for kernel density esimtation=m HEIGHTSthis keyword takes the label of an action that calculates a vector of values=1
...
histo: ACCUMULATESum the elements of this value over the course of the trajectory This action has hidden defaults. More details ARGthe label of the argument that is being added to on each timestep=kde STRIDE the frequency with which the data should be collected and added to the quantity being averaged=1
DUMPGRIDOutput the function on the grid to a file with the PLUMED grid format. More details ARGthe label for the grid that you would like to output=histo FILE the file on which to write the grid=histo.grid STRIDE the frequency with which the grid should be output to the file=10000

In this case the input is equivalent to the first input above and the bandwidth is a constant. You could, however, also use a non-constant value as input to the BANDWIDTH keyword.

Working with vectors and scalars

If the input to your KDE action is a set of scalars it appears odd to separate the process of computing the KDE from the process of accumulating the histogram. However, if you are using vectors as in the example below, this division can be helpful.

Click on the labels of the actions for more information on what each action computes
tested on2.11
d1: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMS1the pair of atom that we are calculating the distance between=1,2 ATOMS2the pair of atom that we are calculating the distance between=3,4 ATOMS3the pair of atom that we are calculating the distance between=5,6 ATOMS4the pair of atom that we are calculating the distance between=7,8 ATOMS5the pair of atom that we are calculating the distance between=9,10
kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=d1 GRID_MIN the lower bounds for the grid=0.0 GRID_MAX the upper bounds for the grid=1.0 GRID_BINthe number of bins for the grid=100 BANDWIDTHthe bandwidths for kernel density esimtation=0.2

In the papea cited in the bibliography below, the KL_ENTROPY between the instantaneous distribution of CVs and a reference distribution was introduced as a collective variable. As is detailed in the documentation for that action, the ability to calculate the instaneous histogram from an input vector is essential to reproducing these calculations.

Notice that you can also use a one or multiple matrices in the input for a KDE object. The example below uses the angles between the z axis and set of bonds aroud two atoms:

Click on the labels of the actions for more information on what each action computes
tested on2.11
d1: DISTANCE_MATRIXCalculate a matrix of distances between atoms. This action has hidden defaults. More details GROUPAwhen you are calculating the adjacency matrix between two sets of atoms this keyword is used to specify the atoms along with the keyword GROUPB=1,2 GROUPBwhen you are calculating the adjacency matrix between two sets of atoms this keyword is used to specify the atoms along with the keyword GROUPA=3-10 COMPONENTS also calculate the components of the vector connecting the atoms in the contact matrix
phi: CUSTOMCalculate a combination of variables using a custom expression. More details ARGthe values input to this function=d1.z,d1.w FUNCthe function you wish to evaluate=acos(x/y) PERIODICif the output of your function is periodic then you should specify the periodicity of the function=NO
kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=phi GRID_MIN the lower bounds for the grid=0 GRID_MAX the upper bounds for the grid=pi GRID_BINthe number of bins for the grid=200 BANDWIDTHthe bandwidths for kernel density esimtation=0.1

Using different weights

In all the inputs above the kernels that are added to the grid on each step are Gaussians with that are normalised so that their integral over all space is one. If you want your Gaussians to have a particular height you can use the HEIGHT keyword as illustrated below:

Click on the labels of the actions for more information on what each action computes
tested on2.11
d1: CONTACT_MATRIXAdjacency matrix in which two atoms are adjacent if they are within a certain cutoff. More details GROUPAwhen you are calculating the adjacency matrix between two sets of atoms this keyword is used to specify the atoms along with the keyword GROUPB=1,2 GROUPBwhen you are calculating the adjacency matrix between two sets of atoms this keyword is used to specify the atoms along with the keyword GROUPA=3-10 SWITCHthe input for the switching function that acts upon the distance between each pair of atoms. Options for this keyword are explained in the documentation for LESS_THAN.={RATIONAL R_0=0.1} COMPONENTS also calculate the components of the vector connecting the atoms in the contact matrix
mag: CUSTOMCalculate a combination of variables using a custom expression. More details ARGthe values input to this function=d1.x,d1.y,d1.z FUNCthe function you wish to evaluate=x*x+y*y+z*z PERIODICif the output of your function is periodic then you should specify the periodicity of the function=NO
phi: CUSTOMCalculate a combination of variables using a custom expression. More details ARGthe values input to this function=d1.z,mag FUNCthe function you wish to evaluate=acos(x/sqrt(y)) PERIODICif the output of your function is periodic then you should specify the periodicity of the function=NO
kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=phi GRID_MIN the lower bounds for the grid=0 GRID_MAX the upper bounds for the grid=pi HEIGHTSthis keyword takes the label of an action that calculates a vector of values=d1.w GRID_BINthe number of bins for the grid=200 BANDWIDTHthe bandwidths for kernel density esimtation=0.1

As indicated above, the HEIGHTS keyword should be passed a Value that has the same rank and size as the arguments that are passed using the ARG keyword. Each of the Gaussian kernels that are added to the grid in this case have a value equal to the weight at the maximum of the function.

Notice that you can also use the VOLUMES keyword in a similar way as shown below:

Click on the labels of the actions for more information on what each action computes
tested on2.11
d1: CONTACT_MATRIXAdjacency matrix in which two atoms are adjacent if they are within a certain cutoff. More details GROUPAwhen you are calculating the adjacency matrix between two sets of atoms this keyword is used to specify the atoms along with the keyword GROUPB=1,2 GROUPBwhen you are calculating the adjacency matrix between two sets of atoms this keyword is used to specify the atoms along with the keyword GROUPA=3-10 SWITCHthe input for the switching function that acts upon the distance between each pair of atoms. Options for this keyword are explained in the documentation for LESS_THAN.={RATIONAL R_0=0.1} COMPONENTS also calculate the components of the vector connecting the atoms in the contact matrix
mag: CUSTOMCalculate a combination of variables using a custom expression. More details ARGthe values input to this function=d1.x,d1.y,d1.z FUNCthe function you wish to evaluate=x*x+y*y+z*z PERIODICif the output of your function is periodic then you should specify the periodicity of the function=NO
phi: CUSTOMCalculate a combination of variables using a custom expression. More details ARGthe values input to this function=d1.z,mag FUNCthe function you wish to evaluate=acos(x/sqrt(y)) PERIODICif the output of your function is periodic then you should specify the periodicity of the function=NO
kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=phi GRID_MIN the lower bounds for the grid=0 GRID_MAX the upper bounds for the grid=pi VOLUMESthis keyword take the label of an action that calculates a vector of values=d1.w GRID_BINthe number of bins for the grid=200 BANDWIDTHthe bandwidths for kernel density esimtation=0.1

Now, however, the integral of the Gaussians over all space are equal to the elements of d1.w.

Input

The arguments that serve as the input for this action are specified using one or more of the keywords in the following table.

Keyword Type Description
ARG scalar/vector/matrix the label for the value that should be used to construct the histogram

Full list of keywords

The following table describes the keywords and options that can be used with this action

Keyword Type Default Description
ARG input none the label for the value that should be used to construct the histogram
KERNEL compulsory GAUSSIAN the kernel function you are using
GRID_MIN compulsory auto the lower bounds for the grid
GRID_MAX compulsory auto the upper bounds for the grid
CUTOFF compulsory 6.25 the cutoff at which to stop evaluating the kernel functions is set equal to sqrt(2x)bandwidth in each direction where x is this number
BANDWIDTH optional not used the bandwidths for kernel density esimtation
VOLUMES optional not used this keyword take the label of an action that calculates a vector of values
HEIGHTS optional not used this keyword takes the label of an action that calculates a vector of values
GRID_SPACING optional not used the approximate grid spacing (to be used as an alternative or together with GRID_BIN)
GRID_BIN optional not used the number of bins for the grid
SERIAL optional false do the calculation in serial. Further information about this flag can be found here.
USEGPU optional false run this calculation on the GPU. Further information about this flag can be found here.