Action: KDE
| Module | gridtools |
|---|---|
| Description | Usage |
| Create a histogram from the input scalar/vector/matrix using KDE | |
| output value | type |
| a function on a grid that was obtained by doing a Kernel Density Estimation using the input arguments | grid |
Details and examples
Create a histogram from the input scalar/vector/matrix using KDE
This action can be used to construct instantaneous distributions for quantities by using kernel density esstimation. The input arguments must all have the same rank and size but you can use a scalar, vector or matrix in input. The distribution of this quantity on a grid is then computed using kernel density estimation.
The following example demonstrates how this action can be used with a scalar as input:
d1: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2 kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=d1 GRID_MIN the lower bounds for the grid=0.0 GRID_MAX the upper bounds for the grid=1.0 GRID_BINthe number of bins for the grid=100 BANDWIDTHthe bandwidths for kernel density esimtation=0.2 DUMPGRIDOutput the function on the grid to a file with the PLUMED grid format. More details ARGthe label for the grid that you would like to output=kde STRIDE the frequency with which the grid should be output to the file=1 FILE the file on which to write the grid=kde.grid
This input outputs a different file on every time step. These files contain a function stored on a grid. The function output in this case consists of a single Gaussian with that is centered on the instantaneous value of the distance between atoms 1 and 2. Obviously, you are unlikely to use an input like the one above. The more usual thing to do would be to accumulate the histogram over the course of a few trajectory frames using the ACCUMULATE command as has been done in the input below, which estimates a histogram as a function of two collective variables:
d1: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2 d2: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2 kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=d1,d2 GRID_MIN the lower bounds for the grid=0.0,0.0 GRID_MAX the upper bounds for the grid=1.0,1.0 GRID_BINthe number of bins for the grid=100,100 BANDWIDTHthe bandwidths for kernel density esimtation=0.2,0.2 histo: ACCUMULATESum the elements of this value over the course of the trajectory This action has hidden defaults. More details ARGthe label of the argument that is being added to on each timestep=kde STRIDE the frequency with which the data should be collected and added to the quantity being averaged=1 DUMPGRIDOutput the function on the grid to a file with the PLUMED grid format. More details ARGthe label for the grid that you would like to output=histo FILE the file on which to write the grid=histo.grid STRIDE the frequency with which the grid should be output to the file=10000
Notice, that you can also achieve something similar by using the HISTOGRAM shortcut.
Controlloing the grid
If you prefer to specify the grid spacing rather than the number of bins you can do so using the GRID_SPACING keyword as shown below:
d1: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2 kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=d1 GRID_MIN the lower bounds for the grid=0.0 GRID_MAX the upper bounds for the grid=1.0 GRID_SPACINGthe approximate grid spacing (to be used as an alternative or together with GRID_BIN)=0.01 BANDWIDTHthe bandwidths for kernel density esimtation=0.2 DUMPGRIDOutput the function on the grid to a file with the PLUMED grid format. More details ARGthe label for the grid that you would like to output=kde STRIDE the frequency with which the grid should be output to the file=1 FILE the file on which to write the grid=kde.grid
If is one of the input arguments to the KDE action and or , where and are the minimum and maximum values on the grid for that argument that were specified using GRID_MIN and GRID_MAX, then by PLUMED will crash.
Notice also that when you use Gaussian kernels to accumulate a denisty as in the input above you need to define a cutoff beyond, which the Gaussian (which is a function with infinite support) is assumed not to contribute to the accumulated density. When setting this cutoff you set the value of in the following expression , where is the bandwidth. By default is set equal to 6.25 but you can change this value by using the CUTOFF keyword as shown below:
d1: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2 kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=d1 GRID_MIN the lower bounds for the grid=0.0 GRID_MAX the upper bounds for the grid=1.0 GRID_SPACINGthe approximate grid spacing (to be used as an alternative or together with GRID_BIN)=0.01 BANDWIDTHthe bandwidths for kernel density esimtation=0.2 CUTOFF the cutoff at which to stop evaluating the kernel functions is set equal to sqrt(2*x)*bandwidth in each direction where x is this number=6.25 DUMPGRIDOutput the function on the grid to a file with the PLUMED grid format. More details ARGthe label for the grid that you would like to output=kde STRIDE the frequency with which the grid should be output to the file=1 FILE the file on which to write the grid=kde.grid
Constructing the density
If you are performing a simulation in the NVT ensemble and wish to look at the density as a function of position in the cell you can use an input like the one shown below:
a: FIXEDATOMAdd a virtual atom in a fixed position. This action has hidden defaults. More details ATcoordinates of the virtual atom=0,0,0 dens: DISTANCESCalculate the distances between multiple piars of atoms This action is a shortcut. More details ATOMSthe pairs of atoms that you would like to calculate the angles for=1-100 ORIGINcalculate the distance of all the atoms specified using the ATOMS keyword from this point=a COMPONENTS calculate the x, y and z components of the distance separately and store them as label kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=dens.x,dens.y,dens.z GRID_BINthe number of bins for the grid=100,100,100 BANDWIDTHthe bandwidths for kernel density esimtation=0.05,0.05,0.05 DUMPGRIDOutput the function on the grid to a file with the PLUMED grid format. More details ARGthe label for the grid that you would like to output=kde STRIDE the frequency with which the grid should be output to the file=1 FILE the file on which to write the grid=density
Notice that you do not need to specify GRID_MIN and GRID_MAX values with this input. In this case PLUMED gets the extent of the grid from the cell vectors during the first step of the simulation.
Specifying a non diagonal bandwidth
If for any reason you want to use a bandwidth that is not diagonal when doing kensity density estimation you can do by using an input similar to the one shown below:
d1: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2 d2: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2 kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ... ARGthe label for the value that should be used to construct the histogram=d1,d2 GRID_MIN the lower bounds for the grid=0.0,0.0 GRID_MAX the upper bounds for the grid=1.0,1.0 GRID_BINthe number of bins for the grid=100,100 BANDWIDTHthe bandwidths for kernel density esimtation=0.2,0.1,0.1,0.2 HEIGHTSthis keyword takes the label of an action that calculates a vector of values=1 ... histo: ACCUMULATESum the elements of this value over the course of the trajectory This action has hidden defaults. More details ARGthe label of the argument that is being added to on each timestep=kde STRIDE the frequency with which the data should be collected and added to the quantity being averaged=1 DUMPGRIDOutput the function on the grid to a file with the PLUMED grid format. More details ARGthe label for the grid that you would like to output=histo FILE the file on which to write the grid=histo.grid STRIDE the frequency with which the grid should be output to the file=10000
As there are two arguments for this KDE action the four numbers passed in the bandwdith parameter are interepretted as a matrix. Notice that you can also pass the information for the bandwidth in from another argument as has been done here:
m: CONSTANTCreate a constant value that can be passed to actions More details VALUESthe numbers that are in your constant value=0.2,0.1,0.1,0.2 NROWS the number of rows in your input matrix=2 NCOLS the number of columns in your matrix=2 d1: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2 d2: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMSthe pair of atom that we are calculating the distance between=1,2 kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ... ARGthe label for the value that should be used to construct the histogram=d1,d2 GRID_MIN the lower bounds for the grid=0.0,0.0 GRID_MAX the upper bounds for the grid=1.0,1.0 GRID_BINthe number of bins for the grid=100,100 BANDWIDTHthe bandwidths for kernel density esimtation=m HEIGHTSthis keyword takes the label of an action that calculates a vector of values=1 ... histo: ACCUMULATESum the elements of this value over the course of the trajectory This action has hidden defaults. More details ARGthe label of the argument that is being added to on each timestep=kde STRIDE the frequency with which the data should be collected and added to the quantity being averaged=1 DUMPGRIDOutput the function on the grid to a file with the PLUMED grid format. More details ARGthe label for the grid that you would like to output=histo FILE the file on which to write the grid=histo.grid STRIDE the frequency with which the grid should be output to the file=10000
In this case the input is equivalent to the first input above and the bandwidth is a constant. You could, however, also use a non-constant value as input to the BANDWIDTH keyword.
Working with vectors and scalars
If the input to your KDE action is a set of scalars it appears odd to separate the process of computing the KDE from the process of accumulating the histogram. However, if you are using vectors as in the example below, this division can be helpful.
d1: DISTANCECalculate the distance/s between pairs of atoms. More details ATOMS1the pair of atom that we are calculating the distance between=1,2 ATOMS2the pair of atom that we are calculating the distance between=3,4 ATOMS3the pair of atom that we are calculating the distance between=5,6 ATOMS4the pair of atom that we are calculating the distance between=7,8 ATOMS5the pair of atom that we are calculating the distance between=9,10 kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=d1 GRID_MIN the lower bounds for the grid=0.0 GRID_MAX the upper bounds for the grid=1.0 GRID_BINthe number of bins for the grid=100 BANDWIDTHthe bandwidths for kernel density esimtation=0.2
In the papea cited in the bibliography below, the KL_ENTROPY between the instantaneous distribution of CVs and a reference distribution was introduced as a collective variable. As is detailed in the documentation for that action, the ability to calculate the instaneous histogram from an input vector is essential to reproducing these calculations.
Notice that you can also use a one or multiple matrices in the input for a KDE object. The example below uses the angles between the z axis and set of bonds aroud two atoms:
d1: DISTANCE_MATRIXCalculate a matrix of distances between atoms. This action has hidden defaults. More details GROUPAwhen you are calculating the adjacency matrix between two sets of atoms this keyword is used to specify the atoms along with the keyword GROUPB=1,2 GROUPBwhen you are calculating the adjacency matrix between two sets of atoms this keyword is used to specify the atoms along with the keyword GROUPA=3-10 COMPONENTS also calculate the components of the vector connecting the atoms in the contact matrix phi: CUSTOMCalculate a combination of variables using a custom expression. More details ARGthe values input to this function=d1.z,d1.w FUNCthe function you wish to evaluate=acos(x/y) PERIODICif the output of your function is periodic then you should specify the periodicity of the function=NO kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=phi GRID_MIN the lower bounds for the grid=0 GRID_MAX the upper bounds for the grid=pi GRID_BINthe number of bins for the grid=200 BANDWIDTHthe bandwidths for kernel density esimtation=0.1
Using different weights
In all the inputs above the kernels that are added to the grid on each step are Gaussians with that are normalised so that their integral over all space is one. If you want your Gaussians to have a particular height you can use the HEIGHT keyword as illustrated below:
d1: CONTACT_MATRIXAdjacency matrix in which two atoms are adjacent if they are within a certain cutoff. More details GROUPAwhen you are calculating the adjacency matrix between two sets of atoms this keyword is used to specify the atoms along with the keyword GROUPB=1,2 GROUPBwhen you are calculating the adjacency matrix between two sets of atoms this keyword is used to specify the atoms along with the keyword GROUPA=3-10 SWITCHthe input for the switching function that acts upon the distance between each pair of atoms. Options for this keyword are explained in the documentation for LESS_THAN.={RATIONAL R_0=0.1} COMPONENTS also calculate the components of the vector connecting the atoms in the contact matrix mag: CUSTOMCalculate a combination of variables using a custom expression. More details ARGthe values input to this function=d1.x,d1.y,d1.z FUNCthe function you wish to evaluate=x*x+y*y+z*z PERIODICif the output of your function is periodic then you should specify the periodicity of the function=NO phi: CUSTOMCalculate a combination of variables using a custom expression. More details ARGthe values input to this function=d1.z,mag FUNCthe function you wish to evaluate=acos(x/sqrt(y)) PERIODICif the output of your function is periodic then you should specify the periodicity of the function=NO kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=phi GRID_MIN the lower bounds for the grid=0 GRID_MAX the upper bounds for the grid=pi HEIGHTSthis keyword takes the label of an action that calculates a vector of values=d1.w GRID_BINthe number of bins for the grid=200 BANDWIDTHthe bandwidths for kernel density esimtation=0.1
As indicated above, the HEIGHTS keyword should be passed a Value that has the same rank and size as the arguments that are passed using the ARG keyword. Each of the Gaussian kernels that are added to the grid in this case have a value equal to the weight at the maximum of the function.
Notice that you can also use the VOLUMES keyword in a similar way as shown below:
d1: CONTACT_MATRIXAdjacency matrix in which two atoms are adjacent if they are within a certain cutoff. More details GROUPAwhen you are calculating the adjacency matrix between two sets of atoms this keyword is used to specify the atoms along with the keyword GROUPB=1,2 GROUPBwhen you are calculating the adjacency matrix between two sets of atoms this keyword is used to specify the atoms along with the keyword GROUPA=3-10 SWITCHthe input for the switching function that acts upon the distance between each pair of atoms. Options for this keyword are explained in the documentation for LESS_THAN.={RATIONAL R_0=0.1} COMPONENTS also calculate the components of the vector connecting the atoms in the contact matrix mag: CUSTOMCalculate a combination of variables using a custom expression. More details ARGthe values input to this function=d1.x,d1.y,d1.z FUNCthe function you wish to evaluate=x*x+y*y+z*z PERIODICif the output of your function is periodic then you should specify the periodicity of the function=NO phi: CUSTOMCalculate a combination of variables using a custom expression. More details ARGthe values input to this function=d1.z,mag FUNCthe function you wish to evaluate=acos(x/sqrt(y)) PERIODICif the output of your function is periodic then you should specify the periodicity of the function=NO kde: KDECreate a histogram from the input scalar/vector/matrix using KDE This action has hidden defaults. More details ARGthe label for the value that should be used to construct the histogram=phi GRID_MIN the lower bounds for the grid=0 GRID_MAX the upper bounds for the grid=pi VOLUMESthis keyword take the label of an action that calculates a vector of values=d1.w GRID_BINthe number of bins for the grid=200 BANDWIDTHthe bandwidths for kernel density esimtation=0.1
Now, however, the integral of the Gaussians over all space are equal to the elements of d1.w.
Input
The arguments that serve as the input for this action are specified using one or more of the keywords in the following table.
| Keyword | Type | Description |
|---|---|---|
| ARG | scalar/vector/matrix | the label for the value that should be used to construct the histogram |
Full list of keywords
The following table describes the keywords and options that can be used with this action
| Keyword | Type | Default | Description |
|---|---|---|---|
| ARG | input | none | the label for the value that should be used to construct the histogram |
| KERNEL | compulsory | GAUSSIAN | the kernel function you are using |
| GRID_MIN | compulsory | auto | the lower bounds for the grid |
| GRID_MAX | compulsory | auto | the upper bounds for the grid |
| CUTOFF | compulsory | 6.25 | the cutoff at which to stop evaluating the kernel functions is set equal to sqrt(2x)bandwidth in each direction where x is this number |
| BANDWIDTH | optional | not used | the bandwidths for kernel density esimtation |
| VOLUMES | optional | not used | this keyword take the label of an action that calculates a vector of values |
| HEIGHTS | optional | not used | this keyword takes the label of an action that calculates a vector of values |
| GRID_SPACING | optional | not used | the approximate grid spacing (to be used as an alternative or together with GRID_BIN) |
| GRID_BIN | optional | not used | the number of bins for the grid |
| SERIAL | optional | false | do the calculation in serial. Further information about this flag can be found here. |
| USEGPU | optional | false | run this calculation on the GPU. Further information about this flag can be found here. |