CLASSICAL_MDS

This is part of the dimred module | |

It is only available if you configure PLUMED with ./configure –enable-modules=dimred . Furthermore, this feature is still being developed so take care when using it and report any problems on the mailing list. |

Create a low-dimensional projection of a trajectory using the classical multidimensional scaling algorithm.

Multidimensional scaling (MDS) is similar to what is done when you make a map. You start with distances between London, Belfast, Paris and Dublin and then you try to arrange points on a piece of paper so that the (suitably scaled) distances between the points in your map representing each of those cities are related to the true distances between the cities. Stating this more mathematically MDS endeavors to find an isometry between points distributed in a high-dimensional space and a set of points distributed in a low-dimensional plane. In other words, if we have \(M\) \(D\)-dimensional points, \(\mathbf{X}\), and we can calculate dissimilarities between pairs them, \(D_{ij}\), we can, with an MDS calculation, try to create \(M\) projections, \(\mathbf{x}\), of the high dimensionality points in a \(d\)-dimensional linear space by trying to arrange the projections so that the Euclidean distances between pairs of them, \(d_{ij}\), resemble the dissimilarities between the high dimensional points. In short we minimize:

\[ \chi^2 = \sum_{i \ne j} \left( D_{ij} - d_{ij} \right)^2 \]

where \(D_{ij}\) is the distance between point \(X^{i}\) and point \(X^{j}\) and \(d_{ij}\) is the distance between the projection of \(X^{i}\), \(x^i\), and the projection of \(X^{j}\), \(x^j\). A tutorial on this approach can be used to analyze simulations can be found in the tutorial Lugano tutorial: Dimensionality reduction and in the following short video.

- Description of components

By default the value of the calculated quantity can be referenced elsewhere in the input file by using the label of the action. Alternatively this Action can be used to calculate the following quantities by employing the keywords listed below. These quantities can be referenced elsewhere in the input by using this Action's label followed by a dot and the name of the quantity required from the list below.

Quantity | Description |

coord | the low-dimensional projections of the various input configurations |

In addition the following quantities can be calculated by employing the keywords listed below

Quantity | Keyword | Description |

gradient | GRADIENT | the gradient |

vmean | VMEAN | the norm of the mean vector. The output component can be referred to elsewhere in the input file by using the label.vmean |

vsum | VSUM | the norm of sum of vectors. The output component can be referred to elsewhere in the input file by using the label.vsum |

spath | SPATH | the position on the path |

gspath | GPATH | the position on the path calculated using trigonometry |

gzpath | GPATH | the distance from the path calculated using trigonometry |

zpath | ZPATH | the distance from the path |

altmin | ALT_MIN | the minimum value. This is calculated using the formula described in the description of the keyword so as to make it continuous. |

between | BETWEEN | the number/fraction of values within a certain range. This is calculated using one of the formula described in the description of the keyword so as to make it continuous. You can calculate this quantity multiple times using different parameters. |

highest | HIGHEST | the highest of the quantities calculated by this action |

lessthan | LESS_THAN | the number of values less than a target value. This is calculated using one of the formula described in the description of the keyword so as to make it continuous. You can calculate this quantity multiple times using different parameters. |

lowest | LOWEST | the lowest of the quantities calculated by this action |

max | MAX | the maximum value. This is calculated using the formula described in the description of the keyword so as to make it continuous. |

mean | MEAN | the mean value. The output component can be referred to elsewhere in the input file by using the label.mean |

min | MIN | the minimum value. This is calculated using the formula described in the description of the keyword so as to make it continuous. |

moment | MOMENTS | the central moments of the distribution of values. The second moment would be referenced elsewhere in the input file using label.moment-2, the third as label.moment-3, etc. |

morethan | MORE_THAN | the number of values more than a target value. This is calculated using one of the formula described in the description of the keyword so as to make it continuous. You can calculate this quantity multiple times using different parameters. |

sum | SUM | the sum of values |

- The data to analyze can be the output from another analysis algorithm

USE_OUTPUT_DATA_FROM | use the output of the analysis performed by this object as input to your new analysis object |

- Compulsory keywords

NLOW_DIM | number of low-dimensional coordinates required |

- Options

SERIAL | ( default=off ) do the calculation in serial. Do not use MPI |

LOWMEM | ( default=off ) lower the memory requirements |

- Examples

The following command instructs plumed to construct a classical multidimensional scaling projection of a trajectory. The RMSD distance between atoms 1-256 have moved is used to measure the distances in the high-dimensional space.

data: COLLECT_FRAMES ATOMS=1-256 mat: EUCLIDEAN_DISSIMILARITIES USE_OUTPUT_DATA_FROM=data mds: CLASSICAL_MDS USE_OUTPUT_DATA_FROM=mat NLOW_DIM=2 OUTPUT_ANALYSIS_DATA_TO_COLVAR USE_OUTPUT_DATA_FROM=mds FILE=rmsd-embed

The following section is for people who are interested in how this method works in detail. A solid understanding of this material is not necessary to use MDS.

The stress function can be minimized using a standard optimization algorithm such as conjugate gradients or steepest descent. However, it is more common to do this minimization using a technique known as classical scaling. Classical scaling works by recognizing that each of the distances $D_{ij}$ in the above sum can be written as:

\[ D_{ij}^2 = \sum_{\alpha} (X^i_\alpha - X^j_\alpha)^2 = \sum_\alpha (X^i_\alpha)^2 + (X^j_\alpha)^2 - 2X^i_\alpha X^j_\alpha \]

We can use this expression and matrix algebra to calculate multiple distances at once. For instance if we have three points, \(\mathbf{X}\), we can write distances between them as:

\begin{eqnarray*} D^2(\mathbf{X}) &=& \left[ \begin{array}{ccc} 0 & d_{12}^2 & d_{13}^2 \\ d_{12}^2 & 0 & d_{23}^2 \\ d_{13}^2 & d_{23}^2 & 0 \end{array}\right] \\ &=& \sum_\alpha \left[ \begin{array}{ccc} (X^1_\alpha)^2 & (X^1_\alpha)^2 & (X^1_\alpha)^2 \\ (X^2_\alpha)^2 & (X^2_\alpha)^2 & (X^2_\alpha)^2 \\ (X^3_\alpha)^2 & (X^3_\alpha)^2 & (X^3_\alpha)^2 \\ \end{array}\right] + \sum_\alpha \left[ \begin{array}{ccc} (X^1_\alpha)^2 & (X^2_\alpha)^2 & (X^3_\alpha)^2 \\ (X^1_\alpha)^2 & (X^2_\alpha)^2 & (X^3_\alpha)^2 \\ (X^1_\alpha)^2 & (X^2_\alpha)^2 & (X^3_\alpha)^2 \\ \end{array}\right] - 2 \sum_\alpha \left[ \begin{array}{ccc} X^1_\alpha X^1_\alpha & X^1_\alpha X^2_\alpha & X^1_\alpha X^3_\alpha \\ X^2_\alpha X^1_\alpha & X^2_\alpha X^2_\alpha & X^2_\alpha X^3_\alpha \\ X^1_\alpha X^3_\alpha & X^3_\alpha X^2_\alpha & X^3_\alpha X^3_\alpha \end{array}\right] \nonumber \\ &=& \mathbf{c 1^T} + \mathbf{1 c^T} - 2 \sum_\alpha \mathbf{x}_a \mathbf{x}^T_a = \mathbf{c 1^T} + \mathbf{1 c^T} - 2\mathbf{X X^T} \end{eqnarray*}

This last equation can be extended to situations when we have more than three points. In it \(\mathbf{X}\) is a matrix that has one high-dimensional point on each of its rows and \(\mathbf{X^T}\) is its transpose. \(\mathbf{1}\) is an \(M \times 1\) vector of ones and \(\mathbf{c}\) is a vector with components given by:

\[ c_i = \sum_\alpha (x_\alpha^i)^2 \]

These quantities are the diagonal elements of \(\mathbf{X X^T}\), which is a dot product or Gram Matrix that contains the dot product of the vector \(X_i\) with the vector \(X_j\) in element \(i,j\).

In classical scaling we introduce a centering matrix \(\mathbf{J}\) that is given by:

\[ \mathbf{J} = \mathbf{I} - \frac{1}{M} \mathbf{11^T} \]

where \(\mathbf{I}\) is the identity. Multiplying the equations above from the front and back by this matrix and a factor of a \(-\frac{1}{2}\) gives:

\begin{eqnarray*} -\frac{1}{2} \mathbf{J} \mathbf{D}^2(\mathbf{X}) \mathbf{J} &=& -\frac{1}{2}\mathbf{J}( \mathbf{c 1^T} + \mathbf{1 c^T} - 2\mathbf{X X^T})\mathbf{J} \\ &=& -\frac{1}{2}\mathbf{J c 1^T J} - \frac{1}{2} \mathbf{J 1 c^T J} + \frac{1}{2} \mathbf{J}(2\mathbf{X X^T})\mathbf{J} \\ &=& \mathbf{ J X X^T J } = \mathbf{X X^T } \label{eqn:scaling} \end{eqnarray*}

The fist two terms in this expression disappear because \(\mathbf{1^T J}=\mathbf{J 1} =\mathbf{0}\), where \(\mathbf{0}\) is a matrix containing all zeros. In the final step meanwhile we use the fact that the matrix of squared distances will not change when we translate all the points. We can thus assume that the mean value, \(\mu\), for each of the components, \(\alpha\):

\[ \mu_\alpha = \frac{1}{M} \sum_{i=1}^N \mathbf{X}^i_\alpha \]

is equal to 0 so the columns of \(\mathbf{X}\) add up to 0. This in turn means that each of the columns of \(\mathbf{X X^T}\) adds up to zero, which is what allows us to write \(\mathbf{ J X X^T J } = \mathbf{X X^T }\).

The matrix of squared distances is symmetric and positive-definite we can thus use the spectral decomposition to decompose it as:

\[ \Phi= \mathbf{V} \Lambda \mathbf{V}^T \]

Furthermore, because the matrix we are diagonalizing, \(\mathbf{X X^T}\), is the product of a matrix and its transpose we can use this decomposition to write:

\[ \mathbf{X} =\mathbf{V} \Lambda^\frac{1}{2} \]

Much as in PCA there are generally a small number of large eigenvalues in \(\Lambda\) and many small eigenvalues. We can safely use only the large eigenvalues and their corresponding eigenvectors to express the relationship between the coordinates \(\mathbf{X}\). This gives us our set of low-dimensional projections.

This derivation makes a number of assumptions about the how the low dimensional points should best be arranged to minimize the stress. If you use an interactive optimization algorithm such as SMACOF you may thus be able to find a better (lower-stress) projection of the points. For more details on the assumptions made see this website.