CLASSICAL_MDS
This is part of the dimred module
It is only available if you configure PLUMED with ./configure –enable-modules=dimred . Furthermore, this feature is still being developed so take care when using it and report any problems on the mailing list.

Create a low-dimensional projection of a trajectory using the classical multidimensional scaling algorithm.

Multidimensional scaling (MDS) is similar to what is done when you make a map. You start with distances between London, Belfast, Paris and Dublin and then you try to arrange points on a piece of paper so that the (suitably scaled) distances between the points in your map representing each of those cities are related to the true distances between the cities. Stating this more mathematically MDS endeavors to find an isometry between points distributed in a high-dimensional space and a set of points distributed in a low-dimensional plane. In other words, if we have \(M\) \(D\)-dimensional points, \(\mathbf{X}\), and we can calculate dissimilarities between pairs them, \(D_{ij}\), we can, with an MDS calculation, try to create \(M\) projections, \(\mathbf{x}\), of the high dimensionality points in a \(d\)-dimensional linear space by trying to arrange the projections so that the Euclidean distances between pairs of them, \(d_{ij}\), resemble the dissimilarities between the high dimensional points. In short we minimize:

\[ \chi^2 = \sum_{i \ne j} \left( D_{ij} - d_{ij} \right)^2 \]

where \(D_{ij}\) is the distance between point \(X^{i}\) and point \(X^{j}\) and \(d_{ij}\) is the distance between the projection of \(X^{i}\), \(x^i\), and the projection of \(X^{j}\), \(x^j\). A tutorial on this approach can be used to analyze simulations can be found in the tutorial lugano-5 and in the following short video.

Examples

The following command instructs plumed to construct a classical multidimensional scaling projection of a trajectory. The RMSD distance between atoms 1-256 have moved is used to measure the distances in the high-dimensional space.

Click on the labels of the actions for more information on what each action computes
tested on master
data: COLLECT_FRAMES 
ATOMS
list of atomic positions that you would like to collect and store for later analysis
=1-256 mat: EUCLIDEAN_DISSIMILARITIES
USE_OUTPUT_DATA_FROM
could not find this keyword
=data mds: CLASSICAL_MDS
USE_OUTPUT_DATA_FROM
could not find this keyword
=mat
NLOW_DIM
compulsory keyword number of low-dimensional coordinates required
=2 OUTPUT_ANALYSIS_DATA_TO_COLVAR
USE_OUTPUT_DATA_FROM
could not find this keyword
=mds
FILE
could not find this keyword
=rmsd-embed

The following section is for people who are interested in how this method works in detail. A solid understanding of this material is not necessary to use MDS.

Method of optimization

The stress function can be minimized using a standard optimization algorithm such as conjugate gradients or steepest descent. However, it is more common to do this minimization using a technique known as classical scaling. Classical scaling works by recognizing that each of the distances $D_{ij}$ in the above sum can be written as:

\[ D_{ij}^2 = \sum_{\alpha} (X^i_\alpha - X^j_\alpha)^2 = \sum_\alpha (X^i_\alpha)^2 + (X^j_\alpha)^2 - 2X^i_\alpha X^j_\alpha \]

We can use this expression and matrix algebra to calculate multiple distances at once. For instance if we have three points, \(\mathbf{X}\), we can write distances between them as:

\begin{eqnarray*} D^2(\mathbf{X}) &=& \left[ \begin{array}{ccc} 0 & d_{12}^2 & d_{13}^2 \\ d_{12}^2 & 0 & d_{23}^2 \\ d_{13}^2 & d_{23}^2 & 0 \end{array}\right] \\ &=& \sum_\alpha \left[ \begin{array}{ccc} (X^1_\alpha)^2 & (X^1_\alpha)^2 & (X^1_\alpha)^2 \\ (X^2_\alpha)^2 & (X^2_\alpha)^2 & (X^2_\alpha)^2 \\ (X^3_\alpha)^2 & (X^3_\alpha)^2 & (X^3_\alpha)^2 \\ \end{array}\right] + \sum_\alpha \left[ \begin{array}{ccc} (X^1_\alpha)^2 & (X^2_\alpha)^2 & (X^3_\alpha)^2 \\ (X^1_\alpha)^2 & (X^2_\alpha)^2 & (X^3_\alpha)^2 \\ (X^1_\alpha)^2 & (X^2_\alpha)^2 & (X^3_\alpha)^2 \\ \end{array}\right] - 2 \sum_\alpha \left[ \begin{array}{ccc} X^1_\alpha X^1_\alpha & X^1_\alpha X^2_\alpha & X^1_\alpha X^3_\alpha \\ X^2_\alpha X^1_\alpha & X^2_\alpha X^2_\alpha & X^2_\alpha X^3_\alpha \\ X^1_\alpha X^3_\alpha & X^3_\alpha X^2_\alpha & X^3_\alpha X^3_\alpha \end{array}\right] \nonumber \\ &=& \mathbf{c 1^T} + \mathbf{1 c^T} - 2 \sum_\alpha \mathbf{x}_a \mathbf{x}^T_a = \mathbf{c 1^T} + \mathbf{1 c^T} - 2\mathbf{X X^T} \end{eqnarray*}

This last equation can be extended to situations when we have more than three points. In it \(\mathbf{X}\) is a matrix that has one high-dimensional point on each of its rows and \(\mathbf{X^T}\) is its transpose. \(\mathbf{1}\) is an \(M \times 1\) vector of ones and \(\mathbf{c}\) is a vector with components given by:

\[ c_i = \sum_\alpha (x_\alpha^i)^2 \]

These quantities are the diagonal elements of \(\mathbf{X X^T}\), which is a dot product or Gram Matrix that contains the dot product of the vector \(X_i\) with the vector \(X_j\) in element \(i,j\).

In classical scaling we introduce a centering matrix \(\mathbf{J}\) that is given by:

\[ \mathbf{J} = \mathbf{I} - \frac{1}{M} \mathbf{11^T} \]

where \(\mathbf{I}\) is the identity. Multiplying the equations above from the front and back by this matrix and a factor of a \(-\frac{1}{2}\) gives:

\begin{eqnarray*} -\frac{1}{2} \mathbf{J} \mathbf{D}^2(\mathbf{X}) \mathbf{J} &=& -\frac{1}{2}\mathbf{J}( \mathbf{c 1^T} + \mathbf{1 c^T} - 2\mathbf{X X^T})\mathbf{J} \\ &=& -\frac{1}{2}\mathbf{J c 1^T J} - \frac{1}{2} \mathbf{J 1 c^T J} + \frac{1}{2} \mathbf{J}(2\mathbf{X X^T})\mathbf{J} \\ &=& \mathbf{ J X X^T J } = \mathbf{X X^T } \label{eqn:scaling} \end{eqnarray*}

The fist two terms in this expression disappear because \(\mathbf{1^T J}=\mathbf{J 1} =\mathbf{0}\), where \(\mathbf{0}\) is a matrix containing all zeros. In the final step meanwhile we use the fact that the matrix of squared distances will not change when we translate all the points. We can thus assume that the mean value, \(\mu\), for each of the components, \(\alpha\):

\[ \mu_\alpha = \frac{1}{M} \sum_{i=1}^N \mathbf{X}^i_\alpha \]

is equal to 0 so the columns of \(\mathbf{X}\) add up to 0. This in turn means that each of the columns of \(\mathbf{X X^T}\) adds up to zero, which is what allows us to write \(\mathbf{ J X X^T J } = \mathbf{X X^T }\).

The matrix of squared distances is symmetric and positive-definite we can thus use the spectral decomposition to decompose it as:

\[ \Phi= \mathbf{V} \Lambda \mathbf{V}^T \]

Furthermore, because the matrix we are diagonalizing, \(\mathbf{X X^T}\), is the product of a matrix and its transpose we can use this decomposition to write:

\[ \mathbf{X} =\mathbf{V} \Lambda^\frac{1}{2} \]

Much as in PCA there are generally a small number of large eigenvalues in \(\Lambda\) and many small eigenvalues. We can safely use only the large eigenvalues and their corresponding eigenvectors to express the relationship between the coordinates \(\mathbf{X}\). This gives us our set of low-dimensional projections.

This derivation makes a number of assumptions about the how the low dimensional points should best be arranged to minimize the stress. If you use an interactive optimization algorithm such as SMACOF you may thus be able to find a better (lower-stress) projection of the points. For more details on the assumptions made see this website.

Glossary of keywords and components
Compulsory keywords
ARG the arguments that you would like to make the histogram for
NLOW_DIM number of low-dimensional coordinates required