This is part of the bias module

Calculate the weights for configurations using the weighted histogram analysis method.

Suppose that you have run multiple \(N\) trajectories each of which was computed by integrating a different biased Hamiltonian. We can calculate the probability of observing the set of configurations during the \(N\) trajectories that we ran using the product of multinomial distributions shown below:

\[ P( \vec{T} ) \propto \prod_{j=1}^M \prod_{k=1}^N (c_k w_{kj} p_j)^{t_{kj}} \label{eqn:wham1} \]

In this expression the second product runs over the biases that were used when calculating the \(N\) trajectories. The first product then runs over the \(M\) bins in our histogram. The \(p_j\) variable that is inside this product is the quantity we wish to extract; namely, the unbiased probability of having a set of CV values that lie within the range for the \(j\)th bin.

The quantity that we can easily extract from our simulations, \(t_{kj}\), however, measures the number of frames from trajectory \(k\) that are inside the \(j\)th bin. To interpret this quantity we must consider the bias that acts on each of the replicas so the \(w_{kj}\) term is introduced. This quantity is calculated as:

\[ w_{kj} = \]

and is essentially the factor that we have to multiply the unbiased probability of being in the bin by in order to get the probability that we would be inside this same bin in the \(k\)th of our biased simulations. Obviously, these \(w_{kj}\) values depend on the value that the CVs take and also on the particular trajectory that we are investigating all of which, remember, have different simulation biases. Finally, \(c_k\) is a free parameter that ensures that, for each \(k\), the biased probability is normalized.

We can use the equation for the probability that was given above to find a set of values for \(p_j\) that maximizes the likelihood of observing the trajectory. This constrained optimization must be performed using a set of Lagrange multipliers, \(\lambda_k\), that ensure that each of the biased probability distributions are normalized, that is \(\sum_j c_kw_{kj}p_j=1\). Furthermore, as the problem is made easier if the quantity in the equation above is replaced by its logarithm we actually chose to minimize

\[ \mathcal{L}= \sum_{j=1}^M \sum_{k=1}^N t_{kj} \ln c_k w_{kj} p_j + \sum_k\lambda_k \left( \sum_{j=1}^N c_k w_{kj} p_j - 1 \right) \]

After some manipulations the following (WHAM) equations emerge:

\[ \begin{aligned} p_j & \propto \frac{\sum_{k=1}^N t_{kj}}{\sum_k c_k w_{kj}} \\ c_k & =\frac{1}{\sum_{j=1}^M w_{kj} p_j} \end{aligned} \]

which can be solved by computing the \(p_j\) values using the first of the two equations above with an initial guess for the \(c_k\) values and by then refining these \(p_j\) values using the \(c_k\) values that are obtained by inserting the \(p_j\) values obtained into the second of the two equations above.

Notice that only \(\sum_k t_{kj}\), which is the total number of configurations from all the replicas that enter the \(j\)th bin, enters the WHAM equations above. There is thus no need to record which replica generated each of the frames. One can thus simply gather the trajectories from all the replicas together at the outset. This observation is important as it is the basis of the binless formulation of WHAM that is implemented within PLUMED.

Glossary of keywords and components
Compulsory keywords
ARG ( default=*.bias ) the biases that must be taken into account when reweighting
MAXITER ( default=1000 ) maximum number of iterations for WHAM algorithm
WHAMTOL ( default=1e-10 ) threshold for convergence of WHAM algorithm
TEMP the system temperature. This is not required if your MD code passes this quantity to PLUMED