Making lepton library faster

In case you are using a lot of CUSTOM functions or switching functions, notice that these commands depend on the lepton library that is included in PLUMED. This library replaces libmatheval since PLUMED 2.5, and by itself it is significantly faster than libmatheval. However, you can make it even faster using a just-in-time compiler. Currently, this is an experimental feature, so use it with care.

In order to enable it you should first install asmjit.

git clone https://github.com/asmjit/asmjit.git
cd asmjit
git checkout 673dcefa # this is a specific version
mkdir build
cd build
cmake -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX=$prefix ../
make -j 4
make install

Notice that you should set the prefix correctly so that PLUMED can find it at configure time. In the example asmjit is installed on /usr/local but you might be willing to install it somewhere else. On a Mac, you might also have to use install_name_tool to fix its path

install_name_tool -id $prefix/lib/libasmgit.dylib $prefix/lib/libasmgit.dylib

Also notice that a specific version of asmjit is required. The version supported by PLUMED is more recent than the version originally supported by the Lepton library. In case you find troubles and want to experiment with older versions, write on the mailing list or check the Lepton implementation for older asmjit releases by doing cd src/lepton; gitk. If on your system a more recent version of the asmjit library is already installed, you might have to make sure that PLUMED finds the correct version, both at compilation and run time.

Then, configure PLUMED using this additional flag:

./configure --enable-asmjit
make
make install

You are done!

In some case using a custom expression is almost as fast as using a hard-coded function. For instance, with an input that contained the following lines:

c: COORDINATION GROUPA=1-108 GROUPB=1-108 R_0=1
d_fast: COORDINATION GROUPA=1-108 GROUPB=1-108 SWITCH={CUSTOM FUNC=1/(1+x2^3) R_0=1}

I (GB) obtained the following timings (on a Macbook laptop):

...
PLUMED: 4A  1 c                                          108     0.126592     0.001172     0.000701     0.002532
PLUMED: 4A  2 d_fast                                      108     0.135210     0.001252     0.000755     0.002623
...

Notice the usage of x2 as a variable for the switching function (see switchingfunction), which avoids an unnecessary square root calculation (this is done automatically by the hard-coded switching functions when you use only even powers). The asmjit calculation (d_fast) takes less than 10% more than the hard-coded one (c).