Command line tool: benchmark

Module	cltools
Description	Input
run a calculation with a fixed trajectory to find bottlenecks in PLUMED	command line args

Details

benchmark is a lightweight reimplementation of driver that can be used to run benchmark calculations

The main difference between driver and benchmark is that benchmark generates a trajectory in memory rather than reading a trajectory from a file. This approach is better for timing the overhead of the plumed library. If you do similar benchmarking with driver the timings you get are dominated by the time spent doing the I/O operations that are required to read the trajectory.

Basic usage

If you want to use benchmark you first create a sample plumed.dat file for testing. For example:

Click on the labels of the actions for more information on what each action computes

WHOLEMOLECULESThis action is used to rebuild molecules that can become split by the periodic boundary conditions. More details ENTITY0the atoms that make up a molecule that you wish to align=1-10000
The WHOLEMOLECULES action with label  calculates somethingpThe POSITION action with label p calculates the following quantities: Quantity    Type    Description  
p.x scalar the x-component of the atom position
p.y scalar the y-component of the atom position
p.z scalar the z-component of the atom position: POSITIONCalculate the components of the position of an atom or atoms. More details ATOMthe atom number=1
RESTRAINTAdds harmonic and/or linear restraints on one or more variables. More details ARGthe values the harmonic restraint acts upon=p.x KAPPA specifies that the restraint is harmonic and what the values of the force constants on each of the variables are=1 ATthe position of the restraint=0

You can then run this benchmark using the following command:

plumed benchmarkrun a calculation with a fixed trajectory to find bottlenecks in PLUMED This action has hidden defaults. More details

The benchmark action with label plumed calculates somethingplumed benchmarkrun a calculation with a fixed trajectory to find bottlenecks in PLUMED This action uses the defaults shown here. More details
 --plumed colon separated path(s) to the input file(s) plumed.dat --kernel colon separated path(s) to kernel(s) this --natoms the number of atoms to use for the simulation 100000 --nsteps number of steps of MD to perform (-1 means forever) 2000 --maxtime maximum number of seconds (-1 means forever)  --sleep number of seconds of sleep, mimicking MD calculation 0 --atom-distribution the kind of possible atomic displacement at each step line --repeatX number of time to align the read trajectory along the fist box component, ingnored with a atomic distribution 1 --repeatY number of time to align the read trajectory along the second box component, ingnored with a atomic distribution 1 --repeatZ number of time to align the read trajectory along the third box component, ingnored with a atomic distribution 1

Notice, that benchmark will read an input file called plumed.dat by default. You can specify a different name for you PLUMED input file by using the --plumed flag.

Running with a different PLUMED version

If you want to run a benchmark against a previous plumed version in a controlled setting you can do so by using the command:

plumed-runtime benchmarkrun a calculation with a fixed trajectory to find bottlenecks in PLUMED This action has hidden defaults. More details --kernel colon separated path(s) to kernel(s) /path/to/lib/libplumedKernel.so

The benchmark action with label plumed calculates somethingplumed-runtime benchmarkrun a calculation with a fixed trajectory to find bottlenecks in PLUMED This action uses the defaults shown here. More details --kernel colon separated path(s) to kernel(s) /path/to/lib/libplumedKernel.so
 --plumed colon separated path(s) to the input file(s) plumed.dat --natoms the number of atoms to use for the simulation 100000 --nsteps number of steps of MD to perform (-1 means forever) 2000 --maxtime maximum number of seconds (-1 means forever)  --sleep number of seconds of sleep, mimicking MD calculation 0 --atom-distribution the kind of possible atomic displacement at each step line --repeatX number of time to align the read trajectory along the fist box component, ingnored with a atomic distribution 1 --repeatY number of time to align the read trajectory along the second box component, ingnored with a atomic distribution 1 --repeatZ number of time to align the read trajectory along the third box component, ingnored with a atomic distribution 1

If you use this command the version of PLUMED that is in your environment calls the version of the library that is specified using the --kernel flag. Running the benchmark in this way ensures that you are running in a controlled setting, where systematic errors in the comparison are minimized.

using plumed-runtime

You use the plumed-runtime executable here to avoid conflicts between different plumed versions. You will find the plumed-runtime executable in your path if you are using the non installed version of plumed, and in $prefix/lib/plumed if you installed plumed in $prefix,.

Comparing multiple versions

The best way to compare two versions of plumed on the same input is to pass multiple colon-separated kernels as shown below:

plumed-runtime benchmarkrun a calculation with a fixed trajectory to find bottlenecks in PLUMED This action has hidden defaults. More details --kernel colon separated path(s) to kernel(s) /path/to/lib/libplumedKernel.so:/path2/to/lib/libplumedKernel.so:this

The benchmark action with label plumed calculates somethingplumed-runtime benchmarkrun a calculation with a fixed trajectory to find bottlenecks in PLUMED This action uses the defaults shown here. More details --kernel colon separated path(s) to kernel(s) /path/to/lib/libplumedKernel.so:/path2/to/lib/libplumedKernel.so:this
 --plumed colon separated path(s) to the input file(s) plumed.dat --natoms the number of atoms to use for the simulation 100000 --nsteps number of steps of MD to perform (-1 means forever) 2000 --maxtime maximum number of seconds (-1 means forever)  --sleep number of seconds of sleep, mimicking MD calculation 0 --atom-distribution the kind of possible atomic displacement at each step line --repeatX number of time to align the read trajectory along the fist box component, ingnored with a atomic distribution 1 --repeatY number of time to align the read trajectory along the second box component, ingnored with a atomic distribution 1 --repeatZ number of time to align the read trajectory along the third box component, ingnored with a atomic distribution 1

Here this means the kernel of the version with which you are running the benchmark. This comparison runs the three instances simultaneously (alternating them) so that systematic differences in the load of your machine will affect them to the same extent.

In case the different versions require modified plumed.dat files, or if you simply want to compare two different plumed input files that compute the same thing, you can also use multiple plumed input files:

plumed-runtime benchmarkrun a calculation with a fixed trajectory to find bottlenecks in PLUMED This action has hidden defaults. More details --kernel colon separated path(s) to kernel(s) /path/to/lib/libplumedKernel.so:this --plumed colon separated path(s) to the input file(s) plumed1.dat:plumed2.dat

The benchmark action with label plumed calculates somethingplumed-runtime benchmarkrun a calculation with a fixed trajectory to find bottlenecks in PLUMED This action uses the defaults shown here. More details --kernel colon separated path(s) to kernel(s) /path/to/lib/libplumedKernel.so:this --plumed colon separated path(s) to the input file(s) plumed1.dat:plumed2.dat
 --natoms the number of atoms to use for the simulation 100000 --nsteps number of steps of MD to perform (-1 means forever) 2000 --maxtime maximum number of seconds (-1 means forever)  --sleep number of seconds of sleep, mimicking MD calculation 0 --atom-distribution the kind of possible atomic displacement at each step line --repeatX number of time to align the read trajectory along the fist box component, ingnored with a atomic distribution 1 --repeatY number of time to align the read trajectory along the second box component, ingnored with a atomic distribution 1 --repeatZ number of time to align the read trajectory along the third box component, ingnored with a atomic distribution 1

Similarly, you might want to run two different inputs using the same kernel by using an input like this:

plumed-runtime benchmarkrun a calculation with a fixed trajectory to find bottlenecks in PLUMED This action has hidden defaults. More details --plumed colon separated path(s) to the input file(s) plumed1.dat:plumed2.dat

The benchmark action with label plumed calculates somethingplumed-runtime benchmarkrun a calculation with a fixed trajectory to find bottlenecks in PLUMED This action uses the defaults shown here. More details --plumed colon separated path(s) to the input file(s) plumed1.dat:plumed2.dat
 --kernel colon separated path(s) to kernel(s) this --natoms the number of atoms to use for the simulation 100000 --nsteps number of steps of MD to perform (-1 means forever) 2000 --maxtime maximum number of seconds (-1 means forever)  --sleep number of seconds of sleep, mimicking MD calculation 0 --atom-distribution the kind of possible atomic displacement at each step line --repeatX number of time to align the read trajectory along the fist box component, ingnored with a atomic distribution 1 --repeatY number of time to align the read trajectory along the second box component, ingnored with a atomic distribution 1 --repeatZ number of time to align the read trajectory along the third box component, ingnored with a atomic distribution 1

Profiling

If you want to attach a profiler to the process on the fly, you might find it convenient to use --nsteps -1. This options ensures that the simulation runs forever unless interrupted with CTRL-C. When interrupted, the result of the timers should be displayed anyway. You can also set a maximum time for the calculating by using the --maxtime flag.

If you run a profiler when testing multiple PLUMED versions it can be difficult to determine which function is from each version. We therefore recommended you recompile separate PLUMED instances with a separate C++ namespace (-DPLMD=PLUMED_version_1) so that you will be able to distinguish them. In addition, compiling with CXXFLAGS="-g -O3" will make the profiling report more complete and will likely highlight lines of code that are particularly computationally demanding.

MPI runs

You can also run a benchmark that emulates a domain decomposition if plumed has been compiled with MPI and you run with mpirun and a command like the one shown below:

mpirun -npRun instances of PLUMED on this number of MPI processes 4 plumed-runtime benchmarkrun a calculation with a fixed trajectory to find bottlenecks in PLUMED This action has hidden defaults. More details

mpirun -npRun instances of PLUMED on this number of MPI processes 4 The benchmark action with label plumed calculates somethingplumed-runtime benchmarkrun a calculation with a fixed trajectory to find bottlenecks in PLUMED This action uses the defaults shown here. More details
 --plumed colon separated path(s) to the input file(s) plumed.dat --kernel colon separated path(s) to kernel(s) this --natoms the number of atoms to use for the simulation 100000 --nsteps number of steps of MD to perform (-1 means forever) 2000 --maxtime maximum number of seconds (-1 means forever)  --sleep number of seconds of sleep, mimicking MD calculation 0 --atom-distribution the kind of possible atomic displacement at each step line --repeatX number of time to align the read trajectory along the fist box component, ingnored with a atomic distribution 1 --repeatY number of time to align the read trajectory along the second box component, ingnored with a atomic distribution 1 --repeatZ number of time to align the read trajectory along the third box component, ingnored with a atomic distribution 1

If you load separate PLUMED instances as discussed above, they should all be compiled against the same MPI version. Notice that when using MPI signals (CTRL-C) might not work.

Since some of the data transfer could happen asynchronously, you might want to use the --sleep option to simulate a lag between the prepareCalc and performCalc actions. This part of the calculation will not contribute to the output timings, but will obviously slow down your test.

Output

In the output you will see the usual reports about timings produced by the internal timers of the tested plumed instances.

In addition, this tool monitors the timing externally, with some slightly different criterion:

First, the initialization (construction of the input) will be shown with a separate timer, as well as the timing for the first step.
Second, the timer corresponding to the calculation will be split in three parts, reporting execution of the first 20% (warm-up) and the next two blocks of 40% each.
Finally, you might notice some discrepancies because some of the actions that are usually not expensive are not included in the internal timers. The external timer will thus provide a better estimate of the total elapsed time that includes everything.

The internal timers are still useful to monitor what happens at the different stages of the calculattion. If you want more detailed information you can also use a DEBUG action with the DETAILED_TIMERS, to determine how much time is spnt in each action.

When you run multiple version, a comparative analisys of the time spent within PLUMED in the various instances will be done. For each PLUMED instance you run, this analysis shows the ratio between the total time each PLUMED instance ran for and the total time the first PLUMED instance ran for. In other words, the first time that the first PLUMED instance ran for is used as the basis for comparisons. Errors on these estimates of the timings are calculated using bootstrapping and the warm-up phase is discarded in the analysis.

Syntax

The following table describes the command line options that are available for this tool

Keyword	Description
--help/-h	print this help
--plumed	colon separated path(s) to the input file(s)
--kernel	colon separated path(s) to kernel(s)
--natoms	the number of atoms to use for the simulation
--nsteps	number of steps of MD to perform (-1 means forever)
--maxtime	maximum number of seconds (-1 means forever)
--sleep	number of seconds of sleep, mimicking MD calculation
--atom-distribution	the kind of possible atomic displacement at each step
--dump-trajectory	dump the trajectory to this file
--domain-decomposition	simulate domain decomposition, implies --shuffle
--shuffled	reshuffle atoms
--ixyz	the trajectory in xyz format
--igro	the trajectory in gro format
--idlp4	the trajectory in DL_POLY_4 format
--ixtc	the trajectory in xtc format (xdrfile implementation)
--itrr	the trajectory in trr format (xdrfile implementation)
--mf_dcd	molfile: the trajectory in dcd format
--mf_crd	molfile: the trajectory in crd format
--mf_crdbox	molfile: the trajectory in crdbox format
--mf_gro	molfile: the trajectory in gro format
--mf_g96	molfile: the trajectory in g96 format
--mf_trr	molfile: the trajectory in trr format
--mf_trj	molfile: the trajectory in trj format
--mf_xtc	molfile: the trajectory in xtc format
--mf_pdb	molfile: the trajectory in pdb format
--repeatX	number of time to align the read trajectory along the fist box component, ingnored with a atomic distribution
--repeatY	number of time to align the read trajectory along the second box component, ingnored with a atomic distribution
--repeatZ	number of time to align the read trajectory along the third box component, ingnored with a atomic distribution

Quantity	Type	Description
p.x	scalar	the x-component of the atom position
p.y	scalar	the y-component of the atom position
p.z	scalar	the z-component of the atom position