LCOV - code coverage report
Current view: top level - clusters - ClusterDistribution.cpp (source / functions) Hit Total Coverage
Test: plumed test coverage Lines: 68 75 90.7 %
Date: 2025-12-04 11:19:34 Functions: 6 9 66.7 %

          Line data    Source code
       1             : /* +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
       2             :    Copyright (c) 2015-2020 The plumed team
       3             :    (see the PEOPLE file at the root of the distribution for a list of names)
       4             : 
       5             :    See http://www.plumed.org for more information.
       6             : 
       7             :    This file is part of plumed, version 2.
       8             : 
       9             :    plumed is free software: you can redistribute it and/or modify
      10             :    it under the terms of the GNU Lesser General Public License as published by
      11             :    the Free Software Foundation, either version 3 of the License, or
      12             :    (at your option) any later version.
      13             : 
      14             :    plumed is distributed in the hope that it will be useful,
      15             :    but WITHOUT ANY WARRANTY; without even the implied warranty of
      16             :    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
      17             :    GNU Lesser General Public License for more details.
      18             : 
      19             :    You should have received a copy of the GNU Lesser General Public License
      20             :    along with plumed.  If not, see <http://www.gnu.org/licenses/>.
      21             : +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ */
      22             : #include "core/ActionWithValue.h"
      23             : #include "core/ActionWithArguments.h"
      24             : #include "core/ActionRegister.h"
      25             : #include "core/ActionShortcut.h"
      26             : #include "core/PlumedMain.h"
      27             : #include "multicolvar/MultiColvarShortcuts.h"
      28             : #include "ClusteringBase.h"
      29             : 
      30             : //+PLUMEDOC CONCOMP CLUSTER_DISTRIBUTION
      31             : /*
      32             : Calculate functions of the distribution of properties in your connected components.
      33             : 
      34             : This action allows you to calculate the number of atoms in each of the connected components that
      35             : were detected when you performed [DFSCLUSTERING](DFSCLUSTERING.md) on one of the adjacency matrices
      36             : computed using the [admat](module_adjmat.md).  The following example illustrates how you can use
      37             : this to compute the number of atoms in each of the identified clusters:
      38             : 
      39             : ```plumed
      40             : cm: CONTACT_MATRIX GROUP=1-100 SWITCH={CUBIC D_0=0.45  D_MAX=0.55}
      41             : dfs: DFSCLUSTERING ARG=cm
      42             : clust: CLUSTER_DISTRIBUTION CLUSTERS=dfs
      43             : PRINT ARG=clust FILE=colvar STRIDE=1
      44             : ```
      45             : 
      46             : The output from the CLUSTER_DISTRIBUTION action here is a vector with 100 elements.  The first element of this vector
      47             : is the number of atoms in the largest connected component, the second element of the vector is the number of atoms in the
      48             : second largest connected component and so on (many elements of the vector will be zero).
      49             : 
      50             : As illustrated in the inputs below there are multiple shortcuts that allow you to probe the distribution of cluster sizes.
      51             : For example, the following input calculates how many clusters with more than 10 atoms are present.
      52             : 
      53             : ```plumed
      54             : cm: CONTACT_MATRIX GROUP=1-100 SWITCH={CUBIC D_0=0.45  D_MAX=0.55}
      55             : dfs: DFSCLUSTERING ARG=cm
      56             : clust: CLUSTER_DISTRIBUTION CLUSTERS=dfs MORE_THAN={RATIONAL D_0=10 R_0=0.0001}
      57             : PRINT ARG=clust.morethan FILE=colvar STRIDE=1
      58             : ```
      59             : 
      60             : By a similar logic you can compute the number of clusters that were identified as follows:
      61             : 
      62             : ```plumed
      63             : cm: CONTACT_MATRIX GROUP=1-100 SWITCH={CUBIC D_0=0.45  D_MAX=0.55}
      64             : dfs: DFSCLUSTERING ARG=cm
      65             : clust: CLUSTER_DISTRIBUTION CLUSTERS=dfs LESS_THAN={RATIONAL R_0=0.0001}
      66             : nc: CUSTOM ARG=clust.lessthan FUNC=100-x PERIODIC=NO
      67             : PRINT ARG=nc FILE=colvar STRIDE=1
      68             : ```
      69             : 
      70             : This input calculates the number of zeros in the vector output by the CLUSTER_DISTRIBUTION action.  If we subtract the number of
      71             : non-zero elements from the size of the vector we then get the number of clusters.
      72             : 
      73             : Lastly, you can calculate the number of clusters that are within a certain range or a set of ranges using the BETWEEN and HISGTOGRAM keywords
      74             : as indicated below:
      75             : 
      76             : ```plumed
      77             : cm: CONTACT_MATRIX GROUP=1-100 SWITCH={CUBIC D_0=0.45  D_MAX=0.55}
      78             : dfs: DFSCLUSTERING ARG=cm
      79             : clust: CLUSTER_DISTRIBUTION ...
      80             :    CLUSTERS=dfs
      81             :    BETWEEN={GAUSSIAN LOWER=5 UPPER=6 SMEAR=0.5}
      82             :    HISTOGRAM={GAUSSIAN LOWER=6 UPPER=10 NBINS=4 SMEAR=0.5}
      83             : ...
      84             : PRINT ARG=clust.* FILE=colvar STRIDE=1
      85             : ```
      86             : 
      87             : This input will output 5 quantities:
      88             : 
      89             : - `clust.between` tells you the number of connected components that contain between 5 and 6 atoms.
      90             : - `clust.between-1` tells you the number of connected components that contain between 6 and 7 atoms.
      91             : - `clust.between-2` tells you the number of connected components that contain between 7 and 8 atoms.
      92             : - `clust.between-3` tells you the number of connected components that contain between 8 and 9 atoms.
      93             : - `clust.between-4` tells you the number of connected components that contain between 9 and 10 atoms.
      94             : 
      95             : If you expand the inputs above you can find more details on how these quantities are calculated.
      96             : 
      97             : The input provided below shows just how this can be used to perform quite complicated calculations.  The input
      98             : calculates the local q6 Steinhardt parameter on each atom.  The coordination number
      99             : that atoms with a high value for the local q6 Steinhardt parameter have with other atoms that have a high
     100             : value for the local q6 Steinhardt parameter is then computed.  A contact matrix is then computed that measures
     101             : whether atoms atoms $i$ and $j$ have a high value for this coordination number and if they are within
     102             : 3.6 nm of each other.  The connected components of this matrix are then found using a depth first clustering
     103             : algorithm on the corresponding graph. The number of components in this graph that contain more than 27 atoms is then computed.
     104             : An input similar to this one was used to analyze the formation of a polycrystal of GeTe from amorphous GeTe in the paper cited below
     105             : 
     106             : ```plumed
     107             : q6: Q6 SPECIES=1-300 SWITCH={GAUSSIAN D_0=5.29 R_0=0.01 D_MAX=5.3}
     108             : lq6: LOCAL_Q6 SPECIES=q6 SWITCH={GAUSSIAN D_0=5.29 R_0=0.01 D_MAX=5.3}
     109             : flq6: MORE_THAN ARG=lq6 SWITCH={GAUSSIAN D_0=0.19 R_0=0.01 D_MAX=0.2}
     110             : cc: COORDINATIONNUMBER SPECIES=1-300 SWITCH={GAUSSIAN D_0=3.59 R_0=0.01 D_MAX=3.6}
     111             : fcc: MORE_THAN ARG=cc SWITCH={GAUSSIAN D_0=5.99 R_0=0.01 D_MAX=6.0}
     112             : mat: CONTACT_MATRIX GROUP=1-300 SWITCH={GAUSSIAN D_0=3.59 R_0=0.01 D_MAX=3.6}
     113             : dfs: DFSCLUSTERING ARG=mat
     114             : nclust: CLUSTER_DISTRIBUTION ...
     115             :    CLUSTERS=dfs WEIGHTS=fcc
     116             :    MORE_THAN={GAUSSIAN D_0=26.99 R_0=0.01 D_MAX=27}
     117             : ...
     118             : PRINT ARG=nclust.* FILE=colvar
     119             : ```
     120             : 
     121             : Notice how the WEIGHTS keyword is used here in the input to CLUSTER_DISTRIBUTION. By using this keyword here we ensure that the size of each
     122             : cluster is the sum of the components of the vector `fcc` that are part of the connected component.  The `size` of each cluster that is output
     123             : is thus the number of atoms in that cluster that have a COORDINATIONNUMBER that is greater than 4.
     124             : 
     125             : */
     126             : //+ENDPLUMEDOC
     127             : 
     128             : namespace PLMD {
     129             : namespace clusters {
     130             : 
     131             : class ClusterDistribution :
     132             :   public ActionWithArguments,
     133             :   public ActionWithValue {
     134             : public:
     135             : /// Create manual
     136             :   static void registerKeywords( Keywords& keys );
     137             : /// Constructor
     138             :   explicit ClusterDistribution(const ActionOptions&);
     139             : /// The number of derivatives
     140             :   unsigned getNumberOfDerivatives() override ;
     141             : /// Do the calculation
     142             :   void calculate() override;
     143             : /// We can use ActionWithVessel to run all the calculation
     144           2 :   void apply() override {}
     145             : };
     146             : 
     147             : PLUMED_REGISTER_ACTION(ClusterDistribution,"CLUSTER_DISTRIBUTION_CALC")
     148             : 
     149           6 : void ClusterDistribution::registerKeywords( Keywords& keys ) {
     150           6 :   Action::registerKeywords( keys );
     151           6 :   ActionWithArguments::registerKeywords( keys );
     152           6 :   ActionWithValue::registerKeywords( keys );
     153           6 :   keys.setDisplayName("CLUSTER_DISTRIBUTION");
     154           6 :   keys.remove("NUMERICAL_DERIVATIVES");
     155          12 :   keys.addInputKeyword("compulsory","CLUSTERS","vector","the label of the action that does the clustering");
     156          12 :   keys.addInputKeyword("optional","WEIGHTS","vector","use the vector of values calculated by this action as weights rather than giving each atom a unit weight");
     157          12 :   keys.setValueDescription("vector","a vector containing the sum of a atomic-cv that is calculated for each of the identified clusters");
     158           6 :   keys.addDOI("https://doi.org/10.1021/acs.jctc.6b01073");
     159           6 : }
     160             : 
     161           2 : ClusterDistribution::ClusterDistribution(const ActionOptions&ao):
     162             :   Action(ao),
     163             :   ActionWithArguments(ao),
     164           2 :   ActionWithValue(ao) {
     165             :   // Read in the clustering object
     166             :   std::vector<Value*> clusters;
     167           4 :   parseArgumentList("CLUSTERS",clusters);
     168           2 :   if( clusters.size()!=1 ) {
     169           0 :     error("should pass only one matrix to clustering base");
     170             :   }
     171           2 :   ClusteringBase* cc = dynamic_cast<ClusteringBase*>( clusters[0]->getPntrToAction() );
     172           2 :   if( !cc ) {
     173           0 :     error("input to CLUSTERS keyword should be a clustering action");
     174             :   }
     175             :   std::vector<Value*> weights;
     176           4 :   parseArgumentList("WEIGHTS",weights);
     177           2 :   if( weights.size()==0 ) {
     178           1 :     log.printf("  using unit weights in cluster distribution \n");
     179           1 :   } else if( weights.size()==1 ) {
     180           1 :     if( weights[0]->getRank()!=1 ) {
     181           0 :       error("input weights has wrong shape");
     182             :     }
     183           1 :     if( weights[0]->getShape()[0]!=clusters[0]->getShape()[0] ) {
     184           0 :       error("mismatch between number of weights and number of atoms");
     185             :     }
     186           1 :     log.printf("  using weights from action with label %s in cluster distribution \n", weights[0]->getName().c_str() );
     187           1 :     clusters.push_back( weights[0] );
     188             :   } else {
     189           0 :     error("should have only one argument for weights \n");
     190             :   }
     191             :   // Request the arguments
     192           2 :   requestArguments( clusters );
     193             :   // Now create the value
     194           2 :   std::vector<std::size_t> shape(1);
     195           2 :   shape[0]=clusters[0]->getShape()[0];
     196           2 :   addValue( shape );
     197           2 :   setNotPeriodic();
     198           2 : }
     199             : 
     200           0 : unsigned ClusterDistribution::getNumberOfDerivatives() {
     201           0 :   return 0;
     202             : }
     203             : 
     204           2 : void ClusterDistribution::calculate() {
     205           2 :   plumed_assert( getPntrToArgument(0)->valueHasBeenSet() );
     206           2 :   if( getNumberOfArguments()>1 ) {
     207           1 :     plumed_assert( getPntrToArgument(1)->valueHasBeenSet() );
     208             :   }
     209           2 :   double csize = getPntrToArgument(0)->get(0);
     210          12 :   for(unsigned i=1; i<getPntrToArgument(0)->getShape()[0]; ++i) {
     211          10 :     if( getPntrToArgument(0)->get(i)>csize ) {
     212           4 :       csize = getPntrToArgument(0)->get(i);
     213             :     }
     214             :   }
     215           2 :   unsigned ntasks = static_cast<unsigned>( csize );
     216           8 :   for(unsigned i=0; i<ntasks; ++i) {
     217          42 :     for(unsigned j=0; j<getPntrToArgument(0)->getShape()[0]; ++j) {
     218          36 :       if( fabs(getPntrToArgument(0)->get(j)-i)<epsilon ) {
     219          10 :         if( getNumberOfArguments()==2 ) {
     220           5 :           getPntrToValue()->add( i, getPntrToArgument(1)->get(j) );
     221             :         } else {
     222           5 :           getPntrToValue()->add( i, 1.0 );
     223             :         }
     224             :       }
     225             :     }
     226             :   }
     227           2 : }
     228             : 
     229             : class ClusterDistributionShortcut : public ActionShortcut {
     230             : public:
     231             :   static void registerKeywords( Keywords& keys );
     232             :   ClusterDistributionShortcut(const ActionOptions&);
     233             : };
     234             : 
     235             : PLUMED_REGISTER_ACTION(ClusterDistributionShortcut,"CLUSTER_DISTRIBUTION")
     236             : 
     237           6 : void ClusterDistributionShortcut::registerKeywords( Keywords& keys ) {
     238           6 :   ActionShortcut::registerKeywords( keys );
     239           6 :   keys.add("compulsory","CLUSTERS","the label of the action that does the clustering");
     240           6 :   keys.add("optional","WEIGHTS","use the vector of values calculated by this action as weights rather than giving each atom a unit weight");
     241           6 :   multicolvar::MultiColvarShortcuts::shortcutKeywords( keys );
     242          12 :   keys.reset_style("MIN","hidden");
     243          12 :   keys.reset_style("MAX","hidden");
     244          12 :   keys.reset_style("ALT_MIN","hidden");
     245          12 :   keys.reset_style("HIGHEST","hidden");
     246          12 :   keys.reset_style("LOWEST","hidden");
     247          12 :   keys.reset_style("MEAN","hidden");
     248          12 :   keys.reset_style("SUM","hidden");
     249           6 :   keys.addActionNameSuffix("_CALC");
     250           6 : }
     251             : 
     252           2 : ClusterDistributionShortcut::ClusterDistributionShortcut(const ActionOptions&ao):
     253             :   Action(ao),
     254           2 :   ActionShortcut(ao) {
     255             :   std::map<std::string,std::string> keymap;
     256           2 :   multicolvar::MultiColvarShortcuts::readShortcutKeywords( keymap, this );
     257           4 :   readInputLine( getShortcutLabel() + ": CLUSTER_DISTRIBUTION_CALC " + convertInputLineToString() );
     258           4 :   multicolvar::MultiColvarShortcuts::expandFunctions( getShortcutLabel(),  getShortcutLabel(),  "", keymap, this );
     259           2 : }
     260             : 
     261             : }
     262             : }

Generated by: LCOV version 1.16