Line data Source code
1 : /* +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 : Copyright (c) 2011-2023 The plumed team
3 : (see the PEOPLE file at the root of the distribution for a list of names)
4 :
5 : See http://www.plumed.org for more information.
6 :
7 : This file is part of plumed, version 2.
8 :
9 : plumed is free software: you can redistribute it and/or modify
10 : it under the terms of the GNU Lesser General Public License as published by
11 : the Free Software Foundation, either version 3 of the License, or
12 : (at your option) any later version.
13 :
14 : plumed is distributed in the hope that it will be useful,
15 : but WITHOUT ANY WARRANTY; without even the implied warranty of
16 : MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
17 : GNU Lesser General Public License for more details.
18 :
19 : You should have received a copy of the GNU Lesser General Public License
20 : along with plumed. If not, see <http://www.gnu.org/licenses/>.
21 : +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ */
22 : #include "PDB.h"
23 : #include "Tools.h"
24 : #include "Log.h"
25 : #include "h36.h"
26 : #include <cstdio>
27 : #include <iostream>
28 : #include "core/GenericMolInfo.h"
29 : #include "Tensor.h"
30 :
31 : //+PLUMEDOC INTERNAL pdbreader
32 : /*
33 : PLUMED can use the PDB format in several places
34 :
35 : - To read molecular structure (\ref MOLINFO).
36 : - To read reference conformations (\ref RMSD, but also many other methods in \ref dists, \ref FIT_TO_TEMPLATE, etc).
37 :
38 : The implemented PDB reader expects a file formatted correctly according to the
39 : [PDB standard](http://www.wwpdb.org/documentation/file-format-content/format33/v3.3.html).
40 : In particular, the following columns are read from ATOM records
41 : \verbatim
42 : columns | content
43 : 1-6 | record name (ATOM or HETATM)
44 : 7-11 | serial number of the atom (starting from 1)
45 : 13-16 | atom name
46 : 18-20 | residue name
47 : 22 | chain id
48 : 23-26 | residue number
49 : 31-38 | x coordinate
50 : 39-46 | y coordinate
51 : 47-54 | z coordinate
52 : 55-60 | occupancy
53 : 61-66 | beta factor
54 : \endverbatim
55 : PLUMED parser is slightly more permissive than the official PDB format
56 : in the fact that the format of real numbers is not fixed. In other words,
57 : any real number that can be parsed is OK and the dot can be placed anywhere. However,
58 : __columns are interpret strictly__. A sample PDB should look like the following
59 : \verbatim
60 : ATOM 2 CH3 ACE 1 12.932 -14.718 -6.016 1.00 1.00
61 : ATOM 5 C ACE 1 21.312 -9.928 -5.946 1.00 1.00
62 : ATOM 9 CA ALA 2 19.462 -11.088 -8.986 1.00 1.00
63 : \endverbatim
64 :
65 : Notice that serial numbers need not to be consecutive. In the three-line example above,
66 : only the coordinates of three atoms are provided. This is perfectly legal and indicates PLUMED
67 : that information about these atoms only is available. This could be both for structural
68 : information in \ref MOLINFO, where the other atoms would have no name assigned, and for
69 : reference structures used in \ref RMSD, where only the provided atoms would be used to compute RMSD.
70 :
71 : \par Occupancy and beta factors
72 :
73 : PLUMED reads also occupancy and beta factors that however are given a very special meaning.
74 : In cases where the PDB structure is used as a reference for an alignment (that's the case
75 : for instance in \ref RMSD and in \ref FIT_TO_TEMPLATE), the occupancy column is used
76 : to provide the weight of each atom in the alignment. In cases where, perhaps after alignment,
77 : the displacement between running coordinates and the provided PDB is computed, the beta factors
78 : are used as weight for the displacement.
79 : Since setting the weights to zero is the same as __not__ including an atom in the alignment or
80 : displacement calculation, the two following reference files would be equivalent when used in an \ref RMSD
81 : calculation. First file:
82 : \verbatim
83 : ATOM 2 CH3 ACE 1 12.932 -14.718 -6.016 1.00 1.00
84 : ATOM 5 C ACE 1 21.312 -9.928 -5.946 1.00 1.00
85 : ATOM 9 CA ALA 2 19.462 -11.088 -8.986 0.00 0.00
86 : \endverbatim
87 : Second file:
88 : \verbatim
89 : ATOM 2 CH3 ACE 1 12.932 -14.718 -6.016 1.00 1.00
90 : ATOM 5 C ACE 1 21.312 -9.928 -5.946 1.00 1.00
91 : \endverbatim
92 : However notice that many extra atoms with zero weight might slow down the calculation, so
93 : removing lines is better than setting their weights to zero.
94 : In addition, weights for alignment need not to be equivalent to weights for displacement.
95 : Starting with PLUMED 2.7, if all the weights are set to zero they will be normalized to be equal to the
96 : inverse of the number of involved atoms. This means that it will be possible to use files with
97 : the weight columns set to zero obtaining a meaningful result. In previous PLUMED versions,
98 : setting all weights to zero was resulting in an error instead.
99 :
100 :
101 : \par Systems with more than 100k atoms
102 :
103 : Notice that it very likely does not make any sense to compute the \ref RMSD or any other structural
104 : deviation __using__ so many atoms. However, if the protein for which you want to compute \ref RMSD
105 : has atoms with large serial numbers (e.g. because it is located __after__ solvent in the sorted list of atoms)
106 : you might end up with troubles with the limitations of the PDB format. Indeed, since there are 5
107 : columns available for atom serial number, this number cannot be larger than 99999.
108 : In addition, providing \ref MOLINFO with names associated to atoms with a serial larger than 99999 would be impossible.
109 :
110 : Since PLUMED 2.4 we allow [hybrid 36](http://cci.lbl.gov/hybrid_36/) format to be used to specify atom numbers.
111 : This format is not particularly widespread, but has the nice feature that it provides a one-to-one mapping
112 : between numbers up to approximately 80 millions and strings with 5 characters, plus it is backward compatible
113 : for numbers smaller than 100000. This is not true for notations like the hex notation exported by VMD.
114 : Using the hybrid 36 format, the ATOM records for atom ranging from 99997 to 100002 would read like these:
115 : \verbatim
116 : ATOM 99997 Ar X 1 45.349 38.631 15.116 1.00 1.00
117 : ATOM 99998 Ar X 1 46.189 38.631 15.956 1.00 1.00
118 : ATOM 99999 Ar X 1 46.189 39.471 15.116 1.00 1.00
119 : ATOM A0000 Ar X 1 45.349 39.471 15.956 1.00 1.00
120 : ATOM A0000 Ar X 1 45.349 38.631 16.796 1.00 1.00
121 : ATOM A0001 Ar X 1 46.189 38.631 17.636 1.00 1.00
122 : \endverbatim
123 : There are tools that can be found to translate from integers to strings and back using hybrid 36 format
124 : (a simple python script can be found [here](https://sourceforge.net/p/cctbx/code/HEAD/tree/trunk/iotbx/pdb/hybrid_36.py)).
125 : In addition, as of PLUMED 2.5, we provide a \ref pdbrenumber "command line tool" that can be used to renumber atoms in a PDB file.
126 :
127 : */
128 : //+ENDPLUMEDOC
129 :
130 :
131 : namespace PLMD {
132 :
133 27 : void PDB::setAtomNumbers( const std::vector<AtomNumber>& atoms ) {
134 27 : positions.resize( atoms.size() );
135 27 : occupancy.resize( atoms.size() );
136 27 : beta.resize( atoms.size() );
137 27 : numbers.resize( atoms.size() );
138 210 : for(unsigned i=0; i<atoms.size(); ++i) {
139 183 : numbers[i]=atoms[i];
140 183 : beta[i]=1.0;
141 183 : occupancy[i]=1.0;
142 : }
143 27 : }
144 :
145 34 : void PDB::setArgumentNames( const std::vector<std::string>& argument_names ) {
146 34 : argnames.resize( argument_names.size() );
147 98 : for(unsigned i=0; i<argument_names.size(); ++i) {
148 : argnames[i]=argument_names[i];
149 64 : arg_data.insert( std::pair<std::string,double>( argnames[i], 0.0 ) );
150 : }
151 34 : }
152 :
153 1504260 : bool PDB::getArgumentValue( const std::string& name, double& value ) const {
154 : std::map<std::string,double>::const_iterator it = arg_data.find(name);
155 1504260 : if( it!=arg_data.end() ) {
156 1503824 : value = it->second;
157 1503824 : return true;
158 : }
159 : return false;
160 : }
161 :
162 493287 : void PDB::setAtomPositions( const std::vector<Vector>& pos ) {
163 493287 : plumed_assert( pos.size()==positions.size() );
164 519464 : for(unsigned i=0; i<positions.size(); ++i) {
165 26177 : positions[i]=pos[i];
166 : }
167 493287 : }
168 :
169 1502798 : void PDB::setArgumentValue( const std::string& argname, const double& val ) {
170 : // First set the value of the value of the argument in the map
171 1502798 : arg_data.find(argname)->second = val;
172 1502798 : }
173 :
174 : // bool PDB::hasRequiredProperties( const std::vector<std::string>& inproperties ){
175 : // bool hasprop=false;
176 : // for(unsigned i=0;i<remark.size();++i){
177 : // if( remark[i].find("PROPERTIES=")!=std::string::npos){ hasprop=true; break; }
178 : // }
179 : // if( !hasprop ){
180 : // std::string mypropstr="PROPERTIES=" + inproperties[0];
181 : // for(unsigned i=1;i<inproperties.size();++i) mypropstr += "," + inproperties[i];
182 : // remark.push_back( mypropstr );
183 : // }
184 : // // Now check that all required properties are there
185 : // for(unsigned i=0;i<inproperties.size();++i){
186 : // hasprop=false;
187 : // for(unsigned j=0;j<remark.size();++j){
188 : // if( remark[j].find(inproperties[i]+"=")!=std::string::npos){ hasprop=true; break; }
189 : // }
190 : // if( !hasprop ) return false;
191 : // }
192 : // return true;
193 : // }
194 :
195 17 : void PDB::addBlockEnd( const unsigned& end ) {
196 17 : block_ends.push_back( end );
197 17 : }
198 :
199 474 : unsigned PDB::getNumberOfAtomBlocks()const {
200 474 : return block_ends.size();
201 : }
202 :
203 14 : const std::vector<unsigned> & PDB::getAtomBlockEnds()const {
204 14 : return block_ends;
205 : }
206 :
207 23534547 : const std::vector<Vector> & PDB::getPositions()const {
208 23534547 : return positions;
209 : }
210 :
211 0 : void PDB::setPositions(const std::vector<Vector> &v ) {
212 0 : plumed_assert( v.size()==positions.size() );
213 0 : positions=v;
214 0 : }
215 :
216 38558 : const std::vector<double> & PDB::getOccupancy()const {
217 38558 : return occupancy;
218 : }
219 :
220 38558 : const std::vector<double> & PDB::getBeta()const {
221 38558 : return beta;
222 : }
223 :
224 1343 : void PDB::addRemark( std::vector<std::string>& v1 ) {
225 1343 : Tools::parse(v1,"TYPE",mtype);
226 1343 : Tools::parseVector(v1,"ARG",argnames);
227 3775 : for(unsigned i=0; i<v1.size(); ++i) {
228 2432 : if( v1[i].find("=")!=std::string::npos ) {
229 : std::size_t eq=v1[i].find_first_of('=');
230 1864 : std::string name=v1[i].substr(0,eq);
231 1864 : std::string sval=v1[i].substr(eq+1);
232 : double val;
233 1864 : Tools::convert( sval, val );
234 1864 : arg_data.insert( std::pair<std::string,double>( name, val ) );
235 : } else {
236 568 : flags.push_back(v1[i]);
237 : }
238 : }
239 1343 : }
240 :
241 8 : bool PDB::hasFlag( const std::string& fname ) const {
242 8 : for(unsigned i=0; i<flags.size(); ++i) {
243 0 : if( flags[i]==fname ) {
244 : return true;
245 : }
246 : }
247 : return false;
248 : }
249 :
250 :
251 722744 : const std::vector<AtomNumber> & PDB::getAtomNumbers()const {
252 722744 : return numbers;
253 : }
254 :
255 0 : const Vector & PDB::getBoxAxs()const {
256 0 : return BoxXYZ;
257 : }
258 :
259 0 : const Vector & PDB::getBoxAng()const {
260 0 : return BoxABG;
261 : }
262 :
263 12 : const Tensor & PDB::getBoxVec()const {
264 12 : return Box;
265 : }
266 :
267 6876294 : std::string PDB::getAtomName(AtomNumber a)const {
268 : const auto p=number2index.find(a);
269 6876294 : if(p==number2index.end()) {
270 : std::string num;
271 0 : Tools::convert( a.serial(), num );
272 0 : plumed_merror("Name of atom " + num + " not found" );
273 : } else {
274 6876294 : return atomsymb[p->second];
275 : }
276 : }
277 :
278 166259 : unsigned PDB::getResidueNumber(AtomNumber a)const {
279 : const auto p=number2index.find(a);
280 166259 : if(p==number2index.end()) {
281 : std::string num;
282 0 : Tools::convert( a.serial(), num );
283 0 : plumed_merror("Residue for atom " + num + " not found" );
284 : } else {
285 166259 : return residue[p->second];
286 : }
287 : }
288 :
289 151942 : std::string PDB::getResidueName(AtomNumber a) const {
290 : const auto p=number2index.find(a);
291 151942 : if(p==number2index.end()) {
292 : std::string num;
293 0 : Tools::convert( a.serial(), num );
294 0 : plumed_merror("Residue for atom " + num + " not found" );
295 : } else {
296 151942 : return residuenames[p->second];
297 : }
298 : }
299 :
300 247041234 : unsigned PDB::size()const {
301 247041234 : return positions.size();
302 : }
303 :
304 2092 : bool PDB::readFromFilepointer(FILE *fp,bool naturalUnits,double scale) {
305 : //cerr<<file<<endl;
306 : bool file_is_alive=false;
307 2092 : if(naturalUnits) {
308 : scale=1.0;
309 : }
310 : std::string line;
311 : fpos_t pos;
312 : bool between_ters=true;
313 290647 : while(Tools::getline(fp,line)) {
314 : //cerr<<line<<"\n";
315 290429 : fgetpos (fp,&pos);
316 2018653 : while(line.length()<80) {
317 1728224 : line.push_back(' ');
318 : }
319 290429 : std::string record=line.substr(0,6);
320 290429 : std::string serial=line.substr(6,5);
321 290429 : std::string atomname=line.substr(12,4);
322 290429 : std::string residuename=line.substr(17,3);
323 290429 : std::string chainID=line.substr(21,1);
324 290429 : std::string resnum=line.substr(22,4);
325 290429 : std::string x=line.substr(30,8);
326 290429 : std::string y=line.substr(38,8);
327 290429 : std::string z=line.substr(46,8);
328 290429 : std::string occ=line.substr(54,6);
329 290429 : std::string bet=line.substr(60,6);
330 290429 : std::string BoxX=line.substr(6,9);
331 290429 : std::string BoxY=line.substr(15,9);
332 290429 : std::string BoxZ=line.substr(24,9);
333 290429 : std::string BoxA=line.substr(33,7);
334 290429 : std::string BoxB=line.substr(40,7);
335 290429 : std::string BoxG=line.substr(47,7);
336 290429 : Tools::trim(record);
337 290429 : if(record=="TER") {
338 : between_ters=false;
339 231 : block_ends.push_back( positions.size() );
340 : }
341 290429 : if(record=="END") {
342 : file_is_alive=true;
343 : break;
344 : }
345 288703 : if(record=="ENDMDL") {
346 : file_is_alive=true;
347 : break;
348 : }
349 288555 : if(record=="REMARK") {
350 : std::vector<std::string> v1;
351 2686 : v1=Tools::getWords(line.substr(6));
352 1343 : addRemark( v1 );
353 1343 : }
354 288555 : if(record=="CRYST1") {
355 73 : Tools::convert(BoxX,BoxXYZ[0]);
356 73 : Tools::convert(BoxY,BoxXYZ[1]);
357 73 : Tools::convert(BoxZ,BoxXYZ[2]);
358 73 : Tools::convert(BoxA,BoxABG[0]);
359 73 : Tools::convert(BoxB,BoxABG[1]);
360 73 : Tools::convert(BoxG,BoxABG[2]);
361 73 : BoxXYZ*=scale;
362 73 : double cosA=std::cos(BoxABG[0]*pi/180.);
363 73 : double cosB=std::cos(BoxABG[1]*pi/180.);
364 73 : double cosG=std::cos(BoxABG[2]*pi/180.);
365 73 : double sinG=std::sin(BoxABG[2]*pi/180.);
366 292 : for (unsigned i=0; i<3; i++) {
367 219 : Box[i][0]=0.;
368 219 : Box[i][1]=0.;
369 219 : Box[i][2]=0.;
370 : }
371 73 : Box[0][0]=BoxXYZ[0];
372 73 : Box[1][0]=BoxXYZ[1]*cosG;
373 73 : Box[1][1]=BoxXYZ[1]*sinG;
374 73 : Box[2][0]=BoxXYZ[2]*cosB;
375 73 : Box[2][1]=(BoxXYZ[2]*BoxXYZ[1]*cosA-Box[2][0]*Box[1][0])/Box[1][1];
376 73 : Box[2][2]=std::sqrt(BoxXYZ[2]*BoxXYZ[2]-Box[2][0]*Box[2][0]-Box[2][1]*Box[2][1]);
377 : }
378 290635 : if(record=="ATOM" || record=="HETATM") {
379 : between_ters=true;
380 : AtomNumber a;
381 286475 : unsigned resno=0; // GB: when resnum string is not present, we set res number to zero
382 : double o,b;
383 286475 : Vector p;
384 : {
385 : int result;
386 286475 : auto trimmed=serial;
387 286475 : Tools::trim(trimmed);
388 286537 : while(trimmed.length()<5) {
389 124 : trimmed = std::string(" ") + trimmed;
390 : }
391 286475 : const char* errmsg = h36::hy36decode(5, trimmed.c_str(),trimmed.length(), &result);
392 286475 : if(errmsg) {
393 0 : std::string msg(errmsg);
394 0 : plumed_merror(msg);
395 : }
396 286475 : a.setSerial(result);
397 : }
398 :
399 : // allow skipping residue number
400 : {
401 286475 : auto trimmed=resnum;
402 286475 : Tools::trim(trimmed);
403 286475 : if(trimmed.length()>0) {
404 : int result;
405 286475 : while(trimmed.length()<4) {
406 0 : trimmed = std::string(" ") + trimmed;
407 : }
408 286475 : const char* errmsg = h36::hy36decode(4, trimmed.c_str(),trimmed.length(), &result);
409 286475 : if(errmsg) {
410 0 : std::string msg(errmsg);
411 0 : plumed_merror(msg);
412 : }
413 286475 : resno=result;
414 : }
415 : }
416 :
417 286475 : Tools::convert(occ,o);
418 286475 : Tools::convert(bet,b);
419 286475 : Tools::convert(x,p[0]);
420 286475 : Tools::convert(y,p[1]);
421 286475 : Tools::convert(z,p[2]);
422 : // scale into nm
423 286475 : p*=scale;
424 286475 : numbers.push_back(a);
425 286475 : number2index[a]=positions.size();
426 286475 : std::size_t startpos=atomname.find_first_not_of(" \t");
427 286475 : std::size_t endpos=atomname.find_last_not_of(" \t");
428 286475 : atomsymb.push_back( atomname.substr(startpos, endpos-startpos+1) );
429 286475 : residue.push_back(resno);
430 286475 : chain.push_back(chainID);
431 286475 : occupancy.push_back(o);
432 286475 : beta.push_back(b);
433 286475 : positions.push_back(p);
434 286475 : residuenames.push_back(residuename);
435 : }
436 : }
437 2092 : if( between_ters ) {
438 1908 : block_ends.push_back( positions.size() );
439 : }
440 2092 : return file_is_alive;
441 : }
442 :
443 346 : bool PDB::read(const std::string&file,bool naturalUnits,double scale) {
444 346 : FILE* fp=std::fopen(file.c_str(),"r");
445 346 : if(!fp) {
446 : return false;
447 : }
448 : // call fclose when exiting this function
449 : auto deleter=[](FILE* f) {
450 344 : std::fclose(f);
451 : };
452 : std::unique_ptr<FILE,decltype(deleter)> fp_deleter(fp,deleter);
453 344 : readFromFilepointer(fp,naturalUnits,scale);
454 : return true;
455 : }
456 :
457 285 : void PDB::getChainNames( std::vector<std::string>& chains ) const {
458 285 : chains.resize(0);
459 285 : chains.push_back( chain[0] );
460 561643 : for(unsigned i=1; i<size(); ++i) {
461 561358 : if( chains[chains.size()-1]!=chain[i] ) {
462 867 : chains.push_back( chain[i] );
463 : }
464 : }
465 285 : }
466 :
467 12006 : void PDB::getResidueRange( const std::string& chainname, unsigned& res_start, unsigned& res_end, std::string& errmsg ) const {
468 : bool inres=false, foundchain=false;
469 31282215 : for(unsigned i=0; i<size(); ++i) {
470 31270209 : if( chain[i]==chainname ) {
471 26701207 : if(!inres) {
472 12006 : if(foundchain) {
473 0 : errmsg="found second start of chain named " + chainname;
474 : }
475 12006 : res_start=residue[i];
476 : }
477 : inres=true;
478 : foundchain=true;
479 4569002 : } else if( inres && chain[i]!=chainname ) {
480 : inres=false;
481 11451 : res_end=residue[i-1];
482 : }
483 : }
484 12006 : if(inres) {
485 555 : res_end=residue[size()-1];
486 : }
487 12006 : }
488 :
489 235 : void PDB::getAtomRange( const std::string& chainname, AtomNumber& a_start, AtomNumber& a_end, std::string& errmsg ) const {
490 : bool inres=false, foundchain=false;
491 577918 : for(unsigned i=0; i<size(); ++i) {
492 577683 : if( chain[i]==chainname ) {
493 105949 : if(!inres) {
494 235 : if(foundchain) {
495 0 : errmsg="found second start of chain named " + chainname;
496 : }
497 235 : a_start=numbers[i];
498 : }
499 : inres=true;
500 : foundchain=true;
501 471734 : } else if( inres && chain[i]!=chainname ) {
502 : inres=false;
503 125 : a_end=numbers[i-1];
504 : }
505 : }
506 235 : if(inres) {
507 110 : a_end=numbers[size()-1];
508 : }
509 235 : }
510 :
511 11790 : std::string PDB::getResidueName( const unsigned& resnum ) const {
512 14628336 : for(unsigned i=0; i<size(); ++i) {
513 14628336 : if( residue[i]==resnum ) {
514 11790 : return residuenames[i];
515 : }
516 : }
517 : std::string num;
518 0 : Tools::convert( resnum, num );
519 0 : plumed_merror("residue " + num + " not found" );
520 : }
521 :
522 48822 : std::string PDB::getResidueName(const unsigned& resnum,const std::string& chainid ) const {
523 62798425 : for(unsigned i=0; i<size(); ++i) {
524 62847427 : if( residue[i]==resnum && ( chainid=="*" || chain[i]==chainid) ) {
525 48822 : return residuenames[i];
526 : }
527 : }
528 : std::string num;
529 0 : Tools::convert( resnum, num );
530 0 : plumed_merror("residue " + num + " not found in chain " + chainid );
531 : }
532 :
533 :
534 26220 : AtomNumber PDB::getNamedAtomFromResidue( const std::string& aname, const unsigned& resnum ) const {
535 32676780 : for(unsigned i=0; i<size(); ++i) {
536 32676780 : if( residue[i]==resnum && atomsymb[i]==aname ) {
537 26220 : return numbers[i];
538 : }
539 : }
540 : std::string num;
541 0 : Tools::convert( resnum, num );
542 0 : plumed_merror("residue " + num + " does not contain an atom named " + aname );
543 : }
544 :
545 2746 : AtomNumber PDB::getNamedAtomFromResidueAndChain( const std::string& aname, const unsigned& resnum, const std::string& chainid ) const {
546 1099780 : for(unsigned i=0; i<size(); ++i) {
547 1102526 : if( residue[i]==resnum && atomsymb[i]==aname && ( chainid=="*" || chain[i]==chainid) ) {
548 2746 : return numbers[i];
549 : }
550 : }
551 : std::string num;
552 0 : Tools::convert( resnum, num );
553 0 : plumed_merror("residue " + num + " from chain " + chainid + " does not contain an atom named " + aname );
554 : }
555 :
556 34246 : std::vector<AtomNumber> PDB::getAtomsInResidue(const unsigned& resnum,const std::string& chainid)const {
557 : std::vector<AtomNumber> tmp;
558 91444766 : for(unsigned i=0; i<size(); ++i) {
559 91931870 : if( residue[i]==resnum && ( chainid=="*" || chain[i]==chainid) ) {
560 521350 : tmp.push_back(numbers[i]);
561 : }
562 : }
563 34246 : if(tmp.size()==0) {
564 : std::string num;
565 0 : Tools::convert( resnum, num );
566 0 : plumed_merror("Cannot find residue " + num + " from chain " + chainid );
567 : }
568 34246 : return tmp;
569 : }
570 :
571 0 : std::vector<AtomNumber> PDB::getAtomsInChain(const std::string& chainid)const {
572 : std::vector<AtomNumber> tmp;
573 0 : for(unsigned i=0; i<size(); ++i) {
574 0 : if( chainid=="*" || chain[i]==chainid ) {
575 0 : tmp.push_back(numbers[i]);
576 : }
577 : }
578 0 : if(tmp.size()==0) {
579 0 : plumed_merror("Cannot find atoms from chain " + chainid );
580 : }
581 0 : return tmp;
582 : }
583 :
584 9174 : std::string PDB::getChainID(const unsigned& resnumber) const {
585 11372780 : for(unsigned i=0; i<size(); ++i) {
586 11372780 : if(resnumber==residue[i]) {
587 9174 : return chain[i];
588 : }
589 : }
590 0 : plumed_merror("Not enough residues in pdb input file");
591 : }
592 :
593 0 : std::string PDB::getChainID(AtomNumber a) const {
594 : const auto p=number2index.find(a);
595 0 : if(p==number2index.end()) {
596 : std::string num;
597 0 : Tools::convert( a.serial(), num );
598 0 : plumed_merror("Chain for atom " + num + " not found" );
599 : }
600 0 : return chain[p->second];
601 : }
602 :
603 0 : bool PDB::checkForResidue( const std::string& name ) const {
604 0 : for(unsigned i=0; i<size(); ++i) {
605 0 : if( residuenames[i]==name ) {
606 : return true;
607 : }
608 : }
609 : return false;
610 : }
611 :
612 0 : bool PDB::checkForAtom( const std::string& name ) const {
613 0 : for(unsigned i=0; i<size(); ++i) {
614 0 : if( atomsymb[i]==name ) {
615 : return true;
616 : }
617 : }
618 : return false;
619 : }
620 :
621 249 : bool PDB::checkForAtom( AtomNumber a ) const {
622 : const auto p=number2index.find(a);
623 249 : return (p!=number2index.end());
624 : }
625 :
626 0 : Log& operator<<(Log& ostr, const PDB& pdb) {
627 : const std::size_t bufferlen=1000;
628 : char buffer[bufferlen];
629 0 : for(unsigned i=0; i<pdb.positions.size(); i++) {
630 0 : std::snprintf(buffer,bufferlen,"ATOM %3u %8.3f %8.3f %8.3f\n",pdb.numbers[i].serial(),pdb.positions[i][0],pdb.positions[i][1],pdb.positions[i][2]);
631 0 : ostr<<buffer;
632 : }
633 0 : return ostr;
634 : }
635 :
636 38554 : Vector PDB::getPosition(AtomNumber a)const {
637 : const auto p=number2index.find(a);
638 38554 : if(p==number2index.end()) {
639 0 : plumed_merror("atom not available");
640 : } else {
641 38554 : return positions[p->second];
642 : }
643 : }
644 :
645 2539432 : std::vector<std::string> PDB::getArgumentNames()const {
646 2539432 : return argnames;
647 : }
648 :
649 10 : std::string PDB::getMtype() const {
650 10 : return mtype;
651 : }
652 :
653 497 : void PDB::print( const double& lunits, GenericMolInfo* mymoldat, OFile& ofile, const std::string& fmt ) {
654 497 : if( argnames.size()>0 ) {
655 476 : ofile.printf("REMARK ARG=%s", argnames[0].c_str() );
656 1744 : for(unsigned i=1; i<argnames.size(); ++i) {
657 1268 : ofile.printf(",%s",argnames[i].c_str() );
658 : }
659 476 : ofile.printf("\n");
660 476 : ofile.printf("REMARK ");
661 : }
662 : std::string descr2;
663 497 : if(fmt.find("-")!=std::string::npos) {
664 0 : descr2="%s=" + fmt + " ";
665 : } else {
666 : // This ensures numbers are left justified (i.e. next to the equals sign
667 497 : std::size_t psign=fmt.find("%");
668 497 : plumed_assert( psign!=std::string::npos );
669 994 : descr2="%s=%-" + fmt.substr(psign+1) + " ";
670 : }
671 2241 : for(std::map<std::string,double>::iterator it=arg_data.begin(); it!=arg_data.end(); ++it) {
672 1744 : ofile.printf( descr2.c_str(),it->first.c_str(), it->second );
673 : }
674 497 : if( argnames.size()>0 ) {
675 476 : ofile.printf("\n");
676 : }
677 497 : if( !mymoldat ) {
678 2263 : for(unsigned i=0; i<positions.size(); ++i) {
679 : std::array<char,6> at;
680 : {
681 1769 : const char* msg = h36::hy36encode(5,numbers[i].serial(),&at[0]);
682 1769 : plumed_assert(msg==nullptr) << msg;
683 1769 : at[5]=0;
684 : }
685 : std::array<char,5> res;
686 : {
687 1769 : const char* msg = h36::hy36encode(4,i,&res[0]);
688 1769 : plumed_assert(msg==nullptr) << msg;
689 1769 : res[4]=0;
690 : }
691 1769 : ofile.printf("ATOM %s X RES %s %8.3f%8.3f%8.3f%6.2f%6.2f\n",
692 : &at[0], &res[0],
693 1769 : lunits*positions[i][0], lunits*positions[i][1], lunits*positions[i][2],
694 : occupancy[i], beta[i] );
695 : }
696 : } else {
697 69 : for(unsigned i=0; i<positions.size(); ++i) {
698 : std::array<char,6> at;
699 : {
700 66 : const char* msg = h36::hy36encode(5,numbers[i].serial(),&at[0]);
701 66 : plumed_assert(msg==nullptr) << msg;
702 66 : at[5]=0;
703 : }
704 : std::array<char,5> res;
705 : {
706 66 : const char* msg = h36::hy36encode(4,mymoldat->getResidueNumber(numbers[i]),&res[0]);
707 66 : plumed_assert(msg==nullptr) << msg;
708 66 : res[4]=0;
709 : }
710 66 : ofile.printf("ATOM %s %-4s %3s %s %8.3f%8.3f%8.3f%6.2f%6.2f\n",
711 132 : &at[0], mymoldat->getAtomName(numbers[i]).c_str(),
712 66 : mymoldat->getResidueName(numbers[i]).c_str(), &res[0],
713 66 : lunits*positions[i][0], lunits*positions[i][1], lunits*positions[i][2],
714 : occupancy[i], beta[i] );
715 : }
716 : }
717 497 : ofile.printf("END\n");
718 497 : }
719 :
720 37562 : bool PDB::allowedResidue( const std::string& type, const std::string& residuename ) const {
721 37562 : if( type=="protein" ) {
722 37562 : if(residuename=="ALA") {
723 : return true;
724 35552 : } else if(residuename=="ARG") {
725 : return true;
726 33312 : } else if(residuename=="ASN") {
727 : return true;
728 31392 : } else if(residuename=="ASP") {
729 : return true;
730 29712 : } else if(residuename=="CYS") {
731 : return true;
732 29582 : } else if(residuename=="GLN") {
733 : return true;
734 26782 : } else if(residuename=="GLU") {
735 : return true;
736 25522 : } else if(residuename=="GLY") {
737 : return true;
738 23702 : } else if(residuename=="HIS") {
739 : return true;
740 23702 : } else if(residuename=="ILE") {
741 : return true;
742 21452 : } else if(residuename=="LEU") {
743 : return true;
744 17462 : } else if(residuename=="LYS") {
745 : return true;
746 14762 : } else if(residuename=="MET") {
747 : return true;
748 13162 : } else if(residuename=="PHE") {
749 : return true;
750 10522 : } else if(residuename=="PRO") {
751 : return true;
752 8542 : } else if(residuename=="SER") {
753 : return true;
754 7112 : } else if(residuename=="THR") {
755 : return true;
756 5762 : } else if(residuename=="TRP") {
757 : return true;
758 5762 : } else if(residuename=="TYR") {
759 : return true;
760 4382 : } else if(residuename=="VAL") {
761 : return true;
762 : }
763 : // Terminal groups
764 1492 : else if(residuename=="ACE") {
765 : return true;
766 1492 : } else if(residuename=="NME") {
767 : return true;
768 1492 : } else if(residuename=="NH2") {
769 : return true;
770 : }
771 : // Alternative residue names in common force fields
772 1492 : else if(residuename=="GLH") {
773 : return true; // neutral GLU
774 1492 : } else if(residuename=="ASH") {
775 : return true; // neutral ASP
776 1492 : } else if(residuename=="HID") {
777 : return true; // HIS-D amber
778 1492 : } else if(residuename=="HSD") {
779 : return true; // HIS-D charmm
780 1492 : } else if(residuename=="HIE") {
781 : return true; // HIS-E amber
782 1112 : } else if(residuename=="HSE") {
783 : return true; // HIS-E charmm
784 1112 : } else if(residuename=="HIP") {
785 : return true; // HIS-P amber
786 1112 : } else if(residuename=="HSP") {
787 : return true; // HIS-P charmm
788 1112 : } else if(residuename=="CYX") {
789 : return true; // disulfide bridge CYS
790 : }
791 : // Weird amino acids
792 1112 : else if(residuename=="NLE") {
793 : return true;
794 1112 : } else if(residuename=="SFO") {
795 : return true;
796 : } else {
797 1112 : return false;
798 : }
799 0 : } else if( type=="dna" ) {
800 0 : if(residuename=="A") {
801 : return true;
802 0 : } else if(residuename=="A5") {
803 : return true;
804 0 : } else if(residuename=="A3") {
805 : return true;
806 0 : } else if(residuename=="AN") {
807 : return true;
808 0 : } else if(residuename=="G") {
809 : return true;
810 0 : } else if(residuename=="G5") {
811 : return true;
812 0 : } else if(residuename=="G3") {
813 : return true;
814 0 : } else if(residuename=="GN") {
815 : return true;
816 0 : } else if(residuename=="T") {
817 : return true;
818 0 : } else if(residuename=="T5") {
819 : return true;
820 0 : } else if(residuename=="T3") {
821 : return true;
822 0 : } else if(residuename=="TN") {
823 : return true;
824 0 : } else if(residuename=="C") {
825 : return true;
826 0 : } else if(residuename=="C5") {
827 : return true;
828 0 : } else if(residuename=="C3") {
829 : return true;
830 0 : } else if(residuename=="CN") {
831 : return true;
832 0 : } else if(residuename=="DA") {
833 : return true;
834 0 : } else if(residuename=="DA5") {
835 : return true;
836 0 : } else if(residuename=="DA3") {
837 : return true;
838 0 : } else if(residuename=="DAN") {
839 : return true;
840 0 : } else if(residuename=="DG") {
841 : return true;
842 0 : } else if(residuename=="DG5") {
843 : return true;
844 0 : } else if(residuename=="DG3") {
845 : return true;
846 0 : } else if(residuename=="DGN") {
847 : return true;
848 0 : } else if(residuename=="DT") {
849 : return true;
850 0 : } else if(residuename=="DT5") {
851 : return true;
852 0 : } else if(residuename=="DT3") {
853 : return true;
854 0 : } else if(residuename=="DTN") {
855 : return true;
856 0 : } else if(residuename=="DC") {
857 : return true;
858 0 : } else if(residuename=="DC5") {
859 : return true;
860 0 : } else if(residuename=="DC3") {
861 : return true;
862 0 : } else if(residuename=="DCN") {
863 : return true;
864 : } else {
865 0 : return false;
866 : }
867 0 : } else if( type=="rna" ) {
868 0 : if(residuename=="A") {
869 : return true;
870 0 : } else if(residuename=="A5") {
871 : return true;
872 0 : } else if(residuename=="A3") {
873 : return true;
874 0 : } else if(residuename=="AN") {
875 : return true;
876 0 : } else if(residuename=="G") {
877 : return true;
878 0 : } else if(residuename=="G5") {
879 : return true;
880 0 : } else if(residuename=="G3") {
881 : return true;
882 0 : } else if(residuename=="GN") {
883 : return true;
884 0 : } else if(residuename=="U") {
885 : return true;
886 0 : } else if(residuename=="U5") {
887 : return true;
888 0 : } else if(residuename=="U3") {
889 : return true;
890 0 : } else if(residuename=="UN") {
891 : return true;
892 0 : } else if(residuename=="C") {
893 : return true;
894 0 : } else if(residuename=="C5") {
895 : return true;
896 0 : } else if(residuename=="C3") {
897 : return true;
898 0 : } else if(residuename=="CN") {
899 : return true;
900 0 : } else if(residuename=="RA") {
901 : return true;
902 0 : } else if(residuename=="RA5") {
903 : return true;
904 0 : } else if(residuename=="RA3") {
905 : return true;
906 0 : } else if(residuename=="RAN") {
907 : return true;
908 0 : } else if(residuename=="RG") {
909 : return true;
910 0 : } else if(residuename=="RG5") {
911 : return true;
912 0 : } else if(residuename=="RG3") {
913 : return true;
914 0 : } else if(residuename=="RGN") {
915 : return true;
916 0 : } else if(residuename=="RU") {
917 : return true;
918 0 : } else if(residuename=="RU5") {
919 : return true;
920 0 : } else if(residuename=="RU3") {
921 : return true;
922 0 : } else if(residuename=="RUN") {
923 : return true;
924 0 : } else if(residuename=="RC") {
925 : return true;
926 0 : } else if(residuename=="RC5") {
927 : return true;
928 0 : } else if(residuename=="RC3") {
929 : return true;
930 0 : } else if(residuename=="RCN") {
931 : return true;
932 : } else {
933 0 : return false;
934 : }
935 0 : } else if( type=="water" ) {
936 0 : if(residuename=="SOL") {
937 : return true;
938 : }
939 0 : if(residuename=="WAT") {
940 : return true;
941 : }
942 0 : return false;
943 0 : } else if( type=="ion" ) {
944 0 : if(residuename=="IB+") {
945 : return true;
946 : }
947 0 : if(residuename=="CA") {
948 : return true;
949 : }
950 0 : if(residuename=="CL") {
951 : return true;
952 : }
953 0 : if(residuename=="NA") {
954 : return true;
955 : }
956 0 : if(residuename=="MG") {
957 : return true;
958 : }
959 0 : if(residuename=="K") {
960 : return true;
961 : }
962 0 : if(residuename=="RB") {
963 : return true;
964 : }
965 0 : if(residuename=="CS") {
966 : return true;
967 : }
968 0 : if(residuename=="LI") {
969 : return true;
970 : }
971 0 : if(residuename=="ZN") {
972 : return true;
973 : }
974 0 : return false;
975 : }
976 : return false;
977 : }
978 :
979 : }
980 :
|