Line data Source code
1 : /* +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 : Copyright (c) 2011-2023 The plumed team
3 : (see the PEOPLE file at the root of the distribution for a list of names)
4 :
5 : See http://www.plumed.org for more information.
6 :
7 : This file is part of plumed, version 2.
8 :
9 : plumed is free software: you can redistribute it and/or modify
10 : it under the terms of the GNU Lesser General Public License as published by
11 : the Free Software Foundation, either version 3 of the License, or
12 : (at your option) any later version.
13 :
14 : plumed is distributed in the hope that it will be useful,
15 : but WITHOUT ANY WARRANTY; without even the implied warranty of
16 : MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
17 : GNU Lesser General Public License for more details.
18 :
19 : You should have received a copy of the GNU Lesser General Public License
20 : along with plumed. If not, see <http://www.gnu.org/licenses/>.
21 : +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ */
22 : #include "PDB.h"
23 : #include "Tools.h"
24 : #include "Log.h"
25 : #include "h36.h"
26 : #include <cstdio>
27 : #include "core/GenericMolInfo.h"
28 : #include "Tensor.h"
29 :
30 : //+PLUMEDOC INTERNAL pdbreader
31 : /*
32 : PLUMED can use the PDB format in several places
33 :
34 : - To read molecular structure (\ref MOLINFO).
35 : - To read reference conformations (\ref RMSD, but also many other methods in \ref dists, \ref FIT_TO_TEMPLATE, etc).
36 :
37 : The implemented PDB reader expects a file formatted correctly according to the
38 : [PDB standard](http://www.wwpdb.org/documentation/file-format-content/format33/v3.3.html).
39 : In particular, the following columns are read from ATOM records
40 : \verbatim
41 : columns | content
42 : 1-6 | record name (ATOM or HETATM)
43 : 7-11 | serial number of the atom (starting from 1)
44 : 13-16 | atom name
45 : 18-20 | residue name
46 : 22 | chain id
47 : 23-26 | residue number
48 : 31-38 | x coordinate
49 : 39-46 | y coordinate
50 : 47-54 | z coordinate
51 : 55-60 | occupancy
52 : 61-66 | beta factor
53 : \endverbatim
54 : PLUMED parser is slightly more permissive than the official PDB format
55 : in the fact that the format of real numbers is not fixed. In other words,
56 : any real number that can be parsed is OK and the dot can be placed anywhere. However,
57 : __columns are interpret strictly__. A sample PDB should look like the following
58 : \verbatim
59 : ATOM 2 CH3 ACE 1 12.932 -14.718 -6.016 1.00 1.00
60 : ATOM 5 C ACE 1 21.312 -9.928 -5.946 1.00 1.00
61 : ATOM 9 CA ALA 2 19.462 -11.088 -8.986 1.00 1.00
62 : \endverbatim
63 :
64 : Notice that serial numbers need not to be consecutive. In the three-line example above,
65 : only the coordinates of three atoms are provided. This is perfectly legal and indicates PLUMED
66 : that information about these atoms only is available. This could be both for structural
67 : information in \ref MOLINFO, where the other atoms would have no name assigned, and for
68 : reference structures used in \ref RMSD, where only the provided atoms would be used to compute RMSD.
69 :
70 : \par Occupancy and beta factors
71 :
72 : PLUMED reads also occupancy and beta factors that however are given a very special meaning.
73 : In cases where the PDB structure is used as a reference for an alignment (that's the case
74 : for instance in \ref RMSD and in \ref FIT_TO_TEMPLATE), the occupancy column is used
75 : to provide the weight of each atom in the alignment. In cases where, perhaps after alignment,
76 : the displacement between running coordinates and the provided PDB is computed, the beta factors
77 : are used as weight for the displacement.
78 : Since setting the weights to zero is the same as __not__ including an atom in the alignment or
79 : displacement calculation, the two following reference files would be equivalent when used in an \ref RMSD
80 : calculation. First file:
81 : \verbatim
82 : ATOM 2 CH3 ACE 1 12.932 -14.718 -6.016 1.00 1.00
83 : ATOM 5 C ACE 1 21.312 -9.928 -5.946 1.00 1.00
84 : ATOM 9 CA ALA 2 19.462 -11.088 -8.986 0.00 0.00
85 : \endverbatim
86 : Second file:
87 : \verbatim
88 : ATOM 2 CH3 ACE 1 12.932 -14.718 -6.016 1.00 1.00
89 : ATOM 5 C ACE 1 21.312 -9.928 -5.946 1.00 1.00
90 : \endverbatim
91 : However notice that many extra atoms with zero weight might slow down the calculation, so
92 : removing lines is better than setting their weights to zero.
93 : In addition, weights for alignment need not to be equivalent to weights for displacement.
94 : Starting with PLUMED 2.7, if all the weights are set to zero they will be normalized to be equal to the
95 : inverse of the number of involved atoms. This means that it will be possible to use files with
96 : the weight columns set to zero obtaining a meaningful result. In previous PLUMED versions,
97 : setting all weights to zero was resulting in an error instead.
98 :
99 :
100 : \par Systems with more than 100k atoms
101 :
102 : Notice that it very likely does not make any sense to compute the \ref RMSD or any other structural
103 : deviation __using__ so many atoms. However, if the protein for which you want to compute \ref RMSD
104 : has atoms with large serial numbers (e.g. because it is located __after__ solvent in the sorted list of atoms)
105 : you might end up with troubles with the limitations of the PDB format. Indeed, since there are 5
106 : columns available for atom serial number, this number cannot be larger than 99999.
107 : In addition, providing \ref MOLINFO with names associated to atoms with a serial larger than 99999 would be impossible.
108 :
109 : Since PLUMED 2.4 we allow [hybrid 36](http://cci.lbl.gov/hybrid_36/) format to be used to specify atom numbers.
110 : This format is not particularly widespread, but has the nice feature that it provides a one-to-one mapping
111 : between numbers up to approximately 80 millions and strings with 5 characters, plus it is backward compatible
112 : for numbers smaller than 100000. This is not true for notations like the hex notation exported by VMD.
113 : Using the hybrid 36 format, the ATOM records for atom ranging from 99997 to 100002 would read like these:
114 : \verbatim
115 : ATOM 99997 Ar X 1 45.349 38.631 15.116 1.00 1.00
116 : ATOM 99998 Ar X 1 46.189 38.631 15.956 1.00 1.00
117 : ATOM 99999 Ar X 1 46.189 39.471 15.116 1.00 1.00
118 : ATOM A0000 Ar X 1 45.349 39.471 15.956 1.00 1.00
119 : ATOM A0000 Ar X 1 45.349 38.631 16.796 1.00 1.00
120 : ATOM A0001 Ar X 1 46.189 38.631 17.636 1.00 1.00
121 : \endverbatim
122 : There are tools that can be found to translate from integers to strings and back using hybrid 36 format
123 : (a simple python script can be found [here](https://sourceforge.net/p/cctbx/code/HEAD/tree/trunk/iotbx/pdb/hybrid_36.py)).
124 : In addition, as of PLUMED 2.5, we provide a \ref pdbrenumber "command line tool" that can be used to renumber atoms in a PDB file.
125 :
126 : */
127 : //+ENDPLUMEDOC
128 :
129 :
130 : namespace PLMD {
131 :
132 0 : void PDB::setAtomNumbers( const std::vector<AtomNumber>& atoms ) {
133 0 : positions.resize( atoms.size() );
134 0 : occupancy.resize( atoms.size() );
135 0 : beta.resize( atoms.size() );
136 0 : numbers.resize( atoms.size() );
137 0 : for(unsigned i=0; i<atoms.size(); ++i) {
138 0 : numbers[i]=atoms[i];
139 0 : beta[i]=1.0;
140 0 : occupancy[i]=1.0;
141 : }
142 0 : }
143 :
144 0 : void PDB::setArgumentNames( const std::vector<std::string>& argument_names ) {
145 0 : argnames.resize( argument_names.size() );
146 0 : std::vector<double> tmp(1,0);
147 0 : for(unsigned i=0; i<argument_names.size(); ++i) {
148 : argnames[i]=argument_names[i];
149 0 : arg_data.insert( std::pair<std::string,std::vector<double> >( argnames[i], tmp ) );
150 : }
151 0 : }
152 :
153 2148 : bool PDB::getArgumentValue( const std::string& name, std::vector<double>& value ) const {
154 : std::map<std::string,std::vector<double> >::const_iterator it = arg_data.find(name);
155 2148 : if( it!=arg_data.end() ) {
156 2148 : if( value.size()!=it->second.size() ) {
157 : return false;
158 : }
159 4300 : for(unsigned i=0; i<value.size(); ++i) {
160 2152 : value[i] = it->second[i];
161 : }
162 : return true;
163 : }
164 : return false;
165 : }
166 :
167 0 : void PDB::setAtomPositions( const std::vector<Vector>& pos ) {
168 0 : plumed_assert( pos.size()==positions.size() );
169 0 : for(unsigned i=0; i<positions.size(); ++i) {
170 0 : positions[i]=pos[i];
171 : }
172 0 : }
173 :
174 0 : void PDB::setArgumentValue( const std::string& argname, const double& val ) {
175 : // First set the value of the value of the argument in the map
176 0 : arg_data.find(argname)->second.resize(1);
177 0 : arg_data.find(argname)->second[0] = val;
178 0 : }
179 :
180 : // bool PDB::hasRequiredProperties( const std::vector<std::string>& inproperties ){
181 : // bool hasprop=false;
182 : // for(unsigned i=0;i<remark.size();++i){
183 : // if( remark[i].find("PROPERTIES=")!=std::string::npos){ hasprop=true; break; }
184 : // }
185 : // if( !hasprop ){
186 : // std::string mypropstr="PROPERTIES=" + inproperties[0];
187 : // for(unsigned i=1;i<inproperties.size();++i) mypropstr += "," + inproperties[i];
188 : // remark.push_back( mypropstr );
189 : // }
190 : // // Now check that all required properties are there
191 : // for(unsigned i=0;i<inproperties.size();++i){
192 : // hasprop=false;
193 : // for(unsigned j=0;j<remark.size();++j){
194 : // if( remark[j].find(inproperties[i]+"=")!=std::string::npos){ hasprop=true; break; }
195 : // }
196 : // if( !hasprop ) return false;
197 : // }
198 : // return true;
199 : // }
200 :
201 0 : void PDB::addBlockEnd( const unsigned& end ) {
202 0 : block_ends.push_back( end );
203 0 : }
204 :
205 3 : unsigned PDB::getNumberOfAtomBlocks()const {
206 3 : return block_ends.size();
207 : }
208 :
209 6 : const std::vector<unsigned> & PDB::getAtomBlockEnds()const {
210 6 : return block_ends;
211 : }
212 :
213 23579899 : const std::vector<Vector> & PDB::getPositions()const {
214 23579899 : return positions;
215 : }
216 :
217 0 : void PDB::setPositions(const std::vector<Vector> &v ) {
218 0 : plumed_assert( v.size()==positions.size() );
219 0 : positions=v;
220 0 : }
221 :
222 153972 : const std::vector<double> & PDB::getOccupancy()const {
223 153972 : return occupancy;
224 : }
225 :
226 152263 : const std::vector<double> & PDB::getBeta()const {
227 152263 : return beta;
228 : }
229 :
230 4607 : void PDB::addRemark( std::vector<std::string>& v1 ) {
231 4607 : Tools::parse(v1,"TYPE",mtype);
232 4607 : Tools::parseVector(v1,"ARG",argnames);
233 12507 : for(unsigned i=0; i<v1.size(); ++i) {
234 7900 : if( v1[i].find("=")!=std::string::npos ) {
235 : std::size_t eq=v1[i].find_first_of('=');
236 7286 : std::string name=v1[i].substr(0,eq);
237 7286 : std::string sval=v1[i].substr(eq+1);
238 : // double val; Tools::convert( sval, val );
239 7286 : std::vector<std::string> words=Tools::getWords(sval,"\t\n ,");
240 7286 : std::vector<double> val( words.size() );
241 14580 : for(unsigned ii=0; ii<val.size(); ++ii) {
242 7294 : Tools::convert( words[ii], val[ii] );
243 : }
244 14572 : arg_data.insert( std::pair<std::string,std::vector<double> >( name, val ) );
245 7286 : } else {
246 614 : flags.push_back(v1[i]);
247 : }
248 : }
249 4607 : }
250 :
251 0 : bool PDB::hasFlag( const std::string& fname ) const {
252 0 : for(unsigned i=0; i<flags.size(); ++i) {
253 0 : if( flags[i]==fname ) {
254 : return true;
255 : }
256 : }
257 : return false;
258 : }
259 :
260 :
261 314654 : const std::vector<AtomNumber> & PDB::getAtomNumbers()const {
262 314654 : return numbers;
263 : }
264 :
265 0 : const Vector & PDB::getBoxAxs()const {
266 0 : return BoxXYZ;
267 : }
268 :
269 0 : const Vector & PDB::getBoxAng()const {
270 0 : return BoxABG;
271 : }
272 :
273 12 : const Tensor & PDB::getBoxVec()const {
274 12 : return Box;
275 : }
276 :
277 6947345 : std::string PDB::getAtomName(AtomNumber a)const {
278 : const auto p=number2index.find(a);
279 6947345 : if(p==number2index.end()) {
280 : std::string num;
281 0 : Tools::convert( a.serial(), num );
282 0 : plumed_merror("Name of atom " + num + " not found" );
283 : } else {
284 6947345 : return atomsymb[p->second];
285 : }
286 : }
287 :
288 166232 : unsigned PDB::getResidueNumber(AtomNumber a)const {
289 : const auto p=number2index.find(a);
290 166232 : if(p==number2index.end()) {
291 : std::string num;
292 0 : Tools::convert( a.serial(), num );
293 0 : plumed_merror("Residue for atom " + num + " not found" );
294 : } else {
295 166232 : return residue[p->second];
296 : }
297 : }
298 :
299 180036 : std::string PDB::getResidueName(AtomNumber a) const {
300 : const auto p=number2index.find(a);
301 180036 : if(p==number2index.end()) {
302 : std::string num;
303 0 : Tools::convert( a.serial(), num );
304 0 : plumed_merror("Residue for atom " + num + " not found" );
305 : } else {
306 180036 : return residuenames[p->second];
307 : }
308 : }
309 :
310 243528438 : unsigned PDB::size()const {
311 243528438 : return positions.size();
312 : }
313 :
314 4673 : bool PDB::readFromFilepointer(FILE *fp,bool naturalUnits,double scale) {
315 : //cerr<<file<<endl;
316 : bool file_is_alive=false;
317 4673 : if(naturalUnits) {
318 : scale=1.0;
319 : }
320 : std::string line;
321 : fpos_t pos;
322 : bool between_ters=true;
323 469356 : while(Tools::getline(fp,line)) {
324 : //cerr<<line<<"\n";
325 468846 : fgetpos (fp,&pos);
326 3377675 : while(line.length()<80) {
327 2908829 : line.push_back(' ');
328 : }
329 468846 : std::string record=line.substr(0,6);
330 468846 : std::string serial=line.substr(6,5);
331 468846 : std::string atomname=line.substr(12,4);
332 468846 : std::string residuename=line.substr(17,3);
333 468846 : std::string chainID=line.substr(21,1);
334 468846 : std::string resnum=line.substr(22,4);
335 468846 : std::string x=line.substr(30,8);
336 468846 : std::string y=line.substr(38,8);
337 468846 : std::string z=line.substr(46,8);
338 468846 : std::string occ=line.substr(54,6);
339 468846 : std::string bet=line.substr(60,6);
340 468846 : std::string BoxX=line.substr(6,9);
341 468846 : std::string BoxY=line.substr(15,9);
342 468846 : std::string BoxZ=line.substr(24,9);
343 468846 : std::string BoxA=line.substr(33,7);
344 468846 : std::string BoxB=line.substr(40,7);
345 468846 : std::string BoxG=line.substr(47,7);
346 468846 : Tools::trim(record);
347 468846 : if(record=="TER") {
348 : between_ters=false;
349 245 : block_ends.push_back( positions.size() );
350 : }
351 468846 : if(record=="END") {
352 : file_is_alive=true;
353 : break;
354 : }
355 464835 : if(record=="ENDMDL") {
356 : file_is_alive=true;
357 : break;
358 : }
359 464683 : if(record=="REMARK") {
360 : std::vector<std::string> v1;
361 9214 : v1=Tools::getWords(line.substr(6));
362 4607 : addRemark( v1 );
363 4607 : }
364 464683 : if(record=="CRYST1") {
365 80 : Tools::convert(BoxX,BoxXYZ[0]);
366 80 : Tools::convert(BoxY,BoxXYZ[1]);
367 80 : Tools::convert(BoxZ,BoxXYZ[2]);
368 80 : Tools::convert(BoxA,BoxABG[0]);
369 80 : Tools::convert(BoxB,BoxABG[1]);
370 80 : Tools::convert(BoxG,BoxABG[2]);
371 : BoxXYZ*=scale;
372 80 : double cosA=std::cos(BoxABG[0]*pi/180.);
373 80 : double cosB=std::cos(BoxABG[1]*pi/180.);
374 80 : double cosG=std::cos(BoxABG[2]*pi/180.);
375 80 : double sinG=std::sin(BoxABG[2]*pi/180.);
376 320 : for (unsigned i=0; i<3; i++) {
377 240 : Box[i][0]=0.;
378 240 : Box[i][1]=0.;
379 240 : Box[i][2]=0.;
380 : }
381 80 : Box[0][0]=BoxXYZ[0];
382 80 : Box[1][0]=BoxXYZ[1]*cosG;
383 80 : Box[1][1]=BoxXYZ[1]*sinG;
384 80 : Box[2][0]=BoxXYZ[2]*cosB;
385 80 : Box[2][1]=(BoxXYZ[2]*BoxXYZ[1]*cosA-Box[2][0]*Box[1][0])/Box[1][1];
386 80 : Box[2][2]=std::sqrt(BoxXYZ[2]*BoxXYZ[2]-Box[2][0]*Box[2][0]-Box[2][1]*Box[2][1]);
387 : }
388 470586 : if(record=="ATOM" || record=="HETATM") {
389 : between_ters=true;
390 : AtomNumber a;
391 458780 : unsigned resno=0; // GB: when resnum string is not present, we set res number to zero
392 : double o,b;
393 : Vector p;
394 : {
395 : int result;
396 458780 : auto trimmed=serial;
397 458780 : Tools::trim(trimmed);
398 458842 : while(trimmed.length()<5) {
399 124 : trimmed = std::string(" ") + trimmed;
400 : }
401 458780 : const char* errmsg = h36::hy36decode(5, trimmed.c_str(),trimmed.length(), &result);
402 458780 : if(errmsg) {
403 0 : std::string msg(errmsg);
404 0 : plumed_merror(msg);
405 : }
406 458780 : a.setSerial(result);
407 : }
408 :
409 : // allow skipping residue number
410 : {
411 458780 : auto trimmed=resnum;
412 458780 : Tools::trim(trimmed);
413 458780 : if(trimmed.length()>0) {
414 : int result;
415 458780 : while(trimmed.length()<4) {
416 0 : trimmed = std::string(" ") + trimmed;
417 : }
418 458780 : const char* errmsg = h36::hy36decode(4, trimmed.c_str(),trimmed.length(), &result);
419 458780 : if(errmsg) {
420 0 : std::string msg(errmsg);
421 0 : plumed_merror(msg);
422 : }
423 458780 : resno=result;
424 : }
425 : }
426 :
427 458780 : Tools::convert(occ,o);
428 458780 : Tools::convert(bet,b);
429 458780 : Tools::convert(x,p[0]);
430 458780 : Tools::convert(y,p[1]);
431 458780 : Tools::convert(z,p[2]);
432 : // scale into nm
433 : p*=scale;
434 458780 : numbers.push_back(a);
435 458780 : number2index[a]=positions.size();
436 458780 : std::size_t startpos=atomname.find_first_not_of(" \t");
437 458780 : std::size_t endpos=atomname.find_last_not_of(" \t");
438 458780 : atomsymb.push_back( atomname.substr(startpos, endpos-startpos+1) );
439 458780 : residue.push_back(resno);
440 458780 : chain.push_back(chainID);
441 458780 : occupancy.push_back(o);
442 458780 : beta.push_back(b);
443 458780 : positions.push_back(p);
444 458780 : residuenames.push_back(residuename);
445 : }
446 : }
447 4673 : if( between_ters ) {
448 4481 : block_ends.push_back( positions.size() );
449 : }
450 4673 : return file_is_alive;
451 : }
452 :
453 459 : bool PDB::read(const std::string&file,bool naturalUnits,double scale) {
454 459 : FILE* fp=std::fopen(file.c_str(),"r");
455 459 : if(!fp) {
456 : return false;
457 : }
458 : // call fclose when exiting this function
459 : auto deleter=[](auto f) {
460 458 : std::fclose(f);
461 : };
462 : std::unique_ptr<FILE,decltype(deleter)> fp_deleter(fp,deleter);
463 458 : readFromFilepointer(fp,naturalUnits,scale);
464 : return true;
465 : }
466 :
467 296 : void PDB::getChainNames( std::vector<std::string>& chains ) const {
468 296 : chains.resize(0);
469 296 : chains.push_back( chain[0] );
470 574777 : for(unsigned i=1; i<size(); ++i) {
471 574481 : if( chains[chains.size()-1]!=chain[i] ) {
472 756 : chains.push_back( chain[i] );
473 : }
474 : }
475 296 : }
476 :
477 11906 : void PDB::getResidueRange( const std::string& chainname, unsigned& res_start, unsigned& res_end, std::string& errmsg ) const {
478 : bool inres=false, foundchain=false;
479 31023905 : for(unsigned i=0; i<size(); ++i) {
480 31011999 : if( chain[i]==chainname ) {
481 26714341 : if(!inres) {
482 11906 : if(foundchain) {
483 0 : errmsg="found second start of chain named " + chainname;
484 : }
485 11906 : res_start=residue[i];
486 : }
487 : inres=true;
488 : foundchain=true;
489 4297658 : } else if( inres && chain[i]!=chainname ) {
490 : inres=false;
491 11340 : res_end=residue[i-1];
492 : }
493 : }
494 11906 : if(inres) {
495 566 : res_end=residue[size()-1];
496 : }
497 11906 : }
498 :
499 228 : void PDB::getAtomRange( const std::string& chainname, AtomNumber& a_start, AtomNumber& a_end, std::string& errmsg ) const {
500 : bool inres=false, foundchain=false;
501 549225 : for(unsigned i=0; i<size(); ++i) {
502 548997 : if( chain[i]==chainname ) {
503 112687 : if(!inres) {
504 228 : if(foundchain) {
505 0 : errmsg="found second start of chain named " + chainname;
506 : }
507 228 : a_start=numbers[i];
508 : }
509 : inres=true;
510 : foundchain=true;
511 436310 : } else if( inres && chain[i]!=chainname ) {
512 : inres=false;
513 110 : a_end=numbers[i-1];
514 : }
515 : }
516 228 : if(inres) {
517 118 : a_end=numbers[size()-1];
518 : }
519 228 : }
520 :
521 9900 : std::string PDB::getResidueName( const unsigned& resnum ) const {
522 12195264 : for(unsigned i=0; i<size(); ++i) {
523 12195264 : if( residue[i]==resnum ) {
524 9900 : return residuenames[i];
525 : }
526 : }
527 : std::string num;
528 0 : Tools::convert( resnum, num );
529 0 : plumed_merror("residue " + num + " not found" );
530 : }
531 :
532 50084 : std::string PDB::getResidueName(const unsigned& resnum,const std::string& chainid ) const {
533 64922933 : for(unsigned i=0; i<size(); ++i) {
534 64973197 : if( residue[i]==resnum && ( chainid=="*" || chain[i]==chainid) ) {
535 50084 : return residuenames[i];
536 : }
537 : }
538 : std::string num;
539 0 : Tools::convert( resnum, num );
540 0 : plumed_merror("residue " + num + " not found in chain " + chainid );
541 : }
542 :
543 :
544 22020 : AtomNumber PDB::getNamedAtomFromResidue( const std::string& aname, const unsigned& resnum ) const {
545 27242772 : for(unsigned i=0; i<size(); ++i) {
546 27242772 : if( residue[i]==resnum && atomsymb[i]==aname ) {
547 22020 : return numbers[i];
548 : }
549 : }
550 : std::string num;
551 0 : Tools::convert( resnum, num );
552 0 : plumed_merror("residue " + num + " does not contain an atom named " + aname );
553 : }
554 :
555 2898 : AtomNumber PDB::getNamedAtomFromResidueAndChain( const std::string& aname, const unsigned& resnum, const std::string& chainid ) const {
556 1116492 : for(unsigned i=0; i<size(); ++i) {
557 1119390 : if( residue[i]==resnum && atomsymb[i]==aname && ( chainid=="*" || chain[i]==chainid) ) {
558 2898 : return numbers[i];
559 : }
560 : }
561 : std::string num;
562 0 : Tools::convert( resnum, num );
563 0 : plumed_merror("residue " + num + " from chain " + chainid + " does not contain an atom named " + aname );
564 : }
565 :
566 35470 : std::vector<AtomNumber> PDB::getAtomsInResidue(const unsigned& resnum,const std::string& chainid)const {
567 : std::vector<AtomNumber> tmp;
568 95754470 : for(unsigned i=0; i<size(); ++i) {
569 96261470 : if( residue[i]==resnum && ( chainid=="*" || chain[i]==chainid) ) {
570 542470 : tmp.push_back(numbers[i]);
571 : }
572 : }
573 35470 : if(tmp.size()==0) {
574 : std::string num;
575 0 : Tools::convert( resnum, num );
576 0 : plumed_merror("Cannot find residue " + num + " from chain " + chainid );
577 : }
578 35470 : return tmp;
579 : }
580 :
581 0 : std::vector<AtomNumber> PDB::getAtomsInChain(const std::string& chainid)const {
582 : std::vector<AtomNumber> tmp;
583 0 : for(unsigned i=0; i<size(); ++i) {
584 0 : if( chainid=="*" || chain[i]==chainid ) {
585 0 : tmp.push_back(numbers[i]);
586 : }
587 : }
588 0 : if(tmp.size()==0) {
589 0 : plumed_merror("Cannot find atoms from chain " + chainid );
590 : }
591 0 : return tmp;
592 : }
593 :
594 7710 : std::string PDB::getChainID(const unsigned& resnumber) const {
595 9481600 : for(unsigned i=0; i<size(); ++i) {
596 9481600 : if(resnumber==residue[i]) {
597 7710 : return chain[i];
598 : }
599 : }
600 0 : plumed_merror("Not enough residues in pdb input file");
601 : }
602 :
603 0 : std::string PDB::getChainID(AtomNumber a) const {
604 : const auto p=number2index.find(a);
605 0 : if(p==number2index.end()) {
606 : std::string num;
607 0 : Tools::convert( a.serial(), num );
608 0 : plumed_merror("Chain for atom " + num + " not found" );
609 : }
610 0 : return chain[p->second];
611 : }
612 :
613 0 : bool PDB::checkForResidue( const std::string& name ) const {
614 0 : for(unsigned i=0; i<size(); ++i) {
615 0 : if( residuenames[i]==name ) {
616 : return true;
617 : }
618 : }
619 : return false;
620 : }
621 :
622 0 : bool PDB::checkForAtom( const std::string& name ) const {
623 0 : for(unsigned i=0; i<size(); ++i) {
624 0 : if( atomsymb[i]==name ) {
625 : return true;
626 : }
627 : }
628 : return false;
629 : }
630 :
631 249 : bool PDB::checkForAtom( AtomNumber a ) const {
632 : const auto p=number2index.find(a);
633 249 : return (p!=number2index.end());
634 : }
635 :
636 0 : Log& operator<<(Log& ostr, const PDB& pdb) {
637 : const std::size_t bufferlen=1000;
638 : char buffer[bufferlen];
639 0 : for(unsigned i=0; i<pdb.positions.size(); i++) {
640 0 : std::snprintf(buffer,bufferlen,"ATOM %3u %8.3f %8.3f %8.3f\n",pdb.numbers[i].serial(),pdb.positions[i][0],pdb.positions[i][1],pdb.positions[i][2]);
641 0 : ostr<<buffer;
642 : }
643 0 : return ostr;
644 : }
645 :
646 1038 : Vector PDB::getPosition(AtomNumber a)const {
647 : const auto p=number2index.find(a);
648 1038 : if(p==number2index.end()) {
649 0 : plumed_merror("atom not available");
650 : } else {
651 1038 : return positions[p->second];
652 : }
653 : }
654 :
655 0 : std::vector<std::string> PDB::getArgumentNames()const {
656 0 : return argnames;
657 : }
658 :
659 0 : std::string PDB::getMtype() const {
660 0 : return mtype;
661 : }
662 :
663 0 : void PDB::print( const double& lunits, GenericMolInfo* mymoldat, OFile& ofile, const std::string& fmt ) {
664 0 : if( argnames.size()>0 ) {
665 0 : ofile.printf("REMARK ARG=%s", argnames[0].c_str() );
666 0 : for(unsigned i=1; i<argnames.size(); ++i) {
667 0 : ofile.printf(",%s",argnames[i].c_str() );
668 : }
669 0 : ofile.printf("\n");
670 0 : ofile.printf("REMARK ");
671 : }
672 : std::string descr2;
673 0 : if(fmt.find("-")!=std::string::npos) {
674 0 : descr2="%s=" + fmt + " ";
675 : } else {
676 : // This ensures numbers are left justified (i.e. next to the equals sign
677 0 : std::size_t psign=fmt.find("%");
678 0 : plumed_assert( psign!=std::string::npos );
679 0 : descr2="%s=%-" + fmt.substr(psign+1) + " ";
680 : }
681 0 : for(std::map<std::string,std::vector<double> >::iterator it=arg_data.begin(); it!=arg_data.end(); ++it) {
682 0 : ofile.printf( descr2.c_str(),it->first.c_str(), it->second[0] );
683 : }
684 0 : if( argnames.size()>0 ) {
685 0 : ofile.printf("\n");
686 : }
687 0 : if( !mymoldat ) {
688 0 : for(unsigned i=0; i<positions.size(); ++i) {
689 : std::array<char,6> at;
690 : {
691 0 : const char* msg = h36::hy36encode(5,numbers[i].serial(),&at[0]);
692 0 : plumed_assert(msg==nullptr) << msg;
693 0 : at[5]=0;
694 : }
695 : std::array<char,5> res;
696 : {
697 0 : const char* msg = h36::hy36encode(4,i,&res[0]);
698 0 : plumed_assert(msg==nullptr) << msg;
699 0 : res[4]=0;
700 : }
701 0 : ofile.printf("ATOM %s X RES %s %8.3f%8.3f%8.3f%6.2f%6.2f\n",
702 : &at[0], &res[0],
703 0 : lunits*positions[i][0], lunits*positions[i][1], lunits*positions[i][2],
704 : occupancy[i], beta[i] );
705 : }
706 : } else {
707 0 : for(unsigned i=0; i<positions.size(); ++i) {
708 : std::array<char,6> at;
709 : {
710 0 : const char* msg = h36::hy36encode(5,numbers[i].serial(),&at[0]);
711 0 : plumed_assert(msg==nullptr) << msg;
712 0 : at[5]=0;
713 : }
714 : std::array<char,5> res;
715 : {
716 0 : const char* msg = h36::hy36encode(4,mymoldat->getResidueNumber(numbers[i]),&res[0]);
717 0 : plumed_assert(msg==nullptr) << msg;
718 0 : res[4]=0;
719 : }
720 0 : ofile.printf("ATOM %s %-4s %3s %s %8.3f%8.3f%8.3f%6.2f%6.2f\n",
721 0 : &at[0], mymoldat->getAtomName(numbers[i]).c_str(),
722 0 : mymoldat->getResidueName(numbers[i]).c_str(), &res[0],
723 0 : lunits*positions[i][0], lunits*positions[i][1], lunits*positions[i][2],
724 : occupancy[i], beta[i] );
725 : }
726 : }
727 0 : ofile.printf("END\n");
728 0 : }
729 :
730 59906 : bool PDB::allowedResidue( const std::string& type, const std::string& residuename ) const {
731 59906 : if( type=="protein" ) {
732 59906 : if(residuename=="ALA") {
733 : return true;
734 56756 : } else if(residuename=="ARG") {
735 : return true;
736 53172 : } else if(residuename=="ASN") {
737 : return true;
738 50100 : } else if(residuename=="ASP") {
739 : return true;
740 47412 : } else if(residuename=="CYS") {
741 : return true;
742 47204 : } else if(residuename=="GLN") {
743 : return true;
744 42844 : } else if(residuename=="GLU") {
745 : return true;
746 40828 : } else if(residuename=="GLY") {
747 : return true;
748 38036 : } else if(residuename=="HIS") {
749 : return true;
750 38036 : } else if(residuename=="ILE") {
751 : return true;
752 34436 : } else if(residuename=="LEU") {
753 : return true;
754 28052 : } else if(residuename=="LYS") {
755 : return true;
756 23732 : } else if(residuename=="MET") {
757 : return true;
758 21412 : } else if(residuename=="PHE") {
759 : return true;
760 17188 : } else if(residuename=="PRO") {
761 : return true;
762 14020 : } else if(residuename=="SER") {
763 : return true;
764 11810 : } else if(residuename=="THR") {
765 : return true;
766 9740 : } else if(residuename=="TRP") {
767 : return true;
768 9740 : } else if(residuename=="TYR") {
769 : return true;
770 7532 : } else if(residuename=="VAL") {
771 : return true;
772 : }
773 : // Terminal groups
774 3010 : else if(residuename=="ACE") {
775 : return true;
776 3010 : } else if(residuename=="NME") {
777 : return true;
778 3010 : } else if(residuename=="NH2") {
779 : return true;
780 : }
781 : // Alternative residue names in common force fields
782 3010 : else if(residuename=="GLH") {
783 : return true; // neutral GLU
784 3010 : } else if(residuename=="ASH") {
785 : return true; // neutral ASP
786 3010 : } else if(residuename=="HID") {
787 : return true; // HIS-D amber
788 3010 : } else if(residuename=="HSD") {
789 : return true; // HIS-D charmm
790 3010 : } else if(residuename=="HIE") {
791 : return true; // HIS-E amber
792 2402 : } else if(residuename=="HSE") {
793 : return true; // HIS-E charmm
794 2402 : } else if(residuename=="HIP") {
795 : return true; // HIS-P amber
796 2402 : } else if(residuename=="HSP") {
797 : return true; // HIS-P charmm
798 2402 : } else if(residuename=="CYX") {
799 : return true; // disulfide bridge CYS
800 : }
801 : // Weird amino acids
802 2402 : else if(residuename=="NLE") {
803 : return true;
804 2402 : } else if(residuename=="SFO") {
805 : return true;
806 : } else {
807 2402 : return false;
808 : }
809 0 : } else if( type=="dna" ) {
810 0 : if(residuename=="A") {
811 : return true;
812 0 : } else if(residuename=="A5") {
813 : return true;
814 0 : } else if(residuename=="A3") {
815 : return true;
816 0 : } else if(residuename=="AN") {
817 : return true;
818 0 : } else if(residuename=="G") {
819 : return true;
820 0 : } else if(residuename=="G5") {
821 : return true;
822 0 : } else if(residuename=="G3") {
823 : return true;
824 0 : } else if(residuename=="GN") {
825 : return true;
826 0 : } else if(residuename=="T") {
827 : return true;
828 0 : } else if(residuename=="T5") {
829 : return true;
830 0 : } else if(residuename=="T3") {
831 : return true;
832 0 : } else if(residuename=="TN") {
833 : return true;
834 0 : } else if(residuename=="C") {
835 : return true;
836 0 : } else if(residuename=="C5") {
837 : return true;
838 0 : } else if(residuename=="C3") {
839 : return true;
840 0 : } else if(residuename=="CN") {
841 : return true;
842 0 : } else if(residuename=="DA") {
843 : return true;
844 0 : } else if(residuename=="DA5") {
845 : return true;
846 0 : } else if(residuename=="DA3") {
847 : return true;
848 0 : } else if(residuename=="DAN") {
849 : return true;
850 0 : } else if(residuename=="DG") {
851 : return true;
852 0 : } else if(residuename=="DG5") {
853 : return true;
854 0 : } else if(residuename=="DG3") {
855 : return true;
856 0 : } else if(residuename=="DGN") {
857 : return true;
858 0 : } else if(residuename=="DT") {
859 : return true;
860 0 : } else if(residuename=="DT5") {
861 : return true;
862 0 : } else if(residuename=="DT3") {
863 : return true;
864 0 : } else if(residuename=="DTN") {
865 : return true;
866 0 : } else if(residuename=="DC") {
867 : return true;
868 0 : } else if(residuename=="DC5") {
869 : return true;
870 0 : } else if(residuename=="DC3") {
871 : return true;
872 0 : } else if(residuename=="DCN") {
873 : return true;
874 : } else {
875 0 : return false;
876 : }
877 0 : } else if( type=="rna" ) {
878 0 : if(residuename=="A") {
879 : return true;
880 0 : } else if(residuename=="A5") {
881 : return true;
882 0 : } else if(residuename=="A3") {
883 : return true;
884 0 : } else if(residuename=="AN") {
885 : return true;
886 0 : } else if(residuename=="G") {
887 : return true;
888 0 : } else if(residuename=="G5") {
889 : return true;
890 0 : } else if(residuename=="G3") {
891 : return true;
892 0 : } else if(residuename=="GN") {
893 : return true;
894 0 : } else if(residuename=="U") {
895 : return true;
896 0 : } else if(residuename=="U5") {
897 : return true;
898 0 : } else if(residuename=="U3") {
899 : return true;
900 0 : } else if(residuename=="UN") {
901 : return true;
902 0 : } else if(residuename=="C") {
903 : return true;
904 0 : } else if(residuename=="C5") {
905 : return true;
906 0 : } else if(residuename=="C3") {
907 : return true;
908 0 : } else if(residuename=="CN") {
909 : return true;
910 0 : } else if(residuename=="RA") {
911 : return true;
912 0 : } else if(residuename=="RA5") {
913 : return true;
914 0 : } else if(residuename=="RA3") {
915 : return true;
916 0 : } else if(residuename=="RAN") {
917 : return true;
918 0 : } else if(residuename=="RG") {
919 : return true;
920 0 : } else if(residuename=="RG5") {
921 : return true;
922 0 : } else if(residuename=="RG3") {
923 : return true;
924 0 : } else if(residuename=="RGN") {
925 : return true;
926 0 : } else if(residuename=="RU") {
927 : return true;
928 0 : } else if(residuename=="RU5") {
929 : return true;
930 0 : } else if(residuename=="RU3") {
931 : return true;
932 0 : } else if(residuename=="RUN") {
933 : return true;
934 0 : } else if(residuename=="RC") {
935 : return true;
936 0 : } else if(residuename=="RC5") {
937 : return true;
938 0 : } else if(residuename=="RC3") {
939 : return true;
940 0 : } else if(residuename=="RCN") {
941 : return true;
942 : } else {
943 0 : return false;
944 : }
945 0 : } else if( type=="water" ) {
946 0 : if(residuename=="SOL") {
947 : return true;
948 : }
949 0 : if(residuename=="WAT") {
950 : return true;
951 : }
952 0 : return false;
953 0 : } else if( type=="ion" ) {
954 0 : if(residuename=="IB+") {
955 : return true;
956 : }
957 0 : if(residuename=="CA") {
958 : return true;
959 : }
960 0 : if(residuename=="CL") {
961 : return true;
962 : }
963 0 : if(residuename=="NA") {
964 : return true;
965 : }
966 0 : if(residuename=="MG") {
967 : return true;
968 : }
969 0 : if(residuename=="K") {
970 : return true;
971 : }
972 0 : if(residuename=="RB") {
973 : return true;
974 : }
975 0 : if(residuename=="CS") {
976 : return true;
977 : }
978 0 : if(residuename=="LI") {
979 : return true;
980 : }
981 0 : if(residuename=="ZN") {
982 : return true;
983 : }
984 0 : return false;
985 : }
986 : return false;
987 : }
988 :
989 : }
990 :
|