Home

manual

1. 0 040 1 r 1 1 r 1 1 1 i EES MA ae 0 055 i kc 0 58 1 angs i 0 401 A Dress kc 0 75 1 angs 0 030 L m 0 45 I A 0 045 7 ke 1 0 t angs H vi K vi i 0 035 t 0 020 Vi a 0 35 1 angs S i S 0 025 yi j S yi i 0 010 H v4 0 015 Vs it E yi m L ayn l ZA 0 005 i i rr oN ji Me gear PS a 0 000 nc ee Ww r enan Teine ENO ms j L i r Vi 0 005 W f 0 010 Me 0 015 i i i i i 0 0 5 0 10 0 15 0 20 0 25 0 30 0 0 5 10 15 20 25 30 r Angs r Angs Figure 4 4 The correction potential x r ke as a function of the distance for different values of the parameters a left and ke right The solid line on the top right corner is the bare Coulomb potential 1 r The potential x r keut a yields in direct space the neglected reciprocal energy due to the truncation of the reciprocal lattice sums and must in principle be included for each atom pair distance in direct 4Note that limg_ 9 x r k erf ar r Electrostatic Interactions 40 space Thus the corrected direct space potential is then N a D ud 5 X rij tril Keut 4 48 ij n J ne which is then split as usual in short medium long range according to 439 The correction is certainly more crucial for the excluded intramolecular contacts because Veorr is essentially a short ranged potential which is non negligible only for intramolecular short distances For systems with hydr
2. ral7 val vie 7 3 52 So if the quantity riki T Vik T remains bounded which is true if the potential is not dissociative since k l refers to the same molecule i the average in Eq 8 52 is zer Thus we can rewrite the average of Eq 8 49 as DD R e F 5 5 MikMil vi Vik Sin e fix 3 53 i i t kl ik 8The statement the molecule of group does not dissociate is even too restrictive It is enough to say that the quantity 3 52 remains bound Multiple Time Steps Algorithms 25 The first term on the right hand side of the above equation can be further developed obtaining the trivial identity 5 MikMi Vi Vik gt Mimil v 5 MikMmiu lvi kl kl kl 5 2MigMi Vil Vik 3 54 kl 2M miz vx 2 P 3 55 k Substituting Eq in Eq 8 53 we get 1 2 RieF 2 min v mE 2 ra o fix 3 56 Substituting Eq 8 56 into Eq 3 42 leads speedily to 3 41 which completes the proof As a consequence of the above discussion it seems likely that both the equilibrium and non equilibrium properties of the MD system are not affected by coordinate scaling We shall see later that this is actually the case 3 4 Liouvillean Split and Multiple Time Step Algorithm for the NPT Ensemble We have seen in section P 2 that the knowledge of the Liouvillean allows us to straightforwardly derive a multi step integration algorithm Thus for simulation in the NPT ensemble the Liouv
3. D C Rapaport The Art of Molecular Dynamics Simulation Cambridge University Press Cambridge UK 1995 S Nos In M Meyer and V Pontikis editors Computer Simulation in Materials Science page 21 Kluwer Academic Publishers 1991 S Nos Prog Theor Phys Supp 103 1 1991 M Ferrario In M P Allen and D J Tildesley editors Computer Simulation in Chemical Physics page 153 Kluwer Academic Publishers 1993 G J Martyna D J Tobias and M L Klein J Chem Phys 101 4177 1994 H C Andersen J Chem Phys 72 2384 1980 M Parrinello and A Rahman Phys Rev Letters 45 1196 1980 S Nose Mol Phys 52 255 1984 M Ferrario and J P Ryckaert Mol Phys 78 7368 1985 M E Tuckerman C J Mundy and M L Klein Phys Rev Letters 78 2042 1997 S Melchionna G Ciccotti and B L Holian Mol Phys 18 533 1993 H J C Berendsen Lectures notes unpublished reported by G Ciccotti and J P Ryckaert Comp Physics Report 4 1986 345 1986 E Paci and M Marchi J Phys Chem 104 3003 1996 S Toxvaerd Phys Rev B 47 343 1993 J P Hansen Molecular dynamics simulation of coulomb systems in two and three dimensions In Molecular Dynamics Simulation of Statistical Mechanics Systems Proceedings of the International School of Physics Enrico Fermi North Holland Physics 1986 H G Petersen J Chem Phys 103 3668 1995 S J Stuart R Zhou and B J Berne J Chem Phy
4. 6 24 Serial generalized ensemble simulations 54 Thus for each pair of neighboring ensembles n and m we generate two collections of instantaneous generalized dimensionless works Wi m gt n W2 m gt n etc and Wi n gt m Wa n gt m etc Let us denote the number of elements of such collections with Nm n and Nnm Afn m can be calculated by solving the equation see Eq 27 of Ref I16 D i l Nn m 1 Nmn 14 Nnm giana _ 5 i Nm gt n eWilm gt n Afn gt m 6 25 mn n gt m j 1 that just corresponds to the Bennett acceptance ratio for dimensionless quantities It is important to point out that Eq 6 25 is valid for nonequilibrium transformations does not matter how far from equilibrium and is rigorous only if the initial microstates of the transformations are drawn from equilibrium Therefore care should be taken in verifying whether convergence equilibrium is reached in the adaptive procedure It should be noted that Eq 6 25 is a straightforward generalization of Eq 8 of Ref 65 that was specifically derived for systems subject to mechanical changes Shirts et al 65 proposed a way of evaluating the square uncertainty variance of A fnm from maxi mum likelihood methods by also correcting the estimate in the case of the restriction from fixed probability of forward and backward work measurements to fixed number of forward and backward work measurements They provided a formula for s
5. Read a general formatted parameters file SYNOPSIS READ_ PRM_ASCII filename Input to ORAC amp PARAMETERS 97 DESCRIPTION Here filename is the ASCII parameter file The general formatted force field parameters file is described in Sec 10 3 In this file one must define each potential energy parameter of the given force field defined in Eq 4 3 It must be consistent with the general topology file read by READ_TPG_ASCIT The same parameters file can be used for many different solute molecules This is the reason of the word general EXAMPLES READ_PRM_ASCII forcefield prm Read the general formatted parameters file forcefield prm WARNINGS Must be used in conjunction with command READ_TPG_ASCII READ_TPG_ASCII NAME READ_TPG_ASCII Read a general formatted topology file SYNOPSIS READ_TPG_ASCII filename DESCRIPTION Here filename is the ASCII topology file The general topology file is described in Sec 10 3 It must define each residue contained in the current solute molecule The same topology file can be used for many different solute molecules This is the reason of the word general EXAMPLES READ_TPG_ASCII forcefield tpg Read the formatted topology file forcefield tpg WARNINGS Must be used in conjunction with command READ_PRM_ASCII REPL_RESIDUE NAME REPL_RESIDUE Replace or add the topology of a certain residue SYNOPSIS REPL_RESIDUE END DESCRIPTION The command REPL_RESIDUE opens
6. averaged ca compute average rms of a carbons averaged heavy compute average rms of non hydrogen atoms inst_xmrms type_1 type_2 specifies which instantaneous rms s have to computed type n can be any combination of the four keywords ca heavy backbone allatoms The keyword all stands for all the the preceding keywords simultaneously The inst_xrms keyword is mandatory when print inst_xmrs is specified print type nprint OPEN filename print rms s calculation as specified by type to file filename every nprint configurations see also command DUMP amp INOUT The keyword type can be any of the following averaged the full protein solute coordinates in pdb format are printed to the file filename with a constant orientation so that atomic rms s are minimized Feeding directly the file to rasmol gives an pictorial view of the atomic diplacements avg_xrms The time running averages of the rms s are printed averaged over alpha carbons backbone atoms and all heavy atoms inst_xrms The instantaneous values of the rms s are printed averaged over alpha carbons backbone atoms and all heavy atoms If this subcommand is specified along with inst_xmrms subcommand orac also produces the file filename_atm which contains the final values of the atomic rms s for the atom types alpha carbon backbone atoms etc specified in the command inst_xrms EXAMPLES STRUCTURES print averaged 2 OPEN test str print avg_xrms 3 OPEN tes
7. m t 1 the alchemical species disappears according to the mixing rules for j t nij t factors specified in Table 9 1 These rules are such that the modified alchemical potential is enforced only when one of the two interacting atoms is alchemical while atom atom interactions within a given alchemical species are accounted for with the standard potential or simply set to zero when they do refer to atoms on different alchemical species In general the time protocol for the 7 Van der Waals and electrostatic atomic parameters may differ from each other and for different alchemical species A simple and sufficiently flexible scheme would be that for example of allowing only two sets of alchemical species i e the species to be annihilated and the species to be created defining hence two different time protocols for the A and two more for the 7 atomic parameters Such a scheme allows for example the determination of the energy difference when one group in a molecule is replaced by an other group in a single alchemical simulation As remarked by others I5 it is convenient in a e g alchemical creation to switch on first the Van der Waals parameters changing 7 for the alchemical atoms from one to zero and then charge the system varying from one to zero While for soft core Lennard Jones term and the direct lattice electrostatic term the combination rules described in Table 1 can be straightforwardly implemented at a very limited
8. 1 whereas the other comes from Eq applied in the reverse direction replace n with n 1 and m with n into Eq 6 27 The two estimates will be invoked in the acceptance ratio of n gt n 1 and n 1 gt n ensemble transitions respectively see next point 4 In the former case we need to resort to additional arrays denoted as N and W n n 1 to store Nn n41 and W n n 1 Separate arrays are necessary because they are subject to different manipulation during the simulation Specifically if the condition N 41 gt N is satisfied then we calculate Af n41 via Eq This estimate is employed as such in the acceptance ratio Then we set N 0 and remove W n gt n 1 from computer memory The same protocol is used to calculate Afn 1 n from the quantities NAP and W4 n 1 n The additional arrays introduced here are updated as described at point 2 Note that in this procedure the arrays of step 3a are neither used nor changed Note also that the procedure described here corresponds to the way of calculating the finite free energy differences in free energy perturbation method 118 c If none of the above criteria is met then optimal weights are not updated and conventional sampling continues Storage of dimensionless works as described at point 2 continues as well We point out that if equilibrium is reached slowly case of large viscous systems or systems with very complex free energy landscape t
9. 2The explicit i e atomistic solvent introduced in the MD cell is in fact the minimum amount required such that the distance between any two portion of different solute replicas is sufficiently large so as to assume negligible interprotein interactions Also the shape of the MD cell is usually chosen so as to minimize the amount of explicit solvent whose sole role at an extremely demanding computational cost is to provide the correct dielectric medium for the biomolecule including microsolvation effects For example globular i e quasi spherical proteins are usually simulated in a dodecahedric box Such a system single solvated protein in PBC is thus representative of dilute solution of biomolecules since the solute molecules in the periodic systems can never come close to each other thereby interacting Atomistic simulations an introduction 7 difference To give just a faint idea of the computational cost involved for the simple system of decaalanine in vacuo 1 0 microseconds of serial simulation takes about 10 days on 2 5 MH processor Due to this computational bounds standard molecular dynamics simulation of even small biological systems are in general not ergodic in the accessible simulation time Typically the system remains trapped during the whole simulation time span in a local minimum and the rare event of escaping the trap surmounting a free energy barrier never happens In order to overcome such severe sampling problem many
10. FREQUENCIES END DESCRIPTION The following subcommands may be specified within FREQUENCIES dist_max no_step print e dist_max hdist The differential increment in A for numerical computation of the dynamical matrix The default is 0 03 A which is OK for most systems and force fields e no_step steps Order of Chebyshev polynomial for numerical computation of the dynamical matrix The default is 6 which is OK for most systems and force fields e print OPEN filename Write frequencies and eigenvectors to file filename If not specified frequencies are written to the main output file EXAMPLES FREQUENCIES print OPEN myfreq out END Input to ORAC amp SIMULATION 140 MINIMIZE NAME MINIMIZE Run steepest descent like or conjugate gradient minimization at constant volume or at a given pressure SYNOPSIS MINIMIZE END DESCRIPTION Run energy minimization using a method of choice steepest descent of conjugate gradient After minimization is done the dynamical matrix is computed and diagonalized and the normal frequencies are listed along with eigenvectors The following subcommands may be specified within MINIMIZE CG SD WRITE_GRADIENT AGBNP GC eps_energy Use conjugate gradient with energy tolerance eps_energy SD eps_energy Use steepest descent with energy tolerance eps_energy WRITE_GRADIENT Write final gradient at each atom AGBNP Minimization is done using an AGBNP model 157 for implicit solve
11. If irest 0 the run is restarted from a previous one This implies that the directories PARXXXX are present and are equal in number to nprocs i e the number of replicas If irest 0 then the run refers to a cold start from scratch and irest 1 the scaling factors associated with the intermediate ensembles are derived according i ae m 1 nstates 1 to a geometric progression namely scale m scale where scale m is the scaling factor for the potential i of the ensemble m with 1 lt m lt nstates For example if scale 0 6 and nstates 4 then scale 1 1 scale 2 0 843433 scale 3 0 711379 and scale 4 scale 0 6 The nprocs replicas are initially distributed as described in Section 6 3 2 note we assume A to correspond to m 1 i e to the unscaled ensemble irest 2 the scaling factors are read from an auxiliary file called SGE set that must be present in the directory from which the program is launched using the mpiexec mpirun com mand This ASCII file has two comment lines on the top and then as many lines as the number of ensembles nstates and on each line the three scale factors must be specified SGE simulations in the space of collective coordinates If the parameters scale scaleg and scalez are not specified in the SETUP command then a SGE simulation in the space of collective coordinates is performed In such a case the SETUP command is used to define the numbe
12. NB In the last example valid for paralle runs the relative path is specifed with respect to the actual pwd of the parallel processes Input to ORAC amp INOUT 85 TRAJECTORY NAME TRAJECTORY Read history file SYNOPSIS TRAJECTORY filename DESCRIPTION The TRAJECTORY command instructs the program to read the history file produced at an earlier time see command DUMP in this environment The auxiliary file filename contains the names of the parameters and history files s See also environment amp ANALYSIS for retrieving the history file and environment amp PROPERTIES for computing properties form history files EXAMPLES TRAJECTORY file aux Input to ORAC amp INTEGRATOR 86 10 2 3 amp INTEGRATOR This environment includes commands defining the integration algorithms to be used during the simulation run The following commands are allowed MTS_RESPA MTS_RESPA TIMESTEP NAME MTS_RESPA Use a multiple time step integrator SYNOPSIS MTS_RESPA END DESCRIPTION The used MTS_RESPA structured command opens an environment which includes several subcommands to define a multiple time step integrator The MTS_RESPA directive can be specified for NVE simulations and extended system simulations NHP NPT and NVT MTS_RESPA is also compatible with constraints The following subcommands may be specified within MTS_RESPA step dirty very_cold_start energy_then_die k ewald test times p_test s_test
13. 1996 W B Street D J Tildesley and G Saville Mol Phys 35 639 1978 O Teleman and B Joensonn J Comput Chem 7 58 1986 M E Tuckerman G J Martyna and B J Berne J Chem Phys 94 6811 1991 M E Tuckerman and B J Berne J Chem Phys 95 8362 1991 M E Tuckerman B J Berne and A Rossi J Chem Phys 94 1465 1990 H Grubmuller H Heller A Winemuth and K Schulten Mol Simul 6 121 1991 M E Tuckerman B J Berne and G J Martyna J Chem Phys 97 1990 1992 M E Tuckerman B J Berne and G J Martyna J Chem Phys 99 2278 1993 D D Humphreys R A Friesner and B J Berne J Phys Chem 98 6885 1994 P Procacci and B J Berne J Chem Phys 101 2421 1994 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 174 P Procacci and M Marchi J Chem Phys 104 3003 1996 G J Martyna M E Tuckerman D J Tobias and M L Klein Mol Phys 87 1117 1996 P Procacci and B J Berne Mol Phys 83 255 1994 M Marchi and P Procacci J Chem Phys 109 5194 1998 M Saito J Chem Phys 101 4055 1994 H Lee T A Darden and L G Pedersen J Chem Phys 102 3830 1995 J A Barker and R O Watts Mol Phys 26 789 1973 J A Barker The problem of long range forces in the computer simulation of condensed matter volume 9 page 45 NRCC Workshop Proceedings 1980
14. As discussed in Sec during a SGE simulation optimal weights are evaluated using Eq 6 25 and only temporary values are obtained from Eq 6 27 Therefore for each optimal weight the simulation produces a series of estimates A f1 Afo Afp At a given time the current value of P depends on average on the time and on the update frequency of optimal weights In this section for convenience the subscript in Af labels independent estimates We also know that each Af value is affected by an uncertainty quantified by the associated variance 8 A f calculated via Eq 6 26 We can then write Af the optimal estimator of P Da Afi by a weighted sum of the individual estimates 128 D lP AK AK Af 6 28 Dia 67 A FI Note that independent estimates with smaller variances get greater weight and if the variances are equal then the estimator Af is simply the mean value of the estimates The uncertainty in the resulting estimate can be computed from the variances of the single estimates as 1 P P A D PANT 6 29 j 1 The ORAC program allows one to calculate A f using either all available estimates or a fixed number of estimates taken from the latest ones Chapter 7 Metadynamics Simulation history dependent algorithms in Non Boltzmann sampling If we are studying a prototypical elementary reaction in which two stable states are separated by a high free energy barrier AA gt gt kpgT along the reaction coor
15. BPTI Frequencies were calculated according to Eqs notice that almost all the proper torsions fall below 350 cm An efficient and simple separation of the intramolecular AMBER potential 26 assigns all bendings stretching and the improper or proper torsions involving hydrogen to a fast reference system labeled n0 and all proper torsions to a slower reference system labeled n1 The subdivision is then Vao Votretch F Voend Vi tors aiy Vs Vor Vp tors 4 19 Where with Vee ss we indicate proper torsions involving hydrogen For the reference system Vno the hydrogen stretching frequencies are the fastest motions and the Atyo time step must be set to 0 2 0 3 fs The computational burden of this part of the potential is very limited since it involves mostly two or three body forces For the reference system Vni the fastest motion is around 300 cm and the time step Atn1 should be set to 1 1 5 fs The computational effort for the reference system potential Vn is more important because of the numerous proper torsions of complex molecular systems which involve more expensive four body forces calculations One may also notice that some of the bendings which were assigned to the nO reference system fall in the torsion frequency region and could be therefore integrated with a time step much larger than Atyo 0 2 0 3 However in a multiple time step integration this overlap is just inefficient but certainly not dangerous Indeed no
16. C Micheletti and M Parrinello J Phys Chem B 110 3533 2006 B Isralewitz M Gao and K Schulten Curr Op Struct Biol 11 224 230 2001 D J Evans and D J Searls Phys Rev E 50 1645 1648 1994 M Sprik and G Ciccotti J Chem Phys 109 7737 7744 1998 S Park and K Schulten J Chem Phys 120 5946 5961 2004 148 149 150 151 152 153 154 155 156 157 178 R H Wood and W C F Muehlbauer J Phys Chem 95 6670 6675 1991 R Chelli and P Procacci Phys Chem Chem Phys 11 1152 1158 2009 M R Shirts and V S Pande Solvation free energies of amino acid side chain analogs for common molecular mechanics water models J Chem Phys page 134508 2005 M R Shirts J W Pitera W C Swope and V S Pande Extremely precise free energy calculations of amino acid side chain analogs Comparison of common molecular mechanics force fields for proteins J Chem Phys 119 5740 5761 2003 In the formulation of Eq QI we have implicitly assumed the so called tin foil boundary conditions the Ewald sphere is immersed in a perfectly conducting medium and hence the dipole term on the surface of the Ewald sphere is zero S W deLeeuw J W Perram and E R Smith Proc R Soc London A 373 27 1980 See for example the GROMACS manual and the tutorial for alchemical calculations Hands on tutorial Solvation free energy of ethanol available at http www gromacs o
17. DESCRIPTION As described in Sec IJORAC can handle the electrostatic interactions by Ewald summation If the argument to the command is on followed by alphal and rkcut standard Ewald is used with a alphal in A and the cutoff in k space keut rkcut in A The output will show what degree of convergence has been reached showing the numerical value of erfc reut reut and of exp k2 2a keut To chose instead PME the argument pme alphal ftt1 ftt2 ftt order must be chosen Here alphal has the same meaning as before while order is the order of the spline and fft1 fft2 fft3 define the three dimensional grid in direct space providing the number of bins along the a b c crystal axis i e the dimensions of the 3 D FFT used in the PME method For best efficiency fft1 fft2 and fft must be multiples of 2 3 or 5 Of course if the argument is off no Ewald summation is used for the electrostatic interactions When using PME Newton third law is not obeyed exactly but to the numerical accuracy of the interpolation This leads to a small momentum of the MD cell which can be removed by specifying the argument REMOVE_MOMENTUM EXAMPLES Input to ORAC amp POTENTIAL 106 1 EWALD on 0 4 2 0 Standard Ewald is used with parameters a 0 4 and keut 2 0 Av 2 EWALD pme 0 4 45 32 45 6 EWALD REMOVE_MOMENTUM The electrostatic interaction is handled by PME with a 0 4 A the order of the spline is 6 and the and the number of bins for defi
18. P Ewald Ann Phys 64 253 1921 S W deLeeuw J W Perram and E R Smith Proc R Soc London A 373 27 1980 T Darden D York and L Pedersen J Chem Phys 98 10089 1993 U Essmann L Perera M L Berkowitz T Darden H Lee and L G Pedersen J Chem Phys 101 8577 1995 R W Hockney Computer Simulation Using Particles McGraw Hill New York 1989 H G Petersen D Soelvanson and J W Perram J Chem Phys 101 8870 1994 L Greengard and V Rokhlin J Comput Phys 73 325 1987 J Shimada H Kaneko and T Takada J Comput Chem 15 28 1994 R Zhou and B J Berne J Chem Phys 103 9444 1996 Y Duan and P A Kollman Pathways to a protein folding intermediate observed in a 1 microsecond simulation in aqueous solution Science 282 740 744 1998 R H Swendsen and J S Wang Phys Rev Lett 57 2607 1986 C G Geyer in Computing Science and Statistics Proceedings of the 23rd Symposium on the Inter face edited by E M Keramidis page 156 1991 E Marinari and G Parisi Europhys Lett 19 451 1992 K Hukushima and K Nemoto J Phys Soc Jpn 65 1604 1996 Y Okamoto J Mol Graphics Modell 22 425 2004 A P Lyubartsev A A Martsinovski S V Shevkunoy and P N Vorontsov Velyaminov J Chem Phys 96 1776 1992 S Rauscher C Neale and R Pom s J Chem Theory Comput 5 2640 2009 S Park Phys Rev E 77 016709 2008 C Zhang and J Ma J Chem P
19. Steer along an arbitrary curvilinear coordinate or perform alchemical transformation SYNOPSIS STEER_PATH OPEN filename DESCRIPTION This command allows to do a MD simulation by steering the system along an arbitrary curvilinear path with arbitrary time protocol in a n dimensional coordinate space see Sec 8 3 This curvilinear coordinate and time protocol can be given in terms of time dependent added stretching bending and torsions external potentials to be specified in the file filename The format of this file is shown in Table 8 3 Refer to Sec 8 3 for more details or STEER_PATH ALCHEMY filename DESCRIPTION This command allows alchemical transformations using a time protocol specified in the file filename See DEFINE_ALCHEMICAL_ATOM command for details on alchemical transformations STRETCHING NAME STRETCHING Include stretching potentials SYNOPSIS STRETCHING HEAVY DESCRIPTION This command assigns a harmonic stretching potential see Sec between covalently bonded atoms in the solute Without argument stretching potentials are assigned to all possible covalently bonded pairs If the argument HEAVY is provided bonds involving hydrogens are maintained rigid and only stretching potentials for bonded pairs involving non hydrogen atoms are assigned DEFAULT Constraints on all bonds is the default Input to ORAC amp POTENTIAL 110 UPDATE NAME UPDATE Assign parameters for the computation of the Verlet n
20. YON OG Vex r Via Xn VWvaw Xn Vga Xn yest x Xna Vid Xr m T ga Xn XN n v amp v S xy 2 gee n Vor Xn X nn Vor Vba Xn 5 22 The solute potential VS X includes all the proper torsions and the 14 non bonded interactions involving the n atoms of the solute The solute solvent interaction involve all non bonded interactions between the N n solvent atoms and the n solute atoms The solvent solvent interaction involve all non bonded interactions among the N n solvent atoms As one can see the global fast bonded potential Viq Xn Vgonds Vangle Vi tors is assimilated to a solvent solvent contribution It should also be remarked that the reciprocal lattice contribution Vg i e the long range electrostatics is in any case assigned to the solvent solvent term even if it includes all kinds of non bonded interaction solute solute solvent solute and solvent solvent The reason why Var is not split in the solute solute solute solvent and solvent solvent components is both physical and practical Firstly the long range potential associated to each of three component of this term is expected in general to be rather insensitive along arbitrary reaction coordinates such that a scaling of Vy do not correspondingly produce a significant heating of any conformational coordinate Secondly in the Particle Mesh Approach approach the solute solute solvent solute and solvent solvent contribution to V r
21. computational cost in a standardly written force routine the same rules cannot be directly applied to the reciprocal lattice part In common implementation of the Ewald method for obvious reason of com putational convenience the reciprocal lattice space double sum is rewritten in terms of a squared charge weighted structure factors as 1 amp exp 7 m a rl 5217 3 2r V mi S m S m 9 2 In a system subject to a continuous alchemical transformation the charge weighted structure factor becomes Steered Molecular Dynamics 70 a function of the atomic factors t N S m A XC 1 da t Qi exp 2rim ri 9 3 a In the PME method the sum of Eq 9 3 is done via FFT by smearing the atomic charges on a regular grid in the direct lattice 84 In this approach all charge charge interactions between alchemical and non alchemical species are almost inextricably mixed in the PME Ewald reciprocal lattice contribution and the application of the rules reported in Table 9 J requires an extra effort indeed an effort that has apparently deterred many to use the full Ewald method for computing the work done during continuous alchemical transformations To this end and with no loss of generality it is convenient to classify the system in an alchemical solute and in a non alchemical solvent with only the former being externally driven We then label with q t and Q the time dependent alchemical charges an
22. only entropy is produced in the non equilibrium process while in the former the system can also cross different thermodynamics states i e the underlying PMF can also be non zero such that thermodynamic work can also be done The Crooks theorem CT reads pT 20 gt T zr pT z0 lt I 2r 1The potential of mean force is defined as W z kgT lIn P z where P z lt 6 z z r gt is the probability to find the system at the value of the reaction coordinate z r z independently on all the other coordinates exp 8 Wr 2 gt r 2 AF 8 2 R1 Steered Molecular Dynamics 62 20f 0 120 40 60 80 AF lt W gt lt W gt AF lt W gt AF lt W gt lt W gt AF lt W gt Figure 8 1 Physical significance of the Crooks theorem for a general driven process for nearly reversible processes left the forward Py W and backward P W work distributions overlap significantly The dotted line is the the backward work distribution for the inverse process without changing the sing of the work The crossing of the two solid distribution occurs at the free energy value for the forward process AF 18 When the process is done faster right panel the dissipation Wg both in the forward and in the backward process is larger the overlap is negligible and the crossing point of the two solid distribution can no longer easily identified where 7 is the duration time of the driven non equilibriu
23. real numbers are specified in the SETUP command then a Hamiltonian SGE simulation with total or partial scaling of the potential energy is performed simulated tempering and solute tempering like simulations respectively In such a case the SETUP command is used to define the number of ensembles nstates integer number and the lowest scal ing factor i e the highest temperature of the last ensemble The number of replicas in the SGE simulations is equal to the number of processors passed to the MPI routines nprocs At variance with REM nprocs may be not equal to nstates The restart option of a SGE simulation is con trolled by irest integer number The three parameters scale scale and scale3 can be different and refer to scaling features of different parts of the potential energy scale refers to the bending stretching and improper torsional potentials scale to the proper torsional potential and to the 1 4 non bonded interactions and scale3 refers to the non bonded potential IMPORTANT NOTE when the Ewald summation is used together with the command SEGMENT amp SGE scale3 scales only the Input to ORAC amp SGE 134 STEP direct short ranged part of the electrostatic interactions and the long ranged reciprocal part has a scaling factor of 1 i e these interactions are not scaled If scale scaleg scale3 then an equal scaling is applied to all parts of the potential it corresponds to a simulated tempering simulation
24. step type n ir Al dr reciprocal The command step is used to define the potential subdivision and the corresponding time steps The string type can be either intra or nonbond in the former case the command defines an intramolecular shell whereas in the latter a nonbonded shell is defined If intra is specified only one keyword is expected i e the integer n When two subcommand of the type step intra n are entered the first is assumed to refer to the faster intramolecular subsystem the Vno subsystem as defined in eq 4 3 with n n0 and the second is assumed to define the slower intramolecular subsystem the V 1 subsystem as defined in eq 3 with n n1 If only one subcommand step intra n is entered then n0 is set to 1 and and nl n If no step intra n subcommand is given then nl n0 1 If the first argument of the step subcommand is the string nonbond then at least an integer and a real are expected The integer n is the time step dividing factor of the nonbonded shell while the real argument equals the shell upper radius Two more optional real arguments can be defined ie the healing length at the upper shell radius and the corresponding neighbor list offset The dafaults value of the healing lenght are As for the intra shell the more rapidly varying nonbonded shells are entered first If three step nonbond subcommands are entered then the first refers to the Vm the second to the V and t
25. the remaining direct space interactions from 7 3 8 3 A to cutoff distance As the simulations proceeds the particles seen by a target particle may cross from one region to an other while the number of two body contacts in one distance class 18 or reference system potential must be continuously updated Instabilities caused by this flow across potential shell boundaries are generally handled by multiplying the pair potential by a group based switching function PREI Thus at any distance r the direct space potential V can be written schematically 3Here the word group has a different meaning that in Sec B and stands for sub ensemble of contiguous atoms defined as having a total charge of approximately zero Electrostatic Interactions 38 as VEVEWAV 4 40 with H va 4 41 V2 V S2 S 4 42 V Wise S 4 43 4 44 where S is the switching function for the three shells 7 m l h defined as 1 Ryj 1 lt R lt R S R 4 SY Ry lt R lt Ry 4 45 0 R j lt h Here R is the intergroup distance and A is the healing interval for the j th shell While Ro is zero Table 4 1 Potential breakup and relative time steps for complex systems with interactions modeled by the AMBER B force field and electrostatic computed using the SPME method Component Contributions Spherical Shells Time step Vao VowetchtVbenat Atw 033 fs Vi_tors Ves Vn Vp tors Via Atni 1 0 fs Ve Vy F jay Dars Ai S20 fe Vi V
26. total energy number of particles and volume are conserved The derivation based on the Liouvillean and the corresponding propagator however lends itself to a straightforward generalization to non microcanonical ensembles Simulations of this kind are based on the concept of extended system and generate trajectories that sample the phase space according to a target distribution function The extended system method is reviewed in many excellent textbooks and papers to which we refer for a complete and detailed description Here it suffices to say that the technique relies on the clever definition of a modified or extended Lagrangian which includes extra degrees of freedom related to the intensive properties e g pressure or temperature one wishes to sample with a well defined distribution function The dynamics of the extended system is generated in the microcanonical ensemble with the true n degrees of freedom and additionally the extra degrees of freedom related to the macroscopic thermodynamic variables With an appropriate choice the equations of motion of the extended system will produce trajectories in the extended phase space generating the desired equilibrium distribution function upon integration over the extra extended variables There are several extended system techniques corresponding to various ensembles e g constant pressure in the NPH ensemble simulation with isotropic and anisotropic 92 stress constant temperature simulation 93 i
27. we remark that the free energy differences estimated via Eq 6 27 tend to give larger acceptance rates in comparison Serial generalized ensemble simulations 56 to the exact free energy differences thus favoring the transitions toward the ensemble that has not been visited This is a well known biasing effect of exponential averaging 127 leading to a mean dissipated dimensionless work artificially low As a matter of fact this is a positive effect since it makes easier ensemble transitions during the equilibration phase of the simulation In the above discussion we do not have mentioned the number M of independent replicas that may run in the space of the N ensembles In principle M can vary from one to infinity on the basis of our computer facilities The best performance is obtainable if a one to one correspondence exists between replicas and computing processors A rough parallelization could be obtained performing M independent simulations and then drawing the data from replicas at the end of the simulation to get an augmented statistics However the calculation of the optimal weights would be much improved if they were periodically updated on the fly on the basis of the data drawn from all replicas This is just what ORAC does In this respect we notice that our version of multiple replica SGE algorithm is prone to work efficiently also in distributed computing environments The phase of the simulation where information is exch
28. 2 Properties of liquid water computed from a 200 ps simulation at 300 K and 0 1 Mpa on a sample of 343 molecules in PBC with accurate Ewald a 0 35 A ke 2 8 AT and no correction Eq column EWALD with inaccurate Ewald a 0 35 A7 ke 0 9 A but including the correction Eq and with no Ewald and cutoff at 10 0 A Ro o is the distance corresponding to the first peak in the Oxygen Oxygen pair distribution function Chapter 5 The Hamiltonian Replica Exchange Method 5 1 Temperature REM The Replica Exchange Method is based on multiple concurrent parallel canonical simulation that are allowed to occasionally exchange their configurations For a system made of N atoms by configuration we mean a state defined by a 3N dimensional coordinate vector independent of the momenta Thus in a replica exchange only coordinates and not momenta are exchanged In the standard implementation of the methodology each replica bearing a common interaction potential is characterized by a given temperature and configurations between couple of replicas are tentatively exchanged at prescribed time intervals using a probabilistic criterion The target temperature i e the temperature corresponding to the thermodynamic state of interest is usually the lowest among all replicas In this manner hot configurations from hot replicas i e configurations where energy barrier are easily crossed may be occasionally accepted at the target temper
29. 3 74 right for the rightmost propagator Note that Cana ez 3 75 left right Inserting these approximations into 8 71 the overall integrator is found to be time reversible and second order Time reversible integrators are in fact always even order and hence at least second order 67 Therefore the overall molecular and atomic or group discrete time propagators are given by eibmoiAt _ pile pilus pile pily p pile t ciGoAtoyn x xo pile SP ily AB ciL AP ilu AP pil AP 3 76 gilatomAt _ etls Ati cily H pile AA 2 x cibuAto 2 pil Ato 2 eiG0Ato ciLzAto 2ciLuAto 2 n X xo eile r pilya ib Sp 3 77 The propagator e 4 0 defined in Eq 8 62 is further split according to the usual velocity Verlet breakup of Eq 2 22 Note that in case of molecular scaling the slow coordinates S h 7 move with constant velocity during the n small times steps since there is no fast force acting on them in the inner integration The explicit integration algorithm may be easily derived for the two propagators in Eqs and using the rule in Eq 2 23 and its generalization ev f y flyet cP f y Ey 3 78 Where a and a are a scalar and a matrix respectively The exponential matrix a on the right hand side of Eq is obtained by diagonalization of a As stated before the dynamics generated by Eqs 8 233 27 or 8 353 38 in the NPT ensemble is not Hamiltonian and hence we cannot speak o
30. 5 26 k 1 F where we have collapsed the two indices k for the configurations and m for the replicas in one single index k running on all the configurations N X Nm produced by the REMD Except for an arbitrary multiplicative factor Eq 5 26 can be solved iteratively for the partition function Zn At the beginning of the process one sets Z 1 for all i At the iteration 7 1 we have that Zn i 1 2 Wil Lp 4 5 27 where the weights depend on the Z calculated at the previous iteration e ben v te Wn 2p i S Ne raO berver 5 28 and we have used the definition of for the replica distribution in ORAC Once the Z have been determined Z for an arbitrary distribution P X can be calculated using the configurations sampled in the REMD simulation yo Pale Z 3 SZ MP 2 W xx 5 29 Setting for example P P X A X where P X is the target distribution e Z i e that of the replica 1 and A x is an arbitrary configurational property we obtain PX A X dX _ f P X dX _ Z lt gt 1 Z Z Ay 5 30 From Eq 5 29 we have that Z x Wi a A a and that Z eu Wi 2 Substituting these results into Eq 5 30 we obtain N W A ZAS Lr Wiler Alea 5 31 Jp Wilze where the weights W for all sampled points in the REMD simulation are given by e 8V er Wi Lk i S Nie In Z i Ber v rK 5 32 In summary using the configurational energies fro
31. By specifying the directive corrected ORAC corrects for the reciprocal lattice cutoff for all intermolecular interactions in the direct lattice using the same oscillating potential of Eq see Sec used for correcting the intra molecular potential see ERF_CORR in this environment This allows the use shorter cutoffs in reciprocal space or coarser grids in SPME The argument rcut corresponds to the maximum distance for the spline table Must be larger than the current cutoff see examples EXAMPLES ERFC_SPLINE 0 01 A B spline is used to evaluate the direct space sum To evaluate the B spline the original function is computed on a grid of 0 01 bin size ERFC_SPLINE 0 01 corrected 14 0 The splined potential is now given by standard the direct lattice Ewald term plus the y r a potential defined in Eq see also command ERF_CORR in this environment The B spline look up table is done for distances 0 lt r lt 14 WARNINGS recut is an atomic cutoff Always define rcut large enough to assure that all atoms are included within rcut for any molecular pair E g if r the largest cutoff defined in the structured command MTS_RESPA amp INTEGRATOR and the molecule has a maximum extension in any possible direction of AR choose reut rn AR EWALD NAME EWALD Determine if standard Ewald or particle mesh Ewald sum must be used SYNOPSIS EWALD off EWALD ON alphal_ rkcut EWALD PME alphal ftti ftt2 ftt order EWALD REMOVE MOMENTUM
32. CONTENTS 1 Preface This manual is for release 5 3 of the program ORAC In this new release many improvements have been included Here we only mention the most important new features e Now ORAC may be run in parallel using the standard message passing interface libraries OpenMPI mpich2 The parallelism allows to run Hamiltonian replica exchange simulations and multiple walkers metadynamics simulations The REM algorithm may be implemented in a solute tempering fashion allowing potential scaling only for limited user selected part of the simulated system e ORAC can run steered molecular dynamics non equilibrium trajectories with on the fly work eval uation The driven coordinate can be any combination of intramolecular coordinates stretching bending and torsions This features allows one to compute along the selected reaction coordinate the free energy profile PMF using non equilibrium Jarzinsky and Crooks theorem Steered molec ular dynamics can be done as well by varying the temperature of the Nos bath doing on the system an adimensional work computed according to the generalized Crooks theorem e Minimization routines has been improved by providing the possibility of minimizing only part of the solute by keeping frozen all other degrees of freedom e several ancillary programs are included in this distribution for post analysis of steered molecular dynamics and replica exchange simulation data The present manu
33. Cm X Cn b B em en v X w X 5 16 A blem cn V X v k BY EP MX Vx 5 17 There is considerable freedom in the splitting of the potential Eq and in the selection of the corresponding scaling factors These factors are always positive and can be either smaller or greater than one meaning that the corresponding potential contributions for m gt 1 imply a heating and a cooling respectively of the involved degrees of freedom For example we could use c lt 1 for torsions and and c gt 1 for bending so that with increasing m torsional degrees of freedom are heated up while bending are frozen down Global scaling In the present implementation of ORAC one can do a global subdivision i e ignoring the distinction between solvent and solute of the overall atomistic interaction potential for biomolecular system according to the following Vin X ce VBonds Vangle Vi teie of Viors Via 5 18 e Veaw Vor Vaa 5 19 where the meaning of the subscripts is given in Sec 4 1 Typically one then sets of 1 Vm as there is little advantage for conformational sampling in exchanging configurations involving stiff degrees of freedom such as bending stretching and improper torsion On the other hand conformational transitions in proteins are mainly driven by torsional and intraprotein and protein solvent non bonded interactions It is thus convenient to heat up these degrees o
34. Input to ORAC Force Field and Topology Files Compared to molecular liquids simulating any complex macromolecule poses additional problems due to the covalent structure of the systems and to the related complexity of the potential force fields ORAC builds the covalent topology needed to evaluate the potential energy from the structure of its constituents In case of a protein the constituents are the amino acids Also ORAC tries to to minimize the size and the complexity of the actual input needed to construct this topology In practice the minimal information to be provided in order to describe the residue topology is the constituent atoms the covalent bonds and in case of polymers or biopolymers the terminal atoms used to connect the unit to the rest of the chain In addition in order to assign the correct potential parameters to the bonds bending and torsions of the residue the type of each atom needs to be specified Finally to each atom type must correspond a set of non bonded parameters When the bonding topology of the different residues contained in the solute molecule s is known the units are linked together according to their occurrence in the sequence In this fashion the total bonding topology is obtained From this information all possible bond angles are collected by searching for all possible couples of bonds which share one atom Similarly by selecting all couples of bonds linked among each other by a distinct bond torsion
35. PMF in the segment zo z doing only the two work measurements from zo to z and back We first rewrite the Crooks equation Eq 8 2 as follows pr T pr e9 WA 8 7 where pr py are the probability to observe a particular trajectory I in the forward and reverse process respectively and T indicate the time trajectory taken with inverted time schedule Eq 8 7 trivially implies that lt For lt FfeilW 4F 5 8 8 lt F gt r aya Se 8 9 where F F T F is an arbitrary functional of the trajectory T and of its inverted time schedule counterpart I Using Eq B7 we thus can combine the direct estimate of pp T with the indirect estimate of the same quantity obtained from p T This latter according to Eq must be unbiased with the weight factor corresponding to the exponential of the dissipated work in the forward measurement If the direct and indirect Eq estimates are done with np forward measurements and np reverse measurements respectively the optimal minimum variance combination of these two estimates of pr T is done according to the WHAM formula 53 nepr l nrpr C prl 8 10 Here W is the work done in the full IT path from the end point at t 0 to the end point at t 7 We now calculate the average of the trajectory functional e7PWo at intermediate times 0 lt t lt T using the optimized above density Taking the average of this functional over forward T and reverse work measu
36. T Concerning the temperature spacing we have seen that acceptance probability for an exchange is larger the larger is the overlap of the two energy distributions referring to the two contiguous replica i e the closer are the temperatures Of course the closer are the temperatures and the larger is the number of replicas to be simulated i e the heavier is the CPU cost of the simulation For an optimal choice we thus set Emar Em 0 Bn 5 11 where Em and opg are the mean energy and the standard deviation of energy distribution for the m the replica Assuming then that the system can be described by an ensemble of N harmonic oscillators we have that Em NkTm and og cNV kV 27 fl Substituting these values in Eq we obtain the temperature spacing for optimal superposition ey ore ee en Th 5 12 u Tn 7 5 12 In the parallel implementation of the temperature REM in order to keep the communication overhead at the lowest possible level we standardly exchange the temperatures and not the configurations So the m th slave process may explore the entire range of temperatures When the m th slave process periodically writes out the coordinates of the configuration Typically in pdb or xyz format one must also keep track of the current temperature the program does this automatically in order be able to reconstruct a posteriori the true m th temperature configurational space of the m th replica In Fig we show a typical parallel REM
37. THERMOS NAME THERMOS Run with Nos thermostats for NVT or NPT simulations SYNOPSIS THERMOS END DESCRIPTION For a faster and better energy equipartition ORAC uses three thermostats The first coupled to the center of mass momenta of all molecules in the system the second coupled to the momenta of the atoms of the solute if present and the third coupled to the momenta of solvent atoms if present The following subcommands may be specified within THERMOS cofm defaults solute solvent temp_limit e cofm freq mass Specify the mass of the barostat coupled to the centers of mass of the molecules This mass is also assigned to the barostat coupled to the box momenta in NPT simulation in case STRESS or ISOSTRESS have been specified Actually what is entered with the variable freq mass is the approximate frequency of oscillation of the thermostat The actual mass W in units of mass times a length to the power of two of the barostat may be recovered according to the relation freq mass 2NkpT W 2 79 defaults Use defaults value for mass variables The defaults are freq mass solute freq mass solvent freq mass 30 0 solute freq mass solute Specify mass units of cm7 1 of the barostat coupled to the momenta of the solute atoms solvent freq mass solvent Specify mass units of cm7 1 of the barostat coupled to the momenta of the solvent atoms temp limit maxtemp Specify maximum temperature allowed
38. a dielectric constant that matches that of the inner sphere The technique has been proved to give results identical to those obtained with the exact Ewald method in Monte Carlo simulation of dipolar spherocilynders where the dielectric constant that enters in the reaction field is updated periodically according to the value found in the sphere The reaction field method does however suffer of two major disadvantages that strongly limits its use in MD simulations of complex systems at the atomistic level during time integration the system may experience instabilities related to the circulating dielectric constant of the reaction field and to the jumps into the dielectric of molecules in the sphere with explicit interactions The other problem maybe more serious is that again the method requires an a priori knowledge of the system that is the dielectric constant In pure phases this might not be a great problem but in inhomogeneous systems such as solvated protein the knowledge of the dielectric constant might be not easily available Even with the circulating technique an initial unreliable guess of the unknown dielectric constant can strongly affect the dielectric behavior of the system and in turn its dynamical and structural state The electrostatic series can be computed in principle exactly using the Ewald re summation technique 31 32 The Ewald method rewrites the electrostatic sum for the periodic system in terms of two absolutely convergent
39. adaptively 1 At the beginning of the simulation we assign the system i e the replica to a randomly chosen ensemble and start the phase space sampling with the established simulation protocol Monte Carlo or molecular dynamics Note that several simulations may run in the generalized ensemble space each yield ing an independent trajectory Analogously to REM a single simulated system will be termed replica In the ORAC program we have arbitrarily decided to use the following criteria to distribute the replicas among the ensembles at the beginning of the SGE simulations In Hamiltonian tempering simulations if we deal with M replicas we assign them to different ensembles with increasing order from A to Ay If M gt N then the N 1 th replica is assigned to A as the first replica the NV 2 th replica to Ag as the second replica and so on In SGE simulations performed in the space all replicas are assigned to A Serial generalized ensemble simulations 55 see Section 0 2 11 for the definition of the A sequence For the sake of simplicity in the following presen tation of the method we will take into account one replica alone A discussion regarding multiple replica simulations is reported in the final part of this section 2 Every La steps and for each ensemble n we store into memory the quantities W n gt n 1 and W n n 1 computed as described in Sec 6 3 1 There is no well established recipe in choos
40. and and reciprocal lattice Erf corrections terms pose no difficulties in A derivation with a moderate extra cost of the force routines the analytic derivation of reciprocal lattice energy Va Eq with respect to A implies the calculation of three gridded charge arrays i e one for the whole system and two more for the discharging and for the charging alchemical solutes 2 27 2 av 1 amp exp 7 m a Be a 5 s m Saje sit m S m Sa e sit m 9 8 4 m 0 where with the notation Saje sit m we refer to the gridded charge arrays obtained for the discharging 0 lt A lt 1 and charging alchemical species 1 lt lt 0 if they are both present The work can also by computed numerically observing that the differential work due to a 6A or 67 increment of the alchemical factors is given by oe iea tia BO Erina Basa 9 9 which is correct to order 0 6 and o Eq R 9 requires just one extra calculation of the energy within the direct space force loop using the A values at the previous step with no need for tagging annihilating and creating species For computing the work arising from the reciprocal lattice sum Eq 9 2 the gridded charge array must be computed at every step of the intermediate range shell using the current charges and those at the previous step with a very limited computational cost Both these array must then undergo FFT As for the direct lattice also for the reciprocal term there is no need for tagging
41. and are equal in number to nprocs as specified in the mpiexec mpirun command If irest4 0 then the run refers to a cold start from scratch and if irest 1 then the scaling factors of the intermediate replicas are derived according to a m nprocs 1 geometric progression namely scale m scale where scale m is the scaling factor for the potential of the replica m with 0 lt m lt nprocs 1 For example if scale 0 6 and nprocs 4 then replica m 0 has scale 0 1 replica m 1 has scale 1 0 843433 replica m 2 has scale 2 0 711379 and the replica m 3 has scale 3 scale 0 6 if irest 2 the scaling factors are read from an auxiliary file called REM set that must be present in the directory from which the program is launched using the mpiexec mpirun command This ASCII file has as many lines as parallel processes and on each line the three or one scale factors must be specified EXAMPLES SETUP 1 0 1 0 0 6 1 Scales only the non bonded potential direct part using a geometric progression DEFAULTS SETUP 1 0 1 0 1 0 1 NAME STEP exchange time for REM SYNOPSIS STEP rtime DESCRIPTION Define the time in fs for attemping an exchange between adjacent replicas EXAMPLES STEP 5 0 Attempt replica exchanges every 5 fs DEFAULTS STEP 0 WARNINGS If STEP is not set rtime is set to the time step of the m th intermolceular shell Input to ORAC amp RUN 121
42. angle bending force constant and the equilibrium angle respectively The units used for the Kangie and ro are Kcal mol rad and degree EXAMPLES BENDINGS cb G na 70 00 111 30 Input to ORAC Force Field amp Topology 154 BOND cb c o 80 00 128 80 cm G o 80 00 125 30 n C na 70 00 115 40 END NAME BOND Read stretching potential parameters SYNOPSIS BOND typ typ Kstretch To END DESCRIPTION The command reads a sequence of stretching potential parameters typ1 and typ2 are two character strings not to exceed 7 characters indicating the atom types of the two atoms involved in the stretching interaction Kstreteh and ro are the stretching force constant and the stretching equilibrium distance respectively The units used for the Ksireten and ro are Kcal mol A and A EXAMPLES BOND c ca 469 00 1 409 c cb 447 00 1 419 cm 410 00 1 444 END NONBONDED NAME NONBONDED Read Lennard Jones parameters SYNOPSIS NONBONDED MIXRULE NOMIXRULE END DESCRIPTION The command reads the Lennard Jones parameters for the solute non bonded interactions OO aon Arguments MIXRULE and NOMIXRULE to the command indicate if Lennard Jones mixing rules are to be used by ORAC or conversely explicit mixed Lennard Jones potentials are to be expected in input The format of the nonbonded potential is different in the two alternative cases If mixing rules are to be found the input to NONBONDED looks like NONBOND
43. approach since the temperature is the same for all ensembles momentum rescaling Eq 6 14 must not be applied We will see in Section 6 3 how fm and fn appearing into Eq are determined 6 2 2 SGE simulations in space In SGE simulations conducted in a generic A space at constant temperature the dimensionless Hamiltonian is given by Eq 6 3 In the ORAC program we use a Hamiltonian aimed to sample i the distance between two target atoms ii the angle formed by three established atoms and iii the torsion formed by four established atoms or iv combinations of these coordinates There are several ways to model such a Hamiltonian Our choice is to use harmonic potential functions correlated to the given collective coordinates hn x p pt BIH a p pe k r An 6 16 where as usual H x p p is the extended Hamiltonian In Eq 6 16 r is the instantaneous collective coordinate bond bending torsion and k is a constant As in ST simulations transitions from n to m ensemble occur at fixed configuration However in this case there is no need of rescaling momenta because they drop out of the detailed balance condition naturally The resulting acceptance ratio is ace n m min 1 ef An Am fim F 6 17 In this kind of simulations the free energy as a function of corresponds to the biased PMF 50 along the coordinate associated with A Biasing arises from the harmonic potential added to the original Hamilto
44. array Q ki k2 k3 is defined as Q k k2 k3 5 qiMn uit k Mn uiz k2 Mn uiz k3 4 28 i 1 N Inserting the approximated structure factor of Eq into Eq and using the fact that F Q m1 m2 m3 K K2 K3F Q m1 m2 m3 the SPME reciprocal lattice energy can be then written as K Ky Kg Ve 2 5 5 B m m2 m3 C m1 M2 M3 X mi 1 m2 1m3 1 x F Q m1 m2 m3 F Q m1 m2 m3 4 29 1 ku Be Ks 5 gt X SS YS Free mi m2 m3 F Q mi M2 ms x x KK K3F Q m m2 m3 4 30 with B mi m2 m3 b1 m1 b2 m2 b3 ms 4 31 C m m2 m3 1 nV exp 12m a m 4 32 Orec F BC 4 33 Using the convolution theorem for FFT the energy 4 30 can be rewritten as K Ko K3 Vy gt So SS FO Ore Q m1 m2 ms F Q m1 m2 ma 4 34 mi 1m2 1m3 1 We now use the identity X n F A m B m mn 4 m F B m to arrive at Ve 5 do YE Cree Q m ma ma Q m ma ma m l m2 1 m3 1 4 35 Electrostatic Interactions 36 We first notice that Orec does not depend on the charge positions and that Mn Uia k is differentiable for n gt 2 which is always the case in practical applications Thus the force on each charge can be obtained by taking the derivative of Eq 35 namely r 0Q m1 M2 M peo ae gt DD Poma T2 TE O e x Q m1 m2 ma 4 36 m 1m2 lms 1 In practice the calculation is carried out according
45. as a higher rate of transversing the potential energy space Moreover SGE methods are well suited to distributed computing environments because synchronization and communication between replicas processors can be avoided The potential of mean force 50 along a chosen collective coordinate can be computed a posteriori in REM and SGE simulations using multiple histogram reweighting techniques 52 53 The potential of mean force can also be determined by performing SGE and REM simulations directly in the space of the collective coordinate 54 In the ORAC program we have implemented SGE simulations either in a simulated tempering like fashion or in the space of bond bending and torsion coordinates These simulations exploit the adaptive method to calculate weight factors i e free energies proposed in Ref 55 The method is described in Chapter 6 The a priori identification of the unknown coordinates along with their underlying free energy sur face are actually one of the outputs of the REM and SGE approaches However once these important coordinates are known one can use less expensive techniques to study the associated essential free energy surface Canonical reweighting or Umbrella Sampling methods 56 for example modify bias the interac tion Hamiltonian of the system in such a way to facilitate barrier crossing between conformational basins The canonical average of the unperturbed systems are then reconstructed by appropriately reweigh
46. barostat coordinate is no longer slow unless the parameter W is changed For standard values of W selected to obtain an efficient sampling of the NPT phase space 98 79 the barostat dependent Liouvilleans Eqs 3 60 8 59 have time scale dynamics comparable to that of the intramolecular Liouvillean iGo and therefore must be associated with this terni Thus the molecular split of the Liouvillean is hence given by iL iL ily iL ily iLs iLo iGo 3 69 whereas the atomic split is iL tLy ily ils iLo iGot iL ily 3 70 For both scaling a simple Hermitian factorization of the total time propagator e t yields the double time discrete propagator eilitiLo eibsAti 2 ethoAto n piliAti 2 3 71 9Similar considerations hold for the thermostat coordinate which in principle depends on the kinetic energy of all degrees of freedom modulated hence by the fast motion also In this case however the value of the thermostat inertia parameter Q can be chosen to slow down the time scale of the 7 coordinates without reducing considerably the sampling efficiency Multiple Time Steps Algorithms 27 where Ato the small time step must be selected according to the intramolecular time scale whereas At the large time step must be selected according to the time scale of the intermolecular motions We already know that the propagator 8 71 cannot generate a symplectic The alert reader may also have noticed that in this case the s
47. be used in principle without prior knowledge of the important reaction coordinates of the system i e in the case of biological systems those that defines the accessible conformational space in the target thermodynamics conditions The REM algorithm is described in detail in Chapter 5 A class of simulation algorithms closely related to REM are the so called serial generalized ensemble SGE methods 45 The basic difference between SGE methods and REM is that in the former no pairs of replicas are necessary to make a trajectory in temperature space and more generally in the generalized ensemble space In SGE methods only one replica can undergo ensemble transitions which are realized on the basis of a Monte Carlo like criterion The most known example of SGE algorithm is the simulated tempering technique 43 46 where weighted sampling is used to produce a random walk in temperature space An important limitation of SGE approaches is that an evaluation of free energy differences between ensembles is needed as input to ensure equal visitation of the ensembles and eventually a faster convergence of structural properties 47 REM was just developed to eliminate the need to know a priori such free energy differences On the other side several studies 47 have reported that SGE in general and simulated tempering in particular consistently gives a higher rate of delivering the system between high temperature states and low temperature states as well
48. bending potential has force constant k in Keal mol rad and equilibrium bending angle ao in degrees If a is also specified then the added bending potential is time dependent and a is the equilibrium bending angle after the steering time T see STEER amp RUN command for the definition of the steering time in a SMD simulation WARNINGS If the chosen ag is very different from the actual value of the bending angle at time 0 a very large force is experienced by the atoms in involved in the added bending and the simulation may catastrofically diverge after few steps EXAMPLES Example 1 ADD_STR_BENDS 1 50 104 400 180 0 Example 2 Input to ORAC amp POTENTIAL 101 amp POTENTIAL ADD_STR_BENDS 1 50 104 400 180 0 90 0 amp END amp RUN STEER 10000 50000 amp END In the first example a bending constraint is imposed bewteen atom 1 atom 50 and atom 104 of the solute In the second example a time dependent driving potential ia applied to the same atoms of the solute The equilibrium bending angle of such harmonic driving potential move at constant velocity in T 40 ps starting at t 10 ps between ao 180 0 and a 90 0 ADD_STR_TORS NAME ADD_STR_TORS Add a harmonic bending potential between three target atoms SYNOPSIS ADD_STR_TORS iati iat2 iat3 iat3 k Ao 0 DESCRIPTION This command can be used to impose an additional harmonic torsional constraint between atoms iatl iat2 iat and iat4 o
49. bonded atoms The parameter are chosen according to the AMBER protocol 3 by assigning the carbon and hydrogen atoms to the AMBER types ct and hc respectively For various dynamical and structural properties we compare three integrators namely a triple time step r RESPA R3 a single time step integrator with bond constraints on X H S1 and a single time step integrator with all bonds kept rigid S These three integrators are tested starting Symplectic and Reversible Integrators 17 100 kj mole 0 500 1000 1500 2000 Figure 2 1 Time record of the torsional potential energy at about 300 K for a cluster of eight molecules of C24 H5o obtained using three integrators solid line integrator E circles integrator R3 squares integrator S1 diamonds integrator S see text from the same phase space point against a single time step integrator E with a very small time step generating the exact trajectory In Fig P I we show the time record of the torsional potential energy The R3 integrator generates a trajectory practically coincident with the exact trajectory for as long as 1 5 ps The single time step with rigid X H bonds also produces a fairly accurate trajectory whereas the trajectory generated by S quickly drifts away from the exact time record In Fig 2 2 we show the power spectrum of the velocity auto correlation function obtained with R3 S1 and S The spectra are compared to the exact spectrum computed usi
50. by the well known expression 1 2 1 Kr Vs 4 4 where u is reduced mass Bending We shall assume for the sake of simplicity that the uncoupled bending frequencies depends on the masses of the atom 1 and 3 see Fig B I that is mass 2 is assumed to be infinity This turns out to be in general an excellent approximation for bending involving hydrogen and a good approximation for external bendings in large molecules involving masses of comparable magnitude The frequency is obtained by writing the Lagrangian in polar coordinates for the mechanical system depicted in Fig 4 1 The Cartesian coordinates are expressed in terms of the polar coordinates as z r2sin a 2 y r 2 cos a 2 4 5 z3 razsin a 2 y3 132 cos a 2 4 6 where the distance r32 and r12 are constrained to the equilibrium values The velocities are then i rz cos a 2 5 in ryasin a 2 5 4 7 a t3 T32 cos a 2 gt 3 rs2sin a 2 gt 4 8 The Lagrangian for the uncoupled bending is then 1 P 2 miii H mii H mat H may3 Viena 4 9 1 1 lt miris mar3z a gkela ag 4 10 The equation of motion 42L 2L 0 for the a coordinate is given by 4K ata aa 0 4 11 b Where Ip m r mgr is the moment of inertia about an axis passing by atom 3 and perpendicular to the bending plane Finally the uncoupled bending frequency is given by 1 AK 1 2 Up Se 4 12 ie 2 2 2m Miria M3
51. can no longer be easily separated and this term must be thus arbitrarily assigned to one of the three components Given the subdivision Eq the local scaling for replica m in ORAC is implemented as Vm X _ cf mye x x J Caer Xn XN n J Yee gn 5 23 The solvent solvent interactions including the global bonded potential and the long range electrostatic interactions are not scaled in the local approach Solute solute interactions and solute solvent interactions as defined in Eq 5 22 are scaled independently thereby generalizing the so called solute tempering approach recently proposed by Liu et al 105 This generality allows a complete freedom in the choice of the scaling protocol For example one can choose to set ae gt 1 i e to progressively freeze the solute solvent interaction as the replica index m grows m Slt Slv while at the same time setting m 1 for all replicas thereby favouring at large ute Siv the solvation of the solute i e for example favouring the unfolding The global REM algorithm i e uniform scaling of the full interaction potential as implemented in ORAC works also for constant pressure simulation see ISOSTRESS directive In that case the selected external pressure pressure refers to that of the target replica m 1 Since the PV is a configurational term and is not scaled in the current implementation the non target replicas sample coordinate configurations according to a higher
52. coordinates of free energy 6 FREQUENCIES 139 fudge factor L07 generalized Born solvent model GENERATE 150 glycine GOFR group scaling Liouvillean split for 26 GROUP_CUTOFF Hamilton equations 9 harmonic constraints harmonic frequencies 139 HBONDS 113 heavy_atoms 116 Hermitian operator histogram 13 history file auxiliary file 82 H MASS hydrogen bond acceptor and donor 163 imphd 161 implicit solvent 140 improper torsion 32 95 22 C61 definition of in the parameter file 057 amp INOUT 80 INSERT 151 inst_xrms 115 amp INTEGRATOR integrator reversible symplectic 8 I interchange matrix 147 ISEED 137 ISOSTRESS 138 isothermal isobaric ensemble I TORSION 1 06 jacobian 10 Jarzynski identity JOIN 94 JORGENSEN KEEP_BONDS k ewald leap frog algorithm 12 Legendre transformation Lennard Jones cutoff parameters Lennard Jones potential Soft core variant for alchemical transformations 63 linked cell 108 LINKED_CELL Liouville formalism Liouvillean split of Parrinello Rahman Nos liquid water LJ FUDGE 107 Lucy s functions 59 lysozyme Markovian process 61 Martyna G mass of the Nos thermostat specifying the type atomic 155 maximum likelihood 65 MAXRUN 122 MBAR MDSIM mean square displacement membrane 169 simulation at constant pressure in the NPT en semb
53. determine as many nested reference systems as we wish The first step in defining a general protocol for the subdivision of the bonded potential for complex molecular systems consists in identifying the various time scales and their connection to the potential The interaction bonded potential in almost all popular force fields is given as a function of the stretching bending and torsion internal coordinates and has the general form Vina Vstretch Vbend F Viorss 4 2 A Electrostatic Interactions 31 where Votretch 5 K r ro Bonds Vbend 5 Ko 0 bo 4 Angles Viors X Vo 1 cos no 1 4 3 Dihedrals Here K and Kg are the bonded force constants associated with bond stretching and angles bending respectively while ro and fo are their respective equilibrium values In the torsional potential Vors is the dihedral angle while Ky n and y are constants The characteristic time scale of a particular internal degrees of freedom can be estimated assuming that this coordinate behaves like a harmonic oscillator uncoupled form the rest the other internal degrees of freedom Thus the criterion for guiding the subdivision of the potential in Eq 42 is given by the characteristic frequency of this uncoupled oscillator We now give for each type of degree of freedom practical formula to evaluate the harmonic frequency from the force field constants given in Eq 43 Stretching The stretching frequencies are given
54. employed Default values are Le tstep La tstep Ly 1000 x tstep and na 0 where tstep is the simulation time step in fs of the hth intermolecular shell see Section 4 3 TRANSITION_SCHEME NAME TRANSITION_SCHEME Choose scheme for replica transitions SYNOPSIS TRANSITION_SCHEME scheme DESCRIPTION This command defines the replica transition scheme used during an SGE simulation The allowed values of the keyword scheme are e SEO Use the so called Stochastic Even Odd SEO transition scheme At each transition step the trajectory in ensemble n attempts a transition towards ensemble n 1 or n 1 with equal probability e DEO Use the so called Deterministic Even Odd DEO transition scheme If at the s th transition step the trajectory is in ensemble n a transition is attempted towards ensemble n 1 T Input to ORAC amp SGE 136 that is toward ensemble n 1 at even steps and to ensemble n 1 at odd steps if n is even the opposite if n is odd This scheme is the same as the coupling scheme used in Replica Exchange and is expected to give better diffusion in temperature space than SEO 113 EXAMPLE TRANSITION_SCHEME SEO DEFAULTS The default for scheme is DEO ZERO_FREE_ENERGY NAME ZERO_FREE_ENERGY Set up input for zeroing the accumulated free energy averages SYNOPSIS ZERO_FREE_ENERGY DESCRIPTION The presence of this command in the input establishes that the wei
55. for all Nos thermostat when the argument of the com mand REJECT amp RUN is different from zero In principle for a system out of equilibrium no temperature scaling should be enforced when using Nos thermostatting Actually when equi librating systems in the NVT or NPT ensembles it is strongly recommended to specify the subcommand temp_limit along with a rejection time REJECT amp RUN as normally done for con ventional scaling in NVE dynamics In a NV P T system out of equilibrium while the tempera ture of the system remains close to the selected temperature the temperature of the thermostat coordinates which are not themselves thermostatted may raise dramatically if not scaled EXAMPLES amp SIMULATION TEMPERATURE 300 0 25 0 MDSIM THERMOS cofm 30 0 solute 30 0 solvent 30 0 END amp END Run a simulation in the NVT ensemble at T 300 K Input to ORAC amp SIMULATION 144 WRITE_PRESSURE NAME WRITE_PRESSURE Write the pressure of the system during a simulation SYNOPSIS WRITE_PRESSURE DESCRIPTION This command is used to print the system pressure and stress tensor to the simulation output It has no argument WARNINGS Make sure that when simulations at constant pressure are run ORAC has been compiled with the appropriate PRESSURE option in the config h file see Chapter I Input to ORAC amp SOLUTE 145 10 2 13 amp SOLUTE The amp SOLUTE environment includes commands which are concerned
56. good time discretization for the time evolution of the perturbation that is for the slowly varying intermolecular potential The discrete e 4 e Lotili Ati 8 time propagator can be factorized as eiLAti gehts gah nyn ee Arle acc iad Oma A a ea 2 31 where we have used Eq 2 21 and we have defined At Ato 2 32 n as the time step for the fast reference system with Hamiltonian T Vo The propagator is unitary and hence time reversible The external propagators depending on the Liouvillean L acting on the state vectors define a symplectic mapping as it can be easily proved by using Eq 2 8 The full factorized propagator is therefore symplectic as long as the inner propagator is symplectic The Liouvillean iLo qO 0q V q p can be factorized according to the Verlet symplectic and reversible breakup Symplectic and Reversible Integrators 14 described in the preceding section but with an Hamiltonian T Vo Inserting the result into Eq 2 31 and using the definition 2 30 the resulting double time step propagator is then av aV0 aVo a av eibAt ear HAN 2 er Sp Ato 2 eid Ato pay aaa e D Be At 2 2 33 This propagator is unfolded straightforwardly using the rule 2 23 generating the following symplectic and reversible integrator from step t 0 to t At p p 0 F 0 44 DO i i n Alo 2 34 ENDDO p Ati p 28 Fi nAto 42 Note that the slowly varying forces F ar
57. handle this situation Sec 0 3 describes how the inter and intramolecular Lennard Jones parameters are read by ORAC WARNINGS Experimental Unsupported KEEP_BONDS NAME KEEP_BONDS Constraints bond lengths to starting values SYNOPSIS KEEP_BONDS DESCRIPTION This command should be specified when bond constraints are imposed to the system see command STRETCHING and CONSTRAINT in this environment If specified all bonds to be constrained are constrained to the initial length found in the starting PDB file DEFAULTS KEEP_BONDS is FALSE LJ FUDGE NAME LJ FUDGE Set the fudge factor of the Lennard Jones interaction SYNOPSIS LJ FUDGE lj fudge DESCRIPTION The argument to this command lj fudge is the multiplicative factor of the 1 4 Lennard Jones inter action EXAMPLES LJ FUDGE 0 5 DEFAULTS LJ FUDGE 1 0 Input to ORAC amp POTENTIAL 108 LINKED_CELL NAME LINKED_CELL Compute linked cell neighbor lists SYNOPSIS LINKEDCELL ncr ncy nez nupdte DESCRIPTION The LINKED_CELL command switches to linked cell neighbor lists in place of conventional Verlet lists The command can be used also for non orthogonal MD boxes The integers ncz ncy nez define the three dimensional grid by providing the number of bins along the a b c crystal axis respectively The optimum fineness of the cell grid depends on the density of the sample For normal density a grid spacing of 3 0 3 5 A along each axis is recommend
58. imphd a list of improper torsions ended by the keyword end must be provided Each improper torsion is defined by a quadruplet of atom labels see synopsis Labels starting with a or a refer to atoms belonging to the preceding and following residue in the solute sequence EXAMPLES imphd ca n hn ca n c o end WARNINGS The keyword bonds must appear before imphd Input to ORAC Force Field amp Topology 162 backbone NAME backbone Define the backbone atoms for the residue SYNOPSIS backbone lab1 lab2 lab3 DESCRIPTION With backbone a list of atom labels lab1 lab2 lab3 is provided which belong to the biomolecule backbone The corresponding atoms are uniquely identified The backbone atoms are only used by ORAC in the calculation of run time properties The command can be repeated as many times as necessary WARNINGS The keyword bonds must appear before backbone termatom acc NAME termatom Define a pair of atoms which are covalently bound to other residues SYNOPSIS termatom labi lab2 DESCRIPTION termatom is used to define two atoms whose labels are labi and lab2 which are connecting the residue to the rest of the biopolymer If the residue has only one connecting atom or has none one of the labels or both must be replaced by a EXAMPLES 1 Connecting atoms for an amino acid termatom n c EXAMPLES 2 Connecting atoms for a residue not covalently connected with the others residues of an
59. molecules in the gas phase along with some ancillary codes for the analysis of the program output Chapter 8 Steered Molecular Dynamics Steered molecular dynamics simulation SMD is a technique mimicking the principle of the atomic force microscopy AFM In practice one applies a time dependent mechanical external potential that obliges the system to perform some prescribed motion in a prescribed simulation time SMD has been widely used to explore the mechanical functions of biomolecules such as ligand receptor binding unbinding and elasticity of muscle proteins during stretching at the atomic level 144 The SMD has also been used in the past to approximately estimate the potential of mean force PMF along a given mechanical coordinate for example a distance or an angle The model upon which this technique for estimating the PMF relies was based on the assumption that the driven motion along the reaction coordinate z could be described by an over damped one dimensional Langevin equation of the kind where y is the friction coefficient W is the underlying potential of mean force Fext 2 t is the external force due the driving potential and t is a stochastic force related to the friction through the second fluctuation dissipation theorem The PMF W z can then be determined only if one knows or can somehow figure it out the friction coefficient so as to evaluate the frictional force that discounts the irreversible work done in the
60. of this angle EXAMPLES ADD_TORS 1 5 8 11 4 0 Add the torsional angle between atom 1 atom 5 atom 8 and atom 11 to the list of the reaction coordinates NAME RATE Define the deposition rate of a metadynamics run SYNOPSIS RATE mtime mheight DESCRIPTION This command defines the deposition frequency mtime in fs and the height mheight in kJ mol of the repulsive potential terms deposed during a metadynamics run If mheight is not specified then a zero height is assumed EXAMPLES RATE 100 0 0 05 Depose an hill of height 0 05 kJ mol every 100 fs NAME READ Read a trajectory from a previous metadynamics run SYNOPSIS READ filename DESCRIPTION When present the program reads a trajectory filename from a previous metadynamics run EXAMPLES READ old_traj out Read trajectory from file old_traj out WARNINGS The metadynamics parameters of the previous and of the new simulation must be the same in order to obtain meaningful results Input to ORAC amp META 91 TEMPERED NAME TEMPERED During a metadynamics simulation adds an hill to the biasing potential with a decreasing probability SYNOPSIS TEMPERED T DESCRIPTION When present the program adds an hill to the current biasing potential with a probability given by P acc exp Vmax t kBT where Vmax t is the maximum value of the potential V s t at time t and T is a user defined temperature EXAMPLES TEMPERED 1000 0 Run a t
61. order for finite step size Using Eq 2 8 it is easy to show that the t flow defined in Eq is symplectic being the product of two successive symplectic transformations Unfortunately the propagator Eq 2 20 is not unitary and therefore the corresponding algorithm is not Symplectic and Reversible Integrators 12 time reversible Again the non unitarity is due to the fact that the two factorized exponential operators are non commuting We can overcome this problem by halving the time step and using the approximant e ATB t At 2 Bt 2 Bt 2 At 2 At 2 Bt At 2 2 21 The resulting propagator is clearly unitary therefore time reversible and is also correct to the second order 74 Thus requiring that the product of the exponential operator be unitary automatically leads to more accurate approximations of the true discrete time propagator 75 74 Applying the same argument to the propagator we have eibAt _ elise tog At be bhAta 4 At PHAt 2 O At 2 22 The action of an exponential operator e 9 on a generic function f x trivially corresponds to the Taylor expansion of f x around the point x at the point x a that is 029 82 Fie f a a 2 23 Using Eq 2 23 the time reversible and symplectic integration algorithm can now be derived by acting with our Hermitian operator Eq 2 22 onto the state vector at t 0 to produce updated coordinate and momenta at a later time At The resulting algorithm is completely equ
62. principle the initial and final microstates can be defined by different coordinates and or momenta x x and or p p though the condition x wz is usually adopted The transition probabilities for moving from x p n to x p m and viceversa have to satisfy the detailed balance condition P x p P n gt m Pn 2 p P m gt n 6 6 where P x p is the probability of the microstate x p n in the extended canonical ensemble Eq 6 5 P x p Zhe hn ep 90 6 7 In Eq P n m is a shorthand for the conditional probability of the transition x p n gt a p m given the system is in the microstate x p with analogous meaning of P m gt n Using Eq together with the analogous expression for P a p in the detailed balance and applying the Metropolis s criterion we find that the transition x p n x p m is accepted with probability ace n m min 1 et hm 2p gm 9n 6 8 The probability of sampling a given ensemble is P J Poo dx dp Zn Z7 e 6 9 Uniform sampling sets the condition P N for each ensemble n 1 N that leads to the equality gn ln Zn ln 6 10 Equation implies that to get uniform sampling the difference gm gn in Eq 6 8 must be replaced with fm fn where fn is the dimensionless free energy related to the actual free energy of the ensemble n by the relation fn BF ln Zn where is the inverse
63. recent MD techniques has been devised The Replica Exchange Method REM 41 provides an elegant and simple solution to quasi ergodic sampling In REM several independent trajectories called replicas are simultaneously generated in dif ferent thermodynamic conditions The production of these simultaneous trajectories usually occurs on an array of parallel processors The thermodynamics conditions of these replicas are chosen so as to span ho mogeneously the thermodynamic space from the ensemble of interest to a different ensemble with enhanced transition rates where the sampling is ergodic During the simulation neighbouring replicas are allowed to exchange their configurations subject to specific acceptance criteria In this fashion a trajectory is no longer bound to an unique given equilibrium ensemble but can randomly walk in a thermodynamic space of different equilibrium conditions visiting ensembles where an ergodic sampling is possible and then going back to the quasi ergodic ensemble of interest Therefore REM is an algorithm which employs an extended ensemble formalism in order to overcome slow relaxation The gain in sampling efficiency with respect to a series of uncoupled parallel trajectories comes from the exchange of information between trajectories and the replica exchange process is the tool by which information e g a particular configuration is carried for example from an high to a low temperature The REM algorithm can
64. resistant to time step increase and generate stable long time trajectory i e they do not show drifts of the total energy Popular MD algorithms like Verlet leap frog and velocity Verlet are all symplectic and their robustness is now understood to be due in part to this property 69 9 22 70 Symplectic and Reversible Integrators 11 2 2 Liouville Formalism a Tool for Building Symplectic and Re versible Integrators In the previous paragraphs we have seen that it is highly beneficial for an integrator to be symplectic We may now wonder if there exists a general way for obtaining symplectic and possibly reversible integrators from first principles To this end we start by noting that for any property which depends on time implicitly through p q x we have dA p q iF i OA OH 2 lt gt _ __ dt 3 q 7 Op 3 Op q Oq Op iLA 2 13 where the sum is extended to all n degrees of freedom in the system L is the Liouvillean operator defined by o o oH OH O ib j p s 2 14 CE 3 a e Eq 2 13 can be integrated to yield A t e A 0 2 15 If A is the state vector itself we can use Eq 2 15 to integrate Hamilton s equations q t iLt q 0 e 2 16 Fs p o ee The above equation is a formal solution of Hamilton s equations of motion The exponential operator eilt times the state vector defines the t flow of the Hamiltonian system w
65. series one in the direct lattice and the other in reciprocal lattice This method in its standard implementation is extremely CPU demanding and scales like N with N being the number of charges with the unfortunate consequence that even moderately large size simulations of inhomogeneous biological systems are not within its reach The rigorous Ewald method which does not suffers of none of the inconveniences experienced by the reaction field approach has however regained resurgent interest very recently after publication by Darden York and Pedersen of the Particle Mesh technique and later on by Essmann Darden at al 34 of the variant termed Smooth Particle Mesh Ewald SPME SPME is based on older idea idea of Hockney and is essentially an interpolation technique with a charge smearing onto a regular grid and evaluation via fast Fourier Transform FFT of the interpolated reciprocal lattice energy sums The performances of this techniques both in accuracy and efficiency are astonishing Most important the computational cost scales like Nlog N that is essentially linearly for any practical application Other algorithm like the Fast Multipole Method FMM scales exactly like N even better than SPME However FMM has a very large prefactor and the break even point with SPME is on the order of several hundred thousand of particles that is as up to now beyond any reasonable practical limit The combination of the multiple time step algorithm and of
66. setting up the unit cell 45 topology 126 total charge 146 SOLUTE 128 amp SOLUTE solute thermostatting solute atoms 143 solvent thermostatting solvent atoms 43 solute tempering 119 SOLVENT solvent generating the coordinates input examples 128 reading the coordinates of 49 setting up the unit cell 50 amp SOLVENT 149 space group SPACE_GROUP 147 spherical cutoff 37 SPME 6 accuracy B spline interpolation 171 in multiple time scales integrators 86 memory demand B6 performances 36 setting work arrays dimensions 166 START STOP steepest descent STEER steered molecular dynamics 8 61 adding a time dependent bending adding a time dependent stretching adding a time dependent torsion L01 along a curvilinear coordinate 109 printing out the work 84 restart thermal changes STEER_PATH 109 step 86 STEP amp REM 120 STEP amp SGE 134 s_test STRESS stress tensor stretching 154 printing out 95 STRETCHING for the solute 109 stretching potential structure factor structured commands definition of STRUCTURES VORONOT 116 symplectic building integrators DI condition 12 condition for canonical transformations integrators notation of the equations of motion 9 TEMPERATURE temperature scaling with Nos thermostats 143 TEMPERED RI TEMPLATE 129 termatom 162 test times thermal changes thermal wo
67. symmetry of the Hamilton equations one ends up adiabatically in zo Under these assumptions the ratio of the two probabilities on the left hand side of Eq 8 2 can be written as pT zo T z _ eG p T zo lt lt S z E Zo e bH z z7 eb He z7 7 He z0 AF exp B Wrizo sr zr AF 8 3 equation where we have used the facts that 6F z zo ln Zo GF z z ln Z and that the energy difference Hp Hz in the forward adiabatic trajectory equals to the external work done on the systems Equation 8 2 refers to the probability of a single forward or backward trajectory Suppose now to perform a large number of forward trajectories all with a give time schedule but each started from a different initial phase point sampled according to the canonical equilibrium distribution characterized by the Hamiltonian H z zo and a large and not necessarily equal number of backward trajectories with reverse time schedule Steered Molecular Dynamics 63 and starting from initial phase points this time sampled according to the canonical equilibrium distribution characterized by the Hamiltonian H z x Pl By collecting all trajectories yielding the work W in 8 2 the CT may compactly be written as Pr W PRW exp 8 W AF 8 4 where Pr W and P W are the normalized forward and backward distribution functions note that due to the time reversal symmetry for the backward distribution the work is taken with the minus si
68. the extra degrees of freedom whose time scale dynamics can be controlled by varying the parameter Q and W Large values of Q and W slow down the time dynamics of the barostat and thermostat coordinates The potential V determines the time scale of the iGo term the fast component and of the iL contribution the slow component All other sub Liouvilleans either handle the coupling of the true coordinates to the extra degrees of freedom iL expresses the coupling of all momenta including barostat momenta to the thermostat momentum while iL is a coupling term between the center of mass momenta and the barostat momentum or drive the evolution of the extra coordinates of the barostat and thermostat iL and iL The time scale dynamics of these terms depends not only on the potential subdivision and on the parameters W and Q but also on the type of scaling 26 When the molecular scaling is adopted the dynamics of the virial term V contains contributions only from the intermolecular potential since the barostat is coupled only to the center of mass coordinates see Eq 3 29 Indeed the net force acting on the molecular center of mass is independent on the intramolecular potential since the latter is invariant under rigid translation of the molecules When atomic scaling or group i e sub molecular scaling is adopted the virial see Eq 8 39 depends also on the fast intramolecular such as stretching motions In this case the time scale of the
69. to input specifications needed when retrieving the file see amp ANALYSIS The file filename looks like system_file_O traject_file_1i traject_file_n system_file is the parameters file where the time steps and and the CO matrix are specified All other files are reserved for the trajectory Partitioning a very long trajectories in many files allows to overcome e g OS set file size limits or filesystem limits EXAMPLES DUMP write 30 0 OPEN alk 1 aux occupy atom_record 30 END Writes history file and parameters file as specified in auxiliary file alk 1 aux every 30 0 fs After execution the file alk 1 aux is rewritten by the program and looks like Rewritten by Program system_file 66 0 0 traject_file_1 1320 20 30 The numbers in columns 1 are the length of the file in records The numbers in the second columns and second row are the number of records per point calculated by the program and the number of atoms per record given in input see subcommand atom_record in the trajectory file traject_file_1 In the above example the record length is 30 4 3 360 bytes the total size of the file of bytes allocated at simulation start is given by 30 3 4 1230 442800 and the total number of bytes dumped per phase space point is given by 20 30 3 4 7200 Input to ORAC amp INOUT 83 WARNINGS Work only during the acquisition phase see TIME in environment amp RUN PLOT NAME PLOT Write solute coordinates and connect
70. to the following scheme i At each simulation step one computes the grid scaled fractional coordinates uia and fills an array with Q according to Eq 4 28 At this stage the derivative of the M functions are also computed and stored in memory ii The array containing Q is then overwritten by F Q i e Q s 3 D Fourier transform iii Subsequently the electrostatic energy is computed via Eq 430 At the same time the array containing F Q is overwritten by the product of itself with the array containing BC computed at the very beginning of the run iv The resulting array is then Fourier transformed to obtain the convolution Opec x Q v Finally the forces are computed via Eq 4 36 using the previously stored derivatives of the Mn functions to recast 0Q Oria The memory requirements of the SPME method are limited 2K K2K3 double precision real numbers are needed for the grid charge array Q while the calculation of the functions M ujg j and their derivatives requires only 6 x n x N double precision real numbers The Ka integers determines the fineness of the grid along the a th lattice vector of the unit cell The output accuracy of the energy and forces depends on the SPME parameters The a convergence parameter the grid spacing and the order n of the B spline interpolation For a typical a 0 4 A a relative accuracies between 1074 1075 for the electrostatic energy are obtained when the grid spacing is around 1 A along each
71. topology of the actual solute molecules This information is provided through a series of free format keywords and their corresponding input data as done in the main input file sys mddata In this way ORAC reads the solute connectivity the atomic charges the atomic labels corresponding to those found in the PDB file and the atomic types according to the chosen force field i e AMBER CHARMM or others Moreover the atomic groups and the improper torsions are also defined As for the mail input file the file field tpg is parsed and the composing substrings of each line are interpreted Comment lines must have the character in column 1 Each residue or unit definition starts with the keyword RESIDUE residue_name where residue_name is a character label which must match labels found in the command JOIN of the envi ronment amp PARAMETERS and must end with the keyword RESIDUE_END These residue delimiting keywords are the only one in capital letters in field tpg see the valine example later on in this section Atom type definitions and charges are read in between the keywords atom and end For each atom three strings must be entered the PDB atom label the potential type according to the selected force field as specified in parameter file see Sec 10 3 I and the point charge in electron units Groups are composed of all atoms entered between two successive group keywords The PDB labels must be all different from each others since they
72. which are not marked unsupported in this manual ORAC is simultaneously an ancient code and a new code which is still in the developing stage ORAC has indeed a stable core i e the part which is officially maintained but it also has some obsolete options some features used for diagnostic or debugging purposes and some other experimental features not yet fully tested i e the unsupported material Unsupported features are by no means essential to the ORAC functioning and may belong to three categories Experimental These features have been generally tested on only one or two unix platform usually OSF1 or HP UX some of them can be used only while running in single time mode some other can be used only while running with r RESPA These features are documented normally and in the WARNINGS section they are referred as Experimental Unsupported Diagnostic These feature were introduced in the developing stage for diagnostics and debugging purposes In the current version the diagnostic features are kept since they may turn to be useful to the programmer when modifying the code These features are documented normally and in the WARNINGS section they are referred as Diagnostic Unsupported Obsolete features These features are no longer used and will be eliminated in the next ORAC release They are poorly documented and they are referred as Obsolete Unsupported Input to ORAC amp ANALYSIS 79 10 2 1 amp ANALYSIS T
73. will denote the momentum conjugated to the dynamical variable associated with the thermostat Also in this case Eq 6 2 holds but it takes the form hn 2 p Pt Bn H x p pi 6 13 In this equation H x p pt V x K p K pz is the extended Hamiltonian of the system where V x is the potential energy while K p and K p are the kinetic energies of the particles and thermostat re spectively As in Monte Carlo version transitions from n to m ensemble are realized at fixed configuration while particle momenta are rescaled as pP p Calla p Pt Ton Ta P As in temperature REM I23 the scaling drops the momenta out of the detailed balance and the acceptance ratio takes the form of Eq Note that if more thermostats are adopted 122 then all additional momenta must be rescaled according to Eq ST is implemented in the ORAC program exactly as it has been done for REM see Section 5 2 In particular global and local scalings of the potential energy can be realized by keeping fixed the temperature of the system A generic ensemble n is therefore defined by a coefficient cn see Eq that scales the potential energy v x of the replica the vectorial form of the potential energy V x is used because of possible local scaling i e V x cn v x In this sort of Hamiltonian tempering the transition from n to m ensemble is accepted with probability acc n m min 1 efrem vie m Jn 6 15 6 14 In this
74. work spent during the switching will be dissipated in the system resulting in an non equilibrium non canonical distribution and in a systematic error in the free energy estimate In particular it is assumed that during a metadynamics simulation all the microscopic variables different from the macroscopic reaction coordinate s are always in the equilibrium state corresponding to the value of s 134 This property is known with the name of Markov property and it summarizes the main assumption of the algorithm all the slow modes of the system coupled to the reaction under study have to be known a priori and they have to be included in the number of the reaction coordinates Therefore at variance with the methods presented in the previous chapters metadynamics should be considered a quasi equilibrium method in which the knowledge about the variables that capture the mechanism of a reaction is exploited to gain insight on the transition states and more generally to compute the free energy landscape along the relevant reaction coordinates 7 1 Implementation in ORAC From the practical point of view a metadynamics simulation consists in two steps In the first one a set of reaction coordinates is chosen whose dynamics describes the process under study As we said such a procedure requires an high degree of chemical and physical intuition for its application to complex molecular system since these variables are not obviously determined from a molecula
75. 0 TIME 10000 0 amp END amp INOUT PLOT STEER_ANALYTIC 100 0 OPEN WRK out amp END amp INOUT RESTART read file rst write 30 0 OPEN new rst END amp END Input to ORAC amp RUN 125 In this example the simulation starts from the restart file file rst and goes from the time found in that file to 10000 0 fs The total steering time is 18000 0 fs In the next restarted run the configuration of the system at t 10000 fs is found in the file new rst The next restarted simulation could be thus of the kind amp RUN CONTROL 1 STEER 0 0 18000 0 TIME 18000 0 amp END amp INOUT PLOT STEER_ANALYTIC 100 0 OPEN WRK_10000_18000 out amp END amp INOUT RESTART read new rst END amp END In this example the steering is complete and in the output file WRK_10000_18000 out the work is calculated from t 10000 fs to t 18000 fs TIME NAME TIME Length of the simulation not including the rejection phase SYNOPSIS TIME time DESCRIPTION This command gives the length of the acquisition run which is to be carried out after the rejection equilibration phase The unit of its real argument ftime is femtoseconds During the acquisition run averages are accumulated EXAMPLES TIME 100000 0 Input to ORAC amp SETUP 126 10 2 10 amp SETUP The environment amp SETUP includes commands concerned with the simulation box setup In this environment the simulation cell parameters dimensions and symmetry can be initializ
76. 1 and num2 Here residue numbers are the sequential numbers of the residues as given in input to command JOIN e torsion tata 1atb 2atc 2atd residue numi num2 Add a proper torsion to the topology of the current solute molecule The number 1 and 2 refer to residue num1 and num2 respectively Atom ata and atb belong to residue number num1 while atoms atc and atd are on residue number num2 Additional torsion having three atoms on one residue and one atom on the other residue are also allowed Here residue numbers are the sequential numbers of the residues as given in input to command JOIN If the command AUTO_DIHEDRAL of the environment amp SOLUTE is used no extra torsions need to be added to the current topology e i_ torsion tata tatb 2atc 2atd residue numi num2 Add an improper torsion to the topology of the current solute molecule The number 1 and 2 refer to residue numbers num and num2 respectively Atom ata and atb belong to residue number num1 while atoms atc and atd are on residue number num2 Additional improper torsions having three atoms on one residue and one atom on the other residue are also allowed Here residue numbers are the sequential numbers of the residues as given in input to command JOIN EXAMPLES Input to ORAC amp PARAMETERS 94 ADD_TPG SOLUTE bond isg 2sg residue 6 127 bond isg 2sg residue 30 115 bond isg 2sg residue 64 80 bond isg 2sg residue 76 94 END Add extra bonds to the current topology In this exa
77. 10 2 9 amp RUN Define run time parameters which concern output printing and run averages The following commands are allowed CONTROL DEBUG OPTION MAXRUN PRINT PROPERTY REJECT STEER TIME CONTROL NAME CONTROL Indicate initial conditions SYNOPSIS CONTROL icontrol DESCRIPTION ORAC can run simulations or minimization reading the system initial momenta and or coordinates from different sources If the integer argument control is zero the simulation run must commence from coordinates either stored entirely in a PDB file or generated by ORAC itself from some initial configuration see CELL or SPACE_GROUP in amp SETUP CONTROL 0 implies that all system momenta are initialized from the Boltzmann distribution at the wanted temperature When CONTROL 2 the run is started from the restart file defined by the command SAVE or RESTART in amp INOUT With CONTROL 2 all system averages are zeroed The same action is taken if CONTROL 1 but the averages are not initialized to zero EXAMPLES CONTROL 2 Run a simulation from a restart file and set all averages to zero DEFAULTS CONTROL 0O WARNINGS When restarting a run with a different integration scheme form the one used in the restart file CONTROL should be set to 2 If not unpredictable behavior may occur DEBUG NAME DEBUG Print debug information SYNOPSIS DEBUG all DEBUG debug_type DESCRIPTION Print various arrays to the standard output for debugging Informatio
78. 10000 0 traj out Print trajectory in file traj out every 10 ps Input to ORAC amp PARAMETERS 93 10 2 5 amp PARAMETERS This environment includes commands which read the topology and force field parameter files of the solute These files described in Sec 10 3 contain sufficient information to define the solute topology and to assign potential parameters to the solute molecules The following commands are allowed ADD_TPG JOIN PRINT_TOPOLOGY READ_TPGPRM READ_PRM_ASCII READ_TPG_ASCII REPL_RESIDUE WRITE_TPGPRM_BIN ADD_TPG SOLUTE NAME ADD_TPG SOLUTE Add topology components to the current solute molecule SYNOPSIS ADD_TPG SOLUTE END DESCRIPTION The structured command ADD_TPG SOLUTE opens an environment including commands which add extra bonds proper and improper torsions to the topology of the current solute molecule s The command is closed by END This command must be used to connect atoms belonging to different residues of the current molecule For instance to connect through a sulphur bridge two cysteine residues or to bind ligands to a metal atom e bond lata 2atb residue numi num2 Add a bond to the topology of the current solute molecule which connect atom ata of residue number num with atom atb of residue number num2 The number 1 and 2 refer to residue num1 and num2 respectively The atom label ata and atb must be defined in the general formatted topology file as labels of actual atoms of residue number num
79. 14 1 f erf correction in the zero cell Subtract the self energy z 50 1 Ai t T Pai Update the N alchemical work Update velocities using bonded forces at Aty using ty forces Update velocities using bonded forces at Aty using ty forces continued Nz Loop ends Compute N direct space non bonded forces at Aty Update the N alchemical work 79 Steered Molecular Dynamics 76 Update velocities bonded forces at Aty using ty forces N Loop ends The computational overhead after the inclusion of the alchemical code in the MD driver is mostly due to the evaluation of the alchemical work during the non equilibrium driven experiment As the simulation proceeds the alchemical work must be computed in the direct lattice as well as in the reciprocal lattice with a frequency identical to that of the energy terms For the reciprocal lattice contribution one extra Fast Fourier Transform is required in order to evaluate the reciprocal lattice Particle Mesh Ewald energy at the previous step Moreover the Erf correction Vaen Eq 9 5 is an entirely new energy term due to the alchemical species The efficiency loss of the alchemical code with respect to a non alchemical code is around 30 as measured in a short serial simulation of ethanol in water in standard conditions see Methods section of the main paper for the simulation parameters Chapter 10 Input to ORAC 10 1 General Features Input File At execution time O
80. 2_1 2 1 00 00 00 00 1 00 00 00 00 1 00 00 00 00 1 00 00 00 00 1 00 00 00 00 1 00 00 50 00 Space Group Symmetry P 2 c 4 1 00 00 00 00 1 00 00 00 00 1 00 00 00 00 1 00 00 00 00 1 00 00 00 00 0 00 50 00 50 1 00 00 00 00 1 00 00 00 00 1 00 00 50 50 1 00 00 00 00 1 00 00 00 00 1 00 50 50 00 The space group file is parsed by ORAC as usual by interpreting the composing tokens of each line string The space group name is taken to begin after the third word Symmetry in the first line and Input to ORAC amp SOLUTE 148 may be composed of more than one word The number of inequivalent molecules nmol in the cell is read in the immediately following line Then for each molecule four lines must be provide where the interchange matrix and the fractional translations are read in No comment lines may be included In the present example for the first molecule the identity matrix and the zero translation are given from line 3 6 while in e g the P21 group for the second molecule a C2 line 7 9 rotation and a 0 5 fractional translation line 10 along the same axis are given The coordinates of the asymmetric unit must be provided in input through the command READ_PDB The command REPLICATE is used to generate a simulation box larger than the unitary cell Note that the cell parameters of the simulation box are input to the command CRYSTAL EXAMPLES SPACE_GROUP sgroup da
81. 39 Electrostatic Interactions 32 Figure 4 1 Bending and dihedral angles Torsion We limit our analysis to a purely torsional system see Fig Ib where atoms 2 and 3 are held fixed and all bond distances and the angle 0 are constrained to their equilibrium values The system has only one degree of freedom the dihedral angle driven by the torsional potential Vs Again we rewrite the kinetic energy in terms of the bond distances the dihedral angle and the constant bend angle For the kinetic energy the only relevant coordinates are now those of atoms 1 and 4 d z d z cos Ea t4 d34 cos 0 y dy2sin0cos 2 y4 d4 sin 0 cos p 2 zy dy2sin sin d 2 z4 d34 sin 0 sin 2 4 13 The Lagrangian in terms of the dihedral angle coordinate is then p she E EE T E 4 14 where I sinb midjz mad3q 4 15 Assuming small oscillations the potential may be approximated by a second order expansion around the corresponding equilibrium dihedral angle o 1 Viors 2 O 0 5Van do 4 16 o 0 Substituting 4 16 into Eq 4 14 and then writing the Lagrange equation of motion for the coordinate one obtains again a differential equation of a harmonic oscillator namely AVgn T do 0 4 17 Thus the uncoupled torsional frequency is given by 1 2 n Vo 2 sh 4 1 A Fm sin 0 mid 4 18 For many all atom force fields imp
82. Corrections for the Multiple Time Step Simu lation In flexible molecular systems of large size the Ewald summation presents computational problems which are crucial to constructing efficient and stable multiple time step integrators I We have seen that Electrostatic Interactions 39 intra molecular Coulomb interactions between bonded atoms or between atoms bonded to a common atom are excluded in most of the standard force fields for protein simulation In any practical implementation of the Ewald method the intra molecular energy Vintra is automatically included in the reciprocal space summation and is subtracted in direct space see Eqs 4 23 21 In actual simulations the reciprocal space sum is computed with a finite accuracy whereas the intra molecular term Vintra due to the excluded Coulombic interactions is computed exactly This clearly prevents the cancellation of the intra molecular forces and energies When the stretching and bending forces are integrated explicitly the intra molecular term due to the excluded contacts varies rapidly with time and so does the cancellation error Consequently instability may be observed when integrating the reciprocal lattice forces in reference systems with large time steps The correction due to the truncation can be evaluated by approximating the reciprocal lattice sum for the excluded contacts in Eq 4 21 to an integral in the 3 dimensional k space and evaluating this integral from the cutoff
83. ED MIXRULE Input to ORAC Force Field amp Topology 155 typl Tmin Y Mass END Here typ is a character string not to exceed 7 characters labeling the atom type for the atom Tmin is the radius corresponding to the minimum of the Lennard Jones potential e the Lennard Jones well depth y is reserved for later usage and should be set to zero mass is the atom mass The non bonded potential format changes if different Lennard Jones potentials must be used for the 1 4 interactions in which atom type typ1 is involved NONBONDED MIXRULE typ Tmin Tit e4 mass END Here parameters r 4 e are used only for 1 4 interactions In case the argument NOMIXRULE is used the input to NONBONDED looks like NONBONDED NOMIXRULE typ Tmin Y Mass END Bo Ay First the sequence of the Niype force field atom types and Lennard Jones parameters is read inter rupted by the keyword END at the beginning of a new line Second a list of the Niype Ntype 1 2 interaction potential parameters B and A must be provided in input For most of the biomolecular force fields non bonded mixing rules are commonly used EXAMPLES 1 NONBONDED MIXRULE h4 1 409 0 015 0 000 1 008 o 1 661 0 210 0 000 16 000 ca 1 908 0 086 0 000 12 010 END EXAMPLES 2 NONBONDED NOMIXRULE h 0 000 0 000 0 000 1 008 o 1 700 0 120 0 000 15 999 2 000 0 110 0 000 12 011 END 0 0 0 0 Interaction type h h 0 0 0 0 Interaction type h o 0 0 0 0 Int
84. ER ALCHEMY RESTART NAME RESTART Write or read an unformatted file from which a simulation might be restarted SYNOPSIS RESTART END DESCRIPTION The RESTART command may include the following subcommands e read filename read restart configuration from file filename When this subcommand is active CONTROL amp RUN must be non zero and the command READ_TPGPRM amp PARAMETERS must have been entered e readmultiple_restart filename_prefia num This command works only if the code is compiled using the MPI libraries and is not rec ognized when running in serial Each of the nprocs processor will read a restart file named filename_prefix iproctnum rst So if filename_prefix is u foo restarts ala and num is 0 then process 0 will read the file u foo restarts ala0000 rst process 1 will read the file u foo restarts ala0001 rst and so on This command is useful when running in parallel multiple steered molecular dynamics trajectories see also commands ADD_STR_BONDS ADD_STR_BENDS ADD_STR_TORS of namelist POTENTIAL e rmr filename_prefiz num Same as above e write fprint OPEN filename write restart configuration to file filename every fprint fs e write forint SAVE_ALL_FILES filename write restart configuration to files filename i rst every fprint fs see also command read multiple_restart EXAMPLES RESTART read filel rst write 1000 0 OPEN file2 rst END RESTART rmr RESTARTS ala 0 END
85. Hamilton s equation A stepwise integration defines a t flow mapping which may or may not retain these properties Non symplectic and or non reversible integrators are generally believed 67 68 69 70 to be less stable in the long time integration of Hamiltonian systems In this section we shall illustrate the concept of reversible and symplectic mapping in relation to the numerical integration of the equations of motion 2 1 Canonical Transformation and Symplectic Conditions Given a system with n generalized coordinates q n conjugated momenta p and Hamiltonian H the corre sponding Hamilton s equations of motion are _ _ OH n oH Bi gg ITb 2 1 These equations can be written in a more compact form by defining a column matrix with 2n elements such that q 2 2 x 3 2 2 In this notation the Hamilton s equations 2 1 can be compactly written as OH 0 1 where J is a 2n x 2n matrix 1 is an n x n identity matrix and_0 is a n x n matrix of zeroes Eq 2 3 is the so called symplectic notation for the Hamilton s equations Using the same notation we now may define a transformation of variables from x q p to y Q P as y y x 2 4 For a restricted canonical transformation 72 we know that the function H x expressed in the new coordinates y serves as the Hamiltonian function for the new coordinates y that is the Hamilton s equations 1Symplectic means intertwined in Greek and
86. IN alail0_A prmtpg amp END amp POTENTIAL ADD_STR_BONDS 1 104 400 31 5 15 5 amp END Input to ORAC amp POTENTIAL 100 amp RUN CONTROL 2 REJECT 0 0 STEER 10000 50000 TIME 50009 0 amp END amp INOUT RESTART rmr RESTART alai0 0 amp END In the first example a stretching constraint is imposed bewteen atom 1 and atom 104 of the so lute In the second example a time dependent driving potential ia applied to the same atoms of the solute The equilibrium distance of such harmonic driving potential move at constant velocity in T 40 ps starting at t 10 ps between ro 31 5 and r 15 5 Since the directive rmr or read multiple_restart is issued in the RESTART amp INOUT command the example is assumed to run in parallel Each process reads a different restart file named ala10iproc rst in the RESTART directory Note that the path of the restart files is specified with respect to the actual value of the pwd command when the parallel version is executed i e in the PARxxxx directories ADD_STR_BENDS NAME ADD_STR_BENDS Add a bending potential between three target atoms SYNOPSIS ADD_STR_BENDS iati iat2 iat2 k ao alpha DESCRIPTION This command can be used to impose an additional bending constraint between atom iat1 iat2 and iat3 of the solute The numeric order of the solute atom indices iat1 iat2 is that specified in the topology file see Sec 10 3 The central atom of the bending is ita2 The added
87. In the extended systems formulation we always deal with real and virtual variables The virtual variables in the Hamiltonian 8 13 are the scaled coordinates and momenta while the unscaled variables e g R hS or Pira Pika s are the real counterpart The variable s in the Nos formulation plays the role of a time scaling 93 The above Hamiltonian is given in terms of virtual variables and in term of a virtual time and is indeed a true Hamiltonian function and has corresponding equation of motions that can be obtained applying Eq 2 3 with x Sia lika hag Tia Pika Tag Ps in a standard fashion Nonetheless the equations of motions in terms of these virtual variable are inadequate for several reasons since for example one would deal with a fluctuating time step 85 93 It is therefore convenient to work in terms of real momenta and real time The real momenta are related to the virtual counterpart through the relations Ta gt Tjafs 3 14 Pika gt Pika S 3 15 Pa ag gt Pa ag s 3 16 Ds gt Ps s 3 17 3 18 It is also convenient to introduce new center of mass momenta as such that the corresponding velocities may be obtained directly without the knowledge of the coordinates h in a4 4 namely y P S 3 20 M 3 20 Finally a real time formulation and a new dynamical variable 7 are adopted t gt t s 3 21 7 Ins 3 22 The equations of motions for the newly adopted set of dynamical variables a
88. In the serial version of ORAC this file is written in the working directory In the parallel version it is written in the PARXXXX directories The file reports the energies of the system see top of the file including the ensemble index corresponding to the current replica e g the number n if the current ensemble of the replica is An The file is updated in time intervals defined by the parameter Le of the command STEP see below SGE_HISTOG In the serial version of ORAC this file is written in the working directory In the parallel version it is written in the PAR0000 directory The file reports a histogram related to the replica population of the various ensembles The file is updated in time intervals defined by the parameter L of the command STEP see below The following commands are allowed in the amp SGE environment FIX_FREE_ENERGY PRINT_ACCEPTANCE_RATIO PRINT_WHAM SEGMENT SETUP STEP TRANSITION_SCHEME ZERO_FREE_ENERGY FIX_FREE_ENERGY NAME FIX_FREE_ENERGY Set up input for performing a SGE simulation with fixed weight factors SYNOPSIS FIX_FREE_ENERGY OPEN PATH filename DESCRIPTION The presence of this command in the input establishes that user defined weight factors the Agn m Im Gn difference factors in Eq instead of self updating free energy differences the A fnm fm fn free energy difference in Eq 6 22 must be used in the SGE simulation Such factors are kept constant during the simulation
89. M can also be applied to a specific part of the potential thereby localizing the effect of the configurational exchanges to specific part of the systems Given a potential made up of a sum of various i 1 k contributions e g stretching bending torsional solute solvent solute solute solvent solvent non bonded etc then one can define in a general way the m th replica of the extended system as Vin X 2 X 5 13 m is the scaling factor for th i th contribution v X of the potential of the m the replica So m m m each replica is characterized by a k dimensional scaling vector Cm cy Whose component are the scaling factors of k contributions of the interaction potential for that replica The target replica where c Replica exchange 46 replica 1 is such that Vi x V x the unscaled potential corresponding to the target system for which Cm 1 1 1 1 In vector notation we may compactly write Eq 5 13 as Vin X Cm V X 5 14 Using this formalism the probability of a configuration X in the m the replica may be written as 1 Pm X s e Bom V X 5 15 with Zm f e Pem v X dX The detailed balance condition for the exchange of configurations between replica m characterized by the scaling vector Cm and replica n characterized by the scaling vector cn is then given by W X Cm X Cn B Again the detailed balance is implemented through Eq 5 6 with W X
90. ORAC 5 User Manual Release 5 4 ORAC A Molecular Dynamics Program to Simulate Complex Molecular Systems at the atomistic level Authors and copyright holders Piero Procacci Massimo Marchi Contributors Simone Marsili Tom Darden Marc Souaille Giorgio Federico Signorini Riccardo Chelli Emilio Gallicchio Contents 1 Atomistic simulations an introductio 7 1 Implementation in ORAC CONTENTS 2 61 a a a a i ns es 61 inte Rs RG T O 65 68 9 0 1 Production of the MD trajectory with an externally driven alchemical process 68 Saat oe ue oh ee oe See 72 10 Input to ORAC 77 PAs E E ee ee 77 ee ee ay re ee Ee 77 fad PO A Soa de a Sa ee ee ee 79 iid ites OMe ook hd bbe be bam ca a 80 e abe od 2 eee ede ee ee aoa ee a 86 i ote RO Ee ie Yee Be ee a Eee ech ees wo hk 89 hod Shae eed be eh eee eee ee ek ee aes ee eee 93 Sou A Adwhe Sod does code Oe od ae aed oe eee ee 99 Te ee eee ae ee ee ee eee 111 a ayia D Map tea eee Gal dy at Sh tar aes oy Weve Sh ee Gh ABAD sige ele ra a eo ae tea AB cde Ca 118 N a i 2a er GaP sn ais ay E a a G Gh e Ges chee GR Gece be a eee cho ee aes cause ep aes way ne 121 Aa erage hs ee ee a a eee Go Dae ee ee 126 ee a en a E E E a a a e a RR E O E E R S 131 Sa Me er E E E E E EAE E ee E E E E 137 shah ta Gt ie a tt e Geet a E Gane E E aay Se E E 145 esos es ee hee es es Oe Ee a mrs Rennes ies eo ae ge eo Bee 149 Pues ode eek eo a A A wee ewe 153
91. OUT ANY WARRANTY without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE See the GNU for more details A general version of the GPL may be requested at Free Software Foundation Inc 59 Temple Place Suite 330 Boston MA 02111 1307 USA CONTENTS 2 Contributors to ORAC Main contributors and license holders Piero Procacci Dipartimento di Chimica Universit di Firenze Via della Lastruccia 3 I 50019 Sesto Fiorentino Italy E mail piero procacciQunifi it Massimo Marchi Commissariat l Energie Atomique DSV IBITEC S SB2SM Centre dEtudes de Saclay 91191 Gif sur Yvette Cedex France E mail massimo marchiQ cea fr Other contributors Simone Marsili Dipartimento di Chimica Universita di Firenze Via della Lastruccia 3 I 50019 Sesto Fiorentino Italy E mail simone marsili unifi it Role in development Replica Exchange Method and Metadynamics Giorgio Federico Signorini Dipartimento di Chimica Universita di Firenze Via della Lastruccia 3 I 50019 Sesto Fiorentino Italy E mail giorgio signorini unifi it Role in development Tests tools package distribution and version management Riccardo Chelli Dipartimento di Chimica Universita di Firenze Via della Lastruccia 3 I 50019 Sesto Fiorentino Italy E mail riccardo chelliQunifi it Role in development Serial Generalized Ensemble simulations Marc Souaille Medit SA 2 rue du Belv d re 91120 Palaiseau France Role in devel
92. RAC reads an input file in free format from the standard input Each line of the input is read as a 80 character string and interpreted If the first character of the input line is a the line is ignored In order to be interpreted the input line is parsed in the composing words which are sequences of characters separated by blanks or commas Each word represents an instruction which must be interpreted by the program Instructions Set The instruction set of ORAC includes environments commands and sub commands An input file is made out of a series of environments Each environment allows a series of commands which might use a few sub commands Environments resembles Fortran NAMELIST but have not been programmed as NAMELIST The environment name is a string which always starts with a amp followed by capital letters Each environment ends with the instruction END Command names are characters strings all in capital letters Each command reads a variable set of parameters which can be characters and or numbers real or integer There are also commands structured commands which are composed of more than one input line A structured command end with the instruction END and allows a series of sub commands in its inside Sub commands are in lower case and can read sub string of characters and or real or integer numbers In the following section we will describe in details all the supported instruction allow by ORAC Handling External Files Many ORAC c
93. REM because replica exchanges occur between microstates of the same extended thermodynamic ensemble To achieve rapid sampling of the ensemble space through high acceptance rates we need to choose ensembles appropriately so that neighboring ensembles overlap significantly As stated above the most critical aspect in SGE schemes is the determination of weight factors viz dimensionless free energy differences between neighboring ensembles This issue has been the subject of many studies especially addressed to ST simulations The first attempts are based on short trial simulations 46 107 108 The proposed procedures are however quite complicated and computationally expensive for systems with many degrees of freedom Later Mitsutake and Okamoto suggested to perform a short REM simulation to estimate ST weight factors 109 via multiple histogram reweighting 52 53 A further approximated but very simple approach to evaluate weight factors is based on average energies calculated by means AQ Serial generalized ensemble simulations 50 of conventional molecular dynamics simulations I10 The weight factors obtained by the average energy method of Ref IIO were later demonstrated to correspond to the first term of a cumulant expansion of free energy differences 48 Huang et al used approximated estimates of potential energy distribution functions from short trial molecular dynamics simulations to equalize the acceptance rates of forward and
94. RIPTION Save data necessary for reweighting by weighted histogram analysis method 106 WHAM every freq_ print fs in the file SGE_WHAM In the serial version of ORAC this file is written in the working directory In the parallel version it is written in the PARXXXX directories If a Hamiltonian SGE simulation is performed then the file reports the 3 unscaled potential energy terms v x vector of Eq that are subject to scaling see command SETUP above In a SGE simulation in the space of collective coordinates instead of v x the file reports the index of bond bending torsion coordinate the equilibrium value corresponding to the current ensemble An in Eq 6 16 and the current value of the coordinate r in Eq 6 16 EXAMPLES PRINT_WHAM 1000 Print data every 1000 fs DEFAULTS No data are printed SEGMENT NAME SEGMENT Define the solute in Hamiltonian SGE simulations SYNOPSIS SEGMENT END Input to ORAC amp SGE 133 DESCRIPTION This structured command is used to define the solute in a Hamiltonian SGE simulation and to assign the scaling factors to the intrasolute solute solvent and solvent solvent interactions The following subcommands may be specified within SEGMENT define kind e define ni n2 The define command is used to crop a piece of solute for Hamiltonian scaling in a SGE simu lation One can use up to a maximum of 10 define commands cropping 10 disconnected non overlapping part
95. Ras 3 2 lira likas Here the indices 7 and k refer to molecules and atoms respectively while Greek letters are used to label the Cartesian components Tika is the a component of the coordinates of the k th atom belonging to the i th molecule Ria is the center of mass coordinates Sig is the scaled coordinate of the i th molecular center of mass lika is the coordinate of the k th atom belonging to the i th molecule expressed in a frame parallel at any instant to the fixed laboratory frame but with origin on the instantaneous molecular center of mass The set of lika coordinates satisfies 3N constraints of the type 774 lika 0 The matrix h and the variable s control the pressure an temperature of the extended system respec tively The columns of the matrix h are the Cartesian components of the cell edges with respect to a fixed frame The elements of this matrix allow the simulation cell to change shape and size and are sometimes called the barostat coordinates The volume of the MD cell is related to h through the relation Q det h 3 3 s is the coordinates of the so called Nos thermostat and is coupled to the intramolecular and center of mass velocities We define the potentials depending on the thermodynamic variables P and T Vp Pdet h Ve ins 3 4 p Where P is the external pressure of the system 8 kgT and g is a constant related to total the number of degrees of freedom in the system This const
96. SIS STEER tiniz tfinal STEER temperature temp0 tempt tiniz tfinal DESCRIPTION Steer the system with the time dependent mechanical potential defined in the namelist amp POTENTIAL starting the SMD at time tiniz and ending at time tfinal This command can also be used to gradually change the temperature in conjuction with the command PLOT STEER_TEMPERATURE amp INOUT where Input to ORAC amp RUN 124 the adimensional thermal work done on the thermostat is printed at regular time intervals see the PLOT amp INOUT command Steered molecular dynamics can be automatically restarted In order to do this one sets once for all the steering time tfinal to the desired value updtating at each restart only the simulation time given by the TIME amp RUN directive EXAMPLES amp RUN STEER 1000 0 10000 0 amp END amp INOUT PLOT STEER_ANALYTIC 100 0 OPEN WRK out amp END Start to apply the time dependent potential see Eq 8 13 at 1 ps and switch it off at 11 ps Print out the accumulated work every 100 fs to the file WRK out The accumulated work is calculated according to Eq 8 16 amp RUN STEER temperature 300 1500 1000 0 11000 0 amp END amp INOUT PLOT STEER_TEMPERATURE 100 0 OPEN WRKTEMP out amp END rise the temperature form 300 to 1500 K starting at 1 ps and ending at 11 ps with the constant speed of 120 K ps The thermal work is printed every 100 fs to the file WRKTEMP out amp RUN CONTROL 1 STEER 0 0 18000
97. a Atn 1 a At When the large step size at which the intermittent impulses are computed matches the period of natural oscillations in the system one can detect instabilities of the numerical integration due to resonance effects Resonances occurs for pathological systems such as fast harmonic oscillators in presence of strong albeit slowly varying forces and can be cured easily by tuning the time steps in the multilevel integration However for large and complex molecules it is unlikely that an artificial resonance could sustain for any length of time 69 3In the original force breakup 18 19 the energy is not generally conserved during the unperturbed motion of the inner reference systems but only at the end of the full macro step Force breakup and potential breakup have been proved to produce identical trajectories 22 With respect to the force the breakup implementation of the potential breakup is slightly more complicated when dealing with intermolecular potential separation but the energy conservation requirement in any defined reference system makes the debugging process easier Symplectic and Reversible Integrators 15 The integration algorithm that can be derived from the above propagator was first proposed by Tuckerman Martyna and Berne and called r RESPA reversible reference system propagation algorithm 19 2 4 Constraints and r RESPA The r RESPA approach makes unnecessary to resort to the SHAKE procedure 9 10 to freeze s
98. a flexible molecular systems we have fast degrees of freedom which are governed by the stretching bending and torsional potentials and by slow intermolecular motions driven by the intermolecular potential As we shall discuss with greater detail in section 4 in real systems there is no clear cut condition between intra and intermolecular motions since their time scales may well overlap in many cases The conditions Eq are hence never fully met for any of all possible potential subdivisions Given a potential subdivision Eq 2 26 we now show how a multi step scheme can be built with the methods described in section 2 2 For the sake of simplicity we subdivide the interaction potential of a molecular system into two components only One intra molecular Vo generating mostly fast motions and the other intermolecular V1 driving slower degrees of freedom Generalization of the forthcoming discussion to a n fold subdivision Eq 2 26 is then straightforward For the 2 fold inter intra subdivision the system with Hamiltonian H T Vo is called the intra molecular reference system whereas V is the intermolecular perturbation to the reference system Corre spondingly the Liouvillean may be split as it E OVo O 1q q Op f OV L _ iLi Da Op 2 30 Here Lo is the Liouvillean of the 0 th reference system with Hamiltonian T Vo while L is a perturbation Liouvillean Let us now suppose now that At is a
99. adynamics approach 142 the third choice is given by P 4 add e7 V s t keT to where the probability depends parametrically both on time t and on position s of the system along the reaction coordinate through the biasing potential V In this case the biasing potential does not converge to the free energy inverted in sign as in the previous case since in general w turns out to be coordinate dependent even when the potential has flatten the free energy profile However as shown in 142 the relation A s Tvet 7 9 can be used to recover the original free energy from the biasing potential The multiple walkers version of metadynamics algorithm 143 was implemented in the parallel version of the code through the MPI library This approach is based on running simultaneously multiple replicas of the system contributing equally to the same history dependent potential and therefore to the same free energy surface reconstruction For N replicas V s t can be written as a double sum Vist X J eee 7 10 EST Tatte N where s w is the position at time t of the i th replica along s In particular the enhanced efficiency of this algorithm with respect to uncoupled simulations contributes to make the calculation of FESs in high dimensions more accessible In the ORAC distribution at http www chim unifi it orac we provide some example of metadynamics simulations using Lucy s functions on multi dimensional surfaces of simple
100. al is organized as follows The first seven chapters constitute the ORAC theoretical background Chapter 1 contains general and introductory remarks Chapter 2 deals with symplectic and reversible integrators and introduces to the Liouvillean formalism Chapter 3 extends the Liouvillean formalism to the extended Lagrangian methods and Chapter 4 describes how to deal with long range electrostatic interactions and how to combine the SPME method with the multilevel integration of the equations of motion in order to obtain efficient simulation algorithms Chapter for 5 to 7 have been added in the present release Chapter 5 contains an introduction to replica exchange techniques and a description on how such a technique has been implemented in the ORAC program Chapter 6 deals with metadynamics simulations Chapter 7 treats steered molecular dynamics simulation and the theory of non equilibrium processes Chapter 8 is the command reference of the ORAC program Chapter 9 contains instructions on how to compile and run ORAC in a serial and parallel environment 1The ORAC program has been copyrighted C by Massimo Marchi and Piero Procacci 1995 2008 This program is free software you can redistribute it and or modify it under the terms of the GNU General Public License as published by the Free Software Foundation either version 2 of the License or at your option any later version This program is distributed in the hope that it will be useful but WITH
101. al motions is identical for all integrators within statistical error when evaluated over 20 ps of simulations From these results it can be concluded that S1 and R3 are very likely to produce essentially the same dynamics for all relevant degrees of freedom We are forced to state that also the integrator S appears to accurately predict the structure and overall dynamics of the torsional degrees of freedom at least for the 20 ps time span of this specific system Since torsions are not normal coordinates and couple to higher 4For example our conclusions on the effect of SHAKE onto torsional motions for highly flexible systems differs form the Symplectic and Reversible Integrators 18 E RS o ao 1 o VACF 0 01 J Normalize intensity 200 30 400 wavenumbers 0 00 0 100 Figure 2 2 Power spectra of the velocity auto correlation function left and of the torsional internal coordinates right at 300 K for a cluster of 8 C24Hs0 molecules calculated with integrators E R3 S1 and S see text starting from the same phase space point frequency internal coordinates such as bending and stretching the ability of the efficient S integrator of correctly predicting low frequency dynamics and structural properties cannot be assumed a priori and must be in principle verified for each specific case We also do not know how the individual eigenvectors are affected by th
102. al transformations in a MD driver code with multiple time step MTS integrators and Particle Mesh Ewald treatment of long range electrostatics in ORAC The modification due the alchemical transformations are highlighted in red Steered Molecular Dynamics Alchemical MD pseudo code Read coordinates and velocities and compute forces at zero time Simulation begins N long ranged non bonded loop begins Update velocities at Aty 2 using N forces Update velocities at Aty 2 using N forces continued N intermediate ranged non bonded loop begins Update velocities at Aty 2 using N2 forces N3 Short ranged non bonded loop begins Update velocities at Aty 2 using N3 forces N Slow bonded energy shell loop begins torsion Update velocities at Aty 2 using N4 forces Update velocities and coordinates at Aty 2 using N5 forces compute Ns bonded forces at Aty update velocities bonded forces at Aty using ty forces Ns Loop ends Compute N4 bonded forces at Aty Update velocities bonded forces at Aty using ty forces N4 Loop ends Update externally driven A and 7 Compute N3 direct space non bonded forces energy and work at Aty Compute N3 erf forces energies and work due to Vaich Update the alchemical work using the N3 contribution Update velocities at Aty using ty forces N3 Loop ends Compute N direct space non bonded forces energy and work at Aty Compute the Reciprocal lattice forces at Aty Compute the 12 13 1nd
103. an environment which includes the same series of commands and subcommands accepted by the general formatted topology file described in Sec 10 3 EXAMPLES REPL_RESIDUE RESIDUE gly Total Charge 0 0 atoms Input to ORAC amp PARAMETERS 98 group n n 0 41570 hn h 0 27190 group ca ct 0 02520 hat hi 0 06980 ha2 hi 0 06980 group c 0 59730 o o 0 56790 end bonds n hn n ca o C C ca ca hat ca ha2 end imphd G ca n hn ca n c o end termatom n c backbone n ca c RESIDUE_END END Replace or add the topology for residue gly WRITE_TPGPRM_BIN NAME WRITE_TPGPRM_BIN Write an unformatted parameter and topology file SYNOPSIS WRITE_TPGPRM_BIN filename DESCRIPTION This command must be used in combination with READ_TPG_ASCII and READ_PRM_ASCII It produces the binary file filename containing the force field and topology tables associated with the current solute molecule s which can be reread in subsequent runs by the command READ_TPGPRM EXAMPLES WRITE_TPGPRM_BIN moleculel prmtpg Write the unformatted topology and parameter file for the current solute molecule that can be read by READ_TPGPRM WARNINGS Must be used in conjunction with commands READ_TPG_ASCII and READ_PRM_ASCII Input to ORAC amp POTENTIAL 99 10 2 6 amp POTENTIAL The environment amp POTENTIAL includes commands which define the general features of the system inter acting potentials These features are common to both solute and
104. and is active both for solute and solvent molecules It writes a history PDB file containing the system coordinates The centers of mass of all molecules are always inside the simulations cell The dumping frequency in fs is fplot At each writing the system coordinates in PDB format are appended to the history file filename EXAMPLES ASCII 10 0 OPEN test pdb Write system coordinates to the history pdb file test pdb every 10 fs WARNINGS Work only during the acquisition phase see TIME in environment amp RUN ASCII_OUTBOX NAME ASCII Write solute and solvent coordinates to a history PDB file SYNOPSIS ASCII fplot OPEN filename DESCRIPTION This command is active both for solute and solvent molecules It writes a history PDB file containing the system coordinates The dumping frequency in fs is fplot The centers of mass the molecules are at the position given by the simulation and may be hence also outside the simulation box At each writing the system coordinates in PDB format are appended to the history file filename EXAMPLES ASCII_OUTBOX 10 0 OPEN test pdb Write system coordinates to the history pdb file test pdb every 10 fs WARNINGS Work only during the acquisition phase see TIME in environment amp RUN Input to ORAC amp INOUT 81 DCD NAME DCD Write solute and solvent coordinates to a trajectory DCD file SYNOPSIS DCD fplot OPEN filename DCD fplot OPEN filename NOH DESCRIPTION This command is a
105. anged is that described at point 3 free energy calculation It should be noted that when a free energy estimate is performed the work arrays stored for each replica processor see point 2 do not need to be communicated to all other replicas processors Only the sums yo Loar case of Eq 6 25 pagal ie ae Ci case of Eq 6 26 and so exp Wi n gt m case of Eq 6 27 together with Nnm and Nim n must be exchanged for all N 1 ensemble transitions Then each replica processor will think by itself to reassemble the global sums Exchanging one information implies to send M M 1 N 1 real integer numbers through the net 60 kB of information using 20 replicas and slightly less than 1 MB of information using 50 replicas Only in the case of the iterative procedure of Eq 6 25 one information has to be sent several times per free energy calculation i e the number of iterations needed for solving the equation The computational cost arising from computer communications can however be reduced updating the free energy rarely Furthermore in order to improve the first free energy estimate and hence to speed up the convergence the M simulations should be started by distributing the replicas among neighboring ensembles namely replica 1 to Ay replica 2 to Ag and so on see also the discussion at the beginning of the current section 6 3 3 Free energy evaluation from independent estimates and associated vari ances
106. ant is chosen to correctly sample the NPT distribution function The extended NPT Lagrangian is then defined as N 1 2 tptp 1 21t 1 2 ti L 3 Mis S h hS 3 a Mins ltiglig zs tr h h 3 5 lQ Pai 3 6 58 V Pert B ns The arbitrary parameters W and Q are the masses of the barostat and of the thermostats respectively They do not affect the sampled distribution function but only the sampling efficiency 94 For a detailed discussion of the sampling properties of this Lagrangian the reader is referred to Refs Bg 3 2 The Parrinello Rahman Nos Hamiltonian and the Equations of Motion In order to derive the multiple time step integration algorithm using the Liouville formalism described in the preceding sections we must switch to the Hamiltonian formalism Thus we evaluate the conjugate momenta of the coordinates Sia lika Rag and s by taking the derivatives of the Lagrangian in Eq 8 6 with respect to corresponding velocities i e T MGs 3 7 Pik mixslix 3 8 Pa Wh 3 9 Ps Qi 3 10 3W has actually the dimension of a mass while Q has the dimension of a mass time a length squared Multiple Time Steps Algorithms 21 Where we have defined the symmetric matrix G h h 3 11 The Hamiltonian of the system is obtained using the usual Legendre transformation X ap Lla 4 3 12 One obtains H ee T G T pt ikPik ltr Pt Ph pA Ms y Mi ih 3 s2 Ww 2Q l Vero 3 13
107. antaneous i e if it is done at infinite speed then the work done on the system is simply equal to W H Ho with Ho and H being the Hamiltonian of the initial and final state respectively The JI reduces in this case to the to famous free energy perturbation Zwanzig 118 formula lt e 9 41 0 gt 45 e 84 with the subscript 0 indicating that the canonical average must be taken according to the equilibrium distribution of the system with Hamiltonian Ho For fast non equilibrium experiments a large amount of the work rather than in advancing the reaction coordinate is dissipated in heat that is in turn only partly assimilated by the thermal batl A consequence of this is that the maxima of two work distributions Pr W and Pr W tend to get further apart from each other so that the determination of AF becomes less accurate The faster are performed the non equilibrium experiments the large is the average dissipation and the smaller is the overlap between the two work distributions see Fig The reason why CT and JI can be so useful in evaluating the free energies along given reaction paths in the molecular dynamics simulation of complex biological system lies on the fact that this methodologies are inherently more accurate the smaller is the sample Let s see why As one can see form Fig AF can be determined with accuracy if the two work distributions overlap appreciably or stated in other terms if there are sufficient trajectories th
108. are used to establish the topology and connectivity of the solute The bond connectivity is specified between the keywords bond and end by providing the series of bonds present in the residue Each bond is specified by two atom labels corresponding to the atoms participating to the bond All possible bendings and proper torsions are computed by ORAC from bond connectivity and need not to be specified Improper torsions must instead be provided Improper torsion are used to impose geometrical constraints to specific quadruplets of atoms in the solute In modern all atoms force fields improper torsions are generally used to ensure the planarity of an sp2 hybridized atom The convention in ORAC to compute the proper or improper torsion dihedral angle is the following If r1 r2 r3 r4 are the position vectors of the four atoms identifying the torsion the dihedral angle y is defined as r2 r1 x r3 r2 v3 r2 X v4 rs eS ees 10 2 r2 r r3 re r3 r2 r4 rs X arcos Input to ORAC Force Field amp Topology 158 RESIDUE NAME RESIDUE Read covalent topology of the residue SYNOPSIS RESIDUE res1 END DESCRIPTION The command RESIDUE read the covalent topology for the residue labeled resi resi must be a character string not to exceed 8 characters The environment generated by this command can accept the following keywords atoms bonds rigid dihed imphd omit_angle backbone termato
109. as W H t H 0 where H t is the total energy of the microcanonical extended system i e it inlcudes the energy of the thermostat and or of the barostat If the integration time steps are too large and the simulation shows a energy drift then the accumulated work includes the dissipation due to the energy drift of the integrator PLOT STEER_ANALYTIC 50 0 OPEN WRK out Write the accumulated work see Eq 8 16 to the file wrk out every 50 fs The accumulated work at time t is calculated analitically according to Eq This option is slightly more computationally demanding than the previous one but in this case the accumulated work is not affected by the energy drift The last two commands are to be used in conjuction with the STEER amp RUN command and with the commands ADD_STR_BONDS ADD_STR_BENDS ADD_STR_TORS namelist POTENTIAL for defining an external steering potential for SMD Input to ORAC amp INOUT 84 PLOT STEER_TEMPERATURE 50 0 OPEN WRKTEMP out In a steered temperature sumulation 149 write the accumulated adimensional therml work every 50 fs to the file WRKTEMP out This command must be used in conjuction with the STEER amp RUN command for steered molecular dyanmics simulations and with the THERMOS amp SIMULATION com mand for running NVT simulations PLOT ALCHEMY 50 0 OPEN alchemic wrk Print to the file alchemic wrk the work done during an alchemical tranformation See also com mands DEFINE_ALCHEMICAL_ATOM and STE
110. at in both directions transiently violate the second law i e trajectories for which W lt AF This is clearly not in contrast with the second law which states that W lt AF where W f P W WdW is the mean irreversible work In general the probability of an overlap of the two work distributions i e the probability of transiently violating the second law is clearly larger the smaller is the system Suppose to simultaneously and irreversibly unfold N identical proteins in a dilute solution starting from their native states In the assumption that the 2 The Hamiltonian H z zt may be imposed practically in steered molecular dynamics using constraints or adding a stiff harmonic potential that keeps the system at z z4 Both these methods requires small corrections when reconstructing the PMF In particular the use of constraints on z sets also z 0 a condition that is not present in the definition of the PMF see previous footnote The correction to the PMF due this extra artificial condition imposed through a generic constraint is discussed in Ref 146 Stiff harmonic potentials in the sense that the associated stretching motion is decoupled from the degrees of freedom of the system behaves essentially like constraints 147 The depuration of the the PMF from the non stiff harmonic driving potential in AFM experiments has bee proposed bu Hummer and Szabo 3During the non equilibrium experiment the instantaneous temperature of the system a
111. at may modify part of the density of the state of the system The Liouvillean approach to multiple time steps integrator lends itself to the straightforward albeit tedious application to extended Lagrangian systems for the simulation in the canonical and isobaric ensem bles once the equations of motions are known the Liouvillean and hence the scheme is de facto available Many efficient multilevel schemes for constant pressure or constant temperature simulation are available in the literature 25 24 26 As already outlined long range interactions are the other stumbling block in the atomistic MD simula tion of complex systems The problem is particularly acute in biopolymers where the presence of distributed net charges makes the local potential oscillate wildly while summing e g onto spherical non neutral shells The conditionally convergent nature of the electrostatic energy series for a periodic system such as the MD box in periodic boundary conditions PBC makes any straightforward truncation method essentially unreliable 28 The reaction field is in principle a physically appealing method that correctly accounts for long range effects and requires only limited computational effort The technique assumes explicit electrostatic Atomistic simulations an introduction 6 interactions within a cutoff sphere surrounded by a dielectric medium which exerts back in the sphere a polarization or reaction field The dielectric medium has
112. atement of the CT is the following nF 1 NR 1 2 1 2E eb WIF AF 3 1 2 WRIA 8 6 nR i nF Bennett was the first researcher to clearly recognize and formalize through the BAR the superiority of bidirectional methods in the computation of free energy differences We cite verbatim form his paper 117 The best estimate of the free energy difference is usually obtained by dividing the available computer time approximately equally between the two ensembles its efficiency variance x computer time is never less and may be several orders of magnitude greater than that obtained by sampling only one ensemble as is done in perturbation theory Steered Molecular Dynamics 65 where the np ny are the number of forward and backward non equilibrium experiments and W F W Rj indicate the outcome of i th forward and backward work measurement This equation has only one solution for AF i e the MLE As such however the Crooks theorem allows through the MLE estimate based on bidirectional work measurements to compute the free energy difference AF between the end points i e between thermodynamic states at fixed and given reaction coordinates z z and z z In principle to reconstruct the full PMF along the reaction coordinate z in the spirit of thermodynamics integration One should provide a series of equilibrium ensembles of configurations at intermediate values of z Here we briefly sketch out a methodology for reconstructing the full
113. ates ADD_BEND NAME ADD_BEND Add the bending angle between three atoms to the list of the reaction coordinates SYNOPSIS ADD_BEND tatl iat iat3 w DESCRIPTION This command adds to the list of the reaction coordinates of a metadynamics simulation the bending angle between atom iat1 iat2 and iat3 The central atom of the bending is iat2 The numeric order of the atom indices iat1 iat2 iat is that specified in the topology file see 10 3 The repulsive potential terms deposed in the space of the reaction coordinates during the simulation see 6 3 3 will have a width w in arc degrees in the direction of this angle EXAMPLES ADD _BEND 1 7 12 4 0 Add the bending angle between atom 1 atom 7 and atom 12 to the list of the reaction coordinates ADD_TORS NAME ADD_TORS Add the torsional angle between four atoms to the list of the reaction coordinates SYNOPSIS ADD_TORS iatl iat iat3 iat4 w Input to ORAC amp META 90 RATE READ DESCRIPTION This command adds to the list of the reaction coordinates of a metadynamics simulation the torsional angle between atom iat iat2 iat8 and iat4 The axis of the torsion is defined by the atoms iat2 and iat3 The numeric order of the atom indices iat1 iat2 iat3 iat4 is that specified in the topology file see L0 3 The repulsive potential terms deposed in the space of the reaction coordinates during the simulation see 6 3 3 will have a width w in arc degrees in the direction
114. ature The canonical probability of a coordinate configuration X for m th replica is given by 1 Pn X e Pn VO 5 1 Zm where m is the replica index 87t kgT V x is the potential of the system and Zm f ef V dX is the configurational partition function for m th replica Being the M replicas independent the probability distribution for a generic configuration of the M fold extended system X X1 Xm is M Px Pate 5 2 As stated above the global state X of the extended system may evolve in two ways i by evolving each replica independently i e via MC or MD simulation protocols and ii by exchanging the configurations of two replicas Regarding the second mechanism we introduce the transition probability W X Bm X Bn for the exchange between the configuration X of replica at Tm and the configuration X for the replica Tn The probability for the inverse exchange is clearly given by W X 8m X Bn The detailed balance condition on the extended system for this kind of moves is given by Ples X Bm X Banse W X Bm X Bn 5 3 Pree ig Xe Brag X Bnr W X Bm X Bn 5 4 which using the expressions 5 2 and 5 1 for the global probability is satisfied if the transition probability satisfies the equation W X Pm X Bn L 8m Bn E X B X 5 5 W X Pmi X Bn The exchange of configurations of replicas obeying the detailed balance condition can be as usual implemented by using the Metropolis al
115. axis and the order n of the B spline interpolation is 4 or 5 A pes error analysis and a comparison with standard Ewald summation can be found in Refs 84 and For further readings on the PME and SPME techniques we refer to the original papers 33 OI 28 N PME tests on 5CB 3 0 lw 201 a no Z amp a w F we 3 p z 1 0 ee J O ae wee e oa 0 0 S f f 0 4000 8000 12000 Number of atoms Figure 4 3 CPU time versus number of particles for the SPME algorithm as measured on a 43P 160MH IBM workstation The power of the SPME algorithm compared to the straightforward implementation of the stan dard Ewald method is indeed astonishing In Fig 4 3 we report CPU timing obtained on a low end 43P 160MH IBM workstation for the evaluation of the reciprocal lattice energy and forces via SPME for cyanobiphenil as a function of the number of atoms in the system Public domain 3 D FFT routines were used The algorithm is practically linear and for 12 000 particles SPME takes only 2 CPU seconds to perform the calculation A standard Ewald simulation for a box 64 x 64 x 64 A ie with a grid spacing Electrostatic Interactions 37 in k space of k 27 64 0 01 A for the same sample and at the same level of accuracy would have taken several minutes 4 3 Subdivision the Non Bonded Potential In addition to the long range electrostatic contributions Var and Vga given in Eqs 4 20 4 21 more short range forc
116. backward transitions between neighboring temperatures ultimately leading to a uniform temperature sampling in ST I11 The techniques illustrated above have been devised to determine weight factors to be used without further refinement 109 or as an initial guess to be updated during the simulation 11 MIO In the former case these approximate factors should hopefully ensure an almost random walk through the ensemble space However as remarked in Ref 47 the estimate of accurate weight factors may be very difficult for complex systems Inaccurate estimates though unaffecting the basic principles of SGE methods do affect the sampling performances in terms of simulation time needed to achieve convergence of structural properties 47 As discussed above dimensionless free energy differences between ensembles viz weight factors may also be the very aim of the simulation 54 since they correspond to the PMF along the chosen coordinate In such cases accurate determination of weight factors is not simply welcome but necessary This can be done a posteriori using multiple histogram reweighting techniques 52 53 or with more or less efficient updating protocols applied during the simulation 12 g In the ORAC program we have implemented SGE simulations either in a ST like fashion or in the space of bond bending and torsional coordinates These simulations exploit the adaptive method to calculate weight factors developed in Ref 55 Such
117. ciple be assigned to each of 9 extra degrees of freedom of the barostat Setting for example Wap W fora lt g 3 80 Wag fora gt B 3 81 3 82 one inhibits cell rotations 26 This trick does not work unfortunately to change to isotropic stress tensor In this case there is only one independent barostat degrees of freedom namely the volume of the system In order to simulate isotropic cell fluctuations a set of five constraints on the h matrix are introduced which correspond to the conditions hep ha _ oP _ 9 hat hy l no hop Ehn 0 for a lt 3 83 hiy with h being some reference h matrix These constraints are implemented naturally in the framework of the multi time step velocity Verlet using the RATTLE algorithm which evaluates iteratively the constraints force to satisfy the constraints on both coordinates h and velocities h 26 In Ref it is proved that the phase space sampled by the NPT equations with the addition of the constraints Eq correspond to that given by NPT distribution function Chapter 4 Multiple Time Steps Algorithms For Large Size Flexible Systems with Strong Electrostatic Interactions In the previous sections we have described how to obtain multiple time step integrators given a certain potential subdivision and have provided simple examples of potential subdivision based on the inter intra molecular separation Here we focus on the time scale separation of model potentials of complex mole
118. creating or annihilating species The different means to access the alchemical work can be used as a powerful check to test the coherency of the trajectories and of the computed numerical work Eq The alchemical work indirectly evaluated monitoring the changes of total energy of the system must follow closely the profile of the numerical work computed using Eq 9 9 Such test is reported in Figure 9 2 right for the discharging of ethanol in water In a multiple time step scheme the alchemical work must be computed exactly as the energy is com puted hence evaluating more often the contributions arising from the fast shells with respect to the terms alchemical species The work done by an alchemical species through this term is simply given by W ta X T depending whether the alchemical species has been charged or discharghed Steered Molecular Dynamics 73 evolving more slowly In the scheme reported in the Supporting Information we succinctly the describe the implementation of the alchemical process and the associated work calculation in a molecular dynamics code highlighting the parts of the code that must be modified because of the presence of alchemical species with respect to a normal MD code In Figure 9 2 we report the behavior of the various contributions to Total 10 Direct space erf 0 cell 6 J Reciprocal Lattice 4 Self term 7 aa Nok Direct space erfc 5 me a 5 ma Numerical work 4 Ss o4 To
119. ctive both for solute and solvent molecules It writes a trajectory file containing the system coordinates and the simulation box parameters The centers of mass of all molecules are always inside the simulations cell The dumping frequency in fs is fplot At each writing the system coordinates and the simulation box parameters are appended to the trajectory file filename During a REM simulation this file will automatically be complemented by the file filename rem where energy terms involved in the REM exchanges along with the time step and the replica index will be printed with the same frequency The file filename rem contains the same information as the one created by the command PRINT_ENERGY amp REM EXAMPLES DCD 10 0 OPEN test dcd Write atomic coordinates to the dcd trajectory file test dcd every 10 fs DCD 10 0 OPEN test dcd NOH Write coordinates of non hydrogen atoms to the dcd trajectory file test dcd every 10 fs WARNINGS Work only during the acquisition phase see TIME in environment amp RUN DYNAMIC DUMP NAME DYNAMIC Write force field parameters in extended format see also DEBUG amp RUN SYNOPSIS DYNAMIC OPEN filename DESCRIPTION This command prints out to the file filename the parameters of the force field for the solute only in a verbose format EXAMPLES DYNAMIC OPEN ff out NAME DUMP Write coordinates to a direct access unformatted file with a given frequency The file is written in a pa
120. cular systems Additionally we provide a general potential subdivision applying to biological systems as well as to many other interesting chemical systems including liquid crystals This type of systems are typically characterized by high flexibility and strong Coulombic intermolecular interactions Schematically we can then write the potential V as due to two contributions V Vona Vabn 4 1 Here the bonded or intramolecular part Vpna is fast and is responsible for the flexibility of the system The non bonded or intermolecular or intergroup term Vabn is dominated by Coulombic interactions The aim of the following sections is to describe a general protocol for the subdivision of such forms of the interaction potential and to show how to obtain reasonably efficient and transferable multiple time step integrators valid for any complex molecular system 4 1 Subdivision of the Bonded Potential As we have seen in Sec 2 3 the idea behind the multiple time step scheme is that of the reference system which propagates for a certain amount of time under the influence of some unperturbed reference Hamiltonian and then undergoes an impulsive correction brought by the remainder of the potential The exact trajectory spanned by the complete Hamiltonian is recovered by applying this impulsive correction onto the reference trajectory We have also seen in the same section that by subdividing the interaction potential we can
121. d in the minimization routine only DEF INE_ALCHEMICAL_ATOM NAME DEF INE_ALCHEMICAL_ATOM SYNOPSIS DEFINE_ALCHEMICAL_ATOM tati iat on off DESCRIPTION Define an alchemical segment fo the solute N B Only solute atoms can be of alchemical types iat1 and iat3 are the index of the first and last atom of the alchemical segment Alchemical segments can either be switched on or switched off The alchemical atoms must be part of the starting PDB file whether they interact of not with the actual atoms This command is used along with the command STEER_PATH ALCHEMY to define the time protocol of the transformation and the command PLOT ALCHMEY to printing out the work done during the transformation EXAMPLES Input to ORAC amp POTENTIAL 104 Example 1 DEFINE_ALCHEMICAL_ATOM 1 10 on STEER PATH ALCHEMY alchemy time on atoms from 1 to 10 of the solute will be switched on with a time protocol specified in the file alchemy time on Example 2 DEFINE_ALCHEMICAL_ATOM 1 10 off STEER PATH ALCHEMY alchemy time off atoms from 1 to 10 of the solute will be switched off with a time protocol specified in the file alchemy time off Example 3 DEFINE_ALCHEMICAL_ATOM 1 10 on DEFINE_ALCHEMICAL_ATOM 10 20 off STEER_PATH ALCHEMY alchemy time on off atoms from 1 to 10 of the solute will be switched on and atoms from 10 to 20 of the solute will be switched off each with a time protocol specified in the common file alchemy ti
122. d the full time invariant atomic charges of the solute respectively and with Qs the charges on the solvent The alchemical q t and full Q solute charges are related by q t 1 A t Q When evaluating the reciprocal lattice energy via Eq 9 2 the situation for the charge charge electrostatic interactions is in represented in Table 9 2 In the direct lattice Direct Lattice Erfc Reciprocal Lattice Erf Qseqs a x Fa T T Only interactions gt 14 All interactions Table 9 2 Charge charge interactions in alchemical transformations using the Ewald summation The atomic charges labeled q t Q an d Q refer to the alchemical charge to the full time invariant solute charge and to the solvent non alchemical charge the rules reported in table P I can be implemented straightforwardly by excluding in the double atomic summation of Eq QJ all the so called 12 and 13 contacts These atom atom contacts involve directly bonded atoms of atoms bound to a common atom for which no electrostatic charge charge contribution should be evaluated In the reciprocal lattice however because of the structure of Eq 9 2 all intra solute interactions are implicitly of the kind q t q t erf ar r and 12 and 13 pairs are automatically considered in the sum of Eq The latter terms may be standardly removed in the zero cell by subtracting from the energy the quantity Vintra 5 ep eur 9 4 ij excl Tij Regarding the 1 4 interactions these ar
123. dinate s configurations corresponding to the free energy maximum the transition state s can be sampled by adding a restraining potential to the original Hamiltonian of the system so as to obtain a frequency histogram for the value of the reaction coordinate s centered around the transition state itself If we were good enough in locating the transition state and matching the curvature of the potential this distribution will overlap with the two distributions obtained starting two different simulations from the two metastable states The free energy difference between the metastable states as well as the height of the free energy barrier at the transition state can then be computed using the sampling from this bridging distribution This solution is known as Umbrella Sampling 56 More generally if the transition state can be identified and located at some value of the reaction coordinate the procedure of modifying the energetics of a system in order to balance the activation barrier and flatten the free energy profile is known with the name of Non Boltzmann sampling The original free energy can be computed from the free energy of the modified ensemble through the formula A s A s V s 7 1 where A s denotes the free energy computed by simulating the modified ergodic system As in the Um brella Sampling algorithm the hardest part of the Non Boltzmann sampling approach is the construction of a good biasing potential since t
124. driven process The method also relies on the strong assumption that the friction along z is local in time i e the underlying equilibrium process is Markovian 8 1 The Crooks theorem Recent development on non equilibrium thermodynamics have clarified that the PMF along the given re action coordinate z can actually be reconstructed exactly using an ensemble of steered molecular dynamics simulations without resorting to any assumption on or having any knowledge of the frictional behaviour of the system along the reaction coordinate These developments date back to a paper by Evans Searls 145 where the first example of transient fluctuation theorem for a system driven out of equilibrium was formu lated demonstrating the connection between the time integral of the phase compression factor in Liouville space along an arbitrary time interval and the probability ratio of producing the entropy A and A along a deterministic trajectory of a many particles non equilibrium steady state system Gavin Crooks in his phd thesis proposed 63 in the context of Monte Carlo simulations in the canonical ensemble NVT a transient 145 fluctuation formula from now on indicated with CT involving the dissipative work for sys tems driven out of equilibrium by varying some arbitrary mechanical parameter The CT is actually even more general than the Evans and Searls fluctuation theorem 145 since in the latter the driven z coordinate has an underlying zero PMF i e
125. due 97 158 definition of in the tpg file 58 sequence residue 113 residue sequence 121 RESTART 84 restart file parallel simulation restricted canonical transformation 9 reversible integrator rigid 59 root mean square displacement 113 L4 T30 amp RUN Ryckaert J P SAVE 91 saving coordinates to disk SCALE SCALING 141 SCALE_CHARGES scaling equivalence of atomic and group scaling method for constant pressure simulation 141 SD SEGMENT 19 SEGMENT amp SGE 132 SELECT_DIHEDRAL Serial Generalized Ensemble simulations BAR SGE method General theory 50 Input of see also amp SGE P31 Simulated tempering BIJ Simulations in collective coordinate space 52 SETUP amp REM replica exchange method amp SETUP SETUP amp SGE 133 amp SGE FIX_FREE_ENERGY 131 PRINT_ACCEPTANCE_RATIO PRINT_WHAM SEGMENT SETUP STEP 134 TRANSITION_SCHEME ZERO_FREE_ENERGY Description of the method see also Serial Gen eralized Ensemble simulations 31 SHAKE 5 05 103 Simulated tempering see also Serial Generalized En semble simulations amp SIMULATION 137 simulation box 126 smooth particle mesh Ewald see also SPME 35 Soft core Lennard Jones potential 68 solute defining a fragment of input examples input topology from ASCII file input topology from binary file 96 inserting in solvent pair correlation function 112
126. e da 0 2 41 da 0 2 42 equation where a runs over all constrained bonds and a are constants In the double time integration 2 34 velocities are updated four times i e two times in the inner loop and two times in the outer loop To satisfy 2 41 SHAKE must be called to correct the position in the inner loop To satisfy 2 42 RATTLE must be called twice once in the inner loop and the second time in the outer loop according to the following scheme p S p 0 AO p Ati RATTLE P DO i 1 n p 52 652 e p 2h 188 RATTLE p 44 iA 2 43 q iAto q li 1 Ato p 42 122 amp f q ito RATTLE q iAto p 42 iAto p AB 122 Fo iAto 4 ENDDO p At p Fy nAty 22 Where RATTLE and RATTLE represent the constraint procedure on velocity and coordinates respec tively 2 5 Applications As a first simple example we apply the double time integrator 2 34 to the NVE simulation of flexible nitrogen at 100 K The overall interaction potential is given by V Vintra Vinter Where Vinter is the intermolecular potential described by a Lennard Jones model between all nitrogen atoms on different molecules 79 Vintra is instead the intramolecular stretching potential holding together Symplectic and Reversible Integrators 16 Table 2 1 Energy conservation ratio R for various integrators see text The last three entries refer to a velocity Verlet with bond constraint
127. e optimal solution would be to use a faster rate at the beginning of the simulation so as to produce a rough estimate of the free energy and then to reduce w to refine this estimate 139 This problem corresponds to finding an optimal protocol for the evolution of the modification factor in the original Wang Landau algorithm Various solutions have been proposed 140 in which the energy h in is time dependent We propose instead to add a term to the biasing potential with a given probability P add depending parametrically on time For example for P add 1 t the evolution of the rate would be given by w t P add wo wo t This procedure can be seen on average as an increasing deposition interval T t such that w t h 7 t decreases in time In the present implementation of ORAC three different choices are available for the probability P add the default one is simply P add 1 and corresponds to the standard metadynamics algorithm The second one is given by P add e7 Vnex t keT 7 7 where Vinax t is the maximum value of the potential V s at time t During the simulation the effective rate w t decreases as Vinax t increases As Vmax gt kpT the deposition rate w t is so slow that the transformation can be considered adiabatic and the biasing potential converges to the free energy inverted in sign A s V s t The slowdown of w can be tuned through the parameter T Finally following the well tempered met
128. e box according to the P 2 c space group along with 64 replicas of solvent molecules Again the overlapping solvent molecules say no will be discarded If we comment the line GENERATE RANDOMIZE 4 4 4 and uncomment the lines GENERATE RANDOMIZE 8 8 8and REPLICATE 2 2 2 we double the size of the sample we will have 8 cell of 20 x 20 x 20 A each with 4 molecules of solute and 8 x 8 x 8 512 solvent molecules minus 8 X no overlapping molecules Input to ORAC amp SOLUTE 146 DEF_SOLUTE NAME DEF_SOLUTE Define a solute molecule SYNOPSIS DEF_SOLUTE begin end DESCRIPTION This command is used in conjunction with the command STRUCTURES in amp PROPERTIES and TEMPLATE in amp INOUT It defines the solute atoms from which mean square displacements are to be computed The arguments indicate the ordinal numbers of the first begin and the last atom end of a solute molecule These numbers may be deduced by inspection of the Template file The command DEF_SOLUTE can appear more than one time in the environment The atoms of different solute molecules defined with this command may overlap EXAMPLES amp SETUP DEF SOLUTE 1 10 DEF SOLUTE 31 57 END amp ANALYSIS UPDATE 3 2 0 START 1 STOP 199 amp END amp PROPERTIES STRUCTURES inst_xrms heavy print inst_xrms 1 OPEN isnt xrms END amp END Computes instantaneous mean square displacements for heavy atoms for the solute chunks 1 10 and 31 57 WARNINGS This command has no ac
129. e felt only at the beginning and the end of the macro step Ati In the inner n steps loop the system moves only according to the Hamiltonian of the reference system H T Vo When using the potential breakup the inner reference system is rigorously conservative and the total energy of the reference system i e T Vo Vp is conserved during the P micro steps The integration algorithm given an arbitrary subdivision of the interaction potential is now straight forward For the general subdivision 2 26 the corresponding Liouvillean split is gee i ps wo iL eee 2 35 q q Op q Op q Op We write the discrete time operator for the Liouville operator iL Lo Ln and use repeatedly the Hermitian approximant and Trotter formula to get a hierarchy of nested reference systems propagator viz iLo ciao LDA thn S AE iea clin ege 2 36 Aty Aty 1Pp_1 cD Li Atn o etln e Dice Li Atn an etln a 2 37 Atn 1 Atyn 2Pn 2 ei LotLitla Atz _ pile 32 cittr toran eile 2 38 Atg AtP ei Lot Liat _ cla e iLo Ato gin 2 39 At AtoPo where At is the generic integration time steps selected according to the time scale of the i th force F We now substitute Eq 2 39 into Eq 2 38 and so on climbing the whole hierarchy until Eq 2 36 The resulting multiple time steps symplectic and reversible propagator is then At cil tn ofa Btn mone A 2 40 Pasi a Ato 2 Ato Po
130. e fully included in the reciprocal lattice sum while in popular force fields only a portion of them is considered via the so called fudge factors f What must be subtracted in this case is the complementary interaction q t q t 1 f erf ar r It should be stressed here that when the reciprocal lattice sum is computed using Eq 9 2 the zero cell Erf contribution of the 12 13 and 14 1 f interactions must be removed whether the two charges are alchemical or not So alchemically driven simulations imply no changes on the subtraction of these peculiar self interactions with respect to a normally implemented program with no alchemical changes The routines that implement Eq 9 4 must be therefore called using the atomic charges qi 1 t Qi whether alchemical or not i e whether A is different from zero or not With the same spirit the self interaction in the zero cell i e the term fz 3 1 t Q must be computed using the same charges We have seen in Table 9 2 that in the direct lattice the intrasolute non bonded electrostatic interactions are computed using the full time invariant solute charges Q as alchemical changes affect only solute solvent interaction energies To recover the bare Coulomb potential for intrasolute interaction in a system subject to an alchemical transformation one must then subtract as done for the 12 13 and 14 1 f pairs the Erf q t q t contribution and add a QQ Erf term to the total energ
131. e integrators and although the overall density for S and S1 appears to be the same there might be considerable changes in the torsional dynamics R3 does not require any assumption is accurate everywhere in the spectrum see Fig and is as efficient as S For these reasons R3 or a multi step version of the equally accurate S1 must be the natural choice for the simulation of complex systems using all atoms models results published by Watanabe and Karplus for another flexible system i e met enkephalin in vacuo They compared SHAKE on X H against full flexibility and found that the power spectrum of torsional degrees of freedom differs significantly For met enkephalin their spectrum evaluated on a 10 ps time span shows a single strong peak at 10 or 40 wavenumbers with and without constraints respectively The different behavior of the constrained and totally flexible system might be ascribed in their case to the the specificity of the system and or the potential although this seems unlikely 23 In their study on the other hand we must remark the unusual shape of the spectral torsional profile with virtually no frequencies above 100 wavenumbers and with strong peaks suspiciously close the minimum detectable frequency according to their spectral resolution Chapter 3 Multiple Time Steps Algorithms for the Isothermal Isobaric Ensemble The integrators developed in the previous section generates dynamics in the microcanonical ensemble where
132. e system ii atomic and the barostat is coupled to the coordinates of the atoms iii group with the barostat coupled to the smallest groups which are not connected by a constraint If no constraints have been imposed to system see STRETCHING amp POTENTIAL SCALING GROUP and SCALING ATOMIC have the same behavior EXAMPLES SCALING MOLECULAR Run with molecular scaling SCALING GROUP Run with group scaling STRESS NAME STRESS Run MD simulations at constant pressure with a non isotropic volume changes SYNOPSIS STRESS PRESS EXT pert BARO MASS wpr COMPR compressibility DESCRIPTION This command allows to run simulations and minimizations at a given pressure with non isotropic volume changes according to the Parrinello Rahman equation of motion If the command is used alone ORAC runs simulations in the NPH ensemble Simulations in the NPT ensemble can instead be carried out if STRESS is used in conjunction with the command THERMOS The external pressure Input to ORAC amp SIMULATION 142 in MPa is read in by the keyword PRESS EXT Also the keyword BARO MASS expects the mass of the barostat in cm see ISOSTRESS The system compressibility in MPa is read in as input to keyword COMPR see ISOSTRESS EXAMPLES amp SIMULATION MDSIM TEMPERATURE 300 0 25 0 STRESS PRESS EXT 0 1 BARO MASS 10 0 COMPR 1 0e 4 amp END Run a simulation in the NHP ensemble at pressure 0 1 MPA atmospheric pressure with a ba
133. eans that H in terms of the new coordinates Eq 8 28 is only a constant of motion but is no longer a true Hamiltonian application of Eq does not lead to Eqs 3 23 3 27 Simulations using the real variables are not Hamiltonian in nature in the sense that the phase space of the real variables is compressible and that Liouville theorem is not satisfied 90 This strangeness in the dynamics of the real variables in the extended systems does not of course imply that the sampling of the configurational real space is incorrect To show this it suffices to evaluate the partition function for a microcanonical distribution of the kind 6 H E with H being given by Eq 8 28 The Jacobian of the transformation of Eqs 3 1443 22 must be included in the integration with respect to the real coordinates when evaluating the partition function for the extended system If the equations of motion in terms of the transformed coordinates are known this Jacobian J can be readily computed from the relation 72 dF a S y 3 31 Where y has the usual meaning of phase space vector containing all independent coordinates and momenta of the systems Inserting the equations of motion of Eq 8 27 into Eq 3 31 and integrating by separation of variables yields J eX det hy 3 32 Using 8 32 and integrating out the thermostat degrees of freedom the partition function can be easily shown 96 to be equivalent to that that of NPT ens
134. ecord of the intra solute electrostatic energy during the discharging of a molecule of ethanol in water in standard conditions In spite of the huge changes in the contributing energy energy terms the total intrasolute energy remains approximately constant during the transformation modulated by the intramolecular motion exactly as it should The changes in the solute self term z gt 1 Ai Q compensate at all time steps the variation of the direct lattice and of Erf intrasolute corrections This balance does occur provided that all terms in the energy of Eq 9 6 are accounted for including the intrasolute alchemical Erf correction Vaich of Eq 9 5 Total Direct space erfc Direct space erf 0 cell Self term Intrasolute Energy kJ mol 0 2 4 6 10 12 14 8 Time ps Figure 9 1 Time record for the intrasolute energy arising form electrostatic interactions during the al chemical discharging of ethanol in water at T 300 K and P 1 Atm The simulation went on for 15 ps The red curve is due the self term z 53 1 Ax t Q7 The green curve is due to the direct lattice contribution The magenta curve includes the terms Vintra Eq R and Vaicn Eq 9 5 In a multiple time scheme the individual contributions to the non bonded forces evolve in time with disparate time scales and must be hence partitioned in appropriately defined integration shell as described in details in Chapter 3 So in condensed
135. ectic condition 2 8 An important consequence of the symplectic condition is the invariance under canonical or symplectic transformations of many properties of the phase space These invariant properties are known as Poincare invariants or canonical invariants For example transformations or t flow s mapping obeying Eq 2 8 preserve the phase space volume This is easy to see since the infinitesimal volume elements in the y and x bases are related by dy det M dx 2 10 where det M is the Jacobian of the transformation Taking the determinant of the symplectic condition Eq 2 8 we see that det M 1 and therefore dy dx 2 11 For a canonical or symplectic t flow mapping this means that the phase total space volume is invariant and therefore Liouville theorem is automatically satisfied A stepwise numerical integration scheme defines a At flow mapping or equivalently a coordinates trans formation that is Q At Q q 0 p 0 At P At P q 0 p 0 At XA y0 ai We have seen that exact solution of the Hamilton equations has t flow mapping satisfying the symplectic conditions 2 8 If the Jacobian matrix of the transformation 2 12 satisfies the symplectic condition then the integrator is termed to be symplectic The resulting integrator therefore exhibits properties identical to those of the exact solution in particular it satisfies Eq 2 11 Symplectic algorithms have also been proved to be robust i e
136. ed Moreover files containing the system coordinates in appropriate format can be provided The following commands are incorporated in amp SETUP CRYSTAL READ_PDB RECONSTRUCT REPLICATE TEMPLATE CHANGE_CELL NAME CHANGE_CELL Recomputes the atomic coordinates according to input SYNOPSIS CHANGE_CELL DESCRIPTION This command has an effect only when the run is restarted see commands RESTART amp INOUT and CONTROL amp RUN This command must be specified in case one wishes to change the MD cell parameters with respect to those dumped in the available restart file to those specifies in the CRYSTAL directive in this environment If CHANGE_CELL is not specified and the run is restarted the CRYSTAL directive is ignored and the cell parameters are taken form the last configuration of the restart file If CONTROL O is entered in the environment amp RUN this command has no effect EXAMPLES CHANGE_CELL CRYSTAL NAME CRYSTAL Read the cell parameters defining the shape of the simulation box SYNOPSIS CRYSTAL a b c a B y DESCRIPTION The arguments a 8 and y to this command are defined using the usual crystallographic conventions a is the angle between the b and c axis 8 is the angle between a and c and y is the angle between a and b EXAMPLES CRYSTAL 12 3 14 5 12 3 90 0 95 0 90 0 CRYSTAL 15 0 DEFAULTS a 8 y 90 0 Input to ORAC amp SETUP 127 READ_PDB NAME READ_PDB Read input system coord
137. ed The Verlet neighbor list computation depends on N where N is the number of particle in the system The linked cell neighbor algorithm 155 scales linearly with N but it has a large prefactor The break even point for the two methods is at about 7000 atoms for scalar machines The frequency of updating of the index cell list is controlled by the argument nupdate and by the command UPDATE in this environment If fupdate is the updating time specified in the command UPDATE the updating time for the linked list is fupdate x nupdate EXAMPLES amp SETUP CELL 54 0 72 0 41 0 90 0 102 0 90 0 amp END amp POTENTIAL LINKED_CELL 15 20 12 1 amp END Here a grid spacing of about 3 5 A along each crystal axis is selected DEFAULTS nupdate 1 QQ FUDGE NAME QQ FUDGE Set the fudge factor of the electrostatic interaction SYNOPSIS QQ FUDGE qq fudge DESCRIPTION The argument to this command qq fudge is the multiplicative factor of the 1 4 electrostatic inter action EXAMPLES QQ FUDGE 0 5 DEFAULTS QQ FUDGE 1 0 Input to ORAC amp POTENTIAL 109 SELECT_DIHEDRAL NAME SELECT_DIHEDRAL Include only selected torsion angles in the potential SYNOPSIS SELECT_DIHEDRAL DESCRIPTION In old force field only selected torsion angles were included This command handles this situation DEFAULTS The action taken by the command AUTO_DIHEDRAL is the default WARNINGS Diagnostic Unsupported STEER_PATH NAME STEER_PATH
138. ed by eliminating the kinetic energy which depends on the lika velocities from the starting Lagrangian 3 6 and replacing the term yy M s7SththS with Dip Mips st hths g The corresponding equations of motions for atomic scaling are then _ Dik _ Fa _ Pn fp S h y gt a 3 3 35 De hpi G Qpa opie 3 36 Pee V 1 h Pon det J T Ph 3 37 tn F 3 38 where the quantities V K Fy depend now on the atomic coordinates N y 5 fi 8 i 1k N x Zm hs x s 3 39 N ni 1 PikPik Fy AS ERE gep 3 40 i 1 k 1 In case of atomic Eq 3 34 or molecular scaling Eq 8 1 the internal pressure entering in Eqs 3 26 3 37 is then Pint Patom 7 ey Ee tne efi 3 41 Pint Pmot a GE r 3 42 7 Actually in ref is pointed out that the virial theorem implied by the distribution is slightly different from the exact virial in the NPT ensemble Martyna et al proposed an improved set of equations of motion that generates a distribution satisfying exactly the virial theorem Multiple Time Steps Algorithms 24 respectively Where the molecular quantities can be written in term of the atomic counterpart according to iL P XO pa 3 44 k F J fn 3 45 k The equation of motion for the barostat in the two cases Eqs 88 3 B 26 has the same form whether atomic or molecular scaling is adopted The internal pressure in the former case is given by Eq and in the latter is give
139. efaults 143 DEF_FRAGMENT 111 DEF INE_ALCHEMICAL_ATOM 103 DEF_SOLUTE density of states dielectric constant diffusion 115 diffusion coefficient TTI dihed dihedral angle dihedral angle in torsions 157 dimensions changing the in ORAC 165 dipole HI direct lattice potential direct potential subdivision of dirty discrete time propagator TJ 13 B8 DIST_FRAGMENT dist_max 139 dived_step 116 don 163 driven thermal changes 84 124 driving external potential DUMP dumping the restart file 84 DYNAMIC BI dynamical matrix 139 eigenvectors electrostatic correction 05 electrostatic corrections electrostatic potential 35 subdivision of 37 energy equipartition energy_then_die 87 168 enhanced sampling 6 equations of motion 9 for Parrinello Rahman Nos Hamiltonian 20 equilibration 25 ERF_CORR ERFC_SPLINE 104 error function EWALD Ewald method 6 B4 electrostatic corrections in multiple time scales integrators 86 intramolecular correction 104 intramolecular self term 34 self energy 39 setting work array dimensions smooth particle mesh excess charge extended Lagrangian 19 FIX_FREE_ENERGY amp SGE 131 fluctuation theorem force breakup force field BOJ 153 input parameters from ASCII file input parameters from binary file 96 force field printout FORCE_FIELD 12 fractional translations fragment writing
140. eighbor list SYNOPSIS UPDATE fupdte rspcut DESCRIPTION ORAC computes Verlet neighbor lists the atomic groups of both the solvent and solute There exist three different neighbor lists a solvent solvent a solute solute and a solvent solute list During the run the calculation is carried out with a frequency equal to fupdte fs All the group group interactions within a radial cutoff of reut rspcut are included in the neighbor lists The dimensions of the three lists are printed at run time In the ORAC output nnlww nnlpp and nnlpw refers to the solvent solvent solute solute and solute solvent neighbor list The current version of ORAC can also use linked cell in place of the conventional Verlet neighbor list see command LINKED_CELL EXAMPLES UPDATE 65 0 1 4 Update the neighbor lists every 65 0 fs and use a cutoff of reut 1 4 A DEFAULTS UPDATE 100 0 1 0 WARNINGS The neighbor list cutoff must not be chosen larger than half of the simulation box size The calculation of the neighbor list is performed by default Only for solvent only simulations if the radial cutoff is equal to half of the box size the force calculation is carried out without the use of neighbor list When using r RESPA the value of rspcut is ignored in the UPDATE directive and is taken as an argument of the last step nonbond command in the MTS_RESPA structured command VERLET_LIST NAME VERLET_LIST Compute Verlet neighbor list SYNOPSIS VERLET_LIST T
141. emble i e Anpr X J dhe Prest det h O h 3 33 5 In presence of bond constraints and if the scaling is group based instead of molecular based these expression should contain a contribution from the constraints forces Complications due to the constraints can be avoided altogether by defining groups so that no two groups are connected through a constrained bond 26 In that case vy does not include any constraint contribution 6The thermostat degree of freedom must be included 85 90 in the count when working in virtual coordinates Indeed in Eq 3 13 we have g Ny 1 Multiple Time Steps Algorithms 23 a being the canonical distribution of a system with cell of shape and size define by the columns of h 3 3 Equivalence of Atomic and Molecular Pressure The volume scaling defined in Eq B I is not unique Note that only the equation of motion for the center of mass momentum Eq 8 25 has a velocity dependent term that depends on the coordinates of the barostat through the matrix G defined in Eq 8 11 The atomic momenta Eq 8 24 on the contrary are not coupled to the barostat This fact is also reflected in the equations of motion for the barostat momenta Eq 3 26 which is driven by the internal pressure due only to the molecular or group center of masses In defining the extended Lagrangian one could as well have defined an atomic scaling of the form Tika 5 hapSiak 3 34 Atomic scaling might be trivially implement
142. emical work Eq could be computed simply by montoring the changes in the total energy of the systems that includes the real potential and kinetic energy of system and the potential and kinetic energies of the barostat and the thermostats This energy if no velocity scaling is implemented i e no heat is artificially transferred to or absorbed from the extended system is a constant of the motion and hence any variation of it must correspond to the work done on the system 154 Alternatively the work can be computed by analytically evaluating the and 7 derivatives of the non bonded energy Eq 9 6 Both these methods have counter indications The total energy method suffers form the finite precision of energy conservation in the numerical integration of the equations of motion usually in multiple time step schemes the oscillations of the total energy are the order of 1 50 1 100 of the mean fluctuation of the potential energy of the system 12 Also small drifts in the total energy adds up in the work as a spurious extra dissipation term that may reduce the accuracy in the free energy determination via the Crooks theorem The method based the derivatives if alchemical species are annihilated and created within the same process requires the constant tagging of the two creation and annihilation works as the increments 6Ag 4 or ngja have opposite signs for creation G species and annihilation process A species Besides while all direct lattice Erfc
143. empered metadynamics simulation adding new potential terms with a probability that depends on the ratio Vinax t kp 1000 0 DEFAULTS T 0 0 WTEMPERED SAVE NAME WTEMPERED During a metadynamics simulation adds an hill to the biasing potential with a decreas ing probability following the well tempered metadynamics algorithm SYNOPSIS WTEMPERED T DESCRIPTION When present the program adds an hill to the current biasing potential with a probability given by P acc exp V s t kBT where V s t is the value of the biasing potential and T is a user defined temperature EXAMPLES WTEMPERED 1000 0 Run a tempered metadynamics simulation adding new potential terms with a probability that depends on the ratio V s t kg 1000 0 DEFAULTS T 0 0 NAME SAVE Save periodically a trajectory file during a metadynamics run SYNOPSIS SAVE fprint filename DESCRIPTION When present the program writes the trajectory in the space of the reaction coordinates sampled with the frequency defined through the command RATE to file filename every fprint fs The first line of the file contains the number of hills deposed when the file was dumped and the height and the Input to ORAC amp META 92 width along each reaction coordinate If at the beginning of the run a trajectory file from a previous metadynamics simulation was read through the command READ then the program prints the whole trajectory EXAMPLES SAVE
144. ending is defined by the atoms 25 33 and 67 The ensembles are defined by 2 parameters An A2 4 Abend where the bond related parameters are A30 4 10 Abend 11 5 bord 13 bone 14 5 in A and the bending related parameters is 9 4 100 Abed 110 Ab 4 120 ARP 130 in degrees Therefore the transition of a replica from the ensemble A to the ensemble A 41 involves a synchronous change of both parameters i e 2 4 bond and Abend Abend Finally the harmonic force constants see Eq 6 24 are 1 and 2 kcal mol for bond and bending respectively NAME STEP Set up input information on the frequency of the ensemble transitions and on the free energy updating options Input to ORAC amp SGE 135 SYNOPSIS STEP Le La Ls nav DESCRIPTION This command defines the following parameters Le real number time interval in fs used to attempt a transition of a replica between adjacent ensembles see point 4 in Section 6 3 2 La real number time interval in fs used to store the dimensionless works W n gt n 1 and W n gt n 1 see point 2 in Section 6 3 2 La real number time interval in fs used to try a free energy update see point 3 in Section 6 3 2 nav integer number number of independent free energy estimates used to update the weighted free energy averages see Section 6 3 3 The parameter na is optional If nay 0 or not reported in the input then all free energy estimates st
145. eraction type h c 1200 0 600700 0 Interaction type o o 1000 0 800000 0 Interaction type o c 2000 0 500100 0 Interaction type c c WARNINGS If the 1 4 interaction parameters are not provided in input to NONBONDED MIXRULE the regular non bonded parameters multiplied by the 1 4 factor in input to LJ FUDGE of environment amp SOLUTE are used instead For interactions involving one atom for which the 1 4 parameters are provides and another for which they are not regular non bonded parameters for the interaction are used multiplied by the eventual LJ FUDGE factor Input to ORAC Force Field amp Topology 156 TORSION PROPER NAME TORSION Read proper torsion potential SYNOPSIS TORSION PROPER typl typ typ type Kpi n END DESCRIPTION typl typ2 typ3 and typ4 are four character strings not to exceed 7 characters indicating the atom types of the four atoms involved in the torsion interaction a x string is taken to be as a wild card indicating any atom The torsional axis according to the ORAC convention is the one connecting the type2 and type3 The parameters Ky and n and y are defined in Eq Kpni is in unit of Kcal mol n is an integer indicating the number of minima maxima for 360 degree rotation about the torsional axis y is given in degrees and can be either 0 0 or 180 0 EXAMPLE TORSION PROPER x C ca x 3 6250 2 0 180 0 x cw na x 1 5000 2 0 180 0 ct ct os ct 0 3830 3 0 0 0 ct ct os c
146. erated by using the correct atomic scaling From a computational standpoint molecular scaling is superior to atomic scaling The fast varying Liouvillean in Eq 70 for the atomic scaling contains the two terms iL iL These terms are slowly varying when molecular scaling is adopted and are assigned to the slow part of the Liouvillean in Eq 8 69 The inner part of the time propagation is therefore expected to be more expensive for the multiple time step integration with atomic scaling rather than with molecular scaling Generally speaking given the equivalence between the molecular and atomic pressure molecular scaling should be the preferred choice for maximum efficiency in the multiple time step integration For large size molecules such as proteins molecular scaling might be inappropriate The size of the molecule clearly restricts the number of particles in the MD simulation box thereby reducing the statistics on the instantaneous calculated molecular pressure which may show nonphysical large fluctuations Group scaling is particularly convenient for handling the simulation of macromolecules A statistically sig nificant number of groups can be selected in order to avoid all problems related to the poor statistics on molecular pressure calculation for samples containing a small number of large size particles Notwithstand ing for solvated biomolecules and provided that enough solvent molecules are included molecular scaling again yield
147. erating systems The source code is distributed along with a Makefile which has been tested on several Linux platforms You must have the Gnu version of make to make the executable As a configure file is not provided in this release for other UNIX platforms the Makefile may need some hacking The ORAC distribution file is a tar archive containing the ORAC source code and a few examples which illustrate most of the important features of the program The untarring of the distribution file using the command tar xvf orac5 1 tar gz will create a directory with the following sub directories ORAC ORAC doc ORAC etc ORAC 1ib ORAC pdb ORAC src ORAC tests ORAC tools The directory ORAC doc contains this manual in pdf and HTML format The directory ORAC etc contains material for developers The directory ORAC 1ib contains the force field parameters AMBERO3 and topology files see sec 10 3 The directory ORAC pdb contains The Protein Data Bank format coordinate files for running the input examples in ORAC tests The directory ORAC src contains the source code Read the copyright agreement COPYRIGHT_NOTICE before modifying or distributing the code The directory ORAC tools contains ancillary codes for analyzing MD data In order to see the list of the available compilation targets do make show make with no arguments will show the main targets with a short help To compile ORAC just do Compiling the Pr
148. es of liquid water using the SPC model 104 from a 200 ps MD simulation in the NPT ensemble at temperature of 300 K and pressure of 0 1 MPa with i a very accurate Ewald sum column EWALD in Table 4 2 ii with inaccurate Ewald but corrected in direct space using Eq 448 CORRECTED and iii with simple cutoff truncation of the bare Coulomb potential and no Ewald CUTOFF Results are reported in Table 4 2 We notice that almost all the computed properties of water are essentially independent within statistical error of the truncation method The dielectric properties on the contrary appear very sensitive to the method for dealing with long range tails Accurate and inaccurate Ewald corrected in direct space through 4 48 yields within statistical error comparable results whereas the dielectric constant predicted by the spherical cutoff method is more than order of magnitude smaller We should remark that method ii CORRECTED is almost twice as efficient as the exact method i Electrostatic Interactions Al EWALD CORRECTED CUTOFF Coulomb energy KJ mole 55 2 0 1 55 1 0 1 56 4 0 1 Potential energy KJ mole 46 2 0 1 46 1 0 1 47 3 0 1 Heat Capacity KJ mole K 74 24 5 94 22 0 87 23 2 Volume cm 18 2 0 1 18 3 0 1 18 1 0 1 Volume Fluctuation 3 136 943 5 147 0 3 5 138 7 3 5 Ro o A 2 81 0 01 2 81 0 01 2 81 0 01 Dielectric constant 59 25 8 47 27 3 3 2 Table 4
149. es play a significant role in the total non bonded potential energy The latter can be written as Vabn Voaw Vor Vga Via 4 37 Where Vyaw is the Lennard Jones potential namely N T 12 Tij 6 i lt j Here the prime on the sum indicates that interactions between atoms separated by less than three consec utive bonds must be omitted The term Vj4 is typical for force fields of complex molecular systems 2 4 While non bonded forces between atoms involved in the same covalent bond or angle bending interaction are generally excluded the potential between atoms separated by three covalent bonds is retained and readjusted in various ways In all cases the Vj term remains in general a very stiff and hence a fast varying term The computational cost of the V14 contribution is very small compared to other non bonded interactions Thus it is safer to assigns this potential term to the slowest intramolecular reference system potential Va of Eq 4 19 The Vor reciprocal lattice term including the correction due to the excluded or partially excluded i e the electrostatic part of V14 interactions cannot be split when using SPME and must be assigned altogether to only one reference system The time scale of the potential Vg depends on the convergence parameter a Indeed this constant controls the relative weights of the reciprocal lattice energy Var and of the direct lattice energy Vga By increasing a one increases the weight of
150. external pressure i e Pm P Cm where Cm is the scaling factor of replica m This choice is done in order to avoid through an increase of the external pressure a catastrophic expansion of the simulation box for low scaling factors or high temperatures 5 3 Calculating Ensemble Averages Using Configurations from All Ensembles MBAR estimator As recently shown by Shirts and Chodera 06 all the configurations produced by a REM simulation of M replicas each characterized by a distribution function P X can be effectively used to obtain equilibrium averages for any target distribution P X using the so called Multistate Bennett Acceptance Ratio MBAR estimator which is illustrated in the following With this definition Visit Xn may also depend on the coordinate of few solvent atoms Being the definition of the solute atom based rather than potential based it may be necessary to include in ved Xn e g torsional terms that involve boundary solvent atoms Replica exchange 48 In the ORAC REM implementation the most general distribution function for replica m is given by Eq 5 15 Given that for each replica m one has saved Nm configurations of the kind a7 77 it can be easily shown that M Nm m m ae 1m Nin pres j Opmlt P a 5 24 Na pa Poe 1 nm XE Pm a7 Eq holds for any arbitrary bridge function Qnm X In particular choosing Zn Net Onm X NZ PX 5 25 Eq transforms as N r
151. f freedom by scaling with c lt 1 and c lt 1 the corresponding potential functions With this choice the quantity A in Eq is given by A B e e Viors X Via X Viors X Vial X BC cP Veaw X Var X Vaa X Veaw X Vor X Vea X 5 20 n Local scaling Hamiltonian REM in ORAC can work also by tempering only a user defined solute Unlike standard implementation of the solute tempering techniques 105 the solute in the present version can be any portion of the system including solvent molecules Once the solute has been defined the complementary non solute portion of the system is by definition the solvent In this manner the scaling i e the heating or freezing can be localized in a specific part of the system with the remainder the solvent of the system behaving normally i e with the target interaction potential In order to clarify how local scaling work we illustrate the technique with a working general example Suppose to choose a subset of atoms n in the system that define the solute This subset can be chosen arbitrarily and may include disconnected portions of the protein as well as selected solvent molecules The solvent is then made up of the remaining N n atoms According to this subdivision the global potential of the system may be written as V X VS Xa VES X Xn_n VEY Xy_n 5 21 Replica exchange 47 where
152. f symplectic integrators 99 for the t flow s defined by Eqs 3 77 The symplectic condition Eq is violated at the level of the transformation which is not canonical However the algorithms generated by Eqs 8 76 3 77 are time reversible and second order like the velocity Verlet Several recent studies have shown 25 24 26 that these integrators for the non microcanonical ensembles are also stable for long time trajectories as in case of the symplectic integrators for the NVE ensemble Multiple Time Steps Algorithms 28 3 5 Group Scaling and Molecular Scaling We have seen in section 3 3 that the center of mass or molecular pressure is equivalent to the atomic pressure The atomic pressure is the natural quantity that enters in the virial theorem irrespectively of the form of the interaction potential among the particles So in principle it is safer to adopt atomic scaling in the extended system constant pressure simulation For systems in confined regions the equivalence between atomic or true pressure and molecular pressure see sec holds for any definition of the molecular subsystem irrespectively of the interaction potentials In other words we could have defined virtual molecules made up of atoms selected on different real molecules We may expect that as long as the system no matter how its unities or particles are defined contains a sufficiently large number of particles generates a distribution function identical to that gen
153. f the integration Subdivision of the reciprocal lattice contribution with standard Ewald although technically possible is not recommended 2 The directive dirty makes fast integrators stable but may severely affect dynamical properties TIMESTEP NAME TIMESTEP Define the simulation time step SYNOPSIS TIMESTEP time DESCRIPTION The argument time represents the integration time step used during the run As integration of the equations of motion si always done with the r RESPA algorithm time is the outer most time step time must be given in units of femtoseconds EXAMPLES TIMESTEP 9 0 Input to ORAC amp META 89 10 2 4 amp META Define run time parameters concerning Metadynamics Simulation The following commands are available ADD_BOND ADD_BEND ADD_TORS RATE READ SAVE ADD_BOND NAME ADD_BOND Add the distance between two atoms to the list of reaction coordinates SYNOPSIS ADD_BOND iat iat w DESCRIPTION This command adds to the list of the reaction coordinates of a metadynamics simulation the distance between atom iati and iat2 The numeric order of the atom indices iat1 iat2 is that specified in the topology file see I0 3 The repulsive potential terms deposed in the space of the reaction coordinates during the simulation see 6 3 3 will have a width w in A in the direction of this distance EXAMPLES ADD_BOND 1 12 0 2 Add the distance between atom 1 and atom 12 to the list of the reaction coordin
154. f the solute The axis of the torsion is defined by the atoms iat2 iat3 The numeric order of the solute atom indices iat1 iat2 iat3 iat4 is that specified in the topology file see Sec L0 3 The added torsional potentail has force constant k in Keal mol rad and equilibrium dihedral angle o in degrees If 0 is also specified then the added torsional potential is time dependent and 6 is the equilibrium dihedral angle after the steering time 7 see STEER amp RUN command for the definition of the steering time in a SMD simulation WARNINGS If the chosen 9p is very different from the actual value of the dihedral angle at time 0 a very large force is experienced by the atoms in involved in the added bending and the simulation may catastrofically diverge after few steps EXAMPLES Example 1 ADD_STR_TORS 1 50 70 104 400 60 0 Example 2 amp POTENTIAL ADD_STR_TORS 1 50 70 104 400 60 0 90 0 amp END Input to ORAC amp POTENTIAL 102 amp RUN STEER 10000 50000 amp END In the first example a torsinal constraint is imposed bewteen atom 1 atom 50 atom 70 and atom 104 of the solute In the second example a time dependent driving potential ia applied to the same atoms of the solute The equilibrium dihedral angle of such harmonic driving potential move at constant velocity in T 40 ps starting at t 10 ps between o 60 and 6 90 degrees ADJUST_BONDS NAME ADJUST_BONDS Constraints bond leng
155. factor EXAMPLES ANNEALING 2 0 WARNINGS Diagnostic Unsupported ISEED NAME ISEED Provide a seed for the random number generator SYNOPSIS ISEED seed Input to ORAC amp SIMULATION 138 EXAMPLES ISEED 34567 DEFAULTS ISEED 12345667 WARNINGS Diagnostic Unsupported ISOSTRES NAME ISOSTRESS Run MD simulations at constant pressure with an isotropic volume variable SYNOPSIS ISOSTRESS PRESS EXT pert BARO MASS wpr COMPR compressibility DESCRIPTION This command allows to run simulations and minimizations at a given pressure with isotropic volume changes If the command is used alone ORAC runs simulations in the NPH ensemble Simulations in the NPT ensemble can instead be carried out if ISOSTRESS is used in conjunction with the command THERMOS The external pressure in MPa is read in by the keyword PRESS EXT Also the keyword BARO MASS expects the mass of the barostat in cm7 The system compressibility in MPa is read in as input to keyword compr According to the relation given in Ref compressibility and frequency should be consistent If compr is not specified the default value is used EXAMPLES ISOSTRESS PRESS EXT 0 1 BARO MASS 10 0 COMPR 5 3e 4 Run a simulation at pressure 0 1 MPA atmospheric pressure with a barostat mass corresponding to 10 0 cm The compressibility is set to 5 3 x 1074 MPa7 DEFAULTS ORAC uses the water compressibility at 300 K i e 5 3 x 1074 MPa as the default comp
156. ferent final state t T is given by Na Veat t T ri riol H Kila ai Qio t y Ki i o t 8 13 i 1 where r a and 6 represents the actual i th stretching bending and torsional driven coordinate defined by arbitrarily selecting in the corresponding input definition the involved atoms So a driven torsion or a stretching may be defined using arbitrarily chosen atoms of the solute that are not connected by any real bond rio t aio t and io t are time dependent parameters that defines the non equilibrium trajectory in the space of the coordinates In ORAC each of these parameters given the duration 7 of the non equilibrium experiment is varied at constant speed from an initial value at time t 0 defining the reactants to a final value at time t 7 defining the products Tir fi ri rio t rio virt ay ai t aig t aio Viat Oir i Olt bio t bio viot 8 14 As all the steering velocities are constant during the experiments the above equations define a line z t ri t r2 t ar 4 amp 2 t 01 t 02 t 8 15 in a reaction coordinate space at Ny Na Nog dimensions The work done by the external potential Eq 8 13 in the time 7 of the non equilibrium driven process along the coordinate z is calculated as EN Na No Wo f z Ki ri rio t vir gt Klai ajo t via 2 K 0i io ve dt 8 16 The equilibrium distribution of the starting points fo
157. for computational reasons on a single solvated biomolecule i e in the conditions where the non equilibrium techniques for the reason explained above are deemed to be more successful 8 2 Determination of the potential of mean force via bidirectional non equilibrium techniques The Jarzynski identity is seemingly a better route than the CT to evaluate the full potential of mean force along F z in the interval zo 24 with 0 lt t lt r However the exponential averages in Eq 8 5 is known to be strongly biased i e it contains a systematic error 148 that grows with decreasing number of non equilibrium experiments This can be qualitatively explained with the fact that for dissipative fast non equilibrium experiments the forward work distribution P W has its maximum where the exponential factor e 8 is negligibly small so that the size of the integrand P W e is de facto controlled by the left tail of the P W distribution 64 An unfortunate consequence of this is that the PMF calculated through the JI becomes more and more biased as the reaction z coordinate is advanced since the accumulated dissipation work shift the maximum of the P W distribution The CT is far more precise than the JI to evaluate free energy differences Shirts and Pande 65 have restated the CT theorem showing that the maximum likelihood estimate MLE of the free energy difference exactly correspond to the so called Bennett acceptance ratio 17 4 The MLE rest
158. ghted averages of the free energy differences see Eq 6 28 are reset i e the averages accumulated in a previous simulation are discarded in the new one EXAMPLES ZERO_FREE_ENERGY A SGE simulation is performed by resetting the averages of the free energy differences weight factors DEFAULTS The absence of the ZERO_FREE_ENERGY command in the input implies that the estimates of the free energy differences performed during the simulation are accumulated to those of a previous simulation Input to ORAC amp SIMULATION 137 10 2 12 amp SIMULATION The environment includes commands which define the type of simulation that is to be carried out In particular commands are available to run steepest descent energy minimizations and molecular dynamics simulations in various ensembles The environment amp SIMULATION allows the following commands ANDERSEN ANNEALING ISEED ISOSTRESS ISOSTRESSXY MINIMIZE MDSIM SCALE STRESS TEMPERATURE WRITE_PRESSURE ANDERSEN NAME ANDERSEN The simulation is performed in the NVT Ensemble using the stochastic collision method by Andersen SYNOPSIS ANDERSEN time DESCRIPTION Implement Andersen thermostat with a period for random collision of time femtoseconds EXAMPLES ANDERSEN 1000 0 WARNINGS Diagnostic Unsupported ANNEALING NAME ANNEALING Velocities are multiplied by factor to speed up SYNOPSIS ANNEALING scale factor DESCRIPTION Velocities are multiplied by scale
159. gn i e Pr W is the mirror symmetric with respect to Pr W According to Eq the AF may be thus evaluated constructing the two work distribution function AF is the work value where the two distribution cross i e Pr W A Pg W AF We point out in passing that the famous Jarzynski identity 62 JD BW Ze PW gt e PAF 8 5 is actually a trivial consequence of the CT being derived from the latter by integrating out the work variable and using the fact that the work distribution function Pr W and Pr W are normalized The physical meaning of the Crooks equation sounds indeed very reasonable and can be even be con sidered as a probabilistic restatement of the second law or of a generalization of the H Boltzmann theorem Given a forward deterministic non equilibrium trajectory starting form equilibrium and producing a work W the probability to observe a trajectory for the reverse process again starting from equilibrium and producing the work W is e 4 small than the former where Wa W AF is the dissipated work in the forward process When the dissipated work is zero i e when the driven process is quasi static and is done always at equilibrium then the two probabilities are identical With this regard one important point to stress is that the CT and the JI hold for all systems and for any kind of arbitrary non equilibrium process no matter how fast is performed In particular if the non equilibrium process is inst
160. gorithm Pors min 1 e 5 6 A9 Replica exchange 43 Probability Energy Figure 5 1 Overlapping configurational energy distribution for two replicas The shaded area is the acceptance probability for the configuration exchange The overlap between the two distribution is a lower bound for the acceptance probability with A 8m Bn E X E X Like in a standard MC technique because of the detailed balance condition for the extended system the sampling in the X multi configuration space in REM evolves towards a global equilibrium defined by the multi canonical probability distribution of the extended system Eq 5 2 In principle Eq 5 6 refers to the probability of an exchange between any two replicas In practice the exchanges are attempted between replicas that are contiguous in temperature Let s see why For any two replicas m n the total number of accepted exchanges between them is given by No N AE lt 0 N AE gt 0 5 7 where AE E X E X and N AE lt 0 N AE gt 0 are the number of accepted exchanges for which AF lt 0 and AE gt 0 respectively When the extended system is at equilibrium we clearly must have that N AE lt 0 N AE gt 0 5 8 Inserting the above equation into Eq we obtain NOSS 2N AE lt 0 5 9 Since according to the prescription 5 6 the probability for accepting the move when AE lt 0 is unitary we may write t
161. hat Nece Ntot where N is the total number of attempted exchanges and P AE lt 0 is the cumulative probability that a E X lt E X Eq B 10 states that if the two normalized configurational energy distribution P E of replica m and P of replica n are identical then the probability for a successful exchange between the two replica is equal to the area of the overlap of the two distribution i e the shaded area in Fig B I If Pp E and Pa E are not identical we have in general that the overlap of the two distribution is a lower bound for the acceptance probability the standard deviation 6F generally increases with the mean energy E Based on the above and assuming that M the total number of replicas is even one can then set up 2P AE lt 0 5 10 Replica exchange 44 an exchange protocol periodically attempting M 2 simultaneous contiguous replica exchanges m m 1 with m odd or M 2 1 simultaneous contiguous replica exchanges m lt m 1 with m even accepting each of them with probability given by 5 6 Given the above scheme what is the optimal spacing in temperatures for enhanced sampling of the configuration space at the target temperature First of all the hottest temperature Tm defining the full temperature range AT Tm T of the extended system must be clearly selected such that kgTm is of the order of the maximum height of the free energy barriers that must be overcome at the target temperature
162. he conventional Verlet List computation is the default WARNINGS Obsolete Unsupported Input to ORAC amp PROPERTIES 111 10 2 7 amp PROPERTIES The amp PROPERTIES directive is used to compute statistical properties on the fly or a posteriori once the trajec tory file has been produced see command DUMP amp INOUT and amp ANALYSIS environment ORAC can com pute radial distribution functions structure factors GOFR velocity autocorrelation functions TIME_CORRELATIONS The amp PROPERTIES environment is still in the developing stage in the current version of ORAC Thus none of the amp PROPERTIES features is officially supported Some properties can no longer be computed on the fly in the current version and have to be computed using the amp ANALYSIS environment once the trajectory file has been produced DEF_FRAGMENT DIST_FRAGMENT FORCE_FIELD GOFR HBONDS PRINT_DIPOLE STRUCTURES TIME_CORRELATIONS VORONOI WRITE_GYR DEF_FRAGMENT NAME DEF_FRAGMENT Define a fragment of a solute SYNOPSIS DEF_FRAGMENT begin end DESCRIPTION This command is used in conjunction with the command PLOT FRAGMENT in amp INOUT or in conjunction with the command DIST_FRAGMENT in this environment The arguments indicate the ordinal numbers of the first begin and the last end atom of a solute fragment This numbers may be deduced by inspection of the PDB file including the hydrogens atoms see command ASCII for generating a PDB file
163. he third to the V subsystems with n being m 1 1 such that At At At At l Atm At m see Table 4 3 n for the last nonbonded shell is set automatically to 1 disregarding its actual value If two shells are entered then only two intermolecular time steps are used i e n m and l 1 If one shell is entered only one time step is defined and m 1 1 When using Ewald the Vj term Eq in the reciprocal lattice is assigned by entering the string reciprocal as the last argument of a step nonbond directive k ewald kl lambdakl km lambdakm Obsolete Unsupported kl and km define the shells in reciprocal space Wave vectors k k such that rkcut gt k gt kl kl gt k gt km and km gt k gt 0 are assigned to the h shell shell and m shell respectively lambdakm lambdakl are the upper healing lengths for the reciprocal space m and shells and the lower healing length for the reciprocal space h and I shells respectively Input to ORAC amp INTEGRATOR 87 Warning To be used only when on is specified in the directive EWALD environment amp POTENTIAL rkcut must be defined in the directive EWALD The reciprocal lattice assignment is best done via the keyword reciprocal of the command step nonbond e test times OPEN filename Diagnostic Unsupported Produce the time record of the potential and kinetic energies at the end of the propagation step i e at intervals of Ata fs The following is the format used for dumpi
164. hen the replicas may tend to get trapped in limited regions of the ensemble space at the early stages of the simulation This is basically due to initially inaccurate determination of Afn n41 from Eq point 3a If such an event occurs then subsequent free energy estimates from Eq 6 25 may become very rare or even impossible However we can prevent this unwanted situation by passing to the updating criteria of point 3b when the criteria of point 3a are not met for a given prior established number of consecutive times 10 times in ORAC When equilibrium will be approached the criteria of point 3b will favor transitions of the replicas between neighboring ensembles and eventually the conditions to apply again the criteria of point 3a 4 Every Le steps a transition x p n gt p n 1 is attempted on the basis of the acceptance ratio of Eq and of the current value of A fn n 1 properly reweighted according to the equations reported in Sec 6 3 3 If the estimate of Afp n 1 is still not available from the methods described at points 3a and 3b then the transition is not realized The upward and downward transitions are chosen with equal probability It is worthwhile stressing again that the procedures of point 3b are only aimed to furnish a reliable evaluation of optimal weights when such factors are still not available from the bidirectional algorithm point 3a or when the system is get trapped in one or few ensembles point 3c Moreover
165. hich brings the system phase space point from the initial state qo po to the state p t q t at a later time t We already know that this transformation obeys Eq 2 8 We may also note that the adjoint of the exponential operator corresponds to the inverse that is e is unitary This implies that the trajectory is exactly time reversible In order to build our integrator we now define the discrete time propagator e t as A n g o e 3 At t n 2 17 eiLAt Lap 43qtP op A 2 18 In principle to evaluate the action of e 4 on the state vector p q one should know the derivatives of all orders of the potential V This can be easily seen by Taylor expanding the discrete time propagator et and noting that the operator g0 0q does not commute with OV 0q 0 0p when the coordinates and momenta refer to same degree of freedom We seek therefore approximate expressions of the discrete time propagator that retain both the symplectic and the reversibility property For any two linear operators A B the Trotter formula holds A B lim e4 eBt myn 2 19 el We recognize that the propagator Eq 2 18 has the same structure as the left hand side of Eq 2 19 hence using Eq 2 19 we may write for At sufficiently small eiLAt _ o 4dqtP dp At eidAtemAt 4 O A 2 20 Where for simplicity of discussion we have omitted the sum over q and p in the exponential Eq 2 20 is exact in the limit that At gt 0 and is first
166. his environment includes commands which define the starting and ending record for reading the trajectory file see also TRAJECTORY DUMP amp INOUT The following are allowed commands START STOP UPDATE START STOP NAME START SYNOPSIS START nconf DESCRIPTION The trajectory file specified with the command TRAJECTORY amp INOUT is read in starting from config uration nconf EXAMPLES START 1 NAME STOP SYNOPSIS STOP nconf DESCRIPTION The trajectory file specified with the command TRAJECTORY amp INOUT is read stopping at configuration nconf EXAMPLES STOP 1000 UPDATE NAME UPDATE Update neighbor list for analysis SYNOPSIS UPDATE nconf reut DESCRIPTION Update the neighbor lists for e g radial distribution function calculations every nconf configurations using a cut off of recut A EXAMPLES UPDATE 2 10 0 A WARNINGS Diagnostic Unsupported Input to ORAC amp INOUT 80 10 2 2 amp INOUT The environment amp INOUT contains commands concerning input output operations which can be carried out during run time The commands within the INOUT environment allow to write history files in different formats and to dump restart files The following commands are available ASCII ASCII_OUTBOX DCD DYNAMIC DUMP PLOT RESTART SAVE TRAJECTORY ASCII NAME ASCII Write solute and solvent coordinates to a history PDB file SYNOPSIS ASCII fplot OPEN filename DESCRIPTION This comm
167. his task can be performed only iteratively Given a rough because of some free energy barrier estimate of A s from an old simulation the simplest way to know how good this estimate is consists in performing a new simulation using this estimate inverted in sign as a bias potential If the free energy profile of the modified system is flat A constant then A s V s is the free energy inverted in sign Otherwise from this simulation we can compute an improved estimate for A s through Eq The effectiveness of this tedious approach is due to the fact that each correction to the biasing potential makes the system more ergodic and therefore each successive simulation is statistically more accurate than the former This iterative approach to the problem 129 led to the development of adaptive biasing potential methods that improve the potential on the fly 132 i e while the simulation is performed All these methods share all the common basic idea namely to introduce the concept of memory 131 during a simulation by changing the potential of mean force perceived by the system in order to penalize conformations that have been already sampled before The potential becomes history dependent since it is now a functional of the past trajectory along the reaction coordinate Among these algorithms the Wang Landau and the metadynamics algorithms have received most attention in the fields of the Monte Carlo MC and Molecular Dynamic
168. hys 129 134112 2008 J G Kirkwood J Chem Phys 3 300 1935 D A McQuarrie Statistical Mechanics HarperCollinsPublishers New York USA 1976 S Kumar D Bouzida R H Swendsen P A Kollman and J M Rosenberg J Comput Chem 13 1011 1992 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 TT 78 79 80 81 82 83 175 A M Ferrenberg and R H Swendsen Phys Rev Lett 63 1195 1989 C J Woods J W Essex and M A King J Phys Chem B 107 13703 2003 R Chelli J Chem Theory Comput 6 1935 2010 G M Torrie and J P Valleau Chem Phys Lett 28 578 581 1974 A Laio and M Parrinello Escaping free energy minima Proc Natl Acad Sci USA 99 12562 12566 2002 S Marsili A Barducci R Chelli P Proccaci and V Schettino J Phys Chem B 110 14011 14014 2006 F Wang and D P Landau Phys Rev Lett 86 2050 2053 2001 J Henin and C Chipot J Chem Phys 121 2904 2914 2004 A Laio A Rodriguez Fortea F L Gervasio M Ceccarelli and M Parrinello Assessing the accuracy of metadynamics J Phys Chem B 109 6714 6721 2005 C Jarzynski Nonequilibrium equality for free energy differences Phys Rev Lett 78 2690 2693 1997 G E Crooks J Stat Phys 90 1481 1487 1998 G Hummer and A Szabo Proc Natl Acad Sci USA 98 3658 3661 2001 M R Shirts E Bair G Hooker and V S Pande P
169. hys Rev Lett 91 140601 2003 D Chandler Introduction to Modern Statistical Mechanics Oxford University Press 1987 J M Sanz Serna Acta Numerica 1 243 1992 S K Grey D W Noid and B G Sumpter J Chem Phys 101 4062 1994 J J Biesiadecki and R D Skeel J Comp Physics 109 318 1993 P J Channel and C Scovel Nonlinearity 3 231 1990 H Goldstein Classical Mechanics Addison Wesley Reading MA 1980 V I Arnold Mathematical Methods of Classical Mechanics Springer Verlach Berlin 1989 H F Trotter Proc Am Math Soc 10 545 1959 H de Raedt and B De Raedt Phys Rev A 28 3575 1983 H Yoshida Phys Letters A 150 262 1990 S J Toxvaerd J Chem Phys 87 6140 1987 H C Andersen J Comput Phys 52 24 1983 M E Tuckerman and M Parrinello J Chem Phys 101 1302 1994 S Nose and M L Klein Mol Phys 50 1055 1983 G Herzberg Spectra of Diatomic Molecules Van Nostrand New York 1950 M Watanabe and M Karplus J Phys Chem 99 5680 1995 J K Kjems an G Dolling Phys Rev B 11 16397 1975 F D Medina and W B Daniels J Chem Phys 64 150 1976 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 176 G Cardini and V Schettino Chem Phys 146 147 1990 D Frenkel and B Smit Understanding Molecular Simulations Academic Press San Diego 1996
170. ibution due to the atomic charges Furthermore nearby atoms interact through special two body three body and four body functions representing the valence bonds bending and torsional interaction energies surfaces The validity of such an approach as well as the reliability of the various potential models proposed in the literature 2 3 4 5 is not the object of the present book For reading on this topic we refer to the extensive and ever growing literature 3 6 7 8 Here we want only to stress the general concept that atomistic simulations usually have more predictive power than simplified models but are also very expensive with respect to the latter from a computational standpoint This predictive power stems from the fact that in principle simulations at the atomistic level do not introduce any uncontrolled approximation besides the obvious assumptions inherent in the definition of the potential model and do not assume any a priori knowledge of the system except of course its chemical composition and topology Therefore the failure in predicting specific properties of the system for an atomistic simulation is due only to the inadequacy of the adopted interaction potential We may define this statement as the brute force postulate In practice however in order to reduce the computational burden severe and essentially uncontrolled approximations such as neglect of long range interactions suppression of degrees of freedom dubious therm
171. ied and all output of the replicas are written The only two files that need to be in the directory from which ORAC is launched are the main input and the REM set file only if the a REM simulation is started from scratch and the scaling factors of the replicas are assigned manually and not automatically see SETUP amp REM 11 2 How to set dimensions in ORAC The config h file Being written mostly in fortran77 language the ORAC program does not dynamically allocate the required memory Memory allocation is done statically and dimensions throughout the code are given in a single file named config h To adapt the size of the program to other problems the config h file need to be changed and the program recompiled In the current distribution an ancillary awk script that builds the config h file has been provided This script is called configure and can be found in the tests directory configure parses a general input file for ORAC and produces to the standard output the corresponding config h file A certain number of ORAC routines contains INCLUDE statements The corresponding include files which may contain PARAMETER COMMON and general dimension statements REAL INTEGER etc have by convention a h suffix and are generated by the standard preprocessor 1ib cpp from inc files and the config h file The inc files are templates of include files where constants are initialized to character symbols some are listed below When making the e
172. illean iL yVy is readily available from the equations of motion in 3 23I3 27 For sake of simplicity to build our NPT multiple time step integrator we assume that the system potential contains only a fast intramolecular Vo term and a slow intermolecular term V as discussed in Sec Generalization to multiple intra and inter molecular components is straightforward We define the following components of the NPT Liouvillean Le gt Pi2Ve gt pia Vpn X Prog VPadas 3 57 i Q ik Q a Q ily FyVpp 3 58 iL J E pyyp 3 59 in gt _h P det hn a VPa ag 3 60 n i ETA hig S 5 JiVp 5 ik Vpir 5 V P VP Jag 3 61 i ik a P Pik Pr ag iGo 23 M Vs 2 mi V F 3 w Vn ag T ov Vlis VOV pix 3 62 where in Eq the scaled forces F have been replaced by its real space counterparts i e J h7 F The atomic scaling version of this Liouvillean breakup is derived on the basis of Eqs 8 38 One Multiple Time Steps Algorithms 26 obtains p P iL Pik V Par X Palas ral VP Jag 3 63 ik a iLy FnVp 3 64 iL G i Cas it 3 65 p k ik ih X h Poze deth a PPro 3 66 a iL Vie Ven 5 Vv ap Ver ag 3 67 ik a Pik Ph att it 2a Mik Vis D W Vn ag ik a ov Vie VoV awe 3 68 where jip h fj and V K F are given in Eqs 8 39840 For the time scale breakup in the NPT ensemble we have the complication of
173. inates from a PDB file SYNOPSIS READ_PDB filename DESCRIPTION This command indicates the name of a file in the protein data bank format which contains the solute and or solvent coordinates The name of this file filename must be provided The coordinates of the solvent molecules if present must follow those of the solute in the PDB file The atom labels for solute and or solvent must correspond with those defined in the topology file see description in Sec 10 3 The order of the atoms within a solute residue or a solvent molecule specified in the PDB file is unimportant the ORAC order corresponds to that specified in the topology file If the system contains hydrogens the PDB file ought not to include the hydrogens coordinates If hydrogens atoms are not present in the PDB file but they are included in the topological specification of residue or solvent their coordinates are generated by ORAC according to geometry considerations EXAMPLES READ_PDB test pdb WARNINGS This command has no action if CONTROL in amp RUN is different from zero i e if the system coordinates are read from a restart file see RESTART in amp INOUT REPLICATE NAME REPLICATE Replicate the unit cell generated by SPACE_GROUP amp SOLUTE SYNOPSIS REPLICATE icl icm icn DESCRIPTION The integer arguments icl icm icn indicate how many times along the three axis the unit cell must be replicated The cell parameters of the replicated structure are
174. ing La apart from the requirement that it should ensure as large as possible uncorrelation between work values During the simulation we must also record the number of stored W elements Nyj sn41 and Nn n 1 3 Every Ly steps such that Ly gt gt La three orders of magnitude at least we try a free energy update on the basis of Eq 6 25 or Eq 6 27 The scheme we propose for Afn n 1 follows a First of all we check if the conditions Nn n41 gt N and Nn 1 4n gt N are met In such a case Eq 6 25 is applied setting m n 1 using the stored dimensionless works see point 2 The threshold N is used as a control parameter for the accuracy of the calculation In the ORAC program we have set N int L L Once Af n 41 is known its square uncertainty is computed according to Eq 6 26 Then we set Nn sn4i 0 and Nn 1 gt n 0 and remove W n gt n 1 and W n 1 gt n from computer memory Whenever a free energy estimate and the correlated uncertainty are computed the optimal weight to be used in the acceptance ratio Eq 6 22 is determined applying standard formulas from maximum likelihood considerations see Sec 6 3 3 This step is realized for n 1 2 N 1 b If the criteria needed to apply Eq are not met and no Af n41 estimate is still available from point 3a then we try to apply Eq 6 27 In particular two independent estimates of Afn n 41 are attempted One comes from Eq by setting m n
175. input to the command CRYSTAL EXAMPLES REPLICATE 4 4 5 Replicate the unit cell 4 4 and 5 times along the a b and c crystal axis respectively WARNINGS This command has no action if CONTROL in amp RUN is different from zero i e if the system coordinates are read from a restart file see RESTART in amp INOUT RESET_CM NAME RESET_CM Reset to zero the position of the center of mass of the solute atoms SYNOPSIS RESET_CM DESCRIPTION This command is active only if the solute coordinates are read from a PDB file Before the run starts RESET_CM set the center of mass of the solute to zero Input to ORAC amp SETUP 128 READ_CO NAME READ_CO Read Crystal to Orthogonal CO matrix SYNOPSIS READ_CO ax bx cx ax by cy ax by CZ END DESCRIPTION This command is active only the simulation is restarted and overwrites the CO matrix retrieved from the restart file SOLUTE NAME SOLUTE assume solute SYNOPSIS SOLUTE ON SOLUTE OFF DESCRIPTION This command is active only if the solute coordinates are read from a PDB file If ON is specified ORAC assumes that a solute is present and its coordinates are read in from the file PDB specified by the directive READ_PDB in this environment When SOLUTE ON is specified the namelist amp SOLUTE may be omitted When SOLUTE OFF is specified the namelist amp SOLUTE must be omitted EXAMPLES amp SETUP READ_PDB solute pdb SOLUTE ON amp END A solute is prese
176. instability may derive for integrating slow degrees of freedom with exceedingly small time steps Electrostatic Interactions 34 4 2 The smooth particle mesh Ewald method Before we discuss the non bonded multiple time step separation it is useful to describe in some details one of the most advanced techniques to handle long range forces Indeed this type of non bonded forces are the most cumbersome to handle and deserve closer scrutiny In the recent literature a variety of techniques are available to handle the problem of long range interactions in computer simulations of charged particles at different level of approximation 29 30 OI In this section we shall focus on the Ewald summation method for the treatment of long range interactions in periodic systems 31 32 100 The Ewald method gives the exact result for the electrostatic energy of a periodic system consisting of an infinitely replicated neutral box of charged particles The method is the natural choice in MD simulations of complex molecular system with PBC The Ewald potential is given by N 1 1 qd 5 2 Gd 2 er a ie a rij Yn 4 20 2 27 2 i amp exp 7 m a A Vee y ae m m 5 L g Vintra 4 21 m0 a with N S m qexp 2rim rj 4 22 erf ariz Vin ra i Tana 4 23 i Pa i Tij l where rj is the vector position of the atomic charge qi rij ri rj rn is a vector of the direct lattice erfe z m7 f et dt is the compleme
177. ion table to a history file in Protein Data Bank Format PDB SYNOPSIS PLOT fplot OPEN filename PLOT FRAGMENT fplot OPEN filename PLOT ALCHEMY fplot OPEN filename PLOT CENTER fplot OPEN filename PLOT STEER fplot OPEN filename PLOT STEER_ANALYTIC fplot OPEN filename PLOT STEER_TEMPERATURE fplot OPEN filename DESCRIPTION It writes a history formatted file containing the coordinates of selected part of the solute and the solvent coordinates The dumping frequency in fs is fplot EXAMPLES PLOT 10 0 OPEN test pdb Write coordinates of the backbone atoms of the solute in PDB format every 10 fs to file test pdb PLOT CENTER 10 0 OPEN test pdb Write coordinates of all atoms of the system in PDB format every 10 fs to file test pdb Identical to ASCII_OUTBOX amp INOUT PLOT FRAGMENT 10 0 OPEN test xyz Write coordinates of a fragment of the solute in xyz format selected according the DEF_ FRAGMENT amp PROPERTIES directive every 10 fs to file test xyz The fragment is defined as follows amp PROPERTIES DEF_FRAGMENT 1 38 amp END The file test ryz can be animated using the XMOL public domain molecular graphics program This defines a fragment consisting of the first 38 atoms of the solute The numeral order of the atoms corresponds to that specified in the topology file Sec 10 3 PLOT STEER 50 0 OPEN wrk out write the accumulated work see Eq to the file wrk out every 50 fs The accumulated work at time t is calculated
178. ions are NOT scaled if inter is speci fied If the subcommand kind is not specified the ORAC assumes that both solute solvent and solute solute interactions are scaled EXAMPLES SEGMENT define 1 10 define 1300 1325 kind inter END SETUP NAME SETUP Define the scaling in a REM simulation SYNOPSIS SETUP scale scale scales irest DESCRIPTION The SETUP command is used to define the lowest scaling factor s i e the highest temperature of the last replica The number of replicas in the REMD simulations are equal to the number of processors passed to the MPI routines nprocs The spacing bewteen the replicas is controlled by the irest integer If only the scale real parameter is specified an equal scaling is applied to all parts of the potential If the three parameters scale scalez scalez are specified then scale refers to the bending stretching and improper torsional potential scalez to the proper torsional potential and Input to ORAC amp REM 120 STEP to the 14 non bonded interactions and finally scale3 refers to the non bonded potential NB when the Ewald summation is used together with the command SEGMENT amp REM scale3 scales only the direct short ranged part of the electrostatic interactions and the long ranged reciprocal part has a scaling factor of 1 0 i e these interaction are not scaled If irest 0 the run is restarted from a previous one This implies that the directories PARXXXX are present
179. ip Va aat 4557 lt 75A A 40fs Vor a 0 43 Ri Rm Ro Ri and R3 Rp are the short medium long range shell radius respectively The switching SY R is 1 at Rj and goes monotonically to 0 at Rj A Provided that sv and its derivatives are continuous at R and R A the analytical form of 3 in the healing interval is arbitrary 509 B3102 The full breakup for an AMBER type force field along with the integration time steps valid for any complex molecular system with strong electrostatic interactions is summarize in table II The corresponding five time steps integration algorithm for the NVE ensemble is given by iLAth _ OVn _O Ath _ ov At 7 Vm 0 Atm e exp Or Opi 2 i fexp Or Opi 2 exp Or Opi 2 2Var 3 Ata 2Vn0 3 Atno b 2 exp Or Opi 2 exp Or Opi 2 exp Yj Or Ato a Atn MnO F Ata Nn exp oe fe aa a Nm OVm Atm oV At OV At exp Or Opi 2 exp Dr Opi exp a Opi St where n Atp At Mm Ati Atm Nn Atm Atni Mno Atni Atno The explicit integration algorithm can be easily derived applying the five fold discrete time propagator 4 46 to the state vector p q at time 0 using the rule Eq 2 23 The efficiency and accuracy for energy conservation of this r RESPA symplectic and reversible integrator have been discussed extensively in Refs 12 Extension of this subdivision to non NVE simulation is described in Ref 26 4 4 Electrostatic
180. ivalent to the well known velocity Verlet p At 2 p 0 F 0 At 2 At q0 P At m p At p At 2 F At At 2 2 24 We first notice that each of the three transformations obeys the symplectic condition Eq 2 8 and has a Jacobian determinant equal to one The product of the three transformation is also symplectic and thus phase volume preserving Finally since the discrete time propagator is unitary the algorithm is time reversible One may wonder what it is obtained if the operators gO 0gq and OV 0q 0 0p are exchanged in the definition of the discrete time propagator 2 22 If we do so the new integrator is _ P 0 q At 2 0 At 2 p At p 0 F q At 2 jAt p At q At q At 2 At 2 2 25 m This algorithm has been proved to be equivalent to the so called Leap frog algorithm 76 Tuckerman et al I9 called this algorithm position Verlet which is certainly a more appropriate name in the light of the exchanged role of positions and velocities with respect to the velocity Verlet Also Eq 2 21 clearly shows that the position Verlet is essentially identical to the Velocity Verlet A shift of a time origin by At 2 of either Eq or Eq would actually make both integrator perfectly equivalent However as pointed out in Ref 20 half time steps are not formally defined being the right hand side of Eq 2 21 an approximation of the discrete time propagator for the full step At Ve
181. ive written for SPME simulations these parameters are all defined to be 1 e NAT_WW_ NAT_WP_ _ NAT_PP_ These parameters control the neighbor list dimensions E g the three neighbor lists for the solvent since a maximum of three shell for r RESPA are allowed are integer arrays of dimensions _ NAT_WW_x _MOL_SOLV _ e FFT1_ FFT2_ FFT2_ MORD These parameters control the dimensions of the Q charge array and of the M polynomials for PME computation see section 4 1 The total size of the code depends on the number of particles in the system and on the kind of calculation to be carried out To give an idea an 8000 atoms system running with PME linked cell and computing e g the VACF requires about 25 Mb of memory The equilibration of the solvated reaction center 33000 atoms requires around 85 Mb Index alpha carbon 15 backbone acc 162 writing the coordinates of atoms acceptance ratio backbone 162 ADD_BEND 89 barostat ADD_BOND 89 bending ADD_STR_BENDS printing out ADD_STR_BONDS BENDING ADD_STR_TORS bending potential BI 121 ADD_TORS 89 BENDINGS 153 adding a bending L00 Bennett acceptance ratio 47 65 adding a harmonic distance constraint 99 Berendsen H 24 adding an harmonic torsion 101 Berne B J 5 ADD_TPG SOLUTE 93 BOND 154 ADD_UNITS bonded potential ADJUST_BONDS subdivision of AGBNP bonds 160 Alchemical transformations BPTI definition of the alchemical portio
182. keut 27 mM mnax to infinity in polar coordinates The neglected reciprocal lattice intra molecular energy is then 103 a Veorr arpa erie keur 2a a p2 qiqi X Tij keut amp 4 46 i ij excl with oe X T keut zy e t aaa Sink a 4 47 T Jke kr The first constant term in refers to the self energy while the second accounts for the intra molecular excluded interaction This correction must be included in the same reference systems to which Vor is assigned e g V in our potential separation see Table 2 In principle the correction in Eq 4 46 applies only to standard Ewald and not to the reciprocal lattice energy computed via SPME We can still however use the correction Eq 4 46 if a spherical cutoff keut is applied to SPME This can be done easily by setting exp 7 m a m 0 for 27m gt keut ffnNp L where L is the side length of the cubic box and Ny is the number of grid points in each direction The factor fs must be chosen slightly less than unity This simple device decreases the effective cutoff in reciprocal space while maintaining the same grid spacing thus reducing the B spline interpolation error the error in the B spline interpolation of the complex exponential is indeed maximum precisely at the tail of the reciprocal sums 84 In Ref 103 the effect of including or not such correction in electrostatic systems using multiple time step algorithms is studied and discussed thoroughly
183. l formatted topology file JOIN SOLVENT hoh END Defines the topology of the solvent Input to ORAC amp PARAMETERS 95 WARNINGS The command is inactive when used in conjunction with READ_TPGPRM To have the desired effect the JOIN environment must be used in conjunction with READ_TPG_ASCII READ_PRM_ASCITI and op tionally the REPL_RESIDUE environment PRINT_TOPOLOGY NAME PRINT_TOPOLOGY Print topology components of the current solute molecule SYNOPSIS PRINT_TOPOLOGY END DESCRIPTION PRINT_TOPOLOGY is a structured command to be used for printing out part of the topology and potential information for the solute molecule The following subcommands may be specified within PRINT_TOPOLOGY atoms bendings bonds constraints I torsions P torsions sequence e bonds Print the bonds list e bendings Print the bendings list e constraints Print the bond constraints list e torsions Print the proper torsion list e P torsions Print the improper torsion list e sequence Print info on the units sequence of both solvent and solute EXAMPLES PRINT_TOPOLOGY bonds P torsions END Input to ORAC amp PARAMETERS 96 READ_TPGPRM NAME READ_TPGPRM Read an unformatted parameter and topology file SYNOPSIS READ_TPGPRM filename no_warning DESCRIPTION The command reads the binary force field parameters and topology file filename This file contains the topology and force field parameters tables It i
184. le 138 memory demand in ORAC 165 Message Passing library interface 165 amp META metadynamics Gaussian and Lucy s function multiple walkers 7 well tempered metadynamics 60 minimization with dielectric continuum 140 MINIMIZE 140 mixing rules 154 molecular scaling 27 28 CA1 Liouvillean split for MPI MTS_RESPA 86 multiple Bennett acceptance ratio 47 multiple restarts in parallel simulation 84 multiple time steps 5 29 for Parrinello Rahman Nos Hamiltonian neighbor list 108 10 for hydrogen bonds setting work arrays dimensions 166 non bonded potential subdivision of non bonded potential Nos thermostat 20 no_step NPT ensemble 9 43 NPT simulation NVT ensemble 9 28 84 143 occupy 82 omit_angles 160 Open MPI 165 pair correlation function 12 parallel version compiling 165 REM algorithm 20 steered molecular dynamics simulations 100 amp PARAMETERS Parrinello Rahman Nos Extended Lagrangian 141 PARXXXX directories 120 31 165 PDB generating a file writing the file to disk 83 PLOT 83 PLOT FRAGMENT PMF position Verlet potential bending bonded 30 non bonded 30 of mean force 61 stretching 30 subdivision of amp POTENTIAL 99 potential of mean force determination of via the Crooks theorem potential subdivision 30 B8 for the AMBER force field pressure control for membrane sim
185. ll 64 cells are generated four in each direction INSERT NAME INSERT Insert solute molecules in the solvent SYNOPSIS INSERT radius DESCRIPTION This command is designed to insert solute molecules in a simulation box containing solvent molecules The solvent molecules which overlap with the solute are discarded ORAC assumes that two molecules overlap if their distance is less than the sum of their respective Lennard Jones radii multiplied by radius There is no optimal value for radius however reasonable values are within 0 6 and 0 8 EXAMPLES INSERT 0 6 WARNINGS This command has no action if CONTROL in amp RUN is different from zero i e if the system coordinates are read from a restart file see RESTART in amp INOUT READ_SOLVENT NAME READ_SOLVENT Read solvent molecules SYNOPSIS READ_SOLVENT nmol DESCRIPTION This command is a synonymous of ADD_UNITS REDEFINE NAME REDEFINE Read solvent molecules SYNOPSIS REDEFINE unit name DESCRIPTION This command is used for deleting the unit unit name from the solute list and assigning it to the solvent molecules As long as energies and properties are concerned the unit unit name will pertain to the solvent Input to ORAC amp SOLVENT 152 EXAMPLES amp PARAMETERS READ_TPGPRM_BIN benz prmtpg amp END amp SOLVENT REDEFINE po4 amp END We redefine the solute unit po4 as a solvent unit Input to ORAC Force Field amp Topology 153 10 3
186. locity Verlet and Position Verlet therefore do not generate numerically identical trajectories although of course the trajectories are similar We conclude this section by saying that is indeed noticeable that using the same Liouville formalism different long time known schemes can be derived The Liouville approach represent therefore a unifying treatment for understanding the properties and relationships between stepwise integrators 2 3 Potential Subdivision and Multiple Time Steps Integrators for NVE Simulations The ideas developed in the preceding sections can be used to build multiple time step integrators Multiple time step integration is based on the concept of reference system Let us now assume that the system Symplectic and Reversible Integrators 13 potential V be subdivided in n terms such that Additionally we suppose that the corresponding average values of the square modulus of the forces Fj OV Ox and of their time derivatives Fy d dt OV Ox satisfy the following condition i o gt gt Fee Ser K gt gt FSS Ske 2 27 These equations express the situation where different time scales of the system correspond to different pieces of the potential Thus the Hamiltonian of the k th reference system is defined as H T Vo Vk 2 28 with a perturbation given by P Vki Vk 2 Vn 2 29 For a general subdivision of the kind given in Eq there exist n reference nested systems In the general case of
187. long the trajectory of the system This sum inverted in sign is used during the simulation as a biasing potential V s t that depends explicitly on time s V s t XO G s sv h o 7 2 lee A ASA A where G s s h o hexp s s 20 is a Gaussian function centered in s with height h and variance o During a metadynamics simulation the potential V s t will grow faster for states with an higher probability pushing out the system from minima in the free energy landscape If the rate of deposition w h r is sufficiently slow the system can be considered in equilibrium with the biased Hamiltonian H x t H x V s t and therefore the probability of visiting state s at time t is the equilibrium canonical distribution p s t x exp G A s V s Once all the free energy minima have been filled by the biasing potential and therefore V s t A s such a probability is uniform along s and the potential will grow uniformly The thermodynamical work spent in changing the potential from the original Hamiltonian H a to HA x t H x V s t can be computed through the relation W m dr 4 In the limit of an adia batic transformation this quantity is equal to the Helmholtz free energy difference AA A Ap between two systems with energy functions H and H where A f dx exp GH and Ap f dx exp SH 133 However if the process is too fast with respect to the ergodic time scale a part of the
188. lts of integrators for rigid nitrogen using SHAKE are also shown for comparison The data in Table 1 refer to a 3 0 ps run without velocity rescaling They were obtained starting all runs from coordinates corresponding to the experimental Pa3 structure 82 83 of solid nitrogen and from velocities taken randomly according to the Boltzmann distribution at 100 K The entry in bold refers to the exact result obtained with a single time step integrator with a very small step size of 0 3 fs Note that R increases quadratically with the time step for single time step integrators whereas r RESPA is remarkably resistant to outer time step size increase For example r RESPA with At 9 0fs and P 30 ie Ato 0 3 fs yields better accuracy on energy conservation than single time step Velocity Verlet with At 0 6fs does while being more than six times faster Moreover r RESPA integrates all degrees of freedom of the systems and is almost as efficient as Velocity Verlet with constraints on bonds It is also worth pointing out that energy averages for all r RESPA integrators is equal to the exact value while at single time step even a moderate step size increase results in sensibly different averages intra molecular energies As a more complex example we now study a cluster of eight single chain alkanes C24H59 In this case the potential contains stretching bending and torsional contributions plus the intermolecular Van der Waals interactions between non
189. m Phys 97 2635 1992 Y Sugita and Y Okamoto Chem Phys Lett 314 141 1999 D D Minh and A B Adib Phys Rev Lett 100 180602 2008 P Nicolini P Procacci and R Chelli J Phys Chem B 114 9546 2010 S R Williams D J Searles and D J Evans Phys Rev Lett 100 250601 2008 J Gore F Ritort and C Bustamante Proc Natl Acad Sci US A 100 12564 2003 G Cowan Statistical data analysis Oxford University Press 1998 M Mezei J Comput Phys 68 237 1987 G H Paine and H A Scheraga Biopolymers 24 1391 1985 T Huber A E Torda and W F van Gunsteren J Comput Aided Mol Des 8 695 1994 S Marsili A Barducci R Chelli P Procacci and V Schettino J Phys Chem B 110 14011 2006 M Watanabe and W P Reinhardt Phys Rev Lett 65 3301 1990 N G Van Kampen Stochastic Processes in Physics and Chemistry North Holland 1992 M Iannuzzi A Laio and M Parrinello Phys Rev Lett 90 238302 2003 V Babin C Roland T A Darden and C Sagui J Chem Phys 125 204909 2006 L B Lucy Astronom J 82 1013 1977 W G Hoover and C G Hoover Phys Rev E 73 016702 2006 D J Earl and M W Deem J Phys Chem B 109 6701 2005 C Zhou and R N Bhatt Phys Rev E 72 0205701 R 2005 R E Belardinelli and V D Pereira Phys Rev E 75 046701 2007 A Barducci G Bussi and M Parrinello Phys Rev Lett 100 020603 2008 P Raiteri F L Gervasio
190. m acc don This are described in the following paragraphs EXAMPLES Residue topology of amino acid valine RESIDUE val Total Charge 0 0 atoms group n n 0 41570 hn h 0 27190 ca ct 0 08750 ha hi 0 09690 group cb ct 0 29850 hb he 0 02970 group cgi ct 0 31920 hgi1 he 0 07910 hgi2 he 0 07910 hg1i3 he 0 07910 group cg2 ct 0 31920 hg21 hc 0 07910 hg22 hc 0 07910 hg23 hc 0 07910 group c c 0 59730 o o 0 56790 end bonds cb ca cgi cb cg2 cb n hn n ca o le c ca ca ha cb hb cgi hgiit cgi hgi2 cgi hgi3 cg2 hg2i cg2 hg22 cg2 hg23 end imphd ca n hn ca n o end Input to ORAC Force Field amp Topology 159 termatom n c backbone n ca c END atoms NAME atoms Read the list of atoms forming the residue SYNOPSIS atoms group labi typ1 charge group end DESCRIPTION The command read the list of atoms and corresponding charges charge in electron forming the residue The list can and must contain the keyword group to define atomic groups and is terminated by end lab and typi are both character strings not to exceed 7 characters and correspond to the atom label and type respectively While each atom type listed by atoms must be defined in the parameters file each atom label defines uniquely a particular atom of the residue ORAC expects that labels found in atoms be consistent with those used in the input coordinates i e in the PDB file Atoms in between two consecutive group or betwee
191. m all M replicas we first solve iteratively the system for all Zn except for a multiplicative factor with 1 lt n M In doing this the weights Wn including n 1 are also determined Finally configurational averages at e g the target distribution can be determined using all the REMD configurations by means of Eq Chapter 6 Serial generalized ensemble simulations 6 1 Introduction A class of simulation algorithms closely related to REM see Chapter 5 are the so called serial generalized ensemble SGE methods 45 The basic difference between SGE methods and REM is that in the former no pairs of replicas are necessary to make a trajectory in temperature space and more generally in the generalized ensemble space In SGE methods only one replica can undergo ensemble transitions which are realized on the basis of a Monte Carlo like criterion The most known example of SGE algorithm is the simulated tempering ST technique 43 46 where weighted sampling is used to produce a random walk in temperature space An important limitation of SGE approaches is that an evaluation of free energy differences between ensembles is needed as input to ensure equal visitation of the ensembles and eventually a faster convergence of structural properties 47 REM was just developed to eliminate the need to know a priori such free energy differences ST and temperature REM yield an extensive exploration of the phase space without configurational re st
192. m process Wp z r z is the work done on the system during the driven trajectory I zo gt I z p T zo T z is the joint probability of taking the microstate zo from a canonical distribution with a given initial Hamiltonian H z zo and of performing the forward transformation to the microstate T z corresponding to a different Hamiltonian H z z7 p U zo I z is the analogous joint probability for the time reversal path producing the work Wrz r 20 Wr zo riz AF F z 2z F z 2 is the free energy difference between the thermodynamic states associated to the Hamiltonians H z z and H z zo Although the CT can be stated in a more general formulation see Gavin Crooks phd thesis here the essential assumptions are that i the system is deterministic and satisfies the time reversal symmetry and ii the reverse trajectory is done following a reversed time schedule such that Wpyz r 29 Wr zo gt r z The first assumption is satisfied by any kind of standard MD equation of motion Newtonian Nos Hoover Parrinello Rahman while the second condition can be easily imposed in a SMD experiment A very simple proof of Eq 8 2 goes as follows suppose the zo is drawn from a canonical distribution and that the driven trajectory that brings the system to z is done adiabatically i e removing the thermal bath For the reverse trajectory drawing z from a canonical distribution due to the time reversal
193. me on off ERF_CORR NAME ERF_CORR Implements intramolecular Ewald correction SYNOPSIS ERF_CORR nbin rlow rup DESCRIPTION Adds correction of Eq 4 47 evaluated only for excluded intra molecular contacts stretching bending and fudged part of 1 4 interactions to account for reciprocal space cutoff error The function x r a is B splined using nbin points in the range rlow lt r lt rup EXAMPLES ERF_CORR 2000 0 8 4 5 WARNINGS Choose carefully rlow and rup If an intramolecular distance outside the range is found during execution unpredictable results may occur ERFC_SPLINE NAME ERFC_SPLINE Use spline to compute the complementary error function used for electrostatics in direct space SYNOPSIS ERFC_SPLINE erfc_bin ERFC_SPLINE erfc_bin corrected rcut Input to ORAC amp POTENTIAL 105 DESCRIPTION By default ORAC uses a 5 parameter expansion to compute the complementary error function required by the direct space electrostatic potential Vjq in Eq 2 20 With the command ERFC_SPLINE this expansion is replaced by a B spline The function erfc a is splined from x 0 to x L lareut where a and reut are the Ewald sum parameter and the radial cutoff respectively The argument erfc_bin is the bin size of the spline The usage of the ERFC_SPLINE option is useful when running on workstations where a saving of 10 15 in CPU time is usually obtained ERFC_SPLINE may also be used to speed up the Ewald method
194. mentation the charges and the Lennard Jones potential can be switched on and off independently by setting up different time protocol for 7 and A alchemical coordinates Such as approach is much more flexible and powerful than that based on the definition of a single alchemical parameter im plying the simultaneous variation of Lennard Jones and eklectrostatic interactions If the 7 and A factors are varied coherently i e only one type of alchemical coordinate A is defined catastrophic numerical instabilities may arise especially in complex solutes with competing conformational structures One way to circumvent this problem is to switch electrostatic and Lennard Jones interactions separately as we do here For the evaluation of solvation free energy via alchemical transformations the target end states are i the decoupled solute in the gas phase and the pure solvent in the liquid state and ii the solution For the decoupled state i in principle two independent standard simulations are needed one for the isolated solute and the other for pure solvent However the decoupled state can be sampled in one single simulation using the non bonded energy of Eq by setting the alchemical solute and n factors all equal to one In fact according to Eq 9 6 and to the rules of Table when the alchemical solute and 7 terms are all equal to one the solute is not felt by any means by the solvent and evolves in time independently subject only t
195. method called BAR SGE is based on a generalized expression 115 of the Bennett Acceptance Ratio I17 BAR and free energy perturbation 118 It is asymptotically exact and requires a low computational time per updating step The algorithm is suited not only to calculate the free energy on the fly during the simulation but also as a possible criterion to establish whether equilibration has been reached 6 2 Fundamentals of serial generalized ensemble methods SGE methods deal with a set of N ensembles associated with different dimensionless Hamiltonians hn z p where x and p denote the atomic coordinates and momenta of a microstatd and n 1 2 N denotes the ensemble Each ensemble is characterized by a partition function expressed as Time fake da dp 6 1 In ST simulations we have temperature ensembles and therefore the dimensionless Hamiltonian is hy x p Bn H z p 6 2 where H z p is the original Hamiltonian and 8 kgT with kg being the Boltzmann constant and Tn the temperature of the nth ensemble If we express the Hamiltonian as a function of A namely a parameter correlated with an arbitrary collective coordinate of the system or even corresponding to the pressure then the dimensionless Hamiltonian associated with the nth A ensemble is hn x p BH zx p An 6 3 Here all ensembles have the same temperature It is also possible to construct a generalized ensemble for multiple parameters i19 as In
196. mmand defines the Bravais lattice type to be used when generating a solvent lattice with GENERATE type may be BCC FCC or SC corresponding to Body Centered Cubic Face Centered Cubic and Simple Cubic lattices respectively EXAMPLES amp SOLVENT CELL SC GENERATE RANDOMIZE 4 4 4 amp END COORDINATES NAME COORDINATES Define the coordinates of a solvent molecule SYNOPSIS COORDINATES filename DESCRIPTION Read the coordinates of the solvent molecule in PDB format from file filename EXAMPLES amp SETUP CRYSTAL 20 00 20 00 20 00 90 0 90 0 90 0 amp END amp SOLVENT CELL SC INSERT 1 5 COORDINATES solvent pdb GENERATE RANDOMIZE 4 4 4 amp END In this example the coordinates of the solvent are read in from the file solvent pdb see SOLVENT This input would produce 64 solvent molecules in a box of 20 x 20 x 20 For generating solvent in presence of the solute see COORDINATES amp SOLUTE GENERATE NAME GENERATE Replicate solvent molecules SYNOPSIS GENERATE RANDOMIZE ia ib ic DESCRIPTION This command is used to generate a lattice of ia x ib x ic cells belonging to the Bravais lattice specified in the command CELL The optional string RANDOMIZE is used for assigning a random rotation to each solvent molecule in the lattice EXAMPLES Input to ORAC amp SOLVENT 151 amp SOLVENT CELL SC GENERATE RANDOMIZE 4 4 4 amp END The elementary cell is simple cubic with one molecule per unit ce
197. mp REM 119 SEGMENT NAME SEGMENT Define the solute in solute tempering simulations SYNOPSIS SEGMENT END DESCRIPTION This structured command is used to define the solute in the a solute tempering REM simulation and to assign the scaling factors for the Hamiltonian REM simulation see Sec to the intrasolute solute solvent and solvent solvent interactions The following subcommands may be specified within SEGMENT define kind e define ni n2 The define command is used to crop a piece of solute for Hamiltonian scaling in a REMD simulation One can use up to a maximum of 10 define commands cropping 10 disconnected non overlapping part of the solute n1 and n2 are the atom indices of the selected solute parts The numeric order of the atoms is that specified in the topology file see Sec 10 3 kind inter_type Once the solute has been defined using the define subcommand the subcommand kind is used to scale the solute solute solute solvent interactions Possible choices for the string in ter_type are intra and inter intra means that the non bonded energy scaling see SETUP command is applied to the intrasolute non bonded interactions only i e solute solvent in teractions are not scaled where by solvent we mean the actual solvent and the solute atoms which were not selected using the define subcommand inter scales only solute non solute i e solvent non bonded interactions Intrasolute interact
198. mple the four sulphur bridges of hen egg lysozyme are given WARNINGS This command is inactive when used in conjunction with READ_TPGPRM To have the desired effect the ADD_TPG environment must be used in conjunction with READ_TPG_ASCII and READ_PRM_ASCII JOIN NAME JOIN Provide the list of residues forming the current solute or solvent molecule s SYNOPSIS JOIN SOLUTE SOLVENT END DESCRIPTION The structured command JOIN reads the sequential list of labels corresponding to the residues forming the solute molecule s The list of residues begins at the line following JOIN The end of this list is signaled by END on the line following the last residue label Each residue labels must have been defined in the general formatted topology file read by READ_TPG_ASCIT See Sec 10 3 for explanations EXAMPLES JOIN SOLUTE lys h val phe gly arg cys glu leu ala ala ala met lys arg hsd gly leu asp asn tyr arg gly tyr ser leu gly asn trp val cys ala ala lys phe glu ser asn phe asn thr gln ala thr asn arg asn thr asp gly ser thr asp tyr gly ile leu gln ile asn ser arg trp trp cys asn asp gly arg thr pro gly ser arg asn leu cys asn ile pro cys ser ala leu leu ser ser asp ile thr ala ser val asn cys ala lys lys ile val ser asp gly asn gly met asn ala trp val ala trp arg asn arg cys lys gly thr asp val gln ala trp ile arg gly cys arg leu o END Sequence of residues for hen egg lysozyme All labels must have been defined in the genera
199. n Eq 6 19 we have only reported the explicit time dependence of the temperature Moreover we have considered to deal with thermal changes alone using constant volume constant temperature equations of motion Extending the treatment to constant pressure constant temperature algorithms and to systems subject to generic A e g mechanical changes is straightforward 116 Note that when no changes are externally applied to the system H is exactly the quantity conserved during an equilibrium constant volume constant temperature simulation Accordingly the work W is zero The above definition of generalized dimensionless work is valid for arbitrary values of r In the special case of instantaneous thermal changes and instantaneous variations of the microstate variables as it occurs in ST simulations the times 0 and 7 in Eq 6 18 refer to the states instantaneously before and after the x p n 2 p m transition respectively Therefore according to the notation introduced above Eq 6 18 can be rewritten as where x and x are the values of the configurational thermostat variables before and after the x p n gt x p m transition respectively In the first two terms of the right hand side of Eq 6 20 we can rec ognize the dimensionless Hamiltonians hm x p p4 and hn x p pz It is important to observe that in generalized ensemble simulations an arbitrary change of x during a transition does not affect the accep tance ra
200. n a group and the final end form the atomic group EXAMPLES atoms group n n 0 41570 hn h 0 27190 ca ct 0 08750 ha hi 0 09690 group cb ct 0 29850 hb hc 0 02970 end WARNINGS The keyword atoms must appear at the beginning of the RESIDUE environment rigid NAME rigid Define a rigid unit SYNOPSIS rigid DESCRIPTION Not supported Input to ORAC Force Field amp Topology 160 bonds NAME bonds Read list of bonds SYNOPSIS bonds lab1 lab2 lab labs end DESCRIPTION The keyword is used to define a list of covalent bonds among the atoms forming the residue The list is terminated by end On the lines following bonds a series of pairs of atom labels is expected In the synopsis atom lab1 is covalently bound to atom lab2 and lab to lab4 The labels appearing in input to bonds must be defined in the atom list given with the command atoms EXAMPLES bonds n ca o c Cc ca ca ha cg2 hg22 cg2 hg23 end WARNINGS The keyword atoms must appear before bonds omit_angles NAME omit_angles Provide a list of angle bendings to omit SYNOPSIS omit_angles lab1 lab2 lab3 lab4 labd lab end DESCRIPTION Given the list of bonds for the solute molecule s ORAC generates all possible angle bendings The keyword omit_angles allows the deletion of any angle bendings from the residue angle bendings list Following the line with omit_angles a series of triplets of atom labels is expected In the
201. n by Eq 8 42 The two pressures Eqs 3 41 3 42 differ instantaneously Should the difference persist after averaging then it would be obvious that the equilibrium thermodynamic state in the NPT ensemble depends on the scaling method The two formulas are fortunately equivalent To prove this statement we closely follow the route proposed by H Berendsen and reported by Ciccotti and Ryckaert and use Eqs to rearrange Eq 8 42 We obtain AR eF gt T X Miktar e fi 3 46 i kl i Adding and subtracting Mikr fi we get 1 5 M So mir Tik ri o fi Mikril fi 3 47 i kl which can be rearranged as D gt EG ra mifi myta F Sori nh 3 48 kl l using the newton law fik Mikaik where aik is the acceleration we obtain L 5 E jMilrik ru ai ai So rit e 3 49 i l i kl The first term in the above equation can be decomposed according to d T rix ra Vi Vik Vi Vik 3 50 The first derivative term on the right hand side is zero rigorously for rigid molecules or rigid group and is zero on average for flexible molecules or groups assuming that the flexible molecules or groups do not dissociate This can be readily seen in case of ergodic systems by evaluating directly the average of this derivatives as rik ru au aik rik Ti e Vi Vik Jim L 4 rik ri e vi Vik dt 3 51 im ra r
202. n by a temperature In its simplest implementation the potential energies of the replicas differ by a scaling factor cm with c 1 for the target replica Clearly as long as the exchanged states differ only in the coordinates i e momenta are not exchanged the scaling of the potential energy of a canonical system NVT is equivalent to an inverse temperature scaling Thus Hamiltonian REM with full potential energy scaling and temperature REM are perfectly equivalent in an extended Monte Carlo simulation When the replica simulations are done by numerically integrating the Nos Hoover equations of motion at constant volume Hamiltonian REM with full potential energy scaling and temperature REM are clearly no longer equivalent Since momenta are not exchanged Eq 5 6 is valid for both full Hamiltonian REM and temperature REM but in the latter technique both the kinetic and the potential energy are scaled while in the former as implemented in ORAC one scales only the potential energy The advantage of using the Hamiltonian REM is two fold i as all the replica have the same operating temperature one does not have like in temperature REM to reinitialize the velocities after one successful configuration exchange and ii since the mean atomic velocities are the same throughout the extended system one does not have to adapt the time step size for preserving the quality of r RESPA integrator as it should be done in temperature REM Hamiltonian RE
203. n of the solute B spline 103 of the direct lattice potential printout of the work done alchemical transformations 67 canonical transformations 9 alkanes CELL AMBER force field center of mass 127 KANALYSIS CG ANDERSEN CHANGE_CELL 126 Andersen H C Ciccotti G angular_cutoff 13 cofm 143 angular cutoff compiling ORAC 164 animation compute accessibility 16 using ORAC generated file compute contac solute 116 animation from xyz file compute neighbors 116 ANNEALING 137 compute volume 116 UPDATE 79 config h 165 ASCII config h file 165 DCD BI configure file L65 asymmetric unit 48 conjugate gradient 40 atom_record 82 constant pressure atomic charges scaling 141 subtracting excess charge 146 constant pressure simulation atomic scaling constant temperature simulation Liouvillean split for CONSTRAINT 103 atoms 159 constraints AUTO_DIHEDRAL 102 printing out average 112 with r RESPA 5 averaged 114 CONTROL 121 coordinates B spline interpolation 35 1e97 of the solvent 49 157 COORDINATES 145 150 Crooks theorem 61 CRYSTAL 126 crystal structure 27 crystal symmetry 147 crystal to orthogonal matrix 128 crystallographic parameters 126 cutoff B B4 37 C05 for hydrogen bonds in the reciprocal lattice 40 reciprocal lattice 05 CUTOFF 103 cutoff 012 116 DCD generating a file DEBUG decaalanine 7 d
204. n regarding the solute topology and force field is written only if the solute topology and parameter list is actually computed and not read from a binary file i e READ_TPGPRM BIN in amp PARAMETERS must be inactive The debug_type string can be residue_sequence The residue sequence is printed Input to ORAC amp RUN 122 bond_table Details about bonds and corresponding stretching parameters are printed bend_table Details about bending and corresponding parameters are printed ptors_table Details about proper torsions and corresponding parameters are printed pitors_table Details about improper torsions and corresponding parameters are printed EXAMPLES DEBUG all print out all tables DEBUG bond_table DEBUG bend_table DEBUG residue_sequence print out the bond and bending table and the residue sequence MAXRUN NAME MAXRUN Provide the maximum simulation length in fs SYNOPSIS MAXRUN fmazrun DESCRIPTION This command controls the total length of the direct access file The number of records initialized by the DUMP amp INOUT command is given by nrec fmazrrun natoms atom_rec where natoms and atom_rec are the total number of atoms in the system and the atoms per record respectively fmaxrun cannot be less than ftime see command TIME in this environment EXAMPLES MAXRUN 500000 0 PRINT NAME PRINT Print instantaneous results SYNOPSIS PRINT fprint DESCRIPTION ORAC writes the instantaneou
205. n the NVT ensemble and isothermal isobaric simulation in the NPT ensemble As we shall see the dynamic of the real system generated by the extended system method is never Hamiltonian Hence symplecticness is no longer an inherent property of the equations of motion Nonetheless the Liouvillean formalism developed in the preceding section turns out to be very useful for the derivation of multiple time step reversible integrators for a general isothermal isobaric ensemble with anisotropic stress or NPT This extended system is the most general among all non microcanonical simulations The NPT NPH the NVT and even NVE ensemble may be derived from this Lagrangian by imposing special constraints and or choosing appropriate parameters 3 1 The Parrinello Rahman Nos Extended Lagrangian The starting point of our derivation of the multilevel integrator for the NPT ensemble is the Parrinello Rahman Nos Lagrangian for a molecular system with N molecules or groups each containing n atoms and subject to a potential V In order to construct the Lagrangian we define a coordinate scaling and a 1When P is not in boldface we imply that the stress is isotropic For large molecules it may be convenient to further subdivide the molecule into groups A group therefore encompasses a conveniently chosen subset of the atoms of the molecule 10 Multiple Time Steps Algorithms 20 velocity scaling i e Tika Rio like 5 hapsi f like 3 1 B Ra
206. nd so on In a similar manner in the diffusional regime many different contributions can be identified In a standard integration of Newtonian equations all these motions irrespectively of their time scales are advanced using the same time step whose size is inversely proportional to the frequency of the fastest degree of freedom present in the system therefore on the order of the femtosecond or even less This constraint on the step size severely limits the accessible simulation time One common way to alleviate the problem of the small step size is to freeze some supposedly irrelevant and fast degrees of freedom in the system This procedure relies on the the so called SHAKE algorithm 9 70 I that implements holonomic constraints while advancing the Cartesian coordinates Typically bonds and or bending are kept rigid thus removing most of the high frequency density of states and allowing a moderate increase of the step size The SHAKE procedure changes the input potential and therefore the output density of the states Freezing degrees of freedom therefore requires in principle an a priori knowledge of the dynamical behavior of the system SHAKE is in fact fully justified when the suppressed degrees of freedom do not mix with the relevant degrees of freedom This might be almost true for fast stretching involving hydrogen which approximately defines an independent subspace of internal coordinates in almost all complex molecular system
207. nfinity the NPH equations of motion are obtained Finally setting both W and Q to infinity the NVE equations of motion are recovered Switching to the NPT isotropic stress ensemble is less obvious One may define the kinetic term 10There are also other less material reasons to prefer molecular scaling atomic scaling and molecular scaling yield dif ferent dynamical properties because the equations of motions are different Dynamical data computed via extended system simulations should always be taken with caution With respect to pure Newtonian dynamics however the NPT dynamical evolution is slightly modified by a barostat coupled to the molecular center of mass but is brutally damaged when the barostat is coupled to the fast degrees of freedom For example in liquid flexible nitrogen at normal pressure and 100 K atomic scaling changes the internal frequency by 20 cm while no changes are detected when the barostat is coupled to the centers of mass 11The value of W which works as infinity depends on the force that is acting on barostat coordinate expressed by the Eq 8 25 i e on how far the system is from the thermodynamic equilibrium For a system near the thermodynamic equilibrium with Ny 10000 a value of W 107 a m u is sufficient to prevent cell fluctuations Multiple Time Steps Algorithms 29 associated to barostat in the extended Lagrangian as 1 242 gt Wags hzo 3 79 such that a different inertia may in prin
208. ng the energies WRITE ktest 300 tim utot ustot uptot upstot ektot pottot 300 FORMAT TotalEnergy f12 3 6f15 3 Where tim utot ustot uptot upstot ektot pottot are the values of the time total energy solvent potential energy solute potential energy solvent solute potential energy total kinetic energy total potential energy Time is given in fs and all energies in KJ mole The energy conservation ratio R AE AK and the drift D a are printed periodically every 1000 Atn and at the end of the simulation onto the file filename e dirty Obsolete Unsupported Scales velocities to the initial total energy E 0 during production stage The scaling is done randomly with a Monte Carlo algorithm e p_test nl n2 n3 n4 n5 Diagnostic Unsupported To be used in conjunction with subcommand test times print out time record of the subsys tems potential and forces for the protein for atoms n1 n2 n3 n4 n5 e s test nl n2 n3 Diagnostic Unsupported To be used in conjunction with subcommand test times print out time record of the subsys tems potential and forces for the solvent for atoms n1 n2 n3 very_cold_start rmazr This option is useful when minimizing a protein in a highly unfavorable configuration The real argument rmaz is the maximum allowed displacement in A for any atom when integrating the equations of motion irrespectively of the intensity of the force on that atom This constraint avoid blowing up of the sim
209. ng the trajectories generated by the accurate integrator E We see that R3 and S1 generates the same spectral profile within statistical error In contrast especially in the region above 800 wavenumbers S generates a spectrum which differs appreciably from the exact one This does not mean of course that S is unreliable for the relevant torsional degrees of freedom Simply we cannot a priori exclude that keeping all bonds rigid will not have an impact on the equilibrium structure of the alkanes molecules and on torsional dynamics Actually in the present case as long as torsional motions are concerned all three integrators produce essentially identical results In 20 picoseconds of simulation R3 S1 and S predicted 60 61 60 torsional jumps respectively against the 59 jumps obtained with the exact integrator E According to prescription of Ref 84 in order to avoid period doubling we compute the power spectrum of torsional motion form the auto correlation function of the vector product of two normalized vector perpendicular to the dihedral planes Rare events such as torsional jumps produce large amplitudes long time scale oscillations in the time auto correlation function and therefore their contribution overwhelms the spectrum which appears as a single broaden peak around zero frequency For this reason all torsions that did undergo a barrier crossing were discarded in the computation of the power spectrum The power spectrum of the torsion
210. nian see Eq 6 16 However reweighting schemes are available to recover the unbiased PMF along the real coordinate 52 53 125 We will see later how fm and fn are determined 6 3 The algorithm for optimal weights 6 3 1 Tackling free energy estimates The algorithm used to calculate the optimal weight factors namely the dimensionless free energy differences between ensembles see Sec 6 2 is based on the Bennett acceptance ratio 117 65 and on the free energy Serial generalized ensemble simulations 53 perturbation formula I18 We start by showing that the difference between the dimensionless Hamiltonians appearing in the acceptance ratio see Eq 6 8 can be viewed as the generalized dimensionless work done on the system during the transition x p n x p m The concept of generalized dimensionless work in systems subject to mechanical and thermal nonequilibrium changes has been extensively discussed in the literature 115 116 In particular it has been shown see Eq 45 of Ref I16 that in a nonequilibrium realization performed with extended Lagrangian molecular dynamics 90 the generalized dimensionless work is W 8 H T oH 0 6 18 where 7 is the duration of the realization and H r H z p pi keTrV zt 6 19 where H x p p is defined in Eq 6 13 and V z is a linear function of the configurational variables x associated with the thermostat see Eq 42 of Ref II6 For simplicity i
211. ning the grid are 45 32 45 along the a b c crystal axis respectively Typically acceptable relative accuracy 1074 1075 on electrostatic energies and forces is obtained with a grid spacing of about 1 1 2 A along each dimension In this example the second invocation of the EWALD command is used in order to remove the linear momentum of the MD cell GROUP_CUTOFF NAME GROUP_CUTOFF WARNINGS Obsolete Unsupported H MASS NAME H MASS Change the hydrogen mass SYNOPSIS H MASS hdmass DESCRIPTION The command H MASS changes the mass of all solute hydrogens to hdmass given in a m u This allows to use larger time steps during equilibration EXAMPLES H MASS 10 0 I TORSION NAME I TORSION Set the type of improper torsion potential SYNOPSIS I TORSION ttor_type DESCRIPTION This command defines which improper torsion potential must be used If the argument string itor_type is HARMONIC a harmonic CHARMM like potential functions is chosen Conversely if the argument is COSINE a sinusoidal AMBER like potential function is chosen EXAMPLES I TORSION HARMONIC Input to ORAC amp POTENTIAL 107 JORGENSEN NAME JORGENSEN Allow Jorgensen type interaction potentials SYNOPSIS JORGENSEN DESCRIPTION If the system is composed of solute molecules of the same type it is sometime useful to use different interaction parameters for intermolecular and intramolecular interactions The command JORGENSEN is designed to
212. nly with the parallel version amp RUN defines run time parameters which concern output printing and run averages amp SETUP includes commands concerned with the simulation box setup In this environment the simulation cell parameters dimensions and symmetry can be initialized Moreover files containing the system coordinates in appropriate format can be provided amp SIMULATION includes commands which define the type of simulation that is to be carried out In particular commands are available to run steepest descent energy minimization and molecular dynamics simulations in various ensembles amp SOLUTE includes commands which are concerned with specific aspects of the solute force field and structure amp SOLVENT includes commands which are concerned with specific aspects of the solvent force field and structure amp ST setup the Serial Generalized Ensemble simulation work with both serial and parallel versions amp PARAMETERS includes commands which read the topology and force field parameter files of the solute These files contain sufficient information to define the solute topology and to assign potential parameters to the solute molecules Commands supporting policy Since ORAC is free of charge no professional support is provided Bugs are fixed upon request to the E mail address oracmaster cecdec cecam fr at the authors earliest convenience and support may be requested only for environments commands or sub commands
213. nt A file named agbnp param file must be in the current directory Dielectric constant of the sovent continuum is set in that file In the present release AGBNP works only for constant volume minimization and with no amp SOLVENT specification EXAMPLES MINIMIZE CG 0 00001 WRITE_GRADIENT END MDSIM NAME MDSIM Run molecular dynamics simulations SYNOPSIS MDSIM DESCRIPTION Use this command to run molecular dynamics simulation in any ensembles It has no argument DEFAULTS MDSIM is the default Input to ORAC amp SIMULATION 141 SCALE NAME SCALE Periodic temperature scaling SYNOPSIS SCALE fscale DESCRIPTION Use this command for periodically re scale the temperature with frequency fscale in units of fem toseconds Scaling stands here for random initialization of the system velocities at temperature temp according to a Gaussian distribution EXAMPLES SCALE 100 0 Reinitialize the system velocities every 100 fs WARNINGS Work only during the rejection phase see REJECT in environment amp RUN SCALING NAME SCALING Choose scaling methods for constant pressure simulations SYNOPSIS SCALING MOLECULAR SCALING GROUP SCALING ATOMIC DESCRIPTION This command allows you to switch between scaling methods when running with a barostat see STRESS and ISOSTRESS directive in this environment The scaling can be i molecular with the barostat coupled to the center of mass of the molecules in th
214. nt and the coordinates are read in from the file PDB The residue sequence found in the PDB must match that given in the JOIN SOLUTE amp PARAMETERS directive If the environment amp SOLUTE is entered solute is assumed anyway this command has no effect SOLVENT NAME SOLVENT Reset to zero the position of the center of mass of the solvent atoms SYNOPSIS SOLVENT ON SOLVENT OFF DESCRIPTION If ON is specified ORAC assumes that a solvent is present and its coordinates are read in from the file PDB specified by the directive READ_PDB in this environment This command is not mandatory as if the solvent is present the environment amp SOLVENT which has the same effect of SOLVENT ON must be entered anyway in order to specify how to generate the solvent or the number of solvent molecules in the PDB file When SOLVENT OFF is specified the namelist SOLVENT must be omitted Input to ORAC amp SETUP 129 EXAMPLES amp SETUP READ_PDB solvent pdb SOLVENT ON amp END amp SOLVENT ADD_UNITS 432 amp END amp PARAMETERS JOIN SOLVENT hoh END amp END A solvent 432 molecules is present and the coordinates are read in from the file PDB The residue sequence for the solvent found in the PDB must match that given in the JOIN SOLVENT amp PARAMETERS directive If a solute is also present and its coordinates are given in the PDB file specified by the READ_PDB command then the coordinates of the solvent molec
215. ntary error function erf x 1 erfc x V the unit cell volume m a reciprocal lattice vector and a is the Ewald convergence parameter In the direct lattice part Eq 420 the prime indicates that intramolecular excluded contact are omitted In addition in Eq the term Vintra subtracts in direct space the intra molecular energy between bonded pairs which is automatically included in the right hand side of that equation Consequently the summation on 7 and j in Eq goes over all the excluded intra molecular contacts We must point out that in the Ewald potential given above we have implicitly assumed the so called tin foil boundary conditions the Ewald sphere is immersed in a perfectly conducting medium and hence the dipole term on the surface of the Ewald sphere is zero 32 For increasingly large systems the computational cost of standard Ewald summation which scales with N becomes too large for practical applications Alternative algorithms which scale with a smaller power of N than standard Ewald have been proposed in the past Among the fastest algorithms designed for periodic systems is the particle mesh Ewald algorithm PME 33 34 inspired by the particle mesh method of Hockney and Eastwood 35 Here a multidimensional piecewise interpolation approach is used to compute the reciprocal lattice energy Var of Eq while the direct part Vga is computed straightforwardly The low computational cost of the PME method allo
216. nticipate the discharging process In the Figure 9 3 we report the work computed in the alchemical creation of ethanol in water conducted with two different time protocol In the red non equilibrium trajectory the Lennard Jones 7 parameters for ethanol are prudently brought from 1 to 0 in 30 ps and in the next 20 ps the solute is charged In the black trajectories lasting for 30 ps in the first 10 ps the 7 coordinates alone are brought from 1 to 0 5 and then in the last 20 ps they are brought to zero fully switched on ethanol together with the charging process that is started at 10 ps As one can see both trajectories are regular with no instabilities yielding negative and comparable works with limited dissipation with respect to the reversible work 16 17 kJ mol see next section in spite of short duration of the non equilibrium alchemical transformations We must stress here that in the fast switching non equilibrium method with determination of the free energy difference between end states via the CFT once the equilibrium configurations of the starting end states have been prepared the simulation time per trajectory does correspond indeed to the wall clock time if the independent non equilibrium trajectories are performed in parallel For the creation of ethanol in water the CPU time amounts to few minutes on a low end Desktop computer for both time protocols In the following scheme we succinctly describe the implementation of alchemic
217. o the intramolecular interactions with no contribution form the solute lattice images The intrasolute electrostatic energy in particular has no contribution from the reciprocal lattice sum as the A referring to the solute are all equal to 1 in Eq It has indeed a direct lattice contribution for non bonded intrasolute evaluated in the zero cell according to the rules specified in Table plus the alchemic correction term that simply corresponds with all solute A set to 1 to the complementary Erf part thus recovering the bare intrasolute Coulomb energy At the other extreme end of the alchemical transformation A 0 7 0 according to Eq 9 6 the solute is fully charged interacting normally with the solvent Steered Molecular Dynamics 74 and with the solute images via the term Eq 9 2 We now come to the issue of the efficiency of a code 30 20 S 10 g d o z 10 206 10 20 30 AO 50 Time ps Figure 9 3 Alchemical work produced in the creation of ethanol in water T 300 K and P 1 Atm using two different time protocols represented by the black and red horizontal lines with distinct Lennard Jones and charge alchemical parameters Of course also in this case simultaneous switching of A and 7 remains perfectly possible To avoid numerical instabilities at the early stage of the creation process or at the end of the annihilation it is sufficient in the first case to slightly delay the charge switching and in last case to a
218. of the solute n1 and n2 are the atom indices of the selected solute parts The numeric order of the atoms is that specified in the topology file see Section 10 3 e kind inter_type Once the solute has been defined using the define subcommand the subcommand kind is used to scale the solute solute solute solvent interactions Possible choices for the string in ter_type are intra and inter intra means that the non bonded energy scaling see SETUP command is applied to the intrasolute non bonded interactions only i e solute solvent inter actions are not scaled where by solvent we mean the actual solvent and the solute atoms which were not selected using the define subcommand inter scales only solute non solute i e solvent non bonded interactions Intrasolute interactions are NOT scaled if inter is spec ified If the subcommand kind is not specified the ORAC assumes that both solute solvent and solute solute interactions are scaled EXAMPLES SEGMENT define 1 10 define 1300 1325 kind inter END SETUP NAME SETUP This is the basic command to decide which kind of simulation Hamiltonian SGE simulation or SGE simulation in the space of collective coordinates one wants to carry out This command also defines the number of ensembles the scaling options and the restart option SYNOPSIS SETUP nstates scale scale scales irest DESCRIPTION Hamiltonian SGE simulations If the parameters scale scaleg and scaleg
219. ogen bonds however the correction is also important for intermolecular interactions In Fig 4 4 the correction potential is compared to the Coulomb potential solid line in the top right corner for different value of the reciprocal space cutoff k and of the convergence parameter a For practical values of a and k the potential is short ranged and small compared to the bare 1 r Coulomb interaction In the asymptotic limit Vzorr goes to zero as sin ar r where a is a constant This oscillatory long range behavior of the correction potential Veory is somewhat nasty In Fig 4 5 we show the integral I r ka f x x ke a 27 dx 4 49 0 as a function of the distance If this integral converges then the x r k is absolutely convergent in 3D We Angs Angs 0 i S Vv A 30 0 50 0 60 0 70 0 80 0 r Angs Figure 4 5 The integral I r of Eq 449 as a function of the distance for different values of the parameters a left and ke right see that the period of the oscillations in I r increases with ke while a affects only the amplitude The total energy is hence again conditionally convergent since the limit lim I r does not exist However unlike for the 1 r bare potential the energy integral remains in this case bounded Due to this a cutoff on the small potential Vzorr is certainly far less dangerous that a cutoff on the bare 1 r term In order to verify this we have calculated some properti
220. ogram 165 cd HOME ORAC make default In this case the FORTRAN compiler is by default gfortran The current release of the Makefile supports also the Intel FORTRAN compiler and xlf90 IBM compiler To compile ORAC with the Intel FORTRAN compiler do cd HOME ORAC make Intel To compile ORAC with the IBM xlf90 FORTRAN compiler do cd HOME ORAC i make IBM 11 1 2 Parallel version The parallel version of ORAC has been written using the message passing library interface in its Open MPI version which has full MPI 2 standard conformances ORAC must be compiled with MPI extension for running replica exchange simulation see Chapter 5 In order to do this you have to have the Open MPI package installed in your multiprocessor computer or in your computer cluster To compile the parallel version of ORAC starting form the directory where you have untarred the distribution just do cd HOME ORAC make PARALLEL The default underlying fortran compiler is that implied in your local mpif90 wrapper In order to know which compiler mpif90 is actually using just do mpif90 compile info To compile the parallel version of the executable using the Intel fortran compiler starting from the directory where you have untarred the distribution do cd HOME ORAC make Intel_PARALLEL When launched in parallel ORAC creates in the directory from which it was launched nprocs PARXXXX new directories where the main input file is cop
221. ome fast degrees of freedom However the SHAKE and r RESPA algorithms are not mutually exclusive and sometimes it might be convenient to freeze some degrees of freedom while simultaneously using a multi step integration for all other freely evolving degrees of freedom Since r RESPA consists in a series of nested velocity Verlet like algorithms the constraint technique RATTLE used in the past for single time step velocity Verlet integrator can be straightforwardly applied In RATTLE both the constraint conditions on the coordinates and their time derivatives must be satisfied The resulting coordinate constraints is upheld by a SHAKE iterative procedure which corrects the positions exactly as in a standard Verlet integration while a similar iterative procedure is applied to the velocities at the half time step In a multi time step integration whenever velocities are updated using part of the overall forces e g the intermolecular forces they must also be corrected for the corresponding constraints forces with a call to RATTLE This combined RATTLE r RESPA procedure has been described for the first time by Tuckerman and Parrinello 78 in the framework of the Car Parrinello simulation method To illustrate the combined RATTLE r RESPA technique in a multi step integration we assume a separation of the potential into two components deriving from intramolecular and intermolecular interactions In addition some of the covalent bonds are supposed rigid i
222. ommands provide instructions to open external files No unit number needs to be provided as ORAC open sequentially the required files assigning at each file a unit number according to their order of occurrence in the input file The file units begin at unit 10 and are augmented of one unit for each new file 10 2 Environments Commands and Sub commands The following 10 environments are available 1 amp ANALYSIS retrieves the history file 2 amp INOUT contains commands concerning input output operations which can be carried out during run time The commands allowed within the INOUT environment write history files in different formats and dump the restart files 3 amp INTEGRATION includes commands defining the integration algorithms to be used during the simula tion run 4 amp META includes commands defining the metadynamics simulation 5 amp POTENTIAL includes commands which define the general features of the system interacting potentials These features are common to both solute and solvent and concern only the non bonded interactions PFF Input to ORAC 78 6 10 11 12 13 14 amp PROPERTIES includes commands which make ORAC compute run time observables The commands allowed within the amp PROPERTIES environment can compute on the fly pair correlation functions static structure factors and velocity auto correlation function This environment is not supported amp REM setup the Replica Exchange simulation work o
223. onment Experimental Unsupported WRITE_GYR NAME WRITE_GYR Print gyration radius SYNOPSIS WRITE_GYR ngyr OPEN filename EXAMPLES WRITE_GYR 10 0 OPEN test gyr WARNINGS Work only at single time step Experimental Unsupported Input to ORAC amp REM 118 10 2 8 amp REM Define run time parameters concerning Replica Exchange Simulation Work only with the parallel version see Chapter i e this namelist is not recognized when the serial program is compiled The following commands are allowed PRINT PRINT_ENERGY SEGMENT SETUP STEP PRINT NAME PRINT print out info on REM SYNOPSIS PRINT iprint DESCRIPTION Controls intermediate printing of the acceptance ratio between adjacient replicas EXAMPLES PRINT 1000 Print info on the current acceptance ratios every 1000 fs DEFAULTS No info is printed PRINT_ENERGY NAME PRINT_ENERGY print out unscaled energies terms SYNOPSIS PRINT_ENERGY fplot OPEN filename DESCRIPTION Controls intermediate printing of the unscaled energy terms 1 stretching bending improper tor sions 2 proper torsions 1 4 3 real space electrostatic lennard jones The energy terms are appended to the history file filename along with the time step and the replica index The dumping frequency in fs is fplot EXAMPLES PRINT 60 0 OPEN test ene Print energies to the file file test ene every 60 fs DEFAULTS No info is printed Input to ORAC a
224. opment linked cell neighbor listing routines 2 Author to whom comments and bug reports should be sent CONTENTS 3 Literature citation The current version of ORAC represents a further development of the release published in 1997 I The required citations are P Procacci T A Darden E Paci M Marchi ORAC a molecular dynamics program to simulate complex molecular systems with realistic electrostatic inter actions J Comput Chem 1997 Volume 18 Pages 1848 1862 S Marsili G F Signorini R Chelli M Marchi P Procacci ORAC a molecular dynamics simulation program to explore free energy surfaces in biomolecular systems at the atomistic level J Comput Chem 2010 Volume 31 Pages 1106 1116 In general in addition to the above citations we recommend citing the original references describing the theoretical methods used when reporting results obtained from ORAC calculations These references are given in the description of the theory through the user guide as well as in the description of the relevant keywords Chapter 1 Atomistic simulations an introduction In this manual we describe ORAC a program for the molecular dynamics MD simulation of atomistic models of complex molecular systems In atomistic models the coordinates of all atomic nuclei including hydrogen are treated explicitly and interactions between distant atoms are represented by a pairwise additive dispersive repulsive potential and a Coulomb contr
225. ordinates For example the potential of mean force calculated with Eq B or Eq 8 12 along a driven distance for a freely rotating object includes the additional contribution J t 2k T In r ro arising from the fact that the Steered Molecular Dynamics 67 ty r t ais ry t alt sins an t 61 t jes On t t2 r t2 oe ry t2 alta see an t2 61 t2 soe On t2 f GR on ae AY omy OS a Bees Table 8 1 General format of the file defining of an arbitrary time protocol for a curvilinear path in a reaction coordinates space at N Na No dimensions in ORAC For a generic coordinate r a 0 the steering velocity between times t and tx 1 is constant and equal to ve te te41 G tk tk 1 tr configurational probability P r for two non interacting particles grows with the square of the distance Moreover the PMF calculated using the driving potential given in Eq 8 13 are in principle affected by the so called stiff spring approximation L47 i e if the constant K Ka Ko in Eq 8 13 are not large enough then one actually computes the free energy associated to the Hamiltonian H H Veor z zt rather than that associated to the Hamiltonian H z z However the impact of the strength of the force constant on the computed non equilibrium average especially if the reaction coordinate is characterized by inherently slow dynamics and or the underlying unbiased potential of mean force is m
226. ored during the run are used IMPORTANT NOTE it is also possible to change na on the fly during the simulation In such a case a file called SGE_DF_FLY set must be created by the user in the working directory when using the serial version of ORAC or in the parent directory of PARXXXX directories when using the parallel version of ORAC Such a file must contain an integer number alone which corresponds to nay additional characters will be ignored Note also that if this option is employed then an additional working file called SGE_DF_FLY dat will be created by the program in the same directory This file contains information related to the single estimates of free energy differences do not remove it when restarting from a previous run If the file SGE_DF_FLY set is removed after a simulation and a new simulation is restarted then this latter simulation continues as if the former simulation had been launched with the STEP command specified in the input file EXAMPLES STEP 5 10 2000 40 Ensemble transitions are attempted every 5 fs dimensionless works are stored every 10 fs free energy updates are attempted every 2000 fs the last 40 free energy estimates are used in the weighted free energy average of Eq DEFAULTS The only allowed default value is related to nay Nav 0 In such a case all free energy estimates are used in the weighted free energy average WARNINGS If STEP is not set in the input then default values are
227. ostatting or constant pressure schemes are often undertaken These approximations however lessen the predictive power of the atomistic approach and incorrect results may follow due to the inadequacy in the potential model baseless approximations or combinations of the two Also due to their cost the predictive capability of atomistic level simulations might often only be on paper since in practice only a very limited phase space region can be accessed in an affordable way thereby providing only biased and unreliable statistics for determining the macroscopic and microscopic behavior of the system It is therefore of paramount importance in atomistic simulations to use computational techniques that do not introduced uncontrolled approximations and at the same time are efficient in the sampling of the complex and rugged conformational space of biological systems Regarding this last issue many progresses has been done recently by devising new both non Boltzmann and Boltzmann techniques for extended sampling of complex systems Chapter 6 and Chapter 7 are devoted to these aspects of atomistic molecular simulations 1 1 Multiple time steps integration schemes and electrostatic in teractions in complex biomolecular systems As stated above simulations of complex systems at the atomistic level unlike simplified models have the advantage of representing with realistic detail the full flexibility of the system and the potential energy surface according
228. othing to zero of the derivatives Lennard Jones function as r tends to zero 152 i j A t nyt Alchemical Solvent Ai t m t Solvent Alchemical A t m t Solvent Solvent 0 0 Alchemical A Alchemical A 0 0 Alchemical A Alchemical B 1 1 Alchemical B Alchemical A 1 1 Table 9 1 Combination rules for alchemical and non alchemical species The alchemical systems may contains three species i alchemical growing subsystems ii alchemical annhiliating subsystems and iii the non alchemical solvent The A t milt atomic factors within each of this species are all identical and equal to Ag a s t nGa s t where the index G A S label the growing annhilating and solvent species In the present general formulation according to Eq all atoms of the systems whether alchemical or not are characterized by an additional time dependent and and externally driven coordinate the Ai t parameter controlling the charging discharging of the system and the m t parameter for switching on or off the atom atom Lennard Jones potential The time dependence of the m t Ai atomic factors is externally imposed using an appropriately selected time protocol The non bonded potential energy of Eq P I coincides with the standard potential energy of a system with no alchemical species when all the alchemical atomic factors t m t referring to electrostatic and Van der Waals interactions are constant and equal to zero At the other extreme when t
229. pectively If only one argument is specified the two cutoff are equal histogram fbin define the bin size in A for hydrogen bonds histograms print nprint OPEN filename print hydrogen bond output to file filename every nprint configurations The output format depends on READ_PDB amp SETUP directive If this directive is specified the output contains details concerning atomic types hydrogen bond distances and angles print_histo nprint OPEN filename print hydrogen histogram to file filename every nprint configurations radial_cutoff cutoff define the radial cutoff in A for the hydrogen bond residue printout hydrogen bonds per residues total printout the total number of hydrogen bonds the default use_neighbor nconf recut compute neighbor list for hydrogen bonds nconf defines how frequently the neighbor list must be computed rcut defines the radius of the neighbor list sphere EXAMPLES HBONDS total residues radial_cutoff 2 5 angular_cutoff 200 0 200 0 print 10 OPEN test hbnd histogram 0 1 Input to ORAC amp PROPERTIES 114 use_neighbors 5 5 0 print_histo 2 OPEN test hst END WARNINGS residue and total are ineffective when READ_PDB is also specified Experimental Unsupported PRINT_DIPOLE NAME PRINT_DIPOLE Print out dipole SYNOPSIS PRINT_DIPOLE fdipole OPEN filename DESCRIPTION Print out the components of the total instantaneous dipole M in debye A of the basic cell each fdipole fs and
230. phases the direct lattice term is integrated in the fast short ranged non bonded shell while the reciprocal lattice summations including the Erf intramolecular correction terms in Vintra are usually assigned with an appropriate choice of the Gaussian parameter a to the intermediate non bonded shell The Lennard Jones term finally is split among the short ranged intermediate range and long range integration shells The potential subdivision for condensed phases is basically unaffected by the implementation of alchemical transformation except for the intrasolute self term Va and for the now time dependent self term s D gt 1 Ax t a Bl The latter can be safely included in the intermediate 1This last term does not contribute to the atomic forces but only to the alchemical work and is constant for all non Steered Molecular Dynamics 72 shell while the former a true direct lattice term must be integrated in the sort range shell The A t and t factors finally must be updated according to the predefined time protocol before the force computation of the fast short ranged non bonded shell 9 0 2 Calculation of the alchemical work The work done on the system by the driven alchemical coordinates during a simulation of length 7 can be written as DHEAN OH dn Z A 1 Z A 1 dt h dt W f Dr f Dn n 9 7 In a NVT or NPT extended Lagrangian simulation with an ongoing alchemical process the alch
231. ques in biological systems converge rather slowly since the convergence rate depends crucially on the inherent slow diffusion along the conformational coordinates So even if the potential is relatively flattened the diffusion along a nearly free reaction coordinates can still be slow due to the friction of the orthogonal coordinates The metadynamics algorithm is described in detail in Chapter 6 3 3 Non equilibrium techniques 62 uses an additional driving potential acting on an appropriate reaction coordinates to fast steer the system from a given equilibrium initial state to a given final state and viceversa producing a series of forward and reverse non equilibrium trajectories The driven coordinate is strictly mono dimensional but can be defined as a trajectory in a multidimensional reaction coordinate space The free energy differences between the initial and final states the reactants and the products is connected through the Crooks fluctuation theorem 63 to the ratio of distribution functions of the work spent in these trajectories Free energy reconstruction using non equilibrium steered molecular dynamics of the potential of mean force along one arbitrary reaction coordinate is described in detail in Chapter Chapter 2 Symplectic and Reversible Integrators In an Hamiltonian problem the symplectic condition and microscopic reversibility are inherent properties of the true time trajectories which in turn are the exact solution of
232. r independent work measurements can be deter mined either by a standard equilibrium molecular dynamics simulation or by some enhanced simulation technique by constraining the system with the harmonic constraint Na No Veot 0 Some Ki ri rio oe Kilai aio gt K 6 00 8 17 for the reactants state and Vent T 5 err Mz x gt Na No i Tir gt Klai air XO Ki 6 0 8 18 w 1 i l i l for the products state Having produced the work in a series of bidirectional experiments one can then either apply the Bennett formula Eq to compute the free energy differences between the reactants and the products states or using the intermediate work values W apply Eq B 1Jor Eq 8 12 to reconstruct the entire potential of mean force along the the mono dimensional driven trajectory in a multidimensional reaction coordinate space defined in Eq 8 14 In order to define a non necessarily linear trajectory in a multidimensional reaction coordinate space e g a putative minimum free energy path on must be able to assign to a each steered coordinate a different steering time protocol This can be done in ORAC by providing an auxiliary file defining the path in coordinate space The file has the general form shown in Table 8 3 The free energy or potential of mean force obtained with the described protocols are not depurated by the jacobian terms arising form the definition of the reaction co
233. r of ensembles nstates and the restart option irest Their meaning has been explained above The collective coordinates are defined using the ADD_STR_BONDS bond coordinates ADD STR_BENDS bending coordinates and ADD_STR_TORS torsional coordinates These commands are defined in the amp POTENTIAL environment and must be used in the following form ADD_STR_BONDS tatl iat2 ks ri re ADD_STR_BENDS iatl iat2 iat3 kp Qi ar ADD_STR_TORS iatl iat2 iat3 iat4 ky 0i OF These expressions define the additional harmonic potential entering into Eq 6 24 For example if we perform a SGE simulation in the space of a distance between two atoms then ADD_STR_BONDS must be used The parameters iatl and iat2 are the atom numbers ks corresponds to k of Eq and r and rf define the intermediate ensembles as follows An ri n 1 r ri nstates 1 where An is the parameter characteristic of the ensemble n with n 1 2 nstates see Eq 6 24 EXAMPLES SETUP 5 1 1 0 6 1 A Hamiltonian SGE simulation is performed The non bonded potential direct part is scaled using a geometric progression while the other potential terms are unscaled The number of ensembles is 5 SETUP 4 1 ADD_STR_BONDS 22 143 1 10 14 5 ADD_STR_BENDS 25 33 67 2 100 130 A SGE simulation in the space of collective coordinates is performed using 4 ensembles The collective coordinates are one bond and one bending The bond is related to the atoms 22 and 143 The b
234. r structure The second step is the metadynamics simulation itself during which an history dependent potential is constructed by summing at regular time intervals repulsive potential terms centered in the current position of the system in the space of the reaction coordinates In its standard implementation the history dependent potential of metadynamics is given by a sum of small repulsive Gaussian Eq 7 2 Some variants have been introduced with the intent of improving the accuracy or the efficiency of the method 135 136 In the ORAC program we have used Lucy s function 137 as a very efficient alternative to the use of Gaussians Metadynamics Simulation 59 It is defined as 2 s 80 h w h 1 285 1 0 if s sol gt w 7 3 w w with the origin at sg The symbols h and w denote the height and the width respectively Such a function is normalizable a ds L s 59 w hw has a finite range w has a maximum at the origin and it is differentiable everywhere A Lucy s function can be compared with a Gaussian function with the same value at the origin and at s so w 2 such that 20 w 21n 2 7 4 A Lucy s function can be regarded as a Gaussian function with o in Eq 7 4 but without the long tails of the Gaussian as can be seen in Fig 7J where a Lucy s function with h w 1 and a Gaussian function with the same height and w 2 21n2 are shown The parameters h w and 7 affects the accuracy of the f
235. raints This allows to recover not only the global minimum energy state but also any equilibrium thermo dynamic quantity as a function of temperature The potential of mean force PMF 50 51 along a chosen collective coordinate can also be computed a posteriori by multiple histogram reweighting techniques 52 53 PMF can also be determined by performing generalized ensemble canonical simulations in the space of the collective coordinate 54 for example the space of the end to end distance of a biopolymer Compar isons between ST and temperature REM have been reported 47 48 49 The overall conclusions of these studies are that ST consistently gives a higher rate of delivering the system between high temperature states and low temperature states as well as a higher rate of transversing the potential energy space Moreover ST is well suited to distributed computing environments because synchronization and commu nication between replicas processors can be avoided On the other side an effective application of ST and in general of SGE methods requires a uniform exploration of the ensemble space In order to satisfy this criterion acceptance rates must be not only high but also symmetric between forward and backward directions of the ensemble space This symmetry can be achieved by performing weighted sampling where weights are correlated with the dimensionless free energies of the ensembles The knowledge of such free energies is not needed in
236. re easily obtained from the true Hamiltonian in Eq 3 13 and then using Eqs 3 1443 22 to rewrite the resulting equations in terms of the new momenta In so doing we obtain Pik P Ph Dn lik ni TA h w Ne 3 23 Pik fik opi 3 24 py ho Ro er oP 3 25 Pre Vv K k Pgs det h a oP 3 26 Pn Fn 3 27 4This allows to maintain Verlet like breakup while integrating the equation of motions 25 Multiple Time Steps Algorithms 22 It can be verified that the conserved quantity H is associated with the above equations of motion namely ssl N ni ie an 1 tr P Ph Dor wt 2 o ext 3 28 a EF includes a constraint force contribution which guarantees that the Orika center of mass in the intramolecular frame of the lika coordinates remains at the origin V and K are The atomic force ff the virial and ideal gas contribution to the internal pressure tensor Pint and they are defined af N Ys XFS i 1 7 Pm ns Si 3 29 i 1 Finally F is the force driving the Nos thermostat at x PikPik ik Fy 5 3 MSGS 5 2 y ana gksT 3 30 with g equal to the number of all degrees of freedom Ny including those of the barostat Eqs 8 14 3 21 define a generalized coordinates transformation of the kind of Eq 2 4 This transfor mation is non canonical i e the Jacobian matrix of the transformation from the virtual coordinates does not obey Eq 2 8 This m
237. ree energy reconstruction in a similar manner to the height and the width of Gaussian functions and a comprehensive review on the analysis of the error during a metadynamics run can be found in 6 Figure 7 1 Lucy s function L with h w 1 along with a Gaussian function G with the same height and 20 w 21n 2 The history dependent potential used during an ORAC simulation can therefore be written as Vis t O L s s h w 7 5 VET 27 sus During a simulation forces from this biasing potential are computed in the shell n1 as a sum of derivatives of functions h hi OL 8 50 Nh W 6 if 7 6 Se ae s so s so w 0 1 s sol gt u Such a derivative is computationally attractive since it does not require the evaluation of an exponential function as in the case of the derivative of a Gaussian function Moreover since has a finite range by 1 Lucy s function can be defined for a generic order n such that it has n 1 continuous derivative everywhere 138 The original definition was given for n 3 here it is employed with n 2 Metadynamics Simulation 60 definition it does not need to be smoothly truncated 136 as there are no contribution to the forces from hills farther than the width w Using the standard metadynamics approach during a simulation the algorithm keeps on adding terms to the history dependent potential the sum in Eq 7 5 with the same constant rate w h r However th
238. refers to the interlaced role of coordinate and momenta in Hamilton s equations Symplectic and Reversible Integrators 10 of motion in the y basis have exactly the same form as in Eq 2 3 OH y J 2 5 reg 2 5 If we now take the time derivative of Eq 2 4 use the chain rule relating x and y derivatives and use Eq 2 5 we arrive at H y mom 2 6 Oy Here M is the Jacobian matrix with elements and Mt is its transpose By comparing Eqs 2 5 and 2 6 we arrive at the conclusion that a trans formation is canonical if and only if the Jacobian matrix M of the transformation Eq 2 4 satisfies the condition MJM J 2 8 Eq 2 8 is known as the symplectic condition for canonical transformations and represents an effective tool to test whether a generic transformation is canonical Canonical transformations play a key role in Hamiltonian dynamics For example consider transformation z t t 2 0 2 9 where pogo z 0 and P Q z t i e one writes the coordinates and momenta at time t obtained from the solution of the Hamiltonian equation of motion as a function of the coordinates and momenta at the initial time zero This transformation which depends on the scalar parameter t is trivially canonical since both poqo and P Q satisfies the Hamilton equations of motion Hence the above transformation defines the t flow mapping of the systems and being canonical its Jacobian matrix obeys the sympl
239. rements exploiting the Jarzynski identity 8 5 in the form lt e7PWo gt e B F 2 2t F 2 20 using the fact that W is odd under time reversal and that W W lT we obtain the following estimate for the free energy at intermediate t with 0 lt t lt T eot P a PMN fine 8 11 np npeBW AF np npes WtAF F R This equation due to Minh and Adib 124 allows to reconstruct the entire potential of mean force F Fo along the reaction coordinate spanned during the bidirectional non equilibrium experiments of duration T no matter how fast the driven processes are done Note that AW F Fo and W in Eq 8 11 are the forward free energy difference and work relative the end points respectively For fast pulling experiments i e when the dissipated work is large it can be shown 149 that Eq reduces to eB Fo lt lt e PWE gt p pe BAF lt e PWS gt R 8 12 In both Eq 8 12 and Eq B I one needs to know the free energy difference between the end points AF An unbiased estimate of AF is easily available through the Bennett acceptance ratio Eq 8 3 Implementation in ORAC Steered molecular dynamics in ORAC is implemented by adding an external driving potential depending on user defined internal coordinates in the form of stretching bending torsions The general form of the Steered Molecular Dynamics 66 time dependent external potential that bring the system from an initial state at t 0 to a dif
240. ressibility WARNINGS 1 ORAC can carry out constant pressure runs with isotropic volume changes only for orthogonal cells 2 Make sure that when simulations at constant pressure are run ORAC has been compiled with the appropriate PRESSURE option in the config h file see Chapter LI ISOSTRESSXY NAME ISOSTRESS Run MD simulations at constant pressure with an isotropic surface variation a b cell parameters and independent c cell parameter variation This protocol is engeneered for membrane simulations SYNOPSIS ISOSTRESSXY PRESS EXT pert BARO MASS wpr COMPR compressibility DESCRIPTION See command ISOSTRESS Input to ORAC amp SIMULATION 139 EXAMPLES amp SETUP CRYSTAL 40 0 40 0 60 0 90 0 90 0 90 0 amp END ISOSTRESSXY PRESS EXT 0 1 BARO MASS 10 0 COMPR 5 3e 4 The a b cell parameters vary isotropically independent of the c cell parameter under athmostpheric pressure DEFAULTS ORAC uses the water compressibility at 300 K i e 5 3 x 1074 MPa as the default compressibility WARNINGS 1 ORAC can carry out constant pressure runs with isotropic surface changes only for orthogonal cells 2 Make sure that when simulations at constant pressure are run ORAC has been compiled with the appropriate PRESSURE option in the config h file see Chapter LIJ FREQUENCIES NAME FREQUENCIES Compute harmonic frequencies of the system All atoms solute and solvent are included in the dynamical computation SYNOPSIS
241. rg For NAMD See the tutorial In silico alchemy A tutorial for alchemical free energy perturbation calculations with NAMD available at http www ks uiuc edu P Procacci S Marsili A Barducci G F Signorini and R Chelli J Chem Phys 125 164101 2006 C Brot B Quentrec J Comp Phys 13 430 1975 M P Allen and D J Tildesley Computer Simulation of Liquids Oxford University Press Walton Street Oxford OX2 6DP 1989 R M Levy E Gallicchio Agbnp An analytic implicit solvent model suitable for molecular dynamics simulations and high resolution modeling J Comput Chem 25 479 499 2004
242. rk thermalization 25 THERMOS 143 thermostat Andersen 137 TIME time dependent bending 100 time dependent stretching 99 time dependent torsion I01 TIME_CORRELATIONS 115 TIMESTEP 88 topology adding extra topology from ASCII file from binary file printing 95 torsion definition of dihedral angle 157 improper TORSION IMPROPER 157 printing out 95 proper 156 torsional potential TORSION IMPROPER 156 TORSION PROPER 156 total TRAJECTORY 85 trajectory file auxiliary file 82 TRANSITION_SCHEME amp SGE 135 Trotter formula I Tuckerman M 5 unit cell 145 replicating along selected directions 127 unitary transformation UPDATE use_neighbor 112 use_neighbors for hydrogen bonds 13 vacf valine 158 velocity rescaling 41 velocity autocorrelation function velocity Verlet Verlet neighbor list VERLET_LIST very_cold_start virtual variables Volume calculation 116 Voronoi 28 Voronoi Polihedra 16 Wang Landau algorithm water 28 properties of work in a SMD simulation in alchemical tranformation 172 write WRITE_GRADIENT 40 WRITE_GYR 117 WRITE_PRESSURE 144 WRITE_TPGPRM_BIN WTEMPERED Xmol animation 111 X_RMS xyz format ZERO_FREE_ENERGY amp SGE Bibliography 10 11 12 13 14 15 16 17 18 19 20 21 22 P Procacci T Darden E Paci and M Marchi J Compu
243. roper torsions 2 5 are modeled using a potential identical to that of the proper torsion in Eq 43 and hence in these cases Eq 4 18 applies also to the improper torsion uncoupled frequency provided that indices 1 and 4 refer to the lighter atoms In figure 42 we report the distribution of frequencies for the hydrated protein Bovine Pancreatin Trypsin Inhibitor BPTI using the Electrostatic Interactions 33 AMBER 8 force field The distributions might be thought as a density of the uncoupled intramolecular states of the system As we can see in the figure there is a relevant degree of overlap for the various internal degrees of freedom For example slow degrees of freedom such as torsions may be found up to 600 wavenumber well inside the bending region these are usually improper or proper torsions involving hydrogen It is then inappropriate to assign such fast torsions involving hydrogen to a slow reference system We recall that in a multiple time simulation the integration of a supposedly slow degree of freedom with a excessively large time step is enough to undermine the entire simulation In Fig 4 2 we also Improper Torsion Torsion Stretching 0 500 1000 1500 2000 2500 3000 3500 4000 Wavenumbers Figure 4 2 density of the uncoupled see text states for stretching bending proper and improper torsion obtained with the AMBER force field on the protein bovine pancreatic trypsin inhibitor
244. rostat mass corresponding to 10 0 cm The compressibility is set to 1 0 x 1074 MPa Velocities are initialized and optionally scaled according to a temperature of 300 K amp SIMULATION MDSIM TEMPERATURE 300 0 25 0 STRESS PRESS EXT 0 1 BARO MASS 10 0 COMPR 1 0e 4 THERMOS END amp END Same as before but with a Nos thermostat The simulation is hence in the NPT ensemble with T 300 K DEFAULTS ORAC uses the water compressibility at 300 K i e 5 3 x 1074 MPa as the default compressibility WARNINGS Make sure that when simulations at constant pressure are run ORAC has been compiled with the appropriate PRESSURE option in the config h file see Chapter J TEMPERATURE NAME TEMPERATURE Set the system temperature for the run SYNOPSIS TEMPERATURE temp dt DESCRIPTION The argument temp is the target temperature for the simulation run dt is used only during the rejection phase see command REJECT of environment amp RUN and indicates the temperature window in Kelvin outside which temperature scaling occurs Scaling stands here for random initialization of the system velocities at temperature temp according to a Gaussian distribution System scaling in rejection phase occurs also during constant temperature simulations see command THERMOS in amp SIMULATION EXAMPLES TEMPERATURE 300 0 50 0 WARNINGS Work only during the rejection phase see REJECT in environment amp RUN Input to ORAC amp SIMULATION 143
245. rticular format such that it can be easily retrieved at analysis time by time and by atoms SYNOPSIS DUMP END Input to ORAC amp INOUT 82 DESCRIPTION The DUMP structured command stores the coordinates of the system during a simulation run with a selected frequency The coordinates are stored in single precision to save disk space The following subcommands may be specified within DUMP atom_record occupy write e atom_record natom_rec Defines the number of atoms per record Atomic coordinates are dumped to disk as REAL 4 RecordLenght is defined as Irecl natom 3 x 4 e occupy Allocates disk storage for history file before the simulation is started occupy fills with ze roes the entire direct access history file s whose dimensions are controlled by the command MAXRUN amp RUN and by the number of atoms in the systems If occupy is not specified the history file is expanded at each write request during the simulation This command is useful when sharing disk resources with others preventing the simulation to die because of sudden lack of disk space e write ftime OPEN filename Defines the dumping frequency and the trajectory auxiliary filename Coordinates are dumped to disk every ftime femtoseconds The auxiliary file filename contains the names for parameters and trajectory files and must be user supported At execution time this file is rewritten by the program which supports extra information computed according
246. run They are defined in the file named filename The absolute path PATH must be specified to localize filename If one needs to use the relative path in a many replica SGE simulation parallel run then the working directories of the replicas must be considered the PARXXXX ones The weight factors Agn m gm gn are dimensionless and in filename must be reported one per line from 921 to Jnstates nstates 1 EXAMPLES FIX_FREE_ENERGY OPEN weight_factors dat A SGE simulation is performed with fixed weight factors read from file weight_factors dat DEFAULTS The absence of the FIX_FREE_ENERGY command in the input implies the use of the BAR SGE method see Section 6 3 2 to update the weight factors during the simulation Input to ORAC amp SGE 132 PRINT_ACCEPTANCE_RATIO NAME PRINT_ACCEPTANCE_RATIO Print out the acceptance ratio of the SGE simulation SYNOPSIS PRINT_ACCEPTANCE RATIO iprint DESCRIPTION Print the acceptance ratio between adjacent ensembles of the SGE simulation every iprint fs The ratio is printed in the standard output EXAMPLES PRINT_ACCEPTANCE_RATIO 1000 Print the acceptance ratios every 1000 fs DEFAULTS The acceptance ratio is printed with a frequency corresponding to that of free energy updating see Ly in command STEP PRINT_WHAM NAME PRINT_WHAM Print out data needed for reweighting the configurations of all ensembles on a target state SYNOPSIS PRINT_WHAM freq_print DESC
247. s 105 1426 1996 P Procacci M Marchi and G J Martyna J Chem Phys 108 8799 1998 A Rahman and F H Stillinger J Chem Phys 55 3336 1971 P Liu B Kim R A Friesner and B J Berne Proc Acad Sci 102 13749 13754 2005 M R Shirts and J D Chodera J Chem Phys 129 124105 2008 U H E Hansmann and Y Okamoto J Comput Chem 18 920 1997 A Irb ck and F Potthast J Chem Phys 103 10298 1995 A Mitsutake and Y Okamoto Chem Phys Lett 332 131 2000 S Park and V S Pande Phys Rev E 76 016703 2007 X Huang G R Bowman and V S Pande J Chem Phys 128 205106 2008 C Zhang and J Ma Phys Rev E 76 036708 2007 R Denschlag M Lingenheil P Tavan and G Mathias J Chem Theory Comput 5 2847 2009 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 177 S Park D L Ensign and V S Pande Phys Rev E 74 066703 2006 R Chelli S Marsili A Barducci and P Procacci Phys Rev EF 75 050101 2007 R Chelli J Chem Phys 130 054102 2009 C H Bennett J Comp Phys 22 245 1976 R W Zwanzig J Chem Phys 22 1420 1954 A Mitsutake and Y Okamoto J Chem Phys 130 214105 2009 W G Hoover Phys Rev A 31 1695 1985 W G Hoover Phys Rev A 34 2499 1986 G L Martyna M L Klein and M E Tuckerman J Che
248. s lt V gt and lt Vm gt are the average value of the intra molecular and intermolecular energies in KJ mole respectively CPU is given in seconds per picoseconds of simulation and At in fs Single time step velocity Verlet with At 4 5 fs is unstable At n R CPU lt V gt lt Vn gt 0 3 1 0 005 119 0 1912 4 75 0 6 1 0 018 62 0 1937 4 75 1 5 1 0 121 26 0 2142 4 75 4 5 1 0 6 2 0 004 59 0 1912 4 75 1 5 5 0 004 28 0 1912 4 75 3 0 10 0 005 18 0 1912 4 75 4 5 15 0 006 15 0 1912 4 75 6 0 20 0 008 12 0 1912 4 74 9 0 30 0 012 10 0 1911 4 74 3 0 0 001 14 4 74 60 0 004 8 4 75 90 0 008 6 4 74 the two nitrogen atoms of each given molecule We use here a simple harmonic spring depending on the molecular bond length rm namely 1 Vi 2 2 El ro with ro and r the equilibrium and instantaneous distance between the nitrogen atoms and k the force constant tuned to reproduce the experimental gas phase stretching frequency 80 As a measure of the accuracy of the numerical integration we use the adimensional energy conservation ratio 22 BD 23 mA 2 2 fy ee a tt 2 44 lt K gt K gt where E and K are the total and kinetic energy of the system respectively In table 1 we show the energy conservation ratio R and CPU timings on a IBM 43P 160MH RS6000 obtained for flexible nitrogen at 100 K with the r RESPA integrator as a function of n and At in Eq and also for single time step integrators Resu
249. s MD simulations respectively This success is mainly due to the clearness and the ease of implementation of the algorithm that is basically the same for the two methods The Wang Landau algorithm was initially proposed as a method to compute the density KF Metadynamics Simulation 58 of states g Z and therefore the entropy S E Ing of a simulated discrete system During a Wang Landau MC simulation S E is estimated as an histogram incrementing by a fixed quantity the frequency of the visited energy levels while moves are generated randomly and accepted with a Metropolis probability acc E E min 1 exp AS where AS S E S E is the current estimate of the entropy change after the move While for a random walk in energy the system would have been trapped in entropy maxima the algorithm that can be easily extended to the computation of any entropy related thermodynamic potential along a generic collective variable helps the system in escaping from these maxima and reconstructs the entropy S E The metadynamics algorithm extends this approach to off lattice systems and to Molecular Dynamics Metadynamics has been successfully applied in the computation of free energy profiles in disparate fields ranging from chemical physics to biophysics and material sciences For a system in the canonical ensemble metadynamics reconstructs the free energy along some reaction coordinate s as a sum of Gaussian functions deposed a
250. s but may turn to be wrong in certain cases even for fast stretching between heavy atoms In any case the degree of mixing of the various degrees of freedom of a complex system is not known a priori and should be on the contrary considered one of the targets of atomistic simulations The SHAKE algorithm allows only a moderate increase of the step size while introducing if used without caution essentially uncontrolled approximations In other words indiscriminate use of constraints violates the brute force postulate A more fruitful approach to the multiple time scale problem is to devise a more efficient multiple time step integration of the equation of motion Multiple time step integration in MD is a relatively old idea 13 74 T5 T6 T7 18 but only in recent times due to the work of Tuckerman Martyna and Berne and coworkers 19 20 21 22 23 24 is finding widespread application These authors introduced a very effective formalism for devising multilevel integrators based on the symmetric factorization of the Liouvillean classical time propagator The multiple time step approach allows integration of all degrees of freedom at an affordable computational cost In the simulation of complex systems for a well tuned multilevel integrator the speed up can be sensibly larger than that obtained imposing bond constraints Besides its efficiency the multiple time steps approach has the advantage of not introducing any a priori assumption th
251. s can be obtained The following sections describe the format of the topology and force field parameters files read by ORAC The reading of the two files is carried out immediately after the command READ_TPF_ASCII and READ_PRM_ASCII in the environment amp PARAMETERS are encountered in the input file The topology and force field parameters files are strongly dependent from each other and together fully define the molecular force field of the solute molecule s In the ORAC distribution archive the most recent AMBER 3 force field and topology files are provided 10 3 1 Force Field Parameters The force field parameters must be placed in the file defined by the command READ_PRM_ASCII of the environment amp PARAMETERS This file can contain the directives defining the stretching angle bending proper and improper torsion Lennard Jones potential parameters Each directive is terminated by the keyword END subsequent to the last line of input The allowed commands are the followings BENDINGS BOND NONBONDED MIXRULE NOMIXRULE TORSION PROPER IMPROPER BENDING NAME BENDINGS Read angle bending potential parameters SYNOPSIS BENDINGS typl typ typ Kangle 9 END DESCRIPTION The command reads a sequence of angle bending potential parameters typ1 typ2 and typ3 are three character strings not to exceed 7 characters indicating the atom types of the three atoms involved in the angle bending interaction Kgngie and o are the
252. s created with the commands WRITE_TPGPRM_BIN READ_TPG_ASCII and READ_PRM_ASCII The tables contained in file filename are associated only with the current solute molecule s and can only be used for that those molecule s In alternative to the command READ_TPGPRM READ_TPG_ASCII and READ_PRM_ASCII which read the general formatted topology and parameters files can be used Since the use of the latter commands implies the calcu lation of the topology and parameters tables for the current solute molecule it is advisable to use them only a first time to create the unformatted file read by READ_TPGPRM When READ_TPGPRM is entered all the topology of the system is read in from the specified binary file and the topology com mands such as JOIN_SOLUTE JOIN_SOLVENT or ADD_TPG are ignored Also the environments amp SOLUTE amp SOLVENT amp SETUP need not to be specified EXAMPLES amp PARAMETERS READ_TPGPRM_BIN benz prmtpg amp END amp SIMULATION amp END amp INOUT RESTART 50 0 OPEN benz rst amp END amp INTEGRATOR amp END amp POTENTIAL amp END amp RUN CONTROL al amp END In this example all topology information and the coordinates of all atoms in the system are taken in care by only three directives READ_TPGPRM_BIN RESTART CONTROL The files benz prmtpg and benz rst which contains the topology and the coordinates respectively must have been produced with a previous run READ_PRM_ASCII NAME READ_PRM_ASCII
253. s energies of the system to standard output The real argument fprint indicates the chosen printing frequency in fs EXAMPLES PRINT 5 0 Input to ORAC amp RUN 123 PROPERTY NAME PROPERTY Print averages with a given frequency SYNOPSIS PROPERTY fprop DESCRIPTION ORAC writes to the standard output the running averages of the current run The real argument fprop is the frequency of printing in femtoseconds EXAMPLES PROPERTY 500 0 Write averages every 500 0 fs WARNINGS An error condition will occur if this command is not included in the input to ORAC or if the argument fprop is zero The command is not active only in the rejection phase see command REJECT REJECT NAME REJECT Provide the length of the rejection phase SYNOPSIS REJECT freject DESCRIPTION During the equilibration or rejection phase only instantaneous results are printed while averages are discarded The real argument freject indicates the time lag in femtoseconds of the rejection phase EXAMPLES REJECT 1000 0 Does not accumulate averages during the initial 1000 0 fs of the run WARNINGS This command is inactive during a minimization run see command MDRUN in amp SIMULATION STEER NAME STEER Provide the starting and final time in fs for a steered molecular dynamics The time depen dent harmonic potential is defined in the namelist POTENTIAL using the command ADD_STR_BONDS ADD_STR_BENDS ADD_STR_TORS SYNOP
254. s measured by the kinetic energy may well exceed that of the thermal bath Actually the temperature cannot even be defined for a system that is not at equilibrium as part of it near the reaction path can be warmer than other parts that are far from the reaction coordinate This has clearly no consequences whatsoever on the CT since the temperature in Eq 8 2 that of the system at the initial points which are drawn by hypothesis at equilibrium Steered Molecular Dynamics 64 PWO PCW PW AF lt W gt lt W gt AF lt W gt AF lt W gt lt W gt AF lt W gt Figure 8 2 Effect of the size of the system on the overlap of the forward and backward work distributions In the left panel the non equilibrium processes are done in a given time 7 on a single molecule In the right panel the processes as in left panel of duration 7 are done independently on three identical molecules This implies a factor 3 on energies and a factor 31 2 on widths As a result of the increased size the overlap between Ps W and P W decreases significantly intraprotein interaction are negligible the mean work for this system will be simply N times the mean work done on a single molecule while the width of the work distribution for the N molecule systems will be only N larger than that of the single molecule system This effect is illustrated in Fig Now biomolecular simulation of biosystems are usually done
255. s reliable results In Ref 26 Marchi and Procacci showed that the scaling method in the NPT ensemble does not affect neither the equilibrium structural and dynamical properties nor the kinetic of non equilibrium MD For group based and molecular based scaling methods in a system of one single molecule of BPTI embedded in a box of about a 1000 water molecules they obtained identical results for the system volume the Voronoi volumes of the proteins and for the mean square displacement of both solvent and protein atoms under normal and high pressure 3 6 Switching to Other Ensembles The NPT extended system is the most general among all possible extended Lagrangians All other en semble can be in fact obtained within the same computational framework We must stress 26 that the computational overhead of the extended system formulation due to the introduction and handling of the extra degrees of freedom of the barostat and thermostat variables is rather modest and is negligible with respect to a NVE simulation for large samples N gt 2000 26 Therefore a practical albeit inelegant way of switching among ensembles is simply to set the inertia of the barostat and or thermostat to a very large number This must be of course equivalent to decouple the barostat and or the thermostat from the true degrees of freedom In fact by setting W to infiniti in Eqs 8 231327 we recover the NVT canonical ensemble equations of motion Putting instead Q to i
256. simulation for a general system with 8 processes In the x axis we report the simulation time in the left y axis the process index and in the right y axis the replica index which is bound to the actual temperature Each color represents a process running in parallel with other processes with different colors As it can be seen on each process the temperature i e the replica index changes continuously So for example the configurational sampling of the replica at the lowest temperature in the given time interval must be reconstructed combining the data for the slave processes 1 2 3 4 6 If the algorithm is working properly i e if the temperature spacing is chosen correctly and if there are no phase transition between T and Tm the temperature in each parallel process must perform a random walk in the temperature domain T Tm Going back to equation 5 12 two important issues must be stressed i the temperature spacing for optimal overlap between contiguous replicas while keeping the total number of replicas not too high is not uniform but grows with the replica temperature ii the temperature spacing between contiguous replicas must be decreased with increasing number of degrees of freedom The latter is indeed a severe limitation of the standard REM technology since as the size of the system grows a larger number of replicas must be employed for preserving a significant exchange acceptance probability This is due to the inescapable fact
257. solvent and concern both bonded and non bonded interactions The following are allowed commands ADD_STR_BONDS ADD_STR_BENDS ADD_STR_TORS ADJUST_BONDS AUTO_DIHEDRAL BENDING CONSTRAINT CUTOFF ERF_CORR ERFC_SPLINE DEFINE_ALCHEMICAL_ATOM EWALD GROUP_CUTOFF I TORSION JORGENSEN KEEP_BONDS LJ FUDGE LINKED_CELL QQ_FUDGE SELECT_DIHEDRAL STEER_PATH STRETCHING UPDATE VERLET_LIST ADD_STR_BONDS NAME ADD_STR_BONDS Add a stretching potential between two target atoms SYNOPSIS ADD_STR_BONDS iati iat k To s DESCRIPTION This command can be used to impose an additional stretching constraint between atom iat1 and iat2 of the solute The numeric order of the solute atom indices iat1 iat2 is that specified in the topology file see I0 3 The added stretching potentail has force constant k in Keal mol A and equilibrium distance ro in A If r is also specified then the added stretching potential is time dependent and rz is the equilibrium distance after the steering time T see STEER amp RUN command for the definition of the steering time in a SMD simulation WARNINGS If the chosen ro is very different from the actual value of the distance riat1 riat2 at time 0 a very large force is experienced by the atoms in involved in the added stretching and the simulation may catastrofically diverge after few steps EXAMPLES Example 1 ADD_STR_BONDS 1 104 400 31 5 Example 2 amp PARAMETERS READ_TPGPRM_B
258. synopsis lab1 lab2 and lab are the three atoms involved in one angle bending to be deleted from the residue list Labels starting with a or a correspond to atoms belonging to the preceding and following residue in the solute sequence EXAMPLES omit_angles ncac c ca ha end WARNINGS The keyword bonds must appear before omit_angles Input to ORAC Force Field amp Topology 161 dihed NAME dihed Define proper torsions list for the residue Obsolete Unsupported SYNOPSIS dihed lab1 lab2 lab3 labs labd lab6 lab7 lab end DESCRIPTION In more modern biomolecular force fields all possible torsion angles are included in the interaction potential see AUTO DIHEDRAL of the environment amp SOLUTE dihed includes only selected proper torsions in the potential as it was required by earlier force fields Each proper torsion is defined by a quadruplet of atom labels see synopsis Labels starting with a or a refer to atoms belonging to the preceding and following residue in the solute sequence EXAMPLES dihed c n ca cb n ca cb cgi ncac n end WARNINGS The keyword bonds must appear before dihed If AUTO_DIHEDRAL of the environment amp SOLUTE is selected the keyword dihed has no effect imphd NAME imphd Define improper torsions list for the residue SYNOPSIS imphd lab1 lab2 lab3 labs lab5 lab6 lab7 lab end DESCRIPTION The keyword includes only selected improper torsions Following
259. t Chem 18 1848 1997 S J Wiener P A Kollmann D T Nguyen and D A Case J Comput Chem 7 230 1986 W D Cornell P Cieplak C I Bavly I R Gould K M Merz Jr D M Ferguson D C Spellmeyer T Fox J W Caldwell and P Kollmann J Am Chem Soc 117 5179 1995 B R Brooks R E Bruccoeri B D Olafson D J States S Swaminanthan and M Karplus J Comput Chem 4 187 1983 W F van Gunsteren and H J C Berendsen Groningen Molecular Simulation GROMOS Library Manual Biomos Groningen 1987 A D MacKerrel J Wirkeiwicz Kuczera and M Karplus J Am Chem Soc 117 11946 1995 J J Pavelites P A Gao and A D MacKerrel Biophysical J 18 221 1997 A D MacKerell Jr D Bashford M Bellott R L Dunbrack J D Evanseck M J Field S Fischer J Gao H Guo S Ha D Joseph McCarthy L Kuchnir K Kuczera F T K Lau C Mattos S Michnick T Ngo D T Nguyen B Prodhom W E Reiher III B Roux M Schlenkrich J C Smith R Stote J Straub M Watanabe J Wiorkiewicz Kuczera D Yin and M Karplus J Phys Chem B 102 3586 1998 J P Ryckaert G Ciccotti and H J C Berendsen J Comput Phys 23 327 1977 G Ciccotti and J P Ryckaert Comp Phys Report 4 345 1986 M P Allen and D J Tildesley Computer Simulation of Liquids Oxford University Press Walton Street Oxford OX2 6DP 1989 P Procacci T Darden and M Marchi J Phys Chem 100 10464
260. t 0 1000 2 0 180 0 END TORSION IMPROPER NAME TORSION IMPROPER Read proper torsion potential SYNOPSIS AMBER form cosine TORSION IMPROPER typl typ2 typ typ4 Kpi n y cosine END CHARMM form harmonic TORSION IMPROPER typ1 typ2 typ3 typ4 Kyni angle harmonic END DESCRIPTION typl typ2 typ3 and typ4 are four character strings not to exceed 7 characters indicating the atom types of the four atoms involved in the torsion interaction a x string is taken to be as a wild card indicating any atom For improper torsions ORAC allows both the CHARMM like form a simple harmonic potential or the AMBER like form a torsional potential For the CHARMM form K must be given in Kcal mol rad while angle is the equilibrium angle of the improper torsion in degree Input to ORAC Force Field amp Topology 157 For the AMBER form the meaning of the symbol are identical to those described in the TORSION PROPER directive EXAMPLE TORSION IMPROPER x x ca h4 1 1000 2 0 180 0 cosine x x ca h5 1 1000 2 0 180 0 ck cb n ct 1 0000 2 0 180 0 cosine cm c n ct 1 0000 2 0 180 0 cosine ha cpa cpa cpm 29 40 0 0 harmonic ha cpb c Cc 20 00 0 0 harmonic ha ha Cc c 20 00 180 0 harmonic END 10 3 2 Topology ORAC is instructed to read the topology file by the command READ_TPG_ASCII field tpg of the amp PARAMETERS environment File field tpg contains information on the series of residues needed to define the
261. t P 2_1 The symmetry transformations of the space group P2_1 are applied to the asymmetric unit in order to generate the coordinates of the other molecules contained in the unit cell Input to ORAC amp SOLVENT 149 10 2 14 amp SOLVENT The amp SOLVENT environment includes commands which are concerned with specific aspects of the solvent structure In the present version of ORAC force field and topology specifications are given in the same Force fields and topology files used for the solute The following commands are allowed ADD_UNITS CELL COORDINATES GENERATE INSERT READ_SOLVENT REDEFINE ADD_UNITS NAME ADD_UNITS Add solvent molecules SYNOPSIS ADD_UNITS nmol DESCRIPTION Reads nmol molecules form PDB file specified in the READ_PDB amp SETUP command This command must be entered when starting from a PDB file which includes both solute and solvent coordinates EXAMPLES amp SETUP CRYSTAL 20 00 20 00 20 00 90 0 90 0 90 0 READ_PDB solute 342solvent pdb amp END amp PARAMETERS READ_TPG_ASCII tpg prm amber95 tpg READ_PRM_ASCII tpg prm amber95 prm JOIN SOLUTE ala h ala ala ala ala o END JOIN SOLVENT hoh END amp END amp SOLVENT ADD UNITS 342 amp END The file solute 342solvent pdb contains the coordinates of a penta alanine along with 342 water molecules CELL NAME CELL Define the initial lattice for the solvent SYNOPSIS CELL type Input to ORAC amp SOLVENT 150 DESCRIPTION This co
262. t arms print inst_xrms 3 OPEN test irms inst_xrms ca backbone averaged ca print rms 2 OPEN test rms END WARNINGS STRUCTURES commands works only in conjuction with the ZANALYSIS environment Experimental Unsupported TIME_CORRELATIONS NAME TIME_CORRELATIONS Compute velocity autocorrelation functions and root mean displacements SYNOPSIS TIME_CORRELATIONS END DESCRIPTION The command TIME_CORRELATIONS opens an environment which includes a series of subcommands to define the parameters used in the calculation Input to ORAC amp PROPERTIES 116 diffusion OPEN filename compute the mean square displacements r t r 0 divide_step nspline provide a number equal to nspline of interpolated points between data points vacf OPEN filename Compute velocity autocorrelation functions and print out results to file filename EXAMPLES TIME_CORRELATIONS vacf OPEN vacf test2 divide_step 2 diffusion OPEN diff test2 END WARNINGS When the the subcommand use_neighbor is used cutoff cannot exceed the neighbor lists cutoffs Experimental Unsupported VORONOT NAME VORONOI Compute the Voronoi polihedra of atoms residues and molecules SYNOPSIS VORONOT END DESCRIPTION The command VORONOI opens an environment which includes a series of subcommands which allow to compute average and instantaneous properties related to the Voronoi polihedra of the solute and of the solvent compute accessibility Comp
263. t the calculation of the g r s at distance equal to feut A delta delrg Set the bin size of the g r s to delrg A intra Include intramolecular contacts in solvent solvent g r s print fconf OPEN filename g r s are printed to the file filename every fconf fs use_neighbor Use the neighbor list to compute the g r s Radial distribution function can be computed on the fly EXAMPLES GOFR print 1000 0 OPEN test gofr use_neighbor average 1000 0 compute 10 0 cutoff 12 0 delta 0 02 END Input to ORAC amp PROPERTIES 113 WARNINGS When the the subcommand use_neighbor is used cutoff cannot exceed the neighbor lists cutoffs HBONDS NAME HBONDS Compute solute H bonds structural properties SYNOPSIS HBONDS END DESCRIPTION The command HBONDS opens an environment which includes a series of subcommands which allow to compute hydrogen bond related properties The hydrogen bond donor acceptor pairs must be defined in the topological file see section 0 3 If these definition where not included when generating the trajectory file and if READ_TPGPRM is specified in the PARAMETERS environment HBONDS produces no output These definitions may be provided at analysis time by the READ_TPG amp PARAMETERS directive In the following we indicate with the letters A and D the donor and acceptor pair angular_cutoff cutoff cutoff2 defines two angular cutoffs in degrees for A H D and A D H res
264. tal energy change 7 J a 4 0 o3 4 x 2 S z 2 4 5 J 1 4 10 10 N 2 4 6 8 10 12 14 0 1 2 3 4 5 6 9 Time ps Time ps Figure 9 2 Left Time record for the intrasolute reciprocal lattice contributions to the differential work Eq arising form electrostatic interactions during the alchemical discharging of ethanol in water at T 300 K and P 1 Atm The simulation went on for 15 ps The red curve is due the self term 7 gt 1 i Q The green curve is due to the direct lattice contribution and to Vach The magenta curve includes the terms Vintra Eq 9 4 The blue curve is due to the full reciprocal lattice PME term Eq 9 2 Right Total energy change red line and numerical work black line computed using Eq 9 9 for the discharging of ethanol in water in an alchemical trajectory lasting for 9 ps the intra solute differential work computed during the transformation In the reciprocal lattice term blue curve the intrasolute and solute solvent contributions are mixed Hence the integrated total differential work black curve is expectedly slightly positive due to loss of long range electrostatic energy because of ethanol discharging Again paralleling the situation seen for the intrasolute energy the work due to the self term approximately cancels the end Erfc intrasolute contributions We conclude this section with some comments on the time protocol that drives the alchemical transfor mation In our imple
265. temperature of the ensemble Here we are interested in determining such free energy differences that will be referred as optimal weight factors or simply optimal weights Accordingly in the acceptance ratio we will use fn instead of gn 6 2 1 SGE simulations in temperature space simulated tempering and its implementation in the ORAC program In SGE Monte Carlo simulations conducted in temperature space ST simulations Eq 6 2 holds Specif ically since only configurational sampling is performed we have hn x Ba V 6 11 where V x is the energy of the configuration x Exploiting Eq 6 1 into Eq 6 8 we find that transitions from n to m ensemble realized at fixed configuration are accepted with probability acc n m min 1 efr TEV fm Fn 6 12 Here we assume implicitly that the indexes n and m belong to an ordered list such that Ti lt T lt lt Ty or At lt Ag lt lt A Serial generalized ensemble simulations 52 When the system evolution is performed with molecular dynamics simulations the situation is slightly more complicate Suppose to deal with canonical ensembles to simplify the treatment and the notation we consider constant volume constant temperature ensembles though extension to constant pressure constant temperature ensembles is straightforward Usually constant temperature is implemented through the Nos Hoover method 120 or extensions of it 122 With the symbol p we
266. that the energy fluctuations grow with N while the energy grows with N Moreover in many important cases one has to effectively samples reaction coordinates that are rather localized in the protein like e g in the case of substrate active site interactions In the standard temperature REM the extra heat in the hot replicas is clearly distributed among all the degrees of freedom of the system and therefore most of this heat is uselessly used for exchanging uninteresting configurations e g solvent configurations 5 2 Hamiltonian REM In this program we adopt a variant of the replica exchange called Hamiltonian REM that is far more flexible than the standard temperature REM technique illustrated above In the Hamiltonian REM each replica is In the latter equation c is a constant that depends on the density of states of the N harmonic oscillators and c N Cy with Cy being the constant volume heat capacity of the system Replica exchange 45 XX Xx 7 K 7 6 6 V a a5 5 8 l 5 E 4 4 D X k 1 L L 1 L 1 L Simulation time Figure 5 2 Typical REM Simulation with 8 replicas Each process bear a particular color and the color follows the right scale i e the replica index which is connected to the temperature To reconstruct a trajectory at a given temperatures one must combine the data form several processes characterized by a different potential energy rather tha
267. the SPME makes the simulation of large size complex molecular systems such as biopolymers polar mesogenic molecules organic molecules in solution etc extremely efficient and therefore affordable even for long time spans Further this technique do not involve any uncontrolled approximatior and is perfectly consistent with standard PBC 1 2 Enhanced sampling in atomistic simulations Standard equilibrium simulations of complex biosystems are usually done on a single solvated biomolecule in PBC due to computational bounds In these conditions the only mean to measure e g the free energy differences between two given protein conformations is to record the number of times the protein molecule in the MD cell is observed in each of the two conformations Swaps between these conformers can take in time average as long as 0 1 1 microseconds 40 even for small proteins One then needs to do extremely long equilibrium simulations in order to have a sufficient number of swaps between conformational states so as to determine with good accuracy a stationary equilibrium ratio of the conformational probability and hence the free energy 1Of course SPME is itself an approximation of the true electrostatic energy This approximation is however totally under control since the energy can be determined to any given accuracy and the effect of finite accuracy can be easily controlled on any computed property of the system The approximation is not uncontrolled
268. the reciprocal lattice contribution Vg to the total Coulomb energy When using SPME the cost of the reciprocal lattice sums is cut down dramatically and therefore the use of large a s becomes helpful to reduce the computational burden of the direct lattice calculation For a value of increased beyond a certain limit there is no longer a computational gain since the pair distances must always be evaluated in direct space until convergence of the Lennard Jones energy usually occurring at a 10 A cutoff Furthermore the larger is a the more short ranged and fast varying becomes the potential Vgr thus requiring short time steps to integrate correctly the equations of motion A good compromise for a valid for cell of any shape and size is 0 4 0 5 The direct space potential is separated in three contributions according to the interaction distance The overall non bonded potential breakup is therefore Vai Via Vin Waw Vea Wi VO V Va Vi Va evo 4 39 where the superscripts m l h of the direct space term Vaw and Vga refer to the short medium and long range non bonded interactions respectively The m th reference system includes non bonded direct space interactions at short range typically between 0 to 4 3 5 3 A Vj contains both the medium range direct space potential with a typical range of 4 3 5 3 to 7 3 8 5 A and the reciprocal space term Var Finally the h th reference system which is the most slowly varying contains
269. the running average of the dielectric constant relative permittivity EXAMPLES PRINT_DIPOLE 10 5 OPEN dipole out The file dipole out looks like the following 399115 500 0 50455E 02 0 29885E 02 0 46023E 01 11 330 399126 000 0 48858E 02 0 40479E 02 0 80527E 01 12 520 399136 500 0 52146E 02 0 35302E 02 0 70597E 01 13 023 399147 000 0 57283E 02 0 32666E 02 0 52314E 01 13 372 399157 500 0 62705E 02 0 36044E 02 0 76743E 01 13 913 In the first column the current simulation time is reported Column 2 4 contain the istantaneous values of the x y z component of the cell dipole in Debye In column 5 the running average of the dielectric constant is reported The dielctric constant is computed under the assumption of thin foil boundary conditions 156 i e no surface dipole term at the sphere boundary using the formula e 1 4n lt M gt lt M gt 3V RT 50 WARNINGS Diagnostic Unsupported STRUCTURES NAME STRUCTURES Compute the root mean square deviations from a given solute reference structure SYNOPSIS STRUCTURES END DESCRIPTION The command STRUCTURES opens an environment which includes a series of subcommands which allow to compute average and instantaneous root mean square displacements rms of the solute for various atomic type a carbon heavy atoms backbone atoms etc The reference structure for the solute is entered with the command TEMPLATE amp SETUP Input to ORAC amp PROPERTIES 115
270. this example two parameters T and are employed However no restraint is actually given to the number of ensemble spaces Generalized ensemble algorithms have a different implementation dependent on whether the temperature is included in the collection of sampling spaces Eqs 6 2 and 6 4 or not Eq 6 3 Here we adhere to the most general context without specifying any form of h p In SGE simulations the probability of a microstate x p in the nth ensemble from now on denoted as z p n is proportional to exp An x p gn where gn is a factor different for each ensemble that lIn Monte Carlo generalized ensemble simulations momenta are dropped out Serial generalized ensemble simulations 51 must ensure almost equal visitation of the N ensembles The extended partition function of this system of ensembles is N N Z 5 J qlee inde dp X Zne 6 5 n 1 n 1 where Z is the partition function of the system in the nth ensemble Eq 6 1 In practice SGE simulations work as follows A single simulation is performed in a specific ensemble say n using Monte Carlo or molecular dynamics sampling protocols and after a certain interval an attempt is made to change the microstate x p n to another microstate of a different ensemble 2 p m Since high acceptance rates are obtained as the ensembles n and m overlap significantly the final ensemble m is typically close to the initial one namely m n FA In
271. ths to starting values SYNOPSIS ADJUST_BONDS DESCRIPTION This command should be specified when bond constraints are imposed to the system see command STRETCHING and CONSTRAINT in this environment If specified all bonds to be constrained are constrained to the lengths specified in the force field parameter file see sec 10 3 1 PDB file DEFAULTS ADJUST_BONDS is TRUE AUTO_DIHEDRAL NAME AUTO_DIHEDRAL Include all the proper torsion angle in the interaction potential SYNOPSIS AUTO_DIHEDRAL WARNINGS Obsolete Unsupported BENDING NAME BENDING Constrain bendings SYNOPSIS BENDING on BENDING off DESCRIPTION With the argument on this command includes harmonic bending potentials in the total solute po tential Conversely if the argument is off all the bending of the solute molecules are constrained Input to ORAC amp POTENTIAL 103 DEFAULTS BENDING off WARNINGS Obsolete Unsupported CONSTRAINT NAME CONSTRAINT Constrain bendings SYNOPSIS CONSTRAINT SHAKE CONSTRAINT MIM mimlim DESCRIPTION Select procedure for fulfilling constraints With the argument SHAKE ORAC uses SHAKE With argu ment MIM ORAC uses the matrix inversion method MIM In the latter case the maximum physical dimension of the constraint matrix mimlim must be specified MIM is best used in conjunction with STRETCHING HEAVY DEFAULTS BENDING off CUTOFF NAME CUTOFF SYNOPSIS CUTOFF rspoff WARNINGS Use
272. ting the biased averaged Quasi equilibrium techniques 57 58 59 builds such biasing potential that favours barrier crossing by periodically adding a small perturbation to the system Hamiltonian so as to progressively flatten the free energy surfaces along selected reaction coordinates For example in the so called standard metadynamics simulation method 57 a history dependent potential made of accumulated Gaussian functions deposited Atomistic simulations an introduction 8 continuously at the instantaneous values of the given reaction coordinates is imposed to the system The history dependent potential disfavors configurations in the space of the reaction coordinates that have already been visited and it has been shown by appropriately adjusting system dependent parameters to numerically converge to the inverse of the free energy surface 61 In the present version of ORAC the metadynamics technique has been implemented in the parallel version whereby multiple metadynamics simulation walkers are run in parallel cooperatively building a common history dependent potential which is passed among all processes The history dependent potential is generally defined over a multidimensional domain involving several reaction coordinates Metadynamics can be used e g to identify the minimum free energy path between two metastable protein states defining the reactants and products of an elementary chemical reaction Quasi equilibrium techni
273. tio nor the dynamics of the system Therefore by setting x x and generalizing to changes we recover the equality W n gt m hm x p pi hn a p pt 6 21 Using W n gt m the acceptance ratio of Eq 6 8 becomes acc n gt m min 1 e4 gt m WIn gt m 6 22 where Afn m fm fn The quantity Win gt m Af m can be interpreted as the generalized dimensionless work dissipated in the transition see Eq 17 of Ref I16 Until now we have simply restated the acceptance ratio of SGE simulations in terms of the generalized dimensionless work W n gt m The truly important aspect of this treatment is that the knowledge of W n gt m and W m gt n stored during the sampling gives us the possibility of evaluating the optimal weights A fn m using the Bennett method i17 reformulated with maximum likelihood arguments 65 116 For example in ST simulations we must take memory of the quantities W n gt m Bm Bn Vn a and W m gt n Bn Bm Vm x where the subscripts of the potential energy indicate the ensemble at which sampling occurs The extension to Hamiltonian tempering implemented in the ORAC program is straightforward Wn gt m B em Cn Vn 6 23 with analogous expression for W m n In the case of SGE simulations in the A space we have substitute Eq 6 16 into Eq 6 27 with fixed coordinates and momenta Win gt m Bk r Am r An
274. tion while running a simulation It works only during analysis stage see amp ANALYSIS SCALE_CHARGES NAME SCALE_CHARGES Scale the total charge on the solute to zero SYNOPSIS SCALE_CHARGES nmol ii iz lnmol DESCRIPTION If Q is the excess charge on the solute electro neutrality is imposed by equally distributing Q charge over the atoms of nmol disconnected molecules of solute specified by the indices 71 inmol Disconnected molecules are ordered according to the sequence given in the structured command JOIN Input to ORAC amp SOLUTE 147 EXAMPLES SCALE_CHARGES 4 15 7 11 The excess charge is distributed over 4 molecules the 1 st the 5 th the 7 th and the 11 th molecule as specified in the sequence give in JOIN WARNINGS This command is active only if the solute topology and parameter list is actually computed and not read from a binary file i e READ_TPGPRM in amp PARAMETERS must be inactive SPACE_GROUP NAME SPACE_GROUP Generate a simulation box applying symmetry operations to an input asymmetric unit SYNOPSIS SPACE_GROUP OPEN filename group DESCRIPTION Read the space group group parameters inequivalent molecules and corresponding interchange matrices and fractional translations form the ASCII file filename The file filename is a user database which may contain many entries corresponding to different space groups The following is an example of an entry of this file Space Group Symmetry P
275. to which the atoms move Both these physically significant ingredients of the atomistic approach unfortunately pose severe computational problems on one hand the inclusion of full flexibility Atomistic simulations an introduction 5 necessarily implies the selection of small step size thereby reducing in a MD simulation the sampling power of the phase space On the other hand especially the evaluation of inter particle long range interactions is an extremely expensive task using conventional methods its computational cost scaling typically like N with N being the number of particles quickly exceeding any reasonable limit In this book we shall devote our attention to the methods within the framework of classical MD simulations that partially overcome the difficulties related to the presence of flexibility and long range interactions when simulating complex systems at the atomistic level Complex systems experience different kind of motions with different time scales Intramolecular vibra tions have a period not exceeding few hundreds of femtoseconds while reorientational motions and confor mational changes have much longer time scales ranging from few picoseconds to hundreds of nanoseconds In the intra molecular dynamics one may also distinguish between fast stretching involving hydrogen with period smaller than 10 fs stretching between heavy atoms and bending involving hydrogen with double period or more slow bending and torsional movements a
276. ttice and in a slowly decaying term the Erf term due to the added Gaussian spherical distributions evaluated in the reciprocal lattice Thanks to this trick the conditionally convergent electrostatic energy sum is splitted in two absolutely convergent series In standard implementations of the Ewald resummation technique as we will see later on the electrostatic potential at the atomic position r is actually not available with mixing of the interactions between alchemical and non alchemical species in the so called Ewald reciprocal lattice contribution i e the Erf part The Smooth Particle Mesh Ewald method see Chapter 3 makes no exception with the additional complication that the atomic point charges including the alchemical charges are now smeared over nearby grid points to produce a regularly gridded charge distribution to be evaluated using Fast Fourier Transform FFT Due to the extraordinary efficiency see Figure 4 3 the Particle Mesh Ewald method is still an unrivaled methodology for the evaluation of electrostatic interactions in complex systems Moreover PME can be straightforwardly incorporated in fast multiple time step schemes producing extremely efficient algorithms for e g systems of biological interest For these reasons it is therefore highly desirable to devise rigorous and efficient approaches to account for alchemical effects in a system treated with PME 9 0 1 Production of the MD trajectory with an externally dri
277. uch less stiffer than the harmonic driving potential is generally rather small even at relatively low values of force constant With this respect it has been shown that 147 1 1 z F z zE 2 zat O 1 k 8 19 where z is PMF of the unbiased system with the Hamiltonian H z while F z is the PMF that is actually measured in the SMD experiments i e that corresponding to the biased Hamiltonian H H z Veat z From Eq 8 19 one sees that if the derivatives of F are not too high or k is chosen large enough then one can safely assume that z F z eq intraq intral Chapter 9 Alchemical Transformations In the following we shall describe in details the theory of continuous alchemical transformations with focus on the issues and technicalities regarding the implementation in molecular dynamics code using the Ewald method As we will see running a simulation using standard implementation of the Ewald methods of a system where atomic charges are varying implies the insurgence of non trivial terms in the energy and forces that must be considered for producing correct trajectories In a nutshell Ewald resummations consists in adding and subtracting to the atomic point charges a spherical Gaussian charge distributions bearing the same charge so that the electrostatic potential is split in a fast dying term the Erfc term due to the sum of the point charge and the neutralizing charge distribution and evaluated in the direct la
278. ulation e energy_then_die Print out energies and then stops EXAMPLES step intra 2 step intra 2 step nonbond 4 4 2 step nonbond 4 7 3 reciprocal step nonbond 1 9 7 Here five time steps are defined three for nonbonded potentials and two for intramolecular potential The largest timestep At is defined by the command TIMESTEP in this environment see above and refers to the nonbonded subsystem with shell in the range 7 3 9 7 A We then have At Atp 4 referring to the 4 2 7 3 A shell and At Atn 4 4 referring to the 0 4 2 A shell The reciprocal potential is assigned to the intermediate 4 2 7 3 A shell The two intramolecular shells have time steps Atn Atn 4 4 2 and Atno Atn 4 4 2 2 step intra 2 step nonbond 3 6 5 reciprocal step nonbond 1 9 5 test times OPEN file tests Here only one intramolecular and two intermolecular time steps are defined The reciprocal PME or standard contribution is assigned to the fastest intermolecular shell Energy records are printed onto the file file tests each At femtoseconds Input to ORAC amp INTEGRATOR 88 DEFAULTS step intra 1 step intra 1 step nonbond step nonbond step nonbond erer 4 1 0 3 0 3 7 3 0 3 0 45 reciprocal 9 70 3 1 5 WARNINGS 1 When standard Ewald is used and the reciprocal space contribution is subdivided in k shells the intramolecular term of Eq 2 is always assigned to the fastest k shell This may cause instability o
279. ulation 138 simulation with isotropic and anisotropic stress tensor 19 138 PRESSURE parameter in the config h file 165 PRINT_ENERGY replica exchange method 118 PRINT replica exchange method 118 print for harmonic calculations 139 print 113 PRINT_DIPOLE 114 print_histo 13 printing the force field parameters printing topology information 121 PRINT_ACCEPTANCE_RATIO amp SGE 132 PRINT_WHAM amp SGE PRINT_TOPOLOGY 95 propagator discrete time stepwise I proper torsion BI 95 22 definition of in the parameter file 56 frequency range 33 amp PROPERTIES 111 PROPERTY 123 protein printing out the sequence 95 giving the input sequence in ORAC 94 p_test QQ FUDGE r RESPA energy conservation for NPT ensemble input examples performances 16 use in ORAC 86 with Parrinello Rahman Nos Hamiltonian radial_cutoff 13 radial distribution function 1I RATE RATTLE 5 170 reaction coordinate 7 61 reaction field 5 READ READ_CO 128 reading the restart file READ_PDB READ_PRM_ASCII READ_SOLVENT READ_TPG_ASCIT 97 READ_TPGPRM reciprocal lattice reciprocal lattice potential REDEFINE 151 reference system 12 37 REJECT amp REM replica exchange method Hamiltonian REM local scaling and global scaling 120 temperature REM 42 REPLICATE 127 REPL_RESIDUE RESET_CM RESIDUE 158 resi
280. ules must follow those of the solute in the PDB file An example is the following amp SETUP READ_PDB solute solvent pdb SOLVENT ON SOLUTE ON amp END amp SOLVENT ADD_UNITS 432 amp END amp PARAMETERS JOIN SOLVENT hoh END JOIN SOLUTE ala h ala ala ala o END amp END TEMPLATE NAME TEMPLATE Define a template or reference structure SYNOPSIS TEMPLATE filename Input to ORAC amp SETUP 130 DESCRIPTION This command defines a template PDB file filename which contains reference solute coordinates used during run time analysis for computing root mean square displacements see command X_RMS in amp PROPERTIES for instance EXAMPLES TEMPLATE test_template pdb Input to ORAC amp SGE 131 10 2 11 amp SGE Define run time parameters concerning Serial Generalized Ensemble simulations see Chapter 6 It works with both serial and parallel versions of the ORAC program see Chapter I When reporting SGE sim ulations obtained by BAR SGE method please cite Ref 55 OUTPUT FILES SGE_DF In the serial version of ORAC this file is written in the working directory In the parallel version it is written in the PAR0000 directory The file reports the average dimensionless free energy differences between ensembles see Eq 6 28 along with the errors calculated by Eq 6 29 see top of the file The file is updated in time intervals defined by the parameter La of the command STEP see below SGE_ENERGY
281. ute the area of the Voronoi polihedron for all residues of the solute computed as the sum of the voronoi volumes of the individual atoms and evaluate for each residue the fraction of the surface that is accessible to the solvent solute and solvent as defined in the command JOIN amp PARAMETERS compute contact_solute intl int2 Compute the contact surface among selected solute residues with residue numer int and int2 as in the the PDB file compute neighbors Compute the Voronoi coordination number relative to the whole solute solute solvent solvent and solute solvent contacts compute volume Compute the Voronoi volumes of all residues in the solute cutoff value values the cutoff A for heavy_atoms Use only non hydrogen atoms for evaluating Voronoi polihedra Input to ORAC amp PROPERTIES 117 print nprint OPEN filename Print all output as to file filename every nprint configurations EXAMPLES VORONOT print 5 OPEN 6 vor cutoff 8 0 heavy_atoms compute compute compute compute compute END contact_solute 1 2 contact_solute 5 6 volume neighbors accessibility In this example we compute the voronoi volumes areas and accessibility and neighbors for the residues of a proteins every 5 configurations Also the contact surfaces between residues 1 and 2 and residue 5 and 6 are evaluated All output are printed to the file 6 vor WARNINGS VORONOI commands works only in conjuction with the amp ANALYSIS envir
282. ven alchemical process In a system of N particles subject to a continuous alchemical transformations only the non bonded poten tial energy function is modified because of the presence of alchemical species The full non bonded energy of the system is given by Il Venem An FOL A 6 ZL erfelary a Si MHPO Fis ij J 2 2 Sar D PEA St AONO eplir ry m0 ij deyli n 9 1 ymis t rig oij yms t rig oij where V the unit cell volume m a reciprocal lattice vector and a is the Ewald convergence parameter related to the width of the Gaussian spherical charge distribution The first term in the non bonded energy RQR Steered Molecular Dynamics 69 Eq PIl is limited to the zero cell and corresponds to the electrostatic interactions in the direct lattice the second term refers to the self interactions of the Gaussian charge distributions and the third term corresponds to the interactions between Gaussian distributions in the zero cell as well as in the infinite direct lattice reformulated as an absolutely convergent summation in the reciprocal lattice The last term in Eq finally corresponds to the modified atom atom Van der Waals interaction introduced in Ref 150 incorporating a soft core parameterization where the infinity in the Lennard Jones interaction is smoothed to zero as a function of the n The parameter y is a positive constant usually set I5I to 0 5 that controls the smo
283. ver nearby grid points to produce a regularly gridded charge distribution The PME method accomplishes this task by interpolation Thus the complex exponential exp 27imautia Ka computed at the position of the i th charge in Eq 24 are rewritten as a sum of interpolation coefficients multiplied by their values at the nearby grid points In the smooth version of PME SPME 34 which uses cardinal B splines in place of the Lagrangian coefficients adopted by PME the sum is further multiplied by an appropriate factor namely exp 27iMgUia Ka b Ma 5 Mn Uia k exp 2rimak Ka 4 26 k where n is the order of the spline interpolation Mn Uia k defines the coefficients of the cardinal B spline interpolation at the scaled coordinate uia In Eq 4 26 the sum over k representing the grid points is only over a finite range of integers since the functions M u are zero outside the interval 0 lt u lt n It must be stressed that the complex coefficients b m are independent of the charge coordinates u and need be computed only at the very beginning of a simulation A detailed derivation of the Mn u functions and of the ba coefficients is given in Ref 34 By inserting Eq into Eq 424 S m can be rewritten as S m by m1 b2 m2 b3 m3 F IQ m m2 m3 5 4 27 where F Q m1 m2 mg stands for the discrete FT at the grid point m1 m2 mg of the array Q k1 k2 k3 with 1 lt k lt Kj i 1 2 3 The gridded charge
284. with hydrogens The command DEF_FRAGMENT can appear more than one time in the environment The atoms of different solute molecules defined with this command can overlap EXAMPLES DEF_FRAGMENT 1 80 DEF_FRAGMENT 81 90 DEF_FRAGMENT 1001 1256 DIST_FRAGMENT NAME DIST_FRAGMENT Print out distances between solute fragments SYNOPSIS DIST_FRAGMENT ffragm OPEN filename DESCRIPTION Write the distances between the centroids of the fragments defined in the command DEF_FRAGMENT to the file filename This command works only while retrieving the trajectory file by specifying the amp ANALYSIS environment EXAMPLES DIST_FRAGMENT 10 0 OPEN file_dist frg WARNINGS This command has no action while running a simulation Input to ORAC amp PROPERTIES 112 FORCE FIELD NAME FORCE_FIELD Print force field parameters SYNOPSIS FORCE_FIELD WARNINGS Work only in the production simulation stage It has no effect when reading the trajectory file GOFR NAME GOFR Compute solvent and or solute pair correlation function g r and structure factors S k SYNOPSIS GOFR END DESCRIPTION The command GOFR opens an environment which includes a series of subcommands to define the parameters used in the calculation of the radial distribution functions average favg Average the g r s over length favg given in units of femtoseconds compute fcomp Compute the g r s with a frequency of fcomp femtoseconds cutoff fcut Cu
285. with specific aspects of the solute force field and structure The following commands are allowed COORDINATES DEF_SOLUTE SCALE_CHARGES SPACE_GROUP COORDINATES NAME COORDINATES Define the coordinates of a solute SYNOPSIS COORDINATES filename DESCRIPTION Read the coordinates of the solute in PDB format form file filename This command is best used when also the solvent atoms must be read in EXAMPLES amp SETUP CRYSTAL 20 00 20 00 20 00 90 0 90 0 90 0 REPLICATE 2 2 2 amp END amp SOLUTE COORDINATES solute pdb SPACE_GROUP OPEN benz group P 2 c amp END amp SOLVENT CELL SC INSERT 1 5 COORDINATES solvent pdb GENERATE RANDOMIZE 4 4 4 GENERATE RANDOMIZE 8 8 8 amp END In this example the coordinates of the solute are read in form the file solute pdb while the coordinates of the solvent molecule see amp SOLVENT are read in form the file solvent pdb As is now this input would produce in a box of 20 x 20 x 20 1 solute along with 64 replicas see command GENERATE amp SOLVENT of the solvent molecule Of this 64 molecule those that overlap with the solute molecule see command INSERT amp SOLVENT are discarded If the second line in the environment amp SOLUTE is uncommented the solute is assumed to be arranged in the MD box according to the space group specified by the SPACE_GROUP directive In the present example the group contains 4 molecules per unit cell So 4 molecules of solute are arranged in th
286. ws the choice of large values of the Ewald convergence parameter a as compared to those used in conventional Ewald Correspondingly shorter cutoffs in the direct space Ewald sum Vgq may be adopted If uj is the scaled fractional coordinate of the i th particle the charge weighted structure factor S m in Eq 422 can be rewritten as N f M1Uj1 m2U25 3U3 5 S 2 tS t SS YY 4 24 m Soe ri ata p mataj Hae 4 24 Where N is the number of particles K1 K2 K3 and m1 mz Mg are integers The component of the scaled fractional coordinate for the i th atom can be written asf Uia Kaka gt Ti 4 25 1By excluded contacts we mean interactions between charges on atoms connected by bonds or two bonds apart 2The scaled fractional coordinate is related to the scaled coordinates in Eqs 3 1 3 34 by the relation sia 2uia Ka Electrostatic Interactions 35 where ka a 1 2 3 are the reciprocal lattice basic vectors S m in Eq 4 24 can be looked at as a discrete Fourier transform FT of a set of charges placed irregularly within the unit cell Techniques have been devised in the past to approximate S m with expressions involving Fourier transforms on a regular grid of points Such approximations of the weighted structure factor are computationally advantageous because they can be evaluated by fast Fourier transforms FFT All these FFT based approaches involve in some sense a smearing of the charges o
287. xecutable these character symbols are replaced by the standard preprocessor with their numeric values assigned in the config h file The meaning of most of the character symbols contained in the config h is explained in the file itself Here it is worth mentioning a few Compiling the Program 166 e PRESSURE The statement define PRESSURE is found in the distribution config h file It implies that the single time step non bonded force routines will be generated including the pressure computation section Since force routines not including the pressure calculation are faster of about 10 20 it might be useful in simulation at constant volume to replace the statement with undef PRESSURE With the current version of ORAC after this change all the CPP f files must be removed by hand and the program recompiled e _SIT_SOLU_ This is the maximum number of atoms in the system it includes the solvent and solute atoms The highly misleading name is due to historical reasons e _TYP_SOLU_ This is the maximum number of possible different units type as coded in the topology database e _NRES_ This is the maximum number of possible units in the solute i e the number of entries in the JOIN structured command e _TGROUP_ This is the maximum number of groups in the system e _LMAX_ MMAX_ NMAX These parameters controls the dimension of the sine cosine work in the standard Ewald Method In the config h provided in the ORAC distribution arch
288. y of the system producing the alchemical correction to the electrostatic energy has Ooo Ay e 9 5 ij gt 14 Tij where the summation is extended to all non bonded intrasolute interactions It should be stressed that the energy of Eq P 5 is a non trivial additive term that must be included in simulations of continuous Steered Molecular Dynamics TA alchemical transformations Such term stems from the time dependent alchemical charges q t and is due to the peculiar implementation of the Ewald method Vaich is indeed a large contribution 10 15 kJ mol 1 per solute atom and its neglect may lead to severe errors in the electrostatic energies and to incorrect MD trajectories We can finally re write down the total energy of a system subject to an alchemical transformation as ein Sine tj eyll nig it TAA 9 6 Q erfi ij Va Vintra gt 1 A t E Valch erfe arij Va t IA 2 t QF Vater lam t rij oij ans t rig oij where Va Vintra Vaich are defined in Eqs 9 2 9 4 and 9 5 respectively All terms in Eq 9 6 except for the self term Sz 1 Ai t Q7 contribute to the atomic forces that can be standardly computed by taking the derivatives of the energy Eq 9 6 with respect to the atomic position r producing the correct trajectories for alchemically driven systems under periodic boundary conditions and treated with the Ewald sum In the Figure 9 1 we report the time r
289. y solute sequence termatom WARNINGS This keywords must be always present in any RESIDUE environment NAME acc List the hydrogen bond acceptor atoms Experimental Unsupported SYNOPSIS acc lab1 lab2 DESCRIPTION The labels lab1 lab2 are string character indicating the atom types see command atom If only one label is specified label1 refers to the hydrogen bond acceptor If also label2 is specified the latter is the acceptor and label1 while refers to the conjugate acceptor bonded atom e g N and H acceptor in the C O bond Input to ORAC Force Field amp Topology 163 don NAME don List the hydrogen bond donor atoms Experimental Unsupported SYNOPSIS don lab1 lab2 DESCRIPTION The labels lab1 lab2 are string character indicating the atom types see command atom If only one label is specified label1 refers to the hydrogen bond donor If also label2 is specified the latter is the donor and label1 while refers to the conjugate acceptor bonded atom e g C acceptor and O donor in the C O bond Chapter 11 Compiling and Running ORAC 11 1 Compiling the Program 11 1 1 Serial version ORAC has been written mostly in FORTRAN 77 The present release 5 1 includes some FORTRAN9QO code and can no longer be compiled with the g77 compiler However ORAC 5 1 can be compiled with gfortran the Gnu FORTRAN compiler for GCC the Gnu Compiler Collection ORAC 5 1 is currently supported only for Linux op
290. ymmetric form of the multiple time step propagator Eq 3 71 does not imply necessarily time reversibility Some operators appearing in the definition of L e g iL and iL for the molecular scaling and in the definitions of iL and iL for the atomic scaling are in fact non commuting We have seen in section that first order approximation of non commuting propagators yields time irreversible algorithms We can render the propagator in Eq time reversible by using second order symmetric approximant i e Trotter approximation for any two non commuting operators For example in the case of the molecular scaling when we propagate in Eq the slow propagator e 14 for half time step we may use the following second order O At split At y At r At s At At At At eli a eilu TT eilz ely ST etle ei Ls thu 4 pile 3 72 An alternative simpler and equally accurate approach when dealing with non commuting operators is simply to preserve the unitarity by reversing the order of the operators in the first order factorization of the right and left operators of Eq without resorting to locally second order O At approximation like in Eq 8 72 Again for the molecular scaling this is easily done by using the approximant 7 Ati p Ati y Ati 7 Ati 7 Ati 7 Aty e15 eile ely etlz 5 ebum eils m 3 73 left for the left propagator and 7 Aty ap Aty 7 Aty r Ati p Ati 7 At em 2 etls z giles lea eilu a giles
291. ystems subject only to mechanical work However by following the arguments of Ref 116 it is straightforward to generalize the variance o Afism 2 pa 1 cosh W n gt m Af T 6 26 A 1 cosh W fm gt n A i N 1 N 6 26 n gt m mn where Af Afnom In Nmon Nn m The quantity o Afn m can be calculated once A fnm is recovered from Eq 6 25 It is obvious that in order to employ Eq 6 25 both n and m ensembles must be visited at least one time If statistics is instead retrieved from one ensemble alone say n then we have to resort to a different approach The one we employ is consistent with the previous treatment In fact in the limit that only one work collection specifically the n m collection is available Eq 6 25 becomes 65 compare with Eq 21 of Ref 176 6 27 thus recovering the well known fact that the free energy is the expectation value of the work exponential average 62 6 3 2 Implementation of adaptive free energy estimates in the ORAC program the BAR SGE method We now describe how the machinery introduced in Section 6 3 1 can be employed in SGE simulation programs such as ORAC Suppose to deal with N ensembles of a generic A space be it a temperature space a A space or even a multiple parameter space Without loss of generality we order the ensembles as Ay lt Ag lt lt Ay Thus N 1 optimal weights Afi 2 Afo 3 Afn 1sn have to be estimated

manual

Contents

Download Pdf Manuals

Related Search

Related Contents