Home

User Guide - The Cambridge Crystallographic Data Centre

image

Contents

1. covalent substructure lt option gt default 0 off options 0 off 1 on GOLD supports two types of covalent link A covalent link for use with individual ligands and a substructure based covalent link for use with multiple ligands which have a common functional group Set covalent_substructure 1 to define a substructure based covalent link During docking the link will then be applied to any ligands which contain the specified substructure see covalent_substructure filename Related Instructions covalent covalent_protein_atom_no covalent_substructure_ filename covalent_substructure_atom_no covalent_topology GOLD Configuration File User Guide 41 12 7 covalent_topology covalent topology lt option gt default 0 off options 0 off 1 on e When setting up a substructure based covalent link see covalent_substructure in which the specified substructure atom and therefore ligand atom is topologically equivalent to other atoms e g it is one of the oxygen atoms of an ionised carboxylate group it is possible to instruct GOLD to use whichever of the equivalent atoms gives the best result e Set covalent topology 1 to consider topologically equivalent ligand atoms Related instructions e covalent e covalent_protein_atom_no e covalent_substructure_filename e covalent_substructure_atom_no e covalent_substructure e divsol_cluster_size 42 GOLD Configuration File User Guide 13 Parallel Op
2. 9 4 flip_amide_bonds flip amide bond lt option gt default 0 off options 0 off 1 on GOLD Configuration File User Guide 23 9 5 9 6 24 During initialisation of the ligand amides including thioamides ureas and thioureas will automatically be set to the preferred trans conformation Setting flip amide bond 1 will allow amides thioamides ureas and thioureas in the ligand to subsequently flip between cis and trans during docking Note In order to flip between cis and trans conformations the CO NRR torsion is first made planar at the initialised trans conformation N N disubstituted amides are not made planar CO NH2 will be set so that the NH2 group is in plane with the CO care must be taken that the input RNH2 group itself is planar since GOLD will not change this Related Instructions postprocess_ bonds rotatable_bond_override_file flip_planar_n flip planar n lt option gt lt flip ring NRR fix ring NRR rot ring NRR gt lt flip ring NHR fix ring NHR rot ring NHR gt default 1 flip ring NRR flip ring NHR options 0 off 1 on Set flip planar n 1 to allow planar trigonal nitrogens in the ligand bound to sp2 carbons to flip between cis and trans conformations during docking otherwise they will be held fixed at the input geometry In addition it is possible to specifically control the behaviour of ring NHR and ring NRR groups by using the following keywords fli
3. and m is the number of the ligand i e m1 for the first ligand in the input file m2 for the second etc e If you do not wish to retain these individual solution files e g when writing all solutions to a single concatenated file see concatenated_output then specify clean up option delete all solutions in order to delete the unwanted solution files Related Instructions e concatenated_output e clean_up_option delete_redundant_log files e clean_up_ option delete_all_initialised_ligands e clean_up_option delete_empty_directories GOLD Configuration File User Guide 45 14 5 clean_up_option save_fitness_better_than clean_up_ option save fitness better than lt value gt e Setclean up option save fitness better than inorder to filter out all solutions with fitness scores lower than lt value gt i e solutions with fitness scores lower than that specified will be rejected For example setting clean up option save fitness better than 50 will mean that any solution with a fitness lower than 50 will not be kept Related Instructions e clean_up_option save _top_n_ solutions e clean_up_option save best ligands 14 6 clean_up_option save_top_n_solutions clean up option save top n solutions lt value gt e By default all docked solution will be kept at the end of a docking run However GOLD can produce a lot of output and you may wish to cut this down e Setclean up option save top n solutions in order to r
4. options 0 off 1 on e This option allows control over whether or not diverse solutions are generated for a docking run Note If diverse solutions have never been generated for a particular GOLD run the diverse solutions 0 line may not be present in the gold conf e The diverse solutions instruction in the gold conf should be accompanied by divsol cluster sizeand divsol_ rmsd instructions Related instructions e divsol_cluster_size e divsol_rmsd 9 13 divsol_cluster_size divsol cluster size lt value gt default 1 e When generating diverse docking solutions use the divsol_ cluster size tag to specify how many ligand diverse solutions are contained in a cluster Related instructions e diverse_solutions e divsol_rmsd 9 14 divsol_rmsd divsol_rmsd lt value gt default 1 5 e When generating diverse docking solutions use this setting to define the RMSD cut off in A for determining if diverse solutions are in the same cluster or not Related instructions e diverse solutions 9 15 solvate_all solvate all lt option gt default 0 off options 0 off 1 on e When the binding site is generated potential donor and acceptor fitting points are added to solvent accessible atoms the potential fitting points are themselves tested for solvent accessibility and only those fitting points that are accessible are used It is possible to remove this requirement for fitting points to be solvent acc
5. To use these features you will therefore need to set up the GOLD job as normal in the graphical interface save the configuration file then manually edit this file e Once a configuration file has been created it can be re used either as a quick way of reading program settings into the GOLD front end or to run GOLD from the command line GOLD Configuration File User Guide 1 Running GOLD Using a Configuration File To load a previously created configuration file into GOLD interface enter the file name into the Conf file entry box at the top of the GOLD Setup window Alternatively click on the Load button and use the file selection window to choose the file The parameters read in from the configuration file will overwrite any parameters that have already been set in the GOLD front end To run GOLD from the command line using a configuration file issue the following command Unix Platforms GOLD can be run directly in the background by using a simple command available in GOLD DIR bin o gold auto gold conf amp where gold conf is the name of a configuration file Windows GOLD can be run on Windows by starting a command prompt navigating to the directory containing the gold conf file and running the following command C Program Files CCDC gold suite GOLD gold d_ win32 bin gold_win32 exe The above command assumes that GOLD is installed in the default installation directory and that the config
6. e rds metal 17 2 rds_use _protein_coords rds use protein coords lt option gt default 1 on options 0 off 1 on e Use the protein atom position coordinates when calculating the depth of the atom involved in the interaction Set rds_use_protein_coords 0 to use the ligand atom position instead Required instructions e receptor_depth_scaling e rds_use_donor_coords e rds_use_exact_count e rds_protein_distance e rds_hbond e rds lipo e rds clash e rds metal 60 GOLD Configuration File User Guide 17 3 rds_use_donor_coords rds use donor coords lt option gt default 1 on options 0 off 1 on When the receptor depth is calculated for a hydrogen bond donor the heavy atom position typically N or O is used per default Set to O off to use the position of the hydrogen for the calculation Required instructions receptor_depth_scaling rds_use protein_coords rds_use exact_count rds_protein_distance rds_hbond rds_lipo rds_clash rds_metal 17 4 rds_use_exact_count rds use exact count lt option gt default 1 on options 0 off 1 on Receptor depth RD values need to be pre calculated on a grid to speed up the rescore when ligand atoms are involved For hydrogen bond interactions where rds_use protein_coords 1 the exact values of the RD may be pre calculated for the protein atom positions and used instead Note Lipophilic interactions will still require the grid based cou
7. 15 per_atom_scores per atom scores lt option gt default 0 off options 0 off 1 on e By including the line per_atom scores 1 GOLD will save the scoring contributions of individual ligand and protein atoms to the docked solution output files For each atom its contribution to the total fitness score and also the constituent scoring terms will be written Related Instructions e per_atom_scores 14 16 save_per_atom_scores_to_charge field save per atom scores to charge field lt option gt default 0 off options 0 off 1 on e Byincluding the lines per_atom_scores 1 GOLD will save the scoring contributions of individual ligand and protein atoms to the docked solution output files By also providing the line save per atom scores to charge field the ligand atom scores will be saved to the charge field of the solution mol2 files Related instructions e per_atom_scores GOLD Configuration File User Guide 1 49 15 15 1 15 2 15 3 50 Fitness Function Settings initial_virtual_pt_match_max initial virtual pt match max lt value gt default 4 0 When Goldscore is being used the annealing parameters van der Waals and Hydrogen Bonding allow poor hydrogen bonds to occur at the beginning of a genetic algorithm run in the expectation that they will evolve to better solutions The parameter initial virtual pt match max is used to set the starting values of max_distance the
8. 59 0 59 059 0 59 1 4 2 1 4 3 3 3 9 Related Instructions interaction_restraint_weight 11 10 force_constraints 38 force constraints lt option gt default 0 off options 0 off 1 on Setting force constraints 1 will instruct GOLD not to dock ligands when the specified constraint s are physically impossible to satisfy e g if no suitable group is present in the ligand to form the required H bond constraint GOLD Configuration File User Guide 12 Covalent Bonding 12 1 covalent covalent lt option gt default 0 off options 0 off 1 on Set covalent 1 inorder to dock covalently bound ligands GOLD assumes that there is just one atom linking the ligand to the protein e g the O ina serine residue Both protein and ligand files should be set up with the link atom included so if the serine O is the link atom it will appear in both the protein and ligand input files Ideally the link atom in both the ligand and the protein will have a free valence available through which the link can be made If the link atom on the ligand does not have a free valence having a hydrogen instead then the docking will proceed and the hydrogen will be ignored in terms of its contribution to the fitness score It will however still be displayed when docking poses are visualised It is necessary to specify which ligand atom see covalent_ligand_atom_no is bonded to which protein atom see covalent_pr
9. docking param file Z GOLD datafiles my sf params The format of the scoring function parameter file is quite strict incorrect editing may cause GOLD to behave in unexpected ways or even to crash Because of the large number of parameters no guarantee can be given that the program will behave reliably with anything other than the default parameterisation Specific parameter files for use with heme containing proteins are also available for both GoldScore and ChemScore For further information see S B Kirton C W Murray M L Verdonk and R D Taylor Proteins Structure Function and Bioinformatics 58 836 844 2005 The parameters are derived from contact statistics obtained from the CSD and PDB databases These parameters can be used by specifying the appropriate params file from those that have been supplied with the GOLD installation The following params files are available within the GOLD DIR gold directory goldscore p450 csd params GOLD Configuration File User Guide goldscore p450 pdb params chemscore p450 csd params chemscore p450 pdb params A specific parameter file for use with protein kinases is available for ChemScore For further information see M L Verdonk V Berdini M J Hartshorn W T M Mooij C W Murray R D Taylor and P Watson J Chem Inf Comput Sci 44 793 806 2004 This allows weak CHO interactions to be accounted for by inclusion of a ChemScore term that calculates a contribution for
10. during docking e Setting the trans spin option will make GOLD spin and translate the water molecule to optimise the orientation of the hydrogen atoms as well as the water molecule s position within a user defined radius Note that the translation value must be between O and 2 e g water 1 toggle trans spin 1 5 e To predict whether a specific water molecule should be bound or displaced GOLD estimates the free energy change 4G associated with transferring a water molecule from the bulk solvent to its binding site in a protein ligand complex Further details can be found in Modeling Water Molecules in Protein Ligand Docking Using GOLD Marcel L Verdonk Gianni Chessari Jason C Cole Michael J Hartshorn Christopher W Murray J Willem M Nissink Richard D Taylor and Robin Taylor J Med Chem 48 6504 6515 2005 e For each key water molecule you will need to specify The atom number of the water oxygen atom as defined in the protein input MOL2 file The state of the water available options are on use the water for docking i e present off do not use the water for docking i e absent toggle have GOLD decide whether the water should be present or absent i e bound or displaced during the docking run The orientation of the water hydrogen atoms available options are fix use the orientation specified in the input file spin have GOLD automatically optimise the orientation of the hydrogen atoms tr
11. file and is used automatically if tordist_file is set to DEFAULT mimumba tordist this contains all the torsional distributions used in the MIMUMBA program Klebe and Mietzner J Comput Aided Mol Des 8 583 606 1994 e The torsion angle distribution file can be customised by copying it editing the copy and instructing GOLD to use the edited file e g tordist file Z GOLD datafiles custom tordist e Note The format of entries in the torsion angle distribution file is strict incorrect editing of the file may cause GOLD to behave in unexpected ways Related Instructions e use tordist GOLD Configuration File User Guide 19 8 8 make_subdirs make subdirs lt option gt default 0 off options 0 off 1 on e When more than one ligand is being docked set make subdirs 1 to have results for each ligand written to a separate sub directory 8 9 save_lone_pairs save lone pairs lt option gt default 1 on options 0 off 1 on e Some 3rd party programs have difficulty reading files which contain lone pairs You can stop GOLD including lone pairs when it writes docked solution files by switching off this option 8 10 bestranking_list_filename bestranking list filename lt filename gt e Afile called bestranking 1st is written for batch jobs on multiple ligands This gives a continuous summary of the best solution that has been obtained for each completed ligand The file contains the t
12. from an alternative source for example from a known crystal structure or a solution from another docking program To perform a rescoring run include the line run flag RESCORE One or more of the following keywords may also be specified no simplex By default the docked ligand pose will be minimised before rescoring Simplexing is important if you are to obtain meaningful scores Due to the nature of scoring functions one finds that small changes in location or conformation of the pose can have large effects on the calculated score To disable simplexing specify the keyword no_simplex retrieve When rescoring a GOLD solution file it is possible to use the optimised positions of the polar protein hydrogen atoms that were generated during the original docking To do this include the keyword retrieve If this keyword is not used or no rotatable H positions are available then the default hydrogen atoms positions specified in the protein input file will be used GOLD Configuration File User Guide no file By default GOLD will write out docked ligand solutions after rescoring Solutions will be written to the file rescore mol2 to specify an alternative filename see see concatenated_output To disable writing of this file include the keyword no_file If writing of this file is switched off only the rescore log file will be written no strip By default when rescoring a GOLD solution file the list of active re
13. in its docked position i e expressed with respect to the same coordinate frame as the protein and with the coordinates required to place it in the correct pose lt filename gt is used to provide GOLD with the location of the template file The template must be supplied as a MOL2 file or PDB file e The similarity constraint can be applied in three ways that differ in the way that the overlap between ligand and template is calculated The similarity can be evaluated by using the overlap between all donor atoms in the template and the ligand being docked constraint similarity donor lt filename gt lt weight gt GOLD Configuration File User Guide 35 by using the overlap between all acceptor atoms in the template and the ligand being docked constraint similarity acceptor lt filename gt lt weight gt by using the overlap of all atoms of the template this can be regarded as a ligand shape constraint constraint similarity all lt filename gt lt weight gt e The value of lt weight gt determines the maximum energy term that would be added to the score in the case of perfect overlap between ligand and template i e The energy term to be added is calculated as similarity times weight the similarity value is between O and 1 where 1 indicates an identical match between template and ligand As an initial value for this term we suggest a value between 5 and 30 e When using constraints GOLD will be biased towards find
14. ligand GOLD Configuration File User Guide 29 30 e The early termination criterion must be specified GOLD will stop docking a ligand when the specified number of top solutions see n_top_solutions are all within lt value gt rmsd of each other Related Instructions e early_termination e n_top solutions GOLD Configuration File User Guide 11 Constraints 11 1 constraint distance constraint distance lt protein ligand gt lt atom id gt lt protein ligand gt lt atom id gt lt max distance gt lt min distance gt lt spring constant gt lt on off gt e Adistance between a specified ligand and protein atom or between two ligand atoms can be constrained to lie between minimum and maximum distance bounds During a GOLD run if a constrained distance is found to lie outside its bounds a spring energy term is used to reduce the fitness score i e kx F Br where x is the difference between the distance and the closest constraint bound k is a user defined spring constant e When using a distance constraint GOLD will therefore be biased towards finding solutions in which the specified constraint is satisfied However it is important to remember that such a solution is not guaranteed i e it is not possible to force a constraint to be satisfied in the final solution It is possible to instruct not to dock ligands when the specified constraint s are physically impossible to satisfy e g if n
15. part of the ensemble The best ligand conformation found in any of the ensemble structures is returned For each protein involved in the ensemble it is necessary to add an ensemble structure block of commands to the gold conf Within this block a number of parameters can be defined such as the protein filename any score offset values any constraints flexible sidechains or customised metal geometries e g ensemble structure protein datafile 1QPD protein mol2 protein score offset 5 0 CONSTRAINTS constraint protein h bond 10 0000 0 005000 1207 end_ensemble_ structure The protein datafile instruction specifies the input protein file The protein score offset instruction must be accompanied by a value which will be subtracted from the overall fitness score if a ligand is docked into this protein structure In this way the selection of certain protein structures can be biased If using this feature these scores are reported as DE Protein in the GOLD log files Constraints that are specific to the protein e g a protein H bond constraint should be specified within the ensemble structure block as above Similarly with flexible side chains and customised metal geometries Instructions for specifying constraints flexible side chains and customised metal geometries are detailed elsewhere in this document Related Instructions constraint distance constraint h_bond constraint protein _h_bond constraint substructure protein_datafi
16. the template that gives the best fit based on RMSd e The geometry templates used for given metals are defined in the gold params file in the section headed Metals For example for a Zn atom GOLD will attempt to match coordination geometries 4 5 and 6 tetrahedral trigonal bipyramidal and octahedral templates onto the coordinating atoms found in the protein The template that gives the best match will then be used to generate coordination fitting points e In addition to the templates listed above it is possible to specify custom metal coordination geometries which can subsequently be used to derive ligand binding points around particular metal atoms e Custom metal polyhedron may contain up to nine points Each point in the custom polyhedron must be specified using a vector assuming the centre of your polyhedron is at the origin e For example to set up a custom square planar geometry you must specify four points using the following instructions metal coordination spec point 0 1 0 point 1 0 0 point 1 0 0 66 GOLD Configuration File User Guide point 0 1 0 end metal coordination spec e Assuming the metal is on the origin 0 0 0 GOLD will then attempt to match the specified vectors onto the metal to protein atom vectors found in the protein vectors are normalised to a metal to chelator distance of 2 0 e Once defined it is necessary to explicitly instruct GOLD to consider custom metal coordination geometries wh
17. thereby making the most efficient use of search time When using automatic settings the search efficiency see autoscale can be used to control the speed of docking and the predictive accuracy i e the reliability of the results Related Instructions popsiz select_pressure n_ islands niche_siz pt_crosswt allele_mutatewt migratewt autoscale 5 5 niche_siz niche siz lt value gt auto default 2 Niching is a common technique used in genetic algorithms to preserve diversity within the population In GOLD two individuals share the same niche if the rmsd between the coordinates of their donor and acceptor atoms is less than 1 0 A When adding a new individual to the population a count is made of the number of individuals in the population that inhabit the same niche as the new chromosome If there are more than niche_ siz individuals in the niche then the new individual replaces the worst member of the niche rather than the worst member of the total population Optimum values of the genetic algorithm parameters are highly correlated you are therefore recommended to use automatic ligand dependent GA parameter settings see below or one of the default parameter sets offered via the front end Setting all population and genetic operators to auto will instruct GOLD to automatically calculate the optimal number of operations for each ligand thereby making the most efficient use of search time When using automatic settings
18. weak hydrogen bonds This term can be useful when dealing with particular proteins e g most kinases contain weak N heterocycle CH O hydrogen bonds The following params file is available within the SGOLD_DIR gold directory chemscore kinase params Related instructions gold_fitfunc_path docking _fitfunc_path rescore_fitfunc_path rescore_param_file run_flag 15 6 rescore_fitfunc_path rescore fitfunc path lt goldscore chemscore asp plp lt filename gt gt It is possible to perform automatic rescoring on docked poses using a different scoring function to that used during the docking In order for GOLD to know which scoring functions to use in such a consensus scheme the options docking fitfunc_pathand rescore fitfunc_path need to be defined Further the gold_fitfunc_path option should be set to consensus score and the run_flag should be set to CONSENSUS The options docking param file docking fitfunc_pathand rescore param file are also required to be set when performing automatic rescoring The rescore fitfunc path specifies the scoring function to be used for the rescoring part of the consensus scoring GOLD offers a choice of scoring functions GoldScore ChemScore Astex Statistical Potential Piecewise Linear Potential and User Defined Score which allows users to modify an existing function or implement their own scoring function via a Scoring Function Application Programming Interface API Scor
19. CLO CMV anda CT ee 13 7 4 Fl odfill atom NORIA A A ira 14 7 5 CMV Ml ls EE A A a a A ab 14 7 6 HO lt lo ds 15 D ta li A ida 17 8 1 ligand data Mile iio ia 17 8 2 ligand reference File coil 17 8 3 Darin Tilec nce hie ac Hee es Ue e a soe 18 8 4 set protein atom types cccccccncncncnnncnnnnnnnnnnnnnnnnnonnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnss 18 8 5 set_ligand_atom_typ8S ccccccconononoonnnnonononononocnnnnnnnnnnnnnnnnnnnnnnnnnnnnrnnnnnnnnanannnnnnnannss 18 8 6 AMECA dad eE 19 8 7 O 19 8 8 Make SUD ci A ona ee 20 8 9 SAVE ON St A ces vate A Be eee see ats aa he eS 20 8 10 bestranking_list_filename ccconnnccnoncnncnncanononnnnnnnonnnnnonnnnnnnnnnnnonononnnnnnnnnananonos 20 8 11 ligands from SOCKEt AH unit da td taa tada BGG hs wk 20 8 12 ligands to SORET ete te o edt 20 8 13 goldmine parae E Sii dd Aes 21 GOLD Configuration File User Guide iii 10 11 12 8 14 fit points A ee etaa eaan ae aae Ae a A eaaa devas AAEE aaa a Kaaa 21 A E E EE S E E A E T SEE E D 22 EN A A E E A E E A OO 23 9 1 internal ligand h DONS socio dash da aaa 23 9 2 Mp TEE CONS A ARS 23 9 3 match ring templates iv a a A oe hewn Tao eee eee esos 23 9 4 flip amide bond St A A A A td 23 9 5 flip planar Mts tt de A A neh ae aie de ae 24 9 6 flipe Pyramid Os Ser do 24 9 7 r tate carboxyliC OM erica ia iaa ode iaa bits 25 9 8 Use Tori dui as 25 9 9 fix rotatable DON cccccccccccecesssssssecececsceeseceaaececeesseeseceaeeeee
20. Configuration File User Guide 63 17 9 rds_metal rds metal lt min no atoms gt lt max no atoms gt lt scale factor 1 gt lt scale factor 2 gt lt output gt default rds metal 13 105 0 1 8 O e The rds_metal setting controls how the metal interaction term is scaled within RDS If the number of atoms surrounding an interaction point between the ligand and the protein is lt min no atoms gt or less it is scaled by lt scale factor 1 gt Between lt min no atoms gt and lt max no atoms gt it is scaled linearly to lt scale factor 2 gt All interactions with more than lt max no atoms gt are scaled by lt scale factor 2 gt The variable lt output gt is O for default output while 1 is for verbose output Note Output set to 1 will give large amounts of information e The receptor depth scaling has been validated for the default values only any changes of these values can lead to unpredictable results and should be done with caution Required instructions e receptor_depth_scaling e rds_use_protein_coords e rds_use_donor_coords e rds_use_exact_count e rds_protein_distance e rds_hbond e rds lipo e rds clash 64 GOLD Configuration File User Guide 18 Water Data 18 1 water water lt atom id gt lt on off toggle gt lt spin fix trans spin gt e GOLD allows waters to switch on and off i e to be bound or displaced and to rotate around their three principal axes to optimise hydrogen bonding
21. GOLD Configuration File User Guide A Component of the GOLD Suite 5 3 Release Copyright O 2015 Cambridge Crystallographic Data Centre Registered Charity No 800579 Conditions of Use The GOLD suite of programs the Program comprising all or some of the following Hermes including as Relibase client and as SuperStar interface GOLD GoldMine associated documentation and software are copyright works of CCDC Software Limited and its licensors and all rights are protected Use of the Program is permitted solely in accordance with a valid Software Licence Agreement or a valid Licence and Support Agreement with CCDC Software Limited or a valid Licence of Access to the CSD System with CCDC and the Program is proprietary All persons accessing the Program should make themselves aware of the conditions contained in the Software Licence Agreement or Licence and Support Agreement or Licence of Access Agreement In particular e The Program is to be treated as confidential and may NOT be disclosed or re distributed in any form in whole or in part to any third party e No representations warranties or liabilities are expressed or implied in the supply of the Program by CCDC Software Ltd its servants or agents except where such exclusion or limitation is prohibited void or unenforceable under governing law GOLD 2015 CCDC Software Ltd Hermes 2015 CCDC Software Ltd GoldMine 2015 CCDC Software Ltd Implementation of ChemSc
22. _ligand_h_bonds internal ligand _h bonds lt option gt default 0 off options 0 off 1 on e Switching this option on will allow intramolecular hydrogen bonds in the ligand to be formed during docking 9 2 flip_free_corners flip free corners lt option gt default 0 off options 0 off 1 on e Switch this option on to allow free corners of ligand rings to flip This will result in GOLD performing a limited conformational search of cyclic systems by allowing free corners of rings to flip above or below the plane of their neighbouring atoms If flip free corners 0 then rings will be held rigid at the input conformation during docking The rules govening flipping of ring corners in GOLD are given in A W R Payne and R C Glen J Mol Graphics 1993 10 74 91 Related Instructions e match_ring_ templates 9 3 match_ring_templates match ring templates lt option gt default 0 off options 0 off 1 on e Switch this option on to allow ring conformational searching using the ring template library If the ligand contains a ring that is defined in the ring template library the conformation of that ring will vary within the genetic algorithm run Each time the current chosen ring conformation changes it is mutated and the ligand ring conformation is driven to the newly matched conformation by changing the bond lengths internal ring angles and torsions Related Instructions e flip free_corners
23. aced at an explicitly defined position using x y z coordinates within the binding site Each sphere is assigned a user defined radius so a sphere can be adjusted if required e g to fill an entire pocket in the binding site e Acontribution determined according to a user specified weighting is then added to the score for each specified non hydrogen ligand atom that lies within the GOLD Configuration File User Guide 33 11 5 34 designated sphere Note the contribution is added to the score for each atom located within the sphere i e the total contribution will depend on the number of atoms found in the region of interest and ultimately the ligand accessible volume of the region lt x value gt lt y value gt lt z value gt are used to specify the position of the sphere within the binding site The sphere must also be assigned a radius lt radius gt The minimum settable value of lt radius gt is 0 5 The value of lt score gt is the contribution that will be added to the score for each specified non hydrogen ligand atom that lies within the designated sphere The ligand atoms used in the constraint can be specified explicitly from a list of atom numbers Atoms should be specified using list lt atom ids gt The atom indices as defined in the ligand input file must be used Atom indices should be separated by a single space Alternatively it is possible to use all hydrophobic ligand atoms or to use only those hydrop
24. ach atom in the selected residue are found these atoms plus the atoms of their associated residues are then used for the active site definition Related instructions e do_cavity e floodfill_center e radius 7 5 cavity_file cavity file lt filename gt e cavity file is used to provide GOLD with a binding site definition and must be followed by a filename e g cavity file Z GOLD datafiles ligand_ reference mol2 e When used in combination with floodfi11 center cavity from ligand see floodfill_ center this option will define the binding site from a specified reference ligand This could be a ligand in a known binding mode or the co crystallised ligand By default all protein atoms that lie within 5 0 of each ligand atom are found these atoms plus the atoms of their associated residues are then used for the active site definition To use only those protein atoms within a user specified cavity distance threshold of each ligand atom i e do not also include all atoms of their associated residues use the following floodfill center cavity from ligand lt distance gt atom where lt distance gt is the cavity distance threshold and the keyword atom instructs GOLD not to include all atoms of each residue found e When used in combination with floodfill center file see floodfill_center this option will define the binding site from a list of protein atoms The file specified should contain the atom id as it a
25. ans spin have GOLD optimise the orientation of the water H atoms as well as optimise the location of the water O atom within a user defined radius of less than 2 A The directory the water file is stored in because waters are able to move and are specified outside the protein they need their own file path An example of acceptable input is water 1 toggle trans spin 2 home gold waters H20126 mol2 e Any unspecified waters that are part of the protein are considered to be on automatically and their orientation will not be optimised during docking GOLD Configuration File User Guide 65 19 Metal Data 19 1 metal_coordination_spec metal coordination spec point lt value gt lt value gt lt value gt point lt value gt lt value gt lt value gt point lt value gt lt value gt lt value gt end metal coordination spec e By default GOLD will automatically determine metal coordination geometries The following geometries are recognised Template Geometry Coordination Number TETR Tetrahedral n 4 TBP Trigonal bipyramidal n 5 OCT Octahedral n 6 CTP Capped trigonal prism n 7 PBP Pentagonal bipyramidal n 7 SQAP Square prism n 8 ICO Icosahedral n 10 DOD Dodecahedral n 12 e in order to determine the coordination geometry of a particular metal atom GOLD performs a permuted superimposition of coordination geometry templates onto the coordinating atoms found in the protein Coordination fitting points are then generated using
26. arameter settings see below or one of the default parameter sets offered via the front end Setting all population and genetic operators to auto will instruct GOLD to automatically calculate the optimal number of operations for each ligand thereby making the most efficient use of search time When using automatic settings the search efficiency see autoscale can be used to control the speed of docking and the predictive accuracy i e the reliability of the results Related Instructions popsiz select_pressure match_ring templates niche_siz pt_crosswt allele_mutatewt migratewt autoscale GOLD Configuration File User Guide 7 5 4 maxops maxops lt value gt auto default 100000 The genetic algorithm starts off with a random population each value in every chromosome is set to a random number Genetic operations crossover migration mutation are then applied iteratively to the population maxops is the number of operators that are applied over the course of a GA run It is the key parameter in determining how long a GOLD run will take Optimum values of the genetic algorithm parameters are highly correlated you are therefore recommended to use automatic ligand dependent GA parameter settings see below or one of the default parameter sets offered via the front end Setting all population and genetic operators to auto will instruct GOLD to automatically calculate the optimal number of operations for each ligand
27. ax operations Similarly the minimum number of operations can be specified see autoscale_nops_min Related Instructions autoscale autoscale_nops_min popsiz select _pressure n_ islands match_ring_templates niche_siz pt_crosswt allele_mutatewt migratewt 4 3 autoscale_nops_min autoscale nops min lt value gt default 0 off When using automatic ligand dependent GA parameter settings the search efficiency see autoscale can be used to control the speed of docking and the predictive accuracy i e the reliability of the results When using autoscale the minimum number of GA operations performed during the docking run will be updated automatically according to the autoscale value that is set The automatic preset can be overridden to ensure that every ligand is subjected to at least autoscale nops min operations Similarly The Maximum number of operations can be specified see autoscale_nops_max Related Instructions GOLD Configuration File User Guide autoscale autoscale_nops_max popsiz select _pressure n_ islands match_ring_templates niche_siz pt_crosswt allele_mutatewt migratewt 5 Population 5 1 popsiz popsiz lt value gt auto default 100 The genetic algorithm maintains a set of possible solutions to the problem Each possible solution is known as a chromosome and the set of solutions is termed a population The value of popsiz population size is the number of chromosomes in th
28. by GOLD The minimum H bond geometry weight takes a range of values from O to 1 by default this value is set at 0 005 During docking GOLD assesses the geometry of each required hydrogen bond on a scale of Oto 1 with 1 denoting perfect If this geometry weight for the constrained Hbond falls below the minimum H bond geometry weight specified a penalty lt constraint weight gt will be applied to the score for the unfulfilled hydrogen bond i e it will not be considered to be an H bond and will therefore contribute a penalty to the fitness score e The protein atom s to be constrained should be specified using lt atom ids gt The atom indice s as defined in the ligand input file must be used Either a donatable hydrogen atom you must give the number of the hydrogen atom not the O or N atom to which it is attached or an acceptor can be specified The protein atom s should also be available for ligand binding i e solvent accessible e For a given protein H bond constraint more than one protein atom number can be specified This will instruct GOLD to use an either or type of constraint during docking For example specifying two protein atoms acceptor m and acceptor n separated by a space will result in the constraint being satisfied if an H bond is formed to either m or n during docking This is of use when defining constraints involving for example carboxylates where it is not important which oxygen atom forms an H bond provided that o
29. ccoconocoooonnnnononononanncnnnnnnns 46 14 9 clean_up_option save clustered_sOlUtiONS coconcococncncninanonannnnnnnonananonanonanonnnns 47 14 10 clean_up_option delete empty_directories oconoococcnncnonononoonnnnnnnnnnnnanannnnnnnnnns 47 14 11 clean_up_option delete _rank_file cccocnnonocconnnnnnnnanonannnnnononanananonnnnnnnnnno 48 14 12 clean_up_option delete_all_log files cononoononcnnnnnnnnononnnnnonononanonannnnnnonono 48 14 13 clean_up_option delete_all_initialised_ ligands coconccccccnnnnonanononannnnnonnnno 48 14 14 OUTPUT Tilo O Mini nd A ae 48 14 15 per atOmMm ScCoreS A tanta A ita 49 14 16 save per_atom_scores to charge field ocincooooccncnnnnnononannnononananonannnancnnon 49 15 Fitness Function Settings erei T tdi 50 15 1 initial_virtual_pt_match_MaX ccccccnonnnononnnoncnnnnnonnnnnnnnnnnnnnononnnnnnnnnnnnnenonnnnnnnnon 50 15 2 relative ligand_energy cccccnnnonoonnnnonononononnnnnononnnnnnnonnonnnnnnnnnnnonnnnnnnnnnnnnrnnnnnnnnnnnns 50 15 3 gold fitfunc_path o ccccccninonococnnnnnnnnnnononnnnnoncnnnnnnnnononnnnnnnnnnnonnnnnnnnnnnnnennnnnnnnnnons 50 15 4 docking fitfunc_path cccocnnnccconnncnonanononnnnnnnnnnnnnnnonnnnnnnnnnnnnnonnnnnnnnnnnnnenonnnannnnnn 51 15 5 0OckiME param leia td iii 52 15 6 rescorecfittUNG Patri 53 15 75 Fescore parade a ld ia 54 15 8 score param Tilexs hen eit I Esa 55 15 9 start_vdW_lin ar_CUtOfF miniser
30. covalent covalent_ligand_atom_no covalent_substructure_filename covalent_substructure_atom_no covalent_substructure covalent_topology covalent_ligand_atom_no covalent ligand atom no lt atom id gt When docking covalently bound ligands see covalent GOLD will assume that there is just one atom linking the ligand to the protein e g the O in a serine residue Both protein and ligand files should be set up with the link atom included covalent_ligand_atom_no is used to define the link atom in the ligand file The atom id as it appears in the ligand input file must be specified The link atom as it appears in the protein input file must also be specified see covalent_protein_atom_no GOLD supports two types of covalent link A covalent link for use with individual ligands and a substructure based covalent link for use with multiple ligands which have a common functional group see covalent_substructure covalent ligand atom no should only be set when defining a covalent link to an individual ligand Related Instructions covalent covalent_protein_atom_no covalent_substructure_filename covalent_substructure filename lt filename gt It is possible to apply a covalent link to multiple ligands which have a common functional group see covalent_substructure During docking the link will be applied to any ligands which contain a specified substructure To use a substructure based covalent link first create a fi
31. d reference file lt filename gt e ligand reference file is used to provide GOLD with a file containing a reference ligand e g a crystallographically observed ligand pose ligand_reference_file must be followed by a filename e g ligand reference file Z GOLD datafiles cryst_observed_pose mol2 e The ligand reference file will be used to perform automated RMSd calculations against GOLD solution s For each GOLD solution the resultant RMSd with respect to GOLD Configuration File User Guide 17 8 3 8 4 8 5 18 the reference ligand will be written to the files containing the fitness function rankings i e the ligand rank file rnk and bestranking 1st file param_file param file lt filename gt DEFAULT default DEFAULT param file is used to provide GOLD with the location of the parameter file The parameter file contains all of the parameters used by GOLD e g hydrogen bond energies atom radii and polarisabilities torsion potentials hydrogen bond directionalities etc it also contains parameters that control the general behaviour of GOLD e g whether the final solution from a genetic algorithm run is to be minimised via a Simplex procedure before being saved If param_file is set to DEFAULT then the standard parameter file gold params supplied in the GOLD distribution is used The parameter file can be customised by copying it editing the copy and instructing GOLD to use the edit
32. d terms or both To include weighted scoring terms only specify the keyword weighted to include non weighted terms only specify the keyword unweighted or to include both weighted and non weighted terms specify the keyword a11 To prevent SD style tags being written to comment blocks in MOL2 solution files specify the keyword no_sdtags_ in mol2 If your input file contained MOL2 file tags then these can be preserved in a MOL2 comment field in the output file by specifying the keyword comments save_protein_torsions save protein torsions lt option gt default 1 on options 0 off 1 on It is possible to specify that one or more protein side chains are to be treated as flexible see rotamer_lib Each flexible side chain will be allowed to undergo torsional rotation around one or more of its acyclic bonds during docking In addition the torsion angles of Ser Thr and Tyr hydroxyl groups in the protein will be automatically optimised by GOLD Specifically each Ser Thr and Tyr OH will be allowed to rotate to optimise its hydrogen bonding to the ligand Lysine NH3 groups are similarly optimised These optimised protein torsions that are generated during docking these will usually be different for each docked ligand pose can be written to docked solution files This information is written to SD file tags for MOL2 files these tags are written to comment blocks GOLD Configuration File User Guide 14 3 concatenated_out
33. d with the GOLD installation The following params files are available within the SGOLD DIR gold directory goldscore p450 csd params GOLD Configuration File User Guide goldscore p450 pdb params chemscore p450 csd params chemscore p450 pdb params A specific parameter file for use with protein kinases is available for ChemScore For further information see M L Verdonk V Berdini M J Hartshorn W T M Mooij C W Murray R D Taylor and P Watson J Chem Inf Comput Sci 44 793 806 2004 This allows weak CHO interactions to be accounted for by inclusion of a ChemScore term that calculates a contribution for weak hydrogen bonds This term can be useful when dealing with particular proteins e g most kinases contain weak N heterocycle CH O hydrogen bonds The following params file is available within the SGOLD _DIR gold directory chemscore kinase params Related instructions gold_fitfunc_path docking _fitfunc_path docking param_file rescore_fitfunc_path run_flag 15 8 score_param_file score param file lt lt filename gt DEFAULT gt default DEFAULT The scoring function parameter file contains all of the parameters required by the scoring function If score param file DEFAULT then the appropriate standard scoring function parameter file provided with the GOLD distribution will be used during the docking run i e either goldscore params Or chemscore params will be used depending
34. distance between donor hydrogen and fitting point must be less than max_distance for the bond to count towards the fitness score This allows poor hydrogen bonds to occur at the beginning of a GA run relative_ligand_energy relative ligand energy lt option gt default 1 on options 0 off 1 on relative ligand energy 1 is the default setting and which results in the internal energy terms internal torsion internal vdw and internal Hbond being corrected according to the best energy encountered for these terms during the run By applying this correction the internal energy will be calculated with respect to that of a close to optimal non bound structure thereby taking into account any irreducible internal energy gold_fitfunc_path gold fitfunc path lt goldscore chemscore asp plp lt filename gt consensus score gt GOLD offers a choice of scoring functions GoldScore ChemScore Astex Statistical Potential Piecewise Linear Potential and User Defined Score which allows users to modify an existing function or implement their own scoring function via a Scoring Function Application Programming Interface API Scoring functions are implemented in GOLD using shared objects or dynamically loadable libraries gold_fitfunc_path defines which scoring function is to be used by specifying the path to the relevant dynamically loadable shared object library By default the Goldscore scoring function will be used
35. e population If the number of islands see popsiz is greater than one i e the genetic algorithm is split over two or more islands then popsiz is the population on each island Optimum values of the genetic algorithm parameters are highly correlated you are therefore recommended to use automatic ligand dependent GA parameter settings see below or one of the default parameter sets offered via the front end Setting all population and genetic operators to auto will instruct GOLD to automatically calculate the optimal number of operations for each ligand thereby making the most efficient use of search time When using automatic settings the search efficiency see autoscale can be used to control the speed of docking and the predictive accuracy i e the reliability of the results Related Instructions select _pressure n_ islands match_ring_templates niche_siz pt_crosswt allele_mutatewt migratewt autoscale 5 2 select_pressure select pressure lt value gt auto default 1 1 Each of the genetic operations crossover migration mutation takes information from parent chromosomes and assembles this information in child chromosomes The child chromosomes then replace the worst members of the population The selection of parent chromosomes is biased towards those of high fitness i e a fit chromosome is more likely to be a parent than an unfit one The selection pressure is defined as the ratio between the probabili
36. ed file e g param file Z GOLD datafiles custom params Note Editing the parameters file may cause GOLD to behave in unexpected ways No guarantee can be given that the program will behave reliably with anything other than the default parameterisation For more information see the comments in the parameter file itself gold params set_protein_atom_types set protein atom types lt option gt default 0 off options 0 off 1 on Each protein atom must be assigned an atom type which is used for example to determine whether the atom is capable of forming hydrogen bonds GOLD atom typing is based on SYBYL atom types Setting set protein atom types 1 will instruct GOLD to set atom types automatically The atom types will be assigned from the information about element types and bond orders in the protein input file so it is important that these are correct When using automatic atom type assignment you still need to input the protein structure correctly e g with correct bond orders and appropriate protonation states Structure input files should be prepared in accordance with the guidelines provided using a good modelling package set_ligand_atom_types set _ligand_ atom types lt option gt default 1 on options 0 off 1 on Each ligand atom must be assigned an atom type which is used for example to determine whether the atom is capable of forming hydrogen bonds GOLD atom typing is based on SYBYL atom ty
37. ed to a given rotamer e g as follows rotamer_lib name tyr370 chil 497 498 501 502 chi2 498 501 502 503 rotamer 62 11 90 11 energy 10 rotamer 65 11 85 18 end_rotamer lib e This will penalise i e reduce the GoldScore value by 10 units if the Tyr370 side chain is placed in the chi1 62 chi2 90 conformation In other words it makes this conformation less favourable e Had the command energy 10 been included its effect would have been to improve i e increase the GoldScore value Related instructions rotamer_lib 72 GOLD Configuration File User Guide 21 3 penalise_protein_clashes penalise protein clashes lt option gt default 1 on options 0 off 1 on By default when a flexible side chain is moved during docking see rotamer_lib GOLD checks whether any of its atoms clash with atoms in neighbouring residues This gives rise to an extra Protein Energy term which contributes to the total GoldScore value The term is computed by summing the van der Waals interactions of all pairs of protein atoms which satisfy the following conditions a at least one of the protein atoms is in a flexible side chain b the van der Waals term for that pair of atoms is repulsive The van der Waals interactions will be estimated using the same potential as is used for the protein ligand vdw term Setting penalise protein clashes 0 will switch off calculation of the protein protein clash term for all
38. elated instructions gold_fitfunc_path docking param_file rescore_fitfunc_path rescore_param_file run_flag docking _param_file docking param file lt lt filename gt DEFAULT gt default DEFAULT It is possible to perform automatic rescoring on docked poses using a different scoring function to that used during the docking In order for GOLD to know which scoring function parameter files to use in such a consensus scheme the options docking param fileand rescore param file need to be defined Further the gold _fitfunc path option should be set to consensus score and the run flag should be set to CONSENSUS The options docking fitfunc_ path rescore fitfunc_pathand rescore param file are also required to be set when performing automatic rescoring The docking param file specifies the scoring function parameter file to be used for the docking part of the consensus scoring The scoring function parameter file contains all of the parameters required by the scoring function If docking param file DEFAULT then the appropriate standard scoring function parameter file provided with the GOLD distribution will be used during the docking run i e either goldscore params or chemscore params will be used depending on whichever scoring function is specified see docking _fitfunc_path The scoring function parameter file can be customised by copying it editing the copy and instructing GOLD to use the edited file e g
39. en matching templates onto the coordinating atoms found in the protein see overrule_metal_coordination Related Instructions e overrule_metal_coordination 19 2 overrule_metal_coordination overrule metal coordination lt atom id gt lt coordination geometries gt e By default GOLD will automatically determine metal coordination geometries e In order to determine the coordination geometry of a particular metal atom GOLD performs a permuted superimposition of coordination geometry templates tetrahedral octahedral etc onto the coordinating atoms found in the protein Coordination fitting points are then generated using the template that gives the best fit based on RMSd e The geometry templates used for given metals are defined in the gold params file in the section headed 4 Metals For example for a Zn atom GOLD will attempt to match coordination geometries 4 5 and 6 tetrahedral trigonal bipyramidal and octahedral templates onto the coordinating atoms found in the protein The template that gives the best match will then be used to generate coordination fitting points e Itis possible to manually specify coordination geometries for particular metal atoms This can be used to allow non standard metal coordination geometries or to limit the number of possible geometries that GOLD checks i e it is possible to overrule the default geometries for the corresponding metal type defined in the gold params file e The ato
40. es the atom numbers of the atoms defining the first rotatable torsion The atom indices as they appear in the input file must be specified In the example this corresponds to rotation around Ca CB so the atoms will be the backbone N atom 497 CA 498 CB 501 and CG 502 It is necessary to specify the atoms from the backbone outwards i e chil 502 501 498 497 would be invalid e The chi2 command specifies the second rotatable torsion In this example this corresponds to rotation around CB Cy so the atoms are CA 498 CB 501 CG 502 and CD1 503 e You may specify up to 8 chi commands in a given rotamer_1lib block e Each rotamer command describes one allowed conformation for the side chain Thus the first rotamer command specifies the first set of allowed values for chil and chi 2 Inthe example thisis chil 60 chi2 90 e The second rotamer command specifies the second set of allowed values The format x y specifies the range x y to x y while x y z specifies the range x y to x z In the example therefore chi1 is allowed to have any value between 70 GOLD Configuration File User Guide 71 and 50 degrees and chi2 is allowed to vary continuously between 95 and 70 degrees e Insummary the effect of this rotamer _1ib command block is therefore to allow Tyr370 to adopt the conformation chi1 60 chi2 90 or any conformation inthe range chil 70 to 50 chi2 95 to 70 e You can have up
41. esseeseeeaeeeeeeesessees 25 9 10 pOStprocess DONS ccsssccccecessessnseeecccecessesnaeeeeeesseeseaaeseeseeseseesaaeeeeeesseeaes 26 9 11 rotatable bond override file cccocnnnnocnonnnnnnnnnnonannnnnnnnnonanonannnnnnnonanannnos 26 9 12 AR cece a vi vsaece a E A saves euisade a iia ra AaS 27 95137 dIVSOI CIUSKERUSIZE a erie eit Ad Sean 27 A E O 27 O15 SOlVate A tees a TE teal eee cues bares eae tes aes aaa Gana 27 9 16 _ fix_all_protein_rotatable DONS ooooccccnncononoonnnnonononanonanonnnononnnnnonononnnnnnnananonos 28 TEMO ett ii ita 29 O eariy terminatio Ni onene ai eea ao Ok Si cee Aetna etic hee es eee 29 10 2 into p SOlUTONS 4cechac atta dla dd eis 29 10 3 rms tolerancia ee 29 Constralntesaiain ic lo eG hee hs a ee aie dala 31 A ir AAA O eathetees 31 11 2 constrainth_bONd ooccccncnnnononocnnononnnanonononnnoncnnnnnononnnnnnnnnnnnnnonnnnnnnnnnnnnrnnnnnnnnnnons 32 11 3 constraint protein N bond ossis ereinen ai iiaeia aiias 32 114 SCONStAINUS PNK io oe eeivan sto tete eestor A cd 33 11 5 constraint SUbsStrUCtUl8 cooonnoccconcccnoncnononnnnnnnnnnnacononnnonnnnnnn nn nano nr non nn a aiai 34 11 6 _ constraint similarity ii a ees ee I dba 35 TL7 constraint scaffold iiien dust tavede a a a 36 11 8 interaction_restraint_Weight cccccononocoonnnnnnnnnnonennnnnnncnnnanononnnnnnnncnnnnnonnnnnnnnnns 37 11 9 constraint interaction_restraint ssesssesesesereersrrrrrrrrrrrrrrrrrs
42. essible In GOLD Configuration File User Guide 27 9 16 28 this case fitting points would be generated for all solvent accessible donor and acceptor atoms within the binding site Note that these atoms are already deemed to be solvent accessible but it s their potential fitting points that may have been desolvated by neighbouring atoms This option can be used e g to avoid problems with solvent accessibility of backbone carbonyls in kinases where one of the carbonyl lone pairs is typically desolvated by a neighbouring atom To generate fitting points for all solvent accessible donor and acceptor atoms set solvate_all 1 the default is 0 off Related Instructions radius origin do_cavity floodfill_atom_no cavity_file floodfill_center fix_all_protein_rotatable_bonds fix all protein rotatable bonds lt option gt default 0 off options 0 off 1 on During all dockings serine threonine and tyrosine hydroxyl groups are optimized i e rotated during docking as are lysine NH3 groups If this is undesirable this rotation can be switched off by setting fix all protein rotatable bonds 1 GOLD Configuration File User Guide 10 Termination 10 1 early_termination early termination lt option gt default 1 on options 0 off 1 on e Setting early termination 1 will instruct GOLD to terminate docking runs on a given ligand as soon as a specified number of runs have given essentially the same a
43. etain just the lt value gt best solutions for each ligand e Inorderforclean up option save top n solutions to take effect the options clean up option delete empty directories and clean up option delete redundant log files also need to be set Related Instructions e clean_up_option delete empty directories e clean_up_option delete_redundant_log files e clean_up_option save fitness better_than 14 7 clean_up_option save_best_ligands clean up option save best ligands lt value gt e By default all docked solutions will be kept at the end of a docking run However GOLD can produce a lot of output and you may wish to cut this down e Set clean up option save best ligands in order to retain just the top solution and for only those lt value gt ligands with the best fitness scores Related Instructions e clean_up_option save fitness better_than 14 8 clean_up_option delete_redundant_log files clean up option delete redundant log files e By default a solution log file lt ligand file name gt m 1og is written for each ligand that is docked see clean_up_option delete_all_log files e However under certain circumstances you may wish to use clean up option delete redundant log files in order to delete unwanted log files e g 46 GOLD Configuration File User Guide When choosing not to retain all solutions from a docking run see clean_up_option save_best_ligands you may also wish to remove log files that corresp
44. ete eee 72 21 37 penalise proteicas eiii ae ie eave ns le lek 73 Internal ck eGo teeta eee ae a ene Alen ab eee it 74 2217 YSCCO ln a a AS A Hou eee eee 74 GOLD Configuration File User Guide 1 Overview Using Configuration Files with GOLD e The configuration file is a text file which specifies the GOLD calculation that is to be run including details of the ligand the protein binding site the fitness function parameter file to be used the torsion distribution file to be used and the genetic algorithm parameters Although the file can be generated with a standard text editor the easiest way to create it is to use the GOLD graphical user interface e Any settings that have been defined in the GOLD interface can be saved as a configuration file by selecting the Save button located next to the Conf file entry box at the top of the GOLD Setup window Alternatively you will be prompted to save the file if you start a GOLD job from the interface by selecting either Run GOLD or Run GOLD in the background e By default the configuration file will be saved in the directory from which GOLD was opened and will be called gold conf Use the Conf file entry box at the top of the GOLD Setup window to change the file name and or directory any file name can be used e Certain advanced functionality in GOLD is only available by directly editing the GOLD configuration file i e some functionality is not exposed in the GOLD graphical user interface
45. even if the instruction gold fitfunc path is not present in the configuration file To use the Chemscore scoring function it is necessary to include the line gold fitfunc path chemscore To use a new or modified scoring function set gold _fitfunc path to specify the path to the appropriate shared objects or dynamically loadable libraries e g GOLD Configuration File User Guide gold fitfunc path Z GOLD my_ score dll Full documentation for the GOLD Scoring Function Application Programming Interface API is provided with the GOLD distribution It is possible to perform automatic rescoring on docked poses using a different scoring function to that used during the docking In order for GOLD to know which scoring functions to use in such a consensus scheme the options docking fitfunc_pathand rescore fitfunc path need to be defined Further the gold _fitfunc path option should be set to consensus score and the run flag should be set to CONSENSUS The options docking param file and rescore param file are also required to be set when performing automatic rescoring Related instructions docking _fitfunc_path docking param_file rescore_fitfunc_path rescore_param_file run_flag 15 4 docking fitfunc_path docking fitfunc path lt goldscore chemscore asp plp lt filename gt gt It is possible to perform automatic rescoring on docked poses using a different scoring function to that used during the docking In order fo
46. flexible side chains not just the one corresponding to the rotamer _ lib block in which you have placed the penalise protein clashes 0 command Related Instructions rotamer_lib GOLD Configuration File User Guide 73 22 Internal 22 1 seed _file seed file lt filename gt e Itis possible to supply the seed for the random number generator used by the genetic algorithm in GOLD e By default the file gold seed_log will be generated automatically during a GOLD run This file will be written to your specified output directory However it is possible to instruct GOLD to use a gold seed_ log file from a previous calculation or to use a customised gold seed_ log file e seed file is used to provide GOLD with the location of the gold seed_log file e This provides a mechanism for introducing a non random start facility Normally a new GOLD calculation will be seeded in this way in order to reproduce identical results for repeat runs You could also use this mechanism to manually specify a seed for a new GOLD calculation 74 GOLD Configuration File User Guide
47. function parameter file to be used for the rescoring part of the consensus scoring The scoring function parameter file contains all of the parameters required by the scoring function If rescore param file DEFAULT then the appropriate standard scoring function parameter file provided with the GOLD distribution will be used during the docking run i e either goldscore params or chemscore params will be used depending on whichever scoring function is specified see rescore_fitfunc_path The scoring function parameter file can be customised by copying it editing the copy and instructing GOLD to use the edited file e g rescore param file Z GOLD datafiles my sf params The format of the scoring function parameter file is quite strict incorrect editing may cause GOLD to behave in unexpected ways or even to crash Because of the large number of parameters no guarantee can be given that the program will behave reliably with anything other than the default parameterisation Specific parameter files for use with heme containing proteins are also available for both GoldScore and ChemScore For further information see S B Kirton C W Murray M L Verdonk and R D Taylor Proteins Structure Function and Bioinformatics 58 836 844 2005 The parameters are derived from contact statistics obtained from the CSD and PDB databases These parameters can be used by specifying the appropriate params file from those that have been supplie
48. gorithm has sufficient alternatives for placement of hydrophobic ligand atoms within the cavity GOLD uses gridded points that are spaced by 0 25 A for a speed up in calculation higher values could be used Related Instructions e read fitpts GOLD Configuration File User Guide 21 8 15 22 read_fitpts read fitpts lt option gt default 0 off options 0 off 1 on By default GOLD automatically calculates a list of hydrophobic fitting points in the binding site These are used during the generation of trial docking solutions to map hydrophobic ligand atoms into favourable regions of the binding site GOLD generates its hydrophobic fitting points by placing a fine grid over the binding site At each grid position the van der Waals interaction energy between a bare carbon atom and the protein is evaluated Positions at which the interaction energy is below 2 5 kcal mole are added to the list of fitting points In this way a map is constructed that contains positions onto which the placement of a hydrophobic ligand atom should be favourable The ligand fitting points are used for the matching of hydrophobic regions To instruct GOLD to use customised hydrophobic fitting points set read_fitpts 1 fit points file is used to provide GOLD with the location of the customised fitting points file see fit_points_file Related Instructions fit_points_file GOLD Configuration File User Guide 9 Flags 9 1 internal
49. hobic atoms in aromatic rings lt hydrophobic atoms arom_ring_atoms gt Atoms considered to be hydrophobic include Carbon atoms bound to at least two H or C atoms Atoms typed C cat Atoms typed S 3 and bound to two carbons Hatoms bound to an sp sp or aromatic carbon Note only heavy atoms found within the sphere will contribute to the score When using constraints GOLD will be biased towards finding solutions in which the specified constraint is satisfied However it is important to remember that such a solution is not guaranteed i e it is not possible to force a constraint to be satisfied in the final solution It is possible to instruct not to dock ligands when the specified constraint s are physically impossible to satisfy e g if no suitable group is present in the ligand to form the required constraint see force_constraints Related Instructions force_constraints constraint substructure constraint substructure protein lt atom id gt lt filename gt lt atom id gt lt max distance gt lt min distance gt lt spring constant gt ring center It is possible to apply a distance constraint to multiple ligands which have a common functional group or fragment The constraint will bias the distance between a protein atom and one atom of the fragment towards a specified distance range During a GOLD run if a constrained distance is found to lie outside its bounds a spring energy term is used to
50. ia a E E a E T ia 56 15 10 run_flag 56 15 11 A A ceesdten tached E a E a ieee eet es acs 57 16 Protein Data td A fade te latin alah na tata ila ten tote tele 59 16 1 protein datafile reiia eile ee hh ee ee 59 17 Receptor DepthiScal ing isisisi ono cantada donan cacao aida adenda bien 60 17 1 receptor_depth_scaling ooooccnccononononoonnnncnnnnonnnnnancnnnnnnnnnonnnnnnnnnnnnnrnonnnnnnanon 60 17 2 rds se protein_CO0rdS ccccccccccecessessneececececeesessaaececeesseeseaaaeeeeeesseeseaaaeeeenens 60 17 3 rds_use dONOr_COOTOS cccccononococcnononoconononnnnnononnnnnnnonnnnnnnnnnnnnononnnnnnnnnnnnnenonnnnnnnnons 61 174 rds AS AA anaa deen aaae i ieia iai e aaaea aai 61 GOLD Configuration File User Guide v vi 18 19 20 21 22 17 5 o A a tinea tea eis esos aea snuck tae eaea iaaea 61 17 6 TOSTADO ti A aeons 62 17 7 e 1 RN 62 17 8 o clashistiaccecas heen a eis ee a een et ee E E 63 179 o No Wee pe na eee eee ep ee ee eee 64 Water Dd A A ta 65 IBA WEE A A e 65 Metal Data ad A A A AA ita 66 19 1 metali coordination PEO tt n 66 19 2 overrule_metal_coordinatioN ccccnnnoooonnnncnnnnnononnnnnnnnnnnnnnnannnnnnnncnnnnnonnnnncnnnns 67 ProtenDita di code 69 20 1 ensemble StrUCtUFB ooooocoonccccnononononnnnnnnnnonononononnnnnnnonononnnnnnnnnnnnnnnnnnnnnnnnnnannnannnnnns 69 Flexible Sidechains ii a iaa ds 71 211 rotamer A O 71 A A E1 3 A EEE EEE EEE E E E E waned eaves A ceive tne A awe
51. ij C W Murray R D Taylor and P Watson J Chem Inf Comput Sci 44 793 806 2004 This allows weak CHO interactions to be accounted for by inclusion of a ChemScore term that calculates a contribution for weak hydrogen bonds This term can be useful when dealing with particular proteins e g most kinases contain weak N heterocycle CH O hydrogen bonds The following params file is available within the GOLD_DIR gold directory chemscore kinase params start_vdw_linear_cutoff start _vdw linear cutoff lt value gt default 3 0 When Goldscore is being used the annealing parameters van der Waals and Hydrogen Bonding allow poor hydrogen bonds to occur at the beginning of a genetic algorithm run in the expectation that they will evolve to better solutions At the start of a GOLD run external van der Waals vdw energies are cut off when Eij gt van der Waals kij where kij is the depth of the vdw well between atoms i and j At the start of the run the cut off value is start_vdw_linear_cutoff This allows a few bad bumps to be tolerated at the beginning of the run 15 10 run_flag run flag RESCORE no simplex retrieve no file no strip CONSENSUS no simplex 56 It is possible to rescore a single ligand or a set of ligands in one or more files Typically a user will rescore GOLD solution files with an alternative scoring function However it is also possible to score a known ligand pose
52. ilic interactions within the RDS are scaled by 0 8 Required instructions receptor_depth_scaling rds_use protein_coords rds_ use _donor_coords rds_use_exact_count rds_protein_distance rds_hbond rds_clash rds_metal 17 8 rds_clash rds clash lt min no atoms gt lt max no atoms gt lt scale factor 1 gt lt scale factor 2 gt lt output gt default rds clash 00110 The rds_clash setting controls how the clash term is scaled within RDS If the number of atoms surrounding an interaction point between the ligand and the protein is lt min no atoms gt or less it is scaled by lt scale factor 1 gt Between lt min no atoms gt and lt max no atoms gt it is scaled linearly to lt scale factor 2 gt All interactions with more than lt max no atoms gt are scaled by lt scale factor 2 gt The variable lt output gt is O for default output while 1 is for verbose output Note Output set to 1 will give large amounts of information The receptor depth scaling has been validated for the default values only any changes of these values can lead to unpredictable results and should be done with caution The default settings i e lt min no atoms gt 0 and lt max no atoms gt 0 entails that the clash term within the RDS is scaled by 1 Required instructions receptor_depth_scaling rds_use protein_coords rds_ use _donor_coords rds_use exact_count rds_protein_distance rds_hbond rds_lipo rds_metal GOLD
53. ing functions are implemented in GOLD using shared objects or dynamically loadable libraries docking fitfunc path defines which scoring function is to be used by specifying the path to the relevant dynamically loadable shared object library To use a new or modified scoring function set rescore fitfunc_ path to specify the path to the appropriate shared objects or dynamically loadable libraries e g rescore fitfunc path Z GOLD my_score dll GOLD Configuration File User Guide 53 15 7 54 Full documentation for the GOLD Scoring Function Application Programming Interface API is provided with the GOLD distribution Related instructions gold_fitfunc_path docking _fitfunc_path docking param_file rescore_param_file run_flag rescore_param_file rescore param file lt lt filename gt DEFAULT gt default DEFAULT It is possible to perform automatic rescoring on docked poses using a different scoring function to that used during the docking In order for GOLD to know which scoring function parameter files to use in such a consensus scheme the options docking param fileand rescore param file need to be defined Further the gold fitfunc path option should be set to consensus score and the run flag should be set to CONSENSUS The options docking fitfunc path rescore fitfunc pathand docking param file are also required to be set when performing automatic rescoring The rescore param file specifies the scoring
54. ing solutions in which the specified constraint is satisfied However it is important to remember that such a solution is not guaranteed i e it is not possible to force a constraint to be satisfied in the final solution It is possible to instruct not to dock ligands when the specified constraint s are physically impossible to satisfy e g if no suitable group is present in the ligand to form the required constraint see force_constraints Related Instructions e force_constraints 11 7 constraint scaffold constraint scaffold lt filename gt lt weight gt list lt atom ids gt e The scaffold match constraint can be used to place a fragment at an exact specified position in the binding site the geometry of the fragment will not be altered during docking The scaffold can for example be a common core or fragment or it may just be a substructure known to adopt a certain binding position e The constraint is enforced at the mapping stage in GOLD Ligand placements are generated using a best least squares fit with the scaffold heavy atom positions i e this constraint forces all atoms on the matching portion of the ligand to lie very close or coincident with the corresponding scaffold There is no S con contribution to the fitness score to bias dockings e The scaffold file should contain the scaffold fragment in its docked position i e expressed in the same coordinate frame as the protein and with the coordinates required t
55. ing the most efficient use of search time When using automatic settings the search efficiency see autoscale can be used to control the speed of docking and the predictive accuracy i e the reliability of the results Related Instructions popsiz select _pressure n_ islands match_ring_templates niche_siz allele_mutatewt migratewt autoscale allele_mutatewt allele mutatewt lt value gt auto default 95 The operator weights for the parameters mutate migrate and crossover govern the relative frequencies of the three types of operations that can occur during a genetic optimisation point mutation of the chromosome migration of a population member from one island to another and crossover sexual mating of two chromosomes Each time the genetic algorithm selects an operator it does so at random Any bias in this choice is determined by the operator weights For example if Mutate is 40 and Crossover is 10 then on average four mutations will be applied for every crossover GOLD Configuration File User Guide Optimum values of the genetic algorithm parameters are highly correlated you are therefore recommended to use automatic ligand dependent GA parameter settings see below or one of the default parameter sets offered via the front end Setting all population and genetic operators to auto will instruct GOLD to automatically calculate the optimal number of operations for each ligand thereby making the most efficie
56. int h bond ligand lt atom id gt protein lt atom id gt e Aligand atom may be constrained to form a hydrogen bond to a particular protein atom One atom should be a donatable hydrogen atom you must give the atom id of the hydrogen atom not the O or N atom to which it is attached and the other should be an acceptor The atom ids as they appear in the structure input files must be specified The protein atom should be available for ligand binding i e solvent accessible e The constraint is incorporated into the least squares fitting routine used by GOLD Thus when least squares fitting is used to dock the ligand by attempting to form hydrogen bonds encoded within the chromosome the constraint is added to the least squares mapping The constraint has a weight of 5 relative to a normal hydrogen bond taken from the chromosome e The hydrogen bond constraint weighting can be altered within the FITNESS FUNCTION section of the GOLD parameters file by changing the value of the parameter CONSTRAINT WT e When using constraints GOLD will be biased towards finding solutions in which the specified constraint is satisfied However it is important to remember that such a solution is not guaranteed i e it is not possible to force a constraint to be satisfied in the final solution It is possible to instruct not to dock ligands when the specified constraint s are physically impossible to satisfy e g if no suitable group is present in the
57. is to be docked and must also be specified e g ligand _data_file Z GOLD datafiles ligand mol2 10 e Acceptable ligand file formats are MOL2 i e Tripos format or MOL i e MDL SD format Before being used with GOLD the ligand input file s must have been prepared in accordance with the guidelines provided using a good modelling package e Any number of ligands can be docked either by specifying a directory containing several ligand files by specifying a single file containing several ligands i e a multi MOL2 or SD file or by specifying several individual files When specifying several individual files multiple lines must be used e g ligand _data_file Z GOLD datafiles ligand1l mol2 10 ligand_data_file Z GOLD myfiles ligand2 mol2 20 Note By specifying multiple individual files on separate lines it is possible to have different values of lt no of GA runs gt for individual ligands e When using a single file containing several ligands i e a multi MOL2 or SD file it is possible to only dock specific ligands in that file The ligand you wish to start and finish docking at can be specified using the option start at ligand lt value gt finish at ligand lt value gt e Where lt value gt is the number relating to the position of the ligand within the file Unless specified otherwise GOLD will by default start at the first ligand and finish at the last ligand in the file 8 2 ligand_reference_file ligan
58. le rotamer_lib GOLD Configuration File User Guide 69 70 GOLD Configuration File User Guide 21 Flexible Sidechains 21 1 rotamer_lib rotamer lib name lt identifier gt chi lt value gt lt value gt lt value gt lt value gt lt value gt rotamer lt value value value value value value gt end_rotamer lib e You may specify that one or more protein side chains are to be treated as flexible Each flexible side chain will be allowed to undergo torsional rotation around one or more of its acyclic bonds during docking This option is only available if you are using the GoldScore scoring function e For each side chain that you want to make flexible you should add a rotamer lib block of commands to the gold conf file This specifies the name of the side chain the torsion angles that are permitted to vary and the allowed values or ranges of values for those torsion angles You can have up to 10 rotamer lib blocks in a given configuration file each one pertaining to a particular protein side chain For example consider the following rotamer _1ib command block rotamer lib name tyr370 chil 497 498 501 502 chi2 498 501 502 503 rotamer 60 90 rotamer 65 10 85 10 15 end_rotamer lib e name is used to specify a unique identifier for the rotamer _1ib command block Any text can be used but the obvious choice is the name of the side chain that the command block refers to in this case Tyr370 e The chil command specifi
59. le containing the substructure in MOL2 format e g substructure mol2 It is recommended that you set atom types manually since an incomplete fragment can cause problems with automatic atom typing The actual conformation of the group in this file is not important as only the atom types and 2D connectivity will be used for matching covalent_substructure filename is used to provide GOLD with the location of the substructure file and must be followed by a filename The substructure atom number to which the covalent link applies must also be specified see covalent_substructure_atom_no GOLD Configuration File User Guide Related Instructions covalent covalent_ligand_atom_no covalent_substructure covalent_substructure_atom_no covalent_topology 12 5 covalent_substructure_atom_no covalent substructure atom no lt atom id gt It is possible to apply a covalent link to multiple ligands which have a common functional group see covalent_substructure During docking the link will be applied to any ligands which contain a specified substructure see covalent_substructure_filename covalent _substructure_atom_no is used to define the substructure atom number to which the covalent link applies The atom id as it appears in the substructure input file must be specified Related Instructions covalent covalent_protein_atom_no covalent_substructure_filename covalent_substructure covalent_topology 12 6 covalent_substructure
60. ligand to form the required constraint see force_constraints e For example the following instruction defines a constraint between a carboxylate oxygen atom id 3401 of an aspartate residue in the protein and donatable hydrogen atom id 17 in the ligand constraint h_bond ligand 17 protein 3401 Related Instructions e force_constraints e constraint protein_h_bond 11 3 constraint protein_h_bond constraint h bond lt constraint weight gt lt min geometry weight gt lt atom id gt e A protein hydrogen bond constraint can be used to specify that a particular protein atom should be hydrogen bonded to the ligand but without specifying to which ligand atom e GOLD will be biased towards finding solutions in which the specified protein atom s form hydrogen bonds The fitness score of a given docking will be penalised for every protein H bond constraint that is not satisfied i e for every protein atom that you have specified should form a hydrogen bond but does not The magnitude of this penalty is equal to the lt constraint weight gt that is specified The lt constraint weight gt is also the strength of bias applied to the formation of a specified hydrogen bond in the least squares mapping algorithm within GOLD e The lt min geometry weight gt is a user defined value that determines how good a hydrogen bonding interaction has to be in order for it to be considered a hydrogen 32 GOLD Configuration File User Guide bond
61. m number of the metal as defined in the protein input file must be specified This should be followed by a comma separated list of the allowed coordination numbers Only the templates that correspond to these specified coordination numbers will be used for matching e For example the following instruction overrule metal coordination 4049 4 6 will specify that metal atom 4049 a Zn atom must be matched against tetrahedral 4 and octahedral 6 coordination geometries only i e excluding 5 trigonal bipyramid e To specify a custom or non standard metal coordination geometry see metal_coordination_spec you must use a negative coordination number e For example the following instruction overrule metal coordination 4049 4 4 GOLD Configuration File User Guide 67 68 will specify that metal atom 4049 must be matched against tetrahedral 4 and a custom square planar 4 geometry only Related Instructions e metal_coordination_spec GOLD Configuration File User Guide 20 Protein Data 20 1 ensemble_ structure ensemble structure protein datafile lt filename gt protein score offset lt value gt end _ensemble structure It is possible to dock ligands into multiple proteins in one docking run i e to carry out an ensemble docking Starting from a superimposed set of protein structures GOLD evolves a separate population of individuals representing ligand conformations for each protein structure that is
62. me of the GoldMine dock set and host machine Related instructions e ligands from_socket e ligands to _socket 8 14 fit_points_file fit points file lt filename gt default fit pts mol2 e GOLD automatically calculates a list of hydrophobic fitting points in the binding site These are used during the generation of trial docking solutions to map hydrophobic ligand atoms into favourable regions of the binding site GOLD generates its hydrophobic fitting points by placing a fine grid over the binding site At each grid position the van der Waals interaction energy between a bare carbon atom and the protein is evaluated Positions at which the interaction energy is below 2 5 kcal mole are added to the list of fitting points In this way a map is constructed that contains positions onto which the placement of a hydrophobic ligand atom should be favourable The ligand fitting points are used for the matching of hydrophobic regions e It is possible to instruct GOLD to use customised hydrophobic fitting points see read_fitpts fit points file is used to provide GOLD with the location of the customised fitting points e g fit points file Z GOLD datafiles custom fit pts mol2 e Customised fitting points must be supplied in a MOL2 format file that contains a list of dummy atoms at the desired fitting point locations The supplied fitting points should sample all regions of interest in the cavity so that the docking al
63. mt log is written for each ligand that is docked m refers to the position of the ligand in the input file The log file contains information on the progress of each docking run a comparison of the various docking solutions found and information on clustering of ligand poses for identification of solutions with different binding modes e To instruct GOLD not to save ligand log files use the instruction clean up option delete all log files 14 13 clean_up_option delete_all_initialised_ligands clean up option delete all initialised ligands e By default each initialised ligand is written to a file with a name of the type gold lt ligand filename gt ml mol2 e f you do not wish to retain the initialised ligand file for every docked ligand then specify clean up option delete all initialised ligands so that each initialised ligand file is deleted at the end of the docking run Related Instructions e clean_up_option delete_rank_file e clean_up_option delete_redundant_log files 14 14 output file_format output file format lt MOL2 MACCS gt e By default docking solutions will be written out in the same format as was used for input e To instruct GOLD to write out solution files in an alternative file format use the instruction output file format lt MOL2 MACCS gt MOL2 should be used in order to write out files in Tripos MOL2 format and MACCS for MDL SD format 48 GOLD Configuration File User Guide 14
64. ne of them does e When using constraints GOLD will be biased towards finding solutions in which the specified constraint is satisfied However it is important to remember that such a solution is not guaranteed i e it is not possible to force a constraint to be satisfied in the final solution It is possible to instruct not to dock ligands when the specified constraint s are physically impossible to satisfy e g if no suitable group is present in the ligand to form the required constraint see force_constraints e For example the following instruction defines a constraint where either oxygen atom atom ids 241 and 242 of a carboxylate group should form a hydrogen bond to the ligand A constraint weight of 10 0 has been specified and a minimum geometry weight of 0 05 is used The constraint will be satisfied if either of the carboxylate oxygen atoms forms the required hydrogen bond constraint protein h bond 10 000000 0 005000 241 242 Related Instructions e constraint h_bond e force _constraints 11 4 constraint sphere constraint sphere lt x value gt lt y value gt lt z value gt lt radius gt lt score gt lt hydrophobic atoms arom_ring atoms list lt atom ids gt gt e This constraint can be used to bias the docking towards solutions in which particular regions of the binding site e g a hydrophobic pocket are occupied by specific ligand atoms or types of ligand atom e For each constraint specified a sphere is pl
65. nswer In this situation it is probable that the answer is correct and GOLD will just be wasting time if it performs more docking runs on that ligand e The early termination criterion must also be specified GOLD will stop docking a ligand when a specified number of top solutions see n_top_ solutions are all within a specified rmsd see rms_tolerance of each other Related Instructions e n_top_solutions e rms_tolerance 10 2 n_top_solutions n top solutions lt value gt default 3 e GOLD can be instructed to terminate docking runs on a given ligand as soon as a specified number of runs have given essentially the same answer see early_termination In this situation it is probable that the answer is correct and GOLD will just be wasting time if it performs more docking runs on that ligand e The early termination criterion must be specified GOLD will stop docking a ligand when the specified number of top solutions lt value gt are all within a specified rmsd see rms_tolerance of each other Related Instructions e early_termination e rms tolerance 10 3 rms_tolerance rms tolerance lt value gt default 1 5 e GOLD can be instructed to terminate docking runs on a given ligand as soon as a specified number of runs have given essentially the same answer see early_termination In this situation it is probable that the answer is correct and GOLD will just be wasting time if it performs more docking runs on that
66. nt Note Only possible if rds use protein_coords 1 see rds_use protein_coords Required instructions receptor_depth_scaling rds_use protein_coords rds_ use _donor_coords rds_protein_distance rds_hbond rds_lipo rds_clash rds_metal 17 5 rds_protein_distance rds protein distance lt value gt default 8 GOLD Configuration File User Guide 61 e This is the radius A of the sphere used to calculate the receptor depth Atoms within this radius are counted in order to evaluate the depth of the interaction Required instructions e receptor_depth_scaling e rds_use_protein_coords e rds_use_donor_coords e rds_use_exact_count e rds_hbond e rds lipo e rds clash e rds metal 17 6 rds_hbond rds hbond lt min no atoms gt lt max no atoms gt lt scale factor 1 gt lt scale factor 2 gt lt output gt default rds hbond 13 105 0 1 8 0 e This parameter controls the scaling of the hydrogen bond interactions in the RDS calculation If the number of atoms surrounding a hydrogen bond between the ligand and the protein is lt min no atoms gt or less it is scaled by lt scale factor 1 gt Between lt min no atoms gt and lt max no atoms gt it is scaled linearly to lt scale factor 2 gt All hydrogen bonds with more than lt max no atoms gt are scaled by lt scale factor 2 gt The variable lt output gt is O for default output while 1 is for verbose output Note Output set to 1 will give large amou
67. nt use of search time When using automatic settings the search efficiency see autoscale can be used to control the speed of docking and the predictive accuracy i e the reliability of the results Related Instructions popsiz select_pressure n_ islands match_ring_templates niche_siz pt_crosswt migratewt autoscale 6 3 migratewt migratewt lt value gt auto default 10 The operator weights for the parameters mutate migrate and crossover govern the relative frequencies of the three types of operations that can occur during a genetic optimisation point mutation of the chromosome migration of a population member from one island to another and crossover sexual mating of two chromosomes Each time the genetic algorithm selects an operator it does so at random Any bias in this choice is determined by the operator weights For example if Mutate is 40 and Crossover is 10 then on average four mutations will be applied for every crossover The migrate weight should be zero if there is only one island otherwise migration should occur about 5 of the time Optimum values of the genetic algorithm parameters are highly correlated you are therefore recommended to use automatic ligand dependent GA parameter settings see below or one of the default parameter sets offered via the front end Setting all population and genetic operators to auto will instruct GOLD to automatically calculate the optimal number of operations f
68. nts of information e The receptor depth scaling has been validated for the default values only any change of these values can lead to unpredictable results and should be done with caution Required instructions e receptor_depth_scaling e rds_use_protein_coords e rds_use_donor_coords e rds_use_exact_count e rds_protein_distance e rds lipo e rds clash e rds metal 17 7 rds_lipo rds lipo lt min no atoms gt lt max no atoms gt lt scale factor 1 gt lt scale factor 2 gt lt output gt default rds lipo 0 0 0 52 0 52 0 e The rds _lipo setting controls how the lipophilic interactions are scaled within RDS If the number of atoms surrounding an interaction point between the ligand and the 62 GOLD Configuration File User Guide protein is lt min no atoms gt or less it is scaled by lt scale factor 1 gt Between lt min no atoms gt and lt max no atoms gt it is scaled linearly to lt scale factor 2 gt All interactions with more than lt max no atoms gt are scaled by lt scale factor 2 gt The variable lt output gt is O for default output while 1 is for verbose output Note Output set to 1 will give large amounts of information The receptor depth scaling has been validated for the default values only any changes of these values can lead to unpredictable results and should be done with caution The default settings i e lt min no atoms gt 0 and lt max no atoms gt 0 entails that all lipoph
69. o suitable group is present in the ligand to form the required constraint see force_constraints e For each atom involved in the constraint it is necessary to specify whether the atom belongs to the protein or the ligand file lt protein ligand gt The atom id as it appears in the structure input file must also be specified The minimum lt min distance gt and maximum lt max distance gt separation of the constrained atoms must be entered distances are in and the spring constant lt spring constant gt must also be specified by default this is set to 5 0 e Ifan atom specified in the constraint is topologically equivalent to other atoms e g it is one of the oxygen atoms of an ionised carboxylate group then it is possible to instruct GOLD to automatically compute the constraint term using whichever of the equivalent atoms gives the best value Use lt on off gt to control whether or not topologically equivalent atoms are considered e For example the following instruction defines a constraint between a zinc atom in the protein atom id 2041 and a sulphonamide nitrogen in the ligand atom id 8 A maximum separation of 3 50 A and a minimum separation of 1 50 A have been specified and a spring constant of 5 0 is used constraint distance protein 2041 ligand 8 1 5 3 5 5 0 off Related Instructions e force_constraints e constraint substructure GOLD Configuration File User Guide 31 11 2 constraint h_bond constra
70. o place it in the correct pose lt filename gt is used to provide GOLD with the location of the scaffold file The scaffold must be supplied as a MOL2 file e The value of lt weight gt determines how closely ligand atoms fit onto the scaffold Setting a higher weight will force the ligand to be placed onto the scaffold locations more strictly A default weight of 5 0 is used Values below 1 can be used to achieve a more lenient overlay e By default all heavy atoms in the supplied scaffold structure file will be used for matching However it is possible to specify only a subset of those atoms in the scaffold structure these may include non heavy atoms Atoms should be specified using list lt atom ids gt The atom indices as defined in the scaffold structure file must be used Atom indices should be separated by a single space Related Instructions e force_constraints 36 GOLD Configuration File User Guide 11 8 interaction_restraint_weight interaction restraint weight lt value gt default 50 e When using the constrain interaction restraint option the docking contribution added to the fitness score of ligand poses in which a motif is matched i e poses in which all the interactions defined as part of a motif are satisfied is based upon the accumulated hydrogen bonding and the lipophilic interactions defined as part of that motif and the interaction _ restraint weight e The interaction restraint weight can be used to cu
71. on whichever scoring function is specified see gold_fitfunc_path The scoring function parameter file can be customised by copying it editing the copy and instructing GOLD to use the edited file e g score param file Z GOLD datafiles my sf params The format of the scoring function parameter file is quite strict incorrect editing may cause GOLD to behave in unexpected ways or even to crash Because of the large number of parameters no guarantee can be given that the program will behave reliably with anything other than the default parameterisation Specific parameter files for use with heme containing proteins are also available for both GoldScore and ChemScore For further information see S B Kirton C W Murray M L Verdonk and R D Taylor Proteins Structure Function and Bioinformatics 58 836 844 2005 The parameters are derived from contact statistics obtained from the CSD and PDB databases These parameters can be used by specifying the appropriate params file from those that have been supplied with the GOLD installation The following params files are available within the SGOLD DIR gold directory GOLD Configuration File User Guide 55 15 9 goldscore p450_csd params goldscore p450 pdb params chemscore p450 csd params chemscore p450 pdb params A specific parameter file for use with protein kinases is available for ChemScore For further information see M L Verdonk V Berdini M J Hartshorn W T M Moo
72. ond to solutions which have not been retained or When writing all solutions to a single concatenated file see concatenated_output then you might not wish to retain log files for individual solutions Related Instructions concatenated_output clean_up_option save_best_ligands clean_up_option delete_empty_directories 14 9 clean_up_option save_clustered_solutions clean_up_option save clustered solutions lt value gt GOLD clusters docked solutions according to how similar the poses are in terms of their RMSd A link can be generated to the top ranked solution from each distinct cluster This can be useful in identifying different ligand binding modes To generate a link to the top ranked solution from each cluster use the instruction clean up option save clustered solutions lt value gt where lt value gt is the RMSd clustering distance this determines how similar the poses are in each cluster of solutions By default the clustering distance is 0 75 A A clustering report will be given at the end of the ligand log file The clusters themselves and the individual solutions within each cluster are listed in ranked order Symbolic links will also be generated in the output directory which will link to the top ranked solution in each cluster cluster ligand_ mf n mol2 14 10 clean_up_option delete_empty_directories clean_up option delete empty directories When more than one ligand is being docked it is possible to have res
73. or each ligand thereby making the most efficient use of search time When using automatic settings the search efficiency see autoscale can be used to control the speed of docking and the predictive accuracy i e the reliability of the results Related Instructions popsiz select _pressure n_islands match_ring_templates niche_siz GOLD Configuration File User Guide 11 12 allele_mutatewt pt_crosswt autoscale GOLD Configuration File User Guide 7 Flood Fill 7 1 radius radius lt value gt default 10 e The binding site can be defined as all atoms within lt value gt A of a specified central point e g this could be a protein atom close to the center of the active site or a point defined by X Y Z coordinates The radius should be large enough to contain any possible binding mode of the ligand Related instructions e floodfill_atom_no e origin 7 2 origin origin lt x value gt lt y value gt lt z value gt default 0 0 0 0 0 0 e When used in combination with floodfill center point see floodfill_ center this option will define the binding site from a single point The orthogonal x y z coordinates of a single solvent accessible point approximately at the centre of the active site should be specified The binding site will subsequently be defined as all atoms that lie within a given see radius of the specified point Related instructions e floodfill_center e radiusError Reference s
74. ore Heme Kinase and Astex Statistical Potential scoring functions and the Diverse Solutions code within GOLD 2001 2015 Astex Therapeutics Ltd All rights reserved Licences may be obtained from CCDC Software Ltd 12 Union Road Cambridge CB2 1EZ United Kingdom Web www ccdc cam ac uk Telephone 44 1223 336408 Email admin ccdc cam ac uk GOLD Configuration File User Guide Contents 1 2 3 4 Overview Using Configuration Files with GOLD ononcoconccccconononocnnonononanononnnncnnnnnonnnnnononoss 1 Running GOLD Using a Configuration File oooooocncccnonononanannnnnononanononononcnncnnonnocnnnnconnns 2 Format of the GOLD Configuration File cocoococonncncnonononannnononononanononononcnncnnonononnnnncanons 3 Automatic Settings mii taaan dara aen canadian 4 4 1 Ut Clic ai 4 4 2 autoscale HOPS Maat 4 4 3 AUTOSCAIE NODS MM ii 5 Populi Ni A Sosa ve heaton oe eee oe Do SEE 6 5 1 DODSIZ it EA A A ae Hae nnn 6 5 2 Select A an aa a aa a e A Aa a aaa eA AAEE 6 5 3 naiSland Sea E E OE A Sale ald a tht 7 5 4 MAPS ree a EEA EAE 8 5 5 niche Zi A ee 8 Genetic Operators a a abi da 10 6 1 AN A Ain A Si SR is A ON 10 6 2 all le mutate wt ia A ie a Ga ui sue duce eedans deve vaeieseds 10 6 3 SA seep da va uaa ued ucts E E A dade ctedet els 11 Flood Fills ccicscetisectesccadendetesctesi cheese taonctvascneev a e tia 13 7 1 A a r a it ee ee ae ee ioe 13 7 2 Origi we hele nh edeven A deve se even ees 13 7 3
75. ossible binding of a pre determined ligand geometry The following options are available GOLD Configuration File User Guide 25 Use fix rotatable bond lt atom number 1 gt lt atom number 2 gt to fix a single rotatable bond between two atoms The atom ids as they appear in the ligand input file must be specified Use fix rotatable bond all to fix all rotatable bonds in the ligand at their input conformation Use fix rotatable bond all_but_terminal to fix all non terminal rotatable bonds i e not CH3 OH etc at their input conformation e Note When fixing all rotatable bonds at their input conformation i e performing a rigid ligand docking GOLD will not perform a local optimisation simplex on the final solution This may lead to penalisation of near optimal conformations 9 10 postprocess_bonds postprocess bonds lt option gt default 1 on options 0 off 1 on e By default certain bonds are treated in a specific manner at ligand initialisation in order to prepare them for docking However if a bond is e g desired to rotate freely rather than flip during docking then fine grained control over specific substructures can be achieved by post processing bonds after ligand initialisation e Set postprocess bonds 1 to allow post processing of bonds Control over specific substrucures can then be achieved by using the rotatable_bond_override mol2 file found in the GOLD DIR gold directory see rota
76. otal fitness scores and a breakdown of the fitness into its constituent energy terms e Analternative filename can be specified using the following instruction bestranking list _ filename 8 11 ligands_from_socket ligands from socket lt host gt lt port gt lt no GAs gt e The ligands_from_socket is used to retrieve ligands from a GoldMine The host and network port of the available GoldMine must be supplied together with the number of GAs for the ligands Related Instructions e goldmine_parameters e ligands_to_socket 8 12 ligands_to_socket ligands to socket lt host gt lt port gt e The use of ligands to socket can be used to send each docked ligand to GoldMine located on lt host gt a network port must be specified lt port gt Related Instructions e goldmine_parameters e ligands from_socket 20 GOLD Configuration File User Guide 8 13 goldmine_parameters For SOLite database goldmine parameters SQLite lt PATH gt lt name gt lt dock_set gt For PostgreSQL database goldmine parameters PostgreSQL lt name gt host lt host gt user lt user gt password lt passwd gt lt dock_ set gt e With either ligands_from_socket and ligands_to_socket the goldmine_parameters need to be set The goldmine_parameters control the access to the GoldMine where the relevant ligands are stored The database type SQLite or PostgreSQL must be specified together with the relevant parameters such as the na
77. otein_atom_no when setting up the docking Inside the GOLD least squares fitting routine the link atom in the ligand will be forced to fit onto the link atom in the protein In order to make sure that the geometry of the bound ligand is correct the angle bending potential from the Tripos Force Field has been incorporated into the fitness function On evaluating the score for the docked ligand the angle bending energy for the link atom is included in the calculation of the fitness score Related Instructions covalent_protein_atom_no covalent_ligand_atom_no covalent_substructure_filename covalent_substructure_atom_no covalent_substructure covalent_topology 12 2 covalent_protein_atom_no covalent protein atom no lt atom id gt When docking covalently bound ligands see covalent GOLD will assume that there is just one atom linking the ligand to the protein e g the O in a serine residue Both protein and ligand files should be set up with the link atom included covalent protein atom_no is used to define the link atom in the protein file The atom id as it appears in the protein input file must be specified GOLD Configuration File User Guide 39 12 3 12 4 40 The link atom as it appears in the ligand input file see covalent_ligand_atom_no or the substructure file for use with multiple ligands which have a common functional group see covalent_substructure must also be specified Related Instructions
78. ound 30 000 GA operations Note the exact number of GA operations contributed e g for each rotatable bond in the ligand are defined in the gold params file If the Search efficiency were set to 0 5 then GOLD will perform around 15 000 operations thereby speeding up the docking by a factor of two however the search space will be less well explored Similarly by setting a Search efficiency greater than 1 0 it is possible to make the search more exhaustive but slower When using autoscale it is further possible to ensure that every ligand is subjected to a user specified minimum and or maximim number of operations see autoscale_nops_min and see autoscale_nops_ max Related instructions autoscale_nops_min autoscale_nops_max popsiz select_pressure n_islands match_ring templates niche_siz pt_crosswt allele_mutatewt migratewt 4 2 autoscale_nops_max autoscale nops max lt value gt default 0 off When using automatic ligand dependent GA parameter settings the search efficiency see autoscale can be used to control the speed of docking and the predictive accuracy i e the reliability of the results When using autoscale the maximum number of GA operations performed during the docking run will be updated automatically according to the autoscale value that is set The automatic GOLD Configuration File User Guide preset can be overridden to ensure that every ligand is subjected to no more than autoscale nops m
79. ource not found 7 3 do_cavity do cavity lt option gt default 1 on options 0 off 1 on e Ifthis option is switched on a cavity detection algorithm will be used to restrict the binding site definition to concave solvent accessible surfaces The algorithm LIGSITE is used for the automatic detection of potential small molecule binding sites in proteins see M Hendlich F Rippmann and G Barnickel J Mol Graph Model 15 359 63 389 1997 Related instructions e cavity_file e floodfill_atom_no e origin GOLD Configuration File User Guide 13 7 4 floodfill_atom_no floodfill atom no lt atom id gt default 0 off e When used in combination with floodfill_center atom see floodfil1 center this option will define the binding site from a single protein atom The atom id as it appears in the protein input file of a single solvent accessible atom close to the centre of the active site of the protein should be specified The binding site will subsequently be defined as all atoms that lie within a given radius see radius of the specified protein atom e When used in combination with floodfill_center residue see floodfill_ center this option will define the binding site from a single residue The atom id as it appears in the protein input file of any atom within the residue from which you want to define the active site should be specified All protein atoms that lie within a given radius see radius of e
80. p ring NRR use to allow NRR groups to flip i e rotate 180 deg during docking fix ring NRR use to fix NRR groups at their input conformation rot ring NRR use to allow free rotation of NRR groups flip ring NHR use to allow NHR groups to flip i e rotate 180 deg during docking fix ring NHR use to fix NHR groups at their input conformation rot_ring NHR use to allow free rotation of NHR groups For example setting flip planar n 1 fix ring NRR will allow all planar R3N groups to flip but will fix ring NRR groups at their input conformation Related Instructions postprocess_ bonds rotatable_bond_override_file flip_pyramidal_n flip pyramidal_n lt option gt default 0 off options 0 off 1 on Switch this option on to allow pyramidal i e non planar sp nitrogens to invert during docking otherwise they will be held fixed at the input geometry Given a GOLD Configuration File User Guide non planar group RR R N or tetrahedrally surrounded RR R NH setting flip pyramidal_n 1 will enable flipping of the local stereochemistry around the nitrogen the energy barrier for this umbrella like change of geometry around the nitrogen is low Flipping only changes the stereochemistry around RR R N and RR R NH nitrogens It does not affect other chiral centers 9 7 rotate_carboxylic_oh rotate carboxylic oh lt flip rotate fix gt default fix e The instruction rota
81. pes Setting set ligand atom types 1 will instruct GOLD to set atom types automatically The atom types will be assigned from GOLD Configuration File User Guide the information about element types and bond orders in the ligand input file s so it is important that these are correct When using automatic atom type assignment you still need to input the ligand structure s correctly e g with correct bond orders and appropriate protonation states Structure input files should be prepared in accordance with the guidelines provided using a good modelling package 8 6 directory directory lt filename gt e directory is used to specify the directory to which output files will be written e g directory Z GOLD datafiles output e Unless specified otherwise output will be written to the current working directory Related Instructions e make_subdirs 8 7 tordist_file tordist file lt filename gt DEFAULT default DEFAULT e Torsion angle distributions extracted from the Cambridge Structural Database CSD can be input to GOLD These distributions can be used see use_tordist to restrict the ligand conformational space sampled by the genetic algorithm to those torsion angle ranges commonly observed in crystal structures e tordist file is used to provide GOLD with the location of the torsion angle distribution file Three torsion angle distribution files are provided gold tordist this is the standard torsion
82. ppears in the protein input file of every solvent accessible atom which is required to explicitly define the protein active site all acceptor and donor hydrogen atoms available to the ligand are taken from the file Multiple atoms numbers are permitted on each line in the file 14 GOLD Configuration File User Guide e When used in combination with floodfi11_ center list_of residues see floodfill_center this option will define the binding site from a specified list of residues The residues can be extracted from any text file including a standard GOLD solution file GOLD writes the active site residues list to the solution files if output of rotatable hydrogens is turned on The following formatting restrictions apply The list must begin with the following tag on its own line gt lt Gold Protein ActiveResidues gt The list must end with a blank line or the end of the text file GOLD will read multiple residue names from one line but lines must not exceed 250 characters in length Residue names must be separated by a space for example gt lt Gold Protein ActiveResidues gt HIS69 ARG71 GLU72 ARG127 ASN144 ARG145 GLY155 ALA156 GLU163 THR164 HIS196 SER197 TYR198 SER199 LEU201 LEU203 ILE243 ILE244 ILE247 TYR248 GLN249 ALA250 GLY253 SER254 ILE255 THR268 GLU270 PHE279 ZN309 The list should contain all the residues which are required to explicitly define the protein active site all accep
83. put concatenated output lt filename gt e By default the result of each docking attempt is written out to a separate file gold soln structure mi n mol2 where nis the solution number 1 2 3 and m is the number of the ligand i e m1 for the first ligand in the input file m2 for the second etc e Alternatively it is possible to specify that all saved docking solutions for all ligands are to be concatenated and written to a single file in MOL2 or SD format e concatenated output is used to provide GOLD with the location of the file to which all solutions are to be written concatenated_output must be followed by a filename Note When performing a rescoring run see run_flag GOLD will by default write out docked ligand solutions after rescoring Solutions will be written to the file rescore mol2 A log file rescore log which summarises the outcome of the rescoring run will also be written An alternative filename for both the rescore solution and log files can also be specified using the concatenated output instruction Related Instructions e clean_up_option delete_all_ solutions e clean_up_option delete_redundant_log files e clean_up_option delete_empty_directories e run_flag 14 4 clean_up_option delete_all_solutions clean_up option delete all solutions e By default the result of each docking attempt is written out to a separate file gold soln structure mi n mol2 where n is the solution number 1 2 3
84. r GOLD to know which scoring functions to use in such a consensus scheme the options docking fitfunc_pathand rescore fitfunc_path need to be defined Further the gold _fitfunc path option should be set to consensus score and the run_flag should be set to CONSENSUS The options docking param file rescore fitfunc path and rescore param file are also required to be set when performing automatic rescoring The docking fitfunc path specifies the scoring function to be used for the docking part of the consensus scoring GOLD offers a choice of scoring functions GoldScore ChemScore Astex Statistical Potential Piecewise Linear Potential and User Defined Score which allows users to modify an existing function or implement their own scoring function via a Scoring Function Application Programming Interface API Scoring functions are implemented in GOLD using shared objects or dynamically loadable libraries docking fitfunc path defines which scoring function is to be used by specifying the path to the relevant dynamically loadable shared object library To use a new or modified scoring function set docking fitfunc_path to specify the path to the appropriate shared objects or dynamically loadable libraries e g docking fitfunc path Z GOLD my_ score dll GOLD Configuration File User Guide 51 15 5 52 Full documentation for the GOLD Scoring Function Application Programming Interface API is provided with the GOLD distribution R
85. reduce the fitness score i e kx Ey E where x is the difference between the distance and the closest constraint bound k is a user defined spring constant During docking the constraint will be applied to any ligands which contain the specified fragment matching is performed on the basis of the atom types and 2D GOLD Configuration File User Guide connectivity and the resulting solutions will be biased towards the specified distance range e lt filename gt is used to provide GOLD with the location of the fragment file The fragment must be supplied as a MOL2 file or PDB file e The protein atom number and fragment atom number must be specified The atom ids as they appear in the structure input file must be used The minimum lt min distance gt and maximum lt max distance gt separation of the constrained atoms must be entered distances are in and the spring constant lt spring constant gt must also be specified by default this is set to 5 0 e Itis possible to define a distance constraint from a centroid of a ring in the ligand To do this specify a fragment atom within the ring of interest and use the keyword ring center The closest ring center to the selected atom will be used Note When defining a distance constraint involving a ring center ensure that the maximum and minimum separations are adjusted accordingly e Ifthe constraint refers to a fragment atom and therefore a ligand atom which is
86. rrrrrrerrrrerererereeeee 37 LILLO f re constraints a tt a ia 38 Covalent Bond A rad eh ed do 39 A O RE 39 12 2 covalent_protein_atoM_NO ooooooccnncnoconononnnnnnncnnnnnononnnnnnnnnnnnnnonnnnnnnnnnnnnnnnnnnnnnanons 39 12 3 covalent_ligand_atom_N0O ocooocccccnncnoconononnnnnonnnnnnnononnnnnnnnnnnnnononnnnnnnnnnnnnenonnnnnnanons 40 12 4 covalent_substructure_filename cocnnccccncnncnnnonononnnnnnnnnnonnonononnnnnncnnnnnonnnnnnnnons 40 12 5 covalent_substructure_atom_nNO cccconcconnnnnoncnnnnnononnnnnnnnnnnnnononnnnnnnnnnnnnnonnnnnnnnon 41 GOLD Configuration File User Guide 12 6 covalent_SUDSTrUCtUFE aenean araea aaiae earte a a e a aea aieia ed eera iaai ia 41 127 covalent topology en art a E E ad 42 13 ParallelOptiOns sneng n e i a E eee a Aa net E a ees 43 ID Mia 43 14 SUM A E a o a a 44 TAT Save score in lentas cases a e a A ARIANE 44 14 2 SAVE PHOTEINTOMSIONS cecilia di 44 14 3 concatenated Output nesens eea A oe het ta eed 45 144 clean_up_ option delete _all_solutiONS cccconnnoooonnnnnnnnononannnnnnnonananonnnnnnnnnnnns 45 14 5 clean_up_ option save fitness better_thaN concnccconcnncnnnonononnnnnnnnnnnanonannnnnnanons 46 14 6 clean_Up_ Option save _top_n_SOlUtIONS cccccononoonnnnnonnnanonannnnnononananonannnancnnnns 46 14 7 clean_up_ option save best ligandS o occccccnononoononcnnononanonnnnnnnnnnananonnnncnnnanons 46 148 clean_up_option delete redundant_log files
87. sidues and the rotated protein hydrogen atom positions generated during the original docking will be overwritten with those resulting from the rescoring run Include the keyword no_ strip if you wish not to replace relevant tags lfno_ strip is specified then rescore mo12 will contain both the binding site definition of the original docking and that of the subsequent rescoring run Note that rescoring like docking requires a fully defined binding site preferably the same definition that was used for the original docking The ligand file scoring function and output preferences must also all be specified It is possible to perform automatic rescoring on docked poses using a different scoring function to that used during the docking In order for GOLD to know which scoring functions to use in such a consensus scheme the options docking fitfunc_pathand rescore fitfunc path need to be defined Further the gold _fitfunc path option should be set to consensus score and the run flag should be set to CONSENSUS The options docking param file and rescore param file are also required to be set when performing automatic rescoring Related Instructions concatenated_output gold_fitfunc_path docking _fitfunc_path docking param_file rescore_fitfunc_path rescore_param_file 15 11 alt_residues alt residues lt 1 2 gt lt residues gt GoldScore uses Lennard Jones functional forms for both the External and Internal Van der Waals contribu
88. stomize the overall importance of the constrain interaction restraint Related Instructions e constraint interaction_restraint 11 9 constraint interaction_restraint constraint interaction restraint lt interaction type gt lt residue number gt lt chain id gt lt residue name gt lt atom name gt lt motif 1 gt lt motif 2 gt lt motif n gt e The interaction motif constraint can be used to bias the docking towards solutions that form particular motifs of interactions e During docking a contribution will be added to the fitness score of ligand poses in which a motif is matched i e poses in which all the interactions defined as part of a motif are satisfied This contribution is based upon the accumulated hydrogen bonding and the lipophilic interactions defined as part of that motif and the interaction restraint weight see interaction_restraint_weight Therefore docking will be biased towards ligand poses which form interactions to the protein atoms of interest matching one of the uniquely defined motifs e Typically several lines of the gold conf file would be used to describe all the interactions of interest in the interaction motif using the format outlined above e The available lt interaction type gt are H bond acceptor 1 H bond donor 2 lipophilic interaction 3 CHO donor 4 e The values lt residue number gt lt chain id gt lt residue name gt lt atom name gt are used to identify the protein atom tha
89. t forms the interaction e Inthe case of H bond donors H bond acceptors and CHO donors the values for lt motif 1 gt lt motif 2 gt lt motif n gt are set to either 1 or O depending on whether or not the interaction is observed in that particular motif or not For the lipophilic interactions the values are set to the frequency of that interaction in a set of complexes and are added to all the motifs e For example the following instruction defines an interaction motif consisting of 11 motifs Each line represents a unique interaction that the protein can form i e the first two lines define that the carbonyl group of GLU81 can act as either a hydrogen bond donor or a CHO donor Columns 8 to 18 represent unique interaction motifs GOLD Configuration File User Guide 37 cons cons cons cons cons cons cons cons 039 059 0 59 0 09 0 59 0 99 0 25 train train train train train train train train TE TE FEST CESE ET in in in in in in in in tefac tefac tefac tefac tefac tefac tefac tefac tion_res tion res tion res tion_res tion res tion res tion res tion res train train train train train train train train Ch Ah AE A Cer eh ET 81 A GLU 81 A GLU 83 A LEU 83 A LEU 83 A LEU 31 A ALA 134 A LE 80 A PHE 01 1 1 000 0 01 000001000010 N11101101000 Oy TOCO 0 SE O ES A Ate O 00010000000 0 CBA L de Tks AS 1 11 CD i 4 1 4 1 1 1 11 1 1 CB 0
90. table_bond_override_file Related Instructions e rotatable bond_override_ file 9 11 rotatable_bond_override file rotatable bond override file lt filename gt default SGOLD_DIR gold rotatable bond override mol2 e By default certain bonds are treated in a specific manner at ligand initialisation in order to prepare them for docking However control over specific substructures can be achieved by post processing bonds after ligand initialisation e postprocess bonds 1 see postprocess_bonds must be set to allow post processing of bonds Fine grained control can then be achieved over specific substrucures by using the rotatable_bond_override mol2 file found in the SGOLD_DIR gold directory Some fragments are already provided which can be edited however user specific ones may also be added Instructions on how to do this as well as further information can be found in the file itself This is useful if further control is sought over more than one ligand with a common substructure in a ligand library file e rotatable bond override file is used to provide GOLD with the location i e full path and filename of the rotatable_bond_override mol2 file The file itself can be customised by copying it editing the copy and instructing GOLD to use the edited file Related Instructions 26 GOLD Configuration File User Guide e postprocess bonds 9 12 diverse_solutions diverse solutions lt option gt default 0 off
91. te carboxylic oh can be used to control the behaviour of protonated carboxylic acids during docking The possible settings are flip protonated carboxylic acids will be allowed to flip i e rotate 180 deg during docking rotate protonated carboxylic acids will be allowed to rotate freely during docking fix protonated carboxylic acids will be held rigid at their input conformation 9 8 use_tordist use tordist lt option gt default 1 on options 0 off 1 on e Torsion angle distributions extracted from the Cambridge Structural Database CSD can be input to GOLD When use_tordist is switched on these distributions are used to restrict the ligand conformational space sampled by the genetic algorithm Using torsion angle distributions in this way may improve the chances of GOLD finding the correct answer by biasing the search towards ligand torsion angle values that are commonly observed in crystal structures e The instruction tordist file is used to provide GOLD with the location of the torsion angle distribution file that is to be used see tordist_file Related Instructions e tordist file 9 9 fix_rotatable bond fix rotatable bond lt lt atom number 1 gt lt atom number 2 gt all all but_terminal gt e GOLD was designed to dock flexible ligands into protein binding sites However sometimes it can be useful to fix the geometry of specific bonds or of part of the ligand e g in order to study the p
92. th potential forms can be used in the same gold run GOLD Configuration File User Guide 16 Protein Data 16 1 protein_datafile protein datafile lt filename gt e protein datafile is used to provide GOLD with a file containing the protein or the part of a protein into which the ligand is to be docked protein_datafile must be followed by a filename e g protein datafile Z GOLD datafiles protein mol2 e Acceptable protein file formats are PDB and MOL2 Before being used with GOLD the protein input file must have been prepared in accordance with the guidelines provided using a good modelling package GOLD Configuration File User Guide 59 17 Receptor Depth Scaling 17 1 receptor_depth_scaling receptor depth scaling lt option gt default 0 off options 0 off 1 on e Receptor depth scaling RDS can be chosen when performing a rescore using ChemScore as the scoring function RDS will reward hydrogen bonds deep in protein pockets with an increased score while the scores of those closer to the solvent exposed surface are decreased Simultaneously the scores attributed to lipophilic interactions are reduced e To enable the use of RDS in GOLD set receptor_depth_ scaling 1 all RDS required instructions need to be set in order for RDS to work properly Required instructions e rds_use_protein_coords e rds_use_donor_coords e rds_use_exact_count e rds_protein_distance e rds_hbond e rds lipo e rds clash
93. the GOLD Configuration File User Guide search efficiency see autoscale can be used to control the speed of docking and the predictive accuracy i e the reliability of the results Related Instructions e popsiz e select_pressure e n_islands e match_ring templates e pt_crosswt e allele mutatewt e migratewt e autoscale GOLD Configuration File User Guide 9 6 6 1 6 2 10 Genetic Operators pt_crosswt pt _crosswt lt value gt auto default 95 The operator weights for the parameters mutate migrate and crossover govern the relative frequencies of the three types of operations that can occur during a genetic optimisation point mutation of the chromosome migration of a population member from one island to another and crossover sexual mating of two chromosomes Each time the genetic algorithm selects an operator it does so at random Any bias in this choice is determined by the operator weights For example if Mutate is 40 and Crossover is 10 then on average four mutations will be applied for every crossover Optimum values of the genetic algorithm parameters are highly correlated you are therefore recommended to use automatic ligand dependent GA parameter settings see below or one of the default parameter sets offered via the front end Setting all population and genetic operators to auto will instruct GOLD to automatically calculate the optimal number of operations for each ligand thereby mak
94. tions 13 1 hostfile hostfile lt filename gt e hostfile is used to provide GOLD with the location of a host configuration file The file must have been created previously when using parallel GOLD When hostfile is specified GOLD will read hosts and numbers of processes from this file and attempt to add these hosts to your configuration GOLD Configuration File User Guide 43 14 Save Options 14 1 14 2 44 save_score_in_file save score in file lt option gt weighted unweighted all no _ sdtags in mol2 comments default 1 on options 0 off 1 on It is possible to write additional information to docked solution files This information is written to SD file tags for MOL2 files these tags are written to comment blocks This information can be used in the post processing of docking results e g with SILVER or GoldMine Set save score in file 1 inorder to include the docking score terms i e the total GoldScore or ChemScore value for each docking and its components such as protein ligand H bond energy internal ligand strain energy etc in the docked solution files Certain docking scoring function terms are the product of a term dependent on the magnitude of a particular physical contribution e g hydrogen bonding and a scale factor determined e g by a regression coefficient The docking scoring function terms included in the output file can therefore consist of weighted terms non weighte
95. tions to the Fitness Function By default a 6 12 potential is applied to the Internal Van der Waals contribution and a 4 8 potential is applied to the External Van der Waals contribution The 4 8 potential form for the External contribution is selected as being optimum for general use However there are cases where this potential form may be too severe in the short contact i e the clash component This would arise for instance where part of the binding site is made up of a loop which it is known can move aside slightly to accommodate large ligands In such cases it is possible to apply a softer Split Van der Waals Potential for certain selected residues Two alternative soft Split Potential forms are parameterised in the gold paranms file GOLD Configuration File User Guide 57 58 EXTERNAL POTENTIAL 1 EXTERNAL POTENTIAL 2 4 Form 1 2 Form 2 4 8 2 4 8 1 The first term of each form describes long range interactions the second term describes short range interactions One of these two soft potentials can be applied to a single residue using the instruction alt residues form lt residue gt Where form is the Split Potential form to be applied i e 1 or 2 and lt residue gt is the residue to which the split potential is to be applied e g specifying alt residues 1 ALA148 will apply the split potential of form 1 to the residue Ala 148 More than one residue can be specified and bo
96. to 50 rotamer commands in a rotamer_1ib block e Quite often a side chain rotation is accompanied by a small change in the local backbone conformation primarily affecting the position of the Cy atom Although minor this movement is extremely important because it alters the vector direction Ca CB and this can have a big leverage effect on the positions of atoms further down the side chain The backbone movement can be mimicked by allowing the Ca atom and the attached side chain to rotate around the N C vector where N and C are the backbone atoms on either side of the Ca atom This is defined as a rotation of the improper torsion defined by the atom sequence CA N C CA e The file lt GOLD_DIR gt gold rotamer_library txt contains information taken from the paper The Penultimate Rotamer Library S C Lovell J M Word J S Richardson amp D C Richardson Proteins 40 389 408 2000 This is a compilation of the most commonly observed side chain conformations for the naturally occurring amino acids To make use of the rotamer information for a given residue copy and paste the relevant rotamer_1ib section into the GOLD configuration file and specify the residue name and atom numbers as required Note that the library settings are simply a starting point users are encouraged to generate their own rotamers for optimal results Related Instructions e energy e per_atom_scores 21 2 energy energy lt value gt e Anenergy may be assign
97. topologically equivalent to other atoms e g it is one of the oxygen atoms of an ionised carboxylate group GOLD will automatically compute the constraint term using whichever of the equivalent atoms gives the best value e When using constraints GOLD will be biased towards finding solutions in which the specified constraint is satisfied However it is important to remember that such a solution is not guaranteed i e it is not possible to force a constraint to be satisfied in the final solution It is possible to instruct not to dock ligands when the specified constraint s are physically impossible to satisfy e g if no suitable group is present in the ligand to form the required constraint see force_constraints Related Instructions e constraint distance e force_constraints 11 6 constraint similarity constraint similarity lt donor acceptor all gt lt filename gt lt weight gt e This constraint will bias the conformation of docked ligands towards a given solution This solution or template can for example be another ligand in a known conformation a common core or it may just be a large substructure that is expected or known to bind in a certain way e Anenergy term will be added to the score based on the similarity between the ligand being docked and the template provided The similarity between the two is evaluated as a Gaussian overlap term e The template file should contain the template molecule or fragment
98. tor and donor hydrogen atoms available to the ligand are taken from the list Related instructions e do_cavity e floodfill_center 7 6 floodfill_center floodfill center lt atom cavity from ligand file list_of residues point residue gt default point e floodfill center determines the method used to define the binding site The possible settings are atom used to define the binding site from a single protein atom see floodfill_atom_no cavity from ligand used to define the binding site from a reference ligand see cavity_file file used to define the binding site from a list of protein atoms see cavity_file list _of residues used to define the binding site from a list of residues see cavity_file point used to define the binding site from a single point using orthogonal x y z coordinates see origin residue used to define the binding site from a specified residue see floodfill_atom_no Related instructions e cavity_file GOLD Configuration File User Guide 15 16 floodfill_atom_no origin GOLD Configuration File User Guide 8 Data Files 8 1 ligand_data_file ligand_data file lt filename gt lt no of GA runs gt start_at ligand lt value gt finish at_ligand lt value gt e ligand data fileisused to provide GOLD with the ligand file s to be docked and must be followed by a filename The value of lt no of GA runs gt is the number of times each ligand
99. ty that the most fit member of the population is selected as a parent to the probability that an average member is selected as a parent GOLD Configuration File User Guide Optimum values of the genetic algorithm parameters are highly correlated you are therefore recommended to use automatic ligand dependent GA parameter settings see below or one of the default parameter sets offered via the front end Setting all population and genetic operators to auto will instruct GOLD to automatically calculate the optimal number of operations for each ligand thereby making the most efficient use of search time When using automatic settings the search efficiency see autoscale can be used to control the speed of docking and the predictive accuracy i e the reliability of the results Related Instructions popsiz n_ islands match_ring_templates niche_siz pt_crosswt allele_mutatewt migratewt autoscale 5 3 n_islands n islands lt value gt auto default 5 Rather than maintaining a single population the genetic algorithm can maintain a number of populations that are arranged as a ring of islands Specifically the algorithm maintains n islands populations each of size popsiz see popsiz Individuals can migrate between adjacent islands using the migration operator see migratewt Optimum values of the genetic algorithm parameters are highly correlated you are therefore recommended to use automatic ligand dependent GA p
100. ults for each ligand written to a separate sub directory see make_subdirs However under certain circumstances you may wish to use clean up option delete empty directories in order to delete any empty output directories e g When choosing not to retain all solutions from a docking run see clean_up_option save_best_ligands you may also wish to remove directories that correspond to solutions which have not been retained or When writing all solutions to a single concatenated file see concatenated_output then you might not wish to retain empty directories intended for individual solutions Related Instructions make_subdirs concatenated_output clean_up_option delete_redundant_log files clean_up_option save_best_ligands GOLD Configuration File User Guide 47 14 11 clean_up_option delete_rank_file clean_up option delete rank file e By default a file called lt ligand file name gt m rnk is written for each ligand m refers to the position of the ligand in the input file This file contains a summary of the fitness scores for all the docking attempts on that ligand The docking attempts are listed in decreasing order of fitness score so the best solution is placed first e To instruct GOLD not to save ligand rnk files use the instruction clean_up option delete rank file 14 12 clean_up_option delete_all_log_files clean up option delete all log files e By default a solution log file lt 1igand file name gt
101. uration file is called gold conf If another name has been used for the gold conf e g new conf filename conf this will have to be specified C Program Files CCDC gold suite GOLD gold d_win32 bin gold_win32 exe new conf filename conf GOLD Configuration File User Guide 3 Format of the GOLD Configuration File e The GOLD configuration file is a plain text file containing instructions with one instruction per line Instructions are case sensitive e Typically the order in which instructions are provided is not important e Any text that is not recognised will cause a warning to be issued Warning message can t process configuration line details e Any text appearing after a will be treated as a comment and ignored GOLD Configuration File User Guide 4 Automatic Settings 4 1 autoscale autoscale lt value gt default 0 off Setting all population and genetic operators to auto will instruct GOLD to automatically calculate the optimal number of operations for each ligand thereby making the most efficient use of search time When using automatic settings the search efficiency autoscale can be used to control the speed of docking and the predictive accuracy i e the reliability of the results The value of autoscale can be set between 0 01 and 5 0 When set at 1 0 search efficiency 100 GOLD will attempt to apply optimal settings for each ligand For a ligand with five rotatable bonds this will be ar

Download Pdf Manuals

image

Related Search

Related Contents

Shuttle XPC Barebone SD11G5  VS4500 - Visiplex  bonilla user manual - Expert-CM  VRS-5V USER`S MANUAL  Elinchrom D-LITE 2 IT User's Manual  Bedienungsanleitung - geo  Behringer FCV100 User's Manual  anexo 2 – indicadores de desempenho  

Copyright © All rights reserved.
Failed to retrieve file