Home
COSMOquick User Manual
Contents
1. File Name cocrystal cyanophenol smi Files of Type SMILES files smi SD files sdf sd CQ fragmentation frg text files txt Progress 0 e Status SETUP Memory 34 3616 MB COSMOlogic 13 Predicting Solutions You will now find a list of SMILES strings and compound names in the lower area of the compound setup screen You can add a compound by adding a new line and type a name or a SMILES string in the text area above For example type tartaric acid and glycerine there You may also add a compound by drawing it with the 2D structure editor The editor will automatically generate a SMILES string for you which you can add to the compound setup After you have created a suitable list of molecules select the Next button at the bottom Now a fragmentation is initiated and the CFDB is being accessed which may take a while After it is finished the screen should look like File Extras Tools License Help Start Compound input Compound details No Compound SMILES molweight PERE 1 3 cyanophenol Ocicce C N ccc1 119 12 Cocrystal screening S 1 2 bis 4 pyridyl ethane C Cc2ccncc2 cicencci 184 24 Extended info T E AAA 4 4 4 bipyridine n cci cccic2cencc2 156 18 5 4 phenylpyridine n cc2 ecc2ciccccci 155 20 Inport COSMO Sefa 6 4 pyridinecarbonitrile NsCcicencci 104 11 7 3 cyanopyridine N Cciccenc1 104 11 8 1 cyanonaphtalene N Cc2ciccececiccc2 153 18 9
2. COSMOlogic 19 Predicting Solutions CTCALC _ External COSMOtherm CT EXT Compounds smi CCCC butane smi CC C C isobutane Look In bad fragmented compounds Eze EON JV21LZPCKONN cosmo Wi AGIR1NEKINUN cosmo ff Ij FZ7RGIR7KMUN cosmo _ LSUBKABXLIUN cosmo EXER EOE ENRE MZCSIVIQLSNU cosmo Mh EM4YDR3UOJNU cosmo l BECOMES eum OBH7SYGKKQNU cosmo GOUEAS EUEN OVJOCLNCILUU cosmo QYHKGX1K 4 b File Name AUN cosmo HQ63086VPSNN cosmo JJXOTUNONYNN cosmo JKAKOT6KJKNU cosmo Files of Type COSMO files cosmo compressed COSMO files ccf SMILES files smi SDF fil w Pm Limo COSMOsim calculations The COSMOfrag input generator can also be used to submit molecular similarity calculations based on o profiles COSMOsim Just specify the SMILES or the molecular structures and choose the COSMOsim checkbox where you can define the number of target molecules ntarget and the maximal number of closest hits nbest please refer also the COSMOfrag manual for details _ External COSMOtherm CT EXT Compounds SMIELLU L U MER smi CO methanol smi CC C CC C O methyl isobutyl ketone smi CICCl methylene dichloride smi CCCCOC C O n butyl acetate smi CN1CCCC1 20 N methyl pyrrolidone smi CC COC O propyleneglycol monomethylether smi CC COC OC 0 C propyleneglycol methylether acetate smi CC 10C OC 1 0 propylene carbonate smi CIC Cl C CI CI tetrachloroet
3. JChempaint 2D structure editor may display some compounds incorrectly like cis trans isomers The NIH webservice Chemical Identifier Resolver is in the public domain and a proper continous functioning can not be guaranteed by us 1 5 License Currently the license is checked via COSMOfrag which is called internally by COSMOquick Please provide a valid license file at the first startup of the software Please note that the COSMOfrag executable shipped with COSMOquick is only able to use parameterization at the BP SVP COSMO level For higher level calculations we recommend to use COSMOtherm instead 1 6 Overview on Currently Predictable Properties COSMOdQquick predicts several thermodynamic properties the following table summarizes those properties and lists where they can be found COSMOlogic 4 Predicting Solutions Property Quantity Module Solubility log10 x x in mole fraction Solubility Prediction S in mol L S in g L w in g g Free energy of fusion AG in kcal mol Solubility Prediction as computed from experimental solubilities Free energy of fusion AG in kcal mol QSPR amp ADME as QSPR estimate Activity coefficient Iny Solubility Prediction Henry constant amp gas solubility Excess enthalpy of Compound A Hein kcal mol Cocrystal and Solvate and B Screening Free energy of mixing of Aand B AGmix in kcal mol Cocrystal and Solvate Screening Henry constant H in bar Henry constant amp gas solubility Vapor pressure
4. lt Back Run Cancel Status SETUP Memory 78 3616 MB Progress 0 i You may scroll down and choose e g a 50 50 mixture mole fraction from diethyl ether and dioxane as additional solvent Scroll up and click Add solvent to add this mixture to your solvent list After you have finished your input you may proceed and select the Run button which starts the solubility calculation The calculation may take a few seconds afterwards you find some new tabs at the bottom of the window with the results of the calculation a table and a plot window Tools License Help Start Compound input Compound details Solubility setup j Results No Solvent Status Quality S g exp logi0 x 10 ethanol OK 170 183 1 353 OK 0 411 3 109 propanone OK 86 798 0 985 1 335 toluene SOLVENT OK 3 381 0 004 chloroform SOLVENT OK 3 033 0 012 water SOLVENT OK 3 270 0 030 Quality 1 octanol SOLVENT OK 2 223 0 039 log10 exr i pentanol SOLVENT OK 1 870 0 126 S gA exp ethyleneglycol SOLVENT OK 1 850 0 254 S mol f exp 1 butanol SOLVENT OK 1 737 0 199 7 log1009 1 propanol SOLVENT OK 1 584 0 344 w o a 2 propanol SOLVENT OK 1 671 0 277 Sioh methanol SOLVENT OK 1 345 0 993 E Simol aceticacid SOLVENT OK 1 302 0 868 V self dimethylsulfoxide SOLVENT OK 0 00E0 8 402 en N N dimethylac SOLVENT OK 0 00E0 8 402 4 methyl 2 pe SOLVENT OK 1 601 0 205 butanone SOLVENT OK 1 237 0 631 acetonitrile SOLVEN
5. 3 COSMOlogic Predicting Solutions COSMOQuick User Guide Version 1 3 Copyright by COSMOlogic GmbH amp Co KG Imbacher Weg 46 51379 Leverkusen Germany cosmotherm cosmologic de www cosmologic de Contents T htFOGOlCctiOnz za tirer eer tee ro rege eet Era e dea rea ro tes dade dee Teas ede ra Fea Cra eye lel dk teeta dava 1 1 1 Fragmentation Approach COSMoOfrag esses eene nnns snnt nnns 2 1 2 What is a SMILES string and how to get them ccccccccccscsssessseececeeecessessaeeeeeeeeessessaeees 2 1 3 InstallatiOFi 4 9 oet eere sette ye Dt etre edet n yes ette e et edere us 2 1 4 Current COSMOquick Limitations eeesesseesesee eene enne nn nennen 2 1 5 III E 3 1 6 Overview on Currently Predictable Properties eese 3 1 7 COSMO quick File MENU iicet noble ive ee ea ee esas ie ete da vec Gaus tas roe a eden 5 2 COSMODQuick Tutorial teet ce detain be te ie te eed te Bs GI et t eed 7 2 1 Solubility Calculation and Solvent Screening with COSMOQquick eese 7 2 2 Cocrystal Solvate Screening With COSMOmQquick eese 12 2 3 Sorption amp Solubility in Polymers eese enne nnne 15 2 4 Exporting mcos Elles cett t eot ets era i eee eh reet e e ta o a 17 2 5 COSMOfrag Input Generator ccccesssccccecessessnececccecessecneaeseeecesseseasaeeeesceseessaeaeeeesens 17 2 6 Other Available Options a tv eer
6. CHx n separated by none alkylatoms rotatable bonds number of effectively rotatable bonds internal hbonds number of internal hydrogen bonds conjugated bonds number of conjugated bonds rotbsdmod general flexibility parameter including rings tmult topological multiplicity 2D symmetry nbr11 linear chain rotational bonds rbwring ring flexibility parameter fragments number of fragments necessary to compose molecule frag quality The average similarity of atomic spheres as compared to the CFDB database 1 bad 9 perfect hit see also COSMOfrag maxstring keyword zwitterion in water molecule can form a zwitterion in water 1 true O false sulfoxide quarternary mixed ammonium number of functional groups as computed by CDK Chemistry Development Kit 3 6 QSPR Builder The QSPR builder module allows for the creation of simple QSPR models based on a multiple linear regression It may be started from the main menu Tools gt Create new QSPR model or via the usual workflow from the compound details tab It is possible to load semi colon separated files csv containing any kind of descriptor or one may use COSMOquick based descriptors The latter allows for deployment of those models for laters calls from within the program Linear COSMOlogic 27 Predicting Solutions regression models are a linear combination of variables and may look like for example for the enthalpy of fusion AHfus 2 85 1 07 User tO 45 h_int 0
7. COSMOfrag command line input COSMOlogic 18 Predicting Solutions COSMOfrag keywords COSMOsim CTCALC External COSMOtherm CT_EXT Compounds smi CC C C isobutane smi O H20 smi S H2S fF Add files smi cosmo etc Load inp Save inp Resetinput Openrundirectory Start calculation 8 For the details of how to run a COSMOfrag calculation please consult the manual Help gt COSMOfrag Reference manual Addition of cosmo files to the database CFDB The COSMOfrag interface may be used to add new molecules to the underlying database Please note that you need a quantum chemistry program which is able to create cosmo files at the SVP level of theory to do this e g TURBOMOLE Choose Really add molecules to database from the pulldown menu and select corresponding cosmo files via Add files button Sometimes it may be useful to choose Virtually add molecules to database which leaves the database untouched but gives some information which molecules would be added with the current setup In this respect the MINSADD keyword may be modified which specifies the threshold value of the minimum similarity in a molecule for CFDB addition default is 2 Values can range from 1 to 7 If you finally press Start calculation the molecules in question are added and converted into a compressed format ccf the temporary directory can be accessed via the Open run directory in order to look at the COSMOfrag output
8. Generate o Profiles ES 0 Watch oni introduction COSMOlogic Predicting Solutions Memory 51 3616 MB Progress 16 Please select Henry constant amp Gas Solubility from the first screen Choose the import molecules from file button and load from the exampledata directory the pvc sorption smi 3 s Polymer detection Potential polymer detected Do you want to switch ON polymer detection Uses halide in SMILES to mark a repeat unit i e uses the Polymer I COSMOfrag keyword Yes No Choose Yes to switch on the polymer treatment within COSMOquick For details of the polymer treatment please refer to section 3 9 Now a dataset containing PVC and some small molecules is loaded If you proceed to the compound details window by clicking next this compound is now labeled as polymer green colored entry Continue by choosing screening type Henry constant You should now have a solvent defined PVC and see several solutes in the table If you continue now without further adjustment you would compute the relative solubility constant from the gas phase into the solvent To get absolute values it is necessary to specify a reference experiment from which a material specific shifting constant for the polymer is computed In this case we select the solubility of N in PVC as the reference with S 0 023 cm cm bar First we have to select a suitable input from Units Reference Solub
9. SMILES string Additionaly SDF files may be used as input for COSMOquick 1 3 Installation COSMOQquick is shipped with an installer for Windows Linux and MacOS The COSMOfrag database CFDB needs to be installed separately Extract the COSMOfrag database CFDB zip to a folder of your choice Please note that you need an actual unzipping program e g 7 zip some older versions of Winzip may cause problems here Furthermore due to the size of the database of about 2 4 GB the unzipping process may take several minutes All subdirectories are automatically created At the first start up of the software you are asked to specify the location of the CFDB Please choose an appropriate directory Access to the CFDB over the network may slow down the fragmentation significantly Proxy Server Using the NIH web service needs direct access to the internet In case you want to use this service and you have to access the internet via a proxy server you will have to adapt the java configuration file COSMOQquick vmoptions which can be found in the COSMOquick subdirectory in the installation directory Simply umcomment the respective line there and use your companies institutions proxy settings 1 4 Current COSMOquick Limitations The COSMOquick approach to generate approximate o profiles leads to certain limitations in the application of the method No conformer treatment is possible with COSMOquick For most common ionic compounds o profile can be
10. Technically COSMOquick performs three calculations to obtain Hex one for each of the pure components A and B and one mixture calculation for A and B with the given stoichiometry in the subcooled liquid consisting of the mixture of A and B Sorting the results according to their excess enthalpies will give a list with those compounds having the highest propensity to cocrystallize at the top Based on recent work we have introduced a partial empirical function fj to improve the results of the cocrystal screening It takes into account the flexibility of the API and the conformer via the number of rotational bonds nrot Siu H mita max 1 nrot max 1 nrotcor With the constant a 0 5102 which has been determined on a set of about 300 APl coformer pairs from the literature Highly flexible compounds are thus being punished in a screening We have not fully understood this effect yet It is probably of kinetic nature as more flexible compounds may have a higher barrier for crystallization 3 4 Solute Backfitting The aim of this approach is to find a description i e a composed or meta COSMO file of a compound with a structure that is not well defined like a residue or a polymer based on its solubility in different solvents In other words based on given experimental data a meta COSMO file so called mcos file is generated via an iterative algorithm which reproduces those experimental data as best as possible This can subseque
11. cocrystal with a molecule typically an active pharmaceutical ingredient API This workflow can also be used to identify possible solvate forming solvents for the specific drug Please have a look at section 3 3 for details of the procedure Please select Cocrystal Solvate Screening from the start window COSwOquick Version File Extras Tools License Help Start Henry Constant amp Gas Solubility QSPR amp ADME Properties Generate Hansen parameter Watch online introduction COSMOlogic Predicting Solutions Now you arrive at the compound setup where you can specify the molecules you want to study Please select Import molecules from file and open the smi file cocrystal_cyanophenol smi from the directory exampledata File Extras Tools License Help Start Compound input Molecule input Free text input Please insert SMILES orjad You may also import moled Look In d exampledata S faban comer m 7 cocrystal paracetamol smi nutraceutical sdf 7 compoundlist cimetidine smi 7 compoundlist diuron smi REN 7 compoundlist ibuprofen smi T solvents smi cd 3 compoundlist meloxicam smi 3 top sales drugs smi Choc KoA maes 7 compoundlist paracetamol smi wermuth_coformers smi hg 7 compoundiist_sulfadiazine smi 3 eafus gras new smi 3 ethylcellulose sorption smi 7 polymers smi 7 pvc sorption smi
12. e g chloroform and D A donor acceptor e g water To cover the potential solvent space broadly and to get a good predictivity it is recommended to include one of each type as a reference at least you should have an unpolar an acceptor and a donor acceptor solvent Please note that by dragging the mouse over the field of interest you obtain some additional information Tooltip on that variable There is a second window available with plots of the computed solubilities If you have specified experimental solubilities they are also plotted File Extras Tools License Help Start Compound input Compound details Solubility setup Results 0 00 Graph for 0 25 Screening No 1 0 50 F 0 75 Sig L exp S malI exp 1 00 1 25 wig al Sig L 1 50 S mol 1 75 y self i 2 25 scatterplot 2 50 Q barplot 2 75 3 00 3 25 3 50 3 75 4 00 AM N ce ce 8 SOO 8 OO D o J P oae ae xS SH OS xU FSF Se e PS SF SS xE S E oS a Fl Tu oe 2 XS S PO CS SSOP KS ae Va ra FS Nus NN SS S a eS e x SSS SS So Back You may now extract the results either by using copy amp paste on the tables Ctrl C Ctrl V or use the export to excel csv function COSMOlogic 12 Predicting Solutions 2 2 Cocrystal Solvate Screening with COSMOquick This section explains how to carry out a screening for potential coformers which can form a
13. mcos file as generated by COSMOfrag but also the location of original cosmo files may be given here Error code The error code of a COSMOfrag fragmentation run If error code gt 0 the fragmentation has failed and then the corresponding row is marked red Those compounds can not be used for a subsequent property prediction Please consult the COSMOfrag manual for an explanation of the error codes The most common reasons for an error code 0 are that the system is charged or an invalid SMILES string was given in the input Warn code The warning code of a COSMOfrag fragmentation run If the warning code gt 0 the corresponding row is marked yellow Compounds can be used for subsequent property predictions but should be inspected closer Please consult the COSMOfrag manual for an explanation of the warning codes Polymer Gives a 1 for a molecule which has been fragmented according to the POLYMER X options of COSMOfrag and a 0 for normal molecules Charge Gives the formal integer charge of a molecule as taken from the SMILES 3 9 Treatment of Polymers Because there exists no official encoding of polymers as SMILES COSMOquick uses a workaround to mark a polymer repeat unit Head and tails of a monomer are labeled with the SMILES character usual reserved for halides for example for polychloroprene the corresponding SMILES is C C CI CI CI I In this case head and tail of the repeat unit are marked by lodine but F Cl or Br are
14. 08 M2 0 14 Mdon2 0 13 alkylatoms 0 59 nbr11 Please refer to Section 3 5 for the meaning of the variables Models built can be saved and used subsequently for other systems The linear regression models are evaluated using the root means squared error RMSE between predicted and experimental values RMSE s F6 Y i Here N is the number of samples i e molecules y is the experimental property and f xj is the predicted quantity of the model f x for a molecule with variables x To avoid the problem of overfitting the RMSE is evaluated within a 5 fold cross validation Automatic feature selection can be carried out by a so called greedy forward selection Starting with a single variable the one with the lowest RMSE is selected within a cross validation and added to the model In the next step the best variable among the remaing ones is selected and added to the model This is repeated until the RMSE cannot be improved significantly It is very important that this is done within a cross validation loop otherwise feature selection may induce quite severe overfitting leading to useless models Additionally variables with zero variance i e which are basically constant and highly correlated variables are discarded automatically There need to be at least as many molecules as variables for the linear regression in order to have a unique solution for the coefficients of the model 3 7 Prediction of Hansen Solubility Parameter Hansen s
15. Solubility Calculation 20 SOLUBILITY CONVERTER 5 Solute Backfitting 23 solvation free energy 15 Solvent Screening 7 Sulfoxide 26 Tmult 26 Total q 25 UNIQUECODE 28 USMILES 29 vapor pressure 15 Volume 26 Warn code 29 Zwitterion in water 26 o profile o potential 13
16. Some of them are COSMOquick specific solubility calculation with several references cocrystal screening and some of them can also be carried out with COSMOfrag at the command line For those calculations please have a look at the COSMOfrag manual e g available via the help menu within COSMOquick 3 1 Solubility Calculation COSMOnquick is able to use multiple experimental solubilities as reference to refine its solubility prediction The procedure is outlined below and more details can be found in reference 8 First a number of reference solvents is chosen where we know the solubility e g by an experimental measurement From those n reference solubilites the free energy of fusion AG is calculated by the following equation see also reference 4 AG a _ pure goren en RT In 10 log 10 x pure The chemical potentials of the pure liquid solute u and the solute in the solvent at infite dilution are calculated by COSMOquick The experimental solubility x is given as mole fraction in mol mol Thus for every solvent we obtain a free energy of fusion which will be slightly different Of course in a perfect model AG should be the same for any solvent The basic idea is now to use those differences in the free energy of fusion to correct the chemical potentials within the solvent where the correction term is adapted to the similarity of the AG AG jus AG cor i fus i l n COSMOlogic 21 Predicting Solu
17. UNKNOWN 13 4 4 biphenol 0 500 UNKNOWN 14 1 3 dihydroxybenzene 0 500 UNKNOWN 15 bicalutamide 0 500 UNKNOWN 16 3 hydroxypyridine 0 500 UNKNOWN 17 5 hydroxyisoquinoline 0 500 UNKNOWN 18 4 cyanophenol 0 500 UNKNOWN 19 1 hexadecanol 0 500 UNKNOWN 20 tartaric acid 0 500 UNKNOWN lt Back RnQ Cancel Status SETUP Memory 97 3616 MB Progress 0 If we have a compound set of our choice this cocrystal setup is taken from Bis et al Mol Pharm 2007 4 401 we proceed by pressing the Run button at the lower left corner and the screening starts After a few seconds the results of the calculation are represented in the next window To order the API coformer pairs according to their highest propensity of forming a cocrystal we select the column showing the excess enthalpy H ex and sort it COSMOlogic 15 Predicting Solutions File Extras Tools License Help Start Compound input Compound details Coformer setup Results 1 3 cyanophenol 1 2 bis 4 pyridyl eth ESO T Show all 2 3 cyanophenol trans 1 2 bis 4 dipyr 3 lika 3 3 cyanophenol 4 4 bipyridine UNKNOWN 3 451 2 431 4 3 cyanophenol 4 phenylpyridine COFORMER 2 451 1 431 P 5 3 cyanophenol 4 pyridinecarbonitrile UNKNOWN 1 570 0 550 6 3 cyanophenol 3 cyanopyridine UNKNOWN 1 492 0 471 7 3 cyanophenol 1 cyanonaphtalene UNKNOWN 0 536 0 484 NE 9 3 cyanophenol 1 3 dicyanobenzene UNKNOWN 0 416 0
18. acceptor solvent like water a pure donor solvent like chloroform and an acceptor solvent like acetone Thus the solvent space would be well represented and predictions may become more balanced Correction for o potentials of alkanes Currently solubility trends for a solute in a homologue series of alkanes are not reproduced correctly To overcome deficiencies of the current COSMO RS approach concerning the solubility in pure alkanes the following correction for the pseudo chemical potential is used in COSMOquick only for alkanes Au f FOE etec A A is a constant determined by fitting to experimental data activity coefficients and solubilities in homologue alkanes and is determined to A 1 2 Egiciec is the dielectric energy of the solute in the virtual conductor of the COSMO approach f e Jy and f e are the scaling factors for the dielectric sourrounding The constant scaling factor of a COSMOtherm calculation f e is corrected with a new scaling factor f e gsar which has been adapted to reproduce the behavior of alkanes correctly This scaling factor is obtained from a QSPR for a set of dielectric constants of COSMOlogic 22 Predicting Solutions alkanes f gspr Eqspr 1 Eqspr 0 5 The corresponding empirical QSPR equations for linear and branched alkanes are 5 2 108 0 550 exp 0 157 n Epranene 0 03756 rb 0 08011 n n 0 002 rb Where n is the number of alkane C atoms rb is the
19. also possible The molecule is treated internally as infinite cyclic chain and no molecular weight effects or structural effects are taken into account COSMOnQquick automatically detects if there are SMILES which have an even c number of I characters Alternatively different halides can be choosen within the global options menu For very small repeat units it is recommended to Polychloropren COSMOlogic 30 Predicting Solutions define a dimer or trimer for a more balanced o profile composition For calculations involving COSMO RS properties the combinatorial contribution to chemical potential should be switched OFF i e use Treat solvent as a polymer option for Henry constants or polymer solubilities 3 10 Treatment of Charged Molecules The COSMOquick database CFDB contains meanwhile the most common charged functional groups and therefore most charged molecules and zwitterions can be used This may be useful for example for the creation of fcos files approximated cosmo files from 3D structures for a subsequent COSMOsim3D or COSMOsar3D calculation However we do not recommend currently using charged species for property prediction If you try to use a charged molecule for such a task this will give a warning message which has to be switched off in the global options menu 3 11 Scripting in COSMOquick A still somewhat experimental feature is the use of scripting to access internal COSMOquick routines COSMOfrag
20. created with the help of the QSPR builder see section 3 6 Please contact COSMOlogic if you are interested in more details on the generation and deployment within COSMOquick of those models For the creation of the QSPR models a rich set of descriptor from either COSMOfrag or COSMOtherm has been used Please note that in order to use the variable names with external software packages like R any special characters have been removed This ensures that the variable names stay unchanged after they have been processed externally They are shortly summarized in the following you may also hold the mouse over the variable names within COSMOquick in order to obtain those information Total q Total charge sum from o profile n0 030 e A2 to X0 030_e A2 p o ranging from 0 030 e A to 0 030 e A mu self chemical potential oft he pure compound in kcal mol h hb enthalpy due to hydrogen bonding oft he pure compound in kcal mol h int internal enthalpy oft he pure compound in kcal mol e dielec dielectric energy area surface area of the molecule in A M2 M6 o moments COSMOlogic 26 Predicting Solutions volume molecular volume in A avratio ratio of surface area to volume Macc1 Macc3 o acceptor moments Mdon1 Mdon3 o donor moments molweight molecular weight in g mol ringbonds number of bonds in closed ring alkylatoms number of pure carbon atoms belonging to alkylgroups CHx alkygroups number of alkylgroups
21. e e tet eee e eat 20 3 Technical Details of COSMOQUICK cceseeeseseseeeeee eene eni inihi aaia niin nns 20 3 1 Solubility Calculation oet eet vt e er ette eet ette tees 20 3 2 Solubility Definitions and Unit Conversion cesses eene 22 3 3 Cocrystal Screening verte rete vae e Rt ER a 22 3 4 Solute Backfitting 1i et ve or eee tiva rae t ive a eect tds 23 3 5 ADME S OSPR Calculations eie dieere retento ee ertt etre dete einen adden 24 3 6 OSPR Builder c rnt thier re re ei ier teat edat 26 3 7 Prediction of Hansen Solubility Parameter cccceessssccececessssesneaeeeeeeesseseensaeeeeeeeseesees 27 3 8 Generation of o Profiles Fragmentation Calculation ccccccssscecsecesseecsseeeesseeeeseeees 28 3 9 Treatment of Polymiers oi aci ee nen tackdvessbvbedtenectvesss Melee 29 3 10 Treatment of Charged Molecules cccsscccccccsssesssssaeceeecesesseaeaeeecscessesnaaeseseessessesseaees 30 3 11 Scripting in COSMOQUuicCk citer tete eae Ee Dn Ee ERAN E eee Re Cu EVER cadets 30 Referentes UDIN EI 32 COSMOlogic l 1 Predicting Solutions Introduction COSMOquick is a graphical user interface GUI and a driver for COSMOfrag 1 The program is particularly suited for solubility calculations and screening of large data sets e g cocrystal screening or partitioning coefficients The COSMOquick COSMOfrag approach all
22. generated with COSMOquick but property prediction is currently not recommended COSMOlogic 3 Predicting Solutions A few complex drugs may not be properly represented in the COSMOquick database and no valid c profile may be generated For those cases an Error Warning message is shown For those cases cosmo files have to be generated and added to the database Known SMILES issues are Implicid H inside square brackets is not supported e g write C or NH4 instead of C or N COSMOquick has been tested to run with 20000 medium sized organic compounds Higher numbers may be feasible with the GUI but for performance reasons for large sets of compounds we recommend to use the command line based COSMOfrag instead Input files for COSMOfrag may be created loaded or modified via a graphical user interface from TOOLS gt COSMOfrag calculation There is currently the restriction to use a parameterization at the BP SVP COSMO level For larger set of compounds make sure that sufficient disk space is available A computation of 10000 compounds needs currently roughly 500M for temporary data In case the GUI rans out of memory additional memory can be allocated via changing the Xmx1024m options in the COSMOquick vmoptions file in the COSMOquick directory Length of input SMILES is limited to a total number 222 atoms Limitations due to third party software used within COSMOquick Limited support for inorganic compound SMILES
23. number of ringbonds naa the number of alkylatoms and the aag the number of alkylgroups as given by COSMOfrag The regression coefficients for those two equations as compared with experimental data are r 0 998 for linear alkanes and r 0 96 for the branched alkanes The final dielectric constant is then obtained via Egspr linear F E branched The regression coefficient for QSPR scaling factor f e 4 as compared with the experimentally obtained factor is r 0 977 This alkane correction is only used for solubility calculations with reference solvents within COSMOquick Dissociation correction In the advanced options menu of a solubility calculation it is also possible to switch on a simple Henderson Hasselbalch dissociation correction term Diss Correct for aqueous solutions which may be used to correct the solubilities of strongly dissociating solutes 3 2 Solubility Definitions and Unit Conversion Currently there are many different solubility definitions available in the literature COSMOquick uses the decadic logarithm of the mole fraction log10 x internally for its calculations To alleviate the conversion between different units a solubility converter can be found under Tools gt Solubility converter The same converter can be found by using the context menu when specifying a mixture solvent for a solubility run Currently the following solubility definitions can be used definitions are according to the ones used in the COSMOth
24. p vapor Henry constant amp gas solubility Free energy of solvation AGsow In kcal mol Henry constant amp gas solubility Gas solubility S in cm 3 cm 3 bar Henry constant amp gas solubility Melting point Tm K QSPR amp ADME Enthaly of fusion AH in kcal mol QSPR amp ADME Water solubility logS water S in mol L QSPR amp ADME Solubility w in g g Prediction log10 x x in mole fraction Octanol water partitioning logKow QSPR amp ADME coefficient Blood Brain partitioning logBB QSPR amp ADME coefficient Plasma protein Human Serum logKHSA QSPR amp ADME Albumin partitioning Intestinal Absorption coefficient logKIA QSPR amp ADME Organic carbon Soil Water logKOC QSPR amp ADME partition coefficient Abrahams parameter E S A B V QSPR amp AMDE Hansen parameter 5D 6P 6H Hansen parameter estimation COSMOlogic 5 Predicting Solutions 1 7 COSMOquick File Menu The following options are available in the COSMOquick file menu FILE NEW JOB Starts a new job and closes all results windows LOAD Either load a file containing SMILES strings and compound names smi or a previous fragmentation run frg QUICKLOAD Loads the last fragmentation run OPEN TEMPORARY DIRECTORY Opens the temporary directory used for calculations EXTRAS GLOBAL OPTIONS Options for COSMOfrag and internal COSMOtherm runs can be set here GENERAL SETTINGS Here you can specify for example the location of the COSMOfrag executabl
25. 1 naphthol Oc2ciccccciccc2 144 17 10 1 3 dicyanobenzene N Ccicc C N ccci 128 13 11 1 4 dicyanobenzene NsCciccc C amp N cci 128 13 12 1 3 5 trihydroxybenzene Ocicc O cc O c1 126 11 13 4 4 biphenol Ociccc cc1 c cc2 ccc20 186 21 14 1 3 dihydroxybenzene Ocicc O ccci 110 11 15 bicalutamide FC F F cicc ccc 1C amp N NC C OX 430 38 gt 16 3 hydroxypyridine Ociccenci 95 10 17 5 hydroxyisoquinoline Oc2cicancciccc2 145 16 18 4 cyanophenol Ociccc C amp N cci 119 12 19 1 hexadecanol OCCCCCCCCCCCCCCCC 242 44 20 tartaric acid OC C O C O 0 C 0 0 150 09 aooo oyee em Back Next gt Cancel Compounds where the fragmentation has failed are marked red as in this case glycerine This may have several reasons The compound name was not found within the delivered database and therefore no valid SMILES was found or a SMILES was provided but contains an atomic environment which is not available in the CFDB The checkbox Extended info may reveal the reason for a failed fragmentation In this case the name glycerine was just not found in the delivered database Therefore we would have to provide a SMILES string for this compound by ourself in the Compund input screen This could be done either by using the Manage compounds button or by selecting the right row and calling the context menu by a right mouse button click Now we just remove the compound by either selecting Remove or Remove ALL fragmentation failures The context men
26. 605 8 3 cyanophenol 1 naphthol UNKNOWN 0 396 0 624 Yes 10 3 cyanophenol 1 4 dicyanobenzene UNKNOWN 0 391 0 629 11 3 cyanophenol 1 3 5 trihydroxyben UNKNOWN 0 365 1 676 Cd Status 13 3 cyanophenol 1 3 dihydroxybenzene UNKNOWN 0 297 1 234 12 3 cyanophenol 4 4 biphenol UNKNOWN 0 250 1 791 3 H exfkcal mol 14 3 cyanophenol bicalutamide UNKNOWN 0 198 3 883 16 3 cyanophenol 5 hydroxyisoquinoline UNKNOWN 0 179 0 841 H hb kcalmol 15 3 cyanophenol 3 hydroxypyridine UNKNOWN 0 059 0 961 17 3 cyanophenol 4 cyanophenol UNKNOWN 0 006 1 014 G_mix kcal mol 18 3 cyanophenol 1 hexadecanol UNKNOWN 0 044 8 207 19 3 cyanophenol tartaric acid UNKNOWN 0 066 4 148 N Erst Thu Sep 25 12 19 54 CEST 2014 Cocrystal screening temperature 298 15 K Running time 13 9000 sec i4 gt lt Back Screening No 1 x graphNo 1 x Status Memory 84 3616 MB Progress 0 We should find now all pairs which have a low excess enthalpy at the top of the list those are compounds which have a high probability to form a cocrystal see also section 3 3 Its also possible to display quantities which describe the part of the enthalpy which is due to hydrogen bonding H hb and the free energy of mixing G_mix of the cocrystal liquid The column denoted f fit contains the results of an empirical screening function which takes into accound the excess enthalpy and the molecular flexibility of the drug and the coformers see also section 3 3 The trends of those q
27. 9 00 0 00 0 46 0 0 00 1 9 00 0 00 0 00 0 0 00 1 9 00 We now procceed to the next window where all of our compounds are listed and where one can set the API temperature and the stoichiometry of the system under scrutiny For unknown systems it is recommended to keep the 1 1 stoichiometry as most cocrystals crystallize in either a 1 1 or a 2 1 ratio where the latter would not significantly change the results within the given frame of accuracy If we have experimental knowledge about an API coformer system we may also select a pair as being either a cocrystal or no cocrystal by using the left mouse over the specific table entry in the status column This just results in a coloring of the entry which may be useful if we screen a large list of compounds OSMOquici rsion mre File Extras Tools License Help Start Compound input Compound details Coformer setup No c mole fraction Ez Active pharmaceutical ingredient API 1 3 cyanophenol 0 500 API 3 cyanophenol Z 2 1 2 bis 4 pyridyljethane p 500 ESSEN stcictionetry 3 trans 1 2 bis 4 dipyridylethylene 0 500 COFORMER BL 1 4 4 4 bipyridine 0 500 UNKNOWN Coformer 1 B Seema oe Temperature 6 4 pyridinecarbonitrile 0 500 UNKNOWN 296 15 K 7 3 cyanopyridine 0 500 UNKNOWN 8 1 cyanonaphtalene 0 500 UNKNOWN 9 1 naphthol 0 500 UNKNOWN 10 1 3 dicyanobenzene 0 500 UNKNOWN 1i 1 4 dicyanobenzene 0 500 UNKNOWN 12 1 3 5 trihydroxybenzene 0 500
28. C cimetidine f open exampledir solvents smi r solvents f read f close solList nameList CQInterface useGfusionQSPR True COSMOlogic 31 Predicting Solutions for solute in soluteList molset solute n solvents cqModel CQModel cqModel startFragmentation molset False cqModel setupSolubScreening cqModel startRefSolubCalculation for i m in enumerate cqModel getMixtures if i 0 solutename m getLabel continue sollist append m getSol g p 1 nameList append solutename in m getLabel for name solubility in zip nameList solList print Z 64s 4 2f name solubility The script iterates over 3 solutes and computes the solubility in a set of different solvents using a QSPR for the free energy of fusion Perequisites for such a scripting are e Installation of COSMOquick GUI in order to get a settings xml file with actual paths and directories e Download of the recent jython version e g 2 7 from sourceforge e Adapt paths for jar archives locations and settings xml in the jython script use sys path append command as indicated in the example script or set the java CLASSPATH environment variable Call jython script with java call e g COSMOlogic COSMOquick14 jre bin java jar jython standalone 2 7 b3 jar screening py COSMOlogic 32 Predicting Solutions References 1 Hornig M amp Klamt A COSMOfrag A Novel Tool for High Throughput ADME Property P
29. Calculation and Solvent Screening with COSMOquick This section describes how to perform a COSMOquick solubility calculation with reference solubilities Please have a look at chapter 3 1 for details of the procedure After the first startup please provide a location for the COSMOfrag database CFDB and also for a valid license file If the CFDB location and the license are OK you arrive at the start screen and may choose the calculation type please choose Solubility Prediction File Extras Tools License Help sten Henry Constant amp Gas Solubility Cocrystal Solvate Screening QSPR amp ADME Properties s Solute Backfitting M Generate Hansen parameter Generate o Profiles ES Watch online introduction COSMOlogic Predicting Solutions 51 3616 MB Progress COSMOlogic 8 Predicting Solutions Now you arrive at the compound setup where you can specify the molecules you want to study Please select Import molecules from file and open the smi file compoundlist paracetamol smi from the directory exampledata File Extras Tools License Help Start Compound input Molecule input i Import molecules from fie Draw 2D structure j Look In exampledata 3 cocrystal_cyanophenol smi 3 fabian cofomers smi 3 cocrystal paracetamolsmi 3 nutraceutical sdf 7 compoundiist cimetidine smi polymers smi E compoundlst_diuron smi E pvc s
30. License Help Start Compound input Compound details No Compound SMILES molweight exeun 1 Paracetamol N C 0 C C1 CC C 0 C C1 151 16 Solubility prediction Z 2 ca4 cic en ciel 153 82 Extended info 3 toluene C1 CC CC C1 C 92 14 Sees 4 chloroform cecal 119 38 5 ES s TET Import COSMO file s 6 1 octanol C ccccccc o 130 23 7 Lpentanol C cccc o 88 15 8 ethyleneglycol C cojo 62 07 9 1 butanol C ccc o 74 12 10 1 propanol c cc o 60 10 1 e ERA 46 07 2 duxi Show 2D structure piss mm 2pod c profile potential age 14 methat Show mcos file s 32 04 15 acetica Save mcos file s ojo 60 05 16 dimeth Use COSMO file D c z3 Manage compounds 7 NAE Pemove D o c 87 12 18 4 meth c 0 c 100 16 19 nm Remove duplicates l o 72 11 39 propan Remove ALL fragmentation failures 5508 21 aceton Add experimental data 1 41 05 22 ethylat Export compound details K 0 0 88 11 23 Wf M 7211 24 dioxane o1ccocci 88 11 25 diethylether Ccocc 74 12 B0 ayeeine No HS 0 COSMOlogic 9 Predicting Solutions Compounds where the fragmentation has failed are marked red as in this case glycerine This may have several reasons The compound name was not found within the delivered database and therefore no valid SMILES was found or a SMILES was provided but contains an element which is not available in the CFDB The checkbox Extended info may reveal the reason for a failed fragmentation In this case the nam
31. T OK 1 383 0 705 ethularatste f VENT a 1 4c2 noon av Closest_ref Thu Sep 25 12 03 38 CEST 2014 Reference solubility calculation temperature 298 15 K Running t CQ exponent 0 50 we ore ir 2 3 4 5 6 Ed 8 9 AGfus AGcorr Iny Max_sim 4 Screening No 1 x graph No 1 x Status 84 3616 MB You find also a red mark for row of CCl4 which means that the computed correction for this reference is significantly larger than one would expect the threshold is currently set at 1 5 kcal mol A large correction term is a strong hint that this experimental value is inaccurate and should be checked Indeed as a personal communication from the authors of this experiment confirmed the experimental value of log10 x 3 04 is most probably much too high and the true COSMOlogic H Predicting Solutions solubility of paracetamol in CCIA is about log10 x 5 Please have a look at a more detailed discussion of this issue in reference 8 You find a lot of useful additional information on the calculation by selection of the corresponding field at the right column For example if you inspect the last column of this view you find that each solvent has assigned a type according to its similarity with some standard solvents The three letter codes represent the following solvent types NONP nonpolar e g hexane ACC acceptor e g acetonitrile DON donor
32. ddition of cosmo files 18 ADME 24 alkane correction 22 29 Alkygroups 26 Alkylatoms 26 area 25 Avratio 26 BP SVP COSMO level 3 CDK 26 CDK software 20 CFDB 1 2 18 Charged Molecules 30 CIR chemical resolver identifier 5 Cocrystal Screening 12 22 compound setup 8 Conjugated bonds 26 Correction for o potentials of alkanes 21 COSMOfrag 2 3 COSMOfrag database CFDB 7 COSMOfrag executable 5 COSMOdquick vmoptions 2 COSMO RS theory 1 COSMOsim 5 19 CQ exponent 21 D A donor acceptor 11 database 18 descriptor 25 Diss Correct 22 Dissociation correction 22 DON donor 11 e dielec 25 energy of fusion 20 Error code 29 excess enthalpy 15 23 fit 23 Frag quality 26 Fragmentation Calculation 28 Fragments 26 29 Grir 23 h hb 25 H hb 15 h int 25 Hansen Parameter 27 Hex 15 excess enthalpy 15 hydrogen bonding 15 InChi 2 Internal hbonds 26 Jython 30 33 License 3 limitations 2 LOAD 5 log window 5 log S water 24 logBB 24 logKHSA 24 logKIA 24 logKOC 24 logKow 24 M2 25 Macci 26 Manage compounds 9 maxstring 26 Maxstring 29 Mdon1 26 Molweight 26 mu self 25 Nbr11 26 NIH web service 2 NONP nonpolar 11 Polymers 29 Proxy Server 2 QSPR 20 24 QSPR Builder 26 QSPR descriptos 20 QUICKLOAD 5 Rbwring 26 rdkit 20 reference solubilities 7 Ringbonds 26 Rotatable bonds 28 Rotatable bonds 26 Rotbsdmod 26 Save mcos file 17 Scripting 30 sigma potential similarity 21 SMARTS 20 SMILES 2 13 28
33. e glycerine was just not found in the delivered database Therefore we have to provide a SMILES string for this compound in the Compound input screen This could be done either by using the Manage compounds button at the right or by selecting the right row and calling the context menu by a right mouse button click In this tutorial we just remove the compound by either selecting Remove or Remove ALL fragmentation failures We now proceed to the next tab where we have to select the reference solubilities and to specify experimental values for those Paracetamol is now automatically selected as solute as it was the first molecule in the list Please select Load solubility setup and choose the file paracetamol_pure mix from the exampledata directory The window should look like COSMOS ein Los File Extras Tools License Help Start Compound input Compound details Solubility setup No Solvent Reference S o exp Solute ccl4 Eg 1 458 Paracetamol 2 toluene Input Units 3 chloroform S gA solution v 4 water Temperature 5 1 octanol 28 15 K 2 SEL Extended options Ti ethyleneglycol 8 Thot Add solvent mixture 9 1 propanol Load solubility setup 10 ethanol gE 170 183 Save solubility setup 11 ch2d2 0 411 12 2 propanol 13 methanol 14 aceticacid 15 dimethylsulfoxide 16 N N dimethylacetamid 17 4 methyl 2 pentanone 18 butanone 19 propanone 86 798 20 acetonitrile 21 ethylaceta
34. e the COSMOfrag database CFDB and the license file SHOW LOG Opens a log window with additional information on what is currently happening i e it basically makes the standard output stdout available TOOLS CREATE NEW QSPR MODEL Build a QSPR model via linear regression based on the available COSMOdquick descriptors COSMOFRAG CALCULATION A user interface for starting individual COSMOfrag jobs COSMOsim jobs and loading and saving COSMOfrag input files This allows for additional flexibility as compared to the standard COSMOquick workflow REQUEST SMILES This allows for retrieving SMILES string from a NIH webservice CIR chemical resolver identifier Please note that this web service is under public domain and no guaranty can be provided for its correct functionality SOLUBILITY CONVERTER This tool allows for a conversion between the different definitions of solubility which can be found in the literature CREATE FCOS FILES Create approximate 3D cosmo files fcos from xyz or sdf input files AUTOMATICALLY CREATE 3D STRUCTURES Use the UFF or the MMFF94 forcefields to create 3D structures from SMILES LICENSE COSMOlogic 6 Predicting Solutions IMPORT LICENSE Use this button to import a new license file license ctd into the program HELP COSMOquick USER GUIDE Opens the COSMOquick manual as pdf documents COSMOfrag REFERENCE MANUAL Opens the COSMOfrag manual as pdf documents ONLINE SOURCES Watch
35. erm code mole fraction x in mol mol decadic logarithm of the mole fraction log10 x normalized mass fraction c in g g c x_solute MW solute x_solute MW_solute 1 x_solute MW_solvent decadic logarithm of normalized mass fraction log10 c solute mass based solubility w in g g definition 2 from COSMOtherm manual w x solute MW solute 1 x solute MW solvent Ssolubility S in mol L solution S x solute V solute V solvent solubility S in g L solution S x solute MW solute V solute V solvent 3 3 Cocrystal Screening COSMOdquick allows for the screening of coformers which may form a cocrystal with a given API A detailed benchmark study of COSMO RS predictions for cocrystal formation can be found in reference 5 To compute the likelihood of cocrystal formation we start from a virtually subcooled liquid of the cocrystallization components and neglect the long range order in the crystal An COSMOlogic 23 Predicting Solutions important quantity in this respect is the excess enthalpy Hex mixing enthalpy AHmix obtained when mixing the pure component A and B to yield the subcooled cocrystal liquid A Bm H H g x H x H pure A pure B Has and Hpure represent the molar enthalpies in the pure reference state and in the m n mixture with mole fractions x m m n and x n m n The excess enthalpy Hex of an API and conformer pair gives a good estimate of the propensity to cocrystallize
36. he number of freely rotable bonds of a molecule The higher the more flexible the molecule is Internal hbonds Number of potential internal hydrogen bonds Conjugated bonds Number of comjugated bonds Rotbsdmod Quantifies the general flexibility including rings Tmult A measure for the topological 2D symmetry due to identical connectivity Nbr11 Rotational bonds of linear chains COSMOlogic 29 Predicting Solutions Rbwring Molecular flexibility due to rings Fragments The number of fragments used to create the approximated o profile Zero fragments means the molecule was just taken out of the CFDB Frag quality A number in which the average similarity of the atoms as compared to the database COSMOfrag maxstring variable is given O lowest 9 highest It can be used to identify those compounds which are possibly not represented reasonably by the compounds currently within the CFDB From our point of view a similarity value 2 2 can always be regarded as adequate 0 similarities on the other hand should be replaced in either case COSMOfrag therefore denotes these molecules with error code 38 USMILES A unique smiles code as generated by COSMOfrag Alkane The number of C atoms of a pure alkane If there are heteroatoms the value is 1 This number is used to apply the alkane correction for solubility calculations section 3 1 cosmo file The name of the cosmo file used for this compound Usually this will be a
37. hylene smi O 1CCCC1 tetrahydrofuran smi Cciccccci toluene f Add files smi cosmo etc Load np Save inp Resetinput Openrundrectory Start calculation fS COSMOlogic 20 Predicting Solutions 2 6 Other Available Options There are a few useful tools available for different purposes within COSMOquick 3D structure generation Once valid SMILES have been created within the compound input panel they may be converted into 3D structures sdf format using the rdkit www rdkit org Just select the compounds to be converted via the Manage compounds in the Compound input Please note that those 3D structures should always be checked for correctness fcos file generation Based on 3D structures sdf xyz or COSMO format COSMOquick is able to generate approximate 3D COSMO files To differentiate from true cosmo files they have the file suffix fcos They may be used for COSMOsim3D COSMOsar3D calculations The fcos generation option can be found under Tools It needs priorily calculated 3D structures and is a stand alone option Additional QSPR descriptos Additional QSPR descriptors and SMARTS for functional group analysis n be selected at the ADME amp OSPR panel Those descriptors are based on the open source CDK s mediawiki cdk index php title Main Page Chemistry Development Kit Sorina 3 Technical Details of COSMOquick Currently there are several types of calculations possible with COSMOquick
38. ility the selection box e g Solubility in cm cm bar Then mark N as reference within the table and type in the solubility File Extras Tools License Help Start Compound input Compound details j Henry Constant setup Results No Solute Reference Solubility ae 2 N2 E O 4 carbonmomoxide EE 298 15 K 6 ethane Extended options 7 propyne 8 propene 9 propane 10 SF6 11 ethine 12 butane 13 isobutane 14 H20 15 H2S COSMOlogic 17 Predicting Solutions After starting the calculation via the Run button the results are presented in the next window A polymer shifting constant is computed and correspondingly all solubilities are modified with this shift Comparison with the experimental data from the Polmyer Handbook Pauly S Polymer Handbook Permeability and Diffusion Data Wiley 2005 543 gives a squared correlation coefficient R 0 9 for the logarithmic solubility log10 S 2 4 Exporting mcos Files The result of a COSMOquick fragmentation calculation for a specific compound is saved in a so called mcos file Those mcos files contain basically links of all involved fragments which build up the decomposed molecule to their respective compressed cosmo file ccf within the CFDB They can be used as any other cosmo file for subsequent COSMOtherm calculations To generate them with COSMOquick please activate Manage compounds or the context menu within the Fragment status pa
39. itself can be scripted at the command line but in some cases in may be useful to apply the specific workflows which are implemented in COSMOquick Because COSMOquick is JAVA based a natural choice for scripting access is the Python implementation Jython http www jython org Jython is a fully functional JAVA based Python implementation and allows for access of any JAVA libraries The following code gives an example on how to screen on several solutes with Jython and COSMOquick oon Jython based solubility screening script using COSMOquick Libraries Computes solubility of drugs in different solvents author Christoph Loschen copyright COSMOLogic GmbH 6 Co KG oon import sys sys path append home Loschen COSMOLogic COSMOquick14 COSMOquick COSMOquick jar sys path append home Loschen COSMOLogic COSMOquick14 extLib COSMObasics jar sys path append home Loschen COSMOLogic COSMOquick14 extapps JChempaint cdk 1 4 18 jar sys path append home Loschen COSMOLogic COSMOquick14 extlib jfreechart 1 0 17 jar from de cosmologic cosmoquick model import CQInterface from de cosmologic cosmoquick model import CQModel D if name main CQInterface readSettings home Loschen COSMOLogicAppData COSMOquick14 config settings xmL exampledir home Loschen COSMOLogicAppData COSMOquick14 exampLedata soluteList N C 0 C C1 CC C 0 C C1 paracetamol N C 0 C C1 CC C O C C1 sulfadiazine C1 NC C NH 1 C CSCCNC NC N N
40. n The parameters a and b have been optimized on a grid over a set of 29 reference solvents in order to minimize the Hansen distance between predicted and original values 3 8 Generation of c Profiles Fragmentation Calculation A fragmentation is the basis for each subsequent calculation Instead of carrying out a quantum mechanical calculation to get the o surface of a novel compound COSMOfrag initiates a look up in the COSMOfrag database CFDB for similar molecules or fragments The novel molecule is then decomposed into a set of fragments each of which is represented with its o profile within the CFDB For details of the algorithm please consult reference 1 Thus an approximated o profile of the novel molecule is created which now may be used as any other COSMO file to carry out COSMO RS calculations Additionaly COSMOfrag carries out a detailed analysis of the molecules The fragmentation window contains a lot of useful information which are shown by selecting Extended info Compound The name of the compound which may be changed by selecting the cell SMILES The smiles string of the compound see section 1 2 Molweight The molecular weight in g mol which is calculated by COSMOfrag UNIQUECODE A unique 12 letter code for the compound as created by COSMOfrag Ringbonds The number of bonds within rings Alkylatoms The number of alklyatoms of the compound Alkylgroups The number of alkylgroups of the compound Rotatable bonds T
41. n 3 7 e Generation of approximate o profiles for COSMOtherm calculations 3 8 COSMOquick and COSMOfrag are based on COSMO RS theory which has become an efficient and versatile tool for the prediction of a large variety of physicochemical properties especially in its efficient implementation within the COSMOtherm program Based on quantum chemical DFT COSMO calculations for the individual molecules it allows for physically most sound estimations of general vapour liquid and liquid liquid equilibria and of related properties like solubilities and partition coefficients In addition it has been extended to properties like drug and pesticide solubility blood brain partition coefficients intestinal absorption soil sorption coefficients etc which are of importance in the design and development of drugs pesticides and other physiological agents For more information on the COSMOtherm program suite please contact info cosmologic de All publications resulting from use of this program must acknowledge the following C Loschen A Hellweg A Klamt COSMOquick Version 1 3 COSMOlogic GmbH amp Co KG Leverkusen Germany 2014 In Addition reference 8 should be cited COSMOlogic 2 Predicting Solutions 1 1 Fragmentation Approach COSMOfrag COSMOQquick internally calls COSMOfrag for the generation of o profiles and for the calculation of properties detailed information on COSMOfrag can be found in Reference 1 The basic idea f
42. nel Select Save mcos file and choose a directory where you want to save the files There will be a directory mcos created where all the files are saved File Extras Tools License Help Start Compoundinput Compound details Henry Constant setup Results No Compound SMILES molweight Select calculation Henry constant prediction 0 0 Extended info N N O C 0 c 0 c Export compound details Import COSMO file s Show 2D structure View c profile potential oon OO Uo wN Show mcos file s Save mcos ure Use COSMO file Remove Manage com ids Remove duplicates j o lage compouni Remove ALL fragmentation failures Add experimental data Export compound details FINISHED 54 3616 MB Progress To use them within COSMOthermX you have to use the File manager and choose those previously saved mcos files PLEASE NOTE Within COSMOthermX a valid path to the COSMOfrag database CFDB has to be specified In General Settings change Fragment directory CFDB accordingly 2 5 COSMOfrag Input Generator It is now possible with COSMOquick to generate input files for COSMOfrag which can be submitted from the commandline This offers some performance advantages and may be useful for highthroughput computations which can not be run and parsed via the graphical user interface By choosing Tools gt COSMOfrag calculation a new window opens with a layout closely resembling the
43. ntly be used to predict other properties like solubilities in other solvents to find replacements or to predict any other property predictable with COSMO RS The general idea is to create a probe compound consisting of several functional groups or fragment molecules compute the solubility in M solvents compare with experimtal data points in those solvent and subsequently adapt the probe compound until a convergency threshold is obtained In detail the workflow is as follows As input M experimental solubilities in M different solvents are needed 1 Define N diverse functional groups or molecules and store them in an mcos file 2 Get molecular weight volume and area for all FG solutes and all solvents 3 Create real weight starting guess vector row weights r COSMOlogic 24 Predicting Solutions 4 5 6 7 8 9 c r Bait Compute MW V and A for the pseudo solute x according to starting guess r e g N Y gt j Compute M combinatorial terms for pseudo solute in each solvent Compute one chemical potential of pure pseudo solute x and M chemical potentials of x in all solvents infinite dilution and add the combinatorial terms from above Convert experimental solubilities into mole fractions using MW or V Determine squared deviation between expt solubility and predicted solubility M 2 SSE Y u 0 Honi F AG jug RT Inx Embed 3 8 into optimi
44. olubility parameters are a useful concept for the characterization of solutes and solvents They describe the solubility characteristics in terms of 3 parameters 5D P and SH representing dispersion interaction permanent dipole dipole interaction and hydrogen bonding respectively The parameters for a new solute are usually determined experimentally by measuring its solubility in a set of different reference solvents with known parameters COSMOdquick allows for the estimation of those parameters by carrying out COSMO RS solubility calculations without the need for an experiment The workflow is as follows First a solute x is defined via its 2 D topology e g by the editor or by directly specifying its SMILES code Then a COSMO RS computation of the activity coefficient In y is carried out on a set of reference solvents An initial guess is made for the the Hansen parameters 6D 6P and 6H and an activitiy coefficient for solute x in solvent i is computed via the equation ny cau ko a ar ar t at The activitiy coefficients as computed via the Hansen distance and COMSO RS are plugged into a sigmoid equation in order to differentiate between good f x 1 and bad solvents f x 0 fG e n ees b COSMOlogic 28 Predicting Solutions Then an optimization procedure varies the Hansen parameters such that the squared difference between those two functions becomes minimal gt In 7 Hansen f In Y cosuo ns jJ mi
45. online introduction into COSMOquick ABOUT COSMOQUICK Gives information on COSMOquick and also about the current used license LICENSE AGREEMENTS USED Shows all currently used external licenses of COSMOquick COSMOlogic 7 Predicting Solutions 2 COSMOQuick Tutorial Before starting with a specific tutorial it is helpful to have a look at the typical COSMOquick workflow Definition of Generation Selection amp Calculation amp molecules amp of c profile amp setup of analysis of compounds compound screening type results details Draw 2D structures Calls COSMOfrag Input reference Visualisation of Import structures Access to CFDB measurements data from file database Mixtures are Export results to Search for Compounds are defined CHR compound names created Define temperature Compounds can stoichiometry etc be analyzed The first step consists of defining the molecules under scrutiny this is usually done by loading a file drawing a structure or defining a SMILES Afterwards the compounds are being analyzed and the database CFDB is accessed for the generation of the COSMO RS o profiles Then usually the type of calculation is specified and specific parameters stoichiometry temperature can be chosen Then in most cases a COSMOtherm calculation is being done internally based on the o profiles generated before and results are presented in tabulated and in graphical form 2 1 Solubility
46. or the fragmentation approach is the composition of the o profile of a new molecule from existing o profiles of molecules that have already been pre calculated Currently there are more than 111 000 diverse molecules stored within the CFDB Thus there is no need for quantum chemical calculations prior to COSMO RS calculations of a new molecule The drawback is a little loss of accuracy for molecules which are composed from several fragments from the CFDB If a new molecule is fragmented into a lot of CFDB molecules it may be badly represented Therefore the number and quality of the fragments used for a fragmentation i e o profile generation calculation should be monitored see section 3 8 1 2 What is a SMILES string and how to get them COSMOnquick relies to a large extent on SMILES strings which are used as molecular input for any calculations SMILES stands for Simplified Molecular Input Line Entry Specification It allows for the descriptions of the structure of molecules using comparatively short ASCII codes Examples for some simple compounds are Propane CCC Ethanol CCO oxalic acid C C 20 O 20 O Within COSMOquick they may be obtained with the 2D structure editor which automatically creates a SMILES string for the user or via the web service which can be found under TOOLS in the menu Molecules encoded in the InChi IUPAC International Chemical Identifier format can be loaded with the 2D structure editor which will convert them into a
47. orption smi F compoundiist ibuprofen smi a solvents smi SO E compoundiist_meloxicam smi top_sales_drugs smi SEATA 3 wermuth_coformers smi 3 compoundiist_sulfadiazine smi 3 eafus_gras_new smi F ethylcellulose_sorption smi FileName compoundist paracetamol smi Files of Type SMILES files smi SD files sdf sd CQ fragmentation frg text files txt X Open Cancel i lt Back Next gt Cancel Status SETUP Memory 32 3616 MB Progress 0 i You will now find a list of SMILES strings and compound names in the lower area of the compound input You can add a compound by adding a new line in the text area and type a name or a SMILES string For example type diethylether and glycerine there In the case of glycerine no SMILES is found in the internal database and the entry is marked red If you are connected to the web the button manage compounds allows you to use a web service to look up the SMILES automatically You may also add a compound by drawing it with the 2D structure editor The editor will automatically generate a SMILES string for you which you can add to the compound setup After you have created a suitable list of molecules select the next button at the bottom Now a fragmentation is initiated and the CFDB is being accessed which may take a while After it is finished the screen should look like File Extras Tools
48. ows for quick generation of o profiles avoiding costly quantum chemical calculations It relies on a database of previously computed o profiles for a set of about 111000 compounds COSMOfrag database CFDB Those instantenously generated o profiles can be used to perform COSMOtherm like calculations with only little loss of accuracy COSMOquick is a shortcut tool mainly designed for the screening of large data sets For high quality results and accurate predictions we recommend to use COSMOtherm together with quantum mechanically derived o profiles COSMOtherm is a full implementation of COSMO RS theory and is also distributed by COSMOlogic Currently the following calculation modes can be carried out with COSMOquick e Prediction of solubilities with multiple reference solvents and relative solubilities 3 1 e Cocrystal screening i e fast calculation of excess enthalpies 3 3 e Prediction of the sorption of small molecules in polymers or solvents 2 3 amp 3 9 e Creation of the sigma profile of a unknown undetermined compound could be anything by using reference solubilities in several solvents 3 4 e ADME properties calculations i e different partition coefficients amp water solubility 3 5 e QSPR calculations using multi linear regression or random forest based models 3 5 e Generation and deployment of QSPR models using COSMOQquick derived descriptors 3 6 e Generation of Hansen solubility parameters via solubility predictio
49. rediction and Similarity Screening Based on Quantum Chemistry J Chem Inf Model 2005 45 1169 1177 2 Eckert F amp Klamt A Fast solvent screening via quantum chemistry COSMO RS approach AIChE J 2002 48 369 385 3 Klamt A The COSMO and COSMO RS solvation models Wiley Interdisciplinary Reviews Computational Molecular Science 2011 1 699 709 4 Klamt A Eckert F Hornig M Beck M E amp B rger T Prediction of aqueous solubility of drugs and pesticides with COSMO RS J Comput Chem 2002 23 275 281 5 Abramov Y A Loschen C Klamt A Rational coformer or solvent selection for pharmaceutical cocrystallization or desolvation J Pharm Sci 2012 101 3687 6 Breiman L Random Forests Machine Learning 2001 45 5 7 Freund Y Schapire R E A Decision Theoretic Generalization of On line Learning and an Application to Boosting Journal of Computer and System Sciences 1997 55 119 8 Loschen C amp Klamt A COSMOquick A Novel Interface for Fast o Profile Composition and Its Application to COSMO RS Solvent Screening Using Multiple Reference Solvents nd amp Eng Chem Res 2012 51 14303 9 Hansen C M The three dimensional solubility parameter key to paint component affinities Solvents plasticizers polymers and resins J Paint Technol 1967 39 104 COSMOlogic Predicting Solutions Index fcos file 20 mcos Files 17 3D structure 20 ACC acceptor 11 add new molecules 18 A
50. sation algorithm to update row weights of population and minimize SSE For the optimization constraints are used keeping the r 20 If SSE r threshhold then stop the procedure 3 5 ADME amp QSPR Calculations The following ADME Absorption Distribution Metabolism and Excretion property predictions can currently be carried out with COSMOquick e log S water calculation of the solubility of a molecule in water e logKow calculation of the Octanol Water partition coefficient of a molecule e logKOC calculation of the Organic Carbon Soil Water partition coefficient e logBB calculation of the Blood Brain Partitioning coefficient i e the penetration of the blood brain barrier e logKHSA plasma protein Human Serum Albumin partitioning i e the binding to human serum albumin will be calculated e logKIA calculation of the Intestinal Absorption coefficient Whereas the water solubility and the logKow are calculated on the basis of COSMO RS theory the other coefficients are computed via QSPR equations from so called o moments This set of descriptors is derived from the o profile of a compound and can be used to regress almost any kind of partition property o moments may also be useful descriptors to regress other physico chemical properties and are printed out in the results tab of those QSPR calculations For more information on performing ADME calculations with COSMOfrag please consult reference 1 COSMOlogic 25 Predic
51. te 22 thf 23 dioxane Back Run Cancel Status SETUP Memory 71 3616 MB Progress 0 We have just loaded an experimental setup from the publication Granberg R A amp Rasmuson A C Solubility of Paracetamol in Pure Solvents Journal of Chemical amp Engineering Data 1999 44 1391 1395 Four solvents are marked now as references CCI4 ethanol dichloromethane and propanone This means that their respective solubilities are used to improve the computed solubility of similar solvents Please note that you may specify additional solubilities for the other solvents but only solvents which are marked are considered as references If you do not specify any reference then a relative solubility is carried out where all results are related to the solvent which shows the highest solubility Please remind that this quantity is not an absolute value and may only be used to compare relative solubilities To add a solvent to this experimental setup you have to select the checkbox Add Solvent mixture There will be now an additional area visible where you can select a compound or several compounds choose the composition in mole or mass fraction and specify an experimental solubility in case there is one COSMOlogic 10 Predicting Solutions E 19 propanone 0 0 i 20 acetonitrile 0 0 21 ethylacetate 0 0 i 22 thf 0 0 l 23 dioxane 0 5 l 24 diethylether 0 5 vj i
52. ting Solutions In addition to ADME properties a set of physicochemical properties can be computed via QSPR based on COSMOfrag and COSMOtherm based descriptors COSMOquick can interpret QSPR models based on a multilinear regression on a Random Forest model or on gradient boosting models GBM Those models can be generated for example by the statistics program suite R and be deployed in the PROP directory Due to their inherent size tree based model structures like Random Forests or GBMs are saved internally in a compressed format rfz or gbmz and unzipped into RAM upon use e T melting rfz An empirical random forest model for the prediction of melting points T with an cross validated RMSE accuracy of about 40K e H fusion mlr A multivariate linear regression model for the enthalpy of fusion AH It has a cross validated RMSE accuracy of 2 2 kcal mol e S fusion mlr A multivariate linear regression model for the entropy of fusion ASrus It has a cross validated RMSE accuracy of 5 81 cal mol K e G fusion rf A model for the prediction of the free energy of fusion AG out of the melting point and the enthalpy of fusion with an RMSE 0 8 kcal mol pags m AG AH ps The melting point AH and AG QSPR models may be used for example for the generation of reference data for a solubility calculation In principle arbitrary QSPRs may be generated and deployed within COSMOquick Linear regression based models can also be
53. tions reference solvent and the solvent under scrutiny Thus the average free energy of fusion is calculated from the references and a correction term is obtained AG cor i AG psi AGs i l n Then the sigma potential similarity of each new solvent with each reference is computed and the solvent specific free energy corrections are calculated references AG 2 W AG or j 1 m The normalized weighting factors wj are determined by the sigma potential similarity of solvent j and reference i m 0 02 Wi a Su ate m 0 02 ui and pj are the sigma potentials of reference j and solvent i respectively To avoid the dominance of just one reference the weighting factor is smoothed with an exponent A 0 5 CQ exponent Finally we obtain the solubility for our solute in solvent j by the following equation ure i eo _ AG J J cor j RT lt AG fus gt A Un Please note that the approach will NOT give back the experimental solubilities for the references themselves Rather they might get a slightly adapted solubility COSMOquick checks the correction term AG if this correction is too large currently the threshold is 1 5 kcal mol the program gives a warning message This is a strong hint that the corresponding experimental value is inaccurate and should be checked It is recommened to use a balanced set of reference solvents For example one could use an unpolar solvent like hexane a donor
54. u may also used to specify a cosmo file for the compound to show the structure the o profile o potential to remove duplicates etc The quality of a fragmentation can be assessed by the column fragments which becomes visible if the checkbox Extended info is selected Here the number of fragments which had to be used to generate the according o profile for a molecule is displayed A large number of fragments is a hint that no similar molecule is available in the CFDB For a good cocrystal screening the number of fragments for the API itself should not be too large otherwise the results may not be accurate Another indicator for the quality of the fragmentation is the column labeled frag quality It contains the average similarity of each atom of the molecule with a similar environment from an entry of the CFDB ranging from 0 no similarity to 9 identity Low values indicate a bad fragmentation and those compounds may be considered only with care for COSMOlogic 14 Predicting Solutions further calculations A similarity 9 means that the compounds have been taken in a 1 1 fashion out of the database J Compound details lect calculation isdmod tmult nbrii rbwring alkane fragments frag quality 0 75 0 00 0 0 00 1 9 00 locrystal screening z 2 50 119 0 0 00 1 9 00 Extended info 19 0 i15 0 0 00 a0 829 Export compound detais 0 50 124 0 0 00 9 00 Import COSMO file s 0 50 0 55 0 0 00 1
55. uantities should be the same but the best ranking is usually obtained by the empirical function f fit Note that sometimes cocrystal formation is mainly due to an efficient packing in the solid state Such special cases can not be predicted by the COSMO RS approach which relies solely on liquid phase interactions Furthermore it can never be ruled out that one of the predicted cocrystals was just missed in the chosen experimental setup A detailed study of coformer screening with COSMO RS can be found in reference 5 There is a second window available with plots of the computed energies You may now extract the results either by using copy amp paste on the tables Ctrl C Ctr V or use the export to excel function 2 3 Sorption amp Solubility in Polymers This section explains how to compute the sorption of small molecules from the gas phase into a polymer or any other solvent This property is usually equivalent to the Henry constant of the molecule within the polymer solvent system As a byproduct the vapor pressure and the solvation free energy are computed If the solvent is a polymer its repeat unit is decribed by using halide SMILES characters see section 3 9 COSMOlogic Predicting Solutions File Extras Tools License Help Start ce Solubility Prediction m Henry Constant amp Gas Solubility Cocrystal Solvate Screening QSPR amp ADME Properties s Solute Backfitting Generate Hansen parameter
Download Pdf Manuals
Related Search
Related Contents
Tomo II - Programas de Infraestrutura.indd KitchenAid KECD805E User's Manual Audio System Manuel de l`utilisateur Honeywell MC Toolkit Users manual, 34-ST-25-33 Rohl A3410LMSTN-2 Use and Care Manual OM, Gardena, Elektro-Vertikutierer, Art 04068-20, 2007-11 Phonak Virto Q Copyright © All rights reserved.
Failed to retrieve file