Home
Structural - codessa pro project
Contents
1. CODESSA PRO COmprehensive DEscriptors for Structural and Statistical Analysis User s Manual by Alan R Katritzky Ruslan Petrukhin Inna Petrukhina Andre Lomaka and Douglas B Tatham University of Florida Mati Karelson Universtity of Tartu Estonia Gainesville 2001 CODESSA PRO User s Manual Contents INTRODUCTION vicssisccssccscccccccsccscsccstecssscisccssccsccsccccesestcccescccebcescevesstevccsscesccdesccssccescescecs 2 CONCEPTS AND DEFINITIONS ccccccscsssssssssssssssssssssssssssssssssssssssssssssssssssssscees 3 STARTING A NEW PROJECT cccccccccsssssssssssssssssssssssssssssssssssssssssssssssssssssssssscees 6 CREATING STRUCTURE PLE a e el ee a dl da e el e 6 CREATING A PROPERTY FILE cccecccescceecceeecececccesccecceescccscecaccesseseneeeaceceaccesensnes 7 WORKSPACE aia tias 9 WORK AREA seorsan at ara aae sede dlecssvissssescscedesouectsotecssteseovecssetecsestus 11 STRUCTURE VIEW WINDOW acen ee t naaa aaa an a a a a 11 CORRELATION VIEW WINDOW ccccccccccccscesssesceccccccessusssceccecceessuessseeccecseseuuenenencesess 12 PROPERTY WANDO a a a a E a a a shade 13 TOG WINDOW dt doit 14 MANIPULATING LISTS ssccssscesccisessscscssciscocsbecscscsesccccsscsccceesscssccssbecscsessoccdossescccscseccces 15 CALCULATING CORRELATIONG cccccccccssssssssssssssssssssssssssssssssssssssssssssssssscecs 17 VIEWING CORRELATIONS esesssssssssssssssssssssssesesssssssssssssssssessssssssssses
2. Property Window The window purpose is protocol all operation on the storage Tool Bar Shortcuts to commonly performed tasks The task will be displayed it the pointer is positioned over the icon CODESSA PRO User s Manual Workspace The workspace area contains an on screen H 0 Structures m Descriptors presentation of the current cache file which is Properties 5 x 2 al essentially a directory tree Clicking on the minus will ee collapse a folder branch of the tree and clicking on a Click here to expand folder plus will expand a folder When a folder is expanded note the Properties folder it will show the lists note Click here to collapse folder All list contained in the folder The same icon is used to represent lists as folders because they are also used to contain files If a list is empty then there will be no plus or minus beside it ma Objects al Structures a all e 4 Nitropyridine N oxide 0 Acenaphthene Anthracene e Benzil Carbazole e Diphenyl sulfone Diuron e Ferrocene e Fluoranthene O Monouron O Naphthalene Phenanthrene Pyrene 9 50000005 mol The Structures folder is expanded to show the list All The list All is also expanded to show the structure artifacts Notice the icon beside the structure artifact Each type of artifact will have a different icon The name that is listed will be the same as the name given
3. 3 90515 0 0914958 Vib heat capacity 300K Outliers are selected Number of outliers is 2 Correlation N 37 n 2 R2 0 770255 R2cv 0 719299 F 56 995 s2 0 0706459 Comments R2 0 7703 R2cv 0 7193 F 57 00 s 0 0437 N 2 n 37 Ranges Observed 7 35 9 17 Predicted 7 38056 9 37866 B 3 t Ic Name of descriptor 0 12 4111 0 515973 24 0538 Intercept 2 0 297124 0 0389097 7 63623 0 152585 HOMO LUMO energy gap 2 0 00796042 0 00204845 3 88607 0 152585 Complementary Information content order 2 Outliers are selected Number of outliers is 2 v 4 b The edit pull down menu has options to clear all log paste to log select all log and copy from log Transferring the information for several correlations to a word processing program can be accomplished by clearing the log double clicking on each of the correlation clicking select all then copy 14 CODESSA PRO User s Manual Manipulating Lists CODESSA PRO automatically one or more lists in each folder but when analyzing a property or correlation it is sometimes helpful to create a user defined list e g a list of structures containing phenyl rings The simplest method for creating a list in CODESSA PRO is to select several artifacts from one or more existing lists i e all right click to open the context pop up menu and left click on create list A list can also be formed from a group of artifacts that are linked to a common artifact If a descriptor i
4. corresponds the point to open in the work area Right clicking side the window will open a context pop up menu that will open the structure view windows for each of the structures used in the correlation the correlations for each descriptor allow the creation of lists and allow marking of structures Double clicking on a descriptor artifact will also open a correlation view window The window will show the relationship between the descriptor and the selected property dimension 12 CODESSA PRO User s Manual Property Window The item selected determines the contents of the property window If a folder or list is selected in the workspace the number of artifacts contained is given If an artifact is selected it provides varying information depending on the type of artifact x Property Yalue 4 Name Ferrocene Comment File name 50000001 mol Number of atoms 21 Number of bonds 30 Number of calculated descriptors 220 Number of experimental properties 45 Pictured is the information provided when a descriptor artifact is selected in the workspace la lx Property Yalue Name logL for trans stilbene Comment logL for trans stilbene Defined For structures 4 52 The picture on the left shows information given in the property window when a structure artifact is selected la be Property alue Name 1X BETA polarizability DIP Comment Short name DO_MNO_POL_B
5. on the first line of the molfile The last structure artifact gives the filename of the molfile because the name was not entered on the first line of the molfile Double clicking on an artifact launches a structure view window in the work area ma Objects H Structures Ee fy Descriptors The descriptors folder is expanded to show the All e External and Constitutional lists as well as several 1 5 Constitutional descriptor artifacts The All list is in boldface font Average atom weight os OF Molecular ved because it is the currently selected analysis dimension ele tenant for descriptors Notice the descriptor icon is different umber of benzene rings Ps G Number of Br atoms from the structure icons pictured above If an artifact is bi f gt ela pea selected its color will change from black to blue as can QI Number of double bonds be seen with the Molecular Weight artifact al Number of F atoms GA Number of H atoms ma Objects CODESSA PRO User s Manual 3 Structures Y Descriptors Y Properties 0 3 All ald logL For 1 p chlorophenyl N 4 chl logL For 1 1 1 trichloroethane logL For 1 1 1 trifluoro 2 chloro 2 br logL for 1 1 2 2 tetrachlorodifh logL For 1 1 2 2 tetrachloroethane logL for 1 1 2 2 tetrachloroethene i logL For 1 1 2 trichloro 1 2 2 trifluor logL For 1 1 2 trichloroethane logL For 1 1 2
6. 1 Group Semi empirical Type Molecular typed SubType 1 N A SubType 2 N A Defined For structures 231 When a property artifact is selected the information from the prp file is shown this includes name comments and the number of structures Tala xl Property alue 4 Name N 52 n 5 R2 0 916359 Comment For property logL For trans stilbene The property window for correlation artifacts lists the details of the correlation Details include the properties used in the name of the correlation and the property name 13 Number of descriptors Matched sublist of structures fo Number of structures Squared correlation coefficient Squared correlation coefficient Fisher criteria Standart error 5 logL For trans stilbene 52 0 916359 0 870163 100 793776 0 024032 CODESSA PRO User s Manual Log Window The log window describes each operation that has been performed on the storage Double clicking on a correlation in the workspace will give a full description of the correlation as pictured below The text is in RTF format and can be cut and pasted directly into a word processing program x Comments al 4 R2 0 7709 R2cv 0 7105 F 57 22 s 0 0436 N 2 n 37 Ranges Observed 7 35 9 17 Predicted 7 5001 9 40447 B 3 t Ic Name of descriptor 0 12 8032 0 495462 25 8399 Intercept E 0 311865 0 037522 8 31153 0 0914958 HOMO LUMO energy gap 2 0 0196488 0 0050315
7. 6 items Max partial charge Zefirow for aton Max partial charge Zefirow for ato Max partial charge Zefiroy for aton Max partial charge Zefirov for aton Max partial charge Zefirov for aton iy Max partial charge Zefiroy for aton Max partial charge Zefirov for aton Max partial charge Zefirow for aton The List Logic module in CODESSA PRO compares the artifacts contained in two lists and can be opened by clicking the pull down menu List then left clicking on Logic Choose the type of artifact that will be contained by the list by clicking on the appropriate button in the Sub List Type box descriptors is selected in the example above Choose the desired lists for comparison and the appropriate Operation The artifacts from the two lists that meet the criteria of the operation will be displayed in the Results box The results can then be copied to a new list Create or be selected Select for further comparisons When comparing three or more lists first compare two of the lists click on Select restart List Logic select Current Selection in The First List box then select the remaining list in The Second List box 15 CODESSA PRO User s Manual Calculating Descriptors One of the major advantages of CODESSA is its tremendous pool of descriptors which are calculated for each of the structures listed in the prp file The descriptors are divided into ten lists The first list contains e
8. al storage area for many projects or separate storage areas for each individual project A general storage area allows a research group to avoid repeating the calculations of structures and descriptors as well as minimizing the amount of storage space used Ifa general storage area is to be used it will be necessary to keep an index of the structure and molecule names To prevent confusion exact IUPAC names should be used Additionally the names of the molecules should be inserted in the name section of the molfile by opening the file with a text editor and entering the name on the first line and comments on the third line Creating structure files Molfiles Structures can be created using a chemical drawing program that will convert 2 D structures to 3 D structures 1 e HyperChem ISIS Draw ChemDraw or PCModel The 3 D geometries must be optimized in a two step process using MOPAC Each step involves adding appropriate keywords to the structure file running MOPAC and then converting the optimized output file into a MOPAC internal file The conversion can be performed manually or by using a utility such as Babel http www ccl net cca software MS WIN95 NT babel index shtml The first optimization uses the keyword AM1 and the second optimization uses the keywords AM1 GNORM 0 01 PRECISE The additional keyword MMOK should be used for any molecule with peptide linkages If either optimization fails it should be repeated using additional keywo
9. cern Each property value must be associated with a structure located in the Structure folder Descriptors CODESSA PRO User s Manual Descriptors are defined as numerical characteristics associated with chemical structures They are derived on the basis of the structures chemical constitution topology geometry and inherent wavefunction and potential energy surface The values of a particular descriptor can be provided by the user or calculated by the CODESSA program Each descriptor value must be associated with a previously defined structure Descriptors calculated by the CODESSA program are named automatically renaming descriptors is not recommended Correlation A correlation represents the results of the multi linear regression between a property of interest y and one or more selected descriptors x y ay EA Correlations are composed of regression coefficients a correlation coefficient R standard error s and Fisher criterion F value for the set of structures used in its derivation By default each correlation is named by its number of structures N correlation coefficient R crossvalidated correlation R cv and Fisher criterion value F and standard error s Artifacts Items used by or created by CODESSA PRO The four types of artifacts are structure property descriptor and correlation CODESSA PRO User s Manual Starting a new project It is important to decide whether to use a gener
10. cv 0 5 EA N 52 n 1 R2 0 536667 R2cv 0 5 ba 10 CODESSA PRO User s Manual Work Area Structure View Window Diphenyl sulfone Wire Double clicking on a structure artifact in the structures folder opens a structure view window in the work area Structures are 3 dimensional and can be viewed as wire frame ball and stick CPK surface filled or solvent accessible surface Right clicking inside the window opens the view and label context pop up menu Selecting view provides the four options mentioned earlier and label identifies the number or each atom or the type atom When the pointer is positioned over the view window it changes to a 4 sided arrow that can be dragged to rotate the structure 11 CODESSA PRO User s Manual Correlation View Window N 35 n 5 R2 0 950961 R2cv 0 931197 F 112 473 s2 0 00862279 Observed vs predicted logL for thioxanthen 9 one T D 3B 3 E A Observed Double clicking on a correlation artifact opens the correlation view window The graph shows the observed experimental values versus the predicted values for the property When the window is first opened the points for all outliers will be blue rather than yellow The color of the points will change to blue to indicate they have been selected Clicking once on a point will cause its properties and their associated values to be displayed in the properties window Double clicking on a point will cause a structure view window for the structure that
11. e sequence you would accomplish them This manual also provides further information concerning the features and methods available in the CODESSA PRO program their purpose and interpretation of the results For a more detailed description of the program and employed techniques please refer to the CODESSA PRO User s Manual and Reference Manual E The CODESSA program is designed to operate in the following Microsoft a Windows environments Windows 2000 Windows NT and Windows 9x To start CODESSA double click on its icon shown above When the program starts the CODESSA PRO Visual Interface CVI window will open Click View on the menu bar to open a pull down menu next click on the Refresh option to refresh the current snapshot which is then displayed on the screen Alternatively the single keystroke F5 will refresh the snapshot Before attempting to use CODESSA PRO it is necessary to understand some of the concepts terminology used by the program CODESSA PRO CODESSA PRO User s Manual Concepts and Definitions The use of CODESSA PRO requires understanding of the following concepts Artifacts The individual files or items used by CODESSA PRO There are four types of artifacts structures descriptors properties and correlations Using the CODESSA PRO program you will frequently manipulate single artifacts or lists of artifacts The names of properties are declared in the property prp file and each must have a uniquely defined na
12. essssssssssssseo 17 CODESSA PRO User s Manual Introduction CODESSA PRO is an entirely new software package which performs tasks similar to CODESSA but with many distinct advantages over the previous software package In particular its add in mechanism makes CODESSA PRO expandable its calculation engine has been optimized at the assembly language level for Pentium Pentium Pro Pentium II Pentium II and Pentium IV processors and finally it was designed to run in 32 bit Windows environments CODESSA Comprehensive Descriptors for Structural and Statistical Analysis is a comprehensive program for developing quantitative structure property relationships QSPR which integrates all necessary mathematical and computational tools to calculate a large variety of molecular descriptors on the basis of the 3D geometrical and or quantum chemical structural input of chemical compounds to develop multi linear and non linear QSPR models of the chemical physical or biological properties of individual compounds to perform cluster analyses of the experimental data and molecular descriptors to interpret the developed models and to predict properties for compounds previously unknown or unavailable This manual presents the guidelines for a successful development of QSAR QSPR models using the CODESSA PRO program The execution of each step of the program normally requires the knowledge about the previous proceedings and thus they are discussed in th
13. is generated automatically by the system when a property item is created When correlations are calculated 50 correlation items by default will be stored in the list that corresponds to the property List A list is collection of the chemical structures descriptors physico chemical properties or models correlations Lists can be either system type or user type System lists cannot be deleted by the user Current Analysis Dimensions The dimensions used for an analysis by CODESSA PRO are the property the list of descriptors and the list of the structures All the dimensions are selected from Dimensions pull down menu The default dimensions for a new snapshot file are Dimension Default value Property Not selected List of descriptors All group List of structures All group The selected dimensions are used in the formation of the descriptor property matrix Descriptor Property Matrix The descriptor property matrix consists of descriptor and property the last column values The horizontal dimension of the matrix is descriptor property ID sequence and the vertical dimension is the structure ID sequence The matrix has two presentations binary and text The binary presentation is used for internal use while the text presentation is optimized for import into the STATISTICA software package Property A property is a physical or chemical characteristic biological activity or other characteristic of con
14. le system in the storage To change a storage location click Option on the main menu bar and select Storage The box that opens allows the user to control the storage location of structure files correlations descriptors lists Snapshot Cache File The snapshot CODESSA PRO cache file is a binary compressed representation of all items in storage This file located in the memory whenever work is being performed on the CVI module The cache file reloads whenever the CVI detects a change in the storage contents The snapshot can be reloaded manually anytime by opening the pull down menu View on the menu bar then selecting Refresh or using F5 keyboard shortcut CODESSA PRO User s Manual Workspace The workspace is a window within the CVI that represents the snapshot and located in the workspace area on the top left side of the CVI frame Folder A folder is used to store lists and is similar to a directory CODESSA PRO uses four folders structure descriptor property and model correlation Each of the first three folders has a system generated list named All This list will contain ALL of the structures descriptors properties or models for the corresponding folder The Descriptors folder also includes system defined descriptor lists according to group and type A particular descriptor Item may be contained in several lists The Correlations folder contains lists for each property item in the property folder The list
15. me CODESSA PRO names most artifacts e g descriptors and correlations Structures are currently named by the operator but when the CMol3D module is complete it will automatically create and name structures Structure A representation of an individual chemical object with a precise chemical constitution Examples of structures include a single molecule a monomeric unit of a polymer or a molecular complex of a definite composition The minimum information that a structure must include is the types of atoms involved and their connectivity Each structure must be linked with three files containing 3 D structure SCF and Force information Before a 3 dimensional structure can be input in CODESSA PRO it must be converted to MDL molfile format and be optimized with a gradient norm of 0 01 or better The molfile structures are stored in the mol3dmop directory SCF output files are stored in the mopscf directory and Force output files are stored in the moptherm directory SCF and Force structures will be created and properly stored automatically by the CMol3D module See Starting a New Project for instructions to create each file type It is vital that the name for a particular structure be exactly the same in each directory Structure names must be in the form S0000001 xxx and should be added in sequential order e g SO000001 xxx S0000002 xxx Storage All items that are connected with a project are stored in a single location within the fi
16. rds until successful The following list contains the progression of additional keywords to add EF XYZ EF XYZ EF XYZ GEO OK If none of the keywords or combinations above work select a different starting point for the z matrix rearrange atoms accordingly or add a dummy atom and create a new z matrix Dummy atoms must be deleted prior to performing SCF and thermodynamic calculations The optimized structures should be converted to molfile format given the extension mol and stored in the mol3dmop directory The first line of the molfile will be blank after MOPAC optimization insert the name of the molecule in the first line and comments on the third line SCF files Convert the molfile structures to MOPAC internal files and add the keywords AM1 VECTORS BONDS PI POLAR PRECISE ENPART EF Perform the MOPAC calculation and save the files with an mno extension in the mopscf directory CODESSA PRO User s Manual Force Files Convert the molfile structures to MOPAC internal files and add the keywords AM1 FORCE PRECISE THERMO ROT 1 Perform the MOPAC calculation and save the files with an mno extension in the moptherm directory Creating a property file Once the property value of interest has been collected and carefully checked for each structure the data must be assembled in a text prp file Spreadsheet programs may be used to simplify the process of assembling data but the file should be saved as text The format of the file mu
17. s chosen to be the common artifact then all the structures that are defined for that descriptor could be selected for a list To create the list from a common artifact select the common artifact open the context pop up menu click on Select Linked and then select the type of artifacts that you want to add to the list If multiple common artifacts are selected the list that is created corresponds to creating lists for each common artifact then performing the AND operation see below List Logic oe x m Sub List Type C Structures Descriptors Properties Models Close m The First List Operation The Second List All External Constitutional Topological gt OR as DR All External Constitutional Topological Geometrical Charge Related Semi Empirical Thermodynamical Molecular Typed ye AND o C NOT Geometrical Semi Empirical Thermodynamical Result Molecular Typed Relation 9 Atomic Typed Bond Typed Selected Be items Bond Typed Max partial charge Zefirow f a Max partial charge Zefirow f Max partial charge Zefirow f Max partial charge Zefirow f Max partial charge Zefirow f Max partial charge Zefirov f Max partial charge Zefirov f Max partial charge Zefiroy f Max partial charge Zefirow f Max partial charge Zefirov f Max partial charge Zefirow f Max partial charge Zefirow f Max partial charae Zefirowl h Selected 147 items Selected 122
18. st be exactly as follows Name of property file Comments Structurel Valuel Structure2 Value2 Structure3 Value3 Name of property file The name that will be assigned to the property artifact The artifact will be contained within one or more lists in the property folder A list with same name will also be generated automatically by CODESSA PRO and stored in the correlations folder Comments Any comments written in this location will be listed in the comments section of the property window whenever the property is selected Structurel The number of the first structure not the filename Proceeding 0 s in the structures number are ignored the structure S0000023 mol could be listed as 00023 023 or just 23 Valuel The property value associated with structurel There must be white space space or tab between the structure number and the value The type or number of white spaces is irrelevant so long as some type of space exists CODESSA PRO User s Manual CODESSA PRO Visual Interface File Edit view Lists Dimensions Calculate Option Window Help Menu Bar he sas rc ran o Tool Bar Workspace Property Window Ide Status Bar Workspace An on screen presentation of the current cache file Work Area The work area the screen space for various view windows Property Window The property window depicts information about properties of the selected object and available for almost all objects
19. tors 6 To calculate descriptors for a subset of structures select the user defined structures list click the Dimensions pull down menu then List of structures 7 Click the Calculate pull down menu then Descriptors The order for the Calculate commands is Descriptors Form matrix then HMPRO If HMPRO is selected without first running the prior commands CODESSA PRO will automatically perform the preceding operations 16 CODESSA PRO User s Manual Calculating Correlations Calculating correlations in CODESSA PRO is even easier than calculating descriptors 1 Ensure that property of interest is in boldface If not select the property so that it is highlighted and click the Dimensions pull down menu then property 2 Click on the Calculate pull down menu and then Form matrix 3 Click on the Calculate pull down menu and then HMPRO Viewing Correlations 1 Expand the Correlations folder 2 Expand the correlation list with the same name as the property of interest The default number of correlation artifacts in each correlation list is 50 but this can be altered by editing the settings in the HMPRO par file 3 Double click on a correlation artifact to open the Correlation view window in the work area A description of the correlation will be displayed in the property window with the complete information in the property log 17
20. trichloroethene logL For 1 1 dichloroethane loal for 1 2 42orametrbenzen Bl b The correlations folder has been expanded to display several lists and artifacts A list is created automatically for each property artifact Notice there is no plus or minus beside the first list This indicates that no model for the property has been calculated and the list contains on artifacts If the pointer is held over a correlation artifact the details of the artifact are displayed Double clicking on an artifact launches correlation view The Properties folder is expanded to show the All list and several property artifacts Notice that the forth property is in boldface font which indicates that it is the currently selected property dimension Notice the icon used to represent property artifacts The name of the property artifact is the same as the first line of the property P0000004 prp file The artifacts are created automatically for each property file stored in the props subdirectory Y Structures Y Descriptors Y Properties Y Correlations 3 For logL for ferrocene EQ For logL for trans stilbene 7 za N 52 n 1 R2 0 56867 R2cv 0 53 Za N 52 n 1 R2 0 559584 R2cv 0 5 H N 52 n 1 R2 0 554681 R2cv 0 522693 F 62 279 s2 0 117717 N 52 n 1 R2 0 551986 R2cv 0 5 H N 52 n 1 R2 0 547812 R2cv 0 5 N 52 n 1 R2 0 544658 R2cv 0 5 N 52 n 1 R2 0 538107 R2cv 0 5 Z N 52 n 1 R2 0 537682 R2
21. xternal descriptors The remaining lists can be divided into two categories The first category divides descriptors into lists based on the type of calculation while the second category divides the descriptors into lists according to the portion of the molecule being described The first category contains the following lists constitutional topological geometrical charge related semi empirical and thermodynamic The second contains the remaining lists molecular type atomic type and bond type For more detailed information on descriptors available in the program we refer to the CODESSA PRO Reference Manual At this point it is assumed that the structures and property files have been saved in the appropriate directories Calculation of descriptors is considerably faster than with the previous versions of CODESSA Calculating descriptors in CODESSA is simple 1 Expand the list in the properties folder that contains the property artifact of interest 2 Select the property artifact it should be highlighted Click the Dimensions pull down menu and click on property the property artifact should now be in boldface 4 The default setting is to calculate all descriptors for all structures listed in the prp file Skip to step 7 if using default settings 5 To calculate subset of the possible set of descriptors select the automatically generated or user defined descriptors list click the Dimensions pull down menu then List of descrip
Download Pdf Manuals
Related Search
Related Contents
Clique aqui para fazer o do arquivo1 Thermaltake V3 Black Edition 日常の安全点検とお言 Meanwell User Manual Patientenlifter – Arnold 250 - Sanitätshaus Burbach + Goetz Analytical Instruments: Honeywell DL5000 Dissolved Oxygen (D.O. Secondaria Castelbelforte - Istituto comprensivo San Giorgio di Copyright © All rights reserved.
Failed to retrieve file