Home

MANUEL D`UTILISATION ISIDA

1. Using a msk file if user wants to create input files for ASNN WEKA or ISIDA QSPR he has to save his work in a msk file using the SAVE MSK FILE button Using SD files if user wants to use ISIDA KNN or ISIDA Cluster he has to save his work in two SD files one including compounds of the training set another one for the test set using the two buttons Training Set export SDF and Test Set export SDF The External Set export SDF button is not finished Clusters not finished o Export data Convert into different formats Descriptors DESCRIPTORS CALCULATION lonic Liquids ImBr NR4Br PyBr_TRAl Randomize I isco l Variables with non zero smaller than ragmentation Settings Atoms only P E W ISIDA QSPR fragments E Bonds only Go 2 7 a ISIDA CFR file Atoms and Bonds seg sie Fi O i E Augmented Atoms El All Fragments Augmented Bonds ll External descriptors ISIDA selection deletion of Augmented Atoms and Bonds Binary values W High correlated variables area 0 850 E Hybridiz Augmented Atoms E Benson fragments fi Low correlated variables with property EO 100 UFS Unsupervised Forward Selection EXTERNAL DESCRIPTORS W PCA Principal Components Analysis Components El oven l impon SMF Calculate Fragments ATE FILE Mask File Mask EDITOR Vanables selection a verat SMF vi Delete blank cpds ASNN Click on Files convertor and select the pr
2. gt big distances TANIMOTO MODE red gt blue corresponds to non similar gt very similar Histograms on the right represent the distribution of distances for all pairs of compounds In the example below there are a low number of minimal and maximal distances but a huge number of intermediate distances o Property explorer BIDA CLUSTER Clusters explorer Export SDF Property Explorer QSAR User can search for a cluster having an Show Property homogeneous activity The property explorer allows one to show the property classes inside each cluster Select the property and a number of classes and click on Show Property This module is not finished EE A IT Osteen BIDA k Nearest Neighbors DATASETS TRAINING set TEST set Wi Use an external set RUN NEW CALCULATION LOAD MODELS dt Fig X Control panel of the ISIDA KNN program o Initial parameters Three sets are required for the modelling i the training set ii the internal test set and iii the external set Three Open Dialog boxes can be opened by clicking on the Open buttons in order to select the corresponding datasets in the appropriate directory Then the program analyses the SD file to search for all available fields Thus the user can choose the property in the ComboBox for the modelling To launch the KNN calculations with parameters by default the Run new calculation button has to be clicked o
3. Recommendation use the Files convertor together with the Variables selection o Splitting data into training test sets Click on the Files convertor button to access to all available options and click on the Mask Editor button MASK Builder Gao sr pe fo rf ee gagag coag gogg agag __ 142 as ae vas HKKK gogg goda Tay no Test Set export SDF 152 163 16 res PRM 197 160 69 170 Enternal Set expot SDF 177 aze 179 160 DE I fe ua a In the left part of the window a grid of white squares represents the compounds of the entire set The Mask Editor allows user to split the data into subsets The MANUAL SELECTION ON OFF button des activates the manual selection of compounds with the mouse A left click on any compound colors the corresponding square in gray and puts the selected compound into the test set WHITE training set GRAY test set Radiobutton 1 put each i compound in the test set starting from the number j Radiobutton 2 put the first i compounds in the test set Radiobutton 3 put the last i compounds in the test set Random selection user can also indicate the number of compounds in the test set or the percentage of compounds on the entire set From this number the program selects randomly the desired number of compounds The random selection for the external set is not finished When the splitting is terminated two saving modes are available
4. logS value a numeric field containing the predicted logP value a textual field containing Soluble or Insoluble 24 8 ISIDA DC models in progress
5. 040 0 002 2 136 11 14 3 CTCTNTC 4 0 430 0 185 253 241 130 260 4 S C N C 3 0 005 0 000 0 971 5 g 5 e c o o0 2 0 003 0 000 0 583 3 3 ax 6 ec c c o 4 0 002 0 000 11 456 59 68 Fi C C C N 7 0 074 0 005 59 417 306 441 8 C C C 0 3 0 167 0 028 16 505 85 124 9 CICCEr 2 0 089 0 008 0 194 1 4 gt 10 c c C C 4 0 061 0 004 4 854 23 32 11 N c c 3 0 058 0 003 0 388 2 3 dutes 12 C C N C 10 0 165 0 027 33 204 171 994 13 C C C 0 3 0 276 0 076 21 359 110 118 14 N C C C C 2 0 184 0 034 28 544 147 147 15 NC C N 2 0 078 0 006 D 77T 4 a 16 NCCF 3 0 081 0 006 0 583 3 13 a 17 C C C N 3 0 053 0 003 2 524 13 23 18 CC 42 0 109 0 012 99 806 514 6953 19 CF 3 0 051 0 003 0 971 3 9 20 C C N N 2 0 078 0 006 0 77 4 xi 21 NCCN 4 0 155 0 024 4 660 24 28 22 NCCO 6 0 150 0 023 39 806 205 287 23 oco 3 0 006 0 000 19 223 99 113 24 clcccl 3 0 129 0 017 3 883 20 27 23 c c 3 0 036 0 001 2 136 11 13 26 c c N N 2 0 065 0 004 0 971 5 3 2r CN 13 0 388 0 151 100 000 915 2 18 JR QUE EO A AR mo AN ma 104 1 1 ee ee the name of the set the number of compounds the number of fragments the name of the studied property for each descriptor the symbol of the fragment ex C C C the number of descriptors classes number of different occurrences taken by the descriptor the correlation coefficient R and R between the descriptor and the property the frequency of the fragment i
6. Atoms El ALL Fragments Augmented Bonds ll External descriptors ISIDA selection deletion of Augmented Atoms and Bonds Binary values W High correlated variables ares 0 850 Fragmentation Settings ME Hybridiz Augmented Atoms W Benson fragments HM Lowcorrelated variables with propery R2 lt O 100 EXTERNAL DESCRIPTORS lB PCA Principal Components Analysis Components Eyam a TT Delete blank cpds Calculate Fragments Vanables selection Files convertor Files convertor Checkbox 1 suppress fragments which have an positive occurrence in less than m molecules m being a user defined threshold Checkbox 2 keep only fragments which are present in the CFR file selected by user corresponding to the list of fragments involved in a linear QSPR model performed by ISIDA QSPR Checkbox 3 keep only fragments selected by the Unsupervised Forward Selection UFS Checkbox 4 suppress high correlated fragments R gt user defined threshold Checkbox 5 suppress low correlated fragments with the studied property R2 lt user defined threshold Click on Files convertor to select the property Checkbox 6 calculate the p principal components of the variables and keep only the p variables in the SMF file Available in ISIDA v1 not finished in ISIDA v2 Click on the Create Files button to launch the selection of variables The new SMF is created automatically and has a new extension SEL SMF
7. Mode Advanced user This mode allows user to modify the settings used for the ENN calculations The full window appears after that the checkbox Advanced user has been clicked IVA TEUS The Descriptors page is dedicated to the type nen EE S of molecular fragments taken into account for the EES AVE taining RtestSDF pen calculations and also the minimal and the maximal repre number of variables that models can contain Use an external set rm Advanced user The KNN settings page allows user to select UA SO ali the type of normalization none by default to apply to descriptors the range of nearest neighbors to examine El Subsets of fragments from 2 to 5 by default the randomization key 0 by Tae default the metric between compounds euclidian by wuromooe fault me ae The Variable selection page is split in two E Hybride greed Atoms parts the left part concerns the deletion of variables eens Soe eked ose ata before KNN calculations correlated fragments constant ion a case wane pos fragments or with low variance etc The right part concerns the variables selection for the kNN calculations user can modify the maximal number of iterations between two steps of the forward stepwise variable selection 1000 by default The Property and Compounds pages are only accessible at the end of calculations from En to Descriptors KNN Settings Variable Selection Property Compounds To launc
8. R CCCC R2 0 833 DESCRIPTORS Filtering 0 850 E SEARCH E 0 850 P If user wants to search correlated variables for a given descriptor ISIDA allows one to filter the variables and visualize the correlation X axis studied descriptor Y axis correlated descriptor Results can be saved in the Correlated Descs List txt 11 4 Principal Components Analysis PCA o Settings Parameters of calculations SD and SMF files have to be in the same directory and use the Open SDF SMF button to select one of them If no information appears in the Available Data combobox it means that no previous PCA calculations are available for this set in the work directory The Calculate button can be clicked if user has entered a number of components 3 by default If previous calculations are available select the corresponding line in the combobox and click on the DISPLAY button Principal Component Analysis Open FCL file gt a f 12 w VAR DISTRIB Parameters Component 1 a Color by PROPERTY Open FEL file Color Mode VAR DISTRIB a Components Nb of Property classes Property distribution size of spheres Parameters Compounds ent seas escent ee cee tence tee oe ee ee e A To rotate the points the left button of the mouse has to be down and the rotation is made according to the mouse move To zoom in or out of the points the right button of the mous
9. USER UAL FOURCHES Denis 2006 1 Data Preparation o Data randomization o Duplicates search 2 Calculation of descriptors o Calculation of molecular fragments o Variables selection o How to split an entire dataset into a training test sets o Export data and convert them into different files formats 3 Data and descriptors analysis o According to property classes According to descriptors classes Statistical criteria of descriptors Correlation matrix of descriptors Replacement of a descriptor by another correlated one 4 Principal Components Analysis PCA o Settings Parameters of calculations o 3D view and tools o Color compounds according a property or a cluster analysis 5 Cluster Analysis clustering using ISIDA Cluster o Setting Parameters of calculations o Dendogram representation and tools o Clusters explorer o Proximity maps I PM FPM o Export clusters and data Q 6 QSPR modelling using ISIDA KNN Initial parameters Mode Advanced user Selection of predictive models Property predictions of external set compounds using selected models Load Save kNN models 7 Calculation of aqueous solubility and octanol water partition coefficient using ISIDA LogPS ISIDA DC models in progress o Modelling architect 1 Data Preparation o Data randomization Most of the SD files containing molecules libraries are sorted 1 e the compounds inside a SD file are ranked according to a
10. aining the IDs of compounds for each cluster and a COF file containing the different parameters used to perform the cluster analysis User can also save in SD files the contents of clusters using the Export SDF page in the Clusters explorer Cluster 1 SDF IE Analysis Composition Export SDF To save one cluster write the ID of the cluster example write 1 for cluster 1 and the name of the SD file Click Ok Proximity map z Pas Ei dis h qu q 98 ls Abie Peat ces Tir ea aie El P ae Sh opener SEPE Praga dl i fe PEN 33 z a ES Mn a a TMB DISC EN Click on the First Proximity Matrix Explorer button to launch the viewer of proximity maps This viewer is dedicated to the visualization of the proximity matrix of compounds before the clustering IPM Initial Proximity Matrix when compounds are not sorted and after the clustering FPM Final Proximity Matrix when compounds are sorted according to the final arrangement from the cluster analysis Zoom out Mol Zoom in Distance i Mol j Tanimotofi j ZOOM IN Prop Prop Each point of the map represents the distance between the compounds i and j The IPM is first displayed Click on the Clustering effects button to visualize the FPM and click on the Tanimoto Visualisation to color the FPM with a Tanimoto similarity color pattern DISTANCE MODE red gt blue corresponds to small distances
11. e has to be down and the zoom coefficient 1s calculated according to the mouse move User can assign one of the calculated components to one or more axis By default the X axis corresponds to the first principal component the Y axis corresponds to the second principal component and the Z axis to the third principal component For each component the program calculates the percentage of the total variance of variables expressed by the given component A graphical representation can be displayed by clicking on the VAR button and the percentages are written in green near the selected components o Coloring points according to a property value or to clusters Principal Component Analysis ooe COLOR setting p t T VAR DISTRIB Parameters Compounds Open SDF 7 SMF DISPLAY onic Liquids lmBr NR4Br PyBr_TRAINING SDF Available data pg Use reduced SMF Number of components Select property CALCULATE Principal Component Analysis ILOR se ngs Color by PROPERTY X Open FCL file he VAR DISTRIB Parameters Compounds lonic Liquids ImBr NR4Br PyBr_TRAINING SDF Available data pg Use reduced SMF Number of components gc Select property CALCULATE Color by property there is a dedicated combobox under the Select property label to select the property Then select Color by property in the Color settings combobox User can adjust the number of property classes with the appropriate
12. h KNN calculations the Run new calculation button has to be clicked o KNN calculations displaying of results The program starts with the calculation of molecular fragments for all compounds Then some of these descriptors are deleted due to their high correlated coefficient or their low variance The kNN procedure begins with the generation of a user defined number of models involving m descriptors Among them the stepwise variable selection algorithm selects the i i best ones according to QP LOO procedure or optionally according to the R obtained for the Experimental Predicted internal test set Q2 L00 bag 0 75 4 A During calculations at any time the user has an Subset ID Nb Yariables k parameter orn r r r r e r e r U NNNNNANN Lae entire access to the available models a list of the tenth HIT Models best models according to their Q is displayed and enten updated each time that a new model having a best Q ever SAVE BATCH is discovered The number of models is also displayed LLET This list of hit models contains the statistical Compounds Curves BATCH Mode BATCH Curves ListMODELS criteria for each model the number of involved variables the number of neighbors Q and RMSE for the LOO procedure on the training set R2 RMSE the coefficients of the linear regression PRED a b EXP for the test Set On the BatcH Curves page some graphical repre
13. has to be checked and then Analyse again o According to descriptors classes One click on the Analysis of descriptors button opens another part of the data analysis module of ISIDA performing a splitting of data according to descriptor classes the SMF file and the SD file having the same name have to be in the same directory Else a SMF file has to be created DATA ANALYSIS Compounds 2 0 CON a p IN Re CJ o CC YC CI C C 0 C N CON C O OCS SPCC CENS CECCP CECUNI ceco NECCC CPC NCN desen Compounds CLO Sic CCC CNC OCC X 0 1 2 3 4 5 6 SELECTMODE HIDE ZERO FRAGMENT SAVE LIST _sorrpescriprors E QSPR VEW MODE _ CALCULATE STATISTICS Conelation coeficient Frequency ls III STHESEonic_Liquids NEWw_DATA TR_TS_EXT_ Canes of regent E DISTRIBUTION MODE the program reads the SMF file and fills the left tree view with the list of calculated descriptors for the set If the user clicks on a particular fragment compounds are sorted according to the occurrence of the fragment several histograms are drawn corresponding to the different occurrences X axis that take the studied fragment in order to show the number of compounds Y axis having each occurrence A given histogram can be selected using the Select Mode button Then the compounds having this occurrence of the studied fragment are displayed in the right listbox with their experimental proper
14. ilarity coefficients 7 3 0 8106 0 5791 Thus user can assess 1f it is reasonable 7 3 0 8106 0 5791 L 7 3 0 8106 0 5791 or not to predict the studied property 3 Sa naan of the given compound with the properties of its neighbors SAVE Model Apply models to test set LOAD Model SAVE All Models Apply models to external set LOAD All Models Mod 5285 Mod 4989 Mod 4711 Mod 4560 7 5392 EN i o Prediction of external set using quel selected models 8 0900 The selected models can be applied to ae screen the external set with a click on the 8 7786 Apply models to external set button rr A new worksheet is filled with each line corresponding to a compound and each column for a model Press the Mean SD button to calculate for each compound the mean consensus model of the predicted values by all models and the standard deviation o Load Save RNN models User can save model one by one or all together Each model is saved individually in a MOD file Saving models is done instantaneously Loading models is also very easy to vels accomplish the program must be restarted and the Da iode ho a aa 4 l LOAD MODELS button pressed Then you can reload models one by one or all together Be careful COMPOUND ID i PROPERTY the SD files of the training test and external sets and the MOD files have to be in the same work directory TANIMOTO DISTANCE o PCA Cluster Anal
15. ine cluster When a cluster is defined all the selected compounds are i assigned to this cluster User can also define sub clusters oe hich l included in bi l RE which are c usters included in bigger c usters non exclusive clustering that means that a particular compound can eventually belong to several clusters 009 Compound Clusters A 223 1 95 1 E 96 1 97 1 257 1 1 121 1 252 1 225 1 258 1 136 1 166 1 16 1 224 1 253 1 128 1 199 1 200 1 ee gt Distance 1 36 213 4 09 5 46 6 82 8 18 Define Cluster Delete Cluster Show CLUSTERS v DRAW 3D VIEW a Dataset lonic Liquids ImBr NR4Br PyBr_TRAINING SMF The list of selected clusters is shown in the left listview near the color pattern To explore in details the contents of the clusters click on Show clusters after the selection of clusters ISIDA CLUSTER First Proximity Matrix Explorer Second Proximity Matrix Explorer A sd dl New Molecules IDENTIFICATOR Analysis a Composition Export SDF The clusters explorer is dedicated to see in details the contents of the selected clusters with histograms of the clusters sizes two control panels to explore the clusters and the molecules inside one given cluster Tanimoto similarity calculator for clusters mol Size cluster Mol ID Cluster ID Molecule Molecule ce ha o Export clusters The clustering program saves the results in FCL file cont
16. n the no nds ss seg lo EO Fragmentation settings User can select and or Ell Augmented Atoms El ALL Fragments unselect a given type of fragments and modify erra om ERE a the range of their length from 2 to 6 by default E Hybridiz Augmented Atoms E Benson fragments The All Fragments checkbox allows user to EXTERNAL DESCRIPTORS a ES TES select in only one click ALL types of molecular Dl fragments Calculate Fragments The External descriptors checkbox activates the so called name groupbox in order to select a file containing columns of external descriptors for this dataset At this time this option is only available in the ISIDA cluster v1 and not in ISIDA cluster v2 The Calculate Fragments button launches the fragmentation of compounds and creates a SMF Substructural Molecular Fragments file At the end of calculations the number of compounds and descriptors are written in the memo and then two buttons are visible Variables selection and Files convertor o Variables selection The variables selection suite of ISIDA includes several approaches and options which can be used successively DESCRIPTORS CALCULATION lonic Liquids ImBr NR4Br PyBr_TRAI Variables selection Variables with non zero smaller than Atoms only SE EEES W ISIDA QSPR fragments Bonds only ISIDA CFR file H Cua Eoi Paste Atoms and Bonds se 2 ds A ii UFS Unsupervised Forward Selection E Augmented
17. n the set the number of compounds having this fragment the total number of occurrences of this fragment a warning if less than only 5 compounds in the set have this fragment The Sort descriptors procedure is not finished Click on DESCRIPTORS CORRELATION to see the correlation matrix between descriptors o Correlation matrix of descriptors Te SIDA Analysts of the Desciptors Correlation BACK TO DATA ANALYSIS CORRELATION MATRIX MATRIX OF FRAGMENTS SEARCH DESCRIPTORS T 2 a CD CO w OI IN Fa CJ NJ Descriptor 2 B escriptor 1 cow a ni vi GO LEFT GO RIGHT 1 CO DOWN zoom Ni Zoos our ISIDA builds the correlation matrix between descriptors after one click on Analyse User can easily detect pairs of high correlated variables Properties Descriptors a ae o Replacement of a given descriptor by another correlated one BIDA Analysts of the Desciptors Correlation i BACK TO DATA ANALYSIS CORRELATION MATRIX MATRIX OF FRAGMENTS SEARCH DESCRIPTORS SMF File A Descriptor 2 i Correlated descriptors C C C C 1 0 961 CCC N C C 0 913 CCCC C C N C S C N C C C 0 0 CFCV C 0 C C C N C C C 0 CICCBr C C C C N C C C C N C C C C 0 N C C C C A N C N eee NCCF z C C C N cc CF C C N N NCCN NCCO oco CICCCI C C C C N N CN C C 0 f D escriptor 1 CC R Save as Correlated Descs List txt O
18. operty the mask file and the formats of output files SMF is always created Press Create Files The new files have been written in the same directory than the initial SD and SMF files 3 Data and descriptors analysis o According to property classes The Data Analysis module of ISIDA allows one to split its data according to a numerical property property classes This feature is useful to visualize the distribution of a dataset of compounds according to the studied property or activity To perform the analysis select the dataset and the property Specify the number of classes 10 by default and click on Analyse SELECT THE INPUT DATA Analysis of PROPERTY Analysis of DESCRIPTORS DESCRIPTORS CORRELATIONS MP_exp 185 ANALYSE SAVE DATA gt FILE EXPORT BMP a EXPORT BMP SSS Say The X axis corresponds to the property and the Y axis corresponds to the number of compounds Each histogram represents an ensemble of compounds having similar property values The scale of the X axis is derived automatically from the desired number of classes If the user wants classes with a given range of property he has to enter this value in the edit under the Fixed scaling checkbox has to be checked and then Analyse again If the user wants to change the minimal and the maximal value of the X axis he has to enter the corresponding values in the two edits under the Fixed Min Max Property checkbox
19. parameter ISIDA includes a randomization option to avoid any problem Open SD FILE JRABBE SDF Randomize To randomize your data please click on Descriptors in the ISIDA wizard Select your SD file with the Open SD FILE button and click on Randomize A new SD file is created having the same name than the previous one but the extension _R sdf is added o Duplicates search Detecting duplicates in a large database is not a common task and cannot be done without a computational help DATA Filtering F A THESE DEVELOPPEMENT YSIDA_CLUSTERSSOF e_k1_1_TRS1 I Previous Ment EE ee ee Previous Next Control panel Property value 201 This To search duplicates be sure that you have the SD file and a corresponding SMF file containing descriptors in the same directory Select the SMF file vvith the Open button and the name of the property Click on Analyse to launch the procedure To go from one duplicate to another use the control panel Be careful the program finds also isomers o Scrambling not finished and not available outside ISIDA DC models 2 Calculations of descriptors o Calculation of molecular fragments Click on the Descriptors button in the top of the DESCRIPTORS CALCULATION wizard Select your SD file in the Descriptors lonic Liquids ImBr NR4Br PyBr_TRAI window Fragmentation Settings El Atoms only Length of sequences All molecular fragments are available i
20. sentations of results are available curves of Statistical criteria versus the number of iterations R for the test set versus Q for the training set for all available models the calculated LOO property values versus the experimental ones for the training set the predicted property values versus the experimental ones for the test set Calculations stop when the maximal number of variables has been reached or when the STOP button has been clicked o Selection of predictive models ISIDA KNN produces a huge number of models with high or low predictive abilities In order to select the most predictive ones the user can choose adequate values of Q and R in the Fitter MODELS panel in order to make the selection These corresponding models appear in yellow on the graphical representation On the ListMODELS page a worksheet has been filled with the selected models FILTER MODELS E E and n lt 10 Q2 L00 pes 0 78 R2 test gt If the user clicks on any model Subset ID Nb Variables k parameter LOO RMSE I be 3 j 7 3 in the list the Compounds page is 7 3 0 8170 0 5691 3 3 EPS SES available for each compound of the 7 3 0 8170 0 5691 test or the external set it is possible to 7 3 0 8170 0 5691 li t hb th 5 TT ETT visualize its neighbors e 7 3 0 8170 0 5691 experimental properties of its 7 3 0 8106 0 5791 L A i A A AE 3 3 race FER i neighbors the sim
21. trackbar Color by clusters ISIDA Cluster creates FCL files File of Classes to save the results contents of clusters for each clustering Click on Open FCL file to select the FCL corresponding to the studied set and then select Color by cluster in the Color settings combobox Remark the Use reduced SMF checkbox can be used to calculate PCA with the selected variables of a SEL SMF see variables selection Bee Beet cu Se SS ei oto e eee cigteSsacenciene UE cecuee ous Bacine tse se Pee seen nea E E A ne seen tac te ee eee Gels Mae cour SES SEC EE En ed E ee Sec Reno tee A EE esse li EE 5 Cluster Analysis clustering using ISIDA Cluster o Settings Parameters of calculations ert ae onic Liquids ImBrNA4br PyBr TRAINING SMF Open SMF file Compounds Type of fragments Fragments Launch Me ifthe SMF file is available Click on the Clustering button in the ISIDA wizard and select the SMF with the Open SMF file The number of compounds the type of fragments and the number of fragments are displayed under the name of the SMF file ifthe SMF file is not available Click on the Descriptors button in the ISIDA wizard to calculate fragments for the studied dataset Once the SMF file has been created see 2 Calculation of descriptors click on the Clustering button in the ISIDA wizard and select the SMF with the Open SMF file The number of compounds the t
22. ty User can also look at structures by clicking one compound in the list The Hide zero fragment hides the compounds having not the studied fragment This option leads to a new scale of the Y axis a ac cana umber of fragments QSAR VIEW MODE the experimental property is projected on the X axis and the number of occurrences of the selected fragment on the Y axis Each red point is a compound whose the structure can be seen with a mouse move on it POOO OOOOOO z Siid bn baid bada MC Compounds gt DOO O ODO wow A P CO j CO IN Fa CU ND met This QSAR view represents a graphical mode to detect correlations between an experimental property and variables The program includes also the ability to treat external descriptors If they are not fragments and take real values the occurrences are replaced by ranges in the DISTRIBUTION MODE and by real values in the QSAR VIEW MODE Available in ISIDA vl not finished in ISIDA v2 o Statistical criteria of descriptors Click on Calculate Statistics and Save File A text file is created in the work directory including DATASET F THESEITonic_Liquids NEW_DATA TR_TS_EXT_OL Ionic Liquids ImBr NR46r Py6r_TRAINING SDF Number of compounds 515 Number of descriptors 413 Property MP_exp N Descriptor Classes R R2 Freg Nb Mols Nb Frags WARNING 1 C C C C 25 0 235 0 055 36 505 188 1394 2 N C C 3 0
23. ype of fragments and the number of fragments are displayed under the name of the SMF file Several parameters are required for cluster analysis Methods Giladiie Neighbors Mormalizations Options 7 By default the clustering 7 entirely hierarchical EET eee with a Euclidian metric between compounds and a between clusters gt sin sic link complete link between clusters descriptors are not ae normalized and no modified metric is employed complete link Ete IMK ward link User can modify these parameters using the menu on the top of the ISIDA window EEE i te morie beoveen compounds menu the escriptors Tanimoto similarity corresponds to the distance l Data Analysis calculated as m PCA Dist i j 1 TANIMOTO i j The calculations begin if the launch button is pressed E A E AA AEA ET teii E o Dendogram representation and tools When the calculations stop click on the View Dendogram button A new window is maximized and the resulting dendogram is drawn The exploration of any part of the dendogram is easy with the small dynamic circles placed on the dendogram itself A left click on any circle the circle is activated gt blue color results in the display of the list of compounds in the left listview of the window deriving from this circle If the circle is activated blue a right click on this circle leads to the display of a popup window including the option to def
24. ysis in progress These two options are available when some models have been filtered or when one model has been selected 7 Calculation of aqueous solubility and octanol water partition coefficient using ISIDA LogPS Fi ISIDA logPS I CE Open SD f mal file Edit Structure Register Nb of compounds Options Show distribution according to logP ss la Es om Pee RPE LL PPEMENTSISIDA logPSSSOLUBILITY_ALL_1643 2df ab of compounds Options Show distribution 9 according to 42 X done C M i FA ITER Open SD mol file Edit Structure ff Register F THESESDEVELOPPEMENTSISIOA logPSSSOLUBILITY_ALL 16435 sdt Hb of compounds Options Show distribution according to looP Show Results re Update SD file ISIDA LogPS reads mol and SD files User can launch calculations of logP and logS by clicking the Calculate button During the calculations the distribution of compounds according to their corresponding calculated logS or logP is represented with histograms and updated at every compound User can choose to visualize the histograms of logP or logS Once calculations are ended one click on Show Results leads to the displaying of the predicted values for each compound The Update SD file allows user to insert automatically these predicted values as new fields in the studied SD file 23 a numeric field containing the predicted

MANUEL D`UTILISATION ISIDA

Contents

Download Pdf Manuals

Related Search

Related Contents