Home

Description

image

Contents

1. Files selection Pairwise common phenotypes Comparison of phenotypes Overall common phenotypes Virulence frequency Between sample distances Comments M Between population analysis of original data Matrix of dissimilarities between phenotypes Current pair of files POP2 VAT POP3 VAT 5 Dissimilarity measure Simple mismatch 5 Excel P2 Phenotype P3 phi ph2 ph3 ph4 ph ph6 ph7 phg ph9 phio phi 0438 0 0 313 0 313 O313 0 563 0 250 0 375 0500 0 313 pha o aa o o o5 020 oee 0a oa 035 ph3 0125 0313 O375 0125 0 0250 0438 0188 0188 0250 0 313 0250 0438 O313 0188 0313 0500 0125 0 375 0 063 0 438 0250 0188 0438 0438 0438 0 0 500 0375 0 563 Phenotype characterization P2 POP2 VAT P3 POP3 VAT Phenotype Binary pattern of Octal Hexadecimal Binary Decimal Frequency phi 111100111011101 TEPP 62395 Phenotype Binary pattern of Octal Hexadecimal Binary Decimal Frequency phi 110000000001100 QBCM 49177 0 200 000100011011100 CCPM 4537 0 133 ph3 111100000001100 TBCM 61465 pho 111100100011000 TDFC 62001 111100000001100 TBCM 61465 0 067 __ ph 100101011011101 MHPP 38331 100101011011101 MHPP 38331 0 067 100000000001100 LBCM 32793 Figure 4 7 Comparison of phenotypes sheet Two files are compared pop2 vat and pop3 vat The top matrix contains the dissimilarity values between the 8 phenotypes of pop2 vat and the 10 phenotypes of pop3 vat At the bottom right an
2. Let us consider a dichotomous pattern of any individual J tested on a set of k differentiating factors with positive responses 1s on n of them Then numbers of 1s and Os in binary vector representing I equal n and n k n respectively C individual Complexity Complexity of individual J is determined as the number of 1s positive responses in its binary representation C n A6 RC Relative individual Complexity Relative complexity of individual J is determined as its complexity normalized by the total number of differentiating factors RC 1 ae A7 k The relative complexity varies between 0 and 1 0 lt RC J lt 1 U individual Uniformity Uniformity of individual is determined as follows Kosman 2003b UC 1 2 min RC 1 RCU A8 This measure ranges from 0 to 1 0 lt U lt 1 If there are only 1s or Os in the response pattern then the uniformity is maximal and equals 1 The uniformity reaches its minimum value 0 if the numbers of positive 1s and negative Os responses are equal Ur sample Uniformity Measure of uniformity for a sample of n individuals J j 1 2 n from population P is defined as the average of the uniformity values 3 across individuals 1 n UUJ BU A9 z This average uniformity of individuals is called the sample population uniformity It ranges from 0 to 1 0 lt U P lt 1 Upn average phenotype Uniformity 66 7 Appendix Average unifor
3. Transfer only those files which have actually to be processed in this section The work segment accepts only binary data If any non binary data file was selected for transfer it will automatically be converted first into binary before being transferred from the left source segment to the work segment middle For Regular data a user defined Cut Off Value is requested in a window that will pop up at the stage of a file transfer from the source segment into the work segment The Cut Off Value window Fig 3 2 is mainly comprised of the two boxes Replace with 0 and Replace with 1 preceded by the name of the data file to be converted The program automatically reads the minimum and maximum values from this data file and displays them under the headings From Min and To Max respectively If a Cut Off Value has already been provided in an earlier session this value will be displayed in the box labeled To Cut Off and a message at the bottom will alert you about its existence One may then either change the old value or leave it unchanged If previously no value was ever provided then the To Cut Off box will be empty and a desired Cut Off Value must be entered into the box Next click the Apply button which tells the program to convert all values up to the Cut Off into 0 s and all values above the Cut Off Value into 1 s Finally invoke the conversion by clicking OK As a consequence the name of
4. Comments are in the back Next we explain the new worksheets in more detail The Phenotype Characterization Sheet This sheet shows in two tables the analysis of Phenotype frequency and Virulence complexities see Fig 5 2 In the current file s field above the table you can select 53 5 Inferential Statistics from all analyzed files On the right to this field are three buttons Average SE Average and Excel The Average provides the arithmetic means A45 calculated over all the computer generated random samples originating from the previous resampling runs see 3 3 The Average SE provides the mean A45 and the standard error SE A46 of the estimated parameters Clicking the Excel button will open an Excel file and copy the contents of both tables from the VAT sheet into Excel sheets We describe now the two tables of the Phenotype Characterization Sheet in more detail 35 YAT Inferential Statistics Virulence Data File Change Application Help Files selection Phenotype characterization Virulence frequency Comparison of differentials Diversity parameters Comments Within population analysis of resampled data Phenotype frequencies Current file X Average SE Average Excel Octal Phenotype Binary pattern of phenotype Hexadecimal Binary Decimal Frequency 1000000000011000 0011000000010011 1001000110111011 1001000100111011 0110000000010010 0111001010010011 Vi
5. Comparison of differentials Diversity parameters Comments m Within population analysis of resampled data Virulence frequency Average SE Average Excel Pa p2 P3 Differential pop vat pop2 vat pop3 vat 0 353 0 640 0 620 1 01 Figure 5 3 Virulence frequency sheet Virulence frequencies of the sampled isolates on the set of 16 differentials columns are calculated for two different populations P1 popl vat P2 pop2 vat P3 pop3 vat The Average option is chosen at the top right corner The Comparison of Differentials Sheet This sheet is subdivided in two matrices Fig 5 4 The top matrix displays Average values A45 of Associations A5 or Correlations A4 between virulences and avirulences for all pairs of the differentials columns in the original data table from the given differential set see Fig 5 4 in form of an upper or lower triangular matrix 56 5 Inferential Statistics respectively The bottom matrix displays the Standard Errors SE A46 of the corresponding parameters By checking the Square box on the top the corresponding triangular matrices will be displayed in a symmetric square ones The option Both allows you to combine the upper and lower triangular matrices in one square matrix The Current file field allows you to select among all analyzed files RA AT Inferential Statistics Virulence Data En Fie Change Applicatior Hep Files selecticn Phenotype
6. determined as a table of k rows and one column of binary decimal codes 11 2 Data Entry 2 3 Original Data Sheet Original Data Sheet is the main instrument for entering editing and modifying data and comments as well as for displaying data data parameters type of data size of data table etc and name and location of data file Data always appear in the grid while information about the data and data file is placed below the grid Note that only valid values for declared type of data can appear or can be entered in the grid Otherwise in the case of invalid values error messages will pop up For example for regular data with specified range of assessment scale between minimum and maximum values let say 1 and 10 respectively neither numbers smaller than 1 or greater than 10 nor letters symbols etc cannot be entered Pencil Mode Once entering a cell clicking in the grid cell and typing there the regular row marker a triangle pointing to the right will turn into a pencil with three dots see Fig 2 3 row 11 This mode is called the Pencil Mode In this mode the program enables entering data When typing within the cell is completed quit this cell click anywhere outside the cell in order to return to the Regular Mode Note that in the Pencil Mode some functions may be incorrectly executed Therefore it is always recommended to leave the Pencil Mode before continuing processing or saving HE Data Entry File Cha
7. 1000 0111 gt 001000101111010010000111 If number of individuals in data table equals k then Hexadecimal data are determined as a table of k rows and one column of hexadecimal codes of a fixed length The first sixteen consonants of English alphabet B C D F G H J K L M N P R S Q T are the only valid symbols in the case of Hexadecimal data and all entries should have the same number of symbols fixed Code length Binary Decimal data If a Differential Set consists of n differentials then a binary reaction pattern of an individual on the given set of differentials has the following general form v v v v where v 0 or v 1 for all i 1 2 2 One to one correspondence of such patterns to integers between 0 and 2 1 is naturally established by matching the integer I v v 2 v 2 y 2 3 4v_ 2 v 2 to the binary pattern v This 1 2 3 i ry p cast integer J v is called the Binary Decimal Code of v Obviously any Binary Decimal Code can be transformed to the single binary vector of length n according to the same rule For example in the case of six differentials n 6 the following binary vectors are represented by the corresponding binary decimal Codes 010101 21 0 2 1 24 0 2 1 27 0 2 1 2 101101 gt 45 1 2 0 2 1 2 1 2 0 2 1 2 000011 lt gt 3 0 2 0 27 0 2 4 0 27 1 2 41 2 If number of individuals in data table equals k then Binary Decimal data are
8. 110 111 112 113 114 115 116 117 i118 i19 i2 i1 E SS E SS ES Ee ee ee ee S ee ee ee ee ee E A 0 250 0250 0 SI eS eS SS Se EE Ee a Ee ee Ee Ee ea E 020 020 0 O0 0 CSS OS Ly a ee ay a a ee ee ee ee ee 033 033 033 033 0313 0438 0 A eS aS ES ns Ee ES Ee es ES Ea 0375 0375 0375 0375 0375 0375 0313 0438 0 Ci eS ee a EE Ss eS Ee Ee ee Ee aes 0375 0375 0375 0375 0375 0375 O313 0438 0 0 i a a a es a ae E I SES E 0 250 0250 0375 0375 0375 0500 0313 0313 0375 0375 0375 0375 0375 0563 0 CE a A E 0250 0250 0375 0375 0375 0500 0313 0313 0375 0375 0375 0375 0375 0563 0 0 CE eee E 0 563 0 563 0438 0 438 0438 0 313 0250 0 500 0313 0 313 0 313 0 313 0188 0 250 0563 0 563 0 563 0 563 0 Current file 0 375 0 375 0375 0 375 0 375 0 375 0063 0 563 0 375 0 375 0375 0375 0 Figure 4 3 Comparison of individuals sheet Here all the 20 individuals of a population pop1 vat are compared pair wise between themselves and the Simple mismatch measure is selected The matrix is a lower triangle matrix square box not checked Clicking the Excel button at the top right will open an Excel file and copy the contents into an Excel sheet The Virulence Frequency Sheet This sheet displays proportion of the sampled isolates with virulent reaction on each differential from the given differential set Fig 4 4 In other words the frequency of ls in the corresponding column of the original data table is displayed The leftmost column li
9. A41 Gsr Nei coefficient of differentiation among populations The Nei coefficient of differentiation G among populations Nei 1973 is equivalent to the definition for Fp measure Wright 1951 and being applied to two populations P and P3 is used for measuring distance between populations 1 k CAPT T 2 Osn P P A42 Jal a Hq where G P P ee is calculated as follows For th differentiating factor Tl T H F H P eo where Hs P 1 q l g Ha P 1 qa l 4 and H 1 q 1 q for qu ae Values of the G index vary between 0 and 1 0 lt G P lt 1 One can show that the Nei coefficient of differentiation may be represented in the form 1 CLA Gyr P P gt H A43 k a i i MCD Mean Character Difference between populations The mean character difference MCD pp 122 123 in Sneath and Sokal 1973 is k l identical to the Rogers traits based distance Rogers 1972 in the case of dichotomous characters binary data The mean character difference between two populations P and P tested at k binary differentiating characters D D2 Dx is determined as follows k 1 8 1 MCD P P TE CD P P gt Ea Idi 4 l l 1 A44 where q and q are frequencies of positive response appearance of 1 at th character in populations P and P respectively 1 2 k and CD P P lau E qa is the distance of character difference between the pop
10. Application allows a user to switch to other VAT Application or to return to the VAT Main Window 2 Data Entry Figure 2 1 Data Entry window Three squares represent three different options namely Create Differential Set Enter New Data and Open Existing File 2 1 Create Differential Set This option allows you to customize your own Differential Set or modify an existing Differential Set Notice that term Differential Set means an ordered set of differentials If the same differentials are placed in different order then the corresponding differential sets are different The Differential Set window is shown in Fig 2 2 2 Data Entry Differential Set Differential Set Dasa 6m UPDATE existing set Number of differentials columns pP ada ins va fag Name DiffSet 6 Country Country6 Pathogen pathogen6 Reference Reference6 Host host6 Author Column names Name of Differential Diff1 Diff2 Diff Ditt4 Diff5 this is where you enter commoents Back to Data Entry Figure 2 2 Differential Set window Differential set DiffSet 6 was chosen DiffSet 6 xml is name of the corresponding system file which contains DiffSet 6 This differential set has 7 columns named Diffl Diff2 etc Each differential can be commented as comment1 comment2 etc The marker here is set to Diff7 The field in the top center allows either to browse and select an already existi
11. E E mes 8 2 3 O gmal Data Sheets inesse nnen Sd e a eet a a8 12 24 Enter New Dataretentie EEEE E ETa eee 22 2 5 Open Existing Filesi ennn ieii a e EE AERO 25 3 RESAMPLING AND CODING sesesseseseseesesosoesesesoeoesosorsesereeeesoroeseseseeeesororsesereeeesoroesesee 28 Sol The Fil s Selection Shet sreske a ER R EREE a 29 Conversion By Transfer va cca secayetyaeoce esd i a a a a iaa seek 30 3 2 The Coding Mn CH OM arrsa teo anean iE EE E A A lass teas do aes 32 The Original Input SHC 2 dose seca tedee asec nin arn an a i a i 32 The Binary Representation Sheet s ssessesesessessesessressessresressessresressessresresseesesse 33 The Coded Representation SHC ts isc 2caiccsistecaessasseccyadeus tedereateccseconesniecdessiseetacds 34 The Comments SNC ices tsira eaae E AER ERAO R R ES 36 3 3 Resampling FUNCOM acctvesdecoudiaatecencbaekaand sedextvaats lew iardoceesavaardeaselaendoceeoteadeeans 36 4 DESCRIPTIVE STATISTICS cs siccsencssnesesasaceesnscentoncetiucean sevsanensiawestbcsuesstetossedaassveapaetic 38 4A Th Piles Selection Sh t 2 incas bskouccsaadacaastcnsanausasancenashaaacisaguedpsangveaisaelacaaeaastuaeis 38 4 2 Within population Analy Sis s c sshssccerescescqgiesdacdeasseacyssbesaass dactecareecuisbanecaebeveseoahed 40 The Phenotype Characterization SHGet jccssssessassezisnnadssdachoavedinspateessteaeinces Hibatnausaeeds 41 The Phenotype Frequencies Tables weit sizes nent dedcersanieatidcts ateatadcn ete 41 The T
12. Resistance Complexity RCR Resistance uniformity UR and Resistance Frequency in the Resistance Analysis correspond to Virulence Complexity VC Relative Virulence Complexity RVC Virulence Uniformity VU and Virulence Frequency in the Virulence Analysis respectively System files for Resistance Data have extension rat filename rat similar to the extension vat filename vat for the Virulence Data The most significant differences between Resistance Analysis and Virulence Analysis appear in the Resampling and Coding application 6 Resampling and Coding for Resistance Data When an individual is tested for resistance to any pathogen isolate then the fact of resistance is usually designated by 0 i e the isolate is avirulent on the given individual Susceptibility of an individual to any given isolate of pathogen is designated by 1 in the case of Binary Data virulent reaction of the isolate or by any number according to an assessment scale in the case of Regular Data Therefore in the binary form of resistance data 0 rather than 1 is a valuable characteristics what is unusual To avoid this in the case of Resistance Analysis the Binary representation sheet Fig 7 2 see also 3 2 and compare with Fig 3 4 for Virulence Analysis displays original data Fig 7 1 after conversion to the binary form and additional 0 1 transformation so that in the 61 6 Resampling and Coding for Resistance Data Binary representation
13. characterization Virulenze frequency Comparison ol differentials Diversity parameters Comments Withn population analysis of resampled data Relaticnship between differentials l Square matrix Association Correlation Both Excel Average Cola Col2 cols Cola Cols Col Col colg Colo Cal10 ol11 ol12 Col 13 Col14 colis Coli6 1 0 247 0 314 0 031 0252 0 0022 0544 0 221 0 c87 0 306 3105 0 656 0 Oe 0023 Current file 0 74 0 565 0 149 0 378 2 065 0 064 Q200 0 235 0 19 1 0272 0633 3289 0085 0 033 033 a a ee a E 1 0058 0198 0 037 031 SS SS SS ESS E aa ieee 1 0 045 0158 EE E ee ee ee Oe OE 1 0 304 Cola col2 cols Col4 cols col tol7 Cols colo Col10 Col11 Col12 Col13 Col14 col15 Col16 0075 3055 0054 0 057 0 coea 0 052 0 061 Q060 0736 0 039 0 037 0 0 033 0 076 C038 0 078 0 067 Q072 0381 0045 0 029 0 0 086 0 0 0 cm9 004 0 083 Q01 0N2 O17 0 005 ne el a is 0 112 0030 0118 0105 0 083 0133 0 057 0 101 0066 0 0 044 0 031 0 023 0 0095 omg esas 0 0 029 0 068 0 Figure 5 4 Comparison of differentials sheet Virulence avirulence reactions of isolates to 16 differentials are analyzed with respect to a given population pop1 vat file The Association measure is selected and displayed in an upper triangular matrix The Square matrix box is unchecked here The top and bottom matrices display the Average based on t
14. for all data tables in the files selected in the work segment Otherwise an error message will appear The Between procedure is further explained in 5 3 Before running the Between or Within population analysis the Dissimilarity coefficients must be chosen For that purpose check one or more of the associated boxes on the right labeled Simple mismatch A1 Jaccard A2 and Dice A3 5 2 Within Population Analysis Within includes a series of population analyses applied separately to each population file listed in the work segment The Within population analysis is based on the corresponding collections of previously resampled sets By clicking the Within button this analysis will be performed Once Within is activated it could take a few minutes A special information window will open where the file name currently analyzed and the elapsed running time are displayed as well as a blue bar conveying graphically the approximate time still required for completion As soon as the analysis is completed five new sheets will appear behind the Files selection sheet only the name tags of the five new worksheets will be visible right above the upper edge of the front sheet Clicking such name tag brings the corresponding sheet to the foreground in Fig 5 2 for example Phenotype characterization is the visible front worksheet while Files selection and the other four sheets Virulence frequency Comparison of differentials Diversity parameters and
15. random samples of size 15 were drawn from each of the two files 31 3 Resampling and Coding HE Cut Off alue BoR Cut Off Yalue File Name pop4 vat Replace wih 0 From Min 0 To Cut Off Booo Replace with 0 Replace with t From 3 001 To Max 9 Replace with 1 OK Cancel Figure 3 2 Cut Off window The Cut Off value of the file pop4 vat was set to 3 then the Apply button was clicked which automatically set the From value in the Replace with 1 to 3 001 Once the specified files are successfully transferred to the work segment two functions can be applied to them namely Coding or Resampling These functions appear as buttons on the top right side above the parameter list in the Resampling and Coding window Fig 3 1 3 2 The Coding function The Coding function translates and displays the binary data in the three major nomenclatures 1 Habgood s Binary Decimal code 2 Gilmore s Octal triplet code and 3 Roelfs Hexadecimal code To invoke this feature just click the Coding button above the parameter list Fig 3 1 and all files in the work window will be processed Behind the Files Selection sheet four new sheets will appear labeled Original input Binary representation Coded representation and Comments Next the four new worksheets are explained in more detail The Original Input Sheet This sheet shows the original input data matrix Fig 3
16. representation sheet displaying the original Resistance Data of pl rat Fig 7 1 after conversion to the binary form including the 0 1 transformation File Change Application Help Files selection Original input Binary representation Coded representation Comments Current file pl rat Binary 0 1 No cols 12 No rows 15 Current file path D test VAT 16_02_08 p1 rat Figure 7 3 Coded representation sheet Codes of rows in the Binary representation sheet of the Resistance Data of p1 rat Fig 7 2 are represented 63 7 Appendix For example if original Resistance Data are of Binary type then the following correspondence between binary original and transformed vectors and codes exists Table 1 Resistance Analysis Be Aas ee Coded representation arrears ie i ae Octal Hexadecimal Binary Decimal 111000101010 000111010101 0725 CSH 469 For comparison the same binary vector in the case of original Virulence Data has the following representations Table 2 Virulence Analysis bes Coded representation Original data Binary representation Octal Hexadecimal Binary Decimal 111000101010 111000101010 7052 QDN 3626 Similar to the Virulence Analysis the Binary representation of original data are in fact used in the Descriptive Statistics and Inferential Statistics applications in the case of analyzing Resistance Data 7 Appendi
17. set of differentials can be represented by the Hexadecimal Code as follows The ordered set of differentials is divided in separate groups of four consecutive differentials each Then binary patterns of individuals are represented by the ordered sets of 0 1 four tuples four tuple means four ordered symbols according to the division of differentials into groups Now each four tuple can be encoded by a single letter from the set of the first sixteen consonants of English alphabet B C D F G H J K L M N P Q R S T according to the following rule Four tuple Hexadecimal Code Four tuple Hexadecimal Code 0000 B 1000 L 0001 C 1001 M 0010 D 1010 N 0011 F 1011 P 0100 G 1100 Q 0101 H 1101 R 0110 J 1110 S 0111 K 1111 T This is one to one correspondence between the sixteen letters and all possible 0 1 four tuples Substituting one of the letters for the corresponding four tuple in the binary pattern of individual the individual s Hexadecimal Code of length is obtained Obviously any Hexadecimal Code can be transformed to the binary vector according to the same rule For example the following pairs of binary vectors of length 24 and hexadecimal codes of length 6 are results of the corresponding transformations 10 2 Data Entry BNJHGR lt gt 0000 1010 0110 0101 0100 1100 lt gt 000010100110010101001100 DDTGLK lt lt 0010 0010 1111 0100
18. tag brings the corresponding sheet to the foreground in Fig 4 6 for example Pairwise common phenotype is the visible front worksheet while Files selection and the other five sheets Comparison of phenotypes Overall common phenotypes Virulence frequency Distance parameters between samples Comments are in the back Next we explain the new worksheets in more detail The Pairwise Common Phenotype Sheet This sheet displays for any selected pair of populations the phenotypes common to both The current pair of files field at the top allows you to select among the analyzed file pairs Once a pair is chosen a table is displayed with the binary phenotype vectors the corresponding Octal Hexadecimal and Binary Decimal see 2 2 codes of each of the common phenotypes Note that only the common phenotypes are shown The two columns Count CountP2 and CountP3 in Fig 4 6 of the table report the number of individuals with a common phenotype encountered in the respective population For example the third phenotype 3 Fig 4 6 which is found in both populations P2 and P3 appears 2 times in the first population P2 therefore in the corresponding cell of phenotype 3 and CountP2 the number 2 is displayed The column Mincounts gives the minimum value 1 of CountP2 2 and CountP3 1 for phenotype 3 The two columns Freq FreqP2 and FreqP3 in Fig 4 6 contain the relative frequencies of the common phenotypes A18 with respect to the sample si
19. the Insert Add and Del buttons which will appear next to the field Number of differentials columns Insert Mark any row in the list of differentials from the Columns name box An arrow will appear next to the marked row By clicking the Insert button a new row for entering name and comment for the new differential will be inserted above the marked one Add By clicking Add a new row for entering name and comment for the new differential will be added at the bottom in the Columns name box Del By clicking Del the marked differential will be deleted from the list in the Columns name box The entries of Name Pathogen Host Country Reference and Author can be modified by clicking into the appropriate box The name and comments of each member of the differential set can be modified by clicking the appropriate cell at the Column name box Once editing and modification of the differential set is completed click the Save button at the bottom Now the differential set is saved by VAT under the chosen Name original or modified and is available in the Enter New Data section see 2 4 Clicking the Excel icon at the top right will transfer the current data to an Excel file and open this file The Back to Data Entry button is for return from the Differential Set window to the Data Entry window 2 2 Types of Data The following five types of data are admissible Regular data If number of individuals and differentials in data table equ
20. unformatted way This representation allows recognizing if the data contain comments and names of columns and rows a column delimiter can also be recognized In order to upload the text file properly the corresponding parameters should be specified in the field Number of lines reserved for comments and by clicking in the boxes First column is reserved for names of rows and or First row is reserved for names of columns if there are a header row and or a header column respectively in the given text file Once choosing the appropriate parameters continue by clicking Next Clicking Back will return to the Open Existing File window Fig 2 15 26 2 Data Entry Preview file contents unformated Number of lines reserved for comments First column is reserved for names of rows First row is reserved for names of columns File name E Ariel TEST VAT pop_1_reg txt O TO NW HO 5 0 0 0 0 4 1 1 4 3 0 5 Figure 2 16 Preview of file content unformatted window displays Regular data delimited by tab with no comments and no names of rows and columns The next window is the View of Tabulated Data Set one Fig 2 17 This window displays the input data in a grid according to the parameters determined in the previous Preview of file window Fig 2 16 and an automatically recognized column delimiter used in the given txt file The VAT is able to recognize four commonly used standard delimiters Comma Tab Semicolon
21. 0 343 0 187 0707 0 0 667 0 043 1 0 667 0 312 0 068 0 471 0 057 0 281 0 0 0 167 0 385 Col 1 0 356 0 272 0 171 0 229 0187 0471 0 0 167 0 685 0 0 356 0 663 0 0 272 0 245 1 0 0 0 105 0 327 0076 0192 0 0 272 0 245 1 0 572 0168 0424 0 0 257 0 538 0 0 0 0 Q 1 1 0 245 0 140 0 688 0 192 0 0 1 0145 0 0 480 02755 0150 0 12 0 1 0 0 0 0 0 490 0 459 0 0 408 0 367 0 275 0 150 0 378 0 356 0 023 1 0 096 0 243 1 0 132 1 0 343 0 308 0 187 0 168 0 471 0 182 0 0 1 0 257 1 Figure 4 5 Comparison of differentials sheet Virulence avirulence reactions of isolates to 16 differentials are analyzed with respect to a given population pop1 vat file The Association measure is selected and displayed in an upper triangular matrix The Square matrix box is unchecked here The Diversity Parameters Sheet This sheet provides different measures of diversity within populations namely Nei Diversity Hs A27 Simpson Si A21 Normalized Shannon Sh A25 Kosman Index K A29 Stoddart St A22 Shannon SH A23 Evenness E A24 Gleason G A20 Average dissimilarity within ADW A16 Kosman diversity within KW A16 The Indices ADW and KW can be calculated with regard to the three commonly used dissimilarity measures Simple Mismatch m A1 Jaccard j A2 and Dice d A3 Values of all relevant diversity parameters are displayed in a table where each column correspond
22. 1 vat pop2 vat pop3 vat The Square matrix box is not checked The Comments Sheet This sheet Fig 4 11 displays the available comments about each analyzed population these comments had to be included before in the Data entry section see 2 4 and Fig 2 4 Since some tables e g Fig 4 10 use general labels like P1 P2 for the analyzed populations files the Comment sheet may be useful to learn which file is associated to each label a AT Descriptive Statistics Virulence Data File Change Application Help Files selection Pairwise common phenotypes Comparison of phenotypes Overall common phenotypes Virulence frequency Between sample distances Comments Between population analysis of original data i File comments Population File Name Comments Pl popl vat 20 individuals 16 differentials P2 pop2 vat 12 individuals 16 differentials P3 pop3 vat 15 individuals 16 differentials Figure 4 11 Comments sheet Three different populations including file names are shown pop1 vat pop2 vat pop3 vat 50 5 Inferential Statistics 5 Inferential Statistics This VAT application allows statistically analyzing individual populations Within or groups of population Between and also estimating statistical significance of calculated parameters based on resampled data Only resampled binary files original or converted obtained in the Resampling and Coding application s
23. 2 Since each isolate in the sample is expressed by a binary vector isolates with identical binary vectors are defined to exhibit 41 4 Descriptive Statistics the same phenotype The second column displays for each phenotype the associated binary vector Recall that the dimension of the vector its length equals the number of underlying differentials The next three columns contain the corresponding Octal Hexadecimal and Binary Decimal race codes see 2 2 The last column of the Phenotype frequencies table shows the Phenotype Frequency A18 with or without standard error SE A19 depending on the choice between the FreqPhen SE or FreqPhen button The Three Implemented Race Codes Octal or triplet code see 2 2 requires a binary vector of length divisible by three otherwise the VAT leaves the Octal column empty Hexadecimal code see 2 2 requires a binary vector of length divisible by four otherwise the VAT leaves the Hexadecimal column empty See also the VAview tool Binary Decimal code see 2 2 is available only for binary vectors up to a length of 63 otherwise the VAT leaves the Binary Decimal column empty The Virulence Complexities Table The first two columns from the left are the same as in the Phenotype Frequencies table On the left is a list of all the different phenotypes the second column displays the associated binary vectors The next three columns labeled VC RVC and VU contain va
24. 3 The small Current file field on top of the data table provides the name of the vat file containing the displayed data 32 3 Resampling and Coding The browse option click the downward arrow to the right of the Current file field allows choosing between all files that had just been processed by the Coding function Clicking the Excel button at the top will open an Excel file and copy the contents of all four Coding sheets into four Excel sheets E VAT Resampling and Coding Virulence Data 6 x File Change Application Help Files selection Original input Binary representation Coded representation Comments Current file i Colt leaa as ae js ae ee a E gers an jae ae ue aus ean Rowl 6 SS SS A SC SS SE l a Row3 8 Row5 0 Row 8 a 2 RS SE A SY SO SS Row9 6 9 4 8 8 3 8 2 7 6 6 4 6 5 8 8 SS Se es Se ee ee ee ee ee ee ee ee eee 8 8 5 4 5 6 0 7 8 6 4 1 T 4 4 Regular No cols 16 No rows 12 Minvalue 0 Max value 9 Cut Off Value 3 Current file path D test data pop4 vat Figure 3 3 Original input sheet displaying the input data matrix of pop4 vat Below the table all major data parameters are listed Note The data type here is Regular and the Cut Off Value is 3 This will cause all values below above 3 to become 0 1 respectively after conversion to binary form for example see col 7 rows 1 and 2 in Fig 3 4 The Binary Representation Sheet This sheet disp
25. A2 etc and the corresponding references to them appear in the manual 4 1 The Files Selection Sheet By clicking the Descriptive statistics square in the VAT Main window a new window will appear with one open sheet called Files selection This sheet is subdivided into three parts Fig 4 1 two segments left and middle named source and work segment respectively and right of these two segments a list of parameter values The General Management Bar found above most VAT windows contains the options File Change Application and Help Under File there is the Exit option which allows leaving the program and terminating current VAT session With the Change Application option one can switch to other sections of the program or return to the VAT Main Window The Help button provides information how to work with VAT The source segment on the left with legend Binary files original or converted provides a list of data files to select from for further analysis The small path line field labeled Folder above the source segment displays the full path of the active folder which holds the files listed in the source segment Note that exclusively vat files are listed since they are the only ones to be processable in this section 38 4 Descriptive Statistics To select another folder click the Browse button next to the path line field and use the browse option the listed vat files in the source segment will change accordingly One may t
26. File Change Application Edit Excel Help Figure 2 6a Regular data with one empty cell Row12 Diff4 Validation List of Invalid Cells Row Row Name ColName Col gt 12 Row12 Diff4 4 Figure 2 6b Validation window corresponds to the data from Fig 2 7a with one empty cell Row12 Diff4 List of Invalid Cells contains information about the cell with missing data row number row name column name and column number Find next invalid cell By clicking this function or by pressing F3 button one can scroll through the invalid cells Reset This option allows clearing all data in the grid of Original Data Sheet All cells will become empty Replace This function is available only for Regular data It enables to change all entries with a specific value for another valid value The Replace window will be 16 2 Data Entry opened Fig 2 7 Enter the value which is destined to be replaced into the Find field and enter the new value to replace the old one into the Replace with field Replace Replacement of valid values for regular data All entries with a Find value will be substituted for a Replace with value Figure 2 7 Original data a Replace window b and the modified data after replacement c In the Replace window b the Find and Replace with fields contain 9 and 3 respectively The modified data c have all the cells with the val
27. Ms Cancel My Documents Figure 2 4 Save as window standard The folder s path ends in the folder test VAT File name has to be entered Text txt file Selection of Text File provides saving data in the standard text file format under any name with the standard extension txt filename txt This option is mainly for backup or importing exporting data Note that txt files are not readable and cannot be used in all available VAT Applications except Data Entry Click the Browse button to choose the file name and path The standard Microsoft Save as window will appear Fig 2 5a in order to browse through for saving the data in an appropriate folder under any name but with only possible extension txt filename txt Only folders and or the text txt files will be shown in the Save as window Fig 2 4 Determining a Column delimiter in the case of Binary or Regular data see 2 2 is mandatory 14 2 Data Entry 5 Save as Type of file Text File File Name CAVAT work text_file bal Row Delimiter Tab Column Delimiter Comma Special Column Delimiter Figure 2 5 Save as window system Text File is chosen and the file path is C VAT work text_file txt the Row delimiter is Tab and the Column Delimiter is a Comma Change Application This menu option provides a switch among the five VAT Applications Data entry Resampling and Coding Descriptive Statistics Inferential Statistics Miscellaneo
28. Virulence Analysis Tool VAT User Manual Antje Herrmann Amos Dinoor Christian Albrechts Univ Kiel The Hebrew Univ of Jerusalem Institute of Crop Science and Plant Faculty of Agricultural Food and Breeding Environmental Quality Sciences Gabriel A Schachtel Evsey Kosman Justus Liebig Univ Giessen Tel Aviv University Biometrie amp Pop Genetik FB 09 Institute for Cereal Crops Improvement March 2009 This project was supported by the German sraeli Foundation GIF SE INTRODUCTION siscsdasscseced scscscsccssscdececacscasessecedencssdessdevesscecssssesacssesdescte scsecsssebececseee 1 1 1 General information about VAT si cscs sscceasi tbat shdeasviaendecteshsecensthsnadeeasvsedisestedabdincs 1 12 Using VAT aeriene non a e E a trad dels S cana AR ener ka 2 Getting started eane ane a a a a T a e 2 Basic tools and displ y Setonin pi iiaa a R EA A S ER E 2 1 3 VAT Main WindoW essssessesessseessesesssressessrssressessresresseesessresseeseesresseessesersstessessessees 3 Type of Analysis sector in the VAT Main WindoW sssssessesessssessesssesesseesreseesee 4 Applications sector in the VAT Main WindoW sssessseessessssssessessrssressessessresseese 4 Part I Virulence Analysis applications e sessessossesessossesossossescossesossossesossossessossesessoso 5 2 DATA ENTRY snenie onestearen isser is reS E S eSEE T S 5 2 1 Create Differential Setensionroie ii iian a E E E E E ERR E TR 6 22 ypes ob Dataren eea a
29. al k and n respectively then Regular data are determined as kxn table of nonnegative real 2 Data Entry numbers with k rows and n columns Each row represents a reaction pattern vector pattern of an individual on the given set of differentials according to the assessment scale underlying the data Nonnegative real numbers within the range of assessment scale are the only valid entries in the case of Regular data Binary data If number of individuals and differentials in data table equal k and n respectively then Binary data are determined as kxn table of Os and 1s with k rows and n columns Each row represents a reaction pattern binary vector pattern of an individual on the given set of differentials where 0 or 1 at i th position corresponds to avirulence negative or virulence positive reaction respectively of the individual on i th differential Two symbols 0 and 1 are the only valid entries in the case of Binary data Octal data If number n of differentials in a Differential Set is a multiple of three i e 1 3 then a binary reaction pattern of an individual on the given set of differentials can be represented by the Octal Code as follows The ordered set of differentials is divided in separate groups of three consecutive differentials each Then binary patterns of individuals are represented by the ordered sets of 0 1 triplets triplet means three ordered symbols according to the division of differentials
30. als D and Dz is defined by the formula Cor D D 1 2m D D A4 m D D is value of the simple mismatch dissimilarity Al between binary vector columns corresponding to differentials D and Dz Kosman 2003b This measure ranges from 1 to 1 1 lt Cor D D lt 1 Cor D D2 1 if the two vector columns are identical and Cor D D2 1 if binary representation of each individual is different on D and D gt 01 or 10 Differentiating characters are strongly correlated if Cor D D2 is relatively close to 1 or to 1 whereas there is no correlation between them if the values of Cor D D2 are close to zero g Association between differentiating characters Coefficient of association g between pairs of differentiating characters D and D2 is determined by formula 17 5 in Sokal and Rohlf 1995 DDJ RA A5 eae NE A where a number of individuals with shared positive responses to the both differentials D and D2 11 6 number of individuals with positive response to D but negative response to Dz 10 y number of individuals with negative response to D but positive response to Dz 01 and 6 number of individuals with shared negative 65 7 Appendix responses to the both differentials D and D2 00 Obviously that a P yv d n where n is the total number of individuals tested The g coefficient is related to Fisher s exact test Sokal and Rohlf 1995 3 Characterization of individual patterns
31. an be activated Within or Between 39 4 Descriptive Statistics Within includes a series of population analyses that are applied separately to each file population selected in the work segment By clicking the Within button all these analyses will be performed The Within population analysis is further explained in 4 2 Between includes a series of analyses for pairwise comparison of populations files In the case of more than two files all possible pairs will be analyzed Note that the Between procedure can only be run if the number and names of columns are identical for all data tables in the files selected in the work segment Otherwise an error message will appear If the number of rows is not identical in all selected data files the Kosman s KBm KBj and KBd equation A31 in Appendix cannot not be calculated The program will still run The Between population analysis is further explained in 4 3 Before running the Between or Within population analysis one should choose among three Dissimilarity coefficients For that purpose check one or more of the associated boxes on the right labeled Simple mismatch A1 Jaccard dissimilarity A2 and Dice dissimilarity A3 Note that Al A2 etc are designations of equations that appear in Appendix 4 2 Within population analysis Within includes a series of population analyses that are applied separately to each file population selected in the work segment By clicking the Withi
32. and Space These delimiters together with an option Other are listed above the grid The VAT recognized delimiter if it is identified is automatically marked at the circle next to the delimiter name If the delimiter was correctly recognized and the parameters were accurately determined in the previous Preview of file window then the input data properly appear in the grid By clicking the Next button at the bottom of the View of Tabulated Data Set window the Original Data sheet Fig 2 3 will be opened and the requested data will appear in the grid after automatic testing on validity according to the declared Data Type in the Open Existing File window Fig 2 15 In the case of invalid data an error message will pop up The view of data may be inappropriate if program could not find the correct delimiter and or wrong parameters were determined in the Preview of file window Fig 2 16 One can return to the Preview of file window click Back to change the parameters In order to fit the data another column delimiter can be specified either among the optional standard ones or using the Other option click in the circle next to 27 3 Resampling and Coding the corresponding option By choosing Other any delimiter can be entered into the adjacent field Once choosing the delimiter click the Go button to get a new view of data If a proper data appearance in the View of Tabulated Data Set window cannot be reached by means of the pr
33. ations pop1l vat pop2 vat pop3 vat At the top right the FreqVir option is chosen 49 4 Descriptive Statistics The Between Sample Distance Sheet Matrices of pairwise distances between the analyzed populations are provided Seven different distance measures are available namely Nei distance N A38 Nei s Gst A42 Kosman s Gst A43 Rogers distance A32 Mean Characters difference MCD A44 DADm A30 Kosman distance KBm A31 For each measure a matrix in a separate sheet can be accessed by clicking the corresponding sheet name tag visible at the upper edge of the front sheet Fig 4 10 By checking the Square box on the top the corresponding triangular matrix will be displayed in a symmetric square matrix Clicking the Excel button at the top right opens an Excel file and copies the contents into an Excel sheet S YAT Descriptive Statistics Virulence Data File Change Application Help Files selection Pairwise common phenotypes Comparison of phenotypes Overall common phenotypes Virulence frequency Between sample distances Comments Between population analysis of original data Matrix of distances between samples ee Ez Nei distance N Nei s Gst Kosman s KGst Rogers distance R Mean Character difference MCD DADm Pia P2 P3 0 0 076 0 0 056 0 013 0 Figure 4 10 Between sample Distances sheet The Nei distance N sheet is selected and three populations are compared pop
34. atistics 4 Inferential Statistics and 5 Miscellaneous under development represented by differently sized squares Each application is activated 2 Data Entry by clicking on the desired square In principle these applications are arranged in the logical succession of steps of analysis starting from the biggest square Data entry in the left Part I Virulence Analysis applications 2 Data Entry This VAT application allows user to perform three operations namely 1 Create new and or modify existing differential sets Differential sets can be saved in the program and used for further data entry 2 Enter new data and save it for further analysis New data appear in a grid table where rows represent virulence patterns original or encoded of isolates for a given differential set The data type could be either Regular Binary Octal Hexadecimal or Binary Decimal 3 Open an existing file and modify it for further analysis Two file types are allowed either the text txt file or the system vat file By choosing the Data Entry square in the VAT Main Window the Data Entry window will open The Data Entry window Fig 2 1 displays the three options Create Differential Set Enter New Data and Open Existing File each represented by a square Above these windows is the General Management Bar with buttons File Change Application and Help Under File there is the Exit option which allows terminating the program Change
35. cdaasessdoadenvenbebusaedess 64 8 REFERENCES cstiscecsdtsiseocitislscchasasbocccascestd sedeuactsieteuceesccnecdcesassteestsoecigascaeseenestocseasenes 76 1 Introduction 1 Introduction The analysis of plant pathogen populations is commonly based on experimental data which are organized in large two way tables The Virulence Analysis Tool VAT is user friendly software for processing such kind of data VAT aims at supporting a comprehensive effective and logically consistent evaluation and presentation of virulence data of pathogen populations and of resistance data of host populations The package can also be applied to molecular marker data VAT offers the following features 1 Tools to facilitate basic routine steps such as data entry and transformation dichotomization identification of phenotypes A tool to translate phenotype race names from one nomenclature to another e g from binary octal Gilmour code to binary hexadecimal Roelfs McVey code is implemented to make results of different researchers compatible 2 Descriptive tools for characterization of isolate and host samples e g by distribution of phenotypes virulence resistance frequencies and complexities associations diversities distances etc displaying the results in a clear and organized fashion by histograms under development frequency tables and indices 3 Inference statistical procedures that estimates various diversity and distance indices and o
36. d left are the Phenotype characterization of pop2 vat and pop3 vat respectively The Overall Common Phenotypes Sheet This sheet Fig 4 8 displays the phenotypes common to all analyzed files simultaneously not just pairwise common as in Fig 4 6 The table lists the binary vector of each common phenotype then in the next three columns gives the corresponding Octal Hexadecimal and Binary Decimal codes see 2 2 for each The following columns FreqP1 FreqP2 FreqP3 show the frequencies of each phenotype in each population For example if population P1 has sample size 12 and a common phenotype is found 3 times in P1 then its FreqP1 value is 3 12 0 25 The column Minfreq gives the minimum value of FreqP1 FreqP2 FreqP3 Clicking the Excel button at the top right opens an Excel file and copies the contents into an Excel sheet 48 4 Descriptive Statistics E 4T Descriptive Statistics Virulence Data File Change Application Help Files selection Pairwise common phenotypes Comparison of phenotypes Overall common phenotypes Virulence frequency Between sample distances Comments Between population analysis of original data E sities List of overall cornmon phenotypes Excel Binary pattern of phenotype Octal Hexadecimal Binary Decimal Freq P1 Freq P2 Freq P3 Min Freq 1111001110111011 TFPP 62395 0 050 0 250 0 200 0 050 Figure 4 8 Overall common phenotype sheet One common p
37. e Octal and Hexadecimal codes can both be obtained Clicking the Excel button at the top will open an Excel file and copy the contents of all four Coding sheets into four Excel sheets The Comments Sheet This sheet displays the available comments about each analyzed population file these comments had to be included before in the Data entry section see 2 4 and Fig 2 2 Clicking the Excel button at the top will open an Excel file and copy the contents of all four Coding sheets into four Excel sheets Note To view the data of a vat file after Coding there are three sheets available Original input Binary and Coded representation But only the Current file box in the Original input sheet allows choosing which file to view 3 3 Resampling Function The Resampling is necessary function for Inferential Statistics application 5 which runs only with resampled binary data The Resampling function takes a binary matrix from the work segment Fig 3 1 and generates a number of fictive samples where rows are randomly drawn with replacement from the binary matrix The number and size of the computer generated fictive samples are user defined These samples based on the original data provide a data pool which allows the assessment of statistical significance of population parameters and other statistical inferences in the Inferential Statistics application Once clicking the Resampling button at the top right a Resampling window will p
38. e analysis of pathogenic data in both directions namely Virulence Analysis and Resistance Analysis For each type of analysis there are five 1 Introduction applications 1 Data entry 2 Resampling and Coding 3 Descriptive Statistics 4 Inferential Statistics and 5 Miscellaneous under development which will be described in more detail in the following chapters VAT displays data comments and results of calculations mainly in grids tables and matrices In the Data Entry section you can enter and edit your data and comments into grids and tables In all other sections the contents of grids tables and matrices cannot be modified by the user Format of grid cells height of rows and width of columns can be changed by dragging the corresponding borders Most windows or sheets in VAT have an Excel icon which allows a user to copy data and results to an Excel worksheet The General Management Bar can be found above most VAT windows It contains the options File Change Application and Help under development Under File there is the Exit option to leave the program and terminate current VAT session With the Change Application option one can switch to other VAT applications or return to the VAT Main Window The Help button provides information how to work with VAT 1 3 VAT Main Window The VAT Main Window is divided into two sectors Type of Analysis at the top and Applications the main square Fig 1 1 It is possible to ac
39. ee 2 2 if an underlying differential set consists of 24 differentials the corresponding Hexadecimal Code length should be 6 and the Hexadecimal Code data should be strings of 6 letters from the set of the first sixteen consonants of English alphabet B C D F G H J K L M N P R S Q T 23 2 Data Entry e g BNJHGR DDTGLK TTTTTT etc The Code length is mandatory parameter Note that a Hexadecimal data table will possess only a single column of codes so the Number of columns is irrelevant parameter and the corresponding field will be disabled Binary Decimal data Check the Binary Decimal box Then Binary vector length field will be available to specify a length of the Binary vectors corresponding to the Binary Decimal Code input values if None is selected in the Differential Set field For example also see 2 2 if an underlying differential set consists of 4 differentials the corresponding Binary vector length should be 4 and the Binary Decimal Code data should be 2 16 integers from 0 to 15 representing the following sixteen binary vectors 0 lt gt 0 0 0 0 1 lt gt 0 0 0 1 2 lt 0 0 1 0 3 lt 0 0 1 1 15 lt 1 1 1 1 The Binary vector length is mandatory parameter and its value must not exceed 63 2 1 is the maximum integer allowed by an operational system of PC Note that a Binary Decimal data table will possess only a single column of codes so the Number of columns is irrelevant para
40. ee 3 are acceptable and can be read in the Inferential Statistics application One can analyze Within a single file population or Between different files populations This application is similar to the Descriptive Statistics application The main difference between the Inferential Statistics and the Descriptive Analysis is that the former application provides Average values of parameters A45 and the corresponding values of the Standard Error SE A46 on the basis of the resampled data 5 1 The Files Selection Sheet By clicking the Inferential Statistics square in the VAT Main window a new window will appear with one open sheet called Files selection This sheet is subdivided into three parts Fig 5 1 two segments left and middle named source and work segment respectively and right of these two segments a list of parameter values The General Management Bar found above most VAT windows contains the options File Change Application and Help Under File there is the Exit option which allows leaving the program and terminating current VAT session With the Change Application option one can switch to other sections of the program or return to the VAT Main Window The Help button provides information how to work with VAT The source segment on the left with legend Binary files after resampling provides a list of data files to select from for further analysis The small path line field labeled Folder above the source segment di
41. egative e g avirulence resistance responses respectively We denote by q the frequency of appearance of 1 at the i th differentiating factor D For example if the differentiating factors comprise a typical set of differential host lines used in virulence tests of plant pathogens q would be the frequency of virulence in population P on the i th differential 67 A13 7 Appendix line For dominant molecular markers or any diallelic loci g would be the frequency of dominant allele in the i th locus in population P We denote T a group of individuals of type r The frequency of individuals of type r in population P is denoted by p Depending on the nature of the data types of individuals may mean pathotypes races phenotypes genotypes etc Diversity within population pattern based methods Let population P consists of n individuals x x x and dissimilarity between the individuals is assessed using any measure p e g simple mismatch m Jaccard j Dice d coefficients of dissimilarity ADW Average Difference Within population The Average Difference Within population P with respect to dissimilarity p ADW P is determined as follows 1 n ADW P Y G45 A16 i j l KW Kosman diversity Within population The Kosman diversity KW P within population P of n individuals is defined as follows max KW P Ass PP A17 n Kosman 1996 Kosman and Leonard 2007 For given popula
42. erted to the Binary one in the Resampling and Coding see 3 Regular data Check the Regular box Then Min value and Max value fields will be available to specify range of data by entering a minimum value Min value and a maximum value Max value according to the assessment scale underlying the data Notice that the Min value and Max value are mandatory parameters they should be arbitrary nonnegative numbers so that Min value is less than Max value Binary data Check the Binary box Then Min value and Max value fields will automatically be set to 0 and 1 respectively Octal data Check the Octal box Then Code length field will be available to specify a length of the Octal Code input values if None is selected in the Differential Set field For example also see 2 2 if an underlying differential set consists of 15 differentials the corresponding Octal Code length should be 5 and the Octal Code data should be strings of 5 digits from the set of eight integers 0 1 2 3 4 5 6 7 e g 35164 13704 34520 02250 etc The Code length is mandatory parameter Note that an Octal data table will possess only a single column of codes so the Number of columns is irrelevant parameter and the corresponding field will be disabled Hexadecimal data Check the Hexadecimal box Then Code length field will be available to specify a length of the Hexadecimal Code input values if None is selected in the Differential Set field For example also s
43. es selection Resampled Files List of Resampled Files Number of samples Sample size Figure 3 8 Resampled Files sheet Four files pop l vat pop2 vat pop3 vat and pop4 vat were resampled the Number of samples and the Sample size are 10 and 15 respectively Once Resampling is run the Resampling parameters will be displayed at the right bottom corner of the Files Selection sheet in the Resampling and Coding window Fig 3 1 37 4 Descriptive Statistics Clicking the Excel button at the top will open an Excel file and copy the List of Resampled files into Excel sheet 4 Descriptive Statistics This VAT application allows you to characterize individual populations within or groups of population between by purely descriptive statistical method You can analyze all those data that were previously prepared with the applications Data Entry 2 and Conversion 3 1 and are available as vat type files No other files will be accepted for the following descriptive analysis procedures It is possible to perform a Within analysis of single populations or a Between analysis of pairs of populations Inferential Statistics can be performed in the corresponding application see 5 however only after Resampling see 3 Note Formulae of all parameters and indices calculated by the VAT brief explanations and the list of relevant literature can be found in Appendix The Appendix formulae are designated Al
44. esampling Compatible files Figure 5 1 File selection sheet in the Inferential Statistics section Three files were transferred to the work segment middle from the J VAT folder pop1 vat pop2 vat pop3 vat The rest of the folder s suitable files are displayed on the left in the source segment pop4 vat Only Binary and Resampled vat files are displayed On the right there is a list of Dissimilarity coefficients where the Simple mismatch box is checked Save to File Before running the Between or Within options one can choose to save the results as a text file In order to do this check the Save to File box on the right and select a folder and file name After running the Between or Within the results of calculations will be saved in the corresponding txt file 52 5 Inferential Statistics Once the desired files are in the work segment middle two types of population analyses can be activated Within or Between Within includes a series of population analyses that are applied separately to each file population selected in the work segment By clicking the Within button all these analyses will be performed The Within population analysis is further explained in 5 2 Between includes a series of analyses for pairwise comparison of populations files If more than two files are selected then all possible pairs will be analyzed Note that the Between procedure can only be run if the number and names of columns are identical
45. file and copy the contents of both tables from the VAT sheet into Excel sheets VAT Descriptive Statistics Virulence Data 8 x File Change Application Help Files selection Phenotype characterization Comparison of individuals Virulence frequency Comparison of differentials Diversity parameters Comments m Within population analysis of original data FreqPhen SE FreqgPhen jag Excel Phenotype _ Binary pattem of phenotype Hexadecimal Binary Decimal Frequency phl 1000000000011000 LBCL 32792 0 200 i 1111000101011001 TCHM 61785 Phenotype frequencies Current file 0011000000010011 1001000110111011 1001000100111011 0110000000010010 0111001010010011 Phenotype Binary pattern of phenotype phi 1000000000011000 ph3 0011000000010011 0 313 0 375 1001000110111011 0 563 0 125 1001000100111011 0 500 0 0110000000010010 0111001010010011 0 500 0 Average for individuals 6 450 0403 0 319 Figure 4 2 Phenotype characterization window The Phenotype frequency table upper table contains list of 5 different phenotypes revealed in the chosen file popl vat with their parameters These phenotypes are also analyzed in the Virulence complexities table bottom table At the top right corner the FreqPhen option is chosen The Phenotype Frequencies Table This table lists all phenotypes found in the underlying sample The first column assigns names to each phenotype phl ph
46. he resampled data The Diversity Parameters Sheet This sheet provides the resampling based estimates A45 A46 of different measures of diversity within populations namely Nei Diversity Hs A27 Simpson Si A21 Normalized Shannon Sh A25 Kosman Index K A29 Stoddart St A22 Shannon SH A23 Evenness E A24 Gleason G A20 Average dissimilarity within ADW A16 Kosman diversity within KW A16 57 5 Inferential Statistics The Indices ADW and KW can be calculated with regard to the three commonly used dissimilarity measures Simple Mismatch m A1 Jaccard j A2 and Dice d A3 Values of all relevant diversity parameters are displayed in a table where each column corresponds to one of the analyzed populations Average values A45 of the parameters alone or together with the Standard Error SE A46 can be displayed by clicking the appropriate button Average or Average SE respectively Clicking the Excel button at the top right will open an Excel file and copy the contents into an Excel worksheet The Comments Sheet This sheet displays the available comments about each analyzed population these comments had to be included before in the Data entry section see 2 4 and Fig 2 14 5 3 Between populations analysis Clicking the Between button on the top right in the Files selection sheet see 5 1 and Fig 5 1 of the Inferential Statistics section activates several procedures for comparisons between
47. henotypes was found in all 3 populations popl vat pop2 vat pop3 vat The Virulence Frequency Sheet This sheet displays proportion of the sampled isolates with virulent reaction on each differential from the given differential set Fig 4 9 In other words the frequency of Is in the corresponding column of the original data table is displayed The leftmost column lists all members host plants of the underlying differential set labeled by default as Coll Col2 etc The first two rows serve the table s header and provide the designations and file names of the analyzed populations Values of virulence frequencies alone or together with the Standard Error SE can be displayed by clicking the appropriate button FreqVir or FreqVir SE respectively Clicking the Excel button at the top right opens an Excel file and copies the contents into an Excel sheet HE YAT Descriptive Statistics Virulence Data File Change Application Help Files selection Pairwise common phenotypes Comparison of phenotypes Overall common phenotypes Virulence frequency Between sample distances Comments Between population analysis of original data FreqVir SE Freir Excel Virulence frequency Differential popl vat pop2 vat pop3 vat 0 400 0 667 0 667 0 700 1 0 733 0 0 067 Figure 4 9 Virulence frequency sheet Virulence frequencies on the set of 16 differentials columns are analyzed with respect to 3 different popul
48. hows the data in the three major nomenclatures if applicable the Octal Hexadecimal and Decimal codes The Octal and Hexadecimal codes require suitable numbers of columns multiples of three and four respectively and will only be displayed if these conditions are met Note that if the number of columns exceeds 63 then the Binary Decimal code will not be calculated 34 3 Resampling and Coding vA MGE File Change Application Help Files selection Original input Binary representation Coded representation Comments Current file pop4 vat Binary Decimal Regular No cols 16 No rows 12 Minvalue 0 Max value 9 Cut Off Value 3 Current file path D test data pop4 vat Figure 3 5 Coded representation sheet Codes of rows in pop4 vat are represented Only the Hexadecimal and Binary Decimal codes are available The Octal code cannot be calculated since 16 number of columns is no multiple of three and the corresponding column is empty cs BIE File Change Application Help Files selection Original input Binary representation Coded representation Comments Current file p3 vat Hexadecimal RFC Binary 0 1 No cols 12 No rows 21 Current file path D test VAT 16_02_08 p3 vat 35 3 Resampling and Coding Figure 3 6 Coded representation sheet Codes of rows in p3 vat are represented Since the number of columns is 12 it s a multiple of 3 as well as 4 henc
49. hree Implemented Race Codes ss snnnssnssesssesseseeseesseserssressessrssressessesseessesse 42 The Virulence Complexities Table icc casdsstsctvastedeviieriocebteondoiecieiieestenieads 42 The Comparison of Individuals Sheet lt s csseccessassvssceasseesnasscsovessavascuarasrsaaaiaeets 42 The Virulence Frequency Sheet is kitutsate Antes tin din tecns dee ca eA eas 43 The Comparison of Differentials Sheets yaccferovkaiie seutacvancisvveakodtusanendeadaies 44 The Diversity Parameters SHECE isjccarsaseccagss gscthegssavtstaice cb haniieavvsnciachnesetiesd 45 Th Comments SHES 2253a5 5428 enr ei R a gan Eaa E AT RA ARA 45 4 3 Between Populations Analysis wi ciisccescisseccssteccsscesseneckedededessasudiossteseteccvickerstendies 45 The Pairwise Common Phenotype Sheet csccssccssssscsscsssessscessssssnssssacesnses 46 The Comparison of Phenotypes Sheet cccccccecssecesecceseceeeceeseeeseecsaecnseeneeeensees 47 The Overall Common Phenotypes Sheet ccccecccecssecsseceeeceeeeeeeecseceseeneeeensees 48 The Virulence Frequency Sheet sicaucienr avs naan knee decor dame Aan 49 The Between Sample Distance Sheet asccesaesdocaesesiiesdeatrvasdls vobaeleasdactowasae nade 50 The Comments Sheet neierens i aatisbaicaace une bat Get is ees 50 5 INFERENTIAL STATISTICS oo ccccssesawcascesaccesncseucdccxsdacadassavsasectancesacseveserceseeatadooveaseenee 51 Sol The Files Selection Shet isss 2412054siesasstuc
50. ile This type of file is the system oriented file that is acceptable and can be used in all applications of the program Once the System VAT type of file chosen an input file must be determined using the standard browsing tools by clicking the Browse button Only folders and available vat files will appear Once the file is found and entered in the File name field click the Next button at the bottom of the Open existing file window The Original Data sheet Fig 2 3 will be opened and the requested data appear in the grid Clicking the Back button at the bottom will return to the Data Entry window Open existing Text File txt file This type of file is acceptable and can be used only in the Data Entry applications Once the Text type of file chosen an input file must be selected using the standard browsing tools by clicking the Browse button Only folders and available txt files will appear Once the file is found and entered in the File name field the Data Type must be determined Click proper circle to select among the five listed data types Regular Binary Octal Hexadecimal or Binary Decimal A wrongly specified type of data will result in an invalid data error message at the next stages Click Next to continue Clicking the Back button at the bottom will return to the Data Entry window Clicking Next will open a new window Preview of file content unformatted Fig 2 16 This window displays the data of the selected txt file in an
51. important results of calculations obtained in the various sections e g Descriptive Statistics Inferential Statistics and Miscellaneous are not saved in the corresponding vat rat file The major results of the calculations as well as original and transformed encoded data can be exported to Excel at every step in the program for further study Data can also be saved as a txt file Another option Save to File is available within the sections By clicking this option you can open a new file and save the results of calculations as a standard text file outputfilename txt i e with extension txt However the readability of the outputfilename txt might be poor and need some additional formatting by the user For example if one of the analyses did not develop results then a headline without any corresponding results may appear in the text file Moreover sometimes columns and titles will not be aligned Therefore this option should be used mainly for backup purposes 1 2 Using VAT Getting started Startup In order to run the VAT go into a folder where VAT software VATsoftware folder was installed downloaded and look for the executable file VAT exe according to the following route VATsoftware VAT bin VAT exe Click VAT exe and the VAT Main Window should appear It is recommended to create a shortcut of VAT exe and include the corresponding icon on the desktop to start the VAT Basic tools and displays The VAT program allows th
52. into groups Now each triplet can be encoded by a single digit from 0 to 7 according to the following rule Triplet Octal Code Rule 000 0 0 0 27 0 2 0 2 001 1 0 27 0 2 1 2 010 2 2 0 27 1 2 0 2 011 3 350 2 41 2 41 2 100 4 4 1 27 0 2 0 2 101 5 5 1 2 0 2 1 2 110 6 6 1 2 1 2 0 2 111 7 7 1 2 1 2 1 2 This is one to one correspondence between the eight digits and all possible 0 1 triplets Substituting one of the digits from 0 to 7 for the corresponding triplet in the binary pattern of individual the individual s Octal Code of length is obtained Obviously any Octal Code can be transformed to the binary vector according to the 2 Data Entry same rule For example the following pairs of binary vectors of length 15 and octal codes of length 5 are results of the corresponding transformations 35164 gt 011 101 001 110 100 lt 011101001110100 13704 lt 001 011 111 000 100 lt gt 001011111000100 If number of individuals in data table equals k then Octal data are determined as a table of k rows and one column of octal codes of a fixed length Digits 0 1 2 3 4 5 6 7 are the only valid symbols in the case of Octal data and all entries should have the same number of symbols fixed Code length Hexadecimal data If number n of differentials in a Differential Set is a multiple of four i e n 41 then a binary reaction pattern of an individual on the given
53. lays the binary data matrix after conversion Fig 3 4 For non binary input data i e Regular Binary Decimal Octal or Hexadecimal a conversion by transfer took place in the case of Regular data this conversion went according to a user defined Cut Off Value For original input of type Binary the tables in the Original input and the Binary representation sheets coincide 33 3 Resampling and Coding VAT Resampling and Coding Virulence Data 8 x File Change Application Help Files selection Original input Binary representation Coded representation Comments Current file pop4 vat coli Col2 Col3 Col4 Col5 Col6 Col7 Colg Col9 Col10 Coli1 Col12 Col13 Col14 Coli5 Col16 Rowl 1 1 1 0 0 0 0 1 0 1 0 1 0 1 1 1 i e o e e T i a s wo w io io i w Row3 1 SE E E AE E E E Ro 5 0 JASER EEN A EE E EE a a Row 1 SS SD SE SO EL Row9 1 Rowll 1 Regular No cols 16 No rows 12 Minvalue 0 Max value 9 Cut Off Value 3 Current file path ID test data pop4 vat Figure 3 4 Binary representation sheet displaying the data of pop4 vat after conversion to the binary form applying a Cut Off Value of 3 compare with original Regular input in Fig 3 3 for example col 7 rows and 2 Clicking the Excel button at the top will open an Excel file and copy the contents of all four Coding sheets into four Excel sheets The Coded Representation Sheet This sheet s
54. lues of the Virulence Complexity A6 the Relative Virulence Complexity A7 the Virulence Uniformity A8 respectively The last two rows of the Virulence Complexities table contain the average values of VC RVC and VU calculated over all phenotypes and over all individuals respectively The Comparison of Individuals Sheet This sheet displays all pairwise dissimilarities between individuals isolates of the population Fig 4 3 in a form of a lower triangular matrix By checking the Square box on the top right the dissimilarities will be displayed in a symmetric square matrix The Current file field allows selecting among all analyzed files The Dissimilarity measures field allows choosing from all measures that have already be selected for analysis before in the Files selection sheet see Fig 4 1 right side VAT allows the choice among the following three commonly used measures Simple mismatch A1 Jaccard dissimilarity A2 and Dice dissimilarity A3 42 4 Descriptive Statistics E VAT Descriptive Statistics Virulence Data x File Change Application Help Files selection Phenotype characterization Comparison of individuals Virulence frequency Comparison of differentials Diversity parameters Comments Within population analysis of original data Matrix of dissimilarities between individuals Dissimilarity measure Simple mismatch Excel I Square matrix Individuals i1 i2 i3 i4 is i6 i7 i8 ji9
55. mal and Binary Decimal data into Binary data Conversion 3 To translate a Binary data set to all possible codes Octal Hexadecimal and Binary Decimal and one type of encoded data to another Coding 4 To draw random samples with replacement from a given data set Resampling for Inferential Statistics The Conversion of non binary Regular Octal Hexadecimal and Binary Decimal to Binary data see 3 1 is the first mandatory and crucial task of the Resampling and Coding section This conversion step is an essential precondition for further analysis since the Descriptive Statistics as well as the Inferential Statistics applications both can only operate on binary data sets Conversion Regular data is conducted as soon as a user defined cut off value is devised The Coding function allows viewing the original input data as well as the converted binary data Furthermore it displays the Octal Hexadecimal and Binary Decimal representations nomenclatures of the Binary data The Resampling function generates a user defined number of fictive samples where rows are randomly drawn with replacement from a Binary data set The size of the computer generated fictive samples is also user defined Notice that the Inferential Statistics tool relies completely on these collections of randomly resampled sets 3 1 The Files Selection Sheet By clicking the Resampling and Coding square in the VAT Main window a new window will appear with
56. meter and the corresponding field will be disabled Number of rows No rows It is mandatory to enter number of rows individuals in a new data table Number of columns No cols It is mandatory to enter number of columns differentials in a new data table if none of the existing differential sets is selected None is chosen in the Differential Set field There are two buttons Back and Next at the bottom of Data Parameters window Fig 2 13 Click Back to return to the main Data entry window If values of all mandatory parameters have already been selected or entered by clicking Next a new window with two sheets Original Data Fig 2 14 see 2 3 and Comments will appear The Original Data sheet Fig 2 14 also see 2 3 is composed of an empty grid table with dimension according to the specified data parameters Comments about the data can be entered in the Comments sheet The General Management Bar can be found above most VAT windows contains File Change Application and Help options Under File there is the Exit option which allows user to leave the program and terminate current VAT session Change Application provides a switch to other VAT Applications or return to the VAT Main Window The Help function under development is for information about 24 2 Data Entry VAT In the Original Data sheet the General Management Bar contains an important additional option namely Edit as well as an Excel button ei Data En
57. minimum genetic distance A33 can be represented as the distance of average differences between populations P and P2 A30 with respect to the simple mismatch dissimilarity J J Ny FP Jp ADW PB ADW P oot ADB P P P aE ADW P _ DAD P P m 2 2 m 1 2 N Nei standard genetic distance between populations Nei s standard genetic distance between two populations P and P Nei 1972 1978 is defined as follows l N P B cena E A in J2 2 JJJ where J J2 and Ji 2 are calculated according to equations A34 A36 The Nei standard A38 genetic distance A38 can be expressed as the following function of the measures of average difference within and between populations P and Pz Jia 1 ADB P P 73 TA Tr ADW P 1 ADW P m m N P P In A39 Kosman and Leonard 2007 NSE Normalized Squared Euclidean distance between populations The normalized squared Euclidean distance between two populations P and P is determined on the basis of frequencies of positive responses of individuals on all differentiating factors Ds s 1 2 k NSRP F a a A40 One can prove that the distance of average differences between populations P and P2 A30 with respect to the simple mismatch dissimilarity equals the Nei s minimum genetic distance A33 and the normalized squared Euclidean distance A40 73 7 Appendix DAD F P Ny 2 2 NSE L PB
58. mity of s phenotypes genotypes ph J 2 s sampled from population P is determined as follows l s U p P 7 HULPE A10 It ranges from 0 to 1 0 lt U P lt 1 Designations for virulence data Virulence complexity VC relative virulence complexity RVC virulence uniformity of individual VU and average virulence uniformities of individuals sample population uniformity VU and phenotypes genotypes VU are determined according to formulae A6 A7 A8 A9 and A10 respectively Designations and formulae for resistance data For binary resistance data 1 and 0 usually represent susceptibility and resistance respectively of an individual for any given pathogen Resistance complexity CR relative resistance complexity RCR resistance uniformity of individual UR and average resistance uniformities of individuals sample population uniformity UR and phenotypes genotypes URp are determined according to formulae A11 A12 A13 A14 and A15 respectively CRU n A11 RCR F A12 UR 1 2 min RCR J 1 RCRU UR P e a UR I A14 UR P SUR Ph A15 j l 4 Diversity within and among populations Consider a sample from population P which consists of n individuals We assume that all individuals are tested on k differentiating factors D D2 Dr and represented by binary patterns of 1s and Os for positive e g virulence susceptibility and n
59. mments sheet The Comparison of Phenotypes Sheet This sheet displays dissimilarity between phenotypes of two populations files under comparison and it is comprised of three tables Fig 4 7 The Current pair of files field at the top allows selecting among the analyzed file pairs The Dissimilarities between phenotypes matrix displays the pairwise dissimilarity values between phenotypes of the two populations The field Dissimilarity measure at the top allows choosing from all those measures Simple Mismatch Jaccard Dice which were previously selected in the Files selection sheet see Fig 4 1 right side for analysis The rows are linked to the phenotypes of the first population file name left of in the Current pair of files field and the columns to the second one The Phenotype characterization tables at the lower part of the sheet list all phenotypes for the current pair of populations The first column assigns numbers to each phenotype phl ph2 The second column displays for each phenotype the associated binary vector Recall that the dimension of the vector its length equals the number of underlying differentials The next three columns contain the corresponding Octal Hexadecimal and Binary Decimal race codes see 2 2 Finally the Frequency column gives the frequency of each phenotype 47 4 Descriptive Statistics VAT Descriptive Statistics Virulence Data elx File Change Application Help
60. n button all these analyses will be performed Once Within is activated it could take a few minutes At the bottom of the window the file name currently analyzed and the elapsed running time are displayed as well as a blue bar conveying graphically the approximate time still required for completion As soon as the analysis is completed six new sheets will appear behind the Files selection sheet only the name tags of the six new worksheets will be visible right above the upper edge of the front sheet Clicking such name tag brings the corresponding sheet to the foreground in Fig 4 2 for example Phenotype characterization is the visible front worksheet while Files selection and the other five sheets Comparison of Individuals Virulence frequency Comparison of differentials Diversity parameters and Comments are in the back Next we explain the new worksheets in more detail 40 4 Descriptive Statistics The Phenotype Characterization Sheet This sheet shows in two tables the analysis of Phenotype frequency and Virulence complexities see Fig 4 2 In the current file s field above the table one can select from all analyzed files On the right to this field are three buttons FreqPhen SE FreqPhen and Excel While FreqPhen provides the proportion of each phenotype in the underlying sample A18 FreqPhen SE provides also the standard error SE A19 calculated under the Binomial model Clicking the Excel button will open an Excel
61. ng differential set or to choose the option New and to enter a new differential set New Differential Set Once the New option is chosen the field Number of differentials columns will be available Enter the number of differentials hosts isogenic lines etc in the new set and click Go A grid in the Columns Name box will be created where number of rows equals the number of differentials number of columns in tables of regular and binary data Attributes of the differential set Name Pathogen Host Country Reference and Author must can be entered in the appropriate slots Differential names and comments can be inserted in the corresponding grid cells in the Column name box Once the differential set characterization is completed click the Save button at the bottom in order to save your differential set under the given name Notice that a 2 Data Entry user defined Name of the differential set is mandatory in order to save it Otherwise once you attempt to save an error message will appear After saving the differential set will be available for use in the Enter New Data section see 2 4 Existing Differential Set Selection of an existing differential set in the top box of the Differential Set window will automatically display all available information about the set attributes names of differentials and comments In addition to editing the existing information the possibility to modify the differential set is facilitated by
62. nge Application Edit Excel Help Original Data Comments 3 THE 3 3 HOON HD oO amp DN w Rw ke OH eN ww mM N o ww cw an wo o nu ik DH wWOwoseIB OH Ww BG e Enon anov konnan wl JONN UNN 9 6 6 7 6 5 5 6 8 3 0 1 5 4 6 5 Noe wD H o wo Sn amp wh Figure 2 3 Original Data Sheet Regular data is filled in here In cell Row11 Diff3 the number 6 has just been entered therefore the program is in the Pencil Mode marked next to row 11 12 2 Data Entry The General management bar at the top of the Data Entry window includes the following five options File Change Application Edit Excel and Help File Under this label there are three alternatives namely Save Save as and Exit Save This option is for saving data as a system vat file filename vat see 1 1 Once the Save option is chosen to different modes of action are realized If the original data were imported from already existing vat file then a message window will appear asking whether to overwrite the existing file If new data are to be saved in a new vat file then the standard Microsoft Save as window will appear Fig 2 4 to choose the file name and path Note once again that only vat files can be saved here In all VAT Applications except Data Entry only vat files can serve as input and be processed To save data as a text file filename txt choose the Save As option Save As This option is f
63. nnon and Weaver 1949 is defined as follows SH P gt p 1n p A23 r l Values of Shannon s diversity index range between 0 and Ins 0 lt SA P lt Ins E Evenness of population Measure of population evenness is expressed by the ratio of the Shannon index A23 to its maximum value Ins 2 lt s lt n Sheldon 1969 SH P 1 g Lp np A24 E P Ins Ins This evenness parameter ranges between 0 and 1 0 lt E P lt 1 69 7 Appendix Sh Normalized Shannon diversity within population The Normalized Shannon index of diversity within population P is defined as follows ee 5 ap A25 Sh P Inn Inn ra This diversity index also ranges between 0 and 1 0 lt SA P lt 1 Sh index reflects both the evenness and richness of population because SH P _ SH P Ins Sh P E P In s n Inn Ins where the second factor measures the population richness 0 lt In s lt 1 minimum and maximum value being obtained in a population in which all individuals are of the same type s 1 and different types s n respectively Diversity within population trait based methods Frequency of positive response Let population P consists of n individuals x x x tested on a set of k binary differentiating factor D j 1 2 k positive and negative responses being represented by 1 and 0 respectively Frequencies of appearance of 1 at the i th differentiating factor D proportion
64. of Is in i th column of binary matrix for populations P is denoted by q j 1 2 4 Theoretical standard error of q estimate is l g SE q at a A26 Hs Nei diversity within population expressed by formula for j 1 2 k The Nei s measure of the average gene diversity per locus Hs in population P Nei 1973 is determined by the formula HP PHP Y a 0 4 A27 where k is the total number of loci differentiating factors H P 1 q 1 q and q and l q are frequencies of the two alleles at the j th diallelic locus e g qj virulence frequency 1 q resistance frequency Nei s diversity Hs ranges between 0 and 0 5 for binary data 0 lt H P lt 7 70 7 Appendix It was proved Kosman 2003a that the Nei s measure of the average gene diversity per locus H P and the index of average difference A16 with respect to the simple mismatch coefficient are identical measures of diversity within population ADW P H P A28 K diversity within population The K index for measuring diversity within population is determined as follows 1 K P L KP min 24 21 4 A29 a Manisterski et al 2000 where q is frequency of positive response appearance of 1 at th character in populations P 1 2 k and K P min 2q 2 1 q is the diversity within population at differentiating character D The K index was designated H kp in Manister
65. one open sheet called Files selection This sheet is subdivided into three parts Fig 3 1 two segments left and middle named source and work segment respectively and right of these two segments a list of parameter values The General Management Bar found above most VAT windows contains the options File Change Application and Help Under File there is the Exit option which allows leaving the program and terminating current VAT session With the Change Application option one can switch to other sections of the program or return to the VAT Main Window The Help button provides information how to work with VAT 29 3 Resampling and Coding The source segment on the left with legend All valid and complete system files provides a list of data files to select from for further analysis The small path line field labeled Folder above the source segment displays the full path of the active folder which holds the files listed in the source segment Note that exclusively vat files are listed since they are the only ones to be processable in this section To select another folder click the Browse button next to the path line field and use the browse option the listed vat files in the source segment will change accordingly Conversion by Transfer One may transfer some specific files separately by clicking gt or all listed files by clicking gt gt from the left source segment into the work segment middle
66. op up Fig 3 7 In this window the Number of samples that will be generated default is 100 and the Sample size number of rows in each sample must be determined Note that the scale of the resampled data correlates with the level of significance of the statistical analysis in the Inferential Statistics section 5 On the other hand large samples and a high number of samples lead to longer processing times This is especially critical when running several files under the Between population analysis 5 3 where completing this analysis could take up to several hours 36 3 Resampling and Coding Resampling COE Resampling Number of samples fio Sample size 15 Figure 3 7 Resampling window Number of samples is set to 10 and Sample size is set to 15 When the Resampling parameters are determined click OK in the Resampling window Resampling with these parameters will be done for all files in the work segment middle If one of the files was already resampled in a previous analysis then a message will appear and a user is required to choose if to run Resampling again under the new parameters After Resampling is accomplished a label of new sheet Resampled Files will appear Clicking the Resampled Files will display a list of all processed files and resampling parameters Number of samples and Sample size Fig 3 8 HA YAT Resampling and Coding irulence Data File Change Application Help Fil
67. oposed changes of parameters and delimiters then the only way of importing correct original data is to modify the data format in the requested Text File txt file View of tabulated data set C Comma Tab Semicol Space Other Coll Col2 Col3 Col4 Col5 Col6 Col Colf a n ae m gt Qo N a 0 6 3 1 0 4 1 0 1 IN F amp O KH HSH WH sO Of NY mm Io amp Ht HB ONO Oo oogonNOCO Oo WwW IO F amp F OO WO OW O OMDO OC amp amp InNoOotH OK OM a a Coli Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9 Coli0 Colii Coli2 Coli3 Coli4 Coli5 Coli6 b Row1 0 5 0 1 4 0 2 0 0 jl 0 4 6 0 4 0 Row2 6 0 1 1 A PE MO o ee 2 5 2 1 5 4 Row3 3 1 1 5 1 0 4 0 i 2 0 0 5 0 6 4 Row4 1 3 2 2 Ce Oe es ee Ce 6 6 2 4 5 Row5 0 1 1 0 0 3 0 0 0 ff 0 4 0 2 6 1 Row6 4 1 1 1 ao Oa ie E Oe 8 0 5 5 1 2 4 Row 1 0 0 1 One ein 1 F 4 4 5 1 4 4 Row8 0 4 1 4 Ce ee ee One tee 0 0 4 1 2 5 6 Row9 1 2 0 0 0 0 oO 2 O0 jl 1 5 3 0 1 2 Row10 2 0 1 0 1 p Oe n 3 1 6 0 5 5 b Figure 2 17 View of tabulated data set window a View of Regular data from the existing Text File The delimiter Tab was automatically recognized b The data appearance in the Original Data sheet Fig 2 3 3 Resampling and Coding This VAT application allows performing four operations namely 28 3 Resampling and Coding 1 To assign a cut off and transform Regular into Binary data Conversion 2 To transform Octal Hexadeci
68. or r 1 2 s will denote n the frequency of individuals of type T in populations P and P2 respectively so that Pat potest p 1 i 1 2 R Rogers distance between populations The Rogers distance between two populations P and P2 is determined as follows Rogers 1972 A32 1 S P y ElPe Pa r l Values of the Rogers distance ranges between 0 and 1 0 lt R P P lt 1 Distance between populations trait based methods Let populations P and P consist of n and n individuals x x and Xp1 Xp75 5 respectively all individuals are tested on a set of k binary differentiating factors Dj 7 1 2 4 and positive e g virulence susceptibility and negative e g avirulence resistance responses are represented by 1 and 0 respectively Nw Nei minimum genetic distance between populations Nei s minimum genetic distance between two populations P and Pz Nei 1972 is determined as follows Ny BpP y A33 where i Dhi e ae A34 Phi q A35 72 7 Appendix and 1 a lgu da 1 4 1 92 A36 M Jy I for k dimorphic loci with frequencies q and 1 q and q2 and 1 q2 of the two alleles at the i th locus in populations P and P respectively For binary data q and qz are the frequencies of appearance of 1 at the i th differentiating factor D for populations P and P respectively It was proved Kosman and Leonard 2007 that the Nei s
69. or saving data as a system vat file filename vat see 1 1 under a new name or as a text file filename txt By choosing this option a system special Save as window will appear Fig 2 5 It is mandatory to select Type of File system vat file or text txt file to choose the file name and path clicking the Browse button and to select a Column delimiter in the case of text txt file with Binary or Regular data see 2 2 Once finished click OK to save the data under the file in the File Name box Type of File selection System vat file Selection of System VAT provides saving data in a special system oriented file format under any name with a fixed extension vat filename vat The vat files are readable and can be used in all VAT Applications Click the Browse button to choose the file name and path The standard Microsoft Save as window will appear Fig 2 4 in order to browse through for saving the data in an appropriate folder under any name but with only possible extension vat filename vat Only folders and or the system vat files will be shown in the Save as window Fig 2 4 13 2 Data Entry Save As emm J mee BinaryDecimal a r15 c06_11 Recent c1993_11 Hexadecimal E E octal a 3 pop1 edit E pop_t pop_1_cnames_comments2 D pop_1_reg pop_2 pop_2_reg 3 rii a r12 My Computer r13 B r14 My Network File name zl Places Save as type system VAT files vat
70. ransfer some specific files separately by clicking gt or all listed files by clicking gt gt from the left source segment into the work segment middle Transfer only those files which have actually to be processed in this section fs AT Descriptive Statistics Virulence Data File Change Application Help Files selection Folder Within Between WATS Browse WEA opl vat F Dissimilarity coefficients Eaj M Simple mismatch pr 7 Jaccard dissimilarity I Dice dissimilarity Binary files original or converted Compatible files Figure 4 1 Files selection sheet in the Descriptive Statistics section Two files popl vat and pop4 vat were transferred to the work segment middle from the J VAT folder The rest of the folder s files pop2 vat and pop3 vat are displayed in the source segment left Only vat files with already existing Binary representation of original data are displayed On the right there is a list of Dissimilarity coefficients where the Simple mismatch box is checked Save to File Before running the Between or Within options one can choose to save the results as a text file In order to do this check the Save to File box on the right and select a folder and file name After running the Between or Within the results of calculations will be saved in the corresponding txt file Once the desired files are in the work segment middle two types of population analyses c
71. rentials The next three columns contain the corresponding Octal Hexadecimal and Binary Decimal race codes see 2 2 Note The Octal or triplet code requires a binary vector of length divisible by three otherwise the VAT leaves the Octal column empty The Hexadecimal code requires a binary vector of length divisible by four otherwise the VAT leaves the Hexadecimal column empty The Binary Decimal code can be calculated only for binary vectors up to a length of 63 otherwise the VAT leaves the Binary Decimal column empty The last column of the Phenotype frequency table shows the Average A45 or Average SE A45 A46 depending on which button was activated one may toggle between both buttons The Virulence Complexities Table The left part of the Virulence Complexities table is labeled Original Data and contains exactly the same values as the corresponding table in Descriptive Statistics see Fig 4 2 The first two columns from the left are the same as in the Phenotype Frequencies table On the left is a list of all the different phenotypes the second column displays the associated binary vectors The next three columns labeled VC RVC and VU contain values of the Virulence Complexity A6 the Relative Virulence Complexity A7 the Virulence Uniformity A8 respectively The last two rows of the Virulence Complexities table contain the average values of VC RVC and VU calculated over all phenotypes and over all individuals respec
72. rentials Coll Col2 Col6 columns in the original data table Here the FreqVir SE option is chosen so the virulence frequencies together with the corresponding values of the standard error are displayed The Comparison of Differentials Sheet This sheet displays Associations A5 or Correlations A4 between virulences and avirulences for all pairs of the differentials columns in the original data table from the given differential set see Fig 4 5 in form of an upper or lower triangular matrix respectively By checking the Square box on the top the corresponding triangular matrix will be displayed in a symmetric square matrix The option Both allows combining the Associations and Correlations results upper and lower triangular matrix respectively in one square matrix The Current file field allows selecting among all analyzed files 44 4 Descriptive Statistics Ec AT Descriptive Statistics Virulence Data File Change Application Help Files selection Phenotype characterization Comparison of individuals Virulence frequency Comparison of differentials Diversity parameters Comments Within population analysis of original data Relationship between differentials Current file Jpop vat J7 Square matrix Association Correlation C Both C ag Excel Differentials Col1 Col2 Col3 Col4 col5 Col6 colg Cola Coli2 Col13 Col14 Col15 Col16 Col 1 0 250 0 250 0 089 0 068 0 0 599
73. rices display the Average values respectively of distances between populations look in Comments sheet for the file name of each population P1 P2 etc The Comments Sheet This sheet Fig 5 7 displays the available comments about each analyzed population These comments had to be included before in the Data entry section see 2 4 and Fig 2 4 Since some tables use general labels like P1 P2 etc for the analyzed populations files the Comment sheet may be useful to learn which file is associated to each label 60 6 Resampling and Coding for Resistance Data S AT Inferential Statistics Virulence Data File Change Application Help Files selection Virulence frequency Between population distances Comments Between population analysis of resampled data File comments Population File Name Comments b jPi popl vat 20 individuals 16 differentials P2 pop2 vat 12 individuals 16 differentials P3 pop3 vat 15 individuals 16 differentials Figure 5 7 Comments sheet Three populations and their file names are displayed with comments available Part II Resistance Analysis applications Data entry and computational tools for Resistance Analysis are nearly identical to those for Virulence Analysis described in detail in Part I There are mainly differences in terminology and designations where Resistance R is used instead Virulence V For example Resistance Complexity CR Relative
74. rski J Eyal Z Ben Yehuda P and Kosman E 2000 Comparative analysis of indices in the study of virulence diversity between and within populations of Puccinia recondita f sp tritici in Israel Phytopathology 90 601 607 Nei M 1972 Genetic distance between populations Amer Naturalist 106 283 292 Nei M 1973 Analysis of gene diversity in subdivided populations Proc Natl Acad Sci USA 70 3321 3323 Nei M 1978 Estimation of average heterozygosities and genetic distance from a small number of individuals Genetics 89 583 590 Nei M and Li W H 1979 Mathematical model for studying genetic variation in terms of restriction endonucleasis Proceedings of the National Academy of Sciences of the USA 76 5269 5273 Pielou E C 1974 Population and Community Ecology Principles and Methods Gordon and Breach New York Roelfs A P and D V McVey 1973 Races of Puccinia graminis f sp tritici in the USA during 1972 Plant Dis Rep 57 880 884 Rogers J S 1972 Measures of genetic similarity and genetic distance Pages 145 153 in Studies in Genetics VII University of Texas Publication 7213 Austin 76 15 16 V7 18 19 20 21 22 8 References Shannon C E and Weaver W 1949 The Mathematical Theory of Communication University of Illinois Press Urbana Sheldon A L 1969 Equitability indices Dependence on the species count Ecology 50 466 467 Simpson E H 1949 Meas
75. rulence complexities _Resampled Data Resampling estimates VC Average for phenotypes 6 761 Phenotyp Binary pattern of phenotype vc RVC vu 1000000000011000 3 0 188 0 625 0011000000010011 5 0 313 0 375 1001000110111011 9 0 563 0 125 1001000100111011 0110000000010010 0111001010010011 Average for individuals 6 45 0 403 0 319 Figure 5 2 Phenotype characterization sheet of Inferential Statistics The Phenotype frequency table upper table contains list of 11 different phenotypes that were revealed and analyzed for the chosen population file popl vat The possible codes and the frequency of each phenotype are displayed Note that the Octal representation is impossible for the set of 16 differentials The results for the same phenotypes also appear in the Virulence complexities table for original data bottom left table The results for the resampled data appear in the bottom right table 54 5 Inferential Statistics The Phenotype Frequency Table This table lists all phenotypes found in the underlying sample The first column assigns names to each phenotype phl ph2 Since each isolate in the sample is expressed by a binary vector isolates with identical binary vectors are defined to exhibit the same phenotype The second column displays for each phenotype the associated binary vector Recall that the dimension of the vector its length equals the number of underlying diffe
76. s A45 of virulence frequencies alone or together with the Standard Error SE A46 can be displayed by clicking the appropriate button Average or Average SE respectively Clicking the Excel button at the top right opens an Excel file and copies the contents into an Excel sheet YAT Inferential Statistics Virulence Data File Change Application Help Files selection Virulence frequency Between population distances Comments M Between population analysis of resampled data Virulence frequency Average SE Average Excel Pa P2 P3 Differential pop vat pop2 vat pop3 vat 0 353 0 640 0 613 0567 Oo 0 620 1 000 0 793 0120 ooo 0 000 0 073 0 060 0087 0 340 0 500 0 427 0 287 0 000 0 000 014000588 0 960 1 000 1 000 0800 0 000 0 000 0 000 0 367 0 613 1 000 1 000 Figure 5 5 Virulence frequency sheet Virulence frequencies of the sampled isolates on the set of 16 differentials columns are calculated for two different populations P1 pop1 vat P2 pop2 vat and P3 pop3 vat The Average option is chosen at the top right corner The Between Population Distances Sheet Matrices of the resampling based estimates A45 A46 of pairwise distances between the analyzed populations are provided Seven different distance measures are available namely Nei distance N A38 Nei s Gst A42 Kosman s Gst A43 Rogers distance A32 Mean Characters difference MCD A44 DADm A30 Kosman di
77. s to one of the analyzed populations Clicking the Excel button at the top right will open an Excel file and copy the contents into an Excel worksheet The Comments Sheet This sheet displays the available comments about each analyzed population these comments had to be included before in the Data entry section see 2 4 and Fig 2 4 4 3 Between Populations Analysis Clicking the Between button on the top right in the Files selection sheet see 4 1 and Fig 4 1 of the Descriptive Analysis section activates several procedures for 45 4 Descriptive Statistics comparisons between the selected data sets populations as listed by their file names in the work segment central part of sheet Note In order to be compatible those data need to be based on identical differential sets i e the number and names of columns in the original data tables must be equal in all selected files For the indices KBm KBj and KBd the number of rows must also be equal in all these files Once Between is activated it could run for several minutes The bottom line reports about the current file in process and about the elapsed running time A blue bar conveys graphically the approximate time still required for completion As soon as the analysis is completed several new sheets will appear behind the Files selection sheet only the name tags of these six new worksheets will be visible right above the upper edge of the front sheet Clicking such name
78. sdatereesddzaddaatsedasdaraesieasdatoumarsauuicastanaaei aces 51 5 2 Within Population Analysis 3s iessiscevsbaed sed satextvastelatdieadocensbtese aed etdsantocaxabteie tated 53 The Phenotype Characterization SHeet acissie csasheeceatiessadereaducsdatoczaatdenecuntandpdareeits 53 The Phenotype Frequency Table ss seseseessesessessesessseessesersseessessrsseessessessressesse 55 The Virulence Complexities lab leec c c5eciswsonlucsienarnadeat scares eehdateaetcaaeeenies 55 The Virulence Frequency Sheet onesneseesseseessesseseessesseserssessessrssressessessressesse 56 The Comparison of Differentials Sheet ssioi2 5cisi540 gasp2esscisbacsanciage dacdbaavinadsariacealesas 56 The Diversity Parameters Sheet s i 8casscseeien Macs Ate ere decal eres aree esac 57 Th Comments SECU ggatet tare ee ei AA ET E T aaas 58 5 3 Between populations analysis s sssssessssseesseseessessesersstessessessressessreseesseeseesressee 58 The Virulence Frequency Sheet se ssessesseossessrsseossessesseessessesseessessosseessessosseesseeno 58 The Between Population Distances Sheet cccecceecceesceesceceneeceeceeeeeeeeeeeeenaeens 59 The Comments Sheet s sterse a oA aeneo EaR O EEA E EEE RERA 60 Part II Resistance Analysis applications essossesessoesessossesossossesossossessossesossossesossoe 61 6 RESAMPLING AND CODING FOR RESISTANCE DATA sssesessossesossossesossossossoseesossose 61 RUS BRENDEN EAT E te secasassdecaasde
79. se of Open 20 Fig 2 1 1c Octal code __ Row1 3625 Row2 M Row3 3674 Rw BT Row5 65 gt Row6 7653 a pa i Missing Data 2 Data Entry Existing File Once chosen the Missing Data window will be opened to display all the rows with incomplete code Fig 2 11b for the corresponding data in Fig 2 1 1a Click Yes in order to proceed and to get the modified data with no incomplete codes DER Row Name Value Row2 344 Are you sure you want to delete the above listed rows with incomplete codes _ Figure 2 11 Delete all rows with incomplete code a Original Octal data with a specified code length of 4 i e with 12 differentials b Missing Data window displays two rows Row2 and Row5 with detected incomplete codes 344 and 65 respectively c Modified data table the two rows were deleted Excel This option allows data transfer to the Excel worksheet 21 2 Data Entry 2 4 Enter New Data Enter New Data is the main section in the Data Entry application It provides all necessary tools for generating a new data file which can be saved and used for subsequent analysis By clicking the Enter New Data square in the Data Entry window see Fig 2 1 new window Data Parameters Fig 2 13 will appear Three or four parameters must be set namely 1 Differential Set 2 Type of Data and 3 number of rows No rows if an exis
80. sheet 1 already corresponds to resistance while 0 means susceptibility of the individual to the given isolate The Coded representation sheet Fig 7 3 displays Octal Hexadecimal and Binary Decimal codes of the binary resistant patterns of individuals that appear in the Binary representation sheet Fig 10 RAT Resampling and Coding Resistance Data B x File Change Application Help Files selection Original input Binary representation Coded representation Comments Current file p1 rat z Colt Col2 Cols Col4 Col5 Col6 Col7 Colg Col9 Col10 Coli Col12 Rowl 1 1 0 1 0 0 1 1 0 0 0 1 SS Ss Oe Oe ee a ee ee Row3 1 Row15 1 Binary 0 1 No cols 12 No rows 15 Current file path D test VAT 16_02_08 p1 rat Figure 7 1 Original input sheet displaying the input Resistance Data matrix of p1 rat 62 6 Resampling and Coding for Resistance Data oa File Change Application Help Files selection Original input Binary representation Coded representation Comments MOE Current file pl rat col10 Col11 Col12 1 0 g g gog guagdgaggd eseoeeoses esas EE ye eo i eee oOo escooooocoecoooo eocecececseooeoceo see2 42 0000C000 Co eoceocsooceonr nea ecooooeoocoos 24 CE H E E E E E I E i m M M i i OO O OOO Binary 0 1 No cols 12 No rows 15 Current file path D test VAT 16_02_085p1 rat Figure 7 2 Binary
81. sing Data window displays all rows with empty cells missing data c Modified data table five rows were deleted Delete all columns with empty cells This function enables to detect all columns with empty cells at least one missing entry in a column and to delete them By choosing this option the Missing Data window will be opened to display all the columns with at least one empty cell Click Yes in order to proceed and to get the modified data see Fig 2 9 for rows Fill all empty cells with This function allows to fill in all the empty cells with a specific value Once chosen the Fill Empty Cells window will be opened Fig 2 10 Insert a valid value in the field Load empty cells with Once clicking OK all the empty cells in the grid will be filled with this value Fill Empty Cells Filling Empty Cells All empty cells will be filled with any fixed valid value Load empty cells with Do Reel Figure 2 10 Fill Empty Cells window Value 3 is inserted in the field Load empty cells with All empty cells in the grid will be filled with 3 Delete all rows with incomplete code This function is available only when using the Octal data and Hexadecimal data It allows deleting all rows with incomplete codes Incomplete code is the Octal or Hexadecimal Code with the code length shorter than declared in the Data Parameters window Fig 2 12 in the case of Enter New Data or shorter than code of maximum length in the ca
82. ski et al 2000 Distance between populations pattern based methods Let two populations P and P gt consist of n and n individuals x x x and Vis V20 5 5 respectively and dissimilarity between the individuals is assessed using any measure p e g simple mismatch m Jaccard j Dice d coefficients of dissimilarity DAD Distance of Average Differences between populations The Distance of Average Differences between populations P and P with respect to dissimilarity p is defined as follows ADW F ADW P 2 DAD P P ADB P P A30 where 1 n n ADB R P amp py nen i j 1 n 1 n ADENS T 22h en and ADW B Gaye ape i j n i j KB Kosman Distance between populations The Kosman distance KB P P between two populations P and P3 is defined for two samples of equal number individuals n n as follows 1 KB P P ASS min Pis Pp A31 n 71 7 Appendix Kosman 1996 Kosman and Leonard 2007 For given populations P and P and dissimilarity measure p KB P P lt ADB P P Distance between populations type based methods Let populations P and P gt consist of n and n individuals x x x and Xag Xasis Xto respectively n and nz be the number of individuals of type T from populations P and P respectively and s is the total number of types of individuals nN observed in both populations Then p and p P
83. splays the full path of the active folder which holds the files listed in the source segment Note that exclusively vat files after Resampling 3 3 are listed since they are the only ones to be processable in this section To select another folder click the Browse button next to the path line field and use the browse option the listed vat files in the source segment will change accordingly 51 5 Inferential Statistics Now one may transfer some specific files separately by clicking gt or all listed files by clicking gt gt from the left source segment into the work segment middle with legend Compatible files Transfer only those files into the work segment which are actually wanted to be processed in the current session The Inferential Statistics can only be run if the number and names of columns are identical in the original data tables for all files listed in the work segment It is also required that the collections of re sampled sets are compatible i e the number of computer generated samples as well as their sample sizes must be the same see Fig 3 7 Otherwise an error message will appear ES AT Inferential Statistics Virulence Data File Change Application Help Files selection Folder SE TWAT ERA Within Between pop4 vat popl vat I Save to file op2 vat Dissimilarity coefficients V Simple mismatch Jaccard dissimilarity I Dice dissimilarity Binary files after R
84. stance KBm A31 For each measure a separate sub sheet can be accessed by clicking the corresponding name tag which is visible at the upper edge of the front sheet see Fig 5 6 These index specific sub sheets are subdivided into two matrices The top and bottom matrices display the Average values A45 of the distance measure and the 59 5 Inferential Statistics corresponding Standard Errors SE A46 respectively All values are based on the corresponding resampling data sets By checking the Square box on the top the corresponding triangular distance and SE matrices will be displayed in the form of symmetric square ones Clicking the Excel button at the top right opens an Excel file and copies the contents into an Excel sheet E YAT Inferential Statistics Virulence Data File Change Application Help Files selection Virulence frequency Between population distances Comments M Between population analysis of resampled dat ie ee Matrix of distances between populations Square matrix Excel Nei distance N Nei s Gst Kosman s KGst Rogers distance R Mean Character difference MCD DADm Kosman distance KBm N average Pa p2 P3 PI o ae gt ps 008 0 034 0 0 007 0 011 0 006 Figure 5 6 Between population Distances sheet The Nei distance N sheet is displayed and three populations are compared pop1 vat pop2 vat pop3 vat The Square matrix box is not checked The top and bottom mat
85. sts all members host plants of the underlying differential set labeled by default as Coll Col2 etc The first two rows serve the table s header and provide the designations and file names of the analyzed populations Values of virulence frequencies alone or together with the Standard Error SE can be displayed by clicking the appropriate button FreqVir or FreqVir SE respectively Clicking the Excel button at the top right opens an Excel file and copies the contents into an Excel sheet 43 4 Descriptive Statistics i E vat Desaiptive Statistics Virulence Data File Change Application Help Fies selestion Phenotype characterizaion Comparison of individuals Virulence frequency Comparison of differentials Diversity parameters Commerts Within population analysis of original data Freqvir SE Excel Vrulence frequency Pa p2 P3 P4 Differertial popl vat pop4 vat pop vat pop3 vat Col2 0400 0110 0 667 0 136 0 667 0 136 036740122 Col4 0 700 0102 0 750 0 125 1 0 000 0 733 0 114 0 0 000 0 41720142 0 083 0 080 0 367 0 064 0 350 0107 0 667 0 136 0 50 0 144 0400 0126 0 300 0102 0 750 0 125 0 0 000 0 0 000 0 950 0049 0 750 0 125 1 0 000 1 0 000 0 0 000 0 833 0 108 0 0 000 0 0 000 00650 0107 0 667 0 136 1 0 000 1 0 000 Figure 4 4 Virulence frequency sheet Isolates from four populations popl vat pop2 vat pop3 vat pop4 vat were tested on the set of 16 diffe
86. the processed file will appear in the work window and its conversion parameters will be listed on the right side of the Files selection sheet Fig 3 1 Before transferring any selected file from the source to the work segment the program will step by step convert each non binary file 30 3 Resampling and Coding HE VAT Resampling and Coding Virulence Data lel x File Change Application Help Files selection Resampled Files Resampling and Coding Folder D test data Browse Binary files original or converted Coding Resampling aaa ces o pop3 vat pop4 vat Type Regular Replace with 0 gt From Min 0 To Cut Off 3 Replace with 0 _ Replace witht From Cut Off 3 001 To Max 9 Replace with 1 Resampling Number of Samples 10 Sample Size 15 Resampling operates only on the Binary data All valid and complete system YAT files Figure 3 1 Files selection sheet in Resampling and Coding The folder D test data contained four vat files two of which are listed in the source segment left pop2 vat and pop3 vat Two files were transferred from the source to the work segment middle popl vat and pop4 vat The parameters for the highlighted file pop4 vat with Regular data are listed on the right of the work segment with Min 0 Max 9 and a Cut Off Value of 3 Moreover the two files in the work segment were resampled namely 10
87. the selected data sets populations as listed by their file names in the work segment central part of sheet Once Between is activated it could run for several minutes or even hours A special information window will open where the reports about the current pairs of files in process and about the elapsed running time are displayed as well as a blue bar conveying graphically the approximate time still required for completion As soon as the analysis is completed three new sheets will appear behind the Files selection sheet only the name tags of these three new worksheets will be visible right above the upper edge of the front sheet Clicking such name tag brings the corresponding sheet to the foreground in Fig 5 5 for example Virulence frequency is the visible front worksheet while Files selection and the other two sheets Between population distances and Comments are in the back Next we explain the three new worksheets in more detail The Virulence Frequency Sheet This sheet displays the resampling based estimates A45 A46 of proportion of the sampled isolates with virulent reaction on each differential from the given differential 58 5 Inferential Statistics set Fig 5 5 The leftmost column lists all members host plants of the underlying differential set labeled by default as Coll Col2 etc The first two rows serve the table s header and provide the designations and file names of the analyzed populations Average value
88. ther parameters for sexually and asexually reproducing populations These estimates are obtained by resampling methods allowing further statistical evaluation e g significance tests and confidence intervals 4 Sample size recommendations for reliable estimation in specific experimental situations are provided VAT output is compatible to all major statistical analysis tools and suitable for direct input into MS Excel and most other commonly used packages SAS NTSYS SPSS etc facilitating additional analyses clustering dendrograms PCA etc 1 1 General information about VAT Input files The program accepts only the standard text files inputfilename txt with extension txt or the system oriented files filename vat with extension vat for virulence data and filename rat with extension rat for resistance data which are 1 Introduction created by the VAT program itself The txt files will only be accepted in the Data Entry section The system vat files and rat files can be read at any phase or section of the program for virulence and resistance data respectively and data analysis is possible only with vat files and rat files Output files The basic data output of the program is a system vat rat file User should create the filename vat filename rat file in order to allow data processing This file contains original data results of data transformation and resampling and some other information on original data However the most
89. ting differential set is selected and additional parameter 4 number of columns No cols if none of existing differential sets is selected Data Parameters Diff Set 6xml v V Regular Binary r cr Code length Binary Decimal Binary vector r length No rows 16 Figure 2 13 Data Parameters window Differential set DiffSet 6 xml is selected and Regular type of data is chosen The Min value is 0 and the Max value is 9 The number of columns No cols is 7 and the number of rows No rows is 16 22 2 Data Entry Differential Set Scroll down the field Differential Set and select either any differential set from the list of available ones see 2 1 or None If any existing Differential Set is selected then the number of differentials in this set will automatically appear in the No cols number of columns and the Binary vector length fields If a derivative of the number of differentials is determined for Octal or Hexadecimal data see 2 2 it will automatically appear in the Code length field Type of Data The next segment of the Data Parameters box requires specifications concerning the type of data One of the following five types should be selected Regular Binary Octal Hexadecimal or Binary Decimal see 2 2 In general applications Descriptive Statistics see 4 and Inferential Statistics see 5 will only operate under Binary data Data of other types should be conv
90. tion P and dissimilarity measure p KW P 2 ADW P Diversity within population type based methods Frequency of individuals of a fixed type Let n be the number of individuals of type T from population P which consists of n individuals x x x and s is the total number of types of individuals observed in this population e g number of pathotypes races phenotypes denotypes Then frequency of individuals of type T r 1 2 s in population P is determined as follows Pp A18 so that p p p 1 68 7 Appendix Theoretical standard error of p estimate A18 is expressed by formula 1 SE p PSP A19 n G Gleason richness within population for r 1 2 S The Gleason index of richness within population P is defined as follows 1 GP A20 lnn Si Simpson diversity within population The Simpson index of diversity within population P Simpson 1949 is defined as follows Si P 1 p A21 r l 1 Its values range between 0 and 1 0 lt Si P lt 1 S S St Stoddart diversity within population Stoddart s index of diversity within population P Stoddart 1983 Stoddart and Taylor 1988 is defined as follows 1 S Xr r l Its values range between 1 ands 1 lt St P lt s St P A22 SH Shannon diversity within population Shannon Wiener entropy The Shannon index of diversity within population P Shannon Wiener entropy Sha
91. tivate any of the different applications in the Applications sector combined with each of the two types of analysis 1 Introduction VAT Main Window Virulence Data lal x File Help Type of Analysis Virulence Analysis C Resistance Analysis Applications Resampling and be Coding Descriptive Inferential EES Statistics Miscellaneous Figure 1 1 VAT Main Window At the top is the Type of Analysis sector where the Virulence Analysis option is chosen In the Applications sector there are the five squares representing available applications Type of Analysis sector in the VAT Main Window VAT provides two types of analysis Virulence Analysis and Resistance Analysis These options are available at the top sector Type of Analysis and can be chosen by clicking the appropriate circle Virulence Analysis Virulence patterns of isolates with respect to a given differential set of hosts will be analyzed One can characterize and compare individual isolates and isolate populations tested for virulence on a given differential set Resistance Analysis Resistance patterns of host plants with respect to a given differential set of isolates are analyzed One can characterize and compare individual hosts and host populations tested for resistance on given differential set Applications sector in the VAT Main Window There are five different applications 1 Data entry 2 Resampling and Coding 3 Descriptive St
92. tively The second table in the lower right see Fig 5 2 is labeled Resampled Data and contains the Average for phenotypes and Average for individuals for VC RVC and VU However in contrast to the left Original Data table the Resampled Data table contains the average calculated over all computer generated resampled data sets 55 5 Inferential Statistics Depending on which button was activated one may toggle between Average or Average SE only the Average A45 or additionally the corresponding standard error SE A46 respectively will appear The Virulence Frequency Sheet This sheet displays the resampling based estimates A45 A46 of proportion of the sampled isolates with virulent reaction on each differential from the given differential set Fig 5 3 The leftmost column lists all members host plants of the underlying differential set labeled by default as Coll Col2 etc The first two rows serve the table s header and provide the designations and file names of the analyzed populations Average values A45 of virulence frequencies alone or together with the Standard Error SE A46 can be displayed by clicking the appropriate button Average or Average SE respectively Clicking the Excel button at the top right opens an Excel file and copies the contents into an Excel sheet JE YAT Inferential Statistics Virulence Data File Change Application Help Files selection Phenotype characterization Virulence frequency
93. try File Change Application Edit Excel Help Original Data Comments Figure 2 14 The empty grid table that was created by clicking Next in the Enter New Data window In accordance with Fig 2 13 there are 7 columns named Diffl1 Diff2 and 16 rows 2 5 Open Existing File Open Existing File tool provides a possibility to import data from already existing file the system vat file or the text txt file This file can be uploaded and is available for modification By clicking the Open Existing File square in the Data Entry window Fig 2 1 new window Open existing file Fig 2 15 will appear Open existing file Select existing file JE Arel TEST VAT pop_1_reg bt Browse Il n o F Data Type Regular Hexadecimal C Binary C Octal lt Back Next gt Figure 2 15 Open existing file window The selected Type of file is Text File The Regular data are declared to be imported from the selected existing file E Ariel TEST VAT pop_1_reg txt 25 2 Data Entry Two mandatory parameters must always be determined in the Open existing file window Type of File and File name In the case of the text txt files the third parameter Data Type is also mandatory Type of File One of the two possible file types System VAT File or Text File should be selected in the corresponding field No other types of files can be imported Open existing System VAT File vat f
94. ue 9 replaced by the value 3 for example cell Row2 Diff4 17 2 Data Entry 0 1 Transformation This function is available only for Binary data It switches all Os to 1 and all 1s to 0 Add There are two choices under Add add Row or add Column Each automatically adds a new row or column after the last one respectively in the bottom or to the right of the current grid Insert There are two choices under Insert insert Row and insert Column In order to insert new column row to the left above a target column row click in any cell of the target column row and choose the Column Row option under Insert A new column row will be inserted to the left above the marked column row Delete There are two choices under Delete delete Row and delete Column In order to delete a target column row click in any cell of the column row and choose the Column Row option under Delete The marked column row will be deleted Rename There are two choices under Rename rename Row and rename Column Once choosing the Column Row option the Rename window will be opened displaying the List of Columns Rows of the original data and their names in the current order Fig 2 8 The name of each column row can be modified Click OK in order to fix the changes HE Rename OO ND oO amp WN EI Figure 2 8 Rename window Names of the 16 rows of original data Fig 2 7a are displayed The current row names Row1 Ro
95. ulations at differentiating character D 5 Resampled data Let N fictive samples of a fixed size number of individuals are generated by means of resampling drawing randomly with replacement from an original set of 74 7 Appendix individuals If Par i 1 2 N is any parameter calculated for each fictive sample then the mean estimator of the population parameter Par and its standard error SE are obtained as follows N Par X Par A45 i l SE Par A46 75 8 References 8 References 1 10 11 12 13 14 Gilmour J 1973 Octal notation for designating physiologic races of plant pathogens Nature 242 620 Habgood R M 1970 Designation of physiological races of plant pathogens Nature 227 1268 1269 Kosman E 1996 Difference and diversity of plant pathogen populations A new approach for measuring Phytopathology 86 1152 1155 Kosman E 2003a Nei s gene diversity and the index of average differences are identical measures of diversity within populations Plant Pathology 52 533 535 Kosman E 2003b Measure of multilocus correlation as a new parameter for study of plant pathogen populations Phytopathology 93 1464 1470 Kosman E Leonard K J 2007 Conceptual analysis of methods applied to assessment of diversity within and distance between populations with asexual or mixed mode of reproduction New Phytologist 174 683 696 Maniste
96. urement of diversity Nature 163 688 Sneath P A amp Soal R R 1973 Numerical Taxonomy W H Freeman Co San Francisco Sokal R R and Rohlf F J 1995 Biometry W H Freeman New York Stoddart J A 1983 A genotypic diversity measure J Hered 74 489 490 Stoddart J A and Taylor J F 1988 Genotypic diversity estimation and prediction in samples Genetics 118 705 711 Wright S 1951 The genetic structure of populations Ann of Eugenics 15 323 354 W
97. us and the Main Window Edit This menu option provides various tools functions for editing modifying and validation of original data Transposition Validate Reset Replace 0 1 Transformation Add Insert Delete Rename and Missing Data Transposition This function is for transposition of a data table in the grid of Original Data Sheet rows become columns and vice versa Validate This tool allows checking validity of data in the grid of Original Data Sheet It is mainly used in order to find empty cells in large arrays of data and to detect incomplete codes in the case of Octal and Hexadecimal data Note that all other cases of invalid data are automatically protected on the stage of entering new data or importing data from existing files Missing data empty cells are allowed valid in the Data Entry application but processing files with missing data incomplete data in all other VAT Applications is impossible The Validate function has two options List invalid cells and Find next invalid cell List invalid cells This tool provides information about all invalid cells Validation window will be opened to display coordinates row number row name column name and column number of cells with missing data or incomplete codes see Figs 2 6a and 2 6b with one empty cell Row12 Diff4 To save the displayed List of Invalid Cells click the Save button and the standard Save As window will appear Fig 2 4 15 2 Data Entry
98. w2 Row3 etc can be changed 18 2 Data Entry Missing Data This tool is mainly aimed at modification of original data in order to make them acceptable to the computational VAT Applications Resampling and Coding Descriptive Statistics Inferential Statistics and Miscellaneous This can be achieved by means the following four functions Delete all rows with empty cells Delete all columns with empty cells Fill all empty cells with and Delete all rows with incomplete codes for encoded data Delete all rows with empty cells This function enables to detect all rows with empty cells at least one missing entry in a row and to delete them By choosing this option the Missing Data window will be opened to display all the rows with at least one empty cell Fig 2 9b for the corresponding data in Fig 2 9a Click Yes in order to proceed and to get the modified data Fig 2 9c Roi O o ___ Row13 gt Row15 Missing Data Are you sure you want to delete the above listed rows with empty cells 19 2 Data Entry Diff1 Diff2 Diff3 Diff4 DiffS Diff6 Diff7 Row1 3 4 5 3 2 8 9 Row2 7 6 4 8 2 9 8 Row3 7 8 7 6 7 4 7 Row5 5 7 6 4 5 6 6 Row6 4 6 4 3 4 8 5 Row 1 5 3 2 3 9 4 Row10 0 5 3 6 2 2 2 Row11 9 4 4 8 2 3 9 Row12 0 3 6 0 2 4 0 Row13 9 4 7 0 4 2 9 gt Rowil6 5 7 8 9 7 6 6 c Figure 2 9 Delete all rows with empty cells a Original data with empty cells b Mis
99. x 1 Dissimilarity between individuals Let us consider binary patterns 0 1 vectors of two individuals x and y tested on k differentiating factors Ds s 1 2 4 We denote a number of factors with shared positive responses 1s for the both individuals b number of factors where individual x has a positive response but y does not c number of factors where individual y has a positive response but x does not m Simple mismatch dissimilarity Simple mismatch coefficient of dissimilarity between two individuals x and y is determined as follows b c m x y A1 The simple mismatch coefficient varies between 0 and 1 j Jaccard dissimilarity Jaccard dissimilarity between two individuals x and y is determined as follows 64 7 Appendix l b c I x y _ _ A2 a b c The Jaccard coefficient of dissimilarity varies between 0 and 1 Note j x y is also known as the Tanimoto distance d Dice dissimilarity Dice dissimilarity between two individuals x and y is determined as follows b c d x y _ A3 2a b c The Dice coefficient of dissimilarity varies between 0 and 1 Note 1 d x y is also known as the Nei and Li 1979 genetic similarity measure and the S rensen measure of similarity for composition of species in ecology 2 Comparison of differentials columns of binary data matrix Cor Correlation between differentials The measure of correlation between two differenti
100. ze of the corresponding population For example the third phenotype 3 Fig 4 6 occurs 2 times 46 4 Descriptive Statistics among a total of 12 isolates in sample P2 and only once among a total of 15 isolates in sample P3 then FreqP2 and FreqP3 equal to 2 12 0 167 and 1 15 0 067 respectively Clicking the Excel button at the top right opens an Excel file and copies the contents into an Excel sheet a AT Descriptive Statistics Virulence Data File Change Application Help Files selection Pairwise common phenotypes Comparison of phenotypes Overall common phenotypes Virulence frequency Between sample distances Comments M Between population analysis of original data 7 ae List of pairwise common phenotypes Current pair of files POP2 VAT POP3VAT gt Bea Binary pattem of phenotype Octal Hexadecimal Binary Decimal CountP2 CountP3 Min Counts Freq P2 Freq P3 1111001110111011 TFPP 62395 3 3 3 0 250 0 200 0001000110111001 CCPM 4537 0 167 0 133 2 2 2 1111000000011001 TBCM 61465 2 1 1 0 167 0 067 0111000000011011 KBCP 28699 1 1 1 0 083 0 067 0001000000010001 CBCC 4113 1 1 1 0 083 0 067 1 1 1 0 083 0 067 1 1 1 0 083 0 067 1001010110111011 MHPP 38331 1101001000010001 RDCC 53777 Figure 4 6 Pairwise common phenotype sheet The 7 common phenotypes are revealed and analyzed in pairwise comparison of two populations pop2 vat and pop3 vat Note for file names P1 P2 P3 see co

Download Pdf Manuals

image

Related Search

Description description synonym description meaning description generator description box description of eczema description of shingles rash description of goods description in spanish translation description of rash description ai description spelling description of heaven description text structure description of jesus in the bible description abbreviation desc description of eczema rash description of poison ivy rash description of scabies rash description of heaven in the bible description of hand foot mouth rash description logic description game description picture description de poste description ideas

Related Contents

0E-222 TENISPLAST  Service Manual.book - Frank`s Hospital Workshop  Double Duet Breastmilk Initiation Kit Instructions  Fichas Técnicas  Toro T5 Series Instruction Sheet  GH3000 User manual original_DE  85 表:消費者庁との情報共有実施状況(平成25年度)  Untitled - Lanaform  Westinghouse 30-Inch Specification Sheet  User Manual - Axis Communications  

Copyright © All rights reserved.
Failed to retrieve file