Home

OptiMAS V1.5 Manual - Integrated Breeding Platform

image

Contents

1. Organization and description of the files that have been installed or supplied within the software package README this file INSTALL building and installation instructions COPYING license T AUTHORS ets EEO aenor Sk optimas gt sources code to build optimas command line executable optimas gui gt sources and other files to build OptiMAS GUI doc gt user manual tutorial in PDF format input gt input sample data genetic map amp genotype pedigree moreau dat moreau map gt biparental input example old format map blanc dat blanc map qtlpos qtll gt multiparental input example Pea llncherrecusmb lance txts gt gt cel elic TonkeOcES shor blanc imu epa rental exemple output gt results data obtained from a further analysis can be reloaded in the GUI moreau gt biparental example ready to be analyzed blanc gt multiparental example ready to be analyzed optimas amp optimas gui gt OptiMAS command line amp GUI executables website gt local html version of the documentation tutorial gt install optimas on linux sh gt installation shell script for Linux system 3 Data preparation To run OptiMAS v1 5 you must supply data containing all information about your MAS design i e genotype pedigree data a classic genetic map and information on QTLs Quantitative traits and allelic
2. The structure for this file output date tab_homo_hetero txt can be represented as follows MS All All All QTL1 QTLI QTLI QTL2 0 33333 0 33333 0 66666 0 00000 1 00000 0 00000 0 00000 0 00000 2 0 33333 0 33333 0 66666 0 00000 0 00000 1 00000 0 00000 1 00000 IL1_IL2_F1 0 33333 0 00000 0 33333 0 66666 0 00000 0 00000 1 00000 0 00000 3 1L4 Fl 0 33333 0 0000 333 0 66666 0 00000 1 00000 0 00000 0 00000 IL1_IL2_ F2 0 66137 0 65614 0 33333 0 01047 0 98658 0 00003 0 01335 0 98184 SBEN 422 0 16919 0 00000 0 99996 0 00000 0 89854 1 0 58804 0 13700 0 54989 0 00000 0 00671 0 99328 0 93928 3 Columns of this file correspond to QTLx probability to be homozygous for a favorable allele at the QTL position or to be heterozygous with two different favorable alleles when several parental alleles are considered as favorable QTLx probability to be homozygous for an unfavorable allele at the QTL position or to be heterozygous with two different unfavorable alleles when several parental alleles are considered as unfavorable QTLx probability to be heterozygous at the QTL position one favorable allele with one unfavorable 15 All All All mean of previous probabilities for all QTL together MS Molecular Score expected
3. 28_08_2012_15_05_13 a Select the directory of results to reload GO gt out s P Organiser v Nouveau dossier X fe Favors Nom Modifi le Type E Bureau jy 28 08_2012_15 05_13 14 09 2012 13 41 Dossier dd 7 Dropbox Figure 31 Reloading a previous analysis 38 5 6 FillMD Mk a tool to replace genotyping errors with missing data Genotyping errors coming from the genotypes pedigree file dat file and recorded in the events summary log file if present can be filled with missing data after that the consistency of marker genotyping information has been checked along generations of selection Select Tools gt FillMD Mks from the menu bar see below 5 D a decision support tool to conduct Marker Assisted Selection programs ales File Visualization Data ptiMA ATT computation of genotypic probabilities Estimation of genetic Molecular score prediction Hama fHetera Estimation of parental allele nrobabilities T Granhs l Find amp FillMD Mks fill genotyping errors with missing data x q Genotyping errors coming from the genotypes pedigree file dat file and recorded in the avents_summarylog file if present can be filled with missing data after that the consistency of marker genotyping information has been checked along generations of selection Sn a A new map file cmap has been created to run OptiMAS marker by marker cycle by c
4. 2 1 _diplotypes_set probabilities of phased genotypes 0 00 0 eeeeseeeseceeeeeeeeeeeeees 13 4 2 2 gametes set probabilities Of gametes icc cjcevedesace sascha cee pee detec eee 14 4 2 3 tab_homo_hetero probabilities to be homozygous or heterozygous at the QTL POS Sa cceied cciecaenipstrciat ea weeaad age a E a a E Opus E E Bae eaehe S 15 4 2 4 tab_scores prediction of genetic value sssssesssesssesessseesseessersseeesseeessressresse 17 4 2 5 tab_parents estimated probabilities of parental alleles eeeeeeeeeeeeees 18 4 2 6 tab_check_diplo sum of phased genotypes probabilities with the cut off 19 4 2 7 events_summary log questionable results cceeecceceeeceeseececeeeceeeneeeenneeeenaees 19 5 OptiMAS Graphical User Interface GUI eee ceenecesececeeeeeceeeeececeeceeeeeneeeeneeeenas 20 Sl Fannie OpEni As GULo e a A aan einen E nie sa a ee es 20 5 2 Step 1 Computation of genotypic probabilities Estimation of genetic values 23 5 3 Step 2 Selection of individt lg nsiet eca te st gees e ia eesosa 28 S3 Methods for selection skr eien a eae Saas 28 5 3 2 Display and comparison of lists of selected individuals ceeeeeeseeeeeteeees 32 5 4 Step 3 Identification of crosses to be made among selected individuals 34 5 5 Saving reloading your previous ANalySis ceesceceeseeceeceeceeeeeceeeeecseeeeeseeeeneeeenaees 38 5 6 KFillMD Mk a tool to replace genotyping errors w
5. 4 qtl2 5 qtlw file genetic distance defined on either side of the QTL position set in the qtlpos file The marker set consists of the markers from the map file included in the resulting window QTL window qtll 30 qtl2 20 5 QTL names of the QTLs without blank in character chain and have to appear in the same order than those in the qtlpos file mrk_list list of marker names They have to match those in the map file mrk_nb integer value indicating for each QTL the number of flanking markers window genetic distance in cM 11 Note Never change the header field names in the map file and QTL information files It is recommended to code parental alleles all column by a single character e g a A b 1 Check that decimals are 0 00 and not 0 00 for the marker QTL positions column pos Every file must be in plain text tab delimited format So use tabulation and not spaces between fields even if they are empty e g markerl tab 1 tab 42 2 in the map file or qtll tab 1 tab 70 0 tab tab tab a in the qtlpos file The markers present in the map file must be ordered Both QTL information files have to share the same base name e g maize gtlpos and maize qtll If several files describing markers to be used to track QTL are present in the directory only one will be used according to priority rule qtll gt qtlw gt qtIn 4 OptiMAS in command line OptiMAS command line version manages computation
6. 4 2 1 The genotype at this QTL position is a b 3rd locus and a is the favorable allele The molecular score for this individual will be MSorii p a b x dose 2 1 x 1 2 0 5 The genetic value of the individual MS column is obtained by averaging the molecular score for all QTL MS 0 5 0 5 0 0 3 0 33 The output file also contains additional columns corresponding to quantitative traits if they were provided by the user optional 4 2 5 tab_parents estimated probabilities of parental alleles Beyond global scores presented above it can be interesting to display the probability of having received a given parental allele at individual QTL positions and globally across QTL The default structure for this file output_folder date tab_parents txt can be represented as follows QTLI1 c 0 000 0 0000 0 0000 0 3333 0 5000 0 5000 0 0000 0 0000 _ 0 6613 0 5992 0 4007 0 0000 0 0000 Columns of this file correspond to QTLx parental allele expected proportion of parental allele at QTLx All parental allele average of QTLx values over all QTL MS Molecular Score expected proportion of favorable alleles over all QTL 18 4 2 6 tab_check_diplo sum of phased genotypes probabilities with the cut off Within OptiMAS some algorithms such as selfing may become memory and or time consuming depending on the complexity of your MAS design To overcome th
7. 7424 6 a 1 3 0 0 7259 0 8959 0 8879 0 289 B246 A37 A1040 C2 0 7303 0 6180 8 8995 6 1 3 1 0 0 6036 0 8755 0 9575 0 462 B293 A9 A1040 C2 0 7268 0 6150 8 8608 6 1 3 al 0 0 9549 0 9741 0 7954 0 456 B7 A1005 A1005 C2 0 7238 0 6125 8 4621 6 2 0 3 0 0 8598 0 6196 0 8752 0 011 B124 A23 A167 C2 0 7223 0 6852 8 8114 6 1 1 3 0 4812 0 6698 0 9559 0 9429 0 278 B206 A27 A1005 C2 0 7205 0 6097 8 7916 6 1 i 3 0 0 8518 0 9078 0 8815 0 294 B296 A91 A1003 C2 0 7194 0 6087 8 7789 6 1 3 1 0 0 9764 0 9044 0 9456 0 456 835 A1040 A1003 C2 0 7149 0 6049 8 3643 7 2 0 2 0 0 9769 0 9685 0 9375 0 293 B42 A1040 A1040 C2 0 7138 0 6040 8 3517 7 2 af 1 0 0 9775 0 4025 0 9592 0 982 B157 A251 A1005 C2 Gl 0 7125 0 6028 8 5441 5 1 i 4 0 0 7215 0 4743 0 9293 0 297 4 Figure 7 genetic value molecular score for each individual all each QTL Individuals are represented in lines followed by their pedigree the cycle of selection and the group they belong to Plants can be sorted regarding their value for any columns For instance for the molecular score column MS which is the expected proportion of favorable allele at the QTL position s click on the MS column The genetic values MS red column vary between 0 and 1 A value of 1 indicates that the individual corresponds to the targeted ideotype it is homozygous for the favorable allele s for this QTL or all the QTL In this maize example MS v
8. 7954 0 4568 0 9267 0 9724 B7 A1005 A1005 C2 s 0 7238 0 6125 ey OTA 1 00 8752 0 0119 0 9747 0 996 B124 A23 A167 C2 0 7223 0 6852 hell LL00 9429 0 2787 0 9614 0 9924 B206 A27 A1005 C2 0 7205 0 6097 ay FES 1 00 8815 0 2942 0 5823 0 996 B296 A91 1003 C2 0 7194 0 6087 a L100 m 9456 0 4568 0 5367 0 9644 B35 A1040 A1003 C2 0 7149 0 6049 Ea OTe 1 00 T 9375 0 2939 0 9266 0 9759 B42 A1040 A1040 C2 0 7138 0 6040 TLS 1 00_ ing W 9592 0 9827 0 9231 0 9759 B157 A251 A1005 C2 G1 0 7125 0 6028 J A eset X cancel J Apply 9293 0 2971 0 9236 0 996 Figure 11 weighted molecular score give more or less importance to the different QTL We noticed in this example that the favorable allele x at QTL1 may be lost because i the ten best individuals of the panel have a molecular score of 0 at QTLI1 see Fig 9 cells in red and ii graphs in Fig 10a and 10b indicate the same decay Thus a weight of 3 0 has been attributed to the QTL 1 Assign QTL weights then press Apply It will result in an update on the Weight column in blue and therefore produce a new classification of individuals In addition to the molecular score and the weight columns OptiMAS estimates an Utility Criterion UC green column in Fig 8 which evaluates the expected value of superior gametes of each individual by combining the molecular score with the expected variance of the MS of its gametes
9. Files output_folder date each_qtl qtlx qtlx_diplotypes_set txt contain probabilities for diplotypes according to the following structure QTL Id haploi haplo2 readi read2 proba nb _ haplo 1 ILi a a apa a a a apa a T cCaaa C T C A C 1 000000 x 1 IL2 b bD BpD b b b bfb b c CHEC C C C G C 1 000000 1 i IL3 c c aac c c ca T C A C T C A C 1 000000 a 1 IL4 d d did d d d a jd d T C C C T C G C 1 000000 Zz T ILi_IL2 Fl a a afa a b b bfb b T CHHA C C CHEG C 1 000000 1 i IL3_IL4 Fl c c c c d d did d T C A C T C 3G C 1 000000 1 i ILi_IL2 F2 a a afa a a a ata a T CSHEA C T CHEA C 0 689520 24 1 1 ILi_IL2 F2 a a afa a a a a a b T C HEA C pia a 0 249585 24 Zz ILi_IL2 F2 a b b a a a b bfa a T C ZHA C T C A C 0 000020 24 3 ILi_IL2 F2 a b bfa a a b bBha b T C HEA C T C A C 0 000007 24 ei IL3_IL4 F2 c c c c OF e T C HA C T C G C 0 001863 189 1 IL3_IL4 F2 c c c c c c q d d T C A C T C aG C 0 010292 189 i IL3_IL4 F2 c c c c d d did d T C A C T C C C 0 411757 189 1 IL3_IL4_F2 d d djc d d d dd c T C A C T C ZCG C 0 000337 189 i IL3_IL4 F2 d d djc d d d did d T C A C T C C C 0 001863 189 2 Indi a a a a a C CELL EC T C A C T C HEG C 0 007226 118 i Indi a a a a c c o d d T C A C T C REG C 0 020622 118 1 Indi a a afa a d d dfd d T C A C T C 9G C 0 376333 118 i Indi a b bfa b d d did c T C A C T C C C 0 000120 118 1 Indi a b bba b d d djd d T C A C T C G C 0 000344 118 1 Ind2 a a ala a CCH E T C A C T CHEA C 0 188166 218 1 Ind2 a a a a a GR e
10. T E o wo L 2 A E E Lg 300 b 3 250 J Q ae e 4 200 f Ad 5 a 1 bN 7 oO e 0 4 bata M 4 s f Qo 150 z z A Z104 Fo 2 g 2a 50 4 AN All 1 2 3 4 5 6 7 8 9 10 11 0 0 1 02 03 04 O05 06 07 08 09 1 QTL Molecular Score MS Colors 9 View m ie Save Graph Histo QTL an gt Save No individuals 397 No QTL 11 Mean 0 109483 Var 0 0453865 The graph on Fig 10a indicates the frequency of favorable alleles at the different generations of selection on average and for each QTL Note that no genetic gain is expected for the last generation C2 in blue because individuals are not selected yet a Number of Individuals N 150 100 50 0 0 O01 02 03 04 05 06 07 08 09 1 G1 G2 G3 Molecular Score MS Group Graph Histo a Qam Ms s Save Graph Barpot QTL MS gt Save Mean 0 582949 Var 0 00635147 Figure 10 distribution of QTL MS global genetic values and their evolution over the different cycles of selection 23 Fig 10b and 10c show the distribution of the molecular score for each QTL separately and on average for all QTL whereas Fig 10d displays the average of the MS for individuals classified according to another classification criterion e g subprograms families etc All the graphs can be exported in png svg or eps formats by using the Save button In the estimation of the Molecular Score MS OptiMAS attributes the same weight to all QTL
11. button to display the filter dialog see below 27 Columns Individuals view v Columns Individuals view Columns view Individuals rows view Columns view Individuals rows view Use this tab to select wich QTL column must be hidden or shown Use this tab to select wich individual row must be hidden or shown _all_ None An None Choose cycle or group Cycle None gt Items id P1 P2 Cycle Group a 1 iv Pl 1 iv B8 A1005 A1005 C2 E re Z P2 2 Y B158 A251 A1005 C2 G1 3 Cycle 3 B28 A1006 A251 C2 G1 4 Y Group 4 B13 A1006 A1005 C2 5 V MS 5 B38 A1040 Al1005 C2 6 ic MS_Weight 6 B37 A1040 A1005 C2 7 v MS_UC ni iY B40 A1040 A1040 C2 8 No in 8 Y B242 A37 A1005 C2 9 No 9 Y B246 A37 A1040 C2 10 V No Y B293 A1040 C2 11 No B7 A1005 C2 12 QTL1 B124 A167 C2 13 QTL2 B B206 A1005 C2 x Cancel Figure 14 Columns and Individuals rows filter dialog The Fig 14 presents two tables displaying 1 the list of the variables left tab and ii the list of all the individuals right tab in the display table Individuals can be filtered by Cycles of selection Groups or manually Select and check columns and or individuals rows to follow and press the OK button to apply the corresponding filter This refreshes immediately the tables This new view
12. factorial design applied between two lists of selection Thus a list containing only two individuals B724 and B125 previously found via the QCS method has been crossed with the ten individuals having the highest MS i e List _TS_MS resulting in a factorial design displayed in Fig 29 37 5 5 Saving reloading your previous analysis Lists of selected individuals and crosses can be saved into the results folder On the menu bar Press Data gt Save all gt then press the Choose button Select the directory where to save the data 2 Look in py C Userswalente Desktop out 28_08_2012 15 0513 O O EJE yA My Computer L each_qtl R Valente ps input L list_crosses h list_selection Directory C Users Valente Desktop out 28_08_2012_15_05_13 E choose Fies of type Directories XE cancel Figure 30 saving your current analysis By default all the lists are saved to the appropriate directory specified in Fig 5 Note that the file tab_scores txt will be overwritten to include the possibly added PMS and index columns Once the files are saved you can close OptiMAS and reopen your analysis later by selecting the results folder previously saved and loading it see below On the menu bar Press File gt Reload Data gt Select your previous folder gt OK Emplacements r _ b T l chargements g WW Biblioth ques Documents k Images d Musique E Vid os gt N Dossier
13. for Windows XP or C Program Files x86 OptiMAS for Windows 7 A new folder named OptiMAS containing the two data set examples will be created in your home directory You can now launch OptiMAS interface via the start menu or the desktop shortcut 2 2 2 Linux 32 64 bits The software has been tested on Debian Ubuntu and Fedora We strongly recommend to have g make Qt qwt and graphviz installed via the package manager of your GNU Linux distribution aptitude apt on Debian and Ubuntu or yum on Fedora Building prerequisite 1 GNU compiler collections version 4 0 1 or later http gcc gnu org 2 Qt development package v4 4 3 or later Qt5 not tested yet http gt project org 3 qwt development package v5 x v6 not supported yet http sourceforge net projects qwt 4 graphviz development package version 2 20 2 or later http www graphviz org e g Debian Ubuntu platforms apt get install g libqt4 dev libqwt5 qt4 dev libgraphviz dev Redhat Fedora CentOS platforms yum install gcc c qt4 devel qwt devel graphviz devel To build and install optimas_gui GUI on your system extract the zipped file optimas_gui_linux zip by typing for example unzip optimas_gui_linux zip This will create a new directory called optimas_gui_linux Then open a terminal move to this directory before attempting to run the program and run the installation shell bash script install_optimas_on_lin
14. identified see Fig 18 in red ii among the remaining individuals belonging to Cycle taken in order of decreasing MS with MS gt MS nin OptiMAS searches for the individual having favorable alleles present at the largest number of those QTL identified in i This individual is added to the subset of selected individuals i e N No 1 Steps i and ii are iterated until either of the following conditions is met favorable alleles at all QTL are present in at least n individuals of the selected subset or the number of individuals in the selected subset reaches the given Nmax Value or it is not possible to find an individual in step ii Note that in step 11 MS is a secondary criterion individuals are taken based on their ability to complement the subset of already selected individuals Thus with the present set of parameters two individuals B124 B125 have been added to the selected list see more details on the right side of Fig 18 31 5 3 2 Display and comparison of lists of selected individuals The different lists of selected individuals can be compared in two parallel tables see below displayed by default which can be reduced to one list by clicking on the button Selection of individuals Graphs Pedigree Lists of selected individuals List2_Truncation_MS_Selection g List4_Complementation_Selection gt Ces e
15. individuals and their genotypes With this information OptiMAS determines at each QTL the set of possible gametes produced by each individual and estimates their probabilities Results for each QTL designated as x are stored in a specific folder named qtlx Files output_folder date each_qtl qtlx qtlx_gametes_set txt contain probabilities for gametes according to the following structure QTL Id gamete read proba nb gam 1 IL1 a a afa a T C A C 1 000000 1 1 IL2 b b bpb b C C 2 G C 1 000000 Zz 1 IL3 c c cfc c T C 2 A C 1 000000 1 2 IL4 d d did d T C 2 G C 1 000000 1 i IL1_IL2_F1 a a afa a T C 3fA C 0 320842 32 1 1 IL1_IL2_F1 a a afa b T C 3fA C 0 058067 32 1 IL1_IL2_F1 b b bfb a C C 2fG C 0 058067 32 1 ILi_IL2 Fl b b bfb b C C 2 G C 0 320842 32 1 IL3_IL4 Fl c c c T C 2 A C 0 320842 32 1 IL3_IL4 Fl d d ajd T C 2fG C 0 320842 32 1 IL1_IL2_F2 a a afa a T C 3fA C 0 830127 8 1 IL1_IL2_F2 a a a a b T C 3fA C 0 150240 8 1 ILi_IL2 F2 a b bfa a T C 2 A C 0 004207 8 1 IL1_IL2_F2 a b bfa b T C 3fA C 0 000758 8 1 IL3_IL4 F2 c c fc c T C fA C 0 226658 32 1 IL3_IL4 F2 c c efd c T C 3 G C 0 004352 32 1 IL3_IL4 F2 d d djc d T C 3 A C 0 004352 32 1 IL3_IL4 F2 d d djd d T C 3 G Cc 0 226658 32 1 Indi a a apa a T C 2 A C 0 266346 384 1 Ind1 a a apa c T C A C 0 014774 384 1 Indi d b bha b T C A C 0 000034 384 1 Indi d b bid b T C 2 G C 0 000000 384 1 Ind2 a a apa a T C 2 A C 0 266346 576 1 Ind2 a a a c T C A C 0 028464 576 1 In
16. named C2 OptiMAS is used at this step to select the best individuals that will be used for the next cycle of MARS Figure 6 multiparental MAS study Blanc et al 2006 2008 used as example within OptiMAS 22 5 2 Step 1 Computation of genotypic probabilities Estimation of genetic values Algorithms to compute the probabilities of IBD alleles transmission throughout generations of selection have been deployed and results are displayed via three tables corresponding to sections 4 2 4 4 2 3 and 4 2 5 and graphs see below ieai coe eos Homo Hetero Estimation of parental allele probabilities Graphs A Elview Weight f Index Double click on MS QTL cells to show detailed genotypes Export PL P2 Cycle Group MS MS Weight MS_UC No No No No QTL1 QTL2 QTL3 QTL4 a A1005 A1005 C2 0 8366 0 7079 9 2031 8 I 0 2 0 0 8598 0 9761 0 8752 0 895 B158 A251 A1005 C2 G1 0 8024 0 6790 9 3268 7 1 1 2 0 0 8792 0 8925 0 9405 0 973 B28 A1006 A251 C2 G1 0 774 0 6550 9 2215 6 1 1 3 0 0 9716 0 4938 0 9492 0 913 B13 A1006 A1005 C2 0 7609 0 6438 9 0767 6 1 2 2 0 0 8844 0 9559 0 9179 0 973 B38 a1040 a1005 c2 l 0 7494 0 6341 9 10956 1 1 3 0 0 9349 0 4907 0 9429 0 582 B37 A1040 A1005 C2 0 7433 0 6290 8 6768 7 1 1 2 0 0 9528 0 2334 0 9172 0 873 B40 A1040 A1040 C2 0 7404 0 6265 9 0101 7 1 2 t 0 0 9775 0 4376 0 9592 0 497 B242 A37 A1005 C2 0 7305 0 6181 8
17. proportion of favorable alleles over all QTL MS AII 0 5All 4 see section 4 2 4 Note 1 Individual IL3_IL4_F2 has a probability of 0 99996 to be homozygous unfavorable at QTL1 The sum of the probabilities QTL1 4 QTLI QTL1 4 is not 1 0 because some rare phased genotypes were removed via the cut off by default see section 4 2 6 2 The individual IL1 has a molecular score of 0 333 because it is 100 homozygous favorable for the QTL1 and 0 for the two other QTL 3 Individual Ind1 has a MS of 0 588 considering information at all three QTL Id MS Al AN A QTLI QTL1 QTL1 QTL2 QTL2 QTL2 QTL3 QTL3 QTL3 C E Gere e G ete i CO E Ind1 0 588 0 313 0 137 0 549 0 000 0 006 0 993 0 939 0 000 0 060 0 000 0 403 0 596 4 2 3 2 in terms of parental alleles In a multi allelic context several parental alleles can be regrouped and considered as favorable at the QTL position see the summarized table above Nevertheless it can be interesting in some cases to know the detailed probabilities of the possible genotypes in terms of the parental origin of alleles Results for each QTL designated as x are stored in a specific folder named qtlx Files output_folder date each_qtl qtlx qtlx_homo_hetero txt contain details about parental allele origins at QTL positions according to the following structure QTL 2 example IL1 Homo 0 00000
18. sd Advanced options parameters J QTL analysed Cut off Mode Step 3 intermating gt All QTL 0000 7 Markers 0000 i Individuals 0000 Figure 5 data set importation to run the program Loading input output files using the browser Map file path to the genetic map input file map Remarks QTLs information files and map file has to be in the same folder and share the same base name blanc map blanc qtlpos and blanc qtll note that the method used to determine markers flanking the QTL is determined by the file s found in the same folder If several files describing markers to be used to track QTL are present in the directory only one will be used according to priority rule qtll gt qtIn gt qtlw Genotype file path to the genotype pedigree file dat 20 Allelic effects file optional can be loaded refreshed after loading mandatory files is accessed by browsing and may be located in any folder It should conform with the following structure QTL allele Silk GrMoisture GrYield QTL1 s 0 21 0 04 0 017 QTL2 f 0 42 0 42 0 082 QTL2 s 0 06 0 11 0 083 QTL2 x 0 35 0 26 0 098 QTL3 d 0 48 0 18 2 QTL3 f 0 19 0 01 QTL3 s 0 31 0 02 QTL3 x 0 03 0 2 A tabulated texte file where each number e g 0 22 describes the effect of one parental allele e g d at a given QTL e g QTLI for a particular Trait x Environment combination e g Silk Note that missing informati
19. stil uae iea Deu orate ns 2 E E ET a E EE EN 2 22s APAC OS a e E a E amends 2 13 WPUNC MOUS a ce ien i ea tne aetonty olen a S R N etme deter wee 2 2 Installation procedure asirese teinne nea R asiaan oaea a EE i an as 3 2U OptMASincompmand Ine ieten e eeni atg e e e niagis 4 2l Windows 3S2 Bits haasen aa a a a a dak as a wets 4 2 1 2 i LINUR S 2 OA Dits ere iainta eai ea aa ataia ai eariad raa Tai 4 Ded MacOSX 32 DUS i cases ctt tars cea inoue dai ne a e i a a aS 5 2 2 OptiMAS Graphical User Interface GUD 00 eee eee ceenceceeeeeceeeeeceeeecneeeeneeeenaeeess 5 PA WandoOws 32 DIts siete sisi aes BBE ERA ER RSs 5 222a CNU SZ 64 DUS eee ee De i EN OS ee EA a 6 22 3 MacOSX 32 Bits sesicdstissiace iana a a ae eae 6 2 3 Files and directories description esssssssssssssseesesessetesstessersseesseeessresssresseesseessseesssets 7 De Data preparation iian seth cassis EE a E E E AAEN 7 3 1 Genotypes Pedigree file dat 0 nsnsennseesseesseesseessseeesseessesseesseresseessseessersseessseesssees 8 3 2 Genetic Map map istrosa eiet 10 333 lt QTLsnformat on eSa n a a a less oS E I A 10 4 OptiMAS in command line 3555 voccainsthe seassosvcecnase calc saunewaes saccasnstiaecuckeeccancaseeresetaaesaseteeeds 12 4 1 Running OptiMAS in command Line eee ese ceseceeeeeeneeceeceseeeeeeesaeesaeenseeneees 12 4 2 Results and output files interpretation cee eeeseceeeeecesececeeeeeceeeeecneeeesteeeenaeeeaes 13 4
20. 0 10 References Bernardo R Moreau L and Charcosset A 2006 Number and fitness of selected individuals in marker assisted and phenotypic recurrent selection Crop Sci 46 1972 1980 Blanc G Charcosset A Mangin B Gallais A and Moreau L 2006 Connected populations for detecting quantitative trait loci and testing for epistasis an application in maize Theor Appl Genet 113 206 224 Blanc G Charcosset A Veyrieras J B Gallais A and Moreau L 2008 Marker assisted selection efficiency in multiple connected populations a simulation study based on the results of a QTL detection experiment in maize Euphytica 161 71 84 Hospital F Goldringer I and Openshaw S 2000 Efficient marker based recurrent selection for multiple quantitative trait loci Genet Res 75 357 368 Huang Y F Madur D Combes V Ky C L Coubriche D Jamin P Jouanne S Dumas F Bouty E Bertin P Charcosset A and Moreau L 2010 The Genetic Architecture of Grain Yield and Related Traits in Zea maize L Revealed by Comparing Intermated and Conventional Populations Genetics 186 395 U612 Ribaut J M de Vicente M C and Delannay X 2010 Molecular breeding in developing countries challenges and perspectives Curr Opin Plant Biol 13 213 218 41
21. 0 Hetero 0 000000 Homo 1 000000 asa 1 000000 IL2 Homo 1 000000 b b 1 000000 Hetero 0 000000 Homo 0 000000 IL3 Homo 1 000000 ehe 1 000000 Hetero 0 000000 Homo 0 000000 IL4 Homo 0 000000 Hetero 0 000000 Homo 1 000000 d d 1 000000 IL1_ IL2 Fl Homo 0 000000 Hetero 1 000000 aso L COCCO Homo 0 000000 IL3_IL4 F1 Homo 0 000000 Hetero 1 000000 Eel 1 OOOO OO Homo 0 000000 IL1_IL2 F2 Homo 0 981847 b b 0 981847 Heterot i OOS Osi aso O ONES OLS Homo 0 000082 a a 0 000082 16 IL3 IL4 F2 Homo 898542 3 898542 Hetero 7098 733 g T098 733 Homo 1002 7 180 A T002 miO Indl Homo 939286 939286 Hetero 060239 8 008636 8 ORO SHkG OZ Homo 000474 i 000474 Ind2 Homo Hetero Homo 7939286 fi 093926 0 060239 se 0 006636 8 0 051602 000474 8 000474 Note at QTL2 position favorable alleles are b and c see map file Individual Ind1 has a very high probability to be homozygous favorable Homo 0 939286 We can see that this is due to a high probability of being heterozygous for favorable alleles b and c b c 0 939286 4 2 4 tab_scores prediction of genetic value This table summarizes and presents the molecular scores for each all QTL and additional indexe
22. 28 A1006 A251 C2 G1 Les Le LL J fsan B13 A1006 A1005 C2 LA R A 0 m E A Save B38 A1040 A1005 C2 B37 A1040 A1005 C2 B40 A1040 A1040 C2 B242 A37 A1005 C2 B246 A37 A1040 C2 B293 A9 A1040 C2 ras si B7 A1005 A1005 C2 15 PMS_GrMoisture i an Us B206 A27 a1005 c2 l 2 G 4 Reset X cancel o Apply B296 a91 a1003 c2 l Figure 12 the resulting index column is always located just before the MS column items 10 QTL8 11 QTL9 12 QTL10 13 QTL11 The Find Id dialog box can be used to search and locate a specific individual in the panel Press Find button see below Find Id 2 Whole words only Close Search backward Figure 13 find individual by name Find Id dialog By default research identifies individuals the name of which contains the declared string B124 in figure 13 Research is also possible via exact matching check Whole words only Enter the Id of the individual that you are looking for with the appropriate parameters into the search box and then press the Find button Any matching results will move the main display to the exact position of the individual It will also be graphically highlighted The Filter Columns Individuals rows dialog is used to enable or disable the display of any columns except Id column and or individuals rows on the MS table Press the View
23. 3946 Var 0 000337124 Figure 25 half diallel vs better half strategies In the figure above individuals are ranked on the two axes based on their genetic value MS from highest to lowest On the left side the Complete procedure half diallel has been applied All the individuals were crossed together with no limitation on their contribution whereas on the right side better half method crosses between individuals having lowest MS have been avoided i e B37 to B125 35 Lists of crosses created via the different methods and or constraints can also be analyzed and compared via histograms i e select Graph Histo see below Crossing of individuals Graphs 45 An 0 O1 02 03 04 05 06 o7 0B 09 1 0 oi 02 03 0A 05 06 07 08 09 1 Molecular Score MS Molecular Score MS List List1_Half_Diallel Graph Histo gt QTL MS s Save List List2_Better_Half Graph Histo gt On Ms Save Mean 0 756312 Var 0 000623468 Mean 0 773946 Var 0 000337124 Figure 26 comparison of the two lists half diallel Vs better half of couples generated As expected from the higher relative contribution of individual B8 the average MS is higher for Better Half option 0 756312 vs 0 773946 and the variance of MS among generated couples is lower 6 23 10 vs 3 37 10 Constraints on the contribution of parents or on the maximum number of crosses to be done can be applied see Fig 27 28 In this case c
24. 4 etc Group optional information regarding another classification criterion e g subprograms families etc Mk1 Mkn genotyping results The software can deal with SNPs microsatellites and any bi multi allelic marker genotyping technique with either dominant or codominant scoring Note that The markers present in the genotype pedigree file must match those in the map file same name but not necessarily be ordered Homozygous genotypes for an allele e g A can be scored either as A or A A Heterozygotes are expected to be separated by a e g A G Heterozygous genotypes are assumed to be unphased i e A B equivalent to B A Missing data at marker loci are allowed and must be entered as or can be left blank For dominant markers assuming A dominant vs a recessive genotypes presenting allele A must be coded A Parental inbred lines should not contain missing data Tr1 Trn last columns optional quantitative traits Names of the traits should begin by a star to differentiate them from markers Note that residual heterozygosity can be declared at markers not QT L however using this option should be discouraged due to loss of information Description of the genotypes pedigree file example in relationship with the multiparental MARS schema see table above 1 Parental line where a is the name given to its alleles same name for all loci 2 Example of a hybrid F1 obtained by intercro
25. 424 6 1 1 3 Hetero 0 223785 B246 A37 a1o40 c2 l 0 7303 0 6180 8 8995 6 1 3 1 d f 0 195505 d s 0 028279 B293 A9 A1040 C2 0 7268 0 6150 8 8608 6 1 3 1 pps Arar aca P y a d d 0 012944 B7 A1005 A1005 C2 0 7238 0 6125 8 4621 6 2 0 3 Founders B124 A23 A167 C2 l 0 7223 0 6852 8 8114 6 1 1 3 d 0 124836 B206 A27 A1005 C2 0 7205 0 6097 8 7916 6 1 1 3 f 0 732908 B296 A91 A1003 C2 0 7194 0 6087 8 7789 6 1 3 1 s 0 142245 x 0 000000 B35 A1040 A1003 C2 0 7149 0 6049 8 3643 7 2 0 2 B42 A1040 A1040 C2__ 0 7138 0 6040 8 3517 7 2 1 1 10 9775 0 4025 0 9592 0 9827 0 9231 0 a Figure 8 detailed genotype in terms of parental alleles at QTL position Fig 8 shows that individual B8 has a molecular score of 0 8752 at QTL 4 It has a probability of 0 763260 to be homozygous for the favorable alleles s f i e Homo This score corresponds to the sum of the probabilities of the genotypes f f 0 522327 s f 0 225656 and s s 0 015277 Its MS of 0 8752 corresponds to the expected proportion of favorable allele s i e Homo 1 2 Hetero Founders represented by d f s and x alleles indicate the expected proportion of parental alleles We notice that this individual is issued from three parental lines D F S and not X see also pedigree in Fig 20 A colored view of the molecular score table can be displayed to identify more easily QTL for which a g
26. Documentation for OptiMAS a decision support tool for marker assisted assembly of diverse alleles Version 1 5 F Valente F Gauthier N Bardol G Blanc J Joets A Charcosset amp L Moreau Code by Fabio Valente and Franck Gauthier with contribution from Guylaine Blanc Project started in December 2009 Emails moreau moulon inra fr charcos moulon inra fr Website http moulon inra fr optimas Address for correspondence UMR de G n tique V g tale INRA Univ Paris Sud CNRS AgroParisTech Ferme du Moulon F 91190 Gif sur Yvette France This program was funded by the Generation Challenge Program OptiMAS is a free software you can redistribute it and or modify it under the terms of the GNU General Public License as published by the Free Software Foundation either version 3 of the license or at your option any later version This program is distributed in the hope that it will be useful but WITHOUT ANY WARRANTY without even the implied warranty of MERCHANTABILITY of FITNESS FOR A PARTICULAR PURPOSE See the GNU General Public License for more details You should have received a copy of the GNU General Public License along with this program If not see lt http www gnu org licenses gt IPINA ios ME awe BEB Gcnetique 2 N lt Generation Cultivating Plant Diversity for the Resource Poor Ke Son KKK KKK Contents k introductionis ie Sean Decne Tan clout See ted Aa aot
27. T C HEA C T C HCG C 0 003613 218 1 Ind2 a b bfa b d d dfc d T C A C T C 2A C 0 000003 218 i Ind2 a b ppa b d d d d T C A C E G G C 0 000172 218 Figure 3 probabilities of phased SERYE diplotypes for each indivual at QTL1 Columns of this file correspond to QTL index of the QTL Id corresponds to the name of each individual haplol haplo2 possible pair of haplotypes ie phased genotype also called diplotype defined according to parental origin 13 read1 read2 translation of haplol and haplo 2 in terms of observed marker alleles Note that a given read1 read2 combination may correspond to several haplol haplo2 combinations proba probability of this specific possible phased genotype diplotype nb_haplo number of possible diplotypes corresponding to theindividual Note At the QTL 1 the individual IL1_IL2_F2 has 24 possible phased genotypes One of them 1 which is the most likely given observed marker data has a probability of 0 69 This genotype is a a at the QTL position 3rd locus see map file and its full genotype all loci QTL and associated markers is aaaaa aaaaa The individual Ind2 has more possible phased genotypes than Ind1 Ind1 118 Ind2 218 because it was not genotyped all possible diplotypes are considered 4 2 2 _gametes_ set probabilities of gametes The previous section underlined all the possible phased genotypes along with their probabilities taking into account the pedigree of
28. al list with supplementary individuals bringing the complementary alleles Note that the same name will be kept after complementation Truncation selection MTS QTL Complementation Selection QCS Nsa 8 eI Criterion Moleculat List List4 Complementation _ gt Option Run List List4_Compleme s Option Run Selection of individuals Graphs Pedigree Lists of selected individuals List4_Complementation_Selection a ra 5 w i a 5 x RY SS ee Ea z a nes al Listl_Manual Selection Id P1 P2 Cycle Group MS MS Weight MS_UC No No No No QTL1 QTL2 QTL3 List2_Truncation_MS Selectio List3 Truncation Weight Seld 2 88 A1005 A1005 c2_ 0 8366 0 7079 9 2031 8 1 0 2 SHE 2 8158 A251 A1005 C2 G1 0 8024 0 679 9 3268 7 1 1 2 3 B28 Al1006 A251 C2 G1 0 774 0 655 9 2215 6 1 1 3 4 B13 A1006 A1005 C2 0 7609 0 6438 9 0767 6 1 2 2 5 B38 A1040 A1005 C2 0 7494 0 6341 9 1095 6 1 1 3 6 B37 A1040 A1005 C2 0 7433 0 629 8 6768 7 1 1 2 7 B40 A1040 A1040 C2 0 7404 0 6265 9 0101 7 1 2 1 8 B242 A37 A1005 C2 0 7305 0 6181 8 7424 6 1 1 3 9 B124 A23 A167 C2 0 7223 0 6852 8 8114 6 1 3 10 B125 A23 A167 C2 0 7032 0 6691 8 4426 6 2 1 2 View No ind 10 No group 1 P Mean 0 756312 Var 0 00140282 Figure 17 QTL Complementation Selection As an example create a f
29. alf of couples generated On the left table see List _Half_diallel_From_QCS all the individuals were crossed together For each of the 45 resulting pairs a virtual individual is created and OptiMAS computes the expected molecular score of the progeny of the cross for all and each QTL On the right table the better half strategy leads to 25 couples see also Fig 25 QTL columns can still be sorted in order for instance to identify pairs having a score of 0 Note that crosses can be deleted from a list right click on the selected pair and press the delete button They also can be copied from one list to the other by drag and drop Meanwhile graphs are automatically generated to display a view of the selected crosses based on the different strategies see below Crossing of individuals Graphs Bi25 F F E B125 F B124 re B124 a U B242 4 U B242 4 Z B40 B40 H a 8374 a 8374 g384 B38 3 3 gt B13 4 gt B13 4 T B28 T B28 H B158 4H H B158 B8 BS aa a a a Gee ee So ee es ee ee PF SP Qg g 2 Sh a oP oP E SF Q g D sl a S F Ff F FF YY SY FS yr F amp F F F FFP SY ST Individuals MS Individuals MS List Listt_Half_Dialle Graph Plot an ms Save List List2_Better_Half Graph Plot an ms save Mean 0 756312 Var 0 000623468 Mean 0 77
30. ally intensive processes corresponding only to the step 1 operation see Fig 1 The results and output files produced at the end of the run exemplified below can then be reloaded via the GUI see section 5 5 4 1 Running OptiMAS in command line For the three different operating systems you will need to specify a list of 6 mandatory arguments to run optimas command line executable for example optimas input dat input map 0 000001 0 0 output_folder 0 verb 1 Input file path to the genotype pedigree file dat 2 Input file path to the genetic map input file map 3 Algorithm parameter cut off float number genotypic probability below which a rare phased genotype diplotype see definition below is removed and no more considered in subsequent computations 4 Algorithm parameter cut off float number 0 0 by default for gametic probability It corresponds to the probability that the number of crossovers expected in the region between flanking markers exceeds a given value Thus unlikely gametes with number of crossovers over this value are removed and no more considered in subsequent computations Use of this option with values up to 0 01 is recommended in case many flanking markers per QTL lead to high computation time with default option 5 Output folder path to the output folder where the results will be stored see section 4 2 for output description the name of this directory mu
31. aries between 0 27 X parental line at the bottom of the table not presented in Fig 7 and 0 83 B8 individual presents at the top of the table coming from the last cycle of selection Note that the four inbred lines D DE F F283 S F810 and X F9005 have a MS of 0 36 0 36 0 45 and 0 27 respectively More detailed genotypes can be displayed by double clicking on MS or QTL cells see below This view summarizes and aggregates the information presented in the two other tables Homo Hetero and Estimation of parental allele probabilities see sections 4 2 3 and 4 2 5 23 Molecular score prediction Homo Hetero Estimation of parental allele probabilities Graphs Q Find View 2 Weight for index Double click on MS QTL cells to show detailed genotypes Export Id Pl P2 Cycle Group MS MS Weight MS _UC No No No No QTL1 QTL2 QTL3 QTL4 QTL5 QTL6 A1005 c2 0 8366 0 7079 9 2031 8 l 0 2 0 B158 A251 A1005 C2 Gl 0 8024 0 6790 9 3268 7 1 1 2 Id B8 B28 A1006 A251 C2 Gl 0 774 0 6550 9 2215 6 1 1 3 Ms 0 836645 QTL4 0 875152 B13 A1006 A1005 C2 0 7609 0 6438 9 0767 6 1 2 2 All s f B38 A1040 A1005 C2 0 7494 0 6341 9 1095 6 1 1 3 Genotype B37 A1040 A1005 C2 0 7433 0 6290 8 6768 7 1 1 2 Homol 0 763260 B40 A1040 A1040 C2 0 7404 0 6265 9 0101 7 1 2 1 ff 0 522327 s f 0 225656 s s 0 015277 B242 A37 A1005 C2_ 0 7305 0 6181 8 7
32. as been added to the list of selection m A 2 a Each QTL is requested to be present in atleast nr 2 IZ selected individuals KRAN 2 Number of individuals carrying the favorable QTL alleles nt 2 N 9 QTL1 QTL2 QTL3 QTL4 QTL5 QTL6 QTL7 QTL8 QTL9 QTL10 QTL11 ath 9 9 7 9 9 79 9 9 8 9 9 9 9 9 9 9 Maximum number of individuals selected at the 3 gt p end ofthe complementation process QTLs to complement 1 Individual B125 has been added to the list of selection The given Nmax value has been reached Individual s B124 B125 has have been added to List4_Complementation_Selection Xx Cancel o aoo Table atthe end of the QCS process QTL1 QTL2 QTL3 QTL4 QTLS QTLG QTL7 QTL8 QTL9 QTL10 QTL11 2 10 10 10 8 10 10 10 7 10 10 10 10 10 8 10 10 10 10 10 10 10 a MSmn 3 The minimum threshold value Molecula Score for the addition of an individual Nmex 10 iS Cycle or group of the selected individuals cyde adlie a Figure 18 QTL Complementation Selection algorithm with parameters In this example QCS adds two candidates such that their QTL composition complements those eight individuals already selected see Fig 17 amp 18 30 The QCS is described by five parameters Oms corresponds to the MS QTL threshold QTLx g
33. at this generation have correct genotypes See section 5 6 for more details on the procedure to adopt in case of a warning message at the end of a run 5 OptiMAS Graphical User Interface GUI 5 1 Running OptiMAS GUI To run the program you must specify the paths to the genetic map file and the genotypes pedigree file see section 3 containing all information about your MAS design Select File gt Import Data from the menu bar see below OptiMAS a decision support tool to conduct Marker Assisted Selection programs 1 5 File Visualization D import Data Settings gD import Data Map and Genotypes Example Data Reload Data Q Example Data Browse map data genotype data and results folder in order to run OptiMAS Optionally import allelic effects file to compute predicted molecular scores from different traits environments 4 exit combinations S Or simply browse the allelic effects file and click the Proceed button to add or recompute these predicted molecular scores without having to launch a complete analysis again Step 1 Prediction Data file to import OV Map file fhome valente OptiMAS input blanc map Browse a7 47 Genotype file fhome valente OptiMAS input blanc dat Browse a7 Allelic effects file shomeyvalente OptiMaS input allelic_effects_blanc txt Browse Step 2 ad Output directory wt Results fhome valente OptiMAS output
34. d2 d b bfc b T C A C 0 000000 576 1 Ind2 d b pid b T C 9 G C 0 000000 576 Figure 4 probabilities of gametes for each individual at QTL1 14 Columns of this file correspond to QTL index of the QTL Id corresponds to the name of each individual gamete possible gamete defined according to parental origin read translation of gamete in terms of observed marker alleles Note that a given read may correspond to several gametes proba probability of this specific possible gamete nb_gam number of possible gametes corresponding to the individual Note At the QTL number 1 the individual IL1_IL2_F1 has a probability of 100 to be heterozygous aaaaa bbbbb see section 4 2 1 In this case the five loci are heterozygous and the number of possible gametes is 2 32 These 32 possible gametes have different probabilities depending on the recombination rates calculated from genetic distances Haldane s map function used The highest probability is that of non recombinant gametes 0 32 for both aaaaa and bbbbb 4 2 3 tab_homo_hetero probabilities to be homozygous or heterozygous at the QTL positions 4 2 3 1 based on favorable unfavorable allele grouping Based on the phased genotype information qtlx_haplotypes_set files the probabilities to be homozygous heterozygous at the QTL positions are computed according to favorable unfavorable grouping of founder alleles i e IL1 a IL2 b IL3 c IL4 d
35. declared in the map file It is also possible to discard QTL and or to attribute economical weights defined by the breeder to compute a Weight index Press the Weight button to open a dialog window see below Molecular score prediction Homo Hetero Estimation of parental allele probabilities Graphs Find Eview Weight f Index Double click on MS QTL cells to show detailed genotypes Export id P1 P2 Cyde Group MS MS Weight QTLs Weights 4 QTL5 QTL6 QTL7 Ji B8 A1005 A1005 C2 0 8366 0 7079 Different weights for each QTL can be assigned 8752 0 8959 0 9745 0 996 B158 A251 A1l005 C2 Gl 0 8024 0 6790 1 Select the QTLs that have to be weighted 9405 0 9731 0 9693 0 996 B28 A1006 A251 C2 G1 0 774 0 6550 2 Set a weight 9492 0 9137 0 9199 0 9797 3 Click update B13 A1006 A1005 C2 0 7609 0 6438 4 Click Apply 9179 0 9731 0 4963 0 9895 B38 A1040 A1005 C2 0 7494 0 6341 None Weight 1 00 e Update 9429 0 5822 0 926 0 986 B37 A1040 A1005 C2 0 7433 0 6290 19172 0 8735 0 9718 0 986 QTL Weight 4 B40 A1040 A1040 C2 n 0 7404 0 6265 9592 0 4974 0 9231 0 9759 B242 A37 A1005 C2 0 7305 0 6181 ag TLL 2300 7 18879 0 2896 0 9722 0 986 B246 A37 A1040 C2 0 7303 0 6180 E ote 1 00 m 9575 0 4629 0 9713 0 9759 B293 A9 A1040 C2 0 7268 0 6150 BUE 100 7
36. directory HOME bin optimas The input output file examples will be stored respectively in HOME OptiMAS input and HOME OptiMAS output You can now run optimas from a terminal by typing optimas or usr local bin optimas or HOME bin optimas Note it is not necessary to specify the complete path if the binary optimas is present in the PATH environment variable To run the program you must supply 4 input files and possibly an optional one for allelic effects Instructions for how to prepare the input files are given below see section 3 Input file examples input directory are supplied with the software You can run the program on the test data supplied by typing as an example see section 4 1 for more details about the parameters and options optimas HOME OptiMAS input blanc dat HOME OptiMAS input blanc map 0 000001 0 0 SHOME OptiMAS output run_blanc 0 verb The program will output a summary of the results in the folder run_blanc Open the file tab_scores txt to see the genotypic values calculated for the plants present in the dataset To uninstall OptiMAS run the script uninstall_optimas sh located in the same directory as optimas 2 1 3 Mac OS X 32 bits The executable comes in a zipped file Extract it with your favorite file archiver software e g Archive Utility This will create a new directory called optimas_cmd_mac Move to this directory via the terminal application Applications gt Utilities gt Termina
37. e al id P1 P2 Cycle Group MS MS Weight Id Pl P2 Cycle Group MS MS Weight List3 Truncation Weight Sele 1 B8 A1005 A1005 C2 0 8366 0 7079 d 1 B8 a1005 a1005 C2 0 8366 0 7079 Hst4_Complementation_Sele 2 _ B158 A251 A1005 C2 G1 0 8024 0 679 2 B158 A251 A1005 C2 G1 0 8024 0 679 f 3 B28 Al006 A251 C2 G1 0 774 0 655 d3 B28 A1006 A251 C2 G1 0 774 0 655 a B13 a1006 A1005 c2 0 7609 0 6438 4 B13 Aa1006 A1005 C2 0 7609 0 6438 G p 5 838 Alo4o A1005 c2_ 0 7494 0 6341 d 5 B38 A1040 A1005 Cc2__ 0 7494 0 6341 am 6 B37 A1040 A1005 C2 0 7433 0 629 d 6 B37 A1040 A1005 C2 0 7433 0 629 i 7 B40 A1040 A1040 C2 0 7404 0 6265 d 7 B40 Al040 a1040 C2 0 7404 0 6265 Add B242 a37 aroos c2 l 0 7305 0 6181 8 B242 A37 A1005 C2 0 7305 0 6181 i EEREN 9 B246 A37 A1040 C2 0 7303 0 618 d9 B124 A23 A167 c2 0 7223 0 6852 10 B293 A9 A1040 C2 0 7268 0 615 d 10 B125 A23 A167 C2 0 7032 0 6691 Reset C Ci T Me Mean 0 75947 Var 0 60115008 Mean 0 756312 var 2 0 00140282 Figure 19 comparison between lists of selected individuals via tables If we compare the two lists of ten individuals selected via the MS on the left and QCS on the right we can see that the two last individuals B246 and B293 are replaced by B 24 and B125 in the QCS list The selection of these two individuals will bring two more unfa
38. ean for all QTL MS Molecular Score expected proportion of favorable alleles over all QTL 4 2 7 events_summary log questionable results Questionable results are displayed in the file output_folder date events_summary log see below It is recommended to always check it before any interpretation if a warning message appeared An empty file means that no errors have been found during the execution Otherwise this file contains the name of individuals with questionable genotypes at considered QTL positions These are defined as individual x QTL combinations for which no diplotype passing the cut off threshold could be found Three main causes can be distinguished Individuals have genotype data inconsistent in light of their ancestry which may reveal either a genotyping error or an error in the declared pedigree Individuals display very unlikely genotypes relative to their parents i e genotypes that can only be obtained assuming unlikely recombination events The cut off threshold value set by user for keeping diplotypes is too high SUMMARY Questionable results at markers QTL sorted by Id Tels atiovelil Pubs Abit i Bag Wis 2 Cycle F1 Owe i 2 rels moel PIESEI P23 ILS Cycle F1 OTL i Telg styavelS 5 wilg MA5 Bee 15 Cycle Fl Geha IL cls aliavel Hy Piles 37134 B23 9902 Cycle F2 O I ay 19 Note that a genotyping error at generation n affects the results for its progeny even if at generation n individuals
39. ed according to either the weighted molecular score or the utility criterion Selfing of selected individuals can be included or not Then lists of crosses created via the different methods can be analyzed and compared via graphs Example of instructions to create the two lists of crosses that will contain results of the half diallel and the better half processes among the previously selected candidates Rename the two empty lists of crosses by double click in List1_Half_Diallel_from_QCS and List2_Better_Half_from_QCS Then select the appropriate method via the Option button see Fig 23 amp 24 v Crossing schemes option Crossing schemes Number of crosses Maximum 1 Contribution of individual Unlimited 1 a Criterion Molecular Score Method Cr _ Self Better Better Half tcc Figure 23 crossing schemes options method selection Choose the list of selected candidates coming from the step 2 e g List4_Complementation_Selection and the adequate list of crosses e g List1_Half_Diallel_from_QCS amp List2_Better_Half_from_QCS Finally click on the Run button to see the results of these crosses stored to the appropriate list see below 34 Crossing of individuals Graphs Lists of crosses Listl_Half_Diallel_from_Qcs List2_Better_Half_from_QCs gt Listl Half Diallel from QCS Id Pl P2 MS MS Weight MS Id P1 P2 MS MS Weight MS Ud List2_Better_Half_from_QCs Li
40. effects also can be supplied optionally Four input files are needed a genotypes pedigree file containing also possibly quantitative traits a classic map file with marker positions a QTL positions file and a file used to assign a list of marker to each QTL An additional optional one for allelic effects can also be provided Examples of these files blanc dat blanc map blanc qtlpos blanc qtll and allelic_effects_blanc txt are supplied within the users home directory user_name OptiMAS input for multiparental designs Note that all these files must be in plain text tab delimited format and that the markers present in the map file must be ordered To analyze your own data you must prepare the input files in the appropriate format as described below Note that OptiMAS does not check that your input files are strictly conform to their expected format This may lead to errors not necessarily detectable by the software Note The old map file format gathering information on markers and QTLs within a single file used in previous versions of OptiMAS is still supported by OptiMAS v1 5 but we now recommend using the new format An additional example following the old map format is provided within the user s home directory user_name OptiMAS input for biparental designs 3 1 Genotypes Pedigree file dat The genotypes pedigree file which name has to end with dat should contain individuals pedigree information and genotypic da
41. election of individuals Graphs Pedigree Lists of selected individuals List2_Truncation_MS_ Selection 5 List1 UENEN eae id P1 P2 Cycle Group MS MS Weight MS_UC No No No No QTL1 QTL2 QTL3 List2 Truncation MS Selectia List3 Truncation Weight Sele 1 B8 A1005 A1005 C2 0 8366 0 7079 9 2031 8 1 0 2 0 8598 ERT RR A 4 7 T T ji T ji List4_Complementation_Sele 2 7 8158 A251 Al005 c2 Gl 0 8024 0 679 9 3268 7 a ja 2 3 B28 A1006 A251 C2 G1 0 774 0 655 9 2215 6 1 p 3 4 B13 A1006 A1005 C2 0 7609 0 6438 9 0767 6 1 2 2 z 7 5 B38 A1040 A1005 C2 0 7494 0 6341 9 1095 6 1 3 6 B37 A1040 A1005 C2 0 7433 0 629 8 6768 7 j 1 2 7 B40 A1040 A1040 C2 0 7404 0 6265 9 0101 7 1 2 1 Add 8 B242 A37 A1005 C2 0 7305 0 6181 8 7424 6 la j 3 Remove 9 B246 A37 A1040 C2 0 7303 0 618 _ 8 89956 1 3 1 10 B293 A9 A1040 C2 0 7268 0 615 8 8608 6 a 3 j Reset QG gt m ii No ind 10 No group 1 _ ew PF Mean 0 75947 Var 0 00115008 Figure 16 example of truncation selection based on the Molecular Score MS 29 QTL complementation selection QCS takes into account complementarities between candidate individual s regarding the favorable alleles they carry see figures below It aims at preventing the loss of rare favorable allele s Hospital et al 2000 This option is important when a high number of QTL is considered This option completes an initi
42. f the individual must exist as individuals above in the file except for founder parents Individuals must be ranked according to generations from oldest to most recent Founder parents of the program 1 in table are assumed to be homozygous lines with no residual heterozygosity Their pedigree is assumed to be unknown Parent1 parent2 columns indicate in this case the allele that will be transmitted through generations This allele is identified as a single character OptiMAS can also handle selection schemes starting from heterozygous individuals e g fruit tree breeding In this case two virtual inbred lines must be defined for each heterozygous founder described then as a Cross see below Step corresponds to the pedigree relationship between the individual and its parent s CR Cross indicates that the individual results from a cross between its two parents Sn indicates that the individual results from n integer between 1 and 20 generations of selfing of its parent in this case the two parents Id must be identical RIL Recombinant Inbred Lines assumes that the individual results from an infinite number of selfing generations from an initial F1 hybrid DH Double Haploids assumes that the individual results from haplo diploidisation from an initial F1 hybrid IL indicates founder inbred status see above Cycle optional information regarding the generation in the program e g first cycle second cycle F2 F
43. his example shows that use of truncation selection based on MS leads to the selection of individuals all carrying the unfavorable allele at QTL1 see above histogram on the left On the right side the two individuals B 24 B125 added via the QCS procedure are observed with a MS comprised between 0 45 and 0 5 In addition if we select QTL7 in both lists we notice that the QTL7 will be fixed for the favorable allele at the next generation the ten individuals have a MS gt 0 95 in the graph not represented in Fig 20 32 To visualize the origin of the selected individuals of each list the user can display their pedigree see below Move to Pedigree tab select the list of your choice check Alone to only have the individuals of the selected list and click on the Generate button Selection of individuals Graphs Pedigree List Uist2_Truncation_MS_Selecton CAI Aone Generate Save Figure 21 pedigree of individuals selected via the Truncation Selection method It is possible to compare pedigrees coming from different lists of selection see Fig 21 amp 22 Truncation Selection list vs QCS respectively Note that the two individuals added via the QCS bring the parental allele x So if we use the QCS list to produce the next generation the four parental alleles will be present in the next cycle Selection of individuals Graphs Pedigree List List4_Complementation Selecton Al Alone Generate Sa
44. ility Criterion and List s selection Crossing of individuals Graphs B125 F B125 F 8124 8124 g SESS B40 B40 v B37 v B37 e 5 B38 5 B38 gt B13 gt B13 U B28 U B28 c B158 4 H B158 B8 B8 o gt RD A vo 2 Q V Q MR yr F amp F F FOF HK SY SF ey F amp F F FFF HK SH SF Individuals MS Individuals MS List List4_MS_Contrib Graph Plot gt Qi MS s Save List ListS_UC_Contrib gt Graph Plot lims lt Save Mean 0 756312 Var 0 00131069 Mean 0 756312 Var 0 000877108 Figure 28 constraints on the contribution of parents based on the MS or the Utility Criterion On the left side individuals with the higher MS are crossed together e g B8xB158 B28xB13 etc whereas on the right side the use of the utility criterion resulted in different pairs e g B amp xB38 A factorial design between two lists of selected plants can be applied Checking the box between the two lists enables the possibility to select a second list of selected candidates see below a X List s selection Listi_Manual_Selec V x List2_Truncation_M Crossing of individuals Graphs B125 m B124 B293 2 B246 B242 a B40 B37 B38 B13 B28 B158 BB F ES ES SSS Individuals MS a X Individuals g A AS L FF List List3_ManualxTS Graph Piot Qn 6 lt Mean 0 736117 Var 0 000310277 Figure 29
45. irst version of List4_Complementation_Selection by applying the Truncation Selection based on the MS Nsa 8 Criterion MS List List4_Complementation_Selection Option Cycle C2 then press Run Eight individuals B8 B242 see Fig 17 are selected at this step The colored view points that we are losing the favorable allele for QTL 1 in red QTL1 0 for all individuals To complete the list with the QCS procedure select the List4_Complementation_Selection in the QCS section Click on the QCS Option button to set up the parameters shown in Fig 18 on the left side and press Apply The result appears in a pop up window on the right i o QTL Complementation Selection QCS L2 es amp QTL Complementation Selection QCS Lo mesa QTL Complementation Selection QCS Results of the QTL Complementation Selection QCS Individuals are selected such that their QTL composition complements oe p those individuals already selected Number of individuals carrying the favorable QTL alleles nt 2 N 8 QTL4 QTL2 QTL3 QTL4 QTLS QTL6 QTL7 QTL8 QTLY QTL10 QTL11 AQCS strategy is described by four parameters Hospital et al 2000 os se ae se va ae ae se se se s Ou Soy o The threshold for Molecular Score MS above J a Y which a favourable QTL allele is declared present QTLs to complement 1 o Individual B124 h
46. is problem it is possible to use a cut off on the probability of keeping a phased genotype i e cut off diplotypes 0 000001 by default Thus rare phased genotypes are discarded in probability computation which considerably reduces the computation time At the end of a run a file output_folder date tab_check_diplo txt is created evaluate the impact of cut off on such eliminations at individual QTL locations and globally The default structure for this file can be represented as follows Id MS All QTL1 QTL2 QTL3 ILI 0 333333 1 000000 1 000000 1 000000 1 000000 1L2 0 333333 1 000000 1 000000 1 000000 1 000000 IL3 0 333333 1 000000 1 000000 1 000000 1 000000 I4 0 333333 1 000000 1 000000 1 000000 1 000000 IL1_IL2_F1 0 333333 1 000000 1 000000 1 000000 1 000000 IL3 IL4 F1 0 333333 1 000000 1 000000 1 000000 1 000000 IL1_IL2_F2 0 661379 0 999948 0 999975 0 999987 0 999883 IL3_IL4 F2 0 581056 0 999879 0 999962 0 999985 0 999688 Ind1 0 588043 0 999998 0 999996 0 999999 1 000000 Ind2 0 621266 0 999992 0 999983 0 999999 0 999994 Columns of this file correspond to QTLx sum of the probabilities to be homozygous un favorable and heterozygous at each QTL position as displayed in the file tab_homo_hetero txt see section 4 2 3 When the QTL column is close to one the sum of probabilities of removed diplotypes via the cut off is negligible All refers to the m
47. ith missing data 39 O gt 1 Future WOK in a e Din a a ote a a a cate Sule 40 T Howtocite this programinius a E a a 40 S Contacti eo a ge oc ee aie a Sigs Gees Ate a8 a a ae dd 40 9 Acknowledgments ngra E hack E anus cree E E E Dates 40 WO E AO S EE E E E E EAEE EE 41 1 Introduction 1 1 Aims With the increasing use of markers in breeding programs it is important to develop decision support tools to help breeders in implementing their Marker Assisted Selection MAS project OptiMAS has been developed with the possibility to consider a multi allelic context which opens new prospects to further accelerate genetic gain by assembling favorable alleles issued from diverse parents 1 2 Principles Algorithms have been deployed to trace parental QTL alleles identified as favorable throughout selection generations using information given by markers located in the vicinity of the estimated QTL positions Using these results probabilities of allele transmission are computed in different MAS schemes and mating designs intercrossing selfing backcrossing double haploids RIL with the possibility of considering generations without genotypic information Then strategies are proposed to select the best plants and to efficiently intermate them based on the expected value of their progenies 1 3 Functions OptiMAS includes in a Graphical User Interface GUI three different modules corresponding to the different steps of a selection progra
48. iven individual is considered as fixed or not see Fig 9 Press Visualization gt Color scheme on the menu bar Molecular score prediction Homo Hetero Estimation of parental allele probabilities Graphs Q Find view i Weight f index Double click on MS QTL cells to show detailed genotypes Export Id PL P2 Cycle Group MS MS Weight MS_UC No No No No QTL1 QTL2 QTL3 QTL4 QTLS5 QTL6 e8 4 v Visualization of genotypes ye 1 0 2 p158 Visualization of genotypes 1 1 a B28 The probabilities to be homozygous heterozygous at 3 1 1 7 B13 the QTL positions have been computed according to 6 a 2 2 B38 Al favourable unfavourable grouping of founder alleles 6 1 1 3 B37 Al Set a threshold and select a color to display a new view 7 1 1 2 B40 Al of the molecular score table based on genotypes 7 1 2 1 B242 Ai Customize cut off colors 6 T 1 3 B246 A probisi 07506 Meio 3 fh B293 6 1 a 1 B7 Prob 0 750 C e 2 0 3 B124 A prob y 0 750 E l6 Ki 1 3 B206 Az a lt 6 1 1 3 B296 ag The rest uncertain genotypes 6 1 3 1 B35 Al 7 2 0 2 B42 d LX cancel of Apply ig 2 1 1 B157 ASE yALuUSTCZ or 7 7 5 1 1 4 a Figure 9 colored view of the molecular score table A value of 0 75 by default is selected for the probability threshold to be considered as homozygou
49. l before attempting to run the program for example cd Desktop optimas_cmd_mac To run the program you must supply 4 input files and possibly an optional one for allelic effects Instructions for how to prepare the input files are given below see section 3 Input file examples input directory are supplied with the software You can run the program on the test data supplied by typing see section 4 1 for the details about the parameters and options optimas input blanc dat input blanc map 0 000001 0 0 output run_blanc 0 verb Make sure that both optimas and the input folder with the examples are in the current directory The program will output a summary of the results in the folder output run_blanc Open the file tab_scores txt to see the genotypic values calculated for the plants present in the dataset 2 2 OptiMAS Graphical User Interface GUI As ready to use binaries installation packages are provided for Windows and MacOSX platforms we will only describe the steps for compiling OptiMAS binaries on GNU Linux systems More details on the building and installation instructions from the sources for all supported systems are described in the INSTALL file 2 2 1 Windows 32 bits To set up OptiMAS on your Windows computer double click on the install file i e optimas_win_x86_v1_12_06_O1 exe i e version 1 Ist June 2012 and follow the Wizard procedure By default OptiMAS will be installed in C Program Files OptiMAS
50. loci cM Marker5 2 37 1 Positions of the different loci must be obtained using the Marker6 2 52 2 Haldane s mapping function i e by assuming no Marker7 2 54 0 interference Marker8 2 59 5 i Marker9 2 74 8 3 3 QTLs information files The aim is to create a target genotype ideotype with all the favorable alleles at the QTL positions So before running OptiMAS it is necessary to define the parental alleles to assemble In addition as the QTL position is rarely located at a marker QTL alleles are unknown and must be inferred from flanking markers Thus it is very important to select a subset of markers as informative as possible especially in multi parental context to follow the favorable parental alleles based on haplotypes The number of markers selected per QTL should be in the range of 2 6 in order to avoid intensive computation time more progress in this area will be made in the next version Two QTL files are supplied by the user to i specify the information regarding the QTL position and identification of favorable alleles and ii define the QTL region meaning affiliate a set of marker that will be used to compute the allele transmission 10 In the first file qtlpos file see tab below each QTL is characterized by its estimated position in cM pos on a chromosome chr and the identification of the parent carrying the favorable allele All The information on the confidence interval i e the interval which i
51. m see Fig 1 Step 1 Computation of genotypic probabilities Estimation of genetic values The tool provides for each candidate individual the probabilities of being homozygous or heterozygous for parental alleles at each QTL Based on the classification of parental alleles into favorable and unfavorable categories a molecular score expected probability of favorable allele is computed for each QTL Individual molecular scores are then combined into a global genetic value by assigning identical or different weights to QTL A colored view of the molecular score table is displayed to identify more easily QTL for which a given individual is already fixed or not If allelic effects for traits of interest are provided by the user the tool computes predicted molecular scores PMS for each trait Indexes can be defined by combining different sources of information MS PMS and quantitative trait information provided by the user Graphs are generated to show the distribution of several indicators QTL molecular scores at individual QTL global genetic values and their evolution over the different cycles of selection Step 2 Selection of individuals Different options are available to select candidates Truncation selection can be performed based on i the above described genetic values or ii a utility criterion which considers the probabilities of obtaining superior progenies following gametic segregation QTL complementation selection Hos
52. mmended to disable this option Click on the Run button to analyze the data set The program will create output results in the folder that you specified At the end of the run the progress bar displays 100 close the Import data window by pressing the Close button If a warning message appears report to section 4 2 7 and 5 6 to analyze these questionable results before any interpretation Note it is also possible to directly display the results of previous analyses by selecting File gt Reload data You can also display results from the two examples data sets provided with the program that are located in File gt Example Data gt Biparental or Multiparental from the menu bar 21 To visualize and analyze the results the OptiMAS GUI includes three modules on the left menu corresponding to the different steps of the selection program see below ee 4 vy aji Step 1 Step 2 Step 3 Prediction Selection Intermating To show and use the full functionalities of OptiMAS this analysis will focus on real data coming from a multiparental marker assisted recurrent selection MARS study Blanc et al 2006 2008 Six connected F2 populations with 150 individuals each were obtained from a half diallel design between four unrelated maize inbred lines DE F283 F810 and F9005 Eleven QTL were detected for silking date A set of 34 markers was selected with at least three markers microsatellites t
53. o follow each QTL see blanc map amp blanc dat files Two cycles of MARS were performed with each time a step of selfing before intermating In this example OptiMAS is used at the last cycle to select the best individuals among 297 genotyped plants that will be used for the next cycle of MARS see Fig 6 To run the multiparental example data set you must supply four input files i e blanc dat blanc map blanc qtlpos and blanc qtll see section 3 4 or it is also possible to directly display the results of this previous analysis from the menu bar i e File gt Example Data gt Multiparental The four inbred lines D DE F F283 S F810 and X F9005 corresponding to the parental alleles d f s and x respectively to follow through generations of selection Three out of six F1 plants F2 progenies of sx dx sd Fl were not selected at the F2 cycle F1 hybrids have been selfed to obtain three F2 populations of 150 genotyped individuals 25 F2 candidate plants were selected first cycle of selection Two other selfing operations have been done 25 families of F4 individuals were produced without genotyping information and crossed Among the progenies 21 candidates were selected based on genotyping information second cycle named C1 21 selfed families were produced without genotyping information and crossed together 297 individuals issued from these crosses were genotyped third cycle of selection
54. of the table can be useful if you are working with a large number of QTL and or individuals and you want to focus on specific QTL plants 5 3 Step 2 Selection of individuals Taking into account all the information coming from the previous tables we can select individuals for producing the next generation 5 3 1 Methods for selection In OptiMAS three different ways are possible to select individuals Manual selection selection of individuals based on your own judgment see Fig 15 In the step 1 window molecular score table select plants via click and drag selection or simple click Ctrl gt Press Right click gt Add to list gt Selection of your list can be renamed by double click gt Ok 28 Molecular score prediction Homo Hetero Estimation of parental alleles probability Graphs Q Fna Elview i weight f index Double click on MS QTL cells to show detailed genotypes Id P1 P2 Cycle Group MS MS_Weight MS_UC No No No A1005 0 8366 0 8366 9 2031 0 2 0 0000 C S A1006 A251 0 7740 0 7740 9 2215 1 3 Tc 2 A1040 A1005 C2 0 7494 0 7494 9 1095 1 3 a ist ea List Selection 1 List Selection 2 Listl_Manual_Selection ma N paa Meah Mo Mo M o ore ban p h b h bw Selection of individuals Graphs Pedigree N m List of selected individuals List1_Manual_Selection List Selection 1 Ild P1 P2 C
55. on will be considered as zero Output directory path to the folder where the results will be stored Results from each run will be stored within a new dated directory created automatically within this folder Note that your output directory should not be in the Program Files folder or other specific directories with administrator privileges Advanced options parameters QTL analyzed by default all the QTL present in the input files will be analyzed You can also choose to select a specific QTL to run the analysis Cut off Diplotypes genotypic probability below which a rare phased genotype diplotype see definition in section 4 2 1 is removed and no more considered in subsequent computations default value 0 000001 Cut off Gametes gametic probability default value 0 000000 It corresponds to theprobability that the number of crossovers expected in the region between flanking markers exceeds a given value Thus unlikely gametes with number of crossovers over this value are removed and no more considered in subsequent computations Use of this option with values up to 0 01 is recommended in case many flanking markers per QTL lead to high computation time with default option Verbose Verbose mode creates two files per QTL position reporting respectively gamete and diplotype probabilities for all individuals default value ON Warning With large complex data both files may take a lot of disk space it is then reco
56. pital et al 2000 can be performed in order to prevent the loss of favorable allele s Different lists of selected plants can be compared via graphs showing the distribution of above mentioned indicators All lists can be adjusted manually A visualization tool of the pedigree of the selected plants is also provided Step 3 Identification of crosses to be made among selected individuals We implemented three simple cases i half diallel between selected candidates ii better half strategy Bernardo et al 2006 which consists of avoiding crosses between selected individuals with the lowest scores or iii factorial design between two lists of selected plants Constraints on the contribution of parents or on the maximum number of crosses to be done can be applied In each case OptiMAS computes the expected average of the progeny for the above described genetic values A graph is automatically generated showing a view of the crosses to be done Decision support application for MARS OptiMAS Algorithms to compute probabilities of IBD allele transmission throughout generations of selection Estimated genetic values Homozygous heterozygous Probability of founder alleles Manual Truncation selection SN QTL complementation selection Selected individuals Half diallel Better Half Factorial design Selfing HD SSD variety creation Figure 1 OptiMAS GUI functionalities 2 Installation procedure OptiMAS installable
57. rosses determination is optimized based on the expected molecular score and UC of the crosses This will be exemplified with two new lists of crosses Press two times the Add button to create two new lists of crosses and rename them _ List4_MS_Contrib_l and ListS_UC_Contrib_I Then Press the Option button to open the Crossing schemes option window see below v Crossing schemes option Crossing schemes Number of crosses Maximum a Contribution of individual Unlimited 1 v criterion EREE Weight Method Utility Criterion Xx Cancel J Apply Figure 27 crossing schemes options constraints To specify that each of the ten candidates must be crossed only once and that there is no constraint on the maximum number of crosses to be done select Contribution of individuals I and leave Number of crosses Maximum The criterion box will appear as in Fig 27 Select Criterion Molecular Score for the first list and press the Apply button To apply this strategy to candidates previously selected via the QCS method select List selection s List4_Complementation_Selection Then select the list where crosses will be stored by selecting List crosses List4_MS_Contrib_l Press the Run button to create this first list 36 To apply the same constraints based the utility criterion of the pairs see Fig 28 on the right do the same thing with Criterion ListS_UC_Contrib_l Ut
58. s likely to include the QTL position CI min and CI max will be considered in a future version and therefore can be left empty QTL chr Pos ClImin CI max all qtl1 1 70 0 A qtl2 2 55 0 b c QTL name of the QTL without blank in character chain The QTL names and the marker names have to match those in the qtlpos file and in the map file respectively chr index numerical value of the chromosome where the QTL is located pos estimated QTL position coming from the QTL detection results All identification of the parental allele s considered as being favorable For QTL 1 the favorable allele a refers to the parental line named IL1 see columns P1 P2 of the genotype pedigree file For QTL 2 b c refers to parental lines IL2 and IL3 which can be considered as favorable relatively to other parental lines In a second file either qtll or qtln or qtlw described in the tables below the QTL region is defined respectively as an explicit list of marker name a number of flanking markers or a window defined on either side of the QTL position qtll file A list of markers explicitly assigned to each QTL QTL mrk list qtl1 Marker Marker2 Marker3 Marker4 qtl2 Marker5 Marker6 Marke7 Marker8 Marker9 qtln file Number of flanking markers Marker closest to the QTL position are taken from the map file Note This implies that the resulting set of marker might not include both side of the QTL position QTL mrk_nb qtl1
59. s un favorable or heterozygous at the QTL positions A color can be assigned to each of them Genotypes which are not assigned to any of these categories are considered uncertain genotypes When you apply a new set of parameters cut off colors the four corresponding columns No No No No of the MS table are updated 24 For example in Fig 9 individual B8 is considered as homozygous favorable for eight QTL in blue No 8 homozygous unfavorable for only one QTL in red No 1 it presents no heterozygous QTL in grey No 0 and two QTL in yellow No 2 are uncertain Some MS at QTL positions are close to 1 e g QTL3 0 9761 and some others are lower e g QTL2 0 8598 This uncertainty can be due to i the fact that one marker flanking the QTL is heterozygous whereas the other one is homozygous favorable which indicates that a recombination took place near the QTL position or ii that there are missing data see genotypes pedigree file The results of the different tables can be visualized on graphs that are automatically generated by clicking on Graphs see below _ Molecular score prediction Homo Hetero Estimation of parental alleles probability Graphs a Generation F2 C1 C2 Ge errr rere EE TEEPE SESE SE SDSS ET EEEELESETSTSTSTSOSDSDSSSTSETECETETETETETSTSTSTSS 350 i siaugagnenbagnacsneesien bua E EE ETET EE EAEE TEE EE T ETTET AE AE
60. s of interest The default structure of this file output_folder date tab_scores txt can be represented as follows MS MS _Weight MS_UC No No No 2 QTL1 QTL2 0 3333 0 3333 1 0000 0 1 0000 0 0000 0 3333 0 3333 1 0000 0 0000 1 0000 IL1_IL2_F1 0 3333 0 3333 1 7071 0 5000 0 5000 IL1_IL2_F2 0 6613 0 6613 1 9841 0 9932 0 9908 us 0 5880 0 5880 geile 0 4966 10 9694 Columns of this file correspond to QTLx expected proportion of favorable allele as defined after grouping at QTLx i e 1 0 5 0 0 for individuals with genotypes and respectively see table homo_hetero section 4 2 3 MS Molecular Score expected proportion of favorable alleles over all QTL i e the average of QTLx values MS varies between 0 for an individual which does not carry any of the favorable alleles to 1 for an individual which is homozygote for the favorable alleles i e it corresponds to the target genotype MS_Weight weighted MS weighted average of QTLx values to give more or less importance to the different QTL only used via the GUI MS_UC Utility criterion UC combines the molecular score with the expected variance of the MS of the gametes that can be produced by the individual UC is based on the estimation of the expected number of favorable alleles carried by the superior 5 gametes produced by the individual For a same MS this criterion favors individuals
61. see section 4 2 4 for more details Allelic effects of QTL for traits of interest can be provided by the user Select File gt Import Data from the menu bar browse the allelic effects file and click proceed see section 5 1 figure 5 and Optimas then computes predicted molecular scores PMS for each trait documented with allelic effects and adds replaces the corresponding columns indicated by PMS prefix on the right side of displayed table Note that missing information will be considered as zero Indexes can be defined by combining different sources of information MS MS_UC QTL PMS and quantitative trait information provided by the user To do so click on the Index button define formula and click Apply Figure 12 below By default the index column will 26 be identified as Index but an alternative name can be chosen by the user by prepending the literal expression with this alternative name and the character Note that formulas can be saved by clicking save a file saved_formulas txt is created in the working directory and reused in the same or another Optimas session by clicking load Index x Molecular score prediction Homo Hetero Estimation of parental a A A H 1 Load or enter a formula click buttons and items eae E iew Lo Weight ausis 2 Click Apply 1 Id P1 P2 Cycle Groupi my_index my_index MS 3 PMS_Silk 2 5 88 A1005 A1005 C2 x B158 A251 A1005 C2 Gl Laff of Jas B
62. ssing two parental lines In this case the genotype is inferred from parents and does not have to be declared 3 Plant issued from a selfing process The name of the two parents is the same in this case S1 means that this individual was obtained after one generation of selfing The number of generation s can vary 4 Plant issued from the cross between two individuals in this case two F2 coming from different parental lines already declared and genotyped In this case the four parental alleles may have been transmitted to this individual 5 Plant without genotyping data information issued from the cross between two F2 individuals All possible genotypes will be considered to evaluate the genetic value of this plant Given the possibility to include non genotyped individuals e g Ind2 this makes it possible to analyze most common MAS schemes and mating designs So if several non genotyped steps were required to obtain a specific individual we must generate virtual individuals in these intermediate steps 3 2 Genetic Map map The map file which name has to end with map is supplied by the user to specify the information regarding markers and is very similar to the Flapjack format mrk Chr pos mrk name of markers without blank in character chain ve pe chr index numerical value of the chromosome where Marker2 1 64 0 Marker3 1 72 5 the marker is located 7 l Marker4 1 90 8 pos relative chromosomal position of
63. st be changed from one run to another 6 Algorithm option O by default all the QTL present in the input files will be analyzed l n integer number if you want to run the computations for a specific QTL 7 Algorithm option optional verb for verbose mode Verbose mode creates two files per QTL position reporting respectively gamete and diplotype probabilities for all individuals With large complex data both files may take a lot of disk space it is then recommended to disable this option 12 4 2 Results and output files interpretation As the program continues to run it keeps you informed of progress Markers used to compute probabilities are reported in file project mrkqtl map At the end of a run questionable results likely genotyping error regarding pedigree may be displayed in the file output_folder date events_summary log It is recommended to always look at it before any interpretation and or breeding decision see section 4 2 7 OptiMAS produces sets of files described below 4 2 1 _diplotypes_set probabilities of phased genotypes Taking into account all information available pedigree distance between loci molecular markers OptiMAS computes for each QTL the probability of all possible phased genotypes diplotypes A diplotype is defined as the union of a pair of unambiguous haplotypes corresponding to parental gametes Results for each QTL designated as x are stored in a specific folder named qtlx
64. st3 UC Contrib 2 from QC 35 B38xB124 B38 B124 0 7359 0 6597 9 319 15 B28xB13 B28 B13 0 7675 0 6494 9 442 List4_Factorial_Design 36 B40xB242 B40 B242 0 7354 0 6223 9 207 16 B158xB242 B158 B242 0 7665 0 6485 9 297 List5_UC_Contrib_1_from_QCc List6_MS_Contrib_1_from_QC 37 B37xB124 B37 B124 0 7328 0 6571 _ 9 061 17 B158xB124 B158 B124 0 7624 0 6821 9 386 38 B13xB125 B13 B125 0 732 0 6564 9 052 18 B28xB38 B28 B38 0 7617 0 6445 94965 39 B40xB124 B40 B124 0 7313 0 6558 9 269f 19 B28xB37 B28 B37 0 7587 0 642 9 211 p 40 B242xB124 B242 B124 0 7264 0 6517 9 108 20 B28xB40 B28 B40 0 7572 06407 9 447 41 B38xB125 B38 B125 0 7263 0 6516 9 107 21 B13xB38 B13 B38 0 7551 0 639 9 424 42 B37xB125 B37 B125 0 7233 0 649 8 822 22 B28xB242 B28 B242 0 7523 0 6365 9 27484 Add 43 B40xB125 B40 B125 0 7218 0 6478 9 057 23 B13xB37 B13 B37 0 7521 0 6364 9 139 44 B242xB125 B242 B125 0 7169 0 6436 8 885 24 B13xB40 B13 B40 0 7506 0 6351 9 3749 Remove 45 B124xB125 B124 B125 0 7128 0 6771 8 958 25 B38xB37 B38 B37 0 7464 0 6315 9 2101 Rosat C 4 gt bea A Method complete half diallel Method better half ust List4_Complementation_Selec Criterion no List List4_Complementation_Selec Criterion no Figure 24 comparison of the two lists half diallel vs better h
65. t 0 47 by default above which a favorable QTL allele is declared present In this case and depending on the threshold value not only individuals considered as homozygous for the favorable allele at QTL position will be taken into account e g heterozygous individuals see Fig 17 in yellow nr means that each favorable QTL allele is requested to be present in at least nr selected individuals here ny 2 MS min the minimum threshold value MSmin 0 by default for the addition of an individual In this example individuals with MS genetic value lt 0 7 are not considered Nmax the maximum number of individuals selected at the end of the QCS process Here up to two individuals will be added to the final subset of selected individual Nmax 10 Cycle Group optional information regarding the generation of selection or another classification criterion in the program e g first cycle second cycle F2 F4 subprograms families etc C2 cycle 2 was selected instead of None no selection by default in order to select individuals that belong to the last cycle of selection This approach can be applied to any predefined list including the possibility to consider an empty list In this example the first eight individuals No 8 were selected via the truncation selection based on the MS criterion Then 1 the QTL for which the favorable alleles are present in fewer than n selected individuals are
66. ta It can include quantitative traits optional It also requires a header line specifying the information in each column The file organization will be exemplified below for the following MAS pedigree iow c d a I L3 5 F1 IL1_IL2_F1 a b c d CR see see F2 IL1_IL2_F2 ab c d IL3_IL4_F2 1 a a C1 a b c d CR Figure 2 example of a multiparental design The default structure for the input file corresponding to Fig 2 above can be represented as follows ILI IL2 Fl ILI I2 Fl St ae CR C1 The genotype pedigree file must be in plain text tab delimited format no space between fields The header should not be changed even if the 2 optional columns Cycle Group are left blank or After the header line each line contains the name Id of the current individual followed by its parents P1 P2 the pedigree relationship linking the two generations Step assignation to cycle and group if relevant the genotyping data Mk1 Mkn and the phenotypic traits Tr1 Trn optional Note that this format is very close to the input format of the Flapjack software except that five additional columns in red must be added Columns of this file correspond to Id corresponds to the name of each individual coded as a character string without any blanks and special characters It must be unique Parent 1 Parent 2 correspond to the name s of the parent s o
67. ux sh As root or sudoer it will perform an installation for all users in usr local bin optimas_gul The input output file examples will be stored respectively in usr local share OptiMAS input and usr local share OptiMAS output As a common user it will perform a local install in user s personal directory HOME bin optimas_gui The input output file examples will be stored respectively in HOME OptiMAS nput and SHOME OptiMAS output Note if the installation script fails in finding qwt and or graphviz libraries paths on your system you can specify them in the config in file see INSTALL and README files for more details You can now launch OptiMAS interface from a terminal optimas_gui_ or path to optimas_gui or double click on optimas_gui in your file browser Note it is not necessary to specify the complete path if the folder including the binary optimas_gui executable is present in the PATH environment variable To uninstall OptiMAS run the script uninstall_optimas sh located in the same directory as optimas_gui and optimas 2 2 3 MacOSX 32 bits After downloading the application optimas_gui app to install it you just need to drag it in your Applications folder Applications Then launch OptiMAS by double clicking on the optimas_gui icon present in your file browser A new folder named OptiMAS containing the two data set examples will be created in your home directory 2 3 Files and directories description
68. ve Figure 22 pedigree of individuals selected via the QCS method 33 This representation is useful to follow the contribution of selected individuals over generations of selection and to prevent possible bottlenecks individuals coming from a reduced number of parents at a given generation in order to limit risk of drift which may lead for instance to the fixation of an undesired phenotypic type for traits not considered in the MARS process It also can be used to maintain diversity for selection on traits complementary to those considered for the MARS process 5 4 Step 3 Identification of crosses to be made among selected individuals Now that your list s of selected individuals is are established it is necessary to identify the crosses to be made to initiate the next MARS cycle We addressed crosses between individuals of a single list diallel design or two complementary lists factorial design The diallel situation can be managed with three options 1 the automatic definition of the whole list of possible crosses according to a half diallel complete method see Fig 23 24 25 and 26 ii the better half strategy Bernardo et al 2006 which consists of avoiding crosses between selected individuals with the lowest scores see Fig 23 24 25 and 26 and iii application of constraints on the contribution of parents or on the maximum number of crosses see Fig 27 and 28 In this last case best crosses are determin
69. versions are distributed to run under most modern GNU Linux Windows XP 7 8 and MacOSX 10 5 or later with Intel processor systems Please note that extensive testing has only been done under Linux Ready to use binaries installation packages and the source code which you are welcome to attempt to compile on your favorite platform are available via http moulon inra fr optimas Two versions of the tool have been developed The first one called optimas manages computationally intensive processes for step 1 It runs in command line and is written in C ANSI language The second version integrates the C program and additional functionalities to display results and facilitate breeding decisions within a Graphical User Interface named optimas_gui coded in C using Qt Qwt amp Graphviz libraries 2 1 OptiMAS in command line If you are dealing with huge amount of data or complex MARS schemes it could be better for you to use OptiMAS in command line on a server for example and then reload the results folder via the GUI see section 5 5 2 1 1 Windows 32 bits The executable comes in a zipped file Extract it with your favorite file archiver software e g 7 zip This will create a new directory called optimas_cmd_win Move to this directory via the terminal application click on Start gt Execute gt type cmd gt OK before attempting to run the program for example gt cd Desktop optimas_cmd_win To run the program
70. virtual next generation with information on the number of individuals needed to reach the ideotype Computation of diversity score based on pedigree effective population size New algorithm to compute genotypic probabilities with no limitation on the number of flanking markers around the QTL position option to use a sliding window to compute probabilities along the genome Linkage between QTL in the estimation of the utility criterion Manage the QTL position uncertainty in score computation Computation of diversity score based on markers outside QTL 7 How to cite this program In publications including results from the use of this program please specify the version of the software you used Valente F Gauthier F Bardol N Blanc G Joets J Charcosset Moreau L 2013 OptiMAS A Decision Support Tool for Marker Assisted Assembly of Diverse Alleles Journal of Heredity doi 10 1093 jhered est020 8 Contact Please send bug reports and or requests for new features to Alain Charcosset charcos moulon inra fr andLaurence Moreau moreau moulon inra fr 9 Acknowledgments We are grateful to colleagues within INRA and the Generation Challenge Program GCP for enlightening user oriented input to this project This development has benefited from the advices and beta testing of Delphine Fleury and Mark Sawkins This program is part of the IBP within GCP and is funded by the Bill and Melinda Gates Foundation 4
71. vorable alleles QTLS and QTL8 in red not presented in Fig 19 in the next generation Note that individuals can be deleted from a list right click on the selected individuals and press the delete button They also can be copied from one list to the other by drag and drop The different lists of individuals can also be compared via the Graphs section window click on the Graphs tab see Fig 20 Selection of individuals Graphs Pedigree 10 Lema thse hed she aakGed bese badd aude LADASLGS SESH SAYRAAGSANpUPAGAEDEDEbENS aa 10 Sanscndnnpendgey Detnseanpuce GUuaDibn R A Samael GuenenduaupanddenDudushie A A A RN S A aR A u a DAEMEN SESASI EEEE ZE E OT 3 3 ie MM EEN E N N ER ES H eH S S 4 pub necevecustsuscsusesucess ssiuauiusscusesusesuscsstcocenesuctes s seveessdsucese s ucesusesucdsusceusesssessvessuce m 4 E E sess sedusessccsuscovecieteces u u 5 F E TEPE TTT TTT TTET T TTT 5 2 a PETET ET TEET E ATTI TTT TT TT TITT z z 0 E E E E T R 0 Lc sscasiuedsuudenseauvcausesvsdsuceseccsunstcec MM csivcssccssesssuedsusdess svecarsesuectusasuccssustsucsessess 0 01 02 03 04 05 06 0 7 O8 09 1 0 01 02 03 O4 63 56 0 7 09 89 1 Molecular Score MS Molecular Score MS List List2_Truncation_MS_Selection Qn an Save List List4_Complementation_Selection gt feo an Save Mean 0 Var 0 Mean 0 0962466 Var 0 0370536 Figure 20 comparison between lists of selected individuals via graphs T
72. with no unfavorable alleles fixed This score ranges from 0 to the number of QTL Note that present version of UC estimation assumes independence between QTL and should be considered as only indicative in case of linked QTL It also assumes that the distribution of scores can be 17 approximated by a normal distribution which is not valid in case of small number of heterozygous QTL No number of QTL homozygous for favorable allele s A given QTL is considered as homozygous for favorable allele s when prob exceeds a default threshold value of 0 75 This threshold can be modified via the GUI see Fig 9 resulting in an update of this column No number of QTL homozygous for unfavorable allele s A given QTL is considered as homozygous for unfavorable allele s when prob exceeds a default threshold value of 0 75 This threshold can be modified via the GUI see Fig 9 resulting in an update of this column No number of QTL heterozygous with both favorable and unfavorable allele s A given QTL is considered to belong to this category when prob exceeds a default threshold value of 0 75 This threshold can be modified via the GUI see Fig 9 resulting in an update of this column No number of QTL defined as uncertain Concerns QTL which are not attributed to any of the three previous categories Notes At QTLI individual IL1_IL2_Fl has a 100 probability to be heterozygous aaaaa bbbbb see section
73. ycle in order to check genotyping errors along the pedigree A new genotypes pedigree file is created a with genotyping errors filled with missing data E Data file to import Special map file cmap Browse Q Step 2 Genotypes Pedigree file Browse Selection i 3 came Output directory e Results new dat file Browse Step 3 Close Help Intermating 3 Figure 32 data set importation to run the FillMd Mks tool Loading input output files using the browser Map file path to a new genetic map QTL position input file cmap This file has been created at the end of a previous run of OptiMAS in order to re run OptiMAS marker by marker and localize genotyping errors at individual marker position Genotype file path to the genotype pedigree file dat This is the same file used to run OptiMAS the first time Output directory path to the folder where a new genotype pedigree file dat with inconsistent genotyping data replaced by missing data will be stored Note that your output directory should not be in the Program Files folder or other specific directories with administrator privileges Click on the Run button to create this new genotype pedigree file new_file dat Use this new dat file to re run OptiMAS 39 6 Future work Next steps coming soon Development of a simulation procedure Step 4 to produce a
74. ycle Group MS Weight UC List Selection 2 poneren L L I L L REMAS 1 88 A1005 A1005 C2 0 8366 0 8366 9 203 2 e158 A251 A1005 C2 Gl 0 8024 08024 9 326 3 B13 A1006 A1005 C2 0 7609 0 7609 9 076 a 837 A1040 A1005 C2 0 7433 0 7433 8 676 5 B40 A1040 A1040 C2 0 7404 0 7404 9 010 Figure 15 manual selection of individuals added to a list Individuals are selected manually a and then added to a list of your choice b This new list can be accessed through the Step 2 interface c see above Names of lists can be modified by the user at all steps by double clicking on the name of the list one wants to modify The two next options are initiated by clicking on the Step 2 icon Truncation selection TS individuals can be ranked automatically based on four possible criteria Molecular Score MS Weighted MS Utility Criterion MS_UC Index if computed The Nse first sorted individuals are selected to generate the next population e g Nset 10 Criterion MS List List2_Truncation_MS_Selection Option Cycle C2 last cycle then press Run A second list can be created by doing the same with Criterion Weight and List List3_Truncation_Weight_Selection The list s of selected individuals will appear on the Selection page see Fig 16 Truncation selection MTS QTL Complementation Selection QCS tu 10 TE cern noc 2 us actin wt 3 Can us Can S
75. you must supply 4 input files and possibly an optional one for allelic effects Instructions for how to prepare the input files are given below see section 3 Input file examples input directory are supplied with the software You can run the program on the test example data supplied by typing see section 4 1 for more details about the parameters and options gt optimas exe input blanc dat input blanc map 0 000001 0 0 output run_blanc 0 verb Make sure that both optimas exe and the input folder with the examples are in the current directory The program will output a summary of the results in the folder output run_blanc Open the file tab_scores txt to see the genotypic values calculated for the plants present in the dataset 2 1 2 Linux 32 64 bits To build and install optimas in command line on your system extract the zipped file optimas_cmd_linux zip by typing for example unzip optimas_cmd_linux zip This will create a new directory called optimas_cmd_linux Then open a terminal move to this directory before attempting to run the program and run the installation shell bash script install_optimas_on_linux sh no gui As root or sudoer it will perform an installation for all users in usr local bin optimas The input output file examples will be stored respectively in usr local share OptiMAS input and usr local share OptiMAS output As a common user it will perform a local install in user s personal

Download Pdf Manuals

image

Related Search

Related Contents

Computer Interface Card (CIC) User`s Manual  CM16 MOD 0 BLK/DST CM16 MOD 0 HC05  Manuel d`utilisation Capacimètre portable pour composants CMS  Kensington Comercio™ Soft Folio Case & Stand for Galaxy Tab® 3 10.1 - Olive  P6 EPPM Licensing Information User Manual 15 R2 September 2015  HERMA Inkjet labels A4 83.8x50.8 mm white paper matt 250 pcs.  User Manual - Parts Express  Sony W700i User's Manual  Duramax Building Products 00811 Instructions / Assembly  SATA PHY TxRx Impedance MOI - Agilent 86100C  

Copyright © All rights reserved.
Failed to retrieve file