Home

QTLMap 0.9.6 User's guide

image

Contents

1. Y For simulations or grand daughter design the keyword CORRELATION MATRIX should be used Following the key word the correlation matrix is given the heritability as diagonal elements below the phenotypic correlations and above the genetic correlations not available for expression traits If this information is missing h 0 5 and correlations 0 5 are assumed Options for trait information Example 1 standard situation 3 Number of traits i i Number of fixed effects and covariates sexe poids Names of the fixed effects and covariates malade r 11 0 1st trait nature real value model malcor r 00 1 2nd trait nature real value model third r 000 3nd trait nature real value model CORRELATION MATRIX 0 55 0 26 0 29 ORO 052 0 28 0 20 0 20 0 33 Box 7 Example 1 of a model file This model file describes the performance file where one fixed effect one covariate and three performances are referenced for each animals The model for each performance is malade u sexe B poids malcor u QTL x sexe third u e Example 2 Use of keyword ALL in particular in expression data analyses 10000 Number of traits ld Number of fixed effects and covariates sexe covl Names of the fixed effects and covariates mung x db 3b all is a word key the model will be applied for all the 10000 expression trait Box 8 Example 2 of a model file QTLMap 0 9 6 2
2. multitrait multivariate analysis multitrait multivariate analysis 1 n QTL Within sire gaussian mixture analysis possible 1 n QTL Within sire gaussian mixture analysis possible unitrait 1 n QTL general gaussian linear approximation possible unitrait 1 n QTL Within sire gaussian linear approximation possible unitrait 1 n QTL general gaussian linear approximation possible unitrait 1 n QTL Within sire gaussian linear approximation possible Options for calcul in development for advanced users only LD unitrait 1 QTL Within sire gaussian possible LDLA unitrait 1 QTL Within sire gaussian possible n is the number of QTL recommended value lt 3 23 and 24 approaches are similar but 24 includes a LU factorization of traits correlation matrix QTLMap 0 9 6 28 56 7 2 Option haplotype parental phase identification To chose the parental phases identification and the grand parental segment transmission methods The methods are based on various algorithms with different balance between computation speed and precision haplotype Sire and Dam phase probability Transmission probabilities Recomandation for from parents to offspring sparse dense map 0 Phases are read in the markers Rapid and optimised Sparse or dense genotype file for each locus 1st resp _ transmission probabilities 2 4Jallele read on the 1s chr
3. Genet Sel Evol 31 341 350 Knott S Elsen JM Haley C 1996 Methods for multiple marker mapping of quantitative trait loci in half sib populations Theoretical and Applied Genetics 93 1 2 71 806 Larrosa J Schiex T 2004 Solving weighted CSP by maintaining arc consistency Artificial Intelligence 159 1 2 1 26 Legarra A Fernando RL 2009 Linear models for joint association and linkage QTL mapping Genet Sel Evol 41 43 Mangin B Goffinet B Le Roy P Boichard D Elsen JM 1999 Alternative models for QTL detection in livestock II Likelihood approximations and sire marker genotype estimations Genet Sel Evol 31 225 237 Moreno CR Elsen JM Le Roy P Ducrocq V 2005 Interval mapping methods for detecting QTL affecting survival and time to event phenotypes Genet Res Camb 85 139 149 QTLMap 0 7 56 56
4. 0 010 0 080 O13 0 03 omii 0 03 0 13 0 03 0 13 0 03 0 010 0 090 Os 112 0 02 05112 0 02 0 112 0 02 0 112 0 02 Box 21 Sire QTL effect file o Dam QTL effects out mateff For each tested position the file contains for each tested position on all tested chromosomes The chromosome tested position the dam 1 QTL effect estimation the dam 2 QTL effect estimation QTLMap 0 7 51 56 Note the QTL effect are given only for dams with offspring size larger than the threshold given byopt ndmin 10 6 Parental phase output For each sire and dam if the dam had more than NDMIN progeny the chromosomal phases are displayed on two lines first for the paternal gamete and secod for the maternal gamete CkCk ck ck ck ck ck ck k ck ck ck kk kk SIRE PARENTAL PHASES KKKKKKKKKKKKKKKKK CHROMOSOME 7 QILOOOL m 2 2 4 5 1 2 X 12 2 3 QUOI Cl XQ ts 4 4 3 2 ke 2 2 919045 S 59 S 6 5 9 6 6 ORO E el By AL teh Gy S M QLOOML BS 2 8 Bl 2 26 2 A STOOP Cf iG to 2 4 3 is S 2 MIOOIS amp 7 G lL Awe A il amp ZG Dil OOS CMC OMS 2L WEM dE ESL GE k kkkkk k k DAM PARENTAL PHASES x x k kx CHROMOSOME 7 None of the females had more than the minimum number of progeny needed to estimate its possible phases Box 22 Sire QTL effect file 10 7 Offspring phases The progeny phases are output when the key out phases offspring is given a value in
5. contingence matrix opt eps hwe Threshold to check the equilibrium of marker transmission 0 001 within each family opt eps linear heteroscedastic Threshold for convergence in the heteroscedastic linear 0 5 model opt max iteration linear heteroscedastic Maximum iteration in the heteroscedastic linear model to 5 avoid infinity loop opt eps recomb Minimum probability of recombination events knowing 0 05 the recombination rate between 2 markers to detect mapping errors opt nb haplo prior Maximum number of haplotypes at a given position above 200 which the runtime execution is stopped computing ressource may become problematic opt pro haplo min Minimum frequency under which an haplotype is added to 0 00001 the rare haplotypes group opt longhap Haplotype length in LD and LDLA number of markers 4 opt optim maxeval Maximum number of objective function estimations 1000000 opt optim maxtime Maximum time to optimize the objective function 1000000 opt optim tolx Finite difference variables values used in estimating 0 00005 function gradient non linear methods opt optim tolf Stopping criteria lower bound of the objective function 0 00005 opt optim tolg Stopping criteria lower bound of the gradient 0 00005 opt optim h precision Precision to obtain the gradient 0 00005 QTLMap 0 9 6 26 56 Remark1 opt ndmin The maximum likelihood methods implemented in QTLMap considers the population as being a mixture of half sib and full
6. GNU compiler collection gfortran 4 6 gcc v Cmake 2 8 cross platform open source build system Compilation cd S QTLMAP DIR mkdir build gt cd build gt cmake DCMAKE BUILD TYPE Release gt cmake DCMAKE Fortran COMPILER gfortran gt make The binary qtlmap is created in the QTLMAP_DIR build src directory To install the qtlmap binary in the bin directory QTLMAP_DIR bin gt make install OpenMP support Supports multi platform shared memory parallel programming To define the number of threads gt export OMP_NUM_THREADS 8 NVIDIA GPU acceleration support QTLMap is the ability to use NVIDIA GPU cards Tesla C20XX series to massively accelerate analyses and simulations for QTL detection gt cmake DCUDA TOOLKIT ROOT DIR path cuda toolkit dir DGENCODE CUDA arch compute 20 code sm 20 6 Inputfiles To carry out an analysis you need a minimum of 4 data files Marker map file Pedigree file Marker genotypes file Performance file a file describing the performances Model file and a file describing the input ouput and options parameter file A file describing the breed origins of parents or grand parents have to be provided when within breed haplotype effects are considered 6 1 Pedigree file The file contains pedigree information for the 2 last generations of a design which comprises 3 generations ie parents and progeny It must not contain th
7. and their progenies the G2 generation Extra data may be given about the grand parents GO their ancestors G 1 and the descendants G3 of the G2 generation 4 1 Thebasic model 4 1 1 Information Three groups of information are needed in the analysis The pedigree information P describes the familial structure along the generations i e for each individual say the l in the list its ID Pj and the ID of its sire Ps and dam Pay The only mandatory information are the trios P Psq Paq of G2 individuals This information is assembled in a pedigree file Animals without parental information are the founders and do not figure in this pedigree file When available and useful information about other generations G 1 GO G1 G3 may be given The table lists these extra cases Available extra The file containing the trio P Pj Paay must be given for information G 1 GO G1 G2 G3 Full pedigree Yes Yes Yes Yes No G1 markers No No Yes Yes No GO and G1 markers No Yes Yes Yes No P3 phenotype No No No Yes Yes The marker information M describes for each individual l a list of alleles pairs observed at a set of nm markers mx MX1sa s 1 nma 12 The only mandatory information concern the G2 individuals mx mpijy for ijk However when available extra information about G1 ie ms and mdi and GO mgsi mgd mgsi and mgdi will be used All data concerning the it sire
8. calcul 1 snp QTLMap 0 7 29 56 7 4 Option qt1 number of qtl detection available For most of the analyses controlled by the runtime option calcul only 1 QTL is considered in the model However this number may be increased to 2 or more depending of the calcul option The number of QTL is given by the qtl runtime option Practically as computing time increases rapidly with the number of QTL we do not recommend testing for more than 3 qtl Analysis calcul QTL test detection qt1 1 1 2 2 9 6 7 8 9 1 3 4 23 25 26 27 28 gt 1 Example QTLMAP PATH qtlmap p analysis calcul 1 qtl 1 7 5 Option optim Optimisation method The optim runtime option allows a control of the optimisation procedure Many methods are proposed Description External packages optim needed 1 E04JYF NAG routine quasi Newton NAGG 2 L BFGS routine the Broyden Fletcher Goldfarb Shanno quasi no Newton 5 11 LUKSAN optimisation no 12 47 NLOPT Optimisation GCC Methods may be parametrized with the following options opt optim maxeval maximum number of objective function opt optim maxtime maximum time to find the solution of the objective function opt optim tolx tolerance lower bound of a step opt optim tolf stopping criteria lower bound of the objective function opt optim tolg stopping criteria lower bound of the gradient opt optim h precision precisi
9. exponential in the penetrance L1 is algebraically developped and elementary statistics ES are computed only once allowing a fast computation of the likelihood during the optimisation process 2 x 1 VPijk Hijx The penetrance is o ui oi 2m exp tmc Its mean Hik can be writen u U 2 Hij bij Mi bia The element ybijk Lin is YPt ik 2ypige ui Hij biet beau hi hij birai boo QTLMap 0 9 6 8 56 Pij 1 The ES for the part of the likelihood IL L1jj corresponding to a sire i dam ij dam phase hdjij are Cy C2 Ca C4 Cs Ge c Cg Co Cu ES np 2 2 NL gt Pin NET bi gt Pin bf Pij gt bin gt bie gt binbir gt in gt bie for constant p and yj a ai Ul j and uid lilij aja a2 aj Hillij 1 Hi jG Hi ij During the optimization process the space of the unknown parameters Li Hijs i 0j is explored using the c 1 19 computed only once to estimate the exponent as 2 c u iij C726 2cga 2c ajj Cs ui Hij 2c3a 2c40 j Wcgajaj Coa Cy AZ 4 1 5 Optimisation Parameters of LO likelihood are directly computed using standard formulae i XMjXkYyDjk npi hij Xxypj npij Bi X x i X2 6 Y pj Hi lij np With npi j pij Parameters of L1 likelihood are estimated for each tested QTL position x using a derivative free numerical optimiser As numerical difficulties may
10. family kkkkkkkk MARKER DESCRIPTION F x x kx x 236 animals are present in the genotype fil animal890738 900848 of genotype file are not in the pedigr iE a IL where all animals are genotyped for at least one marker markers were selected among 10 markers There are 236 genotyped animals Check the equilibrium of marker transmission within each family Marker m10 for sire 20 not in HWE 92 heterozygous progeny amongst 150 Marker m49 for sire 17 not in HWE 56 heterozygous progeny amongst 150 Mievek lt exe iil 3 S j tere Gutes bib iaeie bay Ishin c 58 heterozygous progeny amongst 150 Box 16 3 Example of main output file information first section simple statistics about marker data Description of performance traits For each quantitative traits simple statistics are edited v Names of the quantitative traits for each trait Y Number of individuals measured Y Number of individuals having for both performance values and marker genotypes Y Mean variance minimum and maximum Y Names of fixed effect if any with the list of levels v Names of the covariates if any with their mean variance minimum and maximum Ck ck ckckockckckckck ck kck kockck k TRAITS DESCRIPTION ckckckckckckckckckck kc ck ck kk NUMBER OF PHENOTYPED ANIMALS 8 236 NUMBER
11. family i e the markers genotypes of the sire its mates and their progeny is pooled in the M table In the simplest situation M mp ijneJ 1 ndijk 1 npij It is important to realize that before running QTLMap the parental phases that is the way the marker alleles are positionned on their chromosomes are supposed unknown in fact an option allows the input of phased marker phenotypes from external software If for instance sire i is said to carry the marker genotypes A C T G A A at loci 1 2 and 3 the reading order which gives the trios ATA and CGA may not be the way the alleles are carried by sire i chromosomes 1 and 2 The Trait phenotype information Y describes for each individual l a quantitative possibly discrete performance or a vector of nt quantitative performances yx QTLMap 0 9 6 5 56 tyXichiz1 nt The only mandatory information are the G2 individuals quantitative traits phenotypes yDi jxt for l ijk assembled in a vector yp Ypie j 1 nd k 1 npij These vectors form yp yp ypi YPns However when available extra information from G3 will be used 4 1 2 Statistical formulation In the basic model of QTLMap the hypothesis is tested that one QTL affecting a single trait is located at a position x in a linkage group e g a chromosome Successive positions on this linkage group are scanned The test is performed with the interval mapping technique applied to an approximation
12. in std unit enl iL wep 8 0 669 Box 16 10 Example of main output file information fourth section Analysis under H1 e Parameters tests QTLMap 0 7 43 56 For each of the nuisance effect fixed effects and covariates a LRT is reported with the value and significance level of the likelihood ratio when comparing a model with or without this effect The significance level is the probability for the likelihood under the alternative hypothesis to be higher than the likelihood under the null no effect When this LRT exceeds the threshold of a Chi test with p 1 degrees of freedom p being the number of levels for a fixed effect 1 for a covariate corresponding a 5 a 1 or a 0 1 96 type I error the effect can be declared significant Ck Ck ck ck ck ck kk ck kk ck Ck Ck ck Ck Sk ck kk Ck kk kk ck kk ck kk ck kk Ck kk Ck kk kk ck kk ck kk Ck kk Ck kk kk Sk kk ko kk ck kk Ck kk Ck kv Sk Sk ck Sk ko kc kckock ok Testing model effects Tested effect ciae Likelihood p value ratio ied direct effect 23 100 823 1 000 12 direct effect 10 IA aS 1 000 sex direct effect 2 11 146 1 000 Box 16 11 Example of main output file information fourth section Analysis under H1 e Description of data structure Warning about possible confusions between traits effects estimations are given KKK KK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KK KKK KKK KK KK Tes
13. occur depending on the data structure and type of analysis model chosen by the user a panel of optimisers are proposed by QTLMap 4 1 6 Conclusion about the basic model The computation framework corresponding to the basic model described so far ie Within sire dam regression using elementary statistics develoment for the computation of the penetrance and assuming the sire phase correctly found from the marker information will be qualified the standard framework This is the first we used following Elsen et al 1999 Mangin et al 1999 and Goffinet et al 1999 recommandations Many alternative formulations and enrichments of the analysis were proposed in a second time and are now described They are available both for the half sib and general family structures 4 2 Alternative formulations of the likelihood 4 2 1 Fulllinearisation the regression model A fully linear approximation of the likelihood generalizing the Haley and Knott 1992 or Haley et al 1994 regression models is proposed to the user Under H1 the likelihood is given by ns ndi npij a T TT ee i 1 j l k 1 QTLMap 0 9 6 9 56 Wine u p hs M gt plhdy hsi M 9 pti ts ta hsi hdi My eu ijra hsi hdij ts ta X M this is equivalent to Wij u X pti ts t Mj es dis In this situation the parameters may be estimated using the standard linear models framework Let focus on the ith family yp ypijx is the vect
14. t With v p the transmission event 1 or 2 from the sire to the progeny at QTL located at x q 1 2 on the scanned chromosome The summation thus extends on 4 situations Y Qgit the within sire i effect of the t allele at the q QTL located at x Y aajp the epistatic effect for the it sire aaj if the sire transmitted alleles 1 and 1 a if it transmitted 2 and 1 aj if it transmitted 1 and 2 aj if it transmitted 2 and 2 4 3 1 3 Any number nq of unlinked QTL The amount of computation increasing very rapidly with this number further approximations were made to face this burden the fully linearized version of the likelihood is retained and the transmission events are supposed independant between all simultaneously tested QTLs With these approximations the likelihood turns to be ns ndi npij Xq l I k 1 ud 2 2 oj ts t4 M egus 1 p q inqtot 4 3 2 Modeling the polygenic background not yet fully available in QTLMap In the basic model all parents are supposed unrelated a situation not realistic in livestock populations When pedigree information about ancestors is available population structure due to familial relationships may be considered in performances description This extension is proposed in the fully linearized version of the likelihood The linear model is extended to yp Xu W a Za e where y Xp 1uis a column of p the general mean Y ais the vector of QTL effects followin
15. the linkage group under hypothesis H1 1 QTL and H2 2 QTL grand parental segment transmission marginal and joint probabilities Y compulsory parameters QTLMap 0 9 6 23 56 explored chromosomes id step length of the scan minimum size of a full sib above which the dam effects QTL and polygenic are estimated o minimal probability for a paternal and maternal phase to be considered in the analysis o missing genotype value The parameter file use the format lt key gt lt value gt None of the characters after the character are interpreted useful to add comments qtlmap help panalyse for more information H USER FILES in map carte in genealogy genea in genotype typag in traits perf in model model ANALYSIS PARAMETERS analysis step in Morgan opt step 0 1 minimal number of progeny by dams opt ndmin 20 Minimal paternal phase probability opt minsirephaseproba 0 80 overload opt minsirephaseproba 0 90 Minimal maternal phase probability opt mindamphaseproba 0 10 chromosome to analyse opt chromosome 7 for several chromosomes opt chromosome 7 8 Y missing phenotype marker valu opt unknown char 0 Hitt t OUTPUT out_output OUTPUT result out summary OUTPUT summary out lrtsires OUTPUT sires out lrtdams OUTPUT dams out pded OUTPUT pded out pdedjoin OUTPUT pdedjoin out pateff OUTPUT pateff out mateff OUTPUT ma
16. the linkage group as explained in the previous paragraph 0vsiQ pwedbue elie Trait chromosome level genome level 55 1 O La 5 1 omis traitsim 231 39 27 40 PAS SS 28 44 2B iL 28 64 Box 26 Summary file from simulations 10 11 Detailed output of the LRT for simulations The file corresponding to the key out_max1rt of the parameter file contains the maximum LRT the corresponding position and linkage group for each simulation permutation For each analysed variable Y aheader Y foreach simulation o themaximum likelihood ratio test o the position and linkage group ofthe first QTL o the position and linkage group of the second QTL if 2 QTL hypothesis O Trait traitsimull LRTMAX HO H1 CHR Position Us TSS 1 0 4100 iis SLO iL 0 1100 JL7 OS 31 iL JL o BAO wur WEE iciceaLie Saini LRTMAX HO H1 CHR Position 8 9628 iL 0 LOO DID iL 1 0000 16 6090 1 omong Box 27 1 Detailed output of the LRT for simulations when comparing HO no QTL is segregating versus H1 1 QTL is segregating on the linkage group Trait traitsimull LRTMAX H0 H1 CHR Position LRTMAX H1 H2 CHR1 Positionl CHR2 Position2 RAS iL 0 4100 9 6459 iL 0 4100 1 1 2100 JL c SLO iL 0 1100 14 2922 1 0 1100 1 1 0100 JUT s 0 315 1L 1 JL 2 10 15 4039 1 0 3100 i 12100 i ege exeat LRTMAX HO H1 CHR Position LRTMAX H1 H2 Cail osiem Cain Pegiaiesom2 8 9628 al Oa VLOT 12 87 1 T510 i 1 6100 9 3228 iL 1 000
17. unordered When an animal has no phenotype for a marker both alleles must be given the missing value code as given in the parametrisation of the analysis see 6 2 SW552 SW64 CGA snpl snp2 enby4 2 5 9 3L 4 309 2X 3P 6 JA 9128092 826 5 4 139 JX rE X SZ 2 i i I9 5 X Tm Ee m 9229651 2 2 3 1 12 JL3 A Ww e 7A QA AS Zenon eS ele AAS ANG QUAE MACs 2 f i 5 I2 A ow wie 6 S GJLOYAL 2h 9 OlsAA AE A YSLGAS m sw qoi U ds AN WA e See 2 i3 9 0 0 0 X ARG T 9891197 2s O OW 12 4 ONG 9933968 2 2 3 Lig aA T AE 93399 9 7 JL b 39 4b e A 934 98 2 8 JL 5 12 4 AG Box 4 Example of a marker genotypes file In this example amongst the 5 grand parents 3 were genotyped 911714 912892 et 924758 For instance grand dam 911714 is heterozygous 2 5 at marker SW552 the individual 961925 has no genotype at marker mark1 etc 6 5 Performance file This file gives the phenotypes of the traits to be analysed The progeny performances only are considered in the analysis and must be given in the file For each animal its ID identical to the ID given in the pedigree file is followed by information about nuisance effects fixed effect levels covariate value and then by three items for each trait performance CD and IC In grand daughter designs CD is the square of the EBV accuracy In daughter designs CD indicates if CD 1 or not CD 0 the trait was measured for QTLMap 0 9 6 20 56 this a
18. were considered and the classes following the example given by Legarra and Fernando 2009 were simply defined QTLMap 0 9 6 16 56 by the haplotype IBS status to a class 6 corresponds a single haplotype To a given class 6 corresponds a specific effect ys on the quantitative trait The quantitative performance of a progeny depends on the haplotypes as found in the parental chromosomes from which the putative QTL alleles are originating and not to the possibly recombinated haplotypes the progeny itself is carrying Thus the trait expectation is given by E yp j hs hdi ui mij 2 p t ts t4 hs hdi Mj Y5 Fsi t Ya Feia ts tq Where ys is the effect of the class of tt haplotype 1 or 2 the sire i knowing its most probable phase hs hsi hsiz 4 5 2 Linkage Desequilibrium Linkage Analysis Association analyses suffer a lack of robustness to hidden structures Familial structures may be accounted for adding to the model description a random individual effect with a covariance matrix computed from pedigree or dense marker information see Teyssedre et al 2011 fora review More generally modelling both the association linkage desequilibrium and transmission linkage in a single Linkage Desequilibrium Linkage Analysis as been recommanded as a solution for a better control of first type errors Family Based Association Tests Abecassis et al 2000 and Mixed models including a random QTL effect M
19. 0 3 0 3 0 3 0 1 name nature discrete heritability numbr of Classes classes frequencies of the second trait correlation 0 0 between traits phenotypic correlation qtleffect 0 5 055 QTL effects Box 14 Example of a simulation parameter file to simulate data with no reference to existing trait data no missing phenotype structure 9 Simulate and design a new protocol p analysis opt step 0 01 apt chromosome 1 2 3 param sim in_paramsim QTLMap offers the possibility to simulate all the data markers genealogy traits in order to plan a new experiment The output file named by the out max1rt option in the following example provides the value of the LRT resulting from the simulations allowing an estimation of the power of the design To perform these simulations two specific sections must be created in the simulation parameters file in addition to the sections QTL and SIMULTRAITS previously described The first with the keyword MARKERS must give on a single line three items the marker density M the number of alleles marker the map size Morgan The second with the keyword GENEALOGY followed on the next line by the keyword F2 BC or OUTBRED depending on the type of population to simulate and a line giving three items the number of sires number of dam sire and number of progeny dam to simulate Example simulation of an F2 protocol with 10 sires 7 dams per sire an
20. 0 8 4281 JL 0 0100 iL Oo SLOW QTLMap 0 7 55 56 16 6090 1 Oa VLOG 5 NIZE i 0 3100 1 0 4100 Box 27 2 Detailed output of the LRT for simulations when comparing HO no QTL is segregating versus H2 2 QTL are segregating on the linkage group and H1 1 QTL is segregating on the linkage group versus H2 2 QTL are segregating on the linkage group 11 References Elsen JM Filangi O Gilbert H Le Roy P Moreno C 2009 A fast algorithm for estimating transmission probabilities in QTL detection designs with dense maps Genet Sel Evol 41 50 Elsen JM Filangi O Gilbert H Le Roy P Moreno C 2009 A fast algorithm for estimating transmission probabilities in QTL detection designs with dense maps Genetics Selection Evolution 41 50 Elsen JM Mangin B Goffinet B Boichard D Le Roy P 1999 Alternative models for QTL detection in livestock I General introduction Genet Sel Evol 31 213 224 Gilbert H Le Roy P Moreno C Robelin D Elsen JM 2008 QTLMAB a software for QTL detection in outbred population Annals of Human Genetics 72 5 694 Gilbert H Le Roy P 2007 Methods for the detection of multiple linked QTL applied to a mixture of full and half sib families Genet Sel Evol 39 2 139 58 Goffinet B Le Roy P Boichard D Elsen JM Mangin B 1999 Alternative models for QTL detection in livestock III Heteroskedastic model and models corresponding to several distributions of the QTL effect
21. 06666521254513 0 0642879011796014 0 255460347400393 0 189477060869665 0 25462868498086 gen2 0 127806826817031 0 163876647400758 0 0184043832497863 0 296146098377366 9 3 32 73152 0092 3 093 2 0 WSSASISSLOGY 2924 O 1 9 0 9 9012 2 117 590 0 51929 932 012 1E 0 3L VOL gen3 0 259405679027549 0 365184085691961 n a O LOAAOS 7 5 5 0 1L Ss 9L E4165 3 7 5 1 06 5 0 5 7 10 52 1 5 5 1L1 11 2 282092 7 9 3690169 36 32 059 90S O SAAS IS V 23219 S58 gen4 0 151093991655429 0 10964888434473 0 15832262904679 0 284848089326391 0 0808434990010986 0 306550168430082 0 00906573426897184 0 10731093171816 Box 6 Example of a expression quantitative trait values file In this example animal 6380 have a missing data for the gen3 6 6 Model file In this file the information on model for analysis of each trait is described v Number of traits v Number of fixed effects nf Number of covariates nc Y Names ofthe fixed effects and covariates v Name of the 1st trait nature of trait r for real value or i discrete ordered data model for this trait symbolized by 0 1 indicators for each fixed effect nf first indicators each covariate nc following and each interaction between the QTL and the fixed effects nf last QTLMap 0 9 6 21 56 indicators Fixed effect covariate or interaction will be included in the analysis if its indicator is 1 will not be if it is 0 Y Name of the 2nd trait
22. 18 6 1 Pedigree file 18 6 2 Population file optional 19 6 3 Marker map file 19 6 4 Marker genotypes file 20 6 5 Performance file 20 6 6 Model file 21 6 7 Parameter file 23 QTLMap 0 9 6 2 56 7 Run the software with the different running options for analyses 27 7 1 Option calcul choice of the QTL analyses 27 7 24 Option haplotype parental phase identification 29 7 3 Option snp fast phasing in dense genotyping situations 29 7 4 Option qt1 number of qtl detection available 30 7 5 Option optim Optimisation method 30 7 6 QOption disable sire qtl 1 11 1 130 7 7 Options ci amp ci nsim 31 7 8 Options data transcriptomic amp print allReport output mode eQTL analyses to analysis transcriptomic data 31 7 9 Options for the control of process information 31 8 Control of first and second type errors in existing designs 32 8 1 Simulations with respect of missing data structure 32 8 2 Permutations 34 8 3 Simulations without reference to data structure 35 9 Simulate and design a new protocol 36 10 Output files 37 10 1 Main output for phenotype analysis 37 10 2 Output for eQTL analyses 47 10 3 Analysis summary 49 10 4 Output ofthe LRT 49 10 5 QTL effect estimations output 51 10 6 Parental phase output 52 10 7 Offspring phases 52 10 8 Marginal probabilities of the parental chromosome transmission 52 10 9 Joint probabilities of the parental chromosome transmission 53 10 10
23. 2 56 Example 3 Use of keyword TRAITS Only traits third and malcor will be analysed 3 wi sexe poids malade r 1 1 0 malicon se AL tima 2 0 Number of traits Number of fixed effects and covariates Names of the fixed effects and covariates 1st trait nature real value model 2nd trait nature real value model 3nd trait nature real value model CORRELATION MATRIX 9435 2G 0 29 NAO O32 W628 T20 Oo20 O55 TRAITS third malcor Box 9 Example 3 of a model file Example 4 Use of key words TRAITS and ALL to select a list of expression traits to be analysed Here only genes named gen3 and gen4 will be analysed 10000 Number of traits i 3 Number of fixed effects and covariates sexe covl Names of the fixed effects and covariates Jub yer db il TRAITS gen3 gen4 Box 10 Example 4 of a model file 6 7 Parameter file All information needed to run an analysis is given in the parameter file p analyse Y Files paths and names o input files pedigree cf 5 1 population origin of prents or grand parentes cf 5 2 markers map cf 5 3 markers genotypes cf 5 4 trait performances cf 5 5 o file giving the performances model cf 5 6 o ouput files full information analysis result file summary ofthe analysis sire and dam family likelihood ratio test LRT along the linkage group sire and dam QTL effect estimations along
24. 229 0 299 0 025 0 3951 0 050 0 000 A 87 P012666 5 Ll O89 5 TLES id 98S 5a 363 Or ope O 344 0 076 O 249 0 82 5 0 104 QO 304 0 027 0 000 A S7 POLOOS 5 1 109 5 1 5199 17251097 6 642 Og LT 01L 07026 0 013 3L9 9 05100 0 5205 OO SO 0 000 Profile Hypothesis 2 Given parameters are respectively Gene position on the array Chromosome 1 QTL Position 1 Chromosome 2 QTL Position 2 H0 H2 H1 H2 std dev GMB11940 GMB11945 General Mean Sire QTL effects 1 Sire EMI 113L LX NC L2 01 Saas EMELIOAS Jil ewe LACNG I vae Qn exiseveds 2 Sais GMB11940 11 AGCC GTTC Sire GMB11945 11 ATTC AGCC Sire polygenic effects Sire GMB11940 Sire GMB11945 SEGUE QUOD OMS OHO S eg OOE ill 5 SHO 6 904 0 281 Or SHEG 0 057 9oi37 y cibi 0 105 0 155 0 009 0 000 A 57 19023977 5 QEON Osta OT MESI GET 0 601 0 9 9U 054 9 1597 0 028 4 226 0 087 5 0350 0 000 Box 17 3 eQTL report under the alternative hypothesis of two QTL QTLMap 0 7 48 56 10 3 Analysis summary In the file SUMMARY parameter file key out summary several sections are given summarising the analysis For each hypothesis H0 0 qtl H1 1 QTL H2 QTL For each analysed variable by line v Number of genotyped progeny with phenotypes for the trait Y Maximum likelihood ratio Y QTL most likely position s chromosome of each QTL position of each QTL on the chromosome Y foreach sire o Estimation ofthe QTL
25. 9 202 OG 00 Ad Bs 3 2 BE 34 09 35 47 03 00 00 00 2 1 oO 32 22 34 10 04 00 00 00 00 2i O10 32 5923 205 00 00 00 00 00 AT s VO Box 20 2 LRT file grid 2 QTL 10 5 QTL effect estimations output The following key should be defined in the parameter file to output the QTL effect estimations along the linkage group under hypothesis of one QTL segregating out pateff out mateff o Sire QTL effects out pateff For each tested position the file contains for each tested position on all tested chromosomes The chromosome tested position the sire 1 QTL effect estimation the sire 2 QTL effect estimation KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK Che POS SALONO 910045 910081 910088 0 010 d 0 14 zi 3 3 0 02 0 020 0 24 OAS SOR 4 EXON 0 030 zc 24 0 55 0M O Q1 0 040 0523 1 0c 1 5 0 QS 0 050 0 22 Mle 015 0 05 0 060 019219 10 5 16 105 0G 0 070 05 213 Woy 0 16 008 0 080 0 28 o 7 e 0 Chrl Chr2 Posl Pos2 910001 0Qtl 1 910001 0t1 2 910045 0t1 1 910045 0t1 2 910081 Qt1 1 910081 Qt1 2 910088 Qt1 1 910088 Qt1 2 0 010 0 020 057 0 04 0 151 0 04 0 57 0 04 O57 0 04 omoa 0 030 0 24 0 04 0 24 0 04 0 24 0 04 0 24 0 04 0 010 0 040 oma 0 04 omas 0 04 omii 0 04 omis 0 04 0 010 0 050 0 14 0 04 0 14 0 04 0 14 0 04 0 14 0 04 0 010 0 060 0 14 0 04 0 14 0 04 0 14 0 04 0 14 0 04 0 010 0 070 0 14 0 03 0 14 QS 0 14 QS 0 14 QS
26. Here the i line and column of M M and L L were suppressed when Lj lt QTLMap 0 9 6 15 56 viduals not reaching the end point during the recording period Truncated data are also pre sent for individuals without point of origine Different approaches were developped for the analysis of such information including parametric the Weibull regression model of Kalbfleisch and Prentice 1980 and semi parametric the Cox model Cox 1972 models Mo reno et al 2005 extended those models to QTL detection QTLMap offers the Cox model for QTL detection In the extension of Moreno et al 2005 the Cox model is approximated to make computations feasible This model is developped only within the full mixture framework still assuming that the sire phase is known npij dj II IL Yd p hd hs Md IT 2 Xe s p tij ts ta hs hdij Mj Q XikB Git Aijt Only uncensored individuals are considered in the ijk list possibly reducing the numbers ns ndi np to ns nd np The penetrance function o Xijy B ait Aijt 0i weights the risk observed for the ijk individual dying at a time yp by the mean risk of individuals still alived at this date This weighting gives exp iui jas Yi Lie Lhdig pch hs Mj x Eniro Yay ere Dj hsi hdi j Mie exp oett uus J With V pue XiB C7 a 1 99a Y R ypijy the set of individuals known to be alive just prior to time ypj j These indi viduals
27. ITS keyword starting the second section is mandatory to simulate phenotypes On the next line the number of traits to be simulated is given Then for continuously distributed traits the name of each trait to simulate as referenced in the model file with one line per trait For discrete traits the name of each discrete trait to simulate as referenced in the model file with one line per trait followed on the same line by Y trait heritability Y number of modalities Y frequency of each modality Only if one or more QTL is to simulate after the keyword qt leffect the QTL effects real values are listed trait 1 QTL1 trait1 QTL2 trait2 QTL1 trait2 QTL2 Example 1 a parameter file for the estimation of the rejection threshold of the test There is one qtl on the linkage group versus there is no QTL with the corresponding model file TRAITS keyword 2 number of traits to simulate imf name of the first trait as given in the model fil s BOREIA bardiere name of the second trait as given in the model file s xox i122 Box 12 1 Example of simulation parameter file with no QTL effect 2 Ong nofix nocov aime xe O 9 bardiere r 0 0 0 Box 12 2 Corresponding example of Model file Example 2 a parameter file for the estimation of the rejection thresholds for the test There are two QTL on the linkage group versus there is one QTL at the position 0 6 Mor
28. Jo 7 O TET When this probability exceeds the standard threshold corresponding to the 5 1 or 0 1 Pent level you might consider removing this effect from the model Box 16 13 Example of main output file information fourth section Analysis under H1 Marker informativity at the maximum likelihood estimation Q lt imtormatcivity lt I OTL 1 position 0 6700 Chromosome number tested 1 Chromosome 7 Position number tested 67 169 Left marker SLA position 0 6200 Right marker 0102 position 0 7400 KKK Sire 910001 Informativity 0 990 Haplotype 2121 4492 KKK Sire 910045 Informativity 0 894 Haplotype 3651 5432 ckck Sire 910081 Informativity 0 940 Haplotype 21522 61243 QTLMap 0 7 45 56 ckck Sire 910088 Informativity 0 Oils Haplotype 2821 4444 ALlalie origa tor 91000 Chromosome 7 known Allelic origin for 910045 Chromosome 7 known Allelic origin for 910081 Chromosome 7 known AMILALS Lave ueibepuagv Cor LOIS Chromosome 7 known NOTE known allelic origin means QTL effect maternal paternal allele effects Box 16 14 Example of main output file information fourth section Analysis under H1 e Confidence intervals For each detected QTL confidence interval estimated within sire family or globally by Boostrap Drop Off or Hendge and Li methods are reported QTL ID Position method Length of confidence i
29. OF PHENOTYPED AND GENOTYPED ANIMALS 236 NUMBER OF TRAITS E x NUMBER OF FIXED EFFECTS 0 NUMBER OF COVARIABLES i 0 TRAIT Bardie NUMBER OF PHENOTYPED PROGENY 8 236 EANS 7oi16 0650 INIMUM 8 6 048 AXIMUM 9 668 NUMBER OF MISSING PHENOTYPES t 0 NUMBER OF CENSORED PHENOTYPES g 0 WITHOUT MODEL for fixed effects and covariables Box 16 4 Example of main output file information first section simple statistics about performance traits data o Thesecond section informs about preliminary steps of the process QTLMap 0 7 39 56 e Description of parental phase reconstruction This information about parental phases is fully given in the file specified after the heading parental phases In this file figure the most probable phases of the sires and of the dams if available from the analysis built from available marker and pedigree information Remember that parameters to control the minimal sire and dam phase probability can be reset by the user with the keys opt minsirephaseproba and opt mindamphaseproba in the parameter file ckckckckckckckckckck ck ck ck kc PARENTAL PHASES ck ck ckckckckckckck ck ck ck kckck FILE OUTPUT phases kk Ck Ck ck ck kk ck kk Ck Ck Sk Ck kk Ck Sk ck kk ck kk Ck kk ko kk Ck kk Ck kk kk Sk kk Sk ko k ko ko ko kok ok Box 16 5 Example of main output file information second section parental phase information e Description of data structure An header e
30. Outputs for simulations 54 10 11 Detailed output of the LRT for simulations 55 11 References 56 QTLMap 0 9 6 3 56 1 Introduction QTLMap is a software dedicated to the detection of QTL from experimental designs in outbred population QTLMap software is developed at INRA French National Institute for Agronomical Research The statistical techniques used are linkage analysis LA linkage disequilibrium analysis LD and linkage disequilibrium linkage analysis LDLA using interval mapping Different versions of the LA are proposed from a quasi Maximum Likelihood approach to a fully linear regression model The LDLA and LD analyses are regression approaches Legarra and Fernando 2009 The population may be sets of half sib families or mixture of full and half sib families in daughter or grand daughter design The computations of Phase and Transmission probabilities are optimized to be rapid and optimised Elsen et al 2011 Favier et al 2010 QTLMap is able to deal with large numbers of markers SNP and traits eQTL QTLMap sources Fortran language are freely available Up to now the following functionnalities have been implemented Y daughter or grand daughter design QTL detection in half sib families or mixture of full and half sib families One or several linked QTL segregating in the population Single trait or multiple trait analyses Nuisance parameters e g sex batch weight and their interactions with QTL can be incl
31. QTLMap 0 9 6 User s guide 21 06 13 QTLMap 0 9 6 1 56 A A WND Introduction 4 Contributors 4 Support 4 Theoretical background 5 1 The basic model 5 4 1 1 Information 5 4 1 2 Statistical formulation 6 4 1 3 Simplified family structure 7 4 1 4 Computation of elements 8 4 1 4 1 Parental phases 8 4 1 4 2 Parents to progeny transmission 8 4 1 4 3 Penetrance 8 4 1 5 Optimisation 9 4 1 6 Conclusion about the basic model 9 4 2 Alternative formulations of the likelihood 9 4 2 1 Full linearisation the regression model 9 4 2 2 More complex models 11 4 3 Alternative genetic hypotheses 11 4 3 1 Assuming more than one QTL 11 4 3 1 1 Two linked non interacting QTL 11 4 3 1 2 Two linked epistatic QTL 12 4 3 1 3 Any number nq of unlinked QTL 12 4 3 2 Modeling the polygenic background not yet fully available in QTLMap 12 4 4 Alternative penetrance functions 13 4 4 1 Unitrait uniQTL situations 13 4 4 1 1 Nuisance effects 13 4 4 1 2 Non gaussian models 15 4 4 1 2 1 Discrete traits 15 4 4 1 2 2 Survival and Time to events phenotypes 15 4 5 Accounting for linkage desequilibrium in the parental generation 16 4 5 1 Association analysis 16 4 5 2 Linkage Desequilibrium Linkage Analysis 17 4 5 3 Thehalf sib case 17 Setting up QTLMap 18 Pre requisites 18 Compilation 18 OpenMP support 18 NVIDIA GPU acceleration support 18 Input files
32. TH qtlmap p analysis calcul 1 v When debuging the software add d or debug to the command S QTLMAP PATH qtlmap p analysis calcul 1 d To avoid outpout add q or quiet to the command S QTLMAP PATH qtlmap p analysis calcul 1 q QTLMap 0 7 31 56 8 Control of first and second type errors in existing designs Simulations or permutations can be organised to empirically estimate the rejection thresholds of the test statistic and to measure the detection power of an existing design To perform these estimations Keywords must be defined in the parameter file The simulation parameters file name is given in the parameter analysis file with the key in paramsimul not for permutations A second key optional out maxlrt specifies the name of a file reporting the maximum likelihood ratio test values for the simulations or permutations Sections 7 1 and 7 2 describe how to compute empirical distributions of test statistics while accounting for the missing data structure on phenotypes existing in the data set Section 7 3 proposes an alternative to simulate data for all progeny independantly to recorded traits 8 1 Simulations with respect of missing data structure When no QTL is simulated null hypothesis No QTL on the linkage group all needed information heritabiilty and correlations is provided in the model file Under H1 or H2 a specific file simulation parameter file defined b
33. are pointed as k with ke R ypijx 4 5 Accounting for linkage desequilibrium in the parental generation 4 5 1 Association analysis In the basic model it was assumed that all loci marker loci as well as QTLs were in linkage equilibrium in the parents the allele carried by a chromosome at a locus is independant of the allele the same chromosome possesses at any other locus However this hypothesis is not sustainable at very short distances it is now well known that due to various reasons mutation migration selection drift observation of alleles carried at two close loci are not independant This was clearly demonstrated for marker loci e g Farnir et al 2000 in the Bovine species and is certainly true between QTLs and very close marker loci This disequilibrium between allele frequencies justifies so called Association or Linkage Desequilibrium Analyses LDA In their simplest form these methods consider the population as a set of unrelated individuals and test the direct effect of genetic information may be allelic genotypic or haplotypic effect on the quantitative trait variability QTLMap being dedicated to experimental populations characterized by a family structure the LDA Decay approach described by Legarra and Fernando 2009 was implemented In this approach parental haplotypes are pooled in classes the classification being open to the user decision Here only the most probable sire and dam phases hs hdj
34. c print all permut nsim disable sire qtl E qe ci nsim gt 7 1 Option calcul choice of the QTL analyses Option calcul allows to perform LA LD or LDLA analysis using a Gaussian distribution for one trait For all these analyses the variance within sire families can be considered identical or heterogeneous between families homoscedastic heteroscedastic For LA only additional models are available joint analysis of several traits either considering a multivariate gaussian distribution or using a discriminant analysis approach censored analysis using a cox model QTLMap 0 9 6 271 56 Available options for calcul specific key in parame ter file Type of Lees QTL residual Tat dicitibation within dam nuisance prohibited variance likelihood effect running options ideal Analysis unitrait 10r2QTL Within sire gaussian mixture analysis No unitrait 1 QTL Within sire gaussian discrete mixture analysis possible unitrait 1 n QTL general gaussian linear approximation possible unitrait 1 n QTL Within sire gaussian linear approximation possible multitrait multivariate analysis multitrait discriminant analysis 1 QTL Within sire gaussian mixture analysis no 1 QTL Within sire gaussian mixture analysis unitrait 1 QTL no variance Cox model approximation possible unitrait 1 n QTL Within sire gaussian discrete mixture analysis possible
35. cts 1 Sire GMB11940 11 GCCC ACCC Sire GMB11945 11 ACTC ACCC Sire polygenic effects Sire GMB11940 Sire GMB11945 note 0 0 means not estimable zx 97 8032595 5 0 1 99 6 622 ESO 0 97 6 0 063 1 430 OOZ i 19 0 000 AS POLULQOS 5 lo3469 234555 0o db 777 OLEO 0027 0 001 0 086 O OZY 0 000 Box 17 2 eQTL report under the alternative hypothesis of one QTL Under the alternative hypothesis 2 QTL segregating on the linkage group s columns are the gene position on the expression array as indicated in the eQTL performance file the chromosome where the first QTL is detected the position of the first QTL the chromosome where the second QTL is detected the position of the second QTL the LRT for the test HO H2 the LRT for the test H1 H2 the standard deviation of the distribution the mean the sire QTL1 effect for each sire the sire QTL2 effect for each sire the sire familial polygenic effects Profile i 1 Hypothesis 2 Given parameters are respectively Gene position on the array Chromosome 1 QTL Position 1 Chromosome 2 QTL Position 2 H0 H2 H1 H2 std dev GMB11940 GMB11945 General Mean Sire OTL effects 1 Sire GMB11940 11 TTGA CAAA Sire GMB11945 11 TTGA CAAG Sire QTL effects 2 Sire GMB11940 11 CGC4 TGC4 Sire GMB11945 11 CGG4 CGC4 Sire polygenic effects Sire GMB11940 Sire GMB11945 note 0 0 means not estimable KSI 20925 7 5 ORSS NE 9 212 Wil PZ E 0 3 7 ORAS om 0 056 080
36. d 12 progeny per dam total 840 progeny with one chromosome and 101 SNP evenly distributed on 100cM a QTL located in position 70 5cM The QTL is not fixed in the grand parental population Two real traits are simulated with correlation 0 4 MARKERS Dude A marker density M number of alleles marker map size Morgan GENEALOGY F2 type of design Jg y 12 number of sires number of dam sire and number of progeny dam QTLMap 0 7 36 56 OTL 1 Position 0 705 chromosome 1 frequency ON SIMULTRAITS 2 Samereaicil xe 025 Gabe i 10 35 Comieeilerestoim dp O 4 jJ qtleffect 0 1 0 5 Box 15 Example of a simulation parameter file to simulate a completely new protocol F2 10 Output files A set of files is proposed as the result of an analysis or a simulation v The main output analysis report simulation report v A summary Additional files optional in for reporting analyses Likelihood ratio test profile per sire family per dam family general QTL effect estimation at each tested position sire family and dam family Parental phases Alleles frequencies Haplotypes assigned from parents to progeny Parental segment transmission marginal probabilities ANCUS NOS RAS Parental segment transmission joint probabilities Specific files advanced users v Coefficients of the discriminant analysis along the linkage group Additional file optional in a simulation p
37. dered different in the LD QTL analysis 912697 lume 902206 rom 924758 lac IL THEE O QZARS I jee JLS SZI evox Box 2 Example of a population file 6 3 Marker map file This file gives the locations of the markers on the chromosome s Each line corresponds to a single marker and gives order to be followed Y marker name alphanumerique Y name of the chromosome carrying the marker alphanumerique Y marker position of the marker on the average map in Morgan QTLMap 0 9 6 19 56 Y marker position of the marker on the male map in Morgan marker position of the marker on the female map in Morgan Y inclusion key 21 if the marker has to be included in the analysis 0 if not S SW552 Lt 9506 0 05 05 09 1 Swo4 JL OSA Wea O52 10 CGA J ode 9 di 0o 95 L snpl LS 9550 O 37 9559 1 snp2 15 9559 OW 49 19 55 1 Box 3 Example of a marker map file In this example marker SW552 is on chromosome 1 at position 0 08 on the average map 0 05 on male map and 0 09 on the female map and will be included in the analysis of chromosome 1 etc 6 4 Marker genotypes file This file contains the animals phenotypes at the markers The first line gives the marker names the markers must belong to the marker map file For each animal a line gives its ID as described in the pedigree file followed by the markers phenotypes ranked following the first line order Each phenotype is made of 2 alleles
38. dited for each trait with the number of QTL effects to be estimated KKK KKK KKK KKK KKK KKK KKK kk Sk kk ko ko KKK KKK KK Amalysias Ot trat Bardiere kk ck ck ck ck kk ck kk kk ck kk ck kk ck kk ko kk kk ck kckck kk kk k LRT profile on the linkage group POSTERO ESSES EAEG EC 4 sire QTL effects 16 dam QTL effects Box 16 6 Example of main output file information second section quality of parameters estimations As the design may be unbalanced leading to strong colinearity between QTL effects and some other effects in the model a warning is provided if this situation occurs The confusion is measured by the correlation between the columns of the incidence matrix in an equivalent fully linear model at the starting position of the scan a warning is edited if this correlation exceeds opt_eps_confusion A second test of confusion between the QTL effects and the estimable effects finally kept in the model is edited Test of confusion between QTL and other effects in the initial full model test based on the correlation between columns of the incidence matrix Ck Ck ck ck ck ck kk ck kk ck kk ck kk ck kk Ck kk Ck Sk ck kk Ck Ck Sk ck kk Ck kk kk ck kk ck kk kk ok Ck kk kk ck kk ck kk Sk kk ck kk Ck ko Sk Sk kv Sk kk ko ko ko kokock Confusion between QTL and other effects final constained model No confusion detected the highest correlation is 0 257 QTLMap 0 7 40 56 kk c
39. e genotype in_model model in_paramsimul param_sim_real out maxlrt O0UTPUT all simul opt max iteration linear heteroscedastic 30 opt eps recomb 0 05 out phases OUTPUT phases opt_nb_haplo_prior 200 opt_prob_haplo_min 0 00001 opt_long_min_ibs 4 opt_longhap 4 opt_optim_maxeval 1000000 opt_optim_maxtime 1000000 opt_optim_tolx 0 00005 opt_optim_tolf 0 00005 opt_optim_tolg 0 00005 opt_optim_h_precision 0 00005 Box 16 1 Example of main output file information first section data characteristics e Description of the genealogy Number of parents grand parents and progeny are given KKKKKKKKKKKKKKKKK GENEALOGY DESCRIPTION K RK KKK KKK The pedigr file includes 20 parents born from 5 grand sires and 5 grand dams and 236 progeny born from 4 sires and 16 dams Box 16 2 Example of main output file information first section simple statistics about genealogy e Description of the markers information QTLMap 0 7 38 56 Charateristics of marker data read in input files are given Y Number of genotyped individuals Y Number and names of the genetic markers of alleles of each marker allele frequencies Y If unbalanced allelic segregations are observed for all markers the deviation to 0 5 of heterozygous frequency in the offspring of heterozygous sires is tested with a Fisher test a warning about potential transmission distorsion for the marker within the
40. e family In the full likelihood the element Ll iin is Lin 3 2 p tz t ta hsi hdij Mi o u ait aic oi t 712 tq 1 2 with Y u o a normal density with a u mean and c variance Y wis the fixed general mean V Ci tijxstijea the vector of transmission event 1 or 2 from the sire and dam to the QTLMap 0 9 6 6 56 progeny i e from which parental chromosome originated the x segment transmitted to ijk V dit resp a j the effect of the within sire resp dam tt QTL allele Y of the within sire residual variance In the following the double summation Xt 1 2 amp t4 1 2 Will be summarized by Xt t4 It must be emphasized that the aj and aijt effects include both the sire and dam QTL effect and polygenic deviations to the general mean In the alternative parametrization Soller and Genizi 1974 those effects would have been replaced by u 1 55a and uj 1 4aj m In QTLMap this part of the likelihood is linearly approximated by Lir Hr with Hijk H gt p th ts ta hs hdi M it ijtal tstgq As described in Mangin et al 1999 this approximation allows a much faster computation of the likelihood with marginal losses of power and parameters estimation precision this last point not being true when the number of markers is very limited 3 in the Mangin et al 1999 paper A likelihood ratio test LRT compares this L1 likelihood under the H1 hypothesis to the LO likelih
41. e grand parental pedigree information QTLMap 0 9 6 18 56 Each line is made of an alphanumeric ID triplet individual sire dam A fourth information gives the generation number 1 for the parental generation 2 for the progeny generation An animal missing one or both parents ID has not to be included in the file The missing value code given in the parameterization of the analyses see 6 2 cannot be used in the pedigree file When an animal is parent in a sire family and offspring in another one it has to be duplicated in the pedigree file One line with generation 1 and another with generation 2 QPIOIGSNEO NIE OM DIG 944547 924758 911714 QUA GAOL IS 9449959229619592 961924 922961 944547 961925 922961 944547 961926 922961 944547 963187 922961 944985 963188 922961 944985 963189 922961 944985 Box 1 Example of a pedigree file MN NN NY BO Bg Ee In this example the pedigree includes 7 progeny born from 1 sire and 3 dams Sire 922961 is the son of sire 911287 and dam 902206 etc The id 944985 is dam and offspring then it is duplicated with generation number 1 and 2 6 2 Population file optional This file gives the population category of parents or grand parents breed strain Y Firstcolumn Parents ID Y Second column name of the population category This information is used to determine different origins of parental haplotypes Identical haplo types coming from different origins will be consi
42. e matrices extended to the whole set of sires V j 1 V the total covariance matrix The linear model is yp Xpt W ace Parameters maximising the likelihood can be obtained in an iterative two steps procedure At iteration 1 Step 1 Solving the linear system D Yy Yyy 1 Yyy Pu XVX XV qgW X V gyp ER W V aX WVW AW V yp gt Step 2 Estimating the within sire family variances Gis yp Xia WI yp XBox WH cs1 7 The steps are repeated until convergence detected for instance when B e Bo lt e o ll lt l e 02 41 0 lt E QTLMap 0 9 6 14 56 As estimability of a and B elements varies with the tested location x this information is checked for each x before likelihood estimation This information initially developped for the fully linear model is also used in the Within sire dam regression to avoid numerical difficul ties To check estimability parameters the incidence matrix M corresponding to the linear model yp Xp W a e M 0 ce is built at each location x choising as an order for the elements of 0 i the QTL effects a first ii the parental means u and iii the nuisance factors B A Cholesky decomposition of the incidence matrix is performed eliminating M elements corresponding to parameters linearly dependant on previously considered ones 4 4 1 2 Non gaussian models 4 4 1 2 1 Discrete traits Ordered qualitative phenotype
43. ecision available only for calcul 2 3 4 estimated from the diagonal element of the incidence matrix inverse in an equivalent fully linear model the lower the better Parameters estimation with estimability information and precision indicators are listed for General mean Nuisance fixed and covariates factors not for calcul 1 5 and 6 Sire QTL effects Dam QTL effects if their family size is over OPT_NDMIN Sire polygenic effects Dam polygenic effects if their family size is over OPT_NDMIN The mean of the absolute values of QTL effect obtained at the maximum LRT is edited SS RN Estimation of parameters under H1 Within sire standard deviation sire 91000 gol 0 492 eire OO Gl 0 564 sire 910081 s d 0 568 sire OSEE II 07582 parameter estimable value precision General Mean yes 7 524 OO SIS Sire QTL effects Sire 910001 1 yes 0 164 0 021 Sire 910045 1 yes 0 140 0 028 Sire 910088 1 yes 0 226 O O22 Dam OTL effects Dam 910014 1 yes 0 30 0 049 Dam 910002 1 yes 0 482 0 050 Dam 910010 1 yes 0 214 OS Sire polygenic effects Sire 910001 yes Q 676 0 067 Sire 910045 yes 0 440 ROS Sire 910088 no Dam polygenic effects Dam 910014 Sire 910001 yes 0 049 0 069 Dam 910002 Sire 910081 yes 0 164 PONES Dam 910010 Sire 910081 ves G 21777 0 07 2 NOTE known allelic origin means QTL effect maternal paternal allele effects KKK The mean of absolute value of substitution effect WQ
44. effect s o Within sire family standard deviation o Significance of each QTL effect s based on a Student test sign significant ns not significant na not available too limited offspring size PKCkckckckckckckckckckckckckckckckckckckckckckckckckckckckckckckckckckckckckckckckckckokckckckckckckckokckckckokckck ck ckckckckckck ck ck ok ckckckckckckckckckckckckckckckckck Summary 0 QTL versus 1 QTL Variable N Max Lik Pos M Sine 910001 910045 OU SUO Che 1I Posi eff1 SD sigl efti SD gigi bardiere 236 45 2 Hs Oo 2 008 OO SEIS TEES EY 0 119 0 560 sign imf 236 43 7 1 De 0 156 0 338 sign 0 187 0 426 sign EREEREER DL ener scored d c c EE Summary 0 QTL versus 2 QTL 1 QTL versus 2 QTL Variable N Max Lik Pos M Sire 910001 910045 0 20 mm 1 20TL Chr 1 Posl Chr 2 Pos2 effl effz SD Sigi sig effi eff2 SD sigl sig2 hardier 2236 S70 TIS 3 Oa 3d 0 148 0 082 0 481 sign sign SO 226 a V60 D SA SigA SigA imf 29465 AMS 5 6 4959 Gk NT ANd ANS Why Sisk Sign Sign oaks 27 0 252137 Sign fab ES US US cb is TRUS Cbs Us ae ce is Tee L 1 il lll l Summary 0 QTL versus 3 QTL 1 QTL versus 3 QTL 2 QTL versus 3 QTL Variable N Max Lik Pos M Sine 910001 910045 0 3QTL 1 3QTL 2 3QTL Chr 1 Posl Chr 2 PoS2 eue 3 BOSS etri eff2 efr3 SD sigl sig2 s193 effi cff2 eff3 50 sigl sig2 s193 bardiere 236 635 13 8 6 9 il Ore i 0 8 1 ibat 0 340 W266 0 006 0 480 SS EE sign ns 05211 O 528 Oo27i 05593 Sig
45. ermutation case v Maximum Likelihood Ratio Test and its position for each simulation permutation 10 1 Main output for phenotype analysis The main output files comprises five sections o Thefirst section describes the data as read by the software e Description of the parameters file The name of the corresponding file is provided by the user with the key out_output in the parameter file The list of runtime option keys used by the application runtime environment is given all keys are described at the end of this document QTLMap 0 7 37 56 DATE 2013 06 20 17 12 8 02 Release build 0 9 s 6 C 26 04 201 3 17 11 01 ARGUMENTS p analyse calcul 2 CALCUL 2 MODLIN ANALYSIS OMP_NUM_THREADS 6 Seek check ee x Kk Re x x x PARAMETERS ANALYSE FILE SUMMARY KAKKKKKKKK KKK KKK KK out_output OUTRUN eS sie lite out lrtsires OUTPUT sires out lrtdams OUTPUT dams out_pded OUTPUT pded out_pdedjoin OUTPUT pdedjoin out_pateff OUTPUT pateff out mateff OLEI mats TE out grid2qtl OUTPUT grid2qtl out_summary OUTPUT summary opt_chromosome 7 opt_ndmin 20 opt_step 0 01 opt_unknown_char 0 opt_mindamphaseproba 0 10 opt_minsirephaseproba 0 910 opt_eps_cholesky 0 01 opt_eps_confusion 0 70 opt_eps_hwe 0 01 opt_eps_linear_heteroscedastic 0 5 in_map map in_genealogy genealogy aki Vesela S phenotypes in_genotyp
46. ers are defined within sire family the equations simplified for each sire to zs 1 1 1 1 pj Ge n p dd WiX WW W ypi QTLMap 0 9 6 10 56 And the variances are estimated by f yp Xifi Wi yp Xi W aj npi 4 2 2 More complex models If linear models are much easier to handle estimations may be more accurate when using mixture models e g Knott et al 1996 or Mangin et al 1999 Four levels of likelihood linearization will be available in QTLMap Full mixture exact likelihood IT S1 Las D Asi Mi IL ipo p hdij hs Mi I dts ta pti ts ta hs hdi Mj u F Git F Qijty oi Within sire dam regression the basic model IT Yns p hsi Mi Ir Enay p hdi hs Mi i Hn e u F Et tq pti ts ta hsi hdij Mi i eic aij oi Within sire regression IT Dns p hs M IEE Teei o u Xe ptf ts ta hsi M ais aije 01 With p tir ts ta hsi Mj c hd D hdij hsi Mi OP tiie a ts t4 hsi hdi Mi Fully linear model II Ip Hu e u Xt ta ptg ts ta Mi ait ait oi With p tije Mi Dns p hsi Mi na p hdi hsi M p tij cp hs hdij Mi When only the most probable sire phases are considered Within sire regression and Fully linear model are identical When only the most probable dam phases are considered Within sire dam regression and Within sire regression are identical When this restriction is applied to both sire and dam phases
47. euwissen and Goddard 2000 are generally used QTLMap being dedicated to experimental populations characterized by a family structure the LDLA approach described by Legarra and Fernando 2009 was implemented This approach combines the LD Decay 5 1 and regression 2 1 models the QTL effect being defined within the parental haplotype effect The performance expectation E ypijx hsi hdjj becomes Li Hij z p tij ts ta hsi hdi M Y5 RSie Va Fijt Ait Os ts tq Thus some flexibility between families around the mean haplotype effect is given 4 5 3 The half sib case When the dams have only one progeny or a very small offspring 1 dam QTL effects cannot be correctly estimated see 1 3 and 2 dam phases cannot be inferred from the available marker information However in these situations the number of dams is large and a lot of in formation about QTL segregation can be extracted thanks to the linkage desequilibrium In dense map situations local haplotypes defined by segments comprising a limited number of marker loci are very generally fully transmitted without recombination by the dams to their progenies These dam to progeny transmitted haplotypes easily deduced from the progeny marker genotypes and the sire transmitted haplotypes are good approximations of genuine dam haplotypes and considered as such in the previous model QTLMap 0 9 6 17 56 5 Setting up QTLMap Pre requisites v The
48. g BZ 1 46 TOMOS Box 19 LRT file general test and sire family contributions o LRT files dam family contributions out_1rtdam For each tested position the file contains Chromosome Position Dam 1 LRT contribution Dam 2 LRT contribution as in out l1rtsires Note when the offspring size of a dam is below the threshold nd_min the LRT is printed as 0 000 see opt_ndmin option o LRT files grid 2 QTL out_grid2qt1 The first part concerns the test of thypothesis 1 QTL versus the hypothesis 2 QTL Y The fist line gives the tested position for the 1st QTL Y The following lines give the tested position for the 2 4 QTL followed by the LRT 1 vs 2 QTL for each couple of positions TEST 1LOTL 2QTL 4 4 4 4 4 4 4 4 4 o al 102 03 04 205 06 5 101 00 S567 8 42 1 0 30 Li 66 12 80 202 00 a 90 3 74 8 43 HORS 0 11 68 3 00 5 OW 00 Sio 1L 8 43 JL 5 Sal 04 00 00 00 500 3497 8 44 a 095 00 00 00 00 00 EXON S Box 20 1 LRT file grid 2 QTL The second part test of thypothesis 0 QTL versus the hypothesis 2 QTL Y The fist line gives the tested position for the 1st QTL Y The following lines give the tested position for the 2 4 QTL followed by the LRT 0 vs 2 QTL for each couple of positions QTLMap 0 7 50 56 TEST OQTL 2QTL 5 al 02 5 10 8 04 05 06 On 00 27 46 S2 o dl 34 09 35 45 3659
49. g the first parametrization i e considering for each parents the two effects aj and aj or aj and aij QTLMap 0 9 6 12 56 Y W is the correponding incidence matrix whose elements are the p t M and P tijka Mi Y aistherandom animal effect distributed in JV 0 402 Y Zz I Y ethe random residual distributed in N 0 102 In this mixed linear model as between families heterogeneity is considered through the A matrix the homoskedastic situation is kept only one variance for the residuals In principle the first moments u a and second moments o 02 should be estimated at each tested QTL location Following the FASTA approach of Aulchenko presented in the GENABEL software Aulchenko 2011 QTLMap does not re estimate the heritability coefficient more precisely the ratio A 02 02 along the genome scan These parameters must be given by the user and are easily estimable from standard approaches as ASREML At each location the following mixed model equations are solved XX XW x H X yp WX Wwe w a W yp X W I AA a yp Or ra wo B rs ttov with H A A AI a matrix invariant to the location x which has to be calculated only once The residual variance is estimated at each location by 2 gt ag A yp x W a Ac aD yp x w a vp X w 5 nv X w The A matrix which gives the H AA I AA 3 matrix is estimated following the usual Henderson
50. gan on the first chromosome QTLMap 0 7 33 56 In this example the QTL simulated has an effect of 0 4 on the first trait and 0 5 on the second trait The QTL alleles are fixed in the grand parental populations QTL keyword 1 number of QTL to simulate Position 9 5 keyword position of each QTL in Morgan chromosome 1 keyword chromosome location of each QTL frequency 1 0 keyword frequency fl of one QTL allele in a grand parental population TRACES keyword 2 number of traits to simulate imf name of the first trait as given in the model fil s Box 11242 bardiere name of the second trait as given in the model file s 079120021 qtleffect 0 4 0 5 keyword QTL effect on trait 1 and 2 Box 13 Example of a simulation parameter file to simulate data with a QTL effect affecting traits referenced in the model file 8 2 Permutations Model 1 11 Sexe Poid Trait 100 p_analysis in_map map in_genealogy genealogy in genotype genotype in traits performance in model model opt step 0 01 opt chromosome 1 2 3 In the QTLMap software the permutation option allows to permute the nuisance effects and phenotypes between genotyped animals within full and or half sib families to empirically estimate the distribution of test statistic under the null hypothesis no QTL is segregating on the linkage group The permutation procedure proposed by Churchil
51. gating columns are the gene position on the expression array as indicated in the eQTL performance file the standard deviation of the distribution the mean the sire familial polygenic effects The standard deviations and the polygenic means are given for each sire successively on the same line Hypothesis 0 Given parameters are respectively Gene position on the array std dev GMB11940 GMB11945 General Mean Sire polygen ic effects Sire GMB11940 Sire GMB11945 note 0 0 means not estimable m iy egio2 57 09 3997 OW SiS 9 990 0000 T000 m gy pgi29666 04 527 0 277 94960 0 060 9590 A 97 POLOOS O L77 0 209 07005 0 003 0 000 m 197 POOS 1 295 OSLO OOS A05 0 000 A 97 921973 32904 OW eile 0 044 271 0 000 ABT ROIS 2 555 2 596 O14 E 0 PTS EZ 000 A BI_POOSAT O 297 0 290 0 019 019 0 000 A_87_P003455 0 450 0 386 0 042 0 042 0 000 zx 197 LOOI65 O 367 W407 O 01S Qus 950900 m gy POAT Q s95 19 976 0 005 0 005 0 000 m 97 9014297 95399 0 4296 0 050 5059 0 000 A 97 POSAL Ql 0 190 959007 0 007 930900 mo gy 9022239 90 1 043 9 90965 s 029 0 000 m 197 POLL sii 0 254 9 025 0 092 O O00 A 87 P004630 0 347 0 343 0 008 0 008 0 000 sec iuelo575 O 260 4964 O 022 0 022 0 000 seq_RIGG14618 0 367 0 440 0 019 0 019 0 000 AS screws 0 21414 9 2514 O 006 0 006 5090 Box 17 1 eQTL report under the null hypothesis Under the alternative hypothesis 1 QTL
52. k ck ck ck Ck ck ck Ck Sk ck Ck Sk ck KK KKK Ck Sk KKK KKK ck ck ck ck ck ck Ck ck ck Ck ck ck kk ck Ck ck ck kk Ck kk kk ck kk ck kk kk ck Sk kk ko ko k kk Sk Sk ck k KKK KKK Box 16 7 Example of main output file information second section quality of parameters estimations o Thethird section provides results of HO hypothesis analyses for each trait Parameters estimation with estimability information and precision indicators are listed for v Within sire standard deviation global standard deviation for models 3 25 and 27 Y Sire polygenic efects Y Dam polygenic effects if their family size is over OPT_NDMIN v Estimation of parameters under HO Within sire standard deviation sire QlOOGL Sel O55 sire 910045 s d 0 578 sire 910081 s d 0 658 sire 910068 soc 0 654 parameter estimable value precision General Mean yes NES CROSS Sire polygenic effects Sire 910001 yes 667 0 067 Sire 910045 yes 0 448 0 058 Sire 910081 yes 0 264 0 065 Sire 910088 no Dam polygenic effects Dam 910014 Sire 910001 yes 0 062 0 069 Dam 910002 Sire 910081 yes 0 052 0 00 3 Dam 910010 Sire 910081 yes a 1 2 8 0 068 Dam 910074 Sire 910088 yes 0 220 OPCW eS NOTE known allelic origin means QTL effect maternal paternal allele effects KKK Ine mean OF absolute value of etbstituticn erect WO in stel wmit Box 16 8 Example of main output file information third section Analysis unde
53. l and Doerge 1994 is an intuitive method for estimating thresholds which accurately reflects the specificities of an experimental situation However when the permutation groups are small the number of permutation possibilities decrease and the simulation method is more adapted to estimate the distribution of the test statistic under HO In order to prevent unsuited calculations an arbitrary threshold for family sizes was fixed to 10 to allow permutations Different permutation situations were considered When the full sib family size is higher than the nd_min key or 10 if nd_min lt 10 genotyped animals are permuted within the sire full sib family When the full sib family is smaller than nd min or 10 if nd min 10 the permutation is performed within half sib family In case of a multitrait analysis only phenotyped animals are permuted In case of a unitrait QTLMap 0 7 34 56 analysis animals with at least one phenotyped trait among the traits of the performance file are permuted Permutations of performances is available with the runtime option permute S QTLMAP PATH qtlmap p analysis calcul 1 nsim 100 permute 8 3 Simulations without reference to data structure opt chromosome 1 2 3 in paramsim param sim The simulations can be carried out with no reference to existing traits which allow simulating phenotypes for all progeny without missing data In this case the parameter file does not need the keywo
54. n Sigm sign imf 236 60 6 T9 digs il 0 1 i 058 al Ou W123 0 082 Wise e e E eg sign sign 0 439 0 540 10 072 0 408 sign sign ns Box 18 Summary with qt1l 3 option 10 4 Output of the LRT The following key should be defined in the parameter file to output the LRT values for each tested position along the linkage group under hypothesis of one QTL segregating out lrtsires out lrtdam and or a grid output for the likelihood ratio test under hypothesis of 2 QTL out grid2qtl o LRT files general test and sire family contributions out 1rtsires For each tested position the file contains Y For the H1 Chromosome tested position global LRT Sire 1 LRT contribution Sire 2 LRT contribution Y For the H2 Chromosome1 Chromosome2 tested position 1 tested position 2 global LRT Sire 1 LRT contribution Sire 2 LRT contribution QTLMap 0 7 49 56 Chr Pos GlobalLRT 910001 910045 910081 910088 AL 0 010 SoS 4 93 9 SiL 2 Sy 9533 ii 0 020 8 62 4 82 1 5005 2A 0 5 530 il 0 030 rol 4 66 1 14 2 45 0 SHE AL 0 040 8 45 4 47 Les 2 41 0535 AL 0 050 ORA 4 24 IAS 2 34 0 42 1 0 060 ONIS 2524 1435 Zo Sil 0 48 Ciel Cher Ross Pos2 GlobalLRT 910001 910045 910081 910088 1 AL 0 02 0 65 Sio WS Zoe 9 15 zb D 1 1 0 02 0 66 4 70 3505 Oa L2 038 1 92 1 1 0 02 0 67 55 Se Sie Sal 0 40 0 26 1 41 1 1 0 02 0 68 5 80 Soo 0 70 0 79 0 80 1 t 0 02 0 69 596 35 OS JL t JL 25 ie JUI 1 AL 0 02 Ds HO 3 ta Sio Hil L
55. n output and the parental phases output at position x given the dam phase v The probability that the progeny inherited the 2 4 dam chromosome as given in the main output and the parental phases output at position x given the dam phase 0 5 if the dam transmission probabilities are not considered in the analysis Position Sire Dam Dam Phase Animal p 2nd sire allele p 2nd dam allele ax 910001 910014 alt 944217 1 000 0 000 2 910001 910014 iL 944217 0 999 0 001 Sis 910001 910014 JL 944217 OSs 0 001 4 910001 910014 il 944217 0 999 0 001 SE 910001 910014 1 944217 OR 0 001 Box 23 Marginal probabilities of the parental chromosome transmission 10 9 Joint probabilities of the parental chromosome transmission Each line gives for a tested QTL position x v The tested position cM Y The sire ID Y The dam ID v The dam phase number as given in the main output and the parental phases output when multiple phases are available for the dam v The progeny ID v The probability that the progeny inherited the 1st sire and 1st dam chromosome as given in the main output and the parental phases output at position x given the dam phase v The probability that the progeny inherited the 1st sire and 2 4 dam chromosome as given in the main output and the parental phases output at position x given the dam phase v The probability that the progeny inherited the 2 4 sire and 1st dam chromosome as given in the main output and the parental phase
56. nimal and must be included in the analysis 0 1 variable IC which indicates if IC 0 it was censored or not IC 1 this IC information being needed for survival analysis by default IC 1 944985 2 10 3 ES JL di Ug al 961924 1 10 43 yo 4 d ip s walle al Dol OAs 2 Brest 0 5 10 Qr 3b 9 0c i 3l HELSE db db 3 lee Ss o al vod TOS dE dE HESS Z Dees I2 7 3 3 98 1 d Mossalishss dL dbib s dL 9 ish y db di 0 0 Qr MOSS 2 NOs 10 i di Darcy SE dE Box 5 Example of a quantitative trait values file This file describes 2 traits For progeny 961924 the recorded information are sexe 1 fixed effect body weight 10 43 covariate backfat thickness 7 8mm trait 1 and fatening period of 77 6 days trait 2 etc Special case performance file for expression quantitative traits When performances are expression data another format is required This file gives the phenotypes expression traits to be analysed The header line is the list of animals phenotyped The following lines are the fixed effects covariates and finally the phenotype The format of the nuisances effects and phenotype line is lt IDANIMAL gt VALUE ANIMAL1 VALUE ANIMAL2 For missing data insert a character string which is not interpretable as a numeric e g n a OUO GS MEO P Ql JELIS 96ST QSosgWss SoSiWSs Gos go Sexo ik Tak el gi ak die al sil Cowl Q3 0 4 0 9 O15 Os O26 0 59 0 2 genl 0 0184170490684831 0 143560443113406 0 118137020630747 0
57. nterval Left bound Right bound left flanking marker position of left flanking marker Right flanking marker position of right flanking marker eNOS UN Ae SRO Confidence Intervals QTL 1 Trait Bardiere Name Position Method Average Pos Heft Pos Right Left flank mark Pos Right flank mark EOS H 1 0 Oia 0 670 Drop off 90 0 0 600 0 740 SW1369 0 520 0102 0 740 QTL_ 0 670 Drop off 95 0 0 580 0 750 SW1369 0 520 SW352 O10 QTL_ 0 670 Drop off 98 0 0 560 0 780 SW1369 0 520 SW352 PIOS KKK Ow 0 670 Hengde Li 90 0 NO 0 743 SW1369 0 520 SW352 010 Oi 0 670 Hengde Li OMS 0 5 745 0 75 2 SW1369 ORS 20 SW352 010 Qna 0 670 Hengde Li 98 0 55 0 785 SW1369 0 520 SW352 5010 Box 16 15 Example of main output file information fourth section Analysis under H1 QTLMap 0 7 46 56 10 2 Output for eQTL analyses A special output presents the analysis for each gene expression depends the dynamic flag data transcriptomic Only single trait analyses provide this output format For each hypothesis the report gives Y Firstline a header indicating the content of the columns Y Nextlines o first column gene name o others column estimation of each parameter as indicated in the header Note Value 0 0 for the estimation means that the parameter is not estimable Under the null hypothesis no QTL segre
58. ofthe likelihood Knott et al 1996 Elsen et al 1999 Le Roy et al 1998 In this family of modelling all parents are supposed heterozygous at the QTL with specific alleles giving a total of 2 ns nd QTL effects a4 a 1 2 dia Qia i 1 ns for the sires and a Qija j 1 nd for the dams When the lt parent is homozygous at the QTL we get a4 a5 a situation which may be statistically tested An other parametrization of the model describes performances expectations as the sum of parental means values u 1 i or ij and deviations a to this mean due to the QTL with Q4 U a and aj uc aj which can be summarized by aj u 1 a It was proposed by Soller and Genizi 1974 but not kept here In the basic model it is assumed that the parents are unrelated the markers in linkage equilibrium and the trait normally distributed As proposed by Goffinet et al 1999 in the case of populations structured in half sib families and by Le Roy et al 1998 when the population is a mixture of half and full sib families the residual variance of the quantitative trait o is estimated within sire This heteroskedastic parametrization better fits different between sires patterns of segregation of other QTLs unlinked to the tested position The likelihood is given by ns ndj ud a _X pasismo X paau ns Mo ats E hsi J hdij With hs and hdj the sire and dam phases M the marker information for the i sir
59. omosome Elsen et al 2009 resp 2nd 1 Exact probability of phases by Exact transmission Sparse enumeration probabilities are computed All possible phases are considered in e a a ton turn and their probability computed 2 Exact probability of phases by Approximate transmission Sparse not enumeration probabilities are computed recommended Al possible phases are considered using all available information turn and their probability computed 3 default Exact probability of phases by Rapid and optimised sparse enumeration transmission probabilities All possible phases are considered in Eisem et al 2003 turn and their probability computed 4 Very fast but approximate Rapid and optimised dense identification of the most probable transmission probabilities phases based on closest marker Elsen et al 2009 information Windig and Meuwissen 5 Fast and almost exact identification of Rapid and optimised dense eq snp the most probable phases based on transmission probabilities closest marker information Favier et Elsen et al 2009 al 2010 7 3 Option snp fast phasing in dense genotyping situations This option allows to determine phases rapidly and is a good option for dense markers maps In some cases convergence may be diffcult if not impossible This situation may happen due to genotyping errors This option snp is equivalent to haplotype 5 Example S QTLMAP PATH qtlmap parameter file
60. on to obtain the gradient Ao NUN UN USUS QTLMap 0 7 30 56 7 7 Options ci amp ci nsim To obtain confidence interval of QTL position four methods are available informed by the ci runtime option Y 1 Drop off method Y 2 Boostrap resampling method Y 3 Boostrap resampling method keep exactly the number of progeny within a family Y 4 Hengde Li method using the Relative Frequency Ratio The number of simulation or resampling for Confidence Intervals boostrap methods is given by the runtime option ci nsim the default value being 1000 Example S QTLMAP PATH qtlmap parameter file calcul 1 ci 3 4 ci nsim 500 7 8 Options data transcriptomic amp print allReport output mode eQTL analyses to analysis transcriptomic data When looking for eQTL the number of traits to be analysed becomes very large In this case specific routines are needed and ad hoc output are produced To get this situation the runtime option data transcriptomic must be indicated When performing eQTL analyses using data transcriptomic command or corresponding eQTL simulations the output is minimised To force the classical reporting format use the runtime option print all Example S QTLMAP PATH qtlmap p analysis calcul 1 data transcriptomic print all 7 9 Options for the control of process information To get the maximum information during the process add v or verbose to the command S QTLMAP PA
61. ood under H0 there is no QTL segregating on the linkage group LRT 21n L0 L1 ns ndi npij w l p ut aj aij o i 1 j 1 k 1 It must be noted that the elements a and aj only represent the sire and dam polygenic deviations to the general mean with 4 1 3 Simplified family structure In some designs the experimental population is made of sire half sibs each dam producing only one progeny A more frequent situation corresponds to a nested struture of large sire families with very small offspring sizes for the dams In these situations the dam parameters are very difficult to estimate and must be omitted in the likelihood formulation The formulae are adjusted accordingly ns ndi u X pasismo ux i 1 hs k 1 Ll X p t ts hsi Mj o u ai oi ts 1 2 tj the transmission event 1 or 2 from the sire to the progeny QTLMap 0 9 6 7 56 Again this part of the likelihood is linearly approximated by L17 p uj oi with Hik H 2 pti t hs Mj ait ts 1 2 Under H0 the likelihood LO becomes ns ndi L0 p ut ai oi i 1 k 1 4 1 4 Computation of elements 4 1 4 1 Parental phases In the current version of QTLMap only the most probable sire phase given the Mj is considered sire families being large it is supposed that enough information is available for a correct phase inference The efficiency of this approximation was demonstrated in Mangin et al 1999 Practicall
62. or of progeny phenotypes j71 njk 1 ndi a a5 Gin vij dijz the vectors of unknown first moment parameters X and WY i iv C12 ijo C12 jg p Aj i the corresponding nd x 1 nd incidence matrices and Vj Or Been the covariance matrix The non nul Wj elements are given by p t M the 1 and 2 4 elements of the W7 line corresponding to progeny ijk are ptis 1 M and ptis 2 Mi while the 1 2j and 2 2j elements to p tji q 1 M and p ti ya 2 M Note that in this linear context the non independance between sire and dam transmissions has not to be considered Finally we have the linear model Ypi Xin Wi a ei with e the random residual supposed to be distributed in WV 0 V Extention to the ns sire families is straightforward Let yp pi Pi YPns be the vector of performances a 84 05 a Ans the vector of QTL effects X QU X i X s W ji 14 W the incidence matrices extended to the whole set of sires V j 1 V the total covariance matrix The linear model is yp Xu W a ce The least square equations are all W matrices depend on the x position but the corresponding superscript was omitted D XVI X XVI W 0 0 0 0 Xi VI YP W V iX WjVilW 0 0 0 0 W Vi ypi ft 0 0 XV X XjV W 0 0 X V lypz a 0 0 W V X W V W 0 0 W V yp Hns 0 0 0 0 cos Xns Vis Xns Xns Vis Wns Xns Vis Pns Ans 0 0 0 0 Was Vii Xns Was Vis Wns Was Vid YPns In this case where all paramet
63. out grid2qtl Sire QTL effect estimations under Hypothesis H2 out coeffda Trait weights of linear combinations at each tested chromosomal location multivariate analyses out informativity Informativity at each tested chromosomal locations GENERAL OPTIONAL KEYS opt step Step length of the genome scan Morgan When Opt step 0 analysis is done at each marker position QTLMap 0 9 6 25 56 opt ndmin Minimum number of progeny per dam offspring size 10000 above which the polygenic and QTL effects of the dam are estimated opt minsirephaseproba Minimal paternal phase probability the analysis is 0 90 interrupted if for a sire none of its phases reaches this threshold opt mindamphaseproba Minimal maternal phase probability threshold above 0 10 which the probable maternal phases will be considered in the analysis opt unknown char Unknown genotype value 0 opt chromosome Linkage group most often chromosome name opt phases offspring marker start Name of the marker at the begining of the offspring haplotypes option of out phases offspring opt phases offspring marker end Name of the marker at the end ofthe offspring haplotypes option of out phases offspring OPTIONAL KEYS FOR advanced users opt eps cholesky coeff cholesky decomposition 0 5 opt eps confusion Threshold to test between factors confusion from the 0 70
64. r H0 o Thefourth section provides results of H1 hypothesis analyses for each trait Section calcul 1 28 88 8 Z H Possible confusions between QTL and fixed effects or polygenic effects EE SE Residual variances and estimation of the main effects polygenic QTL x x x x x x x LRT for the nuisance effects EE SE x QTLMap 0 7 41 56 e Maximum ofthe test statistics The value of the LRTmax and the maximum likelihood estimation of the QTL under the H1 hypothesis with identification of its flanking markers are given sets T SOT C ND NES I I I I I oo N co a TiS oo T Gy Wey ho ep OGU Sal COPS Sa J On 09 SS ceny Wey CONO Cer SC Sa 1S iss Too 19 Gy Me or GC TEE plc etry ee ee for py EE o Maximum likelihood ratio test Test de mii 5196564 The maximum is reached at position s flanking marker qtl 1 SLA Box 16 9 Example of main output file information Analysis under H1 0 05600 Car 27 p S0102 oo fourth section e Parameters estimation QTLMap 0 7 42 56 Within sire residual variance estimations are printed under all tested hypotheses global standard deviation for calcul 3 25 and 27 The maximum likelihood solutions for the parameters are given with an indication about their pr
65. rds in model and in trait The simulation parameter file should have a specific keyword to start the trait section SIMULTRAITS This section is identical to the TRAITS section in paragraph 7 1 but additional information about the nature of the trait is provided to compensate the absence of the model file This information is given next to the trait name Y Trait name r for real data heritability of the trait Y Trait name i for integer ordered discrete data heritability of the trait number of classes and frequencies of each class If QTL are simulated the simulation parameter file should start with the QTL section as presented in paragraph 7 1 Example a simulation parameter file for the estimation of the rejection thresholds for the test There are two QTL on the linkage group versus there is one QTL at the position 0 6 Morgan on the 7th chromosome The QTL simulated has an effect of 0 5 on the first trait normally distributed with h 0 5 and 0 5 on the second trait discrete distribtuted in 4 classes with h 0 50 The QTL alleles are fixed in the grand parental populations QTLMap 0 7 35 56 OTL ii Linley Qni position 0 6 position Orile so chromosome 7 chromosome ID frequency JL o 9 frq alleles QTL dans Pl et P2 biallelique pour QTL1 SIMULTRAITS 2 ails OIL weens ze 05 540 name nature real and heritability of the ds iBTSELIHE traitsimul2 i 0 50 4
66. rules 4 4 Alternative penetrance functions A few alternatives are available for unitrait and multitrait analyses 4 4 1 Unitrait uniQTL situations 4 4 1 1 Nuisance effects QTLMap 0 9 6 13 56 The penetrance U x 0 may be enriched considering nuisance factors fixed effects or covariables In this case the mean fu becomes lijk XijkB M p t tc hs hdi M aic ei ttg With X the incidence vector corresponding to the ijk progeny and fj the vector assem bling the general mean y and nuisance factors fixed effects and covariables It is possible to create interactions between nuisance factors when defined as fixed effects and the QTL effects This extension of the basic model has been implemented for two likelihood options Within sire dam regression and Fully linear model The penetrance Ll in the Within sire dam re gression is estimated directly from the classical formulation vpi ix eral Its 9j 1 1 exp 1 vV21TtOj p 2 decomposition in Elementary Statistics should be available soon As the nuisance effects may affect performances of individuals belonging to different sire fami lies the within sire likelihoods are no more independant The linear model corresponding to the within sire dam regression is changed accordingly Let yp pi YPi YPns be the vector of performances a d1 2 Ai Ans the vector of QTL effects X QU X i X ns W ji 14 W the incidenc
67. s codded with integer figures are analysed using the liability threshold model of Falconer 1989 For the basic model the penetrance in this case is e g Moreno 2003 8 aa a Vv 210 2 A 0 YPijk Aypiigi 1 t ux L1jj S xp DEAS dt With V Ay and A the lower and upper thresholds corresponding to a y phenotype Vij the expectation of the underlying distribution which is a linear function of a and B Hijk Xij B r p t tc hs hdi M ei ei io V oj the residual variance for the sire i family The general picture is that this liability model needs a much longer computing time than the gaussian model but gives similar results in terms of power and parameters estimations We recommand the use of this discrete traits approach only when 1 there is very few 2 or 3 classes on the discrete scale 2 their frequencies are very unequal and 3 the data set is large enough to avoid that only a few individuals represent a given rare class 4 4 1 2 2 Survival and Time to events phenotypes These phenotypes also called failure times describe the length of interval between a point of origine and an end point They are characterized by the presence of censored data i e indi The Cholesky decomposition aims at transforming M M in the product L L with L a upper triangular matrix The transformation is processed using L7 Mi euge 14 and Li Mii Xk21j 1 Li Lj L5
68. s output at position x given the dam phase v The probability that the progeny inherited the 2 4 sire and 2 d dam chromosome as given in the main output and the parental phases output at position x given the dam phase QTLMap 0 7 53 56 Position Sire Dam Dam Phase Animal p Hsl Hd1 p Hsl1 Hd2 p Hs2 Hd1 p Hs2 Hd2 Ls QALWOWAL 910014 aL 944217 0 000 0 000 1 000 0 000 Zs Chi 910014 1 944217 0 001 0 000 Oro 0 001 Se Sub 910014 l 944217 0 001 0 000 0 998 0 001 4 910001 910014 1 944217 0 001 0 001 Oo 998 0 001 S BilW leil 910014 l 944217 0 000 0 001 099S 0 000 6 SOOO 910014 il 944217 0 9003 0 001 0 941 0 056 Te SALWOWAL 910014 1 944217 0 003 0 001 0 884 OL T2 Box 24 Joint probabilities of parental segment transmission 10 10 Outputs for simulations When data are simulated a reduced output is given in the main result file For each simulated trait general parameters describing the distribution of the test statistic corresponding to the simulated data are printed mean standard deviation minimum maximum skewness kurtosis together with a table containing the thresholds computed for 10 5 1 0 5 0 27 0 1 0 0596 0 01 type I errors It should be mentioned that accurate thresholds are obtained only if a sufficient number of simulations are carried out Typically at least 1000 simulations should be run to compute thresholds corresponding to 5 type I errors The printed thresholds correspond to
69. segregating on the linkage group columns are the gene position on the expression array as indicated in the eQTL performance file the chromosome where the QTL is detected the QTL Position the LRT for the test HO H1 the standard deviation of the distribution the mean the sire QTL effect for each sire the sire familial polygenic effects As the missing data may vary from one expression trait to another information are pooled in profile sections the missing data structure being homogeneous within section This pooling facilitates the comparison of LRT to rejection thresholds which have to be computed independantly for each profile QTLMap 0 7 47 56 Profile il Hypothesis 1 Given parameters are respectively Gene position on the array Chromosome 1 QTL Position 1 H0 H1 std dev GMB11940 GMB11945 General Mean Sire QTL effects 1 Sire GMB11940 11 TT1T AC2C Sire GMB11945 11 AC2C TT1T Sire polygenic effects Sire GMB11940 Sire GMB11945 note 0 0 means not estimable AT 2019257 5 0 2519 4 033 957 0 511 0 005 OFS OAS 9 10 20 0 5 25 7 IPSO zx 197 39021977 5 0 0 99 43 6210 IO Ossl2 950 90 4 9 1L 0 LOS 3 944 0 000 A y IO YS 5 OVS Ay ALS 0 SS 0 2254 ORO Ze JE TS On 000 dE ALS 0 000 Profile T 2 Hypothesis 1 Given parameters are respectively Gene position on the array Chromosome 1 QTL Position 1 H0 H1 std dev GMB11940 GMB11945 General Mean Sire QTL effe
70. sib families The sires and the dams are supposed unrelated A sire resp a dam may be mated to more than one dam resp sire Thus two animals of the second generation may be unrelated half sibs or full sibs A polygenic and a QTL effect are estimated for each parent having a large enough family To avoid numerical difficulties these effects are not estimated for dams having too small offspring In this case the dam progeny are considered as sire half sibs only A control of the structure is allowed through the option number of progeny opt ndmin which is given in the parameter file Remark 2 opt mindamphaseproba and opt minsirephaseproba In the current release QTLMap considers only one phase for the sire excepted when the probabilities of all possible sire and dam phases are computed with the running option haplotype 1 2 3 see below If none of those probabilities for the sire exceed a given threshold opt minsirephaseproba in the parameter file the process is aborted Remark3 optimisation options Optimisation methods can be fine tuned by expert users changing from their default values the keys opt optim maxeval opt optim maxtime opt_optim_tolx opt optim tolf opt optim tolg opt optim h precision see in point 6 5 optim 7 Run the software with the different running options for analyses S QTLMAP PATH qtlmap parameter file lt calcul haplotype optim qtl Snp data transcriptomi
71. t of confusion between QTL and other effects in the final constained model test based on the correlation between columns of the incidence matrix KKK KK KKK KKK KKK KK KKK KKK KKK KKK KKK KK KKK KKK KKK KKK KKK KKK KKK KKK KKK KK KKK KKK KKK KK KKK KK KK Confusion between QTL and other effects final constained model No confusion detected the highest correlation is 0 257 Ck Ck ck ck ck ck KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KK KKK KK KKK Sk KK KKK KK KK Box 16 12 Example of main output file information fourth section Analysis under H1 e Interactions between QTL and fixed effects When interactions between the QTL and m fixed effects are considered in the model the dam if needed and sire QTL effects are estimated for each level of the composite interacting fixed effect if n n2 nm are the number of levels for effects 1 2 m a maximum of ni X n2 X X Nm QTL effects is estimated for each parent as all levels of the interaction might not be represented in the progeny QTLMap 0 7 44 56 CK ck ck ck ok Ck Ck ck Ck ck Ck Sk ck kk KKK Ck ck KKK KKK KKK KK KKK KKK KKK ck kk ck kk Ck ck ck kk ck kk ck Ck ck ck kk ko kk Ck kk kk Sk Sk ck Sk KKK kkk testing model effects Direct effects Tested effect GLE Likelihood p value ratio sexe JL 1 200 535 0 000 modal il 10 92 0 001 lgt 13 126 57 0 000 Mime opglL GEN GXOUS Tested effect di Likelihood p value ratio sexe 4
72. teff out phases OUTPUT phases out haplotypes OUTPUT haplotypes Box 11 Example of a parameter file QTLMap 0 9 6 Several keys may be defined compulsory keys in grey Key Description Default value INPUT FILES KEYS in map Input map in genealogy Input genealogy in genotype Input genotype in traits Input trait in model Input model describing the performances and model factors in paramsimul Input simulation parameters in pop Optional input to give population names OUTPUT FILES KEYS out output Full information about the results out summary Short information about the results out Irtsires Sire family likelihood ratio test out_Irtdams Dam family likelihood ratio test out_pded Grand parental segment transmission marginal probabilities out_pdedjoin Grand parental segment transmission joint probabilities out_phases Parental phases information out_freqall Allele frequency of markers retained in the analysis out_phases_offspring Offspring haplotypes with the parental origin the entire chromosome is considered without option about begining and end of region out_haplotypes Haplotypes out_pateff Sire QTL effect estimations under H1 out_mateff Dam QTL effect estimations under H1 out maxlrt Simulation report Position and max LRT
73. the parameter file For each progeny two lines are printed one for each phased chromosome The sire chromosome is printed first the dam chromosome second For each marker the transmitted chromosome of the parent is printed 1 for the first phase 2 for the second phase as printed in the parental phase output When known with certainty only 1 and 2 are printed When known with high probability between 0 90 and 1 1 and 2 are followed with a p When unknown probability of the transmitted phase at the position is lower than 0 90 a replaces the marker origin By default progeny chromosomes phases are edited for all markers The output can be reduced to a particular chromosomal region around a QTL position for example using the following parameters keys opt phases offspring marker start First marker of the printed progeny phases opt phases offspring marker end Last marker of the printed progeny phases The phases are then output for all the markers located between these bounds 10 8 Marginal probabilities of the parental chromosome transmission Each line gives for a tested QTL position v The tested position cM QTLMap 0 7 52 56 Y The sire ID Y The dam ID v The dam phase number as given in the main output and the parental phases output when multiple phases are available for the dam v The progeny ID v The probability that the progeny inherited the 2 4 sire chromosome as given in the mai
74. the three last models are identical 4 3 Alternative genetic hypotheses 4 3 1 Assuming more than one QTL Extensions of the basic model to more than one QTL acting additively are available 4 3 1 1 Two linked non interacting QTL Following Gilbert and Le Roy 2007 the L175 part of the likelihood is extended to Lik 2 gt p t 0 02 5 t2 t3 hs hdij M 9 2 le agit sje q 1 1 2 tit 02 03 d b OF ijks ijks the progeny at QTL located at x q 1 2 on the scanned chromosome The two first summations thus extends on 16 situations Y qi and az resp ai and the effects of the QTL located at x and x in the i sire resp ijt dam With t4 the vectors of transmission events 1 or 2 from the sire and dam to QTLMap 0 9 6 11 56 It must be noted that the probability of transmission events is their joint probability and not the product of marginals accounting for the linkage between tested positions In the current version of QTLMap this two linked QTL hypothesis is only available in the basic model framework 4 3 1 2 Two linked epistatic QTL As the number of parameters to be estimated in this genetic hypothesis may be very large this option was only made available for half sib family structure where the only sire effect polygenic and QTL effect are estimated In this situation the L17 part of the likelihood is Li 2 pit qe t2 hs Mi u Aiit tct Oc 212 2 01 t
75. type I errors at the level of the linkage group for which data are simulated typically chromosome wide thresholds In case of multiple linkage groups chromosomes analyses it is recommended using an approximate Bonferroni correction to either adjust the type I error for the number of independant linkage groups analysed or adjust the type I error as a proportion of the genome covered by each linkage group when linkage groups have large different sizes as in chicken Variable traitsimull Test 0vsiQ lest uEGuEiuEud Cist doue Ton umber of simulations 100 Mean 14 24685 Standard deviation 4 07168 Skewness 0 70693 Kurtosis 1 05302 inimum 6 62047 Maximum 28 64581 chromosome genome Threshold level 0 1000 19 39 0 0500 i 3S 0 0100 chrom_level 27 40 0 0050 5 22 ALS 0 0027 ho chon 28 44 0 0010 28 59 0 0005 Zt IL 0 0001 28 64 Box 25 Output file from simulations In addition to the main output a summary output is provided For each analysed variable a QTLMap 0 7 54 56 line is given with the empirical thresholds at 596 196 and 0 1 96 at the chromosome and the genome level The calculation of the genome wide level corresponds to a genome scan of 18 autosomes as in pigs For other species the genome wide level can easily be obtained multiplying the chromosome wide level by the number of chromosomes or express it in proportion of the genome represented by
76. uded in the analysis Gaussian discrete or survival Cox model data Familial heterogeneity or homogeneity of variances homo heteroscedasticity Ne SS Can handle eQTL analyses Computation of transmission and phase probabilities adapted to high throughput genotyping SNP Y Empirical thresholds are estimated using simulations under the null hypothesis or permutations of trait values Y Computation of power and accuracy of your design or any simulated designs NOSE NOUS 2 Contributors Pascale Le Roy UMR1348 PEGASE INRA Rennes France Jean Michel Elsen UR0631 SAGA INRA Toulouse France H l ne Gilbert UMR0444 LGC INRA Jouy en Josas France Carole Moreno UR0631 SAGA INRA Toulouse France Andres Legarra UR0631 SAGA INRA Toulouse France Olivier Filangi UMR1348 PEGASE INRA Rennes France 3 Support Subsribe and post any message question to the qtlmap users list mailto qtlmap users listes inra fr QTLMap 0 9 6 4 56 4 Theoretical background QTLMap is a software dedicated to marker assisted genetic dissection of quantitative traits recorded in experimental populations Typically the analysed populations must be presented as a collection of full or half sib families each comprising a sire i 1 ns and its mates j 1 nd each giving birth to one or more progenies k 1 npj There is a total of ns sires nd nd dams and np Ynp with np Xjnpij progenies The parents form the G1 generation
77. y finding the most probable phase can be described as the maximisation of a quadratic function of binary variables Favier et al 2010 This optimisation belongs to the Binary Weighted Constraint Satisfaction Problem area making possible the use of a very efficient algorithm Larrosa and Schiex 2004 Two strategies are proposed for the computation of the dam phase probability p hdj hsi Mj When the number of markers on the linkage group is small less than 15 possible phases can be exhaustively listed and all phase probabilities estimated When this number is high the BWCSP approach is used giving only the most probable dam phase 4 1 4 2 Parents to progeny transmission Probabilities of transmission p t hs hdjj M are calculated following Elsen et al 2009 algorithm This algorithm needs only very limited computational resources both in terms of time and space It limits the exploration of the linkage group to the markers informative for a given position to be traced and thus performs very fast It must be noted that p ti hsi hdi the product of the marginals Indeed when all genotypes for parents and progeny at a marker are heterozygous and identical say 1 2 the origins of the alleles received by the progeny are not independant prob t 1 t 2 0 5 prob t 1 x prob tg 2 0 25 Mj is a joint probability of transmission events and not 4 1 4 3 Penetrance In the basic model the exponent of the
78. y in paramsimul in the parameter file example param sim must be provided by the user This file contains the information needed for the simulation Y QTL information keyword QTL Y Traitinformation keyword TRAITS performance 95001 1 0 2 23 2 2 2 95003 2 0 4 21 5 1 3 95004 2 0 6 52 2 4 5 01 opt chromosome 1 2 3 in paramsim param sim When N QTL are simulated N 0 rejection thresholds for the test of Hn N QTL vs Hn g N q QTLs segregating or analyses for power of detection the QTL is supposed to be biallelic with alleles Q1 Q2 f1 is the frequency of the first allele in the grand sire population simulated as being equal to the frequency of the second allele in the grand dam population As a result the expected genotype frequencies in the parental population are Q1Q1 f1 1 f1 Q1Q2 f1 f1 1 QTLMap 0 7 32 56 f1 1 f1 Q2Q2 1 f1 f1 To get for instance all parents heterozygous the frequency f1 must be given the value 1 or 0 The specific OTL keyword on the first line is mandatory to simulate QTL effects Next the number of QTLs to be simulated is given The user defines for the QTL with a format keyword value Y After keyword Position positions of the chromosme in Morgan unit Y After keyword chromosome chromosome where they are located Y After keyword frequency frequency of one of the QTL allele in grand sire population The specific TRA

Download Pdf Manuals

image

Related Search

Related Contents

Manual de instrucciones Manual de instruções  GPSdash2 User Manual  Polaroid SprintScan 45 User's Manual  取扱説明書[PDF:1.16MB]  Kaiser S 4571 XL  Rope Grab Assembly User Manual  Breo ® Ellipta  PU-3 series test report  Jwin JD-VD520 User's Manual  Fundamentals of Sensor Network Programming: Applications and  

Copyright © All rights reserved.
Failed to retrieve file