Home
        ASReml User Guide - VSN International
         Contents
1.                        nitrogen  block variety   0 0cwt 0 2cwt O 4cwt 0 6cwt  GR 111 130 157 174  l M 117 114 161 141  V 105 140 118 156  GR 61 91 97 100  Il M 70 108 126 149  V 96 124 121 144  GR 68 64 112 86  Il M 60 102 89 96  V 89 129 132 124  GR 74 89 81 122  IV M 64 103 132 133  V 70 89 104 117  GR 62 90 100 116  V M 80 82 94 126  V 63 70 109 99  GR 53 74 118 113  VI M 89 82 86 104  V 97 99 119 121                      268    15 2 Split plot design   Oats       A standard analysis of these data recognises the two basic elements inherent in the ex   periment  These are firstly the stratification of the experiment units  that is the blocks   whole plots and sub plots  and secondly  the treatment structure that is superimposed on  the experimental material  The latter is of prime interest  in the presence of stratifica   tion  Thus the aim of the analysis is to examine the importance of the treatment effects  while accounting for the stratification and restricted randomisation of the treatments to the  experimental units  The ASReml input file is presented below     split plot example    blocks 6   Coded 1   6 in first data field of oats asd  nitrogen  A 4   Coded alphabetically   subplots     Coded 1   4   variety  A 3   Coded alphabetically   wplots     Coded 1   3   yield    oats asd  SKIP 2    yield   mu variety nitrogen variety nitrogen  r idv blocks  idv blocks wplots   residual idv units    predict nitrogen   Print table of predicted nitrogen means   predict variety
2.        1582  Figure 15 14  Plot of fitted cubic smoothing spline for model 1    A quick look suggests this is fine until we look at the predicted curves in Figure 15 14  The fit  is unacceptable because the spline has picked up too much curvature  and suggests that there  may be systematic non smooth variation at the overall level  This can be formally examined  by including the fac age  term as a random effect  This increased the log likelihood 3 71   P  lt  0 05  with the spl age 7  smoothing constants heading to the boundary  There is a  possible explanation in the season factor  When this is added  Model 3  it has an F ratio  of 107 5  P  lt  0 01  while the fac age  term goes to the boundry  Notice that the inclusion  of the fixed term season in models 3 to 6 means that comparisons with models 1 and 2 on  the basis of the log likelihood are not valid  The spring measurements are lower than the  autumn measurements so growth is slower in winter  Models 4 and 5 successively examined  each term  indicating that both smoothing constants are significant  P  lt  0 05   Lastly we  add the covariance parameter between the intercept and slope for each tree in model 6  This  ensures that the covariance model will be translation invariant  A portion of the output file  for model 6 is    6 LogL    87  5371 S2  5 9488 32 df  7 LogL  87  4342 82 gt  5 6885 32 d    8 Logl  87 4291 S2  5 6434 32 df  9 LogL  87  4291 S2  5 6412 32 df    314    15 9 Balanced longitudinal data   Random coe
3.        Pa HE UNIVERSITY GRD   CMM OF ADELAIDE   Gag   SAG   AUSTRALIA Ran iwa  F    Development    Corporation sn          ASReml  User Guide    Release 4 1    Functional Specification    A R Gilmour  VSN International  Hemel Hempstead  United Kingdom    B J Gogel  University of Adelaide  Australia    B R Cullis  Universtiy of Wollongong  Australia    S J Welham  VSN International  Hemel Hempstead  United Kingdom    R Thompson  Rothamsted Research  Harpenden  United Kingdom    April 8  2015    ASReml User Guide Release 4 1 Functional Specification    ASReml is a statistical package that fits linear mixed models using Residual Maximum Like   lihood  REML   It was a joint venture between the Biometrics Program of NSW Department  of Primary Industries and the Biomathematics Unit of Rothamsted Research  Statisticians  in Britain and Australia have collaborated in its development     Main authors     A  R  Gilmour  B  J  Gogel  B  R  Cullis  S  J  Welham and R  Thompson    Other contributors     D  Butler  M  Cherry  D  Collins  G  Dutkowski  S  A  Harding  K  Haskard  A  Kelly  S  G   Nielsen  A  Smith  A  P  Verbyla and I  M  S  White     Author email addresses    arthur gilmour Cargovale com au  beverley gogel adelaide edu au  bcullisQuow edu au   sue  welham  vsni co uk  robin thompson rothamsted ac uk    Copyright Notice   Copyright    2014 VSN International   All rights reserved    Except as permitted under the Copyright Act 1968  Commonwealth of Australia   no part of  the 
4.       e if the correlation structure ar1i column  ari row  was specified  ASReml would auto   matically add a common variance  see Section 7 4     e ASReml would report an error if the consolidated model term ariv column   ariv row     was specified as this would correspond to var  e    02 Y  pc  8 02 U  pr  and o2  and o2   are unidentifiable  that is  it is not possible to estimate them separately  see Section 7 4     120    7 5 A sequence of variance structures for the NIN data       3c Two dimensional separable autoregressive spatial model with mea   surement error       This model extends 3b by adding a random units  Win Alliance Trial 1989  term  Thus variety  A   var  ur    071   var  un    07I    and id   var  e    02 B  p         p    The reserved word units      tells ASReml to construct an additional random term   row 22   with one level for each experimental unit  so that a   column 11 ET   second  independent  error term can be fitted  A creed ieee j   units term is fitted in the model in cases like this idv repi  Gat as ET  where a variance structure is applied to the errors  An   residual ar1v column   ar1  row   IDV variance structure is specified for units to model  o  I     The units term is sometimes fitted in spatial models for field trial data to allow for  a nugget effect  The model now has two terms at the plot  experimental unit  level  that is   a correlated structure defined as an R structure and an uncorrelated structure defined in the  G structure 
5.       j l    n  145    The sample variogram reported by ASReml has two forms depending on whether the spatial  coordinates represent a complete rectangular lattice  as typical of a field trial  or not  In  the lattice case  the sample variogram is calculated from the triple  lij1  lij2  vij  where lij1    Si     Sji and lij2   Si2     Sj2 are the displacements  As there will be many v i  with the  same displacements  ASReml calculates the means for each displacement pair l j1  lij2 either  ignoring the signs  default  or separately for same sign and opposite sign   TWOWAY   after  grouping the larger displacements  9 10  11 14  15 20       The result is displayed as a  perspective plot  see page 235  of the one or two surfaces indexed by absolute displacement  group  In this case  the two directions may be on different scales     Otherwise ASReml forms a variogram based on polar coordinates  It calculates the distance  between points dij     I       2 and an angle 0       180  lt  4   lt  180  subtended by the line  from  0 0  to  lij    lij2  with the x axis  The angle can be calculated as 6     tan71 1ij1 lij2   choosing  0  lt  6    lt  180  if J  2  gt  0 and     180  lt  0    lt  0  if lij2  lt  0  Note that the variogram  has angular symmetry in that vi    Vi  dj    dj  and  6       0      180  The variogram  presented averages the v   within 12 distance classes and 4  6 or 8 sectors  selected using a   VGSECTORS qualifier  centred on an angle of      1  180 s  i   1   
6.       qualifier    action        SLOW n      TOLERANCE  s1   s2        VRB    reduces the update step sizes of the variance parameters more persistently  than the  STEP r qualifier  If specified  ASReml looks at the potential  size of the updates and if any are large  it reduces the size of r  If  n is greater than 10 ASReml also modifies the Information matrix by  multiplying the diagonal elements by n  This has the effect of further  reducing the updates  In the iteration subroutine  if the calculated LogL  is more than 1 0 less than the LogL for the previous iteration and  SLOW  is set and NIT gt 1  ASReml immediately moves the variance parameters  back towards the previous values and restarts the iteration     modifies the ability of ASReml to detect singularities in the mixed model  equations  This is intended for use on the rare occasions when ASReml  detects singularities after the first iteration  they are not expected     Normally  when no   TOLERANCE qualifier is specified   a singularity is  declared if the adjusted sum of squares of a covariable is less than a small  constant  7  or less than the uncorrected sum of squares x7  where 17 is  1078 in the first iteration and 10     thereafter  The qualifier scales 7  by 10    for the the first or subsequent iterations respectively  so that it  is more likely an equation will be declared singular  Once a singularity  is detected  the corresponding equation is dropped  forced to be zero  in  subsequent iterations  If ne
7.      0 0  0 Gp      0 0  G  OG    2  iso   0 0     Gya 0  0 QO    0 Gy    where    is the direct sum operator  each G  is of size q  and q   J  qi     The default assumption is that each random model term generates one component of this  direct sum  then b   b and var u     G  for i   1   b   This means that the random  effects from any two distinct model terms are uncorrelated  However  in some models  one  component of G may apply across several model terms  for example  in random coefficient  regression where the random intercepts and slopes for subjects are correlated  To accommo   date these cases  one component of G may apply across several model terms  then b  lt  b    In some other  less likely but possible  cases  we may wish to separate one model term over  several independent parts  then b     gt  b   see Section 7 2 1     Example 2 2 Variance components mixed models    Building example 2 1 to a linear mixed model with more than one  b  gt  1  random effect   typically known as a variance components mixed model   the random effects wu  in u  and  the residual errors e  are assumed pairwise uncorrelated and to each be normally distributed  with mean zero and variance given by    var  u     o  Iy    2 1 The general linear mixed model       and  var e    o7I   where I  and I   are identity matrices of dimension q  and n  respectively  In this case    b  var  y    X o  ZZ    02Iy   2 5     i 1    2 1 4 Partitioning the residual error term    As for the fixed and random
8.      3 4 2    The first text  non blank  non control  line in  an ASReml command file is taken as the title  for the job and is purely descriptive for future  reference     The title line    3 4 3 Reading the data    The data fields are defined before the data  file name is specified  Field definitions must  be given for all fields in the data file and in  the order in which they appear in the data  file  Note that  in previous releases data field  definitions had to be indented but in Release  4 this condition has been relaxed and is not  required  In this case there are 11 data fields   variety    column  in nin89 asd  see Sec   tion 3 3     The  A after variety tells ASReml that the  first field is an alphanumeric factor and the 4  after repl tells ASReml that the field called          NIN Alliance trial 1989  variety  A    id              NIN Alliance trial 1989  variety  A  id   pid   raw   repl 4  nloc   yield   lat   long   row 22  column i1    nin89 asd  skip 1 f          repl  the fifth field read  is a numeric factor with 4 levels coded 1 4  Similarly for row and  column  The other fields include variates  yield  and various other variables     3 4 4    The data file name is specified immediately   NIN alliance trial 1989 variety  A  after the last data field definition  Data file   ia   qualifiers that relate to data input and out    pid   put are also placed on this line if they are      required  In this example   skip 1 tells AS   Reml to ignore  skip  the first
9.      EXTRA n      FOWN    modifies the algorithm used for choosing the order for solving the mixed  model equations  A new algorithm devised for release 2 is now the default  and is formally selected by  EQORDER 3  The algorithm used for release  1 is essentially that selected by  EQORDER 1  The new order is generally  superior   EQORDER  1 instructs ASReml to process the equations in  the order they are specified in the model  Generally this will make a job  much slower  if it can run at all  It is useful if the model has a suitable  order as in the IBD model   Y   m  r    giv id  id      giv id  invokes a dense inverse of an IBD matrix and id has a sparse  structured inverse of an additive relationship matrix  While  EQORDER 3  generates a more sparse solution   EQORDER  1 runs faster     forces another mod n  10  rounds of iteration after apparent convergence   The default for n is 1  This qualifier has lower priority than  MAXIT and  ABORTASR NOW  see  MAXIT for details    Convergence is judged by changes in the REML log likelihood value and  variance parameters  However  sometimes the variance parameter con   vergence criteria has not been satisfied     allows the user to specify the test reported in the F con column of the  Wald F Statistics table  It has the form    FOWN terms to test   background terms   placed on a separate line immediately after the model line  Multiple   FOWN statements should appear together  It generates a Wald F statistic  for each model term in
10.      Warning  Dropped records  were not evenly distributed  across    Warning  Eigen analysis  check of US matrix skipped    WARNING  Extra lines on the  end of the input file    This is to reduce the number of knot points used in fitting  a spline     data values should be positive     usually means the variance model is overparameterized   Look up   AISING     the structures are probably at the boundary of the param   eter space     either use  MVINCLUDE or delete the records     it is better to avoid negative weights unless you can check  ASReml is doing the correct thing with them     check the data summary has the correct number of records   and all variables have valid data values  If ASReml does  not find sufficient values on a data line  it continues reading  from the next line     You have probably mis specified the number of levels in the  factor or omitted the  I qualifier  see Section 5 4 on data  field definition syntax   ASReml corrects the number of lev   els     the term did not appear in the model   the term did not appear in the model     terms like units and mv cannot be included in prediction      RECODE may be needed when using a pedigree and reading  data from a binary file that was not prepared with ASReml     suggest drop the term and refit the model     IMVREMOVE has been used to delete records which have a  missing value in design variables  This has resulted in mul   tivariate data no longer having an n x t  n subjects with  t traits each  structure
11.     Table 15 8  Estimated variance components from univariate analyses of bloodworm data   a   Model with homogeneous variance for all terms and  b  Model with heterogeneous variance  for interactions involving tmt     a   b           source control treated  variety 2 378 2 334  tmt variety 0 492 1 505  0 372  run 0 321 0 319  tmt run 1 748 1 388 2 223  variety run  pair  0 976 0 987  tmt pair 1 315 1 156 1 359  REML log likelihood  345 256  343 22          The estimated variance components from this analysis are given in column  a  of table  15 8  The variance component for the variety main effects is large  There is evidence of  tmt variety interactions so we may expect some discrimination between varieties in terms  of tolerance to bloodworms     Given the large difference  p  lt  0 001  between tmt means we may wish to allow for hetero   geneity of variance associated with tmt  Thus we fit a separate variety variance for each  level of tmt so that instead of assuming var  u2    03Igg we assume    2   var  u2      T 2   Q I    where c2  and o3  are the tmt  variety interaction variances for control and treated respec   tively  This model can be achieved using a diagonal variance structure for the treatment part  of the interaction  We also fit a separate run variance for each level of tmt and heterogeneity  at the residual level  by including the uni tmt 2  term  We have chosen level 2 of tmt as  we expect more variation for the exposed treatment and thus the extra varianc
12.     cov      uw    COV  Uy      Uy   Uy        a  Q  en  Qo     Q  e  Q  ad  m  A  A       control BLUP   exposed BLUP  o          T T T   2  1 o 1    N  w    control BLUP    Figure 15 11  Estimated difference between control and treated for each variety plotted  against estimate for control    The independence of     and u   and dependence between 6 and wu   is clearly illustrated in  Figures 15 10 and 15 11  In this example the two measures have provided very different  rankings of the varieties  The choice of tolerance measure depends on the aim of the experi   ment  In this experiment the aim was to identify tolerance which is independent of inherent  vigour so the deviations from regression measure is preferred     308    15 9 Balanced longitudinal data   Random coefficients and cubic smoothing splines    Oranges       15 9 Balanced longitudinal data   Random coefficients and  cubic smoothing splines   Oranges    We now illustrate the use of random coefficients and cubic smoothing splines for the analysis  of balanced longitudinal data  The implementation of cubic smoothing splines in ASReml  was originally based on the mixed model formulation presented by Verbyla et al   1999    More recently the technology has been enhanced so that the user can specify knot points  in  the original approach the knot points were taken to be the ordered set of unique values of the  explanatory variable  The specification of knot points is particularly useful if the number of  unique valu
13.     e labels for the data fields in the data file and the name of the data file   e the linear mixed model and the variance model s  if required    e output options including directives for tabulation and prediction     Below is the ASReml command file for an RCB analysis of the NIN field trial data highlighting  the main sections  Note the order of the main sections        title line     gt   data field definition    gt     NIN Alliance trial 1989  variety  A  id   pid   raw   repl 4  nloc   yield   lat   long   row 22    data field definition     gt   data file name and qualifiers    gt   tabulate statement    gt     column 11  nin89 asd  skip 1  tabulate yield   variety    linear mixed model definition    gt   residualvariance model specification     predict statement    gt     yield   mu variety  r idv repl   residual idv units   predict variety             3 4 1    ASReml can generate a basic command file  a template for you to modify  from the data file  if the data file has suitable field  variable  names in the first line  The requirements are    Generating a template    e the data file have file name extension asd  csv  dat or txt   e there is not a matching command file already existing   e the first line of the file contains a    name    for each field     e the    name    must begin with a letter  it may contain numbers and the underscore character  but not any of the characters            7   amp    lt  gt      0 QO     e the    name    may be terminated with  P 
14.     for a model factor  various qualifiers are required depending on the form of the factor  coding where n is the number of levels of the factor and s is a list of labels  or the name  of a file containing the labels one per row  to be assigned to the levels       Or nN    is used when the data field has values 1    directly coding for the factor  unless the levels are to be labelled  see  L    Row     1 12 for example    is used when the data field is numeric with values 7    and labels are  to be assigned to the n levels  for example  Sex  L Male Female    is required if the data field is alphanumeric  for example   Location  A   names   Specify n if there are more than 1000 classes over all class factor variables  indicating the expected number for this factor     47    5 4 Specifying and reading the data        A  L is used if the data field is alphanumeric and must be coded in a particular  s order to set the order of the levels  For example SNP  A  L C C C T T T  defines the levels over riding the default  data dependent order   If there are many labels  they may be written over several lines by using a  trailing comma to indicate continuation of the list  New R4 Alternatively   the labels may be listed in a file  If the filename includes embedded  blanks  or has no file extension  it must be enclosed in quotes   Genotype  A  L MyNames txt  Genotype  A  L    My Names txt     Genotype  A  L    MyNames       Use a  SKIP qualifier after the filename to skip any heading li
15.     structure is used  this may be used to obtain starting values for another run of ASReml     a table showing the variance components for each iteration   a figure and table showing the variance partitioning for any XFA structures fitted     some statistics derived from the residuals from two dimensional data  multivariate  re    peated measures or spatial      the residuals from a spatial analysis will have the units part added to them  defined as  the combined residual  unless the data records were sorted  within ASReml   in which  case the units and the correlated residuals are in different orders  data file order and  field order respectively      the residuals are printed in the  yht file but the statistics in the  res file are calculated  from the combined residual     the Covariance Variance Correlation  C V C  matrix calculated directly from the  residuals  it contains the covariance below the diagonals  the variances on the diagonal  and the correlations above the diagonal     The fitted matrix is the same as is reported in the  asr file and if the Logl has converged  is the one you would report  The BLUPs matrix is calculated from the BLUPs and is  provided so it can be used as starting values when a simple initial model has been used  and you are wanting to attempt to fit a full unstructured matrix  For computational  reasons  it pertains to the parameters and so may differ from the parameter values  generated by the last iteration  The BLUPs matrix may look quit
16.     to fit a slide specific regression of signal on background  In this example  signal  is a multivariate set of 93 variates and background is a set of 93 covariates  The  signal values relate to either the Red or Green channels  So for each slide and  channel  we need to fit a simple regression of signal   mu background  But  the data for the 93 slides is presented in parallel  If it were presented in series   with a factor slide indexing the slides  the equivalent model would be signal    slide slide background      6 7 Weights    Weighted analyses are achieved by using  WT wezght as a qualifier to the response variable   An example of this isy  WI wt   mu A X where y is the name of the response variable and  wt is the name of a variate in the data containing weights  If these are relative weights  to be  scaled by the units variance  then this is all that is required  If they are absolute weights   that is  the reciprocal of known variances  use the  GF qualifier to fix the variances in the  residual model  Section 7 3   When a structure is present in the residuals  Section 7 3  the  weights are applied as a matrix product  If X is the structure and W is the diagonal matrix  constructed from the square root of the values of the variate weight  then R    WX  W   Negative weights are treated as zeros     6 8 Generalized Linear  Mixed  Models    ASReml includes facilities for fitting the family of Generalized Linear Models  GLMs  McCul   lagh and Nelder  1994   A GLM is defi
17.     var    Ur    _   OI OIS   oi    Orly orsLi0    Us Ors Oss Orslio osslio    Here  the set of animal intercepts has a common variance  ozr   and the set of animal slopes  has a  different  common variance  ogs   Intercepts and or slopes from two different animals  are independent  but the intercept and slope from any given animal have covariance gzs  or  correlation o73   o7105s   In this context  we use integers as arguments to emphasize that  the arguments are specifying the size of the variance structure  For this example  id 10   can be replaced by id Animal   In order to simplify processing of the str   arguments   ASReml expects at least 1 single term in the consolidated model term to be a variance model  function with a dimension rather than a variable name as the argument  eg  us 2  in the    113    7 3 Applying variance structures to the residual error term       example  Mostly this is quite natural as a suitable factor is not normally available to indicate  the number of linear model terms being combined  2 in this example   The dummy identity  function id 1  could be introduced to allow processing if the consolidated model term could  only be expressed using variable arguments  for example     str Sire and Dam  id 1  nrm Animal      This random regression model has been developed to describe the form of the str   function   We note that this model is equivalent to    us  pol  age     id Animal   Example 7 2 Fitting a genetic covariance between direct and materna
18.    0 7210E 01  0 7940  0 4170E 01 0 8972    Wald F statistics    Source of Variation NumDF F ine  19 Trait age 5 100 141  20 Trait  brr 15 116 72  21 Trait sex 5 47 97  23 Trait age sex 4 4 17  29 diag TrSG123   sex grp 147 effects fitted   37 are zero   26 diag TrAG1245   age grp 196 effects fitted   69 are zero   36 Trait grp 180 effects fitted    65 singular   31 us Trait  sire 460 effects fitted   20 are zero   33 xfai TrDam123   dam 10683 effects fitted   8 are zero   35 us TrLiti234  lit 19484 effects fitted   20 are zero     The REML estimates of all the variance matrices except for the dam components are positive  definite  Heritabilities for each trait can be calculated using the VPREDICT facility of ASReml   The heritability is given by     P  where 0   is the phenotypic variance and is given by  ob  07   0 0   o   recalling that  2 1    2  o      0  s 4 A  1  2 2 2  Oq   474 t Om    In the half sib analysis we only use the estimate of additive genetic variance from the sire  variance component  ASReml then carries out the VPREDICT instructions in the  asr file   stores the instructions in a  pin file and produces the following output in a   pve file     324    15 10 Multivariate animal genetics data   Sheep       ASReml 4 1  01 Dec 2014  Multivariate Sire  amp   coopmf3 pvc created 27 Mar 2015 10     id units   us Trait    us Trait   us  Trait    us Trait   us  Trait   2e Trait  as  Trait   us  Trait   us  Trait   we  Trait   as Trait    us Trait   us  Trait   fast
19.    103 resid 67 1 2097 0 34725   104 resid 68 0 24528 0 321 1T9E 01   105 resid 69 4 5409 0 21411   106 resid 70 0 85028 0 10023   107 resid 71 2 4831 0 12849   108 resid 72 0 78609E 01 0 11170E 01   109 resid 73 0 11589 0 99338E 01   110 resid 74 1 6318 0 49595E 01  WWTh2   Direct 2 75 phen 1 60  0 1507 0 0396  YWTh2   Direct 2 77 phen 3 62  0  2991 0 0626  GFWh2   Direct 2 80 phen 6 65  0 3087 0 0717  FDMh2   Direct 3 84 phen 10 69  0 1344 CLOTS  FATh2   Direct 3 89 phen 15 74  0 0785 0 0388  GenCor 2 1   us Tr 24 SQR us Tr 23 us Tr 25   0 7045 0 1024  GenCor 3 1   us Tr 26 SQR us Tr 23 us Tr 28   0 2970 0 1720  GenCor 3 2   us Tr 27 SQR us Tr 25 us Tr 28   0 0188 0 1808  GenCor 4 1   us Tr 29 SQR us Tr 23 us Tr 32   0 1947 0 3521  GenCor 4 2   us Tr 30 SQR us Tr 25 us Tr 32    0 1326 0 3249  GenCor 4 3   us Tr 31 SQR us Tr 28 us Tr 32   0 0981 0 3874  GenCor 5 1   us Tr 33 SQR us Tr 23 us Tr 37   0 2924 0 2747  GenCor 5 2   us Tr 34 SQR us Tr 25 us Tr 37   0 5913 0 2026  GenCor 5 3   us Tr 35 SQR us Tr 28 us Tr 37   0 0396 0 2687  GenCor 5 4   us Tr 36 SQR us Tr 32 us Tr 37    0 6577 0 3854  MatCor 2 1   Mater 91 SQR Mater 90 Mater 92   1 4277 0 5305  MatCor 3 1   Mater 93 SQR Mater 90 Mater 95   1 7267 1 4388  MatCor 3 2   Mater 94 SQR Mater 92 Mater 95   3 0703 2 9688    Notice  The parameter estimates are followed by  their approximate standard errors     15 10 2 Animal model    In this section we will illustrate the use of a pedigree file to define the genetic relation
20.    15  23 it Cte BUS    Ona eh ee eee Oe RY See Eee ee Sew ed 15  2 4 Inference  Random effects                000 pee ee eee 16  2 4 1 Tests of hypotheses  variance parameters                  16  24 2 Diagnostics 2 coin eh ew eRe EER ERs EHS 17  25 Inference  Fixed effects   2 424400 4 oe eka Se certi etr kres 18  go    WOU  lt e so a ba Se Se k k eee eRe eee a eS 18  2 5 2 Incremental and conditional Wald F Statistics               19  2 5 3 Kenward and Roger adjustments               2   0   22  2 5 4 Approximate stratum variances    1    2    2 a ee 23  A guided tour 24  3 1 WARGO  s o eoi i a poe eie ee b enk on ele KOR Gie R Ee p Aee 24  3 2 Nebraska Intrastate Nursery  NIN  field experiment                25  3 3 The ASReml data fil    lt   o e e ea 646 REA GREER OR RR ERS  amp  27  3 4 The ASReml command file      2    2  ee 29  3 4 1 Generating a template  lt   sco csoc occa certi eart eren 29  342 Tbhetitlelne  ec reese deea e e ee ek ele Be 31  3 4 3 Reading the data       ee ea eee ewe ee ee Ee 31  3 4 4 The data file line           2    02  2 022 202 0000   31  345 Tab  latio    lt s o s ga coseta d acsm e ee ee ee e L 32  3 4 6 Specifying the terms in the mixed model                  32  34 7 Variance structures 2 gn ee wk coec eR REE SR eo ed 32  34 8 Prediction   lt    ers era radad erdee ee ee ee ES 33  3 5 Uniti the JOD o e a ogue we he ee ESE EER we RAR Oe Ee a 33  3 6 Description  of gutput TES o cc s eed we me we eee Se eee Ze 34  J61 The Asril crer sscn
21.    2    0 141595   0 963017  199771   0 286984  3 64374   0 850282  2 48313   3 0 786089E 01   4 0 115894   5 1 63175    NePPWNF WwW    147 effects    1 1 01106  16 0229  0 280259    w N    196 effects    0 132755E 02  0  976533E 03  0 176684E 02  0 208076E 03  460 effects  0 593942  0 677334  1 55632   280482E 01   287861E 02   150192E 01   596227E 01   657014E 01   477561E 02    157854   407282E 01    133338   877122E 03   0 472300E 01  0 326718E 01  14244 effects  0 126746E 01  0 00000  0 661114E 02  1 46479  1 51911  0 110770  19484 effects   1 23 55275      1 53980  2 2 55497   1  0 310141E 01  2 0 450851E 01  3  1  2  3    PUNB    OOGO    OP WNHFPBPWNHRFPWNHYEPNFP BE  I  oD 2 Oo a  amp     1  2  3  1  2  3    0 191030E 01    0 721026E 01    0 794020    0 417001E 01  4 0 897161    0  0    0    0    0  0    0    oS Oo oo    or O    l  oO Oxo oo    oo 2 CO    0     0    0    O     Covariance Variance Correlation Matrix US Residual  9 461 0 5689 0 2355    0 1640    323    Ois      141595   963017  199771    286984  3 64374    850282  2 48313    786089E 01    115894  1 63175    1 01106  16 0229    280259      132755E 02   976533E 03    176684E 02    208076E 03      593942    677334  1 55632   280482E 01   287861E 02   150192E 01   596227E 01   657014E 01   477561E 02    157854   407282E 01    133338   877122E 03   472300E 01   326718E 01    126746E 01  0 00000   661114E 02  1 46479  1 51911   110770    3 55275  1 53980  2 55497   310141E 01   450851E 01  191030E 01   721026E
22.    404860    100269    128460   111660E 01   990547E 01   495973E 01      340424  4 56493   755415E 01     660473E 03   807052E 03    156358E 02    128442E 03     161397    212998    399056    183322E 01   287861E 01   374544E 02   110412   410000   191024E 01   857902E 01   411396E 01   673424E 01   584748E 02  1 53000    163359E 01      422487  0 00000   528891E 02    181736    208097  218051 F 01     416013    15 10 Multivariate animal genetics data   Sheep       45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  T2  73  74  75  76  tf  78  79  80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95  96  97  98  99    us  TrLit1234   us  TrLit1234   us  TrLit1234   us  TrLit1234   us  TrLit1234   us  TrLit1234   us  TrLit1234   us  TrLit1234   us  TrLit1234   Damv  Damv  Damv  Damv  Damv  Damv  phen  phen  phen  phen  phen  phen  phen  phen  phen  phen  phen  phen  phen  phen  phen 15  Direct    OANDARPWNHE    PrRPrRPRE  PWN OO    23  24  25  26  27  28  29  30  31  32  33  34  35  36  Direct 37  Maternal    Direct  Direct  Direct  Direct  Direct  Direct  Direct  Direct  Direct  Direct  Direct  Direct  Direct    54  55  56  57  58  59    Maternal  Maternal  Maternal  Maternal  Maternal  resid 60  resid 61  resid 62  resid 63       id  lit  us TrLiti234    id lit   us  TrLit1234    id lit   us  TrLit1234    id lit   us  TrLit1234    id lit  sus  TrLiti234    id lit   us  TrLit1234   id  lit   us TrLiti234    id
23.    NIN alliance trial 1989 variety  A  statement  In this case the 56 variety means  for yield as predicted from the fitted model   Column 11   would be formed and returned in the  pvs   ningg asd  skip 1  output file  See Chapter 9 for a detailed dis    tabulate yield   variety    cussion of prediction in ASReml  yield   mu variety  r idv repl   residual idv units   predict variety                3 5 Running the job    Assuming you have located the nin89 asd file  under Windows it will typically be located  in ASRemlPath Examples  we suggest copying the data file to the users workspace as the  Examples folder is sometimes write protected  and created the ASCII command file nin89 as  as described in the previous section and in the same folder  you can run the job  ASRemiPath  is typically C  Program Files ASRem14 under Windows  Installation details vary with the  implementation and are distributed with the program  You could use ASReml W or ConText  to create nin89 as  These programs can then run ASReml directly after they have been  configured for ASReml  An ASReml job is also run from a command line or by    clicking    the   as file in Windows Explorer     The basic command to run an ASReml job is  ASRemlPath bin ASRem1 basename   as     where basename  as  is the name of the command file  Typically  a system PATH is defined  which includes AS RemlPath bin  so that just the program name ASReml1 is required at the  command prompt  For example  the command to run nin89 as fr
24.    TSV or  MSV  ASReml will  use the file f rsv   f tsv or f msv  If f filename xsv with x r  t  or m is used with  CONTINUE   TSV or  MSV  ASReml will use the file  f xsv  If the specified file is not present  ASReml reverts to reading the  previous  rsv file  Some users may prefer  rather than specifying initial  values in the model formulation  to generate a default  tsv file using   MAXIT O and then edit the  tsv file with more appropriate values  If  the model has changed and  CONTINUE is used  ASReml will pick up the  values it recognises as being for the same terms from the  rsv file  Fur   thermore  ASRem1 will use estimates in the  rsv file for certain models  to provide starting values for certain more general models  inserting rea   sonable defaults where necessary  The transitions recognised are listed  and discussed in Section 7 9 2    66    5 8 Job control qualifiers       Table 5 3  List of commonly used job control qualifiers       qualifier    action        CONTRAST s t p    IDDF  i      FCON    provides a convenient way to define contrasts among treatment levels    CONTRAST lines occur as separate lines between the datafile line and  the model line   s is the name of the model term being defined   t is the name of an existing factor   p is the list of contrast coefficients  For example    CONTRAST LinN Nitrogen 3 1  1  3  defines LinN as a contrast based on the 4  implied by the length of  the list  levels of factor Nitrogen  Missing values in the factor bec
25.    V label i j where i  j spans an XFA variance structure  inserts the US matrix based on the  XFA parameters     213    12 2 Syntax       12 2 3 Correlation    Correlations are requested by lines beginning  y1 y2   Trait  r id sire   us Trait   with an R  The specific form of the directive   residual id units   us  Trait        1S VPREDICT  DEFINE  F phenvar 4 6   1 3  R label a ab b  id sire   us Trait   us  Trait     This calculates the correlation r   o     o2o7    tunits us  Trait   a R phencorr 7 9  phenvar  and the associated standard error  a  b and        in f _ R gencorr 4 6  sire us Trait   ab are integers indicating the position of the  components to be used  Alternatively     R label a n             calculates the correlation r   Cab v4  020o  for all correlations in the lower triangular row wise  matrix represented by components a to n and the associated standard errors     Note that covariances between ratios and other components are not generated so the corre   lations are not numbered and cannot be used to derive other functions  To avoid numbering  confusion it is better to include R functions at the end of the VPREDICT block     In the example    R phencorr 7 8 9 or R phencorr phenvar    calculates the phenotypic covariance by calculating  component 8     component 7 x component 9 where components 7  8 and 9 are created with  the first line of the  pin file  and       R gencorr 4 6 or R gencorr sire us Trait     calculates the genotypic covariance by calcul
26.    indicates that for the 16th data record  the residuals are  2 35  6 58 and 5 64 times the  respective standard deviations  The standard deviation used in this test is calculated  directly from the residuals rather than from the analysis  They are intended to flag the  records with large residuals rather than to precisely quantify their relative size  They are  not studentised residuals and are generally not relevant when the user has fitted hetero     geneous variances                 Residual statistics for nin891 asr                  Convergence sequence of variance parameters  Iteration 1 2 3 4 6 6  LogL  449 818  424 315  405 419  399 552  399 336  399  325  Change   177 216 201 51 13 3  Adjusted 0 0 0 0 0 0  StepSz 0 316 0 562 1 000 1 000 1 000 1 000  5 R 0 100000 0 293737 0 481321 0 615630 0 645607 0 653013 1 1  6 R 0 100000 0 232335 0 358720 0 439779 0 441733 0 439143 07  Trace of W W R  W G   W  1376 1714  Plot of Residuals    24 8729 15 9146  vs Fitted values   16 7728 35 9349  _RvE11  Ss a a i R                                  1  i  i 1 4  i 12 2 1211 1 21 4 1 1  i 112 15 1 311 mi 1  1 1 312 dil 221 3  R 1 11 4 141 22121 41121 2  2 i a 11 1112 23 11 1 2     12 1 21 2 1213 1 ds 2 ii 7  Sa jes  see lt     lt  2    6 1 212   lt  a  i   iia 41 4AL2 i2  a  1 4 1 11    i 3 2  i 11  1 11 i  ii 1  117 2 2  14    1 1  1  i  12 1                                1    1   1   1                     SLOPES FOR LOG ABS RES   on LOG PV  for Section 11    0 15    SLOPES FOR LOG S
27.    mean Aw  If a   0  the BLUP in  2 19  becomes    2  m rop                  Sly   1y   2 20   Toe  y     17    2 20   and the BLUP is a so called shrinkage estimate  As ro  becomes large relative to 0     the  BLUP tends to the fixed effect solution  while for small ro  relative to     the BLUP tends  towards zero  the assumed initial mean  Thus  2 20  represents a weighted mean which    involves the prior assumption that the u  have zero mean     15    2 4 Inference  Random effects       Note also that the BLUPs in this simple case are constrained to sum to zero  This is essentially  because the unit vector defining X can be found by summing the columns of the Z matrix   This linear dependence of the matrices translates to dependence of the BLUPs and hence  constraints  This aspect occurs whenever the column space of X is contained in the column  space of Z  The dependence is slightly more complex with correlated random effects     2 4 Inference  Random effects    2 4 1 Tests of hypotheses  variance parameters    Inference concerning variance parameters of a linear mixed effects model usually relies on  approximate distributions for the  RE ML estimates derived from asymptotic results     It can be shown that the approximate variance matrix for the REML estimates is given by  the inverse of the expected information matrix  Cox and Hinkley  1974  section 4 8   Since  this matrix is not available in ASReml we replace the expected information matrix by the Al  matrix  Further
28.    predict variety nitrogen  SED    The data fields were blocks  wplots  subplots  variety  nitrogen and yield  The first  five variables are factors that describe the stratification or experiment design and treat   ments  The standard split plot analysis is achieved by fitting the model terms blocks and  blocks wplots as random effects  The blocks wplots subplots term is not listed in the  model because this interaction corresponds to the experimental units and is automatically  included as the residual term  The fixed effects include the main effects of both variety  and nitrogen and their interaction  The tables of predicted means and associated stan   dard errors of differences  SEDs  have been requested  These are reported in the  pvs file   Abbreviated output is shown below          Results from analysis of yield        Akaike Information Criterion 424 76  assuming 3 parameters    Bayesian Information Criterion 431 04    Approximate stratum variance decomposition    Stratum Degrees Freedom Variance Component Coefficients   idv  blocks  5 00 3175 06 12 0 4 0 1 0  idv blocks wplots 10 00 601 331 0 0 4 0 1 0   Residual Variance 45 00 177 083 0 0 0 0 1 0  Model_Term Gamma Sigma Sigma SE  C  blocks IDV_V 6 1 21116 214 477 1 27 OF  blocks wplots IDV_V 18 0 598937 106 062 1 56 OP  idv  units  72 effects   Residual SCA_V  72 1 000000 177 083 4 74 OP    269    15 2 Split plot design   Oats       Wald F statistics    Source of Variation NumDF DenDF Pine Pine  7 mu l 5 0 245 14  l
29.   1 4  8   2 5 and 9   3 6     F addvar sire us Trait    4 or F addvar 4 6   4    creates new components 10   4 x 4 11   5 x 4 and 12   6 x 4     H heritA addvar 1  phenvar  1  or H heritA 10 7    forms 10   7 to give the heritability for ywt     H heritB addvar 3  phenvar  3  or H heritB 12 9    forms 12   9 to give the heritability for fat     R phencorr phenvar    forms 8    7 x 9  that is  the phenotypic correlation between ywt and fat     or R phencorr 7 8 9    R gencorr addvar    forms 5    4x6  that is  the genetic correlation between ywt and fat     or R gencorr 4 6    The resulting  pvc file contains     id units   us Trait  8140 effects    1 id units   us Trait   us Trait  vidi 23 2055 0 522176   2 id units   us Trait   us Trait    C 2 1 2 50402 0 134915   3 id units   us Trait   us Trait  v 2 2 1 66292 0 506679E 01  us Trait  id sire  184 effects   A ve Trait  idtsire   us Trait  y ot 1 45821 0 398418   5 us Trait   id sire   us Trait  C 2 1 0 130280 0 678542E 01   6 us Trait   id sire   us Trait  T 2 2 0 344381E 01 0 169646E 01    215    12 3 VPREDICT  PIN file processing       7 phenvar 1 24 664 0 64250   8 phenvar 2 2 6343 0 14763   9 phenvar 3 1 6974 0 52365E 01   10 addvar 4 5 8328 1 5926   11 addvar 5 0 52112 0 27168   12 addvar 6 0 13775 O 67791E 01  heritA   addvar 10 phenvar T  0 2365 0 0612  heritB   addvar 12 phenvar 9  0 0812 0 0394  phenco 2 1   phenv 8 SQR phenv 7 phenv 9   0 4071 0 0183  gencor 2 1   addva 11 SQR addva 10 addva 12   0 5814 0 2039    
30.   171 497 S2  1 00000 60 df  12 LogL  171 496 S2  1 00000 60 df          Results from analysis of yl y3 y5 y7 y10           Akaike Information Criterion 354 99  assuming 6 parameters    Bayesian Information Criterion 367  56  Model_Term Sigma Sigma Sigma SE  C  id units   exph Trait  70 effects  Trait EXP_P 1 0 906843 0 906843 21 88 QP  Trait EXP_V 1 60 8955 60 8955 2212 OP  Trait EXP_V 2 8 0128 73 0128 1 99 QP  Trait EXP_V 3 309 013 309 013 2 22 OF  Trait EXP_V 4 435 964 435 964 2 52 QP  Trait EXP_V 5 382 312 382 312 2 74 OP  Covariance Variance Correlation Matrix Residual   61 05 0 8227 0 6768 0 5568 0 4155   54 90 72 95 0 8227 0 6768 0 5050   93 05 123 6 309 6 0 8227 0 6139   90 95 120 8 302 6 437 0 0 7462   63 49 84 36 211 2 305 1 382 5   Wald F statistic  Source of Variation NumDF DenDF F ine P in   8 Trait 5 18 f 108 25  lt  00   1 tmt 1 13 1 0 00 0 96   9 Tr  tat 4 21 0 4 37 0 01    The last two models we fit are the antedependence model of order 1 and the unstructured  model  Starting values need not actually be supplie in this example  the defaults are ad   equate  but are suppled t demonstrate the syntax  We use the REML estimate of X from  the heterogeneous power model shown in the previous output  The antedependence model  models  amp  by the inverse cholesky decomposition    X  UDU     where D is a diagonal matrix and U is a unit upper triangular matrix  For an antedepen   dence model of order q  then u     0 for j  gt  i q   1  The antedependence model of order 
31.   208  219  asp  209    asr  34  208  210   ass  209    dbr  209     dpr  209  219    339    INDEX        msv  208  219   pvc  208   pvs  208  220  221  sxes  208  221   rsv  208  228   Sln  35  208  214   spr  209   tab  208  229   tsv  208  229   veo  209   vll  209   vrb  230   vvp  209  231   was  209   xml  209   yht  37  208  215   dgiv  153  159   mef  160   Sgiv  153  tsv  66   own models  119   OWN variance structure  118  IF2  119  IT  119    Path  DOPATH  194  PATH  195  PC environment  183  pedigree  147  file  148  Performance issues  196  power  117  135  Predict   TP  100  ITP  173   TURNINGPOINTS  173  PLOT suboptions  174  PRWTS  179  predicted values  37  prediction  33  165  qualifiers  172  predictions  estimable  38  prior mean  15    qualifier  87  l lt   54    340    I lt    54  l lt  gt   54  I    54  I gt   54  I gt    54  lx  54  I   54      54      54  l   54  1A  L  48   ABS  54    ADJUST  76   AIF  150   ATLOADINGS  74   AISINGULARITIES  74   ALPHA  150   AOD Analysis of Deviance  103   ARCSIN   54   ARGS  186   ASK  186   ASMV  69   ASSIGN  192    ASSOCIATE in PREDICT  177    ASSOCIATE  172   ASUV  69  1AS  48  1A  47   BINOMIAL GLM  102  I BLOCKSIZE  123   BLUP  75   BMP  75  IBRIEF  75  186   CHECK  199  ICINV  83   COLUMNFACTOR  62   COMPLOGLOG  102   COMPLOGLOG   102   CONTINUE  66  186    CONTRAST  67   COORD  117  ICOS  55  ICSV  62  ICYCLE  193  IDATAFILE  62  IDDF  67    INDEX        DEBUG  186  IDEC  173  IDEFINE  207  IDENSEGIV  154  ID
32.   AEXP  aexp anisotropic ex  C  1 2 3 2 w  ponential C     dlvi   2al olyi ys   a 1 2  0 lt     lt 10 lt     lt 1  AGAU  agau anisotropic C  1 2 3 2 w  gaussian C    i2   vii    aj 1 2  0 lt     lt 10 lt     lt 1  MATE Mat  rn with C   Mat  rn  see text k k 1 k w  matk first I  lt k  lt 5    gt  0 range  v shape 0 5   parameters a    specaned  by     gt  0 anisotropy ratio 1    the user a anisotropy angle 0    A 1 2  metric 2   heterogeneous variance models  DIAG  diag diagonal   IDH x     4   0 i j     w  idh  US unstructured Diz   Qy T   ewt  us general covari   ance matrix  OWNkK user explicitly E   k  ownk forms V and    OV    150    7 12 Variance models available in ASReml       Details of the variance models available in ASReml                   variance description algebraic number of parameters   structure form  name  variance corr hom het  model variance variance  function  name  ANTE1 1    k order   ES   UDU      eae    aes U   1  U   u  1 lt j i lt k  antek 1  lt  k  lt  w     1 ii H f ij   Ui    0    gt J  CHOL1 1    k order   S LDI      ww   a5  k cholesky D   d  D   0 i  j  CHOLk pear  cholk 1 lt k lt w 1 La 51  Ly 5l  CEA Sk  FA1 p k order X  DCD      w w  fal k factor C FF  E  kw w  FAK analytic F contains k correlation factors  fak E diagonal  DD   diag       FACV 1  1    k order Z IT  9      w w  facvl k factor T contains covariance factors kw w  FACVk analytic W contains specific variance  facvk covari    ance   form  XFA1 1 k order X IT  Y      w w  xfat k
33.   ASReml uses the identifiers obtained from the  grr file to define the order of the  factor classes when the data is read  any extra identifiers in the data not in the  grr file are  appended at the end of the factor level name list  If  NOID is set  identifiers in the  grr file  are not needed and if present should be skipped using  CSKIP     Values are typically TAB  COMMA or SPACE separated but may be packed  no separator  when  all values are integers 0 1 2  Missing values in the regression variables may be represented  by    NA  Invalid data is also treated as missing  Missing values are replaced by the mean  of the respective regressor  Alternative missing data methods that involve imputation from  neighbouring markers have not been implemented     Some general qualifiers are     SAVEGIV instructs ASReml to write the G matrix in  dgiv format     PSD s declares that the derived variance matrix may have up to s singularities     PEV requests calculation of Prediction Error Variance of marker effects which are reported  in the  mef file  Calculation of Prediction error variances is computationally very expensive     CENTRE  c  requests ASReml to centre the regressors at c if c is specified else at the individual  regressor means  otherwise the G matrix is formed from uncentered regressors  Note that  centring introduces a singularity in the G matrix and  PSV s will need to be set     Other qualifiers relate specifically to whether the regressors are markers  Markers are t
34.   Fixed effects       Term Sums of Squares M code   1 R 1  x   A R A   1 B C B C    R 1 A B C B C    R 1 B C B C  A   B R B   1 A C A C    R 1 A B C A C    R 1 A C A C  A   C R C   1 A B A B    R 1 A B C A B    R 1 A B A B  A   A B R A B   1 A B C A C B C    R 1 A B C A B A C B C    R 1 A B C A C B C  B   A C R A C   1 A B C A B B C    R 1 A B C A B A C B C    R 1 A B C A B B C  B   B C R B C   1 A B C A B A C    R 1 A B C A B A C B C    R 1 A B C A B A C  B  A B C R A B C   1 A B C A B A C B C    R 1 A B C A B A C B C A B C      R 1 A B C A B A C B C  c    Of these the conditional Wald statistic for the 1  B C and A B C terms would be the same  as the incremental Wald statistics produced using the linear model    y x1 A B C A B A C B C A B C       The preceeding table includes a so called M  marginality  code reported by ASReml when  conditional Wald statistics are presented  All terms with the highest M code letter are tested  conditionally on all other terms in the model  i e  by dropping the term from the maximum  model  All terms with the preceeding M code letter  are marginal to at least one term in a  higher group  and so forth  For example  in the table  model term A B has M code B because  it is marginal to model term A B C and model term A has M code A because it is marginal to  A B  A C and A B C  Model term mu  M code    is a special case in that its test is conditional  on all covariates but no factors  Following is some ASReml output from the  aov file which  re
35.   This will be a problem if the R  structure model assumes n x t data structure     the matrix may be OK but ASReml has not checked it     this indicates that there are some lines on the end of the  as  file that were not used  The first    extra    line is displayed   This is only a problem if you intended ASReml to read these  lines     256    14 5 Information  Warning and Error messages       Table 14 2  List of warning messages and likely meaning s        warning message    likely meaning       Warning  Failed to find  header blocks to skip     Warning  Fewer levels found  in term    Warning  FIELD DEFINITION  lines should be INDENTED    Warning  Fixed levels for  factor   Warning  Initial gamma value  is zero   Warning  Invalid argument   Warning  It is usual to    include Trait in the  model    Warning  LogL Converged   Parameters Not Converged    Warning  LogL not converged    Notice  LogL values are  reported relative to a base  of    Warning  Missing cells in  table   Warning  More levels found  in term   Warning  PREDICT LINE    IGNORED   TOO MANY   Warning  PREDICT statement  is being ignored   Warning  Second occurrence  of term dropped   Warning  Spatial mapping  information for side  Warning  Standard errors  Warning  SYNTAX CHANGE  text  may be invalid   Warning  The  A qualifier  ignored when reading BINARY  data    Warning  The  SPLINE  qualifier has been redefined     The  RSKIP qualifier requested skipping header blocks which  were not present     ASReml in
36.   When   AILOADINGS i is specified  it also prevents AI  updates of some loadings during the first i iterations  For f   gt  1  factors   only the last factor is estimated  conditional on the earlier ones  in the  first f     1 iterations  Then pairs including the last are estimated until  iteration t    If  AILOADINGS is not specified and   CONTINUE is used and initializes the  XFA model from a lower order  the 7 parameter is set internally     can be specified to force a job to continue even though a singularity  was detected in the Average Information  AI  matrix  The AI matrix  is used to give updates to the variance parameter estimates  In release  1  if singularities were present in the AI matrix  a generalized inverse  was used which effectively conditioned on whichever parameters were  identified as singular  ASReml now aborts processing if such singularities  appear unless the  AISINGULARITIES qualifier is set  Which particular  parameter is singular is reported in the variance component table printed  in the  asr file     14    5 8 Job control qualifiers       Table 5 5  List of rarely used job control qualifiers       qualifier    action         BMP    IBRIEF  n      BLUP n    The most common reason for singularities is that the user has overspec   ified the model and is likely to misinterpret the results if not fully aware  of the situation  Overspecification will occur in a direct product of two  unconstrained variance matrices  see Section 7 4   when a random te
37.   Yi Y2 Y3 Y4 Y5   Trait Treatment Trait Treatment    Table 5 4  List of occasionally used job control qualifiers       qualifier    action        ASMV n      ASUV     DESIGN    indicates a multivariate analysis is required although the data is pre   sented in a univariate form     Multivariate Analysis    is used in the narrow  sense where an unstructured error variance matrix is fitted across traits   records are independent  and observations may be missing for particular  traits  see Chapter 8 for a complete discussion     The data is presumed arranged in lots of n records where n is the num   ber of traits  It may be necessary to expand the data file to achieve  this structure  inserting a missing value NA on the additional records   This option is sometimes relevant for some forms of repeated measures  analysis  There will need to be a factor in the data to code for trait as  the intrinsic Trait factor is undefined when the data is presented in a  univariate manner     allows you to have an error variance other than I8 X where    is the un   structured  US  see Table 7 6  variance structure  if the data is presented  in a multivariate form  If there are missing values in the data  include  f  mv on the end of the linear model  The intrinsic factor Trait is defined  and may be used in the model  See Chapter 8 for more information     This option is used for repeated measures analysis when the variance  structure required is not the standard multivariate unstructured matri
38.   assuming 4 parameters    Bayesian Information Criterion 8493 30    Model_Term Gamma Sigma Sigma SE  C  idv variety  IDV_V 532 1 06038 88117 5 9 92 OP  ari  row   ar1  column  670 effects   Residual SCA_V 670 1 000000 83100 1 8 90 0 P  row AR_R 1 0 685387 0 685387 16 65 oP  column AR_R 1 0 285909 0 285909 38T OP    Wald F statistics    Source of Variation NumDF DenDF F inc P inc  7 mu 1 41 7 6248 66  lt  001  3 weed 1 491 2 85 84  lt  001    294    15 7 Unreplicated early generation variety trial   Wheat       The change in REML log likelihood is significant  x    12 46  p  lt   001  with the inclusion of  the autoregressive parameter for columns  Figure 15 6 presents the sample variogram of the  residuals for the AR1xAR1 model  There is an indication that a linear drift from column  1 to column 10 is present  We include a linear regression coefficient pol  column  1  in the  model to account for this  Note we use the     1    option in the pol term to exclude the overall  constant in the regression  as it is already fitted  The linear regression of column number  on yield is significant  t      2 96   The sample variogram  Figure 15 7  is more satisfactory   though interpretation of variograms is often difficult  particularly for unreplicated trials   This is an issue for further research               lbi   12  variogram a a 26 aug 2d02 19 03 11  ale          Oo           Outer displacement Me displacement    Figure 15 6  Sample variogram of the residuals from the AR1 x AR
39.   column 2  and the individual  components that identify the dimension of the individual matrices used in forming the direct  product variance structure are then written down  column 3   Note that in the simplest cases  there is only one component  The variance structure associated with each component has a    110    7 2 Process to define a consolidated model term       Table 7 1  List of common variance model functions  their type  correlation or variance   the  form of the variance matrix generated  C for correlation  V for variance matrix  S for scaled  variance matrix   and a brief description  Parameters g   gt  0 are variances     1  lt  p   lt  1 are  correlations  Subscipt c denotes parameter held in common across all rows columns           name type variance matrix description   for set of n effects   idQ correlation C I IID with variance 1  idv   variance V o7 1 IID with common variance   default model  idh   variance V  diag o    07  independent with separate variances  ari   correlation Ci   pee auto regressive structure of order 1  ariv   variance Viz   o2pli 4 l auto regressive structure of order 1  arih   variance Vij   C10  pis   auto regressive structure of order 1  corg   correlation Ciz   Pij unstructured correlation matrix  diag   variance V  diag o    02  independent with separate variances  same as idh      grm   scaled vari  S specified applies a known scaled variance matrix  the number  ance of rows in the matrix must be match the number of  levels of 
40.   defined by the model which includes those terms appearing above the current term given  the variance parameters  For example  the test of nitrogen is calculated from the change  in sums of squares for the two models mu variety nitrogen and mu variety  No refitting  occurs  that is the variance parameters are held constant at the REML estimates obtained  from the currently specified fixed model     The incremental Wald statistics have an asymptotic x  distribution  with degrees of freedom   df  given by the number of estimable effects  the number in the DF column   In this exam   ple  the incremental Wald F statistics are numerically the same as the ANOVA F statistics   and ASReml has calculated the appropriate denominator df for testing fixed effects  This is  a simple problem for balanced designs  such as the split plot design  but it is not straightfor   ward to determine the relevant denominator df in unbalanced designs  such as the rat data  set described in the next section     Tables of predicted means are presented for the nitrogen  variety  and variety by nitrogen  tables in the  pvs file  The qualifier  SED has been used on the third predict statement  and so the matrix of SEDs for the variety by nitrogen table is printed  For the first two  predictions  the average SED is calculated from the average variance of differences  Note    270    15 2 Split plot design   Oats       also that the order of the predictions  e g  0 6_cwt  0 4_ cwt 0 2_cwt O_cwt for nitrogen  
41.   grr file  file in the first CYCLE and hold it in memory for  use in subsequent cycles  This is advantageous when the data  grr file is  large and there are many cycles to execute where the model changes  but  the data  grr file doesn   t     The  CYCLE mechanism acts as an inner loop when used with  RENAME   ARG  As an example  the  RENAME  ARG arguments might list a set of  traits  and the  CYCLE arguments sequentially test a set of markers     A cycle string may consist of up to 4 substrings  separated by a semicolon  and referenced as  I  J  K and  L respectively  For example   ICYCLE Y1 X1 Y2 X2   I   mu  J    When cycling is active  an extra line is written to the  asr file containing  some details of the cycle in a form which can be extracted to form an  analysis summary by searching for LogL   A heading for this extra line is  written in the first cycle  For example   LogL  LogL Residual NEDF NIT Cycle Text  LogL   208 97 0 703148 587 6 1466  LogL Converged   The LogL  line with the highest LogL value is repeated at the end of the     asr file      DOPATH with  PATH  PART statements allows several analyses to be coded  in one job file and run selectively without having to edit the  as file be   tween runs  Both spellings can be used interchangably  Which particular  lines in the  as file are honoured is controlled by the argument n of the   DOPATH qualifier in conjunction with  PATH  or  PART  statements     202    10 4 Advanced processing arguments       High level 
42.   l lt r lt l  covariance C   positive correlation P O0O lt r lt 1   loading L       130    7 7 Variance model function qualifiers       7 7 8 Equating variance structures   USE t    In some plant breeding applications  it can be convenient to define a variance structure as  the sum of two simpler terms  For example  given 1000 entries representing 50 related  families  where relationships were derived from markers  the full relationship matrix  in   verse  is dense  But it can be well approximated as the sum of a family component and a  diagonal entry component  The reformulation gives a sparser  faster  formulation  But now  we have two terms to interact with xfa1 dtrial  and both must have the same parameters   That is  instead of fitting   xfai dTrial   grm3  entry    we fit   xfal dTrial  grmi family  xfai dTrial   grm2 entry    requiring both xfa1 terms have the same parameters     If there are only a few parameters  this can be achieved directly as follows      ASSIGN QP  GPFPFP    ASSIGN QE    ABCDEFGH    ASSIGN QI   INIT 0 72631 0 000  242713 0 000  882465  846305  04419  743393  xfai dTrial  QP  QE  QI  grm1 family     xfai dTrial  QP  QE  QI  grm2 entry     However  for a larger term  the number of parameters required may exceed the available  letters in the alphabet  In this case  VCC can be used      lt DATAFILE NAME gt   VCC 1    xfal dTrial  QP  QI  grm1 family    xfail dTrial  QP  QI  grm2 entry   21 29  BLOCKSIZE 8  parameters 21 28 are equal to parameters 29
43.   sat Expt 1  idv A   B C will be interpreted as sat  Expt  1  idv A   id B C     However  it is good practice to specify variance model functions for the components in model  terms and we encourage the user to do this  ASReml will automatically add a common  variance to consolidated model terms that are specified as correlation models for both R and  G structures  for example       id A  will be converted to idv A      sat Expt 1  id units  will be converted to sat  Expt  1  idv units      id A  ar1 B  will be converted to idv A   ar1 B      ar1 A  ar1 B  will be converted to ar1v A   ar1 B      sat Expt 1  id A  ar1 B  will be converted to sat  Expt  1   idv A   ar1 B      sat Expt 1  ar1i A  ar1 B  will be converted to sat  Expt  1  ariv A   ar1 B     Using NIN example 2 for demonstration  Section 7 5   a more succinct coding of the model  definition would be    yield   mu variety  r repl    residual units    which would result in identical output to the original example  The model could be relaxed  further to    yield   mu variety  r repl    7 11 Variance model functions available in ASReml    The full range of variance models  that is  correlation  homogeneous variance and hetero   geneous variance models available in ASReml is presented in Table 7 6 which is located at  the end of this chapter for easy access  see Section 7 12 on page 147  This presents the  variance structure name  in UPPERCASE   the corresponding variance model function name   in lowercase  used to as
44.   to the power v  SQRyld   yield  170 5  i 0 takes natural logarithms of the data yield   which must be positive   LNyield   yield  ee  ia    1 takes reciprocal of data  data must yield  be positive   INVyield   yield  eg  Sl ley v logical operators forming 1 if true  0 yield  l       lt    if false  high   yield   gt 10   gt      ABS takes absolute values   no argument yield  required  ABSyield   yield    ABS    ARCSIN v forms an ArcSin transformation us  Germ Total  ing the sample size specified in ASG   Germ  ARCSIN Total    the argument  a number or another  field  In the side example  for two  existing fields Germ and Total con   taining counts  we form the ArcSin  for their ratio  ASG  by copying the  Germ field and applying the ArcSin  transformation using the Total field  as sample size     54    5 5 Transforming the data       Table 5 1  List of transformation qualifiers and their actions with examples          qualifier argument action examples   COS   SIN s takes cosine and sine of the data Day  variable with period s having default CosDay   Day  C0S  27  omit s if data is in radians  set 365  s to 360 if data is in degrees    ID   D lt  gt   v  D o  v discards records which have yield  D lt  0   ID lt    D lt    v v or  missing value    in the field  sub  yield  D lt 1  D gt 100   ID gt    D gt   v ject to the logical operator o    IDV  v  DV o  v discards records  subject to yield  DV lt  0   IDV lt  gt   v the logical operator o  which have v yield  DV lt 1   ID
45.  0   specifies that vertical annotation be used on the x axis  default is horizontal    specifies that the labels used for the data be abbreviated to n characters     specifies that the labels used for the x axis annotation be appreviated to n  characters     184    9 3 Prediction       Table 9 2  List of predict plot options          option action     abbrslab n specifies that the labels used for superimposed factors be abbreviated to n  characters     185    9 3 Prediction       9 3 4 Associated factors     ASSOCIATE factors facilitates prediction when the levels of one factor group or classify  the levels of another  especially when there are many levels  factors is the list of factors  in the model which have this hierarchical relationship  Typical examples are individually  named lines grouped into families  usually with unequal numbers of lines per family  or trials  conducted at locations within regions     Declaring factors as associated allows ASReml to combine the levels of the factors appropri   ately  For example  when predicting a trial mean  to add the effect of the location and region  where the trial was conducted  When identifying which levels are associated  ASReml checks  that the association is strictly hierarchal  tree like  That is  each trial is associated with one  location and each location is associated with only one region  If a level code is missing for  one component  it must be missing for all     Averaging of associated factors will generally gi
46.  0 0335  OOL  0 1514  0 1269   0 278  0 2622   0 226  0 2857  0 2506  0 0763    TotalVar explained by all loadings  The last row contains column averages     13 4 8 The  rsv file    The  rsv file contains the variance parameters from the most recent iteration of a model   The primary use of the  rsv file is to supply the values for the   CONTINUE qualifier  see Table  5 4  and the C command line option  see Table 10 1   It contains sufficient information to  match terms so that it can be used when the variance model has been changed  This is    nin89a rsv     TG 6 1711 121    237    13 4 Other ASReml output files         This  rsv file holds parameter values between runs of ASReml and    is not normally modified by the User  The current values of the    the variance parameters are listed as a block on the following lines     They are then listed again with identifying information    in a form that the user may edit   0 000000 0 000000 0 000000 1 0000000 0 6554798 0 4375045  RSTRUCTURE 1 2 3  VARIANCE 1 1 0  a  V  P  1 00000000 0O 0  STRUCTURE 22 1 J  ay Ry  Py 0 65547976 0 Q  STRUCTURE 11 1 j   6  Ry P  0   43750453 Oo g    13 4 9 The  tab file    The  tab file contains the simple variety means and cell frequencies  Below is a cut down  version of nin89 tab     nin alliance trial 10 Sep 2002 04 20 15    Simple tabulation of yield    variety   LANCER 28 56 4  BRULE 26 07 4  REDLAND 30 50 4  CODY ea 4  ARAPAHOE 29 44 4  NE83404 27 39 4  NE83406 24 28 4  NE83407 22 69 4  CENTURA
47.  0 4_cwt Golden_rain 114 6667 921070 E  0 2_cwt Marvellous 108 5000 8 1070 E  0 2_cwt Victory 89 6667 9 1070 E  0 2_cwt Golden_rain 98 5000 9 1070  E  O_cwt Marvellous 86 6667 9 1070 E  O_cwt Victory 71 5000 9 1070 E  O_cwt Golden_rain 80 0000 9 1070 E    271    15 2 Split plot design   Oats       Predicted values with SED PV     126    110  114     833  118   124   Iir     500  833  167     833   667    9 71503    108     500    9 71503  89 6667  7 68295  98 5000  9 71503  86 6667  9 71503  71 5000  7 68295  9 71503  80 0000  9 71503  9  71503  SED  Standard Error    OO ONN OO O ON    9   Ta  os      71503    71503      68295    71503    71503    71503    71503    68295    68295    71503    71503    71503    9 71503  9 71503  ts  9  9    68295    O N OO    Ko     ONN OO O ON    71903  68295  71503  of Difference     O oO     71503    71503    68295    71503      71503      68295    71503   71503    71503    71503   68295   68295    71503      71503    71503    Min    7 6830    272    oO    NO O O ON    N     71503    71503   68295    11503      71503     68295    71503    71503    71503    71503   68295     68295   71503    Mean    9 1608    O O ON       NO      71503    71503     68295    71503    71503    68295    71503    71503    71503      71503    68295    Max    9 7150    15 3 Unbalanced nested design   Rats       15 3 Unbalanced nested design   Rats    The second example we consider is a data set which illustrates some further aspects of testing  fixed effects in lin
48.  01    794020   417001E 01  897161    2164    23   89   64   08   00   48  19    04  sar  32        N o onw N    ww N       OOP OF WW W    NOrFOFOrF OO    an Or OO    w w o    70    33    90    s97   lt 51  ebd    2401  1   1  1    2L    21d  J62     68   18   90   53   10  On   54  41   25   84   99   99   15   53   00     03   00    29   06   30   08     54   30  15  eho  74   43  22   505  TO   29    OOo OO Oo Oo Oo 2 Oo 2    jo   g        ie         gi    OOGG  c e e ee    oo ooo ooo 0 6 OO CO Oo  SS SO OO eae es Se Oe eo e     oO 2 Oo 6 Oo 6    TU DTW ey oS    Oo 2 Oo Oo 2 Oo 2 0 oOo 6    Se a ae ee e e a e    Se es Ao Oe ee ee I    15 10 Multivariate animal genetics data   Sheep       7 342 17 60 0 4231 0 2494 0 4633  0 2725 0 6680 0 1416 0 3995 0 1635  0 9630 1 998 0 2870 3 644 0 4753E 01  0 8503 2 483 0  7861E 01 0 1159 1 632  Covariance Variance Correlation Matrix US us Trait  id sire   0 5939 0 7045 0 2970 0 1947 0 2924  0 6773 1 556 0 1883E 01  0 1326 0 5913  0 2805E 0O1 0 2879E 02 0 1502E 01 0 9808E 01 0 3960E 01  0 5962E 01  0 6570E 01 0 4776E 02 0 1579  0 6577  0 4073E  01 0 13383 0 8771E 03   0 4723E 01 0 3267E 01  Covariance Variance Correlation Matrix XFA xfai TrDam123   id dam   2 158 0 9961 0 8035 0 9961  2 225 2 512 0 8066 1 0000  0 1623 0 1687 0 1891E 01 0 8066  1 463 1 521 0 1109 1 0000  Covariance Variance Correlation Matrix US us TrLit1234  id lit   3 500 0 5111  0 1190  0 4039E 01  1 540 2155S 0 2041  0 5244     3101E 01 0 4509E 01 0 1910F  01  0 3185
49.  1     Table 13 1  Summary of ASReml output files       file description       Key output files     asr contains a summary of the data and analysis results     msv contains final variance parameter values in a form that is easy to edit for reset   ting the initial values if  MSV or  CONTINUE 3 is used  see Table 5 4      pvc contains the report produced with the P option     pvs contains predictions formed by the predict directive     res contains information from using the pol    sp1   and fac   functions  the    iteration sequence for the variance components and some statistics derived from  the residuals      rsv contains the final parameter values for reading back if the  CONTINUE qualifier  is invoked  see Table 5 4     sln contains the estimates of the fixed and random effects and their corresponding  standard errors     tab contains tables formed by the tabulate directive     tsv contains variance parameter values in a form that is easy to edit for resetting    the initial values if  TSV or  CONTINUE 2 is used  see Table 5 4      yht contains the predicted values  residuals and diagonal elements of the hat matrix  for each data point     Other output files     asl contains a progress log and error messages if the L command line option is  specified     aov contains details of the ANOVA calculations     apj is an ASReml project file created by ASReml W      217    13 1 Introduction       Table 13 1  Summary of ASReml output files       file description        ask holds 
50.  1 has 9 parameters for these data  5 in D and 4 in U  The input is given by     ASSIGN ANTEI   lt   INIT  60 1  54 65 73 65    283    15 5 Balanced repeated measures   Height       91 50 123 3 306 4   89 17 120 2 298 6 431 8   62 21 83 85 206 3 301 2 379 8   1 gt   yi y3 y5 y7 y10   Trait tmt Tr tmt  redidual units ante Trait  ANTEI     The abbreviated output file is       1 LogL  171 501 e2  1 0000 60 df  2 LogL  170  097 52  1 0000 60 df  3 LogL  166 085 S2  1 0000 60 df  4 LogL  161 335 S2  1 0000 60 df  5 LogL  160 407 S2  1 0000 60 df  6 LogL  160 370 S2  1 0000 60 df  T LogL  160 369 S2  1 0000 60 df  8 LogL  160 369 S2  1 0000 60 df  9 LogL  160 369 S2  1 0000 60 df        Results from analysis of yl y3 y5 y7 y10       Akaike Information Criterion 338 74  assuming 9 parameters   Bayesian Information Criterion 357 59  Model_Term Sigma Sigma Sigma SE  C  id units   ante Trait  70 effects  Trait ANTE_U 1 1 0 268643E 01 0 268643E 01 2 44 OP  Trait ANTE_LU 2 1  0 628417  0 628417    2 55 QP  Trait ANTE_U 2 2 0 372830E 01 0 372830E 01 2 41 OP  Trait ANTE_U 3 2  1 49102  1 49102  2 54 OP  Trait ANTE  U 3 3 0 599612E 02 0 599612E 02 2 43 OP  Trait ANTE_U 4 3  1 28037  1 28037  6 19 OP  Trait ANTE_U 4 4 0 789716E 02 0 789716E 02 2 44 OP  Trait ANTE_LU 5 4  0 967820  O  967920  15 40 OP  Trait ANTE_ U 5 5 0 390635E 01 0 390635E 01 2 45 OP  Covariance Variance Correlation Matrix ANTE Residual   37 20 0 5946 0 3550 0 3115 0 3041   23 38 41 55 0 5970 0 5239 0 5114   34 84 61 93 25
51.  1 variety 55 165 0 0 88 0 712 effects    Notice  The DenDF values are calculated ignoring fixed boundary singular  variance parameters using algebraic derivatives   5 repl 4 effects fitted  Finished  04 Nov 2011 21 14 29 242 LogL Converged    3 6 2 The  sln file    The following is an extract from nin89  sln containing the estimated variety effects  intercept  and random replicate effects in this order  column 3  with standard errors  column 4   Note  that the variety effects are returned in the order of their first appearance in the data file   see replicate 1 in Table 3 1     35    3 6 Description of output files          Model_Term Level Effect seEffect  variety LANCER 0 000 0 000  variety BRULE  2 487 4 979  variety REDLAND 1 938 4 979  variety CODY  7 350 4 979  variety ARAPAHOE 0 8750 4 979  variety NE83404  1 175 4 979  variety NE83406  4 287 4 979  variety NE83407  5 875 4 979  variety CENTURA  6 912 4 979  variety SCOUT66  1 037 4 979  variety COLT  1 562 4 979  variety NE83498 1 563 4 979  variety NE84557  8 03  4 979  variety NE83432  8  837 4 979  variety NE87615  2 975 4 979  variety NE87619 2 700 4 979  variety NE87627  5 337 4 979  mu 1 28 56 3 856  repl 1 1 880 1  T55  repl   2 843 1 755  repl 3    8219 1 755  repl 4  3 852 1 755    36    3 7 Tabulation  predicted values and functions of the variance components       3 6 3 The  yht file    The following is an extract from nin89 yht containing the predicted values of the observa   tions  column 2   the residua
52.  2   4 31 65 8 6 9 6 8 2  4 25 65 8 6 10 8 9 2  14    8 6  28 3 8 6 18 15 2  T 8 6 1    19 4  4 20 1    N    4 3 26 4 22 1  6 6 1 2 12  3 6 3 6 32  4 842    an    1 6 8 6 12 10 2  8 6 13 2 11 2  28 6 14 4 12 2   G 15 6 13 2   16 8 14 2    9 2 16 2          4 2 The data file    The standard format of an ASReml data file is to have the data arranged in columns  fields  with a single line for each sampling unit  The columns contain variates and covariates    40    4 2 The data file        numeric   factors  alphanumeric   traits  response variables  and weight variables in any  order that is convenient to the user  The data file may be free format  fixed format or a  binary file     4 2 1 Free format data files    The data are read free format  SPACE  COMMA or TAB separated  unless the file name has  extension  bin for real binary  or  db1 for double precision binary  see below   Important  points to note are as follows     e files prepared in EXCEL must be saved to comma or tab delimited form   e blank lines are ignored     e column headings  field labels or comments may be present at the top of the file  See  Generating a template on page 29  provided that the  skip qualifier  Table 5 2  is used  to skip over them     e NA    and   are treated as coding for missing values in free format data files     if missing values are coded with a unique data value  for example  0 or  9   use the  transformation  M value to flag them as missing or  DV value to drop the data record  contai
53.  2 and then  averages across repl to produce variety predictions    GFW Fdiam   Trait Trait Year  r idv Trait   id Team     predict Trait Team    forms the hyper table for each trait based on Year and Team with each linear combination  in each cell of the hyper table for each trait using Team and Year effects  Team predictions  are produced by averaging over years     yield   variety  r idv site   id variety    predict variety   will ignore the site  variety term in forming the predictions while  predict variety  AVERAGE site    forms the hyper table based on site and variety with each linear combination in each cell  using variety and site variety effects and then forms averages across sites to produce  variety predictions     yield   site variety  r idv site  id variety  at site   idv block    predict variety   puts variety in the classify set  site in the averaging set and block in the ignore set   Consequently  it forms the sitexvariety hyper table from model terms site  variety and  site variety but ignoring all terms in at site   block  and then forms averages across    190    9 3 Prediction       sites to produce variety predictions     9 3 7 New R4 Prediction using two way interaction effects    In some cases we wish to calculate from two way interaction effects  bc   say  effects for  one of the factors  B say  that are a weighted sum averaged over the c levels of C  ie     c  bi   De bcijwj   TPREDICT C  AVE B weights  ONLYUSE fun B  fun C     allows this to be prod
54.  21 65 4  SCOUT66 27 92 4  COLT 27 00 4  NE87615 25 69 4  NE87619 31 26 4  NE87627 23 23 4    13 4 10 The  tsv file    The  tsv file contains the variance parameters as initialized for the most recent run in a  form that is relatively easy to edit if the initial values need to be reset  The file is read when   TSV or  CONTINUE 2 is specified or if  CONTINUE is specified but no  rsv file exists  This  is nin89a tsv     238    13 4 Other ASReml output files       This  tsv file is a mechanism for resetting initial parameter values  by changing the values here and rerunning the job with  CONTINUE 2   You may not change values in the first 3 fields   or RP fields where RP_GN is negative       H H HH         Fields are     GN  Term  Type  PSpace  Initial_value  RP_GN  RP_scale     4     Variance i     V  P  1 00000000 R 4  1  5   ari row  ari column   ariv row _1   R  P  0  10000000 i By 1  6   ari row  ari column  ari column _ i   R  P  0 10000000   6  1      Valid values for Pspace are F  P  U and maybe Z     RP_GN and RP_scale define simple parameter relationships   RP_GN links related parameters by the first GN number   RP_scale must be 1 0 for the first parameter in the set and  otherwise specifies the size relative to the first parameter         HOH OH    Multivalue RP_scale parameters may not be altered here       Notice that this file is overwritten if not being read     13 4 11 The  vrb file    The  vrb file contains the estimates of the effects together with their approx
55.  26342 159 816 2 11 OF  idv units Trait  70 effects  Residual SCA_V 70 1 000000 126 494 4 90 OP       id units   coru Trait      LogL  196 975 S2  264 10 60 df 1 000 0 5000  LogL  196 924 S2  270 14 60 df 1 000 0 5178  LogL  196  886 S2  278 58 60 df 1 000 0 5400  LogL  196 877 S2  286 23 60 df 1 000 0 5580  LogL  196 877 S2  286 31 60 df 1 000 0 5582  Final parameter values 1 0000 0 55819        Results from analysis of yl y3 y5 y7 y10      Akaike Information Criterion 397 75  assuming 2 parameters   Bayesian Information Criterion 401 9  Model_Term Gamma Sigma Sigma SE  C  id units   coru Trait  70 effects  Residual SCA_V 70 1 000000 286 310 3 65 OP  Trait COR_R 1 0 558191 0 558191 4 28 OP    A more realistic model for repeated measures data would allow the correlations to decrease  as the lag increases such as occurs with the first order autoregressive model  However  since  the heights are not measured at equally spaced time points we use the EXP model  The  correlation function is given by    plu    o     where u is the time lag is weeks  The coding for this is    yi y3 y5 y7 yi0   Trait tmt Tr tmt  residual id units  exp Trait  INIT 0 5  COORD 1 3 5 7 10      A portion of the output is    281    15 5 Balanced repeated measures   Height       1 LogL  202 139 S2  234 04 60 df 1 0000 0 5000   2 LogL  183 773 S2  440 42 60 df 1 0000 0 9507   3 LogL  183 070 B2  337 51 60 df 1 0000 0 9308   4 LogL  182 981 52  297 16 60 df 1 0000 0 9172   5 LogL  182 979 S2  302 31 60 df 1 00
56.  3 0 00000 0 00000 0 00 0  Trait XFA_V O 4 0 423585 0 423585 421 0  Trait XFA V O 5 0 00000 0 00000 0 00 0  Trait XFA_L 1 1  0 109659E 02  0 109659E 02 0 00 0  Trait RPA_L 1 2  Q  180117    0  180117  2 88 0  Trait ZPAL  1 3 0 219215 0 219215 3 53 0  Trait XFA_L 1 4 0 214461E 01 0 214461E 01 0 07 0  Trait XFA L  1  amp  O 17 7982 0 177932 1 18 0  Trait XFAL 2 1 1 17261 117261 0 00 0  Trait XFA_L 2 2 0 530954E 01 0 530954E 01 0 00 0  Trait XFA_L 2 3 0 604977E 01 0 604977E 01 1 31 0  Trait XFA_L 2 4 0 286377 0 286377 0 99 0  Trait XFA_L 2 5  0 460967E 01  0 460967E 01  0 33 0  Trait XFA L 3 1  0 123499  0 123499  0 528 0  Trait AFA L 3 2  0 938092E 01  0 938092E 01   lt 1 09 0  Trait XFA L 3 3 0 115989 0 115989 1 12 0  Trait XFA_L 3 4 0 439945 0 439945 1 40 0  Trait XFA_L 3 5  0 288612  0 288612  262 0  tag NRM 10696  Warning  Code B   fixed at a boundary   GP  F   fixed by user       liable to change from P to B P   positive definite   C   Constrained by user   VCC  U   unbounded    S   Singular Information matrix  S means there is no information in the data for this parameter   Very small components with Comp SE ratios of zero sometimes indicate poor  scaling  Consider rescaling the design matrix in such cases   Covariance Variance Correlation Matrix US Residual    8 138 0 5848 0 2532 0 1518 0 2373   7 284 Ifo 0 5057 0 2658 0 4837   0 2477 0 7052 0 1095 0 4193 0 1997   0 8169 2 038 0 2526 3 314 0 9232E 01   0 8713 2 531 0 8210E 01 0 2087 1 543  Covariance Variance Correlation
57.  336    Index    ABORTASR NOW  68  FINALASR NOW  68    Access  42  accuracy   genetic BLUP  214  advanced processing arguments  190  AI algorithm  14  AIC  17  ainverse bin  151  Akaike Information  Criteria  17  aliassing  106  Analysis of Deviance  103  Analysis of Variance  19  Wald F statistics  108  animal breeding data  1  arguments  4  asrdata bin  81  ASReml symbols     Bf     41     41     41     42      90  1   90     90     90  s O     90     90  ia 90  autoregressive  111  Average Information  1    balanced repeated measures  270  Bayesian Information  Criteria  BIC   17  binary files  43  Binomial divisor  104    BLUE  15  BLUP  15  case  88    combining variance models  12  command file  29   genetic analysis  147   multivariate  145  Command line option   A ASK  187   B BRIEF  187   C CONTINUE  189   D DEBUG  188   F FINAL  189   Gg graphics   188   Hg HARDCOPY  188   I INTERACT  188   N NoGraphs  188   0 ONERUN  189   Q QUIET  188   R RENAME  189   W WorkSpace  190   X XML  186  command line options  185  commonly used functions  90  conditional distribution  12  Conditional F Statistics  19  conditional factors  95  contrasts  67  Convergence criterion  68  correlated effects  16  correlation  205   between traits  144   model  11  covariance model  11  covariates  40  61  106  cubic splines  100    data field syntax  47  data file  27  40    337    INDEX       binary format  43  fixed format  42  free format  41  using Excel  42  data file line  31  datafile
58.  36 pairwise    An ever better option in this case is to use just one structure twice  The following code  associates xfai dTrial  in xfa1 dTrial   giv2 entry    with xfai dTrial  in xfai dTrial  givi family    that is  both terms point to the one  structure definition     xfai dTrial  QP  QI  grmi family    xfal dTrial  USE xfai dTrial   grm2 entry     Table 7 5 gives examples of constraining variance parameters in ASReml     131    7 8 Setting relationships among variance structure parameters       7 8 Setting relationships among variance structure param   eters    7 8 1 Simple relationships among variance structure parameters    It is possible to define simple equality relationships between variance structure parameters  using the   s qualifier  see Section 7 8 2 and Table 7 4  More general relationships between  variance structure parameters can be defined by placing the   VCC c qualifier on the data file  definition line  Unlike the case of parameter equality  all parameters can be accessed and the  linear relationship is not limited to equality  However  identification of the parameters is not  as easy  Each variance structure parameter  yi  is allocated a number 7 internally  These  numbers are reported in the  tsv file and some are reported in the structure input section  of the  asr file  These numbers are used to specify which parameters are to be constrained  using this method  Warning Unfortunately  the parameter numbers usually change if the  model is changed    
59.  5 1  List of transformation qualifiers and their actions with examples       qualifier    argument    action    examples         SET      SETN      SETU      SUB      SEQ      TARGET     UNIFORM    vlist    vlist    for vlist  a list of n values  the data  values 1    n are replaced by the cor   responding element from vlist  data  values that are  lt 1 or  gt n are re   placed by zero  vlist may run over  several lines provided each incom   plete line ends with a comma  i e    a comma is used as a continuation  symbol  see Other examples below       SETN v n replaces data values 1    n with normal random variables  having variance v  Data values out   side the range 1   n are set to 0     replaces data values 1   n with uni   form random variables having range  0  v  Data values outside the range  1   n are set to 0     replaces data values   v  with their  index i where vlist is a vector of n  values  Data values not found in  vlist are set to 0  vlist may run  over several lines if necessary pro   vided each incomplete line ends with  acomma  ASReml allows for a small  rounding error when matching  It  may not distinguish properly if val   ues in vlist only differ in the sixth  decimal place  see Other examples  below      replaces the data values with a se   quential number starting at 1 which  increments whenever the data value  changes between successive records   the current field is presumed to de   fine a factor and the number of lev   els in the new factor is set t
60.  57 8 4 72 6 12 B 0 016  34 x66 1 58 5 1 13 0 03 B 0 872  35 x70 2 59 3 71 1 40 B 0 242  36 x71 a 64 4 0 08 0 01  B 0 929  37 x73   T 59 0 1 72 3 01 B 0 088  38 x75 1 59 9 0 04 0 26 B 0 613  39 x91 1 63 8 1 44 1 44 B 0 234    Notice  The DenDF values are calculated ignoring fixed boundary singular  variance parameters using empirical derivatives     129 mv_estimates 9 effects fitted   9 idsize 92 effects fitted   7 are zero   115 expt idsize 828 effects fitted   672 are zero   127 at expt 6  type idsize meth 9 effects fitted    2199 singular   128 at expt 7  type idsize meth 10 effects fitted    2198 singular   LINE REGRESSION RESIDUAL ADJUSTED FACTORS INCLUDED    NO DF SUMSQUARES DF MEANSQU R SQUARED R SQUARED 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23    1 3 0 1113D 02 452 0 2460 0 09098 0 08495 1 1100000000000000  kkk kk  2 3 0 1180D 02 452 0 2445 0 09648 0 09049 1    1 10000000000000  kkk kk  3 3 0 1843D 01 452 0 2666 0 01507 0 00883    1 110000000000000  4 3 0 1095D 02 452 0 2464 0 08957 0 08353 1 1 010000000000000  5 3 0 1271D 02 452 0 2425 0 10390 0 09799 1001 1000000000000  kkk 2K  6 3 0 9291D 01 452 0 2501 0 07594 0 06981 0 1 011000000000000  7 3 0 9362D 01 452 0 2499 0 07652 0 07039 0 0 iiepo 6 0 0 6 0   oO ao  8 3 0 1357D 02 452 0 2406 0 11091 0 10501 1010100 0000000000  perro rd kk  9 3 0 9404D 01 452 0 2498 0 07687 0 07074 0 1 101000000000000  10 3 0 1266D 02 452 0 2426 0 10350 0 09755 1i 1001000000000000  11 3 0 1261D 02 452 0 2427 0 10313 0 09717 100011
61.  6    15 7  15 8    15 9  15 10    15 11  15 12  15 13  15 14  15 15    15 16    of Figures    Variogram in 4 sectors for Cashmore data    o  oaoa a a a 0 00000 83  Residual versus Fitted values      o oo a a a 224  Vorograim Of residuals e a e s SED RHE A REWER EE YR  d 235  Plot of residuals in field pla   order 2     ce eR RS 236  Plot of the marginal means of the residuals                    236  Histogram of residuals s  ss ss Faw Reh awe cd dorita doe tkd ERS 237  Residual plot for the rat data      oa oa a a a a 275  Residual plot for the voltage data      ooo a 278  Trellis plot of the height for each of 14 plants                   279  Residual plots for the EXP variance model for the plant data        a  282  Sample variogram of the residuals from the ARIxXAR1 model          289  Sample variogram of the residuals from the AR1xAR1 model for the Tulli    MESA Gaba  s ea na e a Baers SS eee Eee A ee wees 295  Sample variogram of the residuals from the AR1xAR1   pol column  1    model Tot the Tullibigeal datas        esa 625 e246  2 tad ereere 296  Rice bloodworm data  Plot of square root of root weight for treated versus   CONGO ce ek hoe Bad RE MK a a a a i ES a a e E D e 299  BLUPs for treated for each variety plotted against BLUPs for control         306  Estimated deviations from regression of treated on control for each variety   plotted against estimate for control   ss s es sss esa cai taa cd osad 307  Estimated difference between control and treated for each va
62.  8091 2 125  52  8061 8 125  52  8061 8 125    df  df  df  df  df  df  df          Results from analysis of yield        Akaike Information Criterion    Bayesian Information Criterion    1423 57  assuming 4 parameters      1434 88    Approximate statum variance decomposition  Component Coefficients  25 0    Stratum Degrees Freedom  idv  Rep  5 00  idv RowB1k  24 00  idv  Co1B1k  23 66  Residual Variance 72 34  Model_Term   idv  Rep  IDV_V  idv RowB1k  IDV_V  idv ColB1k  IDV_V  idy  units    Residual SCA_V    Source of Variation    8 mu  6 variety    Variance  266657   74887  8  713569 5  8061 81    Gamma   6 0 528714   30 1 93444   30 1 83725  150 effects   150 1 000000    0 0    0 0  0 0    B0   4 3   0 0   0 0  Sigma  4262 39  15595 1  14811 6    8061 81    Wald F statistics  DenDF   530  79 3    NumDF  1  24    F_inc  1216 29  8 84    5 0    oF O  OUO    Sigma SE  0 62  3 06  3 04    6 01    ererre  O OOGO      c  0 P  0 P  0P    OP    Prob     lt  001   lt  001    Finally  we present portions of the  pvs files to illustrate the prediction facility of ASReml   The first five and last three variety means are presented for illustration  The overall SED  printed is the square root of the average variance of difference between the variety means   The two spatial analyses have a range of SEDs which are available if the  SED qualifier is  used  All variety comparisons have the same SED from the third analysis as the design is  a balanced lattice square  The Wald F statistic stat
63.  952  COLT 27 00  NE87522 25 00  NE87612 21 80  NE87613 29 40  NE87615 25 69  NE87619 31 26  NE87627 Bones  The    predict variety    statement after the model statement in nin89 as results in the nin89 pvs file displayed  below  some output omitted  containing the 56 predicted variety means  also in the order in  which they first appear in the data file  column 2   together with standard errors  column  3   An average standard error of difference among the predicted variety means is displayed  immediately after the list of predicted values  As in the  asr file  date  time and trial  information are given the title line  The Ecode for each prediction  column 4  is usually E  indicating the prediction is of an estimable function  Predictions of non estimable functions  are usually not printed  see Chapter 9     NIN alliance trial 1989 04 Apr 2008 17 00 47    nin89    Ecode is E for Estimable    for Not Estimable    Predicted values of yield   The predictions are obtained by averaging across the hypertable  calculated from model terms constructed solely from factors  in the averaging and classify sets    The ignored set  repl   Use   AVERAGE to move table factors into the averaging set     variety Predicted_Value Standard_Error Ecode  LANCER 28 5625 3 8557 E  BRULE 26 0750 3 8557 E  REDLAND 30 5000 3 8557 E  CODY 21 2125 3 8557 E  ARAPAHOE 29 4375 3 8557 E  NE83404 27 3875 3 8557 E  NE83406 24 2750 3 8557 E  NE83407 22 6875 3 8557 E    38    3 7 Tabulation  predicted values and
64.  AINV GIV    structure   ALNORM calculates the Normal Integral  ASRem l failed to SORT the pedigree     The job file should be in ASCII format     Try running the job with increased workspace  or us   ing a simpler model  Otherwise send the job to VSN   mailto support asreml co uk  for investigation     ASReml failed to expand the at   model term string  Break it  into several parts on separate lines     ASReml failed to parse the term  Revise and simplify   An argument in the CALC statement is not valid     ASReml is using IDV variance structure but wonders whether  that is what you intended        ASReml found a alphacharacters when it was expecting nu   meric data  Either the variable should be declared alphanu   meric  or we have miscounted items on the line  Use  CSV if  there are TAB or COMMA delimited blank lines     Try running without the   CONTINUE qualifier    the program did not proceed to convergence because the REML  log likelihood was fluctuating wildly  One possible reason is  that some singular terms in the model are not being detected  consistently  Otherwise  the updated G structures are not pos   itive definite  There are some things to try      define US structures as positive definite by using  GP      supply better starting values       fix parameters that you are confident of while getting better  estimates for others  that is  fix variances when estimating  covariances        fit a simpler model         reorganise the model to reduce covariance terms 
65.  E sum to zero leaving only 3 fixed degrees of freedom  fitted  Therefore if the A inverse for this pedigree was saved  it will contain  GROUPSDF 3  in the GIV file     8 9 2 The example continued    Below is an extension of harvey as to use harvey giv which is partly shown to the right   This G inverse matrix is an identity matrix of order 74 scaled by 0 5  that is  0 5L    This  model is simply an example which is easy to verify  Note that harvey giv is specified on  the line immediately preceding harvey dat     command file   giv file    165    8 10 The reduced animal model  RAM                          giv file example 01 01  5  Animal  P 02 02  6  Sire  P 03 03  5  Dam 04 04  5  Line 2 05 05  5  AgeOfDam   adailygain     2      Y3 T2 T2 25  harvey ped  ALPHA T3 Ta  6  harvey giv   giv structure file 74 74  5  harvey  dat   adailygain   mu Line    fixed model   Ir grmiv Sire  INIT 0 25    random model   residual idv  units           Model term specification associating the harvey giv structure to the coding of sire takes  precedence over the relationship matrix structure implied by the  P qualifier for sire  In this  case  the  P is being used to amalgamate animals and sires into a single list  and the  giv  matrix must agree with the list order     8 10 The reduced animal model  RAM     The reduced animal model was devised to reduce the computation involved in fitting a large  animal model  When there is at most one record per individual  a large proportion of the  indiv
66.  ERE RE Re eH RS  Gil Wald F Stafeti    gt  ee SEG RARE Oe EERE e EH EGE EWS i  Command file  Specifying the variance structures  Tal Applying variance models to random terms     ooa aa 0004   7 2 Process to define a consolidated model term    aoaaa aa  7 2 1 Modelling a single variance structure over several model terms         7 3 Applying variance structures to the residual error term    aooaa 2     7 3 1 Special properties and rules in defining the residual error term        7 3 2 Using sat   to specify the residual model term for data with sections  7 4 ldentinabiity   c coea hee ae he eG EER EER REAR OR DES oS  7 5 A sequence of variance structures for the NIN data                  7 6 Sigma versus gamma parameterization              2  200000048  7 6 1 Which parameterization does ASReml use for estimation             7 6 2 Switching from the gamma to the sigma parameterization            vii    116    7 7 Variance model function qualifiers               2  2 02 000     7 7 1 Parameter equality constraints   5             00 0000     7 7 2 New R4 Ways to supply distances in one dimensional metric based mod   ek Ce es eo a A ee ee we ee    T3 Yourown programi IPis es CR daret ER Eee EES  7 7 4 Parameter space constraints  Gs 2    2    ee ee  7 7 5 New R4 Initial values  INITv     2    2    2  eee ee  7 7 6 About subsections  SUBSECTION f              2  00    Tar Fatame  ter ypes Fe es a bison ok ee ee ee eo eee Y  7 7 8 Equating variance structures 1USE     4    24  20 
67.  FREE FORMAT skipping 1 lines    Univariate analysis of HT6  Summary of 6399 records retained of 6795 read    Model term Size  miss  zero MinNonOo Mean MaxNonO StndDevn  1 Nfam 71 0 0 1 36 3379 Ti   2 Nfemale 26 0 0 1 12 8823 26   3 Nmale 37 0 0 1 15 2285 37  Warning  More levels found in Clone than specified   4 Clone 926 0 0 1 464 6765 926  Warning  Fewer levels found in MatOrder than specified   5 MatOrder 914 0 0 1 432 5760 860   6 rep 8 0 0 1 4 4837 8   7 iblk 80 0 0 1 40 1164 80   8 tree 0 0 1 0000 7 473 14 00 4 018  9 row 0 0 1 0000 28 52 56 00 16 09  10 col 0 0 1 0000 10 50 20 00 5 760  Warning  Fewer levels found in prop than specified   11 prop 2 0 0 1 1 0000 1   12 culture 2 0 0 1 1 4945 2   13 treat 2 0 0 1 1 4945 2  Warning  Fewer levels found in measure than specified   14 measure 2 0 0 al 1 0000 1   15 SURV 0 6 1 0000 0 9991 1 0000 0 3061E 01  16 DBH6 4 0 0 3000E 01 11 29 16 80 2 400  17 HTG Variate 0 0 76 20 838 6 1286  163 6  18 HT8 83 O 91 44 1148  1576  170 6  19 CWAC6 3167 0 97 54 301 3 542 5 52 26  20 mu 1   21 culture rep 16 12 culture   2 6 rep   8    Warning  GRM matrix is too SMALL    171    8 11 Factor effects with large Random Regression models       22 grmi Clone  923  23 rep iblk 640 6 rep   8 7 iblk   80  Forming 2508 equations  19 dense     Initial updates will be shrunk by factor 0 316  Notice  LogL values are reported relative to a base of  30000 000    Notice  11 singularities detected in design matrix   1 LogL  2845 97 S2  8956 5 6390 df  2 
68.  Matrix US us TrLit1234  id lit   3 847 0 6368 0 2472  0 7180E 01   2 523 4 079 0 6454  0 4860   0 7674E 01 0 2063 0 2504E 01  0 3706    331    bas e u e e e e e 2 a     OU j       TTUTU TTT T yyy Oe ee    15 10 Multivariate animal genetics data   Sheep           1182  0 8241  0 4923E 01 0 7049  Covariance Variance Correlation Matrix XFA xfa1 TrDam12  id dam   1 614 1 0000 1 0000   1 465 1 330 1 0000   Leek 1 153 1 0000   Covariance Variance Correlation Matrix XFA xfa3 Trait   nrm tag    1 389 0 2978 0 1871 0 2861  0 4630E 01  0 9303E 03 0 9948  O  2017  0 7379E 01 0 4419E 01  0 8809  0 1709  0  1009  0 8568 0 2526  0 4495  0 5629E 01  0 4726E 01 0 6514E 01 0 3410 0 3155E 01 0 8583 0 2355 0 4560  0 2820  0 3004E 01 O 7277E 01 0 6992  0 4761 0 2416E 01 0 3414 0 5260    0 1869E 01     7261E 02 0 2757E 02  0 1363 0 1173 0 5210  0  1323  0 8432   0 1097E 02  0  1801 0 2190 0 2020E 01 0 1784 1 0000 0 000 0 000   1 173 0 5310E 01 0 6011E 01 0 2855  0 4530E 01 0 000 1 0000 0 000  Oe 1199  0 9449E 01 0 1164 0 4398  0 2888 0 000 0 000 1 0000    Note that the XFA matrix associated with tag has 8 rows  and columns  the first five relate  to the five traits and the last three relate to the three factors     332    Bibliography    Breslow  N  E   2003   Whither PQL   Technical Report 192  UW Biostatistics Working  Paper Series  University of Washington   URL  http   www  bepress com uwbiostat paper192     Breslow  N  E  and Clayton  D  G   1993   Approximate inference in generalized linea
69.  Prepare the data  typically using a spreadsheet or data base program     e Export that data as an ASCII file  for example export it as a  csv  comma separated  values  file from Excel     e Prepare a job file with filename extension  as   e Run the job file with ASReml  e Review the various output files    e revise the job and re run it  or    24    3 2 Nebraska Intrastate Nursery  NIN  field experiment       e extract pertinent results for your report     You will need a file editor to create the command file and to view the various output files  On  unix systems  vi and emacs are commonly used  Under Windows  there are several suitable  program editors available such as ASReml W and ConText mentioned in Section 1 3     3 2 Nebraska Intrastate Nursery  NIN  field experiment    The yield data from an advanced Nebraska Intrastate Nursery  NIN  breeding trial conducted  at Alliance in 1988 89 will be used for demonstration  see Stroup et al   1994  for details   Four replicates of 19 released cultivars  35 experimental wheat lines and 2 additional triticale  lines were laid out in a 22 row by 11 column rectangular array of plots  the varieties were  allocated to the plots using a randomised complete block  RCB  design  In field trials   complete replicates are typically allocated to consecutive groups of whole columns or rows  In  this trial the replicates were not allocated to groups of whole columns  but rather  overlapped  columns  Table 3 1 gives the allocation of varietie
70.  RENAME  ARG 1 2   Slate Hall example  Rep 6   Six replicates of 5x5 plots in 2x3 arrangement  RowBlk 30   Rows within replicates numbered across replicates    287    15 6 Spatial analysis of a field experiment   Barley       ColBlk 30   Columns within replicates numbered across replicates  row 10   Field row  column 15   Field column  variety 25  yield  barley asd  skip 1  DOPATH  1   PATH 1   AR1 x AR1  y   mu var  residual ariv column  ar1  row    PATH 2   AR1 x AR1   units  y   mu var  r idv units   residual ariv column  ar1  row    PATH 3   incomplete blocks  y   mu var  r idv Rep  idv Rowblk  idv Colblk   residual idv units    PATH O  predict variety  TWOSTAGEWEIGHTS    Abbreviated ASReml output file is presented below  The iterative sequence has converged  to column and row correlation parameters of   68377  45859  respectively  The plot size and  orientation is not known and so it is not possible to ascertain whether these values are spa   tially sensible  It is generally found that the closer the plot centroids  the higher the spatial  correlation  This is not always the case and if the highest between plot correlation relates to  the larger spatial distance then this may suggest the presence of extraneous variation  see  Gilmour et al   1997   for example  Figure 15 5 presents a plot of the sample variogram of  the residuals from this model  The plot appears in reasonable agreement with the model     The next model includes a measurement error or nugget effect compo
71.  Tag e 29 32   50 53  phen  uusT 11 15   susT 11 15    defines  11 15  elements of phen    G  iea BE   SE   ies H HOD    320    15 10 Multivariate animal genetics data   Sheep       defines 70 74  11215    eared d   Direct  susT  4   defines 75  89  23 37   4   Maternal Damv   susT 1 6   defines 90  95  54 59   23 28  resid phen    susT  defines 96 110  60 74   23 37  WWTh2 Direct 1  phen 1   defines 111  75  60   YWTh2 Direct 3  phen 3   defines 112  77  62   GFWh2 Direct 6  phen 6   defines 113  80  65   FDMh2 Direct 10  phen 10   defines 114  84  69   FATh2 Direct 15  phen 15   defines 115  89  74   GenCor  susT  defines 116 125 from 23 37   MatCor Maternal  defines 126 129 from 90 95    POM mteeetyAa aa      Table 15 15  Variance models fitted for each part of the ASReml job in the analysis of the  genetic example       term matrix  PATH 1  PATH 2  PATH 3          sire 5  diag fal us   dam Xa diag fal fal  litter 5X  diag fal us  error de us us us  LogL  1566 45  1488 11  1480 89  Parameters 36 48 55          The specification in Release 3 required specification of initial values for variance parameters  and also through the use of  CONTINUE the generation of initial values from previous anal   yses  In Release 4  with the functional specification and no initial values specified  ASReml  will estimate initial values  In this example we start by fitting diagonal matrices for sire dam  and litter using initial values from univariate analyses and estimate an unstructured res
72.  The default is to  read all values in the file  regardless of layout  Otherwise  the weights must appear a single  column  field  one weight per line  where the field is specified by appending  c to the filename     Consider a rather complicated example from a rotation experiment conducted over several  years  One analysis was of the daily live weight gain per hectare of the sheep grazing the  plots  There were periods when no sheep grazed  Different flocks grazed in the different  years  Daily liveweight gain was assessed between 5 and 8 times in the various years  To  obtain a measure of total productivity in terms of sheep liveweight  we need to weight the  daily gain by the number of sheep grazing days per month  The production for each year is  given by    predict year  predict year  predict year  predict year  predict year    crop 1 pasture lime  AVE month 56 55 56 53 57 63 6 0   crop 1 pasture lime  AVE month 36 0 0 53 23 24 54 54 43 35 0 0  crop 1 pasture lime  AVE month 70 0 21 170007000 53 0  crop 1 pasture lime  AVE month 53 56 22 92 19 44 0 0 36 0 0 49  crop 1 pasture lime  AVE month 0 22 0 53 70 22 0 51 16 5100    aoP WN RK    but to average over years as well  we need one of the following predict statements     predict crop 1 pasture lime  PRES year month    IPRWTS   56 55 56 53 57 63 0 0 0 0 0 0   36 0 0 53 23 24 54 54 43 35 O O   TO 0 2117 0 0    TO 0 O83 Q   53 56 22 92 19 44 0 036 0 0 49   0 22 0 537022 0 51 16 51    0 5  predict crop 1 pasture lime  PRES m
73.  There are now three consolidated model terms  idv rep1   idv units  and  ariv column  ar1 row   This order is reversed in 4              4 Two dimensional separable autoregressive spatial model defined as a  G structure       This model is equivalent to 3c but with the spatial   NIN Alliance Trial 1989  model defined as a G structure rather than an R struc    variety  A  ture  The algebraic form is written alternatively  but id  equivalently  to the form in 3c  that is  var  t O Ty row 22  var  Wer    Oe  el Pe  8 X   pr  and ane f  war  e  _ o   om noran rand  Skip 1 l  yield   mu variety  r idv repl        ariv column  ar1 row    Important points 12 une    residual idv units              e the same G structure could be achieved by specifying ar1  column   ariv  row   see similar  comment in example 3b    e if the variance structure ariv column   ariv row  was specified ASReml would report an  error  see identical comment in example 3b    e estimation is based on the gamma parameterization in which case both the estimated  sigmas and the estimated gammas are reported  The user can force ASReml to use the  sigma parameterization by placing the  SIGMAP qualifier immediately after the indepen   dent variable and before   on the model definition line  In this case only the sigmas would  be reported  but they would be reported twice in the output  see Important points under  example 3a     121    7 6 Sigma versus gamma parameterization       7 6 Sigma versus gamma parameterizati
74.  Trait   volved 35 teams of wethers representing 27   bloodlines  The file wether dat shown below contains greasy fleece weight  kg   yield  per   centage of clean fleece weight to greasy fleece weight  and fibre diameter  microns   The  code  wether as  to the right performs a basic bivariate analysis of this data                    SheepID Site Bloodline Team Year GFW Yield FD  0101 3 21 1 156 74 3 18 5  0101 3 21 1 2 6 0 71 2 19 6  0101 3 21 138 0 75 7 21 5  0102 3 21 1 1 5 3 70 9 20 8  0102 3 21 1 2 5 7 66 1 20 9  0102 3 21 136 8 70 3 22 1  0103 3 21 1 1 5 0 80 7 18 9  0103 3 21 1 25 5 75 5 19 9  0103 3 21 1 3 7 0 76 6 21 9    4013 3 43 35 1 7 9 75 9 22 6  4013 3 43 35 2 7 8 70 3 23 9  4013 3 43 35 3 9 0 76 2 25 4  4014 3 43 35 1 8 3 66 5 22 2  4014 3 43 35 2 7 8 63 9 23 3  4014 3 43 35 3 9 9 69 8 25 5  4015 3 43 35 1 6 9 75 1 20 0  4015 3 43 35 2 7 6 71 2 20 3  4015 3 43 35 3 8 5 78 1 21 7             8 2 Model specification    The syntax for specifying a multivariate linear model in ASReml is    Y variates   fixed   r conrandom     f sparse_fized     residual conresidual      e Y variates is a list of up to 20 traits  there may be more than 20 actual variates if the list  includes sets of variates defined with  G on page 49      154    8 3 Residual variance structures       fixed  conrandom and sparse_fixed are as in the univariate case  see Chapter 6  but involve  the special term Trait and interactions with Trait     The design matrix for Trait has a level  column  fo
75.  Trait  pust Trai   sus Trait  sus Trait   Mel irait  yust Trai    us Trait   us  Trait   mms  Trait  sustiT raid    us Trait sus Trait    us Trait   us  Trait    us Trait   us  Trait    us Trait   us  Trait     1    OANODOa PWN    ererrrhe  PWN Fe OO    15          Results from analysis of wwt    id units   id units   id units   id units   id units   id units   id units   id units   id units   id units   id units   id units   id units   id units   id units     diag  TrSG123   16 diag TrSG123   sex grp diag TrSG123   17 diag TrSG123   sex grp  diag  TrSG123   18 diag TrSG123   sex grp diag TrSG123      Sex grp    35200 effects    147 effects    diag  TrAG1245   age   19 diag TrAG1245     20 diag TrAG1245    21 diag TrAG1245    22 diag TrAG1245    us Trait   id sire     Brp  age  age  age  age    196 effects   grp diag TrAG1245    grp diag TrAG1245    grp diag TrAG1245    grp diag TrAG1245    460 effects    23  24  25  26  2r  28  29  30  31  32  33  34  35  36  37      s Trait   us Trait   ne  Trait   us Trait   us  Trait   us  Trait   us  Trait   us  Trait   us  Trait   ae   Trait   us  Trait   ne  Trait   us  Trait   ie Trait   us  Trait      id sire   us  Trait    id sire   us  Trait    id sire   us  Trait    id sire   us  Trait    id sire   us  Trait    id sire   us  Trait    id sire   us  Trait    id sire   us  Trait    id sire   us  Trait   id sire   us Trait   Ad sire  sus  Trait    id sire   us  Trait    id sire   us  Trait    id sire   us  Trait    id sire   us  Trai
76.  a   none  blocks fixed   2 RCB analysis  Ir idv repl  oy  a rT  Yr  blocks random residual idv units  aT a  a   3a Two dimensional Ir idv repl  ozi  g YT  Yr  spatial model residual idv column   ari  row  a la ZDAN Tapi Tap  Pr  correlation in  one direction   Ss   36  Two dimensional Ir idv rep1  ef  g rl  Yr  separable residual ariv column   ar1  row  oe slic    B   pr  Cx Pe Dapa Dp  Pes De  autoregressive  spatial model   3c Two dimensional Ir idv repl   got  o Yr  Vr  separable idv  units  et on ant ee In  autoregressive residual ariv column   ar1  row  of  Dell   Volpe  of  rs Dis Bel     Val pp  Prs Pe  spatial  model with  measurement  error   4 Two dimensional  r idv repl   oot  o2 yT  Vr  separable autoregressive ariv column   ari  row  Ge Sap E  Gale Taa O Lelie   ena Pe Pe  spatial model residual idv units    lcs o2 A    defined as a  G structure             uoljezZuajawesed ewwes SNSADA ewSIS 9 7    7 7 Variance model function qualifiers       7 7 Variance model function qualifiers    A consolidated model term is comprised of one or more covariance components  where a  covariance component is a component of the model term to which a variance model function  has been applied  see Section 2 1 8 and Table 7 2  All of the covariance components so far  have been of the form    umfname component     where umfname is the variance model function name  in this font in first column of Table  7 6  and component is a component in the model term  Two single covariance compon
77.  and we present a discussion of this code to the left  We present the model specification  explicitly to help the user understand the logic  In some cases  experienced users will wish  to take advantage of reducing typing and clarity by using default rules  These are discussed  in Section 7 10     117    7 5 A sequence of variance structures for the NIN data       1 Randomised complete blocks analysis  blocks fixed    The only random term in a traditional randomised  complete block  RCB  analysis of the NIN data is the  residual error term e   N 0  o7J      The model  therefore involves just one R structure  IDV  and no  G structure  The variance model function name is  idv and there is just one consolidated model term   idv units      2 RCB analysis  blocks random    The random effects RCB model has 2 random terms  to indicate that the total variation in the data  is comprised of 2 components  a random repli   cate term u    N 0  o7I   and the resid   ual error term  as in example 1  The  r be   fore repl tells ASReml that repl is a random    term  All random terms must be written af   ter  r in the model specification line s   This  model involves both the original IDV R struc     IDV G structure for the random  There are now now 2 consoli   idv repl  and idv units      ture and an  replicate term   dated model terms     118          NIN Alliance Trial 1989  variety  A   id   pid   raw   repl 4    row 22   column 11   nin89 asd  skip 1   yield   mu variety repl  residual id
78.  be arranged with key fields followed by other fields  from the primary file and then fields from the secondary file     Table 11 1  List of MERGE qualifiers       qualifier action         CHECK requests ASReml confirm that fields having a common name have the same  contents  Discrepancies are reported to the  asr file  If there are fields  with common names which are not key fields  and  CHECK is omitted  the  fields will be assumed different and both versions will be copied    IKEY keyfields names the fields which are to be used for matching records in the files   If the fields have the same name in both file headers  they need only be  named in association with the primary input file  If the key fields are  the only fields with common names  the  KEY qualifier may be omitted  altogether  If key fields are not nominated and there are no common field  names  the files are interleaved      KEEP instructs ASReml to include in the merged file records from the input  file which are not matched in the other input file  Missing values are  inserted as the values from the other file  Otherwise  unmatched records  are discarded   KEEP may be specified with either or both input files    INODUP fields Typically when a match occurs  the field contents from the second file are  combined with the field contents of the first file to produce the merged  file  The  NODUP qualifier  which may only be associated with the second  file  causes the field contents for the nominated fields from th
79.  conditional    Wald F statistic column to the Wald F Statistics  table  It enables inference for fixed effects in the dense part of the lin   ear mixed model to be conducted so as to respect both structural and  intrinsic marginality  see Section 2 5   The detail of exactly which terms  are conditioned on is reported in the  aov file  The marginality principle  used in determining this conditional test is that a term cannot be ad   justed for another term which encompasses it explicitly  e g  term A C  cannot be adjusted for A B C  or implicitly  e g  term REGION cannot be  adjusted for LOCATION when locations are actually nested in regions al   though they are coded independently    FOWN on page 78 provides a way  of replacing the conditional Wald F statistic by specifying what terms  are to be adjusted for  provided its degrees of freedom are unchanged  from the incremental test     67    5 8 Job control qualifiers       Table 5 3  List of commonly used job control qualifiers       qualifier    action       IMAXIT n      SUM    IX v    IY v    IG v      JOIN    sets the maximum number of iterations  the default is 10 for traditional  models  more for general models  ASReml iterates for n iterations unless  convergence is achieved first  Convergence is presumed when the REML  log likelihood changes less than 0 002  current iteration number and the  individual variance parameter estimates change less than 1      If the job has not converged in n iterations  use the   CONTINU
80.  e  VCC c specifies that there are c lines defining parameter relationships     e If   VCC is used a residual line is required and the parameter relationship lines must occur  after this residual line     e each relationship is specified in a separate line of the form  ke ok simple case    i kxvuk     px vp  BLOCKSIZE n general case    In this specification     i and k   p are the numbers of the specific variance model parameters and vm  m      k    p are the associated scale coefficients such that ym x V m is equal in value to yi   for example  5 7   1 indicates that y_7 x 1   y__5  ie  parameter 7 is equal to parameter 5    5 7    1 indicates that parameter 7 is a tenth of parameter 5      x indicates the presence of the scale coefficient v_m for the parameter m     if the coefficient is 1 indicating parameter equality  the   1 can be omitted  for example  5 7 is a simplified coding of the first example      if the coefficient is  1  i k x    1 can be simplified to  i    k  for example  5  7 indicates that parameter 7 is has the same magnitude but opposite  sign to parameter 5      the  BLOCKSIZE n qualifier is used when constraints of the same form are required on    132    7 8 Setting relationships among variance structure parameters       blocks of n contiguous parameters  for example     21 29  BLOCKSIZE 8 equates parameters 29 with 21  30 with 22      36 with 28       a variance structure parameter may only be included in one relationship line  to equate  several compo
81.  ee eee ee na  10 2 3 Forming a job template from a data file                   10 3   Command line options     2 ek c soea ta ke he eR RS eR e E SE  10 3 1 Prompt for arguments  A     oo aa  10 3 2 Output control  B   OUTFOLDER  IXML  2 224242 he eee dee  10 3 3 Debug command line options  D  E                      10 3 4 Graphics command line options  G  H  1  N Q                 10 3 5 Job control command line options  C  F  O  R                10 3 6 Workspace command line options  S  WW               0      MSS Fe ee he ees ES BS ASP N S A SE EER    10 4 Advanced processing arguments              2  0000002 eee  10 4 1 Standard use of arguments           0   000 eee eee  10 4 2  Frompting for Wt  nw ssc eh eee ee he Ewe AS GAS SS Y  10 4 3 Paths and Loops     24 o sacras cacadan behead eee de  10 4 4 Order of Substitution       aaao OY wD Oe ee See  105 Perlormmengce BONES  o c ee ee bope ee b OE Re pie Eee ee Sew ed  10 5 1 Multiple processors     ck ee hb eh we ee ee ees  10 5 2 Slow processes nb kk ba eae HES REE ER tarti  10 5 3 Timing processes ns ke eee Re ERE HE RS  11 Command file  Merging data files  ILI  MRIMCTO es ce a ee eae eee ee eee ee eee ee ES  22   WHET ew satos adada ek BG 6 He Gok HHS Gee eS eR  Tt EOS  si th eh EMA EERE SEEMS SHES SE HES A G SH  12 Functions of variance components  121  lniodugtom o se ed oe ee PRR ee EER ee OR Ree SS  ee es eo a e a e ee ee ee ee ee ee A  12 2 1 Functions of components      o sa o ead we So ee ee ee es  12 2 2 Conve
82.  extended T contains covariance factors kw w  XFAk factor W contains specific variance  xfak analytic    151    7 12 Variance models available in ASReml       Details of the variance models available in ASReml       variance description algebraic number of parameters   structure form  name    variance corr hom het  model variance variance  function   name       relationship matrices     AINV inverse relationship matrix derived from pedi  0 1    gree   NRM relationship matrix derived from pedigree 0 1    nrm   GIV1 generalized inverse number 1 0 1    givi   GIV8 generalized inverse matrix 8 0 1    givs   GRM1 generalized relationship number 1 0 1    grmi   GRM8 generalized relationship matrix 8 0 1    grm8    t This is the number of variance structure parameters  w is the dimension of the matrix  The  homogeneous variance form is specified by appending V to the correlation basename  the heteroge   neous variance form is specified by appending H to the correlation basename   t These will be associated with 1 variance parameter unless used in direct product with another  structure that provides the variance  Appending a v to a name makes it explicit that a variance  parameter is fitted     152    8 Command file  Multivariate analysis    8 1 Introduction    Multivariate analysis is used here in the narrow sense of a multivariate mixed model  There  are many other multivariate analysis techniques which are not covered by ASReml  Multi   variate analysis is used when we are interes
83.  file is by programs written to parse ASReml  output  For further details  including the status of intended future developments  please  contact support vsni co uk     196    10 3 Command line options       10 3 3 Debug command line options  D  E     D and E   DEBUG   DEBUG 2  invoke debug mode and increase the information written to  the screen or  asl file  This information is not useful to most users  On Unix systems  if  ASReml is crashing use the system script command to capture the screen output rather  than using the L option  as the  as1 file is not properly closed after a crash     10 3 4 Graphics command line options  G  H  I  N  Q     Graphics are produced by ASReml on some platforms  e g  PC and Linux  using the Winter   acter graphics library     The I    INTERACTIVE  option permits the variogram and residual graphics to be displayed   This is the default unless the L option is specified     The N   NOGRAPHICS  option prevents any graphics from being displayed  This is the default  when the L option is specified     The Gg   GRAPHICS g  option sets the file type for hard copy versions of the graphics  Hard  copy is formed for all the graphics that are displayed     H g    HARDCOPY g  replaces the G option when graphics are to be written to file but not  displayed on the screen  The H may be followed by a format code e g  H22 for  eps     Q   QUIET  is used when running under the control of ASReml W_ to suppress any POP   UPs  PAUSES from ASReml     ASReml writes 
84.  files produced by this job include  the  aov   pvs   res   tab   sln and  yht files  see Section 13 4     3 6 1 The  asr file    Below is nin89 asr with pointers to the main sections  The first line gives the version of  ASReml used  in square brackets  and the title of the job  The second line gives the build  date for the program and indicates whether it is a 32bit or 64bit version  The third line  gives the date and time that the job was run and reports the size of the workspace  The  general announcements box  outlined in asterisks  at the top of the file notifies the user of  current release features  The remaining lines report a data summary  the iteration sequence   the estimated variance parameters and a table of Wald F statistics  The final line gives the  date and time that the job was completed and a statement about convergence     ASReml 3 1  01 Jan 2011  NIN alliance trial 1989   job heading  Build cm  25 Oct 2011  64 bit   04 Nov 2011 21 14 28 404 32 Mbyte Linux  x64  nin89   Licensed to  Cargo Vale Olives Univ of Wollongong at  Jul 2012   aE kk kkk kk kkk k kk k k k k k kak k kk GIORGIO k k k k k kk Kk kk KK K K     Contact support asreml co uk for licensing and support     aooo oo oo oo aKa ARG     Folder   home gilmoua W7drive Users Public ASReml asr3 ug3 Manex4   variety  A   QUALIFIERS   SKIP 1   Reading nin89 asd FREE FORMAT skipping 1 lines    Univariate analysis of yield    Summary of 224 records retained of 224 read data summary    Model term Size  mis
85.  for Estimable    for Not Estimable    Warning  mv_estimates is ignored for prediction   The predictions are obtained by averaging across the hypertable  calculated from model terms constructed solely from factors  in the averaging and classify sets    Use   AVERAGE to move ignored factors into the averaging set     nm m a ee Md    Mmm 1 mame m a a ar cla Ga aie M lm  Predicted values of yield   variety Predicted_Value Standard_Error Ecode predicted variety means  LANCER 24 0891 2 4648 E   BRULE 2   0731 2 4946 E   REDLAND 28 7953 2 5066 E   CODY 23 7733 2 4973 E   ARAPAHOE 27  0429 2 4420 E   NE83404 25 7199 2 4426 E   NE83406 25 3793 25030 E   NE83407 24 3981 2 6892 E   CENTURA 26 3531 2 4765 E   SCOUT66 29 1741 2 4363 E   NE87615 25 1218 2 4436 E   NE87619 30 0261 2 4669 E   NE87627 19 7108 2 4836 E    SED  Overall Standard Error of Difference 2 925 SED summary    13 4 7 The  res file    The  res file contains miscellaneous supplementary information including    e a list of unique values of x formed by using the fac   model term    e alist of unique  z  y  combinations formed by using the fac z  y  model term   e legandre polynomials produced by leg   model term    e orthogonal polynomials produced by pol   model term     e the design matrix formed for the sp1   model term     230    13 4 Other ASReml output files       predicted values of the curvature component of cubic smoothing splines     the empirical variance covariance matrix based on the BLUPs when a       J or J 
86.  for exam   ple  use CORUH instead of US      259    14 5    Information  Warning and Error messages       Table 14 3  Alphabetical list of error messages and probable cause s   remedies       error message    probable cause remedy       Correlation structure is not  positive definite    Data does not have    sections     Define structure for        Error  The indicated number    of input fields exceeds the  limit    Error in  CONTRAST label  factor values    Error in  GROUP label factor  values    Error in  SUBSET label factor  values    Error in extended   ASSIGN    Error in R structure  model    checks    Error opening file    Error in list  Error in PREDICT    Error in variance header  line     Error in Variance Parameter  Constraint    Error opening file     Error order    Error parsing    Error reading something    It is best to start with a positive definite correlation structure   Maybe use a structured correlation matrix     The data does not match the RESIDUAL specification     A variance structure should be specified for this term     The reported limit is hardcoded  The number of variables to  be read must be reduced     The error could be in the variable factor  name or in the num   ber of values or the list of values     The list of values does not agree with the factor definition     The error could be in the variable factor  name or in the num   ber of values or the list of values     The   lt    gt  qualifiers allow an assign string to be defined over  severa
87.  for these instructions are discussed  Direct use of the  pin file   as was required in ASReml 2  is discussed in Section 12 3     12 2 Syntax    Instructions to calculate functions are headed by a line  VPREDICT   DEFINE    This line and the following instructions can occur anywhere in the  as file but the logical  place is at the end of the file  The instructions are processed after the job  part cycle   has been completed  ASReml recognises a blank line  or end of file  as termination of the  functional instructions     Functions of the variance components are specified by lines of the form    letter label coefficients    e letter  either F  H  R  S  V or X  must occur in column 1    F forms linear combinations of variance components     210    12 2 Syntax         H is for forming heritabilities  the ratio of two components     R is for forming the correlation from a covariance component     S is a square root function       Vis for converting components related to a CORUH or an XFA structure into components  related to a US structure       X is a multiply function   e label names the result   e coefficients is the list of arguments coefficients for the linear function     When ASReml reads back the variance parameters from the  asr file  each covariance com   ponent  or variance function  is assigned a name  The full name is usually the covariance  function  or its specified contracted form  prepended by the consolidated model term  or its  specified contracted form  and 
88.  form  is said to be nonnegative definite if a    Aw  gt  0 for all a     R     If x    Ags is nonnegative  definite and in addition the null vector 0 is the only value of    for which    Aaw   0  then  the quadratic form is said to be positive definite  Hence the matrix A is said to be positive  definite if z  Aa is positive definite  see Harville  1997   pp 211     7 11 3 Notes on the variance models    These notes provide additional information on the variance models defined in Table 7 6     e the IDH and DIAG models fit the same diagonal variance structure     e the CORGH and US are equivalent variance structures parameterised differently  Both may  fail to converge if the starting values are not good and or if the maximum REML likelihood  occurs at parameter values outside the parameter space  The us model is likely to be better  when the matrix is of order 3 or higher     in CHOLk models      LDL  where L is lower triangular with ones on the diagonal  D is  diagonal and k is the number of non zero off diagonals in L     in CHOLKC models X   LDL  where L is lower triangular with ones on the diagonal  D  is diagonal and kis the number of non zero sub diagonal columns in L  This is somewhat  similar to the factor analytic model     in ANTEk models X     UDU  where U is upper triangular with ones on the diagonal   D is diagonal and k is the number of non zero off diagonals in U     the CHOLk and ANTEK models are equivalent to the US structure  that is  the full variance  st
89.  functions of the variance components       CENTURA 21 6500 3 8557 E  SCOUT66 27 5250 3 3557 E  COLT 27 0000 3 8557 E  NE87613 29 4000 3 9657 E  NE87615 25 6875 3 8557 E  NE87619 31 2625 3 9957 E  NE87627 23 2250 2 8557 E  SED  Overall Standard Error of Difference 4 979    39    4 Data file preparation    4 1    Introduction    The first step in an ASReml analysis is to prepare the data file  Data file preparation is  discussed in this chapter using the NIN example of Chapter 3 for demonstration  The first  25 lines of the data file are as follows        CODY 4    NE83404  NE83406  NE83407  CENTURA  SCOUT66  COLT 11  NE83498  NE84557  NE83432  NE85556  NE85623    NE86482       LANCOTA          CENTURK78 17 1117 632  NORKAN 18 1118 446 1 4 22  KS831374 19 1119 684 14 3  TAM200 20 1120 422 1 4 2    HOMESTEAD 22 1122 566 1 4    variety id pid raw repl nloc yield lat long row column  BRULE 2 1102 631 1 4 31 55 4 3 20 4 17 1  REDLAND 3 1103 701 1 4 35 05 4 3 21 6 18 1   4 3 22 8  ARAPAHOE 5 1105 661 1 4 33 05 4 3    1104 602 1 4 30 1    6 1106 605 1 4 30 2  7 1107 704 1 4 35 2  8 1108 388 1 4 19 4  9 1109 487 1 4 24 3  10 1110 511 1 4 25  1111 502 1 4 25 1 8   12 1112 492  13 1113 509  14 1114 268  15 1115 633  16 1116 513    ererrer    3     ae  4   1 1  21 1121 560 1 4 28    23 1123 514 1 4 25     NE86501 24 1124 635 1 4 31 75 8 6 20 4 17 2  NE86503 25 1125 840 1 4 42 8 6 21 6 18 2     25 4 3 25 2 21 1    5 8 0 2 4 2 2  8    4 24 6 8 665 2   4 25 45 8 6 7 2 6 2  4 13 4 8 6 8 4 7
90.  have  the same name label  For example   IMBF mbf  entry  mlib m35 csv  RENAME Marker35    If the key values are the ordered sequence 1  N  the key field may be  omitted if  NOKEY is specified  If the key is not in the first field  its  location can be specified with  KEY k  If extracting a single covariate  from a large set of covariates in the file  the specific field to extract can  be given by  FIELD s in absolute terms  or relative to the key field by   RFIELD r  For example    IMBF mbf variety 1  markers csv  key 1  RFIELD 35  RENAME Marker35     SKIP k requests the first k lines of the file be ignored     SPARSE can be used when the covariates are predominately zero  Each  key value is followed by as many column value pairs as required to  specifiy the non zero elements of the design for that value of key  The  pairs should be arranged in increasing order of column within rows   The rows may be continued on subsequent lines of the file provided  incomplete lines end with a COMMA     This file may now be a binary format file  with file extension  bin indi   cating 32bit real binary numbers and  dbl indicating 64bit real binary  values  Files with these formats can be easily created in a preliminary  run using the  SAVE qualifier  The advantage of using a binary file is  that reading the file is much quicker  This is important if the file has  many fields and is being accessed repeatedly  for example    ICYCLE 1 1000  IMBF mbf  Geno  markers dbl  key 1  RFIELD  I  renam
91.  inc tests the additional variation explained when the term      is added to a model consisting of the I terms   F con tests the additional variation explained when the term      is added to a model consisting of the I and C c terms   The   terms are ignored for both F inc and F con tests     Incremental F statistics   calculation of Denominator degrees of freedom    Source Size NumDF F value lLambda F Lambda DenDF  mu 1 1 245 1409 245 1409 1 0000 5 0000  variety 3 2 1 4853 1 4853 1 0000 10 0000    227    13 4 Other ASReml output files       LinNitr 1 1 110 3232 110 3232 1 0000 45 0000  nitrogen 4 a 1 3669 1 3669 1 0000 45 0000  variety LinNitr 3 2 0 4753 0 4753 1 0000 45 0000  variety  nitrogen 12 4 0 2166 0 2166 1 0000 45 0000    Conditional F statistics   calculation of Denominator degrees of freedom    Source Size NumDF F value Lambda F Lambda DenDF   mu 1 1 138 1360 138 1360 1 0000 6 0475  variety 3 2 1 4853 1 4853 1 0000 10 0000  LinNitr 1 1 110 3232 110 3232 1 0000 45 0000  nitrogen 4 2 1 3669 1 3669 1 0000 45 0000  variety LinNitr 3 2 0 4753 0 4753 1 0000 45 0000  variety  nitrogen 12 4 0 2166 0 2166 1 0000 45 0000    13 4 2 The  asl file    The  as1 file is primarily used for low level debugging  It is produced when the  LOGFILE  qualifier is specified and contains lowlevel debugging information information when the    DEBUG qualifier is also given     However  when a job running on a Unix system crashes with a Segmentation fault  the  output buffers are not flushed 
92.  inverse being reformed  unless  MAKE is spec   ified   this saves time when performing repeated analyses based on a particular pedi   gree       delete ainverse bin or specify  MAKE if the pedigree is changed between runs     e identities are printed in the  sln and the  aif file       identities should be whole numbers less than 200 000 000 unless  ALPHA is specified     pedigree lines for parents must precede their progeny     unknown parents should be given the identity number 0       if an individual appearing as a parent does not appear in the first column  it is assumed  to have unknown parents  that is  parents with unknown parentage do not need their  own line in the file       identities may appear as both male and female parents  for example  in forestry     We refer the reader to the sheep genetics example on page 317     Table 8 1  List of pedigree file qualifiers       qualifier description         ALPHA indicates that the identities are alphanumeric with up to 225 characters  otherwise  by default they are numeric whole numbers  lt  200 000 000  If using long alphabetic  identities  use  SLNFORM to see the full identity in the  s1n file     IDIAG causes the pedigree identifiers  the diagonal elements of the Inverse of the Relationship    AIF and the inbreeding coefficients for the individuals  calculated as the diagonal of A     J    and a factor with levels Parent and Nonparent indicating if the individual is a parent   with progeny in the pedigree  or a non p
93.  is fitted as fixed to allow for the likely scenario  that rather than a single population of treatment by variety effects there are in fact two  populations  control and treated  with a different mean for each  There is evidence of this  prior to analysis with the large difference in mean sqrt rootwt  for the two groups  14 93  and 8 23 for control and treated respectively   The inclusion of tmt as a fixed effect ensures  that BLUPs of tmt variety effects are shrunk to the correct mean  treatment means rather  than an overall mean      The model for the data is given by    y  XT   Ziu   Zou   Z3u3   Z4U4   Z5uU5  e  15 7     where y is a vector of length n   264 containing the sqrt rootwt  values  7 corresponds to a  constant term and the fixed treatment contrast and u    us correspond to random variety   treatment by variety  run  treatment by run and variety by run effects  The random effects  and error are assumed to be independent Gaussian variables with zero means and variance  structures var  u     o  Iy   where b  is the length of u   i   1   5  and var  e    07I       The ASReml code for this analysis is    300    15 8 Paired Case Control study   Rice       Bloodworm data Dr M Stevens  pair 132  rootwt  run 66  tmt 2  A  id  variety 44  A  rice asd  skip 1  DOPATH 1   PATH 1  sqrt rootwt    mu tmt  r idv variety  idv variety tmt  idv run     idv pair  idv run tmt   residual idv units    PATH 2  sqrt rootwt    mu tmt  r idv variety  diag tmt  id variety  idv run    id
94.  is o   MOn  where    and On are 15 x 1 and 6 x 1 vectors respectively  and M is a 15 x 6 matrix     1 0 0 o 0 0  0 5 0 5 0 O 0  1  0 1 0 Oo Q 0  0 5 0 vo 8     i  0 0 6 0 5 0 Q  f  0 0 1 Oo 0 0  0 5 0 0 0 5 0 si  0 0 5 0 0 5 0  1  0 0 0 5 0 5 9  1  0 0 0 i Q 0  0 5   0 0 0 5  i  0 0 5 Q 0 0 5   i  0 0 0 5 0 0 5  i  0 0 0 Oo OS AL  0 0 0 0 1 Q    A way of fitting this model would be to put the matrix values in a file HuynhFeldt vcm and  replace the model specification lines by    134    7 8 Setting relationships among variance structure parameters        Supply start values because raw SSP generates bad initial values   for HuynhFeldt structure because it does not fit well    ASSIGN HFvcm  GU  INIT 45 20 45 20 20 45 20 20 20 45 20 20 20 20 45  wtO wtl wt2 wt3 wt4   Trait treat Trait treat   residual units us Trait  HFvcm     VCM 5 19 6 HuynhFeldt vcm  parameters 5 to 19 explained in terms of 6 parameters    Note that if the user fits another model with differing numbers of variance structure param   eters so that the variance structure parameters are renumbered  then all the user needs to do  to continue with the same relationships is to change the parameter_number_list parameters  on the VCM line     Important The VCM statement must be placed after any residual definition line s      The new qualifier  DESIGN on the datafile line causes ASReml to write the design matrix   not including the response variable  to a  des file  It allows ASReml to create the design  matrix requi
95.  is strong  supporting earlier  indications of the dependence between the treated and control root area  Figure 15 8      303    15 8 Paired Case Control study   Rice       Table 15 9  Equivalence of random effects in bivariate and univariate analyses       bivariate univariate  effects  model 15 10   model 15 7   trait variety Uy 1  8 u   u   trait run U  1  8 u   u   trait pair e    1  8u   e          15 8 2 A multivariate approach    In this simple case in which the variance heterogeneity is associated with the two level factor  tmt  the analysis is equivalent to a bivariate analysis in which the two traits correspond to  the two levels of tmt  namely sqrt rootwt  for control and treated  The model for each trait  is given by   Y    XTj   Zsu    Z u   e   J  6t   15 9     where y  is a vector of length n   132 containing the sqrtroot values for variate j  j   c for  control and j   t for treated   7  corresponds to a constant term and u   and u   correspond  to random variety and run effects  The design matrices are the same for both traits  The  random effects and error are assumed to be independent Gaussian variables with zero means  and variance structures var  u    op Tss  var  un    0  Tes and var  e     o7T 139  The  bivariate model can be written as a direct extension of  15 9   namely    y    12    X 7    Lp    Zy  Uy    Lo    Z   u          15 10   where y    y  yi   Uy    u  uly  Ur    ul u  and e     el  ey   There is an equivalence between the effects in this b
96.  labels of 16 characters long  If there are large  A factors  so that the  total across all factors will exceed 2000   you must specify the anticipated size  within say  5   of the larger factors     If some labels are longer then 16 characters and the extra characters are significant  you  must lengthen the space for each label by specifying  LL c e g    cross  A 2300  LL 48   indicates the factor cross has about 2300 levels and needs 48 characters to hold the level  names  only the first 20 characters of the names are ever printed      PRUNE on a field definition line means that if fewer levels are actually present in the factor    than were declared  ASReml will reduce the factor size to the actual number of levels  Use   PRUNEALL for this action to be taken on the current and subsequent factors up to  but    49    5 4 Specifying and reading the data       not including  a factor with the  PRUNEOFF qualifier  The user may overestimate the size  for large ALPHA and INTEGER coded factors so that ASReml reserves enough space for  the list  Using  PRUNE will mean the extra  undefined  levels will not appear in the  sln  file  Since it is sometimes necessary that factors not be pruned in this way  for example in  pedigree GIV factors  pruning is only done if requested     Normally a   character in the data file will have the effect of eliminating whatever text  follows on the line  This means that ordinarily the   character may not be included in  the name of the level of an al
97.  line  61  qualifiers  62  syntax  61  datasets  barley asd  278  coop fmt  309  grass asd  270  harvey dat  148  nin89 asd  27  oats asd  260  orange asd  301  rat dat  144  rats asd  264  ricem asd  296  voltage asd  267  wheat asd  284  debug options  188  Denominator Degrees  of Freedom  19  dense  106  design factors  106  diagnostics  17  diallal analysis  97  direct product  10  discussion list  3  Dispersion parameter  103  distribution  conditional  12  marginal  12    Ecode  38  Eigen analysis  232  EM update  120  environment variable  job control  65  equations  mixed model  14  errors  237  Excel  42  execution time  232    F statistics  19  Factor qualifier    DATE  49   DMY  49   LL Label Length  49   MDY  49   PRUNE  50   SORT  50   SORTALL  50   TIME  49  factors  41  file   GIV  153   pedigree  148  Fisher scoring algorithm  13  fixed effects  5  86  Fixed format files  63  fixed terms  87  93   multivariate  146    primary  93  sparse  94  forum  3    free format  41  functions of variance components  37   201  Convert CORUH and XFA to US  204  correlation  205  linear combinations  203  syntax  201    Gamma distribution  103  GBLUP  159  Generalized  Mixed  Linear Models  101  genetic  data  1  groups  152  links  147  models  147  qualifiers  147  relationships  148  genetic markers  71  GIV  143  153  GLM distribution  Binomial  102  Gamma  103  Negative Binomial  103  Normal  102  Ordinal data  102  Poisson  103    338    INDEX       GLMM  104  graphics
98.  line of the data  file nin89 asd  the line containing the field  labels     The data file line       row 22   column 11   nin89 asd  skip 1   tabulate yield   variety   yield   mu variety  r idv repl   residual idv  units    The data file line can contain qualifiers that   predict variety  control other aspects of the analysis  These    qualifiers are presented in Section 5 8              31    3 4 The ASReml command file       3 4 5    The tabulate statements are optional  They  provide a simple way of exploring the struc   ture of a data  They should appear immedi   ately before the model line  In this case the 56  simple variety means for yield are formed and  written to a  tab output file  See Chapter 9  for a discussion of tabulation     Tabulation    3 4 6    The linear mixed model is specified as a list  of model terms and qualifiers  All elements  must be space separated  ASReml accommo   dates a wide range of analyses  See Section  2 1 for a brief discussion and general algebraic  formulation of the linear mixed model  The  model specified here for the NIN data is a sim   ple random effects RCB model having fixed va   riety effects and random replicate effects  The  reserved word mu fits a constant term  inter     cept   variety fits a fixed variety effect and rep1 fits a random replicate effect because the          column 11   nin89 asd  skip 1   tabulate yield   variety   yield   mu variety  r idv repl   residual idv units    predict variety       Specifying the t
99.  listed first followed by permitted alternatives        qualifiers action        NORMAL    IDENTITY    LOGARITHM     INVERSE    allows the model to be fitted on the log inverse scale but with the residuals on the  natural scale   NORMAL   IDENTITY is the default     IBINOMIAL    LOGIT    IDENTITY    PROBIT    COMPLOGLOG      TOTAL n     p 1     p  n Proportions or counts  r   ny  are indicated if  TOTAL specifies the variate con   re   Ea   taining the binomial totals  Proportions are assumed if no response value exceeds    1  y In j     1  A binary variate  0  1  is indicated if   TOTAL is unspecified  The expression for d    on the left applies when y is proportions  or binary   The logit is the default link  function  The variance on the underlying scale is 77 3   3 3  underlying logistic  distribution  for the logit link      MULTINOMIAL k  CUMULATIVE    LOGIT    PROBIT    COMPLOGLOG      TOTAL n    fits a multiple threshold model with t   k     1 thresholds to polytomous ordinal    Vij   pi l    uj   n data with k categories assuming a multinomial distribution   fri lt j lt t Typically  the response variable is a single variable containing the ordinal score   1   k  or a set of k variables containing counts  r   in the k categories  The response  d   2NF may also be a series of t binary variables or a series of t variables containing counts    yiln yi pi  If    counts are supplied  the total  including the kth category  must be given in  where another variable indicated 
100.  lit   us  TrLit1234   ste C1  pus Triit1234     2 1583  2 2202  2 3077  0 16225  0 16827  0 18881E 01  15 766  11 784  24 024  0 43182  0 88424  0 19460  0 95054  1 1380  0 25006  4 6988  0 89101  2 6165  0 79486E 01  0 68664E 01  1 6644  2 3758  2 7093  6 2253     11219   11514E 01   60077E 01    23849  26281    19102E 01   63142     16291    53335    35085E 02   0 18892  0 13069  1 5643  1 5478  0 75138  0 13421  0 16539  0 38619E 02  15 172  11 107  22 468  0 40378       OS OS SS   D     OOOO    aoa aacstan s a  PPP WWWYNY YD  BPWNFPWNHRPENDE    WS      33589    37368    63232  32  85E 01   47001E 01   59274E 02    31286    37589   63510   33038E 01   44563E 01   55003E 02    29825    37755   37255E 01  adaz   10759   14261   12431E 01    10198   51205E 01    64586   85213  1 5966   T3359E 01   11109    14996E 01    44400   64674    76604E 01    34354   16518    27002   233839E 01    12314   65488E 01    37542    43280    75145   37770E 01   54770E 01    7T0075E 02  satel   31755   50789   28124E 01    ooo oo coco Cocooooocooo oo Oo oO Oo Oo OC 0 8    oo Ooo Coo OOo OO CoO oO oO oOo oO oO Oo Oo Oo Oo So    326    1 53980  2 55497   310141E 01   450851E 01   191030E 01   T21026E 01    794020   417001E 01   897161      466606   811102    730000   609258E 01    786132E 02    220000     1 55000         Da    760000  391773    15 10 Multivariate animal genetics data   Sheep       100 resid 64 0 88137 0 35903E 01   101 resid 65 0 17958 0 41634E 02   102 resid 66 0 89091 0 28008
101.  model fitting   Ir Tr Anim Tr Lit  f Tr HYS   without  LAST  the location of singularities will almost surely change if  the G structures for Tr Anim or Tr Lit are changed  invalidating Like   lihood Ratio tests between the models     performs the outlier check described on page 17  This can have a large  time penalty in large models     supplies the name of a program supplied by the user in association with  the OWN variance model  page 127      causes ASReml to print the transformed data file to basename asp  If   n  lt  0  data fields 1   mod n  are written to the file    n   0  nothing is written    n   1  all data fields are written to the file if it does not exist    n   2  all data fields are written to the file overwriting any previous  contents    n  gt  2  data fields n   t are written to the file where tis the last defined  column     sets hardcopy graphics file type to  png   sets hardcopy graphics file type to  ps     modifies the format of the tables in the  pvs file and changes the file  extension of the file to reflect the format     PVSFORM 1 is TAB separated   pvs  gt  _pvs txt    PVSFORM 2 is COMMA separated   pvs  gt  _pvs csv    PVSFORM 3 is Ampersand separated   pvs  gt  _pvs tex   See   TXTFORM for more detail     instructs ASReml to write the transformed data and the residuals to a  binary file  The residual is the last field  The file basename srs is written  in single precision unless the argument is 2 in which case basename drs  is written in doubl
102.  model terms  it is often useful or appropriate to consider a  partitioning of the vector of residual errors e according to some conditioning factor  We  use the term section to describe this partitioning and the most common example of the use  of sections in e is when we wish to allow sections in the data to have different variance  structures  For example  in the analysis of multi environment trials  METs  it is natural  to expect that each trial will require a separate  possibly spatial  error structure  In this  case  for s sections we have e    e  e      e    assuming that the data vector is ordered by  section  and where e  represents the vector of errors for the j       section     2 1 5 R structure for the residual error term    T    For e partitioned as e    e  e5     e     sum structure  with       we allow the matrix R  to have a similar direct    R  0    0O 0  0 R      0 0  R  ja Rys   2 to     0 oO     Rn  0  0  0 oi O R     for s  gt  1 sections and the data ordered by section  Note that it may be necessary to re   order  re number  the data units in order to achieve this structure  In ASReml it is now  straightforward to apply possibly different variance structures to each component of R      In many cases  the residual errors  e  can be expected to share a common variance structure   In this case there is only one section  s   1      Typically a variance structure is specified for each random model term and often more  complex models than the simple IID model 
103.  not generated so the ra   tios are not numbered and cannot be used to derive other functions  To avoid numbering  confusion it is better to include H functions at the end of the VPREDICT block     In the example  H herit 4 3 or H herit genvar phenvar    calculates the heritability by calculating component 4  from second line    component 3   from first line   that is  genetic variance   phenotypic variance     S label 1 77 when 1 7 are assumed positive variance parameters  inserts components which  are the SQRT of components 2 7     X label i k inserts a component being the product of components 7 and k   X label i j k inserts j     i   1 components being the products of components 7  j and k     X label i 7 k 1 inserts a set of j   i 1 components being the pairwise products of components  t jgandk l     The S and X functions are new in ASReml Release 4  The multiply option  X  allows a  correlation in a CORUV structure to be converted to a covariance  The SQRT option allows  conversion of CORGH to US  provided the dimension is moderate  say  lt  10      The variances and covariances are calculated using a Taylor series expansion  Then for  parameters uv  and v  derived from the set of parameters v with variance matrix V  if    Va   falv  and vw   falv  then if dv    fav  and if dv    Shiv  then cou va  Up    dv   V   v        12 2 2 Convert CORUH and XFA to US    V label i zj where i  j spans a CORUH variance structure  inserts the US matrix based on  the CORUH parameters  
104.  now di   rectly supports Arthur Gilmour and Sue Welham for further computational developments  and research on the analysis of mixed models  Release 4 of ASReml was first distributed in  2014  A major enhancement in this release is the introduction of an alternative  functional   specification of linear mixed models  For the convenience of users  three documents have  been prepared  7  a guide to Release 4 using the original  still supported  model specifica   tion  ii  this document which is a guide using the new functional model specification and  iii a document ASReml Update  What   s new in Release 4  which highlights the changes from  Release 3     Linear mixed effects models provide a rich and flexible tool for the analysis of many data sets  commonly arising in the agricultural  biological  medical and environmental sciences  Typical  applications include the analysis of  un balanced longitudinal data  repeated measures anal   ysis  the analysis of  un balanced designed experiments  the analysis of multi environment  trials  the analysis of both univariate and multivariate animal breeding and genetics data  and the analysis of regular or irregular spatial data     ASReml provides a stable platform for delivering well established procedures while also deliv   ering current research in the application of linear mixed models  The strength of ASReml is  the use of the Average Information  Al  algorithm and sparse matrix methods for fitting the  linear mixed model  This en
105.  number in the data file  use the  D transformation in association with the   VO  transformation      forms a set of orthogonal polynomials of order  n  based on the unique values in  variate  or factor  v and any additional interpolated points  see  PPOINTS and  PVAL  in Table 5 4  It includes the intercept if n is positive  omits it if n is negative  For  example  pol time 2  forms a design matrix with three columns of the orthogonal  polynomial of degree 2 from the variable time  Alternatively  pol time  2  is a  term with two columns having centred and scaled linear coefficients in the first  column and centred and scaled quadratic coefficients in the second column     The actual values  Robson  1959  Steep and Torrie  1960  of the coefficients are  written to the  res file  This factor could be interacted with a design factor to fit  random regression models  The leg   function differs from the pol   function in  the way the quadratic and higher polynomials are calculated     defines the covariable  x   0   for use in the model where x is a variable in  the data  p is a power and o is an offset  pow z 0 5  0   is equivalent to  sqr a2  0    pow z 0  0   is equivalent to log    0    pow z  1  0   is equiva   lent to inv a  o       99    6 6 Alphabetic list of model functions       Table 6 2  Alphabetic list of model functions and descriptions       model function    action       qtl f r     sin v 7r     spl v  k    s u  k      sqrt  v  r      Trait    units    uni fl  0  
106.  of 7 and prediction of u  although the latter may not always be of interest  for  given a  and ao  The other process involves estimation of these variance parameters     2 2 1 Estimation of the variance parameters    Estimation of the variance parameters is carried out using residual or restricted maximum  likelihood  REML   developed by Patterson and Thompson  1971   An historical develop   ment of the theory can be found in Searle et al   1992   Note firstly that    y   N X7  H    2 10     where H   ZG o  Z    R  o    REML does not use  2 10  for estimation of variance  parameters  but rather uses a distribution free of 7  essentially based on error contrasts or  residuals  The derivation given below is presented in Verbyla  1990      We transform y using a non singular matrix L    L  L    such that    LiX I   L X  0     Y   n  7 LIAL  LHL   Y gt  0    IHL  Li HL         The full distribution of L y can be partitioned into a conditional distribution  namely y  Yyo     for estimation of T  and a marginal distribution based on y  for estimation of a  and o    the latter is the basis of the residual likelihood     The estimate of T is found by equating y  to its conditional expectation  and after some  algebra we find     7  X H X  X H  y    12    2 2 Estimation       Estimation of k    oj of   is based on the log residual likelihood   1  lp      log det LHL   y  L H Lo   y    1        5  log det X H X   log det H   y  Py     2 11     where  P H   H X X H X  X H      Note tha
107.  of objects produced with each ASReml run and where to find them  in the output files     Table 13 2  ASReml output objects and where to find them       output object found in comment       This table contains Wald F statistics for each term in the  fixed part of the model  These provide for an incremen   tal or optionally a conditional test of significance  see  Section 6 11      Wald F statistics table    asr file    240    13 5 ASReml output objects and where to find them       Table 13 2  Table of output objects and where to find them ASReml       output object    found in    comment       data summary    eigen analysis    elapsed time    fixed and random effects    heritability    histogram of residuals  intermediate results    mean variance relation   ship       asr file   ass file     res file     asr file   asl file     sln file     pvc file     res file   asl file     res file    includes the number of records read and retained for  analysis  the minimum  mean  maximum  number of  zeros  number of missing values per data field  fac   tor variate field distinction     An extended report of the data is written to the  ass file  if the  SUM qualifier is specified  It includes cell counts  for factors  histograms of variates and simple correlations  among variates    When ASReml reports a variance matrix to the  asr  file  it also reports an eigen analysis of the matrix  eigen  values and eigen vectors  to the  res file     this can be determined by comparing the start t
108.  on going testing of the software and numerous helpful discussions and insight  Dave  Butler has developed the ASReml R package  Alison contributed to the development of many  of the approaches for the analysis of multi section trials  We also thank Ian White for his  contribution to the spline methodology  and Simon Harding for the licensing and installa   tion software and for his development of the user interface program ASReml W  The Mat  rn  function material was developed with Kathy Haskard and Brian Cullis  and the denomina   tor degrees of freedom material was developed with Sharon Nielsen  a Masters student with  Brian Cullis  Damian Collins contributed the PREDICT  PLOT material  Greg Dutkowski has  contributed to the extended pedigree options  The asremload d11 functionality is provided  under license to VSN  Alison Kelly has helped with the review of the XFA models  Finally  we  especially thank our close associates who continually test the enhancements  Arthur Gilmour  acknowledges the grace of God through Jesus Christ our Saviour  In Him are hidden all the  treasures of wisdom and knowledge  Colossians 2 3     ill    Contents    Preface i  List of Tables xiii  List of Figures XV  1 Introduction 1  1 1 What ASReml can do    6 ke bk ew we bE ae CRS Re CEE EES 1  1 2 OGIO  ena oe eee a ee da er dea el Kee E a pG 2  1 3 User Interlace oc eoa popoe a e e E a EE Ew REE Rw a ao OS 2  lasi ASREMIAN e oei seck a e a SE e a a Ee es Tea 2   LIA ConTEAT 26 eeeN ee aiei e de
109.  on the basis of smallest SE or SED is not recommended because the model is not necessarily  fitting the variability present in the data     The predict statement included the qualifier  TWOSTAGEWEIGHTS  This generates an extra  table in the  pvs file which we now display for each model     291    15 7 Unreplicated early generation variety trial   Wheat       Table 15 7  Summary of models for the Slate Hall data       REML number of Wald  model log likelihood parameters F statistic SED  AR1xAR1  700 32 3 13 04 59 0  AR1xAR1   units  696 82 4 10 22 60 5  IB  707 79 4 8 84 62 0          Predicted values with Effective Replication assuming  Variance  38754 26    Heron  1 1257 98 22 1504  Heron  2 1501 45 20 6831  Heron  3 1404 99 22 5286  Heron  4 1412 57 22  1623  Heron  5 1514 48 21 1830  Heron  26 1592 02 26 0990    Predicted values with Effective Replication assuming  Variance  45796 58    Heron  1 1245 58 23 8842  Heron  2 1516 24 22 4423  Heron  3 1403 99 24 1931  Heron  4 1404 92 24 0811  Heron  5 1471 61 23 2995  Heron  25 1573 89 26 0505    Predicted values with Effective Replication assuming  Variance  8061 808    Heron  i 1283 59 4 03145  Heron  2 1549 01 4 03145  Heron  3 1420 93 4 03145  Heron  4 1451 86 4 03145  Heron  5 1533 27 4 03145  Heron  25 1630 63 4 03145    The value of 4 for the IB analysis is clearly reasonable given there are 6 actual replicates  but this analysis has used up 48 degrees of freedom for the rowblk and colb1k effects  The  precision from t
110.  options  188  GRM  143    help via email  3  heritability  232    IID  10  inbreeding coefficients  150  214  Incremental F Statistics  19  Information   Criteria  17  information matrix  13   expected  13   observed  13  input file extension     BIN  43   DBL  43   bin  41  43   csv  41   dbl  41  43   pin  207    interactions  95  Introduction   18    job control  options  189  qualifiers  65    key output files  210    likelihood  comparison  211  convergence  68  log residual  13  offset  211  residual  12  longitudinal data  1  balanced example  300    marginal distribution  12   Mat  rn variance structure  135   measurement error  112   MERGE  198   MET  7   meta analysis  1   missing values  41  99  105  215  NA  41  in explanatory variables  105    in response  105  mixed  effects  5  model  5  mixed model  5  equations  14  multivariate  145  specifying  32  model  animal  147  318  correlation  il  covariance  11  formulae  87  sire  147  model building  127  moving average  98  multi environment trial  1  7  multivariate analysis  144  295  example  308  half sib analysis  308    Nebraska Intrastate Nursery  25  Negative binomial  103   non singular matrices  132   NRM  143    objective function  14  observed information matrix  13  operators  90  options   command line  185  ordering of terms  106  Ordinal data  102  orthogonal polynomials  99  outliers  233  output   files  34   objects  231  output file extension    aov  208  216     apj  208    ask  209    asl
111.  outlined Gilmour et al   1995  AS   Reml orders the equations in the sparse part to maintain as much sparsity as it can during  the solution  After absorbing them  it absorbs the model terms associated with the dense  equations in the order specified     6 10 3 Aliassing and singularities    A singularity is reported in ASReml when the diagonal element of the mixed model equations  is effectively zero  see the   TOLERANCE qualifier  during absorption  It indicates there is either    e no data for that fixed effect  or  e a linear dependence in the design matrix means there is no information left to estimate    106    6 10 Some technical details about model fitting in ASReml       the effect     ASReml handles singularities by using a generalized inverse in which the singular row column  is zero and the associated fixed effect is zero  Which equations are singular depends on the  order the equations are processed  This is controlled by ASReml for the sparse terms but by  the user for the dense terms  They should be specified with main effects before interactions  so that the table of Wald F statistics has correct marginalization  Since ASReml processes  the dense terms from the bottom up  the first level  the last level processed  is typically  singular     The number of singularities is reported in the  asr file immediately prior to the REML  log likelihood  LogL  line for that iteration  see Section 13 3   The effects  and associated  standard or prediction error  which cor
112.  reading error  if n is omitted  and then process the records it has  This allows data  to be extracted from a file which contains trailing non data records  for  example extracting the predicted values from a  pvs file   The argument   n  specifies the number of data records to be read  If not supplied   ASReml reads until a data reading error occurs  and then processes the  data it has  Without this qualifier  ASReml aborts the job when it  encounters a data error  See  RSKIP     64    5 8 Job control qualifiers       Table 5 2  Qualifiers relating to data input and output       qualifier action        RSKIP  mie  allows ASReml to skip lines at the heading of a file down to  and includ     ing  the nth instance of string s  For example  to read back the third set  predicted values in a  pvs file  you would specify    RREC  RSKIP 4     Ecode      since the line containing the 4th instance of     Ecode    immediately pre   cedes the predicted values  The  RREC qualifier means that ASReml will  read until the end of the predict table  The keyword Ecode which occurs  once at the beginning and then immediately before each block of data in  the  pvs file is used to count the sections     5 7 1 Combining rows from separate files    ASReml can read data from multiple files provided the files have the same layout  The file  specified as the    primary data file    in the command file can contain lines of the form     INCLUDE  lt filename gt   SKIP n   where  lt filename gt  is the  
113.  s   A figure is produced  which reports the trends in 0   with increasing distance for each sector     ASReml also computes the variogram from predictors of random effects which appear to have  a variance structures defined in terms of distance  The variogram details are reported in the   res file     2 5 Inference  Fixed effects    2 5 1 Introduction    Inference for fixed effects in linear mixed models introduces some difficulties  In general   the methods used to construct F tests in analysis of variance and regression cannot be  used for the diversity of applications of the general linear mixed model available in ASReml   One approach would be to use likelihood ratio methods  see Welham and Thompson  1997   although their approach is not easily implemented     Wald type test procedures are generally favoured for conducting tests concerning T  The  traditional Wald statistic to test the hypothesis Hp   Lr   l for given L  r xp  andl  r x 1     18    2 5 Inference  Fixed effects       is given by   W    LF  I E X A 1X  L Y HL      l   2 24   and asymptotically  this statistic has a chi square distribution on r degrees of freedom  These  are marginal tests  so that there is an adjustment for all other terms in the fixed part of the  model  It is also anti conservative if p values are constructed because it assumes the variance  parameters are known     The small sample behaviour of such statistics has been considered by Kenward and Roger   1997  in some detail  They present
114.  sample of  the data  the  asr file  and the  as1 file produced by the debug options   d1  running   asreml  dl basename as    In this chapter we show some of the common    NIN Alliance Trial 1989  coding problems  The code box on the right   variety     shows our familiar job modified to generate 8   id pid raw   faults  Following is the output from running   T  P       ss loc yield  hi   7 y  i s job lat long    row   column     nin9 asd  slip 1   yield   mu variety   IR Repl   residual ar1 Row   ar1 Col   predict varierty             ASReml 4 1  01 Apr 2014  NIN alliance trial 1989  Build kt  21 Apr 2014  64 bit Windows x64  23 Apr 2014 09 16 54 727 32 Mbyte ninerri    Folder  C  Users Public ASRem1 Docs Manex4 ERR  There is no file called nin9 asd  Variable names may not include      Warning  Unrecognised qualifier at character 10 nin9 asd        SLIP 1 17  Error  Failed to recognise a data file   Check spelling of filename and enclose the name in quotes   Fault  Error parsing yield   mu variety  Last line read was  yield   mu variety  Currently defined structures  COLS and LEVELS    246    14 4 An example       1 variety a 2 0 0 0 0  2 aid 1 1 0 0 0 0  3 pid i A  0 0 0 0  4 raw 1 al 0 0 0 0   amp  repl 1 2 0 0 0 0  6 nloc 1 1 0 0 0 0  7 yield 1 1 0 0 0 0  8 lat 1 1 0 0 0 0  9 long 1 1 0 0 0 0  10 row 1 2 0 0 0 0  11 column 1 2 0 0 0 0    ninerr1 C  Users Public ASRem1 Docs Manex4 ERR  11 factors defined  max5000    O variance parameters  max2500   2 special structures  L
115.  statement is supplied in the  as file     the REML log likelihood is given for each iteration  The  REML log likelihood should have converged    and in binary form in  dpr file  these are printed in col   umn 3  Furthermore  for multivariate analyses the resid   uals will be in data order  traits within records   How   ever  in a univariate analysis with missing values that are  not fitted  there will be fewer residuals than data records    there will be no residual where the data was missing so  this can make it difficult to line up the values unless you  can manipulate them in another program  spreadsheet      given if the  DL command line option is used     simple averages of cross classified data are produced by  the tabulate directive to the  tab file  Adjusted means  predicted from the fitted model are written to the  pvs  file by the predict directive     based on the inverse of the average information matrix    the values at each iteration are printed in the  res file   The final values are arranged in a table  printed with  labels and converted if necessary to variances     242    13 5 ASReml output objects and where to find them       Table 13 2  Table of output objects and where to find them ASReml       output object found in comment       243    14 Error messages    14 1 Introduction    Identifying the reason that ASReml does not produce the anticipated results can be a frus   trating business  This chapter aims to assist you by discussing four kinds of errors  
116.  terms to test which tests its contribution after  all other terms in terms to test and background terms  conditional on  all terms that appear in the SPARSE equations  It should only specify  terms which will appear in the table of Wald F statistics     For example    FOWN ABC   mu  IFOWN A B B C A C   mu ABC   FOWN A B C   mu ABC A BB C A C  would request the Wald F statistics based on  see page 19   A   mu B C sparse    B   mu A C sparse      mu A B sparse    mu A BC B C A C sparse    mu A B C A B A C sparse    mu A B C A B B C sparse  and    mu ABC A B A C B C sparse      DHAAAAADW       78    5 8 Job control qualifiers       Table 5 5  List of rarely used job control qualifiers       qualifier    action        GDENSE    1GLMM  n       HPGL  2      HOLD  list     Warnings    e For computational convenience  ASReml calculates  FOWN tests using a  full rank parameterization of the fitted model with rank  numerator de   grees of freedom  NumDF  of terms generated by the incremental Wald  F tests    e Unfortunately  if some terms in the implicit model defined by the re   quested  FOWN test would have more or less NumDF than are present in  the full rank parameterization because aliased effects are reordered  it  can not be calculated correctly from the full rank parameterization  In  this case ASReml reverts to the    conditional    test but identifies the terms  that need to be reordered in the fitted model to obtain the  FOWN test s   specified  It is necessary to rerun ASR
117.  that direct product  R structure does not match the multivariate data structure     Maybe a trait name is repeated     263    14 5 Information  Warning and Error messages       Table 14 3  Alphabetical list of error messages and probable cause s   remedies       error message    probable cause remedy       Negative Sum of Squares     NFACT out of range   No  giv file for    No residual variation     Out of    Out of memory        Out of memory  forming  design     Overflow forming   PRESENT  table    Overflow structure table     Pedigree coding errors     Pedigree factor has wrong  size     Pedigree too big  or in  error    POWER model setup error    POWER Model  Unique points  disagree with size    PROGRAM failed in       This is typically caused by negative variance parameters  try  changing the starting values or using the  STEP option  If the  problem occurs after several iterations it is likely that the vari   ance components are very small  Try simplifying the model  In  multivariate analyses it arises if the error variance is  becomes   negative definite  Try specifying  GP on the structure line for  the error variance     too many terms are being defined   Fix the argument to giv       after fitting the model  the residual variation is essentially zero   that is  the model fully explains the data  If this is intended   use the  BLUP 1 qualifier so that you can see the estimates   Otherwise check that the dependent values are what you intend  and then identify which v
118.  that only appear in random model terms  are not included in the averaging set unless specified with the   AVERAGE     ASSOCIATE or  PRESENT qualifiers     Explicit weights may be supplied directly or from a file  The default is equal  weights    weights can be expressed like  3 1 0 2 1  5 to represent the sequence 0 2  0 2 0 2 0 0 2 0 2  The string inside the curly brace is expanded first and  the expression n c means n occurrences of c    When there are a large number of weights  it may be convenient to prepare  them in a file and retrieve them  All values in the file are taken unless     n     is specified in which case they are taken from field column n     is used to control averaging over associated factors  The default is to simply  average at the base level  Hierarchal averaging is achieved by listing the  associated factors to average in f     Explicit weights may be supplied directly or from a file as for   AVERAGE     without arguments means all classify variables are expanded in parallel  Oth   erwise list the variables from the classify set whose levels are to be taken in  parallel     is used when averaging is to be based only on cells with data  v is a list of  variables and may include variables in the classify set  v may not include  variables with an explicit   AVERAGE qualifier  The variable names in v may  optionally be followed by a list of levels for inclusion if such a list has not  been supplied in the specification of the classify set  ASReml works ou
119.  the model      a term in the model specification is not among the terms that  have been defined  Check the spelling     there is a problem with the named variable     The second field in the R structure line does not refer to a  variate in the data     the weight and filter columns must be data fields  Check the  data summary     See the discussion of  AISINGULARITIES     Maybe increase workspace or restructure simplify the model     Numerical problems calculating the Mat  rn function  If rescal   ing the X Y cordinates so that the step size is closer to 1 0  does not resolve the issue  try AEXP instead     special structures are weights  the Ainverse and GIV structures   The limit is 98 and so no more than 96 GIV structures can be  defined     The limit is 1500  It may be possible to restructure the job so  the limit is not exceeded  assuming that the actual number of  parameters to be estimated is less     ASReml failed to read the first data record  Maybe it is a head   ing line which should be skipped by using the  SKIP qualifier   or maybe the field is an alphanumeric field but has not been  declared so with the  A qualifier     You need to identify which design terms contain missing values  and decide whether to delete the records containing the missing  values in these variables or  if it is reasonable  to treat the  missing values as zero by using  MVINCLUDE     More missing values in the response were found than expected     missing observations have been dropped so
120.  the right is the ASReml  win Alliance Trial 1989  command file nin89a as for aspatial analysis   variety  A   Alphanumeric  of the Nebraska Intrastate Nursery  NIN  field   id   experiment introduced Chapter 3  The lines   P     that are highlighted in bold blue type relate wae 4   to reading in the data  In this chapter we use it     this example to discuss reading in the data in yield   detail  lat   long   row 22   column 11   nin89aug asd  skip 1  yield   mu variety  residual idv  units     Notice the in line comment indicated by the              5 2 Important rules       In the ASReml command file    e all blank lines are ignored   e   is used to annotate the input  all characters following a   symbol on a line are ignored     e lines beginning with   followed by a blank are copied to the  asr file as comments for  the output     e a blank is the usual separator  TAB is also a separator     44    5 2 Important rules       e acomma as the last character on the line is sometimes used to indicate that the current    list is continued on the next line  a comma is not needed when ASReml knows how many  values to read     e reserved words used in specifying the linear model  Table 6 1  are case sensitive  they need  to be typed exactly as defined  they may not be abbreviated     e a qualifier is a letter sequence preceded by   which sets an option     some qualifiers require arguments       qualifiers must appear on the correct context     qualifier identifiers are not case s
121.  these to disk  Before each iteration  ASReml writes the own parameters to a file  runs  MYOWNGDG  it assumes MYOWNGDG forms the G and derivative matrix  and then reads the  matrices back in  An example of MYOWNGDG f90 is distributed with ASReml  It duplicates  the AR1 and AR2 variance structures  The following job fits an AR2 structure using this  program     Example of using the OWN structure   rep   blcol   blrow   variety 25   yield   barley asd  skip 1  OWN MYOWN EXE   y   variety   residual ari 10  own2 15  INIT  2  1  TRR  F1     The file written by ASReml has extension  own and appears as follows     15 2 1  0 6025860D 000  1164403D 00  This file was written by asreml for reading by your MYOWNGDG program  asreml writes this file  runs your program and then reads  shfown gdg  which it presumes has the following format   The first lines should agree with the top of this file    specifying the order of the matrices   15   the number of variance structure parameters   2   and a control parameter you can specify   1      These are written in  315  format  They are followed by  the list of variance parameters written in  6D13 7  format   Follow this with 3 matrices written in  6D13 7  format   These are to be each of 120 elements being lower triangle  row wise of the G matrix and its derivatives with respect  to the parameters in turn     This file contains details about what is expected in the file written by your program  The  filename used has the same basename as the jo
122.  to run ASReml is     path  ASRem1 basename  as c      e path provides the path to the ASReml program  usually called asreml exe in a PC en   vironment   In a UNIX environment  ASReml is usually run through a shell script called  ASReml1      if the ASReml program is in the search path then path is not required and the word  ASRem1 will suffice  for example    ASReml nin89 as  will run the NIN analysis  assuming it is in the current working folder       if asreml exe ASRem1  is not in the search path then path is required  for example  if  asreml exe is in the usual place then    C  Program Files ASRem13 bin Asreml nin89 as    192    10 2 The command line       will run nin89 as   e ASRem1 invokes the ASReml program   e basename is the name of the  as c  command file     The basic command line can be extended with options and arguments to     path  ASRem1  options  basename  as c    arguments    e options is a string preceded by a    minus  sign  Its components control several operations   batch  graphic  workspace       at run time  for example  the command line  ASReml  w128 rat as    tells ASReml to run the job rat as with workspace allocation of 128mb     e arguments provide a mechanism  mostly for advanced users  to modify a job at run time   for example  the command line    ASReml rat as alpha beta    tells ASReml to process the job in rat as as if it read alpha wherever  1 appears in the  file rat as  beta wherever  2 appears and O wherever  3 appears  see below      1
123.  to run but does have the field  names copied across     10 3 Command line options    Command line options and arguments may be specified on the command line or on the top  job control line  This is an optional first line of the  as file which sets command line options  and arguments from within the job  If the first line of the  as file contains a qualifier other  than  DOPATH  it is interpreted as setting command line options and the Title is taken as the  next line     The option string actually used by ASReml is the combination of what is on the command  line and what is on the job control line  with options set in both places taking values from  the command line  Arguments on the top job control line are ignored if there are arguments  on the command line  This section defines the options  Arguments are discussed in detail  in a following section     Command line options are not case sensitive and are combined in a single string preceded  by a    minus  sign  for example  LNW128    The options can be set on the command line or on the first line of the job either as a  concatenated string in the same format as for the command line  or as a list of qualifiers   For example  the command line  ASReml  h22r jobname 1 2 3  could be replaced with  ASReml jobname  if the first line of jobname as was either  feh22r 1 2 3  or   HARDCOPY  EPS  RENAME  ARGS 1 2 3    Table 10 1 presents the command line options with brief descriptions  It also gives the name  of the equivalent qualif
124.  to understand that all general  qualifiers are specified here  Many of these qualifiers are referenced in other chapters where  their purpose will be more evident     Table 5 3  List of commonly used job control qualifiers          qualifier action    ICONTINUE  f  New R4 These qualifiers are used to restart resume iterations from the  IMSV  f  point reached in a previous run  The qualifier  CONTINUE  f  can alter   ITSV  f  nately be set from the command line using the option letter C  f   see    Section 11 3 on command line options   In each run ASReml writes the  initial values of the variance parameters to a file with extension  tsv   template start values  with information to identify individual variance  parameters  After each iteration  ASReml writes the current values of  the variance parameters to files with extension  rsv  re start values   and  msv  the  msv version has information to clearly identify each vari   ance parameter  If f is not set  then ASReml looks for a  rsv file with the  same name used for the output files  ie  the  as name possibly appended  by arguments  ASReml then scans this file for parameter values related  to the current model  replacing the values obtained from the  as file be   fore iteration resumes  If  CONTINUE 2 or  TSV is used then the  tsv file  is used instead of the  rsv file  Similarly  if   CONTINUE 3 or  MSV are  used then the  msv file is used instead of the  rsv file  If f filename   with no extension  is used with  CONTINUE
125.  value  This is shown in  the output in that the parameter will have the code B rather than  P reported in the variance component table     U unrestricted U does not limit the updates to the parameter  This allows vari   ance parameters to go negative and correlation parameters to ex   ceed  1  Negative variance components may lead to problems  the  mixed model coefficient matrix may become non positive definite   In this case the sequence of REML log likelihoods may be erratic  and you may need to experiment with starting values        F fixed F fixes the parameter at its starting value     Z Zero Z mainly applies to factor analytic models where specific variances  and or loadings may be fixed at zero        For structures with multiple parameters  the form  GXXXX can be used to specify F  P  U or  Z for the parameters individually  A shorthand notation allows a repeat count before a code  letter  Thus  GPPPPPPPPPPPPPPZPPPZP could be written as  G14PZ3PZP     For a US model   GP makes ASReml attempt to keep the matrix positive definite  After each  Al update  it extracts the eigenvalues of the updated matrix  If any are negative or zero  the  Al update is discarded and an EM update is performed  If the highest LogL value relates to  a non positive definite form for the matrix  ASReml may perform hundreds of iterations and    128    7 7 Variance model function qualifiers       never converge  Several forms of EM update are possible  see  EMFLAG  and the PXEM option  will conv
126.  variance  covari   ance matrix formed from  BLUPs and residuals    phenotypic variance  plot of residuals against    field position    possible outliers    predicted  fitted  values at  the data points    predicted values    REML log likelihood    residuals    score    tables of means    variance of variance pa   rameters    variance parameters    variogram     res file     pvc file    graphics file     res file     yht file     pvs file     asr file        yht file     asl file     tab file   pvs file     vvp file     asr file   res file    graphics file    for an interaction fitted as random effects  when the first   outer  dimension is smaller than the inner dimension  less 10  ASReml prints an observed variance matrix cal   culated from the BLUPs  The observed correlations are  printed in the upper triangle  Since this matrix is not  well scaled as an estimate of the underlying variance com   ponent matrix  a rescaled version is also printed  scaled  according to the fitted variance parameters  The primary  purpose for this output is to provide reasonable starting  values for fitting more complex variance structure  The  correlations may also be of interest  After a multivari   ate analysis  a similar matrix is also provided  calculated  from the residuals     placed in the  pvc file when postprocessing with a  pin  file    these are residuals that are more than 3 5 standard de   viations in magnitude    these in the are printed in the second column    given if a predict
127.  variance  function  name  correlation models  One dimensional  equally spaced  ID  id identity C   1  C    0  147 0 1 w  AR1  art 1    order C  1  C  1 2 l w  autoregressive C      C     i gt j 1  l   lt 1  AR2  ar2 2     order C    1  2 3 2  w  autoregressive Cai  p  4   C    OROREN a ONONE i gt   gt l  10     lt   1   b    lol  lt 1  AR3  ar3 3    order C    1 8 1             4    3 4 3 w  autoregressive Capii B  Q        5  Q   ERT T  Q   Q    bs  T  1 ica     2   Ci    OOE FOC Fo  Czy i gt  j 2  1d     lt   1 m     LA  lt  1  PA  lt 1  SAR  sari symmetric Cy    1 2 l w  autoregressive Cis  Q   1     4   C    RONE       4 Cid  i gt j 1  l   lt 1  SAR2  sar2 constrained as for AR3 using 2 3 2 w  autoregressive b     2   3    ee     YN  2    competition bs   NYa      147    7 12 Variance models available in ASReml       Details of the variance models    available in ASReml                variance description algebraic number of parameters   structure form  name  variance corr hom het  model variance variance  function  name  MA1  mal 1    order C   1  1 2 1l w  moving aver  C      0   1   02   age EE E T    0   lt 1  MA2  ma2 2     order C  1  2 3 2 w  moving ANGE Chg aS    0  1   0    1  6    62   ape Casy      0   1   6    6    Ci   0 j gt i 2  0   0  lt 1         lt 1         lt 1  ARMA  arma autoregressive C  4  2 3 2 w  moving aver  Ciiis  e  0 a p  1   A     1    age 02     20     C5    CFs is j gt i 1  la   lt 1      lt 1  CORU  coru uniform C   1  C     6  145 1 2 1
128.  variance parameters   In ASRen1 4 linear relationships among variance structure parameters can be defined through  a simple linear model and by supplying a design matrix for a set of parameters  The design  matrix is supplied as an ascii file containing a row for each parameter in a set of contiguous  parameters and a column for each new parameter  This design matrix is associated with the  job through a statement after the residual model definition line s   of the form     VCM parameter_number_list new filename    where parameter_number_list is a list of parameters in the set  and can be abbreviated to  first and last if all the intermediate parameters are in the set  new is the number of new    133    7 8 Setting relationships among variance structure parameters       parameters and filename is the name of the file containing the design matrix     For example  the Wolfinger rats example involves modelling a 5x5 symmetric residual ma   trix    Wolfinger Rat data   treat  A   wtO wtl wt2 wt3 wt4   subject     VO   wolfrat dat  skip 1   wtO wtl wt2 wt3 wt4   Trait treat Trait treat   residual units us Trait     uses 15 parameters numbered 5 19 generating symmetric matrix   5    6 7    8 9 10    11 12 13 14    15 16 17 18 19    Wolfinger  1996  reports the fitting of the HuynhFeldt variance structure to this data  This  structure is of the form    Oii   Oni    Tij   1 2 Omi t Oni   Ono j lt i lt p    In the rats example  the relationship between the original and new parameters
129. 0  2445 2 1563 02 244  1  36191 0 00000    172    8 11 Factor effects with large Random Regression models       2497 2 2167 01 413  1 21339 0 00000  3180 2 8668 03 42  1 21629 0 00000  3521 CL1577Contigi 03  1 15833 0 00000  3802 CL2573Contigi 03 1 17005 0 00000  4195 CL595Contigi O01   1 19330 0 00000  4351 UMN 1397 01 416  1 34916 0 00000    173    9 Tabulation of the data and prediction  from the model    9 1 Introduction    This chapter describes the tabulate directive and the predict directive introduced in Sec   tion 3 4 under Prediction     Tabulation is the process of forming simple tables of averages and counts from the data   Such tables are useful for looking at the structure of the data and numbers of observations  associated with factor combinations  Multiple tabulate directives may be specified in a job     Prediction is the process of forming a linear function of the vector of fixed and random effects  in the linear model to obtain an estimated or predicted value for a quantity of interest  It is  primarily used for predicting tables of adjusted means  If a table is based on a subset of the  explanatory variables then the other variables need to be accounted for  It is usual to form a  predicted value either at specified values of the remaining variables  or averaging over them  in some way     9 2 Tabulation    A tabulate directive is provided to enable simple summaries of the data to be formed for  the purpose of checking the structure of the data  The summar
130. 0 0 0 0 0 0 0 0 1 14 1 gt      ASSIGN VARF   lt   diag TrAG1245  INIT 0 0024 0 0019 0 0020 0 00026   age grp   diag  TrSG123 IINIT 0 93 16 0 0 28  sex grp I gt      PART 1  DIAGONAL FOR SIRE DAM AND LITTER UNSTRUCTURED FOR RESIDUAL   wwt ywt gfw fdm fat   Trait Trait age Trait brr Trait sex Trait age sex  r  VARF   diag Trait  SDIAGI  id sire  diag TrDam123  DDIAGI  id dam  diag TrLit1234  LDIAGI   id lit    lf Trait  g rp   residual id units  us Trait  RUSI     PART 2  CHANGE DIAGONAL TO XFA1 FOR SIRE DAM AND LITTER   wwt ywt gfw fdm fat   Trait Trait age Trait brr Trait sex Trait age sex  r  VARF   xfai Trait  id sire  xfai TrDam123  id dam  xfai TrLit1234   id lit     If Trait grp mv   residual id units   us  Trait      PART 3  CHANGE XFA1 TO UNSTRUCTURED FOR SIRE AND LITTER   wwt ywt gfw fdm fat   Trait Trait age Trait brr Trait sex Trait age sex  r  VARF   us Trait  id sire  xfai TrDam123  id dam  us TrLit1234  id  lit     If Trait grp mv   residual id units   us Trait     IPART 3   VPREDICT  DEFINE    USING  ASSIGN TO GIVE CONCISE VPREDICT    ASSIGN lusT lit us TrLit1234    us TrLit1234   id lit   us TrLit1234    ASSIGN susT sire us Trait   us Trait   id sire   us Trait     ASSIGN uusT id units   us Trait   us  Trait    X Damv xfai TrDam123    defines 54 59   phen  uusT 1 6   susT 1 6   lusT 1 6  Damv   defines  1 6  elements of phen   defines 60 65  136   23 28   44 49   54 59  phen  uusT 7 10   susT 7 10    lusT 7 10    defines  7 10  elements of phen   defines 66 69 
131. 0 2 2 Processing a  pin file   If the filename argument is a  pin file   see Chapter 12   then ASReml processes it  If the  pinfile basename differs from the basename of the output files it is processing  then the  basename of the output files must be specified with the P option letter  Thus   ASReml border pin    will perform the pinfile calculations defined in border  pin on the results in files border  asr  and border  vvp     ASReml  Pborderwwt border pin    will perform the pinfile calculations defined in border  pin on the results in files borderwwt  asr  and borderwwt  vvp     10 2 3 Forming a job template from a data file    The facility to generate a template  as file was introduced in section 3 4 1  Normally  the  name of a  as command file is specified on the command line  If a  as file does not exist and  a file with file extension  asd   csv   dat   gsh   txt or  xls is specified  ASReml assumes  the data file has field labels in the first row and generates a  as file template  First  it seeks  to convert the  gsh  Genstat  or  xls  Excel  see page 42  file to  csv format  In generating    193    10 3 Command line options       the  as template  ASReml takes the first line of the  csv  or other  file as providing column  headings  and generates field definition lines from them  If some labels have   appended   these are defined as factors  otherwise ASReml attempts to identify factors from the field  contents  The template needs further editing before it is ready
132. 00 0 9193   6 LogL  182 979 S2  301 45 60 df 1 0000 0 9190  Final parameter values 1 0000 0 9190         Results from analysis of yi y3 y5 y7 y10          Akaike Information Criterion 369 96  assuming 2 parameters    Bayesian Information Criterion 374 15  Model_Term Gamma Sigma Sigma SE  C  id units   exp Trait  70 effects  Residual SCA_V 70 1 000000 301 449 3 12 OP  Trait EXP_P 1 0 919007 0 919007 29 49 OP    When fitting power models be careful to ensure the scale of the defining variate  here time   does not result in an estimate of    too close to 1  For example  use of days in this example  would result in an estimate for    of about  993     Residuals plotted against Row and Column position  1  Range   45 11 34 86    O00     p ooo       j  4   B 8         006 9   ao     6   oo       Figure 15 4  Residual plots for the EXP variance model for the plant data    The residual plot from this analysis is presented in Figure 15 4  This suggests increasing  variance over time  This can be modelled by using the EXPH model  which models    by    y  Dc p      282    15 5 Balanced repeated measures   Height       where D is a diagonal matrix of variances and C is a correlation matrix with elements given  by qj   gl   4l  The coding for this is    yl y3 yS y7 y10   Trait tmp Tr tmt  residual id units  exph Trait  INIT 0 5 100 200 300 300 300  COORD 1 3 5 7 10     Abbreviated output from this analysis is    9 LogL  171 512 S2  1 00000 60 df  10 LogL  171 500 52  1 00000 60 df  11 LogL
133. 0000  0 1613 0 2096 0 2162E 01 0 8451  1 298 1 687 0 1243 1 0000    And the eigen analysis in the  res file is    Eigen Analysis of XFA matrix for xfal TrDam123   id dam     Eigen values 4 704 0 246 0 006  Percentage 94 919 4 957 0 124   1 0 6431  0 7647 0 0009   2 0 7637 0 6404  0 0743   3 0 0563 0 0484 0 9972    showing that the smallest eigenvalue is 0 006  On the basis of this ASReml with   ARG 3  fits  unstructured matrices for sire and litter and xfa1 for dam using initial values derived from  the previous analysis in coopmf2 rsv  Portions of the  asr file from the Path 3 run are    Notice  ReStartValues taken from coopmf2 rsv  Notice  LogL values are reported relative to a base of  20000  000  Notice  US matrix updates modified 1 time s  to keep them positive definite   Notice  1084 singularities detected in design matrix   1 LogL  1488 11 S2  1 00000 18085 df 11 components restrained  2 LogL  1486 27 S2  1 00000 18085 df 2 components restrained  3 LogL  1483 34 52  1 00000 18085 df 1 components restrained  4 LogL  1481 89 S2  1 00000 18085 df  5 LogL  1481 10 S2  1 00000 18085 df  6 LogL  1480 91 S2  1 00000 18085 df  7 LogL  1480 89 S2  1 00000 18085 df  8 LogL  1480 89 S2  1 00000 18085 df  9 LogL  1480 89 S2  1 00000 18085 df        Results from analysis of wwt ywt gfw fdm fat        Notice  US structures were modified 1 times to make them positive definite     If ASReml has fixed the structure  flagged by B   it may not have   converged to a maximum likelihood sol
134. 00000000000  12 3 0 9672D 01 452 0 2492 0 07906 0 07295 0O 1001100000000000  13 3 0 9579D 01 452 0 2494 0 07830 0 072188 0 0O 101100000000000  14 3 0 9540D 01 452 0 2495 0 07797 0 0718 0 0 011100000000000  15 3 0 1089D 02 452 0 2465 0 08907 0 083022 10010100000000000  16 3 0 2917D 01 452 0 2642 0 02384 0 01736 010 10100000000000  17 3 0 2248D 01 452 0 2657 0 01838 0 01187 00 1 10100000000000  18 3 0 1111D 02 452 0 2460 0 09088 0 08484 1 0100100000000000  19 3 0 1746D 01 452 0 2668 0 01427 0 00773 0 1100100000000000  20 3 0 1030D 02 452 0 2478 0 08423 0 07815 11000 100000000000  21 3 0 1279D 02 452 0 2423 0 10454 0 09890 10000110000000000  22 3 0 8086D 01 452 0 2527 0 06609 0 05989 0 1 000110000000000  23 3 0 7437D 01 452 0 2542 0 06079 0 05456 00100110000000000  24 3 0 1071D 02 452 0 2469 0 08755 0 08149 0 0 010110000000000  25 3 0 1370D 02 452 0 2403 0 11200 0 10611 00001110000000000  SK SK  26 3 0 1511D 02 452 0 2372 0 12351 0 11770 10001010000000000  SOK SOK  27 3 0 1353D 02 452 0 2407 0 11064 0 10473 0 1 001010000000000  680 3 0 1057D 02 452 0 2472 0 08641 0 08035 1 1 000000000000001    The primary tables reported in the  asr file are now also written in XML format to a  xml  file  The intended use of this file is by programs written to parse Asreml output  The  information contained in the  xm1 file includes start and finish times  the data summary  the  iteration sequence summary  the summary of estimated variance structure parameters and  the Wald F statistics  Develop
135. 03  for examples of factor analytic models in multi environment trials  The general  limitations are    144    7 11 Variance model functions available in ASReml         that W may not include zeros except in the XFAk formulation      constraints are required in I for k  gt  1 for identifiability  These are automatically set  unless the user formally constrains one parameter in the second column  two in the third  column  etc       the total number of estimated parameters  kw  w   k k   1  2  may not exceed w w 1  2     In FAk models the variance covariance matrix X          is modelled on the correlation scale as  X   DCD  where      D  is diagonal such that DD   diag  X         C     is a correlation matrix of the form F F    E where F        is a matrix of loadings  on the correlation scale and F is diagonal and is defined by difference       the parameters are specified in the order loadings for each factor  F  followed by the  variances  diag      when k is greater than 1  constraints on the elements of F are  required  see Table 7 5     FACVk models  CV for covariance  are an alternative formulation of FA models in which    is   modelled as      TT    W where T          is a matrix of loadings on the covariance scale and   W is diagonal  The parameters in FACV     are specified in the order loadings  T  followed by variances  W   when k is greater than  1  constraints on the elements of I are required  see Table 7 5       are related to those in FA by I      DF and Y   D
136. 1  Unstructured 15  158 04 377 50          The split plot in time model can be fitted in two ways  either by fitting a units term plus an  ind residual as above  or by specifying a CORU variance model for the R structure as follows    yl y3 yS y7 y10   Trait tmt Tr tmt  residual id units   coru Trait     The two forms for    are given by    So   of J 031  units  15 3   E   oI  o p J     TI   CORU i  It follows that  o    o   Gi  15 4   P   ma    Portions of the two outputs are given below  The REML log likelihoods for the two models are  the same and it is easy to verify that the REML estimates of the variance parameters satisfy   15 4   viz  o    286 310   159 858   126 528   286 386  159 858 286 386   0 558191     280    15 5 Balanced repeated measures   Height             r idv units Trait      LogL  204 593 S2  224 61 60 df 0 1000 1 000  LogL  201  233 S2  186 52 60 df 0 2339 1 000  LogL  198 453 S2  155 09 60 df 0 4870 1 000  LogL  197 041 52  133 85 60 df 0 9339 1 000  LogL  196 881 S2  127 86 60 df 1 204 1 000  LogL  196 877 S2  126 53 60 df 1 261 1 000  Final parameter values 1 2634 1 0000       Results from analysis of yl y3 y5 y7 y10     Akaike Information Criterion 397 75  assuming 2 parameters   Bayesian Information Criterion 401 9  Approximate stratum variance decomposition  Stratum Degrees Freedom Variance Component Coefficients  idv  units  12 00 925 584 5 0 1 9  Residual Variance 48 00 126 494 0 0 1 0  Model_Term Gamma Sigma Sigma SE  C  idv  units  IDV_V 14 1
137. 1 model for the Tullibigeal  data    The abbreviated output for this model and the final model in which a nugget effect has been  included is     AR1xAR1   pol column   1     1 LogL  4271 06 82 gt  0 12731E 06 665 df  2 LogL  4259  03 52  0 11963E 06 665 df  3 LogL  4245 41 S2  0 10556E 06 665 df  4 LogL  4229 98 S2  78754  665 df  5 LogL  4226 66 52  75970  665 df  6 LogL  4226 29 52  T7975  665 df  7 LogL  4226 25 52  78313  665 df  8 LogL  4226 25 52  78396  665 df  9 LogL  4226 25 S2  78419  665 df    295    15 7 Unreplicated early generation variety trial   Wheat       Tullibigeal trial    Batts  26 aud odo2 19 03 22          Outer displacement Wer displacement    Figure 15 7  Sample variogram of the residuals from the ARIxAR1   pol column  1   model for the Tullibigeal data    10 LogL  4226 25 S2  78425  665 df        Results from analysis of yield          Akaike Information Criterion 8460 50  assuming 4 parameters    Bayesian Information Criterion 8478 50    Model_Term Gamma Sigma Sigma SE  C  idv  variety  IDV_V 532 112313 88081 9 9 81 0 P  ar1 row  ar1 column  670 effects   Residual SCA_V 670 1 000000 78425 4 8 83 OP  row AR_R 1 0 665872 0 665872 15 37 OP  column AR_R 1 0 266047 0 266047 363 0 P    Wald F statistics    Source of Variation NumDF DenDF F ingc P ine  7 mu i 42 5 7149 90  lt  001  3 weed 1 459 0 92 14  lt  001  8 pol column  1  1 62 1 t 261 0 008      AR1xAR1   units   pol column  1   1 LogL  4272 85 S2  0 11684E 06 665 df  2 LogL  4265 70 S2  83872  66
138. 1017E 01  0 508505E 01 0 Q000000E 00 0 393519E 01 0 430418E 01 0 423685E 01  0 428749E 01 0 417784E 01 0 363262E 01 0 444716E 01 0 527187E 01    239    13 5 ASReml output objects and where to find them       0 855044E 01 0 243553E 01 0 Q000000E 00 0 351279E 01 0 369901E 01  0 383964E 01 0 330102E 01 0 361942E 01 0 352305E 01 0 359462E 01  0 392014E 01 0 406704E 01 0 801337E 01 0 475798E 01 0   000000E 00  0 370878E 01 0 418534E 01 0 452789E 01 0 408589E 01 0 446476E 01  0 375742E 01 0 403945E 01 0 420473E 01 0 406937E 01 0 403049E 01  0 857644E 01 0 606943E 00 0 000000E 00 0 428611E 01 0 506706E 01  0 432088E 01 0 387484E 01 0 436861E 01 0 391305E 01 0 421110E 01    The first 5 rows of the lower triangular matrix are    48 7026  0 0000  2 9841  4 7063    0 3155    0 0000  0 0000  0 0000  0 0000    8 0735  4 5654 8 8650  4 0995 4 7648 8 7656    13 4 12 The  vvp file    The  vvp file contains the inverse of the average information matrix on the components  scale  The file is formatted for reading back under the control of the  pin file described  in Chapter 12  The matrix is lower triangular row wise in the order the parameters are  printed in the  asr file  This is nin89a vvp with the parameter estimates in the order error  variance  spatial row correlation  spatial column correlation     Variance of Variance components 3  51 1980  0 217689 0 317838E 02    0 673382E 01  0 201115E 02 0 649673E 02    13 5 ASReml output objects and where to find them    Table 13 2 presents a list
139. 103 81 61 81 130 94 10 55 53 55 106 15 109 153 23 0 50 66 111   29 75 43  24  90  37  23 64 130 84 122 129 126 90  38 91 133 126  16 57 30 70   99  114  218  332  174  77  19  38  29 58 63 88 4 124 49 101 129 113 45 92 70 198    257  333  352  319  253  166  152  52  28 0 97 135 67 16  9  36 96 24 62 48  27  29   227  167  356  335  183  179  189  118  124 14  52 19  7  56  81  33 63  40 57  15 24 73   183  277  352  323  288  151  56  130  188  29  78 7 12  30 39 57 89  3 116 27 2 64              j         yeaa E aao   saros       a il aii na in sa iba dl   eens feed pee ees         3      sal Sgt gg 222 aaa         gt 2 S    f                                  re         rE a   xsl jas is 5 wt al   a E a a e a gg a a a oa a a                 5 E                                     al       5 2 EE 3 5     Mi Pe   srs                   ih ian i cl cis  pa PENER      Ia  a       aml hi se as oak oa p7      l  E Aa       3 E   22  gt   t         4         E   eK                    p          shies a ta ted es a  stag ttn  te    scsi sink ca tia ds aca se  m n                2 2 e         sag 5   ta           aK    Residual  section 1  column 8   11   row A    929  is  3 32 SD   Residual  section 1  column 9   11   row 2     22   is  3 33 SD   Residual  section 1  column 9   11   row 3   22   is  3 62 SD   Residual  section 1  column 10   11   row 3   22   is  3 66 SD   Residual  section 1  column 10   11   row 4   22   is  3 35 SD   Residual  section 1  column 11   11   row 3   
140. 142 2 53142 19 25 OP  Trait US_C 5 3 0 821032E 01 0 821032E 01 4 52 0 P  Trait US_C 5 4 0 208739 0 208739 1 60 0 P  Trait US_V 5 5 1 54280 1 54280 24 00 0 P  diag  TrSG123   sex grp 147 effects   TrsG123 DIAG_V 1 1 01250 1 01250 2 96 OP  TrSG123 DIAG_V 2 15 2159 15 2159 3 49 0 P  TrsG123 DIAG_V 3 0 279183 0 279183 Batt OP  diag  TrAG1245   age grp 196 effects   TrAG1245 DIAG_V 1 0 142096E 02 0 142096E 02 2 04 0 P  TrAG1245 DIAG_V 2 0 143897E 02 0 143897E 02 1 54 OP  TrAG1245 DIAG_V 3 0 163778E 02 0 163778E 02 1 409 OP  TrAG1245 DIAG_V 4 0 207274E 03 0 207274E 03 1 61 0 P    330    15 10 Multivariate animal genetics data   Sheep       us  TrLit1234   id lit  19484 effects  TrLit1234 USV 1 1 3 84738 3 84738 9 19 0  TrLit1234 US_C 2 1 2 52256 2 52256 5 47 0  TrLit1234 US_V 2 2 4 07860 4 07860 5 46 0  TrLit1i234 US_C 3 1 0 767402E 01 0 767402E 01 2 05 0  TrLiti234 US_C 3 2 0 206265 0 206265 4 36 0  TrLit1i234 US_V 3 3 0 250400E 01 0 250400E 01 3 30 0  TrLit1234 US C 4 1  0 118244  0 118244  0 35 0  TrLit1i234 US_C 4 2  0 824135  0 824135  1  58 0  TrLit1234 US_C 4 3  0 492320E 01  0 492320E 01  0 85 0  TrLit1234 US_V 4 4 0 704947 0 704947 1 74 0  xfa1 TrDam12   id dam  32088 effects  TrDam12 XFA_LV O 1 0 00000 0 00000 0 00 0  TrDam12 XFAV O 2 0 00000 0 00000 0 00 0  TrDam12 XFA L 1 1 1 27045 1 27045 10 00 0  TrDam12 SPALL 1 2 1 15350 1 15350 5 66 0  xfa3  Trait   nrm tag  85568 effects  Trait XFAV O 1 0 00000 0 00000 0 00 0  Trait XFA_LV O 2 0 00000 0 00000 0 00 0  Trait XFA_V O0
141. 16 SIRE 2 Q 116 SIRE_2 0 1 5 168 417 2752  Lif SIRE 3 0 117 SIRE  3 0 1 3 154 389 2383  118 SIRE3 0 118 SIRE 3 O 1 4 184 414 2463  119 SIRE_3 0 119 SIRE_3 0 1 5 174 483 2293  120 SIRES Q 120 SIRE 3 O 1 5 170 430 2303                8 7 Reading in the pedigree file  The syntax for specifying a pedigree file in the ASReml command file is  pedigree_file  qualifiers     e the qualifiers are listed in Table 8 1     e the identities  individual  parent_1  parent_2  are merged into a single list and the inverse  relationship is formed before the data file is read     e parent l is typically male for animal pedigrees  sire  but often female for plant pedigrees   it must be the XY parent if the  XLINK qualifier is specified     e when the data file is read  data fields with the  P qualifier are recoded according to the  combined identity list     e the inverse relationship matrix is automatically associated with factors coded from the  pedigree file unless some other covariance structure is specified  The inverse relationship  matrix is specified with the variance model name NRM  the variance model function name  nrm       e the inverse relationship matrix is written to ainverse bin        2 http   www vsni co uk products asreml user PedigeeNotes pdf contains details of these op   tions     158    8 7 Reading in the pedigree file         if ainverse bin already exists ASReml assumes it was formed in a previous run and  has the correct inverse      ainverse bin is read  rather than the
142. 169 61 9169 i 7e UP  Trait US_V 3 3 259 121 269  121 2 45 OP  Trait US _C 4 1 70 8113 70 8113 1 54 OP  Trait US_C 4 2 57 6146 57 6146 1 23 OP  Trait US_C 4 3 331 807 331 807 2 20 OP  Trait US_V 4 4 551 507 551 507 2 45 OP  Trait US_C 5 1 73 7857 73 7857 1 60 OF  Trait US C 5 2 62 5691 62 5691 1 33 OP  Trait US_C 5 3 330 851 330 851 2 29 0 P  Trait US_C 5 4 533 756 533 756 2 42 OP  Trait US_V 5 5 542 175 542 175 2 45 OP    However  the usual syntax for fitting an unstructured error model for multivariate data is to  omit the  ASUV qualifier and write    285    15 6 Spatial analysis of a field experiment   Barley       yl y3 yS y7 y10   Trait tmt Tr tmt  residual id units  us Trait     The antedependence model of order 1 is clearly more parsimonious than the unstructured  model  Table 15 5 presents the incremental Wald F statistics for each of the variance models   There is a surprising level of discrepancy between models for the Wald F statistics  The main  effect of treatment is significant for the uniform  power and antedependence models     Table 15 5  Summary of Wald F statistics for fixed effects for variance models fitted to the  plan    treatment treatment time       model  df 1   df 4   Uniform 9 41 5 10  Power 6 86 6 13  Heterogeneous power 0 00 4 81  Antedependence  order 1  4 14 3 91  Unstructured 1 71 4 46          15 6 Spatial analysis of a field experiment   Barley    In this section we illustrate the ASReml syntax for performing spatial and incomplete block  ana
143. 177 2955 E   4 2982 5846 178 9939 E   522 2 9127 179 3317 E   523 2907   1301 179 7729 E   524 2776  0280 180 3853 E   525 2716 1221 181 8923 E   526 2381 9697 44 1852 E   527 2696   4092 133 8687 E   528 2723  5890 112 6784 E   529 2701 6306 104 2832 E   530 3006   8237 112 7234 E   531 3019 5559 112 6742 E   532 3064 3052 113 0868 E   SED  Overall Standard Error of Difference 246 2    297    15 8 Paired Case Control study   Rice       Note that the  replicated  check lines have lower SE than the  unreplicated  test lines  There  will also be large diffeneces in SEDs  Rather than obtaining the large table of all SEDs  you  could do the prediction in parts   predict var 1 525 column 5 5   predict var 526 532 column 5 5  SED   to examine the matrix of pairwise prediction errors of variety differences     15 8 Paired Case Control study   Rice    This data is concerned with an experiment conducted to investigate the tolerance of rice  varieties to attack by the larvae of bloodworms  The data have been kindly provided by  Dr  Mark Stevens  Yanco Agricultural Institute  A full description of the experiment is  given by Stevens et al   1999   Bloodworms are a significant pest of rice in the Murray and  Murrumbidgee irrigation areas where they can cause poor establishment and substantial  yield loss     The experiment commenced with the transplanting of rice seedlings into trays  Each tray  contained 32 seedlings and the trays were paired so that a control tray  no bloodworms  and  
144. 18  124   130  188    ooo  w       22    ao    eK         omitting  11  5 0 50  O 0 40  5 0 30  0 0  15 99  23 32  25 32   lt 9  m2  55 53  122 129  63 88  97 135   52 19     78 7    233    ooo    18 zeros  38 0 28  32 0 26  24 0 19   0 0   9 37  44 46  120  33  53  41  55 106  127 90   4 124  67 16   7  56  12  30    84  109  10    15   38  49     61  39    64  110  83  ied  123  153  133  129  96  63  89    228  67  113  47  23  126  113    91  49  68  109  1419     16  45  62  57  116    86  131  141   63  181   50   57   92   48   15   27    65   20  69  57  101  66  30  70  a5  24    141    40  25  104  114  70  198   29  T3  64    13 4 Other ASReml output files          Residual  section 11  column 9  of 11   row 2  of 22   is  3 33 SD  Residual  section 11  column 9  of 11   row 3  of 22   is  3 52 SD  Residual  section 11  column 10  of 11   row 3  of 22   is  3 56 SD  Residual  section 11  column 10  of 11   row 4  of 22   is  3 35 SD  Residual  section 11  column 11  of 11   row 3  of 22   is  3 52 SD  6 possible outliers in section 11  test value 23 0297999308   Residuals  Percentage of sigma   6 979           0       o    o                          A GS SO 84 BS G8 144   72  29  52  20  61 11  132 26 63 15 99 37 84 48 110 228 49 131  20 9   87 1  32  14  26  30  3 37  6 4 23 32 44 46 109 97 83 67 68 141 69 40  44 11 0 3 6 O 21 41  15 51 25 32 120  33 10 58 117 113 109 63 57 25  18 18  2  84  19  51  45 18 30 56  9  12 53  41 7 99 123 47 119 181 101 104   40 29 87 
145. 2  0 10969E 06 666 df  4 LogL  4243 76 S2  88040  666 df  5 LogL  4240 59 S2  84420  666 df    293    15 7 Unreplicated early generation variety trial   Wheat       6 LogL  4240 01 52  85617  666 df  7 LogL  4239 91 S5S2  86032  666 df  8 LogL  4239 88 52  86189  666 df  9 LogL  4239 88 S2  86253  666 df  10 LogL  4239 88 52  86280  666 df          Results from analysis of yield        Akaike Information Criterion 8485 76  assuming 3 parameters    Bayesian Information Criterion 8499 26    Model_Term Gamma Sigma Sigma SE  C  idv  variety  IDV_V 532 0 959184 82758  6 8 98 0 P  ar1 row  id column  670 effects   Residual SCA_V 670 1 000000 86280 2 9 12 0 P  row AR_R 1 0 672052 0 672052 16 04 1P    Wald F statistics    Source of Variation NumDF DenDF F ine P ine  7 mu 1 83 6 9799 20  lt  001  3 weed 1 477 0 109 33  lt  001    The iterative sequence converged  the REML estimate of the autoregressive parameter indi   cating substantial within column heterogeneity     The abbreviated output from the two dimensional AR1xAR1 spatial model is    1 LogL  4277  99 S2  0 12850E 06 666 df  2 LogL  4266 14 S2  0 12097E 06 666 df  3 LogL  4253 06 S2  0 10778E 06 666 df  4 LogL  4238 72 52  83163  666 df  5 LogL  4234 53 52  79867  666 df  6 LogL  4233 78 S2  82024  666 df  7 LogL  4233 67 S2  82724  666 df  8 LogL  4233 65 52  82975  666 df  9 LogL  4233 65 S2  83065  666 df  10 LogL  4233 65 S2  83100  666 df          Results from analysis of yield        Akaike Information Criterion 8475 29
146. 22   is  3 52 SD   6 possible outliers in section 1 test value 23  0311757288330    234    13 4 Other ASReml output files       variogram of resiguats  fd sul 2b8512 41 18             Outer displacement    Figure 13 2  Variogram of residuals    Figures 13 2 to 13 5 show the graphics derived from the residuals when the  DISPLAY 15  qualifier is specified and which are written to  eps files by running    ASReml  g22 nin89a as    The graphs are a variogram of the residuals from the spatial analysis for site 1  Figure  13 2   a plot of the residuals in field plan order  Figure 13 3   plots of the marginal means  of the residuals  Figure 13 4  and a histogram of the residuals  Figure 13 5   The selection  of which plots are displayed is controlled by the  DISPLAY qualifier  Table 5 4   By default   the variogram and field plan are displayed     The sample variogram is a plot of the semi variances of differences of residuals at particular  distances  The  0 0  position is zero because the difference is identically zero  ASReml  displays the plot for distances 0  1  2       8  9 10  11 14  15 20          The plot of residuals in field plan order  Figure 13 3  contains in its top and right margins a  diamond showing the minimum  mean and maximum residual for that row or column  Note  that a gap identifies where the missing values occur     The plot of marginal means of residuals shows residuals for each row column as well as the  trend in their means     Finally  we present a small e
147. 29 4 25  6 spl age 7  5 effects fitted    Finished  19 Aug 2005 10 08 11 980  LogL Converged    The REML estimate of the smoothing constant indicates that there is some nonlinearity  The  fitted cubic smoothing spline is presented in Figure 15 13  The fitted values were obtained  from the  pvs file  The four points below the line were the spring measurements     200 600 1000 1400  li fi fi fi fi L fi         fi fi  Marginal       200    5   Ze  150 Lo  100 j    50          200    150    100    Trunk circumference  mm     50             200      150 Z      Zo  100   LA  Ze  50 AA    I  200 600 1000 1400                Time since December 31  1968  Days     Figure 15 13  Fitted cubic smoothing spline for tree 1    We now consider the analysis of the full dataset  Following Verbyla et al   1999  we con   sider the analysis of variance decomposition  see Table 15 11  which models the overall and  individual curves     An overall spline is fitted as well as tree deviation splines  We note however  that the  intercept and slope for the tree deviation splines are assumed to be random effects  This is  consistent with Verbyla et al   1999   In this sense the tree deviation splines play a role in  modelling the conditional curves for each tree and variance modelling  The intercept and    311    15 9 Balanced longitudinal data   Random coefficients and cubic smoothing splines    Oranges       Table 15 11  Orange data  AOV decomposition       stratum decomposition type df or ne       co
148. 3  MBF statements required to ex   tract markers 35  75 and 125 from the marker file markers csv  The  names of model terms must begin with a letter  hence the marker  names are the letter M followed by the position number  Alternatively  IRFIELDlettersinteger is interpreted as  RFIELD integer so the  FOR  statement can be written even more concisely as   mbf Geno 1  markers csv  key 1  RFIELD S  RENAME  S   without the need to assign Markern  Now  to add another marker to the  model  one can just add the marker integer to the ASSIGN statement     Restriction  forlist and command are both limited to 200 charac   ters     203    10 4 Advanced processing arguments       High level qualifiers          qualifier action  LIF stringi    New R4 One form of the IF statement is  string2 text  IF string1    string2  ASSIGN M1 brt DamAge which makes the      ASSIGN statement active if string1 is the same as string2  Note that  there need to be spaces before and after    to avoid confusion with the  strings  This has been used when performing a large number of bivariate  analyses with trait specific fixed effects being fitted  So    IIF  1    wwt  ASSIGN M1 brt DamAge  IIF  1    ywt  ASSIGN M1 brt    IF  1    fwt  ASSIGN M1 DamAge    IF  2    wwt  ASSIGN M2 brt DamAge   IF  2    ywt  ASSIGN M2 brt    IF  2    fwt  ASSIGN M2 DamAge     1  2   Trait at Trait 1    M1  at Trait 2    M2      PATH pathlist The  PATH  or  PART  control statement may list multiple path numbers  so that the follo
149. 3 NE83498  16 LANCER LANCOTA NES87451 NE87409 NE86607 NE87612 CHEYENNE NE83404 NE86503 NE83T12 NE87613  17 BRULE NE86501 NES87457 NE87513 NE83498 NE87613 SIOUXLAND NE86503 NE87408 CENTURAK78 NE86501  18 REDLAND NE86503 NES87463 NE87627 NE83404 NE86T666 NES87451 NE86582 COLT NE87627 TAM200  19 CODY NE86507 NES87499 ARAPAHOE NE87446   GAGE NE87619 LANCER NE86606 NE87522  20 ARAPAHOE NE86509 NE87512 LANCER SIOUXLAND NES86607 LANCER NE87463 NE83406 NE87457 NE84557  21 NES83404 TAM107 NE87513 TAM107 HOMESTEAD LANCOTA NES87446 NES86606 NE86607 NE86509 TAM107  22 NE83406 CHEYENNE NE87522 REDLAND NE86501 NE87518 NES86482 BRULE SIOUXLAND LANCOTA HOMESTEAD             yu  wu  dx   pjay  NIN  Asasanyy a2eyseaquy eyseaq  N ZE    3 3 The ASReml data file       3 3 The ASReml data file    The standard format of an ASReml data file is to have the data arranged in space  TAB or  comma separated columns fields with a line for each sampling unit  The columns contain  covariates  factors  response variates  traits  and weight variables in any convenient order   This is the first 30 lines of the file nin89 asd containing the data for the NIN variety trial   The data are in field order  rows within columns  and an optional heading  first line of the  file  has been included to document the file  In this case there are 11 space separated data  fields  variety   column  and the complete file has 224 data lines  one for each variety in  each replicate        variety id pid raw repl nloc yield lat lo
150. 5 df   1 components restrained  3 LogL  4240 99 S2  80942  665 df  4 LogL  4227  44 52  53712  665 df  5 LogL  4221 09 52  52201  665 df  6 LogL  4220 94 S2  54803  665 df    296    15 7 Unreplicated early generation variety trial   Wheat       7 LogL  4220 94 S2  54935  665 df  8 LogL  4220 94 S2  54934  665 df        Results from analysis of yield        Akaike Information Criterion 8451 88  assuming 5 parameters    Bayesian Information Criterion 8474 37    Model_Term Gamma Sigma Sigma SE  C  idv  variety  IDV V    532 1 32827 72967  0 6 99 OP  idv  units  IDV_   670 0 562308 30889  9 3 18 0 P  ar1 row  ar1 column  670 effects   Residual SCA_V 670 1 000000 54934 0 65 19 OP  row AR_R 1 0 835396 0 835396 18 38 0P  column AR_R 1 0 375499 0 375499 a0 0 P    Wald F statistics    Source of Variation NumDF DenDF F ine Ping  7 mu 1 13 6 4272 13  lt  001  3 weed 1 470 3 86 31  lt  001  8 pol column  1  1 27 4 3 69 0 065    The increase in REML log likelihood is significant  The predicted means for the varieties can  be produced and printed in the  pvs file as    Ecode is E for Estimable    for Not Estimable    Warning  mv_estimates is ignored for prediction  Warning  units is ignored for prediction  Sa le ee el ee     mame il    AR  ete ee ee    Mmmm  Predicted values of yield   column is evaluated at 5 5000   Model terms involving weed are predicted at the average  0 4597  variety Predicted_Value Standard_Error Ecode   i 2916 6768 179 5421 E   2 2955 1002 179 0278 E   3 2869  7482 
151. 7   0 10040 02 394      140099  2 2 1 2 2 2 2 2 2 1 2 1 2 1 1 2 1 2 2 2 2 2 1 2     141099  2 2 0 0 2 2 1 2 2 1 2 1 2 2 0 2 2 2 2 1 2 2 1 1       54785 2 2  gt   gt  2 2 2  gt  2 2 2 2 2  gt  3 2 2 2  gt  2 2    3 2 2 1 2 2 2 1 2 2 0 2 1 2 2 2 2 2 2 2 1 2      547966  2 2 1 1 1 2 0 2 2 1 2 2 2 2 2 2 2 2 2 1 2      548082  2 2 1 2 2 2 1 2 1 2 2 1 2 2 1 2 2 2 2 1 2        2 2  gt   gt  2 2   2 2 2  gt  2 2 2 2 2 2 2 2 2 2  gt     The primary output follows     Nfam 71  A  Nfemale 26  A  Nmale 37  A  Clone  A 860  MatOrder 914  A    170    8 11 Factor effects with large Random Regression models       rep 8  A  iblk 80  A  prop i  A  culture 2  A  treat 2  A  measure 1  A  CWAC6  M 9  Parsing  snpData grr Clone  Class names for factor  Clone  are initialized from the  grr file   GRR Header line begins  Genotype  0 10024 01 114 0 10037 01 257  0  4854 Marker labels found  Marker labels 0 10024 01 114     UMN CL98Contig1   Notice  The header line indicates there are 4854 regressors in the file   Notice  SNP data line begins  140099 2 2 1 2 2 2 2 2 2  1 2 1 2 1 1   Notice  Markers coded  9 treated as missing   Marker data  0 1 2  for 923 genotypes and 4854 markers read from snpData grr  160414 missing Regressor values   3 6   replaced by column average   Regressor values ranged 0 00 to 2 00  Regressor Means ranged 1 00 to 2 00  Sigma2p 1 p  is 1057 12558  GIV1 snpData grr 923 9  946  27  QUALIFIERS   MAXIT 30  SKIP 1  DFF  1  QUALIFIER   DOPART 2 is active  Reading nassau_cut_v3 csv
152. 7 15 91 Fitted values  X  16 77  35 94  o  o  o  o o    o o    N  Q o    o o  6 o o    8 0920 ae  o 8 e oo 4    9   Oo O     oo     o e 8 o 98  wo   o o o  oo o o 998 o o 8  o    Co       o o  o Bo  amp  5 99    o  o    a 85 99    q Oa       o o o 8  8  n 5 oe    2o Blo   9 g o  o 8 g8  o  o 5 a 8  o j    7   e      o a o  o o j oo  o  o  o o  o  o  o o   6  o o  o o o    Figure 13 1  Residual versus Fitted values    This is part of nin89a yht  Note that the values corresponding to the missing data  first  15 records  are all  0 1000E 36 which is the internal value used for missing values     224    13 4 Other ASReml output files       Record Yhat Residual Hat   1  0 10000E 36  0 1000E 36  0 1000E 36   2  0 10000E 36  0 1000E 36  0 1000E 36   3  0 10000E 36  0 1000E 36  0 1000E 36   4  0 10000E 36  0 1000E 36  0 1000E 36   15  0  10000E  36  0 1000E 36  0 1000E 36  16 24 089 5 161 6 075  17 2f OT 4 477 6 223  18 28 795 6 255 6 233  19 PERTE 6 327 6 236  20 27 043 6 007 5 963  239 21 522 8 128 6 314  240 24 696 1 854 6 114  241 25 452 0 1480 6 159  242 22 464 4 436 6 605    13 4 Other ASReml output files    13 4 1 The  aov file    This file reports details of the calculation of Wald F statistics  particularly as relating to the  conditional Wald F statistics  not computed in this demonstration   In the following table  relating to the incremental Wald F statistic  the columns are    e model term   e columns in design matrix   e numerator degrees of freedom  e simple Wald F sta
153. 8  var trt rep  mu var trt  If var trt var trt fitted before mu  var and trt     var trt fully fitted  mu  var and trt  are completely singular and set to zero   The order within   var trt rep   is de   termined internally     6 11 Wald F Statistics    The so called ANOVA table of Wald F statistics has 4 forms     Source NumDF F inc   Source NumDF F inc F con M   Source NumDF DDF_inc F inc P inc  Source NumDF DDF_con F inc F con M P con    depending on whether conditional Wald F statistics are reported  requested by the  FCON  qualifier  and whether the denominator degrees of freedom are reported  ASReml always  reports incremental Wald F statistics  F inc  for the fixed model terms  in the DENSE  partition  conditional on the order the terms were nominated in the model  Note that  probability values are only available when the denominator degrees of freedom  is calculated  and this must be explicitly requested with the  DDF qualifier in larger jobs   Users should study Section 2 5 to understand the contents of this table  The    conditional  maximum    model used as the basis for the conditional F statistic is spelt out in the  aov file  described in Section 13 4     The numerator degrees of freedom  NumDF  for each term is easily determined as the number  of non singular equations involved in the term  However  in general  calculation of the  denominator degrees of freedom  DDF  is not trivial  ASReml will by default attempt the  calculation for small analyses  by one of tw
154. 8 9 0 8776 0 8566   44 60 79 27 331 5 550 9 0 9761   43 16 f6  2 320 8 533 2 541 6   Wald F statistics  Source of Variation NumDF F ine   8 Trait 5 188 83   1 tmt 1 4 14   9 Trait tmt 4 3 91    The iterative sequence converged and the antedependence parameter estimates are printed  columnwise by time  the column of U and the element of D  L e     284    15 5 Balanced repeated measures   Height       0 0269 1    0 6284 0 0 0  0 0373 0 1    1 4911 0 0   D   diag   0 0060    U     0 0 1    1 2804 0  0 0079 0 0 0 1    0 9678  0 0391 0 0 0 0 1    Finally the input and output files for the unstructured model are presented below  The  REML estimate of X from the ANTE model is used to provide starting values      ASSIGN USI   lt   INIT    37 20   23 38 41 55   34 83 61 89 258 9   44 58 79 22 331 4 550 8   43 14 T667 320 7 533 0 541 4    1 gt   yl y3 yo y7 yi0   Trait tmt Trait tmt  residual id units  us Trait  USI     1 LogL  160 368 S2  1 0000 60 df  2 LogL  159  027 S2  1 0000 60 df  3 LogL  158 247 S2  1 0000 60 df  4 LogL  158 040 S2  1 0000 60 df  5 LogL  158 036 S2  1 0000 60 df          Results from analysis of yi y3 yS y7 y10         Akaike Information Criterion 346 07  assuming 15 parameters    Bayesian Information Criterion 377 49   Model_Term Sigma Sigma Sigma SE  C  id units  us Trait  70 effects   Trait USV 1 1 37 2262 of 2262 2 45 OP  Trait US_C 2 1 23 3935 23 3935 it OP  Trait US_V 2 2 41 5195 41 5195 2 45 OP  Trait US_C 3 1 51 6524 51 6524 1 61 OP  Trait US_C 3 2 61 9
155. CA_V 242 1 000000 48 7026 6 81 0O P parameter  row AR_R 1 0 655480 0 655480 11 63 O P estimates  column AR_R 1 0 437505 0 437505 5 43 OP  Wald F statistics  Source of Variation NumDF DenDF Fine P inc testing  12 mu 1 25 0 331 93  lt  001 fixed  1 variety 55 110 8 2 22  lt  001 effects    Notice  The DenDF values are calculated ignoring fixed boundary singular  variance parameters using algebraic derivatives   13 mv_estimates 18 effects fitted  6 possible outliers  in section 11  see  res file   Finished  29 Jan 2014 09 34 34 861  LogL Converged    Following is a table of Wald F statistics augmented with a portion of Regression Screen  output  The qualifier was  SCREEN 3  SMX 3     Model_Term Gamma Sigma Sigma SE   C  idsize IDV V 92 0 581102 0 136683 3 31 OP  expt idsize IDV_V 828 0 121231 0 285153E 01 1 12 OP  idv units  504 effects   Residual SCA_V 504 1 000000 0 235214 12 70 0P    Wald F statistics    Source of Variation NumDF DenDF_con F_inc F_con M P_con  113 mu 1 72 4 65452 25 56223 68    lt  001  2 expt 6 ey dc  Be 0 64 A 0 695    221    13 3 Key output files       4 type 4 63 8 22 95 3 01 A 0 024  114 expt type 10 79 3 1 31 0 93 B 0 508  23 x20 1 55 4 4 33 2 37 B 0 130  24 x21 1 63 3 1 91 0 87 B 0 355  25 x23 1 68 3 23 93 0 11 B 0 745  26 x39 1 yt 1 85 0 35 B 0 556  27 x48 1 69 9 1 58 2 08 B 0 154  28 x59 1 49 7 1 41 0 08 B 0 779  29 x60 1 69 6 1 46 0 42 B 0 518  30 x61 1 64 0 1 11 0 04 B 0 838  31 x62 1 61 8 2 18 0 09 Batre  32 x64 1 55 6 31 48 4 50 B 0 038  33 x65 2
156. Di  on LOG PVBari  for Section    1 37             xk kkk    xk kkk  kxk kkk kkk  FKK KK kk KK K K    at       RK OK  kkk KK    DG kk k k k   k k k k k k kk k ok    232    13 4    Other ASReml output files         OK  x ORK      FRR 2 k a k k kk kk kk k k k k    xk kk kk Ra kk kk kkk   k k k k gk 3k 2k 2K 2k   k ok k K K K K k    Min Mean Max     24 873    0 27959    15 915    Spatial diagnostic statistics of Residuals  Residual Plot and Autocorrelations   lt L0o   xXH gt   se 0 077     Exx K              X  x x gt 4X    o             xxx  X            x   xxx        o         E PEER 2X       axeixx     x         o   EKAXKK xXK       OoL lt 0o    x xXx x il         lt  lt  lt  lt  lt O0   xX   z         lt O lt  lt LLLoo    o         L lt  lt  lt  lt O OL o  lt    x x        1 0 28 0 38 0 50 0 65 0 77  2 O17    s27 0 39 0 51  3 0 08 0 11    Residuals  Percentage of sigma      0 0 0 0 0 0 0      2 29  lt 2  20  61 11  132  a87 i  32  14  26  30  3   44 11 0 3 6 0 21   18 18  2  84   19   1  45   40 29 Bf 103 81 61 81     29 Yh 43   24  90  37   lt 23     99  114  218  332  174  77  19     257  333  352  319  253  166  152   227  167  356  S35  163  lt 179  169   183  277  352  lt 323  288  151    56      I  1   i  I  I   i  I  I  I   i  l  I  I  I  I  I  I  I  I    Residual  section 11  column 8  of 11   row 4  of 22   is  3 32 SD    ds  0 56 0 64 0 56  0 19 0 28 0 35 0 42 0 40    00 0 77    GITA  0  0   26 0  37 6  41  15  18 30  130 94  64 130     26     29     52     25   1
157. E qualifier  to resume iterating from the current point     To abort the job at the end of the current iteration  create a file named  ABORTASR NOW in the directory in which the job is running  At the end  of each iteration  ASReml checks for this file and if present  stops the job   producing the usual output but not producing predicted values since  these are calculated in the last iteration  Creating FINALASR NOW will  stop ASReml after one more iteration  during which predictions will be  formed      On case sensitive operating systems  eg  Unix   the filename   ABORTASR NOW or FINALASR NOW  must be upper case  Note that the  ABORTASR NOW file is deleted so nothing of importance should be in it  If  you perform a system level abort  CTRL C or close the program win   dow  output files other than the  rsv file will be incomplete  The  rsv  file should still be functional for resuming iteration at the most recent  parameter estimates  see   CONTINUE      Use  MAXIT 1 where you want estimates of fixed effects and predictions  of random effects for the particular set of variance parameters supplied  as initial values  Otherwise the estimates and predictions will be for the  updated variance parameters  see the  BLUP qualifier below      If  MAXIT 1 is used and an Unstructured Variance model is fitted  AS   Reml will perform a Score test of the US matrix  Thus  assume the  variance structure is modelled with reduced parameters  if that modelled  structure is then processed as t
158. ED    XFAk  X for extended  is the third form of the factor analytic model and has the same   parameterisation as for FACV  that is      II     WY  However  XFA models     have parameters specified in the order diag W  and vec T   when k is greater than 1   constraints on the elements of I are required  see Table 7 5      may not be used in R structures      return the factors as well as the effects       permit some elements of V to be fixed to zero       are computationally faster than the FACV formulation for large problems when k is much  smaller than w     With multiple factors  some constraints are required to maintain identifiablity  Traditionally   this has simply been to set the leading loadings of new factors to zero  Loadings then need to  be rotated to orthogonality  If no loadings are constrained  ASReml will rotate the loadings  to orthogonality  after holding the loadings of lower factors fixed for a few iterations  The    145    7 11 Variance model functions available in ASReml       orthogonalization process occurs at the beginning of the iteration  so the final returned  values have not been formally rotated      Finding the REML solutions for multifactor Factor Analytic models can be difficult  The  first problem is specifying initial values  When using   CONTINUE and progressing XFA k  to  XFA k   1   ASReml 3 initialises the factor k   1 at    W x 0 2   changing the sign of the   relatively  largest loading to negative  One strategy which sometimes works 
159. ENSE  76   DESIGN  69  IDEVIANCE residuals  104  IDF  76   IDTAG  150  IDISPLAY  70  IDISP dispersion  103  IDOM dominance  59  IDOPART  193  IDOPATH  193  IDO  55   IDV  55   ID  55    EMFLAG   77   ENDDO  55  IEPS  70   EXCLUDE  62   EXP  55   EXTRA  78   FACPOINTS  83   FACTOR  71  IFCON  22  67   FGEN  150  IFIELD  71  IFILTER  62  IFINAL  186   FOLDER  63   FORMAT  63  IFOR  194  IFOWN  22  78   GAMMA GLM  103   GDENSE  79  IGIV  151  IGKRIGE  70  IGLMM  79  IGOFFSET  151   GRAPHICS  186   GROUPFACTOR  70   GROUPSDF  156   GROUPS  151  IG  49  68  70    HARDCOPY  186    HOLD  79    341      HPGL  79     IDENTITY link  103   IDLIMIT  70     INBRED  151     INCLUDE  65   INIT  120     INTERACTIVE  186  IT  48    JOIN  68  70   Jddm  56    Jmmd  56    Jyyd  56    KEEP  199   IKEY  71  199  IKNOTS  84  ILAST  80  151   LOGARITHM   103   LOGFILE  186   LOGIT   102   LOGIT link  102  ILOG link  102    LONGINTEGER  151  IL  47    MAKE  151  IMATCH  64  IMAXIT  68   IMAX  56    MBF  71   IMERGE  64  IMEUWISSEN  151  IMGS  151   IMIN  56   IMM transformation  56  58  IMOD  56  IMVREMOVE  72  IM  56   INAME  122   INA  56    NEGBIN GLM  103    NOCHECK  84   NODUP  199   NOGRAPHS  186   NOKEY  71   NOREORDER  84   NORMAL  56   NORMAL GLM  102   NOSCRATCH  84    INDEX        OFFSET variable  103   ONERUN  186   QUTFOLDER  186   OQUTLIER  17    OWN  80    PEARSON residuals  104  IPLOT  173   IPNG  80    POISSON GLM  103   POLPOINTS  84    PPOINTS  84   PRINTALL  173  IPRINT  80    PR
160. If ASReml  does not run at all  it is a setup or licensing issue which is not discussed in this chapter  It  is hoped that the new syntax for variance structure specification will reduce the incidence  of coding errors     Even when the job appears to run successfully  you should check that    e the records read lines read records used are correct    e mean min max information is correct for each variable    e the Loglikelihood has converged and the variance parameters are stable   e the fixed effects have the expected degrees of freedom     Coding errors can be classified as    e typing errors  these are difficult to resolve because we tend to read what we intended to  type  rather than what we actually typed  Section 14 4 demonstrates the consequences of  the common typographical errors that users make     wrong coding  this arises often from misunderstanding the guide or making assumptions  arising from past experience which are not valid for ASReml  The best strategy here is  to closely follow a worked example  or to build up to the required model  Sections 14 3  and 14 2 may help as well as reviewing all the relevant sections of this Guide  It may be  as simple as adding or deleting a space  inserting a comma  changing case or adding one  more qualifier     inappropriate model  the variance model you propose may not be suited to the data in  which case ASReml may fail to produce a solution  You can verify the model is appropriate  by closer examination of the structure o
161. It will generally be preferable to presepecify the levels than to use  SORT because most other  references to particular levels of factors will refer to the unsorted levels  Therefore users  should verify that ASReml has made the correct interpretation when nominating specific  levels of  SORTed factors  In particular any transformations are performed as the data is  read in and before the sorting occurs      SORTALL means that the levels of this and subsequent factors are to be sorted     5 4 4 Skipping input fields    This is particularly useful in large files with alphabetic fields that are not needed as it saves  ASReml the time required to classify the alphabetic labels  New R4  CSKIP f can be used  to skip f fields  Thus    ICSKIP 1 AB    50    5 5 Transforming the data       skips the first data field and reads the second and third fields into variables A and B  and   CSKIP Sire  I  CSKIP 2 Y    will define two variables  Sire taken from the second data field and Y taken from the fifth  data field  Also   SKIP f will skip f data fields BEFORE reading this field  Thus    Sire  I  SKIP 1 Y  SKIP 2    achieves the same result but in a less obvious way  These qualifiers are ignored when reading  binary data     Important Using the  SKIP qualifier in association with the specification of a file to be read  in allows initial lines of the file to be skipped   SKIP can also be used to skip columns  when reading in a data file  Use of  CSKIP for skipping data fields is recommen
162. LogL  2799 30 S2  8568 1 6390 df  3 LogL  2759 03 52  8131 3 6390 df  4 LogL  2741 99 52  7766 2 6390 df  5 LogL  2741 40 S2  7702 9 6390 df  6 LogL  2741 40 Sa  7700 1 6390 df         Results from analysis of HIG        Akaike Information Criterion 65490 79  assuming 4 parameters    Bayesian Information Criterion 65517 84    Model_Term Gamma Sigma Sigma SE  C  rep iblk IDV_V 640 0 307856 2370 52 13 00 0 P  grm1 Clone  GRM_V 923 0 275656 2122 58 5 82 OP  Clone IDV_V 926 0 152554 1174 68 6 08 OP  Residual SCA_V 6399 1 000000 7700 10 49 64 OP    Wald F statistics    Source of Variation NumDF F ine  20 mu 1 0 11E 06  12 culture 1 2615 96  21 culture rep 6 30 44  23 rep iblk 640 effects fitted  22 grm1 Clone  923 effects fitted  4 Clone 926 effects fitted   66 are zero     78 possible outliers  see  res file    Notes   e of 926 clones identified  860 have data and 923 have genomic data     e The  res file contains additional details about the analysis including a listing of the  larger marker effects  All marker effects are reported in the  mef file     e Particular columns of the   grr data can be included in the model using the grr  Factor  i   model term where and i specifies which  number  regressor variable to include     Listing of the larger marker effects    36  12 61 01 121 1 40736 0 00000  617 0 14383 01 111 1 26081 0 00000  777     0 15417 01 138  1 25597 0 00000  1246 0 18644 02 210 122522 0 00000  1903 0 6863 01 202  1 24800 0 00000  2102 0 8683 02 432 1 15496 0 0000
163. Notice  The parameter estimates are followed by  their approximate standard errors     The first 8 lines are based on the  asr file     12 3 VPREDICT  PIN file processing    There are four forms of the VPREDICT directive     e Ifthe   pin file exists and has the same name as the jobname  including any suffix appended  by using  RENAME   just specify the VPREDICT directive     e If the  pin file exists but has a different name to the jobname  specify the VPREDICT  directive with the  pin file name as its argument     e Ifthe  pin file does not exist or must be reformed  a name argument for the file is optional  but the  DEFINE qualifier should be set  Then the lines of the  pin file should follow on  the next lines  terminated by a blank line     An alternative to using VPREDICT is process the contents of the  pin file by running ASReml  with the  P command line option specifying the  pin file as the input file     Note that in this case the code must be self contained and any substitution variable used  needs defining in the  pin file  For example  if we wish to use  sub to indicate fullname   then the assignment of fullname to sub using     ASSIGN sub fullname  needs to be in the   pin file     216    13 Description of output files    13 1 Introduction    With each ASReml run a number of output files are produced  ASReml generates the out   put files by appending various filename extensions to basename  A brief description of the  filename extensions is presented in Table 13
164. O 21 00 510 5 840 0 149 0  5 repl 4 O  0  1 2 4132 4  6 nloc 0 O 4 000 4 000 4 000 0 000  7 yield Variate 18 0 1 050 25 53 42 00 7 450  8 lat 0 O 4 300 25 80 47 30 13 63  9 long 0 O 1 200 13 89 26 40 7 629  10 row 22 0  0  1 11 5000 22    220    13 3 Key output files       11 column 11 0 0 i 6 0000 11  12 mu 1  13 mv_estimates 18  ariv row  in ari row  ari column  has size 22  parameters  5 5  ari column  in ari row  ari column  has size 11  parameters  6 6  ari  row   ar1 column    4  6  initialized   Sorting Section 1  22 rows by 11 columns  Forming 75 equations  57 dense     Initial updates will be shrunk by factor 0 316    Notice  Specify  SIGMAP to allow the Sigma parameterisation    Notice  1 singularities detected in design matrix  iterations   1 LogL  449 818 s2  49 775 168 df 1 0000 0 1000E00 0 1000E 00    2 LogL  424 315 S2  40 233 168 df 1 0000 0  2937 0 2323   3 LogL  405 419 52  38 922 168 df 1 0000 0 4813 0 3587   4 LogL  399 552 S2  45 601 168 df 1 0000 0 6156 0 4398   5 LogL  399 336 S2  47 986 168 df 1 0000 0 6456 0 4417   6 LogL  399 325 S2  48 546 168 df 1 0000 0 6530 0 4391   7 LogL  399  324 S2  48 672 168 df 1 0000 0 6549 0 4380   8 LogL  399 324 S2  48 703 168 df 1 0000 0 6554 0 4376  Final parameter values 1 0000 0 6555 0 4375         Results from analysis of yield        Akaike Information Criterion 804 65  assuming 3 parameters    Bayesian Information Criterion 814 02  Model_Term Gamma Sigma Sigma SE  C  ari  row   ar1 column  242 effects  Residual S
165. OBIT  102    PROBIT   102   IPS  80   IPVAL  72   IPVR GLM fitted values  104   PVSFORM  80   IPVW GLM fitted values  104  IP  48    QUASS  151    QUIET  186    READ  64    RECODE  64   IRENAME  71  186  IREPEAT  151   REPLACE  56   IREPORT  84    RESCALE  56   RESIDUALS  80  81   RESPONSE residuals  104  IRFIELD  71    ROWFACTOR  64    RREC  64   IRSKIP  65   ISAMEDATA  193   SARGOLZAEI  152  ISAVEGIV  155   ISAVE  81   ISCALE  84   ISCORE  84   ISCREEN  81   ISECTION  73   ISED  173   ISEED  56   ISELECT  62    342    ISELF  152   ISEQ  57   ISETN  57    SETU  57   ISET  57   ISIGMAP  111   ISIN  55   ISKIP  62  71  152  199   SLNFORM  81    SLOW  85   ISMX  81   ISORT  152  199  ISPARSE  71  ISPATIAL  81   SPLINE  73    SQRT link  103  ISTEP  73   SUBGROUP  73   I  SUBSET  74   15UB  57   ISUM  68    TABFORM  82  ITARGET  51  57    THRESHOLD GLM  102    TOLERANCE  85  ITOTAL  102  104    TWOSTAGEWEIGHTS  174   TWOWAY  82   TXTFORM  82   UNIFORM  57   IUSE  122     UpArrow  54   IVCC  82   VGSECTORS  82  IVPV  174   IVRB  85   IV  58   IWMF  74   WORKSPACE  186   WORK residuals  104    IXLINK  152  IX  68    YHTFORM  82    YSS  76  82   YVAR  186  IY  68  ICENTRE  160    INDEX        COORD  116   IGSCALE  161   IND  153    NOID  160   INSD  153    ONLYUSE  170    PEV  160  161    PRECISION  153   IPSD  153     RANGE  161    SPECIALCHAR  41  50     SUBSECTION  116   ITDIFF  173    USE  116  qualifiers   datafile line  62   genetic  147   job control  65    RAM  157  random  
166. ORE     178    9 3 Prediction       model terms  The qualifier  ONLYUSE explicitly specifies the model terms to use  ignoring all  others  The qualifier  EXCEPT explicitly specifies the model terms not to use  including all  others  These qualifiers will not override the definition of the averaging set     The fourth step is to choose the weights to use when averaging over dimensions in the hyper   table  The default is to simply average over the specified levels but the qualifier   AVERAGE  factor weights allows other weights to be specified   PRESENT and   ASSOCIATE   ASAVERAGE  generate more complicated averaging processes     The basic prediction process is described in the following example   yield   site variety  r idv site  id variety  at site   idv block   predict variety    puts variety in the classify set  site in the averaging set and block in the ignore set   Consequently  ASReml implicitly forms the sitexvariety hyper table from model terms  site  variety and site variety but ignoring all terms in at site  block  and then  averages across the sites to produce variety predictions  This prediction will work even if  some varieties were not grown at some sites because the site  variety term was fitted as  random  If site  variety was fitted as fixed  variety predictions would be non estimable  for those varieties that were not grown at every site     179    9 3 Prediction       9 3 3 Predict failure    It is not uncommon for users to get the message   Warning  non e
167. Reml to discard records which have missing values in the  design matrix  see Section 6 9      suppresses the graphic display of the variogram and residuals which is  otherwise produced for spatial analyses in the PC version  This option  is usually set on the command line using the option letter N  see Section  10 3 on graphics   The text version of the graphics is still written to the   res file     is a mechanism for specifying the particular points to be predicted for  covariates modelled using fac v   leg v k   spl v k  and pol  v  k    The points are specified here so that they can be included in the ap   propriate design matrices  v is the name of a data field  p is the list of  values at which prediction is required  See  GKRIGE for special conditions  pertaining to fac z y  prediction     is used to read predict_points for several variables from a file f  vlist is  the names of the variables having values defined  If the file contains  unwanted fields  put the pseudo variate label skip in the appropriate  position in vlzst to ignore them  The file should only have numeric values   predict_points cannot be specified for design factors     72    5 8 Job control qualifiers       Table 5 4  List of occasionally used job control qualifiers       qualifier    action       SECTION v    ISPLINE spl v n  p    ISTEP r      SUBGROUP t v p    specifies the variable in the data that defines the data sections  This  qualifier enables ASReml to check that sections have been correctl
168. SCA_V 256 1 000000  511400E 01 9 12 OP    Warning  Code B   fixed at a boundary   GP       liable to change from P to B  C   Constrained by user   VCC   S   Singular Information matrix      fixed by user    positive definite    unbounded    So aS    The convergence criteria has been satisfied after six iterations  A warning message is printed  below the summary of the variance components because the variance component for the  setstat teststat term has been fixed near the boundary  The default constraint for vari   ance components   GP  is to ensure that the REML estimate remains positive  Under this  constraint  if an update for any variance component results in a negative value then ASReml  sets that variance component to a small positive value  If this occurs in subsequent iterations  the parameter is fixed to a small positive value and the code B replaces P in the C column of  the summary table  The default constraint can be overridden using the  GU qualifier  but it  is not generally recommended for standard analyses     Figure 15 2 presents the residual plot which indicates two unusual data values  These values  are successive observations  namely observation 210 and 211  being testing stations 2 and  3 for setting station 9 J   regulator 2  These observations will not be dropped from the  following analyses for consistency with other analyses conducted by Cox and Snell  1981   and in the GENSTAT manual     The REML log likelihood from the model without the setstat test
169. V lt   v in the field but keeps records with  DV gt 100   IDV lt       missing value    in the field  if  DV   IDV gt   is used after  A or  I  v should re    IDV gt   fer to the encoded factor level rather  than the value in the data file  see  also Section 4 2   Use  DV   to dis  InitialWt  DV    card just those records with a miss   ing value in the field     D v is equivalent to  DV    DV v     DO  n 2z 2v    causes ASRem  to perform the fol  See below  lowing transformations n times  de   fault is variables in current term    incrementing the target by i   de   fault 1  and the argument  if  present  by i   default 0   Loops  may not be nested  A loop is ter   minated by  ENDDO  another  DO or  a new field definition     DOM f copies and converts additive marker ChrAadd  G 10  MM     covariables   1  0  1  to dominance ChrAdom  DOM ChrAadd  marker covariables  see below       ENDDO terminates a  DO transformation See below  block    EXP takes antilog base e   no argument Rate  EXP    required     55    5 5 Transforming the data       Table 5 1  List of transformation qualifiers and their actions with examples       qualifier    argument    action    examples         Jddm      Jmmd   Jyyd    IM   M lt  gt    IM lt   M lt    IM gt  IM gt      IMAX   IMIN    MOD     MM    INA      NORMAL      REPLACE      RESCALE      SEED      Jddm converts a number represent   ing a date in the form ddmmccyy   ddmmyy or ddmm into days    Jmmd  converts a date in the form ccyym   mdd  yymm
170. a direct sum structure with common parameters  Note that    SUBSECTION  is only available when the residual variance function is expressed in terms of one variance  function   SUBSECTION f performs two tasks similar to those described in Section 7 3 2   that is  defining a direct sum structure for the residual vector in a section  with the number  of subsections in section 7  s   given by the number of levels of the factor f  and pruning the  levels of the factor defining the variance structure within each section but allowing common  variance parameters across sections  The data needs to be sorted in order of the variable f   The following code would specify a common AR1 structure across sections  assumed sorted  in to the appropriate order within the section variable  with an initial spatial autocorrelation  parameter of 0 5    residual ari units  INIT 0 5  SUBSECTION section   If there was data sorted on date within plot then we might use  residual exp date  INIT 0 2  SUBSECTION plot     to  specify a common EXP structure across plots     7 7 7 Parameter types  Ts    Each variance parameter also has a type which may be set explicitly with the qualifier  Ts   where s is the type code  The following is a list of the possible parameter types and their  code  They are usually set internally  are reported in the  tsv file and are used to define  the parameter space        type code action if  GP is set  variance V forced positive  variance ratio G forced positive  correlation R
171. a ro kerar der aara 245  14 3 Things to check in the asr file    aoaaa aa 247  l44    Aneampe  ao scce canore a aa Ee eh a h eee e a 247  14 5 Information  Warning and Error messages       oaoa aaa 254  15 Examples 268  15   TION   lt  oee a e ee we a Swe a E he Oe Se ee rece Gg 268  15 2 Split plotdesign    Qats oo ke eh osc neait ner RRS EERE RES OS 268  15 3 Unbalanced nested design   Rats      oaoa aaa pe ee 273  15 4 Source of variability in unbalanced data   Volts        ooa aaa aaa 276  15 5 Balanced repeated measures   Height     oaaae aaa 279  15 6 Spatial analysis of a field experiment   Barley       o  aoa aa aaa 286  15 7 Unreplicated early generation variety trial   Wheat      oaoa aaa aaa 292  15 8 Paired Case Control study   Rice    oaoa aaa 0000004 298  158 1 Standard analysis oa so senad Ke SRS COD    OAS RO 300  15 8 2 A multivariate approach     a  ek aa eR Eee ee KES 304  15 8 3 Interpretation of results       oa aaa 00002022 ee 307   15 9 Balanced longitudinal data   Random coefficients and cubic smoothing splines    EMM a 6 6 a a 6 E Ge e a eaa Ge a e n 309  15 10 Multivariate animal genetics data   Sheep    a aaa a a 317  1510 1 Half sib analysis o o s scopa oe eee eani e a e e RES OO 317  15 10 2Animal del    o ke we hee eR ee we Crins 327  Bibliography 333    xi    Index 337    xii    List of Tables    3 1    5 1  5 2  5 3  5 4  5 9  5 6    6 1  6 2  6 3  6 4    6 5  toh    2  7 3    7 4  Ta  7 6    8 1    91  ga  9 3  9 4    Trial layout and allocati
172. a treated tray  bloodworms added  were grown in a controlled environment room for the  duration of the experiment  At the end of this time rice plants were carefully extracted  the  root system washed and root area determined for the tray using an image analysis system  described by Stevens et al   1999   Two pairs of trays  each pair corresponding to a different  variety  were included in each run  A new batch of bloodworm larvae was used for each  run  A total of 44 varieties was investigated with three replicates of each  Unfortunately  the variety concurrence within runs was less than optimal  Eight varieties occurred with  only one other variety  22 with two other varieties and the remaining 14 with three different  varieties     In the next three sections we present an exhaustive analysis of these data using equivalent  univariate and multivariate techniques  It is convenient to use two data files one for each  approach  The univariate data file consists of factors pair  run  variety  tmt  unit and  variate rootwt  The factor unit labels the individual trays  pair labels pairs of trays   to which varieties are allocated  and tmt is the two level bloodworm treatment factor   control treated   The multivariate data file consists of factors variety and run and variates  for root weight of both the control and exposed treatments  labelled yc and ye respectively      Preliminary analyses indicated variance heterogeneity so that subsequent analyses were con   ducted on the sq
173. able  of predicted values where n is 0   9  The default is 4  G15 9 format is used  if n exceeds 9    When  VVP or  SED are used  the values are displayed with 6 significant  digits unless n is specified and even  then the values are displayed with 9  significant digits     instructs ASReml to attempt a plot of the predicted values  This qualifier is  only applicable in versions of ASReml linked with the Winteracter Graphics  library  If there is no argument  ASReml produces a figure of the predicted  values as best it can  The user can modify the appearance by typing  lt Esc gt   to expose a menu or with the plot arguments listed in Table 9 2     instructs ASReml to print the predicted value  even if it is not of an estimable  function  By default  ASReml only prints predictions that are of estimable  functions     requests all standard errors of difference be printed  Normally only an aver   age value is printed  Note that the default average SED is actually an SED  calculated from the average variance if the predicted values and the average  covariance among the predicted values rather than being the average of the  individual SED values  However  when  SED is specified  the average of the  individual SED values is reported     requests t statistics be printed for all combinations of predicted values     requests ASReml to scan the predicted values from a fitted line for possible  turning points and if found  report them and save them internally in a vector  which can be a
174. ables it to analyse large and complex data sets quite efficiently     One of the strengths of ASReml is the wide range of variance models for the random effects  in the linear mixed model that are available  There is a potential cost for this wide choice   Users should be aware of the dangers of either overfitting or attempting to fit inappropriate  variance models to small or highly unbalanced data sets  We stress the importance of using    data driven diagnostics and encourage the user to read the examples chapter  in which we  have attempted to not only present the syntax of ASReml in the context of real analyses but  also to indicate some of the modelling approaches we have found useful     There are several interfaces to the core functionality of ASReml  The program name ASReml  relates to the primary program  ASReml W refers to the user interface program developed  by VSN and distributed with ASReml  ASReml R refers to the S language interface to a DLL  of the core ASReml routines  GenStat uses the same core routines for its REML directive   Both of these have good data manipulation and graphical facilities     The focus in developing ASReml has been on the core engine and it is freely acknowledged  that its user interface is not to the level of these other packages  Nevertheless  as the  developer   s interface  it is functional  it gives access to everything that the core can do and  is especially suited to batch processing and running of large models without the over
175. age of the 8 location means in Table 9 5     Further discussion of associated factors    The user may specify their own weights  using file input if necessary  Thus predict region      ASAVERAGE location  1 2 3  6  1 1 1 2 1  6 would give region predictions of 11 67 and  10 84 respectively derived from the location predictions in Table 9 5  Note that because  location is nested in region  the location weights should sum to 1 0 within levels of region  when forming region means  The   AVERAGE   ASAVERAGE  qualifier allows the weights to  be read from a file which the user can create elsewhere  Thus the code  ASAVERAGE trial   Tweight csv    2 will read the weights from the second field of file Tweight csv  The user  must ensure the weights are in the coding order ASReml uses  trial order in this instance   given in the  sln file or by using the TABULATE command      It was noted that it is the base  ASSOCIATE factor that is formally included in the hyper   table  If the lowest stratum is random  it may be appropriate to ignore it  Omitting it from    187    9 3 Prediction       the   ASSOCIATE list will allow it to reenter the Ignore set  Specifying it with the   IGNORE  qualifier will exclude its effects from the prediction but not ignore the structural information  implied by the association     Normally it is not necessary for any model term to involve more than 1 of the associated  factors  One exception is if an interaction is required so that the variance can differ betw
176. ail in Chapter 7 with examples     2 1 9 Variance models for terms with several factors    A random model term may comprise either a single factor or several component factors to  give a compound model term  Consider a compound model term represented by A B  where  the component factors A and B have m and n levels respectively and the         operator forms  a term with levels corresponding to the combinations of all levels of A with all levels of B   The effects ab   for A B are generated with the levels of B nested in the levels of A  ie  the  levels of B cycling fastest     ab     ab    ab      ab    ab    ab    ab      ab   ee abpa     in  2n     m1      Now consider the variance model for the term A B  If we specify our variance model generi   cally as    vmodeli A   vmodel2 B     where vmodel1 is a variance model function with variance matrix A    A    and vmode12  is a variance model function with variance matrix B    Bw   then the G structure for this  term is defined by    cov  abjx  abji    Aij X Bri   2 9     This means that the covariance between two effects ab   and ab  in  ab  is constructed as  the product of the covariance between a  and a  in model A i e  its  i  7     element A    and  the covariance between b  and b  in model B ice  its  k l  element By      apo    Example 2 3 A simple direct product structure  If A has 3 levels and B has 2 levels  then the term A B would have the 6 levels      ab     ab    abs  ab    abs  ab  ab         Using magenta and b
177. al  order  but generational order is still required     160    8 8 Genetic groups       List of pedigree file qualifiers       qualifier    description        SARGOLZAETI    ISELF s    SKIP n      SORT     XLINKR    an alternative procedure for computing A7  was developed by Sargolzaei et al  2005      allows partial selfing when second parent is unknown  It indicates that progeny from a  cross where the second parent  male_parent  is unknown  is assumed to be from selfing  with probability s and from outcrossing with probability  1     s   This is appropriate  in some forestry tree breeding studies where seed collected from a tree may have been  pollinated by the mother tree or pollinated by some other tree  Dutkowski and Gilmour   2001   Do not use the  SELF qualifier with the   INBRED or  MGS qualifiers     allows you to skip n header lines at the top of the file     causes ASReml to sort the pedigree into an acceptable order  that is parents before  offspring  before forming the A Inverse  The sorted pedigree is written to a file whose  name has  srt appended to its name     requests the formation of the  inverse  relationship matrix for the X chromosome as  described by Fernando and Grossman  1990  where the first parent is XY and the  second is XX  This NRM inverse matrix is formed in addition to the usual A   and  can be accessed as GRM1 or as specified in the output  The pedigree must include a  fourth field which codes the SEX of the individual  The actual code used 
178. alid characters in the variable names  vari   able names must not include any of these symbols            and         the data file name is misspelt       there are too many variables declared or there is no valid  value supplied with an arithmetic transformation option     there is a problem reading G structure header line An earlier  error  for example insufficient initial values  may mean the ac   tual line read is not actually a G header line at all  A G header  line must contain the name of a term in the linear model spelt  exactly as it appears in the model     a G structure line cannot be interpreted     The size of the structure defined does not agree with the model  term that it is associated with     an error occurred processing the pedigree  The pedigree file  must be ascii  free format with ANIMAL  SIRE and DAM as the  first three fields     ASReml failed to calculate the GLM working variables or  weights  Check the data     Either the field has alphanumeric values but has not been de   clared using the  A qualifier  or there is not enough space to  hold the levels of the factor  To    increase the levels     insert the  expected number of levels after the  A or  I qualifier in the  field definition     Use  WORKSPACE s to increase the workspace available to AS   Reml  If the data set is not extremely big  check the data  summary     Maybe the response variable is all missing     there must be at least 3 distinct data values for a spline term    If ASReml has not 
179. ametric con   straints and relationships  equality and scale  between parameters  A file  msv is produced   similar to  tsv but containing final values that can be edited and used with  MSV  If  TSV   or  MSV  is specified ASRem1 will read the current  created with the same PART number    tsv  or  msv  file  If there is no current  tsv  or  msv file   a non current  produced from  a different PART of the same job   tsv  or  msv  file will be read     Alternative ways of specifying  TSV and  MSV are  CONTINUE 2 and  CONTINUE 3 and these  qualifiers can be used as options on the command line as  C2 and  C3  Note that the  constraints in the  tsv  msv files take precedence over those in the  as file     7 9 2 Using estimates from simpler models    Sometimes we have estimates from simpler models and we wish to reduce the need for the  user to type in updated starting values  The  CONTINUE command line qualifier instructs  ASReml to update initial parameter values from a  rsv file  When it is specified  ASRem1 first  looks for a current  rsv file  and if found will read it and report the constructed initial values  in the  tsv file  If there is no current  rsv file  it looks for the most recent noncurrent  rsv  file and uses that to construct initial values  As discussed below     current    means having the  same  basename    and    run number     A non current file will have the same    basename    but  a different    run number     When reading the  rsv file  if the variance st
180. ance covariables from a set of additive marker covariables  previously declared with the MM marker map qualifier  It assumes the argument A is an  existing group of marker variables relating to a linkage group defined using  MM which rep   resents additive marker variation coded   1  0  1   representing marker states aa  aA and  AA  respectively  It is a group transformation which takes the   1 1  interval values  and  calculates   X      0 5    2 i e   1 and 1 become one  0 becomes  1  The marker map is also  copied and applied to this model term so it can be the argument in a qt1   term  page 100      IDO       ENDDO provides a mechanism to repeat transformations on a set of variables  All  tranformations except  DOM and  RESCALE operate once on a single field unless preceded by  a  DO qualifier  The  DO qualifier has three arguments  n           n is the number of times the  following transformations are to be performed  i  default 1  is the increment applied to the  target field  i   default 0 0  is the increment applied to the transformation argument  The  default for n is the number of variables in the current field definition   ENDDO is formally  equivalent to  DO 1 and is implicit when another  DO appears or the next field definition  begins  Note that when several transformations are repeated  the processing order is that  each is performed n times before the next is processed  contrary to the implication of the  syntax   However  the target is reset for each transfo
181. ance when a correlation structure is applied to the residual     creates a factor with a new level whenever there is a level present for the factor f   Levels  effects  are not created if the level of factor fis 0  missing or negative  The  size may be set in the third argument by setting the second argument to zero     creates a factor with a level for every record subject to the factor level of f equalling  k  i e  a new level is created for the factor whenever a new record is encountered  whose integer truncated data value from data field fis k  Thus uni site 2  would  be used to create an independent error term for site 2 in a multi environment trial  and is equivalent to at site 2  units  The default size of this model term is  the number of data records  The user may specify a lower number as the third  argument  There is little computational penalty from the default but the  s1n file  may be substantially larger than needed  However  if the units vector is full size   the effects are mapped by record number and added back to the fitted residual for  creating    residual    plots     100    6 8 Generalized Linear  Mixed  Models       Table 6 2  Alphabetic list of model functions and descriptions       model function action       vect  v  is used in a multivariate analysis on a multivariate set of covariates  v  to pair them  with the variates  The test example included  signal  G 93   93 slides   background  G 93   dart asd  ASUV  signal   Trait Trait vect background   
182. and c      Note that 7 is the best linear unbiased estimator  BLUE  of 7  while    is the best linear  unbiased predictor  BLUP  of u for known o  and a  We also note that    s o  Eog       3he       2 2 3 Use of the gamma parameterization    ASReml uses either the gamma or sigma parameterization for estimation depending on the  residual specification  The current default for univariate  single section data sets is the  gamma parameterization  In this case  all scale parameters are estimated as a ratio with  respect to the residual variance  o    and any parameters that measure only correlation are  unchanged  See Chapter 7 for more detail     2 3 What are BLUPs     Consider a balanced one way classification  For data records ordered r repeats within b  treatments regarded as random effects  the linear mixed model is y  XT   Zu   e where  X   1 1  is the design matrix for 7  the overall mean   Z   I    1  is the design  matrix for the b  random  treatment effects u  and e is the error vector  Assuming that the  treatment effects are random implies that u   N Aw  o7I    for some design matrix A and  parameter vector w  It can be shown that    ro  o     a   9   17        Aw  2 19     ro   o  ro    0     where y is the vector of treatment means  y   is the grand mean  The differences of the  treatment means and the grand mean are the estimates of treatment effects if treatment  effects are fixed  The BLUP is therefore a weighted mean of the data based estimate and the     prior 
183. arable  111  singularities  106  107  slow processes  196  sparse  106   sparse fixed  87  spatial   analysis  277   data  1   model  110  specifying the data  46  split plot design  259    tabulation  32  qualifiers  165  syntax  165   template as file  29   tests of hypotheses  18   Timing processes  197   title line  31  46    TPREDICT  182  trait  41  144  transformation  51  syntax  53  typographic conventions  4    unbalanced   data  267   nested design  264  UNIX  183  Unix crashes  188  Unix debugging  219  unreplicated trial  283    variance components  functions of  201  variance model  combining  12  description  131  forming from correlation models  132  variance parameters  12  relationships  124  variance structure parameters  Simple relationships   123  variance structures  32  multivariate  146  VCM  69    Wald F statistics  19  weight  101   weights  41   Working Folder  63  workspace options  189    XFA extension  137    344    
184. are erroneous    Warning  This US structure  is not positive definite    Warning  Unrecognised  qualifier at character    Warning  US matrix was not  positive definite  MODIFIED    Warning  User specified  spline points  Warning  Variance parameters    were modified by BENDing    Warning  Likelihood  decreased  Check gammas and  singularities      revise the qualifier arguments     The issue is to match the declared R structure to the physi   cal data  Dropping observations which are missing will often  usually destroy the pattern  Estimating missing values al   lows the pattern to be retained     Do not accept the estimates printed     The FOWN test requested is not calculated because it re   sults in different numbers of degrees of freedom to that ob   tained for the incremental tests for the terms in the model  as fitted  the FOWN calculations are based on the reduced  design matrix formed for the incremental model  ASReml  performs the standard conditional test instead  The user  must reorder  swap   the terms in the model specification  and rerun the job to perform the requested FOWN test     the labels for predicted terms are probably out of kilter  Try  a simpler predict statement  If the problem persists  send  for help     check the initial values     the qualifier either is misspelt or is in the wrong place     the initial values were modified by a    bending    process     the points have been rescaled to suit the data values     ASReml may not have converged to th
185. are specified  ASReml offers a wide range of  variance models to choose from  A full listing is in Table 7 6 and details are provided in  Chapter 7     2 1 The general linear mixed model       2 1 6 Gamma parameterization for the linear mixed model    The sigma parameterization of model  2 3  is one possible parameterization of var  y    In this  parameterization both G o   and R  o   are variance matrices and the variance structure  parameters in o  and a  are referred to as sigmas  see above  Other parameterizations  are possible and are sometimes useful  For example  in some of the early development of  REML for the traditional mixed model of  2 5   the variance matrix was parameterized as  the equivalent model    b  var y    o    gt  Yg ZiZ    r   2 6     for 7  being the ratio of the variance component for the random term u  relative to error  variance  that is  yg    02  02  In this case ASReml calculated a simple estimate of o  and  initial values for the iterative process were specified in terms of the ratios yg  rather than in  terms of the variance components Tie It was often easier to specify initial values in terms of  these ratios rather than the variance components which is why this approach was adopted   Where R  o   can be written as a scaled correlation matrix  that is  R  o     0 R  7     this suggests the alternative specification of  2 2     ejor  lo  eS gtl  an    where y  and y  represent the variance structure parameters associated with scaled  by o    
186. ared in AI  matrix   Singularity in Average  Information Matrix    SINGULARITY IN       Sorting data by  Section  Row  Sorting the data into field  order    STOP SCRATCH FILE DATA  STORAGE ERROR     Structure  Factor mismatch     Too many alphanumeric factor  level labels     Too many factors with  A or   T  max 100    Too many  max 20  dependent  variables    Unable to invert R or G  US    matrix     Unable to invert R or G   CORR   matrix     this is a Unix memory error  It typically occurs when a mem   ory address is outside the job memory  The first thing to try is  to increase the memory workspace using the  WORKSPACE  see  Section 10 3 on memory  command line option  Otherwise you  may need to send your data and the  as files to Customer Sup   port for debugging     See the discussion on   AISINGULARITIES    Problem performing the  Regression Screen       the field order coding in the spatial error model does not gen   erate a complete grid with one observation in each cell  missing  values may be deleted  they should be fitted  Also may be due  to incorrect specification of number of rows or columns     ASReml attempts to hold the data on a scratch file  Check that  the disk partition where the scratch files might be written is  not too full  use the  NOSCRATCH qualifier to avoid these scratch  files     the declared size of a variance structure does not match the size  of the model term that it is associated with     if the factor level labels are actually all integer
187. arent  with no progeny in the pedigree  to be  written to basename aif     FGEN  f  indicates the pedigree file has a fourth field containing the level of selfing or the level  of inbreeding in a base individual  In the fourth field  0 indicates a simple cross  1  indicates selfed once  2 indicates selfed twice  etc   A value between 0 and 1 for a  base individual is taken as its inbreeding value  If the pedigree has implicit individuals   they appear as parents but not in the first field of the pedigree file   they will be  assumed base non inbred individuals unless their inbreeding level is set with  FGEN f  where 0  lt  f  lt  1 is the inbreeding level of such individuals  Individuals with one or  both parents unknown  and without a specific non zero inbreeding coefficient provided  in the fourth filed of the pedigree  will are assigned an inbreeding coefficient f     159    8 7 Reading in the pedigree file       List of pedigree file qualifiers       qualifier    description        GIV  IGIV 2     GOFFSET o    IGROUPS g      INBRED      LONGINTEGER      MAKE      MEUWISSEN     MGS    QUAAS      REPEAT    instructs ASReml to write out the A inverse in the format of  giv files   GIV 2 writes  the pedigree of the parents to basename_Parent ped and the diagonal elements of the  A inverse to basename_Q giv with offspring identifiers  see Section 8 10   If  GROUPS  is also specified  this   giv file will include the  GROUPSDF qualifier on its first line     An alternative to gr
188. ariables explain it  Again  the  BLUP  1 qualifier might help     A program limit has been breached  Try simplifying the model     use  WORKSPACE qualifier to increase the workspace allocation   It may be possible to revise the models to increase sparsity     factors are probably not declared properly  Check the number  of levels  Possibly use the  WORKSPACE qualifier     The predict table appears to be too big   WORKSPACE  or predicting in parts     Try increasng    occurs when space allocated for the structure table is exceeded   There is room for three structures for each model term for which  G structures are explicitly declared  The error might occur  when ASReml needs to construct rows of the table for structured  terms when the user has not formally declared the structures   Increasing g on the variance header line for the number of G  structures  see ASReml User Guide  Structural Specification  will  increase the space allocated for the table  You will need to add  extra explicit declarations also     check the pedigree file and see any messages in the output   Check that identifiers and pedigrees are in chronological order     the A inverse factors are not the same size as the A inverse   Delete the ainverse bin file and rerun the job     Typically this arises when there is a problem processing the  pedigree file     Check the details for the distance based variance structure     Check the distances specified for the distance based variance  structure     Try increas
189. ariance  Section 2 1 6      fe     o   220    ronl     109    7 2 Process to define a consolidated model term       where y  and    represent the variance structure parameters associated with scaled  by  g   variance matrices  Under this parameterization    var  y    o   ZG y  Z    Re 7       In this chapter we give a detailed account of variance modeling in ASReml     7 1 Applying variance models to random terms    In the previous chapter we showed how to specify the random model terms u  in u and  associated design matrices and we assumed the effects were IID by using an idv   function   We can naturally extend this using other functions  Some common variance functions are  defined in Table 7 1  the full range of variance model functions and their detailed definition  is presented in Table 7 6     The models are classified as variance models if they include a scale parameter  or as correla   tion models if their scale is fixed  Except for the giv models  correlation models take value  1 on the diagonal  Names of correlation models can be appended with v  eg  idv    to  add a common variance  ie  same variance across all rows  or with h  eg  idhQ  to allow a  separate variance for each row  If all of the variables in a term do not have a variance model  specified then the default variance model  idv    will be applied to these variables  We  further generalise this in Section 7 2 and Table 7 2 by introducing the idea of a consolidated  model term that simultaneously defines 
190. ast line read was  yield   mu variety  Finished  23 Apr 2014 09 16 54 931 Error parsing yield   mu variety    ASReml happily reads down to the nin9 asd line  This name contains a         which is not    permitted in a variable name so nin9 asd is expected to be a file name  but there is no such  file in the working folder  The data file is actually nin89 asd     14 3 Things to check in the  asr file    The information that ASReml dumps in the  asr file when an error is encountered is intended  to give you some idea of the particular error     e if there is no data summary  ASReml has failed before or while reading the model line     e if ASReml has completed one iteration the problem is probably associated with starting  values of the variance parameters or the logic of the model rather than the syntax per se     14 4 An example   Briefly  the 8 coding errors in the example above  in the order they will be detected  are   1  filename misspelt  there is no file nin9 asd in the working folder    2  unrecognised qualifier  should be   SKIP     3     Variety    has alphabetic level labels but not declared has such   A required     4  comma missing from first line of model   R Repl is part of the model but not recognised  as such     5  misspelt variable label in linear model  Repl should be rep1     6  misspelt variable labels in residual model    247    14 4 An example       7  the data has missing cells with respect to the declared residual structure     8  misspelt variable la
191. ates the appropriate inverse R structure arising from the distribution  and link function and so in general a residual line is not needed  The only exception is in this  bivariate case when id units  us Trait  is needed and us Trait  has the three residual    components and often the first one associated with the GLM is constrained to an initial value  of 1     6 8 1 Generalized Linear Mixed Models    This section was written by Damian Collins    A Generalized Linear Mixed Model  GLMM  is an extension of a GLM to include random  terms in the linear predictor  Inference concerning GLMMs is impeded by the lack of a  closed form expression for the likelihood  ASReml currently uses an approximate likelihood  technique called penalized quasi likelihood  or PQL  Breslow and Clayton  1993   which  is based on a first order Taylor series approximation  This technique is also known as  Schalls technique  Schall  1991   pseudo likelihood  Wolfinger and OConnell  1993  and joint  maximisation  Harville and Mee  1984  Gilmour et al   1985   Implementations of PQL are    104    6 9 Missing values       found in many statistical packages  for instance  in the GLMM  Welham  2005  and the  IRREML procedures of Genstat  Keen  1994   the MLwiN package  Goldstein et al   1998    the GLMMIX macro in SAS  Wolfinger  1994   and in the GLMMPQL function in R     The PQL technique is well known to suffer from estimation biases for some types of GLMMs   For grouped binary data with small group sizes  
192. ating  component 5     component 4 x component 6 where components 4  5 and 6 are variance  components from the analysis        12 2 4 A more detailed example    The following example for a bivariate sire model is a little more complicated  The job file  bsiremod as contains    coop fmt  ywt fat   Trait Trait  age c brr  sex sex age   r us Trait  id sire  us Trait   f Tr grp  residual id units   us Trait     VPREDICT  DEFINE   phenvar id units  us Trait  us Trait    us Trait  sire us Trait   addvar sire us Trait    4   heritA addvar 1  phenvar 1    heritB addvar 3  phenvar  3    phencorr phenvar   gencorr addvar    Do mmm    The relevant lines of the  asr file are    214    12 2 Syntax       Model_Term Sigma Sigma Sigma SE  C  id units   us  Trait  8140 effects   Trait US V 1 2 23 2055 23 20565 44 44 OP  Trait Wc 2 T 2 50402 2 50402 18 56 0 P  Trait yS  y 2 2 1 66292 1 66292 32 02 0 P  us Trait  id sire  184 effects   Trait USV 2 4 1 45821 1 45821 3 66 OP  Trait US_G 2 1 0 280280 0 130280 1 92 OP  Trait US_V 2 2 0 344381E 01 0 344381E 01 2 03 OP    Numbering the parameters reported in bsiremod asr  and bsiremod vvp     1 error variance for ywt   2 error covariance for ywt and fat  3 error variance for fat   4 sire variance component for ywt  5 sire covariance for ywt and fat  6 sire variance for fat   then    F phenvar id units  us Trait   us Trait    us Trait  id sire  us Trait  or  F phenvar units us Trait    sire us Trait  or F phenvar 1 3   4 6    creates new components 7 
193. average of the region effects  into all of the location means which is not appropriate  With   ASSOCIATE  it knows which  trials to average  and which region effects to include  to form each location mean  That  is  ASReml knows how to construct the trial means including the appropriate region and  location effects  and which trials means to then average to form the location table     However  for region means  we have a choice  We can average the trial means in Table 9 4  according to region obtaining region means of 11 83 and 11 33  or we can average the location  means in Table 9 5 to get region means of 12 and 11     The former is the default in ASReml produced by   predict region  ASSOCIATE region location trial  ASAVERAGE trial  or equivalently by   predict region  ASSOCIATE region location trial   Again  this is base averaging     By contrast    predict region  ASSOC region location trial  ASAVE location trial    or predict region  ASSOC region location trial  ASAVE location     produces sequential averaging giving region means of 12 and 11 respectively     Similarly  an overall sequential mean of 11 5 is given by   predict mu  ASSOC region location trial  ASAVE region location   while predict mu  ASSOC region location trial  ASAVE region   gives a value of 11 58 being the average of region means 11 83 and 11 33 obtained by averaging  trials within regions from Table 9 4  and   predict mu  ASSOCIATE region location trial  ASAVE location   predicts mu as 11 38  the aver
194. ay be supplied as  the second argument  For example at  Type  TEST   Entry where Type is a factor variable  with level names TEST and CONTROL      at a  b creates a series of model terms representing b nested within a for any model  term b  A model term is created for each level of a  each has the size of b  For example   if site and geno are factors with 3 and 10 levels respectively  then at  site   geno is  shorthand for 3 model terms at site 1  geno at site 2  geno at site 3  geno   each with 10 levels     this is similar to forming an interaction except that a separate model term is created  for each level of the first factor  this is useful for random terms when each component  can have a different variance  The same effect is achieved by using an interaction  e g   site geno  and associating a DIAG variance structure with the first component  see  Section 7 11      any at   term to be expanded MUST be the FIRST component of the interaction   geno at site  will not work   at site 1  at year  geno will not work but   at  year   at site 1  geno is OK     the at   factor must be declared with the correct number of levels because the model  line is expanded BEFORE the data is read  Thus if site is declared as site   or site   A in the data definitions    at site  geno will expand to   at site 01  geno at site 02  geno  regardless of the actual number of sites     6 5 4 Associated Factors    Sometimes there is a hierarchical structure to factors which should be recognised as 
195. b you are running with extension  own for the  file written by ASReml and  gdg for the file your program writes  The type of the parameters  is set with the  T qualifier  see Section 7 7 7  and the control parameter is set using the  F    127    7 7 Variance model function qualifiers       qualifier       F1 applies to the own variance model function  With own  the argument of  F is passed  to the MYOWNGDG program as an argument the program can access  This is the mechanism  that allows several OWN models to be fitted in a single run       Ts is used to set the type of the parameters  It is primarily used in conjunction with the    own variance model function as ASReml knows the type of the parameters in other cases   The valid type codes are given in Section 7 7 7     7 7 4 Parameter space constraints  Gs    Each parameter has an associated constraint code which may be expressed explicitly with  the qualifier  Gs  where s is the code  The following is a list of the possible constraint codes     code constraint type description       P in the space P is the default in most cases and attempts to keep the parameter  in the theoretical parameter space  It is activated when the update  of a parameter would take it outside its space  For example  if  an update would make a variance negative  the negative value  is replaced by a small positive value  Under the  GP condition   repeated attempts to make a variance negative are detected and  the value is then fixed at a small positive
196. basics  Statistics and Computing  4  221 234     Patterson  H  D  and Thompson  R   1971   Recovery of interblock information when block  sizes are unequal  Biometrika 31  100 109     Pinheiro  J  C  and Bates  D  M   2000   Mized Effects Models in S and S PLUS   Springer   Verlaag     Quaas  R  L   1976   Computing the diagonal elements and inverse of a large numerator  relationship matrix   Biometrics 32  949 953     Robinson  G  K   1991   That blup is a good thing  The estimation of random effects   Statistical Science 6  15 51     Rodriguez  G  and Goldman  N   2001   Improved estimation procedures for multilevel  models with binary response  A case study  Journal of the Royal Statistical Society A      General 164 2   339 355     Sargolzaei  Iwaisaki and Colleau  2005   A fast algorithm for computing inbreeding coeffi   cients in large populations  Genetics  Selection and Evolution 122  325 331     Schall  R   1991   Estimation in generalized linear models with random effects  Biometrika  78 4   719 27     Searle  S  R   1971   Linear Models  New York  John Wiley and Sons  Inc     Searle  S  R   1982   Matrix algebra useful for statistics  New York  John Wiley and Sons   Inc     Searle  S  R   Casella  G  and McCulloch  C  E   1992   Variance Components  New York   John Wiley and Sons  Inc     Self  S  C  and Y   L  K   1987   Asymptotic properties of maximum likelihood estimators and  likelihood ratio tests under non standard conditions   Journal of the American Statis
197. bel in predict  statement  varierty should be variety      1  Data file not found    Running this job produces the  asr file in Section 14 1  The first problem is that ASReml  cannot find a data file nin9 asd in the current working folder as indicated in the error  message above the Fault line  Since nin9 asd contains a         which is not permitted in  variable names  ASReml checks for a file of this name  in the working directory since no  path is supplied   But ASReml did not find a file with this name  ASReml cannot tell  whether the filename is misspelt or that an invalid variable name has been specified  In this  case the data file was given as nin9 asd rather than nin89 asd  However  ASReml kept  going and read the model line which it recognised because of the   character  The message  Fault  Error parsing yield   mu variety does not mean that the error is in the model  yield   mu variety but that it recognised this as the model line and gave up because it  had not encountered a valid data file line     The message  Warning  Unrecognised qualifier at character 10 nin9 asd       ISLIP 1    simply indicates that the qualifier  SLIP 1 has not been processed     2  An unrecognised qualifier       After correcting the filename  we get the fol    NIN Alliance Trial 1989  lowing  abbreviated  output  The problem is   variety    that  SKIP 1  which would cause ASReml to  skip the first line of the data file  was mistyped  as  SLIP 1 which ASReml failed to recognise         yi
198. bia debia daue ERE ee 34  302 Theal file ere Be ee Se ES Se Ce ee ee ae 35  Ce Go S  er a ee a ee od a a 37    3 7 Tabulation  predicted values and functions of the variance components  Data file preparation  4 1 MO ee a ee Ea we Ph eee ee a Pee Bae R ake  4 2 TG data ile  o a eR EME eR EHS HESS SEEDER EES  4 2 1 Free format data files    oaa a 0000004  42 2 Pred format data files oe  e o eee eh ed eo Pee ee Deaan  4 2 5 Preparing data fiks in Excel  lt o oo  lt  s oe ee goewa cpana eee es  4 2 4 Binary format data files      aa a  Command file  Reading the data  5A IO o Awe a e e ESE a a SED we e D aR  Da Important muis e iera be ee he e h e i ea a Me hee eee BS  5 3 Te WHS  oe ew a a a he wee ee ea eee ee ee eee a e  5 4 Specifying and reading the data       2    0 00 00 00 eee  5 4 1 Data field definition syntax 2 4  2 24562 h8 400d SERRE ES  5 4 2 Storage of alphabetic factor labels                0 0    5 4 3 Ordering factor levels      de RAO RE eR Oe eR  544 Skipping input fields  lt 4 o eor aoe 6 eo a eG ew eG ew ee ew we  55 Transforming the dala  ws es wh hg a  55 1 Transformation Synta o cc sec at 444848 2 oR dR A EGS  5 5 2 QTL marker transformations      64 666 4a ee ee ee  5 5 3 Remarks concerning transformations                      5 5 4 Special note on covariates    2    a a a  5 6 OPE  te srren ee ey eo Be ed a ee GOR a pA  561 Data Re SYRTEN cc ha ee EE RES ERE EERE EEE ES  5 7 Data file qualifiers       oaa ee ee eee ee  5 7 1 Combining rows from separate 
199. bic smoothing spline between knot points in the  prediction process  Since the spline knot points are specifically nominated in the  SPLINE  line  these extra points have no effect on the analysis run time  The  SPLINE line does not  modify the analysis in this example since it simply nominates the 7 ages in the data file  The  same analysis would result if the  SPLINE line was omitted and spl age 7  in the model  was replaced with spl age   An extract of the output file is    1 LogL  20 9043 S2  48 470 5 df 0 1000E 00  2 LogL  20 9013 S2  49 152 5 df 0 9102E 01  3 LogL  20 8998 S2  49 892 5 df 0 8221E 01  4 LogL  20 8996 82  50 273 5 df 0 7802E 01  Final parameter values G 7892E 01          Results from analysis of   cire       Akaike Information Criterion 45 80  assuming 2 parameters    Bayesian Information Criterion 45 02    Approximate stratum variance decomposition    Stratum Degrees Freedom Variance Component Coefficients   idv spl  age 7   1 49 98 4896 1232 1 0   Residual Variance 3 51 50 2726 0 0 1 0   Model_Term Gamma Sigma Sigma SE  C  idv  spl  age 7   IDV_V 5 0 789210E 01 3 96756 0 40 OP    310    15 9 Balanced longitudinal data   Random coefficients and cubic smoothing splines    Oranges       Residual SCA_V 7 1 000000 50 2726 1 32 OP  Notice  The DenDF values are calculated ignoring fixed boundary singular  variance parameters using algebraic derivatives     Estimate Standard Error T value T prev  3 age   1 0 814772E 01 0 552336E 02 14 75  7 mu   1 24 4378 5 754
200. blem on the  SPLINE line  It could be a wrong  variable name or the wrong number of knot points  Knot points  should be in increasing order     Try increasing workspace     The problem may be due to the use of the  SORT qualifier in  the data definition section     The PREDICT statement seems in error  the named factor is  not present in the model     An  INCLUDE file could not be opened     May be an unrecognised factor model term name or variance  structure name or wrong count of initial values  possible on an  earlier line  May be insufficient lines in the job     Check your MYOWNGDG program and the   gdg file     Maybe increase  WORKSPACE  Messages may identify a prob   lem with the pedigree     261    14 5    Information  Warning and Error messages       Table 14 3  Alphabetical list of error messages and probable cause s   remedies       error message    probable cause remedy       Failed while ordering  equations     FORMAT error reading        G structure header  Factor    order     G structure  ORDER O MODEL    GAMMAS      G structure size does not  match    Getting Pedigree     GLM Bounds failure    Increase declared levels for  factor        Increase workspace        Insufficient data read from  file    Insufficient points for      Insufficient workspace     invalid analysis trait number    This indicates the job needs more memory than was allocated  or is available  Try increasing the workspace or simplifying the  model     Likely causes are      bad syntax or inv
201. both the design matrix  Z   and variance model   G    in particular allowing G  to be the direct product of variance structures  In Section  7 4 we further generalise the consolidated model specification to allow the residual variance  structure to be the direct sum of variance structures     7 2 Process to define a consolidated model term    Consider a linear model term column row comprising the interaction between the single  factors column and row  We refer to column row as a compound model term  If the vari   ance structure for column row is the direct product of two matrices  the first of which is  an IID variance structure  that is  a scaled identity matrix  with dimension equal to the  number of levels of the factor column and the second of which is a matrix with dimension  equal to the number of levels of the factor row and with elements representing a first order  autoregressive correlation structure AR1  then we represent this by the consolidated model  term idv column  ar1  row   This specifies a two dimensional separable spatial variance  structure for column row but with spatial correlation in the column direction only  A con   solidated model term is therefore comprised of component terms  each with a variance model  function applied to give the required direct product form of the variance structure  Table 7 2  demonstrates how to build consolidated terms in ASReml for a small selection of examples   The linear model term  single or compound  is first identified
202. by the   TOTAL qualifier  the multinomial model requires  y  Xi Yj a particular variance structure across the multinomial classes  This is formally  Li  E Y   and specified as residual id units   mthr Trait      Hi     Hi 1    102    6 8 Generalized Linear  Mixed  Models       Table 6 4  GLM distribution qualifiers       qualifier action       The multinomial threshold model is fitted as a cumulative probability model  The  proportions  y    r  n  in the ordered categories are summed to form the cumu   lative proportions  Y   which are modelled with logit   LOGIT   probit   PROBIT   or Complementary LogLog   CLOG  link functions  The implicit residual variance  on the underlying scale is 77 3   3 3  underlying logistic distribution  for the logit  link  1 for the probit link  The distribution underlying the Complementary LogLog  link is the Gumbel distribution with implicit residual variance on the underlying  scale of 72 6   1 65  For example   Lodging  MULTINOMIAL 4  CUMULATIVE   Trait Variety  r block   predict Variety  where Lodging is a factor with 4 ordered categories  Predicted values are reported  for the cumulative proportions      POISSON    LOGARITHM     IDENTITY    SQRT      v  p Natural logarithms are the default link function   d   2 yln y t  ASReml assumes the Poisson variable is not negative       y H    IGAMMA     INVERSE     IDENTITY    LOGARITHM     PHI         TOTAL n    v   u    lon  The inverse is the default link function  n is defined with the  TOTAL q
203. c  moves the    last character read    pointer to line position  c so that the next field starts at position c  1  For example TO goes  back to the beginning of the line    e the string D  invokes debug mode     A format showing these components is   FORMAT D 314 8X A6 3 2x F5 2  4x BZ 2011  and is suitable  for reading 27 fields from 2 data records such as  111122223333xxxxxxxxALPHAFxx 4 12xx 5 32xx 6 32   xxxx123 567 901 345 7890    63    5 7 Data file qualifiers       Table    5 2  Qualifiers relating to data input and output       qualifier    action       IMERGE c f    SKIP n      IMATCH a b       READ n     RECODE     ROWFACTOR v   ROWFAC v    IRREC  n     may be specified on a line following the datafile line  The purpose is to  combine data fields from the  primary  data file with data fields from  a secondary file  f   This  MERGE qualifier has been superseded by the  much more powerful MERGE statement  see Chapter 11     The effect is to open the named file  skip n lines  and then insert the  columns from the new file into field positions starting at position c  If  IMATCH a b is specified  ASReml checks that the field a  0  lt  a  lt  c  has  the same value as field b  If not  it is assumed that the merged file has  some missing records and missing values are inserted into the data record  and the line from the MERGE file is kept for comparison with the next  record    It is assumed that the lines in the MERGE file are in the same order as  the corresponding lines 
204. ccessed by subsequent parts of the same job using  TPn  This  was added to facilitate location of putative QTL  Gilmour  2007      182    9 3 Prediction       Table 9 1  List of prediction qualifiers       qualifier    action         TWOSTAGEWEIGHTS      VPV    is intended for use with variety trials which will subsequently be combined  in a meta analysis  It forms the variance matrix for the predictions  inverts  it and writes the predicted variety means with the corresponding diagonal  elements of this matrix to the  pvs file  These values are used in some  variety testing programs in Australia for a subsequent second stage analysis  across many trials  Smith et al   2001   A data base is used to collect the  results from the individual trials and write out the combined data set  The  diagonal elements  scaled by the variance which is also reported and held in  the data base  are used as weights in the combined analysis     requests that the variance matrix of predicted values be printed to the  pvs  file     PLOT graphic control qualifiers    This functionality was developed and this section was written by Damian Collins     The  PLOT qualifier produces a graphic of the predictions  Where there is more than one  prediction factor  a multi panel    trellis    arrangement may be used  Alternatively  one or more  factors can be superimposed on the one panel  The data can be added to the plot to assist    informal examination of the model fit     With no plot options  ASReml c
205. ce model function name given in Table  7 6  for example  for a factor row    exp  row     is an exponential correlation model with a single correlation parameter     to specify an homogeneous variance model  append a v to the variance model function  name  for example    expv  row     is an exponential variance model with 2 parameters  correlation and variance      to specify a heterogeneous variance model  append an h to the variance model function  name  for example    coruh site     is a variance matrix with different variances for each site but the same correlation for all  pairs of sites     Important See Section 7 4 for rules on combining variance models and Section 7 7 5 for  important notes regarding initial values     7 11 2 Non singular variance matrices    For REML estimation  ASReml needs to invert each variance matrix  For this it requires that  the matrices be negative definite or positive definite  They must not be singular  Negative  definite matrices will have negative elements on the diagonal of the matrix and or its inverse   There are two exceptions  the XFA model which has been specifically designed to fit singular  matrices  Thompson et al  2003  page 144   and singular relationship matrices described in    141    7 11 Variance model functions available in ASReml       Chapter 8 3 1   If an estimated matrix comes too close to being singular  ASReml will stop iterating     Let     Ax represent an arbitrary quadratic form for x    x           The quadratic
206. cord  after transformation    Thus   A   B   LagA   V4  V4 A  reads two fields  A and B   and constructs LagA as the value of A from the previous record  by extracting a value for LagA from working variable V4 before loading V4 with the current  value of A     5 5 1 Transformation syntax    Transformation qualifiers have one of seven forms  namely      operator to perform an operation on the current field  for example   absY  ABS to take absolute values      operator value to perform an operation involving an argument on the cur   rent field  for example   logY   Y   0 copies Y and then takes logs       operator V field to perform an operation on the current field using the data  in another field  for example    V2 to subtract field 2 from  the current field      V target to reset the focus for subsequent transformations to field  number target      TARGET target to reset the focus for subsequent transformations to the pre   viously named field target     V target   value to set the target field to a particular value    1V target   V field to overwrite the data in a target field by the data values    of another field  a special case is when field is O instructing  ASReml to put the record number into the target field     e operator is one of the symbols defined in Table 5 1   e value is the argument  a real number  required by the transformation     e V is the literal character and is followed by the number  target or field  of a data field  the  data field is used or modifie
207. creases to the correct value    indent them to avert this message    user nominated more levels than are permitted   constraint parameter is probably wrongly assigned     fix the argument   The model term Trait was not present in the multivariate    analysis model     you may need more iterations     restart to do more iterations  see   CONTINUE      The computed LogL value is occasionally very large in mag   nitude  but our interest is in relative changes  Reporting  relative to an offset ensures that differences at the units  level are apparent     missing cells are normally not reported   consider setting levels correctly   the limit is 100 PREDICT statements     because it contains errors     if you really want to fit this term twice  create a copy with  another name     gives details so you can check ASReml is doing what you  intend     that is  these standard errors are approximate     use the correct syntax     the  A fields will be treated as factors but are coded as they  appear in the binary file     use correct syntax     257    14 5 Information  Warning and Error messages       Table 14 2  List of warning messages and likely meaning s        warning message    likely meaning       Warning  The  X  Y  G  qualifiers are ignored   There is no data to plot    Warning  Warning  The  default action with missing  values in multivariate data    Warning  The estimation was  ABORTED  Warning  The FOWN test of    is not calculated        Warning  The labels for  predictions 
208. critical value for a xy  variate with 1 degree  of freedom  The distribution of the REMLRT for the test that k variance components are  zero  or tests involved in random regressions  which involve both variance and covariance  components  involves a mixture of x  variates from 0 to k degrees of freedom  See Self and  Liang  1987  for details     Tests concerning variance components in generally balanced designs  such as the balanced    16    2 4 Inference  Random effects       one way classification  can be derived from the usual analysis of variance  It can be shown  that the REMLRT for a variance component being zero is a monotone function of the F  statistic for the associated term     To compare two  or more  non nested models we can evaluate the Akaike Information Cri   teria  AIC  or the Bayesian Information Criteria  BIC  for each model  These are given  by   AIC    2    pri   2t    BIC      2lri  t  logy  2 22        where t  is the number of variance parameters in model    and v   n   p is the residual degrees  of freedom  AIC and BIC are calculated for each model and the model with the smallest value  is chosen as the preferred model     2 4 2 Diagnostics    In this section we will briefly review some of the diagnostics that have been implemented  in ASReml for examining the adequacy of the assumed variance matrix for either R or G  structures  or for examining the distributional assumptions regarding e or u  Firstly we note  that the BLUP of the residual vector is 
209. cture  or to implicit residual variance parameters  The   VCC syntax is required  for these cases     125    7 7 Variance model function qualifiers       Table 7 5  Examples of constraining variance parameters in ASReml       ASReml code action         ABACBADCBA constrains all parameters corresponding to A to be equal   similarly for B and C  The fourth parameter symbol D is  only associated with one parameter and can be replaced  by 0 to indicate that it is unconstrained  This sequence  applied to an unstructured  US  4 4 matrix would make  it banded  that is    A   BA  CBA  DCBA    this example defines a structure for the genotype by site  interaction effects in a multi environment trial with 3  sites  in which the genotypes are independent random  effects within sites but are correlated across sites with  equal covariance  The initial value for the common co   variance is 0 1     us site  GP   OAOAAO   IINIT eS od wt od  oh 233    a factor analytic model of order 2 for 4 sites  with equal  variance across sites  is specified using this code  For the  fak variance model functions  ASReml orders the param   eters as the loadings followed by the specific variances  In  this example  the first loading in the second factor is con   strained to be equal to zero for identifiability  P restricts  the magnitude of the loadings and the variances to be  positive     fa2 site  G4PZ3P4P   00000000VVVV    INIT 4  9 O 3  1 4  2  gen    code for a factor analytic model of order 2 for 4 s
210. cture for  rows and idv column  models the IDV variance structure for columns  The consolidated  model term             idy column  ari row     directly mirrors the algebraic form var  e    02 I      Er  pr      Important points    e the same residual variance structure could be achieved by specifying  id column   ariv row  which mirrors the alternate but equivalent algebraic form  var  e    I      02 X  pr   It is arbitrary which variable the common variance is attached  to  column in the code box  row in the latter  see Section 7 4 on identifiability     if the correlation structure id column   ar1  row  was specified  ASReml would automat     ically add a common variance to model var  e    077          pr   see Section 7 4     If mv is now included in the model specification  This tells ASReml to estimate the missing  values  The  f before mv indicates that the missing values are fixed effects in the sparse  set of terms  An equivalent way of specifying this model is   yield   mu variety mv  r idv repl    where mv is the last fixed effect term and ASReml will include mv and succeeding terms in  the sparse set     ASReml would report an error if the consolidated model term idv column   ariv row   was specified  this would correspond to var  e    021      02      p   and o2 and o  are    unidentifiable in this case  that is  it is not possible to estimate them separately     119    7 5 A sequence of variance structures for the NIN data       e this is a univariate analysis i
211. d  as after basename in the example       italic font is used to name information to be supplied by the user  for example  basename  stands for the name of a file with an  as filename extension       square brackets indicate that the enclosed text and or arguments are not always re   quired  Do not enter these square brackets     e ASReml output is in this size and font  see page 34     e this font is used for all other code     2 Some theory    2 1 The general linear mixed model    If y  n x 1  denotes the vector of observations  the general linear mixed model can be written  as  y XT Zut e  2 1     where 7  p x 1  is a vector of fixed effects  X  n x p  is the design matrix of full column rank  that associates observations with the appropriate combination of fixed effects  u  q x 1  is  a vector of random effects  Z  n x q  is the design matrix that associates observations with  the appropriate combination of random effects  and e  n x 1  is the vector of residual errors     2 1 1 Sigma parameterization of the linear mixed model    Model  2 1  is called a linear mixed model or linear mixed effects model  It is assumed    ejl   SE nvea   2a    where the matrices G and R  are variance matrices for u and e and are functions of pa   rameters o  and o   This requires that the random effects u and residual errors e are  uncorrelated  The variance matrix for y is then of the form    var  y    ZG o  Z   R  o    2 3     which we will refer to as the sigma parameterization of the G a
212. d allowed specifi   cation of parametric constraints and relationships  equality and scale  between parameters  to be defined  This parametric information was interspersed within the structure definition   Release 4 allows an alternative way of specifying this parametric information  essentially con   structing a table in a  tsv file  with the rows labelled by the specific parameters  columns  for initial values and parametric constraints  and two columns that allow specification of  relationships  This  tsv file is written by ASReml after the input file has been parsed  using    to represent initial values and setting  MAXITER 0 gives an easy construction  Once the   tsv file has been edited it can be read by inserting   TSV on the data file line  As an example    Wolfinger Rat data   treat  A   wtO wti wt2 wt3 wt4   subject     V0   wolfrat dat  skip 1  ASUV  MAXITER 0   wtO wti wt2 wt3 wt4 Trait treat Trait treat  1 2 9   27 O ID  error variance    Trait 0 US      indicates generates initial values  generates a  tsv file       This  tsv file is a mechanism for resetting initial parameter values    by changing the values here and rerunning the job with  TSV     You may only change values in the last 4 fields      Fields are      GN  Term  Type  PSpace  Initial_value  RP_GN  RP _scale     136    7 9 Ways to present initial values to ASReml       5     units ue Trait  us Trait  1     G  P  4 7911110   5  1  6     units us Trait   us Trait  2   G  P  5 0231481   6  1  7     un
213. d are the numbers or names of existing components Ug  Up  Ve and vg and cp  is a multiplier for v  m is a number greater than the current length of v to flag the special  case of adding the offset k  When using the component numbers  the form a b can be used  to reference blocks of components as in    F label a b  k   c d    The instructions in the ASReml code box corresponds to a simple sire model so that variance  component 1 is the Sire variance and variance component 2 is the residual variance  then    F phenvar 1  2  or  F phenvar idv Sire    idv units     creates a third component called phenvar which is the sum of the variance components  that  is  the phenotypic variance     F genvar 1 4  or  F genvar idv Sire    4    creates a fourth component called genvar which is the sire variance component multiplied  by 4  that is  the genotypic variance     212    12 2 Syntax          Ratios  or in particular cases heritabilities  are  Y  lt  mu tr iav Sire   requested by function lines beginning with an   residual idv units   H  The specific form of the directive is VPREDICT  DEFINE  F phenvar idv Sire   idv  units   H label n d F genvar idv Sire    4  R herit genvar phenvar          This calculates 02 07 and se o  o7  where n  and d are the names of the components or  integers pointing to components v  and vg that are to be used as the numerator and denom   inator respectively in the heritability calculation        Note that covariances between ratios and other components are
214. d as it    is running  it will attempt to restart the job with increased workspace  If the system has  already allocated all available memory the job will stop     10 3 7 Examples          ASReml code action   asreml  LW64 rat as increase workspace to 64 Mbyte  send screen output to rat asl and sup   press interactive graphics   asreml  IL rat as send screen output to rat asl but display interactive graphics   asreml  N rat as allow screen output but suppress interactive graphics   asreml  ILW512 increase workspace to 512 Mbyte   send screen output to rat asl but   rat as display interactive graphics   asreml  rwi coop wwt runs coop as twice using 1Gbyte workspace and writing results to   ywt coopwwt as and coopywt as and substituting wwt and ywt for  1 in the  two runs        10 4 Advanced processing arguments    10 4 1 Standard use of arguments    Command line arguments are intended to facilitate the running of a sequence of jobs that  require small changes to the command file between runs  The output file name is modified  by the use of this feature if the  R option is specified  This use is demonstrated in the  Coopworth example of Section 15 10     199    10 4 Advanced processing arguments       Command line arguments are strings listed on the command line after basename  the com   mand file name  or specified on the top job control line after the  ARGS qualifier  These  strings are inserted into the command file at run time  When the input routine finds a  n  in the com
215. d by the data summary  You should al   ways check the data summary to ensure that    row   column    nin89 asd  skip 1  yield   mu variety          the correct number of records have been de    jp Repl  tected and the data values match the names   residual ari  Row   ar1 Col   appropriately  predict varierty       The problem is that  R Repl is meant to be part of the linear model  but it is on a separate  line  and the first part of the model on the preceding line does not end with a COMMA to  indicate that the model is incomplete  Appending a comma to the first model line resolves  this problem     Folder  C  Users Public ASRem1 Docs Manex4 ERR  variety  A   QUALIFIERS   SKIP 1   Reading nin89 asd FREE FORMAT skipping 1 lines    Univariate analysis of yield  Summary of 224 records retained of 224 read    Model term Size  miss  zero MinNonO Mean MaxNonO StndDevn  1 variety 56 0 0 1 28 5000 56   2 ad O 0 1 000 28 50 56 00 16 20  3 pid 0 0 1101  2628  4156  I2   4 raw 0 0 21 00 610 5 840 0 149 0  5 repl 4 0  0  1 2 5000 4   6 nloc 0 O 4 000 4 000 4 000 0 000  7 yield Variate 0 O 1 050 25 53 42 00 7 450  8 lat 0 O 4 300 Af raz 47 30 12 90  9 long 0 O 1 200 14 08 26 40 7 698  10 row Ze 0   0  l   17321 22   11 column 11 Q  0  1 6 3304 11   12 mu 1    QUALIFIERS   R Repl   Fault  Error in variance header line   R Repl  Last line read was  IR Repl 0 0 0 0   ninerr4 variety id pid raw rep nloc yield lat  Model specification  TERM LEVELS GAMMAS    variety 56  mu 1   12 factors defin
216. d depending on the context     e Vfield may be replaced by the label of the field if it already has a label   e in the first three forms the operation is performed on the current field  this will be the    field associated with the label unless the focus has been reset by specifying a new target  in a preceding transformation     53    5 5 Transforming the data       e the last four forms change the focus of subsequent transformations to target     e in the last two forms a value is assigned to the target field  For example       V22 V1i1  copies  existing  field 11 into field 22  Such a statement would typically be followed by  more transformations  If there are fewer than 22 variables labelled then V22 is used in  the transformation stage but not kept for analysis     e only the  DOM and  RESCALE transformations automatically process a set of variables  defined with the  G field definition  All other transformations always operate on only  a single field  Use the  DO  ENDDO transformations to perform them on a set of    variables     Table 5 1  List of transformation qualifiers and their actions with examples          qualifier argument action examples  I  v used to overwrite create a variable half   0 5  with v  It usually implies the vari  zero   0   able is not read  see examples on  page 52   I  l   bx  v usual arithmetic meaning  note yield   10  17 that  0 0 gives 0 but v 0 gives a  missing value where v is not 0   Le v raises the data  which must be pos  yield  itive
217. dd or mmdd into days    Jyyd converts a date in the form  ccyyddd or yyddd into days  These  calculate the number of days since  December 31 1900 and are valid for  dates from January 1 1900 to De   cember 31 2099  note that   if cc is omitted it is taken as 19 if yy   gt  32 and 20 if yy  lt  33  the date  must be entirely numeric  charac   ters such as   may not be present   but see  DATE      IMv converts data values of v to  missing  if  M is used after  A or  T   v should refer to the encoded fac   tor level rather than the value in the  data file  see also Section 4 2      the maximum  minimum and mod   ulus of the field values and the value  v     assigns Haldane map positions  s  to  marker variables and imputes miss   ing values to the markers  see be   low      replaces any missing values in the  variate with the value v  If v is an   other field  its value is copied     replaces the variate with normal  random variables having variance v     replaces data values o with n in the  current variable  I e    IF  DataValue EQ o    DataValue n    rescales the column s  in the current  variable   G group of variables  us   ing Y    Y  0  s    sets the seed for the random number  generator     56    yield  M 9  yield  M lt  0  M gt 100    yield  MAX 9    ChrAadd  G 10  MM i         Rate  NA O  WT   Wt2  NA Wtt    Ndat   0  Normal 4 5  is equivalent to  Ndat   Normal 4 5    Rate  REPLACE  9 0    Rate  RESCALE  10 0 1      ISEED 848586    5 5 Transforming the data       Table
218. ded to avoid  confusion     5 5 Transforming the data    Transformation is the process of modifying the data  for example  dividing all of the data  values in a field by 10   forming new variables  for example  summing the data in two fields   or creating temporary data  for example  a test variable used to discard some records from  analysis and subsequently discarded   Occasional users may find it easier to use a spreadsheet  to calculate derived variables than to modify variables using ASReml transformations     Transformation qualifiers are listed after data field labels  and the field_type if present   They  define an operation  e g      often involving an argument  a constant or another variable    which is performed on a target variable  By default the target is the current field  but can  be changed with the   TARGET qualifier  For a  G group of variables  the target is the first  variable in the set     Using transformations will be easier if you understand the process  As ASReml parses the  variable definitions  it sequentially assigns them column positions in the internal data vector   It notes which is the last variable which is not created by  say the     transformation  and  that determines how many fields are read from the data file  unless overridden by  READ  qualifier in Table 5 2   ASReml actually reads the data file after parsing the model line  It  reads a line into a temporary vector  performs the transformations in that vector  and saves  the values tha
219. default value of n is 1000 so that points closer than  0 1  of the range are regarded as the same point     83    5 8 Job control qualifiers       Table 5 6  List of very rarely used job control qualifiers       qualifier    action        KNOTS n      NOCHECK      NOREORDER     NOSCRATCH     POLPOINTS     PPOINTS n      REPORT    ISCALE 1      SCORE    changes the default knot points used when fitting a spline to data with  more than n different values of the spline variable  When there are more  than n  default 50  points  ASReml will default to using n equally spaced  knot points     forces ASReml to use any explicitly set spline knot points  see   SPLINE   even if they do not appear to adequately cover the data values     prevents the automatic reversal of the order of the fixed terms  in the  dense equations  and possible reordering of terms in the sparse equations     forces ASReml to hold the data in memory  ASReml will usually hold the  data on a scratch file rather than in memory  In large jobs  the system  area where scratch files are held may not be large enough  A Unix system  may put this file in the  tmp directory which may not have enough space  to hold it     affects the number of distinct points recognised by the pol    model func   tion  Table 6 1   The default value of n is 1000 so that points closer than  0 1  of the range are regarded as the same point     influences the number of points used when predicting splines and poly   nomials  The design matrix ge
220. defining factor names and improved facilities for reading relationship matrices and  better explanation of a simpler way of constructing variances of functions of parameters   Among the developments associated with analysis are making it easier to specify functions  of variance parameters using names rather than numbers  fitting factor effects with large  random regression models  such as commonly used with marker data  fitting linear rela   tionships among variance structure parameters and calculating information criteria  The  developments associated with output include writing out design matrices  A major devel   opment in Release 4 is an alternative model specification using a functional approach  Prior  to Release 4 a structural specification was used in which variance models were applied by  imposing variance structures on random model terms and or the residual error term after  the mixed model had been specified  In this case  the variance models were presented in a    ii    separate part of the input file  The functional specification offers an alternative to the struc   tural specfication in which the variance structures for random model terms and the residual  error term are specified in the linear mixed model definition by wrapping terms with the  required variance model function  This approach is more concise  less error prone and more  automatic for specifying multi section residual variances     The data sets and ASReml input used in this guide are available fro
221. dominant spatial processes are aligned with rows columns as occurs in field experiments   Geometric anisotropy is discussed in most geostatistical books  Webster and Oliver  2001   Diggle et al   2003  but rarely are the anisotropy angle or ratio estimated from the data   Similarly the smoothness parameter v is often set a priori  Kammann and Wand  2003   Diggle et al   2003   However Stein  1999  and Haskard  2006  demonstrate that v can be  reliably estimated even for modest sized data sets  subject to caveats regarding the sampling  design     143    7 11 Variance model functions available in ASReml       The syntax for the Mat  rn class in ASReml is given by MATk where k is the number of  parameters to be specified  the remaining parameters take their default values  Use the  G  qualifier to control whether a specified parameter is estimated or fixed  The order of the  parameters in ASReml  with their defaults  is      v   0 5  6   1  a 0  A   2   For  example  if we wish to fit a Mat  rn model with only    estimated and the other parameters  set at their defaults then we use MAT1  MAT2 allows v to be estimated or fixed at some other  value  for example   mat2 fac xcoord ycoord   INIT 0 2 1 0  GPF   ie    The parameters    and v are highly correlated so it may be better to manually cover a grid  of v values     We note that there is non uniqueness in the anisotropy parameters of this metric d    since  inverting    and adding 5 to a gives the same distance  This non uniqu
222. dvantages arising from a balanced  spatial layout can be exploited  The equations for mv and any terms that follow are  always included in the sparse set of equations     Missing values are handled in three possible ways during analysis  see Section 6 9    In the simplest case  records containing missing values in the response variable  are deleted  For multivariate  including some repeated measures  analysis  records  with missing values are not deleted but ASReml drops the missing observation and  uses the appropriate unstructured R inverse matrix  For regular spatial analysis  we  prefer to retain separability and therefore estimate the missing value s  by including  the special term mv in the model     out n   out n t  establishes a binary variable which is    out i  1 if data relates to observation i   trait 1   else is 0   out  i t  1 if data relates to observation i   trait t   else is 0   The intention is that this be used to test remove single observations for example  to remove the influence of an outlier or influential point  Possible outliers will  be evident in the plot of residuals versus fitted values  see the  res file  and the  appropriate record numbers for the out   term are reported in the  res file  Note  that i relates to the data analysed and will not be the same as the record number as  obtained by counting data lines in the data file if there were missing observations in  the data and they have not been estimated   To drop records based on the record 
223. e  that the data file is misnamed     Check the argument     There is probably a problem with the output from MY   OWNGDG  Check the files  including the time stamps to check  the  gdg file is being formed properly     if you read less data than you expect  there are two likely expla   nations  First  the data file has less fields than implied by the  data structure definitions  you will probably read half the ex   pected number   Second  there is an alphanumeric field where  a numeric field is expected     check the  STEP qualifier argument   either all data is deleted or the model fully fits the data     error with the variance header line  Often  some other error has  meant that the wrong line is being interpreted as the variance  header line  Commonly  the model is written over several lines  but the incomplete lines do not all end with a comma     an error reading the error model     Maybe you need to include mv in the model to stop ASReml  discarding records with missing values in the response variable     Without the ASUV qualifier  the multivariate error variance  MUST be specified as US     Apparently ASReml could not open a scratch file to hold the  transformed data  On unix  check the temp directory   tmp for  old large scratch files     265    14 5    Information  Warning and Error messages       Table 14 3  Alphabetical list of error messages and probable cause s   remedies       error message    probable cause remedy       Segmentation fault     Singularity appe
224. e M I  Ir M I    ral    5 8 Job control qualifiers       Table 5 4  List of occasionally used job control qualifiers       qualifier    action        MVINCLUDE     MVREMOVE      NODISPLAY     PVAL v p     PVAL f ulist    Restrictions    The key field MUST be numeric  In particular  if the data field it relates  to is either an  A or  I encoded factor  the original  uncoded  level  labels may not specified in the MBF file  Rather the coded levels must  be specified  The MBF file is processed before the data file is read in and  so the mapping to coded levels has not been defined in ASReml when  the MBF file is processed  although the user can must anticipate what  it will be     Comment    If this MBF process is to be used repeatedly  for example to process a  large set of marker variables in conjunction with  CYCLE  processing will  be much faster if the markers variables are in separate files  ASReml will  read 10 files containing a single field much faster than reading a single  file containing 400 fields  ten times to extract 10 different markers  Also  note that the file may be a binary file and will be read much quicker than  a formatted file  A binary file may be formed in a previous run using    SAVE     When missing values occur in the design ASReml will report this fact  and abort the job unless  MVINCLUDE is specified  see Section 6 9   then  missing values are treated as zeros  Use the  DV transformation to drop  the records with the missing values     instructs AS
225. e best estimate     a common reason is that some constraints have restricted  the gammas  Add the  GU qualifier to any factor definition  whose gamma value is approaching zero  or the correlation is  approaching    1  Alternatively  more singularities may have  been detected  You should identify where the singularities  are expected and modify the data so that they are omitted or  consistently detected  One possibility is to centre and scale  covariates involved in interactions so that their standard  deviation is close to 1     258    14 5    Information  Warning and Error messages       Table 14 3  Alphabetical list of error messages and probable cause s   remedies       error message    probable cause remedy        COLFAC confusion   ROWFAC confusion     PRINT  Cannot open output  file      SUBSECTION not permitted        AINV GIV matrix undefined or  wrong size    ALNORM Error    Apparent error in pedigree  relationships    ASReml command file is EMPTY     ASReml failed in       at   string too long    Badly formed model term   CALC    reference to large     Check IDV structure     Context of read error    Data Error  At record        Continue from  rsv file    Convergence failed    ICOLFAC  ROWFAC arguments contradict RESIDUAL state   ment order  If the variables have the correct names  reverse the  order     Check filename     This variance structure qualifier is only permitted in single sec   tion RESIDUAL structures     Check the size of the factor associated with the
226. e component  for this term should be positive  Had we mistakenly specified level 1 then ASReml would  have estimated a negative component by setting the  GU option for this term  The portion  of the ASReml output for this analysis is    5 LogL  343 759 S2  1 2242 262 df   1 components restrained  6 LogL  343 577 52  1 1738 262 df   1 components restrained  7 LogL  343 543 S2  1 1559 262 df  8 LogL  343 535 S2  1 1469 262 df  9 LogL  343 535 S2  1 1451 262 df          Results from analysis of sqrt rootwt         Akaike Information Criterion 705 07  assuming 9 parameters    Bayesian Information Criterion 737 18    302    15 8 Paired Case Control study   Rice       Model_Term Gamma Sigma Sigma SE  C  idv variety  IDV_V 44 1  89172 2 16613 2 99 QF  idv  run  IDV_V 66 0 296929 0 340000 0 62 QP  idv  pair  IDY V 132 0 871770 0 998227 2 64 OP  idv uni tmt  2   IDV_V 264 0 144454 0 165408 0 27 OP  idv  units  264 effects  Residual SCA_V 264 1 000000 1 14506 2 19 OP  diag tmt   id variety  88 effects  tmt DIAG_V 1 1 09032 1 24848 2 21 OP  tmt DIAG_V 2 0 148952E 05 0 170558E 05 2 79 OB  diag  tmt   id run  132 effects  tmt DIAG_V 1 1 25736 1 43975 2 25 QF  tmt DIAG_V 2 1 86671 2 13750 2 97 OF  Warning  Code B   fixed at a boundary   GP  F   fixed by user       liable to change from P to B P   positive definite   C   Constrained by user   VCC  U   unbounded    S   Singular Information matrix  S means there is no information in the data for this parameter   Very small components with Com
227. e components  suppress screen output    repeat run for each argument renaming output file   names   set workspace size   over ride y variate specified in the command file with  variate number v   reports current license details    requests that the main output from the  asr   pvs  and  sln files be also written in the  xml1 file     195    10 3 Command line options       10 3 1 Prompt for arguments  A     A   ASK  makes it easier to specify command line options in Windows Explorer  One of the  options available when right clicking a  as file  invokes ASReml with this option  ASReml  then prompts for the options and arguments  allowing these to be set interactively at run  time  With  ASK on the top job control line  it is assumed that no other qualifiers are set  on the line  For example  a response of      hoor 12  35 would be equivalent to  ASReml  h22r basename 1 2 3    10 3 2 Output control  B   OUTFOLDER    XML     B b    BRIEF  b  suppresses some of the information written to the  asr file  The data  summary and regression coefficient estimates are suppressed by the options B  B1 or B2   This option should not be used for initial runs of a job before you have confirmed  by  checking the data summary  that ASReml has read the data as you intended  Use B2 to also  have the predicted values written to the  asr file instead of the  pvs file  Use B 1 to get  BLUE estimates reported in  asr file      OUTFOLDER  path  allows most of the output files to be written to a folder o
228. e defined in terms of X and Y axes  hy   x    2    hy   Yi     Yj  Sx   Cos a hy   sin a hy  sy   sin a h      cos a hy  d    d s   gt     s    d 1     For a given v  the range parameter    affects the rate of decay of p    with increasing d  The  parameter v  gt  0 controls the analytic smoothness of the underlying process us  the process  being  v     1 times mean square differentiable  where  v  is the smallest integer greater than  or equal to v  Stein  1999  page 31   Larger v correspond to smoother processes  ASReml  uses numerical derivatives for v when its current value is outside the interval  0 2 5      When v   m  z with m a non negative integer  pm    is the product of exp    d      and a polynomial of degree m in d  Thus v   5 yields the exponential correlation func   tion  pm d  Q  5    exp    d     and v   1 yields Whittle   s elementary correlation function   pm d     1     d    Ki d     Webster and Oliver  2001      When v   1 5 then  pau  d      1 5    exp    d     1   d        which is the correlation function of a random field which is continuous and once differentiable   This has been used recently by Kammann and Wand  2003   As v     oo then pm    tends  to the gaussian correlation function     The final metric parameter A is not estimated by ASReml  it has default value of 2 for  Euclidean distance  Setting A   1 provides the cityblock metric  which together with v   0 5  models a separable AR1xAR1 process  Cityblock metric may be appropriate when the  
229. e different from the  fitted matrix because BLUPs are shrunken phenotypes  The BLUPs matrix retains much  of the character of the phenotypes  the rescaled has the variance from the fitted and  the covariance from the BLUPs and might be more suitable as an initial matrix if the  variances have been estimated  The BLUPs and rescaled matrices should not be reported       relevant portions of the estimated variance matrix for each term for which an R structure  or a G structure has been associated     a variogram and spatial correlations for spatial analysis  the spatial correlations are based  on distance between data points  see Gilmour et al   1997      the slope of the log absolute residual  on log predicted value  for assessing possible mean   variance relationships and the location of large residuals  For example    SLOPES FOR LOG ABS RES   ON LOG PV  for section 1   0 99 2 01 4 34    produced from a trivariate analysis reports the slopes  A slope of b suggests that y       231    13 4 Other ASReml output files       might have less mean variance relationship  If there is no mean variance relation  a slope  of zero is expected  A slope of 5 suggests a SQRT transformation might resolve the depen   dence  a slope of 1 means a LOG transformation might be appropriate  So  for the 3 traits   log y1   ya    and y3  are indicated  This diagnostic strategy works better when based on  grouped data regressing log standard deviation  on log mean   Also     STND RES 16  2 35 6 58 5 64 
230. e field width  is no longer restricted  See  TXTFORM for more detail     increases the amount of information reported on the residuals obtained  from the analysis of a two dimensional regular grid field trial  The infor   mation is written to the  res file     81    5 8 Job control qualifiers       Table 5 5  List of rarely used job control qualifiers       qualifier    action        TABFORM  n      TXTFORM  n      TWOWAY    IVCC n      VGSECTORS  s      YHTFORM  f     1YSS  r     controls form of the  tab file    TABFORM 1 is TAB separated   tab becomes _tab txt   TABFORM 2 is COMMA separated   tab becomes _tab csv   TABFORM 3 is Ampersand separated   tab becomes _tab tex  See   TXTFORM for more detail     sets the default argument for  PVSFORM   SLNFORM   TABFORM and   YHTFORM if these are not explicitly set   TXTFORM  or  TXTFORM 1  re   places multiple spaces with TAB and changes the file extension to  say   _sln txt  This makes it easier to load the solutions into Excel   ITXTFORM 2 replaces multiple spaces with COMMA and changes the file  extension to  say  _sln csv  However  since factor labels sometimes con   tain COMMAS  this form is not so convenient          TXTFORM 3 replaces multiple spaces with Ampersand  appends a double  backslash to each line and changes the file extension to say _sln tex   Latex style     Additional significant digits are reported with these formats  Omitting  the qualifier means the standard fixed field format is used  For  yht and   sln fi
231. e has the matrix presented lower triangle rowwise  with each row begin   ning on a new line           e a sparse format file must be free format   11 1  with three numbers per line  namely i   i  row column value 441  defining the lower triangle row wise of the      5 1 0666667  matrix 6 5  0  2666667    6 6 1 0666667  7 T 1 0666667  e the file must be sorted column within row    8 7  0 2666667  8 8 1 0666667  e every diagonal element must be present  279 d vopebor  ae   10 9  0 2666667  missing off diagonal elements are assumed 10 10 1 0666667  to be zero cells  11 11 1 0666667  12 11  0 2666667  e the file is used by associating it with a fac    12 12 1 0666667          tor in the model  The number and order of  the rows must agree with the size and order  of the associated factor     e the  SKIP n qualifier tells ASReml to skip n header lines in the file     The  giv file presented in the code I  0  box gives the G inverse matrix on ior  02i  the right   mi   i 0 4 8  coger L007    The easiest way to ensure the variable is coded to match the order of the GRM file is to supply  a list of level names in the variable definition  For example  genotype  A  L Gorder txt   would code the variable genotype to agree with the order of level names present in the file  Gorder  txt which would be the order used in creating the GRM GIV matrix     If the file has a   grm file extension  ASReml will invert the GRM matrix  If it is not Positive  Definite  the job will abort unless an appro
232. e job  may just consist of a title line and MERGE directives  The  MERGE qualifier  on the other  hand  combines information from two files into the internal data set which ASReml uses for  analysis and does not save it to file  It has very limited in functionality   The files to be merged must conform to the following basic structure    e the data fields must be TAB  COMMA or SPACE separated    e there will be one heading line that names the columns in the file    e the names may not have embedded spaces     e the number of fields is determined from the number of names     e missing values are implied by adjacent commas in comma delimited files  Otherwise   they are indicated by NA    or   as in normal ASReml files     e the merged file will be TAB separated if a  txt file  COMMA separated if a  csv file  and SPACE separated otherwise     11 2 Merge Syntax    The basic merge command is    207    11 2 Merge Syntax       MERGE filel  WITH file2  TO newfile     Typically files to be merged will have common key fields  In the basic merge    KEY not  specified  any fields having the same names are taken as the key fields and if the files have  no fields in common  they are assumed to match on row number  Fields are referenced by  name  case sensitive      The full command is     MERGE file1    KEY keyfields     KEEP       SKIP fields    WITH file2    KEY keyfields      KEEP      NODUP       SKIP fields   ITO newfile   CHECK       SORT       Warning  Fields in the merged file will
233. e precision  Factor names are held in a  v11 file  see  ISAVE below     80    5 8 Job control qualifiers       Table 5 5  List of rarely used job control qualifiers       qualifier    action       ISAVE n    I SCREEN  n     SMX m        SLNFORM  n     I  SPATIAL    The file will not be written from a spatial analysis  two dimensional  error  when the data records have been sorted into field order because the  residuals are not in the same order that the data is stored  The residual  from a spatial analysis will have the units part added to it when units  is also fitted  The  drs file could be renamed  with extension  db1  and  used for input in a subsequent run     instructs ASReml to write the data to a binary file  The file asrdata bin  is written in single precision if the argument n is 1 or 3  asrdata dbl is  written in double precision if the argument n is 2 or 4  the data values  are written before transformation if the argument is 1 or 2 and after  transformation if the argument is 3 or 4  The default is single precision  after transformation  see Section 4 2      When either  SAVE or  RESIDUALS is specified  ASReml saves the factor  level labels to a basename v1l and attempts to read them back when  data input is from a binary file  Note that if the job basename changes  between runs  the  v11 file will need to be copied to the new basename   If the  v11 file does not match the factor structure  i e  the same factors  in the same order   reading the  v11 file is abort
234. e remedy  Variance structure is not Use better initial values or a structured variance matricx that  positive definite is positive definite     XFA model not permitted in R You may use FA or FACV  The R structure must be positive  structures definite    XFA may not be used as an R   structure    267    15 Examples    15 1 Introduction    In this chapter we present the analysis of a variety of examples  The primary aim is to  illustrate the capabilities of ASReml in the context of analysing real data sets  We also  discuss the output produced by ASReml and indicate when problems may occur  Statistical  concepts and issues are discussed as necessary but we stress that the analyses are illustrative   not prescriptive     15 2 Split plot design   Oats    The first example involves the analysis of a split plot design originally presented by Yates   1935   The experiment was conducted to assess the effects on yield of three oat varieties   Golden Rain  Marvellous and Victory  with four levels of nitrogen application  0  0 2  0 4  and 0 6 cwt acre   The field layout consisted of six blocks  labelled I  II  HI  IV  V and VI  with  three whole plots each split into four sub plots  The three varieties were randomly allocated  to the three whole plots while the four levels of nitrogen application were randomly assigned  to the four sub plots within each whole plot  The data is presented in Table 15 1     Table 15 1  A split plot field trial of oat varieties and nitrogen application     
235. e same line because of    the way ASReml processes the command file   Example 7 1 Random coefficient regression    In the first order random coefficient regression model it is required to specify a covariance  between the intercept and slope for each subject to ensure translation invariance  that is   equivalent variance parameter estimates for addition of any constant to the independent  variable  For example  in a random coefficient regression where a set of random intercepts is  specified by the model term Animal  with 10 levels  and a set of random slopes is specified  by the model term age  Animal  translation invariance is achieved using str   as    str Animal age Animal us 2  id 10      The algorithm places the model terms specified using the argument form together in the  processed random model  here Animal followed by age Animal  The variance structure s   begins at the start of the first term specified in str   and is expected to exactly span  the whole set of terms given within the brackets  The overall size of the variance model is  checked against the total number of levels of these terms  but the user must verify that the  ordering is appropriate for  matches  the variance model specified     In our example  this random model generates a combined set of random effects from the  individual animal intercepts  ur    ur     uro   and animal slopes  ws    us1       us10    as  urs    ul us      The consolidated term then has variance structure of the form    var  urs
236. e second file  only be inserted once into the merged file  For example  assume we want  to merge two files containing data from sheep  The first file has several  records per animal containing fleece data from various years  The sec   ond file has one record per animal containing birth and weaning weights   Merging with  NODUP bwt wwt will copy these traits only once into the    merged file   ISKIP fields is used to exclude fields from the merged file  It may be specified with  8  either or both input files     SORT instructs ASReml to produce the merged file sorted on the key fields   8  Otherwise the records are return in the order they appear in the primary  file     208    11 3 Examples       The merging algorithm is briefly as follows  The secondary file is read in  skip fields being  omitted  and the records are sorted on the key fields  If sorted output is required  the primary  file is also read in and sorted  The primary file  or its sorted form  is then processed line by  line and the merged file is produced  Matching of key fields is on a string basis  not a value  basis  If there are no key fields  the files are merged by interleaving     If there are multiple records with the same key  these are severally matched  That is if 3  lines of file 1 match 4 lines of file 2  the merged file will contain all 12 combinations     11 3 Examples  Key fields have different names    IMERGE filel  KEY keyla keylb  WITH file2  KEY key2a key2b  TO newfile    Key fields have commo
237. e section into independent subsec   tions  with subsections having common variance parameters  see Section  tae   existing  Ts is used with the own   variance model function to set the parameter    types  see Sections 7 7 7 and 7 7 3    existing USE t t is a compound model term component used elsewhere in the model   allows this variance structure and its parameters to be the same as that  used for t  see Section 7 7 8 for an example           e all parameters with the same letter in the structure are constrained to be equal    e 1 9  a z and A Z are all unique so that 61 equalities can be specified  O and   indicate  that the corresponding parameter is not related to any other parameter  A colon generates  a sequence  that is  a e is the same as abcde    e putting   as the first character in s makes the interpretation of codes absolute  so that  they apply across structures     e putting   as the first character in s indicates that numbers are for repeat counts  A Z are  equality codes and are not different from a z giving only 26 equalities  In this case only    represents unrelated to any other parameter  Thus     3A2  is equivalent to    AAA    or   Qaaa00 or   BAAACD  Some users might find the contractions appealing  other users  find an explicit definition less error prone     Examples are presented in Table 7 5  Important This syntax is limited in that it cannot apply  relationships to simple variance components  random terms that do not have an explicit  variance stru
238. e trimmed  but empty rows in the middle of  a block are kept  Empty columns are ignored  A single row of labels as the first non empty  row in the block will be taken as column names  Empty cells in this row will have default  names C1  C2 etc  assigned  Missing values are commonly represented in ASReml data files  by NA    or    ASReml will also recognise empty fields as missing values in  csv   x1s  files     42    4 2 The data file       4 2 4 Binary format data files    Conventions for binary files are as follows     e binary files are read as unformatted Fortran binary in single precision if the filename has  a  bin or  BIN extension     Fortran binary data files are read in double precision if the filename has a  dbl or  DBL  extension     ASReml recognises the value  1e37 as a missing value in binary files     Fortran binary in the above means all real   bin  or all double precision   db1  vari   ables  mixed types  that is  integer and alphabetic binary representation of variables is not  allowed in binary files     binary files can only be used in conjunction with a pedigree file if the pedigree fields are  coded in the binary file so that they correspond with the pedigree file  this can be done  using the  SAVE option in ASReml to form the binary file  see Table 5 5   or the identifiers  are whole numbers less than 9 999 999 and the  RECODE qualifier is specified  see Table  5 5      43    5 Command file  Reading the data    5 1 Introduction       In the code box to
239. ear mixed models  This example differs from the split plot example  as it  is unbalanced and so more care is required in assessing the significance of fixed effects     The experiment was reported by Dempster et al   1984  and was designed to compare the  effect of three doses of an experimental compound  control  low and high  on the maternal  performance of rats  Thirty female rats  dams  were randomly split into three groups of 10  and each group randomly assigned to the three different doses  All pups in each litter were  weighed  The litters differed in total size and in the numbers of males and females  Thus  the additional covariate  littersize was included in the analysis  The differential effect  of the compound on male and female pups was also of interest  Three litters had to be  dropped from experiment  which meant that one dose had only 7 dams  The analysis must  account for the presence of between dam variation  but must also recognise the stratification  of the experimental units  pups within litters  and that doses and littersize belong to the  dam stratum  Table 15 2 presents an indicative AOV decomposition for this experiment     Table 15 2  Rat data  AOV decomposition       stratum decomposition type df or ne       constant 1 F 1  dams   dose F   littersize F 1   dam R 27  dams pups   sex F 1   dose sex F 2  error R          The dose and littersize effects are tested against the residual dam variation  while the re   maining effects are tested against the r
240. ection 2 1 5 described partitioning the data observations into data sections to which sepa   rate variance structures are applied  There are three data sections in the fourth example on  page 115  When variance structures are specified using dimensions rather than factor names   idv 23  for section 1  idv 27  for section 2    in the example   the data must be ordered  into sections and the variance structures must be ordered to match the order of the sections  in the data file  It is usually more convenient to use a variable in the data file to identify  sections within the data  The data will be sorted internally by ASReml  ie  the data file  does not need to be ordered in any particular way  and the variance structures for sections  can then be specified using the sat function  for example    residual sat section   idv units     for the simple example with 3 data sections  where section is a new column in the data file  to separate the data into the three sections  units 1   23  24   50 and 51   70  The sat  function  shorthand for section at  is new with Release 4 and performs several different tasks       it tells ASReml that the variance structure for the residual error term is a direct sum  structure  see Section 2 1 5  where the different components of the direct sum apply to  the different levels of the sectioning variable in the data file      it prunes the levels for a section so that only the levels of factors defining the residual  variance structure for that sect
241. ed     performs a    Regression Screen     a form of all subsets regression  For d  model terms in the DENSE equations  there are 27   1 possible submodels   Since for d  gt  8  24     1 is large  the submodels explored are reduced by  the parameters n and m so that only models with at least n  default 1   terms but no more than m  default 6  terms are considered  The output   see page 221  is a report to the  asr file with a line for every submodel  showing the sums of squares  degrees of freedom and terms in the model   There is a limit of d   20 model terms in the screen  ASReml will not  allow interactions to be included in the screened terms  For example  to  identify which three of my set of 12 covariates best explain my dependent  variable given the other terms in the model  specify  SCREEN 3  SMX 3   The number of models evaluated quickly increases with d but ASReml has  an arbitrary limit of 900 submodels evaluated  Use the  DENSE qualifier  to control which terms are screened  The screen is conditional on all  other terms  those in the SPARSE equations  being present     modifies the format of the  s1n file     SLNFORM  1 prevents the  sln file from being written     SLNFORM 1 is TAB separated   sln becomes _sln txt    SLNFORM 2 is COMMA separated   sln becomes _sln csv    SLNFORM 3 is Ampersand separated   sln becomes _sln tex   Note that  extra signifcant digits are reported when  SLNFORM is set  and  expanded labelling of the levels in interactions is used becaus
242. ed  max5000     O variance parameters  max2500   2 special structures  Final parameter values   2  0     Last line read was   R Repl 0 0 0 0  Finished  23 Apr 2014 09 17 08 861 Error in variance header line   R Repl    250    14 4 An example       5  A misspelt factor name in linear model        After correcting the definition of variety  we    NIN Alliance Trial 1989  get the following  abbreviated  output  Now   variety  A   it has failed to parse the model line because   id pid raw   the model term Repl was declared as repl   T  P1     and so is unrecognised  Changing Repl to        row   column    repl  or vice versa  resolves this problem     nin89 asd  skip 1   yield   mu variety     IR Repl   residual ari Row  ar1 Col   predict varierty             Folder  C  Users Public ASRem1 Docs Manex4 ERR  variety  A   QUALIFIERS   SKIP 1   Reading nin89 asd FREE FORMAT skipping 1 lines  Model term  Repl  is not valid recognised    Fault  Error reading model terms    Last line read was  Repl  Currently defined structures  COLS and LEVELS  1 variety 1 2 2 0 0 0  2 ad 1 1 1 0 1 0  3 pid 1 1 1 0 2 0  4 raw 1 1 1 0 3 0  5 repl 1 2 2 0 4 0  12 mu 0 1  8 0  1 0  ninerrdS variety id pid raw rep nloc yield lat  Model specification  TERM LEVELS GAMMAS  mu 0  variety 0  12 factors defined  max5000    O variance parameters  max2500   2 special structures  Last line read was  Repl    Finished  23 Apr 2014 09 17 15 785 Error reading model terms    251    14 4 An example       6  Misspelt fact
243. ed a scaled Wald statistic  together with an F approximation  to its sampling distribution which they showed performed well in a range  though limited in  terms of the range of variance models available in ASReml  of settings     In the following we describe the facilities now available in ASReml for conducting inference  concerning terms which are the in dense fixed effects model component of the general linear  mixed model  These facilities are not available for any terms in the sparse model  These  include facilities for computing two types of Wald F statistics and partial implementation of  the Kenward and Roger adjustments     2 5 2 Incremental and conditional Wald F Statistics    The basic tool for inference is the Wald statistic defined in equation 2 17  ASReml produces  a test of fixed effects  that reduces to an F statistic in special cases  by dividing the Wald  statistic  constructed with l   0  by r  the numerator degrees of freedom  In this form it  is possible to perform an approximate F test if we can deduce the denominator degrees of  freedom  However  there are several ways L can be defined to construct a test for a particular  model term  two of which are available in ASReml  These Wald F statistics are labelled F inc   for incremental  and F con  for conditional  respectively  For balanced designs  these Wald  F statistics are numerically identical to the F statistics obtained from the standard analysis  of variance     The first method for computing Wald s
244. ed in detail in Section 7 11 1     7 2 1 Modelling a single variance structure over several model terms    This facility was motivated by two considerations  Typically the random effects from any  two distinct model terms are uncorrelated  However  in some models one G structure may  apply across several model terms  Sometimes one also wishes to partition the random effects  into sets with independent variance structures  In ASReml  we can accomplish these two  models using the special variance model function str    where the name str is for structure    112    7 2 Process to define a consolidated model term       and str   has the following general form   str  model term s  variance structure s      The m individual model terms generate the design matrices Z  and effect vectors w  of    size b   i 1     m  and the v variance structure terms generate variance structures G  of  size b   j   1     v   The function str   generates a combined model design matrix  Z     Z      Zm  and a combined effects vector ul    u    u    of size be   Lft b  and    the variance structure for ue is Ge   Cja1G j for u  and Ge to be conformable Xib    Dy  If v   1 then there is one variance structure associated with the combined set of effects and  if v  gt  1 we can partition u  and G  with ul    us    u    and G    G     G   and the  effect vectors are independent of each other and the effects u  have variance structure G    A restriction with str   is that the closing parenthesis must be on th
245. eee ae ea we 5  7 8 Setting relationships among variance structure parameters              7 8 1 Simple relationships among variance structure parameters          7 8 2 Fitting linear relationships among variance structure parameters       7 9 Ways to present initial values to ASReml               2  20004    7 9 1 Using templates to set parametric information associated with variance  structures using  tsv and  msvfiles               2004     7 9 2 Using estimates from simpler models               20    7 10 Default variance structures in ASReml               2  00004  7 11 Variance model functions available in ASReml                    7 11 1 Forming variance models from correlation models              7 11 2 Non singular variance matrices            2  02  0200004  7 11 3 Notes on the variance models   2 2 4 2 4 eee ed ewae de ced  7 14 Hives oe Rise o ee eo Re Re eS OR  SO eee  7 11 5 Notes on power models   2 26 eh eb we eR Re eR ee ee  7 11 6 Notes on Factor Analytic models                 0 0    7 12 Variance models available in ASReml      2    2 2 20  eee  Command file  Multivariate analysis  8 1 Inrodugctiok cak eee oe BERD EBLE EADS EES G  8 1 1 Repeated measures on rats      2     812 Wether tial data 2 21 chee deadedeved Gah aeeahans  8 2 Model specification   6 o ce oe RODE RRERAEH RHE HE OR RODGERS  8 3 Residual variance structures   6 kee Ye cee K eee EP ewe ee ee 8  8 3 1 Specifying multivariate variance structures in ASReml              8 4 introduction s  v
246. een  sections  For example  fitting the terms at region  trial as random effects would allow  the trials in region 1 to have a different variance component to those in region 2  Prediction  in these cases is more complicated and has only been implemented for this specific case and  the analagous region trial case  The associated factors must occur together in this order  for the prediction to give correct answers     The   ASSOCIATE effect  with base averaging  can usually be achieved with the  PRESENT  qualifier except when the factors have many levels so that the product of levels exceeds 2147  000 000  it fails in this case because the KEY for identifying the cells present is a simple  combination of the levels and is stored as a normal  32bit  integer  However    ASSOCIATE is  preferred because it formally checks the association structure as well as allowing sequential  averaging     Two   ASSOCIATE clauses may be specified for example  PRED entry  ASSOC family entry  ASSOC reg loc trial  ASAVE reg loc     Only one member of an   ASSOCIATE list may also appear in a   PRESENT list  If one member  appears in the classify set  only that member may appear in the   PRESENT list  For example  yield   region  r idv region  id family  idv entry    PREDICT entry  ASSOCIATE family entry  PRESENT entry region    Association averaging is used to form the cells in the PRESENT table and PRESENT  averaging is then applied     9 3 5 Complicated weighting with  PRESENT    Generally  when 
247. effects  5  86  correlated  16  terms  multivariate  146  random terms  94  RCB  29  design  25  reading the data  31  46  Reduced animal model  157  relationships    variance structure parameters  124    REML  1  12  16   REMLRT  16   repeated measures  1  270   reserved terms  90  Trait  90  100  a t r   97  abs v   90  97  and t r   90  97  at    97  at f n   90  97  cos v r   91  97  fac v y   90  97  fac v   90  97  g f n   98  giv f n   91  98    grm f n   91   h    98   i f   98   ide f   91  98   inv v r   91  98   1 f   98   leg v n   91  98   lin f   90  98   log v r   91  98   ma1 f   91  98   mal  91  98   mbf v 7r   91   mu  90  99   mv  90  99   out    99   p v n   99   pol v n   91  99   pow z p 0   99   qtl    100   s v  k    100   sin v r   91  100   spl v  k    90  100   sqrt v r   91  100   uni f k   100   uni f n   92   uni f   91   units  90  100   vect u   92  reserved words   GRM  143   AINV  143   ANTE 1   142   CHOL 1   142   CORGH  139   FACV 1   142   FA 1   142   GIV  143   GRM  143   IDH  141   MAT  141   NRM  143   OWN  141   US  141   XFA 1   142   AEXP  141   AGAU  141   AR2  138    343    INDEX       AR3  138  ARMA  139  AR 1   138  CIR  141  CORB  139  CORGB  139  CORU  139  DIAG  141  EXP  140  GAU  140  ID  138  IEUC  140  IEXP  140  IGAU  140  LVR  140  MA2  139  MA 1   139  SAR2  138  SAR  138  SPH  141  residual  29  error  5  86  likelihood  12  response  87  running the job  33    score  13  Score test  68  Segmentation fault  219  sep
248. efined with  P  In this case there is no difference between fitting nrm dam  and  id ide dam   since there is no pedigree information on dams  It is preferable to be explicit   specify nrm dam  when the relationship matrix is required  and id ide dam   in the G  structure definition      In this case PATH 1  2 and 3 were run in turn but in PATH 3 ASReml had trouble converging  because in each iteration the unstructured us tag  matrix is not positive definite and so  ASReml uses a slower EM algorithm that keeps the estimates in the parameter space but the  convergence is very slow  Here is the convergence log for PATH 3    Notice  15358 singularities detected in design matrix     1 LogL  1543 55 S2  1 00000 18085 df   15 components restrained  Notice  US matrix updates modified 1 time s  to keep them positive definite   2 LogL  1540 93 S2  1 00000 18085 df   15 components restrained    Notice  US matrix updates modified 1 time s  to keep them positive definite     38 LogL  1538  34 S2  1 00000 18085 df   15 components restrained          reported in the  asr file    329    15 10 Multivariate animal genetics data   Sheep       39 LogL  1538 33  40 LogL  1538  32    To avoid this problem in PATH 4 and 5 we use xfa2 and xfa3 structures  These converge    S2   S2     1 00000  1 00000    18085 df  18085 df    14 components restrained  15 components restrained    much faster  Here is the convergence log and resulting estimates for PATH 5    Notice  ReStartValues taken from pcoopf4 r
249. eh e Nen e ne Eee 2   1 4 How to use this guide     oaa aa CEES ee ee eS    Brook 3  1 5 Getting assistance and the ASReml forum   aoaaa aaa 3  1 6 Typographic CONVENTIONS   lt     lt c e ke REE EHR SEE RR ERK G 4   2 Some theory 5  2A The general linear mixed model                222 020000 5  2 1 1 Sigma parameterization of the linear mixed model             5   2 1 2 Partitioning the fixed and random model terms               6   2 1 3 G structure for the random model terms                   6   2 1 4 Partitioning the residual error term              2 020   T   2 1 5 R structure for the residual error term    aaa aaa aaa 7   2 1 6 Gamma parameterization for the linear mixed model            8   2L7 Farameter Woes ec cede bei dundee dt Gavdseeaeees 8   2 1 8 Variance structures for the random model terms              8   2 1 9 Variance models for terms with several factors               9   2 110 Direct product structures e so oes he ee Ee ee ew a 10   2 1 11 Direct products in R structures oc    eee eee a 10   2 1 12 Direct products in G structures 2  lt s oc csa eee tapeet pad 11    2 1 13 Range of variance models for R and G structures              11    2 1 14 Combining variance models in R and G structures             12  a2 Be ne be eee Bee eee bee ee PAS ee oe 6 eee 12  2 2 1 Estimation of the variance parameters                    12  2 2 2 Estimation prediction of the fixed and random effects           14  2 2 3 Use of the gamma parameterization                   
250. el brief description common usage  term  fixed random       cos  v  r  forms cosine from v with period r J  ge f  condition on factor variable f  gt   r J  giv f n  associates the nth  giv G inverse with the factor J  f  grm f n  associates the nth  grm G with the factor f J  gt f  condition on factor variable f  gt r J  h f  factor fis fitted Helmert constraints J  ide f  fits pedigree factor f without relationship matrix J    inv vL  7r   forms reciprocal of v   r     lt     le f  condition on factor variable f  lt   r     lt        leg v     n  forms n 1 Legendre polynomials of order 0  in  vy  tercept   1  linear     n from the values in v  the  intercept polynomial is omitted if v is preceded by  the negative sign     lt  f  condition on factor variable f  lt  r     lt      log v  r   forms natural logarithm of v   r         mai  f  constructs MA1 design matrix for factor f J  mai forms an MA1 design matrix from plot numbers J  mbf  v  r  is a factor derived from data factor v by using the y y   MBF qualifier   out  n  condition on observation n y  out  n  t  condition on record n  trait t J  v    pol  v     n  forms n 1 orthogonal polynomials of order 0  in   tercept   1  linear     n from the values in v  the  intercept polynomial is omitted if n is preceded by  the negative sign     pow  x  p  o   defines the covariable  x  0     for use in the model    where zx is a variable in the data  p is a power and  o is an offset     qtl f p  impute a covariable from marker ma
251. eld   mu variety  and ignored  But then it was unable to read   jp Repl  the first line of the data file  residual ari Row  ar1 Col     predict varierty    row   column    nin89 asd  slip 1          Folder  C  Users Public ASRem1 Docs Manex4 ERR  QUALIFIERS   SLIP 1   Warning  Unrecognised qualifier at character 11  SLIP 1  Reading nin89 asd FREE FORMAT skipping 0 lines    Univariate analysis of yield  Notice  Maybe you want  A  L qualifiers for this factor  variety  Error at field 1  variety  of record 1  line 1    Since this is the first data record  you may need to skip some header lines   see  SKIP  or append the  A qualifier to the definition of factor variety  Fault  Missing faulty  SKIP or  A needed for variety   Last line read was  variety id pid raw rep nloc yield lat long row column  Currently defined structures  COLS and LEVELS   1 variety 1 2 2 0 0 0    10 row 1 2 2 0 9 0     248    14 4 An example       11 column 1 2 2 0 10 0  12 mu 0 4  o 0 z 0  ninerr2 nin89 asd   Model specification  TERM LEVELS GAMMAS    mu 0  variety 0  12 factors defined  max5000    O variance parameters  max2500   2 special structures    Last line read was  variety id pid raw rep nloc yield lat long row column  Finished  23 Apr 2014 09 17 01 765 Missing faulty  SKIP or  A needed for variety    3  An incorrectly defined factor       After correcting  slip 1 to  skip 1  we  qin Alliance Trial 1989  get the following  abbreviated  output  The   variety     problem is that variety is coded in 
252. ements  Normally  predict points will be  defined for all combinations of X and Y values  This qualifier is required   with optional argument 1  to specify the lists are to be taken in parallel   The lists must be the same length if to be taken in parallel    Be aware that adding two dimensional prediction points is likely to sub   stantially slow iterations because the variance structure is dense and  becomes larger  For this reason  ASReml will ignore the extra PVAL  points unless either  FINAL or  GKRIGE are set  to save processing time     The  GROUPFACTOR qualifier  like  SUBSET  must appear on a line by itself  after the data line and before the model line  Its purpose is to define a  factor t by merging levels of an existing factor v  The syntax is   GROUPFACTOR  lt Group_factor gt   lt Exist_factor gt   lt new codes gt    for example     GROUPFACTOR Year YearLoc 1 112233344   forms a new factor Year with 4 levels from the existing factor YearLoc  with 10 levels    Alternatively  Year could be formed by data transformation     Year     YearLoc  set 11122333 4 4  L 2001 2002 2003 2004     IDLIMIT v      JOIN    is used when ASReml expands a residual statement like residual  sat  Site   ar1 row  ari col  and the dimension of row or col is  small  The ari   structure is changed to id Q   When the number of  rows columns is less than or equal to v  the structure is set to ID instead  of AR1  v has a default value of 4 and cannot be reset to less than 3  If  the qualifier i
253. eml after reordering these terms  to obtain the  FOWN test s  specified  Several reruns may be needed to    perform all  FOWN tests specified   e Any model terms in the  FOWN lists which do not appear in the actual    model  are ignored without flagging an error    e Any model terms which are omitted from  FOWN statements are tested  with the usual conditional test    e If any model terms are listed twice  only the first test is performed   F con tests specified in  FOWN statements are given model codes 0  P          The  FOWN statements are parsed by the routine that parses the model  line and so accepts the same model syntax options  Care should be taken  to ensure term names are spelt exactly as they appear in the model     is used to have the first random term included in the dense equations if  it is a GRM GIV variance structure  This will result in faster processing  when the GRM  inverse  matrix is not sparse     sets the number of inner iterations performed when a iteratively weighted  least squares analysis is performed  Inner iterations are iterations to es   timate the effects in the linear model for the current set of variance  parameters  Outer iterations are the AI updates to the variance param   eters  The default is to perform 4 inner iterations in the first round and  2 in subsequent rounds of the outer iteration  Set n to 2 or more to  increase the number of inner iterations     sets hardcopy graphics file type to HP GL  An argument of 2 sets the  hardcopy g
254. eml user interface is terse  Most effort has been directed towards efficiency of the  engine  It normally operates in a batch mode     Problem size depends on the sparsity of the mixed model equations and the size of your  computer  However  models with 500 000 effects have been fitted successfully  The compu   tational efficiency of ASReml arises from using the Average Information REML procedure   giving quadratic convergence  and sparse matrix operations  ASReml has been operational  since March 1996 and is updated periodically     1 3 User Interface       1 2 Installation    Installation instructions are distributed with the program  If you require help with installa   tion or licensing  please email support asreml co uk     1 3 User Interface    ASReml is essentially a batch program with some optional interactive features  The typical  sequence of operations when using ASReml is    e Prepare the data  typically using a spreadsheet or data base program     e Export that data as an ASCII file  for example export it as a  csv  comma separated  values  file from Excel     e Prepare a job file with filename extension  as  e Run the job file with ASReml   e Review the various output files   e revise the job and re run it  or   e extract pertinent results for your report     You need an ASCII editor to prepare input files and review and print output files  Two  commonly used editors are     1 3 1 ASReml W    The ASReml W interface is a graphical tool allowing the user to edit pr
255. en Marker35 a new name because it is still also generated by the  CYCLE unless  it is modified to read  ICYCLE 1 34 36 1000    After several cycles  we might have    Marker screen   Genotype     yield   PhenData txt    ASSIGN MSET R21 R35 R376 R645 R879   ICYCLE 1 1000   IMBF mbf Genotype  MLIB Marker I csv  RENAME Marker I    FOR  MSET  DO  MBF mbf Genotype  MLIB Marke S csv  RENAME  S  yld   mu  r  MSET Marker I    10 4 4 Order of Substitution    The substitution order is ASSIGN  FOR  CYCLE  TP  command line arguments and finally the  interactive prompt     10 5 Performance issues    10 5 1 Multiple processors   ASReml has not been configured for parallel processing  Performance is downgraded if it  tries to use two processors simultaneously as it wastes time swapping between processors   10 5 2 Slow processes    The processing time is related to the size of the model  the complexity of the variance model   in particular the number of parameters   the sparsity of the mixed model equations  the    205    10 5 Performance issues       amount of data being processed     Typically  the first iteration take longer than other iterations  The extra work in the first  iteration is to determine an optimum equation order for processing the model  see  EQORDER      The extra processes in the last iteration are optional  They include    e calculation of predicted values  see PREDICT statement    e calculation of denominator degrees of freedom  see  DDF    e calculation of outlier stati
256. eness can be removed  by considering 0  lt  a  lt  5 and     gt  0  or by considering 0  lt  a  lt  m and either 0  lt 6  lt  1 or      gt  1  With A   2  isotropy occurs when      1  and then the rotation angle qa is irrelevant   correlation contours are circles  compared with ellipses in general  With A   1  correlation  contours are diamonds     7 11 5 Notes on power models    Power models rely on the definition of distance for the associated term  for example     the distance between time points in a one dimensional longitudinal analysis       the spatial distance between plot coordinates in a two dimensional field trial analysis     Information for determining distances is supplied either implicitly by applying the model to  the fac   of the coordinate variables  or explicitly with the   COORD qualifier     For one dimensional cases  either      expv fac X   where X contains the positions       expv Trait  COORD x  where x is a vector of positions       In two directions  IEXP  IGAU  IEUC  AEXP  AGAU  MATn     For a G structure relating to the model term fac x y   use fac x y   For example    yield   mu     r ieucv fac xcoord ycoord  INIT 0 7 1 3     7 11 6 Notes on Factor Analytic models    FAk  FACVk and XFAk are different parameterizations of the factor analytic model in which     is modelled as      IT    Y where T        is a matrix of loadings on the covariance scale  and W is a diagonal vector of specific variances  See Smith et al   2001  and Thompson et  al   20
257. ensitive       most qualifier identifiers may be truncated to 3 characters     45    5 4 Specifying and reading the data       5 3 Title line       The first 40 characters of the first nonblank NIN Alliance Trial 1989  text line in an ASReml command file are taken   variety 1A   as a title for the job  Use this to document id   the analysis for future reference  An optional   pid   qualifier line  see section 10 3  may precede      the title line  It is recognised by the presence  of the qualifier prefix letter    Therefore the title MUST NOT include an exclamation mark              5 4 Specifying and reading the data    Typically  a data record consists of all the information pertaining to an experimental unit   plot  animal  assessment   Data field definitions manage the process of converting the fields  as they appear in the data file to the internal form needed by ASReml  This involves mapping   coding  factors  general transformations  skipping fields and discarding unnecessary records   If the necessary information is not in a single file  the MERGE facility  see Chapter 11  may  help     Variables are defined immediately after the job title  These definitions indicate how each  field in the data file is handled as it is read into ASReml  Transformations can be used to  create additional variables  Users can explicity nominate how many are read with the  READ  qualifier described in Table 5 5  No more than 10 000 variables may be read or formed        Data field definit
258. ents  are idv repl  and ar1  row   see Table 7 2     A general form for a covariance component is    umfname component qualifiers     where qualifiers is an optional list of one or more qualifiers to be applied to the variance  structure being defined  A simple example of this is the extension of idv  repl  to idv repl   INIT 0 65  which specifies an IDV structure of dimension 4 for replicates  NIN example 2   with an initial variance of 0 65 for the variance component associated with replicates under  the sigma parameterization  or an initial variance component ratio of 0 65 for the variance  component ratio associated with replicates under the gamma parameterization     Note that a variance structure of a particular dimension  w say  can been specified directly  as    umfname w qualifiers     For example  idv 3  defines the IDV variance structure of dimension 3  that is  o7I   and  idv 3  INIT 1 1  specifies an initial value of 1 1 for the associated variance component  under the sigma parameterization or variance ratio under the gamma parameterization   Likewise  ar1 10  specifies an autoregressive correlation structure  AR1  of order 10 and  ari 10  INIT 0 4  specifies this same structure with an initial autocorrelation parameter  of 0 4  A simple variance component o  would be defined as idv 1   Note that an integer  value for the first argument is only valid in variance model functions associated with residual  terms and str       The full list of variance model functio
259. er if labels are not abbreviated  If abbreviations are used  then they need to be chosen to avoid confusion     if the model is written over several lines  all but the final line must end with a COMMA   or    to indicate that the list is continued     In Tables 6 1 and 6 2  the arguments in model term functions are represented by the following  symbols   f     the label of a data variable defined as a model factor    k  n     an integer number    r    areal number    t     a model term label  includes data variables      v  y     the label of a data variable     Where a model term takes another model term as an argument  the argument may occa   sionally need to be predefined  This is done by including the argument model term in the    88    6 2 Specifying model formulae in ASReml       model term list with a leading         which will cause the term to be defined but not fitted  For  example    Trait male  Trait female and Trait female     89    6 2 Specifying model formulae in ASReml       Table 6 1  Summary of reserved words  operators and functions       model brief description common usage  term  fixed random       reserved mu the constant term or intercept J   terms mv a term to estimate missing values J  Trait multivariate counterpart to mu J  units forms a factor with a level for each experimental J   unit   operators   Or  placed between labels to specify an interaction J J    forms nested expansion  Section 6 5  y y    forms factorial expansion  Section 6 5  J J    p
260. eration  the final iteration in which prediction is performed     By default  factors are predicted at each level  simple covariates are predicted at their overall  mean and covariates used as a basis for splines or orthogonal polynomials are predicted at  their design points  Covariates grouped into a single term  using  G qualifier page 48  are  treated as covariates     Prediction at particular values of a covariate or particular levels of a factor is achieved by  listing the levels values after the variate factor name  Where there is a sequence of values   use the notation ab    n to represent the sequence of values from a to n with step size b   a   The default stepsize is 1  in which case b may be omitted   A colon     may replace the  ellipsis        An increasing sequence is assumed  When giving particular values for factors   the default is to use the coded level  1 n  rather than the label  alphabetical or integer   To  use the label  precede it with a quote      Where a large number of values must be given   they can be supplied in a separate file  and the filename specified in quotes  The file form  does not allow label coding or sequences   See the discussion of  PRWTS for an example      Model terms mv and units are always ignored     Model terms which are functions  such as at   and   pol   sin   spl         including  those defined using   CONTRAST   GROUP   SUBGROUP   SUBSET and  MBF are implicitly de   fined through their base variables and can not be direct
261. erge faster  Note that this option is not available with the nrm or grm functions   Note also  that the EM update is applied to all of the variance parameters in the particular US  model and cannot be applied to only a subset of them  EM updates can be slow to converge  and an alternative parameterization using a factor analysis may converge faster and give a  more parsimonious parameterization  It may be that there is no variance associated with  some levels of the matrix  in which case the dimension of the matrix should be reduced     7 7 5 New R4 Initial values   INIT v    Prior to Release 4 it was necessary to supply initial values for variance structure parameters  except for the default IDV variance structure for a random model term  where the default  initial variance  ratio  parameter value was 0 1  In Release 4  it is not generally necessary  to supply initial values  In this release  ASReml provides starting or initial values for vari   ance structure parameters based on knowledge of the phenotypic variance of the response   Occasionally these initial values are not adequate and more appropriate values will need to  be supplied by the user  In this case the user may have good prior information that can be  utilized in forming initial values     There are several ways to provide initial values  The particular choice will depend on how    many values and other variance model function qualifiers are to be specified  The initial  values can be provided in a number of wa
262. erms in the mixed model          NIN Alliance trial 1989 variety  A    column 11   nin89 asd  skip 1   tabulate yield   variety   yield   mu variety  r idv repl   residual idv units    predict variety        r qualifier tells ASReml to fit the terms that follow as random effects     3 4 7    There are two variance structures to be spec   ified and two variance components to be es   timated  The first structure is for the repli   cate  repl  effects  These effects are IID dis   tributed and idv rep1l  denotes this and es   timates one variance component associated  with these effects  The other is associated  with the residual effects  which are again as   sumed to be IID distributed  This is formally  specified here by the line    Variance structures          NIN Alliance trial 1989 variety  A    column 11   nin89 asd  skip 1   tabulate yield   variety   yield   mu variety  r idv repl   residual idv units    predict variety                residual idv units  where residual is the name of the directive that specifies the vari   ance structure for the residuals  and units is the reserved word specifiying a factor with  a level for every experimental unit  The default variance structure is always uncorrelated  effects with a common variance and so idv rep1  and idv units  can be reduced to simply  repl and units  See Chapter 7 for a lengthy discussion on variance modelling in ASReml     32    3 5 Running the job       3 4 8 Prediction    Predict statements appear after the model
263. ers are advised to parse the  xm1 file in redeveloping code to    222    13 3 Key output files       handle the changes with the new release     13 3 2 The  sin file    The  sln file contains estimates of the fixed and random effects with their standard errors  in an array with four columns ordered as    factor_name level estimate standard _error   Note that the error presented for the estimate of a random effect is the square root of the  prediction error variance  In a genetic context for example where a relationship matrix A  is involved  the accuracy is    1     athe  where s  is the standard error reported with  the BLUP  u   for the ith individual  f  is the inbreeding coefficient reported when  DIAG  qualifier is given on a pedigree file line  1   f  is the diagonal element of A and o  is the    genetic variance  The  s1n file can easily be read into a GENSTAT spreadsheet or an S PLUS  data frame  Below is a truncated copy of nin89a sln  Note that    e the order of some terms may differ from the order in which those terms were specified in  the model statement     e the missing value estimates appear at the end of the file in this example     e the format of the file can be changed by specifying the  SLNFORM qualifier  In particular   more significant digits will be reported     e use of the  OUTLIER qualifier will generate extra columns containing the outlier statistics  described on page 17     Model_Term Level Effect seEffect   variety LANCER 0 000 0 000 variety est
264. es in the explanatory variable is large  or if units are measured at different times     The data we use was originally reported by Draper and Smith  1998  ex24N  p559  and has  recently been reanalysed by Pinheiro and Bates  2000  p338   The data are displayed in  Figure 15 12 and are the trunk circumferences  in millimetres  of each of 5 trees taken at 7  times  All trees were measured at the same time so that the data are balanced  The aim of  the study is unclear  though  both previous analyses involved modelling the overall    growth     curve  accounting for the obvious variation in both level and shape between trees  Pinheiro  and Bates  2000  used a nonlinear mixed effects modelling approach  in which they modelled  the growth curves by a three parameter logistic function of age  given by    E b1  1  exp     x      2  93   where y is the trunk circumference  x is the tree age in days since December 31 1968      is    the asymptotic height  2 is the inflection point or the time at which the tree reaches 0 5       3 is the time elapsed between trees reaching half and about 3 4 of 1        y  15 11     this is the orange data circ age Tree  4 jn 5  A    a        N    o    Figure 15 12  Trellis plot of trunk circumference for each tree    The datafile consists of 5 columns viz  Tree  a factor with 5 levels  age  tree age in days since  31st December 1968  circ the trunk circumference and season  The last column season    309    15 9 Balanced longitudinal data   Random coe
265. esidual within litter variation  The ASReml input to  achieve this analysis is presented below     Rats example   dose 3  A   sex 2  A   littersize   dam 27   pup 18   weight   rats asd  DOPATH 1   Change DOPATH argument to select each PATH   PATH 1   weight   mu littersize dose sex dose sex  r idv dam     273    15 3 Unbalanced nested design   Rats       residual idv units     PATH 2   weight   mu out 66  littersize dose sex dose sex  r idv dam   residual idv units     PATH 3   weight   mu littersize dose sex  r idv dam    residual idv units     PATH 4   weight   mu littersize dose sex   residual idv units     The input file contains an example of the use of the  DOPATH qualifier  Its argument specifies  which part to execute  We will discuss the models in the two parts  It also includes the   FCON qualifier to request conditional Wald F statistics  Abbreviated output from part 1 is  presented below     1 LogL  74 2174 S2  0 19670 315 df 0 1000 1 000   2 LogL  79 1579 S2  0 18751 315 df 0 1488 1 000   3 LogL  83 9408 S2  0 17755 315 df 0 2446 1 000   4 LogL  86 8093 S2  0 16903 315 df 0 4254 1 000   5 LogL  87 2249 S2  0 16594 315 df 0 5521 1 000   6 LogL  87 2398 S2  0 16532 315 df 0 5854 1 000   7 LogL  87 2398 S2  0 16530 315 df 0 5867 1 000   8 LogL  87 2398 S2  0 16530 315 df 0 5867 1 000  Final parameter values 0 5867         Results from analysis of weight        Akaike Information Criterion  170 48  assuming 2 parameters    Bayesian Information Criterion  162 97  App
266. estimation biases can be over 50   e g  Bres   low and Lin  1995  Goldstein and Rasbash  1996  Rodriguez and Goldman  2001  Wadding   ton et al   1994   For other GLMMs  PQL has been reported to perform adequately  e g   Breslow  2003   McCulloch and Searle  2001  also discuss the use of PQL for GLMMs     The performance of PQL in other respects  such as for hypothesis testing  has received much  less attention  and most studies into PQL have examined only relatively simple GLMMs   Anecdotal evidence suggests that this technique may give misleading results in certain situ   ations  Therefore we cannot recommend the use of this technique for general use  and it is  included in the current version of ASReml for advanced users  If this technique is used  we  recommend the use of cross validatory assessment  such as applying PQL to simulated data  from the same design  Millar and Willis  1999      The standard GLM Analysis of Deviance   AQD  should not be used when there are random  terms in the model as the variance components are reestimated for each submodel    6 9 Missing values    6 9 1 Missing values in the response       It is sometimes computationally convenient to estimate NIN Alliance Trial 1989    missing values  for example  in spatial analysis of regular variety  arrays  see example 3a in Section 7 5  Missing values are    estimated if the model term mv is included in the model  row 22  mv is formally shown here in the sparse fired effects to column 11    nin89 asd  
267. example      A will be interpreted as idv A      A B will be interpreted as idv  A B      A B C will be interpreted as idv A B C      sat  Expt 1  A will be interpreted as sat  Expt  1  idv A      sat  Expt 1  A B will be interpreted as sat  Expt 1  idv A B      sat Expt 1  A B C will be interpreted as sat  Expt 1   idv A B C     In these cases the model term can be followed by an initial value and or a parametric  qualifier  for example      A 1  GP is interpreted as idv A  INIT 1  GP     There is always a residual error term in the model but if it is not explicitly specified it is  assumed to be idv units  for univariate data and id units  us Trait  for multivariate  data  If the consolidated model term definition is incomplete  that is  if some but not all  of the components have a variance model function specified  the variance model functions  idv  or id   will be applied to these components depending on the variance model functions  specified  For example      idv A   B will be interpreted as idv A   id B      id A   B will be interpreted as id A   idv B     id A   B C will be interpreted as id A   idv B C      idv A   B C will be interpreted as idv A   id B C    Similarly  at the residual level as sat    cannot be converted into a variance function      sat  Expt 1  id A   B will be interpreted as sat  Expt  1   id A   idv B     139    7 11 Variance model functions available in ASReml         sat  Expt 1  id A   B C will be interpreted as sat  Expt 1  id A   idv B C   
268. f the data and by fitting simpler models     244    14 2 Common problems       e software problems  There are many options in ASReml and some combinations have not  been tested  Some jobs are too big  When all else fails  describe your problem to the  forum http   www vsni co uk forum or email support vsni cu uk     There are over 6000 one line diagnostic messages that ASReml may print in the  asr file   Hopefully  most are self explanatory  but it will always be helpful to recognise whether  they relate to parsing the input file  or raise some other issue  See Section 14 5 for more  information on these messages     14 2 Common problems    Common problems in coding ASReml are as follows     e a variable name has been misspelt  variable names are case sensitive     e a model term has been misspelt  model term functions and reserved words  mu  Trait  mv   units  are case sensitive     the data file name is misspelt or the wrong path has been given   enclose the pathname  in quotes       if it includes embedded blanks     a qualifier has been misspelt or is in the wrong place     failure to use commas appropriately in model definition lines     e there is an error in the predict statement     e model term mv not included in the model when there are missing values in the data and  the model fitted assumes all data is present     e there is an inconsistency between the variance header line and the structure definition  lines presented  original syntax      e there is an error in 
269. factor where the n variables are the marker states for n markers  in a linkage group in map order and coded   1 1   backcross  or   1 0 1   F2 design   s  length  n 1  should be the n marker positions relative to a left telomere position of zero  and an  extra value being the length of the linkage group  the position of the right telomere   The  length  right telomere  may be omitted in which case the last marker is taken as the end of  the linkage group  The positions may be given in Morgans or centiMorgans  if the length is  greater than 10  it will be divided by 100 to convert to Morgans      The recombination rate between markers at sz and sp  L is left and R is right of some  putative QTL at Q  is   OLR    1   e     sr s1    2    Consequently  for 3 markers  L Q R   OLR   Ore   Par     20LQOQR    The expected value of a missing marker at Q  between L and R  depends on the marker  states at L and R  E q 1 1     1     Ozo     6gr   1     OLR     E q 1   1     Oor   9r   Orr  Eal     1 1     Cro     bar  OiR   and E q      1 1        1 o     Oza     Let Ax     E ql1 1    E g 1   1   2   2226 Gro     OLR L   OLR     and AR    E q      1  1    E q     1     1   2     91    91q  1   26gR     OLrR 1   OLR   Then E qlzr    R    Arz   AR  R  Where there is no marker on one side  E q er      1     69r  r   Ogr      R    trR 1    209r   This qualifier facilitates the QTL method  discussed in Gilmour  2007            58    5 5 Transforming the data       IDOM A is used to form domin
270. fficients and cubic smoothing splines         Oranges        Results from analysis of circ        Akaike Information Criterion 186 86  assuming 6 parameters    Bayesian Information Criterion 195 65  Model_Term Gamma Sigma Sigma SE  C  idv spl  age 7   IDV_V 6 2 17100 12 2471 1 09 OP  Residual SCA_V 35 1 000000 5 64123 1 12 OP  us  2   id Tree  10 effects  2 UV 1 1 5 61715 31 6877 1 26 0P  2 US_C 2 1  0 124098E 01  0 700063E 01  0 85 0P  2 US_V 2 2 0 108290E 03 0 610886E 03 1 41 OP  idv spl age 7   Tree  ID_V 1 1 38313 7 80258 1 48 OP  Covariance Variance Correlation Matrix US Tree  31 69  0  5032   0 7001E 01 0 6109E 03  Wald F statistics   Source of Variation NumDF DenDF F inc P ine  9 mu 1 2 4 169 87 0 006  3 age 1 2 4 92 78 Oii  5 Season 1 8 8 108 49  lt  001    200 600 1000 1400      I L      5 Marginal             Trunk circumference  mm              100   Zo   Lo    ZA Wa  50 a ZA             200 600 1000 1400    Time since December 31  1968  Days     Figure 15 15  Trellis plot of trunk circumference for each tree at sample dates  adjusted for  season effects   with fitted profiles across time and confidence intervals    Figure 15 15 presents the predicted growth over time for individual trees and a marginal  prediction for trees with approximate confidence intervals  2  standard The conclusions    from this analysis are quite different from those obtained by the nonlinear mixed effects    315    15 9 Balanced longitudinal data   Random coefficients and cubic smoothing 
271. fficients and cubic smoothing splines    Oranges       was added after noting that tree age spans several years and if converted to day of year   measurements were taken in either Spring  April May  or Autumn  September October      First we demonstrate the fitting of a cubic spline in ASReml by restricting the dataset to  tree 1 only  The model includes the intercept and linear regression of trunk circumference  on age and an additional random term spl age 7  which instructs ASReml to include a  random term with a special design matrix with 7   2   5 columns which relate to the vector      whose elements      i   2     6 are the second differentials of the cubic spline at the knot  points  The second differentials of a natural cubic spline are zero at the first and last knot  points  Green and Silverman  1994   The ASReml job is    this is the orange data  for tree 1    seq   record number is not used  Tree 5  age  118 484 664 1004 1231 1372 1582    circ  season  L Spring Autumn  orange asd  skip 1  filter 2  select 1  ISPLINE spl age 7  118 484 664 1004 1231 1372 1582   PVAL age 150 200 1500  circ   mu age  r idv spl age 7    residual idv units   predict age    Note that the data for tree 1 has been selected by use of the  filter and  select qualifiers   Also note the use of  PVAL so that the spline curve is properly predicted at the additional  nominated points  These additional data points are required for ASReml to form the de   sign matrix to properly interpolate the cu
272. files       ooa aaa  5 8 Job control qualifiers      aaa ee eee Ee  Command file  Specifying the terms in the mixed model  6 1 IMrod  chOn o ec Bee eo Se Se EERE ER a A Re EE ea  6 2 Specifying model formulae in ASReml                2 000004  GAl General rules ce ee AER SRR ASE ERE ER  2 eG  o oe ee Pe ee Se CES SS eRe eee we ars  6 3 Fixed terms in the model 2  osa ccc 884622886028 tu 48 es    vi    GaL PN Ted TER o sios So od BRON SHEEN OS BEES  Goo Site Te Ts o  lt  oaie wa RE EE KEG REG ES  6 4 Random and residual terms in the variance component model            6 5 Interactions and conditional factors       ooa a 004 0s  0O51 Interactions o co o Rk eee ee eh a eh g p a  eP Fees E E E E  053 Conditional factos  e e s oes b oe eos piedi g p Re Sew Ee  6 5 4 Associated Factors    aoaaa a  6 6 Alphabetic list of model functions    oa a a eee eee  6 7 WEEDS s saca e SE Ew Oe a ae See  6 8 Generalized Linear  Mixed  Models    oaaae  6 8 1 Generalized Linear Mixed Models                 2 4    6 9 Missing valties ce we ke we eR ee BSE oe ee ee Ee eS ee  6 9 1 Missing values in the response               2  2 20004   6 9 2 Missing values in the explanatory variables                  6 10 Some technical details about model fitting in ASReml                6 10 1 Sparse versus dense o  o s cec aon ke Pe ee ew  6 10 2 Ordering of terms in ASReml              0  0  000004  6 10 3 Aliassing and singularities               0    0 2  0000    6 10 4 Examples of alassing      o o co eee
273. forming a prediction table  it is necessary to average over  or ignore  some  dimensions of the hyper table  By default  ASReml uses equal weights  1 f for a factor with  f levels   More complicated weighting is achieved by using the  AVERAGE qualifier to set  specific  unequal  weights for each level of a factor  However  sometimes the weights need to  be defined with respect to two or more factors  The simplest case is when there are missing  cells and weighting is equal for those cells in a multiway table that are present  achieved by  using the  PRESENT qualifier  This is further generalized by allowing the user to supply the  weights to be used by the  PRESENT machinery via the  PRWTS qualifier     The user specifies the factors in the table of weights with the  PRESENT statement and then  gives the table of weights using the  PRWTS qualifier  There may only be one   PRESENT  qualifier on the predict line when  PRWTS is specified  The order of factors in the tables of  weights must correspond to the order in the  PRESENT list with later factors nested within  preceding factors  The weights may be given in a separate file if a filename  in quotes  is  given as the argument to  PRWTS  Check the output to ensure that the values in the tables    188    9 3 Prediction       of weights are applied in the correct order  ASReml may transpose the table of weights to  match the order it needs for processing     When weights are supplied in a separate file  two layouts are allowed 
274. from 1 to 30 across replicates  see Table 15 6   The terms in  the linear model are therefore simply RowBlk ColBlk  Additional fields row and column  indicate the spatial layout of the plots     The ASReml input file is presented below  Three models have been fitted to these data  The  lattice analysis is included for comparison in PATH 3  In PATH 1 we use the separable first    286    15 6 Spatial analysis of a field experiment   Barley       order autoregressive model to model the variance structure of the plot errors  Gilmour et al    1997  suggest this is often a useful model to commence the spatial modelling process  The  form of the variance matrix for the plot errors  R structure  is given by    PE   o     D  8E   15 5     where X  and X  are 15 x 15 and 10 x 10 matrix functions of the column  c  and row   r  autoregressive parameters respectively  Gilmour et al   1997  recommend revision of  the current spatial model based on the use of diagnostics such as the sample variogram of  the residuals  from the current model   This diagnostic and a summary of row and column  residual trends are produced by default with graphical versions of ASReml when a spatial  model has been fitted to the errors  It can be suppressed  by the use of the  n option  on the command line  We have produced the following plots by use of the  EPS qualifier   The  RENAME  ARG 1 2 3 qualifiers in conjunctio with  DOPART  1 cause ASReml to run all  three parts  appending the part number to the outpu
275. g sigma s  gamma s  under each parameterization for the series of    NIN data examples    oaa eee ee eR EEE EES 123  Variance model function qualifiers available in ASReml               125  Examples of constraining variance parameters in ASReml            126  Details of the variance models available in ASReml                147  List of pedigree file qualifiers                 02002022 ee 159  List or prediction qualifiers       2 cb doh ea ee eee eA 181  List OF predict plot options  gt     s Ne eee Gee eee ON we ede 183  Trials classified by region and location                 2 000  186  Trok AWN o e Be a ee ee ee a Be ee ee 186    xiii    9 5    10 1  10 2  10 3    13 1  13 2    14 1  14 2  14 3    15 1  15 2  15 3  15 4  15 5    15 6  15 7  15 8    15 9   15 10  15 11  15 12  15 13    15 14  15 15    Location means s ee wR ER Re eS we eS we ee eS 186    Command line options  44 66   4 be dee nba bee Ee ESSE DES 195  The use of arguments in ASReml            00 00  eee eee 200  High level qualifiers      so eraco 64 bb ee de gadna ERG eG 201  Inet of MERGE Gugino   lt a 444 sereu sokea eee ee ee eo 208  Simmary of ASKeml output flee    65  cee aa Gee Gee ee wes 217  ASReml output objects and where to find them                  240  Some information messages and comments              2 2 000  255  List of warning messages and likely meaning s                   256  Alphabetical list of error messages and probable cause s  remedies        259  A split plot field trial of 
276. g terms   With sum to zero constraints  a missing treatment level will generate a singularity  but in the first coefficient rather than in the coefficient corresponding to the missing  treatment  In this case  the coefficients will not be readily interpretable  When  interacting constrained factors  all cells in the cross tabulation should have data     fac v  fac v  forms a factor with a level for each value of x and any additional points   fac v y  inserted as discussed with the qualifiers  PPOINTS and  PVAL  fac v y  forms a  factor with a level for each combination of values from v and y  The values are  reported in the  res file     97    6 6 Alphabetic list of model functions       Table 6 2  Alphabetic list of model functions and descriptions       model function    action       giv f n   g f n   grm f n     h f     ide f   i f     inv v  7r      leg v     n     lin f   1 f     log v  r      mal    mai f     mbf  f c   mbf  f     associates the nth   giv G inverse with the factor  This is used when there is a known   except for scale  G structure other than the additive inverse genetic relationship  matrix  The G inverse is supplied in a file whose name has the file extension   giv  described in Section 8 9  grm   and giv   are formally equivalent with grm standing  for generalized relationship Matrix     h f  requests ASReml to fit the model term for factor f using Helmert constraints   Neither Sum to zero nor Helmert constraints generate interpretable effects if sing
277. given by    e   y  WB  R Py  2 23     It follows that  E e      0  var        R   WCW   The matrix WC   W   under the sigma parameterization  is the so called    extended hat     matrix  ASReml includes the o  in the hat matrix under the gamma parameterization  It    is the linear mixed effects model analogue of    X X X   X  for ordinary linear models   The diagonal elements are returned in the fourth field of the  yht file     The  OQUTLIER qualifier invokes a partial implementation of research by Alison Smith  Ari  Verbyla and Brian Cullis  With this qualifier  ASReml writes       e G u and G7  u diag G       G  C   7G   to the  sln file           R  e and R  e diag  R      R  WC  W R    to the  yht file     e and copies lines where the last ratio exceeds 3 in magnitude to the  res file    and reports the number of such lines to the  asr file   e It has not been validated for multivariate models or XFA models with zero Ws     17    2 5 Inference  Fixed effects       The variogram has been suggested as a useful diagnostic for assisting with the identification  of appropriate variance models for spatial data  Cressie  1991   Gilmour et al   1997   demonstrate its usefulness for the identification of the sources of variation in the analysis  of field experiments  If the elements of the data vector  and hence the residual vector  are  indexed by a vector of spatial coordinates  s  2   1     n  then the ordinates of the sample  variogram are given by    vi   5 l   ls         s 
278. gression models associating u  a vector of f factor effects with v a vector of m regression  effects through the model u   Mv where the matrix M contains m regressor variables for  each of the f levels of the factor  Direct fitting of the regression effects is facilitated by  using the my basis function  mbf function  associating the regressor variables to the levels  of the factor  essentially fitting ZMv where Z is the design matrix linking observations to  the levels of the factor  But if m is much bigger than f  it is more computational efficient  to fit an equivalent model Zu with a variance structure for u based on MM     ASReml  can read the matrix M associated with a factor and group of regressor variables from a   grr file  construct a GRM matrix  G   MM     s   fit the equivalent model and report  both factor and regressor predictions  One common case of this model is when u represents  genotype effects  the regressors represent SNP marker counts  typically 0 1 2  and v are  marker effects     The  grr file is specified after any pedigree file and before the data file  with any other GRM  files   There may only be one  grr file  It is assumed to contain a row for each level of the  factor  each row containing m regressor values  Optionally the factor level name associated  with the i th row can be included before the relevant regressor values  Also a heading row  might include a name for each field regressor variable  Superfluous fields before the factor  or regress
279. he initial values of a US structure  ASReml  tests the adequacy of the reduced parameterization     causes ASReml to report a general description of the distribution of the  data variables and factors and simple correlations among the variables  for those records included in the analysis  This summary will ignore  data records for which the variable being analysed is missing unless a  multivariate analysis is requested or missing values are being estimated   The information is written to the  ass file     is used to plot the  transformed  data  Use  X to specify the z variable   1Y to specify the y variable and  G to specify a grouping variable    JOIN  joins the points when the z value increases between consecutive records   The grouping variable may be omitted for a simple scatter plot  Omit   Y y produce a histogram of the x variable     For example     X age  Y height  G sex   Note that the graphs are only produced in the graphics versions of AS   Reml  Section 10 3      68    5 8 Job control qualifiers       Table 5 3  List of commonly used job control qualifiers       qualifier    action       For multivariate repeated measures data  ASReml can plot the response  profiles if the first response is nominated with the  Y qualifier and the fol   lowing analysis is of the multivariate data  ASReml assumes the response  variables are in contiguous fields and are equally spaced  For example  Response profiles   Treatment  A   Yi Y2 Y3 Y4 Y5   rat asd  Y Y1  G Treatment   JOIN 
280. he range of the data or ASReml will modify  them before they are applied  If you choose to spread them over several  lines use a comma at the end of incomplete lines so that ASReml will to  continue reading values from the next line of input  If the explicit points  do not adequately cover the range  a message is printed and the values  are rescaled unless  NOCHECK is also specified  Inadequate coverage is  when the explicit range does not cover the midpoint of the actual range   See  KNOTS   PVAL and  SCALE     reduces the update step sizes of the variance parameters  The default  value is the reciprocal of the square root of  MAXIT  It may be set between  0 01 and 1 0  The step size is increased towards 1 each iteration  Starting  at 0 1  the sequence would be 0 1  0 32  0 56  1  This option is useful when  you do not have good starting values  especially in multivariate analyses     forms a new group factor  t  derived from an existing group factor  v   by selecting a subset  p  of its variables  A subgroup factor may not be  used in a PREDICT or TABULATE directive     73    5 8 Job control qualifiers       Table 5 4  List of occasionally used job control qualifiers       qualifier    action         SUBSET t v p      WMF    forms a new factor  t  derived from an existing factor  v  by selecting a  subset  p  of its levels  Missing values are transmitted as missing and  records whose level is zero are transmitted as zero  The qualifier occupies  its own line after the dataf
281. he required  range     The voltage of 64 regulators was set at 10 setting stations  setstat   between 4 and 8  regulators were set at each station  The regulators were each tested at four testing stations   teststat   The ASReml input file is presented below     Voltage data   teststat 4   4 testing stations tested each regulator   setstat  A   10 setting stations each set 4 8 regulators   regulator 8   regulators numbered within setting stations   voltage  voltage asd  skip 1  voltage   mu  r idv setstat  idv setstat regulator  idv teststat  idv setstat teststat   residual idv  units     The factor regulator numbers the regulators within each setting station  Thus the term    setstat regulator fits an effect for each regulator  while the other terms examine the  effects of the setting and testing stations and possible interaction  The abbreviated output    276    15 4 Source of variability in unbalanced data   Volts       is given below    LogL  188 604 S52  0 67074E 01 255 df  LogL  199 530 S52  0 59303E 01 255 df  LogL  203 007 S52  0 52814E 01 255 df  LogL  203 240 S2  0 51278E 01 255 df  LogL  203 242 S52  0 51141E 01 25655 d    LogL  203 242 S2  0 51140E 01 255 df  Model_Term Gamma Sigma Sigma SE  C  idv TestStat  IDV_V 4 0 642752E 01 0 328704E 02 0 98 0P  idv Setstat  IDV_V 10 0 233416 0 119369E 01 1 35 OF  idv TestStat Setstat  IDV_V 40 0 101193E 06 0 517501E 08 0 00 OB  idv  Regulator  Setstat  IDV_V 80 0 601817 0 307770E 01 3 64 OP  idv units  256 effects  Residual 
282. he spatial analyse  45796 58 23 8842   1917 442 c f  8061 808 4 03145   1999 729  are similar but slightly lower reflecting the gain in accuracy from the spatial  analysis  For further reading  see Smith et al   2001  2005      15 7 Unreplicated early generation variety trial   Wheat    To further illustrate the approaches presented in the previous section  we consider an un   replicated field experiment conducted at Tullibigeal situated in south western NSW  The    292    15 7 Unreplicated early generation variety trial   Wheat       trial was an S1  early stage  wheat variety evaluation trial and consisted of 525 test lines  which were randomly assigned to plots in a 67 by 10 array  There was a check plot variety  every 6 plots within each column  That is the check variety was sown on rows 1 7 13      67 of  each column  This variety was numbered 526  A further 6 replicated commercially available  varieties  numbered 527 to 532  were also randomly assigned to plots with between 3 to 5  plots of each  The aim of these trials is to identify and retain the top  say 20  of lines for  further testing  Cullis et al   1989  considered the analysis of early generation variety trials   and presented a one dimensional spatial analysis which was an extension of the approach  developed by Gleeson and Cullis  1987   The test line effects are assumed random  while the  check variety effects are considered fixed  This may not be sensible or justifiable for most  trials and can lead to inc
283. heads  of other systems     This guide has 15 chapters  Chapter 1 introduces ASReml and describes the conventions used  in this guide  Chapter 2 outlines some basic theory while Chapter 3 presents an overview  of the syntax of ASReml through a simple example  Data file preparation is described in  Chapter 4 and Chapter 5 describes how to input data into ASReml  Chapters 6 and 7  are key chapters which present the syntax for specifying the linear model and the variance  models for the random effects in the linear mixed model  Chapters 8 and 8 3 1 describe  special commands for multivariate and genetic analyses respectively  Chapter 9 deals with  prediction of linear functions of fixed and random effects in the linear mixed model  Chapter  10 demonstrates running an ASReml job  Chapter 11 describes the merging of data files and  Chapter 12 presents the syntax for forming functions of variance components  Chapter 13  gives a detailed explanation of the output files  Chapter 14 gives an overview of the error  messages generated in ASReml and some guidance as to their probable cause  The guide  concludes with the most extensive chapter which presents the analysis of a range of data  examples     In brief  the improvements in Release 4 include developments associated with input include  generating initial values  generating a template to allow an alternative way of presenting  parametric information associated with variance structures  new facilities for reading in data  files and 
284. hen we invoke    marginality    considerations  The issue  of marginality between terms in a linear  mixed  model has been discussed in much detail  by Nelder  1977   In this paper Nelder defines marginality for terms in a factorial linear  model with qualitative factors  but later Nelder  1994  extended this concept to functional  marginality for terms involving quantitative covariates and for mixed terms which involve an  interaction between quantitative covariates and qualitative factors  Referring to our simple  illustrative example above  with a full factorial linear model given symbolically by    y 1 A B A B    then A and B are said to be marginal to A B  and 1 is marginal to A and B  In a three way  factorial model given by       y x1 A B C A B A C B C A B C    the terms A  B  C  A B  A C and B C are marginal to A B C  Nelder  1977  1994  argues  that meaningful and interesting tests for terms in such models can only be conducted for  those tests which respect marginality relations  This philosophy underpins the following  description of the second Wald statistic available in ASReml  the so called    conditional     Wald statistic  This method is invoked by placing  FCON on the datafile line  ASReml  attempts to construct conditional Wald statistics for each term in the fixed dense linear  model so that marginality relations are respected  As a simple example  for the three way  factorial model the conditional Wald statistics would be computed as    20    2 5 Inference
285. hooses an arrangement for plotting the predictions by recog   nising any covariates and noting the size of factors  However  the user is able to customize  how the predictions are plotted by either using options to the  PLOT qualifier or by using  the graphical interface  The graphical interface is accessed by typing Esc when the figure is    displayed     The  PLOT qualifier has the following options     Table 9 2  List of predict plot options       option    action       Lines and data     addData    superimposes the raw data     183    9 3 Prediction       Table 9 2  List of predict plot options       option    action          addlabels factors       addlines factors       noSEs     semult r       joinmeans    superimposes the raw data with the data points labelled using the given  factors  which must not be prediction factors   This option may be useful to  identify individual data points on the graph     for instance  potential outliers      or alternatively  to identify groups of data points  e g  all data points in  the same stratum      superimposes the raw data with the data points joined using the given factors  which must not be prediction factors  This option may be useful for repeated  measures data     specifies that no error bars should be plotted  by default  they are plotted   specifies the multiplier of the SE used for creating error bars  default 1 0     specifies that the predicted values should be joined by lines  by default  they  are only joined if the 
286. iance components models  that is those  linear with respect to variances in H   the terms in Z4 are exact averages of those in  2 14   and  2 15   The basic idea is to use Z4 4       in place of the expected information matrix  in  2 16  to update x     The elements of Z4 are    1    The Z4 matrix is the  scaled  residual sums of squares and products matrix of    y    Y1 Yel  where y  is the    working    variate for k  and is given by  y     HiPy  H R        R  R      Ki    Oy    ZG G t    Ki E Og    where       y     X7     Zu  7 and    are solutions to  2 18   In this form the Al matrix is  relatively straightforward to calculate     The combination of the Al algorithm with sparse matrix methods  in which only non zero  values are stored  gives an efficient algorithm in terms of both computing time and workspace   2 2 2 Estimation  prediction of the fixed and random effects  To estimate T and predict u the objective function   log fy y   u   T  Re   log fulu   G   is used  This is the log joint distribution of  Y   u      Differentiating with respect to 7 and u leads to the mixed model equations  Henderson et  al   1959  Robinson  1991  which are given by    X R  X X R  Z 7       XR  2 18   ZRIX Z R    Z G9  a    ZR y         These can be written as    CB  WR  y    14    2 3 What are BLUPs        where C   W R  W   G   B    r  u l  and     _ l0 0  da   0G     The solution of  2 18  requires values for o  and op  In practice we replace o  and a  by  their REML estimates o  
287. idual  matrix  Unfortunately ASReml does not yet have an automatic way of taking the estimates  from the univariate analyses and using them in the diagonal analysis  The Log likelihood  from this run is  20000  1566 45  Once the model from PATH 1 has run we can rerun the anal   ysis changing  ARG 1 to  ARG 2 to obtain the next analysis  With the statement   CONTINUE  coopmf1 rsv ASReml generates initial values from the coopmf1 rsv file  if no filename is  given ASRem1 will look for the previous  rsv file to generate initial values  In analysis 2 we  get estimates of the sire  dam and litter matrices based on a factor analysis parameterization   This can give better initial values for unstructured matrices and indicate if the estimated  matrices are near singularity  The log likelihood from this run is  20000  1488 11  In this  case the dam variance parameters are    Source Model terms Gamma Sigma Sigma SE  C  xfai  TrDam123   id dam  14244 effects   TrDam123 XFA_V O 1 0 405222 0 405222 1 30 OP  TrDam123 XFA_V 0 2 0 00000 0 00000 0 00 OF  TrDam123 XFA_V O 3 0 616712E 02 0 616712E 02 1 14 0 P  TrDam123 AFAL 1 1 1 29793 1 29793 9 05 0 P  TrDam123 XFA L 1 2 1 68814 1 68814 9 96 OP  TrDam123 XFA_L 1 3 0 124492 0 124492 6 02    321    15 10 Multivariate animal genetics data   Sheep       And one of the dam specific variances is zero  The resulting dam matrix is    Covariance Variance Correlation Matrix XFA xfai TrDam123   id dam     2 090 0 8981 0 7590 0 8981  2 190 2 845 0 8451 1 
288. iduals are non parents and have no progeny and there is interest in predictions for  parents alone  This can happen in large forestry trials  The reduced animal model expresses  the non parent genetic effect in terms of parent effects and a Mendelian sampling term that  is combined with the residual effect for the residual  We consider the case when there is data  on parents and non parents and some individuals are inbred     An example tree model for a single trait and a single site might be    DBH   mu  r nrmv tree  plot aritv column   ar1  row   residual idv units     since trees are often planted in plots of say 5 trees  This is a spatial analysis  the idv  units   term is required so that error variance is not transferred to the nrmv tree  term since trees  are unreplicated     This analysis requires a pedigree file  say TreePed csv  and if the  DIAG qualifier is specified  on the pedigree line  the resulting  aif file will contain the inbreeding level for every tree  in the pedigree  the diagonal of the A   matrix and a N P code distinguishing parents  with  progeny  from non parents  without progeny      To analyse the data using the RAM  we need to incorporate these last two columns into the  data file  which can be done with the  MERGE statement   If there is data on parents  further    166    8 10 The reduced animal model  RAM        processing of the data file is required  create a copy of the    tree    field  call it say    parent      and change it to    0    fo
289. ie  scaled identity      7 6 2 Switching from the gamma to the sigma parameterization    ASReml uses the gamma parameterization by default for univariate single section analyses   see above  However   SIGMAP is a new qualifier with Release 4 that enables the user to force  ASReml to use the sigma parameterization this case  This is achieved by placing  SIGMAP  immediately after the independent variable and before   on the model definition line  For  example     yield  SIGMAP   mu variety  r idv repl   f mv    residual idv units     would force ASReml to use the sigma parameterization in NIN example 3a  see Section 7 5     Table 7 3 gives the variance model specification for each of the six NIN examples  column 3    the individual terms in G o   and R  o   under the sigma parameterization  column 4    the sigmas that are estimated under this parameterization  column 5   the individual terms  in G y   and R  7    under the gamma parameterization  column 6  and the gammas that  are estimated under this parameterization  column 7      122    Table 7 3  G structure for the random terms  magenta  and R structure for the residual error term  cyan  under both the sigma and  gamma parameterizations  and the corresponding sigma s  gamma s  under each parameterization for the series of NIN data examples    sigma parameterization    gamma parameterization       no  definition variance model G aq  sigma s  G T  gamma s   specification Ry az  RAy     1 RCB analysis  residual idv units  eT 
290. ier used on the top job control line  Detailed descriptions follow     194    10 3 Command line options       Table 10 1  Command line options       option    qualifier    type    action       Frequently used command line options    C    N    Ww    Other command line options    Bb    Gg  Hg    Rr    Ss    Yv      CONTINUE      FINAL      LOGFILE    NOGRAPHS      WORKSPACE w     ARGS a      ASK  IBRIEF b     DEBUG  DEBUG 2  IGRAPHICS g    HARDCOPY g      INTERACTIVE    ONERUN     OUTFOLDER  NA     QUIET      RENAME    NA      YVAR v    NA      XML    job control  job control    screen output  graphics    workspace    job control    job control  output control  debug   debug  graphics    graphics    graphics   job control  output control  post processing  graphics    job control    workspace    job control    license    output control    continue iterations using previous estimates as initial  values    continue for one more iteration using previous esti   mates as initial values    copy screen output to basename asl  suppress interactive graphics    set workspace size to w Mbyte    to set arguments  a  in job rather than on command  line   prompt for options and arguments   reduce output to  asr file   invoke debug mode   invoke extended debug mode    set interactive graphics device    set interactive graphics device   graphics screens not displayed    display graphics screen   override rerunning requested by  RENAME  changes output folder   calculation of functions of varianc
291. ies are based on the same  records as are used in the analysis of the model fitted in the same run  In particular  it will  ignore records that exist in the data file but were dropped as the data was read into ASReml   either explicitly using  DV or implicitly because the dependent variable had missing values   Multiple tabulate statements are permitted either immediately before or after the linear  model  If a linear  mixed  model is not supplied  tabulation is based on all records     The tabulate statement has the form    tabulate response_variables   WT weight  COUNT  DECIMALS  d   SD  RANGE  STATS  FILTER  filter  SELECT value    factors    174    9 3 Prediction       e tabulate is the directive name  appearing on a new line   e response_variables is a list of variates for which means are required     IWT weight nominates a variable containing weights      COUNT requests counts as well as means to be reported      DECIMALS  d   1  lt  d  lt  7  requests means be reported with d decimal places  If omitted   ASReml reports 5 significant digits  if specified without an argument  2 decimal places are  reported     IRANGE requests the minimum and maximum of each cell be reported      SD requests the standard deviation within each cell be reported      STATS is shorthand for  COUNT  SD   RANGE       FILTER filter nominates a factor for selecting a portion of the data      SELECT value indicates that only records with value in the filter column are to be in   cluded       facto
292. ies arranged as a grid of 4 rows by 24 columns  rows are replicates   a first order  separable autoregressive spatial variance structure for the residuals can be specified by the  consolidated model term ar1 column   ar1  row   where column and row are the appropriate  columns in the data file  However  the number of data units must be the product of the  number of levels for row and the number of levels for column  96 in this case  If this is not  the case  or if more than one unit is associated with some row column combination  ASReml  will return an error message and it will not be possible to use ar1 column   ar1 row  for  residual error  If there are fewer than 96 units and each row column combination present  is associated with one unit  then the  COLUMNFACTOR  ROWFACTOR data file qualifiers  see  Table 5 2  can be used to augment the data by completing the grid to allow an appropriate  analysis     These rules will always be satisfied for a single section of data with IID errors  that is   R    R    07I   see Example 2 2  defined either by default  ie  with no residual specified   or in terms of the units factor  However  a mismatch in both size and ordering is possible  when either multiple sections are present  as in multi environment trial  MET  analysis  or  when non identity variance model functions are used     115    7 3 Applying variance structures to the residual error term       7 3 2 Using sat   to specify the residual model term for data with  sections    S
293. if provided  the initial values are for the   GFW FDIAM   Trait Trait YEAR     lower triangle of the  symmetric  matrix    r us Trait  id TEAM  us Trait   id SheepID   specified row wise  residual id units  us Trait  GP              e finding reasonable initial values can be a  problem  When no initial values are provided  as in code box   ASReml takes half of the  phenotypic variance matrix of the data as an initial value     Since the variance component matrices for the TEAM and SheepID strata are not specified   ASReml will plug in values derived from the observed phenotypic variance matrix   GP  requests that the resulting estimated matrix be kept within the parameter space  ie  it is to  be positive definite     155    8 5 The command file       The special qualifiers relating to multivariate analysis are   ASUV and   ASMV t  see Table 5 4  for details     e to use an error structure other than US for the residual stratum you must also specify    ASUV  see Table 5 4  and include mv in the model if there are missing values     e to perform a multivariate analysis when the data have already been expanded use   ASMV  t  see Table 5 4     e tis the number of traits that ASRem1 should expect    e the data file must have t records for each multivariate record although some may be coded  missing     Note that  if no residual line is inserted the id units   us Trait  variance structure is  assumed for multivariate data     chapterCommand file  Genetic analysis    8 4 Introductio
294. ile line but before the linear model  e g      SUBSET EnvC Env 3 5 8 9  15 21 33   defines a reduced form of the factor Env just selecting the environments  listed  It might then be used in the model in an interaction  A subset  factor can be used in a TABULATE directive but not in a PREDICT directive     The intention is to simplify the model specification in MET  Multi En   vironment Trials  analyses where say Column effects are to be fitted to a  subset of environments  It may also be used on the intrinsic factor Trait  in a multivariate analysis provided it correctly identifies the number of  levels of Trait either by including the last trait number  or appending  sufficient zeros  Thus  if the analysis involves 5 traits     SUBSET Trewe Trait 13400    sets hardcopy graphics file type to  wmf     Table 5 5  List of rarely used job control qualifiers       qualifier    action        ATLOADINGS 2      ATSINGULARITIES    controls modification to AI updates of loadings in extended Factor Ana   lytic models  After ASReml calculates updates for variance parameters   it checks whether the updates are reasonable and sometimes reduces them  over and above any  STEPSIZE shrinkage  The extra shrinkage has two  levels  Loadings that change sign are restricted to doubling in magni   tude  and if the average change in magnitude of loadings is greater than  10 fold  they are all shrunk back    Unless the user gives constraints  ASReml sets them and rotates the load   ings each iteration
295. imate prediction  variance matrix corresponding to the dense portion  It is only written if the   VRB qualifier is  specified  The file is formatted for reading back for post processing  The number of equations  in the dense portion can be increased  to a maximum of 800  using the   DENSE option     Table  5 5  but not to include random effects  The matrix is lower triangular row wise in the order  that the parameters are printed in the  sln file  It can be thought of as a partitioned lower  triangular matrix     g     B  pa  where B  is the dense portion of B and C     is the dense portion of C    This is part of  nin89a vrb  Note that the first element is the estimated error variance  that is  48 6802   see the variance component estimates in the  asr output     0 487026E 02  0 000000E 00 0 O000000E 00 0 298409E 01 0  000000E 00  0 807354E 01 0 470629E 01 0 Q000000E 00 0 456542E 01 0 886497E 01   0 315807E 00 0 000000E 00 0 409951E 01 0 476481E 01 0 876563E 01  0 295379E 01 0 Q000000E 00 0 343250E 01 0 389543E 01 0 416076E 01  0 743440E 01 0 163089E 01 0 000000E 00 0 377085E 01 0 428016E 01  0 472451E 01 0 402633E 01 0 837086E 01 0 129027E 01 O   000000E 00  0 329974E 01 0 347377E 01 0 357535E 01 0 316846E 01 0 412043E 01  0 768099E 01 0 309076E 00 0 000000E 00 0 376552E 01 0 419706E 01  0 395640E 01 0 383367E 01 0 458364E 01 0 378483E 01 0 984962E 01  0 226400E 01 0 Q000000E 00 0 379190E 01 0 442373E 01 0 439411E 01  0 402430E 01 0 440457E 01 0 362313E 01 0 502025E 01 0 90
296. imates  variety BRULE 2 984 2 841   variety REDLAND 4 706 2 977   variety CODY  0 3158 2 961   variety ARAPAHOE 2 954 Bahay   variety NE87615 1 033 2 934   variety NE87619 5 937 2 849   variety NE87627  4 378 2997   mu 1 24 09 2 465 intercept  mv_estimates 1 21 91 6 731 missing value estimates  mv_estimates 2 23 22 6 723   mv_estimates 3 22 52 Gril   mv_estimates 4 23 49 6 678   mv_estimates 5 2221 6 700   mv_estimates 6 24 47 6 709   mv_estimates T 20 14 6 699   mv_estimates 8 25 01 6 693    223    13 3 Key output files       mv_estimates  mv_estimates  mv_estimates  mv_estimates  mv_estimates  mv_estimates  mv_estimates  mv_estimates  mv_estimates    mv_estimates    9 24 29 6 678  10 26 30 6 660  11 24 99 6 592  IZ 2 ld 6 493  13 25 39 6 305  14 26 81 5 900  15 29 07 4 906  16 23 97 4 577  17 24 27 4 618  18 29 82 4 532    13 3 3 The  yht file    The  yht file contains the predicted values of the data in the original order  this is not  changed by supplying row column order in spatial analyses   the residuals and the diagonal  elements of the hat matrix  Figure 13 1 shows the residuals plotted against the fitted values   Yhat  and a line printer version of this figure is written to the  res file  Where an observa   tion is missing  the residual  missing values predicted value and Hat value are also declared  missing  The missing value estimates with standard errors are reported in the  s1n file     NIN alliance trial 1989 Residuals vs Fitted values       Residuals      24 8
297. ime with  the finishing time     The execution times for parts of the Iteration process  are written to the  as1 file if the  DEBUG   LOGFILE com   mand line qualifiers are invoked     if  BRIEF  1 is invoked  the effects that were included  in the dense portion of the solution are also printed in  the  asr file with their standard error  a t statistic for  testing that effect and a t statistic for testing it against  the preceding effect in that factor     placed in the  pvc file when postprocessing with a  pin  file    and graphics file  given if the  DL command line option is used     for non spatial analyses ASReml prints the slope  of the regression of log abs residual   against  log predicted value   This regression is expected to  be near zero if the variance is independent of the mean   A power of the mean data transformation might be indi   cated otherwise  The suggested power is approximately   1 b  where b is the slope  A slope of 1 suggests a log  transformation  This is indicative only and should not  be blindly applied  Weighted analysis or identifying the  cause of the heterogeneity should also be considered   This statistic is not reliable in genetic animal models  or when units is included in the linear model because  then the predicted value includes some of the residual     241    13 5 ASReml output objects and where to find them       Table 13 2  Table of output objects and where to find them ASReml       output object    found in    comment       observed
298. in terms of the subset parameters  5  10  14  18 and 19    can be introduced by editing the RN_GN and RP_scale columns  Some users would prefer to  insert initial values into this  tsv file under the Initial_value column  As an example  the  file below contains values based on using 4 8  26  70  35 and 70 for parameters 5  10  14  18  and 19  The data values in the  tsv file become      GN  Term  Type  PSpace  Initial_value  RP_GN  RP _scale   5      units us Trait   us Trait _1   G  P  4 8   5  1 0000   6   units us Trait   us Trait _2   G  P  4 8   5  1 0000  7     units us Trait   us Trait _3   G  P  9 6   5  2 0000  8   units us Trait   us Trait _4   G  P  4 8   5  1 0000  9   units us Trait   us Trait _5   G  P  9 6   5  2 0000   10      units us Trait   us Trait _6   G  P  26  10  1 0000   11   units us Trait   us Trait _7   G  P  4 8   5  1 0000  12     units us Trait  us Trait _8   G  P  9 6   5  2 0000  13     units us Trait s us Trait  9   G  P  26   10  1 0000     gt     137    7 9 Ways to present initial values to ASReml       14   unite  ws Trait  suetirait  10   G  F  70 4 14 1 0000  15     units us Trait   us Trait _11   G  P  4 8   5  1 0000  16     units us Trait sus Trait  12   G  P  9 6   5  2 0000  17     units  us Trait   us Trait 13   G  P  26   10  1 0000  18   units us Trait   us Trait _14   G  P  35   18  1 0000  19      units us Trait   us Trait _15   G  P  70   19  1 0000    Sometimes users wish to rerun a job making changes to the final values  par
299. in this context  is to hold the previously estimated factor loadings fixed for a few iterations so that the  factor k   1 initally aims to explain variation previously incorporated in w  Then allow all  loadings to be updated in the remaining rounds  A second problem  at present unresolved  but somewhat improved  is that sometimes the LogL rises to a relatively high value and then  drifts away     In an attempt to make the process easier  these two processes have been linked as an addi   tional meaning for the  AILOADING n qualifier  When fitting k factors with N  gt  k  the first  k     1 loadings are held fixed  no rotation  for the first k iterations  Then for iterations k   1  to n  loadings vectors are updated in pairs  and rotated  If  AILOADING is not set by the  user and the model is an upgrade from a lower order XFA    AILOADING is set to 4     The problem of XFA loadings going off scale has been reduced by adding a variable penalty  the the loading part of the Al matrix     It is not unusual for users to have trouble comprehending and fitting extended factor analytic  models  especially with more than two factors  Two examples are developed in a separate  document available on request     146    7 12 Variance models available in ASReml       7 12 Variance models available in ASReml    Table 7 6  Details of the variance models available in ASReml          variance description algebraic number of parameters   structure form  name  variance corr hom het  model variance
300. ing workspace  Otherwise send problem to VSN     264    14 5    Information  Warning and Error messages       Table 14 3  Alphabetical list of error messages and probable cause s   remedies       error message    probable cause remedy       PROGRAMMING error     reading  SELF option    Reading distances for POWER  structure    Reading factor names     reading Overdispersion factor    READING OWN structures        Reading the data     Reading Update step size     Residual Variance is Zero     R header SECTIONS DIMNS    GSTRUCT   R structure header SITE DIM  GSTRUCT   Variance header  SEC DIM  GSTRUCT    R structure error ORDER  SORTCOL MODEL GAMMAS     R structures are larger than  number of records    REQUIRE  ASUV qualifier for  this R structure  REQUIRE I x E R structure    Scratch     indicates ASReml has failed deep in its core  It is likely to  be an interaction between the data and the variance model  being fitted  Try increasing the memory  simplifying the model  and changing starting values for the gammas  If this fails send  the problem to the VSN  mailto support asreml co uk  for  investigation     Check the argument     POWER structures are the spatial variance models which re   quire a list of distances  Distances should be in increasing order   If the distances are not obtained from variables  the    SORT     field is zero and the distances are presented after all the R and  G structures are defined     something is wrong in the terms definitions  It could also b
301. into V99 and changes any missing values in V99 to  zero  It then adds V98 and discards the whole record if the result  is zero  i e  both YA and YB have missing values for that record   Variables 98 and 99 are not labelled and so are not retained for  subsequent use in analysis           60    5 6 Datafile line       5 5 4 Special note on covariates    Covariates are variates that appear as independent variables in the model  It is recommended  that covariates be centred and scaled to have a mean near zero and a variance of about one  to avoid failure to detect singularities  This can be achieved either    e externally to ASReml in data file preparation     e using  RESCALE  mean scale where mean and scale are user supplied values  for example   age  rescale  140  142857   in weeks    5 6 Datafile line       The purpose of the datafile line is to NIN Alliance Trial 1989  variety  A   e nominate the data file    e specify qualifiers to modify    row 22      the reading of the data  column 11    the output produced  nin89aug asd  skip 1    the operation of ASReml  yield   mu variety          5 6 1 Data line syntax    The datafile line appears in the ASReml command file in the form    datafile  qualifiers     e datafile is the path name of the file that contains the variates  factors  covariates  traits   response variates  and weight variables represented as data fields  see Chapter 4  enclose  the path name in quotes if it contains embedded blanks    e the qualifiers tell ASRe
302. ion algorithm for factor analytic and reduced rank variance models   Australian and New Zealand Journal of Statistics 45  445   459     Verbyla  A  P   1990   A conditional derivation of residual maximum likelihood  Australian  Journal of Statistics 32  227 230     Verbyla  A  P   Cullis  B  R   Kenward  M  G  and Welham  S  J   1999   The analysis of  designed experiments and longitudinal data by using smoothing splines  with discussion    Applied Statistics 48  269 311     Waddington  D   Welham  S  J   Gilmour  A  R  and Thompson  R   1994   Comparisons of  some glmm estimators for a simple binomial model   Genstat Newsletter 30  13 24     Welham  S  J   2005   Glmm fits a generalized linear mixed model   in R  Payne and P  Lane   eds   GenStat Reference Manual 3  Procedure Library PL17   VSN International  Hemel  Hempstead  UK  pp  260 265     Welham  S  J   Cullis  B  R   Gogel  B  J   Gilmour  A  R  and Thompson  R   2004   Predic   tion in linear mixed models  Australian and New Zealand Journal of Statistics 46  325 347     Wolfinger  R  D   1996   Heterogeneous variance covariance structures for repeated measures   Journal of Agricultural  Biological  and Environmental Statistics 1  362 389     Wolfinger  R  and O   Connell  M   1993   Generalized linear mixed models  A pseudo   likelihood approach  Journal of Statistical Computation and Simulation 48  233 243     Yates  F   1935   Complex experiments  Journal of the Royal Statistical Society  Series B  2  181 247    
303. ion are used in forming that variance structure     Often sections relate to sites  or trials or experiments  in the case where several related trials  are analysed together  For example  consider a MET dataset comprising data for three sites   To model the residuals at each site by a separate AR1xAR1 variance structure  we could  write    residual sat site   ariv column   ari  row     Alternatively  an AR1xAR1 variance structure for sites 1 and 3  but an IDVxAR1 structure  for site 2  could be coded using sat either as    residual sat site 1  ariv column   ari row    sat site 2  idv column   ari row      sat  site 3  ariv column   ar1  row     or  more succinctly  as    residual sat site 1 3  ariv column  ari row  sat site 2  ariv column   id row     For each of these definitions  ASReml will determine the particular levels in row and column  for each site and hence the appropriate sizes of the AR1 matrices     116    7 5 A sequence of variance structures for the NIN data       Important point A variance structure needs to be specified for every level of the sectioning  factor  in which case    residual sat site 1 3  ar1 row  ar1  column     would fail as there is no variance structure specified for site 2     7 4   Identifiability    Once all variables have a variance model function applied  ASReml attempts to determine  whether the term is identifiable  that is  the terms that can be separately estimated from   are not confounded with  other terms in the model  If the cons
304. ions  AS   SIGN strings and commandline arguments may substitute into a    CYCLE line     e I  J  K and L are reserved as names referring to items in the   CYCLE  list and should therefore not be used as names of an ASSIGN string     201    10 4 Advanced processing arguments       High level qualifiers       qualifier    action       ICYCLE   SAMEDATA   list     DOPATH n   DOPART n    is a mechanism whereby ASReml can loop through a series of jobs  The  ICYCLE has a qualifier  SAMEDATA that tells ASReml to use the same data  for all cycles  ie  the data file is only read on the first cycle  and is kept in  memory for later cycles  The   CYCLE qualifier must appear on its own line   list is a series of values which are substituted into the job wherever the  I  string appears  The list may spread over several lines if each incomplete  line ends with a COMMA  A series of sequential integer values can be  given in the form 7  j  no embedded spaces   The output from the set of  runs is concatenated into a single set of files  but the output written to  the  asr file is slightly abbreviated after the first cycle  by suppressing  the data summary and fixed effect solutions that might otherwise appear   see  BRIEF  the  BRIEF qualifier is set after the first cycle      For example   ICYCLE 0 4 0 5 0 6   20 O mat2 1 9  I  GPF   would result in three runs and the results would be appended to a single  file  Putting  SAMEDATA on the  leading   CYCLE line makes ASReml read  the data  and
305. ions NIN Alliance Trial 1989    variety  A    e should be given for all fields in the data file  id  fields can be skipped and fields  on the end   pia  of a data line  without a field definition are   raw  ignored  if there are not enough data fields   repl 4  on a data line  the remainder are taken from      1          yield  the next line s   Jat  long  e must be presented in the order they appear   row 22  in the data file  column 11    nin89aug asd  skip 1      we ield   mu variet  can appear with other definitions on the 7 y    same line              data fields can be transformed  see below      additional variables can be created by transformation qualifiers     46    5 4 Specifying and reading the data       5 4 1 Data field definition syntax    Data field definitions appear in the ASReml command file in the form    SPACE label  field_type    transformations      e SPACE     is now optional    e label      is an alphanumeric string to identify the field       has a maximum of 31 characters although only 20 are ever printed displayed       must begin with a letter       must not contain the special characters                        or         reserved words  Table 6 1 and Table 7 6  must not be used        CSKIP  c  can be used to skip c  default 1  data fields     e field_type defines how a variable is interpreted as it is read and whether it is regarded as  a factor or variate if specified in the linear model     for a variate  leave field_type blank or specify 1   
306. ip matrix    Sometimes a relationship matrix is required other than the one ASReml can produce from  the pedigree file  We call this a GRM  General Relationship Matrix   The inverse of a GRM  is a GIV matrix  The user can provide the relationship matrix in a  grm file and ASReml  will invert it to form the GIV matrix  since it is the inverse that is used in the mixed model  equations   Alternatively  the user can provide a  giv file containing the inverted GRM  matrix     The syntax for specifying a GRM file  say name grm  or the GIV file  say name  giv  is    name   s  d  grm   SKIP n    DENSEGRM  o      GROUPDF n     ND  PSD  NSD     PRECISION n    or  name   s  d  giv   SKIP n     DENSEGIV  o      GROUPDF n     SAVEGIV f      e the named file must have a  giv   grm   sgiv   sgrm   dgiv or  dgrm extension     e  sgiv and  sgrm files are binary format files and will be read lower triangle row wise  assuming single precision     e  dgiv and  dgrm files are binary format files and will be read lower triangle row wise  assuming double precision     e the named file will be read assuming single double precision lower triangle row wise     162    8 9 Reading a user defined  inverse  relationship matrix       e the G  inverse  files must be specified on the line s  immediately prior to the data file line  after any pedigree file     e up to 98 G  inverse  matrices may be defined   e the file must be in SPARSE format unless the  DENSE qualifier is specified     e a dense format fil
307. is  simply the order those treatment labels were discovered in the data file     Split plot analysis   oat Variety Nitrogen 14 Apr 2008 16 15 49  oats    Ecode is E for Estimable    for Not Estimable    The predictions are obtained by averaging across the hypertable  calculated from model terms constructed solely from factors  in the averaging and classify sets    Use  AVERAGE to move ignored factors into the averaging set          aa i a a ae eee 1                                     Predicted values of yield   The SIMPLE averaging set  variety   The ignored set  blocks wplots   nitrogen Predicted_Value Standard_Error Ecode   0 6_cwt 123 3889 7 1747 E   0 4_cwt 114 2222 7 1747 E   0 2_cwt 98  8889 7 1747 E   O_cwt 79 3889 7 1747 E    SED  Overall Standard Error of Difference 4 436         le ali ee ee cea et 2      mn es es a    mli    Mlle  Predicted values of yield   The SIMPLE averaging set  nitrogen   The ignored set  blocks wplots   variety Predicted_Value Standard_Error Ecode   Marvellous 109 7917 7 7975 E   Victory 97 6250 T 7975 E   Golden_rain 104 5000 7 7975 E    SED  Overall Standard Error of Difference 7 079       ei a  i a  a a Mlle    Mlm 3      a i a ee ee    Mli  Predicted values of yield   The ignored set  blocks wplots   nitrogen variety Predicted_Value Standard_Error Ecode  0 6_cwt Marvellous 126 8333 9 1070 E  0 6_cwt Victory 118 5000 9 1070 E  0 6_cwt Golden_rain 124 8333 9 1070 E  0 4_cwt Marvellous 117 1667 9 1070  E  0 4_cwt Victory 110 8333 9 1070 E 
308. is TAB separated   yht becomes _yht txt    YHTFORM 2 is COMMA separated   yht becomes  yht  csv    YHTFORM 3 is Ampersand separated   yht becomes _yht tex   adds r to the total Sum of Squares  This might be used with  DF to add    some variance to the analysis when analysing summarised data     82    5 8 Job control qualifiers       Table 5 5  List of rarely used job control qualifiers                      qualifier action  this is a test of matern  Variogram of fac xsca ysca  predictors  21 61 n  S H 5  e    m    i   r  V   135  a j   0  r j y  i K    a i    j i 45  c    e  399  Distance 2 80  Figure 5 1  Variogram in 4 sectors for Cashmore data  Table 5 6  List of very rarely used job control qualifiers  qualifier action  ICINV n prints the portion of the inverse of the coefficient matrix pertaining to     FACPOINTS n    the n  term in the linear model  Because the model has not been defined  when ASReml reads this line  it is up to the user to count the terms in  the model to identify the portion of the inverse of the coefficient matrix  to be printed  The option is ignored if the portion is not wholly in the  SPARSE stored equations  The portion of the inverse is printed to a file  with extension  cii The sparse form of the matrix only is printed in  the form i j C      that is  elements of C     that were not needed in the  estimation process are not included in the file     affects the number of distinct points recognised by the fac   model func   tion  Table 6 1   The 
309. is keeping the estimated variance matrix  positive definite  These are not simple issues and in the following we present a pragmatic  approach to them     The data are taken from a large genetic study on Coopworth lambs  A total of 5 traits   namely weaning weight  wwt   yearling weight  ywt   greasy fleece weight  gfw   fibre di   ameter  fdm  and ultrasound fat depth at the C site  fat  were measured on 7043 lambs   The lambs were the progeny of 92 sires and 3561 dams  produced from 4871 litters over 49  flock year combinations  Not all traits were measured on each group  No pedigree data was  available for e dams     The aim of the analysis is to estimate heritability  h   of each trait and to estimate the  genetic correlations between the five traits  We will present two approaches  a half sib  analysis and an analysis based on the use of an animal model  which directly defines the  genetic covariance between the progeny and sires and dams     The data fields included factors defining sire  dam and lamb  tag   covariates such as age   the age of the lamb at a set time  brr the birth rearing rank  1   born single raised single   2   born twin raised single  3   born twin raised twin and 4   other   sex  M  F  and grp  a factor indicating the flock year combination     15 10 1 Half sib analysis    In the half sib analysis we include terms for the random effects of sires  dams and litters   In univariate analyses the variance component for sires is denoted by o    to  where 
310. is not valid for generalised linear mixed models as the reported LogL does  not include components relating to the reweighting  Furthermore  it is not appropriate if the  fixed effects in the model have changed  In particular  if fixed effects are fitted in the sparse  equations  the order of fitting may change with a change in the fitted variance structure  resulting in non comparable likelihoods even though the fixed terms in the model have not  changed  The iteration sequence terminates when the maximum iterations  see  MAXIT on  page 68  has been reached or successive LogL values are less than 0 0027 apart     The following is a copy of nin89a asr     ASRem1l 4 0  01 Jan 2013  NIN Alliance Trial 1989 version  amp  title  Build ki  07 Jan 2014  64 bit date  29 Jan 2014 09 34 34 315 32 Mbyte Windows x64 nin89a workspace    Licensed to  VSNi  Robin Thompson  3   EEEE Soo I o I C RI I K I A A AOK aK I A 1 21 21 3 4 4 kkk kk kkk kkk kk kk kk    Contact support asreml co uk for licensing and support    EEEE Sooo ooo o o o ORK kkk k kkk ARG    Folder  D  latest Data examples4 arg Manex4f   variety  A   QUALIFIERS   SKIP 1  DISPLAY 15   QUALIFIER   DOPART 1 is active   Reading nin89aug asd FREE FORMAT skipping 1 lines    Univariate analysis of yield  Summary of 242 records retained of 242 read data summary    Model term Size  miss  zero MinNonO Mean MaxNonO StndDevn  1 variety 56 0 0 1 26 4545 56  2 iq 0 0 1 0000 26 45 56 00 1   18  3 pid 18 0 1101  26026  4156  1121   4 raw 18 Q
311. is up to the  user and deduced from the first line which is assumed to be a an XY individual  Thus   whatever string is found in the fourth field on the first line of the pedigree is taken to  mean XY and any other code found on other records is taken to mean XX     8 8 Genetic groups    If all individuals belong to one genetic group  then use 0 as the identity of the parents of base  individuals  However  if base individuals belong to various genetic groups this is indicated  by the  GROUPS qualifier and the pedigree file must begin by identifying these groups  All  base individuals should have group identifiers as parents  In this case the identity 0 will  only appear on the group identity lines  as in the following example where three sire lines    are fitted as genetic groups     161    8 9 Reading a user defined  inverse  relationship matrix                            Genetic group example G10 0  Animal  P G20 0  Sire  A G3 0 0  Dam SIRE_1 Gi Gi  Line 2 SIRE2 G1 Gi  AgeOfDam SIRE_3 Gi Gi  adailygain SIRE_4 G2 G2  Y2 SIRE_5 G2 G2  Y3 SIRE_6 G3 G3  harveyg ped  ALPHA  GROUPS 3 SIRE_7 G3 G3  harvey dat SIRE_8 G3 G3  adailygain   mu Line    fixed model SIRE_9 G3 G3  Ir grmiv Animal  INIT 0 25     random model 101 SIRE_1 G1  residual idv units  102 SIRE_1 G1  103 SIRE_1 G1  163 SIRE_9 G3  164 SIRE_9 G3  165 SIRE_9 G3       Important It is usually appropriate to allocate a genetic group identifier where the parent is  unknown     8 9 Reading a user defined  inverse  relationsh
312. istics for the spatial models are greater    290    15 6 Spatial analysis of a field experiment   Barley       than for the lattice analysis  We note the Wald F statistic for the AR1xAR1   units model  is smaller than the Wald F statistic for the AR1x AR1     Predicted values of yield     AR1 x AR1  variety Predicted_Value Standard_Error Ecode  1 0000 1257  9763 64 6146 E  2 0000 1501 4483 64 9783 E  3 0000 1404 9874 64 6260 E  4 0000 1412 5674 64 9027 E  5 0000 1514 4764 65 5889 E  23 0000 1311 4888 64 0767 E  24 0000 1586 7840 64 7043 E  25 0000 1592 0204 63 5939 E  SED  Overall Standard Error of Difference 59 05   AR1 x AR1   units  variety Predicted_Value Standard_Error Ecode  1 0000 1245 5843 97 8591 E  2 0000 1516 2331 97 8473 E  3 0000 1403  9863 98 2398 E  4 0000 1404 9202 97 9875 E  5 0000 1471 6197 98 3607 E  23 0000 1316 8726 98 0402 E  24 0000 1557 5278 98 1272 E  25 0000 1573 8920 97 9803 E  SED  Overall Standard Error of Difference 60 51    IB  Rep is ignored in the prediction  RowBlk is ignored in the prediction  Co1lBlk is ignored in the prediction  variety Predicted_Value Standard_Error Ecode  1 0000 1283 5870 60 1994 E  2 0000 1549 0133 60 1994 E  3 0000 1420 9307 60 1994 E  4 0000 1451 8554 60 1994 E  5 0000 1533 2749 60 1994 E  23 0000 1329 1088 60 1994 E  24 0000 1546   4699 60 1994 E  25 0000 1630 6285 60 1994 E    SED  Overall Standard Error of Difference 62 02    Notice the differences in SE and SED associated with the various models  Choosing a model 
313. it aids  formulation of prediction tables  see   ASSOCIATE qualifier on page 186   Common examples  are Genotypes grouped into Families and Locations grouped by Region  We call these  associated factors  The key characteristic of associated factors is that they are coded such  that the levels of one are uniquely nested in the levels of another  If one is unknown  coded  as missing   all associated factors must be unknown for that data record  It is typically  unnecessary to interact associated factors except when required to adequately define the  variance structure     96    6 6 Alphabetic list of model functions       6 6 Alphabetic list of model functions    Table 6 2 presents detailed descriptions of the model functions discussed above  Note that  some three letter function names may be abbreviated to the first letter     Table 6 2  Alphabetic list of model functions and descriptions       model function action       abs  v  takes the absolute value of the variable v  This function can be used on the response  variable    and t r  overlays  adds  r times the design matrix for model term t to the existing design   a t r  matrix  Specifically  if the model up to this point has p effects and t has a effects     the a columns of the design matrix for t are multiplied by the scalar r  default value  1 0  and added to the last a of the p columns already defined  The overlaid term  must agree in size with the term it overlays  This can be used to force a correlation  of 1 betwee
314. ites  in  which the specific variances are all equal  For the xfak  variance model functions  ASReml orders the parameters  as the specific variances followed by the loadings  note  that this is different to the ordering for the fak variance  model functions  see previous example   In this example   the first loading in the second factor is constrained to be  equal to zero for identifiability     xfa2 site   VVVV00000000    contracted form       4V8     4P4PZ3P     INIT 4 0 2 4 1 2 0 3 0 3  gen    7 7 2 New R4 Ways to supply distances in one dimensional metric based  models  COORD v    Power models rely on the definition of distance for the associated term  Information for  determining distances is supplied either implicitly by applying the variance model function  to the fac   of the coordinate variables  for example    expv  fac  X       where X contains the positions  or explicitly with the  COORD qualifier  for example    126    7 7 Variance model function qualifiers       expv Time  COORD x     where x is a vector of distances which has to be of length the number of levels of Time  For  computational reasons it is useful to have the range of x between 5 and 50     7 7 3 Your own program  Fi    The OWN variance structure is a facility whereby  advanced  users may specify their own  variance structure  This facility requires the user to supply a program MYOWNGDG that reads  the current set of parameters  forms the G matrix and a full set of derivative matrices  and  writes
315. ither argument is supplied  2 is assumed  If  the second argument is omitted  it is given the value of the first     If the problem of later singularities arises because of the low coefficient  of variation of a covariable  it would be better to centre and rescale the  covariable  If the degrees of freedom are correct in the first iteration  the  problem will be with the variance parameters and a different variance  model  or variance constraints  is required     requests writing of  vrb file  Previously  the default was to write the file     85    6 Command file  Specifying the terms  in the mixed model    6 1 Introduction    The linear mixed model is specified in ASReml as a series of model terms and qualifiers   In this and the following chapter we discuss a functional specification of mixed models in  ASReml  This chapter describes the model formula syntax for traditional variance component  models     From Chapter 2  the linear mixed model can be written as  y XT Zut e  6 1     where y  n x 1  is a vector of observations  T  p x 1  is a vector of fixed effects  X  n x p   is the design matrix of full column rank that associates observations with the appropriate  combination of fixed effects  u  q x 1  is a vector of random effects  Z  n x q  is the design  matrix that associates observations with the appropriate combination of random effects  and  e  n x 1  is the vector of residual errors     Typically  7 and u are composed of several model terms  that is  7 can be part
316. itioned as  T    T     TI  and u can be partitioned as u    ul   u     with X and Z partitioned  conformably as X    X    X   and Z   Z     Zol    In this chapter we concentrate on specification of the fixed and random effects and their  associated design matrices  For ease of exposition  we assume variance component mixed  models  Example 2 2   In these models  the random effects  within model terms  and the  residual errors are assumed to be identically and independently distributed  IID   This means  they have a common variance and zero covariance  In these variance component models  a  functional specification is relatively simple and we discuss this here  In Chapter 7 we present  a more general functional specification of random effects and variance structures     86    6 2 Specifying model formulae in ASReml       6 2 Specifying model formulae in ASReml       The linear mixed model is specified in ASReml as a se  NIN Alliance Trial 1989  ries of model terms and qualifiers  Model terms include variety   factor and variate labels  Section 5 4   functions of la    bels  special terms and interactions of these  The model column 11    is specified immediately after the datafile and any job nin89 asd  skip 1  control qualifier and or tabulate lines  The syntax for   yield   mu variety  r idv repl      me    If mv  specifying the model is rasial tetua          response  qualifiers    fixed   r conrandom     f sparse_fixed     residual conresidual      e response is the label f
317. its us Trait   us Trait _3   G  P  15 298889   7  1  8 6  1  9 9  i         tnits us Trait us Treit  A   G  P  4 8438271         unites  ne Trait   us Trait  5   G  P  11 264815      10      units us Trait   us Trait _6   G  P  26 095692   10  1  ii     units us Trait  us Trait _7   G  P  4 6882715   11  1  12      units us Trait   us Trait _8   G  P  10 824074   12  1  13     units us Trait  us Trait  9   G  P  27 332887   13  1  14   units us Trait   us Trait _10   G  P  71 875403   14  1  15     unhitse ws Trait  us Trait _ii   G  P  3 9083333   15  1  16      units us Trait   us Trait _12   G  P  10 292592   16  1  17     units us Trait   us Trait _13   G  P  34 137962   17  1  18   units us Trait   us Trait _14   G  P  69 287036   18  1  19   units us Trait   us Trait _15   G  P  141 97296   19  1    Parameter constraints and initial values can be changed by editing the values in the PSpace  and Initial_value columns  Scale relationships can be introduced by noting that the full  set of parameters can be related to a subset of parameters and scale factors such as    parameter   subset parameter   scale   or   GN column parameter  RP_GN column parameter   RP_scale value   where GN  RP_GN and RP_scale are columns in the  tsv file  The relationships generated by    VCC 2   5681115 7  29   212   2 16   2  parameters 6 8 11 15 are equal to 5    7 9 12 16 are twice 5   10 13 17  parameters 13 and 17 are equal to 10    the full set of parameters 5 19 can therefore be expressed 
318. ivariate model and the univariate model  of  15 7   The variety effects for each trait  u  in the bivariate model  are partitioned in     15 7  into variety main effects and tmt variety interactions so that u    l Q u1   Ue   There is a similar partitioning for the run effects and the errors  see table 15 9      In addition to the assumptions in the models for individual traits  15 9  the bivariate analysis  involves the assumptions cov  ty   Ul    Fv  L44  COV  Ur   Un    Ora Too and cov  ec  e     Octd 132  Thus random effects and errors are correlated between traits  So  for example  the    variance matrix for the variety effects for each trait is given by    2  Ca  Ova  var  Uy    ve gn   OL  Ovet Ou     This unstructured form for trait variety in the bivariate analysis is equivalent to the  variety main effect plus heterogeneous tmt  variety interaction variance structure  15 8   in the univariate analysis  Similarly the unstructured form for trait run is equivalent  to the run main effect plus heterogeneous tmt run interaction variance structure  The  unstructured form for the errors  trait pair  in the bivariate analysis is equivalent to the  pair plus heterogeneous error  tmt  pair  variance in the univariate analysis  This bivariate    304    15 8 Paired Case Control study   Rice       analysis is achieved in ASReml as follows  noting that the tmt factor here is equivalent to  traits     this is for the paired data   id    pair 132   run 66   variety 44  A   yc ye   
319. king  directories  maybe just keeping the  as   asr   rsv and  pvs files     218    13 3 Key output files       13 2 An example    In this chapter the ASReml output files are  discussed with reference to a two dimensional  separable autoregressive spatial analysis of  the NIN field trial data  see model 3b on page  120 of Chapter 7 for details  The ASReml  command file for this analysis is presented to  the right  Recall that this model specifies a  separable autoregressive correlation structure  for residual or plot errors that is the direct  product of an autoregressive correlation ma   trix of order 22 for rows and an autoregressive  correlation matrix of order 11 for columns     13 3 Key output files          NIN Alliance Trial 1989  variety  A   id   pid   raw   repl 4   nloc   yield   lat   long   row 22   column 11   nin89a asd  skip 1  DISPLAY 15  tabulate yield   variety  yield   mu variety  f mv  residual ar1 row   ari column   predict variety       The key ASReml output files are the  asr   sln and  yht files     13 3 1 The  asr file    This file contains    e an announcements box  outlined in asterisks  containing current messages     e asummary of the data for the user to confirm the data file has been interpreted correctly  and to review the basic structure of the data and validate the specification of the model     e the iteration sequence of REML loglikelihood values to check convergence     e asummary of the variance parameters       The Gamma column reports 
320. l appears to be failing  then please send details of the problem to support vsni co uk     1 6 Typographic conventions    A hands on approach is the best way to develop a working understanding of a new computing  package  We therefore begin by presenting a guided tour of ASReml using a sample data set  for demonstration  see Chapter 3   Throughout the guide new concepts are demonstrated  by example wherever possible        In this guide you will find framed sample  boxes to the right of the page as shown here   These contain ASReml command file  sample   code  Note that     the code under discussion is highlighted in    bold type for easy identification  s AREE that some of the original  code is omitted from the display    An example ASReml code box    bold type highlights sections of code  currently under discussion    remaining code is not highlighted               the continuation symbol       is used to  indicate that some of the original code is  omitted    Data examples are displayed in larger boxes in the body of the text  see  for example  page   40  Other conventions are as follows     keyboard key names appear in SMALLCAPS  for example  TAB and ESC     e example code within the body of the text is inthis size and font and is highlighted in  bold type  see pages 33 and 49     e in the presentation of general ASReml syntax  for example   path  asreml1 basename    as   arguments     typewriter font is used for text that must be typed verbatim  for example  asrem1    an
321. l effects    This example fits direct effects for two traits  but maternal effects for the first trait only    str Trait animal at Trait 1  dam us 3   nrm animal       A rather artificial example of using v greater than 1 is when we have 20 levels in a factor A  and wish to use one variance for the first 8 levels and another for the last 12 levels  Then    str A idv 8  idv 12    will do this     7 3 Applying variance structures to the residual error term    In Release 4 the residual error term is also defined using a consolidated model term  and it  now appears after a residual statement that has been introduced to specify the associated  variance structure  We give five examples  Firstly  for the default situation of IID residual  errors the error model definition line would be    residual idv units     This second example would specify a separable autoregressive spatial model of order 1   AR1xAR1  for the observations from a trial arranged in a rectangular array indexed by  the data variables column and row  To apply this variance structure the observations would  need to cover the whole grid  but it would not be necessary to pre order the data file as rows  within columns as ASReml uses the information in column and row to put the observations  into the appropriate row within column order     residual ariv column   ari1  row     If there were 3 columns and 23 rows in the previous example  then this third example  residual ariv 3  ar1 23     would be an equivalent coding fo
322. l lines  Maybe the string is too long     the error model is not correctly specified     the file did not exist or was of the wrong file type  binary    unformatted  sequential      The PREDICT statement cannot be parsed   ASReml failed to form the PREDICT design matrix     This usually indicates the model has not been properly parsed  and part is misinterpreted as a variance header line  old syntax  where the residual statement was expected  When the model  statement is written over several lines  incomplete lines must  end with a PLUS or COMMA character     Check old syntax variance structure specification     Check the filename is correct and that the file is not open in  another process     ASReml has failed to determine an order for solving the mixed  model equations  See  EQORDER for some discussion  Try in   creasing  WORKSPACE     This error comes from the main read routine  or from the    variable definition parsing routine     There are several messages of this form where something is what  ASReml is attempting to read  Either there is an error telling  ASReml to read something when it does not need to  or there  is an error in the way something is specified     260    14 5    Information  Warning and Error messages       Table 14 3  Alphabetical list of error messages and probable cause s   remedies       error message    probable cause remedy       Error reading the data     Error reading the DATA  FILENAME line    Error reading the model  factor list    Error  Ra
323. l w  correlation  CORB  corb banded C   1 w l w 2w     1  correlation Cig   b  1Sj lt w 1  l   l lt 1  CORG  corg general C   1 wld  wt  H1 acan  correlation C       i j  w  CORGH   US l    ere  ij    148    7 12 Variance models available in ASReml       Details of the variance models available in ASReml          variance description algebraic number of parameters   structure form  name  variance corr het  model variance  function  name  One dimensional unequally spaced  EXP  exp exponential C   1 1 l w  Cy  l  iA j  xi are coordinates  0 lt o   lt 1  GAU  gau gaussian C   1 1 LEWU  Cy   dO iF j  xi are coordinates  0 lt      lt 1  Two dimensional irregularly spaced  x and y vectors of coordinates  Oij   min d     1  1   dij is euclidean distance  IEXP  iexp isotropic C   1 1 l w  exponential Cy piti   esltly   usl ij  0 lt      lt 1  IGAU  igau isotropic C   1 1 1 w  gaussian C    glares  Huy   pany  0 lt    lt 1  IEUC  ieuc isotropic Er  1 1 1 w  euclidean C    pV imti   Hui   i  ij    0 lt    lt 1  LVR  lvr linear variance C      1    64  1 l w  0 lt        149    7 12 Variance models available in ASReml       Details of the variance models available in ASReml          variance description algebraic number of parameters   structure form  name  variance corr hom het  model variance variance  function  name  SPH  sph spherical C   1     36    583  1 2 l w  0 lt      CIR  cir circular  Web  C   1 1 2 l w   amp  Oliver  7  ted  i INE     2  6454  1    0   sin    p 113   0 lt  
324. laced before model terms to exclude them from y J    the model      placed at the end of a line to indicate that the  model specification continues on the next line         treated as a space J J  I      placed around some model terms when it is impor  J  1  tant the terms not be reordered  Section 6 4   commonly at  f n  condition on level n of factor f  J J  used n may be a list of level numbers  functions at  f  forms conditioning covariables for all levels of fac      J  tor f  fac v  forms a factor from v with a level for each unique J  value in v  fac v y  forms a factor with a level for each combination of J  values in v and y  lin f  forms a variable from the factor f with values equal     to 1    n corresponding to level 1    level n  of the  factor  spl v  k   forms the design matrix for the random component J  of a cubic spline for variable v  other t n  fits variable n from the  G set of variables t  This y y  functions tin  is a special case of the   SUBGROUP qualifier func     tion applied to  G variables  Note that the square  parentheses are permitted alternative syntax     abs  v  forms the absolute value of the variable v    and t  r   adds r times the design matrix for model term t to J  the previous design matrix  r has a default value  of 1 predefine it by saying  t and t r     c f  factor fis fitted with sum to zero constraints J    90    6 2 Specifying model formulae in ASReml       Table 6 1  Summary of reserved words  operators and functions       mod
325. lant number  treatment identification and the 5 heights  The  ASReml input file for our first model is    This is plant data multivariate   tmt  A   Diseased Healthy   plant 14   y1 y3 y5 y7 y10  grass asd  skip 1  ASUV   IY y1  G tmt  JOIN   Plot the data  yl y3 y5 y7 y10   Trait tmt Tr tmt  r idv units   residual idv units Trait     279    15 5 Balanced repeated measures   Height       The focus is modelling of the error variance for the data  Specifically we fit the multivariate  regression model given by  Y DT E  15 1     where Y      is the matrix of heights  D    is the design matrix  T     is the matrix of  fixed effects and E      is the matrix of errors  The heights taken on the same plants will be  correlated and so we assume that    var  vec E     I4   X  15 2     where         is a symmetric positive definite matrix     The variance models used for    are given in Table 15 4  These represent some commonly  used models for the analysis of repeated measures data  see Wolfinger  1986   Note that we  have specified the   ASUV qualifier  This is required to allow the fitting of all these models   Without  ASUV  ASReml woul only allow us to fit the final  UnStructured  variance model  which is the default R structure fo    Table 15 4  Summary of variance models fitted to the plant data       number of REML  model parameters log likelihood BIC  Uniform 2  196 88 401 95  Power 2  182 98 374 15  Heterogeneous Power 6  171 50 367 57  Antedependence  order 1  9  160 37 357 5
326. le     254    14 5 Information  Warning and Error messages       Table 14 1  Some information messages and comments       information message    comment       Logl converged    BLUP run done    JOB ABORTED by USER    Logl converged  parameters  not converged    Logl not converged    Warning  Only one iteration  performed    Parameters unchanged after  one iteration     Messages beginning with the word Notice  are not generally listed here   information the user should be aware of as it may affect the interpretation of results  They  are not in themselves errors in that the syntax is valid  but they may reflect errors in the    the REML log likelihood last changed less than 0 002   iter   ation number and variance parameter values appear stable     A full iteration has not been completed  See discussion of    BLUP     See discussion of ABORTASR  NOW     the change in REML log likelihood was small and conver   gence was assumed but the parameters are  in fact  still  changing     the maximum number of iterations was reached before the  REML log likelihood converged  The user must decide  whether to accept the results anyway  to restart with the   CONTINUE command line option  see Section 10 3 on job  control   or to change the model and or initial values be   fore proceding  The sequence of estimates is reported in the   res file  It may be necessary to simplify the model and  estimate the dominant components before estimating other  terms if the LogL is oscilating     Paramete
327. les  setting n to  1 means the file is not formed     modifies the appearance of the variogram calculated from the residuals  obtained when the sampling coordinates of the spatial process are defined  on a lattice  The default form is based on absolute    distance    in each  direction  This form distinguishes same sign and different sign distances  and plots the variances separately as two layers in the same figure     specifies that n constraints are to be applied to the variance parameters   The constraint lines occur after the G structures are defined  The con   straints are described in Section 7 8 2  The variance header line  struc   tural specification  or residual line  Section 6 2  must be present  even  if only O O 0 or residual units indicating there are no explicit R or G  structures  see Section 7 8 2      requests that the variogram formed with radial coordinates  see page 18   be based on s  4  6 or 8  sectors of size 180 s degrees  The default is 4  sectors if  VGSECTORS is omitted and 6 sectors if it is specified without  an argument  The first sector is centred on the X direction     Figure 5 1 is the variogram using radial coordinates obtained using pre   dictors of random effects fitted as fac xsca ysca   It shows low semi   variance in xsca direction  high semivariance in the ysca direction with  intermediate values in the 45 and 135 degrees directions     controls the form of the  yht file    YHTFORM  1 suppresses formation of the  yht file    YHTFORM 1 
328. log likelihoods for models 1 and 2 are comparable and likewise for models 3 to 6  The  REML log likelihoods are not comparable between these groups due to the inclusion of the  fixed season term in the second set of models     We begin by modelling the variance matrix for the intercept and slope for each tree       as a diagonal matrix as there is no point including a covariance component between the  intercept and slope if the variance component s  for one  or both  is zero  Model 1 also does  not include a non smooth component at the overall level  that is  fac age    Abbreviated  output is shown below     6 LogL  97  8517 s2  7 2838 33 df  7 LogL  97   7837 S2  6 6673 33  dt  8 LogL  97 7792 S2  6 4634 33 df  9 LogL   97   7788 52  6 3911 33 df  10 LogL   97  7788 52  6 3615 33 df          Results from analysis of circ           Akaike Information Criterion 205 56  assuming 5 parameters    Bayesian Information Criterion 213 04  Model_Term Gamma Sigma Sigma SE  C    313    15 9 Balanced longitudinal data   Random coefficients and cubic smoothing splines         Oranges  idv  spl  age 7   IDV_V 5 100 466 639 116 1 55 OF  Residual SCA_V 35 1 000000 6 36154 1 74 GP  idv Tree  ID_V 1 4 78778 30 4577 1 24 OP  idv  Tree  age  ID_V 1 0 939009E 04 0 597354E 03 1 41 OP  idv spl age 7  Tree  ID_V 1 1 11619 7 10070 1 44 OP   Wald F statistics   Source of Variation NumDF DenDF F ine P ine  9 mu 1 4 0 47 05 0 002  3 age i 4 0 95 00  lt  001    217 Predicted values of circ               
329. lower than the PCG method     ASReml prints its standard reports as if it had completed the iteration  normally  but since it has not completed it  some of the information  printed will be incorrect  In particular  variance information on the vari   ance parameters will always be unavailable  Standard errors on the es   timates will be wrong unless n 3  Residuals are not available if n 1   Use of n 3 or n 2 will halve the processing time when compared to  the alternative of using  MAXIT 1 rather than a  tt   BLUP n qualifier   However   MAXIT 1 does result in complete and correct output     sets the number of equations solved densely up to a maximum of 5000   By default  sparse matrix methods are applied to the random effects and  any fixed effects listed after random factors or whose equation numbers  exceed 800  Use  DENSE nto apply sparse methods to effects listed before  the  r  reducing the size of the DENSE block  or if you have large fixed  model terms and want Wald F statistics calculated for them  Individual  model terms will not be split so that only part is in the dense section  n  should be kept small   lt 100  for faster processing     alters the error degrees of freedom from vy to v n  This qualifier might be  used when analysing pre adjusted data to reduce the degrees of freedom   n negative  or when weights are used in lieu of actual data records to  supply error information  n positive   The degrees of freedom is only  used in the calculation of the residual 
330. ls  column 3  and the diagonal elements of the hat matrix  This  final column can be used in tests involving the residuals  see Section 2 4 under Diagnostics     Record Yhat Residual Hat  1 30 442  1 192 13 01   2 27 955 3 595 13 01   3 32 380 2 670 13 01   4 23 092 7 008 13 01   5 31 317 1 733 1301   6 29 267 0 9829 13 01   T 26 155 9 045 13 01   8 24 567  6 167 13 01   9 23 530 0 8204 13 01  222 16 673 9 877 13 01  223 24 548 1 052 13 01  224 23 786 3 114 13 01    3 7 Tabulation  predicted values and functions of the vari   ance components    It may take several runs of ASReml to determine an appropriate model for the data  that  is  the fixed and random effects that are important  During this process you may wish to  explore the data by simple tabulation  Having identified an appropriate model  you may  then wish to form predicted values or functions of the variance components  The facilities  in ASReml to form predicted values and functions of the variance components are described  in Chapters 9 and 12 respectively  Our example only includes tabulation and prediction     The statement  tabulate yield   variety    in nin89 as results in nin89 tab as follows   NIN alliance trial 1989 11 Jul 2005 13355221    Simple tabulation of yield    variety   LANCER 28  56  BRULE 26 07  REDLAND 30 50  CODY 21 21  ARAPAHOE 29 44  NE83404 27 39  NE83406 24 28    37    3 7 Tabulation  predicted values and functions of the variance components          NE83407 22 69  CENTURA 21 65  SCOUT66 2
331. lue to highlight terms associated with A and B respectively in cov  ab    ab       if    Ay  Aj Ais B  B  var A      Ax A        and var B      os   then  Az  Azo A33 on  ae    COV  ab    aD      Ags x Biv     2 1 The general linear mixed model       2 1 10 Direct product structures    Mathematically  the result  2 9  is known as a direct product structure and is written in full  as    var   ab     A8 B  A B    ApB  A B  A B    Structures associated with direct product construction are known as separable variance struc   tures and we call the assumption that a separable variance structure is plausible the assump   tion of separability     2 1 11 Direct products in R structures    Separable structures occur naturally in many practical situations  Consider a vector of  common errors associated with an experiment  The usual least squares assumption  and the  default in ASReml  is that these are independently and identically distributed  IID   However   if e was from a field experiment laid out in a rectangular array of r rows by c columns  we  could arrange the residuals as a matrix and might consider that they were autocorrelated  within rows and columns  Writing the residuals as a vector in field order  that is  by sorting  the residuals rows within columns  plots within blocks  the variance of the residuals might  then be  a2 Dp     Ur pr     where  amp   p   and  amp   p   are correlation matrices for the row model  order r  autocorrelation  parameter p   and column model  o
332. ly referenced in the classify and  average sets  For example     GROUP Year YearLoc 1 112233344   forms a new factor Year with 4 levels from the existing factor YearLoc with 10 levels  The  prediction must be in terms of YearLoc  not Year even if YearLoc does not formally ap   pear in the model  For default averaging in prediction  the weights for the levels of the  grouped factor  Year  will be  in this example  0 3 0 2 0 3 0 2 derived from the weights for  the base factor  YearLoc   Use  AVE YearLoc   2 2 2 3 3 2 2 2 3 3   24 to produce  equal weighting of Year effects     If  G sets of variables are included in the classify set  only the first variable is reported in  labelling the predict values  except that for  G  MM sets  the marker position is reported     Having identified the explanatory variables in the classify set  the second step is to check  the averaging set  The default averaging set is those explanatory variables involved in fixed  effect model terms that are not in the classify set  By default variables that are not in any    ASSOCIATE list and that only define random model terms are ignored  Use the   AVERAGE     ASSOCIATE or  PRESENT  qualifiers to force variables into the averaging set     The third step is to check the linear model terms to use in prediction  The default is that  all model terms based entirely on variables in the classifying and averaging sets are used   Two qualifiers allow this default to be modified by adding   USE  or removing   IGN
333. lysis of a field experiment  There has been a large amount of interest in developing  techniques for the analysis of spatial data both in the context of field experiments and  geostatistical data  see for example  Cullis and Gleeson  1991  Cressie  1991  Gilmour et al    1997   This example illustrates the analysis of    so called    regular spatial data  in which the  data is observed on a lattice or regular grid  This is typical of most small plot designed  field experiments  Spatial data is often irregularly spaced  either by design or because of  the observational nature of the study  The techniques we present in the following can be  extended for the analysis of irregularly spaced spatial data  though  larger spatial data sets  may be computationally challenging  depending on the degree of irregularity or models fitted     The data we consider is taken from Gilmour et al   1995  and involves a field experiment  designed to compare the performance of 25 varieties of barley  The experiment was conducted  at Slate Hall Farm  UK in 1976  and was designed as a balanced lattice square with replicates  laid out as shown in Table 15 6  The data fields were Rep  RowBlk  ColBlk  row  column  and yield  Lattice row and column numbering is typically within replicates and so the  terms specified in the linear model to account for the lattice row and lattice column effects  would be Rep latticerow Rep latticecolumn  However  in this example lattice rows and  columns are both numbered 
334. m  P   grp 49  sex   brr 4  litter 4871  age   wwt IMO    MO identifies missing values  ywt  MO  gfw 1MO  fdm   MO  fat  MO    pcoop fmt   read pedigree from first three fields     PATH 1    pcoop fmt    PATH 2    pcoop fmt  CONTINUE pcoopfi rsv  MAXI 40   PATH 3    pcoop fmt  CONTINUE pcoopf2 rsv  MAXI 40   PATH 4    pcoop fmt  CONTINUE pcoopf2 rsv  MAXI 40    PATH 5    pcoop fmt  CONTINUE pcoopf4 rsv  MAXI 40  IPART O    ISUBSET TrDam12 Trait 12000  ISUBSET TrLit1234 Trait 123 40    SUBSET TrAGi1245 Trait 1245  ISUBSET Tr8Gl2a Trait 123 0 0  ISUBSET IrDal2s Trait 1 23 0 0     USING  ASSIGN TO MAKE SPECIFICATION CLEARER    ASSIGN TDIAGI  INIT 2 3759 6 2256 0 60075E 01 0 63086 0 13069  GP   ASSIGN DDIAGI  INIT 2 1584 2 3048  GP    ASSIGN LDIAGI  INIT 3 55265 2 55777 0 191238E 01 0 897272 IGP    ASSIGN RUSI I lt   INIT 13 390 9 0747 17 798 0 31961 0 87272 0 13452  0 71374 1 4028 0 23141 4 0677 0 72812 2 0831 0 75977E 01 0 25782 1 5337  GP   gt         ASSIGN VARF   lt   diag TrAG1245  INIT 0 0024 0 0019 0 0020 0 00026   age grp   diag TrSG123    INIT 0 93 16 0 0 28  sex grp 1 gt      PATH 1   wwt ywt gfw fdm fat   Trait Trait age Trait brr Trait sex Trait age sex  r  VARF    diag Trait  TDIAGI  nrm tag  diag TrDam12  DDIAGI  nrm dam  diag TrLit1234  LDIAGI  id lit    lf Trait grp   residual id units  us Trait  RUSI     328    15 10 Multivariate animal genetics data   Sheep        PATH 2   wwt ywt gfw fdm fat   Trait Trait age Trait brr Trait sex Trait age sex  r  VARF   xfal T
335. m  http   www vsni co uk products asreml as well as in the examples directory created  under the standard installation  They remain the property of the authors or of the origi   nal source but may be freely distributed provided the source is acknowledged  The authors  would appreciate feedback and suggestions for improvements to the program and this guide   Proceeds from the licensing of ASReml are used to support continued development to im   plement new developments in the application of linear mixed models  The developmental  version is available to supported licensees via a website upon request to VSN  Most users  will not need to access the developmental version unless they are actively involved in testing  a new development     Acknowledgements    We gratefully acknowledge the Grains Research and Development Corporation of Australia  for their financial support for our research since 1988  Brian Cullis and Arthur Gilmour wish  to thank the NSW Department of Primary Industries  and more recently the University of  Wollongong  for providing a stimulating and exciting environment for applied biometrical re   search and consulting  Rothamsted Research receives grant aided support from the Biotech   nology and Biological Sciences Research Council of the United Kingdom  We sincerely thank  Ari Verbyla  Dave Butler and Alison Smith  the other members of the ASReml    team     Ari  contributed the cubic smoothing splines technology  information for the Marker map impu   tation 
336. mand file it substitutes the nth argument  string   n may take the values 1   9  to indicate up to 9 strings after the command file name  If the argument has 1 character  a  trailing blank is attached to the character and inserted into the command file  If no argu   ment exists  a zero is inserted  For example     asreml rat as alpha beta    tells ASReml to process the job in rat as as if it read alpha wherever  1 appears in the  command file  beta wherever  2 appears and 0 wherever  3 appears     Table 10 2  The use of arguments in ASReml          in command file on command line becomes in ASReml run  abc 1def no argument abcO def   abc 1def with argument X abcX def   abc 1def with argument XY abcXYdef   abc 1def with argument XYZ abcXYZdef   abc 1 def with argument XX abcXX def   abc 1 def with argument XXX abcXXX def   abc 1 def with argument XXX abcXXX def     multiple spaces     10 4 2 Prompting for input    Another way to gain some interactive control of a job in the PC environment is to insert      text  in the  as file where you want to specify the rest of the line at run time  ASReml  prompts with text and waits for a response which is used to compete the line  The     qualifier may be used anywhere in the job and the line is modified from that point     Warning Unfortunately the prompt may not appear on the top screen under some windows    operating systems in which case it may not be obvious that ASReml is waiting for a keyboard  response     200    10 4 Advanced p
337. matrices for sires  dams and litters respectively  The variance matrix  for dams does not involve fibre diameter and fat depth  while the variance matrix for litters  does not involve fat depth  The effects in each of the above vectors are ordered levels within  traits  Lastly we assume that the residual variance matrix is given by    Ue Q I 7043  Table 15 15 presents the sequence variance models fitted to each of the four random terms  sire  dam  litter and error in the ASReml job    IRENAME 1  ARG 1  CHANGE 1 TO 2 OR 3 FOR OTHER PATHS  Multivariate Sire  amp  Dam     DOPATH  1  tag   sire 92 II  dam 3561  I  grp 49  sex   brr 4  litter 4871  age   wwt IMO    MO identifies missing values  ywt  MO  gfw 1MO  fdm  M     fat  MO   PATH 1  coop fmt   PATH 2    coop fmt  CONTINUE coopmfi rsv   uses initial values from previous  rsv file    319    15 10 Multivariate animal genetics data   Sheep        PATH 3  coop fmt  CONTINUE coopmf2 rsv     PATH O  SETTING UP TRAIT COMBINATIONS FOR DIFFERENT MODEL TERMS   SUBSET TrDam123 Trait 12300    SUBSET TrLit1234 Trait 123 40    SUBSET TrAG1245 Trait 1245    ISUBSET TrSG123 Trait 12300     USING   ASSIGN TO MAKE SPECIFICATION CLEARER    ASSIGN SIRE DAM LITTER AND RESIDUAL INITIAL VALUES FROM UNIVARIATE ANALYSES   ASSIGN SDIAGI  INIT 0 608 1 298 0 015 0 197 0 035  Initial sire variances   ASSIGN DDIAGI  INIT 2 2 4 14 0 018    ASSIGN LDIAGI  INIT 3 74 0 97 0 019 0 941    ASSIGN RUSI   lt   INIT 9 27 0 0 16 48 0 0 0 0 0 14   0 0 0 0 0 0 3 37 
338. ml to modify either    the reading of the data and or the output produced  see Table 5 2 below for a list of   data file related qualifiers     the operation of ASReml  see Tables 5 3 to 5 6 for a list of job control qualifiers   e the data file related qualifiers must appear on the data file line     e the job control qualifiers may appear on the data file line or on following lines     e the arguments to qualifiers are represented by the following symbols  f    a filename     n     an integer number  typically a count     61    5 7 Data file qualifiers       p     a vector of real numbers  typically in increasing order   r    a real number    s     a character string    t     a model term label    v     the number or label of a data variable     vlist     a list of variable labels     5 7 Data file qualifiers    Table 5 2 lists the qualifiers relating to data input  Use the Index to check for examples or  further discussion of these qualifiers     Table 5 2  Qualifiers relating to data input and output       qualifier action       Frequently used data file qualifiers    ISKIP n causes the first n records of the  non binary  data file to be ignored   Typically these lines contain column headings for the data fields     Other data file qualifiers      COLUMNFACTOR v is used in combination with  ROWFACTOR  and  SECTION  to get ASReml   ICOLFAC v to insert extra data records to complete the grid of plots defined by  the RowFactor and the ColumnFactor for each Section so that a 
339. model  Ir nrmv Animal  INIT 0 25  random model  residual idv units           8 6 The pedigree file       is specified after all field definitions and before   the datafile definition  See below for the first   20 lines of harvey  ped together with the cor    responding lines of the data file harvey dat    All individuals appearing in the data file must   appear in the pedigree file  When all the pedi    gree information  individual  male_parent  fe    male_parent  appears as the first three fields   of the data file  the data file can double as the pedigree file  In this example the line  harvey ped  ALPHA could be replaced with harvey dat  ALPHA  Often the pedigree file  will include individuals for which there is no data  individuals that define genetic links  between individuals with data  The nrm in nrmv Animal  indicates that an additive  or  numerator  relationship matrix  nrm  variance structure is constructed from the pedigree  associated with Animal  The v in nrmv indicates that the nrm matrix is scaled by a variance  parameter     8 6 The pedigree file    The pedigree file is used to construct the genetic relationships for fitting a genetic animal  model and is required if the  P qualifier is associated with a data field  The pedigree file     e has three fields  the identities of an individual and its parents  or sire and maternal grand  sire if the  MGS qualifier is specified  Table 8 1   Typically for animals  the male parent is  listed first  but for trees  the 
340. more the REML estimates are consistent and asymptotically normal  though  in small samples this approximation appears to be unreliable  see later      A general method for comparing the fit of nested models fitted by REML is the REML  likelihood ratio test  or REMLRT  The REMLRT is only valid if the fixed effects are the same  for both models  In ASReml this requires not only the same fixed effects model  but also the  same parameterisation     If Zro is the REML log likelihood of the more general model and   p   is the REML log likelihood  of the restricted model  that is  the REML log likelihood under the null hypothesis   then  the REMLRT is given by    D  2 log     r2    r1    2  log  C2    log    r1    2 21     which is strictly positive  If r  is the number of parameters estimated in model 7  then the  asymptotic distribution of the REMLRT  under the restricted model is x   _        The REMLRT is implicitly two sided  and must be adjusted when the test involves an hy   pothesis with the parameter on the boundary of the parameter space  It can be shown that  for a single variance component  the theoretical asymptotic distribution of the REMLRT is  a mixture of x  variates  where the mixing probabilities are 0 5  one with 0 degrees of free   dom  spike at 0  and the other with 1 degree of freedom  The approximate P value for the  REMLRT statistic  D   is 0 5 1 Pr x7  lt  d   where d is the observed value of D  This has a  5  critical value of 2 71 in contrast to the 3 84 
341. mother tree may be first     an optional fourth field may supply inbreeding  selfing information used if the  FGEN qual   ifier is specified  Table 8 1      e an additional field specifying the sex of the individual is required if the  XLINK qualifier  is specified  Table 8 1      is ordered by generation so that the line giving the pedigree of an individual appears above  any line where that individual appears as a parent     is read free format  it may be the same file as the data file if the data file is free format  and has the necessary identities in the first three fields  see below     is specified on the line immediately after all field definitions and before the data file line  in the command file     e use 0 or   to represent unknown parents     157    8 7 Reading in the pedigree file          harvey  ped harvey dat  101 SIRE_1 0 101 SIRE_1 0 1 3 192 390 2241  102 SIRE_1 Q 102 SIRE_1 O 1 3 154 403 2651  103 SIRE_1 0 103 SIRE 1 O 1 4 185 432 2411  104 SIRE_1 0 104 SIRE_1 O 1 4 183 457 2251  105 SIRE_1 0 105 SIRE_1 0 1 5 186 483 2581  106 SIRE_1 0 106 SIRE_1 0 1 5 177 469 2671  107 SIRE_1 0 107 SIRE_1 0 1 5 177 428 2711  108 SIRE_1 0 108 SIRE 1 O 1 5 163 439 2471  109 SIRE 2 Q 109 SIRE_2 0 1 4 188 439 2292  110 SIRE_2 0 110 SIRE_2 0 1 4 178 407 2262  111 SIRE 2 Q 111 SIRE_2 0 1 5 198 498 1972  112 SIRE 2 Q 112 SIRE 2 O 1 5 193 459 2142  113 SIRE 2 Q 113 SIRE 2 O 1 5 186 459 2442  114 SIRE_2 0 114 SIRE_2 0 1 5 175 375 2522  115 SIRE_2 0 115 SIRE 2 0 1 5 171 382 1722  1
342. n       uni f k  n      calculates an expected marker state from flanking marker information at position r  of the linkage group f see  MM to define marker locations   r may be specified as   TPn where  TPn has been previously internally defined with a predict statement   see page 182   r should be given in Morgans     forms sine from v with period r  Omit r if v is radians  If v is degrees  r is 360     In order to fit spline models associated with a variate v and k knot points in ASReml   v is included as a covariate in the model and spl v k  as a random term  The knot  points can be explicitly specified using the  SPLINE qualifier  Table 5 4   If k is  specified but  SPLINE is not specified  equally spaced points are used  If k is not  specified and there are less than 50 unique data values  they are used as knot points   If there are more than 50 unique points then 50 equally spaced points will be used   The spline design matrix formed is written to the  res file  An example of the use  of sp1   is   price   mu week  r spl week     forms the square root of v   r  This may also be used to transform the response  variable     is used with multivariate data to fit the individual trait means  It is formally  equivalent to mu but Trait is a more natural label for use with multivariate data   It is interacted with other factors to estimate their effects for all traits     creates a factor with a level for every record in the data file  This is used to fit the     nugget    vari
343. n    In genetic analysis using an    animal model    or    sire model     we have data on subjects that  are genetically related  The relationships are defined via a pedigree  The subject effects are  therefore correlated and  assuming normal modes of inheritance  the correlation expected  from additive effects can be computed from the pedigree provided all the direct links are in  the pedigree  The matrix of such relationships is called the numerator relationship matrix   It is actually the inverse relationship matrix that is required for analysis and that is formed  by ASReml  Users new to this subject might find notes Mixed Models for Genetic analysis 1  by Julius van der Werf helpful     For the more general situation where the pedigree based relationship matrix is not the ap   propriate required matrix  the user can provide a general relationship matrix  GRM  matrix    explicitly in a  grm file  or its inverse in a   giv file     As an example for this chapter  we consider data presented in Harvey  1977  using the  command file harvey as     8 5 The command file       Pedigree file example    In ASReml the  P data field qualifier indicates     i nimal  P  that the corresponding data field has an asso  Sire  A  ciated pedigree  The file containing the pedi    Dam    gree  harvey  ped in the example  for animal   Line 2  Age0fDam       lhttp   www vsni co uk products asreml user genetipanalysis  pdf  Y2  Y3  156 harvey ped  ALPHA  harvey dat    adailygain   mu Line  fixed 
344. n adjusted variance matrix of 7  They argued that it is useful to consider an  improved estimator of the variance matrix of 7 which has less bias and accounts for the  variability in estimation of the variance parameters  There are two reasons for this  Firstly   the small sample distribution of Wald F statistics is simplified when the adjusted variance  matrix is used  Secondly  if measures of precision are required for 7 or effects therein  those  obtained from the adjusted variance matrix will generally be preferred  Unfortunately the  Wald statistics are currently computed using an unadjusted variance matrix     2 5 4 Approximate stratum variances    ASReml reports approximate stratum variances and degrees of freedom for simple variance  components models  For the linear mixed effects model with variance components  setting  o    1  where G   SH b     t is often possible to consider a natural ordering of the  variance component parameters including o    Based on an idea due to Thompson  1980    ASReml computes approximate stratum degrees of freedom and stratum variances by a mod   ified Cholesky diagonalisation of the average information matrix  That is  if F is the average  information matrix for     let U be an upper triangular matrix such that F   U U  We  define    U  DU    where D  is a diagonal matrix whose elements are given by the inverse elements of the last  column of U ie deai   1 uir i   1     r  The matrix U  is therefore upper triangular with  the elements i
345. n name and other fields are also duplicated    IMERGE filel  KEY keya keyb  WITH file2  TO newfile  CHECK    IMERGE filel  Key key  KEEP  WITH file2  to newfile    will discard records from file2 that do not match records in filel but all records in file1 are  retained     Omitting fields from the merged file  IMERGE filel  KEY key  skip sla slib  WITH file   SKIP s2as2b  TO newfile    Single insertion merging    IMERGE adult txt  KEY ewe  KEEP  WITH birth txt  KEEP  TO newfile  NODUP bwt     209    12 Functions of variance components    12 1 Introduction       ASReml includes a procedure to calculate cer  aa     A     y   mu Ir idv Sire   tain functions of variance components either   besidual idv units   as a final stage of an analysis or as a post    VPREDICT  DEFINE  analysis procedure  These functions enable   F phenvar idv Sire    idv units   the calculation of heritabilities and correla    F genvar idv Sire    4  tions from simple variance components and      berit genvar phenvar  when US  CORUH and XFA structures are used in the model fitting  A simple example is  shown in the code box  The instructions to perform the required operations are listed after  the VPREDICT  DEFINE line and terminated by a blank line  ASReml holds the instructions  in a  pin until the end of the job when it retrieves the relevant information from the  asr  and  vvp files and performs the specified operations  The results are reported in the  pvc  file              In Section 12 2 the syntax
346. n out of space to    code records to sort them         Error setting constraints    VCC  on variance components    Error setting dependent  variable    Error setting MBF design  matrix    MBF mbf x k   filename    Error sorting X Y values    Error structures are wrong  size     Error when reading knot point  values    Failed forming R G scores        Failed ordering Level labels    Failed to find        Failed to open   INCLUDE    Failed to parse R G structure  line  Failed to read R G structure  line    Failed to process MYOWNGDG  files    Failed when sorting pedigree    Failed when processing  pedigree file        the data file could not be interpreted  alphanumeric fields need  the  A qualifier     data file name may be wrong    the model specification line is in error  a variable is probably  misnamed    Declare the levels in the  ROWFACTOR   COLUMNFACTOR  and  SECTION variables more accurately     The   VCC constraints are specified last of all and require know   ing the position of each parameter in the parameter vector     the specified dependent variable name is not recognised     It is likely that the covariate values do not match the values  supplied in the file  The values in the file should be in sorted  order      ROWFAC and  COLFAC and  SECTION as well as factors defining  a residual structure must uniquely define grid points in the  spatial array    the declared size of the error structures does not match the  actual number of data records     There is some pro
347. n qualifiers is in Table 7 4     7 7 1 Parameter equality constraints   s    Parameters in a variance model can be set to be equal using the   s qualifier  Table 7 4   where s is a string of letters and or zeros  Positions in the string correspond to the position  of the parameters in the list of parameters for the particular variance model     124    7 7 Variance model function qualifiers       Table 7 4  Variance model function qualifiers available in ASReml       status qualifier description   existing   s s is a list of codes that link parameters sharing a common value  details  in Section 7 8 2    New R4   COORD v provides coordinates for mapping the effects so that a spatial model can    be applied to the effects  It is needed when the coordinates are not in  the data file  for example  exp Trait  COORD 1 2 5 3 5 5 8   see Section 7 7 2     existing  Fi is used with the own   variance model function  see Section 7 7 3  The  argument i is passed to your own program     existing  Gs s is a list of codes F  Z  P or U  one for each parameter  specifying whether  the parameter is to be Fixed at it   s initial value  held at Zero  if legal    kept in the Parameter space or is Unrestricted  see Section 7 7 4   New R4  INIT v v is the list of initial values for the variance structure parameters  If     initial values can be obtained from the  msv   rsv or  tsv file  they  override these values  see Section 7 7 5    existing  SUBSECTION f f is a factor in the data that breaks th
348. n the last column equal to one  If the vector    is ordered in the    natural    way   with o  being the last element  then we can define the vector of so called    pseudo    stratum  variance components by  gE   U o   Thence write D   The diagonal elements can be manipulated to produce effective stratum degrees of freedom  Thompson  1980  viz   i  2   7 k deii    In this way the closeness to an orthogonal block structure can be assessed     23    3 A guided tour    3 1 Introduction  This chapter presents a guided tour of ASReml  from data file preparation and basic aspects    of the ASReml command file  to running an ASReml job and interpreting the output files   You are encouraged to read this chapter before moving to the later chapters     e areal data example is used in this chapter for demonstration  see below    e the same data are also used in later chapters    e links to the formal discussion of topics are clearly signposted by margin notes    This example is of a randomised block analysis of a field trial  and is only one of many forms  of analysis that ASReml can perform  It is chosen because it allows an introduction to the  main ideas involved in running ASReml  However some aspects of ASReml  in particular   pedigree files  see Chapter 8 3 1  and multivariate analysis  see Chapter 8  are only covered    in later chapters     ASReml is essentially a batch program with some optional interactive features  The typical  sequence of operations when using ASReml is    e
349. n two terms as in a diallel analysis   male and female    assuming the ith male is the same individual as the ith female     at  f n  defines a binary variable which is 1 if the factor f has level n for the record  For    f n  example  to fit a row factor only for site 3  use the expression at  site 3  row   The string    is equivalent to at  for this function     at  f  at f  is expanded to a series of terms like at f i  where i takes the values 01  Qf  to the number of levels of factor f  Since this command is interpreted before the  data is read  it is necessary to declare the number of levels of f correctly in its field  definition  This extended form may only be used as the first term in an interaction     at f i 7 k  is expanded to a series of terms at f i  at f j  at f k   Sim   at  f m n  ilarly  at f i  X at f j  X at f k  X can be written as at f i j7 k  X pro   Q f m n vided at f i  j k  is written as the first component of the interaction  Any number  of levels may be listed  Contiguous sets of values can be specified as 7 7     cos v 7r  forms cosine from v with period r  Omit r if v is radians  If v is degrees  r is 360   con f  apply sum to zero constraints to factor f  It is not appropriate for random factors  c f  and fixed factors with missing cells  ASReml assumes you specify the correct number    of levels for each factor  The formal effect of the con   function is to form a model  term with the highest level formally equal to minus the sum of the precedin
350. n which case ASReml automatically uses the gamma parame   terization for estimation  see Section 7 6  Consequently  both the sigmas and the gammas  are reported  The user can force ASReml to use the sigma parameterization by placing   SIGMAP immediately after the independent variable and before   on the model definition  line     yield  SIGMAP   mu variety mv      SIGMAP is a new qualifier with Release 4  see also Section 7 6  In this case only the sigmas  are reported but they appear twice in the output  that is  in both of the columns headed  sigma in the  asr file  see Chapter 11 of the User Guide for detailed information on output  formats in ASReml     3b Two dimensional separable autoregressive spatial model       This model extends 3a by specifying a first order au    NIN Alliance Trial 1989  toregressive correlation structure for columns  The R   variety  A   structure in this case is the kronecker product of two id   autoregressive correlation matrices  that is  var e         o2 B  p      U  p    giving an AR1xAR1 model for   row 22   error  The consolidated model term in this case is   Colum 11  ariv column  ari row  and includes ariv column  Ee a i   to model the o2 Xe pe  variance structure for columns  a dels at eae i   residual ariv column  ar1  row              Important points    e the same residual variance structure could be achieved by specifying  ari column  ariv row  which mirrors the alternate but equivalent algebraic form  var  e    B  pe     92  Ur  pr
351. natory variable which  is a factor and appears in the model only in terms that are fitted as random  Covariates  generally appear in fixed terms but may appear in random terms as well  random regression    In special cases they may appear only in random terms     Random factors may contribute to predictions in several ways  They may be evaluated at  levels specified by the user  they may be averaged over  or they may be ignored  omitting  all model terms that involve the factor from the prediction   Averaging over the set of  random effects gives a prediction specific to the random effects observed  We call this a     conditional    prediction  Omitting the term from the prediction model produces a prediction  at the population average  often zero   that is  substituting the assumed population mean for  an predicted random effect  We call this a    marginal    prediction  Note that in any prediction   some random factors  for example Genotype  may be evaluated as conditional and others   for example Blocks  at marginal values  depending on the aim of prediction     For fixed factors there is no pre defined population average  so there is no natural interpre   tation for a prediction derived by omitting a fixed term from the fitted values  Therefore  any prediction will be either for specific levels of the fixed factor  or averaging  in some way   over the levels of the fixed factor  The prediction will therefore involve all fixed model terms     Covariates must be predicted a
352. nd Hall     Harvey  W  R   1977   Users    guide to LSML76  The Ohio State University  Columbus   Harville  D  A   1997   Matrix Algebra from a statisticians perspective   Springer Verlaag     Harville  D  and Mee  R   1984   A mixed model procedure for analysing ordered categorical  data  Biometrics 40  393 408     Haskard  K  A   2006   Anisotropic Mat  rn correlation and other issues in model based  geostatistics  PhD thesis  BiometricsSA  University of Adelaide     Kammann  E  E  and Wand  M  P   2003   Geoadditive models  Applied Statistics 52 1   1   18     Keen  A   1994   Procedure IRREML  GLW DLO Procedure Library Manual  Agricultural  Mathematics Group  Wageningen  The Netherlands  pp  Report LWA 94 16     Kenward  M  G  and Roger  J  H   1997   The precision of fixed effects estimates from  restricted maximum likelihood  Biometrics 53  983 997     Lane  P  W  and Nelder  J  A   1982   Analysis of covariance and standardisation as instances  of predicton  Biometrics 38  613 621     McCulloch  C  and Searle  S  R   2001   Generalized  Linear  and Mixed Models  Wiley     334    BIBLIOGRAPHY       Meuwissen and Lou  1992   Forming iniverse nrm  Genetics  Selection and Evolution 24  305   313     Millar  R  and Willis  T   1999   Estimating the relative density of snapper in and around a  marine reserve using a log linear mixed effects model  Australian and New Zealand Journal  of Statistics 41  383 394     Nelder  J  A   1994   The statistics of linear models  back to 
353. nd R variance structures   and the individual variance structure parameters in o  and     will be referred to as sigmas   The variance models given by G and R  are referred to as G structures and R structures  respectively     We illustrate these concepts using the simplest linear mixed model  that is  the one way  classification     Example 2 1 A simple example Consider a one way classification comprising a single ran   dom effect u  and a residual error term e  The two random components of this model     5    2 1 The general linear mixed model       namely u and e  are each assumed to be independent and identically distributed  IID  and  to follow a normal distribution such that u   N 0 02I   and e   N 0 02J     Hence the  variance of y has the form    var y    o2ZZ  071   2 4     This model has two variance structure parameters or sigmas  the variance component g   associated with u  and the variance component g  associated with e  Mapping this equation  back to  2 3   we have o    02  G o     02Iy  0    02 and R  a     02In     2 1 2 Partitioning the fixed and random model terms    Typically  7 and u are composed of several model terms  that is  7 can be partitioned as  T    T    7    and u can be partitioned as u    u     ul    with X and Z partitioned  conformably as X    X    X    and Z   Z     Zol     1    2 1 3 G structure for the random model terms    T    For u partitioned as u    u    w     we impose a direct sum structure on the matrix G  1 b Fi      written  G 0
354. nd varieties within runs defines a  nested block structure of the form    run variety tmt   run   run variety   run variety tmt      run   pair   pair tmt        run   run variety   units      There is an additional blocking term  however  due to the fact that the bloodworms within  a run are derived from the same batch of larvae whereas between runs the bloodworms come  from different sources  This defines a block structure of the form    run tmt variety   run   run tmt   run tmt variety      run   run tmt   pair tmt      Combining the two provides the full block structure for the design  namely    run   run variety   run tmt   run tmt variety    run   run variety   run tmt   units    run   pair   run tmt   pair  tmt    In line with the aims of the experiment the treatment structure comprises variety and treat   ment main effects and treatment by variety interactions  In the traditional approach the  terms in the block structure are regarded as random and the treatment terms as fixed  The  choice of treatment terms as fixed or random depends largely on the aims of the experi   ment  The aim of this example is to select the    best    varieties  The definition of best is  somewhat more complex since it does not involve the single trait sqrt rootwt  but rather  two traits  namely sqrt rootwt  in the presence absence of bloodworms  Thus to minimise  selection bias the variety main effects and thence the tmt variety interactions are taken  as random  The main effect of treatment
355. ned by a mean variance function and a link function   In this context   y is the observation    n is the count for grouped data specified by the   TOTAL qualifier      is a parameter set with the  PHI qualifier    u is the mean on the data scale calculated using the inverse link function from the predicted  value 7 on the underlying scale where n   XT    v is the variance under some distributional assumption calculated as a function of u and n   and   d is the deviance   twice the log likelihood  for that distribution     101    6 8 Generalized Linear  Mixed  Models       Table 6 3  Link qualifiers and functions          Qualifier Link Inverse Link Available with    IDENTITY n    n All   SQRT n  yH u   Poisson  Normal  Poisson  Negative Bino     LOGARITHM n   In p  u   exp n  mial  Gamma    INVERSE n 1 p w 1 n Normal  Gamma  Negative Binomial    LOGIT n  p  1 p  H  FERD  Binomial  Multinomial Threshold   PROBIT n  2   p  u  O n  Binomial  Multinomial Threshold      COMPLOGLOG n  In  In l yw   pw l e Binomial  Multinomial Threshold       where p is the mean on the data scale and 7   XT is the linear predictor on the underlying scale     GLMs are specified by qualifiers after the name of the dependent variable but before the    character  Table 6 3 lists the link function qualifiers which relate the linear predictor  7   scale to the observation  u  E y   scale  Table 6 4 lists the distribution and other qualifiers     Table 6 4  GLM distribution qualifiers  The default link is
356. nent  That is the  variance model for the plot errors is now given by    PE    X    Dr    YI  15 6     where 7 is the ratio of nugget variance to error variance  o     The abbreviated output for  this model is given below  There is a significant improvement in the REML log likelihood  with the inclusion of the nugget effect  see Table 15 7        ARI x AR1      1 LogL  739 681 S2  36034  125 df 1 000 0 1000 0 1000  2 LogL  714 340 S2  28109  125 df 1 000 0 4049 0 1870  3 LogL  703 338 S2  29914  125 df 1 000 0 5737 0 3122  4 LogL  700 371 S2  37464  125 df 1 000 0 6789 0 4320  5 LogL  700 324 S2  38602  125 df 1 000 0 6838 0 4542  6 LogL  700 322 S2  38735  125 df 1 000 0 6838 0 4579  7 LogL  700 322 S2  38754  125 df 1 000 0 6838 0 4585  8 LogL  700 322 S2  38757  125 df 1 000 0 6838 0 4586   Final parameter values 1 0000 0 68377 0 45861          Results from analysis of yield        Akaike Information Criterion 1406 64  assuming 3 parameters    Bayesian Information Criterion 1415 13    288    15 6 Spatial analysis of a field experiment   Barley         Slate Hall exam  Variogram o Paige    Ple 06 aug 2002 17 08 51             Outer displacement Inner displacement    Figure 15 5  Sample variogram of the residuals from the AR1xAR1 model    Model_Term Gamma Sigma Sigma SE  C  ari column   ar1  row  150 effects   Residual SCA_V 150 1 000000 38754 3 5 00 0 P  column AR_R 1 0 683769 0 683769 10 80 OP  row AR_R 1 0 458594 0 458594 5 59 0 P    Wald F statistics    Source of Variati
357. nents  put them all in one list on one line      where the relationship applies among simple model terms  those without an explicit  variance structure  for example units   the model term name may be given rather than    the parameter number     These examples are summarized in the following table           ASReml code action   57   1 parameter 7 equals parameter 5   of simple coding for 5 7   1   57   Ji parameter 7 is a tenth of parameter 5   Da parameter 7 is the negative of parameter 5   22 24 25 BY 309 29 for a  4 x 4  US matrix given by parameters 31    40  the covari   ances  parameters 32    39  are forced to be equal   21 29  BLOCKSIZE 8 equates parameters 29 with 21  30 with 22      36 with 28    units  uni  check  parameter associated with model term uni  check  has the same    magnitude but opposite sign to the parameter associated with  model term units           7 8 2 Fitting linear relationships among variance structure parameters    The user may wish to define relationships between particular variance parameters  For  example  consider an experiment in which two or more separate trials are sown adjacent to  one another at the same trial site  with trials sharing a common plot boundary  In this case it  might be sensible to fit the same spatial parameters and error variances for each trial  In other  situations it can be sensible to define the same variance structure over several model terms   ASReml 3 catered for equality and multiplicative relationships among
358. nerated by the leg    pol  and spl    functions are modified to include extra rows that are accessed by the  PREDICT directive  The default value of n is 21 if there is no  PPOINTS  qualifier  The range of the data is divided by n 1 to give a step size i   For each point p in the list  a predict point is inserted at p   iif there is  no data value in the interval  p p 1 1x 4i    PPOINTS is ignored if  PVAL  is specified for the variable  This process also effects the number of levels  identified by the fac   model term     forces ASReml to attempt to produce the standard output report when  there is a failure of the iteration algorithm  Usually no report is produced  unless the algorithm has at least produced estimates for the fixed and  random effects in the model  Note that residuals are not included in the  output forced by this qualifier  This option is primarily intended to help  debugging a job that is not converging properly     When forming a design matrix for the sp1 Q model term  ASReml uses a  standardized scale  independent of the actual scale of the variable   The  qualifier  SCALE 1 forces ASReml to use the scale of the variable  The  default standardised scale is appropriate in most circumstances     requests ASReml write the SCORE vector and the Average Information  matrix to files basename SCO and basename AIM  The values written are  from the last iteration     84    5 8 Job control qualifiers       Table 5 6  List of very rarely used job control qualifiers 
359. nes  Names  found in the data that are not included are simply appended to the list  of levels as they are discovered by ASReml  An example of this would be  for a genotype factor with 6 levels appearing in the data file in the order  genb6 genai gena5 genb2 genb4 gena3  In this case   Genotype  A  L genal genb2 gena3 genb4   would result in the levels of Genotype being ordered genal genb2 gena3  genb4 genb6 genad      I  n  is required if the data is numeric defining a factor but not 1   n   I must  be followed by n if more than 1000 codes are present   Year LI   1995 1996     AS p is required if the data field has level names in common with a previous   A or  I factor p and is to be coded identically  for example in a plant  diallel experiment  Male  A 22 Female  AS Male   integrated coding    IP indicates the special case of a pedigree factor  ASReml will determine  whether the identifiers are integer or alphanumeric from the pedigree file  qualifiers  and set the levels after reading the pedigree file  see Section  8 6     Animal  P   coded according to pedigree file    A warning is printed if the nominated value for n does not agree with the actual number  of levels found in the data  if the nominated value is too small the correct value is used       for a group of m variates or factor variables    48    5 4 Specifying and reading the data        G m  1  is used when m contiguous data fields comprise a set to be used together   The variables will be treated as factor va
360. ng  like In n      103    6 8 Generalized Linear  Mixed  Models       Table 6 4  GLM distribution qualifiers       qualifier action       ITOTAL  n  is used especially with binomial and ordinal data where n is the field containing the  total counts for each sample  If omitted  count is taken as 1        Residual qualifiers control the form of the residuals returned in the  yht file  The predicted values  returned in the  yht file will be on the linear predictor scale if the  WORK or   PVW qualifiers are used  They will be on the observation scale if the  DEVIANCE    PEARSON   RESPONSE or  PVR qualifiers are used       DEVIANCE produces deviance residuals  the signed square root of d h from Table 6 4 where h  is the dispersion parameter controlled by the  DISP qualifier  This is the default      PEARSON writes Pearson residuals  a in the  yht file    PVR writes fitted values on the response scale in the  yht file  This is the default      PVW writes fitted values on the linear predictor scale in the  yht file      RESPONSE produces simple residuals  y     u     WORK produces residuals on the linear predictor scale  qa    A second dependent variable may be specified  except with a multinomial response    MULTINOMIAL   if a bivariate analysis is required but it will always be treated as a normal  variate  no syntax is provided for specifying GLM attributes for it   The   ASUV qualifier is  required in this situation for the GLM weights to be utilized     ASReml internally calcul
361. ng row column optional field labels  LANCER 1 1101 585 1 4 29 25 4 3 19 2 16 1 data for sampling unit 1  BRULE 2 1102 631 1 4 31 55 4 3 20 4 17 1 data for sampling unit 2  REDLAND 3 1103 791 1 4 35 05 4 3 21 6 18 1   CODY 4 1104 602 1 4 30 1 4 3 22 8 19  ARAPAHOE 5 1105 661 1 4 33 05 4 3 24 20  NE83404 6 1106 605 1 4 30 25 4 3 2 21 1  NE83406 7 1107 704 1 4 35 2 4 3 26 4 22 1  NE83407 8 1108 388 1 4 19 4 8 6 1 2 1 2  CENTURA 9 1109 487 1 4 24 35 8 6 2 4 2 2    1  2 i  Se    w w    2  Gs       SCOUT66 10 1110 511 1 4 25 55 8 6 3 6 3 2  COLT 11 1111 502 1 4 25 1 8 6 4 8 4 2  NE83498 12 1112 492 1 4 24 6 8 6 6 5 2  NE84557 13 1113 509 1 4 25 45 8 6 7 2 6 2  NE83432 14 1114 268 1 4 13 4 8 6 8 4 7 2  NE85556 15 1115 633 1 4 31 65 8 6 9 6 8 2  NE85623 16 1116 513 1 4 25 65 8 6 10 8 9 2  CENTURAK78 17 1117 632 1 4 31 6 8 6 12 10 2  NORKAN 18 1118 446 1 4 22 3 8 6 13 2 11 2  KS831374 19 1119 684 1 4 34 2 8 6 14 4 12 2          27    3 3 The ASReml data file       These data are analysed again in Chapter 7 using spatial methods of analysis  see model  3a in Section 7 5  For spatial analysis using a separable error structure  see Chapter 2   the data file must first be augmented to specify the complete 22 row x 11 column array  of plots  These are the first 20 lines of the augmented data file nin89aug asd with 242  data rows  Note that ASReml 4 can automatically augment spatial data  see  ROWFACTOR     COLUMNF ACTOR                    variety id pid raw repl nloc yield lat long row col
362. ning them  see Table 5 1      comma delimited files whose file name ends in  csv or for which the  CSV qualifier is set  recognise empty fields as missing values     a line beginning with a comma implies a preceding missing value       consecutive commas imply a missing value     a line ending with a comma implies a trailing missing value       if the filename does not end in  csv and the   CSV qualifier is not set  commas are treated  as white space     e TAB delimited files recognise empty fields as missing values    e characters following   on a line are ignored so this character may not be used except  to flag trailing comments on the ends of lines  or to comment out data records  unless   SPECIALCHAR is specified  see Section 5 4 2     adjacent lines can be concatenated and written on one line using     For example     line_1  line_2    line_n    41    4 2 The data file       can be written on one line as  line 1    line 2        linen    This can aid legibility of the input file  Note that everything  including     after the first    on a line is intepreted as a comment     blank spaces  tabs and commas must not be used  embedded  in alphanumeric fields unless  the label is enclosed in quotes  for example  the name Willow Creek would need to be  appear in the data file as    Willow Creek    to avoid an error     the   symbol must not be used in the data file     alphanumeric factor level labels have a default size of 16 characters  Use the  LL size  qualifier to extend 
363. nstant 1 F 1  age  age F 1  spl age 7  R 5  fac age  R 7  tree  tree RC 5  age tree  x tree RC 5  spl age 7  tree R 25  error R          slope for each tree are included as random coefficients  denoted by RC in Table 15 11   Thus   if U     is the matrix of intercepts  column 1  and slopes  column 2  for each tree  then we  assume that   var  vec U     X  amp  I     where X is a 2 x 2 symmetric positive definite matrix  Non smooth variation can be mod   elled at the overall mean  across trees  level and this is achieved in ASReml by inclusion of  fac age  as a random term     312    15 9 Balanced longitudinal data   Random coefficients and cubic smoothing splines    Oranges       Table 15 12  Sequence of models fitted to the Orange data       model  term 1 2 3 4 5 6  tree y y y y y y  age tree y y y y y b4   covariance  n n n n n y  spl age 7  y y y y n  tree spl age 7  y y y n y y  fac age  n y y n n n  season n n y y y y       REML log likelihood  97 78  94 07  87 95  91 22  90 18  87 43          An extract of the ASReml input file is    circ   mu age  r str Tree age Tree us 2  INIT 4 6  00001  000094  id Tree   idv spl age 7     idv spl age 7  Tree  idv fac age    predict age Tree  IGNORE fac age     We stress the importance of model building in these settings  where we generally commence  with relatively simple variance models and update to more complex variance models if ap   propriate  Table 15 12 presents the sequence of fitted models we have used  Note that the  REML 
364. ntal  crosses   Computational Statistics and Data Analysis 51  3749 3764     333    BIBLIOGRAPHY       Gilmour  A  R   Cullis  B  R  and Verbyla  A  P   1997   Accounting for natural and ex   traneous variation in the analysis of field experiments  Journal of Agricultural  Biological   and Environmental Statistics 2  269 273     Gilmour  A  R   Cullis  B  R   Welham  S  J   Gogel  B  J  and Thompson  R   2004    An efficient computing strategy for prediction in mixed linear models  Computational  Statistics and Data Analysis 44  571 586     Gilmour  A  R   Thompson  R  and Cullis  B  R   1995   AI  an efficient algorithm for REML  estimation in linear mixed models  Biometrics 51  1440 1450     Gleeson  A  C  and Cullis  B  R   1987   Residual maximum likelihood  REML  estimation  of a neighbour model for field experiments  Biometrics 43  277 288     Gogel  B  J   1997   Spatial analysis of multi environment variety trials  PhD thesis  De   partment of Statistics  University of Adelaide     Goldstein  H  and Rasbash  J   1996   Improved approximations for multilevel models with  binary response  Journal of the Royal Statistical Society A     General 159  505 513     Goldstein  H   Rasbash  J   Plewis  I   Draper  D   Browne  W   Yang  M   Woodhouse  G   and Healy  M   1998   A user   s guide to MLwiN  Institute of Education  London   URL  hitp   multilevel ioe ac uk     Green  P  J  and Silverman  B  W   1994   Nonparametric regression and generalized linear  models  Chapman a
365. o  is  the additive genetic variance  the variance component for dams is denoted by oj    0o   02   where o   is the maternal variance component and the variance component for litters is  denoted by g  and represents variation attributable to the particular mating     For a multivariate analysis these variance components for sires  dams and litters are   in theory replaced by unstructured matrices  one for each term  Additionally we assume  the residuals for each trait may be correlated  Thus for this example we would like to fit  a total of 4 unstructured variance models  For such a situation  it is sensible to commence    317    15 10 Multivariate animal genetics data   Sheep       Table 15 13  REML estimates of a subset of the variance parameters for each trait for the  genetic example  expressed as a ratio to their asymptotic s e        term wwt ywt gfw fdm fat       sire 3 68 3 57 3 95 1 92 1 92  dam 6 25 4 93 2 78 0 37 0 05  litter 8 79 0 99 2 23 1 91 0 00  age grp 2 29 1 39 0 31 1 15 1 74  sex grp 2 90 3 43 3 70   1 83          the modelling process with a series of univariate analyses  These give starting values for the  diagonals of the variance matrices  but also indicate what variance components are estimable   The ASReml job for the univariate analyses is    IRENAME 1  ARG 1 2 3 4 5  Does 5 runs one for each trait  Multivariate Sire  amp  Dam model     DOPART  1    IF  1    1  ASSIGN YV wwt  sets up dependent variable to each trait in turn   IF  1    2  ASSIGN YV 
366. o methods  In larger analyses  users can request  the calculation be attempted using the  DDF qualifier  page 67   Use  DDF  1 to prevent the  calculation to save processing time when significance testing is not required     108    7 Command file  Specifying the  variance structures    In Chapter 2 we presented the general linear mixed model  y   XT Zu e    where y  n x 1  is a vector of observations  T  p x 1  is a vector of fixed effects  X  n x p   is the design matrix of full column rank that associates observations with the appropriate  combination of fixed effects  u  q x 1  is a vector of random effects  Z  n x q  is the design  matrix that associates observations with the appropriate combination of random effects  and  e  n x 1  is the vector of residual errors  see model  2 1   Among the key concepts regarding  this model are     e the sigma parameterization  Section 2 1 1      ere sl    where the matrices G and R  are variance matrices for u and e and are functions of  parameters o  and o  Under this parameterization    var y    ZG o  Z  R  a      e G structures for the random model terms  Section 2 1 3  and R structures for the residual  error term  Section 2 1 5      e direct sum structures for G and or R   Re  see below   Sections 2 1 3 and 2 1 5    e direct product structures for terms composed of several component factors  Section 2 1 10      e the gamma parameterization for estimation of variance structure parameters as ratios  relative to the residual error v
367. o the  number of levels identified in this se   quential process  see Other exam   ples below   Missing values remain  missing     changes the focus of subsequent  transformations to variable  field  v     replaces the variate with uniform  random variables having range 0  v     57    treat    ILCA B  CYR   treat ISET 1  1  1    group   treat ISET 1     2293 4    Anorm   A  SETN 2 5 10    Aeff      A  SETU 5 10    year 3  SUB 66 67 68    plot   V3  5EQ    sqrtaA    meanAB   A   2     TARGET sqrtA  70 5    Udat    1 0        UNIFORM 4 5    5 5 Transforming the data       Table 5 1  List of transformation qualifiers and their actions with examples       qualifier argument action examples        Vtarget  value assigns value to data field tar       V3 2 5  get overwriting previous contents   subsequent transformation qualifiers  will operate on data field target     Vfield assigns the contents of data field      V10 V3  field to data field target overwriting      V1i1 block  previous contents  subsequent trans       V12 VO    formation qualifiers will operate on  data field target  If field is 0 the  number of the data record is in   serted     5 5 2 QTL marker transformations    IMM s associates marker positions in the vector s  based on the Haldane mapping function   with marker variables and replaces missing values in a vector of marker states with expected  values calculated using distances to non missing flanking markers  This transformation will  normally be used on a  G n 
368. oat varieties and nitrogen application          268  Rat data  AOV decomposition      e ac 6 a Rs 273  REML log likelihood ratio for the variance components in the voltage data   278  Summary of variance models fitted to the plant data               280  Summary of Wald F statistics for fixed effects for variance models fitted to    iS ts ee a ee ee Pee ead w Be mi e ee 286  Field layout of Slate Hall Farm experiment                    287  Summary of models for the Slate Hall data                 0   292    Estimated variance components from univariate analyses of bloodworm data    a  Model with homogeneous variance for all terms and  b  Model with het     erogeneous variance for interactions involving tmt                302  Equivalence of random effects in bivariate and univariate analyses         304  Estimated variance parameters from bivariate analysis of bloodworm data   306  Orange data  AOV decomposition               0 00002 eae 312  Sequence of models fitted to the Orange data               0 4  313  REML estimates of a subset of the variance parameters for each trait for the   genetic example  expressed as a ratio to their asymptotic s e           318    Wald F statistics of the fixed effects for each trait for the genetic example   319  Variance models fitted for each part of the ASReml job in the analysis of the  penelit erainple o oos sae oe mi Paaa E aua Re b e oe Ree 321    XIV    List    5 1    13 1  13 2  13 3  13 4  13 5    13 1  15 2  15 3  15 4  15 5  15
369. obtained the maximum available workspace   then use   WORKSPACE to increase it  The problem could be with  the way the model is specified  Try fitting a simpler model or  using a reduced data set to discover where the workspace is  being used     The response variable nominated by the  YVAR command line  qualifier is not in the data     262    14 5    Information  Warning and Error messages       Table 14 3  Alphabetical list of error messages and probable cause s   remedies       error message    probable cause remedy       Invalid binary data  Invalid Binomial Variable    Invalid definition of factor    Invalid error structure for  Multivariate Analysis    Invalid factor in model     Invalid model factor      Invalid SOURCE in R structure  definition   Invalid weight filter column  number     Iteration aborted because of  singularities    Iteration failed    Matern     Maximum number of special  structures exceeded    Maximum number of variance  parameters exceeded    Missing faulty  SKIP or  A  needed for        Missing values in design  variables factors    Missing Value Miscount  forming design    Missing values not allowed  here     Multiple trait mapping  problem    The data values are out of the expected range for bi   nary binomial data     there is a problem with forming one of the generated fac   tors  The most probable cause is that an interaction cannot  be formed     You must either use the US error structure or use the   ASUV  qualifier  and maybe include mv in
370. occur in the primary data file  and that there are  no extraneous lines in the MERGE file  A much more powerful merging  facility is provided by the MERGE directive described in Chapter 11     For example  assuming the field definitions define 10 fields   PRIMARY DAT  skip 1   IMERGE 6 SECOND DAT  SKIP 1  MATCH 1 6   would obtain the first five fields from PRIMARY  DAT and the next five from   SECOND  DAT  checking that the first field in each file has the same value     Thus each input record is obtained by combining information from each  file  before any transformations are performed     formally instructs ASReml to read n data fields from the data file  It  is needed when there are extra columns in the data file that must be  read but are only required for combination into earlier fields in transfor   mations  or when ASReml attempts to read more fields than it needs  to     is required when reading a binary data file with pedigree identifiers that  have not been recoded according to the pedigree file  It is not needed  when the file was formed using the  SAVE option but will be needed if  formed in some other way  see Section 4 2      is used in combination with  COLUMNFACTOR  and  SECTION  to get AS   Reml to insert extra data records to complete the grid of plots defined  by the RowFactor and the ColumnFactor for each Section so that a two   dimensional error structure can be defined  see   SECTION on page 73      causes ASReml to read n records or to read up to a data
371. ode    action       yield  MO  yield  70  score   5  score ISET  0 5 1 5 2 5    score  SUB  0 5 1 5 2 5    block 8   variety 20   yield   plot     variety  SEQ    Var 3  Nit 4  VxN 12   Var   1   4  4Nit    YA  V98 YA  NA O  YB  V99 YB  NA O   V98  DO    changes the zero entries in yield to missing values   takes natural logarithms of the yield data   subtracts 5 from all values in score   replaces data values of 1  2 and 3 with  0 5  1 5 and 2 5 respectively    replaces data values of  0 5  1 5 and 2 5 with 1  2 and 3 respectively   a data value of 1 51 would be replaced by 0 since it is not in the  list or very close to a number in the list    in the case where      there are multiple units per plot     contiguous plots have different treatments  and    the records are sorted units within plots within blocks     this code generates a plot factor assuming a new plot whenever  the code in V2  variety  changes  whether this creates a variable  or overwrites an input variable depends on whether any subsequent  variables are input variables     assuming Var is coded 1 3 and Nit is coded 1 4  this syntax could  be used to create a new factor VxN with the 12 levels of the com   posite Var by Nit factor     will discard records where both YA and YB have missing values   assuming neither have zero as valid data   The first line sets the  focus to variable 98  copies YA into V98 and changes any missing  values in V98 to zero  The second line sets the focus to variable  99  copies YB 
372. ograms  run and  then view the output  before saving results  It is available on the following platforms     e Windows  32 bit and 64 bit    e Linux  32 bit and 64 bit  various incantations      ASReml W has a built in help system explaining its use     1 3 2 ConTEXT    ConTEXT is a third party freeware text editor  with programming extensions which make  it a suitable environment for running ASReml under Windows  The ConTEXT directory on    1 5 Getting assistance and the ASReml forum       the CD ROM includes installation files and instructions for configuring it for use in ASReml   Full details of ConTEXT are available from http   www  contexteditor org      1 4 How to use this guide    The guide consists of 15 chapters  Chapter 1 introduces ASReml and describes the conven   tions used in the guide  Chapter 2 outlines some basic theory which you may need to come  back to     New ASReml users are advised to read Chapter 3 before attempting to code their first job  It  presents an overview of basic ASReml coding demonstrated on a real data example  Chapter  15 presents a range of examples to assist users further  When coding your first job  look for  an example to use as a model     Data file preparationis described in Chapter 4  and Chapter 5 describes how to input data  into ASReml  Chapters 6 and 7 are key chapters which present the syntax for specify   ing the linear model and the variance models for the random effects in the linear mixed  model  Variance modelling is a c
373. olidated model term gener   ates a correlation matrix  for example  the consolidated model term for A B is specified as  id A   ar1 B   then it is usually the case that one wishes to fit a model with this correlation  structure but to also allow the effects to have a common variance  When a correlation struc   ture is specified for a consolidated term  either for an R or a G structure  ASReml will detect  this and add a common scaled variance parameter  Some users might find it simpler and  reduce confusion by specifying terms as variance terms directly  For example  id  A   ar1  B   should become either idv A  ar1 B  or id A  ar1v B   it is arbitrary which variable the  common variance is attached to  If more than one variance model function in the consol   idated model term generates a variance structure  either homogeneous or heterogeneous    for example idv A  ariv B   then the parameters will not all be identifiable and so the  user must either change idv A  to id A  and leave ariv B  as it is  or change ar1v B  to  ar1 B  and leave idv A  as it is     7 5 A sequence of variance structures for the NIN data    Having outlined the theory and introduced the functional specification  we pause now to  consider an example  The following is a series of six variance structures of increasing com   plexity for the NIN column trial data  see Chapter 3 for an introduction to these data   For  each example we present a code box to the right that contains the functional specification 
374. om the command prompt  when attached to the appropriate folder is   ASReml nin89 as    However  if the path to ASReml is not specified in your system   s PATH environment variable   the path must also be given  and the path is required when configuring ASReml W or Context     In this guide we assume the command file has a filename extension  as  ASReml also  recognises the filename extension  asc as an ASReml command file  When these are used   the extension   as or  asc  may be omitted from basename as in the command line if there  is no file in the working directory with the name basename  The options and arguments that  can be supplied on the command line to modify a job at run time are described in Chapter  10     33    3 6 Description of output files       3 6 Description of output files    A series of output files are produced with each ASReml run  Nearly all files  all that contain  user information  are ASCII files and can be viewed in any ASCII editor including Con   Text  ASReml W and NotePad  The primary output from the nin89 as job is written to  nin89 asr  This file contains a summary of the data  the iteration sequence  estimates of  the variance parameters and an a table of Wald F statistics for testing fixed effects  The  estimates of all the fixed and random effects are written to nin89 sln  The residuals  pre   dicted values of the observations and the diagonal elements of the hat matrix  see Chapter  2  are returned in nin89 yht  see Section 13 3  Other key
375. ome  missing values in the contrast  Zero values in the factor  no level  assigned  become zeros in the contrast  The user should check that the  levels of the factor are in the order assumed by contrast  check the  ass  or  sln or  tab files   It may also be used on the implicit factor Trait  in a multivariate analysis provided it implicitly identifies the number of  levels of Trait  the number of traits is implied by the length of the list   Thus  if the analysis involves 5 traits     CONTRAST Time Trait 1 3 5 10 20    requests computation of the approximate denominator degrees of  freedom according to Kenward and Roger  1997  for the testing of fixed  effects terms in the dense part of the linear mixed model  There are  three options for i  i      1 suppresses computation  i   1 and i   2  compute the denominator d f  using numerical and algebraic methods  respectively    If i is omitted then i   2 is assumed    If  DDF i is omitted  i      1 is assumed except for small jobs   lt  10  parameters   lt  500 fixed effects   lt  10 000 equations and  lt  100 Mbyte  workspace  when i   2     Calculation of the denominator degrees of freedom is computationally ex   pensive  Numerical derivatives require an extra evaluation of the mixed  model equations for every variance parameter  Algebraic derivatives re   quire a large dense matrix  potentially of order number of equations plus  number of records and is not available when MAXIT is 1 or for multivariate  analysis     adds a   
376. omplex aspect of analysis  We introduce variance modelling  in ASReml by example in Chapter 15     Chapters 8 and 8 3 1 describe special commands for multivariate and genetic analyses re   spectively  Chapter 9 deals with prediction offixed and random effects from the linear mixed  model and Chapter 12 presents the syntax for forming functions of variance components  such as heritability     Chapter 10 discusses the operating system level command for running an ASReml job  Chap   ter 11 describes a new data merging facility  Chapter 13 gives a detailed explanationof the  output files  Chapter 14 gives an overview of the error messages generated in ASReml and  some guidance as to their probable cause     1 5 Getting assistance and the ASReml forum    The ASReml help accessable through ASReml W can also be linked to ConText or accessed  directly  ASRem1  chm      There is a User Area on the website  http    www VSNi co uk select ASReml and then User  Area  which contains contributed material that may be of assistance     Users with a support contract with VSN should email support asreml co uk for assistance  with installation and running ASReml  When requesting help  please send the input com   mand file  the data file and the corresponding primary output file along with a description  of the problem  All ASReml users  including unsupported users  are encouraged to join the  ASReml forum  register now at http   www vsni co uk forum     1 6 Typographic conventions       If ASRem
377. on    From Section 2 1 1  the variance matrix of y is  var  y    ZG o  Z    R  o       see model  2 3   This is referred to as the sigma parameterization and the individual vari   ance structure parameters in Og and a  are referred to as sigmas  For the case when  the variance structure for the residual error term is a scaled correlation matrix  that is   R  o    07R  7     the variance matrix of y can be written alternatively as    var y    ol  ZG y  Z    R  7         see  2 8   This is referred to as the gamma parameterization and the variance structure  parameters in y  and y  are referred to as gammas  see Section 2 1 6     7 6 1 Which parameterization does ASReml use for estimation     By default  ASReml uses either the gamma or sigma parameterization for estimation depend   ing on the residual specification  The current default for univariate  single section data sets  is the gamma parameterization  It is possible to over ride this default as discussed in the  following section  ASReml reports both the gammas and the sigmas when the gamma pa   rameterization is used for estimation  For historical reasons  the sigmas are presented twice   two identical columns  when the sigma parameterization is used for estimation     ASReml uses the sigma parameterization for analyses other than univariate single site analy   ses  examples including multi section analyses  multivariate analyses and repeated measures  analysis using R structures that are not the default variance model  
378. on NumDF DenDF F_inc Prob  8 mu i 12 3 850 88  lt  001  6 variety 24 80 0 13 04  lt  001      AR1 x AR1   units    1 LogL  740 735 S2  33225  125 df   2 components constrained  2 LogL  723 595 S2  11661  125 df   1 components constrained  3 LogL  698  498 S2  46239  125 df  4 LogL  696  847 S2  44725  125 df  5 LogL  696  823 S2  45563  125 df  6 LogL  696  823 S2  45753  125 df  7 LogL  696 823 S2  45796  125 df          Results from analysis of yield        Akaike Information Criterion 1401 65  assuming 4 parameters    Bayesian Information Criterion 1412 96    Model_Term Gamma Sigma Sigma SE  C  idv units  IDV_V 150 0 106152 4861 06 212 OP    289    15 6 Spatial analysis of a field experiment   Barley       ari  column   ari  row     Residual  column  row    SCA_V  AR_R  AR_R    Source of Variation    8 mu  6 variety    150 effects   150 1 000000  1 0 843791  1 0 682682    45793 4  0 843791  0 682682    Wald F statistics    NumDF DenDF  1 3 6  24 TST    FP ine  259 83  10 21    O O     Y SG    P ine   lt  001   lt  001    The lattice analysis  with recovery of between block information  is presented below  This  variance model is not competitive with the preceding spatial models  The models can be  formally compared using the BIC values for example       IB analysis    LogL  707  LogL  707    NO OP WNEH    LogL  734   LogL  720   LogL  711   LogL  707   LogL  707     786    786    184  060  119  937  786    52  26778  125  52  16591  125  52  11173  125  S2  8562 4 125  52 
379. on of varieties to plots in the NIN field trial       26  List of transformation qualifiers and their actions with examples        54  Qualifiers relating to data input and output                    62  List of commonly used job control qualifiers                    66  List of occasionally used job control qualifiers                   69  List of rarely used job control qualifiers                  204  74  List of very rarely used job control qualifiers                   83  Summary of reserved words  operators and functions               90  Alphabetic list of model functions and descriptions                97  Link queers and functions        eos a ae ee ee ee ee 102  GLM distribution qualifiers The default link is listed first followed by per    Nee CeCe ee ee ele ot ee ote ee kee oe OE ce S 102  Examples of aliassing in ASReml               0  0000040  107    List of common variance model functions  their type  correlation or variance    the form of the variance matrix generated  C for correlation  V for variance  matrix  S for scaled variance matrix   and a brief description  Parameters  o   gt  0 are variances     1  lt  p   lt  1 are correlations  Subscipt c denotes  parameter held in common across all rows columns                111    Building consolidated model terms in ASReml                  4  112  G structure for the random terms  magenta  and R structure for the residual  error term  cyan  under both the sigma and gamma parameterizations  and the  correspondin
380. ons  including special functions to be included in the table row 22  of Wald F statistics  column 11    e generally begins with the reserved word mu which fits nin89 asd  skip 1  mvinclude    a constant term  mean or intercept  see Table 6 1     93    yield   mu variety  r idv repl    If mv  residual idv units           6 4 Random and residual terms in the variance component model       6 3 2 Sparse fixed terms       The  f sparse_fixed terms in model formula NIN Alliance Trial 1989  variety  e are the fixed covariates  for example  the fixed  lin row  covariate now included in the model for  eae  mula   factors and interactions including special func  column 11  tions and reserved words  for example mv  see Table nin89 asd  skip 1  6 1  for which Wald F statistics are not required  yield   mu variety  r idv repl      If mv lin row     include large   gt 100 levels  terms  lasa iav unii           6 4 Random and residual terms in the variance component  model       The  r conrandom functions have arguments that NIN Alliance Trial 1989  variety  e comprise random covariates  factors and interactions  including special functions and reserved words  see   row 29  Table 6 1  Note that idvQ  may not enclose a column 11  contracted at   function  an at   function that is   nin89 asd  skip 1  expanded by ASReml to form multiple model terms    yield   mu variety  r idv repl        f    because the result is ambiguous  T s  residual idv units           In Chapter 7 we discuss possible 
381. onsistent comparisons between check varieties and test lines  Given  the large amount of replication afforded to check varieties there will be very little shrinkage  irrespective of the realised heritability     We consider an initial analysis with spatial correlation in one direction and fitting the variety  effects  check  replicated and unreplicated lines  as random  We present three further spatial  models for comparison  The ASReml input file is     EPS  RENAME  ARG 1 2 3  Tullibigeal trial  DOPART  1  linenum  yield  weed  column 10  row 67  variety 532   testlines 1 525  check lines 526 532  wheat asd  SKIP 1   PATH 1   ARI x I  y   mu weed mv  r idv variety   residual ariv row   id col    PATH 2   ARI x AR1  y   mu weed mv  r variety  residual ariv row   ar1 col    PATH 3   AR1 x AR1   column trend  y   mu weed pol column  1  mv  r idv variety   residual ariv row   ar1 col    PATH 4   AR1 x AR1   Nugget   column trend  y   mu weed pol column  1  mv  r idv variety  idv units   residual ar1 row   ar1i col   predict var    The data fields represent the factors variety  row and column  a covariate weed and the plot  yield  yield   There are four paths in the ASReml file  We begin with the one dimensional  spatial model  which assumes the variance model for the plot effects within columns is  described by a first order autoregressive process  The abbreviated output file is    1 LogL  4280 75 S2  0 12850E 06 666 df  2 LogL  4268  58 82  0 12139E 06 666 df  3 LogL  4255 89 S
382. onth year    IPRWTS   56 36 70 53 0   556 0 O 56 22   BG 0 21 22 0   53 53 17 92 53   57 23 O 19 70   63 24 0 44 22   054 0 0 0   0 54 70 O 51   0 43 0 36 16   035 0 O 51   0 053 0 U   O 0 0 49 0  5  predict crop 1 pasture lime  PRES year month  PRWIS    YMprwts txt       where YMprwts txt contains    19 2 11 0 11 2 10 6 11 4 1226 0 0 0 0 0 0 0 0 0 0 O20  faz 0 0 0 0 10 6 4 6 4 8 10 8 10 8 8 6 7 0 0 0 0 0  14  0 0 4 2 3 4 0 0 0 0 0 0 14  0 0 0 0 10 6 0 0  10 6 11 2 4 4 16 4 3 8 6 8 0 G  fie W 0 9 8    189    9 3 Prediction       0 4 4 0 10 6 14 4 4 0 10 2 3 2 10 29 0    We have presented both sets of predict statements to show how the weights were derived  and presented  Notice that the order in  PRESENT year month implies that the weight  coefficients are presented in standard order with the levels for months cycling within levels  for years  There is a check which reports if non zero weights are associated with cells that  have no data  The weights are reported in the  pvs file   PRESENT counts are reported in  the  res file     9 3 6 Examples    Examples are as follows     yield   mu variety  r idv repl   predict variety    is used to predict variety means in the NIN field trial analysis  Random rep1 is ignored in  the prediction    yield   mu x variety  r idv repl    predict variety   predicts variety means at the average of x ignoring random repl    yield   mu x variety repl   predict variety x 2   forms the hyper table based on variety and repl at the covariate value of
383. or fields can be skipped and superfluous rows before the regressor information can  be skipped     168    8 11 Factor effects with large Random Regression models       The syntax for specifying and reading the  grr file is    M grr   CSKIP c    Factor  f    NOID    CSKIP co  Regressors  m    NONAMES    SKIP  s  where    M grr is the name of the file to be read   CSKIP c   indicates c   fields are to be skipped  before the factor identifiers are read   Factor is the name of the variable in the data that is associated with the regressors     f sets the maximum number of levels  default 1000  of Factor with regressor data  ASRem   will count the actual number      NOID indicates that the factor identifiers are not present in the  grr file    CSKIP co indicates cz fields are to be skipped before the regressor variables are read   Regressors is the name for the set of regressor variables     m sets the number of regressor variables  default is the number of names found   must be  set if there are extraneous fields to be ignored      SKIP s specifies how many lines are to be skipped before reading the regressor data      NONAMES indicates there is no line containing the individual names of the regressor variables   otherwise names are taken from the first  non skipped  line in the file     If the factor identifiers are not present   NOID   ASReml assumes that the order of the  factor classes in the data file matches the order in the  grr file  If the factor identifiers are  present
384. or messages in the  asr file  The major in   formation messages are in Table 14 1  A list of warning messages together with the likely  meaning s  is presented in Table 14 2  Other error messages with their probable cause s  is  presented in Table 14 3     Not all messages are listed here  If not  identify whether the problem is syntactical  as in  the previous section   whether it is a processing problem  the job starts to process but does  not complete  or a reporting problem     e for a syntax problem  note that the actual problem may be in an earlier line  and the  current message is indicating an inconsistency with what ASReml has already read  Scan  the output for other messages which might indicate the problem  If the problem is not  evident  simplify the job until the simpler version runs and then build back to the required  model  Remember that the model statement is parsed before the data file is read  but any  following statements  e g  residual  predict   are parsed after the data is read     processing errors are indicated if the  asr file contains lines like   Forming 18211 equations  42 dense    Initial updates will be shrunk by factor 0 316   Simple things to try are increasing  WORKSPACE and simplifying the model     reporting problems are indicated if the LogL has converged or ASReml has completed the  specified number of iterations     Do not hesitate to seek help on the forum and to report problems to support vsni co uk   Often a simple solution is availab
385. or name in RESIDUAL declaration       After correcting the spelling of Repl  we  qin Alliance Trial 1989  get the following  abbreviated  output  The   variety  A   problem here is essentially the same as error   id pid raw   5  The spatial residual model was declared   TeP      using Row and Col but the relevant variables T   are in fact row and column  Note that  in   Jing9 asa lakip 4   this case column could be truncated to colin   yield   mu variety     the model formulae as this does not cause any    R repl   ambiguity but often it is clearer to use the full   residual ar1 Row   ar1 Col   variable name  predict varierty             Summary of 224 records retained of 224 read    Model term Size  miss  zero MinNonO Mean MaxNonO StndDevn  1 variety 56 0 0 1 28 5000 56  10 row 22 0 0 1 11 7321 22  11 column 11 0 0 1 6 3304 if  12 mu 1  ari  Row  in ar1 Row  ar1 Col  has size 0  parameters  5 5  ari  Col  in ar1 Row  ar1 Col  has size 0  parameters  6 6  ar1  Row   ar1 Co1    4  6  initialized     Error  There are 224 data records but RESIDUAL model implies 0 data records     Error  Unrecognised argument in ari  Row   Error  Unrecognised argument in ar1 Col    Fault  RESIDUAL structure does not match records in data  Last line read was  Residual ari Row  ari Col    ninerr6 variety id pid raw rep nloc yield lat   Model specification  TERM LEVELS GAMMAS    variety 56  mu 1  repl 4 0 100   3   SECTIONS 0 4 1  STRUCT 0 1 1 5 il 1 1   17 factors defined  max5000     6 variance pa
386. or the response variable s  to be analysed  multivariate analysis is  discussed in Chapter 8     e qualifiers allow for weighted analysis  Section 6 7  and Generalized Linear Models  Section  6 8      e   is read as modelled as    and separates response from the list of fixed and random terms  in the linear mixed model     fixed represents the list of primary fixed explanatory terms  that is  variates  factors   interactions and special terms for which Wald F statistics are required  See Table 6 1 for  a brief definition of reserved model terms  operators and commonly used functions  The  full definition is in Section 6 6     conrandom represents the list of consolidated model terms  see Chapter 7  specifying both  random effects and variance structures  In this chapter the consolidated model terms are  of the form idv   with arguments being the explanatory terms to be fitted as random  effects  see Table 6 1 and Section 6 6  Specifying idv term  indicates that the term effects  are IID distributed with a common variance     sparse_fixed are additional fixed terms not included in the table of Wald F statistics     the residual statement allows specification of the residual error variance structure     conresidual is the list of residual consolidated terms  see Chapter 7  specifying both ran   dom effects and variance structures  In this chapter we are assuming that the residual  errors are IID  Hence the specification idv units  in the code box  where units is the  reserved w
387. ord specifying a factor with a level for every experimental unit     6 2 1 General rules    The following general rules apply in specifying the linear mixed model    87    6 2 Specifying model formulae in ASReml       all elements in the model must be space separated     the character       modelled as     separates the response variables s  from the explanatory  variables in the model     elements in the model may be separated by   which is ignored except when it is at the  end of a line which implies the model continues onto the next line  the   sign must appear  on the first line of the model statement when the model statement is written over several  lines     data fields are identified in the model by their labels    labels are case sensitive       labels may be abbreviated  truncated  when used in the model line but care must be  taken that the truncated form is not ambiguous  If the truncated form matches more  than one label  the term associated with the first match is assumed    For example  dens is an abbeviation for density but spl dens 7  is a different model  term to spl density 7  because it does not represent a simple truncation       model terms may only appear once in the model line  repeated occurrences are ignored     model terms other than the original data fields are defined the first time they appear  on the model line  They may be abbreviated  truncated  if they are referred to again    provided no ambiguity is introduced     Important It is often clear
388. oup constraints  see  GROUPS below  is to shrink the group effects  by adding the constant o   gt  0  to the diagonal elements of A   pertaining to groups   When a constant is added  no adjustment of the degrees of freedom is made for genetic  groups    Use  GOFFSET  1 to add no offset but to suppress insertion of constraints where empty  groups appear  The empty groups are then not counted in the DF adjustment     includes genetic groups in the pedigree  The first g lines of the pedigree identify genetic  groups  with zero in both parent fields   All other lines must specify one of the genetic  groups as parent if the actual parent is unknown    You may insert groups identifiers with no members to define constraints on groups  that  is to associate groups into supergroups where the supergroup fixed effect is formally  fitted separately in the model  A constraint is added to the inverse which causes the  preceding set of groups which have members to have effects which sum to zero  The  issue is to get the degrees of freedom correct and to get the correct calculation of the  Likelihood  especially in bivariate cases where DF associated with groups may differ  between traits  The  LAST qualifier  see page 80  is designed to help as without it   reordering may associate singularities in the A matrix with random effects which at  the very least is confusing  When the A matrix incorporates fixed effects  the number of  DF involved may not be obvious  especially if there is also a 
389. ows Print Manager  11  WMF Windows Meta File wmf  12  HPGL 2 HP GL2 hgl  21  PNG PNG png  22    EPS EncapsulatedPostScript eps          10 3 5 Job control command line options  C  F  O  R     C   CONTINUE  indicates that the job is to continue iterating from the values in the  rsv  file  This is equivalent to setting  CONTINUE on the datafile line  see Table 5 4  page 66 for  details     F    FINAL  indicates that the job is to continue for one more iteration from the values in the   rsv file  This is useful when using predict  see Chapter 9     O   ONERUN  is used with the R option to make ASReml perform a single analysis when  the R option would otherwise attempt multiple analyses  The R option then builds some  arguments into the output file name while other arguments are not  For example   ASReml  nor2 mabphen 2 TWT out 621  out 929    results in one run with output files mabphen2_TWT        R r    RENAME  r   is used in conjunction with at least r argument s  and does two things   it modifies the output filename to include the first r arguments so the output is identified  by these arguments  and  if there are more than r arguments  the job is rerun moving the  extra arguments up to position r  unless  ONERUN  0  is also set   If r is not specified  it is  taken as 1     For example  ASReml  r2 job wwt gfw fd fat   is equivalent to running three jobs   ASReml  r2 job wwt gfw     jobwwt_gfw asr  ASReml  r2 job wwt fd     jobwwt_fd asr  ASReml  r2 job wwt fat     jobwwt_fa
390. p SE ratios of zero sometimes indicate poor  scaling  Consider rescaling the design matrix in such cases     Wald F statistics    Source of Variation NumDF F inc  7 mu 1 1405 14  4 tmt a 441 72    The estimated variance components from this analysis are given in column  b  of table  15 8  There is no significant variance heterogeneity at the residual or tmt run level  This  indicates that the square root transformation of the data has successfully stabilised the error  variance  There is  however  significant variance heterogeneity for tmt variety interactions  with the variance being much greater for the control group  This reflects the fact that  in the absence of bloodworms the potential maximum root area is greater  Note that the  tmt  variety interaction variance for the treated group is negative  The negative component  is meaningful  and in fact necessary and obtained by use of the  GU option  in this context  since it should be considered as part of the variance structure for the combined variety main  effects and treatment by variety interactions  That is     of 03 oF  var  15 Q ui   U2      1 o  2c E fae   Q Is  15 8     Using the estimates from table 15 8 this structure is estimated as    3 84 2 33  2 33 1 96      ota    Thus the variance of the variety effects in the control group  also known as the genetic  variance for this group  is 3 84  The genetic variance for the treated group is much lower   1 96   The genetic correlation is 2 33 v 3 84   1 96   0 85 which
391. p information vy  at position p    sin v r  forms sine from v with period r    OS    sqrt v  7r   forms square root of v   r    uni  f  forms a factor with a level for each record where J  factor f is non zero    91    6 2 Specifying model formulae in ASReml       Table 6 1  Summary of reserved words  operators and functions       model  term    brief description    common usage    fixed    random       uni f n     vect  v     forms a factor with a level for each record where  factor f has level n    is used in a multivariate analysis on a multivariate  set of covariates  v  to pair them with the variates    92    v    6 3 Fixed terms in the model       6 2 2 Examples       ASReml code    action       yield   mu variety  residual idv units     yield   mu variety  r idv block   residual idv units     yield   mu time variety time variety  residual idv units     livewt   mu breed sex breed sex  r idv sire     residual idv units     fits a model with a constant and fixed  variety effects    fits a model with a constant term  fixed  variety effects and random block effects    fits a saturated model with fixed time  and variety main effects and time by va   riety interaction effects    fits a model with fixed breed  sex and  breed by sex interaction effects and ran   dom sire effects          6 3 Fixed terms in the model    6 3 1 Primary fixed terms  The fixed list in the model formula       NIN Alliance Trial 1989    variety  e describes the fixed covariates  factors and interacti
392. path name of the data subfile and  SKIP n is an optional qualifier  indicating that the first n lines of the subfile are to be skipped  After reading each subfile   input reverts to the primary data file     Typically  the primary data file will just contain   INCLUDE statements identifying the subfiles  to include  For example  you may have data from a series of related experiments in separate  data files for individual analysis  The primary data file for the subsequent combined analysis  would then just contain a set of   INCLUDE statements to specify which experiments were being  combined     If the subfiles have CSV format  they should all have it and the   CSV file should be declared  on the primary datafile line  This option is not available in combination with  MERGE     5 8 Job control qualifiers    The following tables list the job control qualifiers  These change or control various aspects  of the analysis  Job control qualifiers may be placed on the datafile line and following  lines  They may also be defined using an environment variable called ASREML_QUAL  The  environment variable is processed immediately after the datafile line is processed  All qual   ifier settings are reported in the  asr file  Use the Index to check for examples or further  discussion of these qualifiers     Important Many of these are only required in very special circumstances and new users    65    5 8 Job control qualifiers       should not attempt to understand all of them  You do need
393. phanumeric variable  The qualifier  SPECIALCHAR cancels the  normal meaning of the   character in an input file so that it can be included in the name  of a level of an alphanumeric or pedigree variable  If class names are being predefined  the  qualifier   SPECIALCHAR must appear before the class names are read in     5 4 3 Ordering factor levels    The default order for factor levels when factors are declared with  I and  A is the order the  levels are encountered in the data file   SORT declared after  A or  I on a field definition  line will cause ASReml to fit the levels in  numeric  alphabetical order although they are  defined in some other order  To control the order levels are defined  the level names must  be prespecified using the  L s qualifier  applies only to factors declared  A    Thus for a  variable SEX coded as Male and Female  declared SEX  A   the user cannot know whether it  will be coded 1 Male  2 Female or 1 Female  2 Male without looking to see which occurs  first in the data file  However declaring it as SEX  A  L Male Female will mean Male is coded  1  Female is coded 2  If it is declared as SEX  A  SORT   the coding order is unspecified but  ASReml creates a lookup table after reading the data to arrange levels in sorted order and  uses this sorted order when forming the design matrices  Consequentially  with the  SORT  qualifier  the order of fitted effects will be 1 Female  2 Male in the analysis regardless of  which appears first in the file     
394. plicit in the variance  structure for the trait by variety effects  The variance structure can arise from a regression  of treated variety effects on control effects  namely    Uy    Buy   E    where the slope 6   oy   02    Tolerance can be defined in terms of the deviations from  regression       Varieties with large positive deviations have greatest tolerance to bloodworms   Note that this is similar to the researcher   s original intentions except that the regression has  been conducted at the genotypic rather than the phenotypic level  In Figure 15 9 the BLUPs  for treated have been plotted against the BLUPs for control for each variety and the fitted  regression line  slope   0 61  has been drawn  Varieties with large positive deviations from  the regression line include YRK3  Calrose  HR19 and WC1403        BLUP regression residual  o   e   T          wy 4  w     2  1 o 1  control BLUP    Figure 15 10  Estimated deviations from regression of treated on control for each variety  plotted against estimate for control    An alternative definition of tolerance is the simple difference between treated and control  BLUPs for each variety  namely 6   Uy      Uy   Unless 8   1 the two measures     and 6 have  very different interpretations  The key difference is that     is a measure which is independent  of inherent vigour whereas    is not  To see this consider   cov  e     ul    COV  ty      Buy   w     v  e       307    15 8 Paired Case Control study   Rice          whereas
395. ports the terms in the conditional statistics     Marginality pattern for F con calculation     Model terms       Model Term DF 1 2 3 4 5 6 7 8  1 mu 1 a NE re     2 water 1 I   C C c   3 variety WO E k Cu a C   4 sow 2 PA Db G x   5 water variety 7 I I I I   C C   6 water sow 2  1 oe Al E 2   7 variety sow 14 I I I I I I      8 water variety sow 14 I I I I I I I      F inc tests the additional variation explained when the term     is added to a model con   sisting of the I terms  F con tests the additional variation explained when the term     is  added to a model consisting of the I and C c terms  Any c terms are ignored in calculating  DenDF for F con using numerical derivatives for computational reasons  The   terms are  ignored for both F inc and F con tests    Consider now a nested model which might be represented symbolically by    y   1   REGION   REGION SITE    For this model  the incremental and conditional Wald F statistics will be the same  However     21    2 5 Inference  Fixed effects       it is not uncommon for this model to be presented to ASReml as  y   1  REGION   SITE    with SITE identified across REGION rather than within REGION  Then the nested structure  is hidden but ASReml will still detect the structure and produce a valid conditional Wald  F statistic  This situation will be flagged in the M code field by changing the letter to lower  case  Thus  in the nested model  the three M codes would be    A and B because REGION SITE  is obviously an interac
396. priate qualifier  ND   PSD or  NSD is supplied   These qualifiers do not modify the matrix  they just instruct ASReml to proceed regard   less  If the matrix has positive and negative eigenvalues   ND instructs ASReml to ignore  the condition and proceed anyway  If the matrix is positive semi definite  positive and zero  eigenvalues    PSD allows ASReml to introduce Lagrangian multipliers to accommodate linear  dependencies and rows with zero elements  and allows ASRem1 to proceed  Linear depen     163    8 9 Reading a user defined  inverse  relationship matrix       dencies occur  for example  when the list of individuals includes clones  Rows with zero  elements occur when the GRM represents a dominance matrix  and the list of individuals  includes fully inbred individuals which  by definition  have zero dominance variance  If the  matrix has positive  zero and negative eigenvalues   NSD may be used to allow ASReml to  continue  The zero eigenvalues are handled as for  PSD  Sometimes  with negative eigen   values  the iteration sequence may fail as some parameter values will result in a negative  residual sum of squares     If the specified  giv file does not exist but there is a  grm file of the same name  ASReml  will read and invert the  grm file  and write the inverse to the  giv file if  SAVEGIV  f  is  specified  Its is written in DENSE format unless f   1   SAVEGIV 3 writes the GIV matrix  as an  sgiv file   SAVEGIV 4 writes the GIV matrix as a  dgiv file  where  sgi
397. publication may be reproduced by any process  electronic or otherwise  without specific    written permission of the copyright owner  Neither may information be stored electronically  in any form whatever without such permission     Published by     VSN International Ltd   5 The Waterhouse   Waterhouse Street   Hemel Hempstead     HP1 1ES  UK  email  info asreml co uk    website  http    www vsni co uk     The correct bibliographical reference for this document is     Gilmour  A  R   Gogel  B  J   Cullis  B  R   Welham  S  J  and Thompson  R   2014   ASRem   User Guide Release 4 1 Functional Specification  VSN International Ltd  Hemel Hempstead   HP1 1ES  UK www vsni co uk    Preface    ASReml is a statistical package that fits linear mixed models using Residual Maximum Like   lihood  REML   It has been under development since 1993 and arose out of collaboration  between Arthur Gilmour and Brian Cullis  NSW Department of Primary Industries  and  Robin Thompson and Sue Welham  Rothamsted Research  to research into the analysis  of mixed models and to develop appropriate software  building on their wide expertise in  relevant areas including the development of methods that are both statistically and compu   tationally efficient  the analysis of animal and plant breeding data  the analysis of spatial and  longitudinal data and the production of widely used statistical software  More recently  VSN  International acquired the right to ASReml from these sponsoring organizations and
398. qualifiers       qualifier    action         FOR forlist  DO  command    IFOR  Markern  DO  MBF    IFOR  Markers  DO  MBF    The argument  n  is often given as  1 indicating that the actual path to  use is specified as the first argument on the command line  see Section  10 4   See Sections 15 7 and 15 10 for examples  The default value of n is  1      DOPATH n can be located anywhere in the job but if placed on the top  job control line  it cannot have the form  DOPATH  1 unless the arguments  are on the command line as the  DOPATH qualifier will be parsed before  any job arguments on the same line are parsed     New R4 The  FOR      DO     command is intended to simplify coding  when a series of similar lines are required in the command file which differ  in a single argument  The list of arguments is placed after  FOR and the  command is written after  DO with  S indicating where the argument is  to be inserted  list may be an assign string since they are processed before  the  FOR statement is expanded  Furthermore  if list is entirely integer  numbers  7 7 notation can be used     For example    ASSIGN Markern 35 75 125    ASSIGN Markers M35 M75 M125   mbf Geno 1  markers csv  key 1  RFIELD S  RENAME M S   me Ir  Markers   is expanded to   IMBF mbf Geno 1  markers csv  key 1  RFIELD 35  RENAME M35   IMBF mbf Geno 1  markers csv  key 1  RFIELD 75  RENAME M75   IMBF mbf Geno 1  markers csv  key 1  RFIELD 125  RENAME M125  Ir M35 M75 M125    The aim here is to generate the 
399. qualifiers that allow specification of initial values and  constraints  We have given an explicit specification for these variance component models to  emphasise the form of the syntax  However  an alternative more concise implicit specification  for these models is to note that idv is a default function and the random terms can be placed  after  r without explicitly specifying idv  Furthermore  residual idv  units  is the default  residual specification and may be omitted from the model specification  This is precisely the  form used in Release 3 for these models     94    6 5 Interactions and conditional factors       6 5 Interactions and conditional factors    6 5 1 Interactions    e interactions are formed by joining two or more terms with a          or a        which is replaced  with           for example  a b is the interaction of factors a and b     interaction levels are arranged with the levels of the second factor nested within the levels  of the first     labels of factors including interactions are restricted to 47 characters of which only the  first 20 are ever displayed  Thus for interaction terms it is often necessary to shorten the  names of the component factors in a systematic way  for example  if Time and Treatment  are defined in this order  the interaction between Time and Treatment could be specified  in the model as Time  Treat  remember that the first match is taken so that if the label  of each field begins with a different letter  the first letter i
400. r each trait       Trait by itself fits the mean for each variate       in an interaction Trait Fac fits the factor Fac for each variate and Trait  Cov fits the  covariate Cov for each variate  An explanatory factor or covariable associated with Trait  i can be fitted using at  Trait i  Fac or at Trait i  Cov     ASReml internally arranges the data so that n data records containing t traits each becomes n  sets of t analysis records indexed by the internal factor Trait ie  nt analysis records ordered  Trait within data record  If the data is already in this long form  use the  ASMV t qualifier  to indicate that a multivariate analysis is required     8 3 Residual variance structures    Using the notation of Section 2 1 11  consider a multivariate analysis with t traits and n units  in which the data are ordered traits within units  An algebraic expression for the residual  variance matrix in this case is   I  8     where         is an unstructured variance matrix  This is the general form of residual variance  structures required for multivariate analysis     8 3 1 Specifying multivariate variance structures in ASReml       A standard multivariate analysis is achieved   Orange Wether Trial 1984 8  using the the us   variance model func    SheepID  I   tion for the two random Trait com  TRIAL   ponents  and specifying the R structure   BloodLine  I          TEAM    for the residual error term as residual hs  id units   us Trait   GFW YLD FDIAM    f f ake wether dat  skip 1  e 
401. r effects   mu takes 1  variety then takes 2  linNitr  takes 1  nitrogen takes 2  variety linNitr takes 2 and there are four degrees of freedom  left  This information is used to make sure that the conditional Wald F statistic does not  contradict marginality principles     The next table indicates the details of the conditional Wald F statistic  The conditional  Wald F statistic is based in the reduction in Sums of Squares from dropping the particular  term  indicated by    from the model also including the terms indicated by I  C and c     The next two tables  based on incremental and conditional sums of squares report the model  term  the number of effects in the term  the  numerator  degrees of freedom  the Wald F  statistic  an adjusted Wald F statistic scaled by a constant reported in the next column and  finally the computed denominator degrees of freedom  The scaling constant is discussed by  Kenward and Roger  1997      Table showing the reduction in the numerator degrees of freedom  for each term as higher terms are absorbed    Model Term 6    a 2 J   mu 12   3 1   variety 1 1 2   LinNitr 1    4   4   1 3   9 3   nitrogen 8 2  6  4    ONWW Ww    variety LinNitr  variety  nitrogen    OoanRFRWNE    Marginality pattern for F con calculation     Model terms       Model Term DF 12 3 4 5 6  1 mu Lm g G a      2 variety 2 T      G   3 LinNitr Lt I T 3   4 nitrogen 2 ai I L     5 variety LinNitr 2 i I I I  amp     6 variety nitrogen 4 I I I I I    Model codes  b A a A bB  F
402. r mixed  models  Journal of the American Statistical Association 88  9 25     Breslow  N  E  and Lin  X   1995   Bias correction in generalised linear mixed models with  a single component of dispersion  Biometrika 82  81 91     Cox  D  R  and Hinkley  D  V   1974   Theoretical Statistics  Chapman and Hall     Cox  D  R  and Snell  E  J   1981   Applied Statistics  Principals and Examples  Chapman  and Hall     Cressie  N  A  C   1991   Statistics for spatial data  John Wiley and Sons     Cullis  B  R  and Gleeson  A  C   1991   Spatial analysis of field experiments   an extension  to two dimensions  Biometrics 47  1449 1460     Cullis  B  R   Gleeson  A  C   Lill  W  J   Fisher  J  A  and Read  B  J   1989   A new  procedure for the analysis of early generation variety trials  Applied Statistics 38  361   375     Cullis  B  R   Gogel  B  J   Verbyla  A  P  and Thompson  R   1998   Spatial analysis of  multi environment early generation trials  Biometrics 54  1 18     Dempster  A  P   Selwyn  M  R   Patel  C  M  and Roth  A  J   1984   Statistical and  computational aspects of mixed model analysis   Applied Statistics 33  203 214     Draper  N  R  and Smith  H   1998   Applied Regression Analysis  John Wiley and Sons   New York  3rd Edition     Fernando  R  and Grossman  M   1990   Genetic evaluation with autosomal and x   chromosomal inheritance  Theoretical and Applied Genetics 80  75 80     Gilmour  A  R   2007   Mixed model regression mapping for qtl detection in experime
403. r the AR x AR1 model using the dimensions of the fac   tors rather than the factor names  In this case the data records would need to be sorted in  the order rows within columns because ASReml does not reorder the data internally when  dimensions are used but instead assumes that the specified variance structure matches the    114    7 3 Applying variance structures to the residual error term       order of the data as presented in the data file     The fourth example assumes variance heterogeneity among the data observations  that is   that the three groups comprising observations 1   23  24   50  51   70 have unequal vari   ances     residual idv 23  idv 27  idv 20     The fifth and final example is the default residual variance in a multivariate analysis  Spec   ifying units as the first component is crucial as ASReml extracts the trait values by trait  within unit     residual id units   us Trait     7 3 1 Special properties and rules in defining the residual error term    There are certain properties and associated rules for this term that require special consider   ation     Rule 1 The number of effects in the residual term must be equal to the number of data units  included in the analysis     Rule 2 Where a compound model term is specified for the residuals  each combination of  levels of the single model terms comprising this term must uniquely identify one unit of  the data  For example  in the spatial analysis of a column trial comprising 4 replicates of  24 variet
404. r the progeny records     Assume our data file ramdbh txt has fields tree mum dad row column plot DBH Aldiag  OP parent and we have deleted the non parent rows from the full pedigree file to form  ParentPed txt  If you have a pedigree file for all trees  processing that pedigree with the   GIV 2 qualifier will create a pedigree file just containing the parents and also the Q giv  file for the non parent referred to below  If we assume a heritability of 0 1111 so that the  ratio of genetic variance to residual variance is 0 125  the following model will estimate the  breeding values for the parents directly     RAM BLUP model   tree     mum  P   V21   dad  P   V21   row     column     plot     DBH   AIdiag   V21   NP  A  L Nonparent Parent   parent  P   filter   NP    1   create Nonparent filter  mum   filter   dad   filter   AIdiag   filter   WT   0 125   AIdiag     1   AIdiag   1   filter  ParentPed txt   ramdbh txt   DBH  WT WT   mu    Ir str parent and mum 0 5  and dad 0 5  id 1  nrmv parent  0 125     plot ariv column   ar1 row     residual idv  units     In this model     e NP  A  L Nonparent Parent ensures the NP data field is coded 1 for non parents and 2  for parents     e filter   NP    1 creates a variable that is 1 for non parents and zero for parents   e The   filter transformations put mum  dad and AIdiag information to zero for parents     e WT   0 125   AIdiag    1   AIdiag   1   V21 creates a weight variable which is 1  for parent records  g  q 7  for a non pa
405. r values are not at the REML solution     Parameters appear to be at the REML solution in that the  parameter values are stable     sense that the user may have intended something different     Messages beginning with the word Warning  highlight information that the user should    check  Again  it may reflect an error if the user has intended something different     Messages beginning with the word Error  indicate that something is inconsistent as far as  ASReml is concerned  It may be a coding error that the user can fix easily  or a processing  error which will generally be harder to diagnose  Often  the error reported is a symptom of    something else being wrong     255    They provide    14 5 Information  Warning and Error messages       Table 14 2  List of warning messages and likely meaning s        warning message    likely meaning       Notice  ASReml has merged  design points closer than    Warning  e missing   values generated by     transformation   Warning  ti singularities in  AI matrix    Warning  m variance  structures were modified    Warning  n missing values  were detected in the design    Warning  n negative weights    Warning  r records were read  from multiple lines    WARNING term has more levels         than expected            Warning  term in the predict   TGNORE list    Warning  term in the predict   USE list  Warning  term is ignored for    prediction    Warning  Check if you need  the  RECODE qualifier    Warning  Code B   fixed ata  boundary   GP
406. rait   nrm tag  xfai TrDam12  nrm dam  xfai TrLit1234  id lit     lf Trait  erp   residual id units   us Trait      PATH 3   wwt ywt gfw fdm fat   Trait Trait age Trait brr Trait sex Trait age sex  r  VARF   us Trait   nrm tag  xfai TrDam12  nrm dam  us TrLit1234   id lit     I   Trait grp   residual id units   us Trait      PATH 4   wwt ywt gfw fdm fat   Trait Trait age Trait brr Trait sex Trait age sex  r  VARF   xfa2 Trait   nrm tag  xfai TrDam12  nrm dam  us TrLit1234   id lit     lf Traat grp   residual id units   us Trait      PART 5   wwt ywt gfw fdm fat   Trait Trait age Trait brr Trait sex Trait age sex  r  VARF   xfa3 Trait   nrm tag  xfai TrDam12  nrm dam  us TrLit1234   id lit     It Trait  grp   residual id units   us Trait     The term function Tr   nrm tag  now replaces the function Tr   id sire  and picks up  part of function TrDam12   id dam  variation present in the half sib analysis  This analysis  uses information from both sires and dams to estimate additive genetic variance  The dam  variance component is this analysis estimates the maternal variance component  It is only  significant for the weaning and yearling weights  The litter variation remains unchanged   Notice again how the maternal effect is only fitted for the first 2 trait and the litter effect  for the first 4 traits  The critical detail is that SUBSET is used to setup TrDam12  a variable  using the first two traits  ASReml uses the relationship matrix for the dam dimension   since  dam is d
407. rameters  max2500   2 special structures  Final parameter values   3  6  0 10000E 00 1 00000 0 10000E 00    0 10000E 00  Last line read was  Residual ari Row  ari Col   Finished  23 Apr 2014 09 17 20 179 RESIDUAL structure does not match records in data    252    14 4 An example       7  Missing plots in field layout       The variables row and column define a 22x11    NIN Alliance Trial 1989    grid  that is 242 plots  but there are only 224  plots in the data  We could manually work  out which are missing  and construct extra  data lines to complete the grid  but ASReml  will do this for us if we add the qualifiers    repl      and add the model term mv to estimate miss     r repl    ing values for the missing plots  So this prob    residual ar1 row   ar1 col   predict varierty       lem is resolved by changing the model lines to    variety  A  id pid raw    row   column    nin89 asd  skip 1   ROWFAC row  COLFAC column yield   mu variety            read   nin89 asd  skip 1    ROWFAC row  COLFAC column  yield   mu variety mv  R repl    residual ari row  ari col     This output also flags the 8th error which is the misspelling of variety in the predict line   That error does not stop the job running  but does mean the predicted means for variety    will not be formed     QUALIFIERS   SKIP 1    Reading nin89 asd FREE FORMAT skipping 1 lines  12 mu 1  ari  row  in ar1 row  ar1i col  has size 22  parameters  5 5  ari  col  in ar1 row  ari col  has size 11  parameters  6 6  ari  
408. raphics file type to HP GL 2    allows the user to temporarily fix the parameters listed  Parameter num   bers have been added to the reporting of input values to facilitate use of  this and other parameter number dependent qualifiers  The list should  be in increasing order using colon to indicate a sequence  step size is 1   For example  HOLD 1 20 30 40     79    5 8 Job control qualifiers       Table 5 5  List of rarely used job control qualifiers       qualifier    action       ILAST  lt factor   gt   lt lev   gt   Kfacz  gt   lt levg  gt   lt fac3  gt   lt lev3  gt        OUTLIER    OWN f     PRINT n     PNG  IPS     PVSFORM n    RESIDUALS  2     limits the order in which equations are solved in ASReml by forcing  equations in the sparse partition involving the first  lt lev   gt  equations of   lt factor   gt  to be solved after all other equations in the sparse partition   It is intended for use when there are multiple fixed terms in the sparse  equations so that ASReml will be consistent in which effects are identified  as singular  The test example had   Ir Anim Litter  f HYS   where genetic groups were included in the definition of Anim     Consequently  there were 5 singularities in Anim  The default reordering  allows those singularities to appear anywhere in the Anim and HYS terms   Since 29 genetic groups were defined in Anim   LAST Anim 29 forces the  genetic group equations to be absorbed last  and therefore incorporate  any singularities   In the more general
409. rder c  autocorrelation parameter pe  respectively  More  specifically  a two dimensional separable autoregressive spatial structure  AR1    AR1  is  sometimes assumed for the common errors in a field trial analysis  see Gogel  1997  and  Cullis et al   1998  for examples   In this case    1 1  Pr 1 Pe 1   t   e m 1 and X    Pe pe 1  oe Ge pe ney  a i O   Be ve    Alternatively  the residuals might relate to a multivariate analysis with n  traits and n units  and be ordered traits within units  In this case an appropriate variance structure might be    ILO X    where X           is a general or unstructured variance matrix  See Chapter 7 for details on  specifying separable R structures in ASReml     10    2 1 The general linear mixed model       2 1 12 Direct products in G structures  Likewise  the random model terms in u may have a direct product variance structure  For    example  for a field trial with s sites  g varieties and the effects ordered varieties within sites   the random model term site  variety may have the variance structure    Yel   where    is the variance matrix for sites  This would imply that the varieties are independent  random effects within each site  have different variances at each site  and are correlated across  sites  Important Whenever a random term is formed as the interaction of two factors you  should consider whether the IID assumption is sufficient or if a direct product structure    might be more appropriate  See Chapter 7 for details on 
410. red by the VCM process  see Section 7 8 2 above  For example  using a control  file vemdes as containing    Create VCM Design for H F model   Row     Col     Off   Y   vo   vemdes asd  DESIGN   Y   Row and Row  0 5  and Col 0 5  Off    and a data file vemdes asd containing    anananrPRPRPRBPWWWNYN EB  ORPWNHRFPBPWNHRWNHENF  1  ps    then the file vcmdes des will be generated which contains the values used in fitting the  variance model for the HuynhFeldt model given in Section 7 8 2     135    7 9 Ways to present initial values to ASReml       7 9 Ways to present initial values to ASReml    In complex models  the Average Information algorithm can have difficulty maximising the  REML log likelihood when starting values are not reasonably close to the REML solution   ASReml has several internal strategies to cope with this problem  When the user needs to  provide better starting values than those generated by ASReml three of the methods are       inserting explicit initial values in the  as file  for example using   INIT        doing a preliminary run to obtain  tsv or  msv files and then modifying the parametric  information in one of those files  Section 7 9 1       fitting a simpler model and using parameter values derived from the simpler model   through the  rsv file  Section 7 9 2     7 9 1 Using templates to set parametric information associated with  variance structures using  tsv and  msv files    ASReml 3 needed initial values for most variance structure parameters an
411. rent record with q the respective diagonal element    167    8 11 Factor effects with large Random Regression models       of AIdiag  with q   2 for non inbred non parents  and y is the variance ratio o   o   0 125  in this case  This weighting corresponds to a residual variance for a non parent record of     05 4   o     e If there is no direct information on parents  the parent term is replaced by zero  where  zero is a variable with zero elements     e If dad is unknown  the and dad  term is dropped     e The BLUPs of a non parent will need to be calculated outside ASReml by adding  y  q 7    times its residual to the average of the parental BLUPs     Prediction of parental values with assumed heritability was the main motivation for the  development of the reduced animal model  Estimation of genetic variance parameters is a  little more complicated and the computational gains of removing non parent genetic values  from the estimation procedure only apply if it is reasonable to form a small number of groups  with roughly similar AIdiag values  If AIG is this group factor then one can estimate residual  variances in each group using sat  AIG   idv units  and use the variance parameter linear  model facilities to constrain the residual variances and the parent variance to be a function  of the genetic and residual variances     8 11 Factor effects with large Random Regression models    One use of the GRM matrix is to allow more computationally efficient fitting of random  re
412. respond to these singularities are zero in the  sln  file     Singularities in the sparse_fixed terms of the model may change with changes in the random  terms included in the model  If this happens it will mean that changes in the REML log   likelihood are not valid for testing the changes made to the random model  This situation is  not easily detected as the only evidence will be in the  sln file where different fixed effects  are singular  A likelihood ratio test is not valid if the fixed model has changed     6 10 4 Examples of aliassing    The sequence of models in Table 6 5 are presented to facilitate an understanding of over   parameterised models  It is assumed that var is a factor with 4 levels  trt with 3 levels and  rep with 3 levels and that all var trt combinations are present in the data     Table 6 5  Examples of aliassing in ASReml          model number of order of fitting  singularities  yield   var  r idv rep  0 rep var  yield   mu var  r idv rep  1 rep mu var  first level of var is aliassed and set to  Zero  yield   var trt  r idv rep  1 rep var trt    var fully fitted  first level of trt is  aliassed and set to zero    yield   mu var trt var trt  8 rep mu var trt var trt   Ir idv rep  first levels of both var and trt are  aliassed and set to zero  together with  subsequent interactions    107    6 11 Wald F Statistics       Table 6 5  Examples of aliassing in ASReml          model number of order of fitting  singularities  yield   mu var trt  r idv rep   
413. riables if the second argument   1  setting the number of levels is present  it may be     For example      is equivalent to  X1 X2 X3 X4 X5 y X IGDy    data dat data dat  y   mu X1 X2 X3 X4 X5 y   mu X       DATE specifies the field has one of the date formats dd mm yy  dd mm ccyy  dd Mon   yy  dd Mon ccyy and is to be converted into a Julian day where dd is a 1 or 2 digit  day of the month  mm is a 1 or 2 digit month of the year  Mon is a three letter month  name  Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec   yy is the year within  the century  00 to 99   cc is the century  19 or 20   The separators         and         must be  present as indicated  The dates are converted to days starting 1 Jan 1900  When the  century is not specified  yy of 0 32 is taken as 2000 2032  33 99 taken as 1933 1999        DMY specifies the field has one of the date formats dd mm yy or dd mm ccyy and is to  be converted into a Julian day        MDY specifies the field has one of the date formats mm dd yy or mm dd ccyy and is to  be converted into a Julian day        TIME specifies the field has the time format hh mm ss  and is to be converted to seconds  past midnight where hh is hours  0 to 23   mm is minutes  0 59  and ss is seconds  0 to    et    59   The separator         must be present     e transformations are described below     5 4 2 Storage of alphabetic factor labels    Space is allocated dynamically for the storage of alphabetic factor labels with a default  allocation being 2000
414. ricem asd  skip 1  X syc  Y sye   sqrt yc  sqrt ye    Trait  r us Trait  id variety  us Trait   id run   residual id units   us  Trait    predict variety    A portion of the output from this analysis is    8 LogL  343 220 S2  1 00000 262 df  9 LogL  343 220 S2  1 00000 262 df          Results from analysis of sqrt yc  sqrt ye              Akaike Information Criterion 704 44  assuming 9 parameters    Bayesian Information Criterion 736 56  Model_Term Sigma Sigma Sigma SE  C  id units   us Trait  264 effects  Trait USV 1 1 2 14370 2 14370 4 44 OP  Trait US_C 2 1 0 987342 0 987342 2 59 OP  Trait USV 2 2 2 34744 2 34744 4 62 OP  us  Trait   id variety  88 effects  Trait usy a 1 3 83911 3 83911 3 47 0P  Trait USC 2 1 2 33352 2 33352 2 01 OP  Trait USV 2 2 1 96136 1 96136 269 OP  us  Trait   id run  132 effects  Trait USV 1 1 1 70810 1 70810 2 61 OP  Trait US_C 2 1 0 319444 0 319444 0 59 OP  Trait USV 2 2 2 54360 2 54360 3 20 OP  Covariance Variance Correlation Matrix US Residual   2 144 0 4401   0 9873 2 347  Covariance Variance Correlation Matrix US us Trait  id variety   3 839 0 8504   2 334 1 961  Covariance Variance Correlation Matrix US us Trait  id run    1 708 0 1533   0 3194 2 544    The resultant REML log likelihood is identical to that of the heterogeneous univariate analysis   column  b  of table 15 8   The estimated variance parameters are given in Table 15 10     The predicted variety means in the   pvs file are used in the following section on interpretation    of res
415. riety plotted   against estimate for Control    lt  sa s sas aiaa ee ey ee RH Y 308  Trellis plot of trunk circumference for each tree      o oo aaa aa 309  Fitted cubic smoothing spline for tree 1    aoaaa aaa a 311  Plot of fitted cubic smoothing spline for model 1                 314  Trellis plot of trunk circumference for each tree at sample dates  adjusted for   season effects   with fitted profiles across time and confidence intervals       315  Plot of the residuals from the nonlinear model of Pinheiro and Bates      316    XV    1 Introduction    1 1 What ASReml can do    ASReml  pronounced A S Rem el  is used to fit linear mixed models to quite large data sets  with complex variance models  It extends the range of variance models available for the  analysis of experimental data  ASReml has application in the analysis of    e  un balanced longitudinal data    e repeated measures data  multivariate analysis of variance and spline type models    e  un balanced designed experiments    e multi environment trials and meta analysis     e univariate and multivariate animal breeding and genetics data  involving a relationship  matrix for correlated effects      e regular or irregular spatial data     The engine of ASReml underpins the REML procedure in GENSTAT  An interface for R called  ASReml R is available and runs under the same license as the ASReml program  While these  interfaces will be adequate for many analyses  some large problems will need to use ASReml   The ASR
416. rm  is confounded with a fixed term and when there is no information in the  data on a particular component    Another common cause is when fitting an animal model and there is ex   cessive sire dam variance  so that heritability from a sire model would  exceed 1  so that the residual variance under the animal model has ap   proached zero  In this case the data contradicts the assumptions of the  animal model    The best solution is to reform the variance model so that the ambiguity  is removed  or to fix one of the parameters in the variance model so that  the model can be fitted  Only rarely will it be reasonable to specify the   ATSINGULARITIES qualifier     sets hardcopy graphics file type to   bmp     suppresses some of the information written to the  asr file  The data  summary and regression coefficient estimates are suppressed  This quali   fier should not be used for initial runs of a job until the user has confirmed  from the data summary that the data is correctly interpreted by ASReml   Use  BRIEF 2 to cause the predicted values to be written to the  asr file  instead of the  pvs file  Use  BRIEF  1 to get BLUE  fixed effect  esti   mates reported in  asr file  The  BRIEF qualifier may be set with the B  command line option     is used to calculate the effects reported in the  sln file without calcu   lating any derived quantities such as predicted values or updated vari   ance parameters  For argument values 1 3  ASReml solves for the effects  directly while for 
417. rmation so that the transformations  apply to the same set of variables   Y1 Y2 Y3 Y4 Y5   Repeat 5 times  incrementing just   Ymean   0   DO 5 O 1   Y1  ENDDO   5   the argument  is equivalent to  Y1 Y2 Y3 Y4 Y5   Ymean   0    Y1   Y2   Y3   Y4   Y5   5    YO Y1 Y2 Y3 Y4 Y5  TARGET Y1  do 5 1 O   YO  ENDDO Take YO from rest  Markers  G 12  do  D    ENDDO   Delete records with missing marker values    The default arguments   12  1  0   are used  The initial target is the first marker     5 5 3 Remarks concerning transformations    Note the following    variables that are created should be listed after all variables that are read in unless the  intention is to overwrite an input field     missing values are unaffected by arithmetic operations  that is  missing values in the  current or target column remain missing after the transformation has been performed  except in assignment       3 will leave missing values  NA    and    as missing         3 will change missing values to 3     multiple arithmetic operations cannot be expressed in a complex expression but must be  given as separate operations that are performed in sequence as they appear  for example   yield   120   0 0333 would calculate 0 0333    yield   120      59    5 5 Transforming the data       e Most transformations only operate on a single field and will not therefore be performed  on all variables in a  G factor set  The only transformations that apply to the whole set  are  DOM   MM and  RESCALE        ASReml c
418. ro ae Ee hee hee hw a e g ek we ea ee he E a    8 5 The gommand ME eoc et  eS OM SEE OES HERS  8 6 Thep  edgee ie  e ce bee hee WR we ee eee ee he  8 7 Reading in the pedigree file              00000 ee eee eee  8 8 Genetic groups  gt  cc sc ow oe oe ASR EMG eb ERE S EES e  8 9 Reading a user defined  inverse  relationship matrix                 8 9 1 Genetic groups in GIV matrices    2    2    ee ee  8 9 2 The example continued   bk ke ee ee eR Ree eS  8 10 The reduced animal model  RAM                   22000 0   8 11 Factor effects with large Random Regression models                 Tabulation of the data and prediction from the model  9 1 IOUNCNIORG 0 oe Ze ee Poe hae Se BEER EES Ee ESS SS  9 2 WIE ce oe ee eae ee ee eee eee eee we  9 3 PPOGIOMON  oe t   444 426 4 o 6 44 6 KE od  amp  He aoi He tae BS ee  9 3 1 Underlying principles    s os c cs ce bev debe atiae EES  9 3 2 Predictsyntak o o p ea alee ke es i e a a eh eee e e a  9 33 Freedict Tailte soes oe Bab a ra do eee a ea He a y  9 3 4 Associated factors 2  co ke Pek ee ee eee a Eo eee Ee wd  9 3 5 Complicated weighting with PRESENT                   03 6 Examples  oei chk ee heb eee hehe bbe ERE REG HS    9 3 7 New R4 Prediction using two way interaction effects               10 Command file  Running the job    Di Tniroduchon    so me BH EE EEE EE EE SOG PE EERE RS RG  10 2 Th  gomiand line  e ow eae a ok ee E ee RH  10 21 Normal run   6 64 44S 4 EER ES EREEER regda SD HDA  10 2 2 Processing a  pin We  o s sa sacca
419. rocessing arguments       10 4 3 Paths and Loops    ASReml was designed to analyse just one model per run  However  the analysis of a data set  typically requires many runs  fitting different models to different traits  It is often convenient  to have all these runs coded into a single  as file and control the details from the command  line  or top job control line  using arguments  The highlevel qualifiers  CYCLE and DOPATH    enable multiple analyses to be defined and run in one execution of ASReml     Table 10 3  High level qualifiers       qualifier    action        ASSIGN list    New R4 An   ASSIGN string qualifier has been added to extend coding  options  It is a high level qualifier command which may appear anywhere  in the job  Each occurrence of   ASSIGN must start on its own input line   The syntax is     ASSIGN name string  or   ASSIGN name   lt  string   gt     and the defined string is substituted into the job where  name appears   string is the rest of the line and may include blanks  If   lt    gt  encloses  string  string may extend over several lines  which are concatenated    For example  ASSIGN TVS xfai Treat      TVS geno      is interpreted as  xfai Treat  geno        Restrictions     e a maximum of 50 assign strings may be defined   e the combined length of all strings is 5000 characters     e name may have up to 8 characters but should not begin with a  number  see command line arguments      e dollar substitution occurs before most other high level act
420. row   ar1 col    4  6  initialized   Forming 61 equations  57 dense   Initial updates will be shrunk by factor 0 400  Notice  Invalid argument  unrecognised qualifier or  vector space exhausted at    varierty  Error  R structures do not match records in data   Error  Spatial Layout is not rectangular grid  Fault  Variance structure does not match data  Last line read was   STOP  ninerr7 variety id pid raw rep nloc yield lat  Model specification  TERM LEVELS GAMMAS  variety 56  mu al  repl 4 0 100   3   SECTIONS 242 4 1  STRUCT 22 1 1 5 1 1 10  10 1 1 6 1 1 11  15 factors defined  max5000    6 variance parameters  max2500   2 special structures  Final parameter values   3  6  0 10000 1 0000 0 10000    253    14 5 Information  Warning and Error messages       0 10000  Last line read was   STOP  Finished  23 Apr 2014 09 17 23 354 Variance structure does not match data    8  A misspelt factor name in the predict statement    The final error in the job is that a factor name is misspelt in the predict statement  This is  a non fatal error  The  asr file contains the messages  Notice  Invalid argument  unrecognised qualifier or   vector space exhausted at    voriety    Warning  Extra lines on the end of the input file are ignored from  predict varierty    The faulty statement is otherwise ignored by ASReml and no  pvs file is produced  To  rectify this statement correct varierty to variety     14 5 Information  Warning and Error messages    ASReml prints information  warning and err
421. roximate stratum variance decomposition   Stratum Degrees Freedom Variance Component Coefficients  idv  dam  22 50 1271762 11 5 1 0  Residual Variance 292 44 0 165300 0 0 1 0  Model_Term Gamma Sigma Sigma SE  C  idv  dam  IDV_V 27 0 586674 0 969770E 0O1 2 92 OP  idv units  322 effects  Residual SCA_V 322 1 000000 0 165300 12 09 OP    Wald F statistics    Source of Variation NumDF DenDF_con F_inc F_con M P_con  7 mu 1 32 0 9049 48 1099 20 b  lt  001  3 littersize 1 316 27 99 46 25 B  lt  001  1 dose 2 29 9 12 15 11 51 A  lt  001  2 sex 1 299 8 57 96 57 96 A  lt  001  8 dose sex 2 302 1 0 40 0 40 B 0 673    Notice  The DenDF values are calculated ignoring fixed boundary singular  variance parameters using algebraic derivatives     4 dam 27 effects fitted  SLOPES FOR LOG ABS RES   on LOG PV  for Section 1  zo    3 possible outliers  see  res file    274    15 3 Unbalanced nested design   Rats       The iterative sequence has converged and the variance component parameter for dam hasn   t  changed for the last three iterations  The incremental Wald F statistics indicate that the  interaction between dose and sex is not significant  The F_con column helps us to assess  the significance of the other terms in the model  It confirms littersize is significant after  the other terms  that dose is significant when adjusted for littersize and sex but ignoring  dose sex  and that sex is significant when adjusted for littersize and dose but ignoring  dose sex  These tests respect marginali
422. rs identifies the factors to be used for classifying the data  Only factors  not  covariates  may be nominated and no more than six may be nominated     ASReml prints the multiway table of means omitting empty cells to a file with extension    tab     9 3 Prediction    9 3 1 Underlying principles    Our approach to prediction is a generalization of that of Lane and Nelder  1982  who only  consider fixed effects models  They form fitted values for all combinations of the explana   tory variables in the model  then take marginal means across the explanatory variables not  relevent to the current prediction  Our case is more general in that we also consider the case  of associated factors  see below  and options for random effects that appear in our  mixed   models  A formal description can be found in Gilmour et al   2004  and Welham et al    2004      Associated factors have a particular one to many association such that the levels of one factor     say Region  define groups of the levels of another factor  say Location   In prediction  it is  necessary to correctly associate the levels of associated factors     175    9 3 Prediction       Terms in the model may be fitted as fixed or random  and are formed from explanatory  variables which are either factors or covariates  For this exposition  we define a fixed factor  as an explanatory variable which is a factor and appears in the model in terms that are fixed   it may also appear in random terms   a random factor as an expla
423. rt CORUH and APA to US   o ac cc hha eH RRR REG HS  1223 Correlation i i sa hee ee ee Eee eR ee SES  12 2 4 A more detailed example               0 22 0000  123 VPREDICT  PIN file processing    gt   lt   o ao ce Se ee eee eee ee  13 Description of output files  131  NOOO    ce bh ee ag eenaa e GEE E SE OEE ES  132 Aneampl   sa s sss s eaor kw aa et ap E os aa ee a eUa  133 WP Ce les 2 sieca ee Bed RS Ot e Re a ea ey  1331 The asr file  oe ee eh oe oe oe BEES eA Ree ee  1332 TOR cE  n ers era pe Gh erde eee Eee ee  133 3 The yht fE o oi ecaa a a a a a N a a aa  13 4 Other ASReml output files     a ey hw wd ww a  1341 The aor ile orek seges deei e dies EERO YESS  1342 The  sasl iile  ene ieder he CEE SE g i ee eS p  1343 The dpr tle  e ca adera ea ea a Ba e SEAS a    1344 The msy tile     5 6 soos Gao Be ee eee ewe ea eS 228    134 5 The BURT e sac doreir ay oe PY ORS eR OEE EKG de 229  1460 Thep file oero Be eee Pw EEE ES eh Ge ek ee hs 230  1347 THe  restile oo 2s ceded wRERE Oe RSE eERESD EASES YS 230  134 8 The  rey file o  o psc eee eh ee eh Oe ee EAS 237  BAS The stabile eo air ati a ee eh eee ai i ee ai i eed 238  BAW CS e e bed ae b OS eH pia E ee ee A 238  134 11 The vrb file  o ae eh ds podia rrara Yaa ra Aoo arada 239  13412 The WS  ee ca samea edee e a REE EE ae ED 240  13 5 ASReml output objects and where to find them                  240  14 Error messages 244  MI lntrodichian o se e cerra ee a E ae wee RE pe e RE SR eee EY 244  142 Common problems ec ss ssa se Ki
424. ructure  when kis w     1     initial values for US  CHOL and ANTE structures are given in the form of a US matrix which  is specified lower triangle row wise  viz    On  Fn  Oa      0  0  0     31 32 33    that is  initial values are given in the order  1   0   2 0   3 0         the US model is associated with several special features of ASReml  There is an process to  update its values by EM  see  EMFLAG rather than Al when its Al updates make the matrix  non positive definite  Also  when used in the R structure for multivariate data  ASReml  automatically recognises patterns of missing values in the responses  see Chapter 8      142    7 11 Variance model functions available in ASReml       7 11 4 Notes on Mat  rn    The Mat  rn class of isotropic covariance models is now described  ASReml uses an extended  Mat  rn class which accomodates geometric anisotropy and a choice of metrics for random  fields observed in two dimensions  This extension  described in detail in Haskard  2006   is  given by    where h    hy  hy     is the spatial separation vector   6 a   governs geometric anisotropy    A  specifies the choice of metric and     v  are the parameters of the Mat  rn correlation  function  The function is    puldiov    2r  2  x S    7 1     where    gt  0 is a range parameter  v  gt  0 is a smoothness parameter  T     is the gamma  function  K     is the modified Bessel function of the third kind of order v  Abramowitz and  Stegun  1965  section 9 6  and d is the distanc
425. ructure for a term  has changed  ASReml will take results from some structures as supplying starting values for  other structures  The transitions recognised are    CORUH to FA1 and XFA1  CORGH to US   DIAG to CORUH   DIAG to FA1   DIAG to XFA1   FAz to CORGH   FAz to FAtt1   FAz to US   XFA2 to XFAt 1   XFAz to US   US to XFA1  XFA2  XFA3    Users may wish to keep output from a series of runs  This can be done by using  RENAME 1   ARG runnumber on the first line of the command file or alternatively  R1 basename runnum   ber on the command line  This ensures that the output from the various parts has runnumber  appended to the base filename  If an  rsv file does not exist for the particular runnumber    138    7 10 Default variance structures in ASReml       you are running  ASRem1 will retrieve starting values from the most recent  rsv file formed  by that job  You can  of course  copy an  rsv file building the new runnumber into its name  so that ASRem1 uses that particular set of values  The  asr file keeps track of which  rsv  files have been formed  If the user wishes to use different models with different runs then  using  DOPART  1 and specifying the different models in different parts will achieve this aim     7 10 Default variance structures in ASReml  There are default variance structures in ASReml that allow the linear mixed model to be    specified more succinctly  IDV is the default variance structure for random model terms and  for the residual error terms  For 
426. running the job with  CONTINUE 3   You may not change values in the first 3 fields   or RP fields where RP_GN is negative     HHHH         Fields are     GN  Term  Type  PSpace  Initial_value  RP_GN  RP_scale     4   Variance 1   V  P  1 00000000   4  1  5   ari row  ar1i column   ariv row _1   R  P  0 65547976   5  si  6     ari row  ar1 column  ari column _1i   R  P  0 43750453 s 6  i      Valid values for Pspace are F  P  U and maybe Z     RP_GN and RP_scale define simple parameter relationships   RP_GN links related parameters by the first GN number   RP_scale must be 1 0 for the first parameter in the set and  otherwise specifies the size relative to the first parameter   Multivalue RP_scale parameters may not be altered here       HH H H      Notice that this file is overwritten if not being read     13 4 5 The  pvc file    The  pvc file contains functions of the variance components produced by running a   pin file  on the results of an ASReml run as described in Chapter 12  The  pin and  pvc files for a  half sib analysis of the Coopworth data are presented in Section 15 10     229    13 4 Other ASReml output files       13 4 6 The  pvs file    The  pvs file contains the predicted values formed when a predict statement is included in  the job  Below is an edited version of nin89a pvs  See Section 3 6 for the  pvs file for the  simple RCB analysis of the NIN data considered in that chapter     NIN Alliance Trial 1989 03 Feb 2014 06 23 03 title line  nin89a    Ecode is E
427. s  ASReml will also include the denominator degrees of freedom   DenDF  denoted by v2   Kenward and Roger  1997  and a probablity value if these can be  computed  They will be for the conditional Wald F statistic if it is reported  The  DDF 2   see page 67  qualifier can be used to suppress the DenDF calculation   DDF  1  or request  a particular algorithmic method   DDF 1 for numerical derivatives   DDF 2 for algebraic  derivatives  The value in the probability column  either P_inc or P_con  is computed from  an Fava reference distribution  An approximation is used for computational convenience  when calculating the DenDF for Conditional F statistics using numerical derivatives  The  DenDF reported then relates to a maximal conditional incremental model  MCIM  which   depending on the model order  may not always coincide with the maximal conditional model   MCM  under which the conditional F statistic is calculated  The MCIM model omits terms  fitted after any terms ignored for the conditional test  I after   in marginality pattern    In the example above  MCIM ignores variety sow when calculating DenDF for the test of  water and ignores water  sow when calculating DenDF for the test of variety  When DenDF    22    2 5 Inference  Fixed effects       is not available  it is often possible  though anti conservative to use the residual degrees of  freedom for the denominator     Kenward and Roger  1997  pursued the concept of construction of Wald type test statistics  through a
428. s  use the  I  option instead  Otherwise  you will have to convert a factor  with alphanumeric labels to numeric sequential codes external  to ASReml so that an  A option can be avoided     The data file may need to be rewritten with some factors re   coded as sequential integers     This is an internal limit  Reduce the number of response vari   ables  Response variables may be grouped using the  G factor  definition qualifier so that more than 20 actual variables can be  analysed     this message occurs when there is an error forming the inverse  of a variance structure  The probable cause is a non positive  definite  initial  variance structure  US  CHOL and ANTE mod   els   It may also occur if an identity by unstructured  ID US   error variance model is not specified in a multivariate analysis   including  ASMV   see Chapter 8  If the failure is on the first  iteration  the problem is with the starting values  If on a sub   sequent iteration  the updates have caused the problem  You  can specify  GP to force the matrix positive definite  and try  reducing the updates by using the  STEP qualifier  Otherwise   you could try fitting an alternative parameterisation     generally refers to a problem setting up the mixed model equa   tions  Most commonly  it is caused by a non positive definite  matrix     266    14 5 Information  Warning and Error messages       Table 14 3  Alphabetical list of error messages and probable cause s  remedies          error message probable caus
429. s  zero MinNonO Mean MaxNonO StndDevn   1 variety 56 0 0 1 28 5000 56   2 id 0 O 1 000 28 50 56 00 16 20  3 pid 0 Q 1101  2628  4156  1121   4 raw 0 Q  21 00 510 5 840 0 149 0  5 repl    0 0 1 2 5000 4   6 nloc 0 O 4 000 4 000 4 000 0 000  7 yield Variate 0 O 1 050 25 53 42 00 7 450    34    3 6 Description of output files       8 lat 0 O 4 300 ah 22 47 30 12 90   9 long 0    1 200 14 08 26 40 7 698   10 row 22 0 0 1 11 7321 22   11 column 11 0 0 1 6 3304 11   12 mu 1  Forming 61 equations  57 dense   Initial updates will be shrunk by factor 0 400  Notice  1 singularities detected in design matrix    1 LogL  454 807 S2  50 329 168 df 0 1000   2 LogL  454 635 S2  50 073 168 df 0 1219   3 LogL  454 513 S2  49 818 168 df 0 1537   4 LogL  454 471 S2  49 622 168 df 0 1899   5 LogL  454  469 S2  49 584 168 df 0 1989   6 LogL  454  469 S2  49 582 168 df 0 1993  Final parameter values 0 1993         Results from analysis of yield        Akaike Information Criterion 912 94  assuming 2 parameters    Bayesian Information Criterion 919 19  Approximate stratum variance decomposition  Stratum Degrees Freedom Variance Component Coefficients  idv rep1  3 00 603 100 56 0 1 0  Residual Variance 165 00 49 5824 0 0 10  Model_Term Gamma Sigma Sigma SE   C  idv  repl  IDV_V 4 0 199323 9 88291   1 12 O P parameter  idv  units  224 effects estimates  Residual SCA_V 224 1 000000 49 5824 9 08 OF  Wald F statistics  Source of Variation NumDF DenDF F ing P inc testing  12 mu 1 3 0 242 05  lt  001 fixed 
430. s formally appears  in this hyper table  regardless of whether it is fitted as fixed or random  Note that variables  evaluated at only one value  for example  a covariate at its mean value  can be formally    176    9 3 Prediction       introduced as part of the classify or averaging set      c  Determine which terms from the linear mixed model are to be used when predicting the  cells in the multiway hyper table in order to obtain either conditional or marginal predictions   That is  you may choose to ignore some random terms in addition to those ignored because  they involve variables in the ignored set  All terms involving associated factors are by default  included      d  Choose the weights to be used when averaging cells in the hyper table to produce the  multiway table to be reported  The multiway table may require partial and or sequential  averaging over associated factors  Operationally  ASReml does the averaging in the prediction  design matrix rather than actually predicting the cells of the hyper table and then averaging  them     The main difference in this prediction process compared to that described by Lane and  Nelder  1982  is the choice of whether to include or exclude model terms when forming  predictions  In linear models  since all terms are fixed  factors not in the classify set must  be in the averaging set  and all terms must contribute to the predictions     9 3 2 Predict syntax       The first step is to specify the classify set of    NIN alliance 
431. s not specified the value of v is 1     is used to join lines in plots  see  X     70    5 8 Job control qualifiers       Table 5 4  List of occasionally used job control qualifiers       qualifier    action       IMBF mbf v n  f   FACTOR     FIELD s   IKEY k    NOKEY    IRENAME t    RFIELD r   ISKIP k    I SPARSE         specified on a separate line after the datafile line predefines the model  term mbf  v n  as a set of n covariates indexed by the data values in vari   able v  MBF stands for My Basis Function and uses the same mechanism  as the leg    pol   and sp1   model functions but with covariates sup   plied by the user  It is used for reading in specialized design matrices  indexed by a factor in the data including genetic marker covariables  By  default  the file f should contain 1 n fields where the first field  the key  field  contains the values which are in the data variable or at which pre   diction is required  and the remaining n fields define the corresponding  covariate values  If n is omitted  all fields after the key field  are taken  unless  FACTOR is specified for which n is 1 and the covariate values are  treated as coding for a multilevel factor  Set n to 1 to read just one  field form the data file  Also note that the file may be a binary file  e g   formed in a previous run using  SAVE    RENAME    changes the name of  the the term from mbf      to the new name t  This is necessary when  several mbf      terms are being defined which would otherwise
432. s sufficient to identify the term     e interactions can involve model functions     6 5 2 Expansions    e   is ignored  except at the end of the line where it indicates the model is continued on the  next line     e   makes sure the following term is defined but does not include it in the model       indicates factorial expansion  up to 5 way    a b is expanded to a b a b   a b c d is expanded to   abcda ba ca db c b d c d a b c a b d a c d b c d a b c d      indicates nested expansion  a b is expanded to a a b    a  b c d  e is expanded to a b a c a d e  This syntax is detected by the string           and the closing parenthesis must occur on the same line and before any comma indicating  continuation  Any number of terms may be enclosed  Each may have         prepended to  suppress it from the model     6 5 3 Conditional factors    A conditional factor is a factor that is present only when another factor has a particular  level     e individual components are specified using the at  f n  function  see Table 6 2   for exam     95    6 5 Interactions and conditional factors       ple  at  site 1  row will fit row as a factor only for site 1     e a complete set of conditional terms are specified by omitting the level specification in the  at f  function provided the correct number of levels of fis specified in the field definitions     e otherwise  a list of levels may be specified  see Table 6 2      e where variable fis coded with alphanumeric level names  the level name m
433. s to plots in field plan order with replicates  1 and 3 in rrazics and replicates 2 and 4 in BOLD     25    Table 3 1  Trial layout and allocation of varieties to plots in the NIN field trial          column  row 1 2 3 4 5 6 7 8 9 10 11  1   NE83407 BUCKSKIN NE87612 VONA NE87512 NES87408 CODY BUCKSKIN NE87612 KS831374  2   CENTURA NE86527 NE87613 NE87463 NE83407 NE83407 NE87612 NE83406 BUCKSKIN NE86482  3   SCOUT66 NE86582 NE87615 NE86507 NE87403 NORKAN NE87457 NE87409 NE85556 NE85623  4   COLT NE86606 NES87619 BUCKSKIN NE87457 REDLAND NE84557 NE87499 BRULE NE86527  5   NE83498 NE86607 NE87627 ROUGHRIDER NE83406 KS8313874 NE838T12 CENTURA NE86507 NE87451  6   NE84557 ROUGHRIDER   NE86527 COLT COLT NE86507 NE83432 ROUGHRIDER NE87409  7   NES88482 VONA CENTURA SCOUT66 NE87522 NE86527 TAM200 NE87512 VONA GAGE  8   NE85556 SIOUXLAND NE85623 NE86509 NORKAN VONA NE87613 ROUGHRIDER NE83404 NE83407  9   NE85623 GAGE CODY NE86606 NE87615 TAM107 ARAPAHOE NE83498 CODY NE87615  10   CENTURAK78 NE88T12 NE86582 NE84557 NE85556 CENTURAK78 SCOUT66   NE87463 ARAPAHOE  11   NORKAN NES86T666 NE87408 KS831374 TAM200 NE87627 NE87403 NE86T666 NE86582 CHEYENNE  12   KS8313874 NES87403 NE87451 GAGE LANCOTA NE86T666 NE85623 NE87403 NE87499 REDLAND  13   TAM200 NE87408 NE83432 NE87619 NE86503 NE87615 NE86509 NE87512 NORKAN NE83432  14 B   NES86482 NES87409 CENTURAK78 NE87499 NE86482 NE86501 NE85556 NE87446 SCOUT66 NE87619  15   HOMESTEAD NES87446 NE83T12 CHEYENNE BRULE NE87522 HOMESTEAD CENTURA NE8751
434. se created or read but not labelled  intermediate  calculations  not required for subsequent analysis     When listing variables in the field definitions  list those read from the data file first  After  them  list  and define  the labelled variables that are to be created  The number of variables  read can be explicitly set using the  READ qualifier described in Table 5 5  Otherwise  if  the first transformation on a field overwrites its contents  for instance using      ASReml  recognises that the field does not need to be read in  unless a subsequent field does need to  be read   For example    A   B   C   A   B  reads two fields  A and B   and constructs C as A B  All three are available for analysis   However     A  B  C   A   B  D  E   D   B    reads four fields  A  B  C and D  because the fourth field is not obviously created and must  therefore be read even though the third field  C  is overwritten  The fifth field is not read  but just created E     Variables that have an explicit label  may be referenced by their explicit label or their internal  label  Therefore  to avoid confusion  do not use explicit labels of the form Vz  where 7 is a  number  for variables to be referred to in a transformation  Vi always refers to field variable  i in a transformation statement     52    5 5 Transforming the data       Variables that are not initialized from the data file  are initialized to missing value for the  first record  and otherwise  to the values from the preceding re
435. ships be   tween animals  This is an alternate method of estimating additive genetic variance for these  data  The data file has been modified by adding 10000 to the dam ID  now 10001 13561   so that the lamb  sire and dam ID   s are distinct  They appear as the first genetic relation   ships are available for this data so the data file doubles as the pedigree file  The multi trait  additive genetic variance matrix  X4 of the animals  sires  dams and lambs  is given by    var  ua    X48 A    where A is the genetic relationship matrix and u4 are the trait BLUPs ordered animals  within traits There are a total of 10696   92   3561   7043 animals in the pedigree     Multivariate analysis involving several strata  here animal  direct additive genetic   dam   maternal  and litter  typically involves several runs  The ASReml input file presented  below has five parts which show the use of FA structures to get initial values for estima   tion of unstructured matrices  and their use when estimated unstructured matrices are not  positive definite as is the case with the tag matrix here  but omits earlier runs involved    327    15 10 Multivariate animal genetics data   Sheep       with linear model selection and obtaining initial values  This model is not equivalent to the  sire dam litter model with respect to the animal litter components for gfw  fd and fat     IRENAME 1  ARG 1  CHANGE 1 TO 2 3 4 OR 5 FOR OTHER PATHS  Multivariate Animal model     DOPART  1  tag  P   Bire 92 lII  da
436. sing values as zero in covariates is usually only acceptable if the  covariate is centred  has mean of zero      Design factors  Where the factor level is zero  or missing and the  MVINCLUDE qualifier is  specified   no level is assigned to the factor for that record  These effectively defines an extra  level  class  in the factor which becomes a reference level     6 10 Some technical details about model fitting in ASReml    6 10 1 Sparse versus dense    ASReml partitions the terms in the linear model into two parts  a dense set and a sparse set   The partition is at the  r point unless explicitly set with the  DENSE data line qualifier or mv  is included before  r  see Table 5 5  The special term mv is always included in sparse  Thus  random and sparse terms are estimated using sparse matrix methods which result in faster  processing  The inverse coefficient matrix is fully formed for the terms in the dense set   The inverse coefficient matrix is only partially formed for terms in the sparse set  Typically   the sparse set is large and sparse storage results in savings in memory and computing  A  consequence is that the variance matrix for estimates is only available for equations in the  dense portion     6 10 2 Ordering of terms in ASReml    The order in which estimates for the fixed and random effects in linear mixed model are  reported will usually differ from the order the model terms are specified  Solutions to the  mixed model equations are obtained using the methods
437. skip 1   yield   mu variety  r idv repl    If mv   residual idv units     emphasise that it is always included in the sparse equa   tions  If mv is listed in the fixed effects section  it and  any following fixed effect terms are processed as sparse   see Section 6 10 1            Formally  mv creates a factor with a covariate for each missing value  The covariates are  coded 0 except in the record where the particular missing value occurs  where it is coded   1  The action when mv is omitted from the model depends on whether a univariate or  multivariate analysis is being performed  For a univariate analysis  ASReml discards records  which have a missing response  In multivariate analyses  all records are retained and the R  matrix is modified to reflect the missing value pattern     6 9 2 Missing values in the explanatory variables    ASReml will abort the analysis if it finds missing values in the design matrix which are not  directly associated with missing values for the response or logically excluded from the model    105    6 10 Some technical details about model fitting in ASReml       by being in combination with an at   term which evaluates to ZERO unless  MVINCLUDE or   MVREMOVE is specified  see Section 5 8   MVINCLUDE causes the missing value to be treated  as a zero   MVREMOVE causes ASReml to discard the whole record  Records with missing  values in particular fields can be explicitly dropped using the  DV   transformation  Table  al    Covariates  Treating mis
438. so the output files do not reflect the latest program output   In this case  use the Unix script screen log command before running ASReml with the    DEBUG qualifier but without the   LOGFILE qualifier  to capture all the debugging information  in the file screen  log     The debug information pertains particularly to the first iteration and includes timing infor   mation reported in lines beginning  gt  gt  gt  gt   gt  gt  gt  gt   gt  gt  gt  gt   These lines also mark progress through  the iteration     13 4 3 The  dpr file    The  dpr file contains the data and residuals from the analysis in double precision binary  form  The file is produced when the  RES qualifier  Table 4 3  is invoked  The file could  be renamed with filename extension  dbl and used for input to another run of ASReml   Alternatively  it could be used by another Fortran program or package  Factors will have  level codes if they were coded using  A or  I  All the data from the run plus an extra column  of residuals is in the file  Records omitted from the analysis are omitted from the file     13 4 4 The  msv file    The  msv file contains the variance parameters from the most recent iteration of a model in  a form that is relatively easy to edit if the values need to be reset  The file is read when   MSV or  CONTINUE 3 is specified  This is nin89a msv     228    13 4 Other ASReml output files       This  msv file is a mechanism for resetting initial parameter values  by changing the values here and re
439. sociate the variance structure with the appropriate component  of a model term  a brief description  the algebraic form of the model and the number of  associated variance structure parameters     The models span correlation  base  models  diagonal elements equal to 1 and correlations  on the off diagonals   the extension of these to variance models  variances on the diagonals  and covariance on the off diagonals   additional models that are parameterized as variance  matrices rather than as correlation matrices and some special cases where the covariance    140    7 11 Variance model functions available in ASReml       structure is known except for the scale     See Sections 7 2 and 7 10 for important points to note in defining variance structures in  ASReml     7 11 1 Forming variance models from correlation models    The variance function models presented under correlation models in Table 7 6  id    matk   are used to specify the correlation models for the corresponding variance structures  The  corresponding homogeneous and heterogeneous variance models are specified by appending  v and h to the variance model function names respectively  and appending the corresponding  variance parameters to the corresponding list of parameters  This convention holds for most  models  It does not make sense to append v or h to the variance model function names for  the heterogeneous variance models from diag    xfak     In summary     e to specify a correlation model  provide the varian
440. sparsely fitted fixed HYS  factor  The number of Fixed effects  degrees of freedom  associated with GROUPS  is taken as the declared number less twice the number of constraints applied  This  assumes all groups are represented in the data  and that degrees of freedom associated  with group constraints will be fitted elsewhere in the model     Each cross is assumed to be selfed several times to stabilize as an inbred line as is usual  for cereals such as wheat  before being evaluated or crossed with another line  Since  inbreeding is usually associated with strong selection  it is not obvious that a pedigree  assumption of covariance of 0 5 between parent and offspring actually holds  Do not  use the   INBRED qualifier with the  MGS or  SELF qualifiers     indicates the identifiers are numeric integer with less than 16 digits  The default is  integer values with less than 9 digits  The alternative is alphanumeric identifiers with  up to 255 character indicated by   ALPHA     forces ASReml to make the A inverse  rather than trying to retrieve it from the  ainverse bin file      The default method for forming A   is based on the algorithm of Meuwissen and Luo   1992      indicates that the third identity is the sire of the dam rather than the dam   The original routine for calculating A   in ASReml was based on Quaas  1976     tells ASReml to ignore repeat occurrences of lines in the pedigree file   Warning Use of this option will avoid the check that animals occur in generation
441. specifying separable G structures  in ASReml     2 1 13 Range of variance models for R and G structures   A range of models are available for the components of both R and G structures  They include  correlation  C  models  that is  where the diagonals are 1   or covariance  V  models and  are discussed in detail in Chapter 7  Among the range of correlation models are    e identity  that is  independent and identically distributed with variance 1    e autoregressive  order 1 or 2    e moving average  order 1 or 2    e ARMA 1 1    e uniform    e banded    e general correlation     Among the range of covariance models are     e scaled identity  that is  independent and identically distributed with homogenous vari   ances     e diagonal  that is  independent with heterogeneous variances     e antedependence    11    2 2 Estimation       e unstructured  e factor analytic     There is also the facility to define models based on relationship matrices  including additive  relationship matrices generated by pedigrees and using user specified variance matrices     2 1 14 Combining variance models in R and G structures    The combination of variance models in separable G and R structures is a difficult and im   portant concept  This is discussed in detail in Chapter 7     2 2 Estimation    Consider the sigma parameterization of Section 2 1 1  Estimation involves two processes that  are closely linked  They are performed within the    engine    of ASReml  One process involves  estimation
442. splines    Oranges       analysis  The individual curves for each tree are not convincingly modelled by a logistic  function  Figure 15 16 presents a plot of the residuals from the nonlinear model fitted on  p340 of Pinheiro and Bates  2000   The distinct pattern in the residuals  which is the same  for all trees is taken up in our analysis by the season term        Residual          T T T T T T T  200 400 600 800 1000 1200 1400 1600    age    Figure 15 16  Plot of the residuals from the nonlinear model of Pinheiro and Bates    316    15 10 Multivariate animal genetics data   Sheep       15 10 Multivariate animal genetics data   Sheep    The analysis of incomplete or unbalanced multivariate data often presents computational  difficulties  These difficulties are exacerbated by either the number of random effects in the  linear mixed model  the number of traits  the complexity of the variance models being fitted  to the random effects or the size of the problem  In this section we illustrate two approaches  to the analysis of a complex set of incomplete multivariate data     Much of the difficulty in conducting such analyses in ASReml centres on obtaining good  starting values  Derivative based algorithms such as the Al algorithm can be unreliable  when fitting complex variance structures unless good starting values are available  Poor  starting values may result in divergence of the algorithm or slow convergence  A particular  problem with fitting unstructured variance models 
443. stStat  IDV_V 4 0 642752E 01 0 328704E 02 0 98 GP  idv Setstat  IDV_V 10 0 233416 0 119369E 01 1 35 QP  idv  Regulator  Set  IDV_V 80 0 601817 0 3077 70E 01 3 64 OP  idv  units  256 effects   Residual SCA_V 256 1 000000 0 511400E 01 9 72 OP    Table 15 3  REML log likelihood ratio for the variance components in the voltage data       REML    2x  terms log likelihood difference P value      setstat 200 31 5 864 0077      setstat regulator 184 15 38 19 0000      teststat 199 71 7 064 0039          278    15 5 Balanced repeated measures   Height       15 5 Balanced repeated measures   Height    The data for this example is taken from the GENSTAT manual  It consists of a total of  5 measurements of height  cm  taken on 14 plants  The 14 plants were either diseased or  healthy and were arranged in a glasshouse in a completely random design  The heights were  measured 1  3  5  7 and 10 weeks after the plants were placed in the glasshouse  There were  7 plants in each treatment  The data are depicted in Figure 15 3 obtained by qualifier line   IY yi  G tmt   JOIN   in the following multivariate ASReml job     Y yl This is plant data multivariate  Y axis    21 0000 130 5000    x axfs    0 5000 5 5000  1 2       Figure 15 3  Trellis plot of the height for each of 14 plants    In the following we illustrate how various repeated measures analyses can be conducted in  ASReml  For these analyses it is convenient to arrange the data in a multivariate form   with 7 fields representing the p
444. stat term was 203 242   the same as the REML log likelihood for the previous model  Table 15 3 presents a summary  of the REML log likelihood ratio for the remaining terms in the model  The summary of  the ASReml output for the current model is given below  The column labelled Sigma SE is  printed by ASReml to give a guide as to the significance of the variance component for each  term in the model  The statistic is simply the REML estimate of the variance component  divided by the square root of the diagonal element  for each component  of the inverse of  the average information matrix  The diagonal elements of the expected  not the average   information matrix are the asymptotic variances of the REML estimates of the variance  parameters  These Sigma SE statistics cannot be used to test the null hypothesis that the  variance component is zero  If we had used this crude measure then the conclusions would  have been inconsistent with the conclusions obtained from the REML log likelihood ratio test    277    15 4 Source of variability in unbalanced data   Volts       ltage example 5 3 6 from the GENSTAT REML manual Residuals vs Fitted valu       Residuals  Y   1 08  1 45 Fitted values  X  15 56  16 81      o  o     O o  z 0o foray Po 7 o s  o o oo d oo      0o    o o 2     o o o  o  o 8 a Pg oo G5 9    o Og o  o o   o CPA Sa Po  a     o Go 6 o  o o    o  o o       Figure 15 2  Residual plot for the voltage data     see Table 15 3      Model_Term Gamma Sigma Sigma SE  C  idv Te
445. stics  see   OUTLIER      If a job is being run a large number of times  significant gains in processing time can some   times be made by reorganising the data  so reading of irrelevant data is avoided   using binary  data files  use of  CONTINUE to reduce the number of iterations  and avoiding unnecessary  output  see  SLNFORM   YHTFORM and  NOGRAPHICS      10 5 3 Timing processes    The elapsed time for the whole job can be calculated approximately by comparing the start  time with the finish time  Timings of particular processes can be obtained by using the   DEBUG  LOGFILE qualifiers on the first line of the job  This requests the  asl file be  created and hold some intermediate results  especially from data setup and the first iteration   Included in that information is timing information on each phase of the job     206    11 Command file  Merging data files    11 1 Introduction    The MERGE directive  described in this chapter  is designed to combine information from  two files into a third file with a range of qualifiers to accomodate various scenarios  It was  developed with assistance from Chandrapal Kailasanathan to replace the  MERGE qualifier   see page 64  which had very limited functionality   The MERGE directive is placed BEFORE the data filename lines  It is an independent part of  the ASReml job in the sense that none of the files are necessarily involved in the subsequent  analyses performed by the job  and there may be multiple MERGE directives  Indeed  th
446. stimable  aliased  cell s  may be omitted    because ASReml checks that predictions are of estimable functions in the sense defined by  Searle  1971  p160  and are invariant to any constraint method used     Immediate things to check include whether every level of every fixed factor in the averaging  set is present  and whether all cells in every fixed interaction is filled  For example  in the  previous example  no variety predictions would be obtained if site was declared as having  4 levels but only three were present in the data  The message is also likely if any fixed  model terms are   IGNOREd  The TABULATE command may be used to see which treatment  combinations occur and in what order     More formally  there are often situations in which the fixed effects design matrix X is not  of full column rank  This aliasing has three main causes     e linear dependencies among the model terms due to over parameterisation of the model     e no data present for some factor combinations so that the corresponding effects cannot be  estimated     e linear dependencies due to other  usually unexpected  structure in the data     The first type of aliasing is imposed by the parameterisation chosen and can be determined  from the model  The second type of aliasing can be detected when setting up the design  matrix for parameter estimation  which may require revision of imposed constraints   All  types are detected in ASReml during the absorption process used to obtain the predicted  val
447. sv  Notice  LogL values are reported relative to a base of  20000  000    Note  XFA model  lower loadings initially held fixed   Notice  29764 singularities detected in design matrix   1 LogL  1558 44 S2  1 00000 18085 df i 1 components restrained  2 LogL  1541 77 S2  1 00000 18085 df 8 components restrained  3 LoghL  1538 27 S2  1 00000 18085 df 1 components restrained  4 LogL  1534 53 S2  1 00000 18085 df 1 components restrained  5 LogL  1532 53 S2  1 00000 18085 df 1 components restrained  6 LogL  1531 90 S2  1 00000 18085 df 1 components restrained  Note  XFA model fitted with rotation   7 LogL  1531 73 S2  1 00000 18085 df 1 components restrained  8 LogL  1531 66 S2  1 00000 18085 df  9 LogL  1531 64 S2  1 00000 18085 df  10 LogL  1531 64 S2  1 00000 18085 df          Results from analysis of wwt ywt gfw fdm fat        Akaike Information Criterion 43151 28  assuming 44 parameters    Bayesian Information Criterion 43494 60    Model_Term Sigma Sigma Sigma SE  C  id units   us Trait  35200 effects   Trait USV 1 1 8 73848 8 73848 30 29 OP  Trait usc 2 1 7 28418 7 28418 20 19 0 P  Trait USV 2 2 17 7519 17 7519 26 87 0 P  Trait US_C 3 1 0 247701 0 247701 5 87 OP  Trait US_C 3 2 0 705206 0 705206 14 31 OP  Trait US_V 3 3 0 109534 0 109534 14 21 0 P  Trait US_C 4 1 0 816946 0 816946 aeee 0 P  Trait US_C 4 2 2 03823 2 03823 3 68 OP  Trait US_C 4 3 0 252623 0 252625 3 82 0 P  Trait US_V 4 4 3 31364 3 31364 7 50 0 P  Trait USC  amp  1 0 871291 0 871291 6 95 0 P  Trait US_C 5 2 2 53
448. t     xfai  TrDam123   id dam   id dam   xfa1 TrDam123   id dam   xfa1 TrDam123   id dam   xfai TrDam123   id dam   xfai TrDam123   id dam   xfai TrDam123   id dam   xfai TrDam123   us  TrLit1234   id lit    44 us TrLit1234   id lit   us TrLit1234     38  39  40  41  42  43    xfai TrDami23    xfai TrDami23    xfai TrDami23    xfai TrDami23    xfai TrDami23    xfai TrDam123      14244 effects    19484 effects         OQ oa na sianasdsoon a    lt      sac    so qn as QOannstanasas sasaac    Pre  lt  lt   lt 4     lt     325    Dam  14 12 751    annnn  nwFrrRPRPRBPWWWNYNND KE    annnn  nFrRRRPWWWNHN KB    PrRPROOO S    w N    PWN    OP WNP BONE WN N EE    OPWNHRFPFRPRPWNHRPWNHRENFR KE    QNeEF WON    ywt gfw fdm fat          9 46109  7 34181  17 6050  0 272536  0 668009  0 141595  0 963017  1 9977 1  0 286984  3 64374  0 850282  2 48313    0 786089E 01    0 115894  1 63175    1 01106  16 0229  0 280259    0  0  0  0    0  0    0  0  0  0    l  O    0    0      132755E 02   976533E 03    176684E 02    208076E 03      593942    677334  1 55632  280482E 01  287861E 02  150192F 01  596227E 01   657014E 01   477561E 02    157854   407282E 01    133338  B77 122E 03   472300E 01   326718E 01     126746E 01  0 00000   661114E 02  1 46479  1 51911  11077    3 55275    oo oOo 0 0 Oo CO O00 OO Oo 0 Oo Oo    0    0    oO OGO    oOo oO Oo Oo 2 CO Oo    oo Oo Oo       0  0  0  0  0  0    0      284202    357266    649871   325222E 01   477490E 01   597447E 02    333224   548821   564929F 01 
449. t  001  4 variety 2 10 0 1 49 0 272  2 nitrogen 3 45 0 37 69  lt  001  8 variety nitrogen 6 45 0 0 30 0 932    For simple variance component models such as the above  the default parameterisation for  the variance component parameters is as the ratio to the residual variance  Thus ASReml  prints the variance component ratio and variance compo for each term in the random model  in the columns labelled Gamma and Component respectively     A table of Wald F statistics is printed below this summary  The usual decomposition has  three strata  with treatment effects separating into different strata as a consequence of the  balanced design and the allocation of variety to whole plots  In this balanced case  it is  straightforward to derive the ANOVA estimates of the stratum variances from the REML  estimates of the variance components  That is    blocks 126    462   Go    3175 1  blocks wplots   46    6    601 3  residual   6    177 1    The default output for testing fixed effects used by ASReml is a table of so called incremental  Wald F statistics  These Wald F statistics are described in Section 6 11  They are simply  the Wald test statistics divided by the number of estimable effects for that term  In this  example there are four terms included in the summary  The overall mean  denoted by mu   is of no interest for these data  The tests are sequential  that is the effect of each term is  assessed by the change in sums of squares achieved by adding the term to the current model 
450. t asr    Yy   YVAR y  overrides the value of response  the variate to be analysed  see Section 6 2   with the value y  where y is the number of the data field containing the trait to be analysed   This facilitates analysis of several traits under the same model  The value of y is appended  to the basename so that output files are not overwritten when the next trait is analysed     10 3 6 Workspace command line options  S  W     198    10 4 Advanced processing arguments       The workspace requirements depend on problem size and may be quite large  On 32bit  computers the maximum is 2000Mbyte under Linux  1600 Mbyte under Windows  On 64bit  systems  the maximum is 32 Gbyte but may be less depending on the machine configuration   The default allocation is 32Mbyte  4 million double precision words   An increased workspace  allocation may be requested on the command line with the Wm option     Wm   WORKSPACE m  sets the initial size of the workspace in Mbytes  For example W1600  requests 1600 Mbytes of workspace  the maximum typically available under Windows  W2000  is the maximum available on 32bit Unix Linux  systems  On 64bit systems  the argument   if less than 33  is taken as Gbyte     If your system cannot provide the requested workspace  the request will be diminished until  it can be satisfied  On multi user systems  do not unnecessarily request the maximum or  other users may complain     Having started with an initial allocation  if ASReml realises more space is require
451. t file names     Table 15 6  Field layout of Slate Hall Farm experiment    Column   Replicate levels                   Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  I 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3  2 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3  3 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3  4 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3  5 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3  6 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6  7 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6  8 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6  9 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6  10 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6  Column   Rowblk levels  Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  I 1 1 1 1 1 11 11 11 11 11 2 21 21 21 21  2 2 2 2 2 2 12 12 12 12 12 22 22 22 22 22  3 3 3 3 3 3 13 13 13 13 13 23 23 23 23 23  4 4 4 4 4 4 14 14 14 14 14 24 24 24 24 4  5 5 5 5 5 5 15 15 15 15 15 25 25 25 25 25  6 6 6 6 6 6 16 16 16 16 16 26 26 26 26 26  7 7 7 7 7 7 17 17 17 17 17 27 27 237 27 27  8 8 8 8 8 8 18 18 18 18 18 28 28 28 28 28  9 9 9 9 9 9 19 19 19 19 19 29 29 29 29 29  10 10 10 10 10 10 20 20 20 20 20 30 30 30 30 30  Column   Colblk levels  Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  5 1 2 3 4 5 6 T 8 9 10 11 12 13 14 15  6 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30  7 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30  8 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30  9 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30  10 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30      EPS 
452. t relate to labelled variables to the internal data array  Note that    e there may be up to 10000 variables and these are internally labeled V1  V2     V10000 for  transformation purposes  Values from the data file  ignoring any  SKIPed fields  are read  into the leading variables     e alpha   A   integer   I   pedigree   P  and date   DATE  fields are converted to real num   bers  level codes  as they are read and before any transformations are applied     51    5 5 Transforming the data       transformations may be applied to any variable  since every variable is numeric   but it  may not be sensible to change factor level codes     transformations operate on a single variable  not a  G group of variables  unless it is  explicitly stated otherwise     transformations are performed in order for each record in turn     variables that are created by transformation should be defined after  below  variables that  are read from the data file unless it is the explicit intention to overwrite an input variable   see below      after completing the transformations for each record  the values in the record for variables  associated with a label are held for analysis   or the record  all values  is discarded  see   D transformation and Section 6 9      Thus variables form three classes  those read from the data file  possibly modified  and labelled  are available for subsequent use in analysis   those created and labelled are also available  for subsequent use in the analysis and tho
453. t specified values  If interest lies in the relationship of the  response variable to the covariate  predict a suitable grid of covariate values to reveal the  relationship  Otherwise  predict at an average or typical value of the covariate  The default  is to predict at the mean covariate value  Omission of a covariate from the prediction model  is equivalent to predicting at a zero covariate value  which is often not appropriate  unless  the covariate is centred      Before considering the syntax  it is useful to consider the conceptual steps involved in the  prediction process  Given the explanatory variables  fixed factors  random factors and co   variates  used to define the linear  mixed  model  the four main steps are     a  Choose the explanatory variable s  and their respective level s   value s  for which pre   dictions are required  the variables involved will be referred to as the classify set and together  define the multiway table to be predicted  Include only one from any set of associated factors  in the classify set      b  Note which of the remaining variables will be averaged over  the averaging set  and which  will be ignored  the ignored set  The averaging set will include all variables involved in the  fixed model but not in the classify set  Ignored variables may be explicitly added to the  averaging set  The combination of the classify set with these averaging variables defines a  multiway hyper table  Only the base factor in a set of associated factor
454. t what  combinations are present from the design matrix  It may have trouble with  complicated models such as those involving and   terms     A second  PRESENT qualifier is allowed on a predict statement  but not  with  PRWTS   The two lists must not overlap     is used in conjunction with the first  PRESENT v list to specify the weights  that ASReml will use for averaging that  PRESENT table  More details are  given below     181    9 3 Prediction       Table 9 1  List of prediction qualifiers       qualifier    action       Controlling inclusion of model terms     EXCEPT t     IGNORE t     ONLYUSE t      USE t    Printing    IDEC  n      PLOT  z      PRINTALL      SED     TDIFF     TURNINGPOINTS n    causes the prediction to include all fitted model terms not in t     causes ASReml to set up a prediction model based on the default rules and  then removes the terms in t  This might be used to omit the spline Lack of  fit term   IGNORE fac x   from predictions as in    yield   mu x variety  r spl x  fac x   predict x  IGNORE fac x     which would predict points on the spline curve averaging over variety     causes the prediction to include only model terms in t  It can be used for  example to form a table of slopes as in    HI   mu X variety X variety  predict variety X 1  onlyuse X X variety    causes ASReml to set up a prediction model based on the default rules and  then adds the terms listed in t     gives the user control of the number of decimal places reported in the t
455. t y   Py    y     X    H    y     X7   The log likelihood  2 11  depends on X and  not on the particular non unique transformation defined by L   The log residual likelihood  ignoring constants  can be written as    1  lp     5  los det C   log det R    log det G  y Py    2 12     We can also write       P   R   R  WC  W R      with W    X Z   Letting k    o  o     the REML estimates of x  are found by calculating  the score    1  U K     OlR OK     zt  PH      y PH Py   2 13   and equating to zero  Note that H    0H    r      The elements of the observed information matrix are    lR 1 1         tr PH        tr  PHPH        1   y PH PH Py     zy PH  Py  2 14   where H     0   H O0K 0K      The elements of the expected information matrix are       ec ee  PH  PH    2 15   ki   kj 7 2 a   i f    Given an initial estimate       an update of     K    using the Fisher scoring  FS  algorithm  is   KD   RO 4  KO  KOTU  KO   2 16   where U     is the score vector  2 13  and I K     k    is the expected information matrix   2 15  of k evaluated at         13    2 2 Estimation       For large models or large data sets  the evaluation of the trace terms in either  2 14  or   2 15  is either not feasible or is very computer intensive  To overcome this problem ASReml  uses the Al algorithm  Gilmour  Thompson and Cullis  1995   The matrix denoted by T4 is  obtained by averaging  2 14  and  2 15  and approximating y PH   Py by its expectation   tr  PH    in those cases when H      0  For var
456. tatistics  for each term  is the so called    incremental     form  For this method  Wald statistics are computed from an incremental sum of squares in  the spirit of the approach used in classical regression analysis  see Searle  1971   For example  if we consider a very simple model with terms relating to the main effects of two qualitative  factors A and B  given symbolically by    yv 1 A B    where the 1 represents the constant term  u   then the incremental sums of squares for this  model can be written as the sequence  R 1   R Al1  R 1 A      R 1   R B 1 A    R 1 A B      R 1 A     where the R    operator denotes the residual sums of squares due to a model containing its  argument and R      denotes the difference between the residual sums of squares for any pair    19    2 5 Inference  Fixed effects       of  nested  models  Thus R B 1  A  represents the difference between the reduction in sums  of squares between the so called maximal    model       yv1 A B    and  y 1l A    Implicit in these calculations is that    e we only compute Wald statistics for estimable functions  Searle  1971  page 408      e all variance parameters are held fixed at the current REML estimates from the maximal  model    In this example  it is clear that the incremental Wald statistics may not produce the desired  test for the main effect of A  as in many cases we would like to produce a Wald statistic for  A based on   R A 1  B    R 1  A  B      R 1  B     The issue is further complicated w
457. ted in estimating the correlations between distinct  traits  for example  fleece weight and fibre diameter in sheep  and for repeated measures of  a single trait     8 1 1 Repeated measures on rats       Wolfinger  1996  summarises a range of  Wolfinger rat data   variance structures that can be fitted to   treat  A   repeated measures data and demonstrates   wtO wt1 wt2 wt3 wt4   the models using five weights taken weekly   T t 4at l   on 27 rats subjected to 3 treatments  This   e   ni Ea  command file demonstrates a multivari  a  us Trait  GP   ate analysis of the five repeated measures    Note that the two dimensional structure for residual errors meets the requirement of inde   pendent units and corresponds to the data being ordered traits within units              153    8 2 Model specification       8 1 2 Wether trial data    Three key traits for the Australian wool in    Orange Wether Trial 1984 8   dustry are the weight of wool grown per   SheepID  I   year  the cleanness and the diameter of   TRIAL   that wool  Much of the wool is produced   BloodLine  I   from wethers and most major producers post oe   have traditionally used a particular strain    o ner dat batty  4   or bloodline  To assess the importance of   GFW FDIAM   Trait Trait YEAR    bloodline differences  many wether trials    r us Trait  id TEAM  us Trait   id SheepID   were conducted  One trial  conducted from   residual id units  us Trait  GP    1984 to 1988 at Borenore near Orange  in    Predict YEAR
458. ter  space but do not always work well when there are several matrices on  the boundary  The options are     EMFLAG  1  Standard EM plus 10 local EM steps   EMFLAG 2 Standard EM plus 10 local PXEM steps   PXEM  2  Standard EM plus 10 local PXEM steps   EMFLAG 3 Standard EM plus 10 local EM steps   EMFLAG 4 Standard EM plus 10 local EM steps   EMFLAG 5 Standard EM only    EMFLAG 6 Single local PXEM    EMFLAG 7 Standard EM plus 1 local EM step   EMFLAG 8 Standard EM plus 10 local EM steps          Options 3 and 4 cause all US structures to be updated by  PX EM if  any particular one requires EM updates     The test of whether the AI updated matrix is positive definitite is based  on absorbing the matrix to check all pivots are positive  Repeated EM  updates may bring the matrix closer to being singular  This is assessed  by dividing the pivot of the first element with the first diagonal element  of the matrix  If it is less than 1077  this value is consistent with the  multiple partial correlation of the first variable with the rest being greater  than 0 9999999  ASReml fixes the matrix at that point and estimates any  other parameters conditional on these values  To preceed with further  iterations without fixing the matrix values would ultimately make the  matrix such that it would be judged singular resulting the analysis being  aborted     T1    5 8 Job control qualifiers       Table 5 5  List of rarely used job control qualifiers       qualifier    action         EQORDER o
459. tes Hybridi aif which contains the identifier names   IPART 2  reads in inverse additive relationship matrix generated in  PART 1  Mline  A  L Hybridi aif  SKIP 1 associates identifier names with levels of Mline   used in giv file   Pline IP    164    8 9 Reading a user defined  inverse  relationship matrix       Fline ped  GIV  DIAG  Hybridi_A giv  formed in part 1 from Mline ped  Hybrid asd  SKIP 1    grm1i Mline  nrm Fline   using new synonyms and functions    8 9 1 Genetic groups in GIV matrices    If a user creates a GIV file outside ASReml which has fixed degrees of freedom associated with  it  a   GROUPSDF n qualifier is provided to specify the number of fixed degrees of freedom  n   incorporated into the GIV matrix  The  GROUPSDF qualifier is written into the first line of  the  giv matrix produced by the  GIV qualifier of the pedigree line if the pedigree includes  genetic groups  and will be honoured from there  when reusing a GIV matrix formed from a  pedigree with genetic groups in ASReml     When groups are constrained  then it will be the number of groups less number of constraints   For example  if the pedigree file qualified by  GROUPS 7 begins  AOO    Q w     gt     B O   ABC is not present in the subsequent pedigree lines    mo  oO Om  lt 2 2     Cr  lt  gt   gt   ti    DE 0 O   DE is not present in the subsequent pedigree lines   there are actually only 5 genetic groups and two constraints so that the fixed effects for A   B and C sum to zero  and for D and
460. the  RENAME   ARG argument from the most recent run so that ASReml can  retrieve restart values from the most recent run when   CONTINUE is specified  but there is no particular  rsv file for the current  ARG argument      asp contains transformed data  see  PRINT in Table 5 2     ass contains the data summary created by the  SUM qualifier  see page 68      dbr  dpr  spr contains the data and residuals in a binary form for further analysis  see   RESIDUALS  Table 5 5       Veo holds the equation order to speed up re running big jobs when the model is  unchanged  This binary file is of no use to the user     vll holds factor level names when data residuals are saved in binary form  See  ISAVE on page 81     vrb contains the estimates of the fixed effects and their variance if  VRB qualifier  specified     VVp contains the approximate variances of the variance parameters  It is designed to    be read back for calculating functions of the variance parameters  see VPREDICT  in chapter 12      was basename was is open while ASReml is running and deleted when it finishes   It will normally be invisible to the user unless the job crashes  It is used by  ASRemlI W to tell when the job finishes       xml contains key information from the  asr   pvs and  res file in a form easier for  computers to parse     An ASReml run generates many files and the  sln and  yht files  in particular  are often  quite large and could fill up your disk space  You should therefore regularly tidy your wor
461. the R structure definition lines     e there is an error in the G structure definition lines     there is a factor name error       there is a missing parameter       there are too many few initial values     The most common problem in running ASReml is that a variable label is misspelt     245    14 2 Common problems       The primary file to examine for diagnostic messages is the  asr file  When ASReml finds  something atypical or inconsistent  it prints an diagnostic message  If it fails to successfully  parse the input  it dumps the current information to the  asr file  Below is the output for  a job that has been terminated due to an coding error  If a job has an error you should    e read the whole  asr file looking at all messages to see whether they identify the problem     e focus particularly on any error message in the Fault  line and the text of the Last line read    this line appears twice in the file to make it easier to find      e check that all variables have been defined and are referenced with the correct case     e some errors arise from conflicting information  the error may point to something that  appears valid but is inconsistent with something earlier in the file     e reduce to a simpler model and gradually build up to the desired analysis   this should help  to identify the exact location of the problem     If the problem is not resolved after these checks  you may need to email Customer Support  at support asreml co uk  Please send the  as file   a
462. the actual parameter fitted          the Sigma column reports the gamma converted to a variance scale if appropriate      Sigma SE is the ratio of the component relative to the square root of the diagonal  element of the inverse of the average information matrix Warning Sigma SE should not  be used for formal testing       The   shows the percentage change in the parameter at the last iteration       use VPREDICT  see Chapter 12  to calculate meaningful functions of the variance com   ponents     e a table of Wald F statistics for testing fixed effects   Section 6 11   The table contains the    219    13 3 Key output files       numerator degrees of freedom for the terms and    incremental    F statistics for approximate  testing of effects  It may also contain denominator degrees of freedon  a    conditional    Wald  F statistic and a significance probability     e estimated effects  their standard errors and    values for equations in the DENSE portion of  the SSP matrix are reported if  BRIEF  1 is invoked  the T prev column tests difference  between successive coefficients in the same factor     The reported log likelihood value may be positive or negative and typically excludes some  constants from its calculation  It is sometimes reported relative to an offset  when its  magnitude exceeds 10000   any offset is reported in the  asr file  Twice the difference in  the likelihoods for two models is commonly used as the basis for a likelihood ratio test  see  page 16   This 
463. the data        file with alphabetic level names but ASReml   row   column     is expecting integer level codes  Changing the     ae 7  variety   line to read variety  A resolves Pp ARRE EEE    this problem  residual ari Row  ar1 Col   predict varierty             Folder  C  Users Public ASRem1 Docs Manex4 ERR  QUALIFIERS   SKIP 1  Reading nin89 asd FREE FORMAT skipping 1 lines    Univariate analysis of yield   Notice  Maybe you want  A  L qualifiers for this factor  LANCER   Error at field 1  LANCER  of record 1  line 1    Since this is the first data record  you may need to skip some header lines   see  SKIP  or append the  A qualifier to the definition of factor variety  Fault  Missing faulty  SKIP or  A needed for variety   Last line read was  LANCER 1 1101 585 1 4 29 25 4 3 19 2 16 1   Currently defined structures  COLS and LEVELS    1 variety 1 2 2 0 0 0  11 column 1 2 2 0 10 0  12 mu 0 1  8 0  i 0    ninerr3 variety id pid raw rep nloc yield lat   Model specification  TERM LEVELS GAMMAS   mu 0   variety 0  12 factors defined  max5000    O variance parameters  max2500   2 special structures   Last line read was  LANCER 1 1101 585 1 4 29 25 4 3 19 2 16 1  Finished  23 Apr 2014 09 17 05 540 Missing faulty  SKIP or  A needed for variety    249    14 4 An example       4  A missing comma       After correcting the definition of variety  we   NIN Alliance Trial 1989  get the following  abbreviated  output  We   variety  A   have at least now read the data file as indi   cate
464. the factor it is applied to and the order of  the rows must match the order of the levels  nrm   scaled vari  SS specified applies a generated relationship matrix derived from  ance the functions argument associated pedigree file  us    variance Vig   ij general unstructured  symmetric positive definite co   variance matrix  xfak   variance V AATHY factor analytic model of order k with A of size n x k           structure name  column 4  and a corresponding variance model function name  column 5   giving the associated component variance structures  column 6   The consolidated model  term is the term presented in the final column of the table  In contrast  in ASReml 3 the  linear model terms are defined on the model line and subsequently a G structure line is  given for each linear model term which specifies the component terms and their associated  structures  The simplest form of a consolidated model term is a single model term with a  variance model function applied  eg  idv repl  in Table 7 2  and the next simplest is a  compound model term with a variance model function applied  eg  idv A B  in Table 7 2     In summary  the following are rules in forming consolidated model terms and applying vari   ance model functions to random model terms     variance model functions can be applied to single model terms  see example 1 in Table  7 2   the components in compound model terms  examples 4 to 6  and single model terms  with a constructed linear model function  example 2      
465. the field in this case  If the covariate values are irregular  you would  leave the field as a covariate and use the fac   function to derive a factor version     forms the natural log of v   r  This may also be used to transform the response  variable     creates a first differenced  by rows  design matrix which  when defining a random  effect  is equivalent to fitting a moving average variance structure in one dimension   In the mat form  the first difference operator is coded across all data points  assuming  they are in time space order   Otherwise the coding is based on the codes in the  field indicated     is a term that is predefined by using the  MBF qualifier  see page 71     98    6 6 Alphabetic list of model functions       Table 6 2  Alphabetic list of model functions and descriptions       model function    action       mu    mv    out  n   out  n t     pol v n   p v n     pow z pL  o      is used to fit the intercept constant term  It is normally present and listed first in  the model  It should be present in the model if there are no other fixed factors or  if all fixed terms are covariates or contrasts except in the special case of regression  through the origin     is used to estimate missing values in the response variable  Formally this creates  a model term with a column for each missing value  Each column contains zeros  except for a solitary  1 in the record containing the corresponding missing value   This is used in spatial analyses so that computing a
466. the graphics to files whose names are built up as    lt basename gt    lt args gt   lt type gt   lt pass gt    lt section gt    lt ext gt  where square parenthe   ses indicate elements that might be omitted   lt basename gt  is the name portion of the  as  file   lt args gt  is any argument strings built into the output names by use of the  RENAME qual   ifier   lt type gt  indicates the contents of the figure  as given in the following table    lt pass gt  is  inserted when the job is repeated   RENAME or  CYCLE  to ensure filenames are unique across  repeats   lt section gt  is inserted to distinquish files produced from different sections of data   for example from multisite spatial analysis  and  lt ext gt  indicates the file graphics format         lt type gt  file contents   R  marginal means of residuals from spatial analysis of a section  V  variogram of residuals from spatial analysis for a section    S residuals in field plan for a section   H histogram of residuals for a section   _RvE residuals plotted against expected values   XYGi figure produced by  X   Y and  G qualifiers   PV_i Predicted values plotted for PREDICT directive 7          The graphics file format is specified by following the G or H option by a number g  or specifying  the appropriate qualifier on the top job control line  as follows     197    10 3 Command line options          g qualifier description  lt ert gt    1  HPGL HP GL pgl   2  VPS Postscript  default  ps   6  BMP BMP bmp  10  WPM Wind
467. the size of factor labels stored     extra data fields on a line are ignored     if there are fewer data items on a line than ASReml expects  the remainder are taken from  the following line s  except in  csv files were they are taken as missing  If you end up  with half the number of records you expected  this is probably the reason     e all lines beginning with   followed by a blank are copied to the  asr file as comments for  the output  their contents are ignored     4 2 2 Fixed format data files    The format must be supplied with the  FORMAT qualifier which is described in Table 5 5   However  if all fields are present and are separated  the file can be read free format     4 2 3 Preparing data files in Excel    Many users find it convenient to prepare their data in EXCEL  ACCESS or some other  database  Such data must be exported from these programs into either  csv  Comma  separated values  or  txt  TAB separated values  form for ASReml to read it  ASReml can  convert an  x1s file to a  csv file  When ASReml is invoked with an  x1s file as the filename  argument and there is no  csv file or  as with the same basename  it exports the first sheet  as a  csv file and then generates a template  as command file from any column headings it  finds  see page 194   It will also convert a Genstat  gsh spreadsheet file to  csv format  The  data extracted from the  x1s file are labels  numerical values and the results from formulae   Empty rows at the start and end of a block ar
468. the symbol    Exceptions to this rule are single components  F  id v   F  and nrm v   F  terms which are reduced to the corresponding single term F   id v   F  and nrm v   F   So  for example  with the random model and residual specification  model terms    Ir idv A  ariv B  nrm C  us Trait  D  residual id units   us Trait    The covariance functions with parameters   idv A   ariv B   us Trait  in nrm C   us Trait   and   us Trait  in id units  us Trait  are named    idv A   ariv B   ariv B   nrm C   us Trait   us Trait   id units   us Trait   us Trait    If the resulting name is not ambiguous the name can be contracted by reducing the con   solidated model term to a unique substring or leaving out the consolidated model term  completely  For example  in the example the covariance functions can be represented by  idv A   ariv B   C us Trait  and units us Trait   respectively  Individual parameters  within a covariance component can be specified by number   or sequence of numbers  n m  by  appending these in square braces  for example  C us Trait   3  or units us Trait   4 6    If the residual directive is not used  the default R structure parameters are effectively named  Residual  The orphan term D with no explicit variance function is treated as idv D  struc   ture with name D  If the user is in doubt of the name or number of a parameter then running  the program with VPREDICT  DEFINE and a blank line will construct a  pvc file with the  names and numbers of parameters iden
469. ther than the  working folder  This qualifier must be placed on the top command line as it needs to be  processed before any output files are opened  Most files produced by ASReml have a filename  structure     lt basename gt  lt subname gt   lt extension gt     where  lt subname gt  is a command line argument value  If  OQUTFOLDER is specified without  path  the output filename pattern becomes     lt basename gt  lt subname gt   lt basename gt   lt extension gt   If path is specified  the output filename pattern becomes   lt path gt    lt basename gt  lt subname gt   lt extension gt     There are a few files written by ASReml that do not follow this naming pattern  for example   ainverse bin and asrdata bin  These remain unchanged  that is  they are not written to  the output folder      XML requests that the primary tables reported in the  asr file and key output from  pvs and   sln files are written to a  xml file in xml format  The output is presented in the order of  computation  The first block written is a  asr block and includes start and finish times  the  data summary  the iteration sequence summary and information criteria  then from the  pvs  file the tables and associated information  then the summary of estimated variance structure  parameters from the  asr file  then information from the  sln file  and then finally  the  Wald F statistics and completion information from the  asr file  The process is repeated for  each cycle of analysis  The intended use of this
470. tical  Society 82  605 610     Smith  A  B   Cullis  B  R   Gilmour  A  and Thompson  R   1998   Multiplicative models for  interaction in spatial mixed model analyses of multi environment trial data  Proceedings  of the International Biometrics Conference     Smith  A   Cullis  B  R  and Thompson  R   2001   Analysing variety by environment data  using multiplicative mixed models and adjustments for spatial field trend  Biometrics  57  1138 1147     335    BIBLIOGRAPHY       Smith  A   Cullis  B  R  and Thompson  R   2005   The analysis of crop cultivar breeding  and evaluation trials  an overview of current mixed model approaches  review   Journal of  Agricultural Science 143  449   462     Stein  M  L   1999   Interpolation of Spatial Data  Some Theory for Kriging  Springer Verlag   New York     Stevens  M  M   Fox  K  M   Warren  G  N   Cullis  B  R   Coombes  N  E  and Lewin  L  G    1999   An image analysis technique for assessing resistance in rice cultivars to root feeding  chironomid midge larvae  diptera  Chironomidae   Field Crops Research 66  25 26     Stroup  W  W   Baenziger  P  S  and Mulitze  D  K   1994   Removing spatial variation from  wheat yield trials  a comparison of methods  Crop  Sci 86  62   66     Thompson  R   1980   Maximum likelihood estimation of variance components  Math  Op   erationsforsch Statistics  Series  Statistics 11  545 561     Thompson  R   Cullis  B   Smith  A  and Gilmour  A   2003   A sparse implementation of  the average informat
471. tified     The original implementation was based entirely on the numbers but it will generally be better  to use the names  since the order model terms are reported cannot always be predicted     211    12 2 Syntax       Critical change For generalised linear models in ASReml Release 4  the   pvc file reports and  numbers  for completeness  a residual or dispersion parameter both when the parameter is  estimated or when it is fixed  By contrast  ASReml 3 does not report nor number if the  parameter is fixed by default at 1  Hence the parameters might be numbered differently in  ASReml 4 and ASReml 3     12 2 1 Functions of components       First ASReml extracts the variance compo  a   A   y   mu Ir idv Sire   nents from the  asr file and their variance   besidual idv units   matrix from the  vvp file  The F  S  V and   VPREDICT  DEFINE  X functions create new components which are   F phenvar idv Sire    idv units   appended to the list  For example  the F func    F genvar idv Sire    4  tion appends component k   c v and forms      berit genvar phenvar  cov  c v  v  and var  c v  where v is the vec   tor of existing variance components  c is the vector of coefficients for the linear function and  k is an optional offset which is usually omitted but would be 1 to represent the residual  variance in a probit analysis and 3 289 to represent the residual variance in a logit analysis   The general form of the directive is             F labela  bxcqp t etd mxk    where a  b  c and 
472. tion dependent on REGION  In the second model  REGION and SITE  appear to be independent factors so the initial M codes are    A and A  However they are  not independent because REGION removes additional degrees of freedom from SITE  so the  M codes are changed from    A and A to    a and A     When using the conditional Wald F statistic  it is important to know what the    maximal  conditional    model  MCM  is for that particular statistic  It is given explicitly in the  aov  file  The purpose of the conditional Wald F statistic is to facilitate inference for fixed effects   It is not meant to be prescriptive of the appropriate test nor is the algorithm for determining  the MCM foolproof     The Wald statistics are collectively presented in a summary table in the  asr file  The basic  table includes the numerator degrees of freedom  1   and the incremental Wald F statistic  for each term  To this is added the conditional Wald F statistic and the M code if  FCON  is specified  A conditional Wald F statistic is not reported for mu in the  asr but is in the   aov file  adjusted for covariates      The  FOWN qualifier  page 78  allows the user to replace any all of the conditional Wald F  statistics with tests of the same terms but adjusted for other model terms as specified by  the user  the  FOWN test is not performed if it implies a change in degrees of freedom from  that obtained by the incremental model     2 5 3 Kenward and Roger adjustments    In moderately sized analyse
473. tistic   e Wald F statistic scaled by A    e    as defined in Kenward  amp  Roger     denominator degrees of freedom    Source Size NumDF F value lLambda F Lambda DenDF  mu 1 1  3831 9252  331 9252 1 0000 25 0143  variety 56 55 2 2257 2 2245 0 9995 110 8370    225    13 4 Other ASReml output files          A more useful example is obtained by adding  gpiit plot analysis   oat    a linear nitrogen contrast to the oats example blocks      Section 15 2   nitrogen  A  subplots  variety  A    The basic design is six replicates of three geike d  whole plots to which variety was randomised  yield  and four subplots which received 4 rates of   oats asd  skip 2  nitrogen  A  CONTRAST qualifier defines the    CONTRAST linNitr nitrogen  6  4  2 0  model term linNitr as the linear covariate    FCON l B l  representing ntrogen applied  Fitting this be    Yt814   mu variety linNitr nitrogen    i   variety linNitr variety nitrogen   fore the model term nitrogen means that this i     Ir idv blocks  idv blocks wplots    latter term represents lack of fit from a linear   residual idv units     response              The  FCON qualifier requests conditional Wald  F statistics  As this is a small example  denominator degrees of freedom are reported by  default  An extract from the  asr file is followed by the contents of the  aov file           Results from analysis of yield        Akaike Information Criterion 415 10  assuming 3 parameters    Bayesian Information Criterion 421 38    Approximate s
474. to indicate a Pedigree factor   A to indicate a    29    3 4 The ASReml command file       alphanumerically coded factor   I to indicated a factor where the numbers are to be  treated as labels for the levels  and   where the numbers are the actual levels     e If none of the    names    are indicated as factors using the   mechanism  ASReml will scan  the first few lines of data and try and identify alphanumeric  integer and simple factors     Always check the template as it is likely some variates have been misclassified as factors   The template file created by running ASReml on the nin89 asd file looks like       WORKSPACE 100  RENAME  ARGS     DOPART  1   Title  nin89    variety id pid raw rep nloc yield lat long row  column   LANCER 1 1101 585 1 4 29 25 4 3 19 2 16 1   BRULE  2 1102 631 1 4 31 55 4 3 20 4 17 1   REDLAND 3 1103 701 1 4 35 05 4 3 21 6 18 1    CODY  4 1104 602 1 4 30 1 4 3 22 8 19 1    variety  A   CODY  id    4   pid 4I   1104  raw  I   602  rep     1  nloc     4  yield   30 1  lat   4 3  long   22 8  row  I   19  column    1      Check Correct these field definitions    nin89 asd  SKIP 1   yield  m     Specify fixed model  Ir   Specify random model   residual units    We need to change the  I associated with row to   because the row numbers are actually    positions  not just labels which could be taken in any order  Note that ASReml displays a  data value beside each name to make it easier to confirm the labelling     30    3 4 The ASReml command file  
475. tratum variance decomposition    Stratum Degrees Freedom Variance Component Coefficients   idv  blocks  5 00 3175 06 12 0 4 0 1 9  idv blocks wplots 10 00 601 331 0 0 4 0 1 0   Residual Variance 45 00 177 083 0 0 0 0 1 6  Model_Term Gamma Sigma Sigma SE  C  idv  blocks  IDV_V 6 121116 214 477 1 27 QFP  idv  blocks  wplots  IDV_V 18 0 598937 106 062 1 56 OF  idv  units  72 effects   Residual SCA_V 72 1 000000 177 083 4 74 OP    Wald F statistics    Source of Variation NumDF DenDF_con F inc F con M P  lt con  8 mu 1 6 0 245 14 138 14    lt  001  4 variety 2 10 0 1 49 1 49 A 0 272  T linNitr 1 45 0 110 32 110 32 a  lt  001  2 nitrogen 2 45 0 1 37 1 37 A 0 265  9 variety linNitr 2 45 0 0 48 0 48 b 0 625  10 variety nitrogen 4 45 0 0 22 0 22 B 0 928    The analysis shows that there is a significant linear response to nitrogen level but the lack  of fit term and the interactions with variety are not significant  In this example  the  conditional Wald F statistic is the same as the incremental one because the contrast must  appear before the lack of fit and the main effect before the interaction and otherwise it is a  balanced analysis     226    13 4 Other ASReml output files       The first part of the  aov file  the FMAP table only appears if the job is run in DEBUG  mode  There is a line for each model term showing the number of non singular effects in  the terms before the current term is absorbed  For example  variety  nitrogen initially  has 12 degrees of freedom  non singula
476. trial 1989 variety  A  explanatory variables after the predict direc   tive  The predict statement s  may appear   Column 11   immediately after the model line  before or   ningg asd  skip 1   after any tabulate statements  or after the   yield   mu variety  r idv repl   R and G structure lines  The syntax is predict variety             predict factors  qualifiers    e predict must be the first element of the predict statement  in upper or lower case    e factors is a list of the variables defining a multiway table to be predicted  each variable  may be followed by a list of specific levels values to be predicted  or the name of the file  that contains those values     e the qualifiers  listed in Table 9 1  modify the predictions in some way     e a predict statement may be continued on subsequent lines by terminating the current  line with a comma     e several predict statements may be specified   ASReml parses each predict statement before fitting the model  If any syntax problems are    encountered  these are reported in the  pvs file after which the statement is ignored  the  job is completed as if the erroneous prediction statement did not exist     177    9 3 Prediction       The predictions are formed as an extra process in the final iteration and are reported to the   pvs file  Consequently  aborting a run by creating the ABORTASR NOW file  see page 68  will  cause any predict statements to be ignored  Create FINALASR NOW instead of ABORTASR  NOW  to make the next it
477. two   dimensional error structure can be defined  see   SECTION on page 73      ICSV used to make consecutive commas imply a missing value  this is auto    matically set if the file name ends with  csv or  CSV  see Section 4 2   Warning This qualifier is ignored when reading binary data     IDATAFILE f specifies the datafile name replacing the one obtained from the datafile  line  It is required when different  PATHS  see  DOPATH in Table 10 3  of  a job must read different files  The  SKIP qualifier  if specified  will be  applied when reading the file       FILTER v    SELECT New R4 enables a subset of the data to be analysed  v is the number or   n     EXCLUDE n  name of a data field  When reading data  the value in field v is checked  after any transformations are performed  If  SELECT and  EXCLUDE are  omitted  records with zero in field v are omitted from the analysis  If   SELECT n is specified records with n in field v are retained and all other  records are omitted  Conversely if  EXCLUDE n is specified  records with  n in field v are ignored     62    5 7 Data file qualifiers       Table 5 2  Qualifiers relating to data input and output          qualifier action    FOLDER s specifies an alternative folder for ASReml to find input files  This qualifier  is usually placed on a separate line BEFORE the data filename line  and  any pedigree  giv  grm filename lines  For example    FOLDER    Data  data asd  SKIP 1  is equivalent to     Data data asd  SKIP 1    FORMAT s s
478. ty to the dose sex interaction     We also note the comment 3 possible outliers  see  res file  Checking the  res  file  we discover unit 66 has a standardised residual of  8 80  see Figure 15 1   The weight  of this female rat  within litter 9 is only 3 68  compared to weights of 7 26 and 6 58 for two  other female sibling pups  This weight appears erroneous  but without knowledge of the  actual experiment we retain the observation in the following  However  part 2 shows one  way of    dropping    unit 66 by fitting an effect for it with out  66      Rats example Residuals vs Fitted values       Residuals  Y   3 02  1 22 Fitted values  X  5 04  7 63   e   o     o  o    o     a  o oo      gt     o  8   0   090 o o8 d abo    R 8  o 8 g o 0   08 6 Bo 0o90 0 25   gt  ogo o 8go B 08   8  O o g 80o    o ao o Gog o  o gou 2 o o P Q z Ae   o    o o oP H go 8 o o8  gt  d o  goo 8 DooDoo 8 o   o o  8 890 8 Fo 8  o 0o00 o    i  o    o S 8    o o o  o 9 g o  o   o     o  o o9    Figure 15 1  Residual plot for the rat data    We refit the model without the dose sex term  Note that the variance parameters are  re estimated  though there is little change from the previous analysis     Model_Term Gamma Sigma Sigma SE  C  idv  dam  IDV_V 27 0 595157 O 979179E 01 2 93 O P  idv  units  322 effects   Residual SCA_V 322 1 000000 0 164524 12 13 GP    Wald F statistics  Source of Variation NumDF DenDF_con F_inc F_con M P_con    275    15 4 Source of variability in unbalanced data   Volts       7 m
479. u   larities occur  ASReml runs more efficiently if no constraints are applied  Following  is an example of Helmert and sum to zero covariables for a factor with 5 levels     Hl H2 H3 H4 C1 C2 C3 C4  Fl  1  1  1  1 1 0 0 0  F2 1  1  1  1 0 1 0 0  F3 0 2  1  1 0 0 1 0  F4 0 0 3  1 0 0 0 1  F5 0 0 0 4  1  1  1  1    is used to take a copy of a pedigree factor f and fit it without the genetic relationship  covariance  This facilitates fitting a second animal effect  Thus  to form a direct   maternal genetic and maternal environment model  the maternal environment is  defined as a second animal effect coded the same as dams  viz   r    animal dam     ide dam     forms the reciprocal of v   r  This may also be used to transform the response  variable     forms n 1 Legendre polynomials of order 0  intercept   1  linear     n from the  values in v  the intercept polynomial is omitted if n is preceded by the negative sign   The actual values of the coefficients are written to the  res file  This is similar to  the pol    function described below     takes the coding of factor f as a covariate  The function is defined for f being a  simple factor  Trait and units  The lin f  function does not centre or scale the  variable  Motivation  Sometimes you may wish to fit a covariate as a random factor  as well  If the coding is say 1   n  then you should define the field as a factor in the  field definition and use the 1in   function to include it as a covariate in the model   Do not centre 
480. u 1 32 0 8981 48 1093 05    lt  001  3 littersize 1 31 4 27 85 46 43 A  lt  001  1 dose 2 24 0 12 05 11 42 A  lt  001  2 sex i 301 7 58 27 58 27 A  lt  001    Part 4 shows what happens if we  wrongly  drop dam from this model  Even if a random  term is not    significant     it should not be dropped from the model when we are testing fixed  effects  or desire standard errors of adjusted means  if it represents a strata of the design as  in this case     Model_Term Gamma Sigma Sigma SE  C  idv units  322 effects  Residual SCA_V 322 1 000000 0 253182 12 59 OP    Wald F statistics    Source of Variation NumDF DenDF_con F_inc F_con M P_con  7 mu 1 317 0 47077 31 3309 42    lt  001  3 littersize 1 317 0 68 48 146 50 A  lt  001  1 dose 2 alt  0 60 99 58 43 A  lt  001  2 sex 1 317 0 24 52 24 52 A  lt  001    15 4 Source of variability in unbalanced data   Volts    In this example we illustrate an analysis of unbalanced data in which the main aim is to  determine the sources of variation rather than assess the significance of imposed treatments   The data are taken from Cox and Snell  1981  and involve an experiment to examine the  variability in the production of car voltage regulators  Standard production of regulators  involves two steps  Regulators are taken from the production line to a setting station and  adjusted to operate within a specified voltage range  From the setting station the regulator  is then passed to a testing station where it is tested and returned if outside t
481. ualifier and    T ke  would be degrees of freedom in the typical application to mean squares  The default  gue  value of    is 1     a  INEGBIN      LOGARITHM    IDENTITY    INVERSE      PHI          v ptp     fits the Negative Binomial distribution  Natural logarithms are the default link    2     y 1n 4 gt 5  function  The default value of    is 1      yin       S               General qualifiers      AOD requests an Analysis of Deviance table be generated  This is formed by fitting a  series of sub models for terms in the DENSE part building up to the full model   and comparing the deviances  An example if its use is  LS  BIN  TOT COUNT  AOD   mu SEX GROUP   AOD may not be used in association with PREDICT     IDISP  A  includes an overdispersion scaling parameter  h  in the weights  If  DISP is specified  with no argument  ASReml estimates it as the residual variance of the working  variable  Traditionally it is estimated from the deviance residuals  reported by  ASReml as Variance heterogeneity    An example if its use is  count  POIS  DISP   mu group     OFFSET  o  is used especially with binomial data to include an offset in the model where o is the  number or name of a variable in the data  The offset is only included in binomial  and Poisson models  for Normal models just subtract the offset variable from the  response variable   for example  count  POIS  OFFSET base  DISP   mu group    The offset is included in the model as n   X7  0  The offset will often be somethi
482. uare root scale  Figure 15 8 presents a plot of the treated and the control  root area  on the square root scale  for each variety  There is a strong dependence between  the treated and control root area  which is not surprising  The aim of the experiment was  to determine the tolerance of varieties to bloodworms and thence identify the most tolerant  varieties  The definition of tolerance should allow for the fact that varieties differ in their  inherent seedling vigour  Figure 15 8   The original approach of the scientist was to regress  the treated root area against the control root area and define the index of vigour as the  residual from this regression  This approach is clearly inefficient since there is error in both  variables  We seek to determine an index of tolerance from the joint analysis of treated and    298    15 8 Paired Case Control study   Rice       control root area     me this is for the paired data  Y axis  1 8957 14 8835    X dxis  8 2675 23 5051    o    o  o  o      o o  o0     g o  o  o o o  o 3 o  o o9     oo  o o  o 8  o o o  o    o    o o  o o o o 2    o o o o  o g 0     d o  o o 9    s o  o o0    o  oo o o  o o o o      o o o o  o o  p oo    oi  o        o o       o       o o  Q oo  o oo o o  o  a o o    o  o o    Figure 15 8  Rice bloodworm data  Plot of square root of root weight for treated versus  control    299    15 8 Paired Case Control study   Rice       15 8 1 Standard analysis    The allocation of bloodworm treatments within varieties a
483. uced more computationally efficiently than it would be using PREDICT   For example     TPREDICT Animal  AVE Trait 2 1 1 2  7 4  ONLYUSE us Trait   nrm Animal     Part of the motivation for this is the calculation of selection indices  The index coefficients  are typically derived as w   a Gon G gt 1  Where Gmm is the variance matrix for the measured  traits  corresponding to C in the example   Gom is the genetic covariance matrix between  the objective traits and the measured traits  and a is the vector of economic values for the  objective traits  The results are given in a  sli  selection index  file  This directive should  be placed after the model specification     191    10 Command file  Running the job    10 1 Introduction    The command line  its options and arguments are discussed in this chapter  Command line  options enable more workspace to be accessed to run the job  control some graphics output  and control advanced processing options  Command line arguments are substituted into the  job at run time     As Windows likes to hide the command line  most command line options can be set on an  optional initial line of the  as file we call the top job control line to distinguish it from the  other job control lines discussed in Chapter 6  If the first line of the  as file contains a  qualifier other than  DOPATH  it is interpreted as setting command line options and the Title  is taken as the next line     10 2 The command line    10 2 1 Normal run    The basic command
484. ues     ASReml doesn   t print predictions of non estimable functions unless the  PRINTALL qualifier  is specified  However  using  PRINTALL is rarely a satisfactory solution  Failure to report  predicted values normally means that the predict statement is averaging over some cells of  the hyper table that have no information and therefore cannot be averaged in a meaningful  way  Appropriate use of the  AVERAGE and or  PRESENT qualifiers will usually resolve the  problem  The  PRESENT qualifier enables the construction of means by averaging only the  estimable cells of the hyper table  where this is appropriate     Table 9 1 is a list of the prediction qualifiers with the following syntax     e fis an explanatory variable which is a factor   e tis a list of terms in the fitted model    e nis an integer number    e vis a list of explanatory variables     180    9 3 Prediction       Table 9 1  List of prediction qualifiers       qualifier    action       Controlling formation of tables      ASSOCIATE  v     IAVERAGE f   weights     AVERAGE f   gt  file     n      ASAVERAGE f   weights    ASAVERAGE f   gt  file     n      PARALLEL  vu      PRESENT v     PRWTS v    facilitates prediction when the levels of one factor are grouped by the levels  of another in a hierarchical manner  More details are given below  Two  independent associate lists may be specified     is used to formally include a variable in the averaging set and to explicitly set  the weights for averaging  Variables
485. ults  A portion of the file is presented below  There is a wide range in SED reflecting  the imbalance of the variety concurrence within runs     305    15 8 Paired Case Control study   Rice       Assuming Power transformation was  Y 0 000  0 500  The ignored set  run    variety Trait Power_value Stand_Error Ecode Retransformed_value approx_SE  AliCombo sqrt  yc  14 9531 0 9181 E 223  5962 27 4568  AliCombo sqrt  ye  7 9941 0 7992 E 63 9050 12 7784  Bluebelle sqrt  yc  13 1036 0 9310 E 171 7046 24 3987  Bluebelle sqrt  ye  6 6302 0 8062 E 43 9598 10 6903  C22 sqrt  yc  16 6676 0 9181 E 277  8096 30 6050  C22 sqrt  ye  8 9541 0 7992 E 80 1756 14 3130  YRK1 sqrt  yc  15 1857 0 9549 E 230   6068 29 0018  YRK1 sqrt  ye  8 3355 0 8190 E 69   4806 13 6531  YRK3 sqrt yc  13 3058 0 9549 E 177 0431 25 4114  YRK3 sqrt  ye  8 1134 0 8190 E 65 8265 13 2892    exposed BLUP    nm ar       N    dig 10 u09          Figure 15 9  BLUPs for treated for each variety plotted against BLUPs for control    Table 15 10  Estimated variance parameters from bivariate analysis of bloodworm data    control treated       source variance variance covariance  us  trait   variety 3 84 1 96 2 33  us trait  run 1 71 2 54 0 32  us  trait   pair 2 14 2 35 0 99          306    15 8 Paired Case Control study   Rice       15 8 3 Interpretation of results    Recall that the researcher is interested in varietal tolerance to bloodworms  This could be  defined in various ways  One option is to consider the regression im
486. umn optional field labels   LANCER 1 NA NA 1 4 NA 4 31 21 1 file augmented by missing values  LANCER 1 NA NA 1 4 NA 4 3 2 4 2 1 for first 15 plots and 3 buffer  LANCER 1 NA NA 1 4 NA 4 3 3 63 1 plots and variety coded LANCER  LANCER 1 NA NA 1 4 NA 4 34 8 41 to complete 22x11 array  LANCER 1 NA NA 1 4 NA 4 365 1   LANCER 1 NA NA 1 4 NA 4 3 7 2 6 1   LANCER 1 NA NA 1 4 NA 4 3 8 4 7 1   LANCER 1 NA NA 1 4 NA 4 3 9 6 8 1   LANCER 1 NA NA 1 4 NA 4 3 10 8 9 1   LANCER 1 NA NA 1 4 NA 4 3 12 10 1   LANCER 1 NA NA 1 4 NA 4 3 13 2 11 1   LANCER 1 NA NA 1 4 NA 4 3 14 4 12 1   LANCER 1 NA NA 1 4 NA 4 3 15 6 13 1   LANCER 1 NA NA 1 4 NA 4 3 16 8 14 1 buffer plots   LANCER 1 NA NA 1 4 NA 4 3 18 15 1 between reps   LANCER 1 NA NA 2 4 NA 17 2 7 264   LANCER 1 NA NA 3 4 NA 25 8 22 8 19 6   LANCER 1 NA NA 4 4 NA 38 7 12 0 10 9   LANCER 1 1101 585 1 4 29 25 4 3 19 2 16 1 original data    BRULE 2 1102 631 1 4 31 55 4 3 20 4 17 1  REDLAND 3 1103 701 1 4 35 05 4 3 21 6 18 1  CODY 4 1104 602 1 4 30 1 4 3 22 8 19 1          Note that    e the pid  raw  repl and yield data for the missing plots have all been made NA  one of  the three missing value indicators in ASReml  see Section 4 2      e variety is coded LANCER for all missing plots  one of the variety names must be used but  the particular choice is arbitrary     28    3 4 The ASReml command file       3 4 The ASReml command file    By convention an ASReml  command file has a  as extension  The file defines    e a title line to describe the job
487. upplies a Fortran like FORMAT statement for reading fixed format files  A    simple example is  FORMAT 314 5F6 2  which reads 3 integer fields and  5 floating point fields from the first 42 characters of each data line  A  format statement is enclosed in parentheses and may include 1 level of  nested parentheses  for example  e g   FORMAT 4x 3 14 f8 2    Field  descriptors are    e rX to skip r character positions   e rAw to define r consecutive fields of w characters width   e rIw to define r consecutive fields of w characters width  and    e rFw d to define r consecutive fields of w characters width  d indicates    where to insert the decimal point if it is not explicitly present in the  field     where r is an optional repeat count    In ASReml  the A and I field descriptors are treated identically and simply  set the field width  Whether the field is interpreted alphabetically or as  a number is controlled by the  A qualifier     Other legal components of a format statement are    e the   character  required to separate fields   blanks are not permitted  in the format    e the   character  indicates the next field is to be read from the next  line  However a   on the end of a format to skip a line is not honoured    e BZ  the default action is to read blank fields as missing values    and  NA are also honoured as missing values  If you wish to read blank fields  as zeros  include the string BZ    e the string BM  switches back to    blank missing    mode    e the string T
488. ution   Used  EMFLAG O Single standard EM update when AI update unacceptable  You could try  GU  negative definite US  or use XFA instead     Akaike Information Criterion    Bayesian Information Criterion 43471 52    Model_Term    id units   us Trait     Trait  Trait  Trait  Trait  Trait    2  2  US_C 3  3    Sigma    35200 effects  US V 1    1    1  2  1  2    9 46109  7 34181  17 6050  0 272536  0 668009    322    43065 77  assuming 52 parameters      Sigma Sigma SE  C  9 46109 33 29 OP  7 34181 20 55 OP  17 6050 2709 OP   0 272536 3 38 0P  0 668009 13 99 0P    15 10 Multivariate animal genetics data   Sheep       Trait US_V  Trait US_C  Trait US_C  Trait US_C  Trait US_V  Trait US_C  Trait US_C  Trait US_C  Trait US_C  Trait US_V  diag  TrSG123   sex grp   TrS G123 DIAG_V  TrSG123 DIAG_V  TrSG123 DIAG_V  diag  TrAG1245   age grp  TrAG1245 DIAG_V  TrAG1245 DIAG_V  TrAG1245 DIAG_V  TrAG1245 DIAG_V  us Trait   id sire    Trait US_V  Trait US_C  Trait US_V  Trait US_C  Trait US_C  Trait US_V  Trait US_C  Trait US_C  Trait US_C  Trait US_V  Trait US_C  Trait US_C  Trait US_C  Trait US_C  Trait US_V  xfai  TrDam123   id dam   TrDam123 XFA_V  TrDam123 XFA_V  TrDam123 XFA_V  TrDam123 XFA_L  TrDam123 XFA_L  TrDam123 XFA_L  us  TrLit1234   id lit   TrLit1234 US_V  TrLit1234 US_C  TrLit1234 US_V  TrLit1234 US_C  TrLit1234 US_C  TrLit1234 US_V  TrLit1234 US_C  TrLit1234 US_C  TrLit1234 US_C  TrLit1234 uS    ana  n Sk A A WA w    anannnF AFP BP WWWNN KE    Bee FE OO CO    BRR wWwWWNNE 
489. v is a single  precision lower triangle row wise binary file and  dgiv is a double precision lower triangle  row wise binary file   PRECISION n changes the value used to declare a singularity when  inverting a GRM file from 1D 7 to 1D n     A GRM can be associated with a factor i by using the variance model function grm f    which associates the ith GRM with factor f  for example     grmiv animal  INIT 0 12   or    coruh site   grm2 variety     It is imperative that the GIV GRM matrix be defined with the correct row column order   the order that matches the order of the levels in the factor it is associated with  The easiest  way to check this is to compare the order used in the GIV GRM file with the order reported  in the  sln file when the model is fitted     Another example of  L  Section 5 4 1  is in analysis on data with 2 relationship matrices  based on two separate pedigrees  ASReml only allows one pedigree file to be specified but  can create an inverse relationship matrix and store the result in a GIV file  So  2 relationship  matrices based on two separate pedigrees may be used by generating a GIV file from one  pedigree and then using that GIV file and the other pedigree in a subsequent run  To process  the GIV file properly  we must also generate a file with identities as required for the GIV  matrix  An example of this is if the file Hybrid as includes    IPART 1  Mline  P  Fline  A    Mline ped  GIV  DIAG   GIV generates the file HybridiA giv and  DIAG    genera
490. v pair  diag tmt  id run  idv uni tmt 2    residual idv units     The two paths in the input file define the two univariate analyses we will conduct  We  consider the results from the analysis defined in PATH 1 first  A portion of the output file is    5 LogL  345 306 S2  1 3216 262 df  6 LogL  345 267 52  1 3155 262 df  7 LogL  345 264 S2  1 3149 262 df  8 LogL  345 263 S2  1 3149 262 df          Results from analysis of sqrt rootwt         Akaike Information Criterion 702 53  assuming 6 parameters    Bayesian Information Criterion 723 94    Approximate stratum variance decomposition    Stratum Degrees Freedom Variance Component Coefficients  idv  variety  44 40 26 0156 4238 3 0 3 6 2 0 1 5 1  idv  run  45 17 7 41702 0 0 3 5  0 0 2 0 iT 1  idv variety tmt  39 53 2 99833 0 0 0 0 Zot  0  0 0 2 1  idv  pair  41 43 3 26838 0 0 0 0 0 0 20  O 0 1  idv run tmt  52 38 5 12369 0 0 0 0 O20 CO  2 2 1  Residual Variance 39 09 1 31486 0 0 0 0 0 0 0 0 020 i  Model_Term Gamma Sigma Sigma SE  C  idv  variety  IDV_V 44 1 80947 2 37920 3 01 GP  idv  run  IDV_V 66 0 244243 0 321145 0 59 OP  idv variety tmt  IDV_V 88 0 374220 0 492047 1 78 OP  idv  pair  IDV_V 132 0 742328 0 976056 2 51 OP  idv run tmt  IDV_V 132 1 32973 1 74841 3 65 OP  idv units  264 effects  Residual SCA_V 264 1 000000 1 31486 4 42 OP  Wald F statistics  Source of Variation NumDF DenDF F ine P in    7 mu 1 53 6 1484 96  lt  001  4 tmt 1 60 4 469 35  lt  001    301    oO OD Oo Oo Oo    15 8 Paired Case Control study   Rice   
491. v units              NIN Alliance Trial 1989  variety  A   id   pid   raw   repl 4    row 22   column 11   nin89 asd  skip 1   yield   mu variety  r idv repl   residual idv units              7 5 A sequence of variance structures for the NIN data       3a Two dimensional spatial model with spatial correlation in one direc   tion       The NIN trial was actually laid out in as a rectangular   NIN Alliance Trial 1989  array indexed in the data file by row and column  We variety  A   can therefore consider fitting a spatial model for the id   residual term where we allow for autocorrelated errors   P14    in the row and or column direction  see Section 7 3  a i   However  there are missing plots in the original data      E   Before fitting a spatial analysis  we therefore need to      Be  row    fill out the data file to contain records for the miss  ieee   ing plots  ASReml can now fill out the data file using aap a ipi  IROWFACTOR and  COLUMNFACTOR  see Table 5 2   This   yield   mu variety    allows us to define a separable variance structure for    r idv repl   f mv   the residual error term that is the kronecker product   residual idv column   ar1  row   of a structure for rows and a structure for columns    The example in the code box specifies e   N 0  o2 I         p     that is  a two dimensional  first order separable autoregressive spatial structure for error but with spatial correlation in  the row direction only  IDVxAR1   ar1 row  models the     p   correlation stru
492. values 4 19 it solves the mixed model equations by it   eration  allowing larger models to be fitted  With direct solution  the  estimation REML iteration routine is aborted after   n   1  forming the estimates of the vector of fixed and random effects  by matrix inversion    n   2  forming the estimates of the vector of fixed and random effects   REML log likelihood and residuals  this is the default      n   3  forming the estimates of the vector of fixed and random effects   REML log likelihood  residuals and inverse coefficient matrix    For arguments 4  10 19  ASReml forms the mixed model equations and  solves them iteratively to obtain solutions for the fixed and random  effects  The options are    n   4  forming the estimates of the vector of fixed and random effects  using the Preconditioned Conjugate Gradient  PCG  Method  Mrode   2005      15    5 8 Job control qualifiers       Table 5 5  List of rarely used job control qualifiers       qualifier    action        DENSE n    IDF n    n   10 19 forming the estimates of the vector of fixed and random effects  by Gauss Seidel iteration of the mixed model equations  with relaxation  factor n 10    The default maximum number of iterations is 12000  This can be re   set by supplying a value greater than 100 with the  MAXIT qualifier in  conjunction with the  BLUP qualifier  Iteration stops when the average  squared update divided by the average squared effect is less than le       Gauss Seidel iteration is generally much s
493. variance in a univariate single  site analysis  The option will have no effect in analyses with multiple  error variances  for sites or traits  other than in the reported degrees of  freedom  Use   ADJUST r rather than  DF n if ris not a whole number   Use with  YSS r to supply variance when data fully fitted     76    5 8 Job control qualifiers       Table 5 5  List of rarely used job control qualifiers       qualifier    action        EMFLAG n   PXEM n    requests ASReml use Expectation Maximization  EM  rather than Av   erage Information  AI  updates when the AI updates would make a US  structure non positive definite  This only applies to US structures and is  still under development  When  GP is associated with a US structure   ASReml checks whether the updated matrix is positive definite  PD   If  not  it replaces the AI update with an EM update  If the non PD char   acteristic is transitory  then the EM update is only used as necessary  If  the converged solution would be non PD  there will be a EM update each  iteration even though  EM is omitted    EM is notoriously slow at finding the solution and ASReml includes  several modified schemes  discussed by Cullis et al   2004   particularly  relevant when the AI update is consistently outside the parameter space   These include optionally performing extra local EM or PXEM  Parame   ter Expanded EM  iterates  These can dramatically reduce the number  of iterates required to find a solution near the boundary of the parame
494. variance matrices  In this case    var  y    02  ZG y  Z    R 7      2 8     which we will refer to as the gamma parameterization  and the individual variance structure  parameters in y  and y  will be referred to as gammas  ASReml switches between the sigma  and gamma parameterizations for estimation  This is discussed in Section 7 6     2 1 7 Parameter types    Each sigma in o  and     and each gamma in y  and y  has a parameter type  for ex   ample  variance components  variance component ratios  autocorrelation parameters  factor  loadings  Furthermore  the parameters in Og  Or  Yg and y  can span multiple types  For  example  the spatial analysis of a simple column trial would involve variance components   sigma parameterization  or variance component ratios  gamma parameterization  and spa   tial autocorrelation parameters     2 1 8 Variance structures for the random model terms    The random model terms u  in u define the random effects and associated design matrices   Zi     Z  but additional information is required before the model can be fitted  This extra    8    2 1 The general linear mixed model       step involves defining the G structure for each term  In Release 4  this is achieved by using  functions to directly apply variance models to the individual component factors in a random  model term to define G   This produces a consolidated model term that simultaneously  defines both the design matrix  Z   and variance model  G    This process is described in  det
495. variance model functions can also be applied to compound model terms  example 3     111    7 2 Process to define a consolidated model term       Table 7 2  Building consolidated model terms in ASReml          linear model term component s  variance variance covariance consolidated model   type of term  structure model component term  name function  name  1 repl repl IDV idv   idv rep1  idv rep1   single  2 fac x  fac x  EXPV expv    expv fac x   expv fac x    single  3 A B A B IDV idvQ idv A B  idv  A B   compound  4 column row column IDV idv   idv  column  idv column   ari  row   compound row ARI ar1Q ari  row   5 site variety site DIAG diag    diag site  diag site   id variety   compound variety ID id   id variety   6 Trait animal Trait US us    us  Trait  us  Trait   nrm animal   compound animal NRM nrm   nrm animal           e variance model functions cannot be applied to expandable model terms  for example  to    A B which expands to A B A B      A B which expands to A A B    at A i j  B which expands to at A i  B at A j  B    e a variance function must be specified for one  but only one  component in a compound  model term  Correlation functions must be defined for the remaining terms  This is due to  the identifiability issues that occur when multiple variance structures are specified  This  is explained in NIN example 3a  see Section 7 5  The defined variance function may be  homogeneous  name ending in v  or heterogeneous variance  name ending in h   This is  discuss
496. ve differing results depending on the order  in which the averaging is performed  We explore this with the following extended example   Consider the mean yields from 15 trials classified by region and location in Table 9 4     Table 9 3  Trials classified by region and location          location  Region L1 L2 L3 L4 L5 L6 L7 L8  R1 T1  T2 T3  T4  T5 T6  R2 T7  T8 TO T10  T11 T12  T13 Tid  T15       Table 9 4  Trial means       T1 T2 T3 T4 T5 T6 Tr T8 T9 T10 T11 T12 T13 T14 T15  10 12 11 12 13 13 11 13 11 12 13 10 12 10 10          Assuming a simplified linear model yield   mu region location trial  the predict statement predict trial  ASSOCIATE region location trial  will reconstruct the 15 trial means from the fitted mu  region  location and trial effects     Given these trial means  it is fairly natural to form location means by averaging the trials  in each location to get the location means in Table 9 5     Table 9 5  Location means       L1 L2 L3 L4 Ld L6 L7 L8  11 12 13 12 12 11 10 10          These are given by   predict location  ASSOCIATE region location trial  ASAVERAGE trial   or equivalently   predict location  ASSOCIATE region location trial   since the default is to average the base associate factor  trial  within the associated classify  factor  location      186    9 3 Prediction       By contrast  by specifying   predict location   or equivalently   predict location  AVERAGE region  AVERAGE trial   ASReml would add the average of all the trial effects and the 
497. wing lines are honoured if any one of the listed path  numbers is active  The  PATH qualifier must appear at the beginning of  its own line after the  DOPATH qualifier  A sequence of path numbers can  be written using a  b notation  For example  mydata asd  DOPATH 4   PATH 2 4 6 10  One situation where this might be useful is where it is necessary to run  simpler models to get reasonable starting values for more complex variance  models  The more complex models are specified in later parts and the   CONTINUE command is used to pick up the previous estimates     Example    The following code will run through 1000 models fitting 1000 different marker variables to  some data  For processing efficiently the 1000 marker variables are held in 1000 separate  files in subfolder MLIB and indexed by Genotype     Marker screen   Genotype     yield   PhenData txt   ICYCLE 1 1000   IMBF mbf Genotype  MLIB Marker I csv  RENAME Marker I  yld   mu  r Marker I    204    10 5 Performance issues       Having completed the run  the Unix command sequence   grep LogL  screen asr   sort  gt  screen srt  sorts a summary of the results to identify the best fit  The best fit can then be added to the  model and the process repeated  Assuming Marker35 was best  the revised job could be    Marker screen   Genotype     yield   PhenData txt   ICYCLE 1 1000   IMBF mbf Genotype  MLIB Marker I csv  RENAME Marker I  IMBF mbf Genotype  MLIB Marker35 csv  RENAME MKRO35  yld   mu  r MKRO35 Marker I    We have giv
498. x     causes ASReml to write the design matrix  not including the response  variable  to a  des file  It allows ASReml to create the design matrix  required by the VCM process  see Section 7 8 2    69    5 8 Job control qualifiers       Table 5 4  List of occasionally used job control qualifiers       qualifier    action       IDISPLAY n    IEPS  IG v    IGKRIGE  p      GROUPFACTOR tv p    is used to select particular graphic displays  In spatial analysis of field  trials  four graphic displays are possible  see Section 13 4   Coding these  1 variogram   2 histogram   4 row and column trends   8 perspective plot of residuals    set n to the sum of the codes for the desired graphics  The default is  9 1 8     These graphics are only displayed in versions of ASReml linked with  Winteracter  that is  LINUX  MAc and PC  versions  Line printer ver   sions of these graphics are written to the  res file  See the G command  line option  Section 10 3 on graphics  for how to save the graphs in a file  for printing    Use  NODISPLAY to suppress graphic displays     sets hardcopy graphics file type to  eps   is used to set a grouping variable for plotting  see  X     controls the expansion of  PVAL lists for fac X  Y  model terms  For  kriging prediction in 2 dimensions  X Y   the user will typically want to  predict at a grid of values  not necessarily just at data combinations  The  values at which the prediction is required can be specified separately for  X and Y using two  PVAL stat
499. x axis variable is numeric     Predictions involving two or more factors       xaxis factor       superimpose  factors       condition factors    Layout      goto n      saveplot filename     layout rows cols     pycols       plankpanels n       extrablanks n and     extraspan p    Improving the graphical     labcharsize n     panelcharsize n     vertxlab      abbrdlab n       abbrxlab n    If these arguments are used  all prediction factors  except for those specified  with only one prediction level  must be listed once and only once  otherwise  these arguments are ignored     specifies the prediction factor to be plotted on the x axis    specifies the prediction factors to be superimposed on the one panel     specifies the conditioning factors which define the panels  These should be  listed in the order that they will be used     specifies the page to start at  for multi page predictions    specifies the name of the file to save the plot to    specifies the panel layout on each page   specifies that the panels be arranged by columns  default is by rows     specifies that each page contains n blank panels  This sub option can only  be used in combination with the layout sub option     specifies that an additional n blank panels be used every p pages These can  only be used with the layout sub option     appearance  and readability    specifies the relative size of the data points labels  default 0 4    specifies the relative size of the labels used for the panels  default 1
500. xample of the display produced when an XFA structure is fitted   The output from a small example with 9 environments and 2 factors is    235    13 4 Other ASReml output files       Field Pt SE oR ES diss ae Jul 2405    2  2 41 18          ae eee omy   eae ee re   ee een     o o a    aaam     se tae n eS G   gopikas ieee Rares oe INS ee    2  aai  j    Figure 13 3  Plot of residuals in field plan order      JIN alliance tria  Residuals V Hon  ana ofinn 5g 595 Bjbion  te yaa b005 12 41 18          Figure 13 4  Plot of the marginal means of the residuals    236    13 4 Other ASReml output files       DISPLAY of variance partitioning for XFA structure in xfa Env 2      nist yea det ans dal A RA 2005 BA 18    Peak Count  17    Range      24 87    15 91                Andal                                           Figure 13 5  Histogram of residuals    Lvl                                                     TotalVar    EN    COMO OANA OTKFWHN    In the figure  1 indicates the proportion of TotalVar explained by the first loading  2 indicates  the proportion explained by first and second  provided it plots right of 1  Consequently  the  distance from 2 to the right margin represents PsiVar   expl reports the percentage of    i    1    NN    NNN    DN         Haaser asteesatassetassstesost ssstss st   Average    Dy    1666    ooo oO OC OC Oo C6    3339    Geno    hexpl    KROORDOOMON  ooooo0oo0oo0oo0oo00    Loadings    0   0    0    0   0   0    0    0    0    0      5147    4003   
501. y di   mensioned  Further  when the model term mv is included in the model  and  ROWFACTOR and  COLUMNFACTOR are defined  ASReml will check that  the observations in each section form a complete grid  if not the grid will  be completed by adding the appropriate extra data records  If only one  grid is required from all the data then the  SECTION variable does not  need specifying  The following is a basic example assuming 5 sites  sec   tions      Basic multi envt trial analysis filling out row and column    grid   site 5   sites coded 1    5  column     columns coded 1   row     rows coded 1     variety  A   variety names   yield    met dat  SECTION site  ROWFACTOR row  COLUMNFACTOR col  yield   site  r variety site variety  f mv  residual sat site  ari row   ar1 column     defines a spline model term with an explicit set of knot points  The basic  form of the spline model term  sp1 v   is defined in Table 6 1 where v  is the underlying variate  The basic form uses the unique data values  as the knot points  The extended form is spl v n  which uses n knot  points  Use this  SPLINE qualifier to supply an explicit set of n knot  points  p  for the model term     Using the extended form without using  this qualifier results in n equally spaced knot points being used  The   SPLINE qualifier may only be used on a line by itself after the datafile  line and before the model line     When knot points are explicitly supplied they should be in increasing  order and adequately cover t
502. ypi     cally coded 0 1 2 being counts of the minor allele  However  if they are imputed  they will  take real values between 0 and 2  Since marker files may be huge     169    8 11 Factor effects with large Random Regression models         SMODE b sets the storage mode for the regressor data  indicating whether it is marker data   b   2 sets 2bit storage for strictly 0 1 2 marker data  b   8  the default  sets 8bit storage  useful for marker data with imputed values having 2 digits after the decimal  b   16 sets  16bit storage useful for marker data with imputation with more than 2 digits and b   32  sets 32bit real storage and should be used for non marker data     RANGE l h indicates the marker scores range l   h and are to be transformed to have a range  0 2    IGSCALE s  controls the scaling of the GRM matrix  If unspecified s   2p 1     p  is used  for marker data  s   1 for non marker data   SMODE 32   Scaling is often used with centred  marker data to scale the MM    matrix so that it is a genomic matrix     Example  IWORK 1   Nassau Clone Data  Nfam 71  A  Nfemale 26  A  Nmale 37  A  Clone  A 860  rep 8  iblk 80  culture  A  DBH6    snpData grr Clone Marker    nassau csv  MAXIT 30  SKIP 1  DFF  1  DBH6   mu culture rep  r grmiv Clon  0 27 Clone 0 15 rep iblk 0 31    where snpData grr is first used to declare Clone identifiers  taken from the first field  in  the correct order  and then contains the marker scores  it looks like  Genotype  0 10024 01 114  0 10037 01 25
503. ys     e in the variance structure specification  for example    ari row  INIT 0 35     sets the initial value of the autocorrelation parameter for ar1 row  at 0 35  when this  form is used  all of the values required by the structure must be specified    e by modifying the  tsv or  msv file created in a preliminary run  Section 7 9 1   e by supplying an  rsv file using  CONTINUE  Section 7 9 2     Important points    e when initial values are supplied using   INIT  there must be the correct number of values  and they must be in the appropriate order  for example  for us   the initial values need  to be supplied in the order lower triangle row wise    e for the gamma parameterization  Section 7 6   the variance structure parameters will be  gammas  in this case the initial values for the gammas that are variance component ratios  will be interpreted by ASReml as ratios     129    7 7 Variance model function qualifiers       7 7 6 About subsections  SUBSECTION f    The  SUBSECTION qualifier provides an extension to the sat function of Section 7 3 2 for  modelling the residual variance  It allows the case of modelling multiple independent sections  of correlated observations with a common variance structure and common parameters within  sections  The sections can be of different sizes and any homogeneous variance correlation  model in Table 7 6 may be used for the variance structure  This gives an R structure of the  form    R    i1 Pto  where R    Dj Z  Q      so R   may have 
504. ywt   IF  1    3  ASSIGN YV gfw   IF  1    4  ASSIGN YV fdm   IF  1    5  ASSIGN YV fat  tag   sire 92 II   dam 3561  I   grp 49   sex   brr 4   litter 4871   age   wwt  MO    MO identifies missing values  ywt  MO   gfw  MO   fdm 1MO   fat 1MO   coop  fmt   IPART 1235     YV   mu age brr sex age sex  r idv sire  idv dam  idv lit  idv age grp    idv sex grp   f grp  traits are substituted for  YV    PART 4  leaves out sex grp for fdm    YV   mu age brr sex age sex  r idv sire  idv dam  idv lit  idv age grp    If grp   fdm is substituted for  YV    Tables 15 13 and 15 14 present the summary of these analyses  Fibre diameter was measured  on only 2 female lambs and so interactions with sex were not fitted  The dam variance  component was quite small for both fibre diameter and fat  The REML estimate of the  variance component associated with litters was effectively zero for fat     318    15 10 Multivariate animal genetics data   Sheep       Table 15 14  Wald F statistics of the fixed effects for each trait for the genetic example       term wwt ywt gfw fdm fat       age 331 3 67 1 52 4 26 7 5   brr 5546 73 4 149 03 13 9   sex 196 1 123 3 0 2 29 0 6  age sex 10 3 1 7 1 9   5 0          Thus in the multivariate analysis we consider fitting the following models to the sire  dam  and litter effects     var  us    Ys    Io   var  ua    Ya   3561  var  u    5  Q   I 4891    where   2    5 and 5f   are positive definite symmetric matrices corresponding to the  between traits variance 
    
Download Pdf Manuals
 
 
    
Related Search
    
Related Contents
DVR 4/8 Canais VD-0412H, VD-0824H Manual do Usuário  user manual notice d'utilisation benutzerinformation Fridge  3. La caja de medición ARTA  Snapper R194014 Lawn Mower User Manual  Erba Hypoclean CC  RPS 制度に係る申請・届出書記入方法 (ITEM2000 の操作方法)    SHDE ® - Shoei  CINEWALL Basic Set XL  VitaScan LT - Vitacon US    Copyright © All rights reserved. 
   Failed to retrieve file