Home
E-SURGE 1.8 user`s manual
Contents
1. EO site 1 site 2 site 3 EO site 1 site 2 site 3 EO staying in 1 leaving 1 staying in 2 leaving 2 staying in 3 leaving 3 t 33 Matrices and pattern matrices The initial state matrix and its pattern matrix are m n mk 1 nk af PIL 1 T 34 The first elementary transition matrix for survival maps from E 0 to ED and hence is of dimension 4x4 s 0 0 1 5 s x k k 0 O sk 1 sk 8 o 0 0 1 l l 20 CHAPTER 3 GEPAT Site 1 Site 2 Site 3 Dead Figure 6 DAG of the Grosbois model We use a directed acyclic graph commonly abbreviated to DAG in the statistical literature to describe life processes model inside one occasion for the AS site fidelity model showing the transitions for survival fidelity given survival and destination given movement Transition probabilities are shown on the pathways originating in Site 1 and for the dead the transition probabilities on the other arrows follow the same pattern The row stochastic matrix 9 projects from row 1 to row 2 of the graph The matrix pl projects from row 2 to row 3 and the matrix ge projects from row 3 to row 4 i e form row 3 back to row 1 The second elementary matrix for site fidelity given survival maps from EO to ES and hence is 4x7 Letting f be the probability of remaining in site given survival we have f i 0 0 0 0 0
2. The rank can be also estimated less precisely by the numerical rank of the computed Hessian plus the estimated number of boundary parameters based on a threshold to decide which eigenvalues listed in L428 to L499 can be considered as equal to zero as mentioned in the previous paragraph This threshold criterion A gt 1077 x Ay where n is the size of the Hessian matrix L78 and A its largest eigenvalue Thus instead of the current value estimated by the numerical CMF method section 7 2 L79 may give the number of eigenvalues which satisfy this criterion plus the estimated number of boundary parameters L80 Following 49 another less severe threshold is applied in E SURGE as AM gt n 1074 dq The difference between the results of the application of the two thresholds is given in L82 These results can be used together with theoretical calculations for advanced investigations of redundancy issue In this example we see from L428 L499 that the rank is at least 70 as all eigenvalues but two 0 00000801 0 00172081 are clearly larger than 0 However two estimates are considered as being 74 CHAPTER 9 INTERPRETING THE OUTPUT on a boundary 19 recommend considering such parameters as non redundant Furthermore the convergence was not achieved in this case two eigenvalues are lower than zero Hence the number of non redundant parameters is taken to be 72 which is a bad estimate for the rank 9 6 Beta estimates L355 L426
3. B K Williams J D Nichols and M J Conroy Analysis and management of animal populations modeling estimation and decision making Academic Pr 2002 Index 4 8l eo 31 sed A Age model 6 AIC 73 Arnason Schwarz model combined 16 separate 18 with site fidelity 19 Biological parameters 7 label 50 output 74 Bootstrap 67 Censoring left 14 Closed models 13 Closed population 13 Cluster 42 Conditionality 12 Confidence intervals 74 by delta method 74 Constrained models 7 Constraint matrix 7 23 Continuous function 14 continuous function 36 Counting algorithm 67 Covariates Covariate selection 42 file format 44 individual 11 30 89 principle 30 Tests 71 Data 39 BIOMECO files 39 HEADED files 40 MARK files 40 Decomposition in Elementary Steps see DES 15 DES 5 15 diagram 19 Design matrix see Constraint matrix E SURGE organisation 47 session 48 Effects see Keywords combining effects 27 levels 27 lists 28 main effects 28 Encounter decomposition 15 First 6 keywords firste amp nexte 28 Next 6 Expectation Maximization EM 72 External covariates input file for 44 GEMACO 23 in practice 54 General model 4 GEPAT 15 90 Generalized linear mixed model 11 GEPAT 15 see Pattern matrix biological parameters 22 practice 21 Global minimum 63 Hessian eigenvalues 72 Initial state skipping 54 Initial values chang
4. 1 sp l 891 81 82 1 s8 1 8s3 1 2 2 2 2 2 2 22 44 81 89 1 s7 1 s5 1 42 where upper indices stand for time and age where lower indice stands for site and where v is the transpose of v 23 4 Constrained models made easy GEMACO The definition of models in terms of constraints on each of the elementary matrices is carried out with GEMACO 5 which is also part of M SURGE 9 The most important new aspect in E SURGE is that the GEMACO keywords from and to current now refer to the rows and columns of the elementary matrices rather than the full matrix 4 1 Overview With GEMACO one of the salient features of E SURGE or M SURGE you will be able to generate easily the constraint matrix X associated with the model For the sake of simplicity a different sub matrix of constraints is defined for each type of initial transition and encounter parameter Overall there is LI LT LB sub matrices of constraints E SURGE assembles them to do the overall matrix of constraints For instance with a DES 1 2 1 general model see section 2 3 with matrices X1 X2 X3 X 4 associated to inital states survivals movements and events probabilities respectively the overall matrix is in block matrix notation 0 0 0 X4 associated to the block vector The main step in defining a matrix of constraint for one type of parameter consists in typing a phrase using the Model Definition Language or MDL for
5. Defining the model as a for age creates a matrix X 2 with A columns when the umbrella model contains A classes of age with A K 1 Xv is given in Table 3 with A 2 for given by Equation 42 F To T AIG 1 0 1 1 1 1 1 1 0 242 41 41 J1 0 0 1 3 I1 1 1 0 0 23 1 1 J1 0 0 3 3 JL 1 1 1 0 1 1 42 1 J1 1 0 2 2 2 b fd X 0 0 113 12 1 1 0 0 23 2 1 1 0 0 313 2 1 1 0 1 1 1 2 2 1 0 1 2 2 2 2 1 0 0 1 32 12 1 0 0 2 312 I2 1 0 0 3 13 2 2 1 Table 3 Constraint matrix associated to age and generated by GEMACO The constraint matrix X left part is generated by GEMACO according to the component of the vector 02 described in Equation 42 The coordinates F To T A G of the components correspond respectively to From To Time Age Group and are displayed in the right part Note that if A 1 then a a 1 else if A 2 then a a 1 2 as a 2 a 2 K 1 see section 4 6 else if A 3 then a a 1 2 3 How is the MDL phrase group interpreted by GEMACO With 2 groups the vector of survival parameters 02 becomes LLI hid 1 1 1 1 1 1 21 1 211 2 1 1 214 0 Cd sp oa ola 1 1 1 si i S2 221 s 2 1 22 s 21 Si l l 1 e 5112 1 1 2 1 1 2 2 12 aia o T Si sz l sp l s Is sy l si ls l 2 22 2 22 s 2 2 s 2 2 1 s 0 89 al y 43 and defining the model as t for time or g for group leads to matrices X gt and Y with twice
6. The Maximum Likelihood Estimates MLEs of the mathematical parameters 5 are given along with their 95 confidence intervals and their standard errors Fixed betas do not appear The standard errors SEs are derived from the matrix of variance covariance computed as the inverse of the second order derivative matrix of the likelihood or equivalently as the first order derivative matrix of the analytical gradient of the likelihood The 95 confidence interval 8 1 96 SE v 6 1 96 SE v relies on the asymptotic Gaussian distribution of MLES 9 7 MLESs of parameters and standard errors L134 L348 In lines 229 to 343 the MLEs of the biological parameters 6 xX B their confidence intervals and standard errors are listed To easily identify the parameters their row number in the X matrix is given The concerned states occasions ages groups and steps are also given in this order according to the letters F To T A G S line 227 which refers to From To Time Age Group Step in order Irrelevant values are set to 0 Example S 2 2 4 1 1 1 means probability of survival step 1 from state 2 at occasion 4 i e between occasions 4 and 5 for age 1 and group 1 there is only one age classes and one group Considering the number of biological parameter users are helped by e an Excel file copy e a summary of lines 229 to 343 given from lines 151 to 221 The covariance matrix of X is obtained from
7. but it is now also used for encounter probabilities and initial probabilities For some recent examples in a CR context see 21 13 48 34 24 45 By default in E SURGE LI LT LB 1 Examples of such one step process DES 1 1 1 are the combined CAS model or the combined memory model However by setting LI LT and or LB to values greater than one and using GEPAT it is possible to define the pattern of each of the elementary matrices at each step it becomes possible to fit many models difficult or impossible to estimate elsewhere Here we present several examples to clarify the steps in the analysis Another strong characteristic of the implementation of DES in E SURGE is that with 1 lt lt LT define a transition matrix from E 1 1 to E l where E can be another set of states than E The same feature is also available for initial state and encounter matrices which is entirely new for the model in CR An example is draw in Figure 5 We will see that it is now possible to generate new general model in the context of linear model in CR 16 CHAPTER 3 GEPAT Sex ascertained Sex not ascertained seen not seen Figure 5 Decomposition of the conditional event probabilities For sex determination in 33 two steps are needed for encounter to generate the GM In the diagram S represents the set of states males or females O the
8. 38 Which law for the duration MET x EB Input for the semi Non parametric Flexible Weibull Reduced Additive Weibull Mixture Siler Mixture variant Additive Weibull Gompertz3 Weibull3 Gompertz Geometrical CHAPTER 4 GEMACO Figure 10 Menu for the choice of the survival After exiting GEMACO the user must select a continuous function thank to a new menu Several hazard functions are available as well as a non parametric function i e full age dependant survival and a geometrical distribution i e constant survival EB IV for the duration law Figure 11 Menu for setting the initial values of the continuous function first one corresponding to firste to 1 then exit A new menu appears see Figure 11 for setting the initial values of the continuous function 39 5 Data input CR data typically consist of recapture history data e ex ex with associated number of animals eff effo effng A negative value for eff means that the animals were removed at the occasion of last capture E SURGE recognizes three file formats for CR data input the BIOMECO the MARK where numbers instead of letters are used as labels for states and the HEADED format This implies that 9 states at most can be handled using the MARK format Any number of states can be considered with the BIOMECO or the HEADED format 5 1 The BIOMECO format This format stems for the statistical ecology s
9. However for the combined survival transition formulation with one state and the separate survival transition formulation with two states the logit and generalized logit coincide 11 Acknowledgements We warmly thank Lauriane Rouan Christine Hunter St phanie Jenouvrier C dric Juillet Jean Dominique Lebreton Olivier Gimenez and members of the Biostatistics and Population Biology team for their useful comments and for some to be patient as being first users of E SURGE and to Hal Caswell for his participation to the first part of the manual This research was supported by a grant from the Jeunes Chercheuses et Jeunes Chercheurs program of the French ANR ANR 08 JCJC 0028 01 12 Conditions of external use Program E SURGE is property of those who wrote the software Conditions for its use are the usual ones e The software will be used for an academic or research purpose only In particular it will not be used for commercial applications e Due acknowledgement will be made for the use of E SURGE program in research reports or publi cations mentioning the program as well as the publication related to the program our preferred option for obvious reasons or the manual e The user recognizes to be aware that the software is a research product and is provided without any expressed or implied warranty There is no warranty of any kind concerning the fitness of the software for any particular purpose 77 13 Output
10. about good starting values a possibility is to change the initial values once or repeatedly at random another to start from the results of a previous model The two approaches are available in E SURGE e Use random initial values option Random Repeated random initial values are particularly useful option Multiple Random over several successive runs you will most of time get convergence at least once at the global minimum of the deviance even if there are local minima e Use MLEs of the previous model option From last model as initial values As an example the MLEs of the JMV model are more easily attained by starting from the MLEs of the corresponding CAS model E SURGE automatically adapts parameter values to fit the structure of the current model e Use random initial values option Multiple Random from IVFV files defined in IVFV files see section 5 7 for the definition of IVFV file To use this option you must first define a file as Table 14 containing the number of initial values and the name of each IVFV file n name filer fix name filen fix Table 14 Multiple random values from IVFV file The first line of the file containts n the number of starting values and the next lines n names of IVFV files Important note Like in multistate models in multievent models no method totally guarantees convergence to the global minimum of the deviance Based on our experience we recommend to use the option EM 2
11. and the last 3 captures parameters 70 71 and 72 The last list of indices build at the MLEs contains two more parameters two transitions parameters 45 and 48 This could be an instance of a local drop of rank However checking the estimated values shows that this parameter is estimated at the boundary its estimated value is zero because there is no such transition in the data set This is not a case of 65 redundancy 8 Advanced tools for output Several advanced tools have been made available in E SURGE to give additional output Only the non parametric bootstrap is general The other tools are currently only available for the option Markovian states only gt Conditional on 1st Capture 8 1 State dependent probabilities of CR histories One may wish to consider the probability of each histories or the probability of an individual conditional to its history to be in state D at first encounter e and in state A at the last occasion K This approach is related to the Bayes theorem and was used in 35 to obtain a post allocation of individual animals to classes of heterogeneity To obtain these quantities in E SURGE select View history state dependent probability in the menu Run amp See see Figure 33 Run amp See Help Run Options gt Run Output gt Y Save history state dependent probability Figure 33 To save probability of each histories select View history state dependent probability in the menu Run amp S
12. d stage E d Name file for covariates defaultfile of step for encounter 1 Phrase for step 1 firste nexte current t Number of shortcuts 0 Pattern matrix b b p n 3 Name file for covariates defaultfile Link function logitgen Explicit gradient off TOLF 0 0000001 TOLX 0 0000001 OUTPUT TEXT FILE 137 138 139 79 Order for the Gauss Hermite interpolation 15 000000 dev PI dev PHI B 116414 612 QAIC 116552 612 Number of mathematical parameters 72 000000 Estimated model rank applicable to the data 69 000000 Estimated number of boundary parameters 2 000000 WARNING there might be O further non identifiable parameters c hat used for the QAIC 1 000000 Model type O Markovian gt O Unavailable 0 000000 Conditionality O 1st capture 1 1st occasion 2 Closed 0 000000 Method for the Gauss Hermite interpolation unknown Time for optimisation 1 215743e 001 seconds Time for Hessian 2 745516e 001 seconds Number of iterations for optimisation 84 69 singular values bigger than 9 2387e 006 May beO more parameters are non estimables 9 quantities solutions of3 partial derivatives equations made of redundant parameters indices below are estimables 25 26 27 40 41 42 55 56 57 70 71 72 25 26 27 40 41 42 55 56 57 70 71 72 25 26 27 40 41 42 55 56 57 70 71 72 11 quantities solutions of3 partial derivatives equations made of redu
13. see Figure 12 a shortcut see Figure 7 or a cluster see Figure 13 is automatically build according to the number of modalities e Shortcuts If the number of modalities levels is lower or equal to 50 a shortcut is created which can be used as a fixed or a random effect e Clusters If the number of modalities levels is strictly larger than 50 a cluster is created by grouping individuals by modality It can be used only as a cluster random effect currently not available but very soon available 5 3 THE HEADED FORMAT Covariate_Selection Choose covariates used to defined GROUPS Covariates for fixed and or random effect Can be combined No covariate selected For multiple choices cit select Choose covariates used to defined CLUSTERS Covariates for random effect only Cannot be combined No covariate selected No covariate to select Clusters are only used as random effects They cannot be combined Figure 13 Cluster interface 43 Select qualitative covariates A A Couple E Select all 44 CHAPTER 5 DATA INPUT 5 4 File of external covariates The format of the file is described in Table 11 n is the number of external covariates x stored as successive rows of k values n ky ko kn 211 T12 Dik Tn1 Un2 Unky Table 11 Description of the external covariates file 5 5 File of time intervals The file of time intervals is a row of K 1 r
14. 1 0000000 HISTORY cz 3d id il ds jd GROUP 1 SUM OF PROBABILITIES 0 00072865 0 1386690 0 0432036 0 0021694 0 8159580 1 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 020 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 1386690 0 0432036 0 0021694 0 8159580 1 0000000 Figure 34 First four capture histories extracted from the file histories tmp 8 2 NON PARAMETRIC BOOTSTRAP 67 8 2 Non parametric Bootstrap E SURGE allows you to do non parametric bootstrap Select the option RUN amp See gt Run Options gt Bootstrap see Figure 35 and set the number of iterates by default 1 A file named bootstrap txt is saved containing first the list of deviances and followed by a list of vectors of mathematical parameters To prevent for local minima it is highly recommended to used the Multiple Random option sec tion 10 1 to fit the model at each iterate of the bootstrap Run amp See Help Run Options MLE Run Output gt Bootstrap non parametric Figure 35 Selecting the non parametric bootstrap option 8 3 The Viterbi and the Counting algorithms The Viterbi and the counting algorithms were developed by 44 in the context of capture recapture for estimating the lifetime reproductive success LRS The Viterbi algorithm reconstitutes the life of the individual The most probable underlying state sequence or more generally a set of state sequences such that the cumulative probability reache
15. 64 Residual variance 71 Saddle point 61 75 how to manage 72 Senescence age of 14 Shortcuts addressing levels 33 definition 33 limitation 33 Standard error 74 State dependent probabilities 65 Umbrella model 4 7 Unequal time intervals application 9 for real age 14 in practice 44 54 input file for 44 limitation 9 Viterbi algorithm 67
16. EXIT in the lower part of the window However if you want to linger some more within the GEMACO interface here are some indications of what you can do For each kind of parameters two steps are compulsory and four steps are optional Optional Select an external file if any external effects are involved see section 4 4 To stan dardize these external covariates select file with external variables to standardize A file with standardized variables is created and saved as the original file with a std suffix Optional Define shortcuts in shortcuts for sentences before GEMACO see section 4 9 for its use Click on the button Add shortcut to define a new shortcut You can also select a shortcut and erase it by clicking on delete shortcut or you can also load and save shortcuts from the menu Pre defined Shortcuts At the end of the session shortcuts are saved with the session With a new session if no shortcuts is already defined a default file is loaded to which you can of course add your own shortcuts Optional Change the pattern for transitions in the Transitions pattern area see section 4 10 for its use The position of the zero in each row represents the parameter that is not constraint directly To change it first select the row concerned and then move to the desired position the bottom cursor Enter a string in the Model definition area and validate it by clicking out of the Model definition area see section 4 1
17. Hessian Moi Default Non linear solver Quasi Newton Initial Values Constant Analytical gradient Convergence or the model rank Stop ater 1 cycle Fit a model No model Figure 17 Main windows of E SURGE with a toolbar and 4 distinct areas The aspect of the window is that before the data set has been loaded a short description of the data is given in the Data status area but before any model has been defined see area Fit a model In this area four buttons give access to the GEPAT interface the GEMACO interface the IVFV Initial Values and Fixed Values interface and the RUN Deviance Minimization routine or solver At the stage considered here no button is activated 6 2 Opening a session In the Start menu you can select either a new or an old session or exit from the program If you want to begin a new session you first select the option open a new session A window will appear 6 2 OPENING A SESSION 49 asking where you want to store the results during the working session For example use the name firststep mod as in Figure 18 file name to store model results 21xj Enregistrer dans a example 1 Geese ere E Nom du fichier firststep mod Type mod y Annuler Figure 18 Open a new session firststep This analysis was run under a French version of Windows All essential pieces of information about data and computations will be stored in this ses
18. Ill Advanced Numer Delete model Number of iterations by cycle 200 Initial Values Model ld Par Deviance QAIC Constant El CAS SF 69 116414612 116552612 Tolerance to parameter change a p 92 1e 007 Convergence Tolerance on gradient 1e 007 Stop after 1 cycle E Analytical gradient Output for the model rank Detailed Y IV Compute Hessian COMPUTE A MODEL Running Rank Wait Results file text Export to Excel Results file Excel Retrieve model imization terminated t search directio nt direction and magnity en c ctional derivative in sea direction than 2 opt io lt Give model rank 72 mathematical paran 5 x Enter the model rank given that rank gradient of prob 69 The number of 16 6026 identifiable parameters is calculated as rank gradient of prob a pns Enter model name AS Rank of the model conditional to the data Figure 31 Output of E SURGE I Informations in the dos window While a model is being fitted information about the iterations scrolls down the left DOS window Once convergence has been reached the numerical ranks of the model at 4 points in the neighborhood of the MLEs and at the MLEs itself 5 value are displayed The right bottom window Output area of the main window permanently displays the previously fitted models ordered by AIC with the following pieces of information model name rank deviance and AIC Once validated the
19. In both cases build the model with environmental covariates following the usual three steps GEMACO GEPAT and IVFV To do the permutation test select the option RUN gt Permutation test To do the standard test select the option RUN gt ANODEV test In addition to the deviances of constant and time dependent models E SURGE asks for the Excel file containing the time dependent model with estimated variances With this last option E SURGE compute an estimates of the residual variance Important remark This option is available only for the first step of transition which is in general the step for survival 9 Interpreting the output In this chapter we will go through the output file obtained at the end of chapter 6 Some of these informations and others are also available in the corresponding Excel file 9 1 CAS output file The content of the CAS output file is given in chapter 13 in detail The lines are numbered from 1 to 499 abbreviated L1 to L499 hereafter 9 2 File heading L1 L9 Lines 1 to 4 are where information about the version of E SURGE used L1 the name of the current output file L3 and the data file L4 are given Lines 5 9 give some basic and essential information about the data such as the number of occasions L5 the number of states L6 the number of events L7 the number of groups L8 and the number of age classes L9 72 CHAPTER 9 INTERPRETING THE OUTPUT 9 3 Information about the model
20. Samuel A general model for the analysis of mark resight mark recapture and band recovery data under tag loss Biometrics 60 4 900 909 2004 BIBLIOGRAPHY 85 14 15 16 17 18 19 21 22 23 24 25 J E Dennis and R B Schnabel Numerical Methods for Unconstrained Optimization and Nonlinear Equations Classics in applied mathematics SIAM 1983 O Duriez S A Saether B J Ens R Choquet R Pradel R H D Lambeck and M Klaassen Estimating survival and movements using both live and dead recoveries a case study of oyster catchers confronted with habitat change Journal of Applied Ecology 46 1 144 153 2009 M Fujiwara and H Caswell Estimating population projection matrices from multi stage mark recapture data Ecology 83 12 3257 3265 2002 O Gimenez and R Choquet Individual heterogeneity in studies on marked animals using numerical integration capture recapture mixed models Ecology 91 4 951 957 2010 O Gimenez R Choquet and J D Lebreton Parameter redundancy in multistate capture recapture models Biometrical Journal 45 704 722 2003 O Gimenez R Covas C R Brown M D Anderson M B Brown and T Lenormand Nonpara metric estimation of natural selection on a quantitative trait using mark recapture data Evolution 60 3 460 466 2006 O Gimenez A Viallefont E A Catchpole R Choquet and B J T Morgan Methods for investi gating parameter re
21. Selecting steps for unequal time interval 2 44 CONTENTS 5 7 File of initial values fixed values ce tp poa dnro uou 6 A short session 6 1 Main window of E SURGE Lili 024 ssaa erre RE O 6 2 Opening a sessiom Li eona mem ga A AAA E ae ee E 6 3 Building matrix patterns using GEPAT e e 6 4 Building constraint using GEMACO e 6 5 Changing Initial Values or Fixing Values 6 6 Setting optimization parameters and running the model Bf Output of results o ey wk a ARA ee oe e di 7 Advanced tools for numerical issues A munal Tanes i oe Boe ee ee E Awe ee Ge ee a e A a 7 2 Catchpole Morgan Freeman approach for redundancy 04 8 Advanced tools for output 8 1 State dependent probabilities of CR histories 0 00 eee ee es 8 2 Non parametric Bootstrap i s e pa ee ee ee a ee 8 3 The Viterbi and the Counting algorithms e e e 8 4 Tests for environmental covariates in presence of a random effect 9 Interpreting the output 9 1 CAS output ile i ico orar a aa A ee awa pees 9 2 File heading LI L9 Lui e aa a a Rca 9 3 Information about the model L11 L21 2 0 2 a 9 4 Minimization L23 L26 and L355 L426 0200000008 9 5 Deviance AIC and related topics L35 L44 and L431 L502 9 6 Beta estimates L358 L426 ns ek a a
22. The outputs chapter 9 e A few warnings chapter 10 2 Models 2 1 Notation Our presentation of Multievent models will use the following general notations and follow as much as possible those of 37 N the number of states U the number of events K the number of occasions A the maximum age class NG the number of groups LI LT LB the number of steps in the decomposition for initial state transition and event i 1 N the index of the previous or departure state JE Loss N the index of the current or arrival state u 1 U the index of the current event Luc the occasion index a 1 A the index of current age classes ng 1 NG the index of the current group 1 L1 LT or LB the elementary step index E Le EN the set of states where ey f for the death Q 461 lt Uy the set of events where v not seen All transition matrices are written with 7 as row index and j as column index following the Markov chain convention in which transitions are from rows to columns rather than the column to row convention used in matrix population models Encounter matrices use j denoting the state as row index and u denoting an event as column index 2 2 Multievent models The time dependent Multievent model assumes that individuals move independently among a finite set E of states over a finite number K of sampling occasions and that successive states obey a Mark
23. Zpbp Y Zombpn n 1 NI 12 p 1 p P 1 where bp R and bn R are random effects given by bp N 0 02 x Isp p 1 P 19 bpn ON C5 p P 1 P Q bp random effects associated to the set of effects 1 p 1 P sp number of levels of the random effect sp NG for a group random effects bon individual random effects assuming that individuals are independent p P 1 P Q 12 CHAPTER 2 MODELS Matrices Zpm contain either 0 1 or values of individual covariates and are never stored Because we assume that individuals are independent then covariance matrices for each random effect are diagonal We use this property to implement efficient algorithms for individuals and groups random effects see 7 17 For individuals E SURGE can handle mixed models with individual random effects only P 0 7 17 Q On XoBo Xnf1 gt Zpmbpn 14 p 1 with bp in the form of equation 13 There is no limit for the number of random effects that we can build However for Q gt 2 the fitting step may be time consuming Example 1 Survival varying with individual covariates and random effect The following model has been used in 19 with a constant survival across time but dependent on an individual covariates the body weight denoted by m and from an individual random effect bn logit n Bo Bimn bn n 1 NI 15 where bn N 0 0 i i d For groups E SURGE can also handle mix
24. a E I 9 7 MLESs of parameters and standard errors L134 L348 o 10 A few warnings a AI 10 2 Saddle point iii aa a a a a a DOV AUTO THORENS ca e Be ee 16 a es e ees 10 4 Generalized logit gt 22 2 es Bb RRA e a a E ee 11 Acknowledgements 12 Conditions of external use 44 47 48 48 50 54 57 59 60 63 63 64 65 65 67 67 71 71 71 71 72 72 73 74 74 75 To 75 75 76 76 76 13 Output text file Bibliography Index List of Tables Oo ON OT FP WwW YY FH e e i a E Wo N O Umbrella models of E SURGE LL Constraint matrix associated to time and generated by GEMACO LL Phrase age interpreted by GEMACO aaau Constraint matrices associated to time and group generated by GEMACO Keywords a c sda es eee PA ee e O BE GT RT AG ee Bae an Constraint matrices associated to time group and time group generated by GEMACO Models in notation of 25 vs in GEMACO so e ecca etsera eee eae eee ees Description of the BIOMECO file format LL Description of the MARK file format suo ii ae E E DE Boy Ge ae oR RA Description of the HEADED file format LL Description of the external covariates file Description of the file of time intervals LL Description of a IVEV file 22 see eee i wee a Multiple random values from IVEV fle ceso cer sondeos a List of Figures o AN SO di F wo N rr E o Detinition 01 the UM irom the GM 46 de goa he A DE y Eee GON
25. as a linear transformation of a vector 8 of mathematical parameters To keep the biological parameters which are probabilities in their permissible range 0 1 a link function f is generally applied see 2 5 0 XB 2 8 CHAPTER 2 MODELS or equivalently 0 fUXB 3 The matrix X is a matrix of constraints It can be a genuine design matrix in the case of a designed experiment In general it expresses hypotheses about the dependence of the parameters on stage of departure or arrival age since first capture time group and or covariates The design matrix is built by the program GEMACO GEnerator of MAtrices of COnstraints see Section 4 using the model definition language described below Often X will contain both discrete indicator 0 1 variables for equality constraints and continuous covariates e g effort or weather covariates An overview of linear constraints in CR models with a single state is given by 25 linear constraints in multistate are considered in 5 An important difference in the application of GEMACO in E SURGE as compared to M SURGE is that the GEMACO keywords from to etc in E SURGE refer to the elementary matrices The rows from and columns to in these matrices do not necessarily correspond to the states in the model e g in the encounter matrix the columns refer to events not states Care is thus required in writing down the GEMACO specification 2 5 The link f
26. as many rows see Table 4 26 CHAPTER 4 GEMACO 5 X and Y WE NH dVNEINH WVNEINHUWINENCPWUWWNEFEFNEUWAIANENH W NNNNNNNNNNA RAR ANNNNNNNNNNAAO A HH NNNNNRA RIA A LAA IND NN ON ON A A PP YD NNNNNNNNNNNNNNN A O qoo oo cc co coo 5 oO RI SO So o uo 2 0 2 0 2 _ 2 5 c 5 oF EH ceo l SO eo 0 gt _ O Si me e zl 0 0 Po coc i O So oc e D 5 oO zo o Po o o Po i I rr O OOHHOOOHhH PO 5 orrl _roioo rr gt r oO PO PO DPI DI DD DD _ __ o WD W NH W DI W MIE VW DI DI MIH DI DI 09 MI 19 DI DN WWW Table 4 Phrases time and group interpreted by GEMACO Constraint matrices respectively X and Y left part are generated by GEMACO according to the component of the vector 02 described in Equation 43 The coordinates F To T A G of the components correspond respectively to From To Time Age Group and are displayed in the right part 4 3 COMBINING EFFECTS WITH OPERATORS 27 How are the MDL phrases from and to interpreted by GEMACO Effects from and to take their meaning only when there are several states more than 2 without the state When from or for short f is applied to survival transition probabilities matrices of Equation 21 will be equal to di di 1 2xQ1 do do 1 2xQ 0 0 1 remember rows previous state columns next state If the structure to is used the survival transition probabilities will be equal to Q dI 1 Y i 12 Di dI dI l Y
27. but before IVFV see Figure 8 E SURGE ask you lists of integer at which constraint will apply see Figure 9 4 14 Non linear model When the option Models gt Markovian amp semi Markovian states gt Conditional on Ist Capture is selected steps in GEMACO and in IVFV interfaces are two fold In GEMACO first defined the linear model a usually but using the intercept for the survival then exit A new menu appears see Figure 10 Several continuous functions associated to different hazard functions are available 11 select one of them In IVFV fix the only one parameter link to the survival to 1 fix also the relevant capture rate like the 4 14 NON LINEAR MODEL 37 Setting Run amp See Help GEnerator of PATtern matrices Ctrl P GEnerator of MAtrices of COnstraints Ctrl G Initial Values or Fixed Values of parameters Ctrl I Link function Current model name Unequal time interval Set Unequal time interval of steps Set equality between param of various types Figure 8 Set equality between mathematical parameters 1456 23 Indices of mathematical parameter to set equal ex 3 7 17 9 11 Figure 9 Set equality between param of various types User s should enter lists of integer at which constraint will apply like 1 4 6 2 3 Here mathematical parameters 1 4 5 and 6 are set equals as well as mathematical parameters 2 and 3 Lists must be separated by a space i e space
28. clicking out of the Model definition area The button Call Gemaco is now activated Click on it or select call Gemaco in the Gemaco menu The design matrix appears in the Constraint matrix area The initial state part of the model is now defined e Select transitions in the menu Parameters select current step 1 corresponding to survival enter the string f t in the Model definition area f from and t time and validate it by clicking out of the Model definition area The button Call Gemaco is now activated Click on it 56 CHAPTER 6 A SHORT SESSION or select call Gemaco in the Gemaco menu The design matrix appears in the Constraint matrix area The survival part of the model is now defined Select current step 2 corresponding to fidelity enter the string f t in the Model definition area validate it and call GEMACO Select current step 3 corresponding to settlement enter the string f to t in the Model definition area validate it and call GEMACO Select Event in the menu Parameters enter the string firste nexte current t in the Model definition area a age validate it and call Gemaco The first encounter denoted as firste vs nexte for re encounter corresponding to the probability of capture include B will be fixed to one later on in IVFV The CAS model with site fidelity parametrization is now fully specified and you can leave the interface by clicking on the button
29. first capture is considered Setting Run amp See Help If any factorisation Y Initial Transition amp encounter Markovian states only Transition amp encounter Random Effect for Independent Group Only AE Figure 26 Menu to estimate or to skip initial state probabilities 6 4 Building the CAS model with site fidelity parametrization using GEMACO The next step is to specify more precisely the particular model to fit This model is always nested within the umbrella model and appropriate restrictions are implemented through constraints on parameters For people used to MARK or SURGE building constraints means creating design matrices One great feature Of E SURGE is that constraints are specified by means of a Model Definition Language inter preted by GEMACO When the button GEMACO in the compute a model area of the main window see Figure 17 is pressed or alternatively GEMACO in the Setting menu is selected the GEMACO window opens see Figure 27 This window has a toolbar with four menus Input Output for constraint ma trix Parameters Parameters and Gemaco and four areas Model definition Shortcuts for sentences Transitions pattern and Constraint matrix in Figure 27 6 4 BUILDING CONSTRAINT USING GEMACO 59 Gemaco interface Input Output for constraint matrix Parameters Gemaco Init state di ae not yet defined not yet defined 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 1 00 0 00 0 00 0 00 0 00 0 0
30. gt age gt group gt step 2 4 CONSTRAINED MODELS 7 Time t 0 t ta ty ts ts t Occasions 1 2 3 4 5 6 7 History 1 1 0 1 0 0 0 1 Pr History 1 Prix sx 1 Pr2 x sox Pr x s3x 1 Prl2 x sax 1 Pr 2 x sex 1 Pr 2 x sgx 1 Pr History 2 0 0 1 0 0 0 1 Pr History 2 Pri x sax 1 Pr 2 x s x 1 Pr2 x sgx 1 Prl2 x sgx 1 Pr02 Figure 2 Two individual histories and their associated probabilities Capture is modeled by a constant first capture rate Pr vs a constant recapture rate Pr Here Pr represents the first class of age of event and Pr represents the second class of age of event The index for one age class in event is the number of occasions spent since first capture plus one A 1 me group step iime age A group step piimeage A 1 group step A 1 Ttime group step prime group step Btime age 2 group step Table 1 Variations considered in the parameters of the umbrella models of E SURGE The type of variation is represented by upper indices for time age group and step A is the number of age classes for transition 2 4 Constrained models Model building in E SURGE as in M SURGE and MARK proceeds by imposing linear constraints on the parameters of the umbrella model in the spirit of generalized linear models 25 The vector of biological parameters parameters of direct interest to the biologist e g O m p b organized as a vector is expressed
31. i 12 Pi 0 0 1 It will be seen that the two main effects from to can in turn be combined to form model with other effects The keywords d od ld ud described in Table 5 correspond to specific combinations of categories of from and to 4 3 Combining effects with operators Two operators can be used to combine effects to generate more complex models Let a and b be two factors with ma and my categories respectively e Dot product a b is the product column by column of a by b i e the set of all combina tions of categories of the factors a and b i e a model with interaction The result a b is a factor with ma x my categories This dot product is the crossing operator of 32 pp 48 70 e Sum a b joins the columns of a and b If the intercept constant column equal to one is obtained as linear combination of the variables in a and also of those in b the first column of b is suppressed to avoid linear redundancy The result a b has then ma Mp 1 columns Otherwise all the columns of a and b are kept For 03 given by Equation 43 one obtains for t g and t g respectively the matrices in Table 6 The dot and sum operators have a well known role in single state models for instance the CJS model run independently by group will be denoted as g pg The dot operator is very useful in combination with the from and to effects when there is more than one state s gt 1 from to applied to th
32. makes it possible to aggregate parameters corresponding to categories of different effects i e which cannot be handled within a same list e Aggregation 8 The syntax a amp b sums each column of the matrix corresponding to a which can be effect 1 list1 with each column of the matrix corresponding to b which can be effect2 list2 If the numbers of columns are not equal then the last columns of the effect with the largest number of columns are kept unchanged It is particularly useful and most commonly used with a single column in each of the terms aggregated to form a single new column as in the following example 4 8 KEYWORD OTHERS 33 Example f 1 to 1 amp to 2 applied to the combined survival transition for two states builds the following constraint 911 12 22 The default priority order of operations is lt amp lt lt 4 8 Keyword others Assume we are modeling data with two groups and three occasions of recapture with two mathematical parameters defined by t 1 2 g 1 and that we want to constrain all the other biological parameters to be equal to a third mathematical parameter This third parameter may be defined by t 3 g 1 amp g 2 The overall model definition will thus be 1 2 9 1 t 3 g 1 amp g 2 Using the keyword others makes this simpler The model can be simply defined as t 1 2 9 1 others Important note This keyword must always be used at the end of the sentence as
33. model others This keyword is particularly useful for multievent models when many parameters have to be fixed to a same value In this case one first defines the mathematical parameters of interest and then simply add the keyword others to account for all remaining parameters 4 9 Shortcuts Definition of shortcuts In order to keep model definitions as simple and readable as possible E SURGE makes it possible to use shortcuts The user associates a shortcut name to an expression written with the MDL via a graphical interface see Figure 7 A shortcut name begins by a letter followed by any letters or figures Ex sex for g 1 2 A shortcut can be combined to another shortcut Ex sex t for g 1 2 t Then GEMACO substitutes every occurrence of the shortcut name by the equivalent expression The syntax for addressing shortcut levels or of any part of a sentence is see section 4 7 shortcut list1 lista Shortcuts in practice Let us consider for instance data consisting of individuals marked as juveniles and as adults Juveniles are stored in group one and adults in group two Individuals are considered as juveniles only during their first year and thereafter become adults We can create two shortcuts Juv for a 1 9 1 and Ad for a 2 5 g 1 amp g 2 For a model in which survival is different for juveniles and adults and is constant over time one simply writes Ad Juv 34 CHAPTER 4 GEMACO interface_shortcu
34. models conditional on the first occasion But a trick can be used to get round this problem for occupancy models To do that add a one to each observation of the data set so that each first non zero event occurs at the first occasion Select then the option Markovian states only gt Conditional on lst Capture By this way we can do all models with individual covariates and random effects described respectively in sections 2 9 and 2 10 Open and closed population The conditional probability of history h denoted Pc h is defined by POHA gray 18 with K P h TED B 01 II ap 2 09 Ly k 2 where P 0 8 is the probability of an individual to remain unseen at all occasions i e the probability of a history to be empty Since the time spend since the first capture is not defined we set A 1 and the age indices vanish in the previous formulae The likelihood is L 8 Pola 19 h As by products the estimated number of individuals present or passing though the site is Dan 1 P 0 8 In the particular case of a closed population model it is assumed that no new individuals enter in the area that no individuals leave the area and that there is no mortality So the survival parameters 14 CHAPTER 2 MODELS has to be fixed to one in the model For violation of closed population model assumptions and ways to manage them see 52 In the case of an open population we can for example model the SOD when the
35. new model will take its place in this window A warning occurs saying that estimates are on a saddle point To avoid this problem run again the model after fixing the last three capture rates to 1 and decreasing tolerances to 1078 user is then prompted to validate the estimated rank and to name the model This done the new model takes its place in the list of previous models maintained in the Output area of the main window see Figure 31 After each model fit the results are also saved automatically in two files named namemodel out and namemodel x1s in the working directory The mouse cursor flick during the save of the Excel file The model name by default is model where is replaced by a number The text file can be displayed by selecting the model in the Output area of the main window Figure 17 and then clicking the button View file of results The corresponding file with generic name out is opened by the editor The Excel file can be opened by Excel and it can be automatically laid out by clicking on Update output Excel 62 CHAPTER 6 A SHORT SESSION TE ES Fichier Edition Affichage Insertion Format oela ela al Hae aj 69 singular values bigger than 9 2387e 006 May be0 more parameters are non estimables 9 quantities solutions of3 partial derivatives equations made of redundant parameters indices below are estimables ao 26 27 40 41 42 55 56 S57 GO Sl 72 25 26 Z7 40 41 42 55 56 S57 70 Sl 2 25 2
36. of Figure 31 the number of singular values of the derivative matrix see 4 below the indicated threshold here 9 2387e 006 the number of additional singular values below a less selective threshold and the indices of the potentially redundant mathematical parameters here there are none file shape in the Run and see menu We recommend you associate files with a suffix out to the editor of your choice Similarly the Hessian and the estimated variance co variance matrices are saved in the Excel file and in a temporary file named Hessian tmp and may be retrieved using an editor Any model can be retrieved by selecting a model in the Output window part and by clicking the RETRIEVE MODEL button The model selection can also be exported to Excel using the EXPORT TO EXCEL button 63 7 Advanced tools for numerical issues Critical issues particularly for multistate models are the risk of numerical convergence to a local rather than the global minimum and problems of parameter redundancy Several advanced tools have been made available in E SURGE to address these issues The diagnostic tool for parameter redundancy is available for any Markovian models but only for fixed effect model without an individual covariate 7 1 Initial values The default constant initial values may lead to a local minimum of the deviance To reach the absolute minimum the initial value should ideally be chosen near the unknown MLEs In the absence of clues
37. one state corresponds to a single type of event at recapture then we can easily demonstrate than the m array for time dependent model or extended m array for age dependent model 9 is a set of sufficient statistics for the recapture part of the model 2 9 Individual covariates E SURGE can handle individual covariates Considering the general form of GLM f 0 X with B the vector of fixed effects is computationally demanding because of the dimension of the problem with so many potential effects Thus we have implemented the following restricted form of GLM by constraining separately the two following sets of effect e Set of effect 1 time age cohort and group effects e Set of effect 2 individual effect The result is the following form of GLM implemented in E SURGE F O Xobo XnB1 11 with X individual specific matrices of individual covariates they are never stored in the computer because of the memory size needed but rather they are computed each time 2 10 Independent and identically distributed i i d random effect The class of mixed effects models that E SURGE may consider can be expressed in the form of general ized linear mixed models GLMM Considering the general form of GLMM f O X8 Zb with 6 the vector of fixed effects and b the vector of random effects we have implemented the following restricted form of GLMM by constraining separately the set of effects 1 and 2 P P Q F On Xobo XnB1
38. probability to leave the area is time dependent only 46 Occupancy models In occupancy models 30 we follow patches over time As we know where patches are the capture rate is equal to one Thus the observation 0 has no longer to be called not seen Rather the observation 0 gives an information about the state of patches In that context an empty history is the absurd situation where a know patch is never visited The mathematical consequence is that P 0 8 0 As E SURGE allows you to deal with imperfect detection thus at least all models described in 31 can be fit Note E SURGE do not currently manage empty event patches which are not visited at some occasions However if empty events are distributed randomly then an additional parameter can be used to take into account this lack of information 2 12 Non linear model E SURGE is now able to fit a continuous function for example associated to a hazard function To that purpose a semi Markov formulation of the CJS model has been developed in 11 e to consider continuous function which are parametric in relation to the age of the individual e to deal with left censoring defined in the headed format 5 3 to allow individual to start at different ages As an original contribution of regular function with well defined second derivative we can estimate the onset of senescence using geometrical property 11 To handle such a model choose the opti
39. save memory and reduce computation time For transitions or survival common choices for the maximum relevant age are 1 4 1 which implies no age effect and 2 A 2 which creates a model in which the first age class is contrasted to older animals this is particularly useful when animals are marked as young Setting A 2 can also be used to treat transience 11 Specifying age dependence in encounters is slightly more complicated In multistate as opposed to multievent applications all calculations are conditional on the first encounter and hence the probability of that first encounter is not estimated In multievent formulations the first encounter may be an event rather than a state and thus E SURGE has the option of modelling the probability of the initial event see Figure 2 Therefore E SURGE always considers at least two age classes for encounters allowing the first event probability first class of age to be modelled or not Thus if one chooses a maximum age A 1 which implies no age effect E SURGE creates 2 classes for first and next encounter If one sets A 2 E SURGE creates 3 classes with age for events In E SURGE parameters are ordered in memory according to their type as follows with the leftmost indices varying first and the rightmost last m current state gt time gt group gt step previous state gt next state gt time gt age gt group gt step b previous state gt current event gt time
40. that of E as XE 6 X E 0 Standard errors of x logit_1 X 8 are computed by the delta method 95 CI are obtained by back transforming the endpoints of 95 CI of X B As a consequence the confidence interval of the parameter x is logit 9 1 96V 6 ve logit 4 4 1 96V 0 v For parameters obtained using the generalized logit link function the delta method is first applied to 0 logit logitgen XB Then we proceed as above to obtain standard errors and confidence 75 intervals for 2 logit_1 0 10 A few warnings The estimates and other results provided by E SURGE are obtained via a complex numerical analysis procedure Users must be aware of various complications that may arise to find improvements and solutions is an active area of research 10 1 Local minima The optimization procedure used by E SURGE gives a minimum of the deviance function and not nec essarily the global minimum Repeating minimization with different initial values is currently one of the only practical solutions to this problem see section 7 1 Another method is to use initial values computed from the MLEs of a simpler model e g to use MLEs of a time constant model as initial values for the optimization of a time dependent model This approach is available in E SURGE by first running the simpler model and then selecting the start from last model option in the Advanced Numerical Options before running a finer model This allow
41. the same model Sometimes it is not enough and the model has some difficulties to achieve convergence i e estimates are close to the MLEs but numerical difficulties slow down the convergence Fixing some parameters is another possibility and is very efficient for example fixing the last capture probabilities to one for the CAS model or fixing to zero capture probabilities at occasion when there is no capture 9 5 Deviance AIC and related topics L35 L44 and L431 L502 First the time needed to obtain the parameter estimates and the time needed to calculate the Hessian are respectively given in L90 and L91 together with the number of iterations L93 Line 75 and 76 give the deviance and the Akaike information criterion amended for overdispersion QAIC QAIC dev 2 x rank L84 gives c hat provided by the user according to the results of GOF tests default is 1 and L79 gives an estimate of the rank of the model conditional on the data By default in E SURGE the rank is the maximum of numerical rank of derivatives matrix 43 of the summary statistics calculated at several neighbors u of the MLEs section 7 2 The algorithm to compute the rank is summarized below 1 Choose a point y near the MLEs 2 Compute D the derivatives matrix at pu 3 Normalize D by Gy 4 Compute U V orthogonal matrices and E diagonal matrix such that U D G V E 5 Estimate rank u gt e gt mee1 where m is the number of columns of D
42. update now Replace the empty pattern matrix by Equation 37 the result should be the same than the one display in Figure 24 Last select Event in the menu Parameters Change the pattern matrix to Equation 39 the result is given in Figure 25 CHAPTER 6 A SHORT SESSION Gepat interface Input Output for patterns Parameters Pre defined Patterns Full Matrix Transition Options of Rows 4 of Columns 4 Update Now Figure 22 Matrix pattern in GEPAT for survival corresponding to Equation 35 Gepat interface o Parameters Pre defined Patterns Full Matrix Options of Rows 4 of Columns 7 Updtenow Now Figure 23 Matrix pattern in GEPAT for fidelity corresponding to Equation 36 6 3 BUILDING MATRIX PATTERNS USING GEPAT 53 Gepat interface Input Output for patterns Parameters Pre defined Patterns ee ER Options of Rows 4 of Columns 7 Figure 24 Matrix pattern in GEPAT to Equation 37 Gepat interface Input Output for patterns Parameters Pre defined Patterns Diagonal Matrix Empty Matrix Full Matrix Options of Rows of Columns Update Now Figure 25 Matrix pattern in GEPAT for encounter corresponding to Equation 39 54 CHAPTER 6 A SHORT SESSION The ge
43. 0 quasi Newton available for the option Markovian states only gt Conditional on 1st Capture 64 CHAPTER 7 ADVANCED TOOLS FOR NUMERICAL ISSUES 7 2 A numerical approach for redundancy Another crucial point is parameter redundancy In their version adapted to multistate models 18 ad vanced users may use the formal methods of 4 for studying parameter redundancy The key advantage of this method is that estimable functions of the redundant parameters are explicitly identified This enables the user to fix the values of some redundant parameters to render the model full rank and above all to interpret the values of the estimable parameters confidently When this method cannot be applied e g for complex models redundancy can be examined by looking at estimated standard errors see 20 In E SURGE a parameter not at a boundary but with a very large or null standard error is in general redundant However this approach is often unreliable The numerical version of the CMF approach is more reliable and has been implemented in E SURGE It was used as a tool in 43 to demonstrate that some memory model are full rank This approach 6 considers the properties of the numerical derivative matrix rather than those of the formal derivative matrix The local rank conditional on data is estimated as the number of non zero singular values and the redundant mathematical parameters are identified The only limitation is that estimable functions of t
44. 0 0 00 0 00 1 00 0 00 0 00 0 00 Di 0 00 0 00 0 00 0 00 0 00 0 00 0 00 lt a gt 0 00 0 00 0 00 0 00 1 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 1 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 1 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 0 00 y current t SE No external variables No File selected for external variables Call Gemaco EXIT Figure 27 Window structure of the GEMACO interface As shown here the initial state part of the CAS model has just been built The notation current t means that the initial states vary by current site current and time step t The corresponding design matrix automatically created by GEMACO has popped up in the Constraint matrix area fi Constraints on each type of parameters initial states transitions survival fidelity settlement and encounter are defined in turn independently of each other Select each parameter type from the Parameters menu or by clicking repeatedly on the top button with the name of the currently active type e g initial state in Figure 27 For the CAS model with site fidelity parametrization that we intend to fit to the goose data set proceed step by step as follows e First select initial state enter the string current t in the Model definition area and validate it by
45. 1993 Birkhauser R Pradel Multievent An extension of multistate capture recapture models to uncertain states Biometrics 61 442 447 2005 R Pradel The stakes of capture recapture models with state uncertainty 2009 R Pradel and J D Lebreton User s manual for program surge version 4 1 Technical report CEFE CNRS Montpellier France 1991 R Pradel L Maurin Bernier O Gimenez M Genovart R Choquet and D Oro Estimation of sex specific survival with uncertainty in sex assessment The Canadian Journal of Statistics 36 1 29 42 2008 BIBLIOGRAPHY 87 41 44 45 46 47 51 52 Roger Pradel Alan R Johnson Anne Viallefont Ruedi G Nager and Frank Cezilly Local recruitment in the Greater Flamingo new approach using capture mark recapture data Ecology 78 1431 1445 1997 L Rouan Apports des cha nes de Markov cach es l analyse de donn s de capture recapture PhD thesis 2007 L Rouan R Choquet and R Pradel A general framework for modeling memory in capture recapture data Journal of Agricultural Biological and Environmental Statistics 14 3 338 355 2009 L Rouan J M Gaillard Y Gu and edon and R Pradel Estimation of lifetime reproductive success when reproductive status cannot always be assessed In David L Thomson Evan G Cooch and Michael J Conroy editors Modeling Demographic Processes in Marked Populations volume 3 of Springer Series Environme
46. 2 for examples Call GEMACO by clicking on the button call Gemaco If the model is correctly specified the constraint matrix appears in the top left corner of the window in the Constraint matrix area 6 5 CHANGING INITIAL VALUES OR FIXING VALUES 57 e Optional Create user defined models Save the matrix displayed in the Constraint matrix area toolbar Input Output for constraint matrix then you can change it using an external editor such as WordPad or TextPad or vim so that the new design matrix represents the model you would like to fit Finally you can save this new matrix You can then load it in M SURGE using toolbar Input Output for constraint matrix e Optional Build all the model together from the menu Gemaco Call Gemaco all phrases 6 5 Changing Initial Values or Fixing Values of parameters IVFV The IVFV interface serves to help in reaching convergence of a model by setting initial values to replace default values or to set one or more parameters at a pre determined value fixed values To do that click on the button IVEV initial values fixed values in the Compute a model area of the main window see Figure 17 A new window will appear similar to that of Figure 28 By default each parameter has been assigned an initial value equal to 0 5 Select the encounter part to fix the first encounter parameter to 1 In our example parameter 1 appears in Figure 28 Next to Beta 1 on the left is
47. 6 27 40 41 42 55 56 57 70 71 72 69 singular values bigger than 9 2387e 006 May be0 more parameters are non estimables 9 quantities solutions of3 partial derivatives equations made of redundant parameters indices below are estimables 25 26 27 40 41 42 55 56 57 70 MI 72 25 26 27 40 41 42 55 56 57 70 MI T 25 26 27 40 41 42 55 56 57 70 71 72 69 singular values bigger than 9 2387e 006 May be0 more parameters are non estimables 9 quantities solutions of3 partial derivatives equations made of redundant parameters indices below are estimables 25 26 27 40 41 42 55 56 57 70 71 72 25 26 27 40 41 42 55 56 57 70 71 72 25 26 27 40 41 42 55 56 57 70 71 72 69 singular values bigger than 9 2387e 006 May be0 more parameters are non estimables 9 quantities solutions of3 partial derivatives equations made of redundant parameters indices below are estimables 25 26 27 40 41 42 SS 56 57 70 71 72 25 26 27 40 41 42 55 56 S57 70 V1 72 25 26 27 40 41 42 SS 56 57 70 71 72 69 singular values bigger than 9 2387e 006 May be2 more parameters are non estimables Figure 32 Output of E SURGE II Result of the potentially redundant parameters The CAS model without the last captures fixed to 1 is known to be redundant with a fall of rank equal to the number of states However with sparse data this may be worse Here the formal result is verified The above temporary window displays for each of the 5 points near the MLE see text and caption
48. 837606 53 M 4 4 5 1 1 1 1 000000000 1 000000000 1 000000000 0 000000000 54 F 1 1 1 1 1 2 0 778638230 0 720419147 0 827634547 0 027367239 55 F 1 2 1 1 1 2 0 221361770 0 172365453 0 279580853 0 027367239 87 F 3 6 5 1 1 2 0 264761908 0 225381508 0 308284698 0 021175829 88 F 4 7 5 1 1 2 1 000000000 1 000000000 1 000000000 0 000000000 89 T 1 1 1 1 1 3 1 000000000 1 000000000 1 000000000 0 000000000 90 T 4 1 1 1 1 3 0 887027523 0 772978810 0 947661013 0 042718142 137 T 5 3 5 1 1 3 1 000000000 1 000000000 1 000000000 0 000000000 138 T 7 4 5 1 1 3 1 000000000 1 000000000 1 000000000 0 000000000 139 E 1 1 1 1 1 1 0 000000000 0 000000000 0 000000000 0 000000000 140 E 2 1 1 1 1 1 0 000000000 0 000000000 0 000000000 0 000000000 82 342 343 344 345 346 347 348 349 350 351 360 361 362 363 364 406 407 408 409 410 422 423 424 425 426 427 428 429 CHAPTER 13 OUTPUT TEXT FILE Park 2144 E 2 3 6 2 1 1 0 707113893 0 670609870 0 741134050 0 018012035 Part 2154 E 3 4 6 2 1 1 0 385297408 0 339917748 0 432763412 0 023749312 Beta 1 0 231137158 0 125789960 0 336484355 0 053748570 Beta 2 1 208456999 1 118874734 1 298039263 0 045705237 Beta 11 0 539523790 0 735544318 0 343503262 0 100010473 Beta 12 0 261512495 0 103114152 0 419910838 0 080815481 HHEHHHHHHHHH
49. 91715302 0 774618492 0 021175829 90 TC 4 1 1 1 1 3 0 887027523 0 772978810 0 947661013 0 042718142 91 T 6 1 1 1 1 3 0 263997269 0 178922130 0 371234867 0 049405648 131 T 6 1 5 1 1 3 0 122292640 0 083089655 0 176433064 0 023559032 132 T 2 2 5 1 1 3 0 974046985 0 909740540 0 992895327 0 016956592 143 EC 1 2 1 1 1 1 1 000000000 1 000000000 1 000000000 0 000000000 185 EC 1 2 2 2 1 1 0 618845204 0 548023405 0 684949372 0 035136109 227 229 230 145 146 147 148 214 215 216 264 265 266 81 Part 2144 E 2 3 6 2 1 1 0 707113893 0 670609870 0 741134050 0 018012035 Part 2154 E 3 4 6 2 1 1 0 385297408 0 339917748 0 432763412 0 023749312 Parameters Index Estimates Lower Upper 95 percent CI S E F To Park Park Part Part Part Part Part Part Part Part Part Part Part Part Part Part Part Part TAGS 1 IS 1 1 1 1 1 1 0 224670873 0 211123372 0 238824520 0 007066957 2 IS 1 2 1 1 1 1 0 597023446 0 580654624 0 613178976 0 008299832 IS 1 2 6 1 1 1 0 450704260 0 416098003 0 485794108 0 017808202 IS 1 3 6 1 1 1 0 346991015 0 314411399 0 381070122 0 017026884 17 18 19 MC 1 1 1 1 1 1 0 631613156 0 579582626 0 680751334 0 025889372 20 M 2 2 1 1 1 1 0 742891360 0 703481678 0 778710814 0 019209356 52 M 3 4 5 1 1 1 0 376601065 0 325621913 0 430466905 0 026
50. E SR E DE es First capture n Multievent ss css gaga nus E da aaa Ea GS DER a a ed Conditionality OBUDAS osos e a cb ee Rae RR A Semi Markoy OPHOMS a x sis wa ds he seda hee a EE E O O a Decomposition of the conditional event probabilities o o o Diagram describing the AS site fidelity model oo a Shortcut creation Interface i o ce occa ca nad sad a aa a E E E a Set equality between mathematical parameters LL Set equality between mathematical parameters LL Menu for the choice of the survival After exiting GEMACO the user must select a continuous function thank to a new menu Several hazard functions are available as well as a non parametric function i e full age dependant survival and a geometrical distribution i e Constant AD soc oa AAA aia E AA AE TT 84 89 24 25 26 28 29 36 39 40 41 44 44 46 63 LIST OF FIGURES 5 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 Menu for setting the initial values of the continuous function 38 HEADED format Covariate selection e ee 43 HEADED format Cluster automatic creation LL 43 Menu tor getting steps for UTI oe sak he gera separe DE DE wee E ado 45 Set thesteps for ML ssa aw BE Bw Re oe RS O ESA Be A PO EE 45 General organization of E SURGE cccccccccc e a 47 E SURGE Mali SCLGEM es sc ro e Se we aa Re eee we ew 48 Opening a new Se
51. E SURGE 1 8 user s manual MultiEvent Survival Generalized Estimation A program for fitting Multievent models CHOQUET R mi NOGUE Erika CENTRE D ECOLOGIE FONCTIONNELLE amp EVOLUTIVE Biostatistics and Population Biology group CEFE UMR 5175 1919 Route de Mende 34293 Montpellier France September 2011 This manual is to be cited as CHOQUET R NOGU E 2011 E SURGE 1 8 user s manual CEFE UMR 5175 Montpellier France http ftp cefe cnrs fr biom soft cr A regular paper about E SURGE is R Choquet L Rouan and R Pradel 2009 Program E SURGE A software application for fitting Multievent models In David L Thomson Evan G Cooch and Michael J Conroy editors Modeling Demographic Processes in Marked Populations volume 3 of Environmental and Ecological Statistics pages 845 865 Springer Made with ATEX Contents Contents List of Tables List of Figures 1 Introduction 2 Models 2 AMOO eee A A AAA Seg 22 Mu ltievyent models o se sa mos Bo a a dl e E A eg ee ke Zo Umbrella models 1 ooo censores bee eee n A Decomposition in Elementary Steps ee ee Age dependence 24 Constrained models consciente a a A as tara 20 The Vitis DMO Li Ega an ie bE ad Re a ala 26 Unequal time intervals io sa veda 44 ed eed dee dd e 27 Maximin likelihood estimation 2 4 4 8 ye a a E EA TE E CEA a 2 8 Factorizing the likelihood 5 ssa 24 4084646 pus ag dea a ae Pe ee eed 2 9 Indi
52. HHHEHHHHEHHHEHHHRHEHA HHH ARA RO RARA HEH EEE EEE EEE Beta 13 0 539143521 0 321060268 0 757226773 0 111266966 Beta 14 1 061051094 0 863932820 1 258169368 0 100570548 Beta 56 1 970896378 2 401089488 1 540703268 0 219486281 Beta 57 3 625171738 2 310471023 4 939872454 0 670765671 HHEHHHHHHHHHHHHHHHHHHHHHEHEHRHEHAHHHH RHEE HHEH RHEE RARA RHEE RRR HHH Beta 58 0 484649587 0 192687597 0 776611578 0 148960199 Beta 59 0 229668409 0 367699756 0 091637062 0 070424157 Beta 71 0 881407926 0 710944699 1 051871152 0 086971034 Beta 72 0 467123038 0 663660781 0 270585295 0 100274359 HHEHHHHHHHHHHHHEHHHHEHHHEHHHRHEHA HEHEHE HHEHHRH HEE RARA EH EEE 1 94745573 1 70465435 498 2259 51584662 499 2296 98055316 83 84 BIBLIOGRAPHY Bibliography 1 C Brownie J E Hines J D Nichols K H Pollock and J B Hestbeck Capture recapture studies for multiple strata including non markovian transitions Biometrics 49 1173 1187 1993 2 Olivier Cappe Eric Moulines and Tobias Ryden Inference in Hidden Markov models Springer series in Statistics Springer 2005 3 H Caswell Matrix population models Sinauer Associates 2nd edition 2001 4 E A Catchpole and B J T Morgan Detecting parameter redundancy Biometrika 84 1 187 196 1997 5 R Choquet Automatic generation of multistate capture recapture models The Canadian Journal of Statistics 36 1 43 57 2008 6 R Choquet an
53. L11 L21 Lines 10 to 68 provide information about the fitted model The name of the model is given on L11 Lines 25 to 68 give for each step the matrix pattern defined in GEPAT and the Model Definition Language phrase used in GEMACO to build the constrained matrix The initial values of each beta value are given in line 18 on the real axis i e after logit or generalized logit transformation The indices of the beta values that were fixed are given in line 20 and the corresponding fixed values on the scale 0 1 appear in line 21 Only results for free beta parameters are later given from L350 to L423 9 4 Minimization L23 L26 and L355 L426 Lines 69 to 73 give the advanced numerical options used by the unconstrained nonlinear minimization algorithm E SURGE uses a Quasi Newton algorithm 14 an Expectation Maximization algorithm EM 12 or a hybrid algorithm 20 iterations of EM followed by a Quasi Newton algorithm when the generalized logit is used as a link function One of these non linear solvers can be chosen by the user in E SURGE If available for the model under consideration the hybrid algorithm is recommended A constrained nonlinear algorithm is used with the identity link On line 69 it is noted which link function has been used logitgen in our case Line 70 mentions that the gradient of the deviance has been calculated numerically a centered finite difference scheme applied to the deviance is used to compute the gradient O
54. MLEs Getting the correct value for the rank of the model is critical for model selec tion and we implemented a very precise algorithm using a numerical version of the Catchpole Morgam Freeman approach 43 6 see also section 7 2 E SURGE suggests a model rank by default that you can 60 CHAPTER 6 A SHORT SESSION modify Here the rank is estimated correctly at 69 see Figure 30 After convergence i e minimization of the deviance the Hessian matrix can be calculated optionally by ticking the compute Hessian option in area 2 and running again The Hessian is the matrix of second order derivatives of the log likelihood and is used to approximate the variance covariance matrix of the estimated parameters It is computed by a finite difference scheme using either e the deviance by default when the finite difference gradient is used for optimization e the analytical gradient when the analytical gradient is used for optimization The option very efficient in M SURGE is not presently recommended in E SURGE because of the additional cost needed to compute the analytical gradient for Multievent model The variance covariance matrix is approximated by the generalized inverse of the Hessian matrix The rank of the model conditional on the data can also be estimated from the numerical rank of the Hessian matrix 49 but this new estimate is generally less precise than the default one Give model rank 72 mathematical para Enter the mode
55. O Give the number of states 4 Give the number of events 4 Data Input File File name gt C Erika workshop TUTORIALS Ex3 geese Geese rh File type Biomeco Give the number of age classes 5 ok Cancel Advanced Numerical Options Figure 20 Change the number of age classes from 5 to 1 The choice of an appropriate number of age classes leads to a faster algorithm and is thus recom mended if appropriate Select 1 age class in Figure 20 for time dependent model All result files will be saved in the directory of the current session file 6 3 Building the pattern matrices for the CAS model with site fidelity parametrization using GEPAT The next step is to specify in E SURGE the pattern of the matrices PII POD P PC PB defined respectively in equations 34 35 36 37 39 When the button GEPAT in the Compute a model area of the main window Figure 17 is pressed or alternatively GEPAT in the Setting menu is selected the GEPAT window opens Figure 21 This window has a toolbar with three menus Parameters Input Output for patterns Pre defined Patterns four areas Sentence for pattern Matrix pattern Options Automatic Patterns and three edit boxes related to the number of steps 1 the current step 1 and its label IS The matrix pattern PII is by default already defined as in Equation 34 Please note that often this pattern has to be modified For the four others matrix patterns proceed s
56. ODELS For example in 21 a new parameterization of CAS models is presented in which movement among sites is described in terms of the probability of leaving the site of origin and the probability of settling in the destination site conditional on leaving This parameterization is then used to address the influence of local perturbations on site fidelity and settlement decisions of emigrants in a subdivided population of Black headed Gulls Larus ridibundus This two step processes can be expressed by a product of three elementary probability matrices the first describing survival the second describing the probability of emigration conditional on survival i e fidelity to the site of origin and the third describing the probability of the destination site conditional on emigration Such decompositions can be used to model any multi step processes For example the memory model can be defined this way in either separate or combined formulation 10 In E SURGE the structures of the elementary matrices are defined by a tool called GEPAT for GEnerator of PATtern see chapter 3 Age dependence In the UM survival transitions and encounters may depend on age i e time since first capture not necessarily true chronological age The user specifies an oldest relevant age class all animals this age or older are combined into a single age class While restricting the range of ages restricts the range of models that can be fitted it may greatly
57. PUT 8 3 THE VITERBI AND THE COUNTING ALGORITHMS 69 ans 0 1284 0 1664 0 1746 0 1425 0 2717 0 1931 0 1562 0 2480 0 5732 ans full details saved in file counting txt From this resume an estimate of the LRS is given by the formula 0 9386 x 1 1 2949 x 2 1 5392 x 3 8 1460 All the details by histories are saved in the file counting txt Important notes e These two algorithms are availables only for fixed effect not individual effect e More elaborate sentences can be used to select histories of interest For example the sentence find his 1 gt 2 selects all histories of the first cohort E Informati gt E a Which histories ex 1 10 1 174 Which group ex 1 4 Number of most probable sequence 4 i Figure 36 Menu for the Viterbi algorithm 70 CHAPTER 8 ADVANCED TOOLS FOR OUTPUT Which histories ex 1 10 1 174 Which group ex 1 4 State numbers ex 2 5 2 4 oe mes Figure 37 Menu for the Counting algorithm 8 4 TESTS FOR ENVIRONMENTAL COVARIATES IN PRESENCE OF A RANDOM EFFECT 8 4 Tests for environmental covariates in presence of a random effect To test for environmental covariates in presence of a time random effect whitout fitting the random effect there is two options 26 e using a permutation test e using a ANODEV F test or its t test version when a one side test is suitable
58. actorisation of the likelihood can be done for this model see section 2 8 permitting probablities 7 to be estimated separately from probabilities q and p 3 3 The separate formulation for the Arnason Schwarz model In this formulation survival and movement conditional on survival are separated this is a typical case of DES 1 2 1 The two set of states remain constant across the two steps life processes EO site 1 site 2 4 E site 1 site X t and the set of events 2 is the same as in the combined model The initial state matrix and its corre sponding pattern matrix are m 7 1 7 PI 1 x 28 There are now two elementary transition matrices one corresponding to survival and one to transitions noted 4 conditional on survival each of which has its own pattern matrix sp 0 1 s s er o sk 1 sk P s x 29 0 0 1 vi 1 4 0 box pE 1 yk pk 0 Pe Y 30 0 0 1 a There is one elementary detection matrices constant at the first capture and time varying to recapture with a pattern matrix corresponding to both 010 Bl 901 31 100 l p pi 0 p Bee I pf O pi PB x p 32 1 0 0 Note that as in the combined CAS model probabilities 7 can be estimated separately from proba bilities Y and p 3 4 A VERSION OF THE ARNASON SCHWARZ MODEL WITH SITE FIDELITY PARAMETRIZATION 19 3 4 A version of the Arnason Schwa
59. al variation between groups Forces rows in elementary matrices to differ Forces columns in elementary matrices to differ Constant diagonal terms in a s x s matrix of parameters Constant off diagonal terms in a s x s matrix of param eters Constant terms in each upper diagonal of a s x s matrix Constant terms in each lower diagonal of a s x s matrix Constant first encounter i e a 1 Age independent next encounter i e a 2 A 1 See section 4 4 See section 4 5 See section 4 11 4 3 COMBINING EFFECTS WITH OPERATORS 29 5 and Y NNNNNNNNNNR a A RRA NNNNNDNRA RIA HP AAN OD a Pe eee os So 0 Sit ooo Sc 2 i i i 5 D ll o 2 eo ze ze zl zl nq oo od oe oo oH 0 00 00 00000010 0 010 O So Oo Co 0 NRANA AODNRADDRAONRA NDNAO0ONP NDNPRONDNENAONDNAD A NNNNNNNNNNNNNN RRA AA edd O ei So l 0 gt 3 SS SS SOS OCS 02 0 0 e COCO H j Oo Si O OS 0 HSS CS SS SS oS coo oOo coc Co co 0090000920 Hi SC OOO o o OO COCO n O He o oo oF Of HS oC O o oo oS I O oo oo SoOooOonHoocoon rn n oo o rn o0o00000000000000 I 00 00 N WI LI I MIE DI DI WI MIE DI DI WI MY DI DI DI MII DI WI WNT Table 6 Phrases time group and time group interpreted by GEMACO Constraint matrices respec tively X and Y are generated by GEMACO according to the component of the vector 05 described in Equation 43 The
60. ams exist for CR analysis e g 23 50 9 but E SURGE is the first general program for Multievent models E SURGE also incorporates a new and extremely flexible way of defining the transition probabilities because of this it is useful even when multievent considerations do not apply for example in multistate models see 15 47 Because the observations in Multievent models do not necessarily correspond to individual states they can handle state uncertainty As a consequence they provide a general framework for problem such as e Heterogeneity of capture survival or any parameter of interest 35 37 34 e Determination of the sex when sex is not available 33 40 e Memory model 22 1 37 43 e Animal epidemiology model 12 In addition E SURGE can handle models conditional to the first occasion So it provides a natural framework for e Stop over duration 38 e Closed population 52 e Occupancy models 31 E SURGE benefits from the experience gained in developing M SURGE 9 a program for multistate CR analysis M SURGE introduced a powerful language for describing the set of multistate CR models reduced statistics for any classes of age and advanced numerical algorithms E SURGE has similar capabilities for maximum likelihood estimation of complex age and time dependent models with linear constraints among parameters in a generalized linear model fashion Its features include e A tool for defining general mode
61. approach or 16 for an ad hoc method to deal with uncertain states Let h 01 0x be a capture history with first encounter at time e event 04 x 1 k has any value between 0 and U and 8 a vector of parameters Then K poa me 86 00 Il gio BHH on 1 65 k e 1 where B ot is the o xth column of the encounter matrix B at time t and age a and D x is a matrix with x on the diagonals and zeros elsewhere and 1y is a N column vector of ones Assuming that individuals are independent the likelihood for the entire set of capture histories is obtained as the product of the likelihoods for each history L 8 CT Pris 6 h where C is a constant and np is the number of copies of capture history h in the data set The maximum likelihood estimation MLE algorithm is as follows 1 Select an initial value for the vector B of mathematical parameters 2 Calculate the vector of biological parameters 0 f71 X 8 10 CHAPTER 2 MODELS 3 Calculate the elementary matrices and as the product of the elementary matrices each of the full matrices II and B 4 Use the full matrices to calculate the probability P h 8 of each capture history according to Equation 5 5 Calculate the relative deviance Dev 8 2 log L 8 log C Ro h 8 6 Iterate steps 2 5 in a Quasi Newton minimization algorithm or a Expectation Maximization EM algorithm or by product algorithms updating the vector of mathematical param
62. are accepted and should be written between and 5 3 THE HEADED FORMAT 41 Header line with formatted column names b od El ante Cy cc tino io lo Hi tis NH 1 NH2 eNH K efnH1 effNH NG CNH 1 CNH S NH1 NH2 integers or reals en must be separated by spaces or tabulars if number of states is higher than 9 real numbers effnn ng must be separated by spaces or tabular numbers ou letter cnn nc must be separated by spaces or tabular real numbers inh must be separated by spaces or tabular K number of capture occasions NG number of groups NH number of capture histories NC number of covariables 2 number of informative variables censoring variables The Header line contains the K NG S 2 formatted column labels Table 10 Description of the HEADED format The Header Line The header line contains a label for each column to permit E SURGE to read several kind of data The label syntax is format indicator key word label e The key words Several kind of data are allowed and classified according to key words Variable Type Key Word Definition History ei H Recapture History data Sample Size eff x S Associated number of animals Left Censoring i 1 When the animal was censored before the first capture real age at the first capture if left censoring LC 4 1 if left censoring but unknown age O if no left censoring o 1 if the animal is r
63. ather than a minimum e The detection of redundant parameters E SURGE analyzes the likelihood in the neighborhood of the point of convergence and lists the parameters that are apparently redundant Redundancy can then be double checked by drawing profile likelihood curves E SURGE is freely downloadable from http ftp cefe cnrs fr biom Soft CR The program is constantly improved and new capabilities are added Although E SURGE has been extensively tested by many people using a variety of pilot data sets we cannot totally exclude the presence of bugs We are grateful to you for reporting any problems by e mail to remi choquet cefe cnrs fr The purpose of this manual is to provide practical instructions for using E SURGE 1 8 along with some underlying theory We assume familiarity with the basic notions of CR methodology We rec ommend reading 52 for a general review of CR models 25 for an overview of constrained models and generalized linear model philosophy in CR analyses 27 for a review of multistate models 37 for a description of Multievent models 17 7 for individual and group random effects The rest of the manual presents e The notation and models covered by E SURGE chapter 2 e The generator of general model GEPAT chapter 3 e The language and tools for building constrained models GEMACO chapter 4 e Data input chapter 5 e An example of a session with E SURGE chapter 6 e Some advanced tools chapters 7 and 8 e
64. coordinates F To T A G of the components correspond respectively to From To Time Age Group and are displayed in the right part with all elements different Bi 3 d Pi Pa Bo Ba 1 P2 Ba 0 0 1 Several effects can be combined using these operators since in a b and a b a and b can themselves be model formulae The dot operator has priority over the operator This order can be changed using brackets as for instance in a t g 30 CHAPTER 4 GEMACO 4 4 External covariates i i y LI Let us assume that a time dependent covariate x is available as a column vector x The X2 T2 matrix associated to 02 given by Equation 42 corresponding to a linear effect of this time dependent 1 21 LI 0 0 0 T2 T2 covariate within model t is X generated by the phrase i t x 0 0 T2 L2 0 0 0 0 Thus the matrix product of a factor by an external covariate is a way of replacing this factor by O So E e O O O e e O O O e j and sum operators the the linear effect of the covariate Contrary to the dot operator which is indeed to the traditional matrix product is neither commutative nor associative The default priority order of operations is lt lt and as above can be changed using brackets e g a t x Several covariates related to different effects can be used simultaneously provided they are prepared in a same file with a specific format see section 5 4 Th
65. d D J Cole Formal numerical methods for identifiability Technical report UKC SMSAS 2010 7 R Choquet and O Gimenez Towards built in capture recapture mixed models in program E SURGE Journal of Ornithology In press 8 R Choquet A M Reboulet J D Lebreton O Gimenez and R Pradel U care 2 2 user s manual Technical report CEFE UMR 5175 September 2005 9 R Choquet A M Reboulet R Pradel O Gimenez and J D Lebreton M SURGE New software specifically designed for multistate recapture models In Animal Biodiversity and Conservation volume 27 pages 207 215 2004 10 R Choquet L Rouan and R Pradel Program E SURGE A software application for fitting multievent models In David L Thomson Evan G Cooch and Michael J Conroy editors Mod eling Demographic Processes in Marked Populations volume 3 of Springer Series Environmental and Ecological StatisticsEnvironmental and Ecological Statistics pages 845 865 Dunedin 2009 Springer 11 RA mi Choquet Anne Viallefont Lauriane Rouan Kamel Gaanoun and Jean Michel Gaillard A semi markov model to assess reliably survival patterns from birth to death in free ranging pop ulations Methods in Ecology and Evolution 2 4 383 389 2011 12 P B Conn and E G Cooch Multistate capture recapture analysis under imperfect state obser vation an application to disease models Journal of Applied Ecology 46 2 486 492 2009 13 P B Conn W L Kendall and M D
66. d a fixed threshold From that sequences it is easy to calculate the LRS or any quantitative value of interest To do this select the option Run gt Compute reconstituted histories viterbi A menu appears See Figure 37 the sentence 1 173 selects all the histories 1 to 173 the number 4 asks for the 4 most probable sequences for each history All the details by histories are saved in the file viterbi txt Originally the counting algorithm estimates the occurrences of the hidden states in the life of the individual This algorithm is faster than the Viterbi algorithm but as implemented by 44 gives less informations So we generalized this algorithm to evaluate the number of transition between states To do this select the option Run gt Count transition numbers A menu appears See Figure 37 the sentence 1 173 selects all the histories 1 to 173 the sentence 2 4 selects states of interest for the LRS here state 2 1 fawn state 3 2 fawns state 4 3 fawns A resume is given by E SURGE Numberofindividuals 212 of occasions in a state mean se and ci 68 mean se ci ci 0 9386 0790 1 1 2949 1184 0615 5282 1 5392 0 1279 1 2871 1 7914 CHAPTER 8 ans 0 1943 2071 0 2155 0335 0 0328 0301 2215 4064 3165 0279 0683 0347 0 2311 0 2559 0 7392 0 0287 0 0319 0 0842 ADVANCED TOOLS FOR OUT
67. dundancy Animal Biodiversity and Conservation 27 1 561 572 2004 V Grosbois and G Tavecchia Modeling dispersal with capture recapture data Disentangling decisions of leaving and settlement Ecology 84 5 1225 1236 2003 J B Hestbeck J D Nichols and R A Malecki Estimates of movement and site fidelity using mark resight data of wintering Canada Geese Ecology 72 523 533 1991 J E Hines Mssurviv user s manual 1994 C Juillet R Choquet G Gauthier and R Pradel A capture recapture model with double marking live and dead encounters and heterogeneity of reporting due to auxiliary mark loss Journal of Agricultural Biological and Environmental Statistics 16 1 88 104 2011 J D Lebreton K P Burnham J Clobert and D R Anderson Modeling survival and testing biological hypotheses using marked animals A unified approach with case studies Ecological Monographs 62 67 118 1992 J D Lebreton R Choquet and O Gimenez Simple estimation and test procedures in capture mark recapture mixed models Biometrics in press J D Lebreton and R Pradel Multistate recapture models modelling incomplete individual histories Journal of Applied Statistics 29 1 4 353 369 2002 86 28 29 30 31 33 34 BIBLIOGRAPHY J D Lebreton and M Roux Biomeco biometrie ecologie 1989 lain L MacDonald and W Zucchini Hidden Markov and other models for discrete valued time series Monog
68. e survival transition matrix Equation 21 induces a variation by rows and columns i e a matrix 28 CHAPTER 4 GEMACO Table 5 Effects and keywords used in the Model Definition Language MDL of GEMACO Phrases in MDL are interpreted in GEMACO to build the matrices of constraints X Effects Keywords and synonyms Comments Constant or Intercept Time Age Cohort Obtained only with A K 1 classes of age Group Departure state from or previous state capture Arrival state to or cur rent state encounter Diagonal Off Diagonal Upper diagonals Lower diagonals First encounter Next encounter Covariates Individual covariates Individual intercept 1 time t age a cohort c group 8 from f previous p to next n current od ud ld firste laste xind ind To obtain constant parameters Categorical variation over time factor with K 1 or K levels for transition gt for initial state and for encounter Categorical variation over age time elapsed since first capture factor with K 1 or K levels More refined age variations are introduced later for transition gt for encounter Categorical variation between cohorts batches of indi viduals released for the first time with a mark on a same occasion factor with K 1 or K levels for transition gt for initial state and for encounter Categoric
69. eal numbers as in Table 12 ti to tK_1 real numbers t must be separated by spaces not tabulator K 1 number of time interval ti time interval 1 lt i lt K 1 Table 12 Description of the file of time intervals 5 6 Selecting steps for unequal time interval E SURGE allows the user to select steps at which unequal time intervals apply One or more steps can be considered By default unequal time intervals apply to step one This can be change by selecting Unequal time intervals of steps in the menu Setting see Figure 14 A dialog box appears see Figure 15 asking for the steps 5 7 File of initial values fixed values The structure input file for Initial Values and Fixed Values in E SURGE is described in Table 13 5 7 FILE OF INITIAL VALUES FIXED VALUES E SURGE Y 1 04 15 Mar 2007 Session is saved in gt C Remi Redact Start Data Models Setting Run amp See Help GEnerator of PATtern matrices Ctrl P DATA ST Generator of matrices of COnstraints Ctrl G Initial values or Fixed values of parameters Ctrl I Link Function b Current model name of groups 1 of states 4 gorama A ERR of age class _Y Unequal time interval Set of occasion Unequal time interval of steps Figure 14 Select steps from the main menu UTI applied to transition 5 x Enter the indices of steps for application of UTI cme Figure 15 UTI applied to transition User s should ent
70. ed models with group random effects only Q 0 7 P p 1 with bp in the form of equation 13 There is no limit for the number of random effects that we can build However for P gt 2 the fitting step may be time consuming Example 2 We consider a basic model where recapture rates vary with a group random effect logit py Po by 9 1 NG 17 where bg N 0 07 i i d 2 11 Conditionality on the first occasion Since the version 1 7 E SURGE can handled models whose probabilities are written conditional on the first occasion rather that conditional on the first encounter So that it provides a natural framework 2 11 CONDITIONALITY ON THE FIRST OCCASION 13 for stop over duration SOD closed population and occupancy models By default the conditionality is on the first encounter Conditional on 1st Capture To handle an open or a closed population model with conditionality on the first Occasion choose the option Conditional on 1st Occasion To handle a occupancy model choose the option Occupancy see Figure 3 Data Setting Run amp See Help If any factorisation DAT via Ss Markovian states only Conditional on 1st Capture of gr Markovian amp semi Markovian states Conditional on 1st Occasion of ing Random Effect for Independent Group Only Occupancy Figure 3 Conditionality options Important note Neither individual covariates nor random effects are currently implemented in E SURGE for
71. ee After the parameters have been estimated E SURGE creates a file named histories tmp which contains for each history h P h and P D A h una P d a h P D A h Each conditional probabilities P D A h and the associated marginals are stored in an array like d gt a 1 Di N P d h 1 P d 1 a 1 h P d 1 a WN h P d 1 h N 1 Pld N 1La 1 P 4 N 1 a N h P 4 N 1 h Pla 1 h Pla N h 1 Figure 34 illustrate the output associated to the model used in chapter 6 66 CHAPTER 8 Fichier Edition Affichage Insertion Format ADVANCED TOOLS FOR OUTPUT HISTORY Ojal Sle al al di id id GROUP 1 SUM OF PROBABILITIES 0 11477 0 0013763 0 0022713 0 0006298 0 9957226 1 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0013763 0 0022713 0 0006298 0 9957226 1 0000000 HISTORY a cd d il il 2 GROUP 1 SUM OF PROBABILITIES 0 00034768 1 0000000 0 0000000 0 0000000 0 0000000 1 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 1 0000000 0 0000000 0 0000000 0 0000000 1 0000000 HISTORY cau il 3d sl 1 3 GROUP 1 SUM OF PROBABILITIES 0 00059206 0 0000000 1 0000000 0 0000000 0 0000000 1 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 0 0000000 1 0000000 0 0000000 0 0000000
72. egation of parameters lists E SURGE offers several possibilities of grouping parameters in the broad sense First one often needs to build effects that are less complex than full dependence on time age or any other factor Such effects on models are obtained by lumping categories For instance in an analysis of European Dipper data over 6 years floods decreased survival in years 2 and 3 25 The resulting X gt matrix is obtained by lumping years 2 and 3 on the one hand and year 1 4 5 and 6 on the other hand This is done in E SURGE using lists of categories each list corresponding to a set of categories lumped together In the Dipper example the model formula to reduce the variation over time to two levels is t 1 4 5 6 2 3 Similarly over 7 occasions to distinguish the first year after capture from the other ones as two age classes one 32 CHAPTER 4 GEMACO will use a 1 23 4 5 6 7 or a 1 2 7 The overall syntax keyword listl list2 listm where lists defined below are separated by commas generates a factor with m levels Each list can be either e a list of integers k 1 Example a 1 is the first age class a 1 2 aggregates the first two age classes e a discrete interval i j as a shortcut for the set of integers i i 1 j Example a 3 7 is equivalent to a 3 4 5 6 7 e a series of indices i k j as a shortcut for the set of integers 1 i k i 2k in which two successive in
73. emoved at the last capture Right Censoring i 2 RC O if no right censoring Covariable c 4 COV Explanatory Variables predictors ID e The format indicator The format indicator allows E SURGE to identify the variable format By default when the format indicator is not mentionned variables are considered as numeric To read character variables it is 42 CHAPTER 5 DATA INPUT necessary to specify the format indicator as in SAS S SampleSize COV Sex COV weight 55 Male 1 2 12 Female 2 3 41 Female 0 9 e The label A label is mandatory for all covariables and optional for the others It is a word beginning by a letter followed by any letters or figures The covariable labels will be used to build the model in E SURGE Important note The left censoring is only implemented for the Models gt Markovian amp semi Markovian states gt Conditional on 1st Capture option In that case for programming convenience it is necessary to add as much columns of 0 for occasions as the value of the maximal age of left censoring A headed format example H C1 H H C3 H C4 S COV weight COV height LC dead RC COV sex COV Couple 1 0 12 0 1 24 2 81 0 0 male A 0 0 4 1 1 18 2 28 5 1 female A 0 0 0 1 1 33 7 52 0 0 female B 0 0 2 12 1 10 1 33 0 0 male B 0 0 1 4 1 10 7 18 4 0 male C Automatic shortcuts clusters creation and use When a qualitative covariate is find in the file and selected by the user
74. er a list of integer at which unequal time interval will apply like 1 3 By default unequal time intervals apply to step 1 46 CHAPTER 5 DATA INPUT nl nT nB LU Jf Jf Jf JL de IE EE L LL LL Ltt Jf Jf LL LL HHH INIT 44444 typ By tyPnI Pnl LILI LL LL L LIL LL EEE PERE AE HAHAHAHA TRANSITION tYPni 1 BnI 1 tYPnI nT Pnl nT LU tf Jf LL LL TAL LI LL Y LE A LL LL L HAHAHAHA ENCOUNTER tYPnI nT 1 Bni nT 1 tYPnI nT nB Bni nT nB Table 13 Description of the file of Initial and Fixed values B is a real and typ is an integer giving the type of 6 O if 5 is an initial value typi 4 1 if B is a fixed value in the logitgen link scale if applied 2 if f is a fixed value in the identity scale typ 1 is usefull to fix mathematical parameters whereas typ 2 is usefull to fix biological parameters to 1 or 0 47 6 A short session In this chapter we show how to use E SURGE based on a version of the Conditional Arnason Schwarz model with site fidelity parametrization see section 3 4 The data are those used in 22 and come from a 6 year study of a seabird between 1984 and 1999 for the study of movements of the Canada Geese between three sites The following steps that are necessary to obtain the parameter estimates are listed below e Open a session i e a frame that will contain the specifics of the data and the analysis re
75. eters 8 to decrease Dev 8 until convergence 7 Obtain in turn the MLE s and the deviance and various by products of Maximum Likelihood estimation 2 8 Factorizing the likelihood Multievent CR calculations condition on the first capture of the individual Because of this non observable states including the dead state cannot appear as initial states and hence they are elim inated from the vector II by setting the corresponding probabilities to zero The remaining non zero initial state probabilities can be hard to estimate if they are time varying However when one initial observable state corresponds to a single type of event and when the first event probabilities are inde pendent from later event probabilities then II and B gt can be estimated independently of the other parameters In fact we can show that there exists a unique decomposition 681 B2 of the parameter vector 3 such that Dev 8 Dev 81 Dev B 7 where Dev 8B 2 Y ma x log II D B oe 1n 8 h Dev B E o 1 II e amp lt p Bi za 1y 9 k e 1 with 8 the vector of salta parameters linked to the initial state probabilities II and first encounter probabilities B Bo is the vector of mathematical parameters linked to the transition probabilities and subsequent encounter probabilities B2 4 1 and A 1 if HIPD B 02 1w gt 0 xp i 1 N 10 0 else 2 9 INDIVIDUAL COVARIATES 11 Internal note When
76. ey are then used as x 1 x 2 e g txx l tx x 2 4 5 Individual covariates A dedicated keyword xind Syntax xind list consider a list of individual covariates given in the input capture recapture file see chapter 5 Example We would like to have the survival depending on two individual covariates x and 2 Index n is for individual n S i rind 1 2 builds a model where logit n Bo Bi x a Bo x a 4 6 AGGREGATION OF PARAMETERS LISTS 31 How act operators on xind Matrix product allows one slope associated to a set of covariates Syntax effect x vind list levels of effect Example We want to build a model where only one slope is associated to a set of individual covariates varying with occasion k 1 lt k lt K 1 S i tx xind 1_K 1 builds a model where logit 9 Bo By x xt Dot product allows one effect to act on a set of covariates Syntax effect xind list Example We want to build a model where a different slope is associated to a covariate at each occasion S i t xind builds a model where logit Bo Bk X Ln Operator gt allows to restrict the effect of a covariate to a set of occasions groups etc Syntax effect gt xind list Example We want to build a model where the covariate applies only to occasions 1 3 and 5 S 1 t 1 3 5 gt rind builds a model where logit 6 085 0 Bo and logit gt UB n By By x an 4 6 Aggr
77. fects with operators e p posa Geek ERR A e a Boe RAE SR 27 AA External COVANISMOS ss ca sa pasa da ADS A TRE ab e REAR PO a A O i 30 4 5 Individual covariates ee 30 A dedicated keyword Zind es gt sew ao ADS a AE eo a EE GDA AA 30 How act Operator on Md casaria eH eee AE RM A ES E Ra 31 4 6 Aggregation of parameters lists 31 4 7 Aggregation of parameters the aggregation operator 32 AS KeVWord OnE saia a R wee ba eee pa SA ie E ds da 33 AO SEGR o RG E Burg A ADI eee Ap ae DICE E a 33 Definition Of SRORCHIR seo urna o ea a O RR Ae Sed A RREO Eb 33 Shortcuts IM practice o so so spasa ES eg ea EEE ERE RE eRe eS 33 4 10 Redundancy in Matrices e ae a ee 34 411 Random effects o socs o sociae fa e goa RS aa Ew Mee h as a a 35 4 12 A list of models au us amas RGR cdas a p ui a A 36 4 13 Aggregating mathematical parameters LL 36 4 14 Non linear model iu ma a ed a a o a e AS a 36 Data input 39 51 The BLOMECO Ornat am aata ea p sa Soe a a E Be OS 39 5 2 The MARK format cea ea ma Ea a a BRAS a Gae eae 40 Bg The HEADED formal 224 4444 40 ge a P cd OES Ya ed eS ERO ES 40 di sos ance Fa Soe AOS ee ee ek AGR ee ek ee 40 The Header LIS ocio sis aia a A Be A O pd ee 41 A HEADED format example a ee 42 Automatic shortcuts clusters creation and use 42 HA File of external covariates i e LARA A a A A E ee 44 fb Pile of time titeryale smp salas Sd BR e E ALR a a a 44 5 6
78. he redundant parameters are not explicitly identified To improve precision the calculations are performed at 5 points by E SURGE the first 4 are neighbors of the MLEs shifted out of boundaries and the last one is the MLEs itself The estimated numerical rank provided by E SURGE the maximum of the 5 ranks obtained The list of potentially redundant parameters is established as the union of indices of potentially redundant parameters at each point where the rank is maximal This set of indices indicates which parameters should be considered carefully for interpretation Recall that direct interpretation of parameters involved in redundancy is not relevant The rank of a model can drop locally although on a set of interior values of measure 0 the probability to draw such an interior point is zero and four times in a row almost never 43 Such an increase also happens when a parameter is estimated at a boundary see the legend of Figure 32 This local redundancy often occurs at the MLEs in which case the set of indices increases at point 5 the MLEs In case of doubt about the identifiability of a mathematical parameter we recommend drawing the profile deviance The CAS model with site fidelity parametrization with 3 states and no fixed parameters has its parameters over the last interval redundant the last three survivals parameters 25 26 and 27 the last 3 fidelity parameters 40 41 and 42 and the last 3 transitions parameters 55 56 and 57
79. here is a probability pk of being seen in 1 at occasion k and so on The matrices II and B are row stochastic i e the sum of each row equals one Thus one of the entries in each row is redundant and need not be estimated In many cases some of the entries are fixed equal to 0 and also need not be estimated The specification of which entries are redundant which are to be estimated and which are fixed at 0 is done with a pattern matrix associated with each of the initial state transition and encounter matrices The entries of the pattern matrix corresponding to redundant entries are set equal to character there will be one such entry in each row The entries of the pattern matrix corresponding to parameters to be estimated are set equal to any letter from the alphabet The entries corresponding to fixed zero values are set equal to character This is confusing but can be made clear by example Denote the pattern matrices by PII P and PB Then the pattern matrices for the combined CAS model are n a 1 a PII 1 24 d pio 1 pl lo d d x o dh dh 1 9h dh P 6 6 25 0 0 1 e rca 010 DA Bel 001 PB p 26 100 iz 1 p pi 0 pos B 1 p 0 pi PB x p 27 1 0 0 Note 1 Each elementary matrix has only one pattern constant across age So the pattern of each encounter elementary matrix is the same for first and next encounters 18 CHAPTER 3 GEPAT Note 2 F
80. here phrasel and phrase2 are any general phrases for fixed effects Note The phrase random ind is equivalent to the phrase ind 36 CHAPTER 4 GEMACO 4 12 A list of models The MDL in GEMACO has been considerably expanded over the notation originally proposed by 25 It is flexible and powerful even for CJS models A comparison between the two notations for a few models frequently used is provided in Table 7 Table 7 Correspondence between models in the notation of 25 and in the MDL in GEMACO 5 Model in the notation of 25 For survival in GEMACO For event in GEMACO txg t g firste nexte t g t g t g firste nexte t g as t a 1 2 A t ag t a 1 2 A t a2 9 a 1 2 A g as g a 1 2 A g t m firste nexte a 2 3 A 1 t t m firste a 2 3 A 1 t gxm firste a 2 3 A 1 g gm firste nexte a 2 3 A 1 g For models with trap ef fect denoted m the data must be decomposed according to 36 8 Same as a 1 a 2 A 1 a 2 3 A 1 g 4 13 Aggregating mathematical parameters GEMACO allows you aggregate levels inside or between a effect inside a step defined in GEPAT for example the survival But it is impossible with GEMACO to set equality between parameters of different type In E SURGE we can do it by aggregating mathematical parameters To do this select the option Setting gt Set equality between parameters of various types after defining the model in GEMACO
81. ing fixing 57 constant 63 from IVFV files 63 from previous model 63 random 63 scale 57 Intermediate states 19 IVFV input file for 44 Keywords 23 24 age 25 28 cohort 28 d od ud ld 28 firste nexte 28 from 23 27 28 group 25 28 ind 28 intercept 28 lists 28 others 33 random 11 time 24 28 INDEX to current 23 27 28 x 28 xind 28 30 Lifetime reproductive success 67 Likelihood 9 factorizing 10 MLE 9 Link function 7 8 54 generalized logit 76 identity 8 logit 8 multinomial logit 8 Lists 31 Local minima 75 initial values 75 Mathematical parameters 7 aggregating 36 output 74 Model additive 75 information 72 retrieve 61 Non linear model 14 non linear model 36 Occupancy models 14 Operators aggregation 4 32 covariates 30 dot product 27 priority of 32 INDEX sum 27 with xind 30 Output area 60 bootstrap 67 Excel file 61 model selection 61 redundancy 62 saved files 61 tos window 61 variance covariance matrix 61 Pattern matrix 16 in practice 50 input 16 21 redundancy 34 zero value 34 Permutation test 71 Quasi Newton 72 Hazard function 14 Gompertz 14 Mixture 14 Siler 14 Weibull 14 hazard function 36 Gompertz 36 Mixture 36 Siler 36 Weibull 36 Random Effects 11 Redundancy in transition matrix 34 in X 27 91 numerical CMF approach 64 of parameters
82. intermediate set of events and O the effective set of events 3 2 A combined formulation of the Arnason Schwarz model In this formulation transition probabilities combine both survival and movement among states condi tional on survival This is a typical case of DES 1 1 1 In the CAS model the state of an observed individual is always known without error Thus with 2 sites the set of states is E 1 2 1 and the set of events is Q not seen seen in 1 seen in 2 The initial state matrix at occasion k is 11 nt 1 7 20 Note that in E SURGE the state f is always removed from the full initial state vector because individuals are all still alive at first release occasion The transition matrix at occasion k maps individuals from E to E oh dio 1 9 dia k ba dI dia Le di dio 21 0 0 1 Rows and columns of both correspond to states 3 2 A COMBINED FORMULATION OF THE ARNASON SCHWARZ MODEL 17 The event matrices for the first capture and for subsequent captures at occasion k are 010 Bel 001 22 100 1 p pi 0 Be 1 p 0 ph 23 1 0 0 B maps individuals from E to 0 the rows of B thus correspond to states and the columns correspond to events Because first captures are not modelled in the CAS model B says that at its first capture an individual in state 1 will be encountered in seen in 1 with probability 1 etc For later captures t
83. l rank given that rank gradient of prob J 69 The number of identifiable parameters is calculated as rank gradient of prob Ea Enter model name AS cn Figure 30 Give model rank window The estimated rank is 69 and a model name has to be chosen for the CAS model with site fidelity parametrization 6 7 Output of results The previously fitted models of the session their deviances and AICs are permanently displayed in the Output area see Figure 31 Tf more than one model has been run these models are sorted from top to bottom by increasing QAIC values When the optimization of a new model stops the program examines whether it has not stopped at a saddle point in which case a warning is issued see section 10 2 Then it looks for redundant parameters This is done by analyzing the singular values of a derivative matrix see 43 6 for details and section 7 2 at 4 points in the neighborhood of the MLE and at the MLE itself The estimated model rank at each of the five points is shown in the DOS window see Figure 31 while one list of potentially redundant parameters corresponding are listed for each point in the output file see Figure 32 The 6 7 OUTPUT OF RESULTS 61 of groups 1 of states 4 of events 4 of age classes 1 of occasions 6 DATA INPUT FILE File name ES Negative curve this is not a local minima BE CAREFUL results may be gt C Remi Redaction false
84. les of 200 additional iterations or stop here n 3 To run the model click on RUN which is the red button in the Compute a Model area of the main window see Figure 17 This button becomes active once you exit IVFV Advanced Numerical Options EE Change Advanced nu EE E Compute C l Hessian Number of iterations by cycle li 200 Modify Non linear solver Quasi Newton m Tolerance to parameter change Initial Values 1e 007 Constant Analytical gradient Convergence Tolerance on gradient 1e 007 for the model rank Continue after n cycl Gauss Hermite 0 Adaptative 1 Sparse Grid 2 0 Order of integration 1 2 15 15 Figure 29 Advanced Numerical Options The tolerance on change in parameters has been set to 107 the maximum number of iterations is 200 If no convergence is achieved after 3 cycles of 200 iterations E SURGE will ask whether to continue for 3 cycles of 200 more iterations the continue after n cycle option has been set with n 3 or not The analytical gradient will be used to compute the rank default rather than with the Hessian and a detailed output of the iterations will be displayed in a DOS window The Hessian is needed to get standard errors estimates Now the model is being fit The rank of the model conditional on the data is estimated by comput ing the rank of a matrix composed of the gradient of each history probability collapsed together and estimated near the
85. ls A general model is mainly defined by the structure of the transition matrix and the encounter matrix In M SURGE and MARK transition probabilities are defined either directly or in terms of survival and transition conditional on survival E SURGE is unique in permitting more than these two steps in defining transition and encounter matrices and initial state vector We call this feature DES for Decomposition in Elementary Steps The transition and encounter matrices and the initial state vector are constructed using a pattern generator called GEPAT 2 CHAPTER 1 INTRODUCTION e A powerful model description language Constrained models are built using a language interpreted by a generator of constrained matrices called also design matrices called GEMACO This powerful language is similar to those used in general statistical software packages such as SAS R Genstat or GLIM for instance the formula t g generates a model with additive effects of time and group GEMACO avoids tedious and error prone matrix manipulations e Advanced convergence diagnostics Convergence to the maximum likelihood estimator is a very sensitive issue in Multievent models In E SURGE the user gains a greater control over convergence through a choice of non linear solvers and of starting options including the results of previous models random initial values and multiple random initial values In addition warnings are issued if the program stops at a saddle point r
86. ndant parameters indices below are estimables 80 140 141 142 143 144 145 146 147 148 149 150 151 152 160 161 162 163 175 176 177 178 190 191 192 193 204 205 206 207 CHAPTER 13 OUTPUT TEXT FILE 25 26 27 40 41 42 45 48 55 56 57 70 71 72 25 26 27 40 41 42 45 48 55 56 57 70 71 72 25 26 27 40 41 42 45 48 55 56 57 70 71 72 Maximum Likelihood Estimates Index Estimates Lower amp Upper 95 percent CI S E F To TAGS Par Par Par Par Par Par Par Par Par Par Par Par Par Par Par Par Par Par 1 ISC 1 1 1 2 ISC 1 2 1 1 1 1 1 1 1 0 224670873 0 211123372 0 238824520 0 007066957 0 597023446 0 580654624 0 613178976 0 008299832 16 IS 1 1 6 1 1 1 0 202304724 0 175603208 0 231924148 0 014365353 17 ISC 1 2 6 1 1 1 0 450704260 0 416098003 0 485794108 0 017808202 19 MC 1 1 1 1 1 1 0 631613156 0 579582626 0 680751334 0 025889372 20 M 2 2 1 1 1 1 0 742891360 0 703481678 0 778710814 0 019209356 48 MC 2 2 5 1 1 1 0 401722974 0 375928707 0 428073091 0 013313691 49 MC 3 3 5 1 1 1 0 623398935 0 569533095 0 674378087 0 026837606 54H F 1 1 1 1 1 2 0 778638230 0 720419147 0 827634547 0 027367239 56 F 2 3 1 1 1 2 0 901196378 0 876701846 0 921261837 0 011313145 84 F 2 3 5 1 1 2 0 819115054 0 792772322 0 842773232 0 012748695 86 FC 3 5 5 1 1 2 0 735238092 0 6
87. neral model under which the CAS model with site fidelity parametrization is now fully specified and you can leave the GEPAT interface by clicking on the button EXIT in the lower part of the window However before exiting you can save matrix patterns for an upcoming use To that purpose select Save file with Patterns in the Input Output for patterns menu Presently two link functions are available in E SURGE the generalized logit and identity links You can choose between these two links in the Setting menu For now select the generalized logit link the default link This action completes the specification of the general model for the current session Before running any model unequal time intervals see section 2 6 can be used by selecting in the Setting menu the Unequal time intervals sub menu E SURGE asked for a file of unequal time intervals see section 5 5 By default unequal time intervals are applied to the first step of transition In our example initial state probabilities will be estimated The first step to compute these estimates is to select Initial Transition amp Encounter in the menu Models If any factorisation see Figure 26 See section 2 8 for details about the corresponding full likelihood To skip Initial states probabilities from transitions probabilities and encounter probabilities select Transition amp Encounter in the menu Models If any factorisation In this case only the partial likelihood conditionial to the
88. ntal and Ecological StatisticsEnvironmental and Ecological Statistics pages 867 879 Dunedin 2009 Springer A Sanz Aguilar G Tavecchia M Genovart J M Igual D Oro L Rouan and R Pradel Study ing the reproductive skipping behavior in long lived birds by adding nest inspection to individual based data Ecological Applications 21 2 555 564 2011 M Schaub R Pradel L Jenni and J D Lebreton Migrating birds stop over longer than usually thought an improved capture recapture analysis Ecology 82 3 852 859 2001 M Schaub R Zink H Beissmann F Sarrazin and R Arlettaz When to end releases in reintro duction programmes demographic rates and population viability analysis of bearded vultures in the alps Journal of Applied Ecology 46 1 92 100 2009 B R Schmidt R Feldmann and M Schaub Demographic processes underlying population growth and decline in salamandre salamandre Conservation biology 19 4 1149 1156 2005 A Viallefont J D Lebreton A M Reboulet and G Gory Parameter identifiability and model selection in capture recapture models a numerical approach Biometrical Journal 40 1 13 1998 G C White and K P Burnham Program MARK Survival estimation from populations of marked animals Bird Study 46 suppl 120 139 1999 GN Wilkinson and CE Rogers Symbolic description of factorial models for analysis of variance Journal of the royal statistical society series C 22 3 392 399 1973
89. oftware BIOMECO 28 It makes it possible to label rows and columns via external files This may be advantageous for proper retrieval of CR data using e g individual band numbers as labels for rows when there is one row per individual The filename dummy sign can be used if you do not want to create specific label files The BIOMECO format described in Table 8 see also 39 can also be used in input and output in U CARE NH K NG Filenamel Filename2 era 12 1 K eff 1 ema effi wa NH 1 NH2 eNH K CNH CENHNG integers enn must be separated by spaces not tabular real numbers effnn ng must be separated by spaces not tabular Filenamel and Filename2 names of files with row and column labels respectively K number of capture occasions NG number of groups NH number of capture histories K NG total number of columns in the file Table 8 Description of the BIOMECO format as applied to CR data nh k is either 0 if the individual nh is not seen at occasion k or u if individual nh is seen in the event u u 1 U at occasion k The set of values enn 1 nh 2 nh K is the capture history enn and the associated vector eff p ng is the number of animals with history en nh 1 NH in group ng ng 1 NG Negative value for eff means that animals are removed immediately after last capture 40 CHAPTER 5 DATA INPUT Remarks e E SURGE does not allow ex
90. on Models gt Markovian amp semi Markovian states gt Conditional on 1st Capture see Figure 4 Start Data Setting Run amp See Help DAT If any factorisation pa 4 Markovian states only AE of ar Markovian amp semi Markovian states Conditional on 1st Capture Figure 4 Semi Markov option 15 3 Flexible generation of a general model GEPAT 3 1 Overview GEPAT for GEnerator of PATtern of elementary matrices makes it possible to generate the GM using the elementary matrices in Equation 1 under which the UM is defined see Figure 1 Denoting this model as DES for Decomposition in Elementary Steps and denoting the number of elementary matrices of each type LI LT LB we have DES LI LT LB This feature was chosen in the context of uncertainty see 33 37 to allow e models described by Equation 1 to be expressed as linear models e application of constraint on each parameter separately by a language here GEMACO This approach allows the specification of complex models while avoiding non linear constraints algo rithm for parameter in range Non linear constraints can then be handled in a very efficient way with unconstrained algorithms It is also helpful in multistate problems without uncertain states because it allows biologists to specify models in more details Developing life cycle models in terms of such lower level parame ters has a long tradition in various branches of population biology e g 3
91. ov chain The successive states occupied by an individual are not observed directly Rather at each occasion k one member of a finite set 2 of events is observed The event observed at occasion k is assumed to depend only on the unobserved underlying state of the individual at that occasion Unlike traditional practice in CR but similar to 16 37 and consistent to Markov Chain property 29 the dead is explicitly included in E By convention in E SURGE it appears last in the list of states Similarly the event not seen is explicitly included in Q in which it appears first 4 CHAPTER 2 MODELS Multievent models are defined in terms of three kinds of parameters initial state probabilities 7 transition probabilities q and encounter probabilities b For group g we have med the probability of being in state e when first encountered at index of time k e pae the probability of being in state ej at index of time k 1 if in state e at index of time k for the interval a since first capture bin the probability of event v for an animal in state e at index of time k at occasion a since first capture including first capture e II 7 denotes the 1 x N vector of initial state probabilities e Qij denotes the N x N matrix of unconditional transition probabilities i e the matrix of probabilities that an individual moves from one state to another state over a time interval e B bj
92. poy pb 0 0 fo 1 f 0 0 0 Sage ia a ya 0 0 0 0 fs 1 fs 0 SS so Ss f 0 0 0 0 0 0 1 The third elementary matrix for movement conditional on emigration maps from E 2 back to EO and so is of dimension 7 x 4 1 0 0 0 x O vf 1 Y 0 px 0 1 0 0 x O yh 0 1 45 0 Po y 4 37 0 0 1 0 di 1 4 0 0 I E 0 0 1 i Sea 3 5 GEPAT IN PRACTICE 21 The event matrices B map from the set EU of states to the set Q of events and thus are of dimension 4x4 0100 0010 0001 1000 1 p p 0 0 p 1 pk 0 pi 0 ys Br Pa p2 PB 39 1 p 0 0 pi p 1 0 0 0 Lula Important note the choice of states for the intermediate transitions is not always unique There may be more than one equivalent way to group individuals and at the present the only advice we can give is to determine from the structure of the model what information needs to be kept at any one step in order to define the probability of subsequent transitions For example in the models described here the future transitions of a dead individual these transitions are boring the individual just remains dead do not depend on which state the individual died from Thus EU includes only one dead state But the future transitions of individuals that leave a site do depend on what site the individual left from Thus EQ must include separate states for indi
93. raphs on Statistics and applied probability Chapman and Hall 2000 D I MacKenzie J D Nichols J A Royle K H Pollock L L Bailey and J E Hines Occupancy estimation and modeling inferring patterns and dynamics of species occurrence Academic Press 2006 D I Mackenzie J D Nichols M E Seamans and R J Gutierrez Modeling species occurrence dynamics with multiple states and imperfect detection Ecology 90 3 823 835 2009 ISI Document Delivery No 413EX Times Cited 12 Cited Reference Count 32 Mackenzie Darryl I Nichols James D Seamans Mark E Gutierrez R J ECOLOGICAL SOC AMER P McCullagh and J A Nelder Generalized linear models Chapman and Hall New York USA 1989 J D Nichols W L Kendall J E Hines and J A Spendelow Estimation of sex specific survival from capture recapture data when sex is not always known Ecology 85 12 3192 3201 2004 G Peron P A Crochet R Choquet R Pradel J D Lebreton and O Gimenez Capture recapture models with heterogeneity to study survival senescence in the wild Oikos 119 3 524 532 2010 S Pledger K H Pollock and J L Norris Open capture recapture models with heterogenety I Cormack Jolly Seber model Biometrics 59 786 794 2003 R Pradel Flexibility in survival analysis from recapture data handling trap dependence In J D Lebreton and P M North editors Marked individuals in the study of bird population pages 29 37 Basel
94. rz model with site fidelity parametrization Now we consider a version of the Arnason Schwarz model in which the probability of transition con ditional on survival is further subdivided into a probability of leaving the site the complement of site fidelity and a probability of moving to each other site conditional on leaving 21 This is a DES 1 3 1 general model With 3 sites assuming that if an animal is seen its state is known without error the set of events i e the results of observations is Q not seen seen at 1 seen at 2 seen at 3 Defining sets of intermediate states In the classical separate formulation of the Arnason Schwarz model the set of possible states for an individual is the same for both elementary matrices survival and transition conditional on survival In general however there may be a different set of states at each of the elementary steps In constructing the elementary matrices and their states it may be helpful to use a directed acyclic graph DAG used in many area to represent relations between items Figure 6 shows the formulation of the Grosbois model In this formulation states are denoted as numbered nodes on a row Each step in the life process is represented by a subsequent row and the possible transitions are denoted by arrows The initial set of states is repeated at the bottom of the graph In the Grosbois site fidelity model the sets of states are
95. s should be done with care because the meaning of a model may be lost if the underlying UM is not remembered There are six potential sources of variation in the parameters 1 groups i e permanent categories of individuals such as sexes or species or discrete unconnected study sites 2 age i e number of occasions or intervals elapsed since first capture 3 time 4 state of departure 5 state of arrival 6 current event In the UM parameters are always allowed to vary freely over time and among groups Only the number of states and the number of age classes can be set to different values Decomposition in Elementary Steps It is sometimes useful to define the full initial state full transition and or full encounter matrices as arising from a sequence of life processes The familiar decomposition of the transition matrix into survival and transition conditional on survival implemented in M SURGE is an example of this but in some cases more steps may be involved In an approach similar to that used for periodic matrix population models 3 Chapter 14 but between two dates or at one occasion the full matrices are written as products of elementary matrices LI I n 1 LT 1 1 LB B B 1 The intermediate states involved in the sequence of life processes may not be the same as the basic set of states in the model Thus the elementary matrices need not be square 6 CHAPTER 2 M
96. s the new model to start from the solution of the previous one Our preliminary investigations with this approach yielded promising results when the simpler model was well chosen 10 2 Saddle point Estimates at a saddle point are always the results of a bad convergence and or a difficult problem To avoid it several solutions may be advocated 1 If the convergence is not attained try to help the convergence by fixing appropriate parameters Some capture rates which are known to be zero may be fixed to zero Some parameters involved in the redundancy like the last capture rates for the CJS model may be fixed to one 2 If the convergence is attained reduce tolerances 10 3 Additive models Because of the redundancy inherent in categorical variables the sentence t g is reduced to t g 2 NG The first column of g is automatically deleted as the sum of t and the sum of g are both equal to the intercept For the sentence g t the first column of t is deleted However the two formulations are equivalent to the model t g because the resulting X matrices will generate the same linear subspaces e in turn lead to the same final parameter estimates and minimal deviance value 76 CHAPTER 12 CONDITIONS OF EXTERNAL USE 10 4 Generalized logit Additive effects with the generalized logit do not generate the usual parallel responses Pending further investigations we recommend not using additive effects with the generalized logit
97. short This phrase will be interpreted by GEMACO to build X automatically The MDL language is based on reserved keywords for various effects such as time t or group g and operators This language expands the tensor notation for analysis of variance models 51 see 32 p 41 adapted to and advocated for CJS models by 25 Several other steps some of which are optional to build constrained models will be examined later We recommend that you carefully read the presentation of the MDL and work through the examples to progressively learn how to speak MDL You will soon realize that GEMACO along with its MDL offers very wide possibilities that make the building of nearly any biologically meaningful model a fairly easy task 24 CHAPTER 4 GEMACO 4 2 Keywords for main effects In capture recapture modeling several classical effects such as time age and group have been widely used to explain variability in the data 25 In the MDL of GEMACO these effects are represented by reserved keywords with synonyms to facilitate writing models The effects and their associated keywords are described in Table 5 These effects are here considered by themselves i e as main effects in an analysis of variance sense They can also be combined as seen in the next paragraph As a first example of the capabilities of GEMACO let us assume we want to run a CJS type model with survival constant over time but varying among groups and recapture probabili
98. sion file and it will be saved automatically for future use when you exit E SURGE Next you must load the capture histories data prepared as either a BIOMECO a MARK or a HEADED format file File of capture recapture data Mes documents r cents Mes documents Poste de travail Favoris r seau Nom du fichier Geese X Duvrir Fichiers de type Eh gt Annuler Figure 19 Read the data from the Biomeco file geese rh This analysis was run under a French version of Windows Select the option Open a Biomeco file in the Data menu and select your file In our example the 50 CHAPTER 6 A SHORT SESSION file is geese rh as it is shown in Figure 19 Once you have loaded your data file the area Data status is automatically updated with a descrip tion of the data When an old session is opened the data set is loaded automatically and E SURGE gives a short description of the data in the Data status area of the main window see Figure 17 You can and must if necessary change the number of groups the number of states and events and the number of age classes by pressing the Modify button change the current values see Figure 20 Start Data Models Setting Run 8 See Help fe Change Data status of groups 1 of individual cov 0 of states 4 of events 4 of age classes 5 of occasions 6 Modify Give the number of groups Give the number of individual covariates
99. ssion 49 Leucine data MG lt os e ali ds ca Ba E we RR Ar E SUA o A 49 Chancing model Stats soes s boci oa g e a ES 50 Window structure of the GEPAT interface o e 51 Pattern in GEPAT lor the survival conoces aca ae ad a ala 52 Pattern in GEPAT for the fidelity lt lt e a s opena o e e e 52 Pattern in GEPAT for the movement 53 Pattern int GEPAT TOP Capture cg sida a A e a E DS E 53 Keeping Initial State probabilities in E SURGE 1 o 54 Window structure of the GEMACO interface oaao 55 Initial Values Fixed Values window 2 2 ee ee 58 Advanced Numerical Options ccoo ranma sera a he ee es 59 Give model rank window su uu cs os Aaa aa ESE a a ee e a e 60 Output Or PeSURGE I ccoo ra ade Ee oe e A Hee E E ee 61 Output ol E SUBGE ID usas eee Ga eR A Sh SS 62 Save probability of each histories ee 65 First four capture histories extracted from the file histories tmp 66 BOGtSiTS 4424 54422 4 24 e Pee O ee EELS dE See eRe we ERS 67 Viterbi Mier hi va e es apa Bee SE MLS A E A RR A Ar RS Pe 69 aa a lt 1 25 fu a fl Be ER a pa 70 1 Introduction E SURGE which stands for MultiEvent Generalized Survival Estimation is a program for fitting Mul tievent models 37 to capture recapture CR data Multievent models are an extension of multistate models in which observations do not necessarily correspond to states Several progr
100. sults for future retrieval e Load the capture histories data if the session is new e Build a general model containing an umbrella model based in particular on appropriate Goodness of fit test in U CARE using the GEPAT interface e Specify and build further constraints using the GEMACO interface e Fix parameters and or change initial values if needed using the IVFV interface e Run the model e Examine and interpret the results We will go through these steps in the following paragraphs The general organization of E SURGE is summarized in Figure 16 Data context histories number of groups number of states Create Select the Mid Shortcuts link fonction E options I Y y i i GEPAT GEMACO RUN 1 v Define the Y Build the 6 Define the Define the pattern matrices design matrices pattern matrices pattern matrices o X Return points Figure 16 General organization of E SURGE 48 CHAPTER 6 A SHORT SESSION 6 1 Main window of E SURGE After invoking E SURGE e g from Windows Explorer by double clicking on e surge exe we are presented with the main window of E SURGE see Figure 17 The window is divided in four areas namely Data Status Advanced Numerical Options Compute a Model and Output and the toolbar has six menus namely Start Data Models Setting Run amp See E SURGE V 1 7 1 14 tiep 2010 Data Input File No data Advanced Numerical Options O Compute C l
101. t Input output for Shortcuts SHORTCUTS for sentences 1 Add Shortcut d Delete Shortcut Modify Shortcut ennndannre with initia covariate madsaiitios 2 SpPpondence with intial covanate modalties g female male Figure 7 Shortcut creation interface GEMACO automatically replaces the sentence Ad Juv by a 1 g 1 a 2 5 g 1 amp g 2 We can test as usual for effects affecting separately juveniles and adults For example the sentence Ad t Juv considers a time dependent survival for adults only The shortcut realage be defined as a 1 g 1 a 2 5 9 1 amp g 2 with 2 levels We can build models Ad Juv and Ad t Juv respectively with the two phrases realage and realage 1 t realage 2 4 10 Redundancy in matrices Each matrix II B is row stochastic i e the sum of each row is equal to one Thus for a matrix of size R x S fewer than R x S 1 parameters out of the R x S parameters have to be estimated One redundant parameter has to be chosen for each row This is open to user s choice based on a pattern matrix T made of character The T matrix of size R x S is made of and alphabetical letter with rows corresponding to previous states and columns to next state or current event as usual For each non zero elements of the row stochastic matrix either II 9 B trs is equal to any alphabetical letter except for one element per row that is set to to define the position of
102. tegers are separated by the step k Example a 3 2 7 and a 3 2 8 are both equivalent to a 3 5 7 66 9 e A composite list using and square brackets according to the syntax list listl list2 Example a 2 3 6 7 10 is equivalent to a 2 3 6 7 10 The syntax keyword list constrains all parameters in the list to be equal and leaves the param eters corresponding to the other categories of the effect in keyword unconstrained In terms of the corresponding X matrix it sums the columns in the list to produce a single column and leaves the other unchanged If only two age classes A 2 are used for the umbrella model then the model can be re written a 1 2 instead of a 1 2 7 The operator can also be used in this context the formula t 1 45 6 2 3 in the dipper example above is equivalent to the formula t 1 4 5 6 t 2 3 In this type of combination when one wants to keep a series of consecutive categories or equivalently of factor levels distinct one can also use the sign _ to replace the list with commas as separators The overall syntax is keyword i_j which constrains all levels between i and j to be different Example t 1 3 4 6 t 1 2 3 4 5 6 forces levels 1 2 3 to be equal and keeps levels 4 5 6 different 4 7 Aggregation of parameters the aggregation operator Lists make it possible only to aggregate parameters within a same main effect The aggregation oper ator amp
103. tep by step as follows 6 3 BUILDING MATRIX PATTERNS USING GEPAT 51 Gepat interface Input Output for patterns Parameters Pre defined Patterns EE Options of Rows of Columns Update Now Figure 21 Window structure of the GEPAT interface The default patterns for transitions and encounters matrices are those of the combined formulation of the CAS general model The default matrix pattern for initial state will be kept unchanged First select Transition in the menu Parameters change the number of steps to 3 and optionaly the label of the current step 1 to S or any convenient label for survival Change the matrix pattern to Equation 35 the result is given in Figure 22 To obtain this diagonal matrix you can click on the button Diagonal Matrix Select the current step 2 by clicking on the adjacent right arrow Change the label to F or any convenient label for fidelity Define the size of the matrix in the options area enter 4 for the number of rows and 7 for the number of columns and click on the button update now Replace the empty matrix pattern by Equation 36 the result should be the one visualize in Figure 23 Select the current step 3 by clicking again one time on the right arrow Change the label to M or any convenient label for settlement Define the size of the matrix in the options area enter 7 for the number of rows and 4 for the number of columns and click on the button
104. ternal filenames Only the dummy sign is accepted in place of filenamel and filename2 We plan to use such names for labels of rows and columns respectively in further versions 5 2 The MARK format Alternatively the input data file can be in MARK format 50 described in Table 9 b 1 1 1 2 ECL K eff 1 a effi na eNH1eNH2 eNH K efnH CNH NG integers enn k are written without data separator real numbers effnn ng must be separated by spaces not tabulator Table 9 Description of the MARK format E SURGE asks then the number of columns containing covariates By default this value is zero no external variable as such covariates are not presently handled by E SURGE Note that the e 1 have to be contiguous i e not separated by a blank in contrast to the BIOMECO format E SURGE uses only digits and not letters for states and the maximum number of states with the MARK format in E SURGE is presently 9 5 3 The HEADED format The format This format is a more general format in the sense that it includes the two previous formats with an explicit label for each column Using meaningful names as labels may be advantageous for proper retrieval of CR data Numbers are used as labels for events thus the number of events is not limited E SURGE uses only digits and not letters for states Note that the en have to be separated by a blank or a tabular if the number of states is higher than 9 Comments
105. text file o ON DO oO sa U N 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 E SURGE V 1 7 1 23 Sep 2010 OUTPUT FILE C Erika workshop TUTORIALS Ex3 geese CAS out Data File C Erika workshop TUTORIALS Ex3 geese Geese rh Number Number Number Number Number of of of of of occasions 6 states 4 events 4 groups 1 age classes 1 Model Name Modell Model formula or file For Initial State IS Step 1 12 current t For Transition M Step 1 15 f t For Transition F Step 2 15 f t For Transition T Step 3 15 f to t For Event E Step 1 16 firste nexte current t Init values 0 000000 0 000000 0 000000 0 000000 0 000000 Init indices 1 2 3 4 5 6 72 73 Fix values 1 000000 Fix indices 58 of step for initial state 1 Phrase for step 1 current t Number of shortcuts O Pattern matrix pp Name file for covariates defaultfile of step for transition 3 Phrase for step 1 f t Number of shortcuts 0 Pattern matrix 78 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 CHAPTER 13 y y Phrase for step 2 f t Number of shortcuts 0 Pattern matrix Phrase for step 3 f to t Number of shortcuts 0 Pattern matrix Cana ve od a Ip i
106. the chain 1 2 1 1 1 1 The 1 2 sequence means that the beta parameter corresponds to the capture probability of events 2 seen as a 1 for animals in state 1 the fourth indice 1 indicates the first class of age the first capture On the same line you should replace the value 0 5 with 1 This is the new starting value for parameter 1 Check the box nearby to fix this parameter Now parameter 1 is excluded from the optimization process Its value is frozen at the initial value you have just entered Click on the button Exit to exit the IVFV interface It may also useful to fix some parameters to pre determined values In the CAS model the recapture probability at the last occasion is not identifiable separately from the survival probability over the last interval Fixing the three last recapture probabilities to 1 does not change the results and may indeed facilitate convergence There is an option Initial Value in the main window see Figure 17 to change the way initial values or fixing values are set You may want to play with it and see the change in the IVFV interface Fixing and selecting initial values can be done either on the 0 1 axis e on the biological parameter scale without link like 0 8 or on the real axis 7 e on the mathematical parameter scale f 0 8 In the latter case it is the transformed value i e the logit or the generalized logit of the parameter that is set This can be specified via
107. the generalized logit because parallelism is not as clearly defined as with the logit link In such cases the identity link may be used in E SURGE A 2 6 UNEQUAL TIME INTERVALS 9 specific algorithm is then used to keep estimates in range but this algorithm is slower and is not used as the default option 2 6 Unequal time intervals E SURGE permits calculations based on unequal time intervals for any steps of transition Dl see Equa tion 1 of your choice see section 5 5 and 5 6 for practical use Each row of the considered transition matrix should not contain more than one parameter and its complementary The implementation of power of matrices should be considered to allow any kind of transition This feature is not presently available If survival the true parameter estimated by E SURGE is then survival per unit of time denoted s instead of survival over the whole interval denoted S The length of the time interval At is used to back calculate the estimate of survival over the interval S st 2 7 Maximum likelihood estimation The likelihood of a model is proportional to the probability of the data given that model The basic unit of data in E SURGE is the capture history reduced form data descriptions like the m array are not available in general for multievent models Thus the likelihood calculation depends on the application of the transition probabilities to individual capture histories See 37 for a presentation of this
108. the redundant param 4 11 RANDOM EFFECTS 35 gt in each row For each element structurally eter for this row Again there must be a single star egal to 0 either II B trs is equal to character For instance if we want to use the parame ters vii 12 Wai 022 032 433 of Equation 30 adapted to 3 states instead of using the parameters Y y 12 V13 bar Vos 31 W32 we have to change the transition pattern matrix from _ head ype which is the default to AR p y 4 11 Random effects For random effects we extend the MDL for fixed effects 5 Therefore we introduce a new built in key word factor denoted ind for individual random effects and implement random effects for groups with the keyword random which translate fixed effects into random effects These additions fit naturally into E SURGE s model specification syntax However contrary to traditional effect like time age group direct addressing of levels of ind one level corresponding to one individual is not currently allowed We extend also the operator to concatenate fixed effect and random effect to generate mixed models of the form 14 16 Examples include The phrase weight ind models equation 15 The phrase random group models equation 17 More generally two general forms of phrase are currently allowed phrasel phrase2 ind for equation 14 and phrasel random phrase2 for equation 16 w
109. the toolbar menu item Value space Finally the set of initial values can be saved in a file and reloaded later FILE option of the toolbar 58 CHAPTER 6 A SHORT SESSION Initial Yalue or Fixed Yalue for beta 1WFY ME ES Vv ni ni ni ni ni ni im LI Figure 28 IVFV interface The encounter parameter 1 is fixed to one in the 0 1 scale This is because the box fixing or setting initial values of Beta is activated In the current figure the capture parameter 2 is not fixed Iterations will start for this parameter with an initial value equal to 0 5 6 6 SETTING OPTIMIZATION PARAMETERS AND RUNNING THE MODEL 59 6 6 Setting optimization parameters and running the model Before running the model it is possible to change the numerical options that govern the optimization algorithm in the Advanced Numerical Options area of the main window see Figure 17 Because our example is a relatively simple model with good data the maximum number of iterations may be reduced to 200 see Figure 29 Also we can set the tolerance to parameters change to 0 0000001 This is one of the two stopping criteria of the algorithm the lower the tolerances the more precise the result is If the maximum number of iterations is small as here we recommend you set the Convergence option to Continue after n cycle Tf no stopping criterion is satisfied after n cycles of 200 iterations E SURGE will ask you whether to go on with n cyc
110. therwise the gradient is computed analytically In lines 71 and 72 tolerances TOLF and TOLX used as criteria of convergence are given Stopping criteria recommended by 14 p347 are used The minimization process gives a local minimum which is not necessarily global 27 Running the same model with other different initial values is currently the main method to check if another lower minimum of the deviance can be achieved Research is currently underway to help with this difficult problem See also sections 7 1 E SURGE computes the eigenvalues of the Hessian matrix L428 to L499 based on the singular value decomposition SVD These eigenvalues indicate the redundancy and reliability of the parameter estimates If the eigenvalues are 1 All strictly positive E SURGE has found a local minimum of the deviance function and provides estimates and confidence intervals for all the estimators 2 Some negative or positive but near zero some parameters of the model cannot be identified In this case E SURGE decides how many eigenvalues can be considered as equal to zero according to a threshold 3 Strictly negative and far from zero E SURGE did not reach a minimum and the parameter estimates are unreliable A warning appears if any eigenvalue is lower than 107 Ay where Ay is the 9 5 DEVIANCE AIC AND RELATED TOPICS L35 L44 AND L431 L502 73 largest eigenvalue In this case the advice is to set smaller tolerances and re run
111. ty varying with time only All one needs to do then is to define the structure for survival and recapture probabilities to be g and t respectively exactly as in the tensor notation of this model dg p The time dependent CJs model is written as dt pr How is the MDL phrase time interpreted by GEMACO Let us consider with time variation in survival probability over 2 geographical sites 3 states and K 3 occasions i e 2 intervals The vector of survival parameters 05 is defined by Equation 42 Defining the model as time or for short t creates a matrix X with as many rows as components in 09 in the same order Columns in Table 2 correspond to the time index and values 0 1 correspond to indicator variables for time The constraint matrix X left part is generated by GEMACO according F To T A G 1 0 1 1 1 1 1 1 O 212 1 1 1 0 0 113 1 1 1 0 0 2 13 1 1 1 0 0 3 3 11 1 1 0 1 1 1 2 1 1 0 1 2 2 2 1 1 X2 0 0 1 3 2 1 1 0 0 21312 1 1 0 0 313 2 1 1 0 1 1 122 1 0 1 2 12 2 2 1 0 0 1 3 2 2 1 0 0 2 3 12 2 1 0 0 3 13 2 2 1 gt _y_ C oordinates Table 2 Phrase time interpreted by GEMACO to the component of the vector 05 described in Equation 42 The coordinates F To T A G of the 4 2 KEYWORDS FOR MAIN EFFECTS 25 components correspond respectively to From To Time Age Group and are displayed in the right part How is the MDL phrase age interpreted by GEMACO
112. u denotes the N x U matrix of event probabilities Together TI B define the general model GM under which an umbrella model UM retained by Goodness of Fit can be fitted see Figure 1 for the relation between GM and UM These matrices are row stochastic and are called respectively the full initial state vector the full transition matrix and the full event matrix in E SURGE The relation of each of these matrices to the classical CAS model and to the memory model are given in 37 These models belong to the class of Hidden Markov Models HMM see for example 29 2 Pattern definition of elementary matrices Specific variation in parameter Figure 1 Definition of the umbrella model UM under the general model GM by setting specific variation in parameter 2 3 Umbrella models An important but not always obvious notion in model selection is the umbrella model UM The UM is a general model with specified variation in parameters which is later subjected to constraints to define 2 3 UMBRELLA MODELS 5 biological hypotheses of interest In other words the UM is the most general model that can be fitted and the one within which all other models examined are nested The UM depends on the settings of several main options in E SURGE Because there are several choices for these options there are several potential UMs in E SURGE and it is possible to shift from one to another during a session However thi
113. unction The biological parameters are probabilities and hence must lie within the interval 0 1 To satisfy this constraint but still allow the optimization routines to work with mathematical parameters 8 that range over oo 00 a link function is applied to the parameters The link function is a one to one continuous transformation In practice very small or very large 6 values are transformed into 0 0 or 0 1 so in practice some estimates may fall on the boundary of parameter space E SURGE provides two link functions the generalized logit and the identity link Transition probabilities must not only lie within the unit interval but must sum on each row to 1 Consider two transition probabilities on the same row of 9 and 12 The transformations 611 logit 611 and 12 logit 1 912 will assure that 11 and 12 are both in the unit interval but will not guarantee that 11 412 lt 1 The generalized or multinomial logit denoted as logitgen ensures that all parameters and their sum are within 0 1 For the first N 1 transitions parameters among the j j 1 v this transformation is defined as Qij 1 Dear dik For N non null transition probabilities on a row the logitgen transformation is applied to the N 1 logitgen log j l N 1 4 parameters When N 2 the generalized logit reduces to the logit With more than two states some additive effects cannot be modelled meaningfully with
114. vidual covariates 2 42 ssa 44 640 ee datas ao Mee eee aa hee he 2 10 Independent and identically distributed i i d random effect For divida ii sos Bahk ewe ee Ee a eG a a es Por Groupes cocer bee es BE ee ee Ce ORS Le eee oa E n 2 11 Conditionality on the first occasion aooaa a Open and closed population lt s lt c s o a EE ADS a ee RS ICENPADOY models e sonia srl as MA RR a Bk Bee n 2 12 Non linear model 425 044 4685 4 be BESS ERR RR Ee EE Ee RES GEPAT Bl OVEIVIEW a see eh ada A RYE ROE OEE Ee eee ea RS E 3 2 A combined formulation of the Arnason Schwarz model 3 3 The separate formulation for the Arnason Schwarz model 3 4 A version of the Arnason Schwarz model with site fidelity parametrization Defining sets of intermediate states LL Matrices and pattern matrices o qor sii YS yee e a Bw E E SE 309 GEPAT m practice scs e a O eR Ee ne ka o oo ooN Aa dis Www w e e pa ppp Aa Ae N N Ne RA O CONTENTS 3 6 The vector of biological parameter eee ee ee ee 22 GEMACO 23 AL Overview ori iaip e naaa e i 23 4 2 Keywords for main effects i e a a a AE e e a a 24 How is the MDL phrase time interpreted by GEMACO ooann 24 How is the MDL phrase age interpreted by GEMACO P o o oo 25 How is the MDL phrase group interpreted by GEMACO o o 25 How are the MDL phrases from and to interpreted by GEMACO 27 43 Combinme ef
115. viduals who left from site 1 left from site 2 and left from site 3 In any case the numbering of the states is completely arbitrary changing the numbers simply exchanges rows and columns of the elementary matrices 3 5 GEPAT in practice GEPAT is a tool for defining the pattern matrices PII P and PB In the current version of GEPAT the user enters the number of steps LI LT LB for each kind of parameter For each step of each parameter the user enters a pattern matrix by rows The matrix ti tay tmn co try would be entered in GEPAT using a graphical interface See section 6 3 for details 22 CHAPTER 3 GEPAT 3 6 The vector of biological parameter Components of the vector of biological parameter O are dependent upon the definition of pattern ma trices Are considered as biological parameters the set of elements which can be potentialy constrained i e the set of elements which are not labelled by a in pattern matrices Let considered as a example Po in equation 29 the vector restricted to the survival matrix SO for a given occasion and a given age is 6 81 82 1 51 1 82 1 41 sk Each element of 6 in 41 is labelled in the pattern matrix 29 either by a letter s1 s2 or by a 1 s1 1 s9 1 Now the vector 0 restricted to the survival but with full variation in time and age with 2 occasions of recapture K 3 becomes 1 1 1 1 1 1 1 1 21 21 2 1 2 1 6 s 85
Download Pdf Manuals
Related Search
Related Contents
Philips 929689332704 incandescent lamp Manuel d`utilisation FA-LTBQ75M XT-7000 manual GB IGEL UD3 131 ES Copyright © All rights reserved.
Failed to retrieve file