Home
1 Introduction 2 Background Theory
Contents
1. 7 A is the result of the multiplication ZCopyright 1996 Peter Dunn 411 Nov 1996 Terror checks if length D size B 1 error Matrix sizes are not compatible end fend error checks A zeros length D size B 2 for i 1 length D A i D i B i end TAASUBFUNCTION ALIAS function label alias X YALIAS Finds aliased columns of matrix X MUSE labels alias X 4 where labels is a vector of the linearly independent columns X is the original data matrix X label is the matrix with aliased variables removed ZFor use within glmlab Copyright 1996 Peter Dunn Last revision 11 Nov 1996 toler paramtrs 14 XX X 1 i 1 label 1 for j 1 size X 2 i i 1 XX XX X j if det XX XX toler XXC 1 size XX 2 1 1 else label label j end end 15 B The Function glmfit m The function g1mfit organises the fitting of the model checks the inputs deals with the results and looks for errors function beta serrors mu res covarbeta covdiff devlist linpred xnames glmfit y x AGLMFIT Fits a generalised linear model glm 4 USE beta serrors fits res covarbeta covdiff devlist linpred glmfit y x 4 where y is the response variable T For a binomial response y has two columns YA the first contains y the second has the sample sizes 4 x is the covariate matrix Ya the number of columns of x is the number of variables 4 beta contains the parameters estimates YA serr
2. rownamexv GLMLAB_INFO_ 13 namelist rownamexv end YX variables names if inc_const Zif include constant tag such on xelones yrovs 1 xvarl namexv Const cel21str namexv namelist str2mat Constant rovnamexv else xexvar namexv cel2lstr namexv end end of that bit of fiddling if isempty pwvar GLMLAB INFO 1 16 ones yrovs 1 pvvareGLMLAB INFO 116 end f isempty osvar GLMLAB INFO 17 zeros yrovs 1 osvar GLMLAB 17 end zerowts sum pwvar 0 ZNumber of points with zero wight effpts ylen zerowts ZEffective number of points linez z YDISPLAY FITTING INFORMATION if DFORMAT 1 disp line if isstr link l upper link else 12 TO POWER OF num2str link end disp INFORMATION Distribution Link upper errdis 1 if zerowts gt 0 add s if effpts 1 addz end disp blanks 14 Fitting based on num2str effpts observation add end if sum pwvar max pwvar 1 ylen Only enter if weights not all one 17 disp blanks 14 Prior Weights gt namepw end if sum osvar 0 ylen Enter if offsets is not all zeros disp blanks 14 0ffset Variable nameos end if isstr scalepar disp blanks 14 Scale parameter estimated from mean deviance BE parameter set to num2str scalepar
3. e To declare the distribution choose Poisson from the Distribution menu File Distribution Link Scale Parameter Residual Type Options Help glmlab Response y DEED EI Covariates X fac Shiptype fac Vearmade fac Operation Prior Weights EE om Offset Variable log Service NEW MODEL FIT SPECIFIED MODEL QUIT Figure 2 Variables Entered for the Ship Damage Example e The logarithm link function is the default for the Poisson distribution it is the canonical link so no changes need to be made using the Link menu e To alter the scale parameter choose Mean Deviance from the Scale Parameter menu For the Poisson distribution the Scale Parameter by default is set to a Fixed Value of 1 On pressing the Fit Specified Model button results are presented on the screen as shown below gt gt INFO Response Variable Covariates Offset Variable Damage fac Shiptype fac Yearmade fac Operation fitting a constant term intercept log Service Prior Weights Variable Weights Estimate S E Variable 6 405902 0 270524 Constant 0 543344 0 220941 Shiptype 2 0 687402 0 409369 Shiptype 3 0 075961 0 361511 Shiptype 4 0 325579 0 293459 Shiptype 5 0 697140 0 186170 2 0 818427 0 211217 Yearmade 3 0 453427 0 290089 Yearmade 4 0 384467 0 147143 Operation 2 Deviance 38 695052 change 39 883544 Residual df 25 change 5 Scale p
4. matrices of the correct type and size For example the user could enter magic 4 as the covariates and 1 0 3 2 1 2 0 51 as the response Some commands especially if complex will not work Using vector variables defined in the MATLAB workspace is recommended 4 2 Menu Items There are eight menu items in the main GLMLAB window see Figure 1 e File The file menu opens data files mat files loads and saves models glg files exits from GLMLAB quits MATLAB Error Distribution Normal Inverse Gaussian Default Link Function Identity 7 M Inverse Quadratic Default Scale Parameter Mean Deviance Mean Deviance n 1 u Poisson Logarithm 7 log u Binomial Logit n 1 Reciprocal n 1 u Fixed at 1 Fixed at 1 Gamma Mean Deviance Table 1 Default Settings for Chosen Distributions In the table n is the linear predictor and ps is the mean or n in the binomial case Distributions In the distributions menu the user selects the distribution of choice There are five built in distributions normal inverse Gaussian gamma Poisson binomial Users can also add their own distributions which will appear in the menu Distributions have default link functions and scale parameters as shown in Table 1 Link In the link menu the user selects the link function of choice There are eight built in link functions identity log square root power reciprocal c
5. is A A A B B B C K L L L the variable can be generated using Newvar makefac 36 12 3 in the MATLAB workspace 4 5 Returned Variables After the fitting of a model ten variables are made available in the MATLAB workspace e BETA The parameter estimates SERRORS The standard errors of the parameter estimates FITS The fitted values RESIDS The residuals e COVB The covariance matrix of the parameters e COVD The covariance matrix of the differences between parameters DEVLIST The deviance at each iteration of the fit LINPRED The linear predictor 7 XMATRIX The X matrix used in fitting the model e XVARNAMES The names of the X variables 4 6 Binomial Responses Binomial responses variables require some special handling Three link function are unique to the binomial distribution and are unavailable otherwise from the Link menu the logit or logistic probit and com plementary log log link functions In addition the response variable must reflect the binomial situation of counts and sample size In this situation response variable consists of two columns the first for the counts and the second for the sample sizes Note that this is different than the convention adopted in S Plus where the response has the two columns as the number of successes and the number of failures When the data to be analysed is in the form of probabilities only one column is needed See Section 5 2 for an exam
6. through a monotonic differentiable link function so that g u m Usually the link function is the same for all points i 1 n The link function can be chosen independently of the distribution but a popular link function called the canonical link function is found by setting 0 n The standard multiple linear regression technique sets the link function at 7 u Distributions of the form 1 include the normal Gaussian inverse Gaussian gamma Poisson and binomial distributions The deviance of a model a measure of the distance between y and ze is given by m D yi ui X widlyi pi i 1 where d y pi is the unit deviance defined as W di yi Mi 2f m du and w are prior weights The X matrix can contain quantitative variables and also qualitative variables also called categorical variables or factors Each level of the factor is identified with a variable However there are usually dependencies between the variables This overparameterization introduces a singular design matrix which must be removed prior to the fitting of any model There are many different methods available for doing so including Helmert contrasts as favoured by S Plus orthogonal polynomials sum to zero constraints and corner point parameterizations as in GLIM All are different methods of introducing qualitative variables by having k 1 independent variables for a variables with k levels The theory of glm s has been documented by
7. to a wide cross section of students different students may be familiar with different packages and perhaps some students not familiar with any statistical package at all All have a firm grounding in MATLAB however There are many advantages with this approach the rich variety of mathematical and graphical functions of MATLAB are readily available there is no need to purchase or learn a specialist statistics package such as GLIM the same code can be used for all the different MATLAB platforms the graphical environment makes glm s quickly available to students in a short course The software is not intended to take the place of commercial packages but rather to bring the world of glm s to users of MATLAB Section 2 provides a theoretical background to glm s Section 3 discusses some of the methods used in the software The program itself is discussed in Section 4 and two examples are discussed in Section 5 Section 6 briefly concludes with a short discussion 2 Background Theory Generalized linear models are based around distributions in the exponential family such that fy yi 9 50 ay exp yi0 amp 0 9 1 for known functions a and amp In such models p amp 60 and Var y V ui where V ui K 0 is known as the variance function and is the dispersion parameter The linear predictor usually referred to as n is given by n XTB The mean parameter u is related to the linear predictor 7
8. using a logit link function The data is entered into GLMLAB as shown in Figure 5 In particular take note of the entry for the response variable which is entered as two columns After choosing Ha S S Iw i I raq Figure 4 Fitting Interaction Terms Dose Number of Beetles Number of Beetles log CS mg 171 ni Killed r 1 6907 59 6 1 7242 60 13 1 7552 62 18 1 7842 56 28 1 8113 63 52 1 8369 59 53 1 8610 62 61 1 8839 60 60 Table 3 Beetles Mortality Data File Distribution Link Scale Parameter Residual Type Options fisis Help glmlab Response y PH Covariates X Dose Prior Weights Offset Variable NEW MODEL FIT SPECIFIED MODEL QUIT giml b Figure 5 Variables Entered for the Beetle Mortality Example the binomial distribution and the logit link function from the menus the results are given below INFO Response Variable Killed Number Covariates Dose fitting a constant term intercept Estimate S E Variable 60 717455 5 180701 Constant 34 270326 2 912134 Dose Scaled deviance 11 232231 Link LOGIT Residual df 6 Distribution BINOML Scale parameter dispersion parameter 1 000000 Output variables BETA SERRORS FITS RESIDS COVB COVD DEVLIST LINPRED XMATRIX XVARNAMES The results agree with those given in Dobson An alternative method is to fit the model using
9. xtwx x diagm fwts z eta x b toffset mu feval linkinfo eta m mu rdev2 rdev rdev feval distinfo 2 y mu m weights its itst1 devlist devlist rdev end Ti eta d mu variance function Zfitting weights adjusted dependent var WX WX 4next beta Zlinear predictor eta mu last residual deviance residual deviance Zupdate iterations list of deviances for fits 13 devlist 1 remove initial empty entry Issue warnings where appropriate if its gt maxits disp MAXIMUM ITERATIONS REACHED WITHOUT CONVERGENCE disp L This is currently set at num2str maxits 1 end if rcond xtwx lt illctol disp ILL CONDITIONED COVARIATE MATRIX end if rcond xtwx lt illctol its maxits disp gt disp PLEASE NOTE INACCURACIES MAY EXIST IN THE SOLUTION end Message about the fit if format 1 if its lt maxits tag gt if its 1 tag s end disp blanks 14 Convergence in gt num2str its iteration tag end end x xx 1 eta x b oo mu feval linkinfo eta mm mu TAASUBFUNCTION DIAGM function A diagm D B ZDIAGM Multiples D a diagonal matrix as a row of diag elements with matrix B XUSE A diagm D B 1 where is the diagonal elements of a diagonal matrix 7 B is a matrix of suitable size
10. 1 end disp blanks 14 Residual Type gt upper restype end resetgl Z btain link and distribution file names if isstr link linkinfo 1 link else linkinfoz lpover end distinfo d errdisl ACheck they exist user defined case if exist linkinfo opterr 8 linkinfo 2 length linkinfo return end if exist distinfo opterr 9 distinfo 2 length distinfo return end 4 D0 THE NUMBER CRUNCHING beta mu xtwx devlist l linpred irls y x m 4DONE THE NUMBER CRUNCHING ADISPLAY RESULTS disp line GLMLAB INFO 223 0 if isempty devlist Zif empty an error curdev devlist end deviance for current model curdf effpts length beta 147 for current model if curdf lt 0 Zif more estimates that points curdf 0 end if strcmp errdis normal strcmp errdis gamma amp curdf 0 dispers Inf Otherwise gives warning Divide by zero else if isstr scalepar dispers scalepar else dispers curdev curdf end 18 end covarbeta real pinv xtwx dispers Determine variable names varno 1 estno 1 xnames while estno lt size x 2 vn namelist varno varno varno ti if strcmp deblank vn Constant numcols 1 else if isempty findstr deblank vn no interactions if stremp vn 1 4 Var 7 then a glmlab Var numcols 1 else evstr size C deblank vn 2 7 evalin base numcols e
11. A Graphical User Interface to Generalized Linear Models in MATLAB Peter K Dunn July 12 1999 Abstract Generalized linear models unite a wide variety of statistical models in a common theoretical frame work This paper discusses GLMLAB software that enables such models to be fitted in the popular mathematical package MATLAB It provides a graphical user interface to the powerful MATLAB computa tional engine to produce a program that is easy to use but with many features including offsets prior weights and user defined distributions and link functions MATLAB s graphical capacities are also utilized in providing a number of simple residual diagnostic plots 1 Introduction Generalized linear models or glm s were introduced by Nelder and Wedderburn 8 in 1972 as a means of unifying a diverse range of statistical models including multiple regression log linear analysis and probit analysis The subsequent release of the GLIM software Francis et al H enabled the theory to be applied easily to practical problems Since that time other popular statistical software packages including S Plus Statistical Sciences 10 and SPSS Norusis 9 have incorporated glm s This paper explores the use of MATLAB The MathWorks Inc 5 to implement glm s using a graphical interface in the program GLMLAB The motivation for the development of the program comes from the need to teach a short course in statistical models with a section on glm s
12. Nelder and Wedderburn 8 McCullagh and Nelder 6 and more briefly by Dobson 2 among others 3 Methods The algorithms for fitting glm s to data are well established and robust see McCullagh and Nelder 6 and Nelder and Wedderburn 8 The maximum likelihood estimates of 6 can be found using iterative least squares set No 6 to be the initial value of the linear predictor and po the corresponding value of the fitted value recall that n g w The link function is linearized so that an Zo No Y Mo 42 0 0 du and is then regressed onto the covariates X with quadratic weights defined as to produce a new estimate of 9 The algorithm is repeated until convergence Starting values for the algorithm are easily obtained from the data Setting uo y 274 is then obtained from the link function In some cases the method needs some refining for example to avoid calculating log 0 when fitting a log linear model with zero counts The program itself called GLMLAB consists of numerous m files MATLAB functions and scripts for fitting generalized linear models Large portions of the code involve setting up the graphical interface parsing the input strings and subsequent checks of the inputs and executing menu commands The example pieces of code included in the Appendices concern the actual fitting of the model the code probably of most interest to readers of this paper The actual fitting algorithm is implemented i
13. arameter dispersion parameter 1 547802 Output variables BETA SERRORS FITS RESIDS COVB COVD DEVLIST LINPRED XMATRIX XVARNAMES The results naturally agree with those given in McCullagh and Nelder 6 Variables with numbers following are qualitative variables the numbers refer to the levels of the variable They are understood to be in reference to the first level of the variable the same way in which GLIM treats qualitative variables After a model has been fitted the Plots menu becomes available and residual plots can be generated to informally examine the residuals For example if deviance residuals were chosen from the Residual Type menu the Residuals vs Transformed Fitted Values plot is displayed on the screen see Figure 3 To include an interaction term between say Year of Operation and Year of Construction as a covariate enter the variables into the main window as shown in Figure 4 Pushing the Fit Specified Model button produces the following output 2125 No 4 loj x Fils Window Help Axes and Labels Grid Lines Marker Attributes D viance Residuzls Poissor vs z sqri itted Valu2s Deviance Residuals 1 4 A 14 1fi fi R in 2 sqrtiFittec Values Figure 3 The Deviance Residuals vs the Transformed Fitted Values for the Ship Damage Data INFO Response Variable Damage Covariates fac Shiptype fac Yearmade fac Operation fac Y
14. df 17 0f change 13 0 n curdf deldf fprintf FID 12 6f 13 6f 3 0f 5 0f curdev deldev curdf deldf nameyv namexv else deviance doesn t exist so this is the first fit if isempty scalepar if isstr scalepar sdev curdev else sdev curdev scalepar end if isstr link fprintf Scaled deviance 213 6f Link s n curdev upper link else fprintf Scaled deviance 13 6f Link Power of 8 5f n curdev link end else if isstr link fprintf Deviance 19 6f Link s n curdev upper link else fprintf Deviance 19 6f Link Power of 8 5f n curdev link end 20 end printf Residual df 17 0f Distribution s n curdf upper errdis FID fopen DETAILSFILE a fprintf FID Created at s on s n mytime date fprintf FID Deviance Change df Change Variables n fprintf FID 12 6f 12 0f BSN yox curdev curdf nameyv namexv1 end devlist devlist curdev if isempty namepw Cisempty nameos fprintf FID The above fit includes the following n end if isempty namepw amp isempty nameos fprintf FID Prior weights s Offset s n namepv nameos else if isempty namepu fprintf FID Prior weights s n namepw end if isempty nameos fprintf FID ffset dsn nameos end end fclose F1D if finite dispers fprintf Dispersion para
15. earmade fac Operation fitting a constant term intercept Offset Variable log Service Prior Weights Variable Weights Estimate S E Variable 6 515032 0 299392 Constant 0 546126 0 225026 Shiptype 2 0 689259 0 416861 Shiptype 3 0 076216 0 368217 Shiptype 4 0 321635 0 298830 Shiptype 5 0 860880 0 253208 2 0 972409 0 312793 3 0 280620 0 330341 Yearmade 4 0 668061 0 305957 Operation 2 0 387270 0 376715 Yearmade 2 0peration 2 0 340986 0 404893 Yearmade 3 0peration 2 0 000000 aliased Yearmade 4 0 peration 2 Deviance 36 907591 change 1 787460 Residual df 23 change 2 Scale parameter dispersion parameter 1 604678 Output variables BETA SERRORS FITS RESIDS COVB COVD DEVLIST LINPRED XMATRIX XVARNAMES The final parameter is aliased in that it contains no information that is not already contained in the other variables 5 2 Example Binomial Data Because of the particular nature of the binomial distribution a simple example is considered here The data comes from Bliss 1 cited in Dobson 2 and is shown in Table 3 The data involves counting the number of beetles killed after five hours of exposure to various concentrations of gaseous carbon disulphide CS The analysis concerns estimating the proportion r n of beetles that are killed by the gas The variables in MATLAB were named Dose Number and Killed for the obvious variables Dobson analyses the data
16. he data contains many interesting features that serve to demonstrate features of the GLMLAB software For example there are many structural zeros since for example ships made after 1975 cannot have periods of operation before 1975 There is also an observational zero in the data The authors consider an initial model of the form log expected number of damage incidents o log aggregate months of service effect due to ship type effect due to year of construction effect due to service period 2 The first term after the intercept is a quantitative variable with a known coefficient of 1 Such a variable is known as an offset As is usual with count data the authors decide to use the Poisson distribution with the logarithm link function They expect overdispersion due to expected inter ship variability and so we over ride the default fixed scale parameter option in GLMLAB and estimate with the mean dispersion To fit the model discussed in McCullagh and Nelder 6 variables are entered as shown in Figure 2 Note that the offset variable is entered as log Service this shows that GLMLAB can happily accept transformations being entered as variable names log Service has been used as the offset because of the model in Equation 2 proposed by McCullagh and Nelder The variable Weights is a vector of prior weights that omits the structural zeros it is zero for the structural zeros and one elsewhere The following options are also set
17. he manual can be found at http www sci usq edu au staff dunn glmlab glhtml html gli html The GLMLAB Home Page can be found at http www sci usq edu au staff dunn glmlab gimlab html 4 3 Buttons There are seven buttons in the main GLMLAB window see Figure 1 three of these along the bottom are of the most interest e New Model Pressing this button prepares GLMLAB for fitting a new model by restoring default settings and clearing variables e Fit Specified Model Pressing this button fits the model as currently specified e Quit glmlab Quits GLMLAB The remaining four buttons on the left of the main window for example one is labelled Response y open windows for selecting numeric variables from the workspace 4 4 Commands There are only a few commands that need to be learnt to use GLMLAB e glmlab Once in MATLAB the user starts the program by typing glmlab at the MATLAB prompt e fac The fac command is used to flag a variable as qualitative see Section 5 1 fac uses a corner point parameterization as in GLIM that includes each level of the factor as a dummy variable and excludes the first column to preserve full rank e 0 The 6 symbol is used to flag interactions between variables see Section 5 1 e makefac The makefac command is similar to the GLIM command gl It allows for easier creation of factors To create a factor of length 36 that has twelve levels A B C L say occurring in groups of three that
18. if strcmp what mu eta input1 else mu input1 end if strcmp vhat eta answ sqrt 2 erfinv 2 mu m 1 elseif strcmp vhat mu answ m 1 erf eta sqrt 2 2 else answ sqrt 2 pi m exp erfinv 2 mu m 1 end 72 25 input2 m
19. ls The Journal of Computational and Graphical Statistics 5 1 10 September 1996 1 Brian Francis Mick Green and Clive Payne The GLIM System generalised linear interactive mod elling Release 4 Manual Clarendon Press 1993 The MathWorks Inc MATLAB Reference Guide August 1992 P McCullagh and J A Nelder Generalized Linear Models Number 37 in Monographs on Statistics and Applied Probability Chapman and Hall second edition 1994 J A Nelder Nearly parallel lines in residual plots The American Statistician 44 3 221 222 August 1990 J A Nelder and R W M Wedderburn Generalized linear models Journal of the Royal Statistical Society A 135 370 384 1972 M J Norusis SPSS for Windows base system user s guide release 6 SPSS Inc 1993 Statistical Sciences Inc S PLUS User s Manual Version 3 3 for Windows 1995 11 A The Function irls m The function irls implements the iterative revveighted least squares algorithm that finds estimates of 6 function b mu xtwx devlist l eta irls y x m XIRLS Iteratively reweighted least squares for use with glmlab USE b mu xtwx devlist l eta irls y x m 4 where y is the vector of y variables x is the matrix of x variables 7 m 15 the vector of sample sizes for a binomial distribution 1 For other distributions it is not needed or use a dummy YA b is the vector of parameter estimates 4 mu is the fitted values YA xtwx is the ma
20. mat 1 m allmat 2 weights allmat 3 offset allmat 4 12 check again for aliasing l alias x x x 1 end Starting values if GLMLAB_INFO_ 5 amp isempty fitvals If Recycle fitted values option is selected use fitted values has starting point mu fitvals else else usually use the actual observations and the starting point for the Zfitted values in some cases we must fiddle to avoid problems with zeros if strcmp link logit strcmp link complg strcmp link probit mu m y 0 5 m 1 over m 1 in the general case McC and Nelder p 117 elseif strcmp link log lstremp link recip mu y y 0 else mu y end end Zinitialise its 1 devlist clear allmat rdev sqrt sum y 2 rdev2 0 b zeros size x 2 1 b2 100xones size x 2 1 b 1 10 Zremove 0 s that may be there number of iterations list of deviances after fits residual deviance Zdummy to enter loop Xinitial beta Zdummy to enter loop extra precaution to enter loop eta feval linkinfo mu m eta ZetamXb offset hiterate while abs rdev rdev2 gt toler amp its lt maxits hwhile rdev is changing or max its not reached and b still changing detadmu feval linkinfo mu m detadmu vfun feval distinfo 1 y mu m weights fwts 1 detadmu 2 vfun weights z eta offsett y mu detadmu xtuxex xdiagm futs x b2 b b
21. meter cannot be found 0 degrees of freedom Nn else fprintf Scale parameter dispersion parameter 16 6f n dispers end disp Output variables BETA SERRORS FITS RESIDS COVB COVD disp DEVLIST LINPRED XMATRIX XVARNAMES disp Update parameters GLMLAB_INFO_ 20 curdev DEVIANCE curdev GLMLAB INFO 21 curdf scalepar k 1 scale parameter used f isempty scalepar if isstr scalepar k scalepar end end calculate the RESIDUALS res findres y m mu k if isfinite dispers disp WARNING Non finite dispersion end 4calculate other OUTPUT VARIABLES if nargout 4 covdiff zeros size covarbeta for ii 1 length covarbeta for jj iiti length covarbeta 21 cvd real sqrt covarbeta ii ii tcovarbeta jj jj 2 covarbeta ii jj covdiff ii jj cvd covdiff jj ii cvd end end end ZAllov the proper residual plots to be available if strcmp GLMLAB INFO 17 binom1 strcmp GLMLAB INFO 17 poisson strcmp GLMLAB INFO 1 gamma strcmp GLMLAB INFO 11 inv gsn set findobj tag resvxf Enable on end if strcmp lower GLMLAB_INFO_ 4 quantile set findobj tag qequiv Enable off end if isempty GLMLAB_INFO_ 10 set findobj tag resvc Enable off else set findobj tag resvc Enable on end else err
22. n the file irls m as shown in Appendix A Each distribution used in the program has an associated file that contains the information relevant to GLMLAB namely the variance function and the deviance see Appendix C for an example of the gamma distribution Similarly each link function has an associated file containing information about the link namely finding 7 given u finding u given n and finding dn dp given u An example using the probit link function is given in Appendix D This idea of placing relevant information about distribution and links in files enables the user to create new distributions and link functions with a minimum of knowledge of MATLAB programming Parameters such as the maximum number of iterations the parameter accuracy and the ill conditioning tolerance can be altered from one of the drop down menus on the main screen The program produces an output file called DETAILS m that contains information about the variables fitted and deviance from each model see Section 5 Screen output gives information such as parameter estimates and residual deviance This information is easily captured using the MATLAB diary command File Distribution Link Scale Parameter Residual Type Options fisis Help glmlab Response y X Covariates X Prior Weights Offset Variable NEW MODEL QUIT giml b Figure 1 The Main GLMLAB Window 4 The Program There are many functions written for the analysis of partic
23. omplementary log log probit and logit or logistic Users can also add their own links which will appear in the menu For each distribution the default link function is the canonical link see Table 1 Scale Parameter The scale parameter can be set to a fixed positive value or to be estimated by the mean deviance Residual Type Three types of standardized residuals can be chosen deviance Pearson and quantile See Dunn and Smyth 3 for a description of quantile residuals Options Many options can be set including recycling fitted values restoring default options declaring new models including the constant term intercept in the fit and changing the fitting parameters Plots After fitting a model six different plots are available directly from GLMLAB Residuals vs Response Residuals vs Covariates X Normal Probability Plot Residuals vs Fitted Values Residuals vs Transformed Fitted Values The transformation is to the constant information scale of the distribution see Nelder 7 and McCullagh and Nelder 6 Fitted Values vs Quantile Equivalents Of course using MATLAB s facilities and the variables returned by GLMLAB into the MATLAB workspace see Section 4 5 numerous plots can be constructed Help Contains help screens plus links to the GLMLAB Home Page and on line manual and a simple demonstration By having a quick link to the on line manual users have timely help at their disposal T
24. ordlg The model cannot be fitted sensibily check the inputs and settings Model not fitted res ZDisallov residual plots set findobj tag rplots Enable off end Zfix up some things for use elsewhere GLMLAB_INFO_ 19 mu GLMLAB_INFO_ 18 res Mem Zenable residual plots if isempty GLMLAB_INFO_ 18 set findobj tag rplots Enable off7 else set findobj tag rplots Enable on end Reset variables resetgl return Ahhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh XAXSUBFUNCT UN mytime function time mytime AMYTIME Return the current time in the format hh mm ss am or pm MUSE mytime Copyright 1996 Peter Dunn 211 November 1996 ssefix clock if ss 4 gt 12 HOUR time num2str ss 4 12 1 tag pm else time num2str ss 4 tag am 22 end if ss 5 lt 10 MINUTES time time 0 num2str ss 5 71 else time time num2str ss 5 7 7 end if ss 6 lt 10 SECONDS time time 0 num2str ss 6 tagl else time time num2str ss 6 tag end 23 C The File dgamma m The function dgamma m contains the information required by GLMLAB about the gamma distribution function answ dgamma what y mu m weights ZDGAMMA Calculates all kinds of things for gamma distributions AUSE answ dgamma what y mu m weights where y mu m and weights are the ob
25. ors are the standard errors 7 fits are the fitted values YA covarbeta covariance matrix of the parameter estimates 4 res are the standardised residuals 1 y fits sqrt prior vt scale parameter variance function YA covdiff is the var covariance matrix of standard error of parameter 4 differences YA devlist is a vector of residual deviance for iterations of the fit 1 linpred is the linear predictor vA xnames is a string array of the names of the variables as in the output Both vectors y x should be the same length the number of observations Only y is needed If x is not supplied only the constant term is fitted ZALSO SEE glmlab fitting glm s using glmfit where glmfit is used Copyright 1996 1998 Peter Dunn 02 March 1998 Setup beta serrors mu res covarbeta covdiff devlist l linpred xnames ZExtract info extrctgl hextract GLMLAB INFO DFORMAT GLMLAB INFO 73 DETAILSFILE GLMLAB INFO 8 DEVIANCE GLMLAB INFO 120 y yvar ZA check on links distns if editerrs return end Some necessary fiddling Eyrovs ycolslesize y each row an observation can t use yzy since the binomial case has two columns if yrows lt ycols yey end ylen length y if ycols 2 Binomial case extract responses and sample sizes 16 2 y y 1 else m ones size y end m m if exist rownamexv 1
26. ple using the binomial distribution glmlab Contains general information and files used in starting glmlab glmlab fit Contains numerous files for fitting the model and parsing the in put glmlab fit dist Contains information about the distributions that can be used gimlab fit link Contains information about the link functions that can be used glmlab plotting Contains plotting routines glmlab misc Contains other miscellaneous files used in GLMLAB including for matting and tricks glmlab glmhelp Contains the GLMLAB help menu information gimlab glmlog Contains log files fitting parameters and data files that come with GLMLAB Table 2 The Structure of GLMLAB 4 7 Directory Structure GLMLAB consists of over 70 MATLAB files in a number of directories or folders They are structured as shovn in Table 2 5 Examples 5 1 Ship Damage Data McCullagh and Nelder 16 86 3 give some data concerning wave damage done to cargo carrying ships available in the file shipdam txt The data consists of five variables e Ship Type Five types of ship are considered MATLAB variable Shiptype Year of Construction 1960 1964 1965 1969 1970 1974 1975 1979 MATLAB variable Yearmade e Period of Operation 1960 1974 1975 1979 MATLAB variable Operation e Aggregate Months of Service MATLAB variable Service e Number of Damage Incidents MATLAB variable Damage The first three variable are qualitative factors T
27. the probabilities r n as the response that is one column of probabilities and use n as the prior weights The parameter estimates are identical The residual plotted against the fitted values produced using the Plot Residual vs Fitted Value option in Figure 6 shows a possible curvature 6 Discussion The program GLMLAB has been discussed for using MATLAB to fit the wide class of statistical models known as generalized linear models The software is not meant to replace commercial packages but to provide MATLAB with facilities for dealing with such a class of models While GLMLAB may not be as powerful as programs like GLIM and S Plus it offers an easy to use introduction to generalized linear models in a user friendly and powerful environment It allows many different types of models to be fitted including common models such as logit models ANOVA and multiple regression References 1 C I Bliss The calculation of the dosage mortality curve The Annals of Applied Biology 22 134 167 1935 2 Annette J Dobson An Introduction to Statistical Modelling Chapman and Hall 1983 10 3 E Figure No 4 i BI x File Window Help Axes and Labels Grid Lines Marker Attributes Pearscn Residuals Binomial vs Fitled values Pea son Residvals n Rf an Fitted Values Figure 6 Residuals Versus Fitted Values for the Beetle Mortality Example Peter K Dunn and Gordon K Smyth Randomized quantile residua
28. trix X W X YA devlist is the deviance for each iteration of the fit 1 is the labels for linearly independent columns of x YA eta is the linear predictor ACopyright 1996 1997 Peter Dunn Last revision 20 October 1997 Extract info extrctgl extract GLMLAB INFO load default parameter settings clear paramtrs to ensure reloading of parameters toler maxits illctol paramtrs reload parameters obtain information about the model to fit distn GLMLAB INFO 1 Xdistribution linkzGLMLAB INFO 42 link function format GLMLAB INFO 17 Xoutput display fitvals GLMLAB INFO 419 fitted values weights GLMLAB INFO 116 prior weights offset GLMLAB INFO 417 Moffsets determine files that contain distribution and link information if isstr link linkinfoz 1 link Athe file containing link info else linkinfoz lpover flink info if a power link end distinfo d distn the file containing distribution info remove aliased variables from the fit XXEX Ymake other copies oo offset mm m l alias x jdetermine aliased variables x x 1 use unaliased variables Zremove points with zero weights from the fit if any weights 0 removes weights 0 from fitting zeroind find weights 0 allmat y m veights offset x fitvalsl allmat zeroind if size allmat 2 6 if new fit fitvals is empty fitvals allmat 6 end x allmat 5 size allmat 2 1 y all
29. ular statistical problems in MATLAB including a subset of problems in the generalized linear models framework However even with the Statistics Toolbox MATLAB lacks a procedure for analysing the full range of generalized linear models This section discusses the program GLMLAB that has been written to allow MATLAB to analyse such situations GLMLAB uses a graphical interface and so is easy to learn and use Among other features it allows the user to add distributions and link functions that are not included and save and load work between sessions It does not require the user to have access to the Statistics Toolbox Some aspects of the program are listed below 4 1 Data Entry Windows There are four areas in the main screen see Figure 1 for the entry of data or variable names e Response y The response or y variable is entered here but see Sections 4 6 and 5 2 also e Covariates X The covariates or X matrix is entered here The constant or intercept is automati cally fitted but this can be altered by the user in the Options menu e Prior Weights Any prior weights can be entered here for example in a weighted regression to omit structural zeros or optionally with the binomial distribution as in Section 5 2 e Offset Any offset variables are entered here An offset is a variable with a known coefficient see Section 5 1 Valid MATLAB workspace variables can be entered as well as most valid MATLAB commands that produce
30. vious 4 what returns what is asked YA what 1 returns the variance function 4 what 2 returns the deviance scaled deviance YA answ is the answer asked for Called by irls glmfit ACopyright 1996 1997 Peter Dunn Last revision 15 May 1997 if what 1 answ mu 72 0 00000001x mu 0 fin case mu 0 answ answ answ gt 0 answ lt 0 YXremoves negative mu s elseif what 2 mu mut0 000000015 mu 0 fin case mu 0 yy yt y 0 mu hin case 0 answ 2 sum weights log yy mu y mu mu end 24 D The Function lprobit m The function 1probit m contains the information required by GLMLAB about the probit link function function answ lprobit inputi input2 what ALPROBIT Calculates all kinds of things for probit link functions AUSE answ lprobit inputi input2 what 1 where y is the observed y vector 4 inputi is the inputi needed determined by vhat you want out 4 what returns what is asked YA what eta returns the linear predictor eta input1 is mu what mu returns the mean mu inputi is eta 7 what detadmu returns the deriv d eta d mu inputi is mu YA answ is the answer asked for XCalled by glmfit and irls Copyright 1996 Peter Dunn Last revision 11 November 1996 4 input2 is only used for binomial logit complg probit cases when 4 for finding eta inputi mu 4 for finding mu inputi eta 4 for finding d eta d mu inputi mu m input2
31. vstr numcols 1 end end end if numcols 1 xnames str2mat xnames deblank vn estno estnot1 else for j 1 numcols xnames str2mat xnames deblank vn num2str j estno estnot1 end end end xnames 1 ZDisplay results if DFORMAT 2 nargout gt 0 display the parameter estimates if DFORMAT 2 disp Estimate S E Variable disp line end jj70 bb zeros size x 2 1 serrors bb estno 1 varno 1 for varno 1 size x 2 vn xnames varno varno varno ti if 1 unaliased variables ji jitt se sqrt covarbeta jj jj if isinf se amp DFORMAT 2 fprintf 12 6f hs hs n beta 13 7777777 vn 19 elseif DFORMAT 2 fprintf 12 6f 412 6f s n beta jj se vn end serrors estno se bb estno beta jj else 4 variables if DFORMAT 2 fprintf 12 6f 4s 4sNn 0 0 aliased vn end end estno estnot 1 end end beta bb disp line if isempty DEVIANCE Xdeviance already exists so we print changes deldev curdev DEVIANCE defined as in GLIM deldf curdf GLMLAB_INFO_ 21 if isstr scalepar Sdev curdev scalepar sdeldev deldev scalepar fprintf Scaled Deviance 213 6f change 13 6f n sdev sdeldev else fprintf Deviance 20 6f change 13 6f n curdev deldev end Write to DETAILS file FID fopen DETAILSFILE a fprintf Residual
Download Pdf Manuals
Related Search
Related Contents
Bedienungsanleitung/Garantie KM 3573 WatchDog-100 User Manual (rev 140307A) User's Manual / Bedienungsanleitung PALCRIPUR ESMALTE - Pinturas Palcanarias 9578456 Poudre super-absorbante FICHE TECHNIQUE Manual de instrucciones Full Heat Exchanger Series Operation & Installation Manual Yamaha DM2000VCM Reference Guide Meco 9350 Series User's Manual MANUALE UTENTE Copyright © All rights reserved.
Failed to retrieve file