Home
Stat-JR Advanced User's Guide
Contents
1. This template also stores chains for each of the 65 random effects u As usual we also get the MCMC plots which for u is kind of strange as it includes all 65 chains together so should probably be ignored c J a A o ioo 2000 3000 4000 So00 BI OP o3 04 05 06 07 1 0 stored update 0 15 parameter value 08 0 10 I 0 05 Be 0 00 Py m J se 0 10 1 00 0 15 0 2 0 20 40 930 304 CSCC 0 018 ry Jaa j Lag 0 016 0 014 0 012 a o10 0 008 a 0 006 X 0 004 i O 200 400 600 800 100012001400 updates kernel density enuwauo 1 5 ACF es N a CF MCSE 0 002 u lad Inputs y normexam L2ID school D Normal 5 cons standirt Exercise 7 The 2LevelCat template extends 2LevelMod to allow for categorical predictors Try adapting this template so that it allows the user to incorporate interactions Remember to save your new template under a different name You will find it helpful to look at 1Levelwithinteractions first 10 2 NLevelMod template The NLevelMod template as the name suggests extends the 2Leve Mod template to an unlimited number input by the user of levels of clustering Note that these clusters can be either nested or cross classified We will once again start by looking at the invars attribute to see how it differs from 2levelmod invars NumLevs Integer Number of Classifications
2. For a 64bit machine Select the following web link http sourceforge net projects mingw w64 files Toolchains 20targetting 20Win64 Personal 20Builds Choose 4 5 x86_64 After unzipping the files rename the folder to MinGW 2 5 Download Stat JR source files We finally need to download the source files that are specific to Stat JR You will find these at weblink to be determined but for the alpha release we will have E mailed you a fluff link Unzip StatJR zip to C StatJR or a location of your choosing which we will refer to as the Stat JR base directory When we refer to subdirectories they will be housed under this directory When all the files are installed you will find a file called webtest cmd in the base directory This is the file we will use for running the software If you have installed MinGW to a location other than C MinGW edit the webtest cmd to reflect this The webtest cmd file runs a Python script file called webtest py which can be found in the subdirectory src apps webtest In this directory you will also find a settings cfg which contains paths for various components of the Stat JR system A screen shot of this file is given below alo File Edit Format Yiew Help l n A datasets executable c Program Files Cx86 MLWiN w2 22 mlwin exe Demo executable c statIrR Demo binsDemo exe working_directory c statIR Demo bin winBuGs executable c winBUGS14 winBUGS14 exe working_directory c w
3. R executable C Program Files R R 2 12 0 bin R exe We can again first run the Regression2 template to see what it gives for R so choose Regression2 as the template and tutorial as the dataset and then input the following B http localhost 8080 run Stat JR Demonstrator Template Regression Change Dataset tutorial Change View Summary Configuration Start again response normexam explanatory variables cons standlrt Name of output results outr Is estimation method by MCMC No Choose estimation engine MLwiN R STATA Next Sat Inputs y normexam x cons standlrt Clicking on Next and Run will give the following output 58 7 Mozilla Firefox File Edit wiew History Bookmarks Tools Help Results modelout normexam cons standlrt Min 3 6661000 Min Min 2 935000 ist Qu Median 6995100 ist Qu 0043218 Median ist Qu 0 620710 Median 0 040499 BEBEK H Mean 0 0001179 Mean Mean 0 001810 3rd Qu 0 6787600 3rd Qu 3rd Qu 0 619060 Max 3 6661000 Max Max 3 016000 Call glm formula normexam cons standlrt 1 family gaussian identity data mydata Deviance Residuals Min 1 Median 3Q Max 2 65617 0 51847 0 01265 0 54397 2 97399 Coefficients Estimate Std Error t value Pr gt t cons 0 001195 0 012642 0 095 0 925 standlrt 0 595055 0 012730 46 744 lt 2e 16 tt
4. Text t EstObjects Engine name output elif hasattr t tabledata t EstObjects Engine Text t EstObjects Engine name taboutput else t EstObjects Engine Text t EstObjects Engine name service This code shows the general order of things we start by getting the template s inputs through the call to t input then we get hold of the output file name through t resu tout and then we check out which attributes the template has a way of deciding on the type of template in order to know which methods to run next For our Regression1 example we have an attribute outbug and so we call MethodInput which gets the remaining inputs e g random number seed etc Non model templates will not have an attribute outbug You will note in two places the lines if GUIObject interface unanswered_questions print not finished yet raise GUIExceptionInputNeeded These lines interrogate the window to check whether all inputs have been entered and so stop the additional buttons being shown until all inputs are given Having got all inputs we are ready to call other template functions using the following chunks of code state ready context variables t preparedata context data if hasattr t WinBUGSMod statstemplate t WinBUGSMod elif hasattr t outbug statstemplate t output if hasattr t outlatex latextemplate t outputlatex 28 The preparedata method i
5. Update beta0 double sum0 0 for int i 0 i lt xlength normexam i sum0 cons int i normexam int i betal standlirt int i double suml 0 for int i 0 i lt xlength normexam i suml pow cons int i 2 std trl normal_distribution lt double gt normal tau sum0 tau suml 1 sqrt tau suml beta0 normal eng 5 4 JavaScript generation XMLtoJS In the last section we looked at how to output C code that could be taken away and reused The engine in Stat JR uses C for its number crunching as we have already noted This will allow users to install Stat JR on their own machine and host the software locally If instead we wished to place Stat JR ona server which users have remote access to then all their computations would be performed on the server which is perhaps not very attractive An alternative would be for the Stat JR system to export code on to the local machine so that computations could be performed locally This is the motivation behind the JavaScript generation facilities We will return to the Regression1 example and after setting up the simple regression model the screen should look as follows with the JS button available 44 File Edit view n S History Bookmarks __ http localhost 8080 run Random Seed 1 length of burnin 1000 number of iterations 5000 thinning 1 Equation rendering normexam N 44
6. d_e ncols n 1 n 2 priors Text Priors Uniform Wishart if priors Wishart R Text R matrix v Integer Degrees of Freedom beta ParamVector beta ncols lenbeta You will notice that we construct parameter vectors that are a combination of the string miss and y variable names input and these will be used in the model Let us run the MVNormaliLevel template with the gcsemv dataset that contains two responses for secondary school pupils a written and a coursework test score We will set up the inputs as follows 114 File Edit View History Bookmarks Tools Help AOX A hetpn37 222 14064 8081 run 27 7 BB Cogie A Most Visited Getting Started Latest Headlines O http 137 222 140 64 8081 runy Stat JR Demonstrator Template MVNormal1Level Change Dataset gcsemv Change View Summary Configuration Start again responses written csework explanatory variables for response written cons female explanatory variables for response csework cons female Pnors Uniform Name of output results geseout Choose estimation engine eSTAT WinBUGS eSTAT Random Seed 1 length of bumin 1000 number of iterations 5000 thinning 1 Set Inputs x2 cons female pnors Uniform x1 cons female y wntten csework We allow the two responses to both depend on one predictor a dummy for gender female Note
7. double u0_0 i sb1 gt e 1 1 double ul_O i double ul_O i int vwl length u0_0 3 if itnum 0 sb1 gt e 0 0 0 0001 sbl gt e 1 1 0 0001 112 We can finally run the template by clicking the back arrow on the browser and then clicking on the matrix_sym_invinplace sbl struct _gmatrix val matrix _fre auri d_ul ALL 0 1 2 matrix_sym_invinplac omega_ul omega_ul omega_ul matrix_fre 0 L 2 e sb1 val gt e 0 val gt e 0 val gt e 1 val gt e val gt e val gt e e val Run button The results appear as follows rwishart vwl sbl Results tau chain 1 806 0 04131 ESS 5138 u0_0O chain 0 3747 0 09285 ESS 1532 0 4694 0 1057 ESS 1723 0 1498 0 09254 ESS 1364 ul_0O chain 0 1274 0 06977 ESS 3378 0 1697 0 07278 ESS 2869 0 0006234 0 06879 ESS 3210 omega_ul chain 0 1036 0 02194 ESS 2737 0 02069 0 008522 ESS 1741 0 0183 0 005671 ESS 1214 Cube A ehain 1 13 81 3 635 ESS 1524 16 56 8 738 ESS 965 82 24 29 96 ESS 838 betal chain 0 5565 0 02155 ESS 603 betaO chain 0 01124 0 04278 ESS 209 sigma chain 0 7443 0 00852 ESS 5141 and as usual we get MCMC output There is again a similar prejcode function that will allow you to run this model in JavaScript It is very
8. for i in range 0 int NumLevs Gontext ual rk vstr t kL ParamVector context tau_u str i 1 ParamScalar context sigma_u str i 1 ParamScalar selstr Classification str i 1 context C str i 1 DataVector selstr y DataVector response D Text specify distribution Normal Binomial Poisson if D Binomial n DataVector denominator link Text specify link function logit probit cloglog if D Poisson link Text value ln offset Boolean Is there an offset if offset n DataVector offset if D Normal tau ParamScalar sigma ParamScalar x DataMatrix explanatory variables beta ParamVector beta ncols len x We have needed to replace the code for inputting the level 2 identifier with code to input the number of classifications levels of clustering We have then looped over the number of classifications constructing both the names of the columns that contain the classification vectors which are labelled C1 C2 and the new parameters associated with each classification u1 tau_u1 and sigma_u1 etc To achieve these inputs we have used the context command to construct attribute names by concatenating strings and also a simple string concatenation to create selstr which contains the text string for the input question associated with inputting each classification name
9. model for i in l length normexam normexam i dnorm mu i tau mu i lt mmult x beta i Priors beta0 dflat betal dflat tau dgamma 0 001000 0 001000 sigma lt 1 sqrt tau Finally the function mmult is a function written separately and is used to create the products of the x variables i e matrix multiplication and their associated betas with appropriate indexing outbug model for i in l length normexam normexam i dnorm mu i tau mu i lt cons i beta0O standlrt i betal Priors beta0 dflat betal dflat tau dgamma 0 001000 0 001000 sigma lt 1 sqrt tau This is identical to the code we see under Model in the webtest and is one way of displaying the model we wish to fit Another way is to write the model in mathematical form using the LaTeX language and this is also shown in the web output under Equation Rendering Basically we are using a program called MathJax which will display LaTeX code in a nice format embedded within a webpage The attribute that is used for creating this code is outlatex 20 3 2 3 Outlatex If you click on the right button on the part of the screen showing the Equations and click on the Show Source option you will get a Window popping up that shows the source I MathJax Equation Source Mozilla Firefox O x F http localhost 8080 run begin aligned mbox normexam i
10. starting points We will look at these in more details in the sections for the specific packages 48 6 2 WinBUGS and Winbugsscript py We will begin by looking at the WinBUGS package as the model code we have been creating for the Stat JR engine has many similarities with WinBUGS code We will begin by running the template and viewing the output It should be noted that in order to run the WinBUGS engine Stat JR needs to be able to find it In the webtest directory StatJR src apps webtest you will find a file called settings cfg which contains directory names for each package For example on my machine have WinBUGS executable C Documents and Settings frwjb WinBUGS14 WinBUGS14 exe working_directory C Documents and Settings frwjb WinBUGS14 If you wish to use this option you need to change these paths to point to WinBUGS on your machine and you will need write access to the WinBUGS directory Note that you must set the working_directory to be the WinBUGS directory When you have done this restart the webtest program so that it uses the new settings Select Regression2 from the template choices and tutorial for the dataset Next select the following inputs L hitp ifecahost 6080irney Stat JR Demonstrator Template Regression Change Dataset tutorial Change View Summary Configuration of output re s estimation method by MCMC Yes Random Seed length of burnin 100C number of iterations th
11. 1 JADOUtStat IRs C STAW uenerat aT oE ARETE EAO ECE OA ORN ENOTNE 6 1 1 Stat JR software for scaling statistical HEIGHts ccccccssssssscececeseesesssaeeeeeceseessseseeeeseesees 6 1 2 About the Advanced User s Guide 0 eececcceeececeeeeeeeeeeeaaeeeeeeeceseeeesaeeeeaaeseeeeesaeeeeaaeeneaeeeeeeees 7 2 InstallationiinStructiOn ceciren i a i a a eiia 8 3 A simple regression template example sssssssssssssesssrerrsssseserersrssssrrerntsnsssserernesnssssrreenesnssssreeen 12 3 1 RuAningsa first tem plate 2oc a Acca it ee AA Gi En E nS 12 3 2 Opening the bonnet and looking at the COdGC ccscccccccecessesssnececeessessesseaeeeeeessessesseaeeneess 17 SeQ eV WAVAMNS EAE EE EEE E EEA E neers taneve E 18 3 2 Z OQUDU eterna ee E EE A A oe eee ee 19 32 3 Q tlate kinno a a E A a aa seek a a E REEE e a AU a be AEE 21 B o To ASM loai Eo A a151 Sc EA E A T cits bihiavidieede send 22 3 3 Writing your own first template nascera a ti aa i a ae iR aE 23 4 2 Structure Of template Sinn ei a E a a a a aaa T E aE an 24 4 Templating py ureei re e e a Seat On at eee ti et a 24 5 The parts of the Stat JR system ccccccccccccsssssssecececeseesesneaeeeeecsseeseaeseeeeecesseseaaeseeeesessseseeaeaeeeesens 27 5 1 webtest py and the web interface soen aeaeaei ineeie aein nn Aan EnA EAEE EEEE Paa ANAE 27 5 2 Model Templates and model py cssccccccecsssessssececccecesseseaeseecesceeseseeaeeececsessseseaeeeesenseeees 29
12. Signif codes O 0 001 0 01 0 05S O 4 1 Dispersion parameter for gaussian family taken to be 0 6487338 Null deviance 4049 4 on 4059 degrees of freedom Residual deviance 2631 9 on 4057 degrees of freedom AIC 9766 5 Number of Fisher Scoring iterations 2 E modelout TE Mathittalic Inputs normexam x cons standiirt Loading Web Font TeX Mathiltalic puts y Here we see the full output of the R command glm If we wish to use MCMC estimation we would ak give the following inputs I Mozilla Firefox Edit Yiew History Bookmarks Tools Help _ http localhost 8080 run Stat JR Demonstrator Template Regression2 Change Dataset tutorial Change View Summary Configuration Start again response normexam explanatory variables cons standlirt Name of output results outr Is estimation method by MCMC Yes Choose estimation engine eSTAT WinBUGS MLwiN R R Random Seed 1 length of burnin Poo number ofiterations 5o00 o o thinning E Next Bal Inputs y normexam x cons standlrt Ee This will give the following outputs upon running 59 Here acaron s00 E htp lecalbest2000 run al Results anaes 4375 300 y THOT FOV WOH 4000 Soo PsToeoasraSTaeHAeEATO 10 Stored undate 9 93 parameter value as oo a6 a 5 21 can
13. can be thought of as defining the components that make up a model object so here we are building a model object that contains a data vector called y The text in the brackets is used by the web 18 interface as a piece of text to place on the screen alongside the appropriate input device in the case of a data vector a single drop down box Stat JR distinguishes between Data objects these start with the prefix Data which require user inputs and Parameters these start with the prefix Param which just need to be declared This template therefore has 5 declarations for the 5 components in the template 2 pieces of data y and x and 3 parameters tau sigma and beta that make up the model The DataMatrix declaration for x will correspond to the multiple select list that we saw when running the regression1 py template in Section 3 1 The final command beta ncols len x is used to define the size ncols of beta so that there is a separate beta associated with each explanatory variable 3 2 2 Outbug The outbug attribute gives a definition as a text string of an instance of a model set up using this template The definition is in a language that very much resembles the language used by the WinBUGS package with some minor differences and will be used in Stat JR to create code to run the model The definition is also shown on the screen under the Model banner so you can for example see the definition for the model we fitted to the tutor
14. mondiri batat mgm Dimension scalar Lhs GNEs f Rhs GNEs 6 Imm Parents none Imm Children DS Imm Children S none Parents none Children normexam Coparents cA dflat Statement A w With subs dflat Prior p f 2 s se Likelihood pj normexam 5 T A normexam cons 8 standit f 1 1 length nomexam kad omean Posterior pA jnormexam 8 r pif T_ _p normexam 6 8 sai krh normen 2 A kngh nomenen 5 cons ja x Lexp S cons normexam 3 standit 2 1 engh norem argh nosem S d 2 cons normexam stanatt 6 gt cons Ae 2 i l isi ength nomean T J 2 2 brgh norem cons Ja exp 4 E cons normexam fjstanaut 6 ih i l 2 bagh nomenan AWE hanson ti IE coms j8 Ton nse teas s pare Innemen Rietandtet IR int zi normenxam mul tow cons betal slondlel betal sigma x F Likelihood p normexam 8 8 1 F exp normexam cons standlrtf 1 length nonnexam Jeng omean Posterior p f uormexamf c pA TL p normexam i 8 ist K bagh monaca ATIA gt ajg A dength normama d ry ja x exp I sons nommexam f standit TA brgh morae brgh noram 2 en yy cons normexam A stant d a cons i 2 Lo omi Lo omi 5 Jergih rorrercun ajg brgh morman y A ey Ja exp gt cons normexam 8 standlrt Ja kaginan n y kad momen d Zo co Je
15. 1 gt if temp lt 9 999e29 endif and if miss in theta 117 This code recognises the prefix miss in a variable name and places the conditional statements around the the update step for that parameter This reliance on the parameter name is undesirable and we will hopefully have a better method for making such algorithmic changes in later releases It is helpful at this point to look at the code generated for this example To do this click on the code button and scroll down to the iterate method Here we see the step for the variance matrix omega_e at the top static int itnum itnum 1 if itnum for int i 0 if written i mi if cse mi Note curren else i lt length misswritten i gt 9 999e29 sswritten i written i work i gt 9 999e29 sscsework i csework i r r tly using a uniform prior for variance matrix struct _gmatrix sb matrix_blank 2 2 for int i 0 i lt length misswritten i double e0 double misswritten i double cons i beta0 double female i betal double el double misscsework i double cons i beta2 double female i beta3 sb gt e 0 0 e0 eQ sb gt e 0 1 e0 el sb gt e 1 0 el e0 sb gt e 1 1 el el if itnum 0 sb gt e 0 0 0 0001 sb gt e 1 1 0 0001 matrix_sym_invinpla
16. 1 r Tut F 0 001 0 001 o Wty Tu2 T 0 001 0 001 2 Og Utu Model model for i in 1 length attain attain i dnorm su i tau muti lt cons i bera vrali betal ut pidft cons 1 u2 sid i cons 3 for il in i length ut urii dnorm 0 tau_ur for i2 in 1 lengt u2 u2 i2 dnorm O tau u2 Priors beta0 dtiat betai dfiac tau dyamma 0 001000 0 001000 siga lt 2 aqrticau tau ul dganma 0 001 p haberry Corrected Owan E Moza rieron L tomeis Seen L Seawch Rests Ei Advances 97 We can see that the model code is similar to that for the 2 level model in 2LevelMod Again we have a preparedata function which transfers data to the template but also calculates the lengths of the various sets of random effects def preparedata self self data data numlevs for i in range 0 self objects u 1 len numpy unique self data self objects C return self data data int self objects NumLevs numlevs str i 1 ncols ostr ao LJA The model code is created by the outbug attribute with the following code outbug lt numlevs int NumLevs model for Normal Sty fil if D dnorm mu i tau g G H A Ae 7 o a ji al a ol j Hh UO dpois p i link p i if offset endi endif S mmult x beta i fo
17. 521 Thescompilemodel method e e O Er a sed Pans AEAEE eke 31 5 2 2 The Demo algebraic software SYStCM cccccesssssscecececessessaeeeceescessesaeaeseeeessesseseaeeeesessenees 31 5 2 3 Continuing the go function applydata method ccccccesscecsseceesseceesceceseeecsaeeeeseceeseeees 35 5 2 4 Parameter p a aoe esi AS ves ee eee e ar he AT At 36 5 2 5 Running a model the run method and XMLtOC ccccscccccscessssssseceeeesseesesseaeeeeeesseeseaees 37 5 2 6 Summarising the reSults ccccccssssssecececessesscsececeeecesseseeaeseeeessesseaaeeeeeesseesesaeaeeeeeesseuseageas 40 5 3 C code generation XMLtOPULEC cccccccessssccceessececsesececsesaeeecsessececsesaeeeceesaeeecsesaeeeeeesaeeeesaas 41 bo ie aE Jere ea ee ee 42 53 2 XMETOPULEC wis ev uciect ATTE AI TE E EE A Qin daecessvusvonusvsedsauvedhuceve tieeseveee 43 5 3 3 Example code for our template ccccccccccsssssssscecececessesssaeeeeeescesseeaaeseeeessessesneaeeeeseeseeseaaeas 43 5 4 JavaScript generation XMLtOIS c cccccccccceessececsessececseseceeseaeeecseaaececsesaeeecsesaeeecsesaeeeeseaaeeeeseaas 44 5 4 1 Running the Regression1 CxXaMple ccecccsssssecccecsssessaeeeeecscesseeaaeceeeeseeeseseaeeeeeeesseseaaeas 46 542 JaVAaSCript SOUNCE CODE ie enerne oaae aaeei a a les deddvte nah shaded exdevednavssveodensebeantees 47 6 Including Interoperability ceccccccccecsssesssececcceeseseseeseeeeeceseesesaeaeeee
18. Poisson dpois p i link p i lt if offset S n i endif endif S mmult x beta i Priors for i in range 0 x ncols betaS i dflat endfor if D Normal tau dgamma 0 001000 0 001000 sigma lt 1 sqrt tau o endif Basically in the outbug method conditional statements are started by a if and the code to be conditionally executed is ended by a endif The conditional statements can be hierarchical for example the line if offset is within another if statement and now the endif will correspond to the latest if In our example we have D Binomial and so the code simplifies to outbug model for i in 1l length S y name S y i dbin p i n i link p i l lt S mmult x beta i Priors for i in range 0 x ncols beta i dflat 83 and as we demonstrated for Regression1 we can fill in the calls and unwind the for loop and the Smmult function to get the code we saw in the example output window 8 4 Outlatex Finally the outlatex method now also contains conditional statements outlatex r begin aligned Sif D Normal mbox y _i amp sim mbox N mu_i sigma 2 mu_i amp Sendif Sif D Binomial mbox y _i amp sim mbox Binomial mbox n _i pi_i mbox link pi_i amp Sendif Sif D Poisson mbox y _i
19. as 3 2 El aoo oo az Say o 10 i209 2 6 8 10 2 tag Inputs Y normexam cons standin Dieden Dts Sewer iirnts E Aceanend tner MB Citwinerrisyste T metent woad T wreuciscree T mumtvontran w T hogere wa g The estimates and graphical output are very similar to those from eSTAT WinBUGS and MLwiN using MCMC As with other external packages the webtest code interrogates the estimation engine as follows elif t EstObjects Engine name R func getattr t REstimationInput None if callable func m goR t In fact the interface to R has a similar format to MLwiN as R also requires a packaging up of the data into an appropriate format and a script file to be run We require a method called REstimationinput to be present in the template to allow interface with R and having checked for its presence we run the goR function which has the following code def goR t m RInterface m WriteData t m Script t t REstimationInput m m Execute if m OutStatus gt 1 print Problem at execution print Try running R script str m myRSript in R else m DisplayOut t m Summarise return m We have a similar order of processing as with MLwiN This time an object of type Rinterface is formed and the definition of this class of object is given in Rinterface py which can be found in the 60 src lib EStat external R subdirectory After initialising t
20. cons standirt Clicking the Next button and the Run button will then run Stata and the output will be as follows Results modelout ames dog ci users edcmjc appdatra loccal temp i2 rmpqalvoqiscript_ templates Regression output log iog type tex Opened on 14 Dec 2010 12 57 08 summarize Variable 4059 0002279 9989402 3 6661 3 6661 cons 4059 atandire 4059 0018098 993223 2 935 3 016 g m normexem cons stendirt family gaussien link identity noconstant Iteration 0 log likelihood 4880 24 Generalized linear models Mo of obs 4059 Optimization 7 Residual af 4087 Seale parameter 648733 Devience 2691 913033 1 df Deviance 6487538 Pearson 2631 913033 1 df Pearson 6487338 Variance function V u 2 Gausasan Link function 2 gts eo identity Ar 2 405637 Log likelimood 4880 240015 31076 45 As with the other estimation engines when we set the attribute Engine via the inputs then this is picked up in webtest py by the following chunk of code elif t EstObjects Engine name STATA func getattr t STATAInput None if callable func m goOSTATA t To run Stata the template must have an additional attribute entitled STATA nput which the above code checks for If it is found then the goSTATA function is called which performs the equivalent of the go function when using Stata The code for the goSTATA function is as follows
21. endfor endfor endif if D Normal tau dgamma 0 001000 sigma lt 1 sqrt tau endif float 0 0 001000 Here we see that a different mmult function is used for the orthogonal parameterisation and priors are given for betaort rather than beta in this case Finally there is code to allow us to recover beta from betaort deterministically We construct the product of the orthogmat terms and the betaorts placing signs between the terms unless the orthogmat term is negative We can run the model by clicking on the Run button and we will see the following results for beta1 file fdt yew Hixtory Bookmarks Jools Help mcxs A Most Visited gt Getting Started p Latest Headlines Mttpe 137 222 140 64 8081 runy Results betpe 137 227 140 64 8083 jrun 000 2000 3000 4000 S000 o3 00 05 10 15 20 2 Stored update parameter value 9 0005 20000 40000 60000 80000100000 updates 06 06 Yos Eos o2 o2 0 0 EO at ao P 128 We again see good mixing of the chains and very similar estimates to the blocking approach The other advantage of this orthogonal approach is its generalisability to non normal response models In these cases Metropolis Hastings algorithms are used and so a blocking approach is not so straightforward Exercise 10 Convert this template so that it is analogous to the Regression1 template but uses the orthogonal parametrisation Reme
22. normal examples and has also been implemented in MLwiN For details of how we construct orthogonal vectors we refer the reader to Browne et al 2009 but note that a function named orthog can be viewed in iLevelOrthogParam py Here you will see that we fit a model with the parameters betaort placed in the linear predictor along with data vectors orthcons and orthy8 These data vectors are constructed in the following preparedata function def preparedata self data self data data if self objects orthog self objects orthogmat orth numpy zeros len self data self objects x 0 len self objects x for i in range 0 len self objects x orth i numpy array self data self objects x i tmp om numpy linalg qr numpy mat orth om orthog orth for i in om flat self objects orthogmat append str i for n in range 0 len self objects x tmp numpy zeros len self data self objects x n for nl in range 0 len self objects x tmp numpy array self data self objects x nl om nl n self data orth self objects x n list tmp self objects x map lambda n orth n self objects x 126 return self data We begin by constructing a blank list orthogmat and an empty matrix orth We then implement the orthogonalising algorithm by filling orth with the original x variable vectors and then calling the orthog function om is the m
23. normexam explanatory variables cons standirt Name of output results out Random Seed 4 length of burnin 1000 number of iterations 5000 thinning f1 Next I Inputs y normexam x cons standirt You will see the four inputs correspond to the four inputs in the MethodInput method The template class has a large number of methods and here we will simply just list them and give a very brief indication of what they do We will then revisit some of them later 25 e __init__ this is the initialisation method that we described earlier e get_tags aclass method linked to the tags e render_text renders the text in the browser e applydata this function is more important as it actually connects the real data and initial parameter values to the model object that is being formed e preparedata this function can do preprocessing of the data although this is template specific and so the default function simply attaches the data into the instance of the class under the attribute data e resultdata this function is used to construct the results of a model fit and place within the instance of the class as an attribute named results e monitor_list this function sets which of the variables are to have their changes monitored e Run This may be used in the future to give templates more control over what happens when they are run e resultout A function that gets the only other input the name o
24. str i 1l ParamScalar else context omega_u str itl ParamVector context omega_u str it 1 ncols num 1 num 2 context d_u t str itl ParamVector context d_u str it l ncols num t1 num 2 context priors str i Text Priors Uniform Wishart if context priors str i Wishart context R str i Text R matrix context v str i Integer Degrees of Freedom The template begins like the NLevelMod template but then has an additional section that is used to input the variables that have random effects associated with them at each level and then any priors at those levels are input You will see that we make extensive use of the context function to construct the variable names and that there are different parameters for classifications according to the number of random effects In brief parameters beginning tau_uO and sigma_u0 are the precision and variance of the random effects if there is a single set of random effects those beginning omega_u and d_uare the variance matrix and precision matrix if we have multiple sets of random effects for a classification Finally in this case there are two possible priors and for the 108 informative Wishart priors a prior variance matrix estimate beginning with R and degrees of freedom beginning with v parameter are required Having completed our inputs we now need to click on Run to see what the model code l
25. Bal Model model for i in i length use yli dorm muli 1 0 mu i lt cons i betaO age i betal Priors betad dflat betal dflat Run Code JS Set Inputs x cons age v use The latent continuous response variable written as y in the model code is denoted by use in the model equation We also see that the model code is really just fitting a normal model as if we already know the values of y and if we look at outbug we can see that clearly outbug model for i in 1l length S v yli dnorm mu i 1 0 mu i lt mmult x beta i Priors for i in range 0 x ncols beta i dflat o endfor The code is fairly straightforward so the interesting part is the step generating y to make this the correct model We have a preparedata function def preparedata self data self data data self objects y ncols 1 len data self objects v name return data which ensures that the continuous response vector y is the same length as the observed binary response vector v We will now look at the three methods in this template that we haven t looked at before 11 2 monitor_list method In all the templates we have looked at so far all parameters in the model have had their whole chains stored This can mean that the program uses a large amount of memory if the number of 102 parameters is large and in this template we have a parame
26. Demonstrator Template AverageAndCorrelation Change Dataset tutorial Change View Summary Configuration Start again Operation averages Variables girl normexam z Name of output results Next Set Inputs You can now try applying this template to some of the variables specifying averages and correlation in turn It is worth noting that you do not need to fill in the output results box as this is not used by this template 73 Here is an example of averages Mozilla Firefox Ss Stat JR Demonstrator Template AverageAndCorrelation Change Dataset tutonal Change View Summary Configuration Start again Operation averages Variables avsin standirt girinormexam Name of output results name count mean sd avsint 4059 0 00181069110618 standit 4059 gin 4059 06001478 normexam 4059 0 0001178 Inputs vars avsirt_standirt girl normexam op averages and the correlations for the same four variables https localhost 8080 run Stat JR Demonstrator Template AverageAndCorrelation Change Dataset tutorial Change View Summary Configuration Start again Operation correlation Variables avsirt standirt gir normexam Name of output results a avsit standirt girl nomexam avsit 10 0317018286878 00407057856259 0 287936736229 standit 031701828 1 0 0 05321104220 girl 0 04070578562 0053211
27. Instead the computations are performed within a method called resultdata which performs the required calculation and adds the column to the output The resultdata method is as follows def resultdata self m datalen len self data self data keys 0 if self objects type Uniform Random outvar numpy random uniform size datalen if self objects type Binomial Random outvar numpy random binomial float self objects numtrials float self objects prob size datalen if self objects type Chi Squared Random outvar numpy random chisquare float self objects degreefree size datalen if self objects type Exponential Random outvar numpy random exponential size datalen if self objects type Gamma Random outvar numpy random gamma float self objects shape size datalen if self objects type Normal Random outvar numpy random normal size datalen if self objects type Poisson Random outvar numpy random poisson float self objects exp size datalen if self objects type Constant outvar numpy ones datalen float self objects value if self objects type Sequence outvar numpy arange self objects start datalen self objects step self objects start self objects step if self objects type Repeated sequence outvar numpy array
28. Ja a 2 erg sorte oe poe Z standirt Je Log posterior py standlrt normexam f cons fel Distribution dnorm Jength otras Match A r x standirt nonnexam B cons at Jeg sormesi t E stadit Match B dength norman rt E _ standlrt normexam f cons y reat Sampling parameter p haga econ z r 7 standlst it esgth memean x Sampling parameter r r I standit mt brgh rora t I standit nomexam 8 cons pag aomenm A Sampling distribution B dnorm m a Z standist T dE standirt i l 7 Finally tau the precision has a Gamma posterior distribution normencare mul tau coms betad standiet batat sigma VU LOUD en exp U UULUUUT al Prior p t 110 001000 0 99311604842093 4exp 0 001000r T 0 001000 expl 0 001000 P D F Likelihood normexamfiy8 t 5 exp F nomexam cons 8 standit i 4 length nonmexam ecg zermezan Posterior p tmormexam 8 p x pit n p normexan i 5 t O tangy remem y r bagh emam i normexam 8 cons f standit a exp 0 001000 She ha Neen a i exp bangi ramasan 0 001000 imi DO Saga aan normexam f cons 8 standit 2 Brgh mezua 3 fo 001000 z normexam cons standit 2 99 0 Sheth rere kigh nomena J normexam Aysous P standit Log posterior 0 001000 and gt t 0 999 0 Slength norm
29. Log posterior d Z cons nonnezan A stnadit a i Distnbution doom length somes Match A r E coms nonmexam 6 standirt s j ngh sormesen t F cons Match B wie Iength pore xan coer sons nonmexan A standht Sampling parameter u aonn t rs cons 1 1 bagh meman Sampling parameter r r E cons mt brgh rermesase t Z cons normexam 8 standirt paghi somezmi 3 y A y Sampling distribution dnorm x Zo on T cons mt x The program decides which lines in the model specification involve beta0 It then finds the prior and likelihood parts before merging together to find the posterior and log posterior It then uses its 33 features for matching conjugate distributions to spot that the posterior for betaO is a normal distribution Finally it gives the conditional posterior distribution in terms of other objects in the model We can view beta1 and see similarly a Normal posterior normexam mul tow coms betel stondlet Betal mgma oS 4 Likeliwood p normexami lt t exp normexam cons stadt i 1 length nommexam kaga aomen Posterior p P hnommexamf t spl I p normexam 8 0 2 bogh normama a kng nore I standit 8 a bld zr standlrt nonmexam fsons 8 it Lit 7 brgh narama brgh moma 3 ew d I standit normexam cons B d E standit 82 Lite lags 5 Ergi morenn A brgh torres q standlrt Jae x exp E standit normexam 8 cons
30. Next button and then specify the Name of output results we have used out here as shown below I Mozilla Firefox ATT ES File Edit view History Bookmarks Tools Help _ lt 4 rat OE c x amp a Dy tte ttocathost soaojruny EC ia SS Stat JR Demonstrator Template Regression1 Change Dataset tutorial Change View Summary Configuration Start again response normexam explanatory variables cons standlrt Name of output results fout Next Set Inputs y normexam x cons standirt X Find Matplotlib Next 4 Previous 4 Highlight all I Match case Done etg Click Next and we will next have to fill in some estimation method specific boxes The built in engine within STAT JR is an MCMC engine and so we hence have to type in some boxes specific to this type of estimation as follows 7 Mozilla Firefox Edit View Histor lt gt a http flocalhost 8080 run BB Most visited Getting started http localhost 8080 run Stat JR Demonstrator Template Regression1 Change Dataset tutorial Change View Summary Configuration Start again response normexam explanatory variables cons standirt Name of output results out Random Seed 1 length of burnin foo ooa number of iterations boo oS thinning i Bal Inputs y normexam x cons standlrt 14 Click Next The configur
31. The rest of the code is similar to before It should be noted that a 2 level model in this template has one classification as we are not considering level 1 here We now demonstrate use of this template for a cross classified example with two higher classifications Using a dataset from Fife in Scotland we look at the impact of both primary school and secondary school on the attainment of children at age 16 Select the template nLeve Mod from the template list and the dataset xc from the dataset list Select the inputs as shown 96 Marsilla Fired oor 2 L P 1 https localhost 8080 run Stat JR Demonstrator Template NUavelMod Change Dataset xc Change View Summary Configuration Start again Number of Classifications 2 Classification 1 pid Classification 2 sid response attain fy distribution Normal explanatory vanables conswrq Name of output results cout Choose estimation engine eSTAT WinBUGS MLvaN eSTAT Random Saad length of burnin number of iterations thinning sai Inputs D Normal C2 sid Y attain x cons vrq C1 pid NumL evs 2 Clicking on the Next button will display the mathematical formulation of the model and the model code Mozilla Firefox Equation rendering attain N u 0 yas z M Bycons i fiva i pidi 3 sidig w re 2 N pidi N 0 o3 P N Sidi N 0 a2 Bo 1 fixt rt T 0 001 0 001 o
32. a chapter of its own 30 5 2 1 The compilemodel method The compilemodel method basically takes the modelspec and sends it to an algebra system that constructs the conditional posterior distributions for the parameters in the model The code is as follows def compilemodel self if self modelspec print Compiling Model self directory tempfile mkdtemp handle name tempfile mkstemp dir self directory f os fdopen handle wtb f write self modelspec f flush close executable EStat configuration get Demo executable print Running executable cmdline s s s Identify s executable name self directory self identify cwd if EStat configuration get Demo working_directory print cd ing from cwd cwd os getcwd os chdir EStat configuration get Demo working_directory os system cmdline if cwd os chdir cwd This code basically writes the model specification to a file and then runs another piece of software using this file as input 5 2 2 The Demo algebraic software system The algebra system that we have developed for the Stat JR system with main developer Bruce Cameron will take a WinBUGS like input file and produce output files for each parameter giving their full conditional posterior distribution either as a known distribution with formula for each parameter or as an unknown distribution function We can run thi
33. amp sim mbox Poisson pi_i mbox link pi_i amp Sif offset mbox S n _i Sendif Sendif S mmulttex x r beta i Sfor i in range 0 len x beta_ i amp propto 1 Sendfor Sif D Normal tau amp sim Gamma 0 001 0 001 sigma 2 amp 1 tau Sendif end aligned tit and as with the outbug function we achieve conditional operations via the if and endif pairs Again for our example we can strip out the conditionals to get outlatex r begin aligned mbox y _i amp sim mbox Binomial mbox S n _i pi_i mbox link pi_i amp S mmulttex x r beta i W for i in range 0 len x beta_ i amp propto 1 sendfor end aligned If you look at the code you will see other functions for the various other software packages but we will not discuss these here 84 Exercise 6 Convert the more general 1Leve Mod template into a specific logistic regression template To do this copy 1LevelMod py to another filename e g 1 evellogit py and simply remove the conditional statements and additional options so that the template only allows the user to fit logistic regression models You can check the template works by attempting the example given in the chapter with your new template 85 9 Including categorical predictors The template 7Level Mod can handle many response types but treats all predictor variables as if they are continuous This
34. has become quite long mainly due to the conditional statements for the different distribution types and the fact that we also allow the use of the t distribution for random effects although we will not demonstrate or discuss this here We see that the term u L2ID name i cons i has been appended to the linear predictor Note that due to a bug in the algebra system we need to include the multiplication by cons i otherwise the posterior is incorrectly calculated The chunk of code for j in l length u if D name t j dt 0 tau_u df else ol er 94 u j dnorm 0 tau_u endif specifies the random effect distribution and finally the code tau_u dgamma 0 001000 0 001000 sigma_u lt 1 sqrt tau_u specifies the prior distribution for the precision of the random effects The outlatex function is adapted in very similar ways and so for brevity we omit this code here Running this template produces a large amount of output because there are 65 level 2 residuals An extract of the output is as follows tau chain 1 767 0 0392 ESS 5004 sigma_u chain 0 31 0 03178 ESS 2961 u chain 0 374 0 09149 ESS 1682 0 5033 0 1037 ESS 2545 0 1679 0 08954 ESS 1464 betal chain 0 5633 0 01254 ESS 3984 betaO chain 0 003276 0 0395 ESS 252 tai e chain 710 73 2 18 ESS 2977 sigma chain 0 7524 0 008347 ESS 4986
35. input as the Number of missing input Note if you want to predict for other individuals you need to form new columns of the same length as the data although the values below the number of missing row will be ignored We therefore select the following inputs Mtp AZALDA 0000 rue Stat JR Demonstrator Template 1LevePred Change Dataset tutorial Change View Summary Configuration number of iterations 5000 emal xm cons standit Clicking on the Next button gives the following output 130 Mozilla Fir File Edit View History Bookmarks Tools Help ro 9 C 4 http 137 222 140 64 8081 run SB Googie 2 2 Most Visited gt Getting Started Latest Headlines _ http 137 222 140 64 8081 run Equation rendering Semen n 2 normexam N pi 0 Hi Bacons 8 standlrt By x1 By x1 r T 0 001 0 001 a 1 r Model model for i in 1 length normexam normexam i dnorm mu i tau mu i lt cons i betaO standirt i betal for j in 1 10 mnormexam j dnorm mumiss j tau mumiss j lt cons j betazxfd0 standlrt j betazxfdi dummy j ddummy mnormexam j Priors beta0 dflat betal dflat tau dgamma 0 001000 0 001000 sigma lt 1 sqrt tau Run Code us Set Inputs y normexam x cons standirt nmiss 10 D Normal xm cons standirt Done d Here you will notice that w
36. itnum 0 for int i 0 i lt length missS y 0 i sfor var in range 0 n if S y var i gt 9 999e29 miss y var i tyl var i Sendfor Next we have code for generating the step for the level 1 variance matrix This is almost identical to the random slopes code except we need the crossproduct of the level 1 residuals e instead of the higher level random effects u this crossproduct needs to be constructed which is done in the initial code using the mmult2 function Note currently using a uniform prior for variance matrix struct _gmatrix sb matrix_blank S n S n for int i 0 i lt length missS y 0 i lt lenbeta 0 gt for i in range 0 n double e i double miss y i il S mmult2 context x str it l beta i lenbeta lt lenbeta len context x str i 1 gt endfor for i in range 0 n for j in range 0 n sb gt e i j e i e j ae endfor endfor Once constructed the remainding code follows the same pattern as for random slopes and so for brevity we omit this code here it can be viewed in MVNormal1Level py We have not yet mentioned how the missing data are updated and this is currently done in a slightly undesirable way Buried deep in the toC functions of XMLtoC XMLtoPureC and XMLtoJS are two little bits of code thus if miss in theta lt temp theta replace miss
37. mss iinu e e a tas eee aaah cova sto eases das Dea 91 10 2 Nl velmod template eiiiai otuseeseceehigen undated tnbgnuunsibeseeedbadeuntit acevsendivende 96 11 Using the Preccode PreJcode Methods ccsccccssecesseecssscecseceeseecsseeccsaececasecssseecsseeecsaeeeaees 100 11 1 The ProbitRegression template ccccccccccccecsesssssaeeeeecessessaeseceeecesseseaeseeeessessesseaeeeesesseeeegs 100 112 monitor list metho da aeaee nkire tenini ranee ekaa aeaaea Wisc dice ana iaae aa a aaa iaeaea Ra aa 102 11 3 PreCEGASMMETHOG sisecesce ses ces ware esse vee share a e ia eias 103 ADA prejcode METHO ses scccs eee ctaveczeccsscitcs ea bec tain dacica a aa a a aaa taeda a e a aas 105 12 Multilevel models with Random slopes and the inclusion of Wishart priors ccccceeeeeees 107 12 1 An example with random slopes sssscccceceseesecnsceceeeceseeseaeseeeescesseseeaeeeeeessessesseaeeeesesseeees 107 12 2 Preccode for NIEVWIthRS x riccceteeeteivetieneaeels aheiev a a a cde 110 12 3 Multivariate Normal response models cccccsessseeececeesessaeseceeecesseseeaeseceesseeseseaeeeesesseeegs 114 12 4 The preccode function for this template ccccesscccecessessnseseceeeceesesseaeseeeesseesesseaeeeeeenseeegs 116 13 Improving mixing 1levblock and TlevorthOgparamM ccccccccccsssceceessececeesseceseesteseeeeseeeeeees 121 13 1 Rats example nret a ae a a tothe Hebets Leia ee es Aes ee E Es 121 13 2 The Llevelblock te
38. name c str len MLwiNOut data keys 1 LevRes n LwiNOut macro addt LevRes n LwiNOut macro setv 1 LevRes n LwiNOut macro fpar 0 LevRes n for i in range len self objects x LwiNOut macro addt self objects x i n for p in range len self objects x J 1 0 LwiNOut params beta str 1 1 1 1 MLwiNOut params sigma2e 1 0 number of parameters MLwiNOut numparams self objects beta ncols 1l MLwiNOut nummodparams len MLwiNOut params keys Here the macro code sets up the response and level 1 id columns A level 1 residuals constant column is then created and named and added at level 1 before all the predictors are added as fixed effects Several attributes of the MLwiNOut object are also set here Having run the template specific code the next call is to the script method This method completes the macro file by including code for running the model in MLwiN which is estimation method specific The macro code for MCMC then saves the output in the line self macro dwrite c1101 c str self nummodparams 1101 1 n self macro self directory self objects myModel name MCMCChain txt n This is followed by lots of lines that create some fancy plots which currently reside in a temporary directory but should eventually link in when we implement the E books functionality Finally the worksheet is saved and MLwiN is
39. of starting values and data in places but perhaps the most interesting chunk of code is the iteration loop produced by the itcode substitution To find this code you could search for iterate in the browser window The code contains steps to update the four parameters Of these parameters tau betal and betaO have stochastic updates whilst sigma is a derived quantity and therefore has a deterministic update In this simple example it is instructive to look at the code below and compare with the mathematics displayed in the Graph Node window of the algebra system The two sets of expressions should be identical to one written in mathematical form and the other in C code The C code is as follows void iterate Update tau double sum0 0 for int i 0 i lt length normexam i sum0 pow normexam int i beta0 cons int i betal standlrt int i 2 std trl gamma_distribution lt double gt gamma 0 001 0 5 length normexam tau 1 0 0 001000 sum0 2 gamma eng Update sigma sigma 1 sqrt tau Update betal double sum0 0 for int i 0 i lt xlength normexam i sum0 standirt int i normexam int i beta0 cons int i double suml1 0 for int i 0 i lt length normexam i suml pow standlrt int i 2 43 std trl normal_distribution lt double gt normal tau sum0 tau suml 1 sqrt tau suml betal normal eng
40. out an example of these inputs by selecting the template 2Leve Mod and the dataset tutorial and specifying the following inputs 91 Mozila Firefox 25 _ http 127 0 0 1 8080 run a Stat JR Demonstrator Template 2LevelMod Change Dataset tutorial Change V Summary Configuration A Stan again response normexam Level 2 ID i specify distributor Normal explanatory variables cons standirt Name of output results tutout Is estimation method by MCMC yes MABON engin TAT WinBUGS MLWiN e STAT RandomSeed M aaa length of burnin 1000 numberofiteratons S000 ooo S thing M a Nea Mo Inputs y normexam L210 school D Normar x cons standirt j Clicking on Next will bring up the maths and model code 92 B http localhost 8080 run Equation rendering normexam N u 07 H Bacons By standlrt schoolt school NO Gi Ayal ixl t T 0 001 0 001 o 1 t Tu T 0 001 0 001 o2 1ht Model model for i in 1 length normexam normexam i dnorm mu i tau mu i lt cons i beta standirt i betal u school i cons i for j in i length u u j dnorm O tau_u Priors betad dflat betal dflat tau dgamma O 001000 0 001000 sigma lt 1 sqrt tau tau_u dgamma 0 001000 0 001000 sigma_u lt 1 sqrt tau_u Inputs y normexam L2ID school D Normal
41. output results ratsout Random Seed 1 length of burnin 1000 number of iterations 5000 thinning fal Next Bal Inputs y y36 X cons y8 We will then Run the model by clicking the Next and Run buttons If we look at the output and change the MCMC plot so that we have beta1 visible we will see the following 121 file Edit yew Hytory Bookmart Jools Help a AH A herpe 137 222 140 64 8081 eury A Mont Vinted Getting Started h Latest Headlines http 137 222 140 648081 run Results 02 a er b 1000 2000 3000 4000 5000 2 0 4 0 6 0 8 LO 1 2 1 4 L6 1 8 2 10 stored uodate __ parameter value __ J 0 20000 40000 60000 80000100000 updates belal inputs fy y36 Y cons y8 We see that both the regression coefficients have very small effective sample sizes and the chain we observe is not mixing well Aside from being a small dataset a difference between the rats and the tutorial dataset is that the variables have not been centred unlike stand rt in the tutorial dataset This means that the joint posterior distribution of betaO and betal has a large correlation and so if we update these parameters separately we will have problems We will look at two templates that will rectify this problem 13 2 The 1LevelBlock template Most software packages will update the parameters betaO and beta1 together in one multivariate normal block As we have seen in chapter 12 the curren
42. print n 4 4po VE if len self params p chain shape gt 1 out for i in range 0 len self params p chain out lt br gt g g ESS d prec self params p chain i mean prec self params p chain i std ESS self params p chain i else results p chain S g g ESS d prec self params p chain mean prec self params p chain std ESS self params p chain print results p chain print Z Score str self params p Zscore 40 r else PEINE INAN Ppop ier 1 if isinstance self params p rs list out for i in range 0 len self params p rs out S g g prec self params p rs i Mean prec self params p rs i StandardDeviation results p summary out print results p summary else results p summary 3 g g prec self params p rs Mean prec self params p rs StandardDeviation results p summary We have now completed the go function and control returns to the webtest code The webtest code continues as follows for p in m params keys if m params p monitor True t objects p sixway p m params p chain if param_input in i t objects current_plot if param_input context model m datasets datasets t EstObjects outdata name t resultdata m if hasattr t outbug statstemplate t output if hasattr t outlatex latextemplate t ou
43. self for m in old_mon if not miss in m mon append m return mon There is also a preparedata method that is used to set the length of the missing data vector to equal the original response vector def preparedata self data self data data for i in range 0 len self objects y self objects miss self objects y i ncols 1 len data self objects y i return self data We next turn our attention to the preccode function 12 3 1 The preccode function We will deal with the code of the precode function in sections We begin with a definition of the mmult2 function that we will use to work out the linear predictors for each response The mmult2 function is specifically useful for multivariate response models as it contains a count parameter which informs us which element of beta to start with in the linear predictor for each response extraccode lt def mmult2 names var index count out wee first True for name in names if first False QUE t et else first False out double name index var str count count 1 return out lt n len context y gt 116 We next have some code that is run in the first iteration which sets any non missing responses in the parameter vector that will be updated to their value in the actual data These will then not be updated in the main code static int itnum 1 itnum 1 if
44. sim mbox N mu_i sigma 2 mu_i amp beta_ O mbox cons i beta_ 1 mbox standlrt i beta_O s propto 1 beta_1 amp propto 1 tau sim Gamma 0 001 0 001 sigma 2 amp 1 tau end aligned C This code is created via the outlatex function and we will now look at how we get from outlatex to this source for our example The generic code is as follows outlatex r begin aligned mbox y _i amp sim mbox N mu_i sigma 2 mu_i amp S mmulttex x r beta i Sfor i in range 0 len x beta_S i amp propto 1 sSendfor tau amp sim Gamma 0 001 0 001 sigma 2 amp 1 tau end aligned he i We have three steps as with the outbug function firstly we will substitute normexam for S y outlatex r begin aligned mbox normexam _i amp sim mbox N mu_i sigma 2 mu_i amp S mmulttex x r beta i Sfor i in range 0 len x beta_ i amp propto 1 sendfor tau amp sim Gamma 0 001 0 001 sigma 2 amp 1 tau end aligned ty 21 Next we can evaluate the for loop with in our example x having 2 columns outlatex r begin aligned mbox normexam _i amp sim mbox N mu_i sigma 2 mu_i amp S mmulttex x r beta i beta_0 amp propto 1 beta_l amp propto 1 tau amp sim Gamma 0 001 0 001 sigma 2 amp 1 tau end aligned trig and f
45. software Then opening up the bonnet and explaining aspects of the code in particular what is different in this template from earlier templates We mentioned in chapter 3 that in the code for a template we are effectively giving a definition for a specific subclass of the template class and that there is a generic template class that these templates inherit attributes from In this chapter we will give a little more detail of some of these additional attributes before explaining further how the model description generated by the outbug attribute links in to the rest of the software in Chapter 5 You will find many of the inner workings of the Stat JR system in python files within the subdirectory src lib EStat The file Templating py contains amongst other things the class definition for the generic template 4 1 Templating py This python file contains the class definitions of many of the building blocks used in Stat JR including the different data and parameter types we have seen used in the invars attribute of the Regression1 class We are here primarily interested in the template class which is the last class defined within the file although the code is near the top of the file The template class contains definitions of lots of methods that can be used for a template Again we are using object orientated programming terminology here and so a method is a function described within a class definition which performs operations on the instance of
46. standard logistic distribution though this distribution is not as easy to deal with as the normal We will start as usual by looking at the invars attribute which is quite short invars y ParamVector v DataVector response x DataMatrix explanatory variables beta ParamVector beta ncols len x tr You will see that the column containing the 0 1 response is actually stored as v in this template as we will derive y as the underlying continuous response As always it helps to demonstrate the template with an example so we will fit a probit regression model chapter to the Bangladeshi dataset i e replace the logit link of chapter 7 with the probit link Select ProbitRegression from the template list and bang from the dataset list and then fill in the template as shown below Stat JR Demonstrator Template ProbitRegression Change Dataset bang Change View Summary Configuration Start again response use explanatory variables cons age Name of output results bangout Random Seed 1 length of burnin 1000 number of iterations 5000 thinning f4 Sat Inputs x cons age v use e OOOO I IO Clicking on the Next button we see the model described mathematically 101 http localhost 8080 run Equation rendering uset N w 1 where use gt 0 ifuse 1 and use lt 0 if use 0 U Bycons f age By xl
47. that both responses contain missing values as there are some pupils with only a written score and some with only a coursework score The missing values are given the value 9 999e29 and this value will be looked for in the preccode function The model output is as follows ile idt yew Hion Bookmarts Jocis Heip De A tettye 197 222 140 408r ie oer F A Mont Visited Getting Started Latest Hesdines nttpe 237 222 140 04 8081 rav Equation rendering written ce Cee csework Mu Ho ocons 8 female Hy Bacons 83fermale Qe xl 35 x1 B x1 8 x1 33 1 Model mode t for 2 An Srlengehiwesetend 2a minsweitten s misscnework t madi muii Ge Ol Gets delih beta0 female i be cons s betaz femele i beta3 misgveieren s dfler aeee dflat Priors berta dflarti Run Code JS Set Inputs x2 cons female priors Uniform X1 cons female Y written csework y ee a We see again the use of the dnormal2a function and also that we have included dflat statements for both the misswritten and misscsework responses to let the algebra system know that these are 115 parameters We will not look in detail at the outbug method as we can see the output it produces on the screen There is a monitor_list method to avoid storing chains for the missing data def monitor_list self mon old_mon Template monitor_list
48. the Stat JR system works more details about how the system fits together We will do this by showing the code for several of the templates we have written and giving a detailed explanation of what each function does and even in places each line of code An initial question posed by potential template writers is what computer language are templates written in and when told Python then ask whether we provide an introductory chapter on this language We have not written an introductory chapter on Python good books include Hetland 2005 and Lutz and Ascher 2004 as it has many features and we will be interested in specific aspects of the language some of which are non standard and specific to Stat JR In fact many of the functions that make up a template in Stat JR are designed to create text blocks in other languages for example C WinBUGS or any of the other macro languages associated with the software packages supported via inter operability This is not to say that reading up on Python is without merit and certainly Python experts will find writing templates initially easier than others though more because of their programming skills than their Python skills per se Our advice is therefore to work through this guide first and try the exercises and have a Python book as a backstop or look at the Python website at http www python org doc for further documentation for when you are stuck writing your own templates We will now give instruct
49. thinning seed seed applydata m variables t3 3833838888 allmissing True for p in m params values if p tree allmissing False if allmissing True print Error in algebra system nothing generated return None if hasattr t customsteps for p in t customsteps print Custom step for parameter p m params p tree for p in mon m params p monitor True m run m Summarise return m Most of what this code is doing is setting up the model object m The first line initialises the model object and this code can be found by looking at the __init__ method for the model class in model py Basically this is just giving names to all the attributes within the model class and setting default values for these attributes def _ init__ self self directory self modelspec self params self data self burnin 5000 self iterations 100000 self thinning 1 self seed 1 self callback lambda i p i self identify True self code self preccode self postccode self prejcode self postjcode self __initialised True Having initialised the model we next pass on some of the information from the template to the model the model specification and any pre or post C code that is needed see chapter 11 for an example of such code There is then a call to a method compilemodel which does a large number of tasks and which needs
50. values avslrt standlrt Xvalues school Y label London Reading Test Xlabelt School no Name of output results Run Set Inputs xaxislabel School no yaxislabel London Reading Test xaxis school yaxis avsirt standlrt Clicking on the Run button gives the following graph http 127 0 0 1 8080 run London Reading Test 0 10 20 30 40 50 60 70 School no Inputs xaxislabel School no xaxis school yaxislabel London Reading Test yaxis avsirt standirt ee Here we see plotted the actual intake scores for each pupil against school number in green and the school average in blue 77 Exercise 5 Simplify this template to allow only a single y variable Try adding a main title to the graph and varying the symbol and colours To do this you will need to look at the documentation for the plot command in matplotlib http matplotlib sourceforge net api lot_api html matplotlib pyplot plot You can then maybe make this an option for the user to choose Remember however to rename the template before you start 78 8 Single level models of all flavours A logistic regression example We have so far met two model templates Regression1 which could be used to fit normal response multiple regression models using the Stat JR built in MCMC engine eSTAT and Regression2 which allowed the same models to be fitted in other statistics package
51. we mentioned briefly when describing the generic template definition in templating py The generic definition of the method preparedata basically attaches the data to the template but we can write template specific versions of the method that replace this generic function and allow preprocessing of the data In this case the code is as below def preparedata self data self data data for var in self objects x self objects selection del self objects for var in self objects if self objects var list set sel sort remove uniqvals uniqvals data var list numpy array self data var self objects x uniqvals uniqvals uniqvals for i in self origx name append var Jis f data var str int i astype float name append var Save user s original str int i del self datal var else self objects x name append var self objects beta ncols len self objects x return self data This code firstly retains the named predictor variables in origx by copying the contents of x to origx and then deleting them from x Then the code loops over the variables via the second for statement and conditional the if statement on a particular variable being categorical does some processing The lines uniqvals list set self data var uniqvals sort uniqvals remove uniqvals 0 find all unique values in the categorical predictor whic
52. 0 289078129886 avsirt 4059 0 75596 0 63766 0 00181069110618 0 314831650064 standirt 4059 2 935 3 016 0 00180981670362 0 993100685573 schgend 4059 1 0 2 0 146563192905 0 498817437244 student 4059 0 0 197 0 376999260902 30 2606908983 denom 4059 10 1 0 1 0 0 0 girl 4059 0 0 1 0 0 60014781966 0 489867751763 normexam 4059 3 6661 3 6661 0 000117862232077 0 998817146739 70 ee We see that schgend now goes from 1 to 2 as expected Let us now look at the code for this template As with Generate this template has an invars input variables attribute and a resultdata method These are both quite short invars incol DataVector Input column name rangestart Text Start of range rangeend Text End of range newval Text New value EF Here the invars attribute has the four inputs that we saw when running the template We now turn to the resultdata method def resultdata self m Copy data into numpy array for processing var numpy array self data self objects incol var var gt float self objects rangestart amp var lt float self objects rangeend float self objects newval self data self objects incol list var return self data The resultdata method firstly copies the original column to the object var and then performs the recoding by finding the values in the original column within the correct range and then replacing them with the newval Not
53. 0 4000 Sooo Lao 145 L30 135 160 165 170 parameter value 10 OB 08 06 0 6 w u Fos Eg Bt 2 0 0007 0 0006 0 0005 w 0 0004 0 0003 0 0002 0 0001 O 20000 40000 60000 80000100000 updates Here you can see that the out of sample predictions mnormexam have been estimated with standard errors We hope that this and other templates give you a flavour of the possibilities that are available in the Stat JR package The package is still evolving and so we very much welcome feedback and suggestions for improvement We also encourage you to send us your own templates for inclusion with the software Exercise 11 Try modifying the 1LevelCat template to allow for out of sample predictions Remember to save the new template under a different name 132 15 Solutions to the exercises Rather than fill many pages with Python code we have placed potential solutions to each of the exercises in a solutions directory on the e STAT website Below we list the filenames for each of the exercises Exercise 1 LinReg py Exercise 2 Random py Exercise 3 RecodeNew py Exercise 4 AvandCorr py Exercise 5 XYPlotNew py Exercise 6 1LevelLogit py Exercise 7 2levelcatinteractions py Exercise 8 nlevelcatint py Exercise 9 2levelwithRS py Exercise 10 orthogregression py Exercise 11 1levelcatpred py 133
54. 0422088 10 0 114602648345 normexam 0 11460264 8345 10 Inputs Yars avsirt standirt girl normexarn op correlabon Exercise 4 Why not try to add an option to this template to give the standard error of the mean i e the standard deviation of the sample mean as an estimate of the population mean to do this you will 74 need to calculate a square root and this can be done using the numpy sqrt function Also try to allow the template to output both averages and correlations together for the same variables you will need to include them both in the same table output Remember to rename the template first 7 4 XYPlot template The final template in our whistle stop tour of non model templates is a graphing template Python has excellent graphing facilities and so we have created a few very basic graphing templates that demonstrate some of these facilities The xyp ot template allows the user to plot one or more Y variables against an X variable on the same plot The template has an invars input variables attribute as shown below invars yaxis DataMatrix Y values xaxis DataVector X values yaxislabel Text Y label xaxislabel Text X label Here we have four inputs the various Y variables and the corresponding X variable to plot against and then axes labels for each axes For a graph template we use another special method graphdata that can be used t
55. 6 04 q o4 4 03 02 0 2 01 0 0 00 Ee 25 30 e e 100 0 to 2 4 6 8 10 12 0 00020 _ tag bag 0 00018 h 0 00016 H 0 00014 W 0 00012 0 00010 0 00008 0 00006 aS 0 00004 0 00002 35000 40000 60000 80000100000 updates Inputs y use x cons age link Togit D Binomial n cons We see that the age coefficient is positive and statistically significant indicating that older women are more likely to use contraceptives We now look at the template to see what the code looks like We will only concern ourselves with the Stat JR built in engine here and so will not look at how the template implements interoperability this will be an extension of the code for Regression2 in chapter 6 8 1 Invars The code for invars input variables is as follows tre invars y DataVector response D Text specify distribution Normal Binomial Poisson if D Binomial n DataVector denominator link Text specify link function logit probit cloglog if D Poisson link Text value ln offset Boolean Is there an offset if offset n DataVector offset if D Normal tau ParamScalar sigma ParamScalar x DataMatrix explanatory variables beta ParamVector beta ncols len x Compared to Regression1 you will see that we have introduced an input D for distribution along with conditional sta
56. 65 v 1 60 e g 155 1 50 1 45 10 0 8 0 6 0 4 0 2 0 0 0 2 0 0008 0 0007 0 0006 w 0 0005 0 0004 0 0003 0 0002 0 0001 ESS 4276 kernel density i o 1000 2000 3000 4000 5000 45150 155 160 165 170 175 stored update 0 05 parameter value 0 04 0 03 l 5 0 02 0 01 1 0 00 bs gp 0 01 o 20 40 60 80 100 120 9 2 4 6 8 10 12 lag Lag O 20000 40000 60000 80000100000 updates tau Inputs D Normal origx vrband standlrt schgend schgend_cat Yes vrband_cat No y normexam X vrband standlrt schgend_2 schgend_3 This completes this chapter and is the last single level model we will meet for a while Another extension would be to allow the inclusion of interactions in the model This has been done in the template 1Levelwithinteractions Here the modifications are done in the invars and outbug outlatex methods as no new predictor variables are created Instead the model code includes multiplications between the variables We will leave you to try out this template as an exercise standlrt_cat No 90 10 Multilevel models We now move to templates for models for more complex data structures In this chapter we look at multilevel modelling templates templates that allow random effects to account for clustering in the data We will look at two templates of increasing complexity firstly a t
57. 7 0 862 0 6337 0 6827 p 54 betal chain 0 007491 0 002662 ESS 2224 beta chain 0 2592 0 02375 ESS 2178 y Inputs x cons age v use For y we only store summary statistics for each element of the y vector and hence just output a posterior mean and standard deviation for each element This is also why no graphs appear for y below the results 11 3 preccode method We have seen that the code works but we now need to look at how the step for updating the latent y variable is incorporated into the code This is done via the preccode function which for this template is as follows 103 def preccode self extraccode lt ol 1 def mmult names var index out wee count 0 for name in names LE count gt 0 G t p ah out double name index var str count count 1 return out oe Vv double mean for int 1 0 i lt length y it mean mmult x name beta i if S v name i lt 0 yli rtnormal mean 1 2 0 0 else yli rtnormal mean 1 1 0 0 return MakoTemplate extraccode render self objects This function could have been even shorter except we need to include in here a definition for the mmult function that is used to construct the linear predictor this time as C code The actual C code is double mean for int 1 0 i lt length y it mean mmult x
58. An Advanced User s Guide to Stat JR alpha release Programming and Documentation by William J Browne Bruce Cameron Chris M J Charlton Danius T Michaelides and Camille Szmaragd Centre for Multilevel Modelling University of Bristol Electronics and Computer Science University of Southampton December 2010 Acknowledgements The Stat JR software is very much a team effort and is the result of work funded under two ESRC grants The LEMMA 2 programme node Grant RES 576 25 0003 as part of the National Centre for Research Methods programme and the e STAT node Grant RES 149 25 1084 as part of the Digital Social Research programme We are therefore grateful to the ESRC for financial support to allow us to produce this software Both nodes have many staff that for brevity we have not included in the list on the cover We acknowledge therefore the contributions of Fiona Steele Harvey Goldstein George Leckie Rebecca Pillinger Kelvyn Jones Paul Clarke Mark Lyons Amos Liz Washbrook Sophie Pollard and Hilary Browne from the LEMMA 2 node at the Centre for Multilevel Modelling for their input into the software David De Roure Luc Moreau Tao Guan Toni Price Mac McDonald lan Plewis Mark Tranmer Paul Lambert Emma Housley and Antonina Timofejeva from the E STAT node for their input into the software A final acknowledgement to Jon Rasbash who was instrumental in the concept and initial work of this project Contents
59. Gibbs MH Gibbs MH if self EstObjects Engine WinBUGS self EstObjects nchains Integer number of chains else self EstObjects nchains 1 self EstObjects seed Text Random Seed self EstObjects burnin Integer length of burnin self EstObjects iterations Integer number of iterations self EstObjects thinning Integer thinning if self EstObjects Engine R if self objects D Binomial if self objects link logit if self objects link probit print other link functions for binomial distribution not implemented choose another engine self MethodInput else self EstObjects Engine Text Choose estimation engine MLwiN R STATA MLwiN R STATA We see the two inputs EstM and Engine that correspond to the first two method inputs in the example There are then chunks of code for MLwiN and WinBUGS before the standard inputs for seed burnin iterations and thinning There is then some code for R Camille Not sure why this is not before standard inputs and finally the Engine choices when MCMC is not chosen 82 8 3 Outbug The outbug attribute now also contains conditional statements as shown below outbug model for i in 1l length S y S y i l if D Normal dnorm mu i tau mu i lt endif if D Binomial dbin p i n i link p il l lt endif if D
60. LevelBlock zi Select dataset alevchem E Serl bang bes83 birds classsize classsize_inpute diag gcsecomp gcsemv growthdata height hungary The Configuration box shows that the template Regression1 is the default template and the default dataset is called tutorial We will click on the Run button to run the template and obtain the following display 12 2 Mozilla Firefox bttp localhost 8080 run Template Regression1 Change Dataset tutorial Change View Summary Configuration Start again response school hd explanatory variables normexam d Next Set Inputs Here you will see a Configuration box which is looking for inputs for a response a single select list and explanatory variables a multiple select list We will start by fitting a very simple regression model regressing normexam exam scores at age 16 on standirt A London reading test taken by the same children at age 11 We will select normexam as the response and cons and standirt as the explanatory variables as shown below 7 Mozilla Firefox view Histor Stat JR Demonstrator Template Regression1 Change Dataset tutorial Change View Summary Configuration Start again response normexam gt jd explanatory variables school cons cons standirt vrband schav act sehgend student denom girl normexam zf m Inputs 13 Click on the
61. MLwiN The goMLwiN function consists of constructing an MLwiNoutput object that will be used to run the defined model using MLwiN The definitions of the MLwiNoutput class are in the file MLwiNoutput py in the subdirectory src lib EStat external MLwiN The code for the goMLwiN function is as follows def goMLwiN t MLwiNOutput input t write t WriteMLwiNOut m Script t execute output t Summarise return m Bs888ct538383 As with goWinBUGS we have a series of steps which involve constructing files to send to MLwiN an execution of MLwiN function and postprocessing of the results The input method is first called This method constructs a datafile ready to send as input to MLwiN The write method is called next This method constructs the start of the macro that is needed to set up and run the model in MLwiN The generic code for inputting the data and naming the columns that is common to all templates is 56 written by this method The template specific WriteWLwinOut method is then called to fill out the rest of the MLwiN macro For the Regression2 template the WriteWLwiNOut method is as follows def WriteMLwiNOut self MLwiNOut LwiNOut macro t resp self objects y n LwiNOut macro iden 1 self objects y n LwiNOut params LwiNOut macro put str len self data self objects y 1 c t str len MLwiNOut data keys 1 n LwiNOut macro
62. Normal z MCE a a LE Dittpr 127 0 0 1 8000 run Stat JR Demonstrator Template 1LevelCat Change Dataset tutonal Change View Summary Configuration s estimabon method by Choose esimation engine eSTAT WinBUGS Inputs O Normal onge number of iterations Random Seed fi length of burnin TO thinr ng 1 schgend_cat yes Y normexam X If we press Next again we will get the outbug and outlatex output as follows _ http localhost 8080 run Results tau chain 1 569 0 03448 ESS 4270 beta3 chaii 0 2364 0 0278 ESS 1887 beta2 chail A betal chai ran 0 01259 ESS 5604 betal chain 09 H i sigma chain ae 7984 0 008776 ESS 4278 1000 2000 3000 4000 501 B45 kernel density N FD 100 40 1 45 150155160165 L70175 0 0008 oe 0 0007 0 0006 w 0 0005 0 0004 0 0003 0 0002 0 0001 O 20000 40000 60000 80000100000 updates Inputs D Normal origx cons standirt schgend schgend_ oe wit MA normexam x cons standirt schgend_2 schgend_3 cons_cat No Here we see that in both the maths and the model code the expression for the linear predictor has two terms to represent two of the possible categories for school gender schgend_2 and schgend_3 The important method here is the preparedata method which
63. The actual code that is produced can be viewed by right clicking on the LaTeX and selecting show source It looks as follows left begin array 1 u 2 _ 0 school i u 2 _ 1 school i end array right amp sim mbox N left left begin array 1 0 0 end array right Omega 2 _f u right Looking at the model code we have not included a prior for d_u1 and so here we again resort to writing our own preccode chunk 12 2 Preccode for NlevwithRS 110 We will look at the preccode in sections The preccode is used to add a step for updating the precision matrix d_u1 and the corresponding variance matrix omega_u1 Looking at the start of the code extraccode lt numlevs int NumLevs gt static int itnum 1 itnum 1 for i in range 0 numlevs lt n len context x str i 1 gt Sat yk Note currently using a uniform prior for variance matrix struct _gmatrix sbS i 1 matrix_blank S n S n for int i 0 i lt length u0_ i i or j in range 0 n or k in range 0 n sb i 1 gt e j k double u j _ i i double u k _S i il endfor endfor oe o9 This first section stores the number of levels numlevs for looping purposes and within the loop the number of random effects is constructed as n because for classifications with only one set of random effects no fur
64. al normal eng Update beta0 double sum0 0 for int i 0 i lt length use i sum0 cons int i y int 1 betal age int i double suml1 0 for int i 0 i lt length use i suml pow cons int i 2 std trl normal_distribution lt double gt normal sum0 suml1 1 sqrt sum1 betaO normal eng Here we see that the preccode part as the name suggests appears at the start of the iteration loop This is important as the y variable needs initialising before the other parameters are updated and updating it first ensures y is positive when use is 1 and negative when use is 0 11 4 prejcode method In section 5 4 we discussed the alternative code generation of JavaScript that can be run in the local browser If we wish our template to work with JavaScript as well we need to create an equivalent prejcode method which for this template is as follows def prejcode self extraccode lt le 1 def mmult names var index out count 0 for name in names 105 if count gt 0 out 7 out name index var str count count 1 return out ol Vv for var 1 0 i lt length y it var mean mmult x name beta i if v name i lt 0 yli rtnormal mean 1 2 0 0 else yli rtnormal mean 1 1 0 0 return MakoTemplate extraccode render self objects In fact Javascript and C code are pr
65. and Regression2 template Here select the template and dataset and next choose inputs as follows Mozilla Firefox ft Was Template Regression2 Change Dataset tutonal Change View Summary Configuration Start again response normexam explanatory vanables cons standit Name of output results outmivan Is estimation method by MCMC Yes Choose estimation engine eSTAT WinBUGS MlwN R MLN Select MCMC sampling method for fixed effect Gibbs MH Gibbs Select MCMC sampling method for residuals Gibbs MH Gibbs Random Seed 1 length of burnin 1000 number of iterations 5000 thinning 1 Nox Sel Click on the Next button and then the Run button to call MLwiN in fact some MLwiN windows should appear and then disappear The following output is shown 54 _ http localhost 8080 run Results pas AN JA 1000 2000 3000 4000 5000 ese 0 60 0 62 0 64 0 66 0 68 0 70 10 Stored update 0 02 parameter value os 0 01 I 06 0 00 ofl og Q 04 P a uy 0 01 0 2 0 0 ae ez 0 253930 8080100 a FSB 102 0 00030 Lag lag 0 00025 0 00020 w 4 0 00015 0 00010 a 0 00005 0 00000 gt 5000 40000 60000 80000100000 updates sigma2e Z inputs Y normexam cons standirt Loading Wob F ont TeX Main Reguiar Apart from the speed of estimation which is much quicker than Stat JR and WinBUGS the results are very similar We could
66. and is using the distributions generated from the algebra system 47 6 Including Interoperability The Stat JR package has its own new algebra system and estimation engine as illustrated in the last chapter Another aspect of the package is its ability to interface with other software packages and in particular but not exclusively their estimation engines This feature doesn t however come for free and translator methods that are often template specific need writing to achieve interoperability In this chapter we return to the regression modelling template and take a look at how we can include interoperability via an adapted template Regression2 py We will here describe work on the four software packages that have thus far been considered for interoperability namely WinBUGS MLwiN R and Stata 6 1 Regression2 py In this section we will consider a second template Regression2 that extends the first template by including the option to fit the same model in a variety of packages If you look at the code in the Python file you will see that this template has identical code for the attributes defined in Regression1 but in addition has methods to allow the user to call other programs We will begin by looking at the Method nput attribute that we described briefly in chapter 4 def MethodInput self self EstObjects EstM Boolean Is estimation method by MCMC if self EstObjects EstM self EstObjects Engine Text Choose estimat
67. andard substitutions that we will see later A use of this function can be seen for example in the template MVNormaliLevel which we will look at using the built in estimation engine in section 12 3 The goWinBUGS function is in webtest py higher up the code def goWinBUGS t m WinBUGSScript m PrepareWBugsInputs t m WriteScript t m ExecuteScript m m ImportWBUGSOutput t Summarise return m Here you will notice an initialisation of m to a class WinBUGSScript which can be found in src lib EStat external WinBUGS subdirectory The following lines then call various methods for the class to perform all the operations required to fit the model using WinBUGS and convert the results into the format needed for Stat JR We will not describe each line of code in detail but just give brief descriptions of the role of each function The first initialisation line will run the __init__ method and thus construct a blank WiNBUGSScript object The next line calls the Prepare WBugsinputs method This method is long and contains the bulk of the work in translating from the inputs from the template to files required for WinBUGS For fitting a model in WinBUGS three files are required a model file a data file and an initial values file An additional script file is required to actually get WinBUGS to use these three files to fit the model The method starts by setting up filenames for the various files to be created The next se
68. arameters in a slightly different way The final function that is called is the Summarise function which then constructs the text that will appear in the browser under Results 52 We will limit our descriptions here but will endeavour to write further documentation on interoperability later We next look at MLwiN 6 3 MLwiN MLwiN is another package with MCMC functionality but which can also fit multilevel models using classical statistical methods In Stat JR we offer the option of fitting models in MLwiN using either approach where available Having seen how WinBUGS links into Stat JR we will now show the similarities and differences in how MLwiN links in The first observation is that MLwiN doesn t use a model description language like Stat JR or WinBUGS It is also more restrictive in terms of which models it can fit which means that it will not be available for all templates although MLwiN can be used in many of the templates we have written thus far Although MLwiN has a GUI user interface it also has a macro language and it is this language that we make use of when writing interoperability code for Stat JR So as with WinBUGS we need to tell Stat JR where to find MLwiN and this is found in the settings cfg file for example MLwiN executable C Program Files MLwiN v2 22 mlwin exe Note if you change the directory then you will need to restart Stat JR 53 Let us demonstrate using MLwiN and MCMC for the tutorial dataset
69. as an alternative run the model not using MCMC by choosing the following inputs I Mozilla Firefox B http localhost 8080 run 2 Stat JR Demonstrator Template Regression2 Change Dataset tutorial Change View Summary Configuration Start again response normexam explanatory variables cons standlirt Name of output results outmiwin Is estimation method by MCMC No Choose estimation engine MLwiN R STATA MLwin gt Next Bal Inputs y normexam x cons standlrt After clicking Next and Run you get almost instantaneous answers 55 Results sigma2e se 0 014393 mean 0 64841 betal se 0 012727 mean 0 59505 betaO se 0 012639 mean 0 0011931 sigma2e v Inputs y normexam x cons standirt You will notice that the results produced are simply point estimates and standard errors as the method doesn t construct chains We also do not see the plots that we get with MCMC methods If we explore webtest py we will find that there is a chunk of code for when the estimation engine is set to MLwiN elif t EstObjects Engine name MLwiN func getattr t WriteMLwiNOut None if callable func m goMLwiN t Each template must have a method WriteMLwiNOut to be able to work with MLwiN and this is checked for prior to the call to goMLwiN which plays the role of the go function when using
70. at JR into 46 the rest of the system for this alpha release and so the output does not get sent to an output file and we do not get more extensive diagnostic plots etc 5 4 2 JavaScript source code We can also view the source code used in the JavaScript option by clicking on View Page Source in the browser This will give the JavaScript code in a separate window and once again we can look at the chunk of code for performing one iteration function iterate Update tau var sum0 0 for var i 0 i lt length normexam i sum0 Math pow normexam i beta0 cons i betal standlrt i 2 tau rgamma 0 001 0 5 length normexam 0 001000 sum0 2 Update sigma sigma 1 Math sqrt tau Update betal var sum0 0 for var i 0 i lt xlength normexam i sum0 standlrt i normexam i beta0 cons i var suml 0 for var i1 0 i lt xlength normexam i suml Math pow standlrt i 2 betal rnormal tau sum0 tau suml 1 Math sqrt tau suml1 Update beta0 var sum0 0 for var i 0 i lt length normexam i sum0 cons i normexam i betal standlrt i var suml 0 for var i 0 i lt length normexam i suml Math pow cons i 2 beta0 rnormal tau sum0 tau suml 1 Math sqrt tau suml1 Once again although in a different computer language it is easy to see that this code is doing the same algorithm as the C code
71. ation box now summarises the specified model and the MCMC estimation options 2 Mozilla Firefox 3 1 x 7 z ys File Edit View History Bookmarks Tools Help e O5 ce x amp ra Dre ipocahost sosnjrunf No http localhost 8080 run O AG Stat JR Demonstrator Template Regression1 Change Dataset tutorial Change View Summary Configuration Start again response normexam explanatory variables cons standlrt Name of output results out Random Seed 1 length of burnin 1000 number of iterations 5000 thinning 1 Equation rendering Equation rendering normexan N 4 o U fycons f standlrt Poal By xl t T 0 001 0 001 o 1 t Model model for i in 1 length normexam normexam i dnorm mu i tau mu i lt cons i beta0 standirt i betal Priors beta dflat betal dflat tau dgamma 0 001000 0 001000 sigma lt 1 sqrt tau Rin co Sat Inputs y normexam x cons standirt bee OOOO Od The top half of the box contains a nicely formatted mathematical description of the model in LaTeX code The bottom half has some code that is written in a variant of the model specification language associated with the WinBUGS package There are several buttons at the bottom of the window and if 15 we click on the Run button we will run the model and after a short while
72. atory variable in addition to a constant To do this in the template directory copy the file Regression1 py to LinReg py It is also sensible to change the classname in the template For a linear regression we want a template with two inputs y and x only this time x is a vector rather than a matrix i e there is only one predictor plus a constant Try changing the text to ask specifically for a Y variable and an X variable for the inputs You will need to change invars a little Try also then simplifying the outbug and outlatex functions you should be able to get away without needing the mmult mmulttex functions In fact mu i should be something like alpha beta x i though if you use alpha and beta they will both need declaring as ParamScalar s in the invars function When you think you have the template correctly written save it and rerun Stat JR and test it out Note that you will have to close the command window before rerunning Stat JR and then refreshing the browser If it is saved in the templates directory it will be automatically picked up It should give similar results to Regression for the example shown earlier 23 4 Structure of templates From Chapter 3 you should now have an idea of the tasks involved in writing a simple template In later chapters we will give further information on how one writes more complicated templates We will use a similar structure of firstly introducing the template and using it in the
73. atrix that performs the orthogonalisation and we store this as a vector in the object orthogmat The last few lines then multiply the original x variables by orthogmat storing the results in a matrix tmp The columns of this tmp matrix are then placed in objects that have the string orth appended as a prefix to the original x variables names Finally the map function replaces the original x variable names with these new orthogonal variable names before the data are returned The function outbug then constructs the model code outbug model for i in l length S y S y i if D Normal dnorm mu i tau i lt endif Vt Ds S n i oe 3 er dbin p i AP WP V n a Od ar Mh DO H as OFA t U O p n n e ni u dpois pl link p il lt if offset endif endif Sif orthog oe n i ole ole S mmult x betaort i else S mmult x beta i endif Priors for i in range 0 beta ncols Sif orthog betaortS i dflat else beta i dflat endif endfor if orthog lt count 0 gt for i in range 0 beta ncols beta i lt for j in range 0 beta ncols S orthogmat count betaort j o if j beta ncols 1 o else 127 if float orthogmat count 1 gt Sendif endif lt count 1 gt
74. ble command s vanilla f s self executable self myRScript self OutStatus subprocess Popen command wait which as one might expect executes R with the script we have just constructed Having run R we have two final functions DisplayOut and Summarise which read back in the output and construct summary statistics respectively We next consider the Stata package 6 5 Stata Stata is another popular statistical package but unlike MLwiN and R it will only allow classical estimation This means that the code for interacting with Stata is somewhat shorter For our Regression2 template we will utilise Stata s glm function for estimation As with the other three programs we need to include details of the location of Stata on our machine in settings cfg prior to running webtest By default in the file this is as follows STATA executable C Program Files x86 Statall StataMP 64 exe 61 We can again first run the template to see what it gives for Stata so choose Regression2 as the template and tutorial as the dataset and then input the following faery O Stat JR Demonstrator Template Regression2 Change Dataset tutorial Change View Summary Configuration Start again response school explanatory variables cons standlrt Name of output results outstata Is estimation method by MCMC No Choose estimation engine MLwiN R STATA STATA Set Inputs y school x
75. ce sb int vw length misswritten 3 struct _gmatrix va rwishart vw sb matrix_free sb d_e 0 va gt e 0 0 d_e 1 va gt e 0 1 d_e 2 va gt e 1 1 matrix_sym_invinplace va omega_e 0 va gt e 0 0 omega_e 1 va gt e 0 1 omega_e 2 va gt e 1 1 matrix _free va and towards the end the steps for the missing data Update misswritten for int i 0 i lt xlength misswritten itt If lt 9 999e29 written i 118 std trl normal_distribution lt double gt normal 0 5 d_e int 0 2 beta0 cons int i 2 betal female int i 2 d_e int 1 misscsework int i beta2 cons int 1 beta3 female int i d_e in t 0 1 sqrt d_e int 0 misswritten i normal eng Update misscsework for int i 0 i lt xlength misscsework itt if csework i lt 9 999e29 std trl normal_distribution lt double gt normal 0 5 2 d_e int 1 misswritten int i beta0 cons int i betal female int i d_e int 2 2 beta2 cons int 1 2 beta3 female int i d_e int 2 1 sqrt d_e int 2 misscsework i normal eng We can run the template by clicking back in the browser and then clicking on the Run button k i aa File Edit View History Bookmarks Tools Help gt C 4 http 137 222 140 64 8081 run Ly gt SB Googie 2 2 Most Visit
76. code n varnames varnames itcode self code varnames params self params varnames data self data varnames burnin self burnin varnames iterations self iterations varnames seed self seed code stdtmpl get_template standalone cpp render varnames return code 42 Basically the method constructs the standalone C code for the model and here we see the call to XMLtoPureC that creates the C code to update a particular parameter from its tree Note that all of this stuff is passed into the standalone cpp file which contains code substitutions the most important being itcode which is all the updating steps for all parameters in the model as we will see later 5 3 2 XMLtoPureC This file is used to construct pure standalone C code and can be found in the src lib EStat subdirectory The code in XMLtoPureC is basically very similar to the code in XMLtoC except it doesn t assume the parameters are objects in Python and so the code it produces is much easier to read and looks much more like the sort of C code that would be familiar to C programmers We will now consider the Regression1 example and look at the code produced 5 3 3 Example code for our template When the Code button is pressed C code is generated and then displayed in the browser so that it can be saved and used elsewhere Lots of the code at the start of the program is generic for all models There are specific inputs
77. cting and as a python variable So numlevels in our example is 2 and expanding the outside for Python gives for il in l length ul ulf il dnorm 0 tau_ul for 12 in l length u2 u2 i2 dnorm 0 tau_u2 as we see in the browser Once again the outlatex function which creates the LaTeX output will have similar substitutions via Python but we will not describe this in detail here Clicking on Run will run the model and the output contains information for all the residuals Results sigma_u2 chain 0 1194 0 06839 ESS 227 sigma_ul chain 0 5297 0 05856 ESS 847 tau chain 0 235 0 005764 ESS 4434 ul chain 0 08162 0 2647 ESS 3523 0 0314 0 4495 ESS 4729 0 2171 0 485 ESS 4880 0 1411 0 435 ESS 4760 0 07879 0 5157 ESS 5203 tau_ul chain 3 698 0 841 ESS 829 tau_u2 chain 220 6 366 8 ESS 281 u2 chain 0 0405 0 1133 ESS 1699 0 007032 0 1079 ESS 2274 0 03756 0 1093 ESS 1549 0 1284 0 1489 ESS 452 99 betal chain 0 1596 0 00279 ESS 52 betaO chain 9 979 0 2814 ESS 52 sigma chain 2 064 0 02532 ESS 4429 and the usual MCMC plots are available There are several other N level modelling templates included with the software that you can also look at We will describe one further such template NLevelwithRS which allows random slopes in chapter 12 This te
78. ction of code then interrogates the template to construct a list of data objects and a separate list of parameters There is then a long section that constructs the data file Some templates will have their own function WinBUGSData for this purpose but by default there is code here for generic templates It should be noted that this code at present relies on specific naming of objects in the template e g NumLevs and L2ID which we will try to change later 51 The data file is stored as a text string called infile which is extended as data items are added At the top of the file will be scalar quantities like N the length of the dataset Then the data itself is added The columns have to be added as comma separated lists housed within round brackets You will notice special code for classification variables if p in self clas for q in self datal p infile t str SE qtl1 k k 1 if k lt N infilet else infilet As WinBUGS requires classification units to begin with the number 1 rather than O which is how STAT JR stores them all classification data items have 1 added to them Having written the text string the following three lines write the data out to file file open self myInfile w file write str infile file close The code then proceeds to write the model file Here the file is stored as a text string labelled modspec and the main purpose of the code is to do a few text substitutions
79. def goSTATA t m STATAInterface m WriteData t m Script t t STATAInput m m m m Execute DisplayOut Summarise return m This code creates a STATA nterface object and the class definition of such objects is found in the file STATAInterface py which can be found in the src lib EStat external STATA subdirectory Having initialized an instance m of the STATAInterface class we next call the WriteData method This method converts the data to be used in the model into a text file to be input into Stata We next call the Script method which constructs the beginning of the Stata Do script which is the method that Stata uses to work in batch mode and fire off a series of commands The Script is continued via the template specific code which is done by interrogating the STATA nput method of the template object whose code is given below def STATAInput self STATAob j family gaussian link identity STATAobj DoScript glm self objects y name for p in self objects x name STATAobj DoScript p STATAobj DoScriptt family family link link STATAobj DoScriptt noconstant n always remove the intercept STATAobj DoScriptt log close n The model fitting is performed by the lines above and the lines below simply carry out some additional plotting functions that are linked with the template These plots will for now reside in the tem
80. e have an additional j loop in the model loop for the out of sample predictions which will be stored in mnormexam There are two interesting parts to this code Firstly the line mumiss j lt cons j betazxfd0 standlrt j betazxfdl has the strange string zxfd placed in the middle of the two parameter names This is our way of performing the cut function used in WinBUGS As the predictors in this line are not beta0 and beta1 then this line will not influence the posterior of the fixed effect The posterior for mnormexam will be calculated but this is only because we include the line dummy j ddummy mnormexam j so that mnormexam appears on both the left and right hand side within the model code to differentiate it from data The algebra system will formulate the posterior which will depend on betazxfd0 and betazxfd1 Of course in practice we want these replaced by the correct beta0 and beta1 and this is done in the bowels of the code generator with the lines LE elif type variable result l1 name replace _ replace zxfd which can be found in XMLtoC py and other such files If we run the template we get the following output 131 Ee Edit yew History Hookmarks Josts Help 6 A A henpi 137 222 140 643081 run A Coos 2 A Most Visited Getting Started ih Latest Headlines Ittpe 137 222 140 64 8081 runy Results 5 kemel density neoa 1405 1000 2000 s00
81. e the gt and lt operators mean that the range includes its end points Finally the modified version of var replaces the incol original column in the data and the dataset is returned Exercise 3 This template applies the recoding by copying the recoded column over itself As an exercise try modifying the template so that it will place the recoded column into a new location i e choose another name for the recoded variable Note that the code for the Generate template should help here 7 3 AverageandCorrelation template Another template that one might consider using prior to fitting a model is the AverageandCorrelation template This template will give either some summary statistics including the averages for a series of columns or the correlation matrix for a set of columns The template has a very short invars input variables attribute invars op Text Operation averages correlation vars DataMatrix Variables Here op stores the operation to be performed i e allows the user to choose between averages and correlations whilst vars stores which columns to perform the operation on This template has another different method tabledata The name tabledata is recognised as a special method by webtest and hence the results of running this template are displayed in the browser 71 The code for tabledata that will be called by webtest py is as follows def tabledata self if self object
82. e will see the new tutorial dataset complete with column labelled random as shown below 68 2 Mozilla Firefox E http localhost 8080 summary 4 Stat JR Demonstrator Summary Name Count Min Max Mean Std school 4059 00 64 0 30 0066518847 18 9368110726 cons 4059 10 1 0 1 0 0 0 yrband 4059 10 3 0 184306479428 0 630784592987 schay 4059 10 3 0 2 12712490761 0 652926315528 random 4059 0 000420227992873 0 999825397828 0493044444463 0 289078129886 avslrt 4059 0 75596 0 63766 0 00181069110618 0 314831650064 standirt 4059 2 935 3 016 0 00180981670362 0 993100685573 schgend 4059 10 3 0 1 80487804878 0 914079654538 student 4059 0 0 197 0 376999260902 30 2606908983 denom 4059 10 1 0 1 0 0 0 girl 4059 0 0 1 0 0 60014781966 0 489867751763 normexam 4059 0 000117862232077 0 998817146739 Exercise 2 Try modifying this template so that it only offers the random number generators Try expanding the inputs so for example the Normal random generator will allow a mean and a standard deviation the Gamma has a scale parameter and the exponential has a rate parameter 7 2 Recode template recode py The Generate template allows the user to add new columns to their existing dataset There are many templates that expand or manipulate a dataset and we will here look at a second template the Recode template The Recode template as the name suggests recodes values in this case recoding values within a cont
83. ed 4 Getting Started Latest Headlines _ http 137 222 140 64 8081 run re Results betal chain 3 399 0 6466 ESS 859 betaO chain 48 77 0 4931 ESS 897 beta3 chain 5 963 0 7757 ESS 833 beta2 chain 69 78 0 5982 ESS 852 misswritten summary 23 75 0 43 38 11 46 39 38 0 36 88 0 16 88 0 36 25 0 49 38 0 25 0 33 34 11 46 48 7 misscsework summary 54 24 13 77 71 3 0 76 85 0 87 96 0 44 44 0 70 38 13 9 89 81 0 17 59 0 32 41 0j 84 omega_e chain 177 1 6 065 ESS 4069 108 1 5 834 ESS 4281 258 4 8 723 ESS 4422 die chain 0 0076 0 0002656 ESS 3575 0 00318 0 0001782 ESS 3296 0 005207 0 0001807 ESS 3517 betal 0 6 1 r o z os g 2 5 2 0 4 v 3 0 v amp 3 5 E03 g 4 0 0 2 4 5 S oa 5 0 5 5 1000 2000 3000 4000 5000 26 5 lt 4 3 2 1 0 1 0 parameter value o 1 0 08 0 8 4 0 6 0 6 Z o4 Eos 0 2 0 2 d 0 00 30 40 S o T20 0 2 4 6 8 l0 12 You will see that for the variables with missing data the data items that are not missing have standard deviation zero or very small values due to rounding as their values should not change 119 from iteration to iteration for example we can see that the first written score and second coursework scores were observed We have extended these multivariate normal m
84. eed to investigate how the webtest py code uses the changes to Estimation method input Within the run class in webtest py search for class run to find this we will now no longer run the code following the statement if t EstObjects Engine name eSTAT and instead we have the following code note elif is Python for else if elif t EstObjects Engine name WinBUGS m goWinBUGS t if t EstObjects nchains 1 for p in m displayres keys t objects p sixway p m displayres p else for p in m displayres keys t objects p sixwaymulti p m displayres p t datasets datasets t EstObjects outdata name m combresults 50 if hasattr t WinBUGSMod statstemplate t WinBUGSMod elif hasattr t outbug statstemplate t output if hasattr t outlatex latextemplate t outputlatex Here we see calls to a new function goWinBUGS which we will look at next You will also see that for multiple chains a different graphics function sixwaymulti is used We will not go into details of how this works as it is generic to all templates Finally you will see that there is an if statement if hasattr t WinBUGSMod statstemplate t WinBUGSMod which allows us to write our own function for interpreting the user inputs in terms of WinBUGS code This function isn t present for this template as the code generated is close enough to standard WinBUGS code aside from some st
85. eeeseeseaseaeeeeeceseeeaaeaeeeeseesees 48 oes Ba ita ECO ge AEE E eo A 48 6 2 WINBUGS and Winbugsscript py cccccccecesssssssecececessesseseeececesseseaesesecssesseaaaeceeeceseesnaeaeeeeeens 49 6 3 MWIN aseeni dne iante etbevtconcte aeaa iaei eahdvanad eds sashauechavedeascue tossed eaka aaia Kaaa 53 E E E de did lool eee dees eee ae 58 6S STATA sits coxsevsteetacticcudectchsa test cit atacand tote peters OEA A fines acetate tetas 61 7 Input Data manipulation and output teMplates cccccccccccsssssssscecececessesesseseeeesesssesesteaeeeesens 65 7 1 Generate template ZeNErate py cccsscccccssscccsessececeessececsesseeeceeseeececsseeeceesaeeeceesseseceeseeeeeees 65 7 2 Recode template recode py bs cacieccacei decd aeons vided Liteon ew ated ee ee ie ee 69 7 3 AverageandCorrelation template nini irrien tiini iaaa iate aieiaiei anatas 71 TAX Plot templates anera aa aaas aa araa a aaa aaa eels 75 8 Single level models of all flavours A logistic regression example sssssssssssssssseserenrnssssesereene 79 SEMIS a E ee E A eh a A 81 8 2 Methodinput se seiesima een a Sorat e a eas a a aa es es ete Ee 82 32 9a GJ 0101 E EAE E A E E S 83 BA QUtaAt ARE E E A E A E E E E NAET 84 9 Including categorical PredictOrs c ccccccccccsssssssecececessesesseseceeecesseseaesececscesseseaaeseceesesssussaeaeeeesens 86 10 M ltilevel models air ae osinean ieenna ieai aa aaia ada a T aana a ga 91 10 1 2levelmod template
86. emplate for fitting models that have 2 levels i e one level of clustering and then secondly a more general template that will fit models with any number of levels of clustering whether nested or crossed Note here that these templates allow only random intercepts 10 1 2LevelMod template We will begin our investigation of 2Leve Mod by looking at its invar function invars y DataVector response L2ID DataVector Level 2 ID D Text specify distribution Normal Poisson Binomial t if D Binomial n DataVector denominator link Text specify link function logit probit cloglog if D Poisson link Text value ln offset Boolean Is there an offset if offset n DataVector offset if D Normal tau ParamScalar sigma ParamScalar if t tau ParamScalar sigma ParamScalar df ParamScalar u ParamVector tau_u ParamScalar sigma_u ParamScalar x DataMatrix explanatory variables beta ParamVector beta ncols len x If you compare this with the invars function for 1LevelMod you will see we have added one additional input L2 D to allow the user to input the column containing the level 2 identifiers We have also included three additional model parameters u tau_u and sigma_u the level 2 residuals their precision and standard deviation respectively We can try
87. emplate from the pull down list and press the set button Next we press the Run button to run the template The template will look as follows _ http localhost 8080 run a r Stat J Demonstrator Template Generate Change Dataset tutorial Change View Summary Configuration Start again Output column name Type of number to generate Uniform Random M Net Inputs Now we type random for the output column name and stick with Uniform Random for the type After clicking Next we are asked for a name for the output results and here if we enter tutorial the new column will be appended to the dataset and the tutorial dataset in memory will have an additional column If we choose a new name then a new dataset containing all the columns from tutorial along with this new dataset will be formed in memory and tutorial will persist without the new column Pressing Next will finish the inputs and Pressing Run will run the template and the Run button will then disappear 67 B http localhost 8080 run Stat JR Demonstrator Template Generate Change Dataset tutorial Change View Summary Configuration Start again Output column name random Type of number to generate Uniform Random Name of output results tutorial Set Inputs outcol random type Uniform Random To see what the template has done if we click on the Summary button w
88. est Meactines Inttpes 137 222 140 64 5081 reny Stat JR Demonstrator Template 1LevelOrthogParam Change Dataset rats Change View Summary Configuration Stan again response y36 specify distribution Normal explanatory variables cons y8 Do you want to use orthogonal parametensation Yes Name of output resuts ratout Choose estimation engine eSTAT WinBUGS eSTAT Random Seed 1 length of burnin 1000 number of iterabons 5000 thinning 1 Next Set Inputs orthogmat Y Y36 orthog Yes D Normal x cons y8 Clicking on the Next button will give the following output for the model 125 E morna rrr File dit yew Higtory Bookmarks Tools Help A Mntpu 137 22210 648081 run 3 2 A Mort Vrted Getting Started i Latest Headlines http 137 222 140 64006 L run Equation rendering y36 N n 07 Hi Baorthcons Sj orthy8 faal i 1 Bo 1 085 152 1666666673 3 0 035 1 08 7 0 001 0 001 a l r tau betaorto orthys 4 betaort Priors betaorto dfiati betaorti dflat betra lt 1 0 bereort0 152 1 betaorti betal lt 0 0 hetac i teu dgemme 0 signa lt 1 sqztitau Run Code JS Set Inputs y y36 x orthcons onhy8 orthog Yes D Normal The method of using an orthogonal parameterisation is mentioned in Browne et al 2009 for non
89. etty similar and so you will see only a few differences here for example the use of var instead of int or double when defining types of variable If you click on the JS button and then the Run button this code will be incorporated into the Javascript that is run and the output is as follows done in 18869ms mean std ESS iter 2499 5 1443 3756441065507 4 betal 0 007603313424488391 0 0026453520910962153 2195 betal 0 258863378907476 0 023498508867254648 2327 betal gt 0 221 0 229 0 237 0 245 0 253 0 261 0 269 0 277 0 285 0 293 0 301 prep eymen 0143 429 714 1000 1286 1571 1857 2143 2429 2714 3000 3286 3571 3857 4143 4429 4714 5000 If you are interested in looking at the code you can select view page source and the code will be displayed We will not show the code here but you will see the similarities with the C code used in the main engine As a more complicated example of using the preccode function we also have a template for ordered probit regression models OrderedProbitRegression py We do not have space in this guide to go into details for this template but such models also include multiple threshold parameters rather than a single intercept term as in the binary case 106 12 Multilevel models with Random slopes and the inclusion of Wishart priors One limitation of the algebra system in its current form is that it treats all parameters as scalars This means for e
90. exam In t Distnbution dganuna Jerefh normen 4 f normexam f cons f standlrt Match A 0 001000 Match B 0 999 0 Slength normexam brgh mmen gt nommexam joo Stanly t Sampling parameter j 0 001000 izl Sampling parameter r 0 001 0 Slength normexam bxh merenn efi normexam cons 2 standin Sampling distribution r dgamma 0 001 O Slength normexam 0 001000 7 When run from Stat JR the algebraic processing software then saves these three distributions in XML file format so that they can be read in later when we create code to fit the model 34 5 2 3 Continuing the go function applydata method Having run the compilemodel function the next few lines of the go function are as follows callback c burnin burnin iterations iterations thinning thinning seed seed applydata m variables tB BBS 8 Here we are copying more information into our instance of the model before calling a method of our template t to link up the model with the data which have been stored in variables The generic applydata function can be found in templating py def applydata self m variables for var in self objects print var var if type self objects var DataScalar print var name self objects var name m adddata self objects var name m data self objects var name variables self objects var name if type self
91. exited and we save the string self macro as a file using the following lines f open self MacroName w f write str self macro f close The next call in goMLwiN is to the execute method def execute self executable EStat configuration get MLwiN executable subprocess call executable run self MacroName 57 This will simply run MLwiN using the macro we have just constructed Having run MLwiN the goMLwin function finishes with calls to the output and Summarise functions which bring back the data into Stat JR in an appropriate format and construct summary statistics respectively We will leave MLwiN here and move onto another package with some functionality for the use of both MCMC and classical estimation methods R 6 4R R is a general purpose statistics programme that consists of a framework of interlinking statistical commands that are often called packages The R installation consists of a base package containing many of the standard statistical operations and to this can be added to user written packages For our Regression2 template we will utilise the glm function which requires the MASS package when performing classical inference For MCMC methods we use the MCMC glmm package a user defined function that can fit many models using MCMC As with the earlier programs we need to include details of the location of R in settings cfg on our machine prior to running webtest On my machine this is as follows
92. f DoScriptOut r read print self out def Summarise self results if hasattr self out results modelout n n self out return results This ends our description of the interoperability features in the Stat JR program The interoperability features are still a work in progress and although they are present in many of the templates that we will describe in later chapters we will not going into details on this aspect of these templates The interested reader can look at these templates and see how they perform interoperability and try writing their own interoperability code for their own templates We will extend the documentation on interoperability in later versions 64 7 Input Data manipulation and output templates The Stat JR system does not simply consist of templates for fitting models to datasets There are in addition templates that allow the user to input their own datasets manipulate datasets and plot features of datasets In many ways these templates are much simpler to write and understand We will look at a few examples of the templates along with their code and explain how they fit into the webtest interface 7 1 Generate template generate py The first template we will look at is called Generate and is used for generating columns to add to a dataset These columns can be constants sequences repeated sequences or random numbers The Generate template has an invars function as shown below invars
93. f the dataset where the results will be stored e Methodinput the function previously described to get hold of estimation method inputs e input a function that links with the invars attributes to get hold of model inputs e output a function that links with the outbug attribute to create the model output code for the algebra system e outputlatex a function that links with the outlatex attribute to create the LaTeX output e outputBUGS a function that will link with the outWBug attribute where present to create true WinBUGS model files e preccode postcode prejcode postjcode by default each returns a blank string but can be replaced by template specific code that allows the user to insert sections of C code or Java script into the code formed by Stat JR We will see how this works later At this stage it is enough to know that there is a lot of generic functionality that has been written and is common to all templates and that although the example template is quite small a lot is going on behind the scenes In the next chapter we will look at how creating an output BUGS like model statement fits into the whole Stat JR system 26 5 The parts of the Stat JR system In this chapter we will look at how the model templates that we write link into the system as a whole We will show the template classes link with a separate generic model class that is used within the bowels of the Stat JR system to store the model and create a
94. for the for loop lines as the length function is not standard WinBUGS code The initial values code follows next which is looped over for the number of chains to be used The code is stored in a text string labelled inits and there is some matching of variable names so that appropriate random starting values can be generated After creating the three input files the next method call in the goWinBUGS function is to write the script file and this is done via the WriteScript method This method is somewhat shorter as it only has to write one file that runs within WinBUGS Basically the function creates a script file that reads in the three input files performs the burnin and then stores chains for all parameters for the main run These chains are then output via the WinBUGS CODA function Having written the script the next line in goWinBUGS calls ExecuteScript which is very short def ExecuteScript self WinBugsApp EStat configuration get WinBUGS executable subprocess call WinBugsApp par script txt with script txt being in the same directory as WinBugl4 exe This function runs WinBUGS in the background Having run WinBUGS we need to import back in the chains from WinBUGS into Stat JR and this in performed by the function Import WBUGSOutput which takes the files output by WinBUGS and constructs chains in an appropriate format Here the code needs to deal with thinning and multiple chains and to deal with precision p
95. ght This terminology class object subclass attribute is used in what is called object orientated programming In the definition here four attributes tags invars outbug and outlatex are then defined as being parts of a Regression1 class although there will be other attributes that are generic to the template class and are defined elsewhere Briefly 1 The tags attribute identifies the template as belonging to the tag groups model and 1 Level and this is used in the web interface to decide which templates to show in specific template lists Thus the regression1 class is a template for specifying 1 level i e single level models ow 2 The invars attribute is a text string hence the starting and ending which consists of a list of the inputs in this template 3 The outbug attribute is a text string that will produce the model code we saw in the web interface for this template 4 The outlatex attribute is a text string that will produce a piece of LaTeX code which is converted into the nice maths we saw in the web interface We will now look at the last three attributes in more detail 3 2 1 Invars When this template has been selected in the web interface it will firstly have its inputs interrogated and an instance of a model object will start to be created Stat JR has a list of object types that can be thought of as the building blocks of a model object Statements like y DataVector response
96. h are then stored in uniqvals We then sort these into ascending order before removing the first uniqvals O as it will play the role of the base category in the model We then have a second loop over this list of uniqvals where we create the dummy variables The lines self data var _ str int i list numpy array self data var i astype float self objects x name append var _ str int i firstly construct an array which takes value 1 if the original variable has value i or O otherwise This newly constructed predictor variable is then appended to the new variable list If the variable is not categorical it is simply added to this new variable list itself We finally adjust the length of beta to account for the expansion of the categorical variables and return the new dataset This preparedata method is run before the outbug and outlatex methods and so these are identical to those we saw in 1LevelMod To fit the model with the dummy variables for school gender we press the Run button and get the following results 89 Mozilla Firefox http localhost 8080 run tau chain 1 beta3 i3 chain beta2 chain betal chain beta0 chain sigma chain 595 0 03506 ESS 2 0 3002 0 02694 ESS 1 4268 1939 0 1828 0 03828 ESS 2912 O 5552 0 01314 ESS 4016 0 08699 0 008957 ESS 1791 0 7919 0 008705 1 75 1 70 g 1
97. he instance m we then call the WriteData method that packages up the data into a text file format ready to be fed into R via a script The Script method is called next and this method begins the construction of the script file used within R to import data and fit the model The Script method only writes the start of the script file which is generic code for all templates The template s own method REstimation nput is then called and this then adds to the script file code specific to this template In the Regression2 template this begins with a reallocation of the outbug attribute with the lines self outbug S y name S Rmmult x name This is just so that the model can display the model part of the R statements The R model is set up so that it can be used in R commands by removing the intercept as it is usually present in the variable list and doing some text substitutions There are then two chunks of engine specific code for the classical and MCMC engines respectively This code is longer than required as it produces some graphs that may be useful when the E book extension to the Stat JR software is written Finally we see the lines write the script out file open Robj myRScript w file write Robj RScript file close which copy the Robj string we have been constructing to a file which will be called from R The execute method is next called def Execute self self executable EStat configuration get R executa
98. hich is found in XMLtoC py in the src lib Estat directory You will see that this function requires the tree that contains the parameter posterior the algorithm for the parameter posterior and the mode which is not important here Briefly this function converts the tree object into a string of C code that will update the parameter by randomly drawing either from its conditional posterior when algorithm is Gibbs or indirectly via a proposal distribution and comparison of the posterior probability of proposed and current values when algorithm is MH At this stage in the run function the code attribute contains a fully formed piece of C code for performing one iteration of an MCMC algorithm for the current model The next chunk of the run method of the model class is as follows for k in self params keys if type self params k current_value list vars k _accept map lambda x 0 self params k current_value localvars k _acceptinrow map lambda x 0 self params k current_value 38 localvars k _acceptdone map lambda x 0 self params k current_value vars k _sd map lambda x 0 1 self params k current_value else vars k _accept 0 localvars k _acceptinrow 0 localvars k _acceptdone 0 vars k _sd 0 1 adapt True adapt_count 0 it 0 desacc 44 1 Desire 44 1 acceptance tol 10 with tolerance of 10 print Fi
99. hinning len vars k else self params k chain numpy array 0 0 self iterations self thinning else if type vars k list self params k rs for ind in range 0 len vars k self params k rs append RunningStat else self params k rs RunningStat self code t Update k n self code str XMLtoC self params k tree self params k mode self params k algorithm toC self code n self code self postccode Here we begin by setting a timer and calculating in total how many iterations we need to run for We then set up attributes that contain the code that will be run to fit the model The attribute scode is a string containing C code that contains all the random number generators and matrix algebra functions written in C that are required The file statlib cpp contains these generators and the only thing that is variable here is the seed used Next we define the code attribute which will be the body of the code that is run for each iteration of an MCMC algorithm Here you will see that code is constructed by concatenating preccode a chunk of code for each parameter each preceded with a comment to identify what is updated and postccode We also set up the memory required for storing the chains of parameters which are for now initialised with zeros The chunk of code for each parameter is created using the XMLtoC call and here we need to look at the XMLtoC class w
100. ial dataset earlier by turning back a few pages As the text is specific to the inputs given the definition is a text string containing some quantities that depend on inputs These are integrated into the text string via the symbol for substitutions through conditional and looping computation achieved via commands and through the calling of external functions The outbug code for this template uses all three devices and so we will here go through stage by stage the instance of outbug shown in the earlier screen shots We start with the raw code outbug model for i in 1l length S y S y i dnorm mu i tau mu i lt mmult x beta i Priors for i in range 0 x ncols beta i dflat endfor tau dgamma 0 001000 0 001000 sigma lt 1 sqrt tau Now we can substitute normexam for S y as this is the variable we chose for y thus outbug model 19 for i in l length normexam normexam i dnorm mu i tau mu i lt mmult x beta i Priors for i in range 0 x ncols beta i dflat endfor tau dgamma 0 001000 0 001000 sigma lt 1 sqrt tau Next we can evaluate the for loop with in our example x having 2 columns as input by the user and the ncols function looks up the number of columns in x The Python range function does not include the upper bound in the range so in this case i will take values O and 1 outbug
101. iguous range to a specific new value This can be useful for creating categorical values although this might involve several repeated uses of the recode template We will demonstrate this with the tutorial dataset and look at recoding the school gender schgend column In the original dataset schgend takes values 1 for a mixed school 2 for a boys school and 3 for a girls school We might want to recode this to take values 1 for mixed and 2 for single sex i e convert the 3s for girls schools to 2s 69 _ http localhost 8080 run Stat JR BB Getting Started E Demonstr ator To do this we first select Recode from the template list and hit set followed by the Run button Next we select schgend from the list of columns and select the other options as below Template Recode Change Dataset tutorial Change View Summary Configuration Start again Input column name schgend Start of range 2 End of range 3 New value 2 Name of output results tutorial Run Set _ http localhost 8080 summary Stat JR Inputs lincol schgend newval 2 rangeend 3 rangestart 2 Zz Demonstrator Summary Name Count Min Max Mean Std school 4059 00 64 0 300066518847 18 9368 110726 cons 4059 1 0 1 0 1 0 0 0 vrband 4059 1 0 3 0 184306479428 0 630784592987 schav 4059 1 0 3 0 2 12712490761 0 652926315528 random 4059 0 000420227992873 0 999825397828 0493044444463
102. inBuGs14 STATA executable c Program Files x86 Statall statamMP 64 exe R executable C Program Files R R 2 12 0 bin R exe Note that if your base directory is something other than C StatJR then you will need to change the path name in this file for the Demo algebra system accordingly For interoperability you will need to change the paths for the various third party software packages WinBUGS Stata and R 2 6 Check that the installation has been successful You can now try to run Stat JR to check that the installation has been successful Double click on the webtest cmd file in the base directory This will fire up a command prompt window and your default browser At present you will normally get an error in your browser as it fires up before the command 10 prompt window is ready but this should be fixed by waiting and refreshing the window As an alternative the Python file webtest py which is being run by webtest cmd and is located at src apps webtest can be run directly Once this has started you can open your browser and type in the web link http localhost 8080 to connect with webtest If the installation has been successful you will see the following webpage J Mozilla Firefox Stat JR Demonstrator Configuration Current Template Regression1 Current Dataset tutorial Run Template Information Select template 1 Level 2 Level Multivariate N Levellinput NOCeljo
103. inally we have the step to expand a function this time called mmulttex outlatex r begin aligned mbox normexam _i amp sim mbox N mu_i sigma 2 mu_i amp beta_0 mbox cons _ i beta_ 1 mbox standlrt _ i beta_0 amp propto 1 beta_l amp propto 1 tau amp sim Gamma 0 001 0 001 sigma 2 amp 1 tau end aligned 1 oe 3 2 4 Some points to note You will notice that the string object created in outlatex has an r before the and that similarly there is an r inside the mmulttex function call before the Basically the triple quotes are used in place of quotes to allow the use of single quotes within the expression The r is used to let the computer know that the expression in the quotes is a raw string i e a string that doesn t contain any special control characters For example although the character is often used as a control character in a raw string it will be treated simply as a and passed through to the LaTeX reading software This avoids the use of lots of double for each One debugging tip is that lines often finish with a double slash to denote a new line in LaTex It is important to add a space after the double slash in the text file as otherwise it will be concatenated onto the next line Some of you will know LaTeX and so the code in the source window will be familiar It is however not essential to write an outlatex function for your own templates as the code i
104. ing for all parameters We have in other templates 2LevelBlock and NLevelBlock implemented similar block updating of fixed effects for multilevel models We will next look at an alternative method that has the advantages of not needing to use preccode and also of being useful for non normal response models 13 3 The 1LevelOrthogParam template The alternative approach to blocking variables that are correlated is to reparameterise the parameters to a configuration that are less correlated We will achieve this by using an orthogonal parameterisation for the fixed effects rather than the standard parameterisation The template we will use is called 1Leve OrthogParam and the inputs are very similar to the i1LevelMod template as this approach also works for non normal responses The template does have one additional input in invars which is used to find out whether or not to use an orthogonal parameterisation This can be seen in the following lines orthog Boolean Do you want to use orthogonal parameterisation if orthog betaort ParamVector betaort ncols len x orthogmat Text value Here we add an additional vector of responses betaort if the orthogonal parameterisation is to be used Let us try out the template on the rats example so choose 1LevelOrthogParam from the template list and input the following fie L t yew Hinom Bookmams Jools Heip lt gt e E ron 4 A Mort Visited Getting Started a Lat
105. inning Clicking on Next will simply show the Model code and mathematical representation as we saw for Regression1 with the in built engine Note this can be chosen by selecting eSTAT for the estimation engine If we next click on Run you will see a WinBUGS window appear on your toolbar and in the background whilst WinBUGS is fitting the model When it finishes it will disappear and we will get the following output in the browser 49 Mttps localhost 0000 run Results sigma 2 ee NN sGSe kernel density wu 7500 2000 3000 4000 5000 amp 50060 062064066060070 stored update 0 03 parameter value 0 02 oo fall 0 00 oe ee 0 01 0 02 20 00 2 4 6 8 10 12 0 50000 100000 150000 200000 updates sigma Inputs y normexam cons standit Loading Web Font TeX Math talic lt Werdowes meda Player E Mozo Firefox Ors loss 2 search Rests E advanced User Compati MM CAWINNTYsystemszjom Here you will notice that we get results for each parameter including the addition of the deviance and some reordering of the output As we chose 2 chains you will also observe a green and blue output for both the chains and kernel density plots We now need to see how the connection to WinBUGS was achieved Interestingly for the Regression2 template you will not find any additional code to run WinBUGS within the template itself We will therefore n
106. inning inputs y use i cons age link logit D Binomial n cons We are fitting a logistic regression to the response variable use which indicates whether a Bangladeshi woman is using contraception at the time of the survey We are regressing use on age and using the Stat JR built in MCMC engine with some default settings for estimation Clicking on Next will display the model Mozilla Firefox http localhost 8080 run Equation rendering use Binomial cons 7 logit z By cons age faal ixl Model model for i in 1 length use use i dbin p i cons i logit p i lt cons i betaO age i betal Priors betad dflat betal dflat Run Code JS Sat Inputs y use x cons age link logit D Binomial n cons Ba 4 We will next click on Run to run the model and then get the following results 80 Mozilla Firefox Fa betas flocalhost 0080iruny T Mtp locathost 8080 run Results betal ehain 0 01218 0 00428 betad chain 0 4145 0 03872 betal 0 030 ad x 0 025 a r o zr j 5 002 S 6 00s g f e 0010 z 40 j a 000s 5 j 20 0 000 10 F 0 005 gt gt OO e j 0 1000 2000 3000 4000 5000 0 5000 000 010 010 020 020 030 035 10 Stored uodate __ 07 D rameter value __ os as os 0
107. ion engine eSTAT WinBUGS MLwiN R eSTAT WinBUGS MLwiN R if self EstObjects Engine MLwiN self EstObjects FixedAlg Text Select MCMC sampling method for fixed effect Gibbs MH Gibbs MH self EstObjects VarAlg Text Select MCMC sampling method for residuals Gibbs MH Gibbs MH if self EstObjects Engine WinBUGS self EstObjects nchains Integer number of chains else self EstObjects nchains 1 self EstObjects seed Text Random Seed self EstObjects burnin Integer length of burnin self EstObjects iterations Integer number of iterations self EstObjects thinning Integer thinning else self EstObjects Engine Text Choose estimation engine MLwiN R STATA MLwiN R STATA As we Saw in section 4 1 this Method nput function is called after the model inputs and you will see some additional inputs that were set to defaults in the Methodinput function in templating py Here we allow the user to input whether their method EstM is MCMC or not and then conditional on the method we offer a choice of packages that offer that method Engine There are some additional package specific inputs for example MLwiN allows the user to select which method to be used for updating fixed effects and residuals whilst WinBUGS allows multiple chains from different
108. ions into how to install all the software needed to run Stat JR before moving on to our first example template 2 Installation instructions To use the Stat JR software requires several pieces of software to be present on your PC We will endeavour in later versions to construct a purpose built install program but for now we provide a list of instructions for the user 2 1 Install Python You will firstly need to download and install Python from http www python org download Python is the language we use to construct much of Stat JR and it is also the language that the templates are written in We recommend the most recent in the 2 7 series of installs currently 2 7 1 For now do not use the Python 3 series If you have a 32bit operating system choose the Windows installer If you have a 64bit operating system you can if you wish choose Windows X86 64 installer instead If you choose this version then setting up the C compiler will be slightly different 2 2 Install accompanying packages for Python Next we need to install some of Python s accompanying packages so open http www lfd uci edu gohlke pythonlibs The package names here take the form lt package name gt lt version gt lt windows API version gt lt Python version gt You must choose packages based on the version of Python you selected above for example if you choose the 32bit version of 2 7 you need those packages with a windows API version of win32 and a P
109. l parameters to appear on the lefthand side we complete our workaround for a multivariate Normal distribution by including the two dflat statements these do not change the posterior but mean that the u0_O i1 and u1_O i1 are regarded in the algebra system as parameters Note that the dummy_0O parameters are simply placeholders as each distribution needs a scalar left handside The definition of dnormal2a does not depend on the LHS and as they do not appear anywhere on the RHS the algebra system does not recognise them as parameters and so does not attempt to find their posteriors 109 The code for creating the model code is in outbug but does not contain anything very new that needs reporting here The outlatex code might interest those trying to learn LaTeX as it contains a section to produce the multivariate Normal line as follows left begin array 1 for i in range 0 len context x str lev 1 u S lev 2 _ S i S context C str lev 1 i if i len context x str lev l1 1 3 endif endfor end array right amp sim mbox N left left begin array 1 for i in range 0 len context x str lev 1 a ae if i len context x str levtl1 1 right Omega lev 2 _ u right Here we use Python ifs and fors to allow conditional code and the array environment and left and right for big brackets in LaTeX to deal with vectors and matrices
110. lf objects var ParamMatrix pass Still need to add this This method passes through each of the objects that are in the object list of the template and performs different tasks depending on their type Basically if the object is one of the various data types then we call the model object s method to attach the actual data to the object m adddata with the code looping over the columns if the object is a matrix For objects that are parameters the code both adds the parameter and also sets an initial value for it Adding a parameter does actually more than simply adding the parameter to the model it also adds the algorithm used for that parameter as an attribute of the parameter The method of adding a parameter to a model initialises the parameter and then loads up the algorithm def addparam self name mode 0 algorithm self params name Parameter self name mode algorithm self params name load self directory Here we need to look at the class definition for a parameter to see what exactly is going on and this appears in parameter py 5 2 4 Parameter py The file parameter py contains the definition of the parameter class and the RunningStat class which is one of the attributes of a parameter The parameter class has an initialising method but here we are more interested in the load method def load self directory fname directory self name xml if os path isfile fname f open fna
111. list numpy repeat numpy arange 1 self objects max 1 self objects repeats datalen self objects max 1 self objects repeats self data self objects outcol name list outvar return self data Although the function is long this is in the main due to the many if statements to cope with each type of vector to be generated So for example if we wanted a vector of Uniform random numbers to be stored in a variable called random then the only lines to be executed are datalen len self data self data keys 0 outvar numpy random uniform size datalen self data self objects outcol list outvar return self data The first line in the above code calculates the length of the columns of data by looking at the first column of data hence the 0 and assuming all columns are the same length The next line then 66 uses the numpy random generator to create the column of numbers in outvar In the third line we link the column into the dataset and finally return the dataset to the output name we gave as an input As this template doesn t have any exciting outputs we will not see much happen after execution The webtest py code knows when the template fits a model and displays the required output As the Generate template is not a model template nothing is displayed Let s look at an example of adding a vector of uniform random numbers to the tutorial dataset We firstly choose the Generate t
112. ll return which object on the index window has changed and what it s value is to allow the user to change the dataset and or the template that is currently set We will not go into detail on how the lists are populated with the available templates datasets and how tagging is implemented but for the interested reader this is done directly in the html file The second sort of input is a click on a hyper link so for example a click on the word Run at the top of the screen This works simply through HTML and the run htm page is called and replaces the index html file in the browser An object of type run is created and now the class code for the run object is important 27 The run object is driving much of the Stat JR package When called firstly the GET method is called which initialises the screen and ends with a call to the build_form method whose code is at the bottom of the class definition Here there is some initial setting up code including the line t context template which sets up t to be an instance of the current template and then the following chunk of code boys t input print finished t resultout if GUIObject interface unanswered_questions print not finished yet raise GUIExceptionInputNeeded if hasattr t outbug t MethodInput if GUIObject interface unanswered_questions print not finished yet raise GUIExceptionInputNeeded elif hasattr t graphdata t EstObjects Engine
113. m plate neniani iian ia a a a a 122 13 3 The 1levelorthogparam template ccccccccccssssssssseeececsssesseeeeeeecessesssaeseeeessessesssaeeeesesseesegs 125 14 Out of Sample pr dictioN Snine aneirin ienei aai a ae i aeea ieia aaa Kaa Ri 130 14 1 The 1levpred template using the zxfd trick sssssssseessssssesenressssssrrensrsssseserersrssssesereesessss 130 15 Solutions tothe exercises his vet desde ce aia i iaa ida aaae e eaea aade eee 133 1 About Stat JR e STAT 1 1 Stat JR software for scaling statistical heights The use of statistical modelling by researchers in all disciplines is growing in prominence There is an increase in the availability and complexity of data sources and this has been followed by a corresponding increase in the sophistication of statistical methods that can be used For the novice practitioner of statistical modelling it can seem like you are stuck at the bottom of a mountain and current statistical software allows you to progress slowly up certain specific paths depending on the software used Our aim in the Stat JR package is to assist practitioners on their initial journey up the mountain but also to cater for more advanced practitioners who are potentially having difficulty progressing further up the mountain but also want to assist their novice colleagues in making their own ascent as easy as possible so that they too can ascend the mountain of statistical knowledge One issue with complex s
114. mber to rename the template 129 14 Out of sample predictions Most of the statistical modelling templates we have thus far created are primarily being used to estimate a model We might be interested in using the fitted model to predict responses for units outside the sample Using a simulation based approach it is easy to get confidence intervals about these predictions at the same time as estimating the model In a classical setting out of sample prediction would occur after a model is fitted In MCMC the predictions are estimated simultaneously with the model fit so it is important that although the model fit is used in the prediction that the reverse is not true and that the fit of the model to the data should not depend on the out of sample predictions WinBUGS has a method to do this with its cut function and we have developed a similar method which we will demonstrate here 14 1 The 1LevelPred template using the zxfd trick We will illustrate our approach on a 1 level model which we can fit using the 1LevelPred template We will choose this template along with the tutorial dataset To explain what is going on we are planning to fit a regression model to normexam with predictor standirt as we have done previously using the 1 evelmod template We will then use the predictors given in missing explanatory variables to predict the 10 individuals who in this case have the same scores as the first 10 in the tutorial dataset the 10 has to be
115. me r self xml f read f close self tree XMLtoTree self xml tree Temporarily detect and set algorithm NOTE default algorithm should be based on existence of sampling distribution as ideally logpost should always be present if type self tree dict self algorithm gibbs else self algorithm mh print Algorithm for self name self algorithm else print Missing formula for self name self tree 36 The parameter has three interesting attributes that are set here Firstly the particular XML file for this parameter output by the algebra system is read and stored in the attribute xml This is converted into a tree attribute which is an object of type XMLtoTree the code for this is in XMLtoTree py in the src lib Estat directory but we will not go into details here The tree will have an attribute type that allows the software to know whether it will use Gibbs sampling or MH sampling here Note that the posterior distribution is stored as a tree at this stage as the model object is used both when fitting models using the engine and also when converting to C or Java code So we should now have data and parameters linked into our model object Returning to the go function this continues as follows allmissing True for p in m params values if p tree allmissing False if allmissing True print Error in algebra system nothing genera
116. means that if we have a categorical predictor for example school gender in the tutorial dataset we will need to perform some data manipulation to construct the dummy variables that are used for a categorical predictor before using this template We will here look at an alternative template that has built in functionality for constructing these categorical variables within the model template This template is called 1Leve Cat and we will first look at its invars function to see how it gets the user to input the model structure We then demonstrate its use with the tutorial dataset The invars function is as follows invars y DataVector response D Text specify distribution Normal Binomial Poisson if D Binomial n DataVector denominator link Text specify link function logit probit cloglog if D Poisson link Text value ln offset Boolean Is there an offset Ii Offset n DataVector offset if D Normal tau ParamScalar sigma ParamScalar x DataMatrix explanatory variables for var in x context var _cat Boolean Is var categorical origx Text value beta ParamVector This section of code is the same as that in 1Leve Mod up to the point that x is input We next see a for loop that includes the use of the context statement to construct attribute names that are a combination of text and variable na
117. mes If for example x contains the three variable list cons standlrt schgend then the context statements will concatenate the the three variables with the text string _cat to create three new Boolean variables cons_cat standlrt_cat and schgend_cat which will store whether the variables are categorical or not The line origx Text value will be used to store the original x variables prior to manipulating the categorical variables By setting its value in the assignment i e value we will not get an input widget appearing in the browser Let us now demonstrate fitting a model with a categorical predictor Choose 1Leve Cat from the template list and tutorial as the dataset Firstly we will choose the inputs as follows remembering that the Name of output results input is the name given to where the resulting chains will be stored 86 Stat JR Demonstrator Template i1LevelCat Change Dataset tutorial Change View Summary Configuration response specify distribution explanatory variables Is cons categorical Is standlrt categorical Is schgend categorical Name of output results normexam lormal cons standlrt schgend yes gt tutourt Start again Next C Click on the Next button and set the estimation options as follows Mozila Frelon Bal Inputs y normexam x cons standlrt schgend origx D
118. mplate will need to utilise the preccode feature and so we will first explain this with a simpler 1 level example Exercise 8 Try adapting the NLevelMod template to allow categorical predictors as in 1Leve Cat You could also try adding the option to include interactions Remember to save your template under a new name 11 Using the Preccode PreJcode methods One of the aims of the Stat JR system is to allow other estimation engines aside from our built in MCMC engine to be used with templates We saw in chapter 6 details of how the system can interact with third party software In this chapter and in fact the following three chapters we will see how through the inclusion of additional C and or Javascript code the user can increase the set of models and algorithms that can be fitted using the built in engine At present the methods we describe are partly to advance the modelling but also partly to cover current limitations in the algebra system which will eventually be rectified As the names suggest the preccode and preJcode functions will involve writing C or Javascript code and so some knowledge of these languages would be useful The examples given here will however allow the user with some modification to use similar code for their examples We begin in this chapter with a simple example of a 1 level probit regression model 11 1 The ProbitRegression template We have seen already that the 1Leve Mod template can be used to fit binary resp
119. n algorithm to fit the model This chapter is fairly technical and some readers might find it easier to skip through to chapter 6 We will begin by looking briefly at the current webtest interface into the Stat JR system 5 1 webtest py and the web interface To start up the software for the earlier Regression1 example we used a piece of Python code called webtest py found in the src apps webtest subdirectory The webtest file as the name suggested allows a web interface to be used to run templates from The webtest py file basically starts up a web based application on your machine and links together html files that can be found in the htm l_templates subdirectory of the webtest directory with the user written templates There are several class definitions within webtest py run info summary data image index which relate to which html file is currently being viewed When webtest is started up the index html file which contains the lists of templates and datasets is used There are two ways in which user input is then dealt with Firstly the user might click on for example the Set buttons which will involve the code in the class index being executed You will see in the index class definition the lines i web input if dataset in i context datasetname i dataset if template in i context templatename i template and this is how the user input is captured from this initial screen i e here the web input function wi
120. name output The Engine name being set to output is picked up in the Run method in webtest with the following check elif t EstObjects Engine name output t objects outimg t graphdata context variables This sets an attribute outimg which is picked up in the html template file run htm which then does the following if template objects has_key outimg lt img name current_plot src image outimg gt endif To see this template in action we will pick it from the webtest main screen along with the tutorial dataset and after running we will see the following in the browser Mozilla Firefox istory Bookmarks Tools Help Stat JR Demonstrator Template XYPlot Change Dataset tutorial Change View Summary Configuration Start again Y values school cons vrband schav avsirt standirt schgend student denom gir normexam z X values school Y label X label Name of output results Next Set Inputs ee Perhaps the simplest plot would be of normexam against standirt which you can try yourself Here we illustrate instead the use of more than one y variable by making the following selections 76 Mozilla Firefox Bookmarks YM http 127 0 0 1 8080jrun http 127 0 0 1 8080 run Stat JR Demonstrator Template XYPlot Change Dataset tutorial Change View Summary Configuration Start again Y
121. name beta i if S v name i lt 0 yli rtnormal mean 1 2 0 0 else yli rtnormal mean 1 1 0 0 Here we see that the code involves looping over all data points using a for loop and for each point we evaluate the mean value which is the linear predictor calculated via the substitution Then depending on the value of the binary response a call is made to the truncated normal random number generator via the rtnormal function Here rtnormal takes 5 arguments the mean the standard deviation the type of truncation 1 left 2 right 3 both and finally the left and right truncation values To see this in action we will use the Code button so click on Start Again and fill in the inputs for the model as shown earlier in the chapter Then press the Code button to see the generated code The generated C code is long and so again we will just display the iteration loop void iterate 104 double mean for int 1 0 i lt length y it mean double cons i beta0 double age i betal if use i lt 0 yli rtnormal mean 1 2 0 0 else yli rtnormal mean 1 1 0 0 Update y Update betal double sum0 0 for int i 0 i lt length use i sum0 age int i y int i beta0 cons int i double suml 0 for int i 0 i lt length use i sumlt pow age int i 2 std trl normal_distribution lt double gt normal sum0 sum1 1 sqrt suml bet
122. nfiguration response explanatory variables Use MVNormal update for beta Name of output results Random Seed length of burnin number of iterations thinning Start again y36 cons y8 Yes ratsout 3 hoo o gt poo gt m Next Bat Inputs y y36 x cons y8 my Yes ee Running the template by pressing the Next and Run buttons results in the following output Note that we have selected beta1 for comparison with the Regression1 output C ie S E E E m File Edit View History Bookmarks Tools Help lt oie ay http 137 222 140 64 8081 run Google rJ 2 Most Visited Getting Started z Latest Headlines http 137 222 140 64 8081 run Results tau chain 0 004097 0 001099 ESS 4125 sigma chain 16 06 2 232 ESS 4229 betal chain 1 069 0 268 ESS 4633 beta0 chain 162 2 40 86 ESS 4626 betal 2 5 14 12 2 0 5 Z 1 0 z 15 S 0 8 10 T 0 6 a 5 0 4 oS W 0 2 0 0 O 1000 2000 3000 4000 5000 99 05 10 15 20 2 5 1 0 1 0 parameter value _ 0 8 0 8 0 6 0 6 g g n 0 4 Eo4 0 2 0 2 nant 000i a e0 80 100 120 9 2 4 6 8 10 12 0 006 laa lag 0 005 0 004 H z Y 0 003 E 0 002 0 001 0 000 0 20000 40000 60000 80000100000 updates 124 We can see that this method has given much better mix
123. nished initialising str time clock run_time Here we are simply setting starting values for all parameters and MH settings for each parameter The next chunk of code algorithm if S accept gt desacc tol and S accept lt desacc tol S acceptinrow 1 if S acceptinrow 3 S acceptdone 1 else S acceptinrow 0 if S accept gt desacc sd 2 100 accept 100 desacc else S sd 2 S accept desacc S accept at oun gives the basic form for the adapting steps that are part of the MH algorithm Then we have while adapt scipy weave inline self code vars keys self data keys local_dict vars global_dict self data support_code self scode compiler gcc type_converters converters blitz headers lt vector gt extra_compile_args 03 it it 1 This basically compiles and runs one iteration of the code and adds 1 to an iteration counter There is then a long section of code not shown that forms the steps to be used to adapt the proposal distributions for MH sampling After a message to show adapting has finished the above code for compiling and running one iteration of the algorithm is repeated this time within a loop for the desired number of iterations 39 The code is completed by a chunk of code that stores the values of parameters for iterations after the burnin if we are storing chains and updates summary statistic
124. nstruct the data as a matrix invars from which we can construct the correlations corrs by a call to the numpy corrcoef function Then we again format the output nicely into tabout The command print tabout is simply a debug line as this printing occurs in the debug window only The webtest program knows to do something with the output as in the build_form method the lines elif hasattr t tabledata t EstObjects Engine Text t EstObjects Engine name taboutput pick up the method tabledata and flag the template s name Engine as taboutput this is then picked up in the Run method where the lines 72 elif t EstObjects Engine name taboutput t objects outtab t tabledata make a template object outtab which is checked for by the html template run htm with the lines if template objects has_key outtab lt print str template objects outtab gt lt table class config gt for i in range 0 len template objects outtab lt tr gt for j in range 0 len template objects outtab i lt td gt S template objects outtab i j lt td gt endfor lt tr gt endfor lt table gt o endif This places the output in the correct place If we use this template with the tutorial dataset we first need to set and run it to get the default screen 2 Mozilla Firefox http localhost 8080 run 2 Stat JR
125. o L Bacons f standlrt By xd By axl t T 0 001 0 001 e llt Model model for i in 1 length normexam normexam i dnorm mu i tau mu i lt cons i beta0 standlrt i betal Priors beta0 dflat betal dflat tau dgemma O 001000 0 001000 sigma lt 1 sqrt tau Run Code JS Set When we were describing the webtest Python file we noted that under the POST method we were able to find the lines of code that are executed when various buttons are pressed Similarly the code that is executed when the JS button is pressed can be found within the POST method of webtest and is as follows ini m Model m modelspec t output m prejcode t prejcode m postjcode t postjcode m compilemodel m m t if Js burnin t EstObjects burnin iterations t EstObjects iterations applydata m context variables if hasattr t customsteps for p in t customsteps print Custom step for parameter p m params p tree for p in t monitor_list m params p monitor return render jsrun model True m genJS Again this code snippet is very similar to that used for running the model or for generating C code There are references to prejcode and postjcode which are used for users to incorporate additional chunks of Java code as is done for C in preccode and postcode The main difference is that now we Call the method genJS for
126. o output an image file to the browser The graphdata code is as follows def graphdata self data import numpy from matplotlib figure import Figure import matplotlib lines as lines from matplotlib backends backend_agg import FigureCanvasAgg import subprocess import tempfile import os fig Figure figsize 8 8 ax fig add_subplot 100 10 1 xlabel str self objects xaxislabel ylabel str self objects yaxislabel for n in self objects yaxis ax plot self data self objects xaxis self data n x canvas FigureCanvasAgg fig directory tempfile mkdtemp os sep canvas print_figure directory xyplot png dpi 80 return directory xyplot png We have to firstly import several Python libraries in order to call the graphics function We are using the Figure function from the matplotlib package We then make a new plot using the Figure command before sticking on the axes labels and looping over the y variables and plotting their points The x is the symbol to be plotted for each plot The last four lines are used to store the plotted figure as a png file and the full pathname for this file is returned by the function Once again the webtest program looks for the method name to know to plot the figure This is achieved in the build_form function via the following code elif hasattr t graphdata t EstObjects Engine Text 75 t EstObjects Engine
127. objects var ParamScalar print initial values str self objects var _name m addparam var if self objects var answered Fals m params var current_value 1 0 else m params var current_value float self objects var name if type self objects var DataVector print var name self objects var name m adddata self objects var name if variables has_key self objects var name m data self objects var name variables self objects var name if type self objects var ParamVector print initial values str self objects var _name if self objects var ncols lt 0 m addparam var if self objects var answered Fals m params var current_value 0 1 l self objects var ncols else m params var current_value map float self objects var name m params var current_value 0 1 len variables t objects y name else for i in range 0 self objects var ncols m addparam var str i if self objects var answered Fals m params var str i current_value else m params var str i current_value map float self objects var name i 35 if type self objects var DataMatrix for i in range 0 self objects var ncols print var name self objects var name i m adddata self objects var name i m data self objects var name i variables self objects var name i if type se
128. odelling templates to more levels and to include random slopes They will also form the basis of the REALCOM templates we are currently writing to mimic the functionality that exists in the REALCOM software program You will see that these templates are pretty big and involve coding in several languages Python WinBUGS model code LaTeX C and JavaScript It is hoped that with advances in the algebra system the reliance on the preccode prejcode functions will be reduced but if you want to look at the other multivariate templates you will see many similarities in the code in these functions This is one of the plus points of the open source nature of the Stat JR system We will finish this documentation by considering two more examples of getting more from the MCMC estimation engine 120 13 Improving mixing 1LevelBlock and 1LevelOrthogParam In this chapter we will return once again to our first template Regression1 but use it on a different dataset rats This dataset contains the weights of 30 laboratory rats taken at weekly intervals from 8 days old We will regress their final weight at 36 days on their initial weight at 8 days old 13 1 Rats example We will set up a simple regression for this rather small dataset as follows Help ro MM http 127 0 Template Regression Change Dataset rats Change View Summary Configuration Start again response y36 explanatory variables cons y8 Name of
129. onse models and we have demonstrated a logistic regression model for the bang dataset A probit regression is similar to a logistic regression but uses a different link function One interesting feature of a probit regression is that the link function is the inverse normal distribution cdf This means that we can interpret the model using latent variables in an interesting way Imagine that you had a variable which was a continuous measurement but that we can only observe a binary indicator as to whether the variable was above or below a threshold for example in education we might have a mark in an exam but the student is only told whether they pass or fail If we model the pass fail indicators using a probit regression then this is equivalent to assuming the unobserved latent continuous measure follows a normal distribution with threshold 0 and variance 1 We can use this latent variable representation of the probit model in MCMC estimation by generating the latent continuous response variable as part of the algorithm Having generated the latent variable we then have a normal response model for these variables which is easy to fit The 100 ProbitRegression template fits a probit regression using this technique and we will add the step to update the continuous response variables via the preccode prejcode methods It should be noted that the logistic regression can also use a latent variable representation In this the latent variables follow a
130. ooks like Eile Edit View History Bookmarks Tools Help C 4 5 bttp 137 222 140 64 8081 run YY S Googie P 2 Most Visited Getting Started Latest Headlines _ httpv 137 222 140 64 8081 run normexam N u 0 2 f RE E FN 2 P 2 e Hi Bocons 8 standlrt Up schoolj CBSE Uy schooll standlrt ul j 0 school a N 0 Ql uy 0 1 school 2 Q oc By x1 By xl 7 T 0 001 0 001 2 o 1 r Model model for i in 1 length normexam normexam i dnorm mu i tau mu i lt cons i beta standirt i betal u0_O school i cons i ul_O school i standirt i for il in l length u0_0 dummy 0 i1 Gnormal2a u0_O i1 ul_0 i1 0 0 d u1 0 d_u1 1 d_u1 2 u0_0 1 dflat z ul_0 ii dflat Priors beta0 dflat betal dflat tau dgamma 0 001000 0 001000 sigma lt 1 sqrt tau Here we see the LaTeX code including the multivariate normal distribution for the random intercepts and slopes You will see how this is written in the model code for il in 1 length u0_0 dummy_O il dnormal2a u0_O il ul_O il 0 0 d_ul 0 d ulil d_ul 2 u0_O il dflat ul_O il dflat The dnormal2a distribution has as its first two arguments the two responses Next we get the two means and then the three parameters that make up the precision matrix As the algebra system expects al
131. or beta step if itnum 0 struct _gmatrix xty matrix_blank S nbeta 1 for int i 0 i lt length S context x 0 i for j in range 0 nbeta xty gt e j 0 double S context x J j il double context y i for k in range 0 nbeta xtxinv gt e S j k double context x j i double S context x k i endfor endfor matrix_sym_invinplace xtxinv for int j 0 j lt S nbeta j for int k 0 k lt S nbeta k mean gt e j 0 xtxinv gt e j k xty gt e k 0 matrix_free xty Multivariate step for beta struct _gmatrix variance matrix_blank S nbeta S nbeta for int j 0 j lt S nbeta j for int k 0 k lt S nbeta k variance gt e j k xtxinv gt e j k tau struct _gmatrix rand rmultnormal mean variance matrix _free variance for j in range 0 nbeta beta j rand gt e j 0 py_beta j PyFloat_FromDouble betaS j PyDict_SetItemString raw_locals betaS j py_betaS j Py_DECREF py_beta j matrix _free rand return MakoTemplate extraccode render self objects Choose the 1Leve Block template and the rats dataset and set up the inputs as follows 123 2 Mozilla Firefox bttp 127 0 0 1 8080 run Stat JR Demonstrator Template 1LevelBlock Change Dataset rats Change View Summary Co
132. outcol Text Output column name type Text Type of number to generate Uniform Random Binomial Random Chi Squared Random Exponential Random Gamma Random Normal Random Poisson Random Constant Sequence Repeated sequence if type Binomial Random prob Text Probability numtrials Integer Number of Trials if type Chi Squared Random degreefr Integer Degrees of Freedom if type Gamma Random shape Text Shape if type Poisson Random exp Text Expectation if type Constant value Text Value if type Sequence start Integer Starting Value step Integer Step if type Repeated sequence max Integer Maximum number repeats Integer Repeats per block We see that there are two main attributes a name for the additional column outcol and a type of column to generate Depending on the type there may be additional attributes and these are catered for through a set of if statements in Python So for example if we want a constant column 65 we will have an additional attribute value which gives the value of the constant Note that the length of the vector is controlled by the lengths of the columns already in the dataset as a dataset is currently restricted to be a set of columns of equal length As this template is not a model template there are no outbug or outlatex attributes
133. porary directory but may be linked in by the E book extension to Stat JR when it is written produce diagnostic plots 63 STATAobj DoScript predict yhat mu npredict ehat response npredict rhat response standardized n STATAob j DoScriptt scatter ehat yhat n dumfile re sub STATAobj DoScriptOut dumfile2 re sub dumfile STATAobj DoScript graph export re sub _outputlog ResivsFitted png dumfile2 replace n STATAobj DoScript egen rank rank rhat n STATAobj DoScript gen nscore invnormal rank _N 1 n STATAobj DoScriptt scatter rhat nscore n STATAobj DoScript graph export t re sub _outputlog QQ Plot png dumfile2 replace n STATAobj DoScriptt exit clear STATA n write the script out file open STATAobj myDoScript w file write STATAobj DoScript file close The final few lines finish creating the script as a text file and then the execute method of the STATAInput object is called as shown below to fire up Stata and run the code def Execute self self executable EStat configuration get STATA executable command s s self executable self myDoScript self OutStatus subprocess Popen command wait Having run the model there are two further functions to display the output and summarise the parameter estimates This code is shown below def DisplayOut self self out open sel
134. r i in range 0 uS i 1 S context c endfor Wr lt ole n i ol h Fh ae ol for i in range 0 for uS i 1 aiS i 1 endfor Priors o o aP i in 1 length y numlevs str i 1 T EE cons i numlevs iS i 1 in l length uS i 1 dnorm 0 tau_u i 1 for i in range 0 x ncols betaS i dflat endfor if D Normal tau dgamma 0 001000 0 001000 sigma lt 1 sqrt tau endif for i in range 0 numlevs 98 tau_u i 1 dgamma 0 001000 0 001000 sigma_u i 1 lt 1 sqrt tau_uS i 1 o endfor Here we introduce the use of local variable numlevs The outbug attribute is a text string with substitutions If we wish to include a Python statement inside the text string we place it within a lt and a gt In this case we set a value to numlevs and then use it as the upper bound in loops later in the code You will see that the rest of the code contains many of the features we have discussed in earlier examples You do however have to be careful as the code is a mixture of WinBUGS like model code and Python code For example consider the following code for our cross classified model for i in range 0 numlevs for i i 1 in 1 length u i 1 u i 1 i i 1 dnorm 0 tau_u i 1 E Here we are using i in both the model code we are constru
135. random at school classification cons standirt Priors Uniform Name of output results tutout Choose estimation engine eSTAT WinBUGS eSTAT Random Seed 1 length of burnin 1000 number of iterations thinning 107 The invars function for this template is quite long as we see below tr invars NumLevs Integer Number of Classifications for i in range 0 int NumLevs selstr Classification str i 1 context C str i 1 DataVector selstr y DataVector response D Text specify distribution Normal Binomial Poisson if D Binomial n DataVector denominator link Text specify link function logit probit cloglog if D Poisson link Text value ln offset Boolean Is there an offset if offset n DataVector offset if D Normal tau ParamScalar sigma ParamScalar x DataMatrix explanatory variables beta ParamVector beta ncols len x for i in range 0 int NumLevs context x str i l DataMatrix explanatory variables random at context C str i 1 classification for var in range 0 len context x str i l context u str var _ str i ParamVector num len context x str itl1 if num 1 context tau_u0_ str it 1 ParamScalar context sigma_u0_
136. ressed can be found and is as follows 41 if code in i web header Content Type text plain m Model m modelspec t output m preccode t preccode m postccode t postccode m compilemodel m burnin t EstObjects burnin m iterations t EstObjects iterations if seed in t EstObjects m seed t EstObjects seed t applydata m context variables if hasattr t customsteps for p in t customsteps print Custom step for parameter p m params p tree for p in t monitor_list m params p monitor True return m genCPP You will notice here similarities to the code in the go function that we used when running a model Basically we again form a model and set some of the attributes and call the compilemodel and applydata methods as described in sections 5 2 1 5 2 4 The difference here is we replace the calls to run and summarise with a returning of the genCPP method 5 3 1 Gencpp The gencpp method code is quite short def genCPP self self code stdtmpl get_template statlib cpp render seed self seed self code void iterate n self cod self preccod for k in self params keys self code Update k n self code XMLtoPureC self params k tree self params k mode self params k algorithm toC self code self params k purec self code n self code self postccode self
137. rs to the sb1 and vw1 parameters Next we have matrix_sym_invinplace sb it1l struct _gmatrix va i l rwishart vw i 1 sb i 1 matrix_free sb it l lt count 0 gt oe or j in range 0 n or k in range j n d_u itl count va S itl gt e S j k lt count 1 gt endfor endfor f f oe ol matrix_sym_invinplace va itl count 0 gt or j in range 0 n or k in range j n omega_u i 1 count va i 1 gt e j k lt count 1 gt endfor endfor A oe f f oe ae matrix _free va it l Sendif sendfor return MakoTemplate extraccode render self objects In this last chunk of code we invert the sb1 parameter before drawing the new precision matrix which we give the place holder vai The last chunk then involves copying these values to the vector d_u1 and the inverse matrix to the vector omega_u1 To see the code that the preccode method generates for our example we can click on the code button and search for the iterate function This can be seen here static int itnum 1 itnum 1 Note currently using a uniform prior for variance matrix struct _gmatrix sbl matrix_blank 2 2 for int i 0 i lt length u0_0 i sb1 gt e 0 0 double u0_O i double u0_0O i sb1 gt e 0 1 double u0_O i double ul_O i sb1 gt e 1 0 double ul_O i
138. s We will now look at a generalisation of these template 1LevelMod that allows other response types including Binomial and Poisson responses in a range of packages This template will illustrate the use of conditional statements within the invars and outbug functions We will begin by looking at the template in action in webtest The template should be able to fit all the models that Regression fits and so you could try repeating the earlier regression analysis but here we will look at a logistic regression From the main browser screen we need to set the template to be 1Leve Mod and the dataset to be bang our example binary response dataset Clicking on Run gives the following output in the browser http 127 0 0 1 8080 run Stat JR Demonstrator Template 1LevelMod Change Dataset bang Change View Summary Configuration Start again response qd it specify distribution Normal Next Bal Inputs ee We will now set up the various inputs as shown on the following screen 79 L retees127 0 0 1 2080 rury httg 127 0 0 1 8080 run Stat JR Demonstrator Template 1LevelMod Change Dataset bang Change View Summary Configuration denominator n specity link functon agit explanatory variables cons age Name of output results bangout s estimation method by MCMC y s eSTAT V MLWIN R eSTAT Random Seed dumin 1000 number of iterations th
139. s if we are not storing the chains o if iteration 1 gt self burnin and iteration self thinning for k in self params keys if self params k monitor True if type vars k list for p in range len vars k self params k chain p iteration self burnin self thinning vars k p else self params k chain iteration self burnin self thinning vars k else if type vars k list for p in range len vars k self params k rs p Push vars k p else self params k rs Push vars k print Finished iterating str time clock run_time for k in self params keys self params k current_value vars k This run function therefore runs the whole model You will note that this is done in Python but with calls to C code to update the parameters for each iteration This can be contrasted from the code generation which we will look at in section 5 3 where the code is all written in C 5 2 6 Summarising the results Having run the model the last call before returning the model is to the model summarise method This method is used to calculate summary statistics for all parameters as shown in the following code run html then loops through the dictionary returned by the method and prints out the value stored for each key parameter Ha def Summarise sel prec 4 results for p in self params keys results p if self params p monitor True
140. s op averages tabout name count mean sd for i in range 0 len self objects vars var numpy array self data self objects vars i row self objects vars i len var var mean var std tabout append row if self objects op correlation invars numpy empty len self objects vars len self data self objects vars 0 for i in range 0 len self objects vars for j in range 0 len self data self objects vars i invars i j self data self objects vars iJ j corrs numpy corrcoef invars tabout row for j in range 0 len self objects vars row append self objects vars j tabout append row for i in range 0 len self objects vars row self objects vars i for j in range 0 len self objects vars row append corrs i j tabout append row print tabout return tabout You will see separate chunks of code for averages and correlations The average code basically initializes a table output with a column heading and then loops through the columns in vars setting each in turn as a numpy array stored in var An array of text strings is then stored in row and we use the len function to get the number of data items and the built in numpy functions mean and std to get the mean and standard deviation respectively This row is then appended to the output tabout The correlation code is slightly longer We first need to co
141. s algebraic processing system in isolation as it also has some nice output screens that are useful for teaching purposes Start up the Demo executable on your machine you can find this in the Demo bin subdirectory underneath the base directory where you extracted StatJR This will bring up the program and a large number of windows 31 F Testbed F File Window Distributions Functions Preferences About F BUGS Input PIES Complementary log log cloglog x y e exp exp y You will need to copy the model for the Regression1 template onto the clipboard and then paste this into the window entitled BUGS Input as follows Vd rE Be Window Deetutens Eurebons Breferences About E ouas mo iss n deli for i in Lilength nornexan normexan i dnorm muli tau muli lt cons i beta 4 standlrt i betal i Priors betat dflat betel dflat tau dganma 0 001000 0 002000 riga lt 1 eqrt tau cloglog x y x exp exp y Be careful if you are copying from the web browser as it may lose the endlines and this will then not work properly but things are OK if you copy somewhere else first You will notice when the model 32 code has been copied that the program takes a little while computing things and the other windows will then change If we select the Graph Nodes Window and make this full screen and choose betaO we get the following normenxam mul tou coms betad
142. s called to link in data into an attribute called variables and then depending on the template type and whether we are using WinBUGS see chapter 6 the output and outputlatex methods are called and allocated to statstemplate and latextemplate respectively These calls are to update the screen and display the model code and LaTex At this stage the buttons to allow the model to be run or for C or Java code to be generated appear in the web browser Each of these buttons correspond to a different value of input within the POST method chunk of code Note in the html code we switch to the POST method once all inputs are generated i e at the same point as the buttons appear We will describe what happens when we fit a model or generate source code in later subsections but we next need to turn our attention to the concept of a model as opposed to a template 5 2 Model Templates and model py In Stat JR we are anticipating advanced users writing their own templates and our definition of a template is an object that takes user input and produces sufficient output for the system to do a particular task When we describe model templates their task is to get enough information to set up a model object that can then be used to fit a statistical model We have a specific type of model object that is used to fit models using the estimation engine within Stat JR In chapter 6 we will look at how things work when we don t use this engine but instead use o
143. s purely decorative We will not give a crash course on LaTeX here but essentially the aligned environment is used to write a set of mathematical equations with the amp sign denoting the place where to line up horizontally the lines and the double slashes denoting new lines LaTeX uses the preceding terms to denote special characters e g beta gives a Greek lowercase beta The aligned environment is for mathematics and so if we wish to write words in normal font we enclose them in a mbox With this basic knowledge 22 you should be able to compare the source code and the maths it produces and thus see what each of the special characters is 3 3 Writing your own first template We haven t at this stage explained how the outbug function is used to create code to fit the model This is all generic code that is common to all templates and which we will introduce in chapter 5 It is enough for now to realise that to write some basic templates simply requires writing code similar to that seen here the Stat JR system will do the rest of the hard work for you We will now test your understanding by getting you to construct your own first template Exercise 1 It is best when starting to write templates to begin with a template that works and modify it to confirm you understand what is going on You will therefore now take the Regression1 template and construct a template for an even simpler model a simple linear regression i e with one explan
144. scribed in the next section the Stat JR system is built around the construction and use of templates self contained objects to perform specific statistical tasks Templates will be created by both advanced practioners and algorithm writers and used and shared by these two groups as well as novice practitioners Many of the ideas within the Stat JR system were the brain child of Jon Rasbash JR Sadly Jon died suddenly just as we began developing the system and so we dedicate this software to his memory We hope that you enjoy using Stat JR and are inspired to become part of the Stat JR community through the creation of your own templates that can be shared with others or simply through feedback on the existing templates Happy Modelling The Stat JR team 1 2 About the Advanced User s Guide A major component of the Stat JR package is the use of often user written templates Templates are pieces of computer code written in Python that perform a specific task Many of the templates are used to fit a family of statistical models although there are other templates that perform data input data manipulation and data and graphical outputs Templates appear to the end user as a window with a series of widgets and drop down boxes which allow the user to specify the model to be fit the data to be manipulated etc In this document it is our aim to give advanced users who intend to write their own templates or more generally are interested in how
145. similar to the preccode function so we do not go into details here Exercise 9 We have not written a 2Leve withRS template so try adapting the NLeve withRS template so that it only allows one higher classification This exercise will be in essence a merging of features of two templates 2 evelmod and NlevelwithRS and will test your understanding of the various chunks of code 113 12 3 Multivariate Normal response models Having established a method of including multivariate distributions for use with random slopes in the preccode we can reuse the same method to allow us to fit multivariate Normal response models We will here consider the template for fitting 1 level multivariate response models MVNormal1Level This template can be used to fit models with missing data for some responses which is achieved by a method similar to that used for the probit regression and so the preccode will generate two steps one for the variance matrix of the responses and an intial step to set up the missing responses The invars function is as follows invars y DataMatrix responses lenbeta 0 for i in range 0 len y context x str i l DataMatrix explanatory variables for response Ve oe Sr RE ake tate SP lenbeta len context x str itl context miss t y i ParamVector n len y if n tau ParamScalar sigma ParamScalar else omega_e ParamVector omega_e ncols n 1 n 2 d_e ParamVector
146. t algebra system in Stat JR does not produce multivariate posterior distributions We can however work out the correct posterior distribution by hand and plug this into the code via the preccode This is performed by the 1Leve Block template If you look at the template code you will see it has an initial input in the invars attribute requiring the user to specify whether or not to block the fixed effects Otherwise the code is little changed from the template 1Level Mod template apart from the additional preccode method which we consider now The first chunk of code is as follows self customsteps if not self objects mv return nbeta len self objects x for i in range 0 nbeta self customsteps append beta str i 122 Here we see for the first time the preccode function doing something other than writing a text string This code is additionally informing the code generator that custom steps i e user written steps are to be used for the beta parameters when the block updating option mv is selected The remainder of the code updates the beta vector which has mean in matrix form X X X y and variance XX times the residual variance The code is as follows extraccode lt nbeta len context x gt static int itnum 1 itnum 1 static struct _gmatrix xtxinv matrix_blank S nbeta nbeta static struct _gmatrix mean matrix_blank S nbeta 1 Setting up constant terms f
147. t these figures mean in this Advanced User s Guide interested readers can look at the accompanying Novice Practitioner s Guide to be written in 2011 for such information The aim here is to teach you how to write a similar template yourself 16 3 2 Opening the bonnet and looking at the code The operations that we have performed in fitting our first model are shared between the user written template Regression1 and other code that is generic to all templates and which we will discuss in more detail later So our next stage is to look at the source python file for Regression1 All templates are stored in the templates subdirectory under the base directory and have the extension py If we open regression1 py in a text editor such as Wordpad or Notepad rather than Python we will see the following from EStat Templating import from mako template import Template as MakoTemplate import re class Regressionl Template A model template for fitting 1 level Normal multiple regression model in E STAT only To be used in documentation tags model 1l Level invars y DataVector response tau ParamScalar sigma ParamScalar x DataMatrix explanatory variables beta ParamVector beta ncols len x outbug model for i in 1l length S y S y i dnorm mu i tau mu i lt mmult x beta i Priors for i in range 0 x ncols be
148. ta i dflat endfor tau dgamma 0 001000 0 001000 sigma lt 1 sqrt tau outlatex r begin aligned mbox S y _i amp sim mbox N mu_i sigma 2 mu_i amp S mmulttex x r beta i for i in range 0 len x beta_S i amp propto 1 sendfor tau amp sim Gamma 0 001 0 001 sigma 2 amp 1 tau 17 end aligned We will now describe in some detail what this code does The first three lines here are simply importing information needed by the template and are generic to many templates We then have a class statement which defines a class Regression1 which is a subclass of a generic Template class There is then a sentence known as a descriptor that describes what the template does For those unfamiliar with the terminology we are using the word class to describe a definition of a type of object for example we might have a class of rectangles where each rectangle might be described by two attributes length and width Then an instance of the class which we might call Dave will have these values instantiated e g Dave s length is 3 and width is 1 We might think of the subclass of rectangles the squares which again have the two attributes length and width We could state that class Square Rectangle in which case we know that as squares are a subclass of rectangles they have a height and width but we would now redefine the attribute width within the squares definition to equal hei
149. tatistical modelling is that to use the latest techniques can involve having to learn new pieces of software To continue the mountain analogy having climbed up one path with one piece of software this is like having to descend again and choose another path and another piece of software In Stat JR we aim to circumvent this problem via our interoperability features so that the same user interface can sit on top of several software packages thus removing the need to learn multiple packages To aid understanding the interface will allow the curious user to look at the syntax files for each package to learn directly how each package fits their specific problem For example a user familiar with Stata could look at equivalent R code for fitting their model To complete the picture the final group of users to be targeted by Stat JR are the statistical algorithm writers These individuals are experts at creating new algorithms for fitting new models or better algorithms for existing models and can be viewed as sitting on peaks possibly isolated with limited links to the applied researchers who might benefit from their expertise Stat JR will again build links by including tools to allow this group to connect their algorithm code to the interface through template writing and hence allow it to be exposed to practitioners They can also share their code with other algorithm developers and compare their algorithms with other algorithms for the same problem As de
150. ted return None if hasattr t customsteps for p in t customsteps print Custom step for parameter p m params p tree for p in mon m params p monitor True return m Basically we check that something is generated by the algebra system else display an error The last few lines allow custom estimation steps steps written by the user that replace those output by the algebra system by removing the tree code for particular parameters such custom steps will be included in the preccode postccode functions as explained later Finally the list of parameters that we wish to store MCMC chains for is set via the monitor attributes for each parameter and the run method for the model object is called 5 2 5 Running a model the run method and XMLtoC The run method for the Model class contains a lot of Python code and does many things We now need to piece together the parameter trees to form a set of algorithmic steps to fit the model to the data The first chunk of code is as follows def run self run_time time clock total self burnin self iterations vars localvars self scode stdtmpl get_template statlib cpp render seed self seed selftscode self code self preccod for k in self params keys vars k self params k current_value if self params k monitor True 37 if type vars k list self params k chain numpy array 0 0 self iterations self t
151. tements if statements The distribution D is defined as a Text input and you will see 81 that there are a limited number of choices given as a second argument to the statement The webtest program will treat this as a pull down list input with the limited number of choices populating the list As we saw in our example when fitting a Binomial model we introduce additional inputs n the denominator column and link a text based input to indicate the link function note that currently there is a bug in the algebra system so that probit and cloglog do not work correctly We also see that for non normal models there is no level 1 variance and so the quantities tau and sigma are not included 8 2 Methodinput This template has its own Methodinput method designed to allow the user to choose between the various possible engines to estimate the model as shown below def MethodInput self self EstObjects EstM Boolean Is estimation method by MCMC if self EstObjects EstM self EstObjects Engine Text Choose estimation engine eSTAT WinBUGS MLwiN R eSTAT WinBUGS MLwiN R if self EstObjects Engine MLwiN self EstObjects FixedAlg Text Select MCMC sampling method for fixed effect Gibbs MH Gibbs MH if self objects D Normal self EstObjects VarAlg Text Select MCMC sampling method for residuals
152. ter for each data point the latent continuous response We would therefore like the option to not store every value This is done via the monitor_list method which has the following code def monitor_list self mon old_mon Template monitor_list self for m in old_mon if not m y mon append m return mon Here we have a method that goes through all the parameters in the model and as long as they are not equal to y they are appended to the monitor list To see how this feeds through to the system in webtest you will recall that the go function is the main driver and is called thus m go t context variables t EstObjects burnin t EstObjects iterations thinning seed t monitor_list callback You will see that the monitor_list method is called as one of the arguments to the go function We then see within the go function that this argument which has the name mon within the function is used as follows for p in mon m params p monitor True and so only those parameters in the list are monitored The run function will then treat parameters differently in terms of their monitor status To see this in practice we will continue our example and press the Run button The results will look as follows http localhost 8080 run Results y summary2 0 8522 0 6336 0 7419 0 58 0 9356 0 6742 0 9066 0 6589 0 8759 0 641 0 957 0 6857 0 9369 0 678
153. the class Let us return to our rectangle class which had two attributes length and width An example method for this class could be area which would be a function that multiples the length attribute by the width attribute and hence returns the area A second method might be perimeter which adds 2 length to 2 width We will illustrate just two methods which might help explain a little about what is going on Firstly all classes will have an initialisation method often called a constructor in object oriented programming terminology and in Python these are usually written as _init__ The code for initialisation is short def _ init__ self self objects MyDict self outputs self EstObjects MyDict You will notice that all methods refer to self which says that the method operates on this particular instance of the class The initialisation code simply sets up 3 attributes within the instance of the 24 class objects outputs and EstObjects Currently objects and EstObjects are empty MyDict objects and outputs is an empty dictionary So we have here populated an instance of the class with some placeholders for attributes that will be created properly elsewhere To illustrate this further we will look at the Method input method This method is used to ask for some of the other user inputs we saw being input when we ran the Regression1 template The code for this method is as follows Create another set of inp
154. the model and switch to the html template file jsrun htm which is in the html_templates subdirectory of the webtest directory 45 5 4 1 Running the Regression1 example The method genJS works in a similar way to method genCPP but instead of C code generates Javascript using the class XMLtoJS code in src lib EStat We will omit details here but show the output The JavaScript screen looks as follows I Mozilla Firefox http localhost 8080 run Run Reset To run the script we simply press the Run button and we will get the following output after potentially waiting for some time done in 8613ms mean std ESS iter 2499 5 1443 3756441065507 4 tau 1 540803671896099 0 03408111117747336 4503 sigma 0 8057606978756133 0 008917708510009217 4499 betal 0 5949691577279538 0 012812573352366123 5457 beta 0 0010476597232762523 0 012576384256059106 5195 betal z 0 5175 0 4644 0 4114 0 3584 0 3054 0 252 0 198 0 145 0 092 0 0394 0 014 E REEFS ee 0143 429 714 1000 1286 1571 1857 2143 2429 2714 3000 3286 3571 3857 4143 4429 4714 5000 Note that it is best here to use a Firefox browser as the Javascript engine that comes with Internet Explorer can be slow and give lots of warning messages The output we see is a summary of each parameter and the chain for the first parameter with a pull down list for other parameters We have not fully integrated the JavaScript options in St
155. ther action is required because the algebra system has already evaluated the correct posterior We next construct a matrix variable sb1 which initially stores the crossproduct matrix of the residuals before moving to the next chunk of code if context priors str i Uniform int vwS itl length u0_ i S n 1 if itnum 0 oe for j in range 0 n sb i 1 gt e j j 0 0001 endfor endif if context priors str i Wishart int vwS itl length u0_ i S context v str i endif if context priors str i Wishart lt import numpy Rmat numpy empty n n count 0 for j in range 0 n for k in range 0 j 1 Rmat j k float context R str i name count Rmat k j Rmat j k count 1 S gt lt count 0 gt for j in range 0 n for k in range 0 n sb itl gt e S j k stxr Rmat j k float context v str i ae o9 111 o9 endfor endfor endif oe ae In this chunk of code we have different blocks depending on the type of prior distribution For the uniform prior we simply construct the degrees of freedom parameter vw1 which equals the number of higher level units minus the number of sets of random effects 1 We also have some code for the first iteration to avoid numerical problems as the residual starting values are all the same For the Wishart prior we have to add the prior paramete
156. ther packages The definition of the model class can be found in the file model py found in the src lib EStat subdirectory and the methods within the model class are what are used to fit a model using the Stat JR system Continuing our exploration of the webtest code you will find that if the run button is pressed then the following code is executed rE Sewn Dr Sass MLWiNESt None m None if t EstObjects Engine name eSTAT if seed in t EstObjects seed int t EstObjects seed name else seed 1 thinning 1 if thinning in t EstObjects thinning int t EstObjects thinning m go t context variables int t EstObjects burnin int t EstObjects iterations thinning seed t monitor_list callback Here the first if statement establishes that we are intending to run a model and the next if statement checks we are running the in house estimation engine currently called eSTAT Then we have a few initialising lines before the important go function is called and the instance of model object m is allocated the output of go The go function can be found near the top of webtest py and has the following code def go t variables burnin iterations thinning seed mon c m Model m modelspec t output m preccode t preccode 29 postccode t postccode compilemodel callback c burnin burnin iterations iterations thinning
157. tputlatex Firstly the sixway function that creates the graphical summaries for each parameter in the model that is monitored is called and these graphs are stored within the template instance Next the current plot is assigned so that we know which parameter to display The results from fitting the model are stored as an output dataset so that they can be used by other templates and finally we recreate the model specification and LaTeX outputs so that they appear on screen Then we are done and the browser will show all the output We next look at alternatives that rather than running the model directly instead simply generate code to fit the model 5 3 C code generation XMLtoPureC In the last section we looked at how Stat JR uses the built in MCMC estimation engine to fit the model specified by a template to a particular dataset Stat JR can also output free standing C code that can be copied and compiled in isolation This code performs essentially the same estimation algorithm for the model dataset combination The advantage of this option is that the code can be scrutinised and modified and we hope that possibly this will be a route to allow users in the future to increase the range of algorithms on offer When we were describing the webtest Python file we noted that under the POST method we were able to find the lines of code that are executed when the Run button is pressed Similarly the code that is executed when the Code button is p
158. utput reset iLevelBlock p Set 1LevelCat 1LevelCatPred 1LevelComplex 1LevelMod 1LevelOrthogParam 1LevelPred 1Levelwithinit 1LevelwithInteractions 2LevelBlock hal Select dataset alevchem a Seil bang bes83 birds classsize classsize_inpute diag gcsecomp gcsemv growthdata height hungary 2 7 Add a shortcut to Stat JR on your Start menu or on your desktop Finally add a short cut to Stat JR on your Start menu or on your desktop to enable you to easily find and launch the software at the beginning of each session 11 3 Asimple regression template example 3 1 Running a first template We will firstly consider a very simple template that is included in the core model templates distributed with Stat JR which has the title Regression1 Perhaps before looking at the code it would be good to run the template in the web interface to see what it does To do this launch Stat JR full installation instructions are given in Chapter 2 If you refresh the screen you should be greeted by a display similar to the following Configuration Current Template Regression1 Current Dataset tutorial Run Template Information Select template 1 Level 2 Level multivariate N Levellinput NOCEljoutput reset 1LevelBlock E Serl 1LevelCat 1LevelCatPred 1LevelComplex 1LevelMod 1LevelOrthogParam 1LevelPred 1LevelwithInit 1Levelwithinteractions 2
159. uts related to estimation method and engine Will also define type of outputs specific to method engine def MethodInput self self EstObjects Engine Text value eSTAT self EstObjects seed Integer Random Seed self EstObjects burnin Integer length of burnin min 0 self EstObjects iterations Integer number of iterations min 0 self EstObjects thinning Integer thinning self EstObjects nchains 1 This function is essentially populating the dictionary known as EstObjects EstObjects is an instance of a MyDict class that has been defined elsewhere and is essentially a list of objects of varying types where each object has a label and a value The set of lines within the Method input method then define the labels and types of all objects in the dictionary and either assign a value to the object as for the Engine object or give a text string that appears on the screen when this function is called as for the seed object We repeat below the inputs for the Regression1 example 2 Mozilla Firefox File Edt vie lt PME http flocalhost e080 runf Bl Most visited Getting Started MM Latest Headlines E stone Bind E Ea C http localhost 8080 run 7 F i Stat JR Demonstrator Template Regression1 Change Dataset tutorial Change View Summary History Bookmarks Tools Help Configuration Start again response
160. we will discover the page lengthens to incorporate a final box entitled Results Bit localhest 8080 run Results 1 70 1 55 e g 150 kernel density 4 Fi f 145 2 fi W 1 404 Ot e 0 1000 2000 3000 4000 5000 140 145 150 155 160 165 1 70 10 stored update 0 015 parameter value 0 010 o8 0 005 1 I A AN Da 1 0 025 o 20 40 60 80 100 128 0 rr es E 0 6 0 000 Lag lag 40 005 a4 40 010 0 2 40 015 ACF 0 0 E 0 000 9d O 20000 40000 60000 80000100000 updates Inputs y normexant xt cons standit This screen contains summary statistics for four parameters beta0 beta1 tau and sigma The last of these sigma is simply a function of the precision parameter tau Thus the model intercept and slope coefficients are 0 001 and 0 595 respecitvely while the corresponding standard errors are 0 013 and 0 013 The residual variance is estimated as 0 65 0 806 2 The screen also displays a range of figures for the MCMC output for the tau parameter There is a drop down box at the bottom of the screen from which we can choose to view the MCMC graphs for the other parameters The five figures contain a trace plot of the 5 000 estimates a kernel density plot plots of the autocorrelation ACF and partial auto correlation PACF functions for assessing mixing of the chains and a Monte Carlo standard error MCSE plot We will not go into detail about wha
161. x cons standlrt Se Here we see a mathematical representation of the model created in outlatex along with the model code in outbug We saw in the last chapter the use of preparedata and this method is used here as well def preparedata self data self data data self objects u ncols 1 len numpy unique self data self objects L2ID name return self data All we use preparedata for apart from the default operation of copying the data to the template is to assess the number of unique level 2 identifiers and hence the length of the level 2 residual vector u Let s look at outbug outbug model for i in 1l length S y Sty Hi if D Normal 93 dnorm mu i tau I NS if D Binomial l i link p il lt ol pi Hh UO Poisson 1 link p i lt dnorm mu i tau mu i lt endif S mmult x beta i u L2ID i cons il for j in l length u if D t u j at 0r tau_u df else u j dnorm 0 tau_u endif Priors for i in range 0 x ncols betaS i dflat endfor if D Normal tau dgamma 0 001000 0 001000 sigma lt 1 sqrt tau endif if D t tau dgamma 0 001000 0 001000 sigma lt 1 sqrt tau df dunif 2 200 df lt 8 endif tau_u dgamma 0 001000 0 001000 sigma_u lt 1 sqrt tau_u The code
162. xample that for the Regression1 template where we started the set of beta parameters are all updated individually through univariate normal steps We will investigate the implications of this in chapter 13 In chapter 10 we introduced our first multilevel models all of which only had random intercepts To extend such models to include random slopes requires assuming slopes and intercepts are correlated the use of a multivariate normal distribution for the random effects Multivariate normal distributions by their nature have vector rather than scalar parameters and so our model code diverges from standard WinBUGS model code here Our improvised model code depends on the number of response variables i e we have bivariate trivariate etc normal distributions We will see how these work in practice via the template NLevelwithRS 12 1 An example with random slopes Select NLevelwithRS from the template list and tutorial from the data list Then choose the inputs as follows File Edit View History Bookmarks Tools Help ro gt C AA http 137 222 140 64 8081 run A Most Visited Getting Started a Latest Headlines Stat JR Demonstrator Template NLevelwithRS Change Dataset tutorial Change View Summary http 137 222 140 64 8081 run Configuration Start again Number of Classifications 1 Classification 1 school response normexam specify distribution Normal explanatory variables cons standirt explanatory variables
163. ython version of py2 7 Locate and install the following five packages choose the mkl version of the packages if it exists 1 NumPy 2 SciPy 3 Numexpr 4 Matplotlib 5 Setuptools Note you may want to use the search facility in your internet browser to locate the packages When installing the packages accept all the default file directories 2 3 Install two further packages Next open a command prompt window for example click the Start button then select run Type cmd in the run window and click OK and run the following commands cd Python27 Scripts If you installed Python elsewhere substitute this directory easy_install mako easy_install web py These will install two further packages that we use with Python Close the command window by running the exit command 2 4 Install MinGW Next as we are using the C language within Stat JR you will need to install a compiler Again this will depend on your machine For a 32bit machine Select the following weblink http sourceforge net projects mingw files Automated 20MinGW 2OInstaller mingw Note when you get to the following screen shot make sure to click on the C Compiler check box before proceeding iix Select Components h Choose which optional components of MinGW to install the C compiler is always Q installed MinGW Compiler Suite C Compiler C Compiler O Fortran Compiler CO obic Compiler C Ada Compiler zi lt Back Cancel
Download Pdf Manuals
Related Search
Related Contents
Samsung 320BX Bruksanvisning Operating instruction Mode d'emploi Instruccions de manejo Exhibitor Service Manual Manual de instruções - equipamentos.cdr - E Philips Saeco HD8859/01 coffee maker 420010446300_Kroms VCS 5_Layout 1 HP ML350 User's Manual 取扱説明書 - Panasonic Peripheral Electronics PGHGM1 User's Manual Copyright © All rights reserved.
Failed to retrieve file