Home

LADES User`s manual

1. AN R 2t RESTE z 3 Graphics a 600 600 Po 600 Time VS Response s aad e Rat Export Graph gt 92 e 91 e 99 g e Tables one s eo ot on 500 4 500 4 90D oe w 4 x afa s sn w Hs Export Table e 7 S es 9 ER 400 400 400 ra 11 12 e 13 D 14 300 300 300 16 jm ron w M 20 40 60 M 20 40 60 M 20 40 50 i Time Figure 4 19 Longitudinal trellis plot Weights Experiment The mixed linear model for this analysis can be expressed as follows Yi Bo Yo2Doi Yo3Dsi boi Bi N2D2 N13D3i bii ti Ei 4 4 where Do is a binary variable that takes the value of 1 if the ith mouse recieves Diet 2 D3 is a binary indicator for variable Diet 3 i e 1 if the mouse received Diet 3 Do y D are respectively the random intercept and slope under Diet 1 In this case Diet is called reference because our results will be comparison against it 79 1s the average difference of the intercept under Diet k minus the intercept under Diet 1 y 1s the average difference of the slope under Diet k for our example k 2 3 and the slope under Diet 1 bo is the random intercept 1 e the particular mouse initial weight b is the random slope i e the particular mouse growth over time Finally eij 18 the model error This is an example of a model using Treatment contrasts in our model matrix The model implemented in LADE
2. 57 4 4 5 34 24 T T T T7 80 40 40 D Residuals Figure 5 10 Residuals Histogram Linear Regression Grafica Cuantil Cuantil Normal 150 Exportar Gr fica ablas ce Analisis de Varianza o e o as Exportar Tabla TS IY Y A us 2g r DE Salir S o G 50 100 2 5 0 1 2 Cuantiles Distribucion Normal Figure 5 11 Normal QQ Plot Linear Regresion 102 Statistics Module 40 Export Graph Analysis of Variance M A pa Export Table Residuals 40 Observation Figure 5 12 Independence of Residuals Plot For our data we can see a form of cone in the graph Figure 5 13 which indicates that assumption 11 1s not satisfied and we must set another model with other features or predictors 5 2 Logistic Regression 103 KESI gt Resmie Fitted Values VS Residuals Graphics Residuals VS Fitted Values bes a Export Graph 4 Tables v t Export Table a W gt Z ped a Exit e 40 2 Y r T T o 100 200 300 Fitted Values Figure 5 13 Residuals VS Fitted Values Plot 5 2 Logistic Regression The logistic regression develops the special case in which the variable of interest only takes two values e g present or not present Since the response can have two val
3. Variables distance age subject sex Customize Fit s Cancel x Response gt distance v Time gt lage Units gt subject Factors pz sex Interactions me A age sex Figure 4 58 Main Screen GLS Data Analysis MIG Method 2 ML Estructura de Varianza Nula Power Variabl S F category Factors REML Fixed gt Exponential distance age subject sex Figure 4 59 Options Screen GLS 83 84 Tables Analyze Module 95 Confidence Intervals for Fixed Parameter Estimates Figure 4 60 shows 95 confidence intervals and estimates for the parameter in model 4 2 4 In addition con fidence intervals for correlation and variance structure parameters are shown A first evaluation to determine which factors are significant is that its confidence interval does not include zero e Results x I Intercept 23 067 23 801 24 536 age 0 524 0 652 0 779 Graphics sexi 1 87 1 136 0 401 Autocorrelation Function Y age sexi 0 303 0 175 0 048 Export Graph 1195 Confidence Intervals Parameters Estimations Export Table Exit Figure 4 60 Confidence Intervals of Parameter Estimates GLS 95 Confidence Intervals for Correlation Parameter Estimates Figure 4 61 shows 95 confidence intervals and estimates for each parameter i
4. Figure 5 22 Sample size Estimation for continuous response Results 5 4 5 4 Sample Size Estimation for Longitudinal Analysis with Binary Response 113 Sample Size Estimation for Longitudinal Analysis with Binary Response Before conducting a longitudinal study is necessary to calculate the number of subjects needed to ensure that certain predefined effect is significant Sample Size calculation is based on assumptions which if altered can lead very different sizes It is related to the significance level of a factor How many individuals are needed to make some significant effect As an example we have a pilot study with a binary response variable 1 or 0 They want to test a new drug for patients with stomach ulcers The response under interest is 1 patient has ulcers and 0 no ulcers presented The duration of treatment is one month and patients will be reviewed three times The first review 1s when the treatment was first administrated And patients were evaluated at months 6 and 12 In this study the performance of this new treatment is compared against a placebo It is desired that the difference between the proportions is at least 16 5 to consider a significative difference Previous data collected for the calculation of sample size in this study are shown in Table 5 2 Proportion of patients with no stomach ulcers Time Measurements Correlation 3 Number of Measurements acrosstime 3 Tarwe OB Table 5 2 Previous i
5. enn 4 2 4 Generalized Least Squares GLS ililee rn 82 9 Statistics Module eve 93 5 Linear Regression 93 5 2 Logistic Regression 103 5 3 Sample Size Estimation for Longitudinal Analysis with Continuous Response 110 5 4 Sample Size Estimation for Longitudinal Analysis with Binary Response113 6 Graphics Module 0002c cece cece aranana 117 6 1 Longitudinal Data Graph 117 O LI PURCTIONOUIDUNS sau vica arnes rixae rGa ee eeae oa Rx dem eee eee oe oo 119 A Term Definitions 00 0 cece cece cee ee eeee 125 A 1 Model Matrix 125 A 2 Contrasts 125 A 3 X Variance Structures 126 A 4 Correlation Structures 127 A 5 Likelihood Criterias 128 LADES Help How to install LADES User Sessions Import and Export ri B umor EES rs qm gm M mme 8 v M DE J RN Lote as ANNE EET 1 Introduction Welcome to LADES Software v 2 0 0 Longitudinal Analysis and Design of Experiments Software LADES s main goal is to provide an accessible platform using the most advanced tools for the analysis of experimental data Its design contains statistical methods for Design and Analysis of Experiments Longitudinal Data Analysis Sample Size Estimation Visual Elements for Analysing Longitudinal Data LADES supplies researchers and practitioners with a user friendly interface with independent working sessions Furthermore it has the capability to
6. gt Longitudinal Design The statistical model used by this function is a special case of a linear mixed model This model has been reparametrized to make the comparisons of the three slopes more direct The model for this example is Bo boi By jj bi ij Control yij Bo boi Da tij bij amp j TB 3 2 B4 boi Bs tjj bij amp j T C where y represents the j ith measurement of the i th eel Do D and Ba represent the average slope for each group and Dj D3 and Ds represent the average slope for their corresponding group bo is the random factor that represents the difference of the initial weight to the mean while b is the random factor representing the difference of each mouse to the average slope of each group And are independent of each other The levels for time were codified as 4 2 0 2 4 so the model is evaluating the decrease in weight around the third week After fitting the linear mixed model the estimates were Bo 1 3875 B 0 0350 Bo 2 3875 B3 0 17625 B4 3 3750 Bs 0 3175 05 21 1495 op 0 0392 0 b 20 163375 and 0 0 1255 The hypothesis test we want to asses is Ho Di b3 Ds vs H at least one different 3 3 representing that we want to test the average slopes of groups A level of 0 05 will be used Under this framework we will use the the F approximation by Helms to test the hypothesis and compute its power More information a
7. each factor appears in the same number of runs Two factors are said to be orthogonal if all combinations levels appear in the same number of runs A design is called orthogonal if all pairs of factors are orthogonal These two properties are very important because they allow us to have more accurate estimates of the effects of each of the main factors and interactions as well as to have a clearer interpretation of the results 8 Create Module e The contrasts used for the design refer to encodings for each level of each factor For example if we have Temperature as a two level factor with 10 C y 30 C we can encode this factor to be 1 for 10 C and 1 30 C These new encoded levels help us to make our design orthogonal as well as the model matrix This lead us to obtain estimates of the effects for each factor Now If we want to establish three levels for temperature one way to encode our levels is 1 for 10 C 0 20 C and 1 30 C This type of Orthogonal contrasts are called Ortogonal or Helmert On the other hand if we have a factor whose levels are categories such as high medium and low we would use other kind of contrasts called Treatment see A 2 Such encodings would be Level 1 for low Level 2 for medium and Level 3 for High The model matrix for this design is constructed so as to be orthogonal For categorical factors this construction of the orthogonal matrix 1s performed by decomposing each of the corresponding factors to
8. 0 085 0 14 0 888 Parameters Hypothesis Testing v a e e Time TreatmenttTr 0 036 0 097 0 37 0 711 Time TreatmentTr 0 054 0 11 0 491 0 623 Export Table Exit Figure 5 15 Hypothesis Testing for Parameters Logistic Regression e Model Matrix In this section we can see the model matrix see A 1 that was used to adjust the logistic regression model See Figure 5 16 e Estimated Coefficients Shows the estimation of coefficients D Ds See Figure 5 17 These coefficients are the impacts factors have on the logit of the response e Fitted Values and Residuals In Figure 5 18 adjusted values and their corresponding residuals are shown The adjusted values are the resulting values after applying the inverse 5 2 Logistic Regression 107 i i i i ER OM T gt 3 h gt A UY E y qq _ M i z A E Export Graph Figure 5 16 Model Matrix Logistic Regression m4 f EV AN Results x p M n isauli Intercept EL ee C Figure 5 17 Estimated Coefficients Logistic Regression 108 Statistics Module transformation For our case these values denote the probability that lymphocytic infiltrate is present The model residuals are unrestricted 1 e are based on the logit function A Results Treatment Present Absent fitted Values Residuals 0 422 O Figure 5 18 Fitted Values and Residuals Lo
9. 0 272 0 602 Parameters Hypothesis Testing v Export Table Exit Figure 4 53 Hypothesis Testing for Parameters GEE Model Matrix In this section we can see the model matrix see A 1 that was used to adjust the GEE model We can see the contrasts we used see Figure 4 54 Scale Parameter Estimation Refers to the relationship between the variance of the observations and their expected value This parameter estimation is shown in Figure 4 55 Because we are not making assumptions about the distribution of our observations this parameter is simply capturing the information concerning the variability of our observations with respect to its expected value Correlation Parameter Estimations Because it was decided to use an order autore gressive ARI correlation structure in our data only one parameter was estimated Figure 4 2 Longitudinal Analysis 79 S as FEA Intercept Time FatLow FatMedium FiberLow Time FatL Time FatM Time Fiberg3 Figure 4 54 Model Matrix GEE T KESTIS Estimate Std err 4 TII aS Figure 4 55 Scale Parameter Estimation GEE 80 Analyze Module 4 56 shows the result of this estimate The value of the parameter correlation 1s 0 841 This means adjacent observations 1 e whose distance 1s equal to 1 have a correlation equal to 0 841 On the other hand observations with two units of distance for example the
10. 4 Figure 4 42 Model Matrix GLMM Likelihood Criterion In this section the results for the Akaike Information Criterion AIC and Bayesian Information Criterion BIC see A 5 are presented In addition the log likelihood is shown See Figure 4 43 Fitted Values and Residuals In Figure 4 44 adjusted values and residuals for model 4 2 2 The adjusted values are the resulting values after applying the inverse transforma tion of g For our case these values denote the probability to present a separation in the nail of the big toe The residuals are not restricted 1 e are based on the logit function Random Effects Estimation by Subject In Figure 4 45 the best linear unbiased predic tors for the random coefficients of each patient are displayed These predictions were used to calculate the fitted values and residuals Basically they denote deviations from the overall intercept Fixed Effects Estimates For each of the predictor variables the estimated coefficients the standard error of the coefficient and the corresponding p value are given The p value is based on the Wald statistic which 1s defined as the ratio between the square regression coefficient and its standard error This statistic follows a Y distribution with 1 degree of freedom We can observe that Visit Time factor has an impact of 0 56 per unit time on the logit of the response And this change is statistically significant At 0 05 Due to the type of con
11. Brel ef oN 0 0D i 1 M 4 10 Then for a fixed the estimates of the parameters B and o can be obtained by ordinary least squares OLS e The parameter estimation is performed as follows First we have to find least squares estimators for the parameters B and o for model 4 2 4 these estimations are conditioned to the value of A Then the function log likelihood of the model 4 2 4 is constructed by substituting the estimates D A and o7 A Now the likelihood function is a function only of A Finally the likelihood function is maximized over A i e the maximum likelihood estimator A is founded and the conditioned estimators are B A and 6 obtained We can restrict the likelihood function for model 4 2 4 This restriction consists on integrating the D vector out from the full likelihood function Then after de maximization the resulting estimators B and 67 can be obtained Such estimators are known as restricted maximum likelihood estimators e The A matrices can be decomposed into a set of simple matrices A V C V Where V describes the variance structure and C describes the correlation structure of the within subject errors That is the variance and the correlation matrix of each individual i Analysis using GLS As an example we have an experiment where the distance from the pituitary to the pterygo maxilar fisure mm was studied The distances were obtained from a X ray to each of the skulls of indi
12. Control 5 Y Y Y Y Y Treatment B 10 Control 3 Y Y Y Control 2 Y Y Treatm 5 v Y Y Y Y Treatm 5 Y Y Time Fit Customize Cancel Cost Evaluation Unit Cost 0 3 Observation Unit Cost 1 Figure 3 2 Main Screen using Incomplete Longitudinal Designs Evaluate Longitudinal Design Now in the customize section we define the contrasts to use and the parameter estimates Fig 3 3 In the Fixed Effects Parameters we input the corresponding parameters for fixed parameters intercepts and slopes We can see that the first column of this matrix contains a row called Beta Global here we can input the parameter estimation 1f we were working with a linear mixed effect model with a global intercept for all groups On the left side of this screen we have to input the parameters for variances of random effects and residuals If for example we fill the Var b1 box with 0 then the function will not include a random effect for slopes in the model Sigma 2 stands for the variance of the residuals and alpha represents type 1 error When we want to discriminate the designs using the power of the complete design we fill User Defined Power section with O If we are interested in some specific value we just input it In the Slopes Comparison section we define the contrast we will use We define that we want to compare the group 1 Control with group 2 T A and group 1 Control with the group 3 T B using this definition we can complete a full comp
13. Design Case 1 In this section of the Results output we see the full longitu dinal design with the desired characteristics A column showing the label for individuals Units was added Since we choose to get an already randomiszd design we appreciate that the design points are randomized within each level of Time See Figure e Longitudinal Design Case 2 Figure presents the results of using the Incomplete Design characteristic We can see that there are more observations for occasions 1 and 7 since the three cohorts included these occasions 2 5 24 Create Module Unequal Longitudinal Design The Unequal Longitudinal Designs ULD are based on constructing designs with unequal number of individuals per group These unequal assignations can lead us to more power for the fixed effects test of hypothesis This approach is proposed to reduce the total number of individuals used in a study due to animal ethics costs etc Their construction is very similar to the Longitudinal Optimal Powered Cost Designs Summary of Longitudinal Optimal Powered Cost Design e A simple reparametrized linear mixed model is used to fit a model for each group under study e Linear Mixed Models provide an excellent framework to fit and evaluate the results of a longitudinal experiment Moreover the developed Fizejms test statistic let us evaluate fixed effects hypotheses and compute its power Then we can compare the designs according to its power e A sim
14. Export Table Figure 4 26 Likelihood Criteria Linear Mixed Models 4 2 Longitudinal Analysis 59 gt Figure 4 27 Residuals Statistics Linear Mixed Models Intercept Time ercept 1 0 152 Export Graph Tin Fixed Parameters Correlation v a a Figure 4 28 Fixed Effects Correlations Linear Mixed Models 60 Analyze Module Psali Rat Fitted Values Residuals 240 1 1 1 245 484 5 484 250 8 1 1 248 84 1 16 Graphics 255 15 1 1 252 197 2 803 Residuals VS Fitted Values 260 22 1 1 255 553 4 447 262 29 n 1 258 91 3 09 EE 258 36 1 1 262267 4 267 Tables 266 43 1 1 265 623 0 377 ETT 266 1 1 266 103 0 103 265 50 1 1 268 98 3 98 272 57 1 n 272 337 0 337 Export Table 278 64 1 1 275 693 2 307 225 a 2 1 226 827 1 827 230 8 la 1 229 157 0 843 Exit 230 15 2 1 231 486 1 486 232 22 2 ra 233 815 1 815 240 29 2 E 236 145 3 855 240 36 la 1 238 474 1 526 243 43 2 1 240 803 2 197 244 44 2 1 241 136 2 864 238 50 2 1 243 133 5 133 247 57 l2 1 245 462 1 538 245 64 2 1 247 792 2 792 245 1 3 1 247 202 2 202 250 8 la 1 249 994 0 006 250 15 3 1 252 786 2 786 4 LLL Figure 4 29 Adjusted Values and Residuals Linear Mixed Models in weight during time and that change depends on the type of Diet see Figure 4 30 Fixed Effects and Hypothesis Testing For each of the fixed factors included in the model the regression coefficient the standard error coefficient and the
15. Factorial Design Fractional Factorial Designs with 2 levels Optimal Design of Experiments Full Longitudinal Design Unequal Longitudinal Design Longitudinal Optimal Powered Cost Design 2 Create Module Full Factorial Design Most of the experiments are designed to analyze the effects of two or more factors as well as possible interactions between them In general full factorial designs are the most efficient for this type of experiments By factorial design we mean that on each run of the experiment all combinations of factor levels are investigated For example see Table 2 1 where a full factorial design to analyze the effects of diets including Fibre and Fat as factors is shown For Fibre factor researchers wanted to study the effect of a high fibre with a low content For Fat three levels were proposed Run Fibre Fat 4 Medium 6 High High Table 2 1 Experimental Design for Diet data The runs are each of the combinations of the two factors or each possible diet By using this design only six runs are required Summary of Full Factorial Designs e The most popular designs of experiments are the full factorial designs The 2 level full factorial design 1s referred as 2 where k is the number of main factors This design consists of 2 runs the total number of combinations Two important properties the 2 factorial designs have are balance and ortogonality Balanced means that each level of
16. Here we use the t test as an equivalent way to test the hypothesis about the significance of the factors included in the model Of all the factors under study only Elevation and Adjacent were statistically significant at a level of a 0 05 Thus these factors have a significant impact on the response variable Number of species and the size of this impact is defined by the estimates of the coefficients See Figure 5 4 Model Matrix In this section we can see the model matrix see A 1 that was used to adjust the multiple linear regression model Figure 5 5 Estimated Coefficients Presents the estimation for each of the coefficients 6 Bs Figure 5 6 Fitted Values and Residuals Displays a table containing the fitted values predict of the model and the residuals difference of the actual and predicted Figure 5 7 Adjusted R square Multiple R square and Residual s Standard Error Multiple R Square indicates the proportion of variation in the dependent variable explained by the predictor or independent variables These two statistics help us to check the adequacy or fit of the proposed model For our case it was showed that Multiple R square 0 765 and Adjusted R square 0 7171 The closer they are to the 1 the better the fit See Figure 5 8 See Figure 5 8 Analysis of Variance ANOVA Displays the sum of squares for each one of the depen dent variables as well as the construction of each of the F tests Figure 5 9 The varia
17. Intervals Parameters Estimations M Export Table Quantiles of standard normal o Normaliz d residuals Figure 4 70 Residuals by Factor Normality Plot GLS 22 Results x Residuals by sex Export Graph 3 ables 95 Confidence Intervals Parameters Estimations rv 2 wo m Export Table 3 ME 5 4 oo ae 2 d m IN SS i S 2 o E 0 o c o gt o 1 o o 2 3 2 1 0 1 2 3 age Figure 4 71 Residuals and Adjusted Values Plot GLS 4 2 Longitudinal Analysis Export Graph ables 95 Confidence Intervals Parameters Estimations v Export Table Figure 4 72 Gr fica de residuos en el tiempo por factor GLS 9 Linear Regression Logistic Regression Sample Size Estimation for Longitudinal Analysis with Continuous Response Sample Size Estimation for Longitudinal Analysis with Binary Response 5 1 5 Statistics Module Linear Regression We will focus on fitting linear regression models to data To illustrate suppose you want to develop an empirical model that relates the number of Galapagos tortoises species in several islands to six variables of interest such as maximum elevation of the Island elevation m area of the island area km Distance to the nearest island Proximity km The distance to the Santa Cruz island scruz km And the area of the adjacent island adjacent km A model that
18. Longitudinal Data Analysis for Epidemiology A Practical Guide Cambridge medicine Cambridge University Press 2013 W N Venables and B D Ripley Modern Applied Statistics with S Springer New York fourth edition 2002 ISBN 0 387 95457 0 Bob Wheeler AlgDesign Algorithmic Experimental Design 2011 R package version 124 Graphics Module 1 1 7 Hadley Wickham ggplot2 elegant graphics for data analysis Springer New York 2009 e C F J Wu and M S Hamada Experiments Planning Analysis and Optimization Wiley Series in Probability and Statistics Wiley 2013 Jun Yan geepack Yet another package for generalized estimating equations R News 2 3 12714 2002 Jun Yan and Jason P Fine Estimating equations for association structures Statistics in Medicine 23 8597880 2004 S L Zeger K Y Liang and P S Albert Models for longitudinal data a generalized estimating equation approach Biometrics 44 4 1049 1060 Dec 1988 A Zuur E N leno N Walker A A Saveliev and G M Smith Mixed Effects Models and Extensions in Ecology with R Statistics for biology and health Springer 2009 Model Matrix Contrasts Variance Structures en Correlation Structures Likelihood Criterias x s A fi t dO Mr T e arm NUN i a E tage Kane erc o s i F AN E P genu E 2 MM EERS FEE Wd tas de uo ns Eg Pea pas Aga e PENA SAN Se or M D de rg dm a ds A A X uer mr e op l
19. Response 115 a Results x Required Number of Individuals per Group Inf Figure 5 24 Sample size Estimation for binary response Results Longitudinal Data Graph Function outputs ee e da ee M A AE EA a i A BUD ud Kr Roi Mo A t ON ee JU e p 1 a 1 uc arent e FM e S E A Lm o CR NR LA Pau eda at Pad ra tula ia T ME Pa oe ALLL OE GOAN A La Ac 6 Graphics Module 6 1 Longitudinal Data Graph To explain the use of Longitudinal Data Graph function we will use the data weights csv After importing the data we access the function through Graph gt Longitudinal Data Then the screen shown in Figure 6 1 appears z Longitudinal x pun Variables Emenee weight mcm CHE m Time Tie pa ps me Diet Units m mer By gt m In gt me Response Units Fit Cancel re Graphic Random Effects Figure 6 1 Main Screen Longitudinal Data Graph If we want to plot the response profiles of each individual we fill only the Response Time and Id fields and then click on Create see Figure 6 2 The resulting graph is shown in Figure 6 3 If now we want to categorize each individual according to some factor of interest for instance in our example we want to analyze the three diet groups We just have to input the desired factor 118 Graphics Module 24 1 Longitudinal Figure 6 2 How to f
20. The file containing the data is the toenail csv file The database contains 104 responses Not all patients were measured during the seven visits and therefore some patients have fewer observations than others They want to take the initial condition of the patient as a random variable The codings for treatments are Treatment A 0 and Treatment B This coding is due to the fact that we have a nominal factor and then we have to use Treatment contrasts matrix This encoding is usually used in longitudinal analysis comparing a control group versus a group under a new treatment To access the functionality of generalized linear mixed models we have to follow Analyze gt Longitudinal Analysis gt Generalized Mixed Models In Figure 4 40 the main screen is shown The model to be fitted is logit Yi Bo YooT reatmenty boi Bi Y gt Treatmenty Visit Eir 4 5 where Treatment is a binary variable taking 1 if patient is taking Treatment B and 0 it not Do y and p are the average intercept and the average slope respectively for patient under Treatment A Treatment A is called of reference since our results will be comparisons against it Yor 18 the average difference of the intercept for Treatment k minus the intercept under Treatment A reference yj 1s the average difference of the slopes for patients under Treatment k minus patients under Treatment A bo is the random intercept i e patient i initial co
21. can describe this relationship is y Bo Dixi Box2 B3x3 Daxa Dsxs 5 1 where y represents the number of species x the highest elevation of the island x2 the area of the Island x3 distance to the nearest island x4 the distance to the Santa Cruz island and Xs the area to the adjacent island is the vector of errors and is usually known as residuals This is a multiple linear regression model with five independent variables These independent variables are usually known as regressors or predictors The linear term is due to the form of the parameters Bo B1 etc A nonlinear model would have parameters such as sin B log B etc There are models that are apparently more complex than 5 1 that can be fitted using linear regression For example consider add the interaction between the highest elevation of the island and the its area y Po Bix Doxo P3x3 Daxa Dsxs P12x1x3 E 5 2 If we set xg x1x3 and Dg B13 equation 5 1 can be written as follows y Bo Dixi Boxe B3x3 Daxa P5x5 Boxe 5 3 94 Statistics Module and then we can use linear regression Summary of Linear Regression Model It is assumed that residuals i follow a Normal distribution 2 have a constant variance and 3 are independent This enables us to derive distributions for model parameters and thus constructing hypothesis testing to verify its statistical significance In addition it allows us
22. construction of the model of interest y Bo Di Strain Bo Sex D3 Treatment Ba Strain Treatment 2 1 In Figure 2 11 we input the factors that will be part of our model X4 X5 X4 which are Strain Sex and Treatment respectively We include the interaction of interest X4 X5 1 e Strain Treatment The size of the design is seven And we will run the algorithm 100 times Once data is inputed we click Fit Function Outputs e Design Matrix Determinant This window displays the relative efficiency of the design This is the ration of the Mnewdesigen and MruliFactorialDesign 2 See Figure 2 12 e Optimal Design This section presents the best resulting design for model 2 3 See Figure 2 13 e Model Matrix In Figure 2 14 we show the model matrix used 2 3 also see A 1 e Variance Inflation Factor In Figure 2 15 we observe the VIF for main effects and interactions The VIF for all is 1 16 this implies that the variance of the estimates of the 18 Create Module lt 3 Optimal Design Size of Design i a Figure 2 11 Create Optimal Design Screen Relative Efficiency 0 939286759724129 Figure 2 12 Relative Efficiency Create Optimal Design 2 3 Optimal Design of Experiments Figure 2 13 Optiml Design Create Optimal Design wd 2 Results Figure 2 14 Model Matrix Create Optimal Design 19 Create Module effects will be 1 16 larger than the esti
23. factor is constant among individuals Otherwise we would need to include a random factor in order to model this difference among individuals e Estimates for Individual Models The point estimates of the coefficients for each indi vidual are shown in Figure 6 8 120 Graphics Module 4 Longitudinal x Figure 6 5 Unit inclusion and random effects plot option Results a 95 Confidence Intervals 9 6 5 4 3 O 82 46 45 14 13 12 11 10 1 300 400 500 Intercept OOOO Figure 6 6 Graph of the 95 confidence intervals for intercept coefficients for each individual using a simple linear regression model 6 1 Longitudinal Data Graph 121 Results 95 Confidence Intervals oo eee T S e A tC Export Table a Slope Figure 6 7 Graph of the 95 confidence intervals for time coefficients for each individual using a simple linear regression model 122 Graphics Module Intercept Figure 6 8 Point estimates for the intercept and time coefficients for each individual using a simple linear regression model Bibliography Douglas Bates Martin Maechler Ben Bolker and Steven Walker Ime4 Linear mixed effects models using Eigen and S4 2013 R package version 1 0 5 P Diggle Analysis of Longitudinal Data Oxford Statistical Science Series OUP Oxford 2002 J J Faraway Linear Models
24. observation at time 1 and the observation time 3 have a correlation of 0 8412 0 707 And the observations whose separation is 3 units time have correlaciion of 0 841 0 594 This means say that the correlation decreases as more distant observations are LA x Hauts Estimate Std err alpha 0 841 Graphics w Export Graph Tables Correlation Parameter Estimation Y Export Table Exit 4 nam E mmm I E E gt Figure 4 56 Correlation Parameter Estimations GEE e Fitted Values and Residuals Finally we show the adjusted values and residuals in Figure4 57 4 2 Longitudinal Analysis 8 AR AAA x sul Fitted Values Residuals 1 Low Low 1 175 190 53 15 53 1 Low low 2 200 176 822 23 178 Graphics 1 Low Low 3 1134 163 114 29114 F vii Low Low 4 138 149 406 11 406 2 Low Low la 192 190 53 11 47 Export Graph x Low Low 2 169 176 822 7 822 Tables 2 low low 39 a hen annm Fitted Values and Residuals y tae fo aa e UU ME bean B Low Low 1 153 190 53 737 53 3 Low Low 2 132 176 822 44 822 Export Table 3 Low Low 3 115 163 114 48 114 A Low Low 4 114 1149 406 35 406 4 Low Low 1 204 190 53 13 47 Exit 4 Low Low 2 184 176 822 7 178 4 Low Low 3 1162 163 114 1114 4 Low Low 4 164 149 406 14 594 5 Low low 1 1194 190 53 3 47 5 Low Low 2 1173 176 822 3 822 5 Low Low la 449 163 1414 14 114 5 Low Low 4 51 149 406 1 594 6 Low
25. of each individual are different Then we treat the coefficients as random 1 e different coefficients for each individual The simplest form of a random coefficient model is regarding only the intercept as a random coefficient This means the first observation of individuals e g before starting the study are different but the growth slope in the response 1s the same for everyone This relationship can be seen in Figure 4 15 It is also possible that the intercept 1s the same for all subjects but growth is different for subjects This model will include a random slope This phenomenon is observed in Figure 4 16 Another case 1s when both coefficients the intercept and slope are random This means that each individual has his own response Profile See Figure 4 17 Summary of Linear Mixed Effects Model e The linear mixed effects model can be represented as follows yi Xip Zibi i i l M 4 2 bi N 0 N 0 0 T where D is a p dimension vector of fixed effects b is a q dimension vector of random effects X of size n x p and Z of size n x q regression matrices for fixed and random effects respectively and e a n dimension vector of with in subjects errors 1 e the residuals for each observation in time of individual i An important assumption when 4 2 Longitudinal Analysis Le ei pue o o0 o Cl cL i wm a un o 1 2 3 4 5 Time Figure 4 15 Random Intercept Example pue un e cL
26. over time Eight mice were assigned to a control diet Diet 1 four to an experimental diet Diet 2 and the remaining four to other experimental diet Diet 3 Measurements were made for 64 days after application of the treatment Measurements were performed every seven days plus one on day 44 We want to analyze the performance of the three diets over time on the weights of the mice Data loaded in the program is seen in Figure 4 18 a m x File Create Evaluate Analyze Statistics Graph Help amp 5 8 s A Till DS Toe ete A A p oe s T Ee eh et Rehd i Be e T beh E Rt T bt ih t T het cH T N aii N jo Www WW NN NIN NNN NJ NJ NU Nie A HA A Ai pp p p melie Row Column File Q7 97 TTO impor Figure 4 18 Weight Experiment Dataset File weights csv To access the functionality of Linear Mixed Models we follow Analyze gt Longitudinal Analysis gt Linear Mixed Models In Figure 4 19 we can see differences between individuals belonging to each of the three diets We can also see an unusual initial weight of the mouse 2 in Diet 2 group The weights seem to grow linearly in time but with different initial weights and different slopes for each diet Therefore 1t was decided to use a linear mixed model with random effects in the intercept and time in order to model the mouse to mouse variation This plot as created using the Longitudinal Graph function Section 6 1 o4 Analyze Module
27. screen is shown in Figure 4 2 File Create Evaluate Analyze Statistics Graph Help Strain Treatme RBC 40 1 9 73 9 2 0 14 10 08 9 6 8 68 8 45 8 18 8 95 8 89 8 82 10 09 1 1 8 1 1 1 8 24 1 d 0 56 IEEE ET DUE EEN i aer eto et e Co RRA RR E E Figure 4 1 RBC Data File RBC csv In Figure 4 2 the proposed model is shown This model has the main factors of strain and treatment and their interaction Function Outputs e Hypothesis Testing for Parameters The hypothesis test were assessed using the t statistic The tesi evaluates if the coefficient is different from zero Only the main factors strain and treatment have a resulting below 0 05 Therefore we can say these factors have a significant impact on the response number of red blood cells 1 e RBC is different for strains as well as treatments see Figure 4 3 e Model Matrix In this section we can see the resulting model matrix see A 1 This matrix was used to adjust the multiple linear regression model to our design Figure 4 4 As we can see this matrix is orthogonal so we can ensure that our estimates for the coefficients are the most accurate e Coefficients Displays the estimates of the coefficients in our regression model Fig ure 4 5 This allows us to define an equation that represents our data and then make predictions on new data The model in this case would be using encodings in Table 4 1 RBC 9 113 0 258 Strain 0
28. the deviations of each individuals from the Intercept average and the estimated Time effect average change rate Estimated Parameters by Subject Figure 4 33 shows all the estimated coefficient and best linear unbiased predictions for the fixed and random effects respectively Mil y 4 2 Longitudinal Analysis 61 3 Results A i isisS numDF denDF 157 17 3 29 M Residuals VS Fitted Values a F value p value Figure 4 30 Analysis of Variance Linear Mixed Models Results mes Std Error DF t value p value ea Residuals VS Fitted Values Diet3 LLL Figure 4 31 Fixed Effects and Hypothesis Testing Linear Mixed Models 62 Analyze Module Intercept E TTT TT TOTO u Intercept Time Diet2 Diet3 Time Diet2 Time Diet3 reel 221 Figure 4 33 Estimated Parameters by Subject Linear Mixed Models 4 2 Longitudinal Analysis 63 e Residuals VS Fitted Values Plot In this graph Figure 4 34 we can validate the assump tion of constant variance for the residuals The adjusted values are on the X axis and the standardized residuals on the Y axis The graph should show a pattern of an horizontal band If the graph is cone shaped then we say that the variance of the residuals 1s not constant In our case the graph if represents the pattern of constant variance in the residuals This is the result of in
29. think about a correlation between in random effects As mentioned before other assumptions of the model 1s that the random effects are uncorrelated For our case there were no trends see Figure 4 39 Random factors by Diet Export Graph Tables variance parameters 95 Confidence Intervals l Export Table E E Exit Intercept Figure 4 39 Gr fica de efectos aleatorios por factor Modelos Mixtos 4 2 Longitudinal Analysis 6 4 2 2 Generalized Linear Mixed Models When the linear mixed model LMM is used it is assumed that the response of interest 1s continuous Sometimes when the variable of interest is ordinal LMM could be used as long as the response has a large number levels But LMM does not work to model binary responses or ordinal responses with few levels or when the response represents counts For these cases we used generalized linear mixed models GLMM This is simply an extension of linear mixed models that works for more general cases Summary of Generalized Linear Mixed Models Linear mixed model is used to model the average of observations since it works with continuous responses This average is said to be a conditional on the random effects 1 yu where u are the random effects This random variable follows a distribution N yu o I when y is a continuous variable Then the model can be written as follows U
30. to include three cohorts to reduce the number of subject occasions and cost In the customize section we define the contrasts to use On the left side of Figure we define that we want to compare the group 1 Control against group 2 T A and group 2 T A against group 3 T B using this definition we can complete a full comparison among the three groups Now we have to define the minimum difference to detect in these two comparisons That is the average slope of T A to the average slope of the control group will be considered significative if is at least 0 14 we can also use 0 14 The same for the second contrast On the right side of Figure 2 26 we define if we want to discriminate the constructed designs based on cost For this options we can use a User s define Cost or by leaving this option in Zero we discriminate against the value of the complete design 30 mice 5 occasions We also need to specify the cost of each unit mouse and the cost to get one measure Unit cost and Observation Unit Cost respectively If we select the Discriminate on Power checkbox we discriminate the generated designs based on a specific desired power For this we have to first define the type of model we are working with Different Intercept means that each group model has an independent intercept our case See model 3 2 on the other hand Common Intercept means that all group models will have an overall intercept representing the average
31. v User Defined Cost 0 v Discrimir en Power Model Diferent Common Random effects parameters War b 1 1495 War bi 0 0392 Cov b0 b1 0 1633 Sigma 2 Residuals 0 1255 alpha 0 05 User Defined Po 0 Accept Figure 2 19 Customize Section Unequal Longitudinal Designs If we select the Discriminate on Power checkbox we discriminate the generated designs based on a specific desired power For this we have to first define the type of model we are working with Different Intercept means that each group model has an independent intercept our case See model 3 2 on the other hand Common Intercept means that all group models will have an overall intercept representing the average of all mice s first observation Now to define the random effects we are using in model 3 2 we simply fill out the boxes with the corresponding variances and covariances If for example we fill the Var b1 box with 0 then the function will not include a random effect for slopes in the model Sigma 2 stands for the variance of the residuals 02 and alpha represents type 1 error When we want to discriminate the designs using the power of the complete design we fill User Defined Power section with 0 If we are interested in some specific value we just input it Finally we just click on Fit in the main screen Function Outputs e ULD Design Graph e Cost Efficient Designs by Cost Figure 2 20 depicts all designs with lower cost and more p
32. with R Chapman amp Hall CRC Texts in Statistical Science Taylor amp Francis 2004 P Goos and B Jones Optimal Design of Experiments A Case Study Approach Wiley 2011 Ulrike Gromping R package FrF2 for creating and analyzing fractional factorial 2 level designs Journal of Statistical Software 56 1 1 56 2014 R W Helms Intentionally incomplete longitudinal designs I methodology and compari son of some full span designs Stat Med 11 14 15 1889 1913 Oct Nov 1992 S ren H jsgaard Ulrich Halekoh and Jun Yan The r package geepack for generalized estimating equations Journal of Statistical Software 15 2 1 11 2006 R A Johnson and D W Wichern Applied Multivariate Statistical Analysis Pearson Education International Pearson Prentice Hall 2007 G A Milliken and D E Johnson Analysis of Messy Data Designed Experiments Second Edition Number v 1 Taylor amp Francis 2009 J Pinheiro and D Bates Mixed Effects Models in S and S PLUS Statistics and Comput ing Springer New York 2010 Jose Pinheiro Douglas Bates Saikat DebRoy Deepayan Sarkar and R Core Team nlme Linear and Nonlinear Mixed Effects Models 2013 R package version 3 1 113 R Core Team R A Language and Environment for Statistical ComputIng R Foundation for Statistical Computing Vienna Austria 2013 Deepayan Sarkar Lattice Multivariate Data Visualization with R Springer New York 2008 ISBN 978 0 387 75968 5 J W R Twisk Applied
33. with other main effects or second order interactions but the second order interactions are confused on each other In Design of Resolution V main effects and interactions are not confounded with any other main effect or interaction but two factor interactions are confounded with some three factor interactions The type of resolution depends on the generators used to create the design Then a criterion for choosing generators where we can get the maximum resolution e For constructing and analyizing such designs it is usually assumed that interactions of order three or more are not significant Constructing Fractional Factorial Designs To access the function for constructing fractional factorial design we must follow the route Create gt Fractional Factorial By choosing this function the screen shown in Figure 2 6 appears This function shows a catalog containing maximum resolution fractional factorial designs In such table we can see the number of runs required for each design the number of factors that can be analyzed with the design the name of the design its generators and its resolution We just simply choose the desired option and click on Create For generators the numbers indicate factors for instance is the first factor 2 1s the second etc Thus for example in option 3 the factor 4 is confused with the interaction of the first and second factor and factor 5 is confused with the interaction of factor and 3 Op
34. 2 design is described as follows In this case we have a 277 design with 8 runs or a 2 design In such design two factors are now confused In order to create the generators we need to assign the other two C factors and D to either second order interactions as AB BC AC or to the interaction ABC The way how we 2 2 Fractional Factorial Designs with 2 levels 13 assign such actors and create the generators affects considerably an important property called resolution Thus we should be careful when making such assignments Finally we have to mention that if the factor C in a 2 design is significant then the effect of the interaction AB is also significant this due to the confounding property So the next step to get a better estimate separately from the AB effect estimate of the main effect of factor C is to generate extra runs This method is known as foldover e The design 2 worked up above is called design of resolution III In this design the main factors are confounded with two factor interactions Usually Roman numerals are used to define the types of resolution The designs of resolution III IV and V are the most important The definitions of each of these designs are as follows Designs of Resolution III are designs in which no main effect are confounded with other main effects but such main effects are confounded with some two factor interactions Designs of Resolution IV are designs in which no main effects are confused
35. 4 2 Longitudinal Analysis 85 Autocorrelation Function M CO 73 cor 2 a p EwotGaph Graph cor y PA ARTES ENERO ROSA LS I STARR d Carrols ses an Sey pe pe AM PRY VU Fic i 21 YO GU uc IL ry ada DA IRA di ars cd Figure 4 61 Confidence Intervals for Correlation Parameter Estimates GLS upper 427 Figure 4 62 Confidence Intervals for Variance Parameter Estimates GLS 86 Analyze Module Resiites upper Eos Autocorrelation Function v ables Export Table Figure 4 63 Confidence Intervals for Variance Error Estimate GLS Results istis Autocorrelation Function M Export Graph Li on hd SSS SS See IE Ed Figure 4 64 Likelihood Criteria GLS 4 2 Longitudinal Analysis Autocorrelation Function h A SE Export Graph Reesiduals Statistics Figure 4 65 Residuals Statistics GLS Intercept 011 DUE i gt a Figure 4 66 Fixed Effects Correlations GLS 87 88 Analyze Module HES te i j sex Fitted Values Residuals S 26 3 MO1 1 22 457 3 543 25 ze MOL d 24 11 0 89 Gropes 29 1 mo1 a 25 764 3 236 3 Autocorrelation Function Eg 3 M01 ag 27 418 3 582 21 5 3 M02 E 22 457 0 957 Export Graph 22 5 a M02 a 24 11 1 61 XS 23 1 M02 E 25 764 2 764 Fitted Values and Residuals y ees zn EE 2 bi x 23 ET M03 4 22 457 0 543 22 5 1 1M03 1 24 11 1 61 Export Table 24 1 M03 La 25 764 1 764 27 5 3
36. 5 O Minimum difference to detect Researchers usually want to reject the null hypothesis with high probability when the parameters of interest really deviates from its true value This deviation is denoted as 6 We are interested in detecting the minimum value of 1 p Test Power or Power The power of a statistical test is the probability that the null hypothesis is rejected when in fact is not true A common value for power is 0 8 but it depends on the study 5 3 Sample Size Estimation for Longitudinal Analysis with Continuous Responde m Total number of observations over time o Standard deviation of the observations For continuous responses the quantity Var Y o i individuals and j measurements over time represents the unexplained variation in the response Sometimes it is possible to give an approximation to the real value of o throughout pilot or previous studies R Correlation Structure in Observations v ase A 4 The correlation parameter or pa rameters can be estimated from previous or pilot studies We also have to define a correlation structure that better fits our data e The formula to estimate sample size in each group is L Zicaj Zip o V n a y l t v o p 2 jew fi ea e lm i5 Example To estimate the sample size using LADES we follow Statistics gt Sample Size Estimation Longitudinal Analysis with continuos response The captured data and the main screen of the function are shown
37. 574 Treatment 0 054 Treatment Strain 4 1 42 Analyze Module Strain Treatment Figure 4 2 Analyze DOE Screen Function 4 1 Design of Experiments Statistical Analysis i M Estimate Std Error t value Pr gt t Figure 4 3 Hypothesis Testing for Parameters DOE Analysis E a t Intercept Strain Treatment Strain Treatment Figure 4 4 Model Matrix DOE Analysis 43 44 Analyze Module REST bs intercept 9 113 Strain 70 258 Treatment 70 574 Graphics Strain Treatment 0 054 Export Graph Tables Estimation of Coefficients w Expor Table Exit Figure 4 5 Coefficients DOE Analysis Adjusted Values and Residuals Displays a table containing regressors predicted values and residuals actual minus predicted 4 1 See Figure 4 6 R Square Adjusted R Square and Standard Error The multiple R square indicates the proportion of variation in the dependent variable explained by the set of independent predictor variables This statistics is used to check the goodness of fit of model 4 1 For our case it showed that multiple R square 0 8955 The closer 1 the better the goodness of fit See Figure 4 7 Analysis of Variance Shows the sum of squares for each of the factors as well as its F statistics to assess their significance Figure 4 8 It clearly shows the significance of Treatment and Strain factors since the resulting p value is a lower t
38. A 0 95 Export Graph Tables Optimal Cost Efficient Designs y 0 90 Export Table as O 7 dl 0 85 Exit 0 80 150 200 250 300 350 Cost Figure 2 30 Cost Efficient Designs by Cost design total cost on the X axis and Total Power on the Y axis A Results x Graphics 0 95 Export Graph Tables Optimal Cost Efficient Designs di 0 90 Export Table o n 0 85 Exit 0 80 I I I I 30 40 50 60 70 Total Units Figure 2 31 Cost Efficient Designs by Units total number of units on the X axis and Total Power on the Y axis Longitudinal Design summary of Evaluating Longitudinal Designs Evaluating a Longitudinal Design Function Outputs 3 Evaluate Module 3 1 Longitudinal Design Once we have detecting a significant difference among treatments we have to evaluate 1ts power Power is the probability of detecting a significative difference when it really exist Power for the most common test such as t test or anova have been developed but power for longitudinal parameters is still under study This function uses a method to evaluate the power of a test for the average slopes basing on a linear mixed model framework 3 1 1 Summary of Evaluating Longitudinal Designs Linear Mixed Models provide appropriate analysis tools for this type of designs Fixed effect hypoth
39. LADES User s manual Alan V zquez Alcocer Copyright 2014 Alan Vazquez Alcocer HTTP CIMAT MX HECTORHDEZ LADES INDEX HTML The I4TpXtemplate used for this manual is the Legrand Orange Book originally created by Mathias Legrand This material is used in compliance with the Licence of the Creative Com mons Attribution NonCommercial 3 0 Unported License http creativecommons org licenses by nc 3 0 First printing February 2014 1 1 1 2 1 3 1 4 2 1 2 2 2 3 2 4 2 5 2 6 3 1 3 1 1 3 1 2 3 1 3 4 1 4 2 4 2 1 Contents Introduction LADES Help 6 How to install LADES 6 User Sessions 6 Import and Export 6 Create Module Tp Full Factorial Design 7 Fractional Factorial Designs with 2 levels 12 Optimal Design of Experiments 16 Full Longitudinal Design 21 Unequal Longitudinal Design 24 Longitudinal Optimal Powered Cost Design 29 Evaluate Module 0 2 00 cc eee eee es OF Longitudinal Design 35 summary of Evaluating Longitudinal Designs 0 35 Evaluating a Longitudinal DESIQN asaan aaa eee 35 FURICTION OUNOUNS decos ord Booed rosarios EOD EE EOE ede Kena x 93 13 97 Analyze Module T Design of Experiments Statistical Analysis 39 Longitudinal Analysis 50 Linear Mixed Model 0 0 ras 50 4 2 2 Generalized Linear Mixed Models 4 68 4 2 3 Generalized Estimating Equations
40. Low 1 224 190 53 33 47 8 Low Low 2 194 176 822 17 178 6 Low Low la 164 163 1414 0 886 6 Low Low 4 170 149 406 20 594 7 Low High 1 163 178 660 15664 v 4 m m m gt Figure 4 57 Valores Ajustados y Residuos GEE 4 2 4 Generalized Least Squares GLS In linear mixed models the covariance matrix of the response vector y 1s Var yi y o ZDZj Ai l this matrix has two components which can be used to model heteroskedasticity and correla tion a component of random effects given by Z DZ and a component within subjects given by Aj respectively In some applications it is desirable to avoid the incorporation of random effects in the model to reflect the dependency between observations If we only use the within subjects component Aj to model the structure of the response variance we can generate a simplified version of a linear random effects model That is y XiB amp N 0 0 A j i 1 M 4 9 The parameter estimation under this model has been studied in the linear regression literature It is usually assumed that A is known This problem of estimation is known as the generalized least squares problem Summary of GLS e If we use the transformation l l l l 1 y aP y X UY x Ef Y e 82 Analyze Module T where Aj is positive definite and A D A qe a gt We can re express the model in Equation 4 2 4 as a classical linear regression model y X
41. M03 1 27 418 0 082 25 5 En M04 E 22 457 3 043 bit 27 5 a M04 1 24 11 3 39 26 5 Em moa a 25 764 0 736 27 3 M04 1 27 418 0 418 20 3 M05 1 22 457 2 457 23 5 4 MOS 1 24 11 0 61 22 5 ROS MOS ES 25 764 3 264 26 3 M05 4 27 418 1 418 24 5 E M06 4 22 457 2 043 25 5 E M06 2 24 11 1 39 27 1 M06 1 25 764 1 236 28 5 3 M06 1 27 418 1 082 22 E M07 Lx 22 457 0 457 y 4 m Figure 4 67 Adjusted Values and Residuals GLS Wald statistic which is defined as the ratio between the square radius regression and its standard error This statistic follows a Y with 1 degree of freedom We can see that age has an impact of 0 632 per each time unit and this change is statistically significant at a level of 0 05 That is that for both groups male and female the the average growth is 0 652 mm per year around 11 years Since the contrast type for sex was Helmert one effect were defined as sex1 the effect that represents the difference between the means of the observations of the two groups In our case there 1s a significant difference in the distance from the pituitary to pterygomaxilar fissure between men and women Then the man s contribution to distance is 1 136 while that of a woman is 1 136 Now to see if growths between Men and women are different we focus on the interaction age sexl The effect of this interaction is statistically significant at 0 05 level This means that the difference between the growth rates of men
42. S can be seen in Figure 4 20 where we can see the main screen of the function Linear Mixed Model After clicking Adjust the function displays different outputs such as ANOVA parameter estimation etc but for now we will concentrate on the Adjusted Vs Residuals plot This graph of standardized residuals versus fitted values gives us a clear indication of the heteroskedastisity in the residuals In Figure we can see that our residuals indicate heteroskedastisity Thus in order to have residuals with constant variance we have to use a Power Variance Structure This customization in our analysis can be done using the Customize button on the screen in Figure 4 20 The customization menu shown in Figure 4 22 4 2 Longitudinal Analysis 99 Models Mix Variables weight Time Rat Diet Figure 4 20 Main Screen Linear Mixed Models ze Results Ee re x estis Fitted Values VS Residuals EE Residuals VS Fitted Values de 10 Export Graph q e A 3 r 2 1 iu t DE 1 gt AL Xi LEE P mE D Ei tt y t y Export Table 2 AS a x qa g ee M z 1 9 n E Us j w a 4 4 1 at 5 fo us x E gt e ie 501 a 10 P 15 30 400 aM 600 Fitted Values Figure 4 21 Results of longitudinal graph 96 Analyze Module Data Analysis MiG Method ML O REML Estructura de Varianza N
43. We want to test the parameter effects and to construct a model that better represents our data The analysis we use is similar to multiple linear regression Summary Design of Experiments Statistical Analysis e The main effect of a factor 1s the resulting difference of subtracting the average of the observations in its lower level minus the average of the observations in its higher level The main effects graph shows the average of all observations according to each level in each factor and then connect them with a line The estimated effect of the interaction of two factors A and B is calculated as the resulting difference of subtracting the average of the observations of factor B at its highest level with the factor B at its low level both given the highest level A minus the difference of factor B at its highest with factor B in low level both at the lower level of A The effects of interactions can be seen through an interaction plot This graph shows the average of the observations in each combinations of two factors A and B E and these averages are joined by a line e The model matrix helps us to compute main effects and interactions effects These effects can be estimated using linear regression See Section 5 1 we only need the model matrix and response vector See A 1 By least squares the coefficients are estimated for each of the dependent variables factors of the regression model These coefficients mul
44. actional factorial designs are ideal because of the ortogonality prop erty thus they produce the minimum determinant for M diagonal matrix In order to measure the deviance from the perfect ortogonality of one optimal design we use the Variance Inflation Factor VIF The minimum value of VIF is one For orthogonal de signs the VIF for each of the estimates effects of the factors is 1 When a design is not orthogonal one VIF is greater than one If the VIF of an effect is four then variance of the estimate of such effect 1s four times greater than the estimate using an ortogonal design Constructing an Optimal Design of Experiments To access the function for creating optimal designs we follow the next route Create gt Optimal 2 design As an example we have a study in which a new treatment to aid the healing process in rats is to be assessed The population under study is two strains of rats Wistar and NuNu using males and females each This new treatment is to be compared against a control group Treatment was applied and the measurements were taken 3 days after application It is considered that the interaction between the type strain and treatment can be significant The response under study is area of the wound measured in mm Only seven rats are available so it is necessary to propose the best design to analyze the factors of interest In Figure 2 11 we see the main screen of the function In this figure we can also see the
45. actorial Designs with 2 levels If the number of factors in a 2 factorial design increases then the number of runs necessary for completing a replica quickly increases For example a complete replica of a design 2 require 64 runs This seems to be too much regarding that only few effects such as main effects or some interactions interactions will be significant If we can reasonably assume that certain high order interactions will not be significant then by using only a fraction of the complete design we could get enough information about the main effects and some interactions of two factors These fractional factorial designs are the most used in experiments where you want to reduce the number of resources used The most common use of the fractional factorial designs 1s as screening In this type of experiment many factors are considered and it aims to identify factors if any with the largest effects Screening experiments are used in the early stages of a study when many initial factors are considered for the study but it is known jus a few will have a significant effect The success of fractional factorial designs 1s based on the following concepts 1 Sparsity When there are many factors under study and it is expected that only a few main effects and interactions will be significant for the response 2 Projection The resulting significant effects of a fractional factorial designs can be projected on larger designs 3 Sequential Expe
46. and women around age of 11 years are different Residuals Normal Plot This graph shows residuals empirical quantiles and standard normal distribution quantiles It helps us validate whether the residuals are Normal distributed We want points to forma a diagonal line in the graph See Figure 4 69 Residuals by Factor Normality Plot This graphs shows residuals empirical quantiles and normal standard quantiles for both groups male and female It helps us to validate whether the model residuals are distributed Normal in both groups We want points to forma a diagonal line in both graphs See Figure 4 70 Residuals and Adjusted Values Plot This graph helps us to validate the assumption of constant variance in residuals We note that the dispersion of points in both groups is similar See Figure 4 71 This is due to the adjustment we made to include a parameter for modeling the difference between the variance of the two groups Gr fica de Autocorrelaci n 4 2 Longitudinal Analysis 89 a ay Std Error jt value p value 4i 10 117 UOC Autocorrelation Function M aes ex Export Graph ables Figure 4 68 Fixed Effects and Hypothesis Testing GLS ENT E iaa M Export Graph 95 Confidence Intervals Parameters Estimations Quantiles of standard normal Normalized residtals Figure 4 69 Residuals Normal Plot GLS 90 Analyze Module za asus x Residuals by sex Export Graph 95 Confidence
47. arameter estimates by Subject GLMM e Estimation of Random Effects Parameter In this section the estimation of the random effects parameter is shown Only one parameter was estimated O since the distribution of the random intercept is Normal with mean zero Figure 4 48 shows such estimation 0 e Correlation Fixed Effects Shows the correlation of the fixed effects used in our model See Figure 4 49 74 Analyze Module x Lo MEN 5 17198902615059 Graphics Export Graph Tables Export Table Exit b Figure 4 48 Estimation of Random Effects Parameter Variance GLMM ESE Visit Treatment1 Visit Treatment1 FERA Intercept eaaa Intercept 0 263971776348029 0 0297298100038449 0 263971776348029 l Visit 0 0297298100038449 0 0162069840384693 0 0297298100038449 Graphics Treatment1 0 263971776348029 0 0297298100038449 0 555665919501512 hd Visit Treatment1 0297298100038449 0 0162069840384693 0 0608352269880347 Export Graph i Export Table Exit Figure 4 49 Correlation of Fixed Parameters GLMM 4 2 3 4 2 Longitudinal Analysis 79 Generalized Estimating Equations The method of Generalized Estimating Equations GEE allows us to model variables relations at different points in time in a simultan
48. arison among the three groups With these characteristics the function will define the minimum difference to detect in these two comparisons The same holds for the Intercept Comparison section if we leave this section in zero then we will not compute any test Finally we just click on Fit in the main screen Function Outputs e Design Cost This section shows the total cost of the design according to eq 3 1 e Power Slopes This section shows the power for the slopes test of hypothesis e Power Intercept This section shows the power for the intercept test of hypothesis e FHeims and Test Shows the results of the test of hypothesis 3 3 We encourage the reader to check the paper Helms1992 for more information about the methodology 38 Evaluate Module A Longitudinal Design x Random effects parameters 0 Var bO 1 149 Treatment B 3 375 a Var bi 0039200 Control 00350 Cov b0 bi 0 s 0 1255 Treatment A 0 17625 E 0 163375 Figure 3 3 Customize Section Evaluate Longitudinal Design Design of Experiments Statistical Analysis Longitudinal Analysis Linear Mixed Model Generalized Linear Mixed Models Generalized Estimating Equations Generalized Least Squares GLS 4 Analyze Module 4 1 Design of Experiments Statistical Analysis After running the experimental design we have to analyze the results in an adequate statistical manner
49. atamiento Exportar Tabla Figure 4 6 Fitted Values and Residuals Analisis DOE Resme Residual Standard Error 0 2493 with 12 degrees of freedom aphics A Export Graph sou N ees Figure 4 7 R Squared for model 4 1 DOE Analysis 45 Analyze Module A An NN Sum Sq Mean Sq F value 1 066 117 Export Graph Tables Export Table oa Figure 4 8 ANOVA DOE Analysis QQ Plot Residuals Normal Plc gt A o P9 Sample 2 1 o 3 2 Theoretical EE Figure 4 9 Q Q Plot for Residuals DOE Analysis 4 1 Design of Experiments Statistical Analysis MS5ules 0 50 Graphics Independence Residuals Plot n Export Graph Tables 0 25 Export Table v Residuals gt Exit 0 00 0 25 d B Observation Figure 4 10 Residuals Independence Graph DOE Analysis 47 amplitude of the points for each adjusted level value 1s the same If the points on the graph reflect a form of cone this indicates that the variance of our residuals is not constant For our data there is only one point that does not fit in our proposed from If we take this value as an outlier we agree with 111 Figure 4 11 e Half Normal Plot Is a graph of the absolute value of estimated factor effects against its normal cumulative probabilities The Figure 4 12 shows the half Normal plot of the effects of each of the factors in the d
50. bles 95 5 1 Linear Regression Loaded Data 1 5 Figure 96 Statistics Module EN 1 Response Island gt Ins lt NS ES Area Anear Dist DistSC Elevation E EM gt actors lt ES Area Anear Dist DistSC Elevation NN ad n Interactions T All lt ES Area Fit Cancel Contraste Tratamiento Helmert Figure 5 2 Linear Regression Main Screen KESHI x dssujis Graphics Export Graph Tables Export Table Exit Figure 5 3 Results Screen Linear Regression 5 1 Linear Regression 97 aa Estimate Std Error Pr tl E E Figure 5 4 Hypothesis Testing for Parameters Linear Regression m ns r ss Y m d H Intercept ES Area Anear Dist DistSC Elevation ES Area Figure 5 5 Model Matrix Linear Regression 98 Statistics Module e a Intercept Figure 5 6 Estimated Coefficients A i Results x es vm esulto Island NS Anear Dist DistSC Elevation Fitted Residua i tra n2 j 35 nc 284 0 6 6 46 A 5 rc E aS bar PE NE z J 0 4 UN gt z ER A UT ey j gt gt 4 411 11443 T TEM eo e amp ES ays a T 5 5 I T 2 5 x 2 SE PEER 5 Champi 3 1 v1 J gt 44 D 7 n 3 3114 H A EC JJ ma Y RA gt Jt je Hs m d y Ltd E u az
51. bout the theory and methods can be found in Helms Helms1992 Now we will present how to fill out this data in the corresponding boxes First Figure 3 1 shows the general parameter of the function such as Design Characteristics 3 groups 10 mice where we can add up to 3 groups to evaluate the hypothesis using Fyelms Below this section we found the Time section where we define the time used 5 occasions and the Cost Evaluation section if we want to calculate the cost of our design 2 Longitudinal Design x gt Design Characteristics lt Incomplete Design Group Units per Group gt Cohorts lt Control 10 Group Unit 4 2 0 2 4 TreatmentA 10 Treatment B 10 Time Fit Customize Cancel Cost Evaluation Unit Cost 0 3 Observation Unit Cost 1 Figure 3 1 Main Screen Evaluate Longitudinal Design 3 1 Longitudinal Design 37 We can define an Incomplete Design also We just need to select the Group and add up to 3 Cohorts and define the number of individuals for each We can see an example in Figure 3 2 For instance the second row in Figure 3 2 shows that the second Cohort is composed of the first the last and the middle occasions and the total number of occasions was 3 Researchers decided to include three cohorts to reduce the number of subject occasions and cost zz Longitudinal Design x gt Design Characteristics lt Incomplete Design Group Units per Group gt Cohorts lt Control 10 Group Unit 4 2 0 2 Treatment A 10
52. cluding a Power variance structure mentioned at the beginning of the analysis AAA x i 23uitz Graphics Residuals VS Fitted Values v Expor Graph Tables 1 Variance parameters 95 Confidence Intervals v Export Table Exit Standardized residuals 300 400 500 600 Fitted values Figure 4 34 Residuals VS Fitted Values Plot Linear Mixed Models e Residuals VS Fitted Values Plot by Factor This graph shows the previous graph but split it by Diet We note that the dispersion for Diet 2 points is smaller than for Diets 1 and 3 see Figure 4 35 This makes us doubt of the validity of the assumption of constant variance in the residuals and makes us think that the problem is in Diet 2 observations e Normal Random Effects Plot This graph Figure 4 36 helps us to validate the assump tion that the random effects follow a normal distribution Quantiles for the random effects X axis are shown and the Normal Quantiles Y axis The points for each of the random effects should form a diagonal For our data this is true but we can detect some outliers rat number 12 Residuals by Subject Plot This graph shows the residuals for each one of the 16 subjects Another assumption of the model is that the variance is the same for all subjects In Figure 4 37 we note that the dispersion of points for mouse 14 is wider than the others so that this assumptions could not be fulfilled Real VS Fitted Values In this plot th
53. concepts and theory regarding it and an example implemented in LADES are provided 1 2 How to install LADES Before you can install LADES you have to update or install JAVA software You can download the last version of JAVA from http java com es download index jsp Once you have downloaded it please just follow the installer instructions If you want to install LADES on your computer please follow 1 Download the LADES exe file from http cimat mx hectorhdez lades donwload html 2 Open the exe file and follow the instructions The R 2 15 1 installer will appear automati cally just follow the instructions of it 3 Open LADES by double clicking its desktop icon 4 The username to start your first LADES session is USERNAME admin and PASSWORD 1234567 You may be asked to install the R libraries just click YES 5 Start using lades and have fun 1 3 User Sessions If you want to start a new session in LADES go to the main menu and click on File gt Administration gt Users Control If you want to include a new user just click on new Just fill the required information and click Save Now you can start a new session under your user 1 4 Import and Export To import a csv file to LADES just click on Import and choose the file from your computer If you want to save a file just click on Save To add or delete columns from your data click on the bottoms or on the bottom of the main menu 2 1 Full
54. corresponding p value are given The p value is based on the Wald statistic which is defined as the ratio between the squared regression coefficient and its standard error This statistic follows a x distribution with 1 degree of freedom We can see that Time has an impact of 2 196 gr of weight per unit time And this change is statistically significant at 0 05 Since the contrasts were defined as Treatment the Diet factor was splitted into two effects Diet 2 and Diet 3 Thus the reference now is Diet 1 Then the estimated coefficient for Diet 2 refers to the added value to apply Diet 2 in an individual taking Diet 1 Such estimate is significant and this added value is 198 584 grams Furthermore for Diet 3 the estimated coefficient 1s the contribution of the Diet 3 to Diet 1 which produces an increase of 250 809 gr This estimation is also statistically significant Regarding the interaction Time Dieta2 it refers to the change in the slope produced by applying Diet 2 to an individual taking Diet 1 This interaction 1s significant and causes a change of 3 73 grams per unit time Interaction Tiempo Dieta3 is explained likewise but this interaction is not significant Random Effects Estimates by Subject In Figure 4 32 the best linear unbiased predic tors for random effects are shown each of the rats These predictions were used to calculate the residuals and fitted values of the model Basically the best linear unbiased predictions denote
55. cost function is defined in order to compare the constructed design cost N x MCS K x COS 2 4 Where N is the total number of subjects K is the total number of observations per subjects MCRS is the marginal cost of including e g buying an individual in the study and COS is the cost of measure maintain and individual during K times Other cost functions can be defined e Using this framework of power analysis Fye ms statistic and cost function we can construct incomplete longitudinal designs that reach the desire power for fixed effects hypothesis We call this longitudinal designs Longitudinal Optimal Powered Cost Design This designs look for a tradeoff between power and cost e We need prior information in order to get more accurate results Constructing Longitudinal Optimal Powered Cost Design As an example to show this methodology we have a mice study Three treatments were administered to three groups of mice Treatment A Treatment B and a Placebo Five measures over time were recorded and the response under study was the weight The main goal of this example is to design the next experiment using the minimum number of mice and having a reasonable power for testing the difference among the slopes of three treatment We will explain the example and some important characteristics of this function This function can be accessed by following Create gt Longitudinal Optimal Powered Cost Design The statistical model used by t
56. d optimal design of experiments Summary of Optimal Design of Experiments e As already mentioned we must build the best design to explain as more information as possible There are different criteria to classify the designs as better than others The criteria used by this software 1s the D optimality criterion This criterion 1s based on giving the best estimates of main and interaction effects e One way to estimate the effects of main factors and interactions is through estimating the coefficients D of the regression model of interest this may include only main effects or main effects and some interactions Coefficients are estimated using least squares which leads to the the following notation B X X XY 2 3 Optimal Design of Experiments 17 where X is the model matrix and Y is the vector of responses The matrix covariance estimates of these coefficients 1s Var B o x x where X is the model matrix and 0 is the variance of the model error The D optimality criterion is responsible for choosing the design that minimizes the determinant of the covariance matrix This lead us to more accurate estimates for our coefficients This minimization of the covariance matrix can be performed by maximizing the determinant M X X since 0 is unknown but constant e The construction of these designs that minimize the variance of the estimates of coefficients is performed by using the point change algorithm e Full factorial and fr
57. del 1 follow a Normal distribution 2 have constant variance and 3 are independent This means in other words the residuals have no more information Implementation of an Analysis of an Experimental Design As an example we will analyze the results of an study about two different strains of mice BALB c y C57BL To each of the strains a treatment and a placebo were administered the response under study was the number of red blood cells RBC Researchers are interested in assessing the difference between the number of red cells between strains and a possible difference between the strains under the new treatment strain treatment interaction The design that was used for this experiment 1s shown in Table 4 1 REC CSTBL Placebo 96 10 09 24 9 56 Table 4 1 RBC Experimental Data When analyzing an experimental design it is important to codify the variables to construct an ortogonal design matrix This can lead us to a more accurate statistical analysis The encoded levels for each of our variables are shown in Table Codification BALB a 1 CSTBL 1 Codification 1 Table 4 2 Codifications for each Factor in the RBC Experiment 4 1 Design of Experiments Statistical Analysis A Data imported to the software is shown in Figure 4 1 Here we can see that it is a 2 full factorial design replicated 3 times Now to access the Design of Experiments Analysis function we follow Analyze gt DOE The resulting
58. dividuals per group total cost and power Cost Efficient Designs by Units Export Graph Reference Power Ba Export Table dll Figure 2 23 Reference Power Unequal Longitudinal Designs 2 6 Longitudinal Optimal Powered Cost Design 29 Graphics Cost Efficient Designs by Units Export Graph Tables Refernce Cost v Export Table Exit 4 mi gt Figure 2 24 Reference Cost Unequal Longitudinal Designs 2 6 Longitudinal Optimal Powered Cost Design Full longitudinal designs in which each subject is evaluated at each measurement occasion are the best option for conducting longitudinal studies but are often very expensive and motivate a search for more efficient designs Intentionally incomplete designs represent less cost since it divides the population under study in groups that are measured only on specific occasions called Cohorts This type of designs have the potential to be more efficient than complete designs Summary of Longitudinal Optimal Powered Cost Design e Linear Mixed Models provide appropriate analysis tools for this type of designs e Fixed effect hypotheses can be tested via the FHelms test statistics An accurate approxi mation of the statistic s small sample non central distribution makes power computation feasible Thus we can compare the longitudinal designs according to the power of fixed effects test of interest under the FHelms statistic e A simple
59. e and having a reasonable power for testing the difference among the slopes of three treatment The statistical model used by this function is a special case of a linear mixed model This model has been reparametrized to make the comparisons of the three slopes more direct The model for this example is Bo boi B3 ti bij j T B 2 3 Bo boi Bi ti bij amp j Control Ba boi Bs tij bii amp j LC where y represents the j ith measurement of the i th eel bo is the random factor that represents the difference of the initial weight to the mean while b is the random factor 2 5 Unequal Longitudinal Design 20 representing the difference of each mouse to the average slope of each group And amp j are independent of each other This model and the corresponding hypothesis testing for slopes was presented in Section 3 1 2 In our example we will use some of the results We now present the function and how to fill the corresponding data First Figure 2 18 shows the general parameter of the function such as Number of Groups Units per group and Time We fill out the boxes with 3 groups 10 mice and 5 occasions codified as 4 2 0 2 4 respectively Minimum Units per Group refers to the minimum number of individuals the generated new designs must have Tolerance Difference Number for Individuals stands for restricting possible big differences among the groups in the new generated des
60. e likelihoods of the two proposed models This test basically compares the information explained by two models and decide whether or not our proposed model is better than the other This test helps us to determine the significance of potential random effects in our data and is based on a X distribution Finally this type of test can be used using two different criteria Akaike Information Criterion AIC and Bayesian Information Criterion BIC See A 5 The linear mixed effects model 4 2 1 assumes that with in subjects errors e are inde pendent and follow a N 0 071 There is also an extension to linear mixed effects model in which the assumptions are relaxed and this allows us to model non constant variance heteroskedasticity and different correlation structures for errors see A 3 y A 4 for a 4 2 Longitudinal Analysis 53 catalog of variances and correlations structures respectively This extended model is expressed as follows yi XiD Zibi i i 1 M 4 3 bi N 0 9 N 0 07A in the same manner as the basic mixed effect model we assume that the errors and the random effects b are independent The estimation methods and hypothesis testing for the parameters of this model are similar to the basic model The matrix A is transformed so that we can re formulate a basic model and apply the known theory Linear Mixed Effects Model Analysis As an example the data file weights csv contain the weights of 16 mice
61. e real versus the fitted values are depicted see Figure 4 38 This graph help us with detecting discrepancies of the model predictions Random Effects by Factor Plot In this section of results we can observe the random Intercept effect on X axis and Time on Y axis for each diet as this is our factor interest 64 Analyze Module Residuals by Diet 300 400 500 600 Export Graph Variance parameters 95 Confidence Intervals v Export Table Standardized residuals 300 400 500 600 300 400 500 600 Fitted values _ Figure 4 35 Residuals VS Fitted Values Plot by Factor Linear Mixed Models fs EmetGaph O Export Graph Variance parameters 95 Confidence Intervals M Export Table Quantiles of standard normal e 50 0 50 100 0 4 02 0 0 0 2 Randcm effects E SE Figure 4 36 Gr fica de normalidad de efectos aleatorios Modelos Mixtos 4 2 Longitudinal Analysis 9 oo Oo O Variance parameters 95 Confidence Intervals v ate e e oe CAII o0 0M0B0o0c o Qo o0 C oe 7 o 000 9 Residuals Figure 4 37 Residuals by Subject Plot Linear Mixed Models Results L c MEME Real VS Fitted V ka EmwtGah 5 5 5 Export Graph Export Table weight 300 400 500 600 Fitted values Figure 4 38 Real VS Fitted Values Plot Linear Mixed Models 66 Analyze Module These graphs should not present trends as this would let us
62. ecause the number of parameters in A 4 1s increased quadratically with the number of observations of individuals over time the correlation structure usually lead us to an overparam eterized model When there are few observations per group general correlation structure is a useful tool to model the data Autoregressive order 1 AR1 The AR 1 model is the simplest of the autoregressive models The correlation structure using this model represents a decrease in the correlation as that observations are more and more separated An example of this structure can be observed below 3 p p p p 1 pp All p p p 1 We note using structure only one parameter has to be estimated p This matrix 1s the most used when fitting a longitudinal analysis model since in most of the cases more distant observations tend to be less correlated Moreover the number of parameters to estimate is just one Independent This structure 1s used when there is no relationship between observations An example of this structure 1s A 12 O O O O O O Cc O O It is assumed that when we are working with the linear regression model our errors follow this structure Independent correlation matrix does not help us much when we want to use longitudinal data analysis where the interest is to model the correlation of measurements over time Likelihood Criterias Akaike Information Criteria Criterion AIC and the Bayesian Information Criterio
63. efines the complete structure This correlation pattern for observations can be estimated from previous or pilot studies e The formula to estimate sample size is 114 Statistics Module AD ap 216 POP L m 0p 5 8 m pi Po where p 22 Example To estimate the sample size using LADES we follow Statistics Sample Size Estimation for Longitudinal Analysis with Binary Response The captured data and the main screen of the function are shown in Figure 5 23 For our problem we want to detect a minimum significance difference of 6 0 165 in the proportion of Presents Thus we fix both proportions to be p 0 5 control group and p 0 665 treatment group We use 0 05 and a power of 0 8 And the alternative hypothesis to be testes is Ho Dic Bir two sided Pi 0 5 P2 0 5 Observations over time 3 Level of significance alpha 0 05 Power Test 1 beta 0 8 Alternative Hypothesis One Tail Two Tail Exit Estimate Figure 5 23 Input data Sample Size Estimation Binary Response The most sensible parameter for sample size calculation is the correlation structure It is assumed tbinhat the correlation structure 1s Interchangeable with p 0 15 This means the correlation between observations is the same 0 15 The results are shown in Figure 5 24 In conclusion we will need 60 individuals per group to perform our study 5 4 Sample Size Estimation for Longitudinal Analysis with Binary
64. ent If the lines of the effects were horizontal or nearly horizontal then we would say the estimated effect is not significant since the average of the observations in the Lower level 1 would be almost equal to the average High level 1 i e no change from one level to another 48 Analyze Module Fitted Values VS Residuals 0 50 Export Graph D e ss a Export Table 2 4 c NENNT UN l E A 0 00 gt gt m 25 b 8 5 9 5 10 0 9 0 Fitted Values ml Figure 4 11 Residuals VS Fitted Plot DOE Analysis Resultados kesmtadgg Exportar Gr fica Analisis de Varianza Half Normal Flot or o Tratamiento 152 o S E io Eo o E Eo e t e S o CepaTratamiento 0 2 0 4 0 6 0 8 1 0 12 14 absolute effects SS Figure 4 12 Half Normal Plot DOE Analysis 4 1 Design of Experiments Statistical Analysis 49 s wr PACTI i 25 t3 Main Effects Plot Graphics Main Effects Plot Y Strain Treatment Export Graph Tables o Export Table RBC Exit c Figure 4 13 Main Effect Plots for RBC experiment DOE Analysis e Interaction Plot This graph shows the average of the observations in each of the combi nations of two factors A and B E and these averages are joined by a line For insta
65. eous way Then the estimate of D in equation 4 2 3 is reflecting the relationship between the longitudinal development in time of the response variable Y and the longitudinal development of the corresponding predictor variable X4 Y P0 P X 4 6 GEE is an iterative procedure which uses the quasi likelihood to estimate regression coeffi cients Summary of Generalized Estimating Equations e The literature assumes that GEE analysis is robust against wrong selection of the corre lation matrix structure However we must be careful when selecting the best option for our correlation matrix see A 4 Unfortunately there is not a direct method to determine what correlation structure is more appropriate One possibility is to analyze the correlation structure of the within subject observations over time observed data and then find the most appropriate structure e The simplicity of the correlation structure is a factor we must take into account The number of parameters in this case correlation coefficients that need to be estimated 1s different according to the structure we choose For instance in exchangeable structure we estimate only one parameter for all the observations in time while for a complete structure with for example five observations over time we estimate five parameters Finally the power of statistical analysis will be influenced by the choice of the structure e The parameter estimation process can be seen as
66. erence level Helmert The codification for this type of contrast is l i 1 1 1 0 2 l 0 0 3 If you have an equal number of observations at each level 1e a balanced design then the dummy variables will be orthogonal to each other and to the intercept This codification is not very suitable for the interpretation of the results but can be used in some special cases For our qualitative variable the parameter of the second dummy variable coefficient regression first column refers to the average difference between level 1 and level 2 The interpretation of the parameter of the third dummy variable relates to the average difference between level 1 and level 2 with twice the average of observations at level 3 The same holds for the fourth dummy variable Variance Structures Fixed This structure represents a variance function without parameters but with one variable affecting the variance This structure is used when the variance within groups is defined by a proportional constant This structure is used for example when we have evidence that the variance within group increases linearly over time Var amp o tij A 5 i individuals j measures in time Ident This structure represents a variance model in which the variances for each level of stratifi cation of the s variables is different Thus different values are taken in the set 1 2 S that 1S Var e 0 55 A 6 The variance model A 3 us
67. es S 4 1 parameters to represent the S variances A 4 Correlation Structures 127 Power The variance model for this structure 1s 2 2 Var ij O vij o A 7 which represents a potential absolute change in variance by variable The parameter is not restricted then A 3 can be used in cases where the variance increases or decreases with the absolute value of a variable Exponencial This variance model is represented by the structure Var e 0 28v A 8 It can be explained as an exponential function of the variance through a variable The parameter is not restricted then A 3 can be modeled in cases where the variance increases or decreases with a variable A 4 Correlation Structures Exchangeable This 1s the simplest correlation structure which assumes equal correlations between all errors within groups belonging to the same group An example of this correlation structure can be seen in the following matrix where we considered 4 observations over time ppp p 1p p A9 p pl p a ppp When using this structure to represent the relationship between data taken across time it is only necessary to estimate a parameter p General Full This structure represents the most complex structure Each correlation in the data is repre sented by a different parameter the corresponding correlation structure is as follows pi po ps Pi 1 pa ps A 10 po pa 1 pe P3 ps pe A 5 128 Term Definitions B
68. eses can be tested via the Fpe ms test statistics An accurate approxi mation of the statistic s small sample non central distribution makes power computation feasible Thus we can compare the longitudinal designs by the power of the fixed effects test of interest under the FHelms statistic A simple cost function is defined in order to compare the constructed design cost N x MCS K x COS 3 1 Where N is the total number of subjects K is the total number of observations per subjects MCRS is the marginal cost of including e g buying an individual in the study and COS is the cost of measure maintain and individual during K times Other cost functions can be defined We need prior information in order to get more accurate results such as random effects variances and residual variances Under this framework we can also evaluate Intentionally Longitudinal Design 3 1 2 Evaluating a Longitudinal Design As an example to show this methodology we have a mice study worked in Helms1992 Three treatments were administered to three groups of mice Treatment A Treatment B and a Placebo Five measures over time were recorded and the response under study was the weight of mice The main goal of this example is to test the hypothesis for average slopes and evaluate its power 36 Evaluate Module We will explain the example and some important characteristics of this function This function can be accessed by following Evaluate
69. esign The solid line of the graph always starts at the origin and usually passes trough the 50 quantile data This plot is easy to interpret particularly when there are few effects For our data the estimated effects for treatment and strain factors are far away from the midline so we say these effects can not be due to chance or do not belong to a normal distribution and we consider them as significant The more remote effects are from the midline the more significant will be on the response Certainly this 1s consistent with the ANOVA and hypothesis testing coefficients shown above e Main Effects Graph The plot of the main effects depicts the average of all observations in each of the factor levels and connect them with a line For a two level factor the effect of one factor 1s related to the slope of the line the more vertical the line 1s the more significant the effect will be In Figure 4 13 we find that the lines of each of the main are more likely to be vertical lines For treatment we can see a line with a very steep slope down here we can conclude that treatment has a negative effect on the response That its a change from the Control group 1 to Group therapy 1 is negative and significant For strain the slope of the line is not very pronounced but not close to being an horizontal line this means that a change from BALB c 1 to C57BL 1 produces a negative effect and this effect is significant but not to a great ext
70. esigns There are different ways to conduct the analysis of these designs One is the subject specific approach where we are interested in modelling the response of each of the individuals in the study Popular methods are Linear Mixed Models Generalized Linear Mixed Models etc On the other hand the population approach is used when we want to analyze the behaviour of whole groups Popular methods are Generalized Least Squares Generalized Estimating Equations Incomplete Longitudinal Design are defined by using Cohorts This methodology is important when we want to reduce the cost of our study We will explain this in the next section Constructing a Full Longitudinal Design As an example we have a longitudinal study of a weight loss drug administered at three dose levels O placebo 1 and 2 The main goal is to test the alternative hypothesis that doses l and 2 bring about a weight loss trend different from the trend resulting from placebo Two strains of mice were included in the study Finally the measures will be over 7 weeks of study In order to construct a Full Longitudinal Design for this study using LADES we follow the route Create gt Full Longitudinal Design In Figure 2 16 we appreciate the main screen of this function We also see the two group characteristics under study with their corresponding levels And we select the option Randomize Design in order to get our resulting design randomized over the individuals We wil
71. export and import csv files which makes your results easier to share and evaluate LADES is intended to include all the necessary tools when analyzing experimental results laboratory industrial chemical etc All functions of LADES are based on the R language for statistical computing and graphics LADES makes use of R libraries such as MASS lattice nlme Ime4 FrF2 AlgDesign geepack ggplot2 LADES was built using JAVA programming language and it is available in its project home page http cimat mx hectorhdez lades index html LADES requirements are e OS Windows XP 7 and 8 e JAVA 1 6 or superior 6 Introduction 1 1 LADES Help This document is divided into six Chapters that will help you to get a better understanding of the capacity and power of LADES First of all we show how to install LADES in your computer and we talk about user s sessions input and output Chapter 1 Chapter 4 presents tools for creating a variety of designs Chapter 5 shows a function to evaluate longitudinal designs Chapter 6 explain all the functionality regarding the analysis of the resulting data Chapter 7 presents a variety of statistical methods with different purposes such as fitting model to compute sample size Chapter 8 shows the longitudinal data graph function Finally an Ap pendix is included to provide definitions of some important concepts used throughout the manual For each function an introduction a summary of all important
72. follows First simple linear regression a model assuming that the observations are independent 1s adjusted Then based on this residual analysis the parameters for the correlation matrix are computed Finally the regression coefficients are re estimated using this correlation matrix to correct residuals dependency The process is repeated until it converges e In GEE the correlation within subjects is considered a nuisance parameter Then GEE corrects the with in subjects dependency by equation 4 2 3 Y Po DiXi j Dot CORR Ei 4 7 where Y is the observations of subject i at time t Do is the intercept X is the independent variable j for the individual i at time f Di is the regression coefficient for independent variable j J is the number of independent variables t is time P2 is the regression coeffi cient for time CORR represents the elements of the correlation structure and j is the error for subject i at time f Longitudinal An lysis using GEE A study was conducted to evaluate the effect of six diets in cholesterol levels in older adults Diets were constructed from six combinations of two fiber levels Low High and three levels of fat Low Medium High Thirty six people were randomly assigned to these six diets Each level of cholesterol was determined every two months Time 2 months Time 2 4 Months Time 3 6 months time 4 8 months To access the GEE function in LADES we follow 76 Anal
73. gistic Regression e Analysis of Variance ANOVA This analysis of variance works different as ANOVA for linear regression This typo of ANOVA use the likelihood ration for testing hypothesis over factors For our example only Time was significant that is adding Time to our proposed model contributes significantly to the explanation of the observed data This hypothesis testing works with small samples e Null Deviance and Residual Null deviance of the fitted model See Figure 5 20 5 2 Logistic Regression 109 n er Deviance Resid Df Resid Dev Pr gt Chi Figure 5 19 ANOVA Logistic Regression Null Deviance 27 5799 with 19 degrees of freedom ce 7 6308 with 10 degrees of freedo Figure 5 20 Null Deviance and Residual Logistic Regression 5 3 110 Statistics Module Sample Size Estimation for Longitudinal Analysis with Continuous Response Before conducting a longitudinal study is necessary to calculate the number of subjects needed to ensure that certain predefined effect is significant Sample size calculation are based on assumptions which if they are not satisfied can lead to very different sizes Sample size calculation is related to level of significance How many individuals are needed to make some effect significant As an example we have a study in which we want to compare a therapy based intervention against a placebo intervention In this study the response of interest is the systolic pressure Previo
74. han 0 05 These data ais consistent with the results shown in Figure However ANOVA is considered as the most reliable test to determine factor significance Quantile Quantile Normal Plot Graph where residuals empirical quantiles and standard normal quantiles are plotted Figure 4 9 It helps us to validate the assumption of normality i We aim to have all points very close to the midline This chart helps us to detect possible outliers that occurred in our study For our data we can see that most falls around the midline We can detect however an atypical point in the lower left corner of the chart Residuals Independence Graph This graph shows the times X axis in which each observation was collected and the residuals of each observation Y axis It helps us to validate the assumption of independence iii This graph shows no trend up or down 1 e is around zero If you have an upward trend for example this would provide evidence that the current observation depends on the former For our data it appears that there is no explicit trend See Figure 4 10 Residuals VS Fitted Plot Adjusted Model Values are plotted on the X axis and the corresponding residuals on the Y axis It helps us to validate the constant variance assumption ii It is desired the points on the graph form an horizontal band i e the 4 1 Design of Experiments Statistical Analysis Resmtados oe Valores Ajustados Residuos esu tidos Cepa Tr
75. his function is a special case of a linear mixed model This model has been reparametrized to make the comparisons of the three slopes more direct The model for this example is B2 boi Bs tij bii amp j T B 2 5 Ba boi Bs tij bij amp j T C where y represents the j ith measurement of the i th eel bo is the random factor that represents the difference of the initial weight to the mean while bj is the random factor representing the difference of each mouse to the average slope of each group And elj are independent of each other This model and the corresponding hypothesis testing for slopes was presented in Section 3 1 2 In our example we will use some of the results We now present the LADES function and how to fill the corresponding data First Figure 2 25 shows the general parameter of the function such as Number of Groups Units per group and Time We fill out the boxes with 3 groups 10 mice and 5 occasions respectively We leave the Customize section for a moment and we will focus on the Cohorts section In the Cohorts section we can define up to a maximum of 3 cohorts Here we can add cohorts select the occasions that conform each cohort and we have to define separately the number of observations regarding cohorts For example the second row in Figure shows that the second Cohort is composed of the first the last and the middle occasions and the total number of occasions was 3 Researchers decided
76. igns ze Unequal Longitudinal Design x Number of Groups 3 Units per Group 10 Time Customize Mew Designs Minimum Units per Group 3 Tolerance diffferece number of i 2 Fit Cancel Figure 2 18 Main Screen Unequal Longitudinal Designs In the customize section we define the contrasts to use On the left side of Figure 2 19 we define that we want to compare the group 1 Control against group 2 T A and group 1 T A against group 3 T B using this definition we can complete a full comparison among the three groups Now we have to define the minimum difference to detect in these two comparisons That is the average slope of T A to the average slope of the control group will be considered significative if is at least 0 14 we can also use 0 14 The same for the second contrast On the right side of Figure 2 19 we define if we want to discriminate the constructed designs based on cost For this options we can use a User s define Cost or by leaving this option in Zero we discriminate against the value of the complete design 30 mice 5 occasions We also need to specify the cost of each unit mouse and the cost to get one measure Unit cost and Observation Unit Cost respectively 26 Create Module i Longitudinal Design x Parameters for Power Analysis Fixed Effect Parameters Difference to Detect Group 1 Group 2 Group 3 0 14 v v Observation Unit 1 w Discriminate on Cost Costo unidad 0 5 0 28 v
77. ill the function Longitudinal Data Graph Results Figure 6 3 Results Simple Longitudinal Graph 6 1 1 6 1 Longitudinal Data Graph 119 Diet in the By section see Figure 6 4 In the resulting display we can have a notion of the effect of each one of the diets under study We can also add another level of classification factor by placing other factor in the section In rf a 373 Be Sn asilo Graphics 600 4 Time VS Response m Export Graph Tables Export Table weight Exit 4007 300 4 2eo ese T T Y T 0 20 40 60 Figure 6 4 Results Longitudinal Plot 600 4 500 7 400 300 4 600 4 400 300 4 Rat We can also include the units of measurement of our observations In this case the response variable was the weight of mice in Gr Figure 6 5 In addition we choose the option of Random Effects Plot that will be explained soon Function outputs By choosing the Random Effects Plot option a simple linear regression model is fitted to each of the individuals The regressor used is time e Random Effects Plot For the intercept and time estimated coefficients 9596 confidence intervals are constructed which are plotted in Figure 6 6 and 6 7 respectively This chart allows us identify possible random factors for the intercept and the time factor If our confidence intervals overlap for all individuals then we say this
78. in Figure 5 21 For our problem we want to detect a minimum significant difference of 6 4 23 The standard deviation is obtained from a pilot study conducted before o 13 6 The probability of rejecting the null hypothesis given that it is true 1 e type 1 error is 0 05 We will use a power of 0 8 And the alternative hypothesis to be tested is Ho Dic Dir two sided pi Delta 423 Sigma 2 146 Observations over time 2 Level of significance alpha 0 05 Power Test 1 beta 0 8 Alternative Hypothesis One Tail Two Tail Correlation structure Independent Interchangeable AR1 Complete rho 0 867 Exit Estimate Figure 5 21 Input data Sample Size Estimation Continuous Response 112 Statistics Module The more sensible parameter for sample size calculations is the correlation structure It is assumed that the correlation structure is an Interchangeable structure with p 0 867 This means the correlation between observations 1s the same 0 867 The results are shown in Figure 5 22 In conclusion we will need 44 individuals per group to perform our study Casi ii Required Number of Individuals per Group 44 Delta 4 23 Sigma 13 6 prope e Significance Level alpha 0 05 y Power 0 8 H 1 Two Sided Export Graph Correlation Parameter 0 867 Observations Across Time 2 Tables Export Table Exit
79. l quantiles for residuals and the standard normal distribution quantiles It helps to validate the assumption of normality i We want the points on the graph to be located close to the midline This chart helps us to detect possible outliers that occurred in our study For our data we can see that most falls around the midline We can detect also an atypical point in the top right of the graph See Figure 5 11 Independence of Residuals Plot This graph shows the Times X axis in which each observation was collected and their corresponding residuals Y axis It helps us to validate the assumption of independence iii We want the plotted line to show no trend up or down ii e to be around around zero If you have an upward trend for example this would give indication that the result of the current observation depends on the latter For our data it appears that there is no explicit trend See Figure 5 12 Residuals VS Fitted Values Plot Fitted values are plotted on the X axis and their corresponding residuals in the Y axis It helps us to validate the constant variance assumption ii It is desired that points on the graph to form an horizontal band If the points on the graph reflect a form of cone then the variance of the residuals is not constant 5 1 Linear Regression 101 EA Resme Export Graph Tables Analysis of Variance Export Table Exit Frequency a 0
80. l see this in the Results section Moreover the specifications for measurements over time have to be inputed in the Time section button See Figure 2 17 This full design will be the Case 1 We will present the Incomplete Design characteristic as a the Case 2 Incomplete designs are useful when we want to reduce the total cost of our study The term incomplete is used to represent the lack of some observations on specific time points We can define more easily this concept using Cohorts schedule semantic showed on the right side of Figure The first row of this matrix corresponds to the first Cohort that represents that all individuals in this group will be evaluated at all seven occasions we define the number of individuals in this group by inputting the desired number in the column Size The second row represent the Cohort 2 where all the individuals of that group will be evaluated at occasions 1 4 and 7 And Cohort 3 is composed of individuals that will be measured only at the first and last occasions Varying the number of subjects in each cohort produces different incomplete designs We can define different cohorts by selecting different groups of time points Function Outputs 22 Create Module ez Longitudinal Design x Group Units per Group Treatmen la Sn Figure 2 16 Main Screen Full Longitudinal Design Figure 2 17 Time Measurements Full Longitudinal Design 2 4 Full Longitudinal Design 23 e Longitudinal
81. lt this gives evidence residuals may follow a symmetric distribution around zero Fixed Effects Correlations In Figure 4 28 we observe the correlation matrix of the fixed effects estimates Adjusted Values and Residuals In Figure 4 29 the adjusted values and residuals of model 4 2 1 are shown We note that the adjusted values are similar to the values of the original observations We can say that our model provides a good representation of the data Analysis of Variance This section shows the results of the ANOVA The sum of squares is shown for each of the dependent variables as well as the construction of each of the F statistic Note all the fixed effects are significant Therefore there is a significant change 4 2 Longitudinal Analysis 57 7 Residuals VS Fitted Values M ze Export Graph Time D t3 EE Intervals Parameters Estimations AA Table A OF D Figure 4 23 Confidence Intervals of Parameter Estimates Linear Mixed Models Export Graph Mixed Effects Parameter 95 Confidence Intervals Export Table Figure 4 24 Confidence Intervals for Mixed Effects Parameter Estimates Linear Mixed Model 58 Analyze Module Hesults Residuals VS Fitted Values Export Graph ables Figure 4 25 Confidence Intervals for Variance Error Estimate Linear Mixed Model 4 pu Results x logLik 570 961 Residuals VS Fitted Values Export Graph L EUNODOO CREL
82. mated effect using an ortogonal design Because that inflation is not very large we accept the design A rule of thumb is that VIF is not greater than five A cU euis Esa Factor VIF Uintercept il xi 1 16666666666667 Graphics x2 1 16666666666667 h x3 1 16666666666667 X13 1 16666666666667 Export Graph Tables Variance Inflation Factor Export Table Exit Figure 2 15 Variance Inflation Factor Create Optimal Design 2 4 2 4 Full Longitudinal Design 2 Full Longitudinal Design A longitudinal design is an experimental design in which the researcher asses a particular group of people for some period of time This kind of designs are special due to the correlation of the observations over time Usually this experiments are conducted to measures trends over a particular characteristic of interest of an individual or groups of individuals The same people are studied in longitudinal studies so that the accuracy of the results are greatest Longitudinal studies are also commonly used in the medical field because they help to measure medicine and discover certain diseases Summary of Full Longitudinal Design e Full Longitudinal Designs are the easiest to construct They are composed of all the combination of all the factors under study including Time In addition they are the easiest to analyze due to they are balanced d
83. n BIC are evaluated as follows AIC 2 log Lik 2npar A 13 BIC 2log Lik nparlog N A 14 where Apar denotes the number of parameters in the model and N is the total number of observations used to fit the model Under these definitions better is better So if we are using AIC to compare two or more models for the same data we would prefer the model with the lowest AIC Similarly when using BIC prefer the model with the lowest BIC As already mentioned what is desired is to minimize AIC or BIC Models that consider many factors tend to fit the data better but require more parameters Then the best choice for our model is to a balance between between its fitting and its number of parameters BIC severely penalizes models with more parameters then tends to prefer small models in comparison with the AIC
84. n the general correlation struc ture For instance cor 1 2 represents the estimate of parameter p 2 in the correlation structure 95 Confidence Intervals for Variance Parameter Estimates Figure 4 62 shows 95 confidence intervals and estimates for parameters in the variance structure For example 1 represents the estimation of the variance for the first stratification level 95 Confidence Intervals for Variance Residuals Estimate Figure 4 63 shows 95 confidence interval and estimate for the variance of the error Likelihood Criteria In this section the resulting values for the Akaike Information Criterion AIC and Bayesian Information Criterion BIC see A 5 are presented The log likelihood is also shown See Figure 4 64 Residuals Statistics In this section see Figure 4 65 shows some statistics of the model residuals We note that the median of the residuals is close to 0 and Quartile 1 and Quartile 4 to the minimum and maximum respectively This gives evidence that the residuals may follow a symmetric distribution around zero Fixed Effects Correlations In Figure 4 66 we observe the correlation matrix of the estimates of the fixed coefficients Adjusted Values and Residuals In Figure 4 67 adjusted values and residuals of model 4 2 4 are shown Fixed Effects and Hypothesis Testing In Figure 4 68 the estimates of the coefficient the standard error and the corresponding p value are presented The p value is based in
85. nce for the lower left graph the X axis 1s represented by the strain factor and the Y axis is represented by the Treatment factor This chart tells us that if we change the BALB c strain 1 to C57BL 1 a decrease occurs in the average of the observations both for the control group and the treatment group On the other hand for the upper right graph now we have Treatment on the X axis and Strain on the Y axis This graph is explained as follows if we apply the treatment to a rat having the placebo for any of the two strains there will be a decrease in average the response This 1s an indication that the two factors do not interact When we say that the graph reflects an interaction between the factors When changing the factor A for example from the level 1 to level 1 one of the levels in B increases the other decreases the average of the observations 50 Analyze Module EST Ls interactions Plot Graphics e Interaction Plot v Expor Graph Tables Co Export Table Exit Figure 4 14 Interaction Plot DOE Analysis 4 2 Longitudinal Analysis 4 2 1 Linear Mixed Model In longitudinal studies 1t is well known that temporal observations of a particular subject are correlated If for example we treat 2 patients against flu the two of them will react in a different way due to their natural differences genetics etc The idea behind the basic analysis of linear mixed models is that regression coefficients
86. ndition at the beginning of the study Finally ej are the residuals of our proposed model The implementation of this model in LADES is shown in Figure 4 40 Since our answer is binary 1 or 0 s we have to specify this in the Options Section In Figure 4 41 select the Binomial family to specify our response is binary Also select REML as the method for parameter estimation And finally select the type of contrasts which in this case is Treatment Function Outputs e Model Matrix In this section we can see the model matrix see A 1 that was used to adjust the model GLMM This matrix can be seen reflected in the type of contrast that was used see Figure 4 42 4 2 Longitudinal Analysis lt 5 Generalized Models Mixt x Variables ae Time Es st enn Units gt Paiet Te Patient Separation Treatment Visit Intercept Figure 4 40 Main Screen GLMM Family Gaussiana Binomial Poisson Method ML 0 REML Contrast Helmert 0 Treatment Figure 4 41 Toe Nail data toenail csv file 70 Analyze Module M CN x RESTS Intercept i Treatment1 Visit Treatmenti Graphics Tables I p Export Graph Model Matrix 2d Export Table Exit eieileilloboelboeoelbeccbesocsoeboelsesssjisjwiNimie PWN RF ODUAWNHROURWN RF ODU AWN HE e leilsocsscocscoccococococrmibii i i inmnmmIe 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
87. nformation needed to compute sample size for binary response Sample size is based on detecting a significant difference between the average change rate of both groups Sumary of Sample Size Estimation for binary response e The null hypothesis on which sample size calculations are based is Ho Pic P r That is the average change over time for the two groups is the same The alternative hypothesis can be of two types Ho Bic Bir or Ho Dic gt Bir or Ho Bic lt Bir e The required information to make the calculations 1s a Type 1 Error This parameter represents the probability the null hypothesis is rejected when it is true For example it would correspond to the probability of concluding there 1s a difference between treatment and control groups when there is not Usually equal to 0 05 p1 p2 Proportion of expected Presents Y 1 in group 1 and 2 respectively Using this two quantities we can define the minimum significance difference to detect Researchers usually want to reject the null hypothesis with high probability We are interest in detecting the minimum value of 1 p Test Power or Power The power of a statistical test is the probability that the null hypothesis is rejected when it is not true A common required value for power is at least 0 8 but depends on the study m Total number of observations over time p Correlation Structure in Observations is an nterchangeable structure The parameter p d
88. of all mice s first observation Now to define the random effects we are using in model 3 2 we simply fill out the boxes with the corresponding variances and covariances If for example we fill the Var b1 box with Bo boi D1 tij bii amp j Control Yij 2 6 Longitudinal Optimal Powered Cost Design LA Lonard mal Optimal Powered Cosi Design Humber of Groups Units per Group Figure 2 25 Main Screen Longitudinal Optimal Powered Cost Design Longitudinal Design x diem adsis cues Ta Discriminate on Cost Fixed Effect Parameters UntCos ObsevationfUnitCos User Defined Cost0 Discrimir en Power Random effects parameters Model Intercepts Diferent Common Var b Var bi Cov b0 b1 Sigma 2 Residuals alpha User Defined PowerD Figure 2 26 Customize Section Longitudinal Optimal Powered Cost Design 3 32 Create Module 0 then the function will not include a random effect for slopes in the model Sigma 2 stands for the variance of the residuals and alpha represents type 1 error When we want to discriminate the designs using the power of the complete design we fill User Defined Power section with 0 If we are interested in some specific value we just input it Finally we just click on Fit in the main screen Function Outputs e Reference Cost This section shows the maximum cost according to eq 3 1 used to discriminate the constructed design
89. og u In the linear mixed model the parameters estimates are provided through maximizing the likelihood function 1 e finding estimates that are more likely to fit our data For generalized linear mixed model the same procedure is performed But maximizing the likelihood function is not a simple task since the distributions of our data depend on the mean To accomplish this maximization PIRLS algorithm is implemented 68 Analyze Module e Once we estimate the parameters and we have constructed and adequate model for our data we can compute the fitted values i e model predictions To find these resulting predictions of the real restricted observations we must implement the inverse function of g to find the real values For the case of Bernoulli family the inverse function would be x en l H 8 UM nen EE and for the Poisson family u g n e Analysis using Generalized Linear Mixed Models The data analyzed is part of the results of a study that was conducted to compare two oral treatments for a certain infection of the big toe nail The degree of separation of the nail was evaluated in each patient These patients were randomized into two treatment groups and seven measurements were performed during seven visits The first visit at time 0 denotes observation when applying the treatment for the first time Responses of each of the patients were considered as 0 no separation 1 separation moderate or severe a binary response type
90. ower than the original design complete design e Cost Efficient Designs by Units Figure 2 21 depicts all designs with minimum total number of animals used and more power than the original design complete design e Unequal Longitudinal Designs Table where the full information of the unequal longitu dinal designs is shown See Figure 2 22 e Reference Power Power of the complete design used as reference See Figure2 23 e Reference Cost Cost of the complete design used as reference See Figure 2 24 2 5 Unequal Longitudinal Design 2 c Results x Esa Cost Eficient Designs by Cost v 0785 Unequal Longitudinal Designs v Export Table l E 0 780 O Eu 0 775 0 770 151 00 191 25 191 50 191 78 192 0 Cost Figure 2 20 Cost Efficient Designs by Cost design total cost on the X axis and Total Power on the Y axis c RESIS e Export Graph 0 785 Unequal Longitudinal Designs Y Export Table 00 780 E pe 0 775 0 770 29 50 29 75 30 00 30 25 30 5 Total Units Figure 2 21 Cost Efficient Designs by Units total number of units on the X axis and Total Power on the Y axis 28 Create Module Cost Efficient Designs by Units M DDARN Export Graph Unequal Longitudinal Designs m Figure 2 22 Unequal Longitudinal Designs number of in
91. ple cost function is defined in order to compare the constructed designs cost N x MCS K x COS 2 2 Where N is the total number of subjects K is the total number of observations per subjects MCRS is the marginal cost of including e g buying an individual in the study and COS is the cost of measure maintain and individual during K times Other cost functions can be defined Using this framework of power analysis FHeims statistic and cost function we can construct unequal longitudinal designs that reach the desired power for fixed effects hypothesis This longitudinal designs are called Unequal Longitudinal Design This designs look for a tradeoff between power and number of individuals throughout cost function It is important to have prior information in order to get more accurate results Pilot studies etc Constructing Unequal Longitudinal Designs The sequence Create gt Unequal Longitudinal Designs gives the functionality for gen erating unequal designs This function needs the parameter estimates the power and cost of the original design that will be compared to the new designs as inputs As an example to show this methodology we have a mice study Three treatments were administered to three groups of mice Treatment A Treatment B and a Placebo Five measures over time were recorded and the response under study was the weight The main goal of this example is to design the next experiment using the minimum number of mic
92. r model to it This model can be extended to more predictors e To return to the actual values of variable Y 1 e restricted values we use the inverse transformation of 5 2 1 o len e The estimation of the model parameters the values of B s is done by maximizing the likelihood function Because there is no closed solution to such estimates numerical methods are used instead e The model parameters D in his version of inverse transformation exp p indicate how the probability is changing per unit of z e To determine if the effects of our factors are significant to the model we use the likelihood ratio test The statistic used to this test is as follows pjp Ga uced Lmax P That is we construct an hypothesis Ho Dy 0 The test statistic is based on the ratio of the Lmax Reduced Which is the maximum likelihood without including the factor and Lmax the maximum likelihood including all the factors This statistic is also known as deviance and is chi squared distributed with one degree of freedom We reject Ho when the deviance is too large e Another hypothesis testing is Wald test The Wald test for Ho Dj 0 uses the Z B SE B statistic or its chi squared version Z with one degree of freedom Where SE represents the Standard Error of the estimation One requisite for this test is to have a large sample size Analysis using Logistic Regression Model As an example we have a study in which three types of t
93. reatment were analyzed The response under evaluation was lymphocytic infiltration in mice In addition to the two treatments under study a control group was included and a no Treatment group The difference between these two 1s that if the No Treatment group was underwent surgery but without applying any type of treatment Measurements were taken at 15 20 45 and 60 days after infiltrating the disease and apply the treatment The answers are stored as Present presence of lymphocytic infiltrate and Absent The data from this experiment are found in the file infiltration csv To access the logistic regression function follow the route Statistics gt Logistic regression In Figure 4 14 the main screen is shown function In Figure 5 14 we have included the model to be analyzed The model can be mathematically represented as follows logit p Bo Dj Time D Treatment B3 Time Treatment 5 5 5 2 Logistic Regression 105 Eu Logistic Kegression Absents Optional wsm Pe Time Treatment Time Treatment Cancel Contraste 6 Tratamiento Helmert Figure 5 14 Main Screen Logistic Regression 106 Statistics Module where p is the proportion of Presents We want to analyze the effects of Treatments the effect of time and a possible interaction between these two factors All samples are independent SO we can work with time as a variable not correlated in its levels The type of contra
94. rimentation It is possible to combine some runs from two or more fractional factorial designs to build a new factorial design Such design would allow us to get betters estimates for main effects and interaction effects of interest Summary of Fractional Factorial Designs e Consider a situation where three factors at two levels each are of interest But the experimenter can not perform 2 8 combinations of all levels of the factors The experimenter only has budget for perfuming only four runs Thus this suggests us to use a half fraction of a 2 design Since this design requires only 237 4 combinations levels of factors A fraction of a half of a 2 design is called a 277 design Similarly we can define a quarter of a 2 factorial design This design is known as 2 design and requires only 277 2 8 runs instead of 32 runs for the complete design e Now taking the fractionated design 2 how we construct the design For this consider Table 2 2 This table shows the model matrix see A 1 of a 237 design We note this matrix is equal to the model matrix of a 2 factorial with four runs To create the 2 design we simply assign the third factor to the third column of the model matrix 1 e the interaction between factor A and B We define the generator design 277 as C AB and this means the effect of C is confused with the effect of the interaction AB Table 2 2 Model matrix for a 27 design The process to generate a
95. s A A r Figure 2 8 Interaction Confounding Create Fractional Factorial 24 p Results Export Table Figure 2 9 Main Effect Confounding Create Fractional Factorial 16 Create Module S Results x Bran gt 2 1 a 1 1 1 1 1 1 Graphics 1 ie i zx vili 1 1 1 1 E 1 n 1 Export Graph 1 1 1 1 1 Tables 1 d 1 ri r1 Fractional Factorial Design 1 1 1 1 Export Table Exit Figure 2 10 Fractional Factorial Design Create Fractional Factorial 2 3 Optimal Design of Experiments Usually screening designs are two level factorial designs used for fitting models with main effects and two factor interactions The benefit of considering only two factor levels and consider a simple model is the experiment can be perform using less runs By using full factorial designs and fractional factorial design we have to choose over a large catalog of designs that best fits our goals But what happens if the available resources for the experiment did not allow us to use a full factorial design or a fractional designs An example of this problem is when you only have resources to carry out only seven runs The designs that we know are based on runs that are multiples of two 2 combinations where k is the number of factors For this type of problem we propose the best design according to our resources so that we can get more information from the observations The area of statistics that works with this problem is calle
96. s See Figure 2 27 2 Results x itesulis OE UU l 151 5 Graphics Cost Efficient Designs by Units v Export Graph Tables Refernce Cost v Export Table Exit Figure 2 27 Reference Cost Optimal Cost Efficient Longitudinal Designs e Reference Power This section shows the minimum power used to discriminate the constructed designs See Figure 2 28 e Longitudinal Optimal Design Shows all the constructed designs with the desired power and cost See Figure 2 29 e Power vs Cost Graph This graphs gives us a perspective of all the designs having the desired properties It shows cost on X axis and Power on y axis See Figure 2 30 e Power vs Total Units Presents Total Number of Units on X axis and Power on y axis See Figure 2 31 We encourage the reader to check the paper Helms 1992aa for more information about the methodology 2 6 Longitudinal Optimal Powered Cost Design 33 Cost Efficient Designs byUnits sd Export Graph AA Figure 2 28 Reference Power Optimal Cost Efficient Longitudinal Designs MX EEE _ Cohort 1 Cohort 2 Cohort 3 Total Units Total Obser Cost L A 2r E 4 Cost Efficient Designs by Cost M Export Graph Export Table Figure 2 29 Optimal Cost Efficient Longitudinal Designs number of individuals per group total cost and power 34 Create Module ANN
97. s coefficient is splitter to several effects since we choose Treatment as our type of contrast see A 2 This will be explained later We also chose well see Figure 4 52 an Exchangeable correlation structure since this requires fewer parameters to estimate and was suggested by the researchers conducting the study Finally a Gaussian family was selected Function Outputs The resulting tables are e Hypothesis Testing for Parameters For each of the predictor variables the regression coefficient the standard error of the coefficient and corresponding p value are given The p value is based on the Wald statistic which is defined as the ratio between the square regression coefficient and its standard error This statistic follows a Y distribution with 1 degree of freedom We can observed that Time has a negative impact of cholesterol of 4 2 Longitudinal Analysis 2x x GE Variables Response gt Cholesterol 1 lt Time n x7 Subject Fat Fiber Time Cholesterol Units Fiber Time Cancel Figure 4 51 Main Screen GEE Figure 4 52 Customization Screen GEE Lf 78 Analyze Module 1 216 per units each unit time But this change is not statistically significant at a level of 0 05 Since the type of contrast that was used for this model Fat factor was splitted into two effects LowFat and MediumFat In this case our profile or reference level for Fat is the High level As a res
98. same manner we can delete factors using the lt button To edit the names of the factors as well as their levels we use the section labeled in green In Figure 2 2 we see the captured data for our example We must note we used Treatment as our contrasts see A 2 due to the fact we have a categorical factor with more than two levels Now we will create another full factorial design for a study in which we want to analyze the effect a new treatment has on the number of red blood cells RBC The performance of the new treatment will be compared against the performance a placebo control We want to analyze these effects on two different mouse strains the BALB c and C57BL Finally we want to perform the whole experiment 3 times in order to obtain three replicas Then to create this design we access again to the Create Full Factorial function and we introduce two factors adding their names as Treatment and Strain and the levels for each of them In this case there were only two levels for both factors see Figure2 3 Function Outputs e The resulting design for the first example is shown in Figure 2 4 Levels are represented using the numbers 1 to 3 We can export the table using the Export Table button After exporting the table we can change the names levels for their true labels to conduct the 2 1 Full Factorial Design za Rl factorial Design X Design Coding Contrasts Orthogonal Treatment Figure 2 1 Di
99. splay of the Function Create Full Factorial Designs zd j er a pull factorial Design x Design Coding Contrasts e Rau Y E a m Exe Orthogonal Treatment mer Figure 2 2 Create a full factorial design for the diets experiment 10 c Full Factorial Design Factores Fit Treatment Cancel Strain Create Module X Factors E Miveles Design Coding Contrasts Orthogonal Treatment Figure 2 3 Create a full factorial design for the RBC s experiment experiment We can also randomize the runs of the experiment to meet some assumptions that we will discuss later If we want to perform more replicas of the experiment we only have to copy the design several times and randomize The resulting design for example 2 is shown in Figure 2 5 They used orthogonal contrasts see Figure 2 3 due to the fact we have two factors with two levels each This will have benefits later when we do the analysis of the resulting data We can also export and manipulate the design using other tools such as Excel We can copy it twice to obtain the three replicas and then randomize the runs to meet the assumptions required by the statistical analysis assumptions 2 1 Full Factorial Design 1 Strain Figure 2 4 Full factorial design for the diets experiment P Results x EM Treatment Strain al Figure 2 5 Full factorial design for the RBC s experiment 2 2 12 Create Module Fractional F
100. st used 1s Treatment since we have discrete levels for this factor This means that all results are compar isons against the Control group Function Outputs e Hypothesis Testing for Parameters Figure 5 15 shows Wald test for each of the parame ters Time has a negative effect of 0 084 in the logit of the response This effect appears to be significant at a level of 0 05 For No infection C Uu and C Uu treatments the corresponding coefficients are read as the change that occurs when applying any of these treatments to the control group For instance for C Treatment a mouse in the control group taking this treatment produces a change in the logit Response of 0 831 We note that no treatment was statistically significant at 0 05 For the interaction between Treatment and Time the estimated parameter is the change in the growth over time of being in the control group and be administrated with one treatment For example for the interaction Time Treatment Uu this change is 0 012 Neither of these interactions is significant suits Estimate Std Error z value Pr gt z Intercept 0 944 1 923 0 491 0 624 Time 0 084 0 07 1 205 0 228 Graphics TreatmentNoTrea 0 314 2 565 0 122 0 903 Y TreatmentTreatm 0 831 2 546 0 327 0 744 TreatmentTreatm 2 277 2 881 0 79 0 429 Export Graph TreatmentTreatm 2 046 2 975 0 688 0 492 Fabin Time TreatmentN 0 012 0 088 0 136 0 892 E E A Time TreatmentTr 0 012
101. t aed m rub una rae dm e all A Pall a rr ae tni WENT SR aiia ON MN NS A Term Definitions A 1 Model Matrix If N observations are collected in an experiment and the linear model is yi Po Dixi Dixi e i 1 N A 1 where y represents i th response and xj1 xj are the corresponding values of the k dependent variables These N equations can be written in matrix form as y XB e A 2 where y y1 yw is the response vector of dimensions N x 1 B Bo Bi Bx is the regression coefficient vector k 1 x 1 1 y is the error vector N x 1 and X is the model matrix of dimensions N x k 1 given as 1 X11 cee X k l XN XNK A 2 Contrasts Treatment Consider a qualitative factor having four levels This factor can be encoded using three dummy variables This contrasts matrix describes the codification where the columns represent dummy variables and the rows represent levels A 4 O O O m O OC A 3 126 Term Definitions In this encoding the level one first row is considered as the reference level at which the other levels will be compared This is similar to the implementation of a control group Then if it exists it is recommended to assign it to level one As a result the parameter for the dummy variables regression coefficient columns of matrix A 4 represents the difference between certain level and the ref
102. their corresponding sub factors level Finally the contrast selection and model matrix depends on the questions we want to answer e For a design whose factors have more than two levels it is difficult to calculate and interpret their effects when they are categorical The simplest way to calculate the effects of the factors is through the least squares method using the model matrix This model matrix is calculated from the proposed design For the case of 2 designs this matrix is always orthogonal but in cases where we have more than two levels on each factors the matrix undergoes some complications as interpretation of the results The least squares method led us to obtain unbiased estimates for each of the factor effects when our design is orthogonal e The experiment can be replicated 1 e all runs of a design were performed more than once It is said a design that is not replicated have only one replica Having more replicas of our experiment allows us to have more information and thus to perform better statistical analysis Constructing a Full Factorial Design As an example we will create the design mentioned at the beginning of this section using LADES To access the function for creating full factorial designs we follow this path Create gt Full Factorial Design The main screen of this function is shown in Figure 2 1 We can add more factors to the design using the gt button on the right upper corner of the screen In the
103. tion Number of Runs Number of Factors 2k Design Generator Resolution C 1 4 3 BH oen m 2 8 4 4 1 4 123 IV 3 8 5 52 4 12 5 13 n 4 16 5 5 1 5 1234 v 5 16 6 6 2 5 123 6 234 IV 6 16 7 7 3 5 123 6 234 7 134 IV 7 32 6 6 1 6 1234 IV 8 32 7 7 2 6 1234 7 1245 IV 9 32 8 8 3 6 123 7 124 8 2345 IV 10 32 9 9 4 6 2345 7 1345 8 1245 9 1235 IV 4 fi gt Cancel Fit Figure 2 6 Main screen Create Fractional Factorial 14 Create Module For example if we want to create the 297 for analyzing 5 factors we only click on option 3 and then Create Function Outputs e Generators Displays the generators that were used to create the design See Figure2 7 p pesme D AB E AC Graphics Expor Graph Tables Generators v Export Table A o PS P Figure 2 7 Generators Create Fractional Factorial e Interactions Figure 2 8 displays the effect from which each two factor interaction is confounded For our example only two interactions are confounded e Main In this screen we can see the confounding for main effects We note our main factors are confounded with two factor interactions Moreover one of them 1s confused with more than one interaction See Figure 2 9 e Fractional Factorial Design Figure 2 10 shows the generated fractional factorial design This design can be exported to a csv file by using the Export Table button 2 2 Fractional Factorial Designs with 2 levels 15 l4 2 Result
104. tiplied by 2 are the resulting effects for each factor main and interactions e We can use the analysis of variance to assess which effects are statistically significant Since the ANOVA is based on the F test and the degrees of freedom of the residuals we would prefer a replicated design in order to have efficient results When such a replicated design is not possible we can use the Half Normal plot The Half Normal plot is a graph where the absolute value of the estimates of the effects and its normal cumulative probabilities are plotted We may also use the t test for regression coefficients and decide which coefficients are statistically significant in the model but this test is equivalent to ANOVA e There are three principles in the design of experiments that we must take into account at the time of our analysis Principle of Hierarchy i is more likely that the effects of low 40 Analyze Module order e g main are more important than higher order effects e g interactions quadratic effects etc 11 the effects of the same order are equally important Sparcity Principle the number important factors in a factorial experiment is small Principle Heredity For significant interaction at least one of the factors involved should be significant To verify if our model is correct 1 e the factors involved in our model are really significant for our response we have to check some assumptions It is assumed that the residuals of our mo
105. to give more precise parameter estimates The method of least squares is used to estimate the parameters fo Bg The method of least squares finds the values estimates of Bo Ds such that the error the difference between the responses y and the predictive values of the model 5 1 is minimized If the conditions of paragraph 1 are met then the estimators are unbiased and have minimum variance The Analysis of Variance is used as one way to test the hypothesis about the significance of the regressors Basically a statistical test 1s built using the ratio of the contribution made by the regressor x through D and a unbiased estimate of the variance of the residuals This test statistic is compared with the F distribution Analysis using Linear Regression To access this function we simply follow the route Statistics gt Linear Regression We note that we have previously see Figure 5 1 loaded the data using Import This data 1s available in galapagos csv We add the variables that conform the model we want to adjust as our first approximation model 5 1 Figure 5 2 shows the LADES screen where we have to input the corresponding variables To adjust proposed model we click on Set After fiting the model the results on screen appear which are shown in Figure 5 3 Next section presents the interpretation and results of this function Function Outputs The tables shown as results are Hypothesis Testing for Parameters
106. trast that was used for Treatment one effect is defined Treatment B In our case our profile is Treatment A as mentioned before Then the coefficient of Treatment B refers to the added value given of Treatment B to Treatment A i e the impact of Treatment B has on an patient taking Treatment A This coefficient is not significant to the logit of the response The Treatment Visit interaction the adding value in the slope when administrating the Treatment B to a patient taking Treatment A This interaction produces a change 0 054 in the logit of the response per unit time We have to mention that our results depend upon the sample size Parameter estimates by Subject Figure 4 47 shows all the estimated coefficient and best linear unbiased predictions for the fixed and random effects respectively 4 2 Longitudinal Analysis Figure 4 43 Likelihood Criteria GLMM PP A A Patient Separation Treatment Visit Fitted Values Figure 4 44 Adjusted Values and Residuals GLMM Residuals 72 4 Z Results kesme Analyze Module T Patient Intercept l Export Graph Figure 4 45 Random Effects Estimates by Subject GLMM u zo Estimate Std Error z value Pr gt z Figure 4 46 Fixed Effects Estimates GLMM 4 2 Longitudinal Analysis 73 24 sl Rest MER Patient antercept Visit fTreatment1 TT Treatment Figure 4 47 P
107. ues we can always encode the two cases as O and 1 For example no present 0 present 1 11 for certain types of disease in an individual Then the probability p that the response 1s equal to 1 1 e the probability that a patient has the disease is of interest The distribution for binary type variables is the Bernoulli distribution This distribution has the following properties e Mean p e Variance p 1 p We can clearly see that the variance depends upon p Since the response of interest can only have values two O or 1 if we want to model the probability that y is equal to 1 for a single predictor we can write the following model p E Y z Bo Biz and add the error of the model Two problems arise with such type of modeling 1 The predicted values by our model for response Y may be greater than 1 or less than 0 since the linear expression for these predictions is not restricted 2 One of the assumptions of linear regression analysis is that variance of Y is constant throughout the values of the predictor z We have shown that this is not the case Summary of logistic regression model e A new response is constructed to transform the current response as follows 104 Statistics Module logit p ln 5 n 5 4 where 77 is the predicted value by the model e g n fo Bi z This function of the probability p is called logit This function no longer has restrictions and we can work freely on it and fit a linea
108. ull Fixed O Power Exponential Ident Variables aii tiia Category Factors weight Time Rat Diet Variance Correlation Structure Intercambiable General AR 1 Continuo Independent rho 0 Outliers detection 0 01 Contrast 9 Treatment Helmert Accept Figure 4 22 Customization Screen Linear Mixed Models Function Outputs 95 Confidence Intervals for Fixed Parameter Estimates Figure shows 95 con fidence intervals and estimates for the parameter in model 4 2 4 In addition confidence intervals for correlation and variance structure parameters are shown A first evaluation to determine which factors are significant 1s that its confidence interval does not include zero 95 Confidence Intervals for Mixed Effects Parameter Estimates Figure 4 24 shows 95 confidence intervals and estimates for each mixed effect parameter included in the model 95 Confidence Intervals for Variance Error Estimate Figure 4 25 shows 95 confidence interval and estimate for the variance of the error Likelihood Criteria In this section the Akaike Information Criterion AIC and Bayesian Information Criterion BIC see A 5 are displayed In addition the log likelihood is presented See Figure 4 26 Residuals Statistics In this section see Figure 4 27 some statistics about the residuals of the model are shown We note the median is closed to 0 and the quartile 1 is closed to the minimum value same for Quartile 3 and the maximum As a resu
109. ult the coefficient for LowFat refers to the change that occurs when following a diet high in fats and add the effect of Low Fat diet This coefficient turns out to be significant and its estimated value is 23 06 units of cholesterol that is the impact of a low fat diet on a high fat diet is negative In contrast for MediumFat coefficient the change that occurs 1s about 5 837 and it 1s not statistically significant This indicates that there is no difference if you decide to change from a high fat diet to a medium fat diet For Fiber the reference level is High Then the coefficient for LowFiber factor refers to the effect produced pf being on a diet with high content of fiber and add a diet with content This change is of 11 315 units in colesterol but this is not significant Therefore there is no difference if we choose a High Fiber Diet or a Low fiber diet The interaction factor Time LowFat refers to the change in the growth over time produced by adding a low fat diet to a high fat This interaction 1s not significant The rest of the interactions are not significant sul Estimate Std err Pr gt wW Intercept 215 988 6 039 1279 147 0 Time 1 216 1 138 1 141 0 285 Graphics FatLow 23 066 7 143 10 429 0 001 Y FatMedium 5 837 6 361 0 842 0 359 FiberLow 11 315 5 577 4 117 0 042 Export Graph Time FatLow 13 043 1 277 104 397 0 TERED Time FatMedium 5 99 143 17 535 jo Time FiberLow 0 551 1 057
110. un pue Ira Time Figure 4 16 Random Slope Example 5l 52 Analyze Module Response Time Figure 4 17 Random Intercept and Slope Example working with linear mixed model is that Var o I i e the variance for each of the residuals vectors has to be the same Another assumption of the model is that the the random effects b and the residuals have to be independent Furthermore it is also assumed that the random effects follow a multivariate normal with O mean and covariance matrix Y The maximum likelihood method is used to estimate the parameters B o0 and X Another method of parameter estimation is the restricted maximum likelihood REML method This method is the most used since maximum likelihood tends to inflate the estimates for 0 and X We have to keep in mind that the estimation for random coefficients is the covariance matrix 2 because we are making the assumption that the random effects are Normal distributed Therefore in order to give puntual estimations for the random coefficients of each individual we can sample from a Normal distribution with 0 mean and covariance matrix X This is known as best linear unbiased predictions for random coefficients In order to test the hypotheses about the significance of the fixed effects we use a condi tional t test This test is conditioned by the estimates of the random effects using REML method There is another test for comparing models based on th
111. us data collected for the sample size calculation of this study are shown in 5 1 Value Average Standard Deviation in the first and second 13 6 measurement Time Measures Correlation 0 867 Number of measures over Time Significance Level Power Significant Difference 4 23 Table 5 1 Previous data needed to compute sample size for a continuos response Suppose we have the following model for the Control group Yi Poc Pictij ij 5 6 where Y are the observations of patient i in time j Boc represents the intercept of the control group and Bic represents the average changing rate over time The same model can be constructed for the Treatment group The sample size is based on detecting a significant difference between the average change of the two groups in time i e Bic y Bir Summary of Sample Size Estimation e The null hypothesis on which sample size calculations are based is Ho Bic Dir That is the average change over time for the two groups is the same The alternative hypothesis can be of two types Ho Dic 4 Dir two sided or Ho Bic gt Bir one sided or Ho Bic lt Pir The required prior information to make the calculations 1s a Type 1 Error This parameter represents the probability the null hypothesis is rejected when is true for example it would correspond to the probability of concluding there is a difference between treatment and control group when there is not Usually equal to 0 0
112. viduals Men and women were studied at ages 8 10 12 and 14 years To avoid problems with the estimates and the interpretations of the results 1t was decided to codify the age variable as centered around 11 years Then the analysis is aimed to evaluate the growth of such distance around the age of 11 years The file containing the results of the study 1s orthodoncy csv They want to set the following model distance Bo DiSex D Age 11 D3 Age 11 Sex i 4 11 where i 1 27 individuals and measured points j 1 4 3 1 1 3 To access the generalized least squares function just follow Analyze gt Longitudinal Analysis gt GLS The main screen of this function is shown in Figure 4 58 In Figure 4 58 we can see the model 4 2 4 implemented in LADES It was decided to use a Exchangeable correlation structure 1 e the correlation between observations over time for each of the individuals is the same see A 4 We select this in the Options section see Figure 4 59 Moreover we used Helmert contrasts see A 2 since the age variable contains negative and positive values Such contrasts encode sex as male and female 1 so that our model matrix 1s orthogonal We will model this phenomenon using the Ident Variance Structure using Sex see A 3 and a General correlation structure After setting this new model we proceed to the Function Outputs Function Outputs 4 2 Longitudinal Analysis
113. yju u X p Zb Where u is the random effect coefficients D is the fixed effects vector X is the model matrix for fixed effects and Z is the model matrix for random effects Other typical distributions for the conditional random variable y u are 1 The Bernoulli distribution for binary data 0 and 1 which density function is plu 2w u pgu O lt u lt 1 y 0 1 2 Many independent binary response variables can be represented as a binomial re sponse only if the distributions are Bernoulli and have the same mean 3 The Poisson distribution for count data 0 1 which has the density function pit p y u e AT O lt u y 0 1 2 All these distributions are completely specified by the conditional mean When the conditional distributions y u are constrained by u such that O lt u lt 1 Bernoulli or 0 lt u Poisson we can not define the conditional mean Uy y AS equal to a linear predictor X B Zu since this linear predictor has no restrictions in its resulting values An invertible univariate function g called link function is selected such that n g u It is required that g is invertible such that u g7 n is defined for this inf lt y lt inf and its proper range 0 lt u lt 1 for Bernoulli or 0 lt u forPoisson The link function for the Bernoulli distribution is the logit function n s n 10g 47 The link function for the Poisson distribution is the log function n 8 1 l
114. yze Module pra File Create Evaluate Analyze Statistics Graph Help Subject Fat Fiber Time Choleste 1 Low Low 1 175 1 Low Low 2 200 1 Low Low 3 134 1 Low Low 4 138 2 Low Low 1 192 2 Low Low 2 169 2 Low Low 3 142 2 Low Low 4 137 3 Low Low 1 153 3 Low Low 2 132 3 Low Low 3 115 3 Low Low 4 114 4 Low Low 1 204 4 Low Low 2 184 4 Low Low 3 162 4 Low Low 4 164 5 Low Low 1 194 5 Low Low 2 173 5 Low Low 3 149 5 Low Low 4 151 6 Low Low 1 224 6 Low Low 2 194 6 Low Low 3 164 6 Low Low 4 170 7 Low High 1 163 7 Low High 2 145 7 Low High 3 132 m Row Column File Save Q Q e Import admin Pala lt Minminn gt Figure 4 50 colesterol csv dataset Analyze gt Longitudinal Analysis gt GEE In Figure 4 50 we observe cholesterol csv data already loaded into LADES Now Figure 4 51 shows the screen where we have to input the model specifications Fields such as Response Time and Subject are mandatory Fields such as Factors and Interactions define a more elaborated model On the other hand Figure 4 52 displays the customization section to especify other parameters like Family response variable Correlation Structure correlation between observations over time and Contrasts which have an impact on the interpretation of the estimated coefficients For our problem we opted to fit the model Y Bo Bi t Di Fat Dj Fiber Dg Fat Time DG Fiber Time CORR 4 8 where D means thi
115. zT Export Graph Figure 5 7 Fitted Values and Residuals 5 1 Linear Regression 99 24 22 Results SS Residual Standard Error 28 5704 with 21 degrees of freedom Figure 5 8 Adjusted R square Multiple R square and Residual s Standard Error Linear Regres sion 100 Statistics Module Area Elevation and Adjacent are significant to a level of 0 05 The variable Area was not significant in our first hypothesis testing using the t test However ANOVA considered as the most reliable test to determine the significance factors Therefore we have three significant factors that affect the number of species ZA RES RES LE Df Sum Sq Mean Sq F value Pr gt F ES 1 357498 8301 357498 8301 437 9662 0 Area 1 133 2967 133 2967 0 1633 0 6902 Graphics Anear 1 1153 0189 1153 0189 1 4125 0 2479 Y Dist 1 135 4811 135 4811 0 166 0 6878 DistSC fa 18 67 18 67 0 0229 0 8812 EXPO Graph Elevation 1 1063 6371 1063 6371 1 303 0 2665 ENG ES Area 1 57 9397 57 9397 0 071 0 7925 Residuals 21 17141 6782 816 2704 Analysis of Variance Export Table Exit Figure 5 9 ANOVA Linear Regression Residuals Histogram Shows the histogram of the residuals It helps us validate the assumption of normality i Figure 5 10 The graph should preferably reflect a standard normal distribution a bell shape centered at zero For our data this assumption seems to be fulfilled Normal QQ Plot This plot shows the empirica

LADES User`s manual

Contents

Download Pdf Manuals

Related Search

Related Contents