Home
EAGLES User Manual
Contents
1. Results File Name Email Results To Figure 3 2 Screen capture of the COASTER tool s information entry page COASTER is available at the following address http www coasterdata net 12 EAGLES User Manual February 2011 4 0 Data Integration Before analysis in the RSPF tool can begin the desired covariate and response layers must be projected and or transformed into a common projection so that all raster cells are referenced within the same coordinate plane and are thus properly aligned for data extraction We also recommend that all datasets be resampled to the same pixel size spatial resolution before sampling takes place Furthermore the spatial extent of covariates must overlap the region of interest as modeled output from EAGLES can only be created for areas possessing data for all covariates 4 1 Sampling approach and the distribution of response points Sampling species responses to create a dataset capable of adequately addressing current and future research questions is a major undertaking that is exacerbated by the high cost and limited resources allocated for data collection A tremendous amount of effort has gone into the study of sampling designs for more information see Miller et al 2007 pg 228 col 2 bottom Specific challenges noted in the geographic literature include 1 Selecting the variable s to be collected that capture the necessary information using
2. gt Link Function Semivariogram m S s Submit Figure 9 5a Data displays following the first RSPF R script The order of fit drop down menu upper left allows the appropriate fit for each variable to be selected The swap tool drop down menu allow any variable to be swapped for an alternative dataset thereby allowing what if scenarios to be tested see section 9 7 Clicking the white buttons at the bottom of the window will display the diagnostic graphs for each variable The link function drop down menu allows users to select the appropriate link function for their analysis Data exploration was conducted in the statistical programming environment R through the ArcGIS shell A subset of the materials generated in the data exploration is included below We examined histograms for each covariate as it occurred in three different cases cases of Pronghorn Use cases in the designated Pronghorn Available space and cases from the entire spatial domain see Figure 9 5b below The random universe cases 10000 points distributed uniformly over the entire region of interest For many covariates these distributions are similar but for covariates where the distributions are quite different for example for herbaceous cover shown below there is some evidence that Pronghorn selection may depend on that covariate 46 EAGLES User Manual February 2011 Fie v Print v E mail Bum Open wolf tif Histogram of Values for U
3. 1 2 3 All coefficient estimates and covariate significances are based on all other covariates being in the model Thus while in this case we note that the highest individual coefficient significance is attributed to the elevation covariate in the reduced model that covariate s presence may not actually contribute greatly to the model at large In order to determine whether a covariate contributes substantially to a model s fit we recommend fitting models with and without the covariate of interest and comparing those models AIC scores as outlined below We recommend that the user examine each coefficient s sign and determine whether the sign of the coefficient makes sense for example here we see a negative sign on the coefficient for elevation and it makes sense that as elevation increases use by pronghorn should probably decrease so we are satisfied with that value If coefficients signs are not what is expected consider fitting a model without that covariate and comparing model performance via AIC to see if inclusion of the covariate is appropriate We remind the user that all first order quantitative covariates were standardized prior to fitting thus coefficient magnitudes are in terms of standard deviations above or below that covariate s mean value 52 EAGLES User Manual February 2011 4 We suggest considering models that exclude insignificant predictors e g distance to road in our reduced model Howe
4. Appendix 3 RSPF Flow of Control Overview The RSPF tool merges the spatial analytical capabilities of ArcGIS with the statistical functionality available in the R software package The key components of the RSPF tool are an ArcGIS Graphical User Interface GUI written in Microsoft Visual Basic 6 5 and two R scripts that are called by ArcGIS and execute in a DOS shell The general flow of control within the RSPF tool is illustrated in the following steps 1 Initialization After adding the necessary data layers i e raster covariates layers and point response data to ArcMap the user starts the RSPF tool by clicking the RSPF button in the EAGLE Tools toolbox Once open the RSPF user interface shows three tabs each containing drop down boxes that users populate with appropriate map layers These drop down boxes allow users to specify parameters including a Use and Availability layers b the model region of interest ROI c the model spatial Resolution and d model covariates To run the tool a selection must be made in each of these boxes and an output folder must also be designated If Use and or Availability points fall outside the boundary of the selected ROI the Clip Use Availability and Universe layers to this ROI checkbox should be checked This limits model building to only those points that overlap the ROI In addition to the option to provide an existing Availability layer there are options for randomizing Availa
5. Cartography Tools i My Places E Conversion Tools Qi Coverage Tools Online Services gt Data Interoperability Tools iy Data Management Tools Macros gt B Geocoding Tools amp Geostatistical Analyst Tools 3 b Linear Referencing Tools Eun b Mobile Tools Styles gt Multidimension Tools Options Qi Network Analyst Tools d Samples b Schematics Tools Bi Server Tools b Spatial Analyst Tools x Kp spatial Statistics Tools b Tracking Analyst Tools 2 This will open a Customize window Select the Commands tab and then UIControls 67 EAGLES User Manual February 2011 Customize Toolbars Commands Options Show commands containing Categories Commands Topology Tracking Analyst Utility Network Analyst Versioning v Description Save in Normal mat Keyboard Add from file Lose 3 Click the New UIControl button In the New UIControl window that appears select UlButtonControl and then the Create button New UlControl UlControl Type f UlButtonControl UlEditBoxControl UlToolControl UlComboBoxControl Create and Edit Cancel The following window should appear Notice the new button listed in the Commands pane Normal UIButtonControll 68 EAGLES User Manual February 2011 Customize Toolbars Commands Options Show commands containing Categories Commands Topology P INormal LUIButtenControll Tracking An
6. ThisDocumen MxDocument Alphabetic Categorized 7 Under the File menu item select Import File and navigate to the directory where you unzipped the file Repeat these steps three times and Open each of the forms frmCovariates frmDataType and frmRSPF Import File Look in e Components t ak EJ et rmCovariates F3 frmDataType E frmRSPF File name fimCovariates Files of type ve Files frm bas cls 7 Cancel Help 72 EAGLES User Manual February 2011 8 Under the File menu item select Save Normal mxt You have now created and saved the new button to the master ArcMap project Close the Microsoft Visual Basic window Back in ArcMap your new tool should appear as a button and be completely launchable To uninstall the RSPF tool 1 Under the Tools menu item select Customize Customize is also available in the list that appears when right clicking on a gray area in one of the menu bars Untitled ArcMap Arcinfo File Edit View Bookmarks Insert Selection Tools Window Help Editor v PCS J P Editor Toolbar Dc e Mti vL JE Reports b APR m e miden xxn Geocoding S eeraa siete i Add XY Data 35 Add Route Events ArcCatalog ArcToolbox iy 3D Analyst Tools amp Analysis Tools Qi Cartography Tools Ay My Places amp Conversion Tools b Coverage Tools Online Services gt amp S Data Interoperabil
7. AIC 44 2 AUC 0 552 h l p 0 probit AIC 52 8 AUC 0 583 h l p 0 probit AIC 52 8 AUC 0 585 h LpzD loglog AIC 44 AUC 0 552 h l p 0 loglog AIC 48 4 AUC 0 553 h l p 0 loglog AIC 52 8 AUC 0 586 h I p 0 exp AIC 42 6 AUC 0 552 h l p 0 41 2 AUC 0 552 h l p D exp AIC 49 6 AUC 0 586 h l p 0 30 40 50 60 SAGE TIF Link Function Histogram Semivariogram Figure 9 9a The Swap tool applied to the variable percent sage cover see green oval The RSPF tool defaults to applying the RSPF model to the same variable upon which it was built but the Swap tool allows users to direct ArcGIS to apply the model to an alternative version of that variable see red oval e Pronghorn Observations RSPF Fit GYARoads Value eR Nea hatice os ec ao low 0 am Low 0 Figure 9 9b A portion of the original RSPF model output indicating the resource selection function for pronghorn in Yellowstone National Park left The Swap was used to apply the RSPF model to an alternative distance to road layer created using a hypothetical road addition shown in orange The new prognostic RSPF model output for pronghorn right indicates that pronghorn are excluded from portions of their original selected habitats Pronghorn Observations SPF Fit Hypothetical Road Addition GYA Roads Value 57 EAGLES User Manual February 2011 10 0 Acknowledgements Literature and Programs
8. Cited Further Readings and Citation Information 10 1 Acknowledgements Yellowstone Ecological Research Center thanks the following scientists for their generous contributions to the EAGLES System The team responsible for creating EAGLES includes Robert Crabtree and Jennifer Sheldon YERC Project design Subhash Lele University of Alberta Statistical consultation Alan Swanson YERC University of Montana Project design and R programming Phase 1 John Shupe NASA Ames ArcGIS interface programming VB Phase 1 Brandt Winkelman YERC Project testing Phase 1 Daniel Weiss YERC Project design management and testing Phase 2 COASTER Gordon Reese Colorado State University ArcGIS interface programming VB Phase 2 Kezia Manlove YERC R programming Phase 2 Aaron Doern YERC GeoSpatial Data Wiki project design and implementation Katie Gibson COASTER web programming Additionally we greatly appreciate the support and ongoing feedback provided by our early adoption group especially Greg Watson Phillip Martin Jennifer Jenkins Sean Finn Sharon Baruch Mordo Scott Bergen Matt Holloran James Broska Pat Heglund and Kurt Johnson Thanks also to the attendees of the May 2010 RRSC workshop including Mark Bertram Donna Brewer Stephen DeStefano Rex Johnson James Forester Jonna Katajisto Jonah Keim Paul Moorcroft Doug Ouren Lori Pruitt Tara Wertz Ken Wilson and Joe Witt for their fe
9. Qj Coverage Tools Online Services gt Data Interoperability Tools a S Maks Mansaaman b Tons Customize j Visual Basic Editor Alt F11 Extensions Qj Mobile Tools Styles gt jc Qi Multidimension Tools Options t Q Network Analyst Tools Qp samples 74 EAGLES User Manual February 2011 4 Remove each of the three forms frmCovariates frmDataType and frmRSPF by selecing File Remove frmCovariates etc 48 Microsoft Visual Basic Normal mxt ThisDocument Code File Edit view Insert Format Debug Run Tools Add Ins Winc Save Normal mxt Ctri S 3 5 Import File Ctrl M Export File Ctr E te Sub RSPF Click oad frmRSPF rmRSPF Show ub Print Ctrl P Close and Return to ArcMap Alt Q ef IEENIIIMELI INISIE I frmDataType frmRSPF Modules HBE Prniert v lli gt Properties 5 Finally delete the code in ThisDocument located under the ArcMap Objects folder e Microsoft Visual Basic Normal mxt ThisDocument Code BAR Eile Edit view Insert Format Debug Run Tools Add Ins Window Help 8 x Dd 238 2a 9m gt u m DE Se IP oy Se int colt Project Normal x General gt Dectarations amp amp Normal Normal mxt H 6 ArcMap Objects ThisDocument B Forms Modules Bg E Project i 73 ArcMap Objects Ez References Properties ThisDocument x ThisDocumen MxD
10. Warmest Quarter BioClim 1 km Quarterly 1980 to 1997 Average Temperature of Wettest Quarter BioClim 1 km Quarterly 1980 to 1997 Mean Diurnal Range Annual Temperature Range BioClim 1 km Yearly 1980 to 1997 Maximum Temperature of Warmest Month BioClim 1 km Monthly 1980 to 1997 Minimum Temperature of Coldest Month BioClim 1 km Monthly 1980 to 1997 Precipitation of Coldest Quarter BioClim 1 km Quarterly 1980 to 1997 Precipitation of Driest Month BioClim 1 km Monthly 1980 to 1997 Precipitation of Driest Quarter BioClim 1 km Quarterly 1980 to 1997 Precipitation Seasonality BioClim 1 km Seasonal 1980 to 1997 Precipitation of Warmest Quarter BioClim 1 km Quarterly 1980 to 1997 Precipitation of Wettest Month BioClim 1 km Monthly 1980 to 1997 Precipitation of Wettest Quarter BioClim 1 km Quarterly 1980 to 1997 Temperature Seasonality BioClim 1 km Seasonal 1980 to 1997 National Land Cover Database 1992 NLCD 1992 landsat 30m Single year 1992 National Land Cover Database 2001 NLCD 2001 landsat 30m Single year 2001 Digital Elevation DEMs slope aspect elevation etc Model 30m static unknown DRGs distance metrics road density stream density Digital Raster Graphics n a static unknown Globally Downscaled Climate Projections Future Climate Grids 1961 to 1990 2041 to 2060 2081 to 2100 63 EAGLES User Manual February 2011 Appendix 2 Specific R Functions Used for Each Model glm package nlme ROC curve package PresenceAbsence
11. decision making EAGLES General Work Flow Management Decision Question Data Input and Integration Analysis and Modeling Interpretation amp Decision Making Expert Opinion Consider apriori model Status report Natural history Scientific literature Define ideal t Refinement Specific concerns Actionable outcomes Management approach Research needs Research hypothesis Species Response Data Surveys Field plots Telemetry Environ Covariates Explanatory Variables Existing Habitat maps DEM DRGs NLCD soil type Summarize Visualize Exploratory Analysis Back to Practitioners biologists managers conservationists Link response covariate in Merged Data Array Exploratory Analysis for Statistical Models Outcomes driven actions Site level management T Needed Remote sensing Assimilate models Proxies indirect Apply Diagnostic Models Ultimate validation is w future monitoring survey T Apply Prognostic Models Rinse repeat within an annual adaptive cycle Figure 1 3 The EAGLES workflow schematic diagram EAGLES is a workflow architecture that includes both tools software based and workflow to allow modeling of species legacy data sets to address management and conservation decision making Itis flexible and
12. evaluate multiple development plan proposals can use this system to compare alternatives and scenarios including changes in land use practices and explore their implications using hypothetical what if scenarios For example a manager could use this set of tools to investigate how coyotes currently use a portion of landscape and how that use pattern might change when the landscape is altered e g through fire flood or development These tools are particularly relevant for legacy data on species of concern To reach the widest possible audience an ArcGIS environment was selected as the platform for these tools N Species response Data N rma Dynamic Fixed static a time varying predictors l 5 l predictors aN i oo f Slope aspect elevation B I Climate parameters Existing habitat cover I 5 l Mapped disturbance types Custom remote sensing i amp I water forage biomass etc row 1 6 se Figure 1 1 Schematic representation of the process of matching 1 fixed and 2 temporally dynamic geospatial covariates with spatio temporal response data from legacy data sets to create a merged data array MDA for analysis and modeling in EAGLES Wildlife Biologist GIS Analyst Collect species response data in the field dependent variable Identify environmental variables hypothesized to or observed affecting species covariates Think critically about the influence of SCALE with resp
13. for each of the covariates selected on the Covariate Data tab in the first GUI This resulting information is displayed in a third GUI where each covariate is contained within a unique tab On each tab users can select the fit order first second or third that best fits the data or can choose to exclude the variable from further analysis Several informative graphs stored in the CovariateGraphs folder are also provided on each tab to help the user determine whether variables are appropriate for the analysis or not Another option on each tab i e the Swat Tool allows users to apply the model to a different layer Each of the Swap Tool drop down boxes defaults to the layer on which the model was built At this stage users must also select a link function i e logit exponential or loglog for the model Unlike the fit order a single link function is applied to all variables in the model Once all variables are selected the user clicks submit again causing ArcGIS to create a second parameter file and then call the second R script 5 Produce the Final Model Results The second R script creates several files in the Results folder including text files for the model equation betas and fit summary as well as the rspf ROC semivariogram jpg and rspf rspf resids jpg visual diagnostic plots Finally the MapAlgebra functionality native to the ArcGIS software package is used with the rspf equation txt file to create a response surface T
14. or continuous At this point ArcGIS generates a set of random universe points sampled uniformly over the entire study domain These points display in ArcGIS and are used to generate the stacked histograms in R see Section 9 5 Once extraction of the random universe points is complete the merged data array is constructed and passed to R see Figure 9 4f This script may take several minutes to run depending on the desired number of points and covariate layers 43 EAGLES User Manual February 2011 Data Type 2 2 2 2 2 2 2 2 o oO Oo ooo ouo ou o t e Figure 9 4e The Data Type selection window in which the users select the appropriate data type for each covariate Figure 9 4f Visualization of merged data array generation through layer stacking While the first R script is running a box will appear on top of the ArcGIS environment displaying the R output see Figure 9 4g 44 EAGLES User Manual February 2011 Il C Program Files x86 R R 2 10 0 bin Rscript exe SLoading required package tools MLoading required package Matching Loading required package rgenoud BLoading required package MASS Figure 9 4g Appearance of ArcGIS when the first R script is running Note that large amounts of processing are occurring in R while this window displays and a full records of those processes is stored in the file rspf log scriptl txt 9 5 Data Exploration Boxplots and Pairsplots for C
15. response data sample this area sufficiently to justify inference over the entire domain 5 What is the spatial resolution of the analysis how was this resolution arrived at 1 e were covariates scaled up or down and what ramifications will this resolution have on the results and their interpretation 76 EAGLES User Manual February 2011 6 10 11 Are there any known and important covariates i e raster dataset capturing a biophysical landscape characteristic of interest that are missing from the analysis and what ramification s is this likely to have on the results How is availability space defined and what criteria were used to make this decision Ifan iterative process was used to define availability space what range of values was tested How robust is model inference to changes in the availability space If RSFP results are extrapolated in time and or space i e used to make inference in different times or places from those in which the model was built how is this decision justified Do extrapolated spatial values represent covariate combinations actually observed in the extant data For each covariate what order fit was used and why What link function was used to produce the RSPF fit 1 e the final modeled surface and why Was spatial autocorrelation present in model residuals and if so what course of action were taken T1
16. robust repeatable and defensible methodology 2 Selecting the type of data e g counts presence absence or measurements of characteristics 3 Selecting an underlying sampling approach e g random opportunistic etc that does not violate the assumptions of the desired statistical methods 4 Collecting a sufficient sample size 5 Adequately accounting for the spatial distribution of sample points 1 e points with high spatial proximity may be spatially autocorrelated and therefore of less informational value than points sufficiently far apart whereas points too far apart introduce potential extrapolation error 6 Planning long term strategies i e can equivalent data collection occur at multiple time periods to create a longitudinal dataset 13 EAGLES User Manual February 2011 4 2 Spatial domain of analysis Analysis may be conducted over a spatial extent that exceeds the area in which sample data were collected While very powerful this feature must be used cautiously as inference in unsampled areas particularly areas dissimilar from any sampled points should only be done with extreme caution as inference there is not well supported by the statistical models Spatial extents of potential interest i e for management actions could include politically defined units e g hunting district or management areas or geographically bounded regions e g the Lamar Valley in Yellowstone National Park Note that to app
17. 00 2000 2500 3000 Elevation Availability n 3810 Density 0 0000 0 0015 0 0030 1 1500 2000 2500 3000 Elevation Universe n 9998 Density 0 0010 0 0000 1500 2000 2500 3000 Elevation Figure 5 2 EAGLE conditional histograms 5 3 Semivariograms for Assessment of Spatial Scale Semivariograms are useful for showing the spatial scale at which spatial autocorrelation is present or not within an environmental covariate The presence of spatial autocorrelation is normal in environmental datasets but must be considered when interpreting results Figure 5 3 shows a semivariogram in which autocorrelation ceases to be an issue for this variable at 8000 meters i e the sill or fairly constant horizontal section of the semivariogram begins at about this distance Within univariate semivariograms autocorrelation is particularly problematic when there is no obvious sill e g a linear decrease in autocorrelation with increasing distance 19 EAGLES User Manual February 2011 8000 10000 6000 semivanance 4000 2000 0 10000 20000 30000 40000 50000 distance Figure 5 3 EAGLE univariate semivariogram 5 4 Spatial Distribution The spatial distribution graphic is useful for identifying the location of points that are outliers in the covariates Knowledge about this spatial organization facilitates more informed landscape specific interpretations of results based on expert know
18. 05 a Y ie 00m c E D ce I N o M Co e to N o w fm a E s zi D e o A A a oo o io eo omo ome o wo ae n i ram Kat II r TuS a 3 a EN E m a a 3 AK J n m o I o o o E N o N E e t N o o D cn D 41 12 0 24 0 049 0 08 0 2 t m a i A a a co a oo N m a A S a e oo a m A 038m e io R a ca o a a h se E o Ew 5 Oo N LS o o C LO 5 Co LR a o Co 5 0 om e N de o de co o to Ds o o m R o m D o m m o to A a N N a e a a oO e a a t s a A m 8 B E B 8 5 g Figure 5 1 Standard pairsplot 5 2 Conditional Histograms Stacked histograms can be used to compare the distribution of a covariate at used available and universal that is entire study domain scales In Fig 5 2 which shows stacked histograms for elevation we see that the full spatial domain extends quite a lot higher than either points that were used or points that were deemed available In this case the region with the highest elevation generally resides outside of the area that is modeled thus inference to very high elevations is beyond this model s scope 18 EAGLES User Manual February 2011 Used n 762 Density 0 0015 0 0000 15
19. 2 2 AUC 0 643 h l p 0 7 probit AIC 121 8 AUC 0 606 h l p 0 probit AIC 180 AUC 0 65 h l p 0 7 probit AIC 175 6 AUC 0 649 h l p 0 loglog AIC 118 AUC 0 606 h l p 0 loglog AIC 179 8 AUC 0 65 h l p 0 loglog AIC 180 4 AUC 0 645 h l p 0 exp AlC 100 4 AUC 0 606 h I p 0 exp AIC 124 2 AUC 0 624 h L p 0 exp AlC 154 2 AUC 0 633 h I p D T T T T T T T T T T T T T T T T T 0 000 0 002 0 004 0 006 0 008 0 010 0 000 0 002 0 004 0 006 0 008 0 010 0 000 0 002 0 004 0 006 0 008 0 010 WOLF TIF Link Function Box plot Diagnostic Graph Distribution Graph Histogram Semivariogram vk sg Figure 9 6 Univariate RSPF curve for wolf intensity of use 9 7 RSPF Example Second Phase in ArcGIS The second R script is called after a link function an order of fit and an application layer have been selected in each covariate tab of the user dialogue Send the desired model to R by clicking the Submit button in the lower right hand corner of the dialogue boxes The screen will appear to be inactive for several minutes while the second R script runs and the equation is mapped back to the spatial domain When the fitted RSPF surface appears in the ArcGIS the second script is complete 50 EAGLES User Manual February 2011 9 8 RSPF Model Selection and Output Upon completion of the second R script the RSPF output is stored in the RunX folder located inside the user des
20. 7 A quick assessment of this table illustrates one major problem with this model the variance inflation factors VIF for distance to road and distance to road squared are both quite high indicating collinearity between those two covariates A better model would include only a first order distance to road term Additionally after careful consideration of the biological ramifications of all covariates considered in the model the user team determined that June NPP sage the two predator covariates wolf and coyote intensity of use and forage were unlikely to 51 EAGLES User Manual February 2011 be particularly important Coefficient estimates for a reduced model that is much more interpretable are tabled below Parameter Estimates Intercep dist to r elevation forage ti forest pc herb tif slope tif est se t p vif t 11 2101 0 9122 12 29 8 915039e 32 NA oad tif 0 1842 0 2630 0 70 4 841429e 01 1 2 ca 1 3175 0 1362 9 67 6 228622e 21 1 6 f 0 0211 0 0422 0 50 6 172208e 01 1 5 tot 0 8502 0 3207 2 65 8 217647e 03 2 1 0 6099 0 0945 6 46 1 875431e 10 2 0 0 4672 0 0843 5 54 4 179481e 08 1 3 Initially we note that the variance inflation problems present in first model are no longer a problem v ariance inflation factors should generally be less than 10 as is true for all covariates in the reduced model When interpreting the model coefficients we remind the user of several important points
21. C curve should sit at a line of slope 1 i e the grey line in the background of the plot 31 EAGLES User Manual February 2011 semivariogram of deviance residuals ROC Plot o0 on Aue F ron F M wf o i Ra o_o y 1 0 1 i 0 8 1 BSS 1 0 1 E 0 8 semivariance 0 6 Sensitivity true positives 0 2 1 0 0 1 0 0 0 0 0 2 0 4 0 6 0 8 1 0 T T T T T 0 2000 4000 6000 8000 1 Specificity false positives distance Figure 7 1 RSPF s ROC plot and semivariogram outputs The semivariogram is used to help determine whether lack of independence due to spatial autocorrelation is relevant in the setting of interest This depiction of spatial autocorrelation relies on assumptions of stationarity spatial relationships are the same over the entire spatial domain ergodicity and isotropy spatial relationships are the same in all directions for the underlying spatial process see Zuur et al 2007 pg 344 It is a plot of the variation between two distances as a function of the distance between two points Ideally we want this plot to be a horizontal line which is indicative of similar variance between points regardless of the distance between them Lower values of semivariance for lower distances indicate relatedness between spatially proximal points which suggests a violation of the independence assumption in the model fit Such violations necessitate the use of
22. C values tabled above we conclude that the saturated model performs best with virtually no weight placed on the other two models in the suite thus there is a strong indication that pronghorn are responding to both road and predators when all other covariates are included in the model To address the road impact question we fit a model without roads and compared it to a model that included roads The road model was superior based on AIC 614 for the no roads model as compared to 698 for the saturated model To address the predation question we compared models with and without wolf and coyote In this case the saturated model out performed the model without predators AIC of the saturated model was 698 for the model without predators it was 606 which suggests that wolf and coyote intensity of use do drive pronghorn resource selection The final component of the RSPF output is the predicted RSPF surface for the best model which is fitted and displayed in ArcGIS see Figure 9 8 This prediction looks reasonable based on biological knowledge of this system The large swatch of good habitat that is apparently not used in the upper left hand corner of the surface is a private in holding 55 EAGLES User Manual February 2011 s GYA Roads RSPF Fit Analysis Extent Value 9 Pronghorn Observations um High 0 286773 3 I Low 0 Figure 9 8 RSPF surface as fitted by the final model 9 9 Scenario Tes
23. EAGLES User Manual YELLOWSTONE ECOLOGICAL RESEARCH CENTER Contributing Authors Kezia Manlove Daniel Weiss and Jennifer Sheldon EAGLES User Manual February 2011 Table of contents Page 1 0 Overview 2 2 0 Installation of ArcGIS Tools 8 3 0 Data Input 11 4 0 Data Integration 13 5 0 Data Exploration 17 6 0 Resource Selection Probability Function RSPF Tool 29 7 0 RSPF Model Assessment and Interpretation 31 8 0 Ecological Forecasting Through RSPF 35 9 0 RSPF Example 1 Pronghorn 36 10 0 Acknowledgements Literature Cited Citations for R packages Further 58 References and Citation Information Al Appendix I List of Covariate Layers Commonly Used by YERC 61 A2 Appendix 2 Specific R Functions Used for Each Model 64 A3 Appendix 3 RSPF Flow of Control Overview 64 A4 Appendix 4 Installing the RSPF Tool as a Button 67 A5 Appendix 5 RSPF Analysis Key Questions 76 EAGLES User Manual February 2011 1 0 Overview 1 1 Project Objective and Intended Audience This manual outlines a workflow and a set of software tools collectively known as the Ecosystem Assessment Geospatial Analysis amp Landscape Evaluation System EAGLES EAGLES is designed to aid resource management decision making by providing support for species habitat planning efforts that integrate changing landscape conditions with demographic responses Managers seeking to
24. ESSOR ID EM64T Family 6 Model 23 Stepping 10 Settings Environment Variables Error Reporting Edit System Variable Variable name Path Variable value bin Program Files Figure 2 2 The windows opened when setting the environment path to allow ArcGIS to call R directly in Windows XP 2 2 ArcGIS 9 X components of the RSPF tool The RSPF tool may be started using two approaches The first is to open the ArcGIS project file called RSPF mxd This file contains all the necessary code to use the tool as described in this manual This approach is effective but may necessitate copying the mxd multiple times for various projects Note that if this approach is taken users are advised to clear the spatial information associated with the project prior to adding new data To do this go to view gt Data Frame Properties gt Coordinate System and click the Clear button Alternatively users can set establish the RSPF tool as a clickable button that will be present every time ArcMap is started Instructions for doing this are available in Appendix 4 Requisite files for both the mxd and button installation are available for download from YERC 10 EAGLES User Manual February 2011 3 0 Data Input 3 1 The Wiki Tool The wiki tool Fig 3 1 allows a user team to search for potential covariates using a variety of criteria type of measurement spatial scale data source etc and th
25. Ej M ResponseSurf Value i High 0 945375 Low 0 000000 Figure 7 2 An example RSPF Fit surface showing the probability that each raster cell will be selected by a species Probability values range from 0 to i e zero percent chance of being selected to one hundred percent of the cell being selected according to the model 34 EAGLES User Manual February 2011 8 0 Ecological Forecasting Through RSPF The EAGLES tool provides functionality that allows users to apply RSPF models fit using observed data to potential scenarios through its Swap tool in an effort to make projections about the ecological ramifications of landscape change To use the Swap tool the user must first identify a covariate to be changed and construct a GIS layer depicting this change For example a forecast about the impact of building a new road through a habitat would rely on the construction of a covariate layer that contains the projected road The user can then apply the fitted RSPF model to this new layer instead of the original layer and view the response surface under the changed landscape We emphasize that such projections are not absolute they are simply an application of current responses to alternative scenarios and do not account for potential unobserved threshold values Furthermore projections may be faulty if they are made for covariate combinations that never occur in the observed dataset The Swap tool resides within the RSPF functio
26. F Tool Statistician Figure 1 2 Example workflow for Resource selection RSPF analysis in EAGLES EAGLES User Manual February 2011 We expect that user team will follow a work plan similar to the one outlined in Fig 1 2 Our intent is to facilitate production of a model that is standardized transparent and defensible We obtain each of these criteria as follows Standardization By using hard coded functions we limit potential coding errors that might occur if each analysis was coded individually We present a research framework and tool set that could be applied to many organisms and questions in many different systems Transparency This manual contains relevant citations and methodological discussion and the tools place heavy emphasis on visual display for the user so that modeling assumptions can be clearly identified and verified Defensibility In keeping with the Daubert paradigm for legally defensible science we rely on well documented methodologies RSPF etc with known sampling distributions and thus quantifiable error rates and or uncertainties The workflow is designed to guide a user or team through steps via a series of dialogue boxes in the ArcGIS environment While the tool itself embodies three primary functions Data Input Data Integration and Analysis and Modeling see Fig 1 3 it is nested in a longer process of ecological investigation that begins with a set of management objectives and ends in ecological
27. a more complex model and if left unaccounted for they may result in inflated Type I error rates that is they may increase the chance that users identify covariates as significant when in fact they are not Goodness of fit statistics provide a formal measure of model fit These statistics are located in the RSPF fit summary file produced and stored in the Results subfolder of the RunX folder after 32 EAGLES User Manual February 2011 the second RSPF script has run We provide a Hosmer Lemeshow test statistic often used for assessing fit of binary regression models The hypotheses for this test are as follows Ho The model fits adequately H4 The model does not provide an adequate fit of the data Small p values for the Hosmer Lemeshow test indicate some lack of fit to the model however this test is somewhat conservative We provide this test statistic simply due to its historical impact in binary regression settings and encourage users to rely on the ROC curves and especially AIC values as better indicators of the performance of one model relative to the suite of models of interest A brief assessment of model coefficients is prudent at this point We anticipate that most users will use RSPF models with logit links where the relationship between changes in the covariate and the mean response probability are exponential As such very large coefficient estimates should be regarded with a grain of salt since they indicate mass
28. aces in file names as well as numerals in the first value of the filename e g lanimal Inside the RunX folder the user finds the following four subfolders 1 Parameters contains the parameter files written by ArcGIS and read by R 2 Covariate Graphs contains jpegs of all images displayed in ArcGIS as well as several additional diagnostic plots 3 Results contains model summaries including coefficient estimates and statistics associated with model fit 4 Tables contains the used available and universe csvs written by ArcGIS as well as the RSPF model matrix which is the MDA 23 EAGLES User Manual February 2011 6 3 RSPF Tool Description RSPF tool data flow and processing overview 1 The RSPF tool operates as a GIS based Graphical User Interface GUI collects user defined information e g input file names and output file destinations creates a Merged Data Array MDA by extracting values from each raster dataset for all use and availability points and generates a parameters file visible to the user in Parameters sub folder of the RunX folder in the file RSPF params aks txt allowing these arguments to be passed to the first R script To accommodate the requirements for extracting the raster values for each point the RSPF tool may resample raster files according to the user specified scale of analysis As a result a potentially modified version of each input file may be placed in the RunX folder Since the input f
29. ach use availability point by spatial location and written as a csv file In EAGLES the MDA is then passed to the statistical programming environment R for analysis though the user team could read it into any statistical programming environment they chose While the R processing will be automated for ArcGIS users the underlying R code is available for user inspection and modification 16 EAGLES User Manual February 2011 5 0 Data Exploration We provide a set of data exploration tools available after the MDA is first sent to R 1 e on the graphs available in the window that pops up after users click the first Submit button The data exploration portion of the analysis is intended to help the user team familiarize themselves with the data Specifically we advocate the use of a portion of the data exploration presented protocol by Zuur et al 2010 that is as follows 1 Identify potential outliers in all covariates and the response through use of boxplots and histograms 2 Look for collinearity in the covariates using the pairs plot 3 Look for relationships between the covariates and the response 4 Examine independence assumptions in the response using semivariograms 5 Examine spatial distribution of the covariate values to spatially identify unusual regions Since we anticipate that our audience will often be using a count or binary response we advocate a post model fitting assessment of normality using a normal quantile qu
30. ading the R statistical software EAGLES User Manual February 2011 Unless fundamental changes are made to R new versions of R should continue to work with the RSPF R scripts this is not the case with the packages see Installation Step 2 The code has been tested on R 2 8 X 2 9 X 2 10 X 2 11 X and up to 2 12 1 however compatibility with future versions cannot be guaranteed 2 Ib Installation Step 2 load required R packages The RSPF tool relies on functions housed in a variety of different R packages Please note that several of these packages have undergone extensive reformatting since the inception of this project While some users may have some or all of the necessary packages already installed on their machines version updates make it necessary for all RSPF computations to be carried out through the set of packages provided All necessary package files are contained within a designated folder in the zip file on the YERC website and are detailed in section 10 3 To use the RSPF tool the contents of this library folder must be extracted and placed within the library folder of the R directory e g c program files R R 2 10 1 library 2 1c Installation Step 3 Modify Windows Environmental Variables so R can be called by ArcGIS After R is installed Windows must be set to allow ArcGIS to start R To do this the user with administrative access on the PC must go to the computer s Control Panel and then to System Prop
31. alyst Utility Network Analyst Versioning New UlControl Delete UlControl Description Save in Normal mat Keyboard Add from file Chose 4 Click once on Normal UIButtonControll wait briefly and then modify the name of the button beyond the Normal prefix This will be the identifier for the button that you will be creating Customize Toolbars Commands Options Show commands containing Categories Commands Topology Zu NomalRSPF Tracking Analyst Utility Network Analyst Versioning v New UlControl Delete UlControl Description Save in Nomalms v Keyboard Add from file L cose 69 EAGLES User Manual February 2011 5 Click away from the name to exit the renaming mode Then click and drag the button to a menu bar 6 Right click the new button and select Text only zoma x TA FH g EJ R Spatial Analyst Layer CO Lol ln aQ xum eesEskoASA BR oo Delete ArcToolbox zum Qi 3D Analyst Tools Qi Analysis Tools Change Button Image gt l Cartography Tools cQ Conversion Tools Qi Coverage Tools v Image Only G Data Interoperability Tools tj 3 Data Management Tools Image and Text EH Geocoding Tools Begin a Group em Customize irl View Source 5 6 ug Toolbars Commands Options L2 m Show commands containing T Categories Commands To
32. antile plot in development 5 1 Pairs Plot EAGLES produces a standard pairs plot Fig 5 1 that contains a great deal of information about univariate and bivariate distributions within the dataset On the main diagonal of the plot matrix are histograms of each covariate where the user can look for outlying points and multimodality that is multiple peaks in the distribution To spot outliers look for histograms that have long tails The upper triangle of the plot matrix contains pairwise scatterplots of all covariates Use this plot to identify potentially collinear variables Collinear variables are variables that have strong relationships with one another identifiable by the points in the scatterplot all falling along a line When two collinear covariates are both included in a model the model fitting algorithms cannot identify which variable actually drives the response which may result in the misallocation of influence to one covariate or the other When collinear covariates are of interest we encourage the biologist to make a decision based on prior knowledge of the system about which covariate is most logical for inclusion in the model Here we can see that several 17 EAGLES User Manual February 2011 covariates for example June NPP and May NPP are highly collinear which might suggest that only one of them should be used in our final model oo 05 o E w E Elp o E 2E up a j 1600 2200 00
33. are 2 la Installation Step 1 Download and install R Download the latest version of R by navigating to http cran cnr berkeley edu bin windows base When you follow this link you will arrive at the site shown in Fig 2 1 The version used in the worked example is R 2 10 1 Follow the Download R 2 10 1 for Windows link The default installation is adequate for more users and was used for the examples in this tutorial Download R 2 10 1 for Windows The R project for statistical computing Windows Internet Explor e Jy http cran cnr berkeley edu bin windows base vi X File Edit View Favorites Tools Help Google BO v serch 0 qde share Q M Sidewiki Ched di Be Q Download R 2 10 1 for Windows The R p fa amp R 2 10 1 for Windows Download R 2 10 1 for Windows 32 megabytes Installation and other instructions New features in this version Windows specific all platforms If you want to double check that the package you have downloaded exactly matches the package di compare the md5sum of the exe to the true fingerprint You will need a version of md5sum for wit command line versions are available Frequently asked questions e How do I install R when using Windows Vista e How do I update packages in my previous version of R Please see the R FAQ for general information about R and the R Windows FAQ for Windows spe Other builds Figure 2 1 The webpage for downlo
34. are the resulting models in terms of their AIC values The best model is the one with the lowest AIC In order for a model to be deemed universally the best model it should be two AIC points lower than the next best model To examine criterion 2 we offer the user a measure area under the curve and two plots The plots are a Receiver Operating Characteristic ROC plot to examine the model s ability to correctly classify used and available points in the original dataset and a semivariogram of deviance residuals to assess whether the response exhibits spatial autocorrelation beyond that which can be explained by spatial clustering of the covariates 7 1 Model Assessment Receiver Operating Curve Semivariogram for Spatial Autocorrelation Goodness of Fit Statistics and Model Coefficients The Receiver Operating Characteristic ROC Fig 7 1 is a depiction of the probability that the model ranks points that were actually used as more likely for use than points that were not actually used i e it is a measure of the model s ability to identify used points as used points and available points as available points Higher probabilities indicate better models The ROC curve is often summarized in terms of the Area Under the Curve AUC reported in the ROC legend Higher values of AUC correspond to higher probabilities that the model classifies appropriately If the model s classification is no improvement on random classification then the RO
35. ayers were appropriately up or down scaled Alignment of covariate layers was achieved through resampling Corners of all grid cells were matched to allow for mapping of the fitted RSPF to the study domain 9 3f Merged Data Array A merged data array encompassing used and available points sampled over a common covariate scale was produced in ArcGIS through the RSPF tool as described in Section 9 4 9 4 Implementing the RSPF tool To activate the RSPF tool the user clicks on the RSPF button adjacent to any open toolboxes in ArcGIS see Figure 9 4a O RsPF 320r64bit Dec8 10 ArcMap ArcView rc CP enm File Edit View Bookmarks Insert Selection Tools Window Help DgG uS Ez 1282487 dy E YN Spatial Analyst v Layer coyote tif c ih Layer riz Editor v X Create New Feature zl z HawthsTook v 7 I t E 3D Analyst Layer coyote tif zio eLhlmA mE Geostatistical Analyst RSPF x x il E ever amp 0 pronghorn_use amp O pronghorn availability MF ener Figure 9 4a RSPF button displayed in ArcMap 40 EAGLES User Manual February 2011 Upon clicking this button the screen shown in Fig 9 4b appears The user must work through all three tabs prior to submitting their data for analysis In the first tab the user must identify the Region of Interest ROI which can be any layer that is clipped to the appropriate dimensions The user must al
36. bility points within a specified distance of a Use point and within a specified polygon layer Detailed information is provided for each tab and drop down box in a window on the right side of the tool Once all fields are filled in the user continues by clicking the submit button The RSPF tool then begins to create the input files necessary for R to build an RSPF model Warnings are given as message boxes under certain circumstances e g for a 64 EAGLES User Manual February 2011 Resolution layer containing grid cells that are not square and provide an opportunity to exit the program Additionally processing is automatically stopped and the user is returned to the GUI forms when data are insufficient for processing e g a covariate layer is not spatially referenced In the event that this occurs details are provided 2 Define Data Types The next screen the will ask users to specify whether the data layers are continuous or categorical in nature The default data type is continuous Once all layers are correctly attributed the user clicks another submit button 3 Prepare Data and Call R Script 1 Model building begins by creating a Run folder including the subfolders CovariateGraphs Parameters Results and Tables in the specified Output Folder When a Run folder already exists the smallest available integer is added becoming for example folder Runl Similarly a second folder is created for temporary files and i
37. e RSPF plots In the first pass we recommend focusing on which link function appears to best fit the data Since only one link function can be chosen for the final model we are looking for the link function that does the best in general In the pronghorn example this appeared to be the logit link In the second pass 49 EAGLES User Manual February 2011 through the plots we suggest focusing on which order of fit linear quadratic etc looked best for each covariate Here we focus only on the curves generated by the best link function in this case logit For example the wolf intensity of use curves shown below Figure 9 6 are consistent with a linear fit as wolf intensity of use goes up pronghorn use declines Several formal measures of fit are provided for comparison of fits AIC our go to model selection criterion indicates that the linear fits perform best for wolf intensity of use Covariate Selections coyote dist_to_road elevation forage forest pct herb may npp june_npp sage Sage Hypothetical slope smammal l soil topocoy wolf Swap Tool Order of Fit Exclude this layer Apply model to woolf tif K S statistic 0 237 bootstrapped p value 0 Max 1st order AUC 0 606 scaled RSPF 1st order fit 2nd order empirical rspf empirical rspf empirical rspf logit AIC 122 2 AUC 0 606 h l p 0 logit AIC 180 6 AUC 0 651 h l p 0 logit AIC 18
38. e aware of the important decision points made when resampling The RSPF tool however has the ability to resample raster layers to an identical user selected resolution See Section 4 5 for a discussion of whether to scale up or down 3 The raster datasets should have the same spatial extent as the RSPF model will be applied to the entire user defined ROI This is an important consideration because if covariates have mismatched extents some areas in the resulting RSPF fit raster will be generated without all the necessary covariates 6 5 0 RSPF Fit Fitting the Univariate RSPFs In R In summary the first script file fits a univariate Resource Selection Probability Function RSPF for each of the submitted covariates These functions are of the form f uO By B x f uQ B B x B x and f uO B Bx p x px where f u y is a particular link function relating the probability of use to given levels of the covariate denoted here as x This approach to model fitting allows users to select both an appropriate order for each covariate and an appropriate link function for their final model through examination of these first round models during the second user dialog Note that the same link function must be used for all covariates so we suggest that the user make two passes through the univariate RSPF plots In the first pass assess which link appears to perform best across all covariates and select a link functi
39. e complex models that allow for mixed effects and spatial autocorrelation are under development Due to the training required to effectively use R the EAGLES workflow permits user interface in the more familiar and user friendly ArcGIS environment Expert users can also amend and interact with the underlying R code directly if so desired 1 7 Model Assessment and Interpretation Results from the preliminary data exploration and analysis both require a degree of statistical understanding to effectively build a model and interpret the results A statistical consultation may be useful for many users at this stage but users with even limited statistical training can assess results themselves by studying the examples provided in this manual and utilizing their knowledge of the species of interest EAGLES User Manual February 2011 2 0 Installation of ArcGIS Tools The EAGLES tools are intended to assist users in acquiring data and fitting a Resource Selection Probability Function RSPF Lele and Keim 2006 Lele 2009 in the open source statistical computing environment R on a windows PC equipped with ArcGIS 9 X The RSPF analysis tool utilizes statistical processing functionality contained in R scripts that are called directly from the ArcGIS interface The intent of this tool is to provide users access to a powerful modeling framework without requiring extensive statistical programming knowledge 2 1 Acquisition of Required Open Source Softw
40. e narrative model In order to aid the user in understanding an appropriate depth for the narrative model we include descriptions of the narrative modeling in the tutorial predation GR anthropogenic energetic intake water balance gt RESOURCE habitat RESPONSE DATA intraspecific a COMPETITION K interspecific invasive emerging invasive M 5 PATHOGEN PARASITE endemic co evoved GEOPHYSICAL h _climate change slope elev aspect cover Figure 1 4 A mind map Beel et al 2009 visualization of various factors affecting variation in the focal species response or legacy data sets Considering all possible risks and rewards based on expert opinion research and natural history helps avoid deficient models It should also represent the ideal world of postulated mechanisms leading to testable hypotheses and management decisions Covariates are then specified to represent these factors so that end users can build a Merged Data Array MDA prior to data exploration tools analysis and modeling 1 3 Data Inputs Data inputs can be classified into two broad groups 1 species population inputs i e response data such as GPS radio collar data survey and transect data including flight data 2 geospatial covariate inputs which may be derived from spaceborne sources such as MODIS and LandSAT 5 EAGLES User Manual Februar
41. e user can also make several choices about the graphical display of the covariates by selecting a number of bins and a binning method for the empirical RSPF fit We selected twelve bins and bin generation via the quantile method as we found this generated the most comprehensible picture of the empirical RSPF fit 42 EAGLES User Manual February 2011 Upon completion of all three user dialogue tabs R is called to fit the univariate RSPF curves by hiting the Submit button in the lower right hand corner of the user dialogue box Base Map Use and Availability Data Covariate Data r Covariate Layers Input Data Select the Response Availability and Covariate layers by using the list boxes which contain selectable layers that are dist to road ti open in the current Map Project elevation tif Forage tif Forest_pct tif herb tif may npp tif m Graph Display Options Binning Method quantile Percentile Number of Bins 12 R Script Folder C YERC_Projects RSPF_tool code 02 22 2 uit Submit Output Folder CAvERC ProjectsVRSPF tool code 02 22Xoutput p ji Somit Figure 9 4d The Covariate Data tab in which the user selects all raster covariates to be included in the analysis Note that each covariate must be added to the ArcGIS project to be available in this dialog A follow up dialogue box Figure 9 4e opens so that the user can designate each covariate as categorical
42. ect Kill forested PSI NASA Minnesota from 2000 Logging forested AM Freeze Thaw PSI NASA Minnesota University of Montana from 2000 from 1988 PM Freeze Thaw University of Montana from 1988 Combined Freeze Thaw University of Montana from 1988 Inverse Transitional Freeze Thaw Minimum Temperature University of Montana PRISM 4km Monthly from 1988 1895 to 2010 62 EAGLES User Manual February 2011 Maximum Temperature PRISM 4km Monthly 1895 to 2010 Average Temperature PRISM 4km Monthly 1895 to 2010 Percent Normal Precipitation PRISM 4km Monthly 1895 to 2010 Percent Surface Water PSW YERC Ikm 8 day from 2000 Percent Soil Herbaceous Shrub YERC 30m static static Forest Biomass YERC 100m Annual from 2005 Riparian hydrologically influenced soil vs Upland YERC 30m Annual from 1980 Grey Attack Red Attack Healthy Green Forest YERC 2m Annual from 1980 Annual Average Precipitation BioClim 1 km Yearly 1980 to 1997 Annual Average Temperature BioClim 1 km Yearly 1980 to 1997 Annual Temperature Range BioClim 1 km Yearly 1980 to 1997 Average Diurnal Range In Temperature BioClim 1 km Monthly 1980 to 1997 Average Temperature of Coldest Quarter BioClim 1 km Quarterly 1980 to 1997 Average Temperature of Driest Quarter BioClim 1 km Quarterly 1980 to 1997 Average Temperature of
43. ect to all variables parameters Interpret results to make biological inference EAGLES User Manual February 2011 The workflow is designed for a user team with the following skill sets GIS basic knowledge of remote sensing data access to a statistical consultant for more complex decisions and lead biologist manager with expert level species knowledge These skills may be found in one person but are more likely embodied in a group of people working in collaboration This manual contains instructions useful to all members of the team and or an individual user fulfilling all roles Specific contents include an introduction to the ecological and statistical methodologies underlying the tools an overview of the tools themselves and where they fall in the model building workflow and a worked example Overview Ecological Analysis Framework Form Plan of Action Prepare field data for analysis e g enter COASTER tool for Choose analytical data or digitize maps clipping approach i e choose point datasets summarizing and statistical method downloading data Identify limitations of data e g covariate availability and scale Acquire and or create covariates raster datasets Decide upon final covariate set Choose output scale for applying the derived model back to the full Statistical Analysis See Additional Charts Eventual Buttons study area Habitat Characterization Demographic Models RSP
44. ed Covariates We used a selection of modeled covariate layers in this analysis CASA Forage YERC was used to generate the forage layer Shengli Huang YERC generated the herbaceous sage and soil layers by modeling AVIRIS satellite imagery and Radar CASA Express YERC was used for generation of the May and Jun cumulative NPP layers Small mammal biomass is a modeled layer based on regression of empirically observed biomasses against a habitat map Alan Swanson YERC Coyote and Wolf intensity of use layers were created by accumulating kernel 39 EAGLES User Manual February 2011 density surfaces for individual use probabilities to account for pack sizes and might be considered modeled as well 9 3d Available Points Buffers of Ikm were generated for all use points to create an available space and available points were randomly and uniformly chosen over that space Since the spatial scale at which pronghorn select their habitat was unknown this process was repeated at 3km and 5km and analyses were conducted at each of these scales for comparative purposes We arbitrarily selected available points at the 1km scale for this tutorial Techniques for assessing an optimal scale for availability are in development 9 3e Spatial Scale of Covariates All covariates and the response were geo referenced in the WGS84 UTM zone 12N projected coordinate system One common pixel size of 100 m grid cells was decided upon and covariate l
45. edback and interest This project was supported by funding from NASA Ecological Forecasting RRSC NASA Grant no NNX08AO58G 58 EAGLES User Manual February 2011 10 2 Literature Cited Beel J B Gipp and C M ller 2009 SciPlore MindMapping A Tool for Creating Mind Maps Combined with PDF and Reference Management D Lib Magazine 15 11 Braunisch V and R Suchant 2010 Predicting species distributions based on incomplete survey data the trade off between precision and scale Ecography 33 826 840 Burnham K and D Anderson 2002 Model selection and multi model inference a practical information theoretic approach Second Edition Springer Verlag New York Elith J C H Graham et al 2006 Novel methods improve prediction of species distributions from occurrence data Ecography 29 129 151 Forester J H L and P Rathouz 2009 Accounting for animal movement in estimation of resource selection functions sampling and data analysis Ecology Vol 90 12 pp 3554 3565 Friedman J H 1991 Multivariate adaptive regression splines The Annals of Statistics Vol 19 1 1 141 Lele S and J Keim 2006 Weighted distributions and estimation of resource selection probability functions Ecology Vol 87 12 pp 3021 3028 Guisan A and N E Zimmermann 2000 Predictive habitat distribution models in ecology Ecological Modeling 135 147 186 Lele S 2009 A n
46. en links users to information on collection acquisition and development of those data sources This wiki can be updated by registered users and is intended to function as a reference site for geospatial data and particularly those data derived from remote sensing sources that is of common interest to ecologists To use the wiki users can search by keywords or select from indexed lists of datasets described within the archive Indexed lists can be accessed by pulling down the Main Pages menu found in the upper right hand corner of the screen and selecting Data Sets Resulting lists appear as a set of links that can be clicked thereby leading users to sub lists and or individual dataset pages BLILITITTIUICECIOCICIOECOODO O ERIINNSNASSNNNNKKNNNEEEEENEEEENEENNNENNNNUUUNUUUUUUUUSS zax Be Ed yew Hgy ibomeis Isos teb c e d OS 3 Reon Ret wekome to Geospatial Data wik E Greenehet Wikipedian the See encyd wikites _wiladot com Shareon Eig Join this site Edt History Tags Source Explore gt Geo Spatial Data Wiki ST 731 Welcome to GeoSpatial Data Wiki Online Reference for Identifying and Locating GeoSpatial Datasets for Environmental Research Page tags modis Add a new page new page Figure 3 1 Screencapture of the GeoSpatial Data Wiki page The Geospatial Data Wiki can be accessed at http geospatialdatawiki wikidot com 11 EAGLES User Manual February 2011 3 2 Customized O
47. erties Under the Advanced tab click the button for Environmental Variables as shown below The Environmental Variables dialog window will open In the variable list for System Variables select path and hit the Edit button Paste the path to the bin folder for R located within R s program file e g c program files R R 2 10 1 bin to the end of the Variable Value text line separating the new path by adding a semicolon before pasting This allows ArcGIS and R to communicate Fig 2 2 shows the windows that users will see on a Windows XP machine when setting the environmental variable EAGLES User Manual February 2011 System Properties 2 Environment Variables SystemRestre Automatic Updates Sere eens NES re HOD pe LL ol General Computer Name Hardware Advanced Variable Value You must be logged on as an Administrator to make most of these changes PATH C Program Files x86 SSH Communicat Performance TEMP USERPROFILE Local Settingsi Temp s TMP oLISERPROFILE Local Settings Temp Visual effects processor scheduling memory usage and virtual memory Settings User Profiles Desktop settings related to your logon System variables Variable Value Settings OS Windows NT Path C WINDOWS system32 C WINDOWS Slatin and Racivan PATHEXT COM EXE BAT CMD VBS VBE 5 PROCESSOR A 4AMD64 System startup system failure and debugaing information PROC
48. ew method for estimation of resource selection probability function The Journal of Wildlife Management Vol 73 1 pp 122 127 Manly B L McDonald D Thomas T McDonald and W Erickson 2002 Resource selection by animals Second Edition Kluwer Academic Publishers Dordrecht Netherlands Nelder J and R Mead 1965 A Simplex Method for Function Minimization Computer Journal Vol 7 pp 308 313 Phillips S J R P Anderson and R E Schapire 2006 Maximum entropy modeling of species geographic distributions Ecological Modeling 190 231 259 Sciplore http www sciplore org software sciplore_mindmapping Smithson M and J Verkuilen 2006 A Better Lemon Squeezer Maximum likelihood regression with beta distributed dependent variables Psychological Methods Vol 11 1 54 71 59 EAGLES User Manual February 2011 Zuur A F E N Ieno and G Smith 2007 Analyzing Ecological Data Springer Science New York Zuur A F E N Ieno and C S Elphick 2010 A protocol for data exploration to avoid common statistical problems Methods in Ecology and Evolution 1 3 14 10 3 R Packages Used and Citations These are the packages required by the RSPF tool that are NOT included in the standard installation of R Note that all of these packages must be added to the R library folder as specified in section 2 1b All necessary files are provided in the zip file called RSPF R libraries zip ge
49. he resulting response surface in addition to the diagnostic graphs and tabular outputs produced by R are the final results of the RSPF analysis and thereby represent the information from which inferences can be drawn 66 EAGLES User Manual February 2011 Appendix 4 Installing the RSPF Tool as a Button The RSPF tool is typically distributed within an ArcGIS project mxd containing the necessary code for the Graphical User Interfaces GUI that constitute the tool Alternatively the RSPF tool can be installed as a clickable button that remains within a toolbar in the ArcMap environment This document outlines steps that can be used to install the RSPF tool in ArcMap version 9 3 When you have completed these steps the new button will appear in all new projects Removal instructions are given at the end of the document Note that the files necessary to install the button are provided with the other downloadable materials To install the RSPF tool 1 Under the Tools menu item select Customize Customize is also available in the list that appears when right clicking on a gray area in one of the menu bars Untitled ArcMap Arcinfo File Edit View Bookmarks Insert Selection Tools Window Help Editor E A Editor Toolbar mpm Dosu he dus M eg Reports b m QQuUO Geocoding b 8 X Add XY Data 7 Add Route Events ArcCatalog ArcToolbox amp 3D Analyst Tools Qi Analysis Tools b
50. ies of interest See Forester et al 2009 for a discussion of availability The RSPF tool provides users with three different options for identifying availability points each of which corresponds to a different level of user control and thus a different combination of bias and variance in the model error structure The RSPF tool availability point options are 1 The point buffer option which generates random availability points located within a buffered region around each response point In this case the user must define both the buffer size and the number of points per buffer region 25 EAGLES User Manual February 2011 2 Random selection of a user defined number of availability points from a region of available space i e a polygon shapefile regardless of the observed distribution of response points within the region For example the tool can pick five times the number of use points uniformly from an entire available space This method leads to a uniform sampling intensity over the entire available region 3 Preferable The user defines availability points within a custom made shapefile and enters these points directly into the model The benefit of this approach is control since the user can purposefully exclude points from areas that are not truly available e g water bodies for terrestrial species We note that the appropriate number of availability points and their spatial distribution remain somewhat nebulo
51. ignated output directory The RSPF output is comprised of three parts First the ROC curve and semivariogram are created and stored as graphics in the Results subfolder of the RunX file Second the RSPF model as well as AIC and AUC scores goodness of fit tests and variance inflation factors are stored in the RSPF summary file in the Results subfolder of the Run folder Those results for the saturated Pronghorn model are listed here n use 762 n avail 3810 Parameter Estimates est se t p vif Intercept 12 8088 1 3567 9 44 4 602682e 20 NA coyote tif 0 0821 0 0207 3 96 8 209788e 05 1 4 dist to road tif 18 8106 2 7598 6 82 1 876677e 11 72 1 I dist to road tif 2 10 5740 1 4697 7 19 1 571563e 12 71 8 elevation tif 0 6405 0 1107 5 79 1 035356e 08 2 6 forage tif 0 1094 0 0721 1 52 1 289330e 01 2 9 forest pct tif 0 4398 0 1273 3 46 5 707549e 04 3 5 herb tif 0 1928 0 0762 2 53 1 161013e 02 2 5 june npp tif 0 1471 0 1007 1 46 1 447094e 01 2 4 sage tif 0 1954 0 0504 3 88 1 136533e 04 1 5 slope tif 0 4714 0 0783 6 02 2 730155e 09 1 4 soil tif 0 0070 0 0471 0 15 8 808050e 01 3 0 wolf tif 0 5286 0 0642 8 23 8 288591e 16 1 1 ijog likelihood of GLM estimates 343 977 Log likelihood of DC estimates NA ijog likelihood of N M estimates 361 8818 AIC of N M estimates 697 7635 AUC for N M 0 7666794 mean rspf value for N M 0 03986547 Hosmer Lemeshow goodness of fit results chi 46 7 p 1 73820593696306e 0
52. iles can be quite large this procedure has the potential to take up large amounts of disk space 2 R is called directly by ArcGIS and the first R script is executed using the MDA created in step 1 and arguments specified in the parameter file The first R script derives an empirical univariate RSPF for each covariate diagnostic graphs for data exploration and graphical and tabular output for each selected covariate and enables the user to select covariates contributing to the final composite RSPF fit At this point we reiterate that only one in a pair of collinear covariates should be used in the final fit 3 The output files generated by the first R script are sent back to the ArcGIS tool to allow for additional user specification of the RSPF model e g the user will select which covariates to include in the final model as well as the desired link function The RSPF tool now generates a new parameter file overwriting the existing parameter file in the process that is passed to the second R script The new parameter file is very similar to the first but also contains the new user selected arguments 4 R is called from ArcGIS again and the second R script is executed using the arguments specified in the updated parameter file The second R script produces an output RSPF fit based on the user provided response and availability points discussed in Section 4 4 and writes the equation used for fitting in the RSPF_equation txt fi
53. ing For the pronghorn analysis we designated a set of points to use for availability 41 EAGLES User Manual February 2011 contained in the Pronghorn Availability shape file which is selected with the third radio button Base Map Use and Availability Data Covariate Data Use Layer Availability Layer pronghorn _use The availability data is a point layer that defines the area where the organism could m Availability File potentially reside Buffer Distance Buffer Units Multiple C Create Points Using Buffers C Create Points Within a Polygon Polygon Shapefile Number of Points Gelect an Availability Layer pronghorn availability bad R Script Folder C YERC_Projects RSPF_tool code 02 22 a uit Submit Output Folder C YERC_Projects RSPF_tool code 02 22Xoutput wo a Somit Fig 9 4c The Use and Availability Data tab in which the user enters use layer i e the response data and either makes the availability of specifies a pre made availability point shapefile The third tab of the first user dialogue allows the user to enter all desired covariates for preliminary analysis see Figure 9 4d Users can elect to use layers by selecting them from the drop down menu below Covariate Layers For the pronghorn analysis we initially selected all layers for model fitting as shown below Selected layers are listed in the large white box below the selection box At this point th
54. ion Models DEM and their derivatives modeled outputs from BioGeoChemical BGC models such as Biome BGC interpolated climate data produced using meteorological station data e g PRISM TOPS and products made using remotely sensed imagery 1 e images collected by airborne and spaceborne sensors and used to estimate values of ecological meaning For example Net Primary Production NPP mean winter precipitation and forage biomass estimates are modeled covariates 4 5 Availability Space Availability space 1 e places on the landscape where the sampled species could have been observed is a necessary input for the RSPF tool The user team is responsible for determining an appropriate method for obtaining available points for their focal organism Selection of available points remains an active research area The tool provides three options for creating or importing availability points which are detailed in Section 6 4 2 Generally we advocate that users create their own availability points to have greater control over their spatial distribution I5 EAGLES User Manual February 2011 4 6 Merged Data Array MDA The final product of the data integration phase is a merged data array MDA a table that can be read by a variety of different statistical programming environments The MDA is created in ArcGIS by intersecting all response and availability points with each covariate raster dataset and extracting the covariate value for e
55. ity Tools 3 Data Management Tools Macros P fe Geocoding Tools amp Geostatistical Analyst Tools E iy Linear Referencing Tools Seah ab Mobile Tools Styles b 3 Multidimension Tools ATTEN Qj Network Analyst Tools b Samples b Schematics Tools b Server Tools Spatial Analyst Tools b Spatial Statistics Tools iy Tracking Analyst Tools 2 This will open a Customize window Select the Commands tab and then UIControls The RSPF button will appear in the Commands window pane Select it and then click the Delete UIControl button Then click OK 73 EAGLES User Manual February 2011 Customize 1 oolbars Commands Options Show commands containing Categories Commands Topology zl Tracking Analyst Utility Network Analyst Versioning View WMS Layers XML Support Macros Menus New Menu New UlControl Delete UlControl Description Save in Normal mat Keyboard Add from file Close 3 To remove the code associated with the button return to Microsoft Visual Basic by clicking the Tools menu item then Macros and finally Visual Basic Editor Tools Window Help NT Editor Toolbar Graphs gt MW 230 5 Reports gt NIA HO Oe g Geocoding gt 3 Add XY Data ER 3 Add Route Events Qy 3D Analyst Tools Analysis Tools ArcCatalo d amp Qj Cartography Tools dy My Places Q Conversion Tools
56. ive changes in response probability with changing values of the covariate Additionally we recommend that the users check the variance inflation factors VIFs reported in the coefficients table VIFs in excess of ten are indicative of problems with model fit often related to multicollinearity among selected model covariates see Pronghorn worked example Section 9 If high VIF values are present we suggest that the user s revisit the pairs plot in the first round of user dialogues in an effort to identify potentially collinear variable pairs If such pairs can be identified we recommend the exclusion of one of the paired covariates from the final model fit The final and perhaps most important and intuitive tool for model assessment is an examination of the fitted RSPF surface in ArcGIS Figure 7 2 We recommend that users take a critical look at the fitted surface and apply their knowledge of the ecology of the focal organism to assess whether the surface returned by the model makes sense It is our experience that examination of the fitted surface can be useful in identifying important and overlooked covariates If the fitted surface makes ecological sense the ROC values are acceptable the AIC score is the best or 33 EAGLES User Manual February 2011 among the best in the suite of plausible models and the semivariogram does not display major departures from spatial independence the model should be regarded as acceptable
57. l information on ROC curves and AUC 3 Goodness of fit Various goodness of fit measures have been proposed for binary and binomial response data Here we use them in a Use Availability setting which is not exactly binomial but the measures should work fairly well nonetheless While none of these measures are without their caveats a commonly used statistic is the Hosmer Lemeshow statistic which essentially bins the data over values of the covariate and then uses a Chi square test to compare observed counts in a bin to counts expected in that bin under the model Since the null hypothesis for the Hosmer Lemeshow test is that the model fits well low p values correspond to ack of fit in the model An alternative method is the Kolmogorov Smirnoff K S goodness of fit test which is a very general test of the difference between the used and available distributions of that covariate In addition to these statistical measures the user must rely on knowledge of the biological system at hand as well as the information contained in the curves to select the order at which each covariate should be fit For example a covariate whose optimal values for a given organism are at the middle of the covariate s value range might be a candidate for a quadratic second order term whereas a covariate whose optimal values for the organism are at the low end of the covariate s range and whose increasing presence corresponds to steadily declining desirabi
58. le 24 EAGLES User Manual February 2011 5 ArcGIS reads the RSPF equation from the RSPF equation txt file and applies the model to the entire raster dataset from which the MDA was derived The final result is a raster layer depicting the RSPF model fit for the landscape Details on how to interpret this dataset are provided in Section 7 6 4 0 RSPF Inputs Overview All input files used by the RSPF tool must be spatial datasets formatted as 1 vector shapefiles for the use i e response and availability datasets or 2 raster datasets for the environmental covariate layers The RSPF tool was tested primarily using tif images and use with other raster data formats may behave unexpectedly All input data files should be in the same datum e g WGS 1984 and projection if the data are projected the model will also work using data in geographic coordinates 6 4 1 RSPF Inputs Response Data i e Use Points The Use point shapefile contains points that represent known location of the species of interest 1 e observations telemetry locations or GPS collar locations from the sampling period 6 4 2 RSPF Inputs Availability Points Availability points are used to define potential habitat In other words these are points where the species of interest may occur within the study area In practice however the expertise of the researcher is often used to define a logical available space based on their understanding of the spec
59. ledge Because the graphic as made in R Figure 5 4 is somewhat rudimentary the bin values for each variable are included within the 20 EAGLES User Manual February 2011 table rspf used points with bins csv and can be used to add the associated bin value for each use point within ArcGIS i e join the csv file to the use point shapefile attribute table elevation tif Bin 1 Values 1 61e 03 1 68e 03 Bin 2 Values 1 68e 03 1 75e 03 Bin 3 Values 1 75e 03 1 82e 03 2 Bin 4 Values 1 82e 03 1 9e 03 e Bin 5 Values 1 9e 03 1 97e 03 ES Bin 6 Values 1 97e 03 2 04e 03 T Bin 7 Values 2 04e 03 2 11e 03 Bin 8 Values 2 11e 03 2 18e 03 Bin 9 Values 2 18e 03 2 26e 03 Bin 10 Values 2 26e 03 2 33e 03 Bin 11 Values 2 33e 03 2 4e 03 Bin 12 Values 2 4e 03 2 47e 03 Q 8 e O eo S e y Y coordinate E oR o 7 e S dm 3 eo 3S S 4 3 I I T T I I 520000 530000 540000 550000 560000 570000 X coordinate Figure 5 4 EAGLE spatial distribution for single covariates 21 EAGLES User Manual February 2011 6 0 Resource Selection Probability Function RSPF Tool The RSPF tool fits resource selection probability functions a special class of species distribution models for use available data and a set of desired covariates directly from ArcGIS Species distribution models SDMs are commonly used in ecological studie
60. lity is provided within the ArcGIS based tools to create the MDA thereby relieving the user of several time consuming steps involved in preparing the data for direct export to a statistical program Important considerations to keep in mind at this stage include 1 Sampling approach and the distribution of response points 2 Spatial domain of analysis 3 Spatial scale of analysis 4 Modeled covariates EAGLES User Manual February 2011 5 Availability space Each of these topics is dealt with in detail in Section 4 1 5 Data Exploration Once the user team has developed a narrative model and acquired covariates the data exploration tools accessible as plot buttons in the first round of user dialogues in the ArcGIS RSPF tool provide a venue for preliminary data exploration and model fitting These tools walk users through appropriate portions of the protocol for data exploration for ecologists proposed by Zuur et al 2010 in order to better familiarize themselves with their datasets This protocol consists primarily of graphical tools for identification of outlying data points non normal data distributions and anomalies in data structure that should be considered in model selection and development 1 6 Analysis and Modeling EAGLES s statistical analyses occur in the statistical programming environment R EAGLES currently has a Resource Selection Probability Function RSPF model and a statistical model for intensity of use Mor
61. lity might be a good candidate for a linear first order fit 6 5 2 RSPF FIT Fitting the Full RSPF in R The second R script uses the N M algorithm or simulated annealing if appropriate see above to fit a RSPF function for the particular covariates and covariate orders specified in the second user dialog Fitting here works the same as it did in the first R script but only one model is fit This model is of the form 29 EAGLES User Manual February 2011 F HO By Bx t Box o Besos where f uCy represents the link function for the mean that the user selected in the second user dialog B is an intercept term and Bx represent all terms related to the jth selected 0 covariate this could potentially include as many as three terms of the form Bx Baxi Bjx Parameter estimates and fit statistics associated with this model are available to the user in the rspf fit summary txt file located in the Results subfolder of RunX folder 6 5 2a Standardization The user should be aware that all quantitative predictor variables are standardized prior to fitting While standardization is a transformation procedure that does not affect the model fit or predictions it does facilitate model interpretability Gelman and Hill 2007 pg 56 Examples of appropriate interpretation of standardized coefficient estimates from the logistic RSPF are included in the worked example 6 5 2b Interaction While the EAGLE tools d
62. ly a model to an entire landscape requires that all underlying covariate datasets have spatial extents that cover the entire area of interest 4 3 Spatial scale of analysis To produce the most interpretable results all covariates entering the model should share a common spatial scale 1 e spatial resolution However data layers are often collected or modeled at very different scales Ideally all the covariates will have an identical spatial resolution e g 30 meters that is consistent with the spatial error term associated with the response data This is seldom the case however and user teams will typically need to decide on a scale appropriate for analysis In cases where covariate data must be rescaled the user has two possible options each of which has drawbacks The options are 1 Scaling up in this case the resolutions of the covariate datasets are reduced i e multiple pixels are averaged to create coarser pixels until they match the resolution of the coarsest dataset and or the maximum spatial error of the response dataset The advantage of this approach is that when the statistical model e g the RSPF fit is applied to the entire spatial domain inference will never be made at a finer resolution than the datasets can allow The drawback of this approach is the loss of detail in the covariate datasets that may have been costly to collect or acquire 2 Scaling down in this case the resolutions of covariate datase
63. mivariogram Windows Photo Viewer File v Print v E mail Bum Open v ROC Plot r 0 18 Sensitivity true positives 0 0 0 2 0 4 0 6 0 8 1 0 1 Specificity false positives B 9 pe Figure 9 8 ROC plot for the saturated pronghorn model This model s AUC is fairly high AUC 77 suggesting that the model does a pretty good job of correctly classifying used and available points The Hosmer Lemeshow goodness of fit test indicates a significant lack of fit in this model suggesting potential omission of important covariates However we are not particularly concerned with the lack of fit since our objective is to predict with this model and its AUC is high To examine the ecological impacts of distance to road and predation on pronghorn habitat use we compared AIC scores from our original saturated model and two reduced models one excluding distance to road and one excluding predators see Section 9 2b The reduced models 54 EAGLES User Manual February 2011 were both fit in their own runs of the RSPF R scripts AIC values for each of the models is reported here along with the number of parameters in the model k the difference in AIC scores between the best model and this particular model A AIC and the AIC weight w attributed to that model Model AIC Score k A AIC w Saturated 697 7635 13 0 1 No Road 613 8171 12 83 95 5 91e 19 No Predators 605 9269 11 91 84 1 14e 20 Based on these AI
64. nality and can easily be applied to an RSPF model and surface once projected covariate layers are built Additional types of alternate landscape conditions include products such as expected forest density after thinning forage production after burning or Net Primary Productivity NPP under a future climate scenario An example of the Swap tool is shown in section 9 9 35 EAGLES User Manual February 2011 9 0 RSPF Example 1 Pronghorn 9 1 Overview and Narrative Model Yellowstone National Park pronghorn Antilocapra americana face a risk of extirpation due to geographic demographic isolation low abundance and low recruitment Decision makers need a management plan based on demographic monitoring of abundance especially vital rates and recruitment This study led by PJ White YNP focused on 1 Demographic monitoring esp recruitment and survival 2 Ecological interactions esp predation rates and recruitment Staging areas migratory corridors and summer winter use area were also of interest here see Figure 9 1 gt predation HAZARD HB energetic intake water balance RESOURCE i habitat RESPONSE DATA intraspecific id COMPETITION K interspecific invasive emerging invasive o PUPAIHOGENIPABASHTES _endemicico evolved GEOPHYSICAL H climate change slopereleviaspect cover Figure 9 1 Narra
65. nline Aggregation amp Summarization Tool for Environmental Rasters COASTER The COASTER system is a set of online tools designed to produce customized raster datasets for specific spatial domains COASTER results can be used for data visualization and are amenable for use as input covariates in statistical models such as RSPF The great strength of this approach lies in its ability to reduce massive and cumbersome datasets into manageable information that can be easily incorporated into an ArcGIS environment The data currently available on the tool consist of gridded climate data for the Lower 48 United States from 1980 through 2009 with an 8 km spatial resolution and a daily temporal resolution de COASTER data net E e Hune dio Brandes ein i g Customized Online Aggregation amp Summarization Tool for Environmental Rasters Request a Summary of Gridded Climate Data spatially or temporally aggregated Lat Lon Lon Upper Left Corner L114 Lower Right Corner L110 Time Frame O Summarize Threshold Trends amp Anomalies Variables Variables Trend 1 band Start Date End Date Month Day Month Day vi o Anomaly 1 bd yr Summary Statistic Threshold Type 1 1 1 1 Variables v v Summarize By Comparison Statist atistic M 3 Start Year End Year Ain Max 1980 1980
66. o not generate interaction terms internally in R the user can readily generate interaction layers in ArcGIS and pass them to the R models We suggest the following guidelines when working with interaction terms 1 Consider the use of an interaction term for main effects that have large values 2 When building interaction layers note that the EAGLE tools rely on standardization prior to generation of higher order terms To be consistent the user should first standardize the two layers he or she wishes to include in the interaction by subtracting the layer mean and dividing by the layer standard deviation and then multiply the two layers together to form the product layer 3 Once the individual variables are standardized an interaction layer can be created by multiplying the raster layers together using ArcGIS functionality such as the Raster Calculator 30 EAGLES User Manual February 2011 7 0 RSPF Model Assessment and Interpretation In order for a model to be scientifically defensible it should meet two criteria 1 It should be the best model of a suite of possible models 2 It should provide an adequate fit of the data The EAGLES tools provide the user team with mechanisms for addressing both of these criteria To assess criterion 1 we provide a model AIC value for the final RSPF model fit We suggest that the user team generate a set of candidate models fit each of the models in a series of runs of the EAGLES tool and comp
67. oR HH leaps Matching MCPAN multcomp mvtnorm PresenceAbsence rgenoud Sp Freeman Elizabeth 2007 PresenceAbsence An R Package for Presence Absence Model Evaluation USDA Forest Service Rocky Mountain Research Station 507 25th street Ogden UT USA Heiberger Richard M 2009 HH Statistical Analysis and Data Display Heiberger and Holland R package version 2 1 29 Pebesma E J R S Bivand 2005 Classes and methods for spatial data in R R News 5 2 http cran r project org doc Rnews R Development Core Team 2008 R A language and environment for statistical computing R Foundation for Statistical Computing Vienna Austria ISBN 3 900051 07 0 URL http www R project org Ribeiro Paulo J amp Peter J Diggle 2001 geoR a package for geostatistical analysis R NEWS 1 2 15 18 June 2001 Schaarschmidt Frank Daniel Gerhard and Martin Sill 2008 MCPAN Multiple comparisons 60 EAGLES User Manual February 2011 using normal approximation R package version 1 1 7 10 4 Further Readings Miller J J Franklin and R Aspinall 2007 Incorporating spatial dependence in predictive vegetation models Ecological Modeling 202 225 242 Legendre P and L Legendre 1998 Numerical Ecology Elsevier Amsterdam 10 5 Citation Information Manlove K R Weiss D J and Sheldon J W 2011 EAGLE User Manual Yellowstone Ecological Research Center Bozeman MT Appendix 1 Li
68. ocument _ Alphabetic Categorized a 75 EAGLES User Manual February 2011 Appendix 5 RSPF Analysis Key Questions The goal of EAGLES is to augment existing decision support systems by providing tools that produce results that aid resource managers The Resource Selection Probability Function RSPF Tool available in EAGLES provides a habitat selection model capable of producing results within a standardized and transparent framework This appendix provides a rough guide for documentation we suggest accompany any report and or publication in which results from RSPF are used Note that these questions should be answerable regardless of the habitat selection model e g MaxEnt RSF etc used to support management decisions 1 What is the response sample size How was this response obtained 1 e via probabilitistic sampling of individuals via appropriate temporal thinning of repeated measurements on the same individuals etc Are there inherent biases that might exist due simply to the collection of response data 2 At what temporal scale i e over what temporal period were the responses collected 3 What is the temporal extent of the analysis and are the response and covariate datasets synchronous in time If not how was the temporal domain for each covariate chosen e g were certain temporal lags in covariate values included and with what justification 4 Whatis the spatial extent of analysis and do the
69. on for use in the final model Then once the link function has been selected make a second pass through the plots and identify the best order of fit for each desired covariate considering only the fits based on the chosen link In general we give preferential treatment to the logistic link due to its interpretability and ubiquity 27 EAGLES User Manual February 2011 In R the models are fit using the Nelder Mead N M algorithm a commonly used simplex method that searches for optimal parameter estimates by finding a minimum in the multidimensional parameter space In this case initial values for the N M algorithm are parameter estimates generated by the expectation maximization EM algorithm used in fitting a generalized linear model that is the starting points for establishing optimal values for the weighted distributions are fits from an unweighted generalized linear model using the same link The N M fitting method relies on unimodality and may fail in situations with multiple local minima Sometimes convergence of the N M algorithm is very slow If convergence has not been achieved in 5000 iterations of the N M algorithm the RSPF function may be fit via Simulated Annealing SANN an alternative optimization algorithm that works well on rough surfaces Users are informed when simulated annealing is employed in R but should not be concerned by its use For an introduction to link functions and covariate selection in resource selec
70. ovariate Distributions After completion of the first RSPF R script a display opens in ArcGIS see Figure 9 5a This display contains information necessary for data exploration of each covariate as well as assessment of the link function and covariate term order first order quadratic etc for the full RSPF model 45 EAGLES User Manual February 2011 Coyvariate Selections coyote dist to road elevation Forage forest pct herb may_npp june npp sage Sage_Hypothetical slope l smammal soil topocov l wolf l Swap Tool Order of Fit First Order Apply model to elevation tif K S statistic 0 285 bootstrapped p value 0 Max 1st order AUC 0 661 w a e c m m O D i 1st order fit 2nd order fit 3rd order fit empirical rspf empirical rspf empirical rspf logit AIC 237 2 AUC 0 661 h l p 0 logit AIC 236 AUC 0 661 h l p 0 logit AIC 237 AUC 0 664 h l p 0 probit AIC 236 AUC 0 661 h l p 0 probit AIC 236 8 AUC 0 661 h l p 0 probit AIC 239 AUIC 0 663 h l p 0 loglog AIC 235 2 AUC 0 661 h l p 0 loglog AIC 239 6 AUC 0 661 h l p 0 loglog AIC 241 6 AUC 0 662 h l p 0 777 exp AIC 224 6 AUC 0 661 h l p 0 777 exp AIC 226 AUC 0 661 h l p 0 exp AIC 229 AUC 0 666 h l p 0 oo os E a 1600 1800 2000 2200 2400 1800 2000 2200 2400 1600 1800 2000 2200 2400 ELEVATION TIF
71. ply on f x p the distribution of covariates in the used population Lele and Keim 2006 We developed all models on RSPF functions with logistic links however a cumulative log log link for the RSPF analog of a proportional hazards model is available as well 22 EAGLES User Manual February 2011 6 2 RSPF Tool Description Code data and output file storage protocol Code After creation of the MDA the ArcGIS script calls two R scripts presently called RSPF script l r and RSPF script 2 r The location of the R scripts is set by users when they run the RSPF tool Since the RSPF tool calls the two R scripts by name please do not rename the scripts Data Files The datasets used as inputs by the RSPF tool are made available to the RSPF tool by adding them to the ArcGIS project The user should be aware that due to file reading structures within R data file names must begin with a letter Do not begin a file name with a numeric character Output Files The output files produced by the RSPF tool will be placed in a user defined output folder Within this folder sub folders will be created named RunX where X is a count that will increase by one for each user run For example if the user runs the RSPF tool three times the output folder will contain sub folders named Run Runl and Run2 Contents of the RunX folders are discussed in Sections 6 3 and 6 5 Due to the structure of the R environment users must avoid sp
72. pology Tracking Analyst Utility Network Analyst Versioning um E Normal RSPF HE HE Ki New UlControl Delete UlControl Description Save in Normal mat Keyboard Add from file Close 7 Back in the Customize window double click the renamed button to launch Microsoft Visual Basic 70 EAGLES User Manual February 2011 g Microsoft Visual Basic Normal mxt ThisDocument Code DER iG File Edit view Insert Format Debug Run Tools Add Ins Window Help 8 X i i 4 23 BA bt a BE LS P Sy 38 O in colt Project Normal MN jk x j amp amp Normal Normal mxt Private Sub RSPF Clicki 23 ArcMap Objects ThisDocument End Sub Modules s Project ArcMap Objects 73 References Properties ThisDocument x ThisDocumen MxDocument _ Alphabetic categorized u 8 Add the following lines of code Load frmRSPF frmRSPF Show 71 EAGLES User Manual February 2011 4n Microsoft Visual Basic Normal mxt ThisDocument Code DER ipl File Edit view Insert Format Debug Run Tools Add Ins Window Help 8 X ig 3 484142 n a BS P SF 55 9 ins cols Project Normal x snl EO BS a E amp amp Normal Normal mxt Private Sub RSPF Clicki m 2 3 ArcMap Objects Load frmRSPF ThisDocument frmRSPF Show H E Modules End Sub z Project ArcMap Objects 73 References Properties ThisDocument x
73. provides multiple workflow pathways based on the specifics of the species response data and management question s The general idea is to provide a systematic yet flexible architecture for integration of species data with geospatial covariates most of which are derived from NASA data data products and ecosystem models that assimilate sensor data As the degree of complexity in statistical analyses and remote sensing data increases the need for a set of standardized techniques and common data protocols becomes more essential if we are to support repeatable transparent methods for ecological modeling EAGLES User Manual February 2011 1 2 Development of a Narrative Model This workflow is most effective when the user team has a strong working knowledge of the organism of interest including physiological drivers and potential thresholds trophic roles including predators hazards and prey resources niche competitive interactions and habitat geophysical preferences as well as parasites and disease see Fig 1 4 Our conceptual modeling process begins with a verbal description of important relationships competitive trophic behavioral etc between the organism of interest and its environment This prior knowledge of the system enters the model through selection of a set of hypothetical drivers covariates to be considered for inclusion in the model Here we refer to the covariates and their relationship to the organism of interest as th
74. s named either TmpRSPF or TmpRSPF followed by an integer If point layers are to be clipped the ROI layer is converted to a polygon layer and two new layers will be created the intermediary ROIRecls and ROIReclsPoly A universe point layer RandUnivPts is then created by randomly generating 10 000 points within the extent of the selected ROI If a randomized Availability layer is to be created it is done at this point Randomization of Availability points within a specified distance of Use locations will result in a BufferUse layer where is the specified distance Both options for randomizing Availability points result in the layer RandAvailPts Coordinate pairs are added to each point in the Availability Use and Universe layers and are used to extract the covariate values at each location If the Clip Use Availability and Universe layers to this ROI box is checked these layers are clipped to the ROI The tables for either the clipped or original layers are then exported as comma delimited files and stored within the Tables folder A parameter file named RSPF params aks txt Parameters folder is then generated This parameter files contains the necessary folder information i e paths to the datasets and quantitative details about the raster covariates that are utilized by the first R script 65 EAGLES User Manual February 2011 4 Define the Final Model and Call R Script 2 The first R script produces information
75. s to characterize the relationship between the regions utilized by a species and the habitat features that characterize those regions One specific manifestation of species distribution modeling is the Resource Selection Probability Function RSPF Lele and Keim 2006 Lele 2009 RSPF is a model that estimates the relationship between habitat use and attributes of important covariates through a model akin to standard binomial regression models logistic cumulative log log etc While it is our intent to make a variety of other species distribution models available we developed the initial modeling code for the logistic RSPF due to the ubiquity and transparency of its underlying logistic regression model 6 1 Introduction to RSPF RSPF modeling is an extension of Resource Selection Modeling that relies on resampling theory to resolve problems associated with obtaining truly Unused points RSPF methodology does not imply a particular link function rather it adjusts model standard errors so that they accommodate Use Available as opposed to Use Nonuse sampling designs by considering used points to be draws from the following weighted distribution n x B f x f w x Df G dx where f x is the distribution of covariates for the available population r x is the f x B resource selection probability function and f m x P f4 x dx is the expected probability of use RSPF estimation allows us to estimate 6 in 2 x based sim
76. sed Locations aow 0 005 aco 2015 ano Uses Stes Histogram of Values for Available Locations Figure 9 5b Stacked histograms to compare distributions of universe used and available sites Boxplots can be used to compare the distributions of different covariates at sites that pronghorn actually used and sites that were deemed available to them In Figure 9 5c for wolf intensity of use we see that the distribution of wolf intensity of use in the dataset is slightly right skewed since the boxplot is shifted toward the lower portion of the y axis but there do not appear to be substantial outliers in wolf intensity of use 47 EAGLES User Manual February 2011 e File v Print E mail Bum v Open v Figure 9 5c Boxplot of univariate distribution of wolf intensity of use A pairs plot Figure 9 5c is produced to compare all covariates This plot matrix is particularly useful in helping researchers identify potentially collinear variables for example May and Jun NPP in the pairsplot below Collinearity is problematic in fitting linear models thus in general pairs of collinear variables should not both be included in an analysis 48 EAGLES User Manual February 2011 Figure 9 5c Pairs plot for Pronghorn 9 6 Assessing Univariate RSPF Curves in ArcGIS We encourage users to make two passes through the univariat
77. so select a resolution here the resolution of the elevation tif which is 100 m and identify an output folder where the Run file containing all RSPF output is built Base Map Use and Availability Data Covariate Data Region of Interest ROI Base Map ROI Layer Set the area over which you want the model to apply in the Region of Interest elevation tif bd ROI list box I Clip Use Availability and Universe layers to this ROI Set the cell size to which the map output will conform in the Resolution list box Resolution 100 100 Meter elevation tif gt gt 100 100 m R Script Folder C YERC_Projects RSPF_tool code 02 22 io Quit Submit Output Folder CAYERC ProjectsXRSPF tool code 02 22Xoutput Co submit Figure 9 4b The Base Map tab of the first RSPF user dialog In this tab the user enters the ROI layer sets a file that defines the spatial resolution of analysis identifies the folder containing the R scripts i e the location of RSPF script l rand RSPF script 2 r and the location of the output folder In the Response and Availability Files tab see Figure 9 4b below the user must identify the layer containing the response measurements that is the layer of used points and select a mechanism for selecting available points see Section 6 4 2 for descriptions of the mechanisms provided These mechanisms are represented by the three radio buttons below the Availability File head
78. st of Covariate Layers Commonly Used by YERC Snow Cover from 2000 Land Surface Temperature amp Emissivity from 2000 Vegetation Indices EVI NDVI from 2000 LAI and FPAR from 2000 GPP from 2000 Land Cover Type from 2000 Thermal Anomalies and Fire from 2000 Albedo from 2000 61 EAGLES User Manual February 2011 GPP Gross and Net Primary Productivity MODIS 5km daily from 2000 FPAR Fraction of Photosynthetically Active Radiation Herbaceous Foliar Biomass Production upland CASA Forage 8km 2m to 250m Herbaceous Foliar Biomass Production wetland CASA Forage 2m to 250m Woody Shrub Biomass Production sagebrush CASA Forage 2m to 250m Woody Shrub Biomass Production wetland Urban Expansion CASA Forage PSI NASA Minnesota 2m to 250m from 1982 from 1980 from 1980 from 1980 from 1980 from 2000 Agriculture Expansion New Irrigated Cropland PSI NASA Minnesota from 2000 Agriculture Expansion CRP for two years or more PSI NASA Minnesota from 2000 Wetland Conversion to cropland PSI NASA Minnesota from 2000 Wetland Loss drained or dried out PSI NASA Minnesota from 2000 Wetland Expansion PSI NASA Minnesota from 2000 Fires non forest PSI NASA Minnesota from 2000 Fires forested PSI NASA Minnesota from 2000 Ins
79. t excluded it To examine predator impacts we fit a model that excluded coyote and wolf use as predictors and compared this model to a saturated model where both coyote and wolf were included These specific questions led to the following model suite which were fit and compared using AIC Model 1 Saturated model with all covariates fit at an appropriate order Model 2 Saturated model omitting distance to road Model 3 Saturated model omitting predators 9 3 Data Integration Data integration occurred in the ArcGIS environment prior to running the RSPF tool 9 3a Sampling Locational data were derived from marked Yellowstone Pronghorn 762 fixes were made on 26 collared animals from May to July of 2005 of a 1500 km study area PJ White Yellowstone National Park ungulate biologist Figure 9 3a shows a map of the study domain and used locations 38 EAGLES User Manual February 2011 Figure 9 3a Spatial domain and observed use locations for YNP pronghorn May July 2005 9 3b Full Spatial Domain Data on pronghorn use were collected on 26 collared individuals from May to July of 2005 After compilation of the pronghorn use data a full spatial domain encompassing all the use points as well as some surrounding edge area to be used for selecting potential available points was designated This region was selected arbitrarily by the research team but was driven in part by the known locations of pronghorn use 9 3c Model
80. ting We used two hypothetical scenarios to demonstrate how the EAGLE tool might be used to assess the potential impact of landscape change on pronghorn distribution These hypothetical what if scenarios were tested using the Swap tool whereby a variable used to define the model e g sage and or distance to road is replaced with a hypothetical variable when the model is applied e g sage hypothetical and or distance to road hypothetical Figure 9 9a RSPF fit results from a model with and without swapping the variable distance to road are shown in Figure 9 9b These types of What if Scenario WIS will provide practitioners with important decision support to guide site level action plans restoration efforts and understand the environmental impacts from climate disruptions invasive species changing land use and disturbance regimes 56 EAGLES User Manual February 2011 Coyariate Selections coyote dist_to_road elevation forage forest pct l herb l may npp l jine Swap Tool Order of Fit First Order gt Apply model to Sage_Hypothetical tif gt K 8 statistic 0 133 bootstrapped p value 0 Max 1st order AUC 0 552 w Q I ao 3 2 e D nz 1st order fit 2ndonder f 3rd order fit 7 empirical rspf empirical rspf empirical rspf logit AIC 44 4 AUC 0 552 h l p 0 logit AIC 52 8 AUC 0 594 h Lp 0 logit AIC 52 8 AUC 0 586 h l p 0 probit
81. tion models see Manly et al 2002 For the original work on the Nelder Mead algorithm see Nelder and Mead 1965 6 5 1 RSPF Fit Determining Covariate Presence and Order for the Full Model After obtaining some knowledge of how each covariate relates individually to resource selection the user may wish to construct one or more multi covariate models composed of a combination of covariates each fit at some particular order The RSPF tool incorporates several measures for assessing model fit and performing model comparison to facilitate multiple regression model fitting and selection 1 AIC Akaike s Information Criterion a measure that represents a compromise between the likelihood of a particular parameter set and the number of parameters used to fit the model Low values of AIC are preferable though models are typically taken to be similar in functionality if their AICs are within two units of one another see Burnham and Anderson 2 AUC Area Under the Curve AUC is derived from the Receiver Operating Characteristic ROC curve associated with a model The ROC curve shows the trade off in the model between specificity and sensitivity that is it shows how often 28 EAGLES User Manual February 2011 the model predicts false positives and false negatives In general higher values of AUC correspond to models that exhibit more desirable properties with respect to both specificity and sensitivity See Section 7 1 for additiona
82. tive model framework for Pronghorn analysis In order to get at a more all encompassing assessment of vital rates esp recruitment we fit two RSPFs for two responses one representing selection of birthing arenas for recruitment specific analysis and one representing resource selection in general Here we include only the results from the general RSPF analysis 36 EAGLES User Manual February 2011 9 2 Data Inputs Ideally we are interested in addressing questions of road impacts predator impacts and range condition impacts on pronghorn use and recruitment 9 2a Covariates We translated these ecological interested into the following set of covariate layers see Figure 9 2 to use for model building Abiotic Elevation Slope e Topographic complexity Biotic Productivity e Forage e Net Primary Productivity NPP Biotic Landcover e Percent forest cover e Percent sagebrush cover e Percent herbaceous cover e Percent soil cover Biotic Predation e Coyote intensity of use Wolf intensity of use Small mammal prey prevalence Human Influenced e Distance to roads 37 EAGLES User Manual February 2011 Figure 9 2 Covariate maps for a elevation b forage c percent sage 9 2b Model Suite In order to assess the impact of distance to road on our model we fit two multi covariate models one that included distance to road and one tha
83. ts are increased to match the resolution of the finest dataset through the process of resampling The advantage of this approach is that all data are preserved The drawback is that when the RSPF model is applied to the full study area e g the RSPF fit image is generated inference is being 14 EAGLES User Manual February 2011 made at spatial scales unsupported by the input datasets i e an ecological fallacy Statistically this approach is much harder to defend than scaling up but there are occasions when it is more justified than others For example for a covariate that is at a coarser scale than desired is unlikely to vary significantly at a fine scale e g temperature within a flat area such as a plain scaling down may not compromise the analysis by adding excess noise through resampling the temperature covariate and would allow other covariates to enter the model at their finer more informative scales However users should be prepared for the tell tale checkerboard effect 1 e visible squares representing the grid cell boundaries of the original dataset visible when the model is applied to the entire spatial domain 4 4 Modeled Covariates Many different modeled covariate layers are available for model inputs Some of these are freely available while others are available for purchase see Section 3 1 for use of the geospatial wiki for covariate identification and acquisition Modeled covariates may include Digital Elevat
84. us issues A rule of thumb for the number of availability points is to use five times the number of response points The spatial distribution of availability points is typically random within a defined availability space A more complex issue is the distribution of points within disconnected areas of available habitat e g three patches of habitat with varying numbers of response points within each The preferred approach in such a case is to distribute availability points within each area in proportion to the number of response points This approach however requires slightly more GIS acumen to produce than a simple random distribution 6 4 3 RSPF Inputs Environmental Covariates There are three primary concerns when selecting and or preparing covariate datasets for use with the RSPF tool 1 The raster datasets should underlie all the use and availability points If this is not the case the Merged Data Array MDA will contain inappropriate zero values that will be automatically removed by R so as not to impact the validity of the statistical output but no feedback is provided to the users indicating that they included invalid points Alternatively users can select an option to omit and points outside the region of interest prior to creating the MDA 26 EAGLES User Manual February 2011 2 The covariate datasets should have the same spatial resolution prior to analysis Ideally the user will generate these manually to b
85. ver we recommend that the user consult Hosmer Lemeshow p values or AIC values for models with and without the covariate in order to help decide whether covariate conclusion is appropriate It is advisable to keep insignificant covariates if the sign associated with them makes good biological sense In general proximity to roads seems to facilitate animal use in the Lamar Valley based on RSPFs for several other species so the positive sign on the coefficient here is unexpected and removing distance to road from the model might be a prudent choice To interpret the elevation coefficient from the reduced model above one could say that for each standard deviation of increase in elevation the probability of use by pronghorn decreases by exp 1 3175 268 or 26 8 at the mean level of all the other covariates included in the model Similarly to interpret the model coefficient for herbaceous cover herb tif one could say that for each additional standard deviation increase in herbaceous cover the probability of use by pronghorn increases by exp 6099 1 84 or 184 at the mean level of all other modeled covariates A ROC plot is located in the Run folder in the RSPF ROC Semivariogram file The ROC plot for the saturated pronghorn model is shown below in Figure 9 8 The ROC plot here suggests that the model is doing a fairly good job of classifying points as Used or Available 53 EAGLES User Manual February 2011 Tel rspf ROC se
86. y 2011 airborne sources like LIDAR and NAIP ground based sources like meteorological base stations and distributed sensor networks and modeled estimates like those from CASA HYDRA and SRM We use the term response here to refer to the species data used to fit the model The model can then be extended to predict other species responses in addition to those actually observed and used for model fitting in an effort to generate ecological forecasts In light of the plethora of new and emerging covariate inputs available and the complexity associated with getting them into the ArcGIS environment EAGLES provides the user team with two tools for data acquisition and formatting 1 A wiki site that provides an index of existing geospatial covariates as well as information on their contents and generation located at http geospatialdatawiki wikidot com Additionally a partial list of frequently sought covariate layers and where to find them is included in Appendix 1 of this manual 2 A tool to create climatic variables customized for the user s particular region on interest applicable for immediate use with the ArcGIS tool This site can be accessed at http www coasterdata net 1 4 Data Integration The data integration portion of the analysis consists of accessing covariate layers and integrating them with the response data In most cases a Merged Data Array MDA is built and used for subsequent analysis Functiona
Download Pdf Manuals
Related Search
Related Contents
View Operator`s Manual - Fontaine Chassis Trailers La notion d`originalité et sa preuve en droit d`auteur Dominique 野 Richeーー く取扱説明書) 479。 ガイ ドサインー00 Direcção-Geral da Saúde - Câmara Municipal de Mafra User`s Guide Kit de coche con pantalla Nokia CK-600 Guía de usuario De'Longhi - Billiger.de User Manual - B&H Photo Video Digital Cameras, Photography Copyright © All rights reserved.
Failed to retrieve file