Home

Myair Toolkit for Model Evaluation User Guide

image

Contents

1. 50 100 150 200 Observed Date range 01 01 12 00 00 to 31 12 12 23 00 Figure 4 15 Example of a frequency scatter plot from the Model Diagnostics tool raw data filtered by season Scatter Plot Diagnostics Forecast RB1 Daily Maximum NO ug m Filtered by Month 50 100 150 200 50 100 150 200 PE r b 4 4p 41 d d 1 L January February 7 7 200 4 150 4 100 50 Modelled 200 150 100 50 50 100 150 200 50 100 150 200 Observed Date range 01 01 12 00 00 to 31 12 12 23 00 Figure 4 16 Example of a conventional scatter plot from the Model Diagnostics tool calculated statistics daily maximum NO filtered by month Page 45 of 50 Frequency Scatter Plot Diagnostics Forecast RB1 O Forecast Indices No Filtering L i 1 1 90 80 6 70 60 Ba 50 8 2 40 30 2 20 10 Observed Date range 01 01 12 00 00 to 31 12 12 23 00 Figure 4 17 Example of a frequency scatter plot from the Model Diagnostics tool forecast indices with no filtering Time Plot Raw Data Forecast Station RB1 Pollutant NO ug m 1 1 L 200 4 E 150 4 H NO ug m T T T T Jan Apr Jul Oct Modelled Observed Date Range 01 01 12 00 00 to 31 12 12 23 00 Figure 4 18 Example of a time plot from the Model Diagnostics tool raw data Page 46 of 50 O Forecast Indices Time Plot Forecast Index Forecast Station RB1 Pollutant O4
2. 1 Goto http www java com 2 Follow the instructions to download and install Java version 5 or higher 2 4 Install RGGRunner For Windows and Linux operating systems 1 Inthe Installs subdirectory open rggrunner win linux zip 2 Extract all the files to a suitable directory e g C Program Files on Windows the extraction will put all files in a subdirectory called rggrunner 3 Make a desktop shortcut to rggrunner exe in the bin subdirectory this is the RGGRunner program For Mac operating systems Find rggrunner_macosx zip in the Installs subdirectory Extract all files to a suitable directory After extraction double click to mount the image Eee esc After mounting double click again on the RGG icon to run the application Page 5 of 50 3 Usingthe toolkit The Myair Toolkit for Model Evaluation consists of four tools 1 Screening Questionnaire This tool gives structured advice on your proposed evaluation 2 Data Input This tool processes your modelled and observed concentration data saving the processed data in an R workspace and optionally a CSV file 3 Model Evaluation This tool takes in the workspace saved by the Data Input tool and evaluates all or some of the data producing graphs and optionally CSV files 4 Model Diagnostics This tool takes in the workspace saved by the Data Input tool and produces diagnostic graphs for one station and one pollutant at a time
3. These options only apply to the image output formats JPG or PNG PDF output is always produced on A4 Reducing the overall image size will increase the proportional size of text These options provide flexibility to produce graphs for reports or presentations Select whether to output processed data and statistics as CSV This is a very useful option that produces CSV files containing all of the numerical data used to create the graphs also some statistics not shown on the graphs Refer to section 4 2 for details of the contents of these files Click the green Play button P Refer to section 4 2 for details of the output from the Model Evaluation tool Page 22 of 50 3 4 Model Diagnostics Tool The aim of the model diagnostics tool is to enable further investigation of the performance of the model at one particular monitoring station for one particular pollutant Tip In the Myair Toolkit for Model Evaluation installation directory you will find a DataSamples sub directory Here you will find sample files in the formats recognised by the Toolkit and a ReadMe txt file describing each file Step 1 In RGGRunner select the Model Diagnostics tool fRGGRunnor Fle Edt Ven Teds Window a PDE xE rag osima Daainptrgy x Daal ModeEvanionsgy x Dilocd fokban OO Ea x arcs o Myair Myair Toolkit for Model Evaluation Part 4 Model Diagnostics Version 3 0 Objective Thechjectve of this tel Sto produce dia
4. This section of the User Guide gives step by step instructions for using each tool in the Toolkit Step 1 Start RGGRunner TiRGGRunner Ble Edt Yew Tools Window gt S xG TS output Page 6 of 50 Step 2 Click on File Open RGG and browse to select each of the four Toolkit tools in turn Look in Toolkit 3 Documentation My Recent O Installs Documents E scripts E DSLocal_Datalnput rag x DSLocal_ModelDiagnostics rag DsLocal_ModelEvaluation rgg DSLocal_Questionnaire rgg My Documents File name My Computer Files of type RGG File Step 3 You have now loaded the Myair Toolkit for Model Evaluation tit ow Lele Wr ap SC xB En ET AI eem La Sake al Myair Myair Toolkit for Model Evaluation Part 1 Screening Questionnaire Version 3 0 Objective to obecny of the qusternatols to ases tha comparably and quy of tho obeervad and modalad datasets bsfereplocoadra tthe Data Input td Peace anas every quesion ard then presa the geen tun tutan above 1 Gomparability of the observed and modelled datasets 1 4 are the wv data griidezi or paint vues part Moza tha Data opu tool supports observed data at a par oriy 2 Quality of the modelled data 2 4 Neely procede 22er your eda work been subject to a gusty Ssisucrce procotue 2b Which model have vou usec EI Ow Om ar orter here 21e Whith ef the dung apples to the model yauhave F ch
5. w NMSD CRMSE i NMSD me ees Sf ai us _ J 2RMSy J20 R 2RMSyl yp CRMSE NMSD ME 2RMS J2 R R is the Pearson s correlation coefficient and NMSD is the normalised mean standard deviation Cm The radial distance to a data point on the target plot is equal to T for that station The smaller the value of T the better the concentration prediction for that station The black circle represents T 1 which represents the performance criteria and should be fulfilled by at least 90 of stations The dashed black circle represents T 0 5 The grey lines separate quadrants representative of the dominant error in the modelled data from positive or negative BIAS or from standard deviation SD or correlation R errors in the CRMSE Page 29 of 50 Figure 4 2 and Figure 4 3 show examples of output from the concentration evaluation part of the model evaluation tool with different grouping and filtering options Target plot DELTA 3 3 airText 2012 run a ALL STATIONS ALL POLLUTANTS from 01 01 12 00 00 to 31 12 12 23 00 1 1 0 L 1 daily mean PM2 5 hourly mean NO2 N r2 Bias 0 KK Bias 0 3 J Ta Je W T a k R sol E FA N d 2 Bias 0 N Z Bias 0 S d S L2 RI 8 hour rolling mean O3 daily mean PM10 1 ttk x i lt Bias 0 Bias 0 E 7 14 lo PE sD R sD as L Bias 0 Bias lt 0 N 24 4 ME T T T TT T 2 1 0 1 2 CRMSE 2RMSu woo so m 2 x mo e
6. 10 1 125 1 9 0334065252 1 0 00257619 049772272 D a Flinders nor er s 0 1 osna xa aor 1 ooososaz Wa m na almoderste nc bx 3 o 2 osina NA a seusoass 1 01005509642 NA Na Na 1B moderate noa me 0 9 3 ossi NA oseisra3e 1 0006826566 NA m Na dilmoderate ef lox o o s oam Na NA o s753e2 2 0153424658 NA M LH Table 4 6 Example of forecast index evaluation output forecast alert stats csv file 4 3 Model Diagnostics Tool Output 4 3 1 Time Variation Plot This compares modelled and observed concentrations by clockwise from top hour and day of the week day of the week month of the year and hour of the day The shaded area indicates the 9596 confidence interval in the mean For an example refer to Figure 4 13 4 3 2 Scatter Plot This compares the modelled and observed concentrations on a scatter plot optionally filtered by weekday month or season The frequency scatter plot shows the frequency of occurrence of each point whereas the conventional scatter plot shows one point per pairwise modelled observed data point When the calculated forecast indices are chosen in the data to plot the output defaults to a frequency scatter plot The black solid line is the 1 1 line The dotted lines are the factor of 2 lines Figures Figure 4 14 to Figure 4 17 show examples of frequency and conventional scatter plots for different data to plot and different filtering options 4 3 8 Time Plot This plots a time series of modelled and observed conc
7. Date range 01 01 12 00 00 to 31 12 12 23 00 Frequency Scatter Plot AIRTEXT 2012 VALIDATION Contents CONTENTS 1 INTRODUCTION 2 GETTING STARTED 241 INSTALL R 2 2 INSTALL R PACKAGES 2 3 INSTALL JAVA 2 4 INSTALL RGGRUNNER 3 USING THE TOOLKIT 3 1 QUESTIONNAIRE TOOL 3 2 DATA INPUT TOOL 3 3 MODEL EVALUATION TOOL 3 4 MODEL DIAGNOSTICS TOOL 4 OUTPUT 41 DATA INPUT TOOL OUTPUT 4 1 1 R Workspace 4 1 2 CSV file 42 MODEL EVALUATION TOOL OUTPUT 4 2 1 Concentration Evaluation Output 42 11 Target plot DELTA 1 2 4 21 2 Target plot DELTA 3 3 42 1 3 Box and Whisker plot 42 14 Scatter Plot 42 15 Quantile Quantile Plot 42 1 6 CSV output file 4 2 2 Forecast Index Evaluation Output 4 22 Forecast index accuracy 42 2 2 Forecast alert accurac 4 2 2 3 CSV output files 43 MODEL DIAGNOSTICS TOOL OUTPU 43 1 Time Variation Plot 4 3 2 Scatter Plot 4 3 3 Time Plot 5 FILE FORMATS 5 1 CSV MODELLED AND OBSERVED DATA 6 BATCH MODE FACILITY 7 REFERENCES Page 2 of 50 1 Introduction Regional and municipal governments are increasingly interested in providing services to assess and forecast local and city level air quality Air quality forecasts on these scales can be disseminated to health services and the public in terms of air quality alerts to inform and warn at risk groups ab
8. and pollutant displaying single station type Industrial and pollutant PM Page 33 of 50 4 2 1 5 Quantile Quantile Plot This plot compares the modelled and observed concentrations ordered independently from lowest to highest concentration as a quantile quantile plot The data can optionally be plotted on logio scales The black solid line is the 1 1 line The dotted lines are the factor of 2 lines Figure 4 7 presents an example quantile quantile plot Quantile Quantile Plot AIRTEXT 2012 VALIDATION INDUSTRIAL DAILY MEAN PM ug m 1 10 100 L L L L Forecast Sensitivity C Modelled 1 10 100 Observed Date range 01 01 12 00 00 to 31 12 12 23 00 Figure 4 7 Example of a quantile quantile plot from the concentration evaluation part of the Model Evaluation tool grouped by station and filtered by model and pollutant displaying pollutant PM on a logio scale 4 2 1 6 CSV output file When the output processed data and statistics as CSV option is checked statistics are calculated for each pollutant in output units and are output in a CSV file lt _conc_stats csv gt The statistics are calculated for the variable by which the data is grouped and additionally over all data filtered by the filters selected For example if data is grouped by station and filtered by model and pollutant the statistics are output for each station and for all valid stations in each model and pollutant Pa
9. base Under Other builds click on Previous releases Click on R 2 15 3 March 2013 Click on Download R to download the install program 1o 00 cM SY Oy Ee Run the install program taking care to install R in a directory where you have write privileges NOTE If you do not have direct access to the internet from your computer for example you access the internet through a university network then when you install R instead of accepting all the defaults at the Setup screen choose not to accept all the defaults and when offered choose Internet2 as the internet option This will force R to use the same proxy settings used by Internet Explorer The defaults for all other options can be accepted 2 2 Install R packages Follow these instructions to install the required packages for R 1 OpenR 2 From the Packages menu select Install package s 3 Pick a CRAN mirror preferably one that is geographically nearest to you as this is likely to be the fastest In the UK choose the Bristol mirror 4 Scroll down to find openair from the list of packages click on it and click OK to install openair 5 Repeat steps 2 and 4 for the following packages you won t be asked to choose a mirror again ncdf GEOmap fields latticeExtra maps plyr akima hexbin lattice and nlme Page 4 of 50 2 3 Install Java Follow these instructions to download and install Java needed for RGGRunner
10. e e e p 0 0 9 ep e eg e Y 4 1 4 7 10 13 16 19 22 25 28 32 a Number of alert threshold exceedences x x x x 9 TO x eo te T 1 4 T 10 13 16 19 22 25 Station number ordered by the number of observed alert threshold exceedences Key to numbers 1 Bra D ino z o 5 TH 2 cre 2 THA 22 RBI 32 R2 3 EM E Gs z sm 4 Hs2 D EAT 2 wo 5 wyi 15 T00 25 GRa 5 Gus D Et 26 ma T He 7 axi zr Bor a iwi 18 waz 22 Nz s R3 18 cn 22 Kei 10 Ra 20 Gao 30 Hoz summary Date range 010120000 te 31 212 2300 Val statons E Figure 4 11 Example of a forecast performance metrics graph produced by the forecast index evaluation grouped by station and filtered by model and pollutant displaying pollutant O3 Page 40 of 50 Forecast Performance Metrics airTEXT 2012 ALL STATIONS MODERATE O3 Probability of a correct forecast PCF 10 o 05 0 0 T T T T 1 2 3 4 Probability of detection POD 10 o 05 4 bd o False alarm ratio FAR 10 9 05 o o 0 0 T T T T 1 2 3 4 Probability of false detection POFD 1 0 05 00 T 1 2 3 4 Number of alert threshold exceedences 303 T 242 4 181 R 8 li 121 60 4 9 T T T 1 2 3 Model number ordered by the number of observed alert threshold exceedences Key to numbers 1 FORECAST 2 SENSITVITY C 3 SENSITIVITY 4 SENSITIVITY summary Date range 01 01 12 00 00 to 31 12 12 22 00 Vaid models soutors Figure 4 12 Example of a foreca
11. modelled data set to an existing R workspace the R workspace must have been created by version 3 0 of the Data Input tool When adding multiple modelled data sets to a workspace the modelled data averaging time and statistic must be the same for all modelled data sets Refer to Table 3 1 for details Step 3 Select your modelled data 2 Modelled data Modelled data label Select modelled data format netcbF Choose format AIRSHEDS x Select file Click Browse to open file dialog Browse O Select directory Click Browse to open file dialog Browse ADMS PST elei Oc l You need to label the modelled data set being read in to the tool This label will be used in the subsequent plots to identify the modelled data set The label should be alpha numeric and not include any special characters such as amp or commas You can select either a single file or a whole directory of data files The Data Input tool supports the following formats for modelled data e netCDF files o AIRSHEDS gridded modelled data from the PASODOBLE IC AIRSHEDS work package This option supports output from the PASODOBLE Web Coverage Service WCS and output from individual IC AIRSHEDS partners Page 9 of 50 o MACC Ensemble gridded data from the MACC regional ensemble air quality product o CMAQ gridded modelled data from the CMAQ model e ADMS PST point receptor output format from the ADMS suite of atmospher
12. occurrence of each point whereas the conventional scatter plot shows one point per pairwise modelled observed data point The data can optionally be plotted ona logio scale The black solid line is the 1 1 line The dotted lines are the factor of 2 lines Figure 4 5 presents an example frequency scatter plot and Figure 4 6 presents an example conventional scatter plot Page 32 of 50 Frequency Scatter Plot AIRTEXT 2012 VALIDATION Forecast ROADSIDE 8 HOUR ROLLING MEAN O ug m 1804 100 4 Modelled 50 Forecast X lt u 2 so 100 150 Observed Date range 01 01 12 00 00 to 31 12 12 23 00 Counts 2558 1566 959 587 360 220 135 83 51 Figure 4 5 Example of a frequency scatter plot from the concentration evaluation part of the Model Evaluation tool grouped by station and filtered by model and pollutant displaying single model Forecast and single station type Roadside for pollutant Os Scatter Plot AIRTEXT 2012 VALIDATION SUBURBAN DAILY MEAN PM ug m 20 40 60 80 100 120 jb cap il area A Forecast Sensitivity C 120 4 7 eee s 7 je e 100 80 60 4 40 204 LE Modelled 20 40 60 80 100 120 Observed Date range 01 01 12 00 00 to 31 12 12 23 00 Figure 4 6 Example of a conventional scatter plot from the concentration evaluation part of the Model Evaluation tool grouped by station and filtered by model
13. of the pollutant as n a defined in the pollutant data file used in the Data Input tool k Coverage factor k Refer to the DELTA v3 3 methodology 3 ur Measurement Refer to the DELTA uncertainty uL v3 3 methodology 3 LV Limit value or reference Refer to the DELTA value v3 3 methodology 3 alpha Proportion of the Refer to the DELTA measurement v3 3 methodology 3 uncertainty that is independent of the limit value target units Units that apply to the mol mol ppb measurement ppm ng m3 uncertainty ug m3 mg m3 g m3 or kg m3 target avg time hours Averaging time in hours An integer value that applies to the minimum 1 measurement uncertainty which should be equal to that applied to the concentration evaluation output minimum 1 hour target statistic Statistic that applies to max mean or the measurement rolling mean uncertainty which should be equal to the concentration evaluation output Table 3 4 Details of the uncertainties CSV file columns are unrestricted n a indicates that values Step 8 Forecast Index Evaluation This part of the tool converts both the observed and modelled concentrations to forecast indices and performs an evaluation based on these forecast indices Page 19 of 50 7 Forecast index evaluation Perform forecast index evaluation Select csv file containing index scales C
14. the Toolkit installation directory The basic procedure is the same for each of the above tools 1 Runthe tool as usual from the interface As well as generating all the usual output this saves a ini file containing all the information and settings that you entered in the interface This ini file is saved in the output directory with the specified output label 2 To re run the tool with the same information and settings from the command line type the following on the command line or in a batch file Rscript exe path R script path ini file path The text output that usually appears in the output screen of the interface is saved instead in an out file with the same name as the ini file Rscript exe path This is the pathname of the Rscript exe file installed with R This must be used when running R from the command line A typical path on a Windows PC where R was installed in C Program Files R R 2 15 3 would be C Program Files R R 2 15 3 bin i386 Rscript exe Rscript path This is the pathname of the R script for the tool you want to run For example for the Data Input tool if the Toolkit is installed at C Myair Toolkit v3 0 then the path would be C Myair Toolkit v3 0 RScripts DSLocal_Datalnput r ini file path This is the path of the ini file that was saved when you ran the tool from the interface For example if you chose the output directory C Toolkit_output and the output label DELTA
15. to ten reference values may also be plotted on the graphs To enter more than one reference value separate the values with a comma e g 50 100 3 Plot calculated forecast indices This converts both the modelled and observed data to indices as defined in the index scales file described in section 3 3 step 8 and detailed in Table 3 5 If required alert thresholds defined in the alert thresholds file described in section 3 3 step 8 can be plotted on the graphs as reference lines Step 5 Select which openair graph options you require 4 Diagnostic tools Select one or more openair diagnostic graph options from the list below Time variation always plots raw data Scatter plot L Time plot Page 24 of 50 Step 6 Step 7 Step 8 Step 9 Step 10 There are three options here for full details on the graphs refer to section 4 3 1 Time variation This plots averages by weekday month and hour The settings in the Data to plot section do not apply to this graph it uses the raw data as it was entered into the Data Input tool 2 Scatter plot Select which type of plot you require a frequency scatter plot shows the frequency of occurrence of each point whereas a conventional scatter plot shows one point per pairwise modelled observed data point The scatter plot can be scaled to log and filtered by weekday month or season Tip The frequency scatter plot is better suited to larg
16. 3146006 13e noz aes 7415720337 23 59560562 14 bx7 no2 365 3671222877 4630136386 M 4 W DELTA all forecast index stats 3 1 L Ready og e Table 4 5 Example of forecast index evaluation output forecast index stats csv file Page 42 of 50 forecast alert stats csv This contains for each station for each pollutant and for each alert threshold the number of observed alerts the 4 event parameters a b c and d the performance metrics PCF POD FAR POFD and also the odds ratio OR and the odds ratio skill score ORSS Refer to Table 4 6 for an example qa DELTA a frees lr Hasc MT Bie x Qu esce Home imer Fagelayout Formatas ax Wl mam 3 B3 E rer EAC m a Senni 8 i e e d ris n ka EESSESEI gc ec entero Pe cere eee phong gt B d ses c st A i etam wm D CIN YI C RT rT ey a PO P Lag ame polutant sation numcusalers a o c o FE 700 FaR PO Geni OR ons wem mo em o o 0 oana Na 1Na ona E EM d moderate nei b o 0 0 032 NA NA 1A oa E Na 4 moderate ni b3 4 0 2 4m 2s DEC 1 0009817978 049428839 D a s madente nur fom PPS m 298728778 i oooz7aaae Na m xa 6 moderate no bmi 4 0 3 am Am DE 1 ooneampas oa asoa203 D E 7 moderate n ni 0 o 9 035 NA 1NA oma ma NA Z moderate nez eu e o 1 03A Na semota 1 ours a m na S moderate n bu 3 2 71 1285 2433333338 0656556667 0 reczssast 057260274 QIIDRRNUSS DXGUNOLINl 7ABISETIG 07SUIUT 10 mosersto mo bis
17. 782 117996 20033 0 15703 0 74505 0 81523 0084 0 1475 208 045 113734 17405 115 223 3 2911 hourly mean Sersitivity G A993 53 8058 58 6055 405884 429585 279995 0 52292 0 58176 0 77055 0 05072 0 07237 540 583 562 526553 564342 0 B heurrelling mean Sersitvity G 329478 426338 220818 2302 9 75097 0 23568 0 77667 0 70254 025784 DDAUS 164792 17875 169 118 17843 0l dalle mean Seniituin em ns stats Figure 4 8 Examples of concentration evaluation output CSV file grouped by station station type and pollutant and filtered by model and pollutant Page 36 of 50 4 2 2 Forecast Index Evaluation Output The forecast index evaluation part of the model evaluation tool produces three types of output these are described in this section All three types of graph are based on a forecast index which is calculated from the modelled and observed data according to the index threshold definitions in the index scales file described in section 3 3 step 8 4 2 2 1 Forecast index accuracy This part of the tool assesses the performance of the model s forecast index predictions against forecast indices calculated from observed concentrations The graph is a stacked bar chart that shows for each station the percentage of calculated forecast indices valid for comparison where the modelled index was equal to the observed index green and where the modelled index was equal to the observed index plus or minus one band grey Only for
18. 99 210 261 1554 Tos moa hourlymean Forecast 3 95 244707 29 0189 18 3097 20 3348 454819 0 47285 0 58253 0 68736 0 17006 010488 107747 201545 11272 215 754 0280 a ma noz houriymean Forecast B147 759717 415792 35 2491 234857 319425 054232 0S5288 0S4351 DS7S7 O38115 129072 318202 35235 259381 5832 S5 noa heuymean Forecast TIS 435512 39 6278 23 1449 23 0667 4 36341 0 36651 04 946 0 76389 0 10436 0 00339 152 625 304 302 195197 245 838 0 08 JLA no hourly mean Foraract PONEY P UPS valida A MESE STENT 2012 Validation cone sac Mitos Ecl Lee Mome tet poociayout formule Dto Renew Mew Develer gx m m z n Tm n m rm P T TT T TT T T T PTO TPT TT x lant cutputavgtimeancmodel mumwaliivalobemes medmeSDO SDM MB NNSE R Fa2 o F gt obsmax modimaiote AHC md 2 incus o2 howlymesn Forecast 26492 4435893 40 7943 24 5396 24 2998 cats 0 79823 296127 308 02 273 697 3 rerbsce eA Poarymem Forecast nS RLNUS 42 453 539601 784208 41357 12373 0416 05726 ibam 30932 axis 4 roadside o2 houtiymean Forecast 274720 57 9005 401745 36215 254059 11408 038402 048536 0 57855 440 388 210 202 402 443 5 suburban o2 howlymesn Forecast 71023 313835 320852 233513 724856 009997 050302 051257 08675 00031 D9472 190 523 258695 192 27 S woan backround mei hourly mean Forecast 143067 45 3272 392987 423518 24 2226 6 0295 097538 033796 02M 01425 03486 500 635 302593 508 071 Tall Moz nour
19. Forecast Indices Modelled Observed Date Range 01 01 12 00 00 to 31 12 12 23 00 Figure 4 19 Example of a time plot from the Model Diagnostics tool forecast indices Page 47 of 50 5 File formats 5 1 CSV modelled and observed data es 4 BN2 sv Microsoft Excel ox a Data Review View SURE 1 2 3 4 5 6 7 8 9 H op ope ONS Figure 5 1 Example of an observed data set in CSV file format Restrictions e Include no more than one column per pollutant to be analysed e The date must be given by the year month day and hour e The station column is optional If it is present the header should read station if it is missing as in the screenshot above then all the data in that file is assigned to a station with the name of the filename e g BG1 csv would be assigned to station BG1 e Modelled data must contain data for each pollutant to be analysed Note that if multiple files are used that means each pollutant to be analysed must be present in at least one file Observed data does not have to include data for all required pollutants Page 48 of 50 6 Batch mode facility The Toolkit is supplied with R script versions of the Data Input Model Evaluation and Model Diagnostics tools which can be run from the command line making it possible to automate tasks for example in a batch file These R script files can be found in the Rscripts sub directory of
20. Oo Mycir CERG Myair Toolkit for Model Evaluation User Guide Version 3 0 June 2013 amp Forecast Index Accuracy airTEXT 2012 airTEXT 2012 ALL STATION TYPES PM10 i Percentage of forecast indices where the modelled index is wthin 1 band of the observed index i me mem i 10 os o0 os o0 os oo 10 os oo 16 opoopo eo oop 9 9 9 9 9 9 9 9 9 p 9 9 9 5 9 9 9 9 9 9 Forecast Performance Metrics airTEXT 2012 ALL STATIONS MODERATE O3 Probability of a correct forecast PCF ano E o a E 4 7 10 E 16 a 2 y the number or forecast indices val for comparison Probability of detection POD inira HT ps See 06 o AE I a E ERP e 4 7 10 no s n az False alarm ratio FAR F Box and Whisker plot AIRTEXT 2012 VALIDATION ELETT LLOLLDIOLII ALL STATIONS 8 HOUR ROLLING MEAN 03 5 i zo 1 observed 1 4 T LI LI 2 SERTAI Ir Probability of false detection POFD nung vo rund 4 Ji 07 7m 8 6 84 Number of alert threshold exceedences 55 Target plot DELTA 3 3 airTEXT 2012 ALL STATIONS DAILY MEAN PM10 T T from 01 01 12 00 00 to 31 12 12 23 00 i i p i i 2 4 0 1 2 Sa en E run g run j Bia 0 Bia o ALL STATIONS DAILY MEAN PM ug m Parse E BIAS 2RMSu w wo wo oo om Observed
21. ader Description Allowed values pollutant Name of the pollutant to be n a used in all output output units Concentration units to be mol mol ppb ppm used in all output ng m3 ug m3 mg m3 g m3 or kg m3 conv ugm3 ppb Conversion factor from Any numeric value ug m to ppb used for unit conversions obs alias Pollutant name as it appears n a in the observed data obs units Units that apply to the mol mol ppb ppm observed data ng m3 ug m3 mg m3 g m3 or kg m3 obs avg time hours Averaging time in hours of the observed data minimum 1 hour An integer value minimum 1 obs statistic Statistic that applies to the max mean or rolling observed data mean mod alias Pollutant name as it appears n a in the modelled data mod units Units that apply to the mol mol ppb ppm modelled data ng m3 ug m3 mg m3 g m3 or kg m3 mod avg time hours Averaging time in hours of the observed data minimum 1 hour An integer value minimum 1 mod statistic Statistic that applies to the observed data max mean or mean rolling Table 3 1 Details of the pollutant data CSV file columns n a indicates that values are unrestricted Page 12 of 50 Step 7 Select output directory 6 Output Select output directory Click Browse t
22. all stations together on the same graph If data for an individual station is required refer to section 3 4 for the Model Diagnostics tool Page 15 of 50 Step 7 Filtering describes how the data is to be split across plots either as separate panelled plots on one page or as additional pages for large plots The default option is to group by station and filter by pollutant and model Data must always be either grouped by or filtered by model or modelled data set because the data sets must always be evaluated separately against the observed data Note that there are limitations on the grouping and filtering variables available for each plot option appropriate to what each plot presents For example the forecast index evaluation plots present normalised variables and so all options are available with the only constraint being that the data must be either grouped or filtered by model Whereas for concentration evaluation plots the data must be grouped or filtered by both pollutant and model Table 3 2presents the available paired options for grouping and filtering for each plot Refer to sections 4 2 1 and 4 2 2 for details about these graphs Plot Allowed Group Allowed Filter Target plot Station Model and Pollutant Station Model and Pollutant and Station type Station type Model and Pollutant Pollutant Model Pollutant Model and Station type Mod
23. aluation part of the Model Evaluation tool grouped by station and filtered by model and pollutant for a single model run a 4 2 1 2 Target plot DELTA 3 3 The Target plots produced by the concentration evaluation part of the Model Evaluation tool are similar to the plot produced by the FAIRMODE DELTA tool 1 This section describes output in line with version 3 3 of the DELTA tool 3 Page 28 of 50 For DELTA version 3 3 the metrics calculated by the tool for each monitoring station and pollutant and shown on the target plot are Centralised root mean square error CRMSE N 1 Un ERVI CRMSE gt M M 0 0 i 1 M is the mean modelled concentration O is the mean observed concentration Mean bias BIAS Root mean square of the observation measurement uncertainty RMSu RMS kut 1 a 0 o2 aLV where o2 is the standard deviation of the observed data LV is the limit value of interest and the other coefficients k ul and a are derived for a specific pollutant from measurement data as described in the FAIRMODE DELTA v3 3 methodology 3 and associated papers 4 5 An example uncertainties file is provided in the data examples using the values reported by DELTA v3 3 k ul a LV ug m3 NO2 2 0 120 0 020 200 03 1 4 0 090 0 620 120 PM10 2 0 138 0 027 50 BIAS V Target T T Gus The target plot shows uus against where
24. amework for R scripts BMC Bioinformatics Vol 10 p 1471 2105 http www biomedcentral com 1471 2105 10 74 RGG project website http rgg r forge r project org index html Page 50 of 50
25. ast alerts Performance metrics Check the boxes for the graph output you require Page 20 of 50 f 5 c 5 E M T RT M M lumtnderunts indexavg timenours indexstatistie indexcalymax ii Q 8 4 6 6 amp 35 2 no2 uglm 1mean yes 0 68 125 201 268 335 401 458 55 3 jo ug ma Brollingmean yes o 3467 101 ttt 4 pm 0 ugima 24 mean yes o 17 3 St 59 67 76 m 9 5 pmzs ug ma 24 mean yes o 2 Mm 39 42 48 54 539 S 5 Y W index scales 2 m Figure 3 6 Example of an index scales file Column header Description Allowed values pollutant Name of the pollutant as n a defined in the pollutant data file used in the Data Input tool index units Units in which the index mol mol ppb ppm threshold concentrations ng m3 ug m3 are given mg m3 g m3 or kg m3 index avg time hours Averaging time in hours An integer value that applies to the index minimum 1 threshold concentrations minimum 1 hour index statistic Statistic that applies to max mean or rolling the index threshold mean concentrations index daily max Is the forecast given as yes or no the daily maximum of the calculated indices i1 Threshold concentration Any numeric value for index level 1 i2 Threshold concentration Any numeric value for index level 2 i n Threshold concentration Any numeric value f
26. ataSamples sub directory Here you will find sample files in the formats recognised by the Toolkit and a ReadMe txt file describing each file Step 1 In RGGRunner select the Data Input tool Cnsctumner m PAaPuauus x oslo MebEshoen x osd oddania x CIS f Gam xi GE suo sae Mycir Myair Toolkit for Model Evaluation Part 2 Data Input version 3 0 Objective The bt af ths toolis to panfarmthe dita vout ard cosi equa The outous fram th od an ac Holes Tor aiban adin hto the es Evam nd Madd Dinata el uuum hiana a MEE Likeur Selecto arhar eara a now woesce or to ad more motlat dera ta a vare prevost created bet slope For meld ciere data stalin deta dli data Cities ef dat estes eeu Nosed data label Selene dats termat Sacer mes ct Cah or cet es Eme cys Chk ens cert dns Eme Page 8 of 50 Step 2 Select whether you wish to create a new workspace or add modelled data 1 Workspace Select to either create a new workspace or to add more modelled data to a workspace previously created by this tool Create new workspace from modelled data observed data station data and pollutants data Add modelled data to existing workspace You can select to create a new R workspace or to add a modelled data set to an existing R workspace Note that when creating a new workspace the data range of the modelled data set should cover the start and end time you wish to evaluate When adding a
27. e value of T the better the concentration prediction for that station The black circle represents T 1 the dark green circle represents T 0 8 the light green circle represents T 0 65 and the dotted line represents T 0 3 In DELTA v1 2 0 8 is the criteria value 0 65 is the goal value and 0 3 is the uncertainty Figure 4 1 shows an example of target plot DELTA 1 2 output from the concentration evaluation part of the model evaluation tool all pollutants all station types Page 27 of 50 Target plot airText 2012 run a ALL STATIONS ALL POLLUTANTS from 01 01 12 00 00 to 31 12 12 23 00 2 4 0 1 2 1 1 L 1 Ll 1 1 fi 1 daily mean PM2 5 hourly mean NO2 4 F2 F1 0 4 H 1 8 Omod gt Cobs obs gt Gmod Gmod gt Cobs obs gt Omod 4 j 2 D 8 hour rolling mean O3 daily mean PM10 24 F em 14 Tt Il o 44 L Omod gt Cobs Cobs gt Gmod Gmod gt Cobs Govs gt Omod 24 L T T T TT T T T 2 EJ 0 1 2 CRMSE oops woo w m on x me e B ks o oe wA o2 bo e e 0 m st e7 0 Ll E hd X m be ovy mo e s WR da 3 bw X eoe em E s O0 hs 0 105 b m8 wa A mo a 8 5 won won m y som vr F wv E oO mo mo vow wa X 26 er eo a zl i E en in E m z 25 Z8 m Se me zi my tu en es 5 2 A me O s5 ss g 2 y so HM n w A y Ej us E e m We X w m 2 so m A mo e oo woe mo o wo E E ES mm it s m Figure 4 1 Example of a target plot DELTA 1 2 from the concentration ev
28. e alerts and any tendency towards false alarms acd Probability of a correct forecast PCF ae bord a Probability of detection POD a P b False alarm ratio FAR ari b Probability of false detection POFD robability of false detection Pad Each of these metrics lies in the range 0 to 1 A good score for PCF and POD is 1 a good score for FAR and POFD is 0 The graphical output from the performance metrics option is a series of five graphs the four metrics described above and a graph showing the number of observed and modelled alert threshold exceedences for each station Again the stations are sorted by the number Page 39 of 50 05 0 0 05 00 05 0 0 05 00 14 11 of observed alert threshold exceedences and a key to the stations is shown below the graphs Figure 4 11 and Figure 4 12 show examples of this output grouped and filtered by different variables Forecast Performance Metrics airTEXT 2012 ALL STATIONS MODERATE 03 Probability of a correct forecast PCF o o o o o oo v 9 v 9 9 9 T T T T T T T T T T T 1 4 7 10 13 16 19 22 25 28 32 Probability of detection POD o 4 o 9l 4 T TIT b o PE a a a ea i T T T 2 T T T T T T T 1 4 7 10 13 16 19 22 25 28 32 False alarm ratio FAR J o o o 99 o e41925124297 LR T o 4 o E 9 64 o o T T T T T T T T T T T 1 4 7 10 13 16 19 22 25 28 32 Probability of false detection POFD 4
29. e correct prediction of an alert and to the correct non prediction of a non alert If no alerts are observed or no alerts are forecast then ORSS is invalid The graph shows ORSS for each station where the stations are ordered by the number of observed alert threshold exceedences The number of observed and modelled alerts for each station is also plotted in blue right hand y axis A key to the stations is shown below the graph Figure 4 10 shows an example of an ORSS graph Page 38 of 50 Odds Ratio Skill Score ORSS airTEXT 2012 Forecast ALL STATIONS MODERATE ALL POLLUTANTS No bars are shown where ORSS is invalid the number of observed or modelled alert threshold exceedences is zero observed x modeled 1 Good Odds Ratio Skil Score ORS 0 Random Number of alert threshold exceedences 242 1 Bad Pollutant number ordered by the number of observed alert threshold exceedences Key to numbers 5 Summary Figure 4 10 Example of a ORSS graph produced by the forecast index evaluation grouped by pollutant and filtered by model displaying model Forecast 2 Performance metrics In an operational pollution forecasting system it is important not to issue alerts when there should not be an alert but it is arguably more important to accurately issue an alert when an alert should be issued The following performance metrics give information about the skill of a model in terms of its ability to issue accurat
30. ecast periods for which both modelled and observed indices can be calculated are included in the assessment The stations are sorted by the number of indices valid for comparison which is also shown on the chart by the blue circles and the right hand y axis A key to the stations is given below the graph Refer to Figure 4 9 for an example Forecast Index Accuracy airTEXT 2012 Forecast ALL STATIONS NO2 Percentage of forecast indices valid for comparison where the modelled index is either equal to or 1 band apart from the observed index modelled obsenedi E modeled observed iw 21 H NRNEUTTELLILH Ed Mant 5 H 3 an 3 3 A 4 mj i t i al 2 H t f i ni 2 s 139725 m x wow 4 o 5ON MOM ON M T T TT mm a a a I o T x ETE EEIOE C m im gt se L cos E 3 ens HO cy 1 318 5 E E Figure 4 9 Example of the forecast index accuracy graph grouped by station and filtered by model and pollutant displaying model Forecast and pollutant NO Page 37 of 50 4 2 2 2 Forecast alert accuracy It is usual in operational pollution forecasting to use pollution bandings to help communicate pollution levels to the public For example a common set of bandings for a 1 to 10 forecast index scale is shown in Table 4 2 Band Forecast indexrange Alert threshold LOW 1to3 n a MODERATE 4to6 4 HIGH 7to9 7 VERY HIGH 10 10 Table 4 2 Example set
31. ed or conventional scatter plot and a quantile quantile plot Check the boxes of the graphs you require Refer to section 4 2 1 for details about these graphs If you have selected the target plot from DELTA v3 3 select your uncertainties file Your uncertainties file should be a comma separated CSV file containing one row per pollutant For each pollutant the coefficients of the measurement uncertainty given by the FAIRMODE DELTA v3 3 methodology 3 must be specified Table 3 4 describes these parameters If you have selected the box plot option check the Show outliers option if you want outliers to be plotted on the graph If you have selected the scatter plot option select which type of plot you require a frequency scatter plot or conventional scatter plot Tip The frequency scatter plot is better suited to larger sets of data the conventional scatter plot to smaller sets If you have selected any of the box plot scatter plot or quantile quantile plot options check the Log scale option if you wish the numerical axes of the plot to be scaled to logic l B c D E F 6 H E ollutant k ur w alpha target unis target avg tne hours target state A 2 No2 2 012 200 0 02 ug m3 1 mean 3 0 i4 009 120 0 62 ug m3 8 rolling mean 4 Mio 2 0138 50 0 027 ug m3 24 mean 5 6 ri Figure 3 5 Example of an uncertainties file Page 18 of 50 Column header Description Allowed values pollutant Name
32. ee os chee Devin Lene L tsoecin rine lacuset stochastic Figure 3 1 The Myair Toolkit for Model Evaluation Page 7 of 50 3 1 Questionnaire tool The questionnaire is a screening tool which asks you questions about the data you want to evaluate and offers structured advice based on your answers Answer each question in turn then click the Play button P 3 2 Data Input tool The data input tool processes your modelled and observed data saving it in an R workspace file which later can be imported into the Model Evaluation and Model Diagnostics tools The tool supports both gridded and point modelled data Gridded data are interpolated to the monitoring station locations In situ observed data can either be automatically downloaded from the internet and imported UK only or input using simple format CSV files The Toolkit supports the evaluation of multiple modelled data sets either from multiple models or from multiple runs of the same model over the same time period Initially a new R workspace is created with one modelled data set and an associated observed data set Further modelled data sets can then be added to the saved R workspace There is no limit to the number of models that the Data Input tool will load into a saved R workspace but memory issues may be encountered if the number of models is large Tip In the Myair Toolkit for Model Evaluation installation directory you will find a D
33. el Pollutant Model Pollutant and Station type Box and whisker plot Station Model and Pollutant Station Model and Pollutant and Station type Station type Model and Pollutant Model Pollutant and Station type Scatter plot Station Model and Pollutant Station Model and Pollutant and Station type Quantile quantile Station Model and Pollutant plot Station Model and Pollutant and Station type Forecast index plots Station Model Station Model and Pollutant Station Model and Station type Station Model and Pollutant and Station type Station type Model Station type Model and Pollutant Pollutant Model Pollutant Model and Station type Model Model Pollutant Model Station type Model Pollutant and Station type Table 3 2 Grouping and filtering options for the Model Evaluation tool Concentration Evaluation This part of the tool compares the modelled concentrations to observed concentrations Page 16 of 50 6 Concentration evaluation Mert Select csv file containing output averaging times Click Browse to open file dialog Enter data capture threshold 36 Select graph output Target plot DELTA 1 2 7 Target plot DELTA 3 3 Box and whisker plot scatter plot 7 Quantile quantile plot Browse Check the box to perform the concentration evaluation Select your output averaging times file Your output averaging times file should be a comma separated CSV file containing one row pe
34. entrations Figures Figure 4 18 and Figure 4 19 show examples of a time plot for different data to plot Page 43 of 50 Time variation analysis Raw Data Forecast Station RB1 Pollutant NO2 ug m 0 6 1 12 18 23 0 6 12 1823 0 6 01 01 12 00 00 to 31 12 12 23 00 12 18 23 Lp i Monday Ld Tuesday 1 i j Wednesday Thursday Friday Lol od Saturday 60 4 NO ug m 0 6 12 1823 T T T 1 1 Tt T 0 6 12 1823 0 6 12 18 23 hour ENSE Modeled ME Observed 12 18 23 NO ug m 50 4 F e 407 F t z 2 S 4 L g F zo z 204 F L atin it tt 1 1 JFMAMJJASOND Mon Tue Wed Thu Fri Sat Sun hour month weekday Figure 4 13 Example of a time variation plot from the Model Diagnostics tool Frequency Scatter Plot Raw Data Forecast RB1 Raw Data NO ug m No Filtering L L 200 o e 150 4 F E 3 kl 8 1004 H z 5o 4 L T T T T 50 100 150 200 Observed Date range 01 01 12 00 00 to 31 12 12 23 00 Counts 395 249 No ROD Figure 4 14 Example of a frequency scatter plot from the Model Diagnostics tool raw data with no filtering Page 44 of 50 Frequency Scatter Plot Raw Data Forecast RB1 Raw Data NO ug m Filtered by Season 50 100 150 200 L i 1 1 1 Spring MAN summer CHA Counts 201 200 150 418 100 70 41 50 24 Modelled
35. er sets of data the conventional scatter plot to smaller sets 3 Time plot This is a time series plot of both modelled and observed data Select output directory 5 Output Select output directory Click Browse to open file dialog Enter label to prefix output file names Select graph output Format pe vw Select image file size llagay ml Browse to select the directory where you want the output files to be saved Enter label to prefix output file names The text you enter here will be used to identify your output Select graph output format There are three options JPG PNG and PDF The first two options produce image files that can be imported into other documents The PDF option produces PDF files Select image file size There are three options Large A4 Medium and Small These options only apply to the image output formats JPG or PNG PDF output is always produced on A4 Reducing the overall image size will increase the proportional size of text These options provide flexibility to produce graphs for reports or presentations Click the green Play button P Refer to section 4 3 for details of the output from the Model Diagnostics tool Page 25 of 50 4 Output 4 1 Data Input Tool Output 4 1 1 R Workspace This is an R workspace containing all the data imported by the Data Input tool processed and ready for import into the Model Evaluation and Model Diagnostic tools This workspace can also be loaded int
36. ge 34 of 50 The statistics that are output in are given in Table 4 1 Where obs or mod occur in variable names these indicate observed or modelled values respectively Name Description Equation Num valid values Number of values bs M obs mean ean m C mod mean SDO Standard Deviation FXG SDM MB Mean Bias 6 Co NMSE Normalised Mean Square Error 2 C C CC R Pearson s Correlation Coefficient cov G 1Co 06 96 Fac2 Factor of 2 Fraction of data where 0 5 lt C C lt 2 when C 0 C C gt and the data pair is not counted Fb Fractional Bias C C 0 5 C C Fs Fractional Standard Deviation sc o 05 sc m acy obs max Maximum maxC mod max obs RHC Robust Highest Concentration x n x n In E5 mod RHC where n is the number of values used to characterise the upper end of the concentration distribution y is the average of the n 1 largest values and y n is the n largest value n is taken to be 26 Table 4 1 Details of the statistics output by the concentration evaluation part of the Model Evaluation tool In addition some statistics for each station that are presented on the target plots and the box plots are also output in the CSV file for each station Lower whisker observed Lower quartile observed Median observed Upper quartile observed Upper whisker observed 9v Uv dec tr Sic pe Lower
37. gnostic grachs of modelled are observed concrtiaton data Hesse conplte esc part ot the frm and then prees the green Rur button shove 1 Workspace savei by Data inp tl Salieri Ceres emetedec ien 2 Station pec erd madel Ener colin ura hee Erter pali neve heras Erter mele dta 3 Detatoret Flot ran concetratin deta Reference dues optional O Flot caleated statisties Step 2 Browse to select the workspace previously saved by the Data Input tool 1 Workspace saved by Data Input tool Select workspace Click Browse to open file dialog Browse Step 3 Enter station pollutant and modelled data label 2 Station pollutant and model Enter station name here Enter pollutant name here Enter modelled data label here Page 23 of 50 Step 4 Select which data to plot 3 Data to plot Plot raw concentration data Reference values optional O Plot calculated statistics Plot calculated Forecast indices There are three options here 1 Plot raw concentration data This plots the data as it was entered in the averaging time of the modelled data If required one to ten reference values may also be plotted on the graph To enter more than one reference value separate the values with a comma e g 50 100 2 Plotcalculated statistics This plots the selected statistic from a choice of the following 8 hour rolling mean 8 hour mean daily mean and daily maximum If required one
38. he name of this station type in the box provided Tip If you don t know which station types are available type any text into the box When you run the tool it will declare that no data are available for that station type and give you a list of the available station types Step 5 Select which pollutants to evaluate 4 Pollutants Select which pollutants to evaluate S All pollutants Single pollutant You can either evaluate all pollutants or just one pollutant If you only want to evaluate one pollutant enter the name of this pollutant in the box provided Tip If you don t know which pollutants are available type any text into the box When you run the tool it will declare that no data are available for that pollutant and give you a list of the available pollutants Step 6 Select how to group and filter the data 5 Grouping and filtering Select how to group the data And how to filter the data S Station Model Pollutant E Station type Station type Model ites O Pollutant Mod jon typ O Model Pollut Station typ Grouping describes what each data point on a plot represents The Model Evaluation tool supports grouping the data by e Station e Station type e Pollutant e Model or modelled data set It is noted that for the frequency scatter plot conventional scatter plot and QQ plot options the option to group by station should be chosen In these specific cases the data are plotted for
39. i 434 29 15769958 al 1 0 no2 11 10 01 2008 25 6 37 64580154 bgi 28 6 37 64580154 1 1 0 no2 2 11 01 2008 65 8 49 25250075 bgi 65 8 49 26250076 1 1 0 no2 22 12 01 2008 615 63 95709824 bgi 615 53 96709824 1 1 0 no2 14 13 01 2008 113 14 9048995 bgi n3 14 9048996 1 1 0 no2 4 DELTA all forecast index data 6 1 i Ready T e y Table 4 4 Example of forecast index evaluation output _forecast_index_data csv file _forecast_index_stats csv This contains for each pollutant and station the number of valid calculated index values the percentage of indices where the modelled index was correct and the percentage of indices where the modelled index was one band either above or below the correct index For an example refer to Table 4 5 ma Ines DELTA all forecast index stats csv Microsoft Excel 5x rone Page layout Formulas Data Review View 5x Pe LE mum Rum ix me aru crimes irome HET Clipboard All Font Number Styles Celis Editing T ET c I D I E mcm n H ra k m station pollutant num validundices perc indices same_perc indices oneepart E noa 35 7616438356 22485752 02 360 7138883839 prom a02 365s 6e 85245902 2005464481 nee s 9533019435 E noa 365 7555628415 2845716778 m xe 754199441 zs 76301676 02 340 3382352941 50 58823525 T 3700 coss 24 12136195 p xn 7067901235 23 7037037 ne 368 73 8292011 2
40. ic dispersion models e CSV standard text file o Referto section 5 1 for details of the required format o Select the separator used in your CSV files either comma or semicolon o Enter the missing data indicator used in your CSV files e g 999 or NA Step 4 If creating a new workspace select your required observed data option 3 Observed data Select observed data Format csv Select file Click Browse to open file dialog Browse O Select directory Click Browse to open file dislog Browse Separator comma v Missing data indicator O London KCL UK Automatic Urban and Rural Network AURN The Data Input tool supports the following formats for observed data e CSV standard text file o Refer to section 5 1 for details of the required format o Select the separator used in your CSV files either comma or semicolon o Enter the missing data indicator used in your CSV files e g 999 or NA o Select either a single CSV file or a whole directory of CSV files e London KCL This option requires internet access o King s College London KCL maintain a network of 123 monitoring sites in Greater London o Data for all required stations and pollutants will be downloaded and imported for a time period to match the modelled data e UK Automatic and Rural Network AURN This option requires internet access o The UK AURN is the national network of automatic monitoring stations around the UK o Data for all requi
41. le containing pollutant information Click Browse to open file dialog Browse Your pollutant data file should be a CSV file with a comma separator containing a list of all the pollutants for which you wish to process data For each pollutant a set of parameters must be set It is very important these parameters are set correctly for your data Refer to Table 3 1 for details of the information required B c a F Ls H 1 J mmn poltutantJoutput units conv ugm3 epo obs alias cbs units obs avg time hours obs statistic mod alias moc units mod avg time hours mod statistic 2 um 052 n02 ug mi imem no sone ugima 1 mean ugima osos ugma lmem 03 conc _ug ms 1 mesn a pmo ug ms lomm ugma Amean pmin conc ug ms 1 mean Figure 3 3 Example of a pollutant data file Page 11 of 50 Notes e The obs alias and mod alias settings will be the same as pollutant unless the pollutant name within the observations or modelled data are different to pollutant in which case use the aliases to make sure the correct fields are extracted from the data e The observed data averaging time and statistic must either be the same as for the modelled data or 1 hour mean then it will be recalculated to the same averaging time and statistic as the modelled data e When adding multiple modelled data sets to a workspace the modelled data averaging time and statistic must be the same for all modelled data sets Column he
42. lick Browse to open file dialog Select csv fle containing alert threshold levels Click Browse to open file dialog Enter data capture threshold Select chart output Forecast index accuracy Forecast alerts Odds ratio skill score ORSS Forecast alerts Performance metrics Check the box to perform the forecast index evaluation Select your index scales file Your index scales file should be a comma separated CSV file containing one row per pollutant For each pollutant index related parameters must be set Table 3 5 describes these parameters A minimum of 1 index threshold concentration must be defined Select your alert thresholds file Your alert thresholds file should be a comma separated CSV file containing one row per alert Each alert should have a name and an alert threshold Table 3 6 describes these parameters Enter a data capture threshold to apply to the forecast index averaging process For example for an 8 hour rolling mean a 75 data capture threshold means that at least 6 hours of data must be valid in each period for the averaged data to be valid and used in the evaluation The threshold will also be applied over each day if the daily maximum is set to yes in your index scales file Select the graphs you require There are three graph options for details about these graphs refer to section 4 2 2 1 Forecast index accuracy 2 Forecast alerts Odds ratio skill score ORSS 3 Forec
43. lymean Forecast 577308 52 7077 39 2337 40 257 252214 14494 0 77957 0 42238 0 57918 0 3118 0 4721 300 35 216 202 504 188 s verbes c 0 enourroling mean Forecast iuam 14387 195832 12087 IABB darse 03979 030737 O37418 02M9 Disi 8525 952239 mmn 3 roadside eB bhourrolingmean Forecast 75313 27 1242 27 4484 19 091 224259 0 32425 040094 067099 0 63925 0 01188 012987 148 300 10397 171373 1 3B siburar a o Rihour mlina meag ES sanza Js a S AW Sro za Vatsanon 255 wey B gju Oros GiITEXT 2012 target conc stats csy Micrasoft Excel Meme duet Pageiayout fomes Dota esee Mew Deeper a AL Gui potent P emm m ncm m tm m m m Hm m m m TT ibtamjostutapumean mofel sumaalibilohemesmotmeSDO SDM Ma NMSE R F2 Fo Fe obs modmasadsRHC modRHCobs owd 1 62 hourly mean Forecast SUIS SAITI 397337 404057 25284 14494 077957 042236 OMAN MAIB OAT SOLES 36202 SAIGA 339 137 0 3 0 shourrollingmean Forecast 2MIS 22 8539 284509 213952 22 9086 4013 038994 0 86973 00407 0 1301 0 0433 184782 179566 109 118 179 833 0 amp amao daiiy mean Forecast 2I amasa 219979 diei 1739 23475 033909 05 533 072948 0 1318 024068 209 045 128358 197405 178455 1 2511 Smo howiymem Sensitiviy C 380953 519056 544771 Z03684 420918 0 67151 5155 03657 ITE 0M2 00 540383 355 023 526553 30362 0 S hourrolingmesn SersmwiyC 245853 329405 35 3803 220818 214857 243751 0 18949 07798 0 74023 0 07135 0 0274 16752 170 71 169 118 170 352 0 daily mean Sensit 21893 2491 229123 13
44. mo ws o o m A wW A w et O o hod i w7 D ths g th Y wol wed wo a E Mos E Lj wm X wot c OB as O se ne A E w A oS uos oy ao o g y ou wor F s Y co s e 9 fe wi w was X P c o E te m B m F 4 mo z3 3 n BY BE Sk 2i Ui bu ett es 5 hoz is e m 5 9 o g s y so m o F m m y 5 mo EE de m we X os o m oo bo m a mo o o9 i e m2 w o oe E n Ds E Figure 4 2 Example of a target plot DELTA 3 3 from the concentration evaluation part of the Model Evaluation tool grouped by station and filtered by model and pollutant for a single model run a Page 30 of 50 Target plot DELTA 3 3 airTEXT 2012 ALL STATIONS ALL POLLUTANTS from 01 01 12 00 00 to 31 12 12 23 00 2 A 0 1 2 L 1 1 L Sensitivity G Sensitivity J IN Bias gt 0 K Bias Bias c RU S 24 Sensitivity C 2 N Pi N Bias gt 0 a 14 I o R SD 4 4 L 24 Bias 0 NULLA Bias 0 JL T T T TT T T T 2 1 0 1 2 CRMSE 2RMSu Shourroling meanO3 O dalymeanPMi0 hourly mean NO2 Figure 4 3 Example of a target plot DELTA 3 3 from the concentration evaluation part of the Model Evaluation tool grouped by pollutant and filtered by model 4 2 1 3 Box and Whisker plot The Box and Whisker plot shows 5 pieces of information for each station for each of the modelled and observed datasets The lower whisker The 25 percentile the lower quartile the lower end of the box The 50 perce
45. n tarar af career eio aer forecast in Pease cones each patt al the farm andthen prese the green Rr button above 1 Wrap eve by Data pt tod Seien coo ow fle dab eNotes Select lich models evauste slrodates te Osh modeled dta set sete Selec lich ation types te aaatensypee Osrde stan tgo A Foktares Sele hich solia to ea ae tpa Cini alunt 5 Gteupnaan terna Step 2 Browse to select the workspace previously saved by the Data Input tool 1 Workspace saved by Data Input tool Select workspace Click Browse to open file dialog Step 3 Select which modelled data sets to evaluate 2 Models Select which models to evaluate All modelled data O Single modelled data set You can either evaluate all modelled data sets or just one modelled data set If you only want to evaluate one modelled data set enter its label as specified in the Data Input tool in the box provided Tip If you don t know the labels of the modelled data sets that are available type any text into the box When you run the tool it will declare that no data are available for that modelled data set and give you a list of the available labels Page 14 of 50 Step 4 Select which station types to evaluate 3 Stations Select which station types to evaluate All station types O Single station type You can either evaluate all station types or just stations of one type If you only want to evaluate stations of one type enter t
46. ntile the median the horizontal line inside the box The 75 percentile the upper quartile the upper end of the box Ur dec E pe The upper whisker Page 31 of 50 The inter quartile range IQR is defined as the 75 percentile minus the 25 percentile i e the length of the box The lower whisker is defined as the lowest concentration value still within 1 5xlQR of the lower quartile The higher whisker is defined as the highest concentration value still within 1 5xlQR of the upper quartile Optionally the outliers lying outside of the upper and lower whisker can also be plotted and the plot can be displayed on a logio scale An example box and whisker plot is shown in Figure 4 4 Box and Whisker plot AIRTEXT 2012 VALIDATION ROADSIDE 8 HOUR ROLLING MEAN O3 200 observed Forecast Sensitivity C 150 Sensitivity G Sensitivity J E 9 100 2 o 50 4 o T T T T T T 3 9 8 6 8 8 Stations Summary Date range 01 01712 0000 to 31 12 12 23 00 Valid stations SoutofSr Figure 4 4 Example of a box and whisker plot from the concentration evaluation part of the Model Evaluation tool grouped by station and filtered by model and pollutant displaying single station type Roadside for pollutant O 4 2 1 4 Scatter Plot This plot compares the modelled and observed concentrations on a scatter plot The frequency scatter plot shows the frequency of
47. o R to explore the data further 4 1 2 CSV file This contains the time series of concentrations in output units as defined in the pollutants information file for every station and pollutant for which both monitored and modelled data are available with the same averaging time and statistic as the modelled data 4 2 Model Evaluation Tool Output 4 2 1 Concentration Evaluation Output 4 2 1 1 Target plot DELTA 1 2 The Target plots produced by the concentration evaluation part of the Model Evaluation tool are similar to the plot produced by the FAIRMODE DELTA tool 1 These plots can be output as defined by either version 1 2 or version 3 3 of the DELTA tool This section describes output in line with version 1 2 of the DELTA tool For DELTA version 1 2 the metrics calculated by the tool for each monitoring station and pollutant and shown on the target plot are Centralised root mean square error CRMSE N 1 a AA CRMSE x2 it 0 0 M is the mean modelled concentration O is the mean observed concentration Mean bias BIAS Page 26 of 50 Standard deviation of the modelled data 0 4 Target T BIASV CRMSEV iu usd T obs obs The target plot shows nee against where C obs obs Ig CRMSE i em Nn if omod gt Fobs CRMSE abg if dobs gt Omod Tobs The radial distance to a data point on the target plot is equal to T for that station The smaller th
48. o open file dialog Enter label to prefix output file names a C Output raw data as CSV Browse to select the directory where you want the workspace to be saved Step 8 Enter a label for output files The workspace will be saved with this name with extension RData and all output files subsequently produced by the Model Evaluation and Model Diagnostics tools using this workspace will also be labelled with this name Check the box if you wish the whole time series of modelled and observed data for every station and pollutant to be output to a CSV file Warning this file may be large Step 9 Click the green Play button P Refer to section 4 1 for details about the output from the Data Input tool 3 3 Model Evaluation tool This section gives a step by step guide to using the Model Evaluation tool For details of each graph type please refer to section 4 2 Tip In the Myair Toolkit for Model Evaluation installation directory you will find a DataSamples sub directory Here you will find sample files in the formats recognised by the Toolkit and a ReadMe txt file describing each file Page 13 of 50 Step 1 In RGGRunner select the Model Evaluation tool aa DRciPstapuus x socal FoasEvahatorwaa User PociDamesisrca x CHEE E Gus x m sexo some Mycir Myair Toolkit for Model Evaluation Part 3 Model Evaluation version 3 0 objective Te bietive af this tool ir to carry aut an erdt d cel performance i
49. of bandings with associated alert thresholds Depending on the system a forecast index in the MODERATE HIGH or VERY HIGH range may trigger an alert to the public it is therefore important for system operators to understand whether the system issues these alerts correctly The assessment of forecast alerts is carried out by calculating metrics for each monitoring station based on considering the exceedence of an alert threshold as an event The number of events observed and modelled modelled but not observed observed but not modelled not modelled and not observed are summed to get the parameters a b c and d respectively This is summarised in Table 4 3 Event observed Yes No Event Yes a b modelled No c d Table 4 3 Definition of the forecast alert parameters The forecast index evaluation includes two sets of graphical output for the assessment of the accuracy of forecast alerts 1 Odds ratio skill score ORSS The odds ratio skill score ORSS is calculated from the alert metrics as follows Odds ratio OR au s ratio ae OR 1 ORSS OR 1 A perfect system will have b and c equal to zero which means OR gt which means ORSS gt 1 A poor system will have a and d equal to zero which means OR 0 which means ORSS 1 The odds ratio is a good metric for determining if a model is good at correctly issuing and not issuing alerts It gives equal weighting to th
50. or index level n Table 3 5 Details of the index scales CSV file columns n a indicates that values are unrestricted B c dex name 2 moderate 3 7 high 4 10 very high 5 M wi alert thresholds 3 Figure 3 7 Example of an alert thresholds file Column header Description Allowed values index Threshold index for this alert Any integer value name The name to give to this alert n a in all output Table 3 6 Details of the index scales CSV file columns n a indicates that values are unrestricted Page 21 of 50 Step 9 Step 10 Step 11 Step 12 Step 13 Step 14 Select output directory 8 Output Select output directory Click Browse to open file dialog Enter label to prefix output file names gt Select graph output Format s v Select image file size aen v Output processed data and statistics as CSV Browse to select the directory where you want the output files to be saved Enter label to prefix output file names The text you enter here will be used to identify your output Select graph output format There are three options JPG PNG and PDF The first two options produce image files that can be imported into other documents One image file is produced for every graph The PDF option produces PDF files with one PDF file per graph type Select image file size There are three options Large A4 Medium and Small
51. out impending pollution episodes and provide advice Local air quality modelling is critical in assessment of air quality against the EC air quality directive as it can provide high resolution maps of concentration where the population is most dense and allows the investigation of proposed mitigation measures on short or long time scales Understanding the benefits limitations and performance of individual models the input data required of them as well the extent of the options available to them is often lacking Setting standard evaluation criteria and comparing model capabilities in a structured way is therefore a crucial task This toolkit has been developed under the local forecast model evaluation support work package of the EU s 7 Framework PASODOBLE project It draws on existing best practice such as the EU Joint Research Council s JRC FAIRMODE initiative on model evaluation 1 and the openair project tools 2 It is a simple to install user friendly environment that guides the user through the process of evaluating model predictions of local air quality and investigating the model performance It runs on Windows Mac or linux operating systems The toolkit can take modelled data from regional or local scale models as input Observed data are in situ time series data Missing data are handled if they are indicated by a standard value As output the toolkit creates plots of the model performance in predicting concentrations and predic
52. r pollutant For each pollutant the required averaging time and statistic must be set Table 3 3 below describes these parameters bts output statistic B c 2 no2 1 mean 3 o3 1 mean 4 pmi0 1 mean 5 pm2 5 1 mean 6 7 M gt M output averaging times lt 43 Figure 3 4 Example of an output averaging times file Column header Description Allowed values pollutant Name of the pollutant as defined in the pollutant data file used in the Data Input tool n a output avg time hours Averaging time in hours An integer value to be applied to the minimum 1 concentration evaluation output minimum 1 hour output statistic Statistic that applies to max mean or the concentration evaluation output rolling mean Table 3 3 Details of the output averaging times CSV file columns n a indicates that values are unrestricted Page 17 of 50 Enter a data capture threshold to apply to the output averaging process For example for an 8 hour rolling mean a 7596 data capture threshold means that at least 6 hours of data must be valid in each period for the averaged data for this period to be considered valid and used in the evaluation Five graph options are available a target plot from version 1 2 of the FAIRMODE DELTA Tool a target plot from version 3 3 of the FAIRMODE DELTA Tool a box and whisker plot a scatter plot as either a frequency binn
53. red stations and pollutants will be downloaded and imported for a time period to match the modelled data Page 10 of 50 Step 5 If creating a new workspace select station data file 4 Station data Select file containing station information Click Browse to open file dialog Your station data file should be a CSV file with a comma separator containing a list of all the stations for which you wish to process data Each station should have a station type and if you have selected one of the gridded netCDF modelled data formats you also need to include the station longitude and latitude B c D E name type latitude longitude 2 BG1 Suburban 51 56375 0 177928 3 BG2 Suburban 51 52939 0 132889 4 BG3 Roadside 51 54043 0 074451 5 BLO UrbanBac 51 52206 0 12578 6 BN1 Roadside 51 61467 0 17658 7 BN2 UrbanBac 51 59189 0 20596 8 BO5 Industrial 51 47769 0 190833 9 BQ6 Industrial 51 47849 0 185298 10 BQ7 Urban Bac 51 49476 0 134384 11 Bas Urban Bac 51 49476 0 134384 12 BTl Suburban 51 58962 0 27549 13 BT4 Roadside 51 55247 0 25806 14 BTS Industrial 51 55265 0 24875 15 BT6 Roadside 51 53779 0 24776 M W stations_KCL_lationg Figure 3 2 Example of a station data file If you are using either the London KCL or UK AURN observed data options then the station names must match the station codes used by those networks Step 6 If creating a new workspace select pollutant data file 5 Pollutant data Select fi
54. st performance metrics graph produced by the forecast index evaluation grouped by model and filtered by pollutant displaying pollutant 03 Page 41 of 50 4 2 2 3 CSV output files forecast index data csv This contains the modelled and observed concentrations using the index averaging time and statistic in both output units and index units also the observed and modelled index value for each averaging period and the absolute difference between the observed and modelled indices For an example refer to Table 4 4 e a DELTA all forecast index data csv Microsoft Excel M Home Page Layout Formul s Data Review view A x Di m etm t Juan Benet ntes aues d d E pew a eases cea ou eos Cli E I Styles Cells A B c mandi E F Cm ma 1 m E 1 date Obs cutout units mod output units station obsmdex units mod index units Index obs index mod index absdiff pollutant 2 01 01 2008 55 5 138 003006 bg1 55 5 138 003006 1 3 2 no2 02 01 2008 43 6 56 42210007 bgl 48 6 36 42210007 al 1 0 no2 03 01 2008 40 2 50 55319977 bgi 40 2 5055319577 i 1 0 no2 04 01 2008 aaa 56 71079895 bgt ai 56 74079895 1 1 0 nod 05 01 2008 68 3 51 739237 bgl 68 3 31 73929871 2 1 1no2 05 01 2008 578 60 41519928 bgi 579 5041519928 1 1 D nod 07 01 2008 45 22 02479935 bgt 45 22 02479935 1 1 0 no2 08 01 2008 426 17 60239955 bgi 42 6 17 60289955 1 1 0 no2 09 01 2008 ana 29 15759958 bg
55. then the ini file path would be C Toolkit_output DELTA ini Page 49 of 50 7 References 1 Thunis P E Georgieva and S Galmarini 2011 A procedure for air quality models benchmarking http fairmode ew eea europa eu models benchmarking sg4 wg2 sg4 benchmarking v2 pdf David Carslaw and Karl Ropkins 2013 openair Open source tools for the analysis of air pollution data R package version 0 8 5 http www openair project or Thunis P A Pederzoli E Giorgieva C Cuvelier and D Pernigotti 2013 The DELTA tool and Benchmarking Report template Concepts and User guide version 3 2 http agm jrc ec europa eu DELTA data DELTA UserGuide V3 pdf Thunis P D Pernigotti and M Gerboles 2013 Model quality objectives based on measurement uncertainty Part I Ozone Atmospheric Environment In Press http dx doi org 10 1016 j atmosenv 2013 05 018 Thunis P D Pernigotti M Gerboles and C Belis 2013 Model quality objectives based on measurement uncertainty Part Il PM10 and NO2 Atmospheric Environment Submitted R Development Core Team 2010 R A language and environment for statistical computing R Foundation for Statistical Computing Vienna Austria ISBN 3 900051 07 0 URL http www R project org Ilhami Visne Erkan Dilaveroglu Klemens Vierlinger Martin Lauss Ahmet Yildiz Andreas Weinhaeusel Christa Noehammer Friedrich Leisch and Albert Kriegner 2009 RGG A general GUI Fr
56. ting alerts with respect to defined thresholds for single or multiple sites single or multiple pollutants and single or multiple modelled data sets Results can be classified by the type of monitoring site and the pollutant for each modelled data set The diagnosis of model performance for individual sites and individual pollutants produces time series plots scatter plots and analyses with respect to month day of the week and hour of the day All the plotted data are also exported to data files to provide an audit trail and make the data available for further analysis and visualisation Page 3 of 50 2 Getting started The Toolkit can be used on Windows Mac or linux operating systems and does not require any software to be purchased Before using the Toolkit you will need to install some free software R and some R packages java and RGGRunner this will just take a few minutes Detailed installation instructions are given in sections 2 1 2 4 2 1 Install R The MyAir Toolkit for Model Evaluation version 3 0 is compatible with R version 2 15 3 Follow these step by step instructions to download and install R from the internet 1 Goto http www r project org 2 Select CRAN from the links on the left hand side of the page un Choose a CRAN mirror for your locality in the UK choose the mirror for the University of Bristol and click on the link Under Download and install R click on the link for your operating system Click on
57. whisker modelled Page 35 of 50 7 Lower quartile modelled 8 Median modelled 9 Upper quartile modelled 10 Upper whisker modelled 11 centralised root mean square error CRMSE 12 root mean square measurement uncertainty RMSu 13 normalised mean standard deviation NMSD 14 CRMSE sign DELTA 1 2 CRMSE multiplied by 1 if SDO SDM and multiplied by 1 if SDM gt SDO 15 Target parameter T DELTA 1 2 16 CRMSE sign DELTA 3 3 CRMSE multiplied by 1 if NMSD gt J2 1 R and multiplied by 1 if NMSD lt 2 1 R 17 Target parameter T DELTA 3 3 Qe ENTENT 2012 Vallo tor dtt cn Mi lex Home Inset Pogelayout _Formules Doto Renew Mew Developer 9 a i sation E Term t m m p p PTT RT Tr T m T rm rt 2 gi moz hoilymean Forecast 6178 27 0268 30 3741 191772 202714 334722 04707 051897 0 65808 01663 0 05547 141354 184895 136 169 150 502 0 00 i og one o hourly mean Forecast aasa 87384 18903 219996 225357 D simi 0 27754 OSTA 0793S OMGE 0 02407 150808 19464 15A605 204 544 0 400 A bio noz heurlymean Forecast 8167 307802 4819806 20054 243599 236163 0 20543 0 54281 087085 0 03177 0 10059 150 045 274161 194017 23205 5 ont noz houriymean Forecast 2150 653278 65 2069 330571 23 379 20 1209 039153 0 562 075256 039739 0 34298 212213 27882 715 206412 4 038 Eon mc howlymean Forecast 2610 363922 447605 228447 22 3129 6 36833 025053 066802 079617 0 20624 0 02355 16937 292297 1549

Download Pdf Manuals

image

Related Search

Related Contents

Nobo 1901461  PDF FILE  Fluke 720A Scuba Diving Equipment User Manual  HERMA Repositionable address labels A4 63.5x38.1 mm white Movables paper matt 525 pcs.  Warehouse of Tiffany 2562PB07 Instructions / Assembly    データマーク LZ8000/HS8000  取扱説明書  LG 52SX4D-UB User's Manual  AR-media™ Plugin v2.3 INSTALLATION & USER GUIDE  

Copyright © All rights reserved.
Failed to retrieve file