Home
        Microsoft Research Terrestrial Carbon Model Package: User`s Guide
         Contents
1.    computer     This should generate a single folder that contains all of the files in the solution  i e     YOUR ROOT DIRECTORY Package       CGO a  e    Software    MSRTCM           HS m M O M       Organize v New folder    yr Name Date modified  E    d bin 04 01 2012 13 49 File folder  li   data 04 0    J et 04 0    2012 13 51 File folder  2012 13 56 File folder    Select a file  to preview   2012 12 50 HTML Document   2012 12 50 Microsoft Visual S       2012 12 50 Adobe Acrobat D       E  G      MSR LA   Fully Data Constrained Model    04 0  MSRTCM sIn 04 0    UserGuide pdf 04 0       y  1   s   i src 04 01 2012 13 56 File folder  E  y  7             dll  lt 1   F    7 items Offline status  Online  l Offline availability  Not available    Figure 1   You should see something like this when you open the Package folder          Within the Package folder  Fig  1  should be     e A    bin    folder  contains the executable MSRTCM exe and associated libraries necessary to  run the fully data constrained global terrestrial carbon model without having Visual Studio  or Visual C  express installed    e a    data    folder  contains a hierarchical set of folders for holding the input and output data   some of which already contain data   e an    ext    folder  containing compiled binaries for Filzbach  Parameter Inference    FetchClimate  remote data access   Scientific DataSet  facilitating the handling of datasets    as well as some standard scripts for running the statistical pa
2.   Potential   Classification   Center for Sustainability Ramankutty  N   and J A  Foley  1999    vegetatio   of potential and the Global Environment   Estimating historical changes in land  n data vegetation  SAGE   part of the Nelson   cover   across the Institute for Environmental  global land Studies at the University of   North American croplands from 1850  surface Wisconsin   Madison to 1992  Global Ecology and  Biogeography 8  381 396   http   www sage wisc edu  atlas data php incdataset   Potential Vegetation  ProcessedRawData    12    This was used to hold the results of processing the raw datafiles into a standard format  We do not  have permission to redistribute all of these datafiles so this folder is blank  These files are produced    by the MakeDataTables scripts  TrainingEvaluationData    The sample of the datafiles in the  RawData  folder used as training and evaluation data  with  associated climatic data  We have permission to distribute this derived data from all data providers    so all of the training and evaluation data is contained within that folder     Part of the study of Smith et al   2013  was to divide the raw data  in the ProcessedRawData folder   into training and test data  This involves using a random geographical mask to assign approximately  2596 of the terrestrial land surface to final test data  the cru20DataMask nc file in the RawData  folder   with the remaining being training data  The training data is then assigned  fold  numbers  and 
3.  Fraction of Ise  T   amp  Moorcroft  P R  The  leaf and fine   global scale temperature  root carbon and moisture dependencies  that is of soil organic carbon  decomposed   decomposition  an analysis  quickly by soil   using a mechanistic  organisms decomposition model    fraction  Biogeochem  80  217 231   2006   Published by  Springer  Discrete European Commission      classifications    of land cover   eu products glc2000 data  types access php   represented   as integer   codes    Monthly Climatic Research Unit at  values of a the University of East Anglia  range of   environmenta   http   www cru uea ac uk c    variables ru data hrg    obtained by   averaging   over the   period 1961     11    onlinelibrary wiley co    http   bioval jrc ec europa     Stephenson  N L   amp  van Mantgem  P J   Forest turnover rates follow global and  regional patterns of productivity  Ecol   Lett  8  524 531  2005  published by  John Wiley  amp  Sons Ltd     Mouillot F   amp  Field  C  B   Fire history  and the global carbon budget  Global  Change Biology  11 3   398 420  2005      Ise  T   amp  Moorcroft  P R  The global   scale temperature and moisture  dependencies of soil organic carbon  decomposition  an analysis using a  mechanistic decomposition model   Biogeochem  80  217 231  2006    Published by Springer    Bartholome  E  M    amp  Belward A  S    GLC2000  a new approach to global  land cover mapping from Earth  Observation data  International Journal  of Remote Sensing  26  195
4.  Pls exe Data Center for  th i Climate  CERA DB  n i P ELE  UKMO HadCM3 SRES  emissions  http   cera     27    scenarios  www dkrz de WDCC ui Compact jsp    B1 1   acronym UKMO HadCM3 SRESB1 1    The downloadable package also contains the results of our simulation experiments  These are  contained in the folder    YOUR ROOT DIRECTORY MSRTCMSim data OutputData SimulationOutputData    The simulation experiments were the process that took up the most compute time in the study of  Smith et al   2013   A simulation was performed for each sample of the Markov Chain  1200  samples  for each model training fold  10 folds  under two different climate change scenarios  2  scenarios  and for 3 different parameterizations of the plant mortality model  3 mortality models    This equates to 1200  10 2 3   72 000 simulations  Each simulation took a couple of minutes on a  reasonably fast computer however to complete all simulations we divided the jobs by fold  scenario  and mortality model  In other words we simulated separately each lot of 1200 samples of parameter    values     The procedures involved in conducting the simulations are    e Set up an instance of the full model  adds to a  SetOfModels  class all of the  subcomponents including  The name  the data distribution type  e g  normal  lognormal   logistic   the function to initialize the parameters to be estimated in Filzbach  the function  used to make predictions given parameters and data  the function used to estimate the  erro
5.  commands through specifying  FULL  or  FULL      test  for the  command  string     static void Main string   args     string command    FULL  test    The computational framework will then perform the following operations    e Identify all of the names of all sub component models for the full model    e Add to a  SetOfModels  class all of the subcomponents including  The name  the data  distribution type  e g  normal  lognormal  logistic   the function to initialize the parameters  to be estimated in Filzbach  the function used to make predictions given parameters and  data  the function used to estimate the error about the predictions  the function used to  make the datafiles from the raw data    e   If the training datafiles do not exist for all of the models then remake the training datafiles    from the raw data  using the function identified previously   This involves transforming    16    and or sampling the original source data files and dividing into training evaluation data and    final test data  classifying each location into a Holdridge Life Zone and assigning fold    numbers to the data points     e For each model component initialize all of the parameters to be estimated     e For each fold perform Markov Chain Monte Carlo estimation of the parameter space to    estimate the parameter probability distributions     e  Post process the results of 10 fold model parameter estimation     This produces the following files in the DataSets folder    Table 2   Output da
6.  folder  Please consult the    FetchClimate user manual to obtain full details of how to use FetchClimate or see    http   research microsoft com fetchclimate      34    
7.  string   args      The main function orchestrates calls to the highest level operations to be conducted by the solution   These are i  the Bayesian parameter estimation of models given datasets  and ii  post parameter    estimation steps such as the simulation and mapping of model predictions     Note that a convenient way to navigate through functions is to click on the function you are    interested in and press  F12   This should automatically take you to the code for the function     8  Study the structure and contents of the data folder  The data files necessary to repeat the study of Smith et al   2013  are included with the package and    reside in the YOUR ROOT  DIRECTORY Package data folder     8    This folder has several subfolders  OutputData    The output directory for all outputs from the solution  other than training  evaluation or test data    This contains four additional folders to subdivide the data into that produced from performing  parameter inference  ModelFittingOutputData   from post processing all of the results from the  different model fitting experiments  ProcessedFittingOutputData   from assessing the model using  the final test data  ProcessedReservedTestOutputData  and from simulating the model to study its  predictions  SimulationOutputData   This folder is initially blank except for one file   ProcessedFittingOutPutData ProcessErrorValues csv  needed to recreate our results  However all of  the output data resulting from our study are availa
8.  test    as described below  Other command line arguments are explained in the subsequent sections     10  Repeating the methods used in the paper    Implementing the methods used in the study of Smith et al   2013  can be done with certain run   time flags  or commands  The commands come from a  command  string in the Main program  function  or they can be specified as command line parameters when you invoke the program  The  former method is more convenient when you start the program from Visual Studio  The latter  method aids the implementation of parametric sweep jobs on a computation cluster or    implementing the code independently from Visual Studio     We recommend that you begin with all of the original data files in place because i  the code should  definitely run if it is provided these data files  ii  it takes less time for you to see the parameter  estimation algorithms in process and iii  these are the exact data files that were used to generate the    results of Smith et al   2013      If you do not begin with the original data files then the code will look for the raw data files and build  new training and test datasets  The code will throw an exception during this process  i e  crash  if it  cannot find the original data files  or find them in the correct format  To obtain the raw data files  please refer to the sources above  Table 1  or contact the authors for assistance  we cannot    guarantee that these original source data files will always be available 
9.  the parameters for a particular    substructure by specifying a number after BUILD UP  For example   MSRTCM BUILD UP 1  Will result in the NPP model being fit  The integers correspond to the following sub model structures    This functionality is useful if you want to distribute the model fitting experiments on a computer    cluster  as we did      Table 5   Integer codes for performing parameter inference on different subsets of the model    structure    Experiment Models in the experiment  Number    22     1  NPP     2  FracEvergreen    3  LeafMortEvergreen    4  LeafMortDeciduous    5  FRootMort    6  StructuralMort    7  FracAlloc    8  NPP   Fire    9  NPP   FracStruct    10  FracAlloc   FRootMort   LeafMortEvergreen   LeafMortDeciduous      FracEvergreen   StructuralMort   NPP   Fire   FracStruct   PlantC   11   10    LitterTot     12   10    LitterTot   SoilC    Completion of a BUILD UP parameter estimation experiment results in a  OutputData ModelFittingOutputData ModelSet lt n gt ResultsCompilation csv file  where n is the  integer code corresponding to the model fitting experiment  It contains the same parameter and  model performance summary as the  OutputData ModelFittingOutputData FullModelResultsCompilation csv file detailed in Table x    above  but for the specific BUILD UP parameter estimation experiment     10 5  ALL DUMMY   n      Perform the BUILD UP fitting experiments but where every model component is replaced with a    DUMMY  a null model     Performs 
10. 9 1977   2005   The Global Land Cover Map for  the Year 2000  2003     New  M   Lister  D   Hulme  M   amp  Makin      A high resolution data set of surface  climate over global land areas  Climate  Research 21  1 25  2002      1990 and has    a Spatial  resolution of  10 arc   minutes   Soil Total Oak Ridge National Batjes  N  H   ed    2000  Global Data  available   available Laboratory Distributed Set of Derived Soil Properties  0 5   water water Active Archive Centre Degree Grid  ISRIC WISE    Global Data  capacity   capacity  mm    ORNL DAAC   Set of Derived Soil Properties  0 5   data water per 1m Degree Grid  International Soil  soil depth  at   http   daac ornl gov SOILS    Reference and Information Centre    0 5 degree guides IsricGrids html World Inventory of Soil Emission  resolution Potentials    Data set  Available on line   http   www daac ornl gov  from Oak  Ridge National Laboratory Distributed  Active Archive Center  Oak Ridge   Tennessee  U S A   doi 10 3334 ORNLDAAC  Global Classification   Oak Ridge National Matthews  E  1999  Global Vegetation  Vegetati   of vegetation   Laboratory Distributed Types  1971 1982  Matthews    on Types    types at 1 Active Archive Centre  1971  degree  ORNL DAAC   Data set  Available on line  1982 resolution  http   daac ornl gov  from Oak Ridge  National  http   daac ornl gov VEGET   Laboratory Distributed Active Archive  ATION guides matthews gl   Center  Oak Ridge  Tennessee  U S A   obal veg html doi 10 3334 ORNLDAAC 419 
11. AET 0  Frequency 1    Figure 5   Using Datatset Viewer to       inspect the inferred probability distributions for the plant    mortality model likelihood  using evaluation data  and parameters     10 2  MAPS    Map the predictions of equilibrium carbon stocks and flows for the global land surface and simulate a    global re vegetation event    This functionality is provided simply to produce predictions from the model for the global land  surface at 0 5 degree resolution  It completes two main operations   the first is to solve the  equilibrium equations for the global land surface  The second is to initialize all the carbon pools  across the global land surface at the same out of equilibrium values and simulate 100 years of    dynamics under constant climate conditions at each site     The maps require the OutputData ModelFittingOutputData FullModelResultsCompilation csv  datafile to have been produced by FULL model parameter estimation  Then the maps can be    produced by the command line command  MSRTCM MAPS    Or if you want to produce maps immediately after parameter estimation you can write    MSRTCM FULL MAPS    Alternatively you can alter the  command  string in the Main function of Program cs to  MAPS  or     FULL MAPS  using Visual Studio    19    Table 3 describes the results of running the MAPS procedure     Table 3   Output data from mapping the equilibrium carbon stocks and re vegetation    File name and location Description Variables   DataFiles    OutputData 
12. IT DATA 1    Or alternatively you can specify OMIT DATA ALL or OMIT DATA  Experiment Number  in the     command  string in the Main function of Program cs using Visual Studio   The integers correspond to the following datasets being removed     Table 7   Integer codes to use to specify which dataset to remove when inferring the parameters    for the full model    1   FracAlloc 2   FRootMort 3 LeafMortEvergreen  4  LeafMortDeciduous  5   FracEvergreen  6 StructuralMort   7     NPP 8   Fire  9   FracStruct 10   LitterTot 11   PlantC 12   SoilC    Completion of an OMIT DATA parameter estimation experiment results in a  OutputData ModelFittingOutputData NFoldOmitSpecific lt ModelOmitted gt ResultsCompilation csv file    with  lt ModelOmitted gt  corresponding to the specific dataset that had been removed during model    training  although it is still used in model evaluation      10 8  ANALYSE PARAMS  Analyses the results of the BUILD UP  ONE DUMMY and OMIT DATA parameter estimation    experiments    To run this procedure you will need to have run FULL and at least one complete set of the BUILD UP     ONE DUMMY or OMIT DATA parameter estimation experiments  each of these produces 12    25     ResultsCompilation csv files   These files must be in the OutputData ModelFittingOutputData     folder  Table 8 summarizes these requirements and what ANALYSE produces using these files    Table 8   Output data from analyzing the outputs of the model parameter inference experiments    Experi
13. Microsoft Research Terrestrial Carbon  Model Package  User   s Guide    M  J  Smith  D  W  Purves  M  C  Vanderwel  V  Lyutsarev and S  Emmott  Computational Science    Laboratory  Microsoft Research Cambridge  21 Station Road  Cambridge  CB1 2FB  UK    This user   s guide accompanies the research publication    The climate dependence of the terrestrial  carbon cycle  including parameter and structural uncertainties    Details of that publication at  http   research microsoft com apps pubs default aspx id 180603 and   www  biogeosciences net 10 583 2013     referred to below as Smith et al   2013      Please email queries to Matthew Smith Microsoft com    Contents  1  introductio  eed teer tr eee meas e e e aE Ee E eea eeeeeaeenteeass 3  2     System teq  iremients sss erne e ter e raa aa Ee aee sensed EE Eea e aeea eaa es eee 3    3  Install Microsoft Visual C  2010 Express or Microsoft Visual Studio 2010 if you want to work    with th   code cet rte ter iet deamon Tu He Red eed an A aaa bane euge esae eee eua 4  4  Download and unpack the solution to obtain the code and the executable                                   4  5       StUdy the sol  tion Structure icc ec oii ceed metere leads  EU aa eben ak EUER Ya Eu EYE ERRARE y e eR RR GEEK NU deta 6  6  Ifyou have a 64 bit operating system then change the default build to 64 bit                               7  7   Skim read Programs  oer oe ee dE oven aseo tv odes E Y eb da oe SURE reed Tv ope OUT Yee aV Eug 8  8  Study the 
14. SimulationOut   Contains results of estimating All carbon stocks and flows   putData EquilibriumMapF   equilibrium carbon stocks and flows for   for all land points with   orFullModelSet csv the global land surface at 0 5 degree accompanying latitude and  resolution  longitude coordinates   OutputData SimulationOut   Contains results of simulating the Plant and soil carbon for   putData SimulationMapFo   recovery of equilibrium carbon stocks   all land points through   rFullModelSet csv from low levels over a 100 year time time  100 years     period under constant climate  conditions at 0 5 degree resolution     We find it convenient to inspect the results in these datafiles using DataSet Viewer  e g  Fig  6                                   Figure 6   Using DataSet Viewer to inspect the predicted maps of equilibrium plant  top  and soil    carbon  bottom  using the full model     10 3  EQCARBDIST    Map the probabilistic predictions of equilibrium carbon stocks for the global land surface at 10 arc    minute spatial resolution  used to produce Fig  2 of the manuscript     20    A useful feature of the global terrestrial carbon model is that it enables probabilistic of equilibrium  carbon stocks and flows to be made for anywhere on earth  The EQCARBDIST routine makes such  predictions for every land surface point on earth at 10 arc minute resolution  approx  18km    outputting maps of plant and soil carbon in terms of the mean median prediction and 5  and 95      percentiles 
15. U S A     GLOPNET   Estimated Authors of Wright     et al  Wright     et al  The worldwide leaf  Leaf lifespan  in The worldwide leaf economics spectrum  Nature  428  821  Traits months  of economics spectrum   2004     data leaves and Nature  428  821  2004       whether they   lan Wright and Peter Reich  are evergreen  or deciduous    Global Mean root Gill  R   amp  Jackson  R  B  Gill  R   amp  Jackson  R  B  Global Patterns   root turnover  yr    Global Patterns of root of root turnover for terrestrial   turnover   1  turnover for terrestrial ecosystems  New Phytologist 81 275    data ecosystems  New 280  2000   published by John Wiley  amp   Phytologist 81 275 280 Sons Ltd      2000   published by John  Wiley  amp  Sons Ltd     http   onlinelibrary wiley co  m doi 10 1046 j 1469   8137 2000 00681 x abstract    10    Forest  turnover  data    Global  Fire Data    Metaboli  c fraction  of carbon  in  terrestria     vegetatio  n    Global  land  cover for  the year  2000    CRU CL  2 0  Global  gridded  climate  data    Forest Stephenson  N L   amp  van  turnover rates   Mantgem  P J  Forest    yr 1  from turnover rates follow global  different sites   and regional patterns of  worldwide  productivity  Ecol  Lett  8     524 531  2005  published  by John Wiley  amp  Sons Ltd     http   m doi 10 1111 j 1461   0248 2005 00746 x su  o    Percentage of   Florent Mouillot  a grid cell  burned per  year for 100    years  1900     Data obtained from the  author s web page    2000  
16. a weighting inversely proportional to its relative frequency of data from that type of climate in  the data  Initially the test data has no modifications and is simply copied to corresponding files in the  Test folder  However the final step in the study of Smith et al   2013  was to assess model  performance using this data  and it has been assigned associated climate data as a result of that    process     The TrainingEvaluationData folder contains two sorts of datafiles  the    DATA CODENAME JSetData csv files simply contain the subset of the data in ProcessedRawData  that was selected as training evaluation data with added fold numbers and the Holdridge zone  classification  Holdridge  L  R  Life Zone Ecology  Tropical Science Centre  San Jose  Costa Rica  1967    The  DATA CODENAMEjSetClimateData csv files contain the same data but with added climate  variables obtained by referring to the environmental datasets  Table 1   We recommend you open    one of these files in DataSetViewer to inspect and explore the data  you can download DataSet    Viewer from http   research microsoft com projects sds      13       NPP data localities                        0  08 Ma EE  e   s  a        s   aes V   atr ve  e  3 T     o re even   uH  RR  Data       2            a   3 mv An D  I       IX Eis         ale eds ft sd  E qM i  gate yi Hie E a i  i  5   h    o ose se v   e   2  e    D             T Mac          T T    v    Mean annual biotemperature  deq C           Figure 4   Using D
17. atatset Viewer to inspect the Net Primary Productivity  NPP  training and  evaluation data  Top panel shows a global map of the sample localities  Bottom panel plots the  mean annual biotemperature of localities against mean annual precipitation  The colour of the    points indicates NPP  kg m  yr    ReservedTestData    The sample of the datafiles in    RawData    used as final test data  with associated climatic data  We  have permission to distribute this derived data from all data providers so all of the test data is    contained within that folder   A note on data file format    Smith et al   2013  used two main file formats for input and output data  NetCDF  extension   nc    and CSV  extension     csv      CSV file format is used for containing files with arbitrary numbers of 1 or  2 dimensional data arrays because it can be conveniently read using several commonly used  programs  Notepad  Excel and DataSetViewer   NetCDF file format is more convenient for handling  N dimensional data structures  e g  2 dimensional space through time   which are also usually    accompanied by large file sizes  say   gt 50Mb      14    9  Running the s fully data constrained global terrestrial carbon    model from command line arguments  A compiled 32 bit version of our code is included in the YOUR_ROOT_DIRECTORY Package bin    folder  You can set this running by navigating to this folder using your console and typing  MSRTCM      However we recommend initially that you run  MSRTCM FULL 
18. ble for download from  http   research microsoft com en us downloads a1281531 df37 4489 a556   56799fd252b4 default aspx and http   download microsoft com download 1 F D 1FD1F550 69C4   4503 B2FE B47F94607A7F MSRTCMSIMData zip     RawData    This was used to hold the raw datafiles used to produce the training  evaluation and test datafiles   We do not have permission to redistribute all of these datafiles and some of them are quite large so  this folder only contains the data mask that we used to partition the raw data into  training evaluation and test data  cru20DataMask nc   These datafiles can be obtained from the    sources listed in Table 1     Table 1  Data sources for the study of Smith et al   2013     Dataset   Data set Data source Citation   name description   Global The amount Carbon Dioxide Information   Ruesch  Aaron  and Holly K  Gibbs    biomass   of carbon and Analysis Centre 2008  New IPCC Tier 1 Global Biomass   carbon held in Carbon Map For the Year 2000    map in terrestrial http   cdiac ornl gov epubs   Available online from the Carbon   the year   vegetation   ndp global carbon carbon   Dioxide Information Analysis Center   2000 tonnes carbon    Qocumentation html  http   cdiac ornl gov   Oak Ridge  ha 1 National Laboratory  Oak Ridge    Tennessee    Global Litter Matthews  E  Global litter Matthews  E  Global litter production    litter production production  pools and pools and turnover times  Estimates   producti  rates  g dry turnover times  Estimat
19. braries within the code  We use the Dmitrov  also known as Scientific DataSet  libraries to manage the use of datasets    throughout our C  code  we developed the software to facilitate the use of multidimensional  datasets in diverse formats and sizes from within code  We included libraries from one specific  version of Dmitrov   version 1 2 12907  The use of this version outside of the MSRTSM solution will  not be supported by the Dmitrov team  The full version of project Dmitrov libraries and tools can be    obtained from http   research microsoft com projects sds     14  Obtaining DataSet Viewer    We find it especially convenient to view model inputs and outputs using DataSet Viewer  We  developed DataSet Viewer as a simple standalone menu driven tool for quickly exploring and  comparing time series  geographic distributions and other patterns within scientific data  DataSet  Viewer combines selection  filtering and slicing tools  with various chart types  scatter plots  line  graphs  heat maps  as well as tables   and geographic mapping  using Bing Maps   It is freely  available as part of the Dmitrov tools and utilities package available from    http   research microsoft com en us um cambridge groups science tools dmitrov dmitrov htm    15  UsingFilzbach    Filzbach is a flexible  fast  robust  parameter estimation engine that allows you to parameterize  arbitrary  non linear models  of the kind that are necessary in biological sciences  against multiple   heterogeneou
20. cing the parameters with those in  OutputData ModelFittingOutputData ModelSet6ResultsCompilation csv  or replacing the  parameters with those in  OutputData ModelFittingOutputData FullModelSetReplaceStructuralMortResultsCompilatio    n csv file  respectively    For example   MSRTCM SIMULATE 3 A1F1 NM    Simulates the full model using parameters from fold number 3 using the A1F1 climate change  scenario and using the inferred mortality model parameters for the full model  Alternatively you can  specify the commands in the  commana  string in the Main function of Program cs using Visual    Studio   We performed all of the simulations using a computer cluster through a cluster job manager     Each job creates a datafile called  YOUR_ROOT_DIRECTORY Package OutputData SimulationOutputData OutputExperiment lt scenario   gt  lt Mortality Model gt  lt FoldNumber gt  csv containing the estimated global plant and soil carbon pools  as well as a detailed breakdown of carbon in different pools through time for 6 different spatial    locations     10 10  ANALYSE SIMULATIONS    Analyses the simulations arising from the SIMULATE command    29    This procedure was used to combine the simulation results for the different training data folds to  produce 6 summary results files  3 different mortality model parameterizations x 2 scenarios  For  each combination of scenario and mortality model you need to have all 10  OutputExperiment lt scenario gt  lt Mortality Model gt  lt FoldNumber gt  csv 
21. ckage  R  to produce the  graphs used in Smith et al   2013   More details on these packages are provided below    e an src  folder  containing all of the source code used in the study   e  MSR LA   Fully Data Constrained Model for Global Vegetation htm   containing the legal  terms of use for the package and citations to all of the data providers who kindly agreed to  us releasing derivatives of their data along with the study of Smith et al  2013  to enable  users to recreate our results    e A MSRTCM sln  file   The Microsoft Visual Studio Solution description  this file can be  opened using Microsoft Visual Studio or Microsoft Visual C  2010 Express    e  UserGuide pdf   This user guide    5  Study the solution structure  If either Microsoft Visual C  2010 Express or Microsoft Visual Studio 2010 is installed on your    computer then you can open the  MSRTCM sIn  file to load the solution  Double click the   MSRTM  ln  file  If Visual Studio doesn t open then right click the file and choose Open With  gt     Microsoft Visual C  2010 Express     The main entry point for standard use of the solution for conducting the study of Smith et al   2013   was the Program cs file  This is listed at the bottom of the solution structure in the Solution Explorer    window  Fig  2      Solution Explorer    ad Solution  MSRTCM   1 project      4  8 Microsoft Research TCM2011   a  Properties    3 References  Aj Service References  Lj DataReferences  Lg MakeDataTables  L3 ModelFittingManage
22. ds    Correlation coefficient  CC     OutputData ProcessedFit   A compilation of the process   tingOutputData ProcessE   error values inferred for each   rrorValues csv model  This is used to estimate  the likelihoods for the  evaluation data in the data   omit parameter inference  experiments     Coefficient of Determination  CD     Mean root mean squared error   MRMSE     Mean relative error  MRE   Mean coefficient of variation  CV     Deviance information criterion  DIC   training data only     Mean 5   and and 95th percentiles  of the above metrics for each fold  and averaged across folds    Mean 5  and 95th percentiles of the  model predictions for each data  point given the sampled Markov  Chain    A copy of the empirical data and  count of the number of training and  validation datapoints    The mean median estimate of the  process error parameter for each  model component in the full model    We found it convenient to visualize the results in these files using Dataset Viewer  It allows us to    rapidly inspect the parameter probability distributions and performance metrics for multiple models     e g  Fig  5     18    A insdecevfiesCdepo CCRCCEIONCCF1O OldDataFiles tst MadelitingiutDet fulMadlSetResutzCompilationca   DataSet Viewer          p pee Yow bb          Plant mortality model  StructuralMort   mean likelihood distribution                T p mp m pm  Tr mp                               qe emet     MP  P     aM a mu       4 2 3 12   oa 04 5 2 03  Log  ein ali t 
23. e    Configuration Manager to switch between Release and Debug configurations  In Release    configuration the C  compiler does more code optimizations and so the program runs faster  Visual    C  will save these preferences when you close the window       7  Skim read Program cs    Program cs SEX       t  Microsoft Research TCM2011   Program    34 Main string   args     Below you will see first see the initialisation of some standard  and some non standard   code libraries that we use in the code  all the other project files have similar calls to libraries    Then you will see the code for the  Program  class  which contains the  highest level  code   in this solution  It essentially orchestrates all of the model fitting experiments   that we perform in our study                                      D    Lately                Standard libraries  Eusing System   using System Collections Generic     Enables us to use the  List  class  using System Ling   using System Text   using System IO     Enables us to output to the user console    Non standard libraries   in this case Scientific DataSet libraries that  allow us to read in and output datasets in various formats  mostly  csv files  or  nc files    using Microsoft Research Science Data    using Microsoft Research Science Data Imperative       namespace Microsoft Research TCM2011                    E summary   This class orchestrates the fitting of the fully data constrained global terrestrial carbon model   by performing va
24. es from measurement data and regression    ondata   matter m 2 from measurement data models  J  Geophys  Res  102  18771     yr 1 and regression models  J  18800  2003     Geophys  Res  102  18771    18800  2003    Global Soil carbon Oak Ridge National Global Soil Data Task Group  2000   gridded   density  kg m    Laboratory Distributed Global Gridded Surfaces of Selected  surfaces   2  atadepth   Active Archive Centre Soil Characteristics  IGBP DIS    Global  of interval of 0     ORNL DAAC  Gridded Surfaces of Selected Soil  selected   100 cm  Characteristics  International  soil http   daac ornl gov SOILS    Geosphere Biosphere Programme    character guides igbp surfaces html   pata and Information System    Data  istics set  Available on line   IGBP   http   www daac ornl gov  from Oak  DIS  Ridge National Laboratory Distributed    Active Archive Center  Oak Ridge   Tennessee  U S A   doi 10 3334 ORNLDAAC 569      ClassB   Net primary Oak Ridge National Olson  R  J   J  M  O  Scurlock  S  D   site  net   productivity Laboratory Distributed Prince  D  L  Zheng  and K  R  Johnson  primary    kg carbon m    Active Archive Centre  eds    2001  NPP Multi Biome  NPP  productiv   2 yr 1   ORNL DAAC   and Driver Data for Ecosystem Model   ity  NPP  Data Intercomparison  Data set     http   daac ornl gov NPP h   Available on line   tml docs EMDI des html     http   www daac ornl gov  from the  Oak Ridge National Laboratory  Distributed Active Archive Center  Oak  Ridge  Tennessee  
25. f the email     At present users will have to work with the raw code to conduct novel studies  We anticipate that  most users will want to work with the automated parameter estimation capabilities of the code  We  therefore highlight the key elements of the code that you may need to change in order to implement    a new model     e Specifying a new model component  Examples of how model components are specified are  in the  OriginalCarbonStocksFlowsModels  folder  Different models were specified as  different object oriented classes with specific fields to store parameter values  We  recommend users look at MiamiNPPModel cs to see a detailed breakdown for a specific  model    e Formatting the model component for Filzbach  In order for parameter estimation to be  performed on specific model components we write a class that handles the interface  between Filzbach and the model component  Examples are in the   ModelsFormattedForFilzbach  folder  This contains   o a SetupParameters    function that initialises parameter values in Filzbach   o aMakePrediction   function that makes predictions for a list of sites by obtaining the    required climate data or predictions from another component  setting up an    31    instantiation of the model object with the current parameter values in Filzbach  and  then making predictions for each site using the model prediction functions   o An ErrorFunction   that predicts the process error associated with the predictions  o  Dummy  functions   
26. fied  analyses through command line arguments  without needing to use Visual Studio  We have not yet    added a graphical user interface     2  System requirements  The study of Smith et al   2013  can be implemented directly as an executable binary file or through    Microsoft Visual Studio  For the former  user s will not need to have Microsoft Visual C  2010  Express or Microsoft Visual Studio 2010 installed on their computer to run the framework  but will  not be able to implement alterations to the code  although the code can still be viewed using a text  reader   Although you do not have to install any version of Visual Studio you still have to ensure you    have the following components installed on your computer for the executable file to run     e Microsoft  NET Framework 4 0 Client Profile http   microsoft com net download Microsoft  Visual C   10 0 Redistributable  x86 or x64 depending on processor architecture and  operating system of your computer   available from the Microsoft downloads site  We    recommend you search for    Microsoft Visual C   2010 SP1 Redistributable Package    from    the http   www microsoft com download website     If users use Microsoft Visual C  2010 Express or Microsoft Visual Studio 2010 to work with the code  then they will benefit from being able to read the code  navigate the solution structure and    implement any modifications     3  Install Microsoft Visual C  2010 Express or Microsoft Visual    Studio 2010 if you want to work w
27. files in the    YOUR_ROOT_DIRECTORY Package data OutputData SimulationOutputData folder     The procedure checks each scenario mortality model combination and if all 10 corresponding  OutputExperiment files exists it produces a  YOUR_ROOT_DIRECTORY Package data OutputData SimulationOutputData  lt scenario gt  lt Mortality  Model gt Processed csv file  containing mean  median and 95th percentiles of the model predictions    across the 10 sets of parameter values     The simulation analysis files produced during the Smith et a    2013  study are in the    MSRTCMSimData zip package available from  http   download microsoft com download 1 F D 1FD1F550 69C4 4503 B2FE   B47F94607A7F MSRTCMSIMData zip    10 11  FINAL TEST    Assesses the predictive performance of the full model using the final test data    It is good practice to only perform this final step once the full model has been finalized  as we did in    the study of Smith et al   2013      This procedure reads in the inferred parameter distributions for the full model from   YOUR ROOT DIRECTORY Package data OutputDataMModelFittingOutputDataVFullModelResultsCo  mpilation csv and uses them to predict the data held in the ReservedTestData folder  This firstly  results in the compiled model performance assessment metrics in the   YOUR ROOT DIRECTORY Package data OutputData ProcessedReservedTestOutputData TestDataR  esultsCompilation csv file  It is then post processed to result in the   YOUR ROOT DIRECTORY Package data OutputDa
28. from the initial locations      We will now describe the most typical ways in which a user will use our system to recreate our    results     NOTE  Fitting all but the simplest models using 10 fold cross validation with multiple components  can take minutes to hours and running all the experiments and simulations performed by Smith et  al   2013  would take days or even weeks on a single processor with a standard personal computer   It will therefore be more practical to run the different experiments in stages  Moreover  you can    restrict the Markov Chain length used for parameter approximation initially to verify that all of the    15    different procedures run and produce results  although the results themselves will be useless   This    can be done by specifying  memc 10 1000 inthe relevant command string    10 1  FULL  Fitting the full Microsoft Data Constrained Model of Global Vegetation    The first experiment we run is to fit the full model  This partly serves to verify that all of the model    components are set up correctly  The command line command is   MSRTCM FULL   or   MSRTCM FULL  mcmc 10 1000   or   MSRTCM FULL  test    Note that the latter two commands are equivalent  Use the latter command if you simply want to  check it is working right  this restricts the Markov Chain length to 10 burn in steps and 1000    sampling steps    we recommend you run this first just to make sure that the code is working fine     Through Visual Studio you can implement the same
29. insert it into the source code of MSRTCM and then recompile the    code  Specifically    e Download the MSRTCMSim zip package from http   research microsoft com en   us downloads 49ad471e 7411 4f65 910a 2a541f946575 default aspx    e Unzip the package and find the ClimateChangeSimulatorlmpl cs file in the  YOUR_ROOT_DIRECTORY MSRTCMSim src ProcessResultsDatafiles folder    e Replace the ClimateChangeSimulatorlmpl cs file in the  YOUR ROOT DIRECTORY MSRTCM src ProcessResultsDatafiles folder with that datafile    e Recompile the MSRTCM exe solution     The climate change prediction data processed using the new code  which were used to force the    model under changing climate scenarios  were obtained from the following source    Table 9  Climate data source used in the climate change simulations of Smith et al   2013     Data set Data set Data source Citation  name description  Simulation Predicted The IPCC Data Distribution Centre  Lowe  2005  IPCC DDC  outputs monthly values   AR4 GCM Data  AR4 UKMO HadCM3  from the of SRESA1B run1  World  HadCM3 environmental Data Center for  model for variables for the Climate  CERA DB  the AR4 surface of the iem Veyron DDC    UkMO HadCM3  SRES  SRES earth gridded at    A1B 1   scenarios a 2 5x3 75   degree    resolution from   http   cera   Lowe  2005  IPCC DDC  i   i jsp     the year 2000 www dkrz de WDCC ui Compact jsp ARA UKMO HadCM3    acronym URIVIO HagClvi3 5RESAJD   through to 2199   acronymzUKMO HadCM3 SRESA1B SRESB1 run1  World   
30. ith the code  If you do not already have Microsoft Visual C  2010 Express or Microsoft Visual Studio 2010 on your    computer then you will need to install one of these to be able to navigate the solution structure and    implement modifications to the code     Microsoft Visual C  2010 Express is free to download from    http   www microsoft com visualstudio en us products 2010 editions visual csharp express  It       provides the basic functionality needed to load  run and edit the solution  After a period of time you  will probably have to register your use of Visual C  2010 Express to continue to use it  Additional  functionality  source control  multiple  NET languages  can be obtained using Microsoft Visual Studio    2010  http   www microsoft com visualstudio en us  although this is generally not freely available     This user guide refers to using the solution in Microsoft Visual C  2010 Express     4  Download and unpack the solution to obtain the code and the    executable  If you do not want to read or modify the code used in the study of Smith et al   2013  but only want    to run it then you will still need to download and unpack the solution to obtain the executable file    that will implement the study     The Microsoft Visual Studio Solution is packaged as a  zip file and can be downloaded from  http   research microsoft com en us downloads 8c51f0b5 17a1 413e 90c4   43c61c7e4843 default aspx  After downloading the file  unpack the  zip file to a folder on your 
31. mentClasses  Lj ModelsForClimateData  ModelsFormattedForFilzbach  L3 OriginalCarbonStocksFlowsModels  Lj ProcessResultsDatafiles  i3 app config  S  ClassDiagram2 cd  g  Microsoft Ccr Core dll  g  Microsoft Research Science Data CSV dll          dj Microsoft Research Science Data Memory dll  dij Microsoft Research Science Data Memory2 dll  tap  Microsoft Research Science Data NetCDF4 dll  ab  netcdf4 dll   c  Program cs   ap  SDSArrays dll   E  Settings cs       Figure 2   This is what you should see when you open the solution explorer window  The  Progam cs file contains the functions for the highest level operations of the code  Other classes are    grouped into folders     In the Solution Explorer window you can click the little arrows on the left of folders or file names to  expand or contract lists of files as was done to give the image on the right  This shows the overall  structure of the solution which basically divides up the references  to code libraries   raw C  scripts  and some other files into different folders corresponding to different categories of use in the    solution     The different folders in the solution structure correspond to different folders in the    YOUR_ROOT_DIRECTORY Package src folder  In summary  these are    6     DataReferences  Contains text files detailing the sources of  and giving citations for  all of the  non Microsoft datasets used in the Smith et al   2013  study    MakeDataTables  A set of C  scripts for reading in the different ec
32. ments  required    FULL  BUILD UP   all 12   ALL   DUMMY  all 12     FULL  BUILD UP   all 12   ALL   DUMMY  all 12     FULL  BUILD UP   all 12   ALL   DUMMY  all 12     FULL  BUILD UP   all 12   ALL   DUMMY  all 12     FULL  ONE   DUMMY  all  12   ALL   DUMMY  all 12     FULL  OMIT    DATA  all 12    ALL DUMMY   all 12     FULL  OMIT    DATA  all 12    ALL DUMMY   all 12     FULL  OMIT    DATA  all 12    ALL DUMMY   all 12     FULL  OMIT    DATA  all 12    ALL DUMMY   all 12     Files produced    OutputData ProcessedFitting  OutputData ProcessedLikelih  oodsBuildUpVL csv   ProcessedLikelihoodsBuildUp  TL csv    OutputData ProcessedFitting  OutputData ProcessedParam  etersBuildUp csv    OutputData ProcessedFitting  OutputData ProcessedPredO  bsBuildUp csv    OutputData ProcessedFitting  OutputData ExampleOutputs  BuildUp csv    OutputData ProcessedFitting  OutputData ProcessedLikelih  oodsReplaceDummyVL csv   ProcessedLikelihoodsReplace  DummyTL csv    OutputData ProcessedFitting  OutputData ProcessedLikelih  oodsOmittedVL csv   ProcessedLikelihoodsOmitted  TL csv    OutputData ProcessedFitting  OutputData ProcessedParam  etersOmitted csv    OutputData ProcessedFitting  OutputData ProcessedPredO  bsOmitted csv    OutputData ProcessedFitting  OutputData ExampleOutputs  Omitted csv    26    Description    Assembles and summarizes the model  performance assessment metrics for the  training  TL  and evaluation datasets  VL   arising from the 12 BUILD UP experiments     Assembles a
33. nd summarizes the inferred  parameter values arising from the 12 BUILD UP  experiments     Assembles and summarizes predictions versus  observations plots arising from the 12 BUILD   UP experiments     Produces component functions using posterior  parameter probability distributions arising from  the 12 BUILD UP experiments     Assembles and summarizes the model  performance assessment metrics for the  training  TL  and evaluation datasets  VL   arising from the 12 BUILD UP experiments     Assembles and summarizes the model  performance assessment metrics for the  training  TL  and evaluation datasets  VL   arising from the 12 OMIT DATA experiments     Assembles and summarizes the inferred  parameter values arising from the 12 OMIT   DATA experiments     Assembles and summarizes predictions versus  observations plots arising from the 12 OMIT   DATA experiments     Produces component functions using posterior  parameter probability distributions arising from  the 12 OMIT DATA experiments     10 9  SIMULATE    Simulates the full model using climate data from different climate model simulation outputs and    different parameter values for the plant mortality model    The simulation experiments conducted in the study of Smith et a    2013  were performed using  separate code to that of the prototype framework for model engineering and refinement  To  conduct the simulations conducted by Smith et al   2013  you will need to download the necessary   cs file to run the simulations  
34. nere i Rec iR REIS 34    1  Introduction  The study of Smith et al   2013  reports the development and analysis of the fully data constrained    global terrestrial carbon model within a prototype framework for rapid modeling engineering and    refinement     At present the fully data constrained global terrestrial carbon model and the framework are both  contained within the same Microsoft Visual Studio solution  written principally in the C   programming language  we composed some of the graphs using the statistical package  R  and  provide that code with the solution package   This user guide provides instructions on how to use  framework to repeat the analyses of Smith et a    2013   A separate download is needed to run the  future carbon cycle projections  Fig  3  in the analysis of Smith et al   2013  because simulating the  inferred models under different climate change scenarios was not part of the prototype framework  for model engineering and refinement  The relevant code is also available for downloading and we    provide instructions here on how to modify the code to perform simulations     The study of Smith et al   2013  was performed through interacting with the raw C  source code  within the Microsoft Visual Studio solution  principally by enabling or disabling calls to procedures  corresponding to different experiments or analyses  We have thoroughly commented the code to  help users understand what it does  We have also made it possible for users to run speci
35. nts is  MSRTCM ONE DUMMY ALL    Or for a single experiment  see code above  replace ALL with an integer corresponding to the    experiment  see Table 6   MSRTCM ONE DUMMY 1    Or alternatively you can specify ONE DUMMY ALL or ONE DUMMY  Experiment Number  in the     command  string in the Main function of Program cs using Visual Studio   The integers correspond to the following model being replaced by a dummy     Table 6   Integer codes to use to specify which model to replace with a dummy    1     FracAlloc 2   FRootMort 3  LeafMortEvergreen  4  LeafMortDeciduous  5 FracEvergreen  6 StructuralMort  7  NPP 8   Fire  9   FracStruct 10   LitterTot 11   PlantC 12   SoilC    Completion of a ONE DUMMY parameter estimation experiment results in a    OutputData ModelFittingOutputData FullModelSetReplace lt ModelOmitted gt ResultsCompilation csv    24    file   with  lt ModelOmitted gt  corresponding to the specific model component that had been replaced    by a null model     10 7  OMIT DATA lt n gt     Perform parameter estimation for the full model but omitting an entire empirical dataset each time    Performs the parameter estimation operations described above for the full model but with a  specific dataset omitted during the parameter estimation procedures  The command line to perform    this sequentially for all datasets is  MSRTCM OMIT DATA ALL    Or for a single experiment  see code above  replace ALL with an integer corresponding to the    experiment  see Table 7   MSRTCM OM
36. ological and climatological  datasets for use in the Smith et al   2013  study    ModelFittingManagementcClasses  A set of CH scripts to enable Bayesian parameter  estimation for arbitrary combinations of models and datasets   ModelsForClimateData  Some CH scripts to enable the calculation of environmental variables  related to water balance  evapotranspiration  soil water content  fire frequency   ModelsFormattedForFilzbach  C  scripts to handle the conversion of the ecological models  used in the study into a format suitable for Bayesian parameter estimation   OriginalCarbonStocksFlowsModels  CH scripts of the ecological models used in the study   ProcessResultsDatafiles  A set of CH scripts for post processing data resulting from Bayesian    parameter estimation  It also includes code for mapping predictions     If you have a 64 bit operating system then change the default    build to 64 bit    We have set the default configuration to be for 32 bit operating systems but if your processor is 64    bit then you should get improved performance  faster running program and access to more    memory  if you switch to 64 bit  To do this in either Visual Studio or Visual CH Express right click the     Solution    MSRTCM     1 project   node in Solution Explorer window and select  Configuration    Manager     in the corresponding context menu  The Configuration Manager window then appears     In the  Active solution platform   box select  64bit  and close the window  You can also us
37. over 10 folds of parameter inference     The maps require the OutputData ModelFittingOutputData FullModelResultsCompilation csv  datafile to have been produced by FULL model parameter estimation  Then the maps can be    produced by the command line command   MSRTCM EQCARBDIST   Or if you want to produce maps immediately after parameter estimation you can write  MSRTCM FULL EQCARBDIST    Alternatively you can alter the  command  string in the Main function of Program cs to     EQCARBDIST  or  FULL EQCARBDIST  using Visual Studio    This procedure first produces a dataset called  OutputData SimulationOuputData EnvironmentsHighResBackup csv using the New et al  2002  and  Batjes  N  H   ed    2000 datasets  Table 1  if they do not already exist in that folder  A copy of that  file is packaged with the software  The code then calculates equilibrium plant and soil carbon for  each of the 1200 parameter samples from the 10 markov chains in FullModelResultsCompilation csv     This procedure takes several hours on a reasonably fast processor   Table 4 describes the results of running the MAPS procedure     Table 4   Output data from making probabilistic maps of terrestrial plant and soil carbon    File name and location Description Variables   DataFiles    OutputData SimulationOut   Contains 2 dimensional grid Either of the 5   95  or   putData  representations of the median 5  and   median estimates for plant  95  percentile estimates of plant and   or soil carbon with   HighResGrid
38. r about the predictions  the function used to make the datafiles from the raw data    e Initialize the parameters in Filzbach   e Read in the previously estimated parameter values from  OutputData ModelFittingOutputData FullModelResultsCompilation csv   e Ifan instruction has been given to get different parameters for the mortality model  details  below  then replace those parameters   e Create a file containing average environmental variables for all terrestrial land points at 0 5  degree resolution using the CRU CL 2 0 Global gridded climate data dataset   if it doesn t  already exist   e Create a file containing the differences to apply to the above environmental variables under    a specific climate change scenario  details below  if it doesn t already exist     28    e Simulate the model for each parameter set in FullModelResultsCompilation csv  saving the  results in  OutputData SimulationOutputData OutputExperiment lt scenario gt  lt Mortality    Model gt  lt FoldNumber gt  csv    Once the solution has been re built to allow simulations to be performer  the command for    specifying a particular simulation to run is  MSRTCM SIMULATE  lt FoldNumber gt   lt Scenario gt   lt MortalityModel gt     Where    e  FoldNumber is an integer from 1 10    e Scenario is  A1F1  or  B1   corresponding to the A1F1 climate change scenario or the B1  scenario respectively  and   e Mortality model is  NM    M  or  ZM    corresponding to not replacing the mortality model  parameters  repla
39. rediction  YourNewModelClass ErrorFunction     YourNewModelClass ProcessData      Decide on the type of fitting  Fitting a single multi component or single model is best  implemented by modifying the FullModel   function in the Program cs file  This simply  outputs an array of strings indicating the model components to be used for parameter  inference  If you want to fit a sequence of model structures then this is best implemented  by modifying the IdentifyIndividualModelExperiments   function in he Program cs file  This    specifies a list of string arrays representing different combinations of model components     32    e The likelihood function used for different models is specified in the LikelihoodAnalysis cs file   These return log likelihood values as well as other variables given data and the results of the  prediction equation from the parameterised model for an assumed data distribution type   normal  lognormal or logistic   Calls to the likelihood functions are orchestrated by the  CalculateLikelihoodFilzbach   function in the SetOfModels cs class    e The different model fitting experiments are called in the Program cs file but are specified in  the NFoldFitting   function of the CCFFittingStudy cs class    e The different model performance assessment metrics are called from the  CalculateStatisticsandAddToFold   function of the SetOfModels cs class although the    statistics themselves are specified in the MakeSummaryStatistics cs file     13  Usingthe Dmitrov li
40. rious model fitting experiments on   the different model components and the full model  The default  Main  class   initialises all of the major model fitting experiments  Therefore a coarse level overview   of the study can be obtained from reading this code    summary    public static class Program    summary  This calls the complete sequence of procedures to perform our study  ar e  args  gt This can be used when setting batch runs or running the executable code from     a console  Please read the user manual for a list of all of the arguments that can be passed  or you can look through the code below to see how different arguments are processed   param   E static void Main string   args    t   This string can be used to specify extra conditions on what procedures to run in the framework    The command gets added to the list of commands in  args   This is useful if you are running the code  s     VAR ti a A rA aoe n F A rf es    100      m         Figure 3   Program cs contains the highest level operations of the code with Main   being the    initial entry point     Program cs has been thoroughly commented to make it as readable as possible  Fig  3   a principle    which applies to all of the solution code     It starts with references to standard and non standard  specific to the solution  namespaces which    you will normally just ignore     The main function you need to look at is called Main    It begins at the bottom of the image below    with the text static void Main
41. s data sets  Filzbach allows for Bayesian parameter estimation  maximum likelihood  analysis  priors  latents  hierarchies  error propagation and model selection  from just a few lines of    code  It includes a set of libraries to allow interoperability between C  and Filzbach  which we use in    33    this study and these are included with the package  Please consult the Filzbach user manual to    obtain full details of how to use Filzbach or see http   research microsoft com filzbach      16  UsingFetchClimate    FetchClimate is a set of libraries and web service to facilitate access to various climatic datasets  The  climate data for our study was not obtained through FetchClimate  Instead we used a local copy of  the New et al   2002   gridded monthly climate dataset  However we have also enabled access to  exactly the same dataset using our FetchClimate data service  which returns exactly the same data   This can be implemented by including    etchclimate true when running the program from  command line  or by altering the UseFetchClimate setting to True in the solution properties window    in Visual Studio     We include FetchClimate as a prototype of how users might obtain standard environmental or other  datasets through a cloud based data provider  in this case run in Azure  to avoid the burden of  having to have local copies of all the necessary files  All of the calls to the FetchClimate data service  are contained in the ClimateLookup cs file in the MakeDataTables
42. sSoilL95 csv soil carbon accompanying latitude and   HighResGridsSoilMed csv longitude coordinates    HighResGridsSoilU95 csv  HighResGridsPlantL95 csv  HighResGridsPlantMed csv  HighResGridsPlantU95 csv    21    OutputData SimulationOut   As above but rearranging the data into   Combined the 5    95  and    putData  columns and combining medians with   median estimates for plant   credibility intervals  or soil carbon with  HighResEnvironmentsColu accompanying latitude and  mnsSoil csv    longitude coordinates  HighResEnvironmentsColu    mnsPlant csv    OutputData SimulationOut   As above but all data combined into a   Combined the 5   95  and    putData HighResEnvironm   single NetCDF file  median estimates for plant  entsMapForFullModelSet n and soil carbon with  c accompanying latitude and    longitude coordinate    These results can be inspected using Dataset Viewer or plotted using the statistical package R with    the code in the using the ext RScripts MainManuscript Fig2 R script    10 4  BUILD UP  lt n gt     Build up parameter estimation experiments    Performs the parameter estimation operations described in FULL above  but for subsets of the    model structure  The command line command is  MSRTCM BUILD UP ALL    Or alternatively you can specify BUILD UP ALL in the  command  string in the Main function of  Program cs using Visual Studio  This command will result in 10 fold parameter estimation for all  substructures in the full model  Alternatively you can estimate
43. structure and contents of the data folder                  essen 8    9  Running the s fully data constrained global terrestrial carbon model from command line    Fluch TD ER 15  10  Repeating the methods used in the paper                  seesesseseseseeeeee nennen enne nnns 15  10 1  duB                              O 16  10 2  Mi c                                                       XM 19  10 3  derer ipu                                               20  10 4  SIBI SUE p                                                               22    10 5  ALE DUMIMY ESI CEN  Em 23  10 6  ONE DUMIMY Pc                      24  10 7  e Pp p                                  25  10 8  ANALYSE PARAMS p                    centre 25  10 9  SIMULATE     5 n eerte teret t e eate te tee a re ede vate e Rn e et ee 27  10 10      ANALYSE SIMULATIONTS Sirrini eriein e nennen nennen tnnt ne nnne inns inns i aaa aa nenne 29  10 11  CEINALSTEST noieira ii dtt e eite ertet tete ie ainda eae E a 30  11  R Scripts to produce final publication graphs                    eeseseseseseeeeeeeer enne 30  12  Conducting riovel studies            3 2 n RR e cR ERE BA RE ERERN ERE CERES ERRAT Ra RR ded 31  13  Using the Dmitrov libraries within the code                       sess 33  14  Obtaining DataSet VieWEl           ccccccccccccecssssssssececececeeseeeseeeeecessesesaeaeseeecesseseuaeaeeeeecesseseaaeaeeeesens 33  15  Using Filzbachzz  il eret totae tr aee bestes steeds ons 33  16  Using  EetchClimate    e 
44. ta ProcessedReservedTestOutputData Processed    LikelihoodsFullModelTest csv file which is used when graphs are plotted using R     11  RScripts to produce final publication graphs  We used the statistical package R to produce some of the final graphs for Smith et al   2013  and    include the scripts that did this in the    30    YOUR_ROOT_DIRECTORY Package ext RScripts folder  The files are divided into those used for the  main manuscript  in the  MainManuscript  folder  and those used in the supplementary information   in the  SupplementaryInformation  folder   They do not use any additional libraries and so should  work with most versions of R  You will need to alter the scripts to refer to the correct file path    containing the source datafiles  The function of the scripts is obvious from the file names     The script used to produce Fig  9 of the main manuscript is included with the simulation output data  and simulation code in the MSRTCMSim zip package available from    http   research microsoft com en us downloads 49ad471e 7411 4f65 910a     2a541f946575 default aspx  the file is Fig9 R   The file is  YOUR_ROOT_DIRECTORY MSRTCMSim ext RScripts MainManuscript Fig9 R     12  Conducting novel studies  We strongly encourage scientists to work with our code to conduct novel studies  At present we    cannot promise dedicated technical support for this although please do email  matthew smith microsoft com with queries  Please include MSRTCM SUPPORT in the subject line    o
45. ta from the FULL model fitting procedure    File name and location Description  DataFiles     OutputData ModelFitting   Updated after every training   OutputData LastOutput c   fold has completed providing   SV an opportunity to visually  inspect plots of predictions  versus observations  this was  useful in debugging     OutputData ModelFitting   A compilation of results from   OutputData FullModelRe   N Fold parameter estimation   sultsCompilation csv for the full model  This is a key  file allowing for visual  inspections of parameter  values  summary statistics and  performance metrics for  training and evaluation data  It  contains the raw Markov  Chains for each parameter  value  the median and 95th  percentile for each fold of each  parameter and the mean of  that data  model component  performance assessment  metrics by fold and their  means  examples of  predictions versus  observations     17    Variables    Shows the last set of predictions for  every set of observations from the  last iteration of training     For each parameter  Samples from each Markov Chain    Median 5  and 95   percentile  credibility intervals for each fold and  averages across folds    Parameter probability distributions  from each fold and on average    Prior parameter settings    For each model  each fold  for each  of the sampled parameter values  given the Training  TL  or Evaluation   VL  data    Likelihoods given sampled  parameter values    Probability distributions for the  likelihoo
46. the parameter estimation operations described above but where every model component  is replaced with one having a single parameter for estimation of the empirical data plus a process  error parameter  This is useful for comparing the performance of the models fitted above to that of  a null model  The commands and outcomes are exactly as in the BUILD UP experiments but for the    DUMMY models  The command line to fit all of the dummy model experiments is  MSRTCM ALL DUMMY ALL    Or for a single experiment  see code above  replace ALL with an integer corresponding to the    experiment  see Table 5     23    MSRTCM ALL DUMMY 1    Or alternatively you can specify ALL DUMMY ALL or ALL DUMMY  Experiment Number  in the     command  string in the Main function of Program cs using Visual Studio     Completion of a ALL DUMMY parameter estimation experiment results in a  OutputData ModelFittingOutputData DummyModelSet lt n gt ResultsCompilation csv file  where n is    the integer code corresponding to the model fitting experiment     10 6  ONE DUMMY  lt n gt     Perform parameter estimation for the full model but with one model component replaced with a    DUMMY  a null model     Performs the parameter estimation operations described above for the full model but when a  specific model component has been replaced with one having a single parameter for estimation of  the empirical data plus a process error parameter  The command line to perform this sequentially    for all model compone
47. which are alternatives to a  b and c for implementing a null  model   Formatting a new source dataset for use in parameter inference  Source data can vary  considerably in format and so we generally found that we had to write a separate function  for reading in each source datafile  There are a range of examples of this in the   MakeDataTables  folder  Ultimately the code must result in the production of a datafile  containing latitude  longitude and the data to be predicted   with the inclusion of elevation  data optional  These are combined into one datafile using the  ClimateLookup CombineDatasets   function   Registering the model component as one for parameter estimation  At present this is  handled by the MakeSetOfModels   or MakeSetOfModels2   functions in the  CCFFittingStudy cs class  These add model components to a SetOfModels class that stores a  list of model components for parameter inference  If these functions are passed a string  array containing the name of a model in their list then they attempt to add the model  component to the set of models  The difference between the two functions is that  MakeSetOfModels   registers the normal model component whereas MakeSetOfModels2      registers the null model for the component  To add a new model name you can specify    if  ModelsTolnclude Contains  YOUR_NEW_MODEL     NewSetOfModels AddModelToModelSet  YOUR_NEW_MODEL     lt StringOfDataDistributionType gt   YourNewModelClass SetupParameters   YourNewModelClass MakeP
    
Download Pdf Manuals
 
 
    
Related Search
    
Related Contents
  2231A-30-3 Datasheet    Copyright © All rights reserved. 
   Failed to retrieve file