Home
A User's Guide to the POT Package (Version 1.4)
Contents
1. any bn gt Gy n 00 2 1 for all y R where G is a non degenerate distribution function According to the Extremal Types Theorem Fisher and Tippett 1928 G must be either Fr chet Gumbel or negative Weibull 1955 noted that these three distributions can be merged into a single parametric family the Generalized Extreme Value GEV distribution The GEV has a distribution function defined by es 2 2 Gly exp ve where u ao are the location scale and shape parameters respectively c gt 0 and z max z 0 The Fr chet case is obtained when gt 0 the negative Weibull when lt 0 while the Gumbel case is defined by continuity when 0 From this result 1975 showed that the limiting distribution of normalized excesses of a thresh old y as the threshold approaches the endpoint Hena of the variable of interest is the Generalized Pareto Distribution GPD That is if X is a random variable which holds 2 1 then Pr X X gt Hy H gt Hena 2 3 with LN ME H y 1 e 2 4 vog where u 0 are the location scale and shape parameters respectively o gt 0 and z max z 0 Note that the Exponential distribution is obtained by continuity as 0 In practice these two asymptotical results motivated modelling block maxima with a GEV while peaks over threshold with a GPD 2 2 The multivariate case When dealing with multivariate extremes
2. time obs Min 1970 Min 0 022 ist Qu 1981 ist Qu 0 236 Median 1991 Median 0 542 Mean 1989 Mean 1 024 3rd Qu 1997 3rd Qu 1 230 Max 2004 Max 44 200 NA s 1 000 Mean Residual Life Plot Dispersion Index Plot e N e 97 g 7 ro EUM ui S S o C a e o 5 10 15 0 10 20 30 40 Threshold Threshold o lO ee m wood 4 7 S 0009000000 46 E 2 S D I N o o e o E v n ul s e 7 5 10 15 5 10 15 Threshold Threshold Figure 11 Threshold selection for river Ardi res at Beaujeu 25 Model Density Probability plot 0 0 02 04 06 08 1 0 Empirical Density Plot 0 20 0 10 0 00 10 20 30 40 50 Quantile Empirical Return Level 40 30 10 25 35 15 QQ plot 5 10 15 20 25 30 Model Return Level Plot 0 5 20 50 200 Return Period Years Figure 12 Graphic diagnostics for river Ardi res at Beaujeu 26 100 0 The result of function fitgpd gives the name of the estimator if a varying threshold was used the threshold value the number and the proportion of observations above the threshold parameter estimates standard error estimates and type the asymptotic variance covariance matrix and convergence diagnostic Figure 12 shows graphic diagnostics for the fitted model It can be seen that the fitted model mle seems to be appropriate Suppose we want to know the return level associ
3. 0 85 If there is some troubles try to put vert lines FALSE or change the range conf inf conf sup 0 1969697 0 4818182 Confidence interval for quantiles or return levels are also available This is achieved using a the Delta method or b profile likelihood 17 If there is some troubles try to put vert lines the range conf inf conf sup 1 566162 2 314646 FALSE or change If there is some troubles try to put vert lines the range conf inf conf sup 0 1969697 0 4818182 FALSE or change 400 398 396 398 396 Profile Log likelihood 402 Profile Log likelihood 400 404 402 406 404 408 1 0 15 20 25 0 0 0 2 0 4 0 6 Scale Parameter Shape Parameter Figure 6 The profile log likelihood confidence intervals 18 conf inf conf sup 8 001327 13 001023 If there is some troubles try to put vert lines FALSE or change the range conf inf conf sup 8 833333 14 277778 eo e E eo o oO o Ez x T Q D a E 9 o C c l e ki 6 8 10 12 14 16 Return Level Figure 7 The profile log likelihood confidence interval for return levels gt gpd firl pwmu prob 0 95 conf inf conf sup 8 001327 13 001023 gt gpd pfrl mle prob 0 95 range c 5 16 If there is some troubles try to put vert lines FALSE or change the range conf inf conf sup 8 833333 14 277778 The profile confidence interval fu
4. identify FALSE Figure displays the L Moment plot By passing option identiy TRUE user can click on the graphic to identify the threshold related to the point selected We found that this graphic has often poor performance on real data 3 2 4 Dispersion Index Plot diplot The Dispersion Index plot is particularly useful when dealing with time series The EVT states that excesses over a threshold can be approximated by a GPD However the EVT also states that the occurrences of these excesses must be represented by a Poisson process Let X be a r v distributed as a Poisson distribution with parameter A That is AE GA Pr X k e E Thus we have E X 2 Var X 1979 introduced a Dispersion Index statistic defined by s2 DI 3 10 3 10 where s is the intensity of the Poisson process and the mean number of events in a block most often this is a year Moreover a confidence interval can be computed by using a x test k N 3 9 xt a 2 M 1 x4 1 o 2 M 1 M i M 1 La 3 11 where Pr DI I4 a For the next example we use the data set ardieres included in the POT package Moreover as ardieres is a time series and thus strongly auto correlated we must extract extreme events while preserving independence between events This is achieved using function clus gt data ardieres gt events clust ardieres u 2 tim cond 8 365 clust max TRUE gt
5. NA s 1 000 gt eventsO lt clust ardieres u 1 5 tim cond 8 365 clust max TRUE gt par mfrow c 2 2 gt mrlplot eventsO obs gt abline v 6 col green gt diplot eventsO gt abline v 6 col green gt tcplot eventsO obs which 1 gt abline v 6 col green gt tcplot eventsO obs which 2 gt abline v 6 col green From Figure 11 a threshold value of 6m s should be reasonable The Mean residual life plot top left panel indicates that a threshold around 10m s should be adequate However the selected threshold must be low enough to have enough events above it to reduce variance while not too low as it increase the biad Thus we can now re extract events above the threshold 6m s obtaining object eventsi This is necessary as sometimes eventsi is not equal to observations of eventsO greater than 6m s We can now define the mean number of events per year npy Note that an estimation of the extremal index is available gt events1 clust ardieres u 6 tim cond 8 365 clust max TRUE gt npy lt length eventsi obs diff range ardieres time na rm TRUE diff ardieres c 20945 20947 time gt print npy 1 1 707897 gt attributes events1 exi 1 0 1247265 Let s fit the GPD gt mle lt fitgpd eventsi obs thresh 6 est mle gt par mfrow c 2 2 gt plot mle npy npy 24
6. distributed Central Limit Theorem However normality doesn t hold anymore for high threshold as there are less and less excesses Moreover by construction this plot always converge to the point 2 4 0 Here is another synthetic example gt x lt rnorm 10000 gt mrlplot x u range c 1 quantile x probs 0 995 col c green black green nt 200 Figure 2 displays the mean residual life plot A threshold around 2 5 should be reasonable 3 2 3 L Moments plot Ilmomplot L moments are summary statistics for probability distributions and data samples They are analogous to ordinary moments they provide measures of location dispersion skewness kurtosis and other aspects of the shape of probability distributions or data samples but are computed from linear combinations of the ordered data values hence the prefix L For the GPD the following relation holds 1 5T 3 5 T 3 where 74 is the L Kurtosis and 73 is the L Skewness T4 T3 The L Moment plot represents points defined by 73 u fau fu lt eux 3 8 where s u and 74 are estimations of the L Kurtosis and L Skewness based on excesses over threshold u and xax is the maximum of the observations x The theoretical curve defined by equation 3 7 is traced as a guideline Here is a trivial example gt x lt c 1 abs rnorm 200 0 0 2 rgpd 100 1 2 0 25 gt Imomplot x u range c 0 9 quantile x probs 0 9
7. 0 4 0 6 0 8 1 0 X Figure 5 The Pickands dependence function scalel shapel scale2 shape2 scalei 3 312e 03 1 545e 03 8 715e 13 1 023e 12 shapei 1 545e 03 2 276e 03 7 430e 13 9 210e 13 scale2 8 715e 13 7 430e 13 8 155e 04 8 837e 04 shape2 1 023e 12 9 210e 13 8 837e 04 1 406e 03 Optimization Information Convergence successful Function Evaluations 40 Gradient Evaluations 8 Note that as all bivariate extreme value distributions are asymptotically dependent the X statistic of Coles et al 1999 is always equal to 1 Another way to detect the strength of dependence is to plot the Pickands dependence function see Figure This is simply done with the pickdep function gt pickdep Mlog The horizontal line corresponds to independence while the other ones corresponds to perfect dependence Please note that by construction the mixed and asymetric mired models can not model perfect depen dence variables 15 3 3 3 Markov Chains for Exceedances The classical way to perform an analysis of peaks over a threshold is to fit the GPD to cluster maxima However there is a waste of data as only the cluster maxima is considered On the contrary if we fit the GPD to all exceedances standard errors are underestimated as we consider independence for dependent observations Here is where Markov Chains can help us The main idea is to model the dependence structure using a Markov Chains while the joint distribution is obviously
8. 1 shape 0 est mle Estimator MLE Deviance 358 5155 AIC 360 5155 Varying Threshold FALSE 11 Threshold Call 1 Number Above 100 Proportion Above 1 Estimates scale 2 209 Standard Error Type observed Standard Errors scale 0 2209 Asymptotic Variance Covariance scale scale 0 0488 Optimization Information Convergence successful Function Evaluations 15 Gradient Evaluations 1 gt fitgpd x thresh 1 scale 2 est mle Estimator MLE Deviance 359 2705 AIC 361 2705 Varying Threshold FALSE Threshold Call 1 Number Above 100 Proportion Above 1 Estimates shape 0 03536 Standard Error Type observed Standard Errors shape 0 07324 Asymptotic Variance Covariance shape shape 0 005364 Optimization Information Convergence successful Function Evaluations 17 Gradient Evaluations 6 If now we want to fit a GPD with a varying threshold just do gt x rgpd 500 1 2 0 3 0 01 gt fitgpd x 1 2 est mle 12 Estimator MLE Deviance 209 8694 AIC 205 8694 Varying Threshold TRUE Threshold Call 1 2 Number Above 500 Proportion Above 1 Estimates scale shape 0 300387 0 007188 Standard Error Type observed Standard Errors scale shape 0 01863 0 04296 Asymptotic Variance Covariance scale shape scale 0 0003470 0 0005586 shape 0 0005586 0 0018454 Optimization Information Convergence successful Function Evaluations 46 Gradien
9. 142 0 141 5 143 5 40 60 80 100 Return Level Figure 13 Profile likelihood function for the 100 year return period quantile 28 A Dependence Models for Bivariate Extreme Value Distribu tions A 1 The Logisitic model The logisitic model is defined by View e ty de cosi A 1 Independence is obtained when a 1 while total dependence for a 0 The Pickands dependence function for the logistic model is A 0 1 0 1 ES E w a w wa A 2 The Asymetric Logistic model The asymetric logistic model is defined by 1 1 a 1 98 1 8 Ta va V sy l 4 x y 0 05 with 0 lt a 1 0 061 05 lt 1 Independence is obtained when either a 1 01 0 or 05 0 Differents limits occur when 04 and 05 are fixed and a 1 0 The Pickands dependence function for the asymetric logistic model is A w 1 6 1 w 1 62 w w 07 we 9g A 3 The Negative Logistic model The negative logistic model is defined by 1 D quod Independence is obtained when a 0 while total dependence when a 00 The Pickands dependence function for the negative logistic model is 1l a A w 1 1 w cw A 4 The Asymetric Negative Logistic model The asymetric negative logistic model is defined by 1 V z y 4 s 8 a gt 0 0 60 04 X1 Tr y 1 2 Independence is obtained when either a 0 01 0 or 05 0 Different lim
10. alpha 7 492e 05 1 377e 04 3 497e 06 8 212e 05 7 403e 04 Optimization Information Convergence successful Function Evaluations 41 Gradient Evaluations 10 In the summary we can see lim_u Pr X_1 gt u X 2 gt u 0 02 This is the x statistics of Coles 1999 For the parametric model we have x22 V 1 1 2 2 1 A 0 5 For independent variables x 0 while for perfect dependence x 1 In our application the value 0 02 indicates that the variables are independent which is obvious In this perspective it is possible to fixed some parameters For our purpose of independence we can run which is equivalent to fit x and y separately of course gt fitbvgpd cbind x y c 0 2 model log alpha 1 Call fitbvgpd data cbind x y threshold c 0 2 model log Estimator MLE Dependence Model and Strenght Model Logistic lim_u Pr X 1 gt u X_2 gt u 0 Deviance 1164 651 AIC 1172 651 alpha 1 Marginal Threshold 0 2 Marginal Number Above 500 500 Marginal Proportion Above 1 1 Joint Number Above 500 Joint Proportion Above 1 Number of events such as Y1 gt ui U Y2 gt u2 500 Estimates scalel shapel Scale2 shape2 0 8903 0 2214 0 4981 0 2436 Standard Errors scalel shapel Scale2 shape2 0 05755 0 04770 0 02856 0 03749 Asymptotic Variance Covariance 14 Pickands Dependence Function e Z 9 o 3 ae o im o Lo o 0 0 0 2
11. diplot events u range c 2 20 The Dispersion Index plot is presented in Figure From this figure a threshold around 5 should be reasonable 2The clust function will be presented later in section 3 6 T4 1 0 0 8 0 6 0 4 0 2 0 0 L Moments Plot P 0 0 0 2 0 4 0 6 0 8 1 0 T3 Figure 3 fig The threshold selection using the Imomplot function Dispersion Index 1 0 1 2 1 4 1 6 1 8 0 8 Dispersion Index Plot Threshold Figure 4 The threshold selection using the diplot function 3 3 Fitting the GPD 3 3 1 The univariate case The main function to fit the GPD is called fitgpd This is a generic function which can fit the GPD according several estimators There are currently 17 estimators available method of moments moments maximum likelihood mle biased and unbiased probability weighted moments pwmb pwmu mean power density divergence mdpd median med pickands pickands maximum penalized likelihood mple and max imum goodness of fit mgf estimators For the mgf estimator the user has to select which goodness of fit statistics must be used These statistics are the Kolmogorov Smirnov Cramer von Mises Anderson Darling and modified Anderson Darling See the html help page of the fitgpd function to see all of them Details for these estimators can be found in Coles 2001 Hosking and Wallis 1987 and Schucany 2004 Peng and Welsh 2001 and Pickands 1975 The MLE is a
12. it is usual to transform data to a particular distribution For example Falk and Reiss 2005 used the inverted standard exponential distribution Pr Z z exp z z 0 use the uniform distribution on 0 1 However the most common distribution seems to be the standard Fr chet one Pr Z lt z exp 1 z 2000 Thus in the following we will only consider this case For this purpose margins are transformed according to i iog F Y where F is the distribution of the j th margin Obviously in practice the margins F are unknown When dealing with extremes the univariate EVT tells us what to do Thus if block maxima or peaks over a threshold are of interest we must replace F with GEV or GPD respectively Definition 2 2 1 A multivariate extreme value distribution in dimension d has representation G yi Ya exp V 21 2a 2 5 with a V a as m xd T P 2 dH a ior Qa where H is a measure with mass 2 called spectral density defined on the set a Tp 2 0 4 20 9 1 j i with the constraint q dH qj 1 Vj 1 d T p The V function is often called exponential measure Kl ppelberg and May 2006 and is an homogeneous function of order 1 Contrary to the univariate case there is an infinity of functions V for d gt 1 Thus it is usual to used specific parametric families for V Several examples for these families are given in Annexe A Another r
13. rgpd 5 loc 1 scale 2 shape 0 2 1 1 185983 1 232789 3 569862 1 251526 2 241297 gt rgpd 6 c 1 5 2 0 2 1 2 558770 4 655136 3 417036 3 639779 2 348138 4 389966 gt rgpd 6 0 c 2 3 0 1 2 221294 10 318083 1 475714 1 391914 3 077797 9 260577 gt pgpd c 9 15 20 1 2 0 25 1 0 9375000 0 9825149 0 9922927 gt qgpd c 0 25 0 5 0 75 1 2 0 1 1 575364 2 386294 3 772589 gt dgpd c 9 15 20 1 2 0 25 1 0 015625000 0 003179117 0 001141829 Several options can be passed to three of these four functions In particular e for pgpd user can specify if non exceedence or exceedence probability should be computed with option lower tail TRUE or lower tail FALSE respectively e for qgpd user can specify if quantile is related to non exceedence or exceedence probability with option lower tail TRUE or lower tail FALSE respectively e for dgpd user can specify if the density or the log density should be computed with option log FALSE or log TRUE respectively 3 2 Threshold Selection The location for the GPD or equivalently the threshold is a particular parameter as must often it is not estimated as the other ones All methods to define a suitable threshold use the asymptotic approximation defined by equation 2 3 In other words we select a threshold for which the asymptotic distribution H in equation is a good approximation The POT package has several tools to
14. study shows that two flood events can be considered independent if they do not lie within a 8 days window Note that unit to define tim cond must be the same than the data analyzed data ardieres gt events clust ardieres u 2 tim cond 8 365 Several options can be passed to the clust function By default it will return a list with the identified clusters Usually we want only cluster maxima this is achieved by passing option clust max TRUE Users can also ask for a graphic representation of clusters by passing option plot TRUE see Figure 9 gt clustMax lt clust ardieres u 2 tim cond 8 365 clust max TRUE plot TRUE xlim c 1971 1 1972 9 3 7 Miscellaneous functions 3 7 1 Return periods rp2prob and prob2rp The functions rp2prob and prob2rp are useful to convert return periods to non exceedence probabilities and vice versa It needs either a return period either a non exceedence probability Moreover the mean number of events per year npy must be specified gt rp2prob 50 1 8 npy retper prob 11 8 50 0 9888889 gt prob2rp 0 6 2 2 npy retper prob 12 2 1 136364 0 6 21 obs 40 30 20 10 1971 5 1972 0 1972 5 tim Figure 9 The identified clusters Data Ardi res u 2 tim cond 8 22 obs 30 40 20 10 1970 1975 1980 1985 1990 1995 2000 2005 time Figure 10 Instantaneous flood discharges and averaged discha
15. 1 0 Then theoretically we have Oo 1 IX u for lt 1 3 3 When gt 1 the theoretical mean is infinite In practice if X represents excess over a threshold uo and if the approximation by a GPD is good enough we have T C X o X gt po I 3 4 For all new threshold 444 such as 41 gt Ho excesses above the new threshold are also approximate by a GPD with updated parameters see section Thus 7 _ Fur _ Tmo Epi live not extreme events Modified Scale Shape 0 90 0 94 0 98 0 90 0 94 0 98 Threshold Threshold Figure 1 The threshold selection using the tcplot function Mean Residual Life Plot Mean Excess 0 30 035 040 045 0 50 0 55 0 25 1 0 1 5 2 0 2 5 Threshold Figure 2 The threshold selection using the mrlplot function The quantity E X u X gt p is linear in pu Or E X m X gt qa is simply the mean of excesses above the threshold 44 which can easily be estimated using the empirical mean A mean residual life plot consists in representing points i KT Y fus i H lt ss 3 6 ue where n is the number of observations x above the threshold y in is the i th observation above the threshold u and zmax is the maximum of the observations x Confidence intervals can be added to this plot as the empirical mean can be supposed to be normally
16. A User s Guide to the POT Package Version 1 4 Mathieu Ribatet Copyright 2011 Department of Mathematics University of Montpellier II France E mail mathieu ribatet math univ montp2 fr 1 Introduction 1 1 Why the POT package The POT package is an add on package for the R statistical software R Development Core Team 2006 The main goal of this package is to develop tools to perform stastical analyses of Peaks Over a Threshold POT Most of functions are related to the Extreme Value Theory EVT 2001 gives a comprehensive introduction to the EVT while Kluppelberg and Mikosch 1997 present advanced results 1 2 Obtaining the package guide The package can be downloaded from CRAN The Comprehensive R Archive Network at r project org This guide in pdf will be in the directory POT doc underneath wherever the package is installed 1 3 Contents To help users to use properly the POT package this guide contains practical examples on the use of this package Section 2 introduce quickly the Extreme Value Theory EV T Some basic examples are described in section CNN section 4 gives a concrete statistical analysis of extreme value for river Adie res at Beaujeu FRANCE 1 4 Citing the package guide To cite this guide or the package in publications please use the following bibliographic database entry title A User s Guide to the POT Package Version 1 0 author Ribatet M A year 2006 mon
17. R A Language and Environment for Statistical Computing R Foundation for Statistical Computing Vienna Austria 2006 URL http www R project org ISBN 3 900051 07 0 S I Resnick Extreme Values Regular Variation and Point Processes New York Springer Verlag 1987 R L Smith Multivariate Threshold Methods Kluwer Dordrecht 1994 R L Smith J A Tawn and S G Coles Markov chain models for threshold exceedances Biometrika 84 2 249 268 1997 ISSN 00063444 ISSN 31
18. a multivariate extreme value distribution This idea was first introduces by Smith et al 1997 In the remainder of this section we will only focus with first order Markov Chains Thus the likelihood for all exceedances is IMi f yi 1 vi 9 V gt f y 8 where f yi 1 yi39 Y is the joint density f y 0 is the marginal density 0 is the marginal GPD pa rameters and v is the dependence parameter The marginals are modelled using a GPD while the joint distribution is a bivariate extreme value distribution For our application we use the simmc function which simulate a first order Markov chain with extreme value dependence structure gt mc simmc 1000 alpha 0 5 model log gt mc lt qgpd mc 2 1 0 15 gt fitmcgpd mc 2 log Call fitmcgpd data mc threshold 2 model log Estimator MLE Dependence Model and Strenght Model Logistic lim_u Pr X_1 gt u X_2 gt u 0 557 Deviance 1528 288 AIC 1534 288 Threshold Call Number Above 998 Proportion Above 1 Estimates scale shape alpha 0 9376 0 1297 0 5295 Standard Errors scale shape alpha 0 08325 0 03877 0 02329 Asymptotic Variance Covariance scale shape alpha scale 0 0069313 0 0011233 0 0010032 shape 0 0011233 0 0015035 0 0003555 alpha 0 0010032 0 0003555 0 0005423 Optimization Information Convergence successful Function Evaluations 43 Gradient Evaluations 10 16 3 4 Confidence Intervals Once a
19. ated to the 100 year return period gt rp2prob retper 100 npy npy npy retper prob 1 1 707897 100 0 9941448 gt prob lt rp2prob retper 100 npy npy prob gt qgpd prob loc 6 scale mle param scale shape mle param shape scale 36 44331 To take into account uncertainties Figure depicts the profile confidence interval for the quantile associated to the 100 year return period gt gpd pfrl mle prob range c 25 100 nrang 200 If there is some troubles try to put vert lines the range conf inf conf sup 25 56533 90 76633 FALSE or change Sometimes it is necessary to know the estimated return period of a specified events Lets do it with the larger events in events1 gt maxEvent lt max eventsi obs gt maxEvent 1 44 2 gt prob lt pgpd maxEvent loc 6 scale mle param scale shape mle param shape gt prob shape 0 9974115 gt prob2rp prob npy npy npy retper prob 1 1 707897 226 1982 0 9974115 Thus the largest events that occurs in June 2000 has approximately a return period of 240 years Maybe it is a good idea to fit the GPD with the other estimators available in the POT package 3 As the asymptotic approximation by a GPD is not accurate anymore 27 If there is the range some troubles try to put vert lines FALSE or change conf inf conf sup 25 56533 90 Profile Log likelihood 76633 143 0 142 5
20. define a reasonable threshold For this purpose the user must use tcplot mrlplot lmomplot exiplot and diplot functions The main goal of threshold selection is to selects enough events to reduce the variance but not too much as we could select events coming from the central part of the distributioq and induce bias 3 2 1 Threshold Choice plot teplot Let X GP 0 00 60 Let jii be a another threshold as w1 gt po The random variable X X gt ju is also GPD with updated parameters o1 co o p1 Ho and o Let Ox O1 i H1 3 1 With this new parametrization c is independent of wi Thus estimates of c and 4 are constant for all 1 gt Ho if wo is a suitable threshold for the asymptotic approximation Threshold choice plots represent the points defined by Qu Os fi S Smax and u1 1 H1 lt Lmax 3 2 where Zmax is the maximum of the observations x Moreover confidence intervals can be computed using Fisher information Here is an application gt x lt runif 10000 gt par mfrow c 1 2 gt tcplot x u range c 0 9 0 995 Results of the tcplot function is displayed in Figure I We can see clearly that a threshold around 0 98 is a reasonable choice However in practice decision are not so clear cut as for this synthetic example 3 2 2 Mean Residual Life Plot mrlplot The mean residual life plot is based on the theoretical mean of the GPD Let X be a r v distributed as GPD p
21. epresentation for a multivariate extreme value distribution is the Pickands representation Pickands 1981 We give here only the bivariate case Definition 2 2 2 A bivariate extreme value distribution has the Pickands representation Gn ex EDIE 2 6 A 0 1 0 1 w Aw max w 1 q 1 w q dH q with In particular the functions V and A are linked by the relation A w e V 21 22 die 22 CE 0 z 29 The dependence function holds 2 max w 1 w A w 1 Vu 3 is convex 4 Two random variables with unit Fr chet margins are independent if A w 1 Vu 5 Two random variables with unit Fr chet margins are perfectly dependent if A w max w 1 w Vw We define the multivariate extreme value distributions which are identical to the block maxima approach in higher dimensions We now establish the multivariate theory for peaks over threshold According to Resnick 1987 Prop 5 15 multivariate peaks over thresholds u has the same representa tion than for block maxima Only the margins F must be replaced by GPD instead of GEV Thus 1 1 F y1 ex V Jess j gt uj 2 7 Ga va 5 log F1 y1 log Fa idi P 3 Basic Use 3 1 Random Numbers and Distribution Functions First of all lets start with basic stuffs The POT package uses the R convention for random numbers generation and distribution function features gt library PUT gt
22. ged over duration 3 days Data ardieres 3 7 2 Unbiased Sample L Moments samlmu The function samlmu computes the unbiased sample L Moments gt x lt runif 50 gt samimu x nmom 5 11 12 t 3 t4 t 5 0 45154808 0 15754100 0 07672827 0 03292651 0 01813285 3 7 3 Mobile average window on time series ts2tsd The function ts2tsd computes an average time series tsd from the initial time series ts This is achieved by using a mobile average window of length d on the initial time series gt data ardieres gt tsd lt ts2tsd ardieres 3 365 gt plot ardieres type 1 col blue gt lines tsd col green The latter execution is depicted in Figure 10 23 4 A Concrete Statistical Analysis of Peaks Over a Threshold In this section we provide a full and detailed analysis of peaks over a threshold for the river Ardi res at Beaujeu Figure 10 depicts instantaneous flood discharges blue line As this is a time series we must selects independent events above a threshold First we fix a relatively low threshold to extract more events Thus some of them are not extreme but regular events This is necessary to select a reasonable threshold for the asymptotic approximation by a GPD see section 2 gt summary ardieres time obs Min 1970 Min 0 022 ist Qu 1981 ist Qu 0 236 Median 1991 Median 0 542 Mean 1989 Mean 1 024 3rd Qu 1997 3rd Qu 1 230 Max 2004 Max 44 200
23. its occur when 01 and 02 are fixed and a 00 The Pickands dependence function for the asymetric negative logistic model is 1 w w A w 1 0 y E 29 c a A 5 The Mixed model The mixed model is defined by Le a a O0 lt a lt l Independence is obtained when a 0 while total dependence could never be reached The Pickands dependence function for the mixed model is A w 1 w 1 w a A 6 The Asymetric Mixed model The asymetric mixed model is defined by WEA TE eee a gt 0 a 20 lt 1 a 30 gt 0 Independence is obtained when a 0 0 while total dependence could never be reached The Pickands dependence function for the asymetric mixed model is Alw Ow aw a 4 0 w 1 References P Bortot and S Coles The multivariate gaussian tail model An application to oceanographic data Journal of the Royal Statistical Society Series C Applied Statistics 49 1 31 49 2000 S Coles An Introduction to Statistical Modelling of Extreme Values Springer Series in Statistics Springers Series in Statistics London 2001 S Coles J Heffernan and J Tawn Dependence measures for extreme value analyses Extremes 2 4 339 365 December 1999 C Cunnane Note on the poisson assumption in partial duration series model Water Resour Res 15 2 489 494 1979 M Falk and R D Reiss On pickands coordinates in arbitrary dimensions Journal of Multivariate A
24. nalysis 92 2 426 453 2005 R A Fisher and L H Tippett Limiting forms of the frequency distribution of the largest or smallest member of a sample In Proceedings of the Cambridge Philosophical Society volume 24 pages 180 190 1928 J R M Hosking and J R Wallis Parameter and quantile estimation for the generalized pareto distribu tion Technometrics 29 3 339 349 1987 A F Jenkinson The frequency distribution of the annual maximum or minimum values of meteoro logical events Quaterly Journal of the Royal Meteorological Society 81 158 172 1955 S F Ju rez and W R Schucany Robust and efficient estimation for the generalized pareto distribution Extremes 7 3 237 251 2004 ISSN 13861999 ISSN C Kl ppelberg and A May Bivariate extreme value distributions based on polynomial dependence functions Math Methods Appl Sci 29 12 1467 1480 2006 ISSN 01704214 ISSN C Kluppelberg and T Mikosch Large deviations of heavy tailed random sums with applications in insurance and finance Journal of Applied Probability 34 2 293 308 1997 30 Liang Peng and A H Welsh Robust estimation of the generalized pareto distribution Extremes 4 1 53 65 2001 J Pickands Multivariate extreme value distributions In Proceedings 43rd Session International Statistical Institute 1981 J III Pickands Statistical inference using extreme order statistics Annals of Statistics 3 119 131 1975 R Development Core Team
25. nctions both returns the confidence interval and plot the profile log likelihood function Figure 7 depicts the graphic window returned by function gpd pfrl for the return level associated to non exceedence probability 0 95 19 Probability plot oo s p 2 o 8 4 2 ld E o o 2 z 00 02 04 06 08 1 0 10 0 105 11 0 11 5 Empirical Model Density Plot Return Level Plot e eo ai ai re T gt 3 5 o 2 o 5 E m O o E c c o o 9 z 10 11 12 13 14 15 12 5 20 100 500 Quantile Return Period Years Figure 8 Graphical diagnostic for a fitted POT model univariate case 3 5 Model Checking To check the fitted model users must call function plot which has a method for the uvpot bvpot and mcpot classes For example this is a generic function which calls functions pp probability probability plot qq quantile quantile plot dens density plot and retlev return level plot for the uvpot class Here is a basic illustration of the function plot for the class uvpot gt x lt rgpd 200 10 0 5 0 2 gt fitted lt fitgpd x 10 est mle gt par mfrow c 2 2 gt plot fitted npy 1 Figure B displays the graphic windows obtained with the latter execution If one is interested in only a probability probability plot there is two options We can call function pp or equivalently plotgpd with the which option The which option select which graph you want to
26. particular case as it is the only one which allows varying threshold Moreover two types of standard errors are available expected or observed information of Fisher The option obs fish specifies if we want observed obs fish TRUE or expected obs fish FALSE As Pickands estimator is not always feasible user must check the message of feasibility return by function fitgpd We give here several didactic examples x rgpd 200 1 2 0 25 gt mom fitgpd x 1 moments param gt mle fitgpd x 1 mle param gt pwmu lt fitgpd x 1 pwmu param pwmb fitgpd x 1 pwmb param gt pickands fitgpd x 1 pickands param med fitgpd x 1 med start list scale 2 shape 0 25 param gt mdpd fitgpd x 1 mdpd param gt mple fitgpd x 1 mple param gt ad2r fitgpd x 1 mgf stat AD2R param gt print rbind mom mle pwmu pwmb pickands med mdpd mple ad2r scale shape mom 2 252115 0 1796505 mle 2 106750 0 2394004 pwmu 2 066882 0 2471230 pwmb 2 078317 0 2429576 pickands 1 538178 0 7153022 med 1 791480 0 3081937 mdpd 2 064501 0 2659958 mple 2 132479 0 2250274 ad2r 2 204852 0 2132807 The MLE MPLE and MGF estimators allow to fix either the scale or the shape parameter For example if we want to fit a Exponential distribution just do with eventually a fixed scale parameter gt x rgpd 100 1 2 0 gt fitgpd x thresh
27. plot That is e which 1 for a probability probability plot e which 2 for a quantile quantile plot e which 3 for a density plot 20 e which 4 for a return level plot Note that which can be a vector like c 1 3 or 1 3 Thus the following instruction gives the same graphic gt plot fitted which 1 gt pp fitted If a return level plot is asked 4 which a value for npy is needed npy corresponds to the mean number of events per year This is required to define the return period If missing the default value i e 1 will be chosen 3 6 Declustering Techniques In opposition to block maxima a peak over threshold can be problematic when dealing with time series Indeed as often time series are strongly auto correlated select naively events above a threshold may lead to dependent events The function clust tries to identify peaks over a threshold while meeting independence criteria For this purpose this function needs at least two arguments the threshold u and a time condition for independence tim cond Clusters are identify as follow The first exceedance initiates the first cluster The first observation under the threshold u ends the cluster unless tim cond does not hold The next exceedance which hold tim cond initiates a new cluster pe Ge due dp The process is iterated as needed Here is an application on flood discharges for river Ardi re at Beaujeu A preliminary
28. statistical model is fitted it is usual to gives confidence intervals Currently only mle pwmu pwmb moments estimators can computed confidence intervals Moreover for method mle standard and profile confidence intervals are available If we want confidence intervals for the scale parameters gt x lt rgpd 200 1 2 0 25 gt mle lt fitgpd x 1 est mle gt mom fitgpd x 1 est moments gt pwmb lt fitgpd x 1 est pwmb gt pwmu lt fitgpd x 1 est pwmu gt gpd fiscale mle conf 0 9 conf inf scale conf sup scale 1 536159 2 283307 gt gpd fiscale mom conf 0 9 conf inf scale conf sup scale 1 207211 3 130971 gt gpd fiscale pwmu conf 0 9 conf inf scale conf sup scale 1 546348 2 274784 gt gpd fiscale pwmb conf 0 9 conf inf scale conf sup scale 1 556727 2 287666 For shape parameter confidence intervals simply use function gpd fishape instead of gpd fiscale Note that the fi stands for Fisher Information Thus if we want profile confidence intervals we must use functions gpd pfscale and gpd pfshape The pf stands for profile These functions are only available with a model fitted with MLE gt par mfrow c 1 2 gt gpd pfscale mle range c 1 2 9 conf 0 9 If there is some troubles try to put vert lines FALSE or change the range conf inf conf sup 1 566162 2 314646 gt gpd pfshape mle range c 0 0 6 conf
29. t Evaluations 12 Note that the varying threshold is repeated cyclically until it matches the length of object x 3 3 2 The bivariate case The generic function to fit bivariate POT is fitbvgpd There is currently 6 models for the bivariate GPD see Annexe A All of these models are fitted using maximum likelihood estimator Moreover the approach uses censored likelihood see Smith et al 1997 gt x lt rgpd 500 0 1 0 25 gt y lt rgpd 500 2 0 5 0 25 gt Mlog fitbvgpd cbind x y c 0 2 model log gt Mlog Call fitbvgpd data cbind x y threshold c 0 2 model log Estimator MLE Dependence Model and Strenght Model Logistic lim_u Pr X_1 gt u X_2 gt u 0 025 Deviance 1164 195 AIC 1174 195 Marginal Threshold 0 2 Marginal Number Above 500 500 Marginal Proportion Above 1 1 Joint Number Above 500 Joint Proportion Above 1 Number of events such as Y1 gt ui U Y2 gt u2 500 13 Estimates scalel shapel scale2 shape2 alpha 0 8881 0 2240 0 4980 0 2418 0 9821 Standard Errors scalel shapel Scale2 shape2 alpha 0 05752 0 04798 0 02866 0 03797 0 02721 Asymptotic Variance Covariance scalel shapel scale2 shape2 alpha scalel 3 308e 03 1 547e 03 3 389e 05 3 270e 05 7 492e 05 shapel 1 547e 03 2 302e 03 1 321e 06 2 057e 05 1 377e 04 scale2 3 389e 05 1 321e 06 8 214e 04 8 954e 04 3 497e 06 shape2 3 270e 05 2 057e 05 8 954e 04 1 441e 03 8 212e 05
30. th August url http cran r project org 1 5 Caveat I have checked these functions as best I can but as ever they may contain bugs If you find a bug or suspected bug in the code or the documentation please report it to me at ribatet hotmail com Please include an appropriate subject line 1 6 Legalese This program is free software you can redistribute it and or modify it under the terms of the GNU General Public License as published by the Free Software Foundation either version 2 of the License or at your option any later version This program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose See the GNU General Public License for more details The GNU General Public License can be obtained from http www gnu org copyleft gpl html You can also obtain it by writing to the Free Software Foundation Inc 59 Temple Place Suite 330 Boston MA 02111 1307 USA 2 An Introduction to the EVT 2 1 The univariate case Even if this package is only related to peaks over a threshold a classical introduction to the EV T must deal with block maxima Let X4 Xn be a series of independent and identically distributed random variables with commom distribution function F Let M max Xi Xn Suppose there exists normalizing constants a gt 0 and b such that Mn bn Pr E an lt y F
Download Pdf Manuals
Related Search
Related Contents
Nortec Industries MES-U User's Manual Machine Interface NA Series user manual manual de usuario 使用說明 ユーザーマニュアル english TERMINAL MANUAL & CON EMPUÑADURA Polaris 2008 Offroad Vehicle User Manual SPORT IMPACT SECTION 09 65 66 Couvre HPC 150 M - Nářadí Territoire du Grand Pau リフォーム 断熱 Copyright © All rights reserved.
Failed to retrieve file