ï»¿78255 v2
Table of Contents
ANNEX I Statistical Methods ................................................................................................................... 1
ANNEX IA. Most commonly used statistical distributions for maxima............................................... 1
Gumbel Probability Density Function (PDF) ........................................................................................... 1
Pearson type III (P3) PDF ........................................................................................................................ 2
Generalized Extreme Value PDF ............................................................................................................. 3
Other Probability Distribution for Extreme Values ................................................................................ 6
ANNEX IB Parameter estimation methods ............................................................................................ 6
Introduction to L-Moments Method ...................................................................................................... 6
Gumbel PDF: Parameter Estimation Methods ..................................................................................... 10
Log Pearson Type III Distribution.......................................................................................................... 12
Generalized Extreme Value Distribution .............................................................................................. 13
ANNEX IC The Mann Kendall Non-Parametric Test ............................................................................. 14
Introduction .......................................................................................................................................... 14
Steps in Performing Mann-Kendal Non-Parametric Test ..................................................................... 15
References for Annex I .......................................................................................................................... 16
ANNEX II Dendrochronology Reconstruction of Hydrologic Records ........................................................ 17
Introduction ............................................................................................................................................ 17
Methodology........................................................................................................................................... 17
References for Annex II ......................................................................................................................... 19
ANNEX III Downscaling Approaches........................................................................................................... 20
Statistical Downscaling ........................................................................................................................... 20
Dynamic Downscaling ............................................................................................................................. 21
Water Balance Models ............................................................................................................................ 21
References for Annex III .......................................................................................................................... 25
ANNEX IV Stochastic Soil Water Balance Model ........................................................................................ 26
Discussion................................................................................................................................................ 26
References for Annex IV.......................................................................................................................... 30
ANNEX V Probabilistic Modeling of Water Table Dynamics ....................................................................... 31
Discussion................................................................................................................................................ 31
References for Annex V........................................................................................................................... 33
ANNEX I Statistical Methods
ANNEX IA. Most commonly used statistical distributions for maxima
There are two common problems in flood hydrology:
i) Estimate the return period for a given flood; and
ii) Estimate the flood for a given return period
A commonly used procedure to solve these problems is to fit a probability density function such as the
Gumbel, Pearson Type III or the Generalized Extreme Value distributions to the historical data.
Gumbel Probability Density Function (PDF)
The Extreme Probability Density Function is also called the Gumbel PDF. It has the following general
form:
F ( xT ) ï€½ exp ï?»ï€ exp ï?»ï€ï?¡ ( xT ï€ u )ï?½ï?½ ï€ ï‚¥ ï€¼ xT ï€¼ ï‚¥, ï€ ï‚¥ ï€¼ u ï€¼ ï‚¥, ï?¡ ï€¾ 0
f ( xT ) ï€½ ï?¡ exp ï?»ï€ï?¡ ( xT ï€ u )ï?½ exp ï?»ï€ exp ï?»ï€ï?¡ ( xT ï€ u )ï?½ï?½ (1)
with parameters ï?¡ (shape) and u (location). The parameters are related to the moments of the PDF by
the following relationship (under the method of moments, MoM) procedure:
ï?§
ï?X ï€½ u ï€« (ï?§ =0.577216 ï‚º Euler's Constant)
ï?¡
ï?° 1.282
ï?³X ï€½ ï€½ (2)
6ï?¡ ï?¡
ï?§ x ï€½ skewness ï€½ 1.1396 ( fixed )
The Gumbel PDF may be used both for maxima and minima. This section focuses on maximum values.
For the solution of the two problems described above the following relationships should be used:
From the Gumbel cumulative density function (CDF
F ( xT ) ï€½ exp ï?»ï€ exp ï?»ï€ï?¡ ( xT ï€ u)ï?½ï?½ (3)
define the standardized variate, i:e.
yT ï€½ ï?¡ ( xT ï€ u) (4)
1
yielding
F ( xT ) ï€½ exp ï?»ï€ exp ï?»ï€ï?¡ ( xT ï€ u )ï?½ï?½ ï€½ exp ï?»ï€ exp ï?»ï€ yT ï?½ï?½ ï€½ F ( yT )
1
C ( xT ) ï€½ 1 ï€ F ( xT ) ï€½ Pr( X ï‚³ xT ) ï€½ p ï€½ (5)
T
where C(xT) is the complementary of the CDF.
The solution for the first problem is straightforward, i.e. given the parameters of the model and p = 1/T.
Using these relationships the values of T for the second problem (i.e. find the flood for a given return
period) may be found, e.g.
1
pï€½ ï€½ 1 ï€ exp(ï€ exp( yT ))
T
T ï€1
ï€½ exp(ï€ exp( yT )) (6)
T
ïƒ¬ ïƒ© T ï€ 1ïƒ¹ ïƒ¼
ln ïƒï€ ln ïƒª ïƒº ïƒ½ ï€½ yT
ïƒ® ïƒ« T ïƒ» ïƒ¾
xT ï€½ yT *ï?¡ ï€« u
Pearson type III (P3) PDF
The Pearson probability distribution was named after the statistician Pearson, it is also called the three-
parameter gamma distribution. A lower bound to the gamma PDF is introduced through the third
parameter (ï?¸). This parameter (ï?¸) is the location parameter, ï?¬ is the shape parameter and r is the shape
parameter.
The CDF has the following form:
2
(u ï€ ï?¸ )r ï€1 ï€ (uï?¬ï€ï?¸ )
x
Fx ( x) ï€½ ïƒ² r e du (7)
ï?¸ ï?¬ ï?‡ ( r )
and its corresponding PDF has the following form:
( x ï€ ï?¸ )r ï€1 eï€ ( x ï€ï?¸ )/ ï?¬
f x ( x) ï€½ (8)
ï?¬ ï?¡ ï?‡( r )
where ï?‡(r) is the Gamma function and
x
G(ï?¡ , x) ï€½ ïƒ² t ï?¡ ï€1eï€t dt (9)
0
is the incomplete Gamma function.
If log X follows a Pearson Type III distribution, then X is said to have a log-Pearson Type III distribution.
The Log Pearson III (LP3) it is a three parameter PDF that has become widely used after it was
recommended by the Water Resources Council in several publications, the last being Bulleting 17B and it
has become the distribution of choice for most of the applications of extremes in the US and many parts
of the world
Generalized Extreme Value PDF
The Generalized Extreme Value (GEV) PDF is a family of distributions for extreme values and combines
the Extreme Value Distributions of type I, II, and III. In its general form it has three parameters: ï?¸
(location), ï?¡ï€ (scale) and ï?« (shape) and it has the following form:
Â· ï€ (kÂ·( x ï€ ï?¸ )) / ï?¡ ](1/ k ) ] if k ï‚¹ 0
F ( x) ï€½ exp[ï€1[1 (10)
The location parameter ï?¸ indicates where the distribution is located on the x axis, the scale
parameter ï?¡ï€ ï€ is a measure of the variability of the distribution and shrinks or stretches the
distribution. The shape parameter ï?« distinguishes between the 3 forms summarized in the GEV
distribution. If the shape parameter is negative we have a Weibull distribution (bo unded tail), if it
is zero we have a Gumbel distribution and if it is positive we have a Frechet distribution with a
heavy tail as shown in the following figure:
3
Figure 1. Different forms of GEV distribution
SOURCE: Provided by author Juan B. ValdÃ©s
The parameters are estimated by:
(11)
For the case of k=0 the distribution is a Gumbel PDF, i.e.
and its parameters are estimated as shown before.
The shape parameter is mostly bounded by -0.3â‰¤ k â‰¤0.0. The impact of alternatives values of the
shape parameter are shown in Figure 2 below:
4
Figure 2. Alternative forms of CDF and PDF of GEV distribution for different values of the shape parameter k
1
K=-0.3
K=-0.2
K=0.0
FX(x) 0.5 K=0.1
0
-4 -3 -2 -1 0 1 2 3 4 5 6
x
0.4
K=-0.3
0.3 K=-0.2
K=0.0
fX(x)
0.2 K=0.1
0.1
0
-4 -3 -2 -1 0 1 2 3 4 5 6
x
SOURCE: Provided by author Juan B. ValdÃ©s
As mentioned before the GEV comprises the Type I (Gumbel), II (Frechet ) and III (Weibull ) as a function
of the shape parameter k as shown in Figure 3 below.
Figure 3. Alternative Forms of GEV distribution
0.45
K<0, Type III
0.4 K=0, Type I
K>0, Type II
0.35
0.3 Type I: -ï‚¥ ï?³ / k + ï?
0.2
0.15
0.1
0.05
0
-3 -2 -1 0 1 2 3 4 5 6
SOURCE: Provided by author Juan B. ValdÃ©s
5
Other Probability Distribution for Extreme Values
Other distributions have been also used for representing extreme values, among them the Generalized
Logistic Distribution (GLO), the Generalized Pareto (GPO), the lognormal PDF (2 and 3 parameters) and
the gamma PDF. They will not be covered in this section since they are not as widely used as the other
three distributions.
ANNEX IB Parameter estimation methods
Introduction to L-Moments Method
Let X be a real-valued random variable with cumulative distribution function F(x) and let
X1: n ï‚£ X 2: n ... ï‚£ X n: n be the order statistics of a random sample of size n drawn from the distribution of X.
Then the rth L-moment of X can be defined as a linear function of expected order statistics as (Hosking,
1990)
r ï€1
ïƒ¦ r ï€ 1ïƒ¶
ï?¬r ï‚º r ïƒ¥ (ï€1) ïƒ§ ïƒ· ïƒ— Eï?»X r ï€ k :r ï?½
ï€1
ïƒ§
k
ïƒ· r ï€½ 1, 2, 3, ... (13)
k ï€½0 ïƒ¨ k ïƒ¸
where E{.} is the expectation of an order statistic and is equal to
r!
( j ï€ 1)!(r ï€ j )! ïƒ²
E{ X j:r } ï€½ x{F ( x)} j ï€1{1 ï€ F ( x)}r ï€ j dF ( x) (14)
As noted by Hosking (1990), the natural estimator of ï?¬r , based on an observed sample of data, is a
linear combination of the ordered data values, i.e., an L-statistic. Substituting equation (14) in equation
(13), expanding the binomials of F(x) and summing the coefficients of each power of F(x), one can write
ï?¬r as
1
(15)
ï?¬r ï€½ E[ xP ( F ( x))] ï€½ ïƒ² x( F ) ïƒ— Pr*ï€1 ( F ) ïƒ— dF
*
r ï€1
0
where
Pr* ( F ) is the r th shifted Legendre polynomial expressed as
r (16)
Pr* ( F ) ï€½ ïƒ¥ pr
*
,k ïƒ— F
k
k ï€½0
and
6
ïƒ¦ r ïƒ¶ïƒ¦ r ï€« k ïƒ¶ (17)
pr*,k ï€½ (ï€1) r ï€k ïƒ§
ïƒ§k ïƒ· ïƒ§ k ïƒ·
ïƒ·ïƒ§ ïƒ·
ïƒ¨ ïƒ¸ïƒ¨ ïƒ¸
The shifted Legendre polynomials are related to the usual Legendre polynomials Pr (u ) as
Pr* (u) ï€½ Pr (2u ï€ 1) , and are orthogonal on the interval (0, 1) with constant weight function.
The first four L-moments are
1
ï?¬1 ï€½ E ( x) ï€½ ïƒ² xdF
0
1
1
ï?¬2 ï€½ E ( X 2:2 ï€ X 1:2 ) ï€½ ïƒ² x(2 F ï€ 1)dF
2 0
(18)
1
1
ï?¬3 ï€½ E ( X 3:3 ï€ 2 X 2:3 ï€« X 1:3 ) ï€½ ïƒ² x(6 F 2 ï€ 6 F ï€« 1)dF
3 0
1
1
ï?¬4 ï€½ E ( X 4:4 ï€ 3 X 3:4 ï€« 3 X 2:4 ï€ X 1:4 ) ï€½ ïƒ² x(20 F 3 ï€ 30 F 2 ï€« 12 F ï€ 1)dF
4 0
In practice, L-moments must usually be estimated from a random sample drawn from an unknown
distribution. Because ï?¬r is a function of the expected order statistics of a sample of size r, it is natural to
estimate it by a U-statistic, i.e. the corresponding function of the sample order statistics averaged over
all subsamples of size r which can be constructed from the observed sample of size n. Let x1 , x2 , ..., xn
be the sample and x1: n ï‚£ x2 : n ï‚£ ... ï‚£ xn : n the ordered sample, and define the r-th sample L-moment to be
ï€1
ïƒ¦nïƒ¶ r ï€1
ïƒ¦ r ï€ 1ïƒ¶
lr ï€½ ïƒ§
ïƒ§r ïƒ·
ïƒ· ïƒ¥ ïƒ¥ ïƒ— ïƒ— ïƒ— ïƒ¥ r ï€1
ïƒ¥ ïƒ§ k ïƒ·
(ï€1)k ïƒ§ ïƒ·xir ï€k :n , r ï€½ 1,2,..., n (19)
ïƒ¨ ïƒ¸ 1ï‚£ i1 ï€¼ i 2 ï€¼ ï€¼ir ï‚£ n k ï€½0 ïƒ¨ ïƒ¸
7
In particular
l1 ï€½ n ï€1 ïƒ¥ xi
i
ï€1
1 ïƒ¦nïƒ¶
l2 ï€½ ïƒ§
2ïƒ§
ïƒ·
ïƒ· ïƒ¥ïƒ¥ ( x i :n ï€ x j :n )
ïƒ¨2ïƒ¸ iï€¾ j
(20)
ï€1
1 ïƒ¦ nïƒ¶
l3 ï€½ ïƒ§
3ïƒ§
ïƒ·
ïƒ· ïƒ¥ïƒ¥ïƒ¥ ( x i :n ï€ 2 x j :n ï€« xk :n )
ïƒ¨3 ïƒ¸ i ï€¾ j ï€¾k
ï€1
1 ïƒ¦ nïƒ¶
l4 ï€½ ïƒ§
4ïƒ§
ïƒ·
ïƒ· ïƒ¥ïƒ¥ïƒ¥ïƒ¥ ( x i :n ï€ 3x j :n ï€« 3xk :n ï€ xl :n )
ïƒ¨ 4ïƒ¸ i ï€¾ j ï€¾ k ï€¾l
These U-statistics have properties of unbiasedness; asymptotic normality and some modest resistance
to the influence of outliers make them particularly attractive for statistical inference.
L-moments also can be defined through the use of the probability-weighted moments (PWMs). L-
moments are linear function of PWMs (Hosking, 1990). Greenwood et al. (1979) defined PWMs to be
the quantities
M q, r , s ï€½ E[( x( F ))q{1 ï€ x( F )}s {F ( X )}r ] (21)
with q=1, s=0
1
M1, r ,0 ï€½ br ï€½ ïƒ² x( F ) F r dF (22)
0
with q=1, r=0
1
M1,0, s ï€½ as ï€½ ïƒ² x( F )(1 ï€ F ) s dF (23)
0
Both as and br are linear in x and are related to each other as
8
s
ïƒ¦s ïƒ¶
as ï€½ ïƒ¥ ïƒ§ ïƒ§k ïƒ·
ïƒ·(ï€1) bk
k (24)
k ï€½0 ïƒ¨ ïƒ¸
r
ïƒ¦r ïƒ¶
br ï€½ ïƒ¥ ïƒ§ ïƒ§k ïƒ·
ïƒ·(ï€1) ak
k
(25)
k ï€½0 ïƒ¨ ïƒ¸
when r ï€½ 0 , b0 is the mean. All higher order PWMs are simply linear combinations of the order
statistics x( n ) ï‚£ x( n ï€1) ï‚£ ... ï‚£ x(1) .
Unbiased sample estimates of PWMs for any distribution can be computed as
ïƒ¦n ï€ jïƒ¶
ïƒ§ ïƒ·
1 ïƒ§ nï€r
ïƒ¨ r ïƒ· ïƒ¸ x( j ) (26)
br ï€½ ïƒ¥
n j ï€½1 ïƒ¦ n ï€ 1ïƒ¶
ïƒ§ r ïƒ·
ïƒ§ ïƒ·
ïƒ¨ ïƒ¸
where x(j) represents the ordered data, with x(1) being the largest observation and x(n) the smallest.
L-moments can be expressed in terms of as and br as
r r
(27)
Lr ï€«1 ï€½ (ï€1) r ïƒ¥ Pr , k ak ï€½ïƒ¥ Pr , k bk
k ï€½0 k ï€½0
In particular
L1 ï€½ a0 ï€½ b0
L2 ï€½ a0 ï€ 2a1 ï€½ 2b1 ï€ b0 (28)
L3 ï€½ a0 ï€ 6a1 ï€« 6a2 ï€½ 6b2 ï€ 6b1 ï€« b0
L4 ï€½ a0 ï€ 12a1 ï€« 30a2 ï€« 20a3 ï€½ 20b3 ï€ 30b2 ï€ 12b1 ï€« b0
The first L-moment is equal to the mean ï? and is hence a measure of location. Other L-moments are
measures of the scale and of the shape of a probability distribution. Analogous to the conventional
moment ratios, Hosking (1990) defined L-moment ratio as
9
L2
R2 ï€½ ï€½ L ï€ Cv (29)
L1
Lr
Rr ï€½ , r ï‚³3 (30)
L2
where R2 is a measure of scale or dispersion, R3 is L-skewness, and R4 is L-kurtosis. Thus R2, R3 and R4 can
be thought of as measures of distributionâ€™s scale, skewness and kurtosis, respectively. The ratios Rr are
independent of the units of measurement.
Gumbel PDF: Parameter Estimation Methods
The three methods to estimate the two parameters for the Extreme Value Type I (Gumbel) PDF are
shown here as an example for all the distributions.
Method of Moments (MoM) Estimates
ï?° 2ï?¢ 2 6ï?³ 2 6ï?³ ~
ï€½ï?³2 ïƒž ï?¢2 ï€½ ïƒžï?¢ ï€½ ï‚» 0.7797ï?³ ïƒž ï?¢ ï€½ 0.7797 S
6 ï?° 2
ï?°
~ ~
ï?¡ ï€« 0.577216ï?¢ ï€½ ï? ïƒž ï?¡ ï€½ ï? ï€ 0.577216ï?¢ ïƒž ï?¡ ï€½ Y ï€ 0.577216 ï?¢ ï€½ Y ï€ 0.4501S (31)
ïƒ¥ ï€¨Y ï€ Y ï€©
n n
ïƒ¥Y
2
i i
where: Y ï€½ i ï€½1
and S ï€½ i ï€½1
n n ï€1
Maximum Likelihood Estimates
For the Gumbel PDF with the two parameters the ML estimates are derived as follows:
10
n n
1 ïƒ¬ ( y ï€ï?¡) ïƒ¼ ïƒ¬ ïƒ¬ ( yi ï€ ï?¡ ) ïƒ¼ïƒ¼
L(ï?¡ , ï?¢ ) ï€½ f ( y1 ,..., yn | ï?¡ , ï?¢ ) ï€½ ïƒ• f ( yi | ï?¡ , ï?¢ ) ï€½ ïƒ• exp ïƒï€ i ïƒ½ exp ïƒï€ exp ïƒï€ ïƒ½ïƒ½ ï€½
i ï€½1 i ï€½1 ï?¢ ïƒ® ï?¢ ïƒ¾ ïƒ® ïƒ® ï?¢ ïƒ¾ïƒ¾
ïƒ¬
ïƒ¯ n ïƒ©( y ï€ï?¡) ïƒ¬ ( y ï€ ï?¡ ) ïƒ¼ïƒ¹ ïƒ¼
n
ïƒ¦1ïƒ¶ ïƒ¯
ï€½ ïƒ§ ïƒ· exp ïƒï€ïƒ¥ ïƒª i ï€« exp ïƒï€ i ïƒ½ïƒº ïƒ½
ï?¢
ïƒ¨ ïƒ¸ ïƒ¯ i ï€½1 ïƒ«
ïƒ® ï?¢ ïƒ® ï?¢ ïƒ¾ïƒ» ïƒ¾
ïƒ¯
n
ïƒ©( y ï€ï?¡) ïƒ¬ ( y ï€ ï?¡ ) ïƒ¼ïƒ¹
l ï€½ ln( L) ï€½ ï€n ln( ï?¢ ) ï€ ïƒ¥ ïƒª i ï€« exp ïƒï€ i ïƒ½ïƒº (32)
i ï€½1 ïƒ« ï?¢ ïƒ® ï?¢ ïƒ¾ïƒ»
ï‚¶l n
ïƒ© ï€1 1 ïƒ¬ ( y ï€ ï?¡ ) ïƒ¼ïƒ¹ n 1 n
ïƒ¬ ( yi ï€ ï?¡ ) ïƒ¼
ïƒž ï€½ ï€ïƒ¥ ïƒª ï€« exp ïƒï€ i ïƒ½ïƒº ï€½ ï€ ïƒ¥ exp ïƒï€ ïƒ½
ï‚¶ï?¡ i ï€½1 ïƒ« ï?¢ ï?¢ ïƒ® ï?¢ ïƒ¾ïƒ» ï?¢ ï?¢ i ï€½1 ïƒ® ï?¢ ïƒ¾
ï‚¶l n n ïƒ© ( y ï€ï?¡) ( y ï€ï?¡) ïƒ¬ ( y ï€ ï?¡ ) ïƒ¼ïƒ¹ n n ( yi ï€ ï?¡ ) n ( yi ï€ ï?¡ ) ïƒ¬ ( y ï€ï?¡) ïƒ¼
ïƒž ï€½ ï€ ï€ ïƒ¥ ïƒª ï€ i 2 ï€« i 2 exp ïƒï€ i ïƒ½ïƒº ï€½ ï€ ï€« ïƒ¥ ï€ïƒ¥ exp ïƒï€ i ïƒ½
ï‚¶ï?¢ ï?¢ i ï€½1 ïƒ« ï?¢ ï?¢ ïƒ® ï?¢ ïƒ¾ïƒ» ï?¢ i ï€½1 ï?¢ 2
i ï€½1 ï?¢ 2
ïƒ® ï?¢ ïƒ¾
The goal is to estimate the two parameters using an optimization/search algorithm like the Newton-
Raphson, i.e.
ïƒ© ï‚¶l ïƒ¹
^ ^ ïƒª ï‚¶ï?¡ ïƒº ïƒ©0 ïƒ¹
se ï?¡ , ï?¢ such that ïƒª ïƒº ï€½ ïƒª ïƒº (33)
ïƒª ï‚¶l ïƒº ïƒ«0 ïƒ»
ïƒ« ï‚¶ï?¢ ïƒº
ïƒª ïƒ»
L-Moments Method (LMoM) for Gumbel Distribution
The Gumbel PDF with two parameters (a,e) has the following L-Moments:
ï?¬1 ï€½ ï?¸ ï€« ï?¡ï?§
ï?¬2 ï€½ ï?¡ *log 2 (34)
ï?´ 3 ï€½ 0.1699 ï€½ log(9 / 9) / log(2)
ï?´ 4 ï€½ 0.1504 ï€½ (16*log 2 ï€ 10*log 3) / log 2
where g is the Eulerâ€™s constant, 0.5772â€¦ and the Gumbel PDF parameters are estimated as:
ï?¡ ï€½ ï?¬2 / log 2
(35)
ï?¸ ï€½ ï?¬1 ï€ ï?§ï?¡
11
Log Pearson Type III Distribution
The CDF of the LP III distribution is as follows:
Â· ï€ (kÂ·( x ï€ ï?¸ )) / ï?¡ ](1/ k ) ] if k ï‚¹ 0
F ( x) ï€½ exp[ï€1[1 (18)
(36)
and, as before, x>ï?¸ï€¬ï€ which is the location parameter, ï?¡ is the scale parameter and k is the shape
parameter.
The estimation of the weighted skewness, as a function of the sample and regional skewness is as
follows :
The estimation by the method of moments is simpler if the location parameter ï?¸ is known. The qth
quantile may be evaluated from the corresponding quantile of the standardized gamma variate as:
ï?¸q ï€½ exp[ï?¸ ï€« ï?‡ï€1 ï€¨ r ï€© (37)
(19)
where ï?‡-1(r) is the qth quantile of the standardized gamma variate. If the lower bound ï?¸ï€ is known the
estimates, by the method of moments, of the other two parameters (r and ï?¡) may be calculated by
computing the mean and standard deviation of ln(x-ï?¸).
If the location parameter must be estimated from the data the following equations should be used,
after computing the sample mean m, standard deviation s and skewness Gs of the logarithms of the
flows:
ï€2
k ï€½ 4Gs (20)
(38)
a ï€½ s Gs / 2
ï?¸Ë† ï€½ mï€ka
The sampling variability of the skewness coefficient has been a major concern in the application of the
LP 3 method. US Water Resources Council Bulletin 17B (1982) suggested the application of a weighted
estimate of the skewness as a function of the sample and regional skewness as follows :
Gs MSE[Gs ] ï€« Gg MSE[Gg ]
G wï€½ (39)
(21)
1 MSE[Gs ] ï€«1 MSE[Gg ]
12
where Gs is the sample skewness (estimated by the MoM), MSE[Gs] is the sample skewness standard
deviation, Gg is the regional skewness (provided in map form in Bulletin 17B for the United States, see
an example in Figure 4) and MSE[Gg] is the standard deviation of the regional skewness (also provided
by Bulletin 17B for the United States with a fixed value of 0.302 with an effective record length of 17
years).
Figure 4. Regional Skewness Coefficients
Source: WRC Bulletin 17B, 1982
These regional estimates of the skewness have not been updated since the publication of Bulleting 17B
in 1982, which is a serious limitation of the approach. This kind of information is not available in other
parts of the world, thus the most common approach is to use only the sample skewness coefficient.
The advantage of using the LP 3 distribution for modeling extremes is the availability of the public
domain software HEC-SSP (USACE Hydrologic Engineering Center, (www.hec.usace.army.mil/software/)
Generalized Extreme Value Distribution
As mentioned before the Generalized Extreme Value (GEV) has three parameters: Î¾ (location), Î±
(scale) and k (shape) and it has following form:
(40)
13
with parameters:
(23)
(41)
If k = 0, the distribution is a Gumbel PDF
(24)
(42)
where:
(25)
(43)
Thus using the equations shown above the three parameters of the GEV may be calculated.
ANNEX IC The Mann Kendall Non-Parametric Test
Introduction
The Mann-Kendall test is a non-parametric test for identifying trends in time series data. The test
compares the relative magnitudes of sample data rather than the data values themselves (Kendall,
1975; Gilbert, 1987). One benefit of this test is that the data need not conform to any particular
distribution. Moreover, data reported as non-detects can be included by assigning them a common
value that is smaller than the smallest measured value in the data set. The procedure described in
the subsequent paragraphs assumes that there exists only one data value per time period. When
multiple data points exist for a single time period, the median value is used.
The data values are evaluated as an ordered time series. Each data value is compared to all
subsequent data values. The initial value of the Mann-Kendall statistic, S, is assumed to be 0 (e.g., no
trend). If a data value from a later time period is higher than a data value from an earlier time period, S
is incremented by 1. On the other hand, if the data value from a later time period is lower than a data
14
value sampled earlier, S is decremented by 1. The net result of all such increments and decrements
yields the final value of S.
Let x1 , x2 , ... xn represent n data points where xj represents the data point at time j. Then the Mann-
Kendall statistic (S) is given by
n ï€1 n
S ï€½ïƒ¥ ïƒ¥ sign( x j ï€ xk ) (
(44)
B1)
k ï€½1 j ï€½ k ï€«1
As already mentioned the value for S is determined by the directional change in a year from all previous
years, if the change is positive S = 1, if negative S = -1, and if a tie is present or a data is missing, S = 0.
The values of S are summed as shown in equation (44).
A very high positive value of S is an indicator of an increasing trend, and a very low negative value
indicates a decreasing trend. However, it is necessary to compute the probability associated with S and
the sample size, n, to statistically quantify the significance of the trend. Since it has been shown that S is
asymptotically normally distributed a test may be carried out. The procedure to compute this probability
is shown below.
Steps in Performing Mann-Kendal Non-Parametric Test
Kendall (1975) describes a normal-approximation test that could be used for datasets with more than
10 values, provided there are not many tied values within the data set. The test procedure is as
follows:
ï‚· Calculate S as described in equation (44)
ï‚· Calculate the variance of S, VAR(S), by the following equation
n(n ï€ 1)(2n ï€« 5) ï€ ïƒ¥t t (t ï€ 1)(2t ï€« 5)
Var[ S ] ï€½ (45)
( B2)
18
where n denotes the number of samples, and t is the extent of a tie summed over all ties.
ï‚· Compute a normalized test statistic Z as follows:
15
S ï€1
Zï€½ if S ï€¾0
VAR( S )1/2
Z ï€½0 if S ï€½0 (46)
( B3)
S ï€«1
Zï€½ if S ï€¼0
VAR( S )1/2
ï‚· Compute the probability associated with this normalized test statistic. The probability density
function for a normal distribution with a mean of 0 and a standard deviation of 1 is given by the
following equation:
2
1 ï€ z2
f ( z) ï€½ e ï€¨B 4ï€©
(47)
2ï?°
ï‚· Select a probability level of significance (95% typically).
ï‚· The trend is said to be decreasing if Z is negative and the computed probability is greater than
the level of significance. The trend is said to be increasing if the Z is positive and the
computed probability is greater than the level of significance. If the computed probability is
less than the level of significance, there is no trend.
References for Annex I
Gilbert, R O., Statistical Methods for Environmental Pollution Monitoring, Van Nostrand Reinhold, New
York, 1987.
Greenwood, J.A., J.M. Landwehr, N.C. Matalas and J.R. Wallis, Probability weighted moments: definition
and relation to parameters of several distributions expressible in inverse form, Water Resources
Research, 15, 1049-1054, 1979
Hosking J.R.M., L-moments analysis and estimation of distributions using linear combination of order
statistics. Journal Royal Statistical, 52 (1), 105-124, 1990
Kendall, M.G., 1975. Rank correlation methods, 4th ed. Charles Griffin, London.
Water Resources Council Hydrology Sub-Committee, Bulletin 17B: Guidelines for Determining Flood Flow
Frequency, 1982.
16
ANNEX II Dendrochronology Reconstruction of Hydrologic Records
Introduction
An approach to extend the hydrologic records is to use paleoclimate information. Dendroclimatology is
the dating of climatic past events through the study of tree ring growth. The precise dating and annual
resolution of tree rings is superior to most sources of paleoclimatic information. Tree rings probably
offer the best means of reconstructing large-scale and highly resolved patterns of climate (for
references on dendrochronology see, e.g., Cook et al., 1994).
They have been used to reconstruct hydrologic variables like floods, mean annual flows and droughts. In
these reconstructions a usually linear relationship is created between the dependent variable (rainfall,
floods, and droughts) and the dendrochronologies. These reconstructions from tree rings only explain a
fraction of the variance of the index being applied (PDSI, streamflow, rainfall). Thus, a reconstructed
index has less variability than the original index. This has influence in hydrologic analyses; because
deviations from normality are generally lower (e.g. the mean of the reconstructed drought deficits is
underestimated).
A way to use these reconstructions is studying their tendencies (e.g. the bidecadal drought rhythm
analyzed in Cook et al., 1997). But combining them in a statistical analysis of droughts with instrumental
data is not simple. As an example, in the drought reconstruction for the continental U. S. (Cook et al.,
1999; www.ncdc.noaa.gov) the PDSI grid reconstruction No 63, in Texas, has a correlation coefficient
ï?²=0.67. However, only in 47% of the dry years (PDSIï‚£-1) simultaneously the original PDSI and the
reconstruction indicate a dry year. A methodology to convert such valuable information in a form that
can be compared with instrumental data was developed by GonzÃ¡lez and ValdÃ©s (2003).
The regression models for the reconstructions are based on the Principal Component Analysis method,
PCA (Davis, 2002). They are a particular case (one predictand) of Orthogonal Spatial Regression (OSR),
which with Canonical Regression are traditionally used in dendroclimatology (Cook et al, 1994).
A summary of the procedure is presented below and it follows closely GonzÃ¡lez and ValdÃ©s (2003).
Methodology
Let Y be the vector of the standardized (i.e., zero mean, unit standard deviation) predictand to be
reconstructed and TR the tree ring matrix where each column represents a standardized chronology.
17
Let C be the correlation matrix of TR . The eigenstructure analysis of C provides the matrix of
principal component vectors of the predictor set, E (normalized eigenvectors).
The components of TR in the orthonormal base formed by E are the amplitudes in each principal
mode. The application of the OSR method leads to the following expression:
Y ï€½U ïƒ—ï?¢ ï€«ï?¥ (1)
where ï?¥ is the vector of errors, ï?¢ is the vector of regression coefficients, and U is the matrix of
selected amplitudes. A subset of amplitudes is chosen to reduce the level of artificial predictability
(Davis, 1976) that the inclusion of too many predictors in the model may produce (Cook et al, 1994). The
amplitudes are selected to explain most of the variance in the data set. The method applied to chose
this subset of amplitudes, or potential predictors, starts with an initial selection by the Cumulative
Eigenvalue Product (PVP) criterion (i.e., in decreasing order of eigenvalues, select those whose
cumulative product of eigenvalues are larger or equal than 1 (Guiot, 1985). Then, the amplitudes from
smaller eigenvalues are sequentially disqualified for the regression until the F-test criterion, with 5% of
significance, rejects the hypothesis of identical residual variance of the models.
Cook et al. (1999) found an improvement of this regression by prewhitening the tree ring chronologies
and the PDSI data before doing the regression. This is due to the sometimes large differences in short-
lag autocorrelation between climate and tree rings. The short-lag autocorrelation â€“red noise- of the PDSI
is added a posteriori to the regressionâ€™s result. In case no improvement is found by prewhitening the
tree ring data; probably the filtering done in the model by the components selection was sufficient. On
the other hand, significant correlation between the regression residuals and the PDSI in short-lags can
be found.
Hence, a low-order autoregressive model, AR(p), can be added to the principal components regression,
leading to the next global model:
q p
y t ï€½ ïƒ¥ u i ,t ïƒ— ï?¢ i ï€« ïƒ¥ y t ï€ i ïƒ— ï?¦ i ï€« a i (2)
i ï€½1 i ï€½1
18
where q is the number of selected components, ï?¦ i s are the autoregressive parameters and a i is the
model residual, assumed to be independent and normally distributed, with zero mean, and variance
ï?³ a2 .
For consistency, principal components and autoregressive parameters are finally jointly recalibrated for
a calibration period using the sum of squares of the residuals as the objective function. For each set of
chronologies several model orders can be proposed: with and without including the tree ring
information of t+1 as predictors, combined with autoregressive models of orders 1, 2, or 3. Starting from
higher order models, the F-test (5% significance) criterion is again implemented to find the reduction of
order that is statistically significant, where lower order implies loss in predictability.
References for Annex II
Cook, E. R., D. M. Meko and C. W. Stockton, 1997: A New Assessment of Possible Solar and Lunar Forcing
of the Bidecadal Drought Rhythm in the Western United States. J. Climate, 10, 1343-1356.
Cook, E. R., D. M. Meko, D. W. Stahle, and M. K. Cleaveland, 1999: Drought Reconstructions for the
Continental United States. J. Climate, Vol. 12, no. 7, 1145-1162.
Cook, E. R., K. R. Briffa, and P.D. Jones, 1994: Spatial regression methods in dendroclimatology: A review
and comparison of two techniques. Int. J. Climatology, 14, 379-402.
Davis, R. E., 1976: Predictability of sea surface temperature and sea level pressure anomalies over the
North Pacific Ocean. J. Phys. Oceanogr., 6, 249-266.
Davis, J., Statistics and Data Analysis in Geology, J Wiley, 3rd edition, 2002
GonzÃ¡lez, J. and J.B. ValdÃ©s, â€œBivariate Drought Recurrence Analysis Using Tree Rings Reconstructions,â€?
ASCE Journal of Hydrologic Engineering, 8(5), 246-258, 2003.
Guiot, J., 1985: The extrapolation of recent climatological series with spectral canonical regression. J.
Climatol., 5, 325-335.
19
ANNEX III Downscaling Approaches
There are basically two approaches to downscale coupled climate model projections: statistical and
dynamic downscaling. An excellent review of these methods is given by Fowler et al (2007) in
International Journal of Climatology.
Statistical Downscaling
In the statistical downscaling the GCM results are disaggregated in space using statistical procedures
from a simple one using the ratio of the local mean to the mean of the GCM cell to more sophisticated
ones relating the local results to climatic precursors like ENSO using MSSA (Multichannel Singular
Spectrum Analysis) as done in CaÃ±on et al (2011). An example of the latter method for the Southwest
region of the US, which includes the Verde basin, is shown in Figure 1.
Figure 1. Original and Downscaled Results Using the MSSA Approach
Source: CaÃ±on et al., 2007
The statistical downscaling techniques have the following advantages (Dominguez et al., 2009):
â€¢ Cheap and computationally efficient.
â€¢ Can use many different scenarios, model runs.
â€¢ Can be used to derive variables not available in RCMs.
â€¢ Easily transferable to other regions.
But they have the following limitations:
â€¢ Requires long and reliable observation data.
â€¢ Depends on choice of predictors.
â€¢ Assumes stationarity of predictor-predictand relationship.
â€¢ Do not account for feedbacks.
20
There are some papers that suggest that the statistical downscaled results may be comparable to those
of the more complex dynamic downscaling techniques.
Dynamic Downscaling
Dynamic downscaling involves the use of regional climate models (RCMs) that use, as boundary
conditions, those provided by the GCMs. The most commonly used dynamic model is the WRF (Weather
Research and Forecast) developed by NCAR and other agencies. The model is public domain and
available at www.ucar.edu/wrf
Dynamic downscaling has the following advantages (Dominguez et al, 2009):
â€¢ Produces responses based on physically consistent processes.
â€¢ Captures feedbacks.
â€¢ Can model changes that have never been observed in historical record.
â€¢ Useful where topographic controls are important.
They have, however, the following limitations:
â€¢ Require significant computational power.
â€¢ Limited amounts of models / runs / timescales.
â€¢ Dependant on GCM boundary forcing.
â€¢ Problems with drifting of large-scale climate.
Regional Climate models will provide additional information about the changes in physical processes
that will arise due to a changing climate:
â€¢ Changes in intensity and/or frequency of precipitation events.
â€¢ Water holding capacity of the atmosphere.
â€¢ Diurnal cycle of precipitation.
â€¢ Changes in land-atmosphere feedbacks that might affect strength of monsoons.
However, in most applications for operational and design purposes statistical downscaling can be an
appropriate technique.
Water Balance Models
There are a plethora of basin-level water balance models to evaluate the temporal characteristics of
streamflows, usually at the monthly level (Alley, 1984). Monthly water balance models are simple
representations of the rainfall-runoff process of a basin but that allow understanding the hydrologic
process and its changes at the basin scale. They are widely used to evaluate changes in the basin due to
21
modifications in the land and soil use, climate change, etc (e.g. Valdes and Seoane, 2000). These models
use monthly series of precipitation and temperature or potential evapotranspiration as input to produce
estimates of surface runoff, actual evapotranspiration and soil moisture storage.
One of these models is the abcd model (Thomas, 1981) that is described in the next section.
The abcd Model
The abcd model was originally proposed by Thomas and its name reflect s the four parameters (a, b, c
and d) utilized in the model. It is based on a previous conceptual model by Thornthwaite (1948) and its
use has been widely reported in the literature. The general form of the model is shown in Figure 2.
Figure 2. General Form of the abcd Model
Pt ETt
a Rot
St=f(Wt, Yt)
c GRt d Qg
t
Gt=f(Wt, Yt)
Source: Martinez et al, 2010
The continuity equation is represented as follows:
Pt ï€ ETt ï€ GRt ï€ Roi ï€½ ï?„S ï€½ St ï€ St ï€1 (1)
where:
Pt = monthly total precipitation [mm]
ETt = monthly actual evapotranspiration [mm]
GRt = monthly groundwater recharge [mm]
ROt = upper layer surface runoff [mm]
St = soil moisture storage [mm] at end of time t
Equation (1) can be rearranged as:
(Pt ï€« St ï€1 ) ï€ ( ETt ï€« St ) ï€½ ROt ï€« GRt (2)
Thomas defined a new variable, Wt (â€œavailable waterâ€?) which is:
22
Wt ï€½ St ï€1 ï€« Pt (3)
and another state variable Yt which is the sum of actual evapotranspiration ET during period t and soil
moisture at the end of the period t, S, i.e.
Yt ï€½ ETt ï€« St (4)
The state variable Yt is a nonlinear function of the available water Wt as follows:
Wt ï€« b ïƒ¦ W ï€« b ïƒ¶ Wt ïƒ— b
2
Yt (Wt ) ï€½ ï€ ïƒ§ t ïƒ· ï€ (5)
2a ïƒ¨ 2a ïƒ¸ a
which is shown in Figure 2 and it shows that Yt