Intelligent Water Decisions Research Group

The Intelligent Water Decisions Research Group is working to help people make smart decisions about our most precious resource: Water.

A sightseeing tour in the field of extremes

Probability and statistics concepts evade many people. Join me on this brief tour exploring four landmarks in the field of extreme value statistics.

1. The distribution of maximum values

You are probably familiar with the idea of a distribution. If you’re not, a distribution is a list or function that shows all possible variables (or intervals) of the data and how often they occur. Read more about distributions here.

There are different types of distribution. For example, the uniform, Gaussian, lognormal or exponential distribution. These distributions encode for various magnitudes how likely a random event is to occur. For example, annual rainfall might follow a Gaussian distribution. This distribution is centred about an average value but deviates above or below from year to year according to some spread. Daily rainfall amounts might follow a skewed distribution such as the lognormal. That is, often there will be low amounts of rainfall, but sometimes they will be high.

What about the distribution of extreme rainfall?

To see how this distribution arises, consider the occurrence of rainfall for 365 days in year 1, e.g. R1={0, 0, 0, 1.2, 2.4, 0, …}. Next, locate the maximum value from this block, i.e. max(R1). If we repeat this process for many years we get a distribution of annual maximums, X={max(R1), max(R2), …}.

What is intriguing about this process is that the maximums will tend towards a known theoretical distribution. This distribution is the Generalised Extreme Value (GEV) distribution.

The app below shows the GEV in action

Even though the underlying parent distributions can be quite different, the distribution of maximum values has a similar shape.

Note that the scale will vary because it depends on the scale of the underlying process and the size of the sample the maximum is from. But this can be accounted for through rescaling. The technical term for this process is “max-stability”. The shape of the GEV distribution is stable after rescaling.

2. The GEV distribution

So what exactly does the GEV distribution look like? Distributions are represented as a probability density function (PDF) or a cumulative distribution function (CDF).

The CDF varies from 0 to 1 and represents the cumulative probability of a random event X being less than some specific size x, i.e. P(X<x). The probability P(10<X<20) is obtained by subtracting the probability of being less than 10 away from the probability from being less than 20: P(10<X<20) = CDF(20) – CDF(10). This probability would correspond to the area under the PDF between the values x=10 and x=20.

Explore the GEV distribution using the app below

You will see that there is a lot of flexibility in shapes that the GEV can take on. There are three parameters:

  1. The ‘Location’ parameter centres the distribution.
  2. The ‘Scale’ parameter controls the spread of the distribution about the centre.
  3. The ‘Shape’ parameter especially controls behaviour in the upper and lower tails. Note that some values of the shape parameter can generate a fixed cut-off point on either the upper or lower tail.

3. Sampling variability in the GEV

It is one thing to look at a theoretical function, it is another to look at data and ask how well is it represented by the function.

There may be reasons why the GEV may not fit well. For example, the true parent process may be a mixture of two or more processes, such as when El Nino and La Nina weather states are dominant (e.g. Leonard et al., 2008). As another example, there may be an underlying trend in the composition of the data, such as a trend in summer extremes (e.g. Zheng et al., 2015). More subtle reasons also abound, such as whether the values are sufficiently extreme, or independent.

One common issue when fitting data to extremes is an under-appreciation for the role of sampling variability. This is the variability that occurs through inherent random chance and the fact that the number of observations is finite. For example, imagine determining the maximum height from the five people nearest to you. Every time you were to repeat this experiment (assuming you randomly mix with the population) there would be a different outcome.

When it comes to rainfall events, the historical record we have is one realisation of the underlying process. But, imagine that the same underlying process could have generated a different set of values. Sampling variability is especially a problem with “short” records. For example, a 20-year record of rainfall would only produce 20 annual maximums. The variability in a sample of 20 values is high, especially when interest is in rarer events such as the 1% exceedance event.

Explore variability using the app below

The  app allows you to explore the amount of variability in the GEV distribution. A distribution with fixed parameter values will have variability due to the finite sample size. Larger samples will have less variability. The nature of the variability depends on the specified parameter values. It is usually larger in one or more of the tail regions when compared to the central region.

4. Bivariate extremes

The last stop on this tour is a glimpse at another whole vista: Multivariate distributions.

It is possible to consider the scatter plot of two variables together, and often there will be some dependence between them. A well-known example is the bivariate Gaussian distribution. It has a correlation parameter to represent the dependence (see the following app). The Gaussian distribution is popular because it occurs in many situations. It also has many theoretical niceties, including the ability to extend it to more than two dimensions. Through a technique known as copula functions, Gaussian dependence is also extendable to other distributions.

What is less well-known is that there is a sting in the tail of the Gaussian distribution.

To see this sting, we will use a different approach than the block-maximum approach in earlier examples. Here we will use a threshold to specify the extreme values. The correlation in the extreme values is less than the correlation in the parent Gaussian distribution. As the threshold becomes higher, the correlation drops towards zero. This is referred to as asymptotic independence and there are many circumstances where that is appropriate. The following app demonstrates the drop in correlation for the upper tail of the Gaussian distribution.

In contrast, there are many asymptotic dependent circumstances. This means that there can be strong dependence among the extreme events.

In these circumstances an assumption of asymptotic independence could lead to poor estimates for the probability of a rare event.  An example made famous by the 2008 global financial crisis is the default rate on mortgages. In normal circumstances, people may default on a mortgage with little association to their neighbours. Extreme circumstances might  be the closing of a large factory or the burst of a housing bubble. In the extreme, the chance of two neighbours defaulting at the same time may be dependent.

Asymptotic dependence can occur in environmental variables too. For example, coastal flooding may depend on extreme storm tides and extreme rainfall events. When a large storm occurs, there may be dependence due to the common association of storm surge and streamflow with the storm event (see Zheng et al., 2014). But, in other locations there may be no dependence because the streamflow takes a long time to occur or because a tidal inlet is sheltered.

Summary of these probability and statistics concepts

Analysing rainfall and extreme events requires us to come to grips with these four concepts. Hopefully you enjoyed the tour, and have taken away some mental postcards. There are many bridging topics we did not cross into, which perhaps could be the subject of further tours. If you are interested in seeing these, leave a comment below.

Let us know what you think of the apps in this article too. Perhaps you have suggestions for other concepts of interest to illustrate with an online app. If you want to know of future learning apps I develop, keep watching this blog, go to my website, or follow me on twitter (@algorithmik).




Leonard, M., Metcalfe, A. and Lambert, M., 2008. Frequency analysis of rainfall and streamflow extremes accounting for seasonal and climatic partitions. Journal of hydrology, 348(1), pp.135-147, doi:10.1016/j.jhydrol.2007.09.045

Zheng, F., Westra, S., Leonard, M. and Sisson, S.A., 2014. Modeling dependence between extreme rainfall and storm surge to estimate coastal flooding risk. Water Resources Research, 50(3), pp.2050-2071, doi:10.1002/2013WR014616

Zheng, F., Westra, S. and Leonard, M., 2015. Opposing local precipitation extremes. Nature Climate Change, 5(5), pp.389-390, doi:10.1038/nclimate2579



  1. In the first app, the location of the maximums-distribution shifts with n. This is expected, see e.g. our paper in review at
    However, there should be less values for smaller n in the histogram, right? I understand you compute max(rnorm(365)) n times, which would relate to the following R code:
    par(mfrow=c(2,1), las=1, cex.main=1)
    hist(replicate(10, max(rnorm(365))), xlim=c(0,5), col=”purple”)
    hist(replicate(100, max(rnorm(365))), xlim=c(0,5), col=”purple”)
    I also do not see such a strong sample size dependency as with the app…
    Any thoughts on this?

    • Michael Leonard

      July 27, 2016 at 9:25 pm


      Thank you for your comment. The difference is due to what I am calling the “sample”. In the app, “n” refers to the size of the block, i.e. max(rnorm(n)) which I replicate 20000 times. ie.

      par(mfrow=c(2,1), las=1, cex.main=1)
      hist(replicate(20000, max(rnorm(10))), xlim=c(0,5), col=’purple’)
      hist(replicate(20000, max(rnorm(100))), xlim=c(0,5), col=’purple’)

      Sorry for any confusion I have caused.

      • Ah, I get it. I find it’s not quite easy to clarify the app text… you could replace “many times” with “20’000 times” so people don’t relate “many” to “n”. And the slider could be renamed to “sample size (# observations per block)” or something… Anyway, thanks fot the quick clarification!

What do you think?


Follow Intelligent Water Decisions

%d bloggers like this: