In this article researchers and industry will learn how to reduce uncertainty in hydrological predictions. It presents the recommendations of a recent study published in the Water Resources Research journal [1].

For the first time, we have identified the best error model to use for representing uncertainty in predictions for hydrological modelling applications. So you can use the recommendations most effectively, we begin by explaining the importance of estimating uncertainty in hydrological predictions.

This was an outcome of a long-term collaboration between the University of Adelaide, University of Newcastle and the seasonal streamflow forecasting team at the Bureau of Meteorology.

The long-term goal of this research is to improve streamflow forecasts around Australia (see impact).

Why should we quantify the uncertainty in hydrological predictions?

Rainfall-runoff models predict the response of flow in streams and rivers to rainfall (referred to as hydrological predictions).

The hydrological predictions from these models are widely used to inform decisions by a range of authorities. They include:

  • flood warning services
  • water supply authorities
  • environmental managers
  • irrigators
  • hydroelectricity generators.

Given the reliance on these hydrological predictions, it is important to understand the uncertainty in these predictions.

Hydrological predictions are not perfect and can have large errors.

These errors are typically in the order of 40–50% [2]. There are errors between observed and predicted streamflow, because

  • Catchments are complex, and hydrological models are simplified representations of complicated catchment physics
  • Catchment processes are hard to measure: Rainfall varies in space and time. Streamflow is not measured directly. This produces observation errors in the rainfall and streamflow data used to develop and test these models.

Quantifying the errors in predictions allows us to estimate uncertainty in predictions.

Uncertainty estimation is essential for quantifying risk

If we do not account for uncertainty we can under-estimate the risk of failure.

Example showing that if we do not consider uncertainty in system performance, we would choose Action A over Action B. This is because Action B has higher system performance than Action A.

Figure 1: Hypothetical example showing predicted system performance (e.g. flood mitigation, drought security, stream health) resulting from Actions A and B. In this case we do not consider uncertainty in system performance for each action.

Consider the hypothetical example introduced in Figure 1.

You are given the task of choosing between Action A and Action B to improve system performance, and avoiding system failure. This could be flood mitigation, drought security or stream health. If you ignore the uncertainty in performance (as in Figure 1) you would choose Action B since it has the highest performance.

Example showing that if we consider uncertainty in system performance, we would choose Action B over Action A. This is because Action A has a lower probability of failure than Action B.

Figure 2: The same as Figure 1, but now considering uncertainty in system performance for each action.

But, you might make a different decision if you were to quantify the uncertainty (see Figure 2). Action A has a much lower probability of failure and would be the preferred action if you want to reduce risk.

Water management is all about balancing risks, e.g. risk of floods, risk of water shortage. So it is clear that quantifying uncertainty is essential for quantifying risk. If we ignore it, we are under-estimating the risk of unwanted outcomes.

It’s not that difficult

There is a perception that uncertainty analysis is hard. Our research shows that you can get robust uncertainty estimates using simple approaches. See recommendations.

What are the challenges with estimating uncertainty?

Time series showing observed and predicted streamflow. There are differences between the two time series. These differences are largest when predicted streamflow is largest.

Figure 3: Observed streamflow data from the Cotter River (ACT, Australia) compared with predictions from the GR4J hydrological model. The size of the errors between observations and predictions is larger for higher streamflow predictions. The ovals highlight period where this is evident.

Uncertainty estimation is based on the statistical modelling of the errors between hydrological predictions and observations.

There are many challenges in modelling these errors. For example, higher streamflow predictions have larger errors. This is seen in Figure 3. This is known as “heteroscedasticity” in errors (non-constant variance). Errors are also persistent (e.g. large errors typically follow large errors) and skewed.

Appropriate modelling of errors needs to account for these properties.

How can we estimate uncertainty in predictions?

Many approaches are used to estimate uncertainty in hydrological predictions. Methods such as Bayesian Total Error Analysis (BATEA) [3] disaggregate uncertainty into different components. These components include input data errors, model structure errors, and output data errors.

But, in many practical applications, an error model that aggregates all errors is preferable because we are primarily interested in the uncertainty in predictions. These are referred to as “residual error models” in the literature, but here we are going to simplify this to “error models”.

In both operational and research settings, a wide range of different error models are used. These include

  1. weighted least squares (WLS) approaches, and
  2. approaches based on transformations of the data (e.g. Log and Box Cox transformations).

But until now, no one has evaluated which error models work best over a diverse range of catchments.

So how big a difference can the choice of error model make?

Within the hydrological community, both WLS and transformation approaches are widely used to account for heteroscedasticity in errors. Thus, we may think that the specific error model would not make a big difference to predictions.

Surprisingly, it can make a very big difference.

Probability limits for streamflow predictions in Cotter River, based on WLS and BC0.2 error models. The width of the 90% confidence intervals are much larger for WLS than BC0.2.

Figure 4: Uncertainty estimates for hydrological predictions in the Cotter River (ACT, Australia) based on the Weighted Least Squares (WLS) error model and the Box Cox error model with fixed parameter (BC0.2).

Figure 4 shows uncertainty estimates for streamflow in the Cotter River, based on the widely used GR4J hydrological model. In the top panel, a weighted least squares (WLS) error model is used. In the bottom panel, the Box Cox transformation is used, with a transformation parameter lambda=0.2 (BC0.2).

We see that WLS over-estimates high flows, and the uncertainty in the WLS predictions is greater than the BC0.2 predictions.

The BC0.2 error model produces predictions that are more precise and more consistent with the observed data. Thus they would be far more useful for management.

How do we identify robust error models for multiple catchments?

The results in Figure 4 highlight the importance of the error model in predicting uncertainty. But are these results consistent over multiple catchments, with different physical characteristics? How do other error models compare with the two models considered in Figure 4? And why do some error models perform better than others?

To identify robust error models for practical purposes we performed a wide range of empirical case studies based on

  • 8 common error models
  • 23 catchments from Australia and the USA
  • 2 hydrological models.

We strengthened the robustness of our findings using theoretical analyses to understand when and why error models performed the way they did.

The findings of this study have recently been published in the Water Resources Research journal.

How do we work out which error model is best?

To estimate uncertainty and describe risk, we want predictions that are

  • Reliable: probabilistic predictions are statistically consistent with observed data. For example, 5% of the observed data should lie outside the 95% confidence limits. If 20% of the observed are outside the 95% limits then the probabilistic predictions are not reliable.
  • Precise: small uncertainty in predictions.  For example, we want the 95% confidence intervals to be narrow, as with BC0.2 in Figure 4, and not unnecessarily wide, as with WLS.
  • Unbiased: total volume matches observations.

To compare performance across 368 case studies (23 catchments x 2 hydrological models x 8 error models), we summarise these aspects of predictive performance using metrics.

Ideally we would use probabilistic predictions that are both reliable, precise and un-biased. But in practice this is hard to achieve.

So which error model should I use?

Based on empirical case studies and theoretical analysis we came to the following four conclusions.

1. Error models that transform the data produce more reliable predictions

Error models based on the Log transformation and Box Cox transformation with fixed parameter are better than the weighted least squares (WLS) error model. This is because transformation approaches capture the real skew in residuals.

2. Choosing the best error model depends on the type of flow regime in the catchment.

For perennial catchments (that always flow), the log and log-sinh error models produce reliable and precise predictions.

In ephemeral catchments (with a large number of zero flow days) these error models produce very imprecise predictions. The BC transformation with lambda=0.2 or lambda=0.5 is better in these catchments.

3. More complex error models do not necessarily produce the best predictions.

Calibrating the parameter in the Box Cox error model produces predictions that are reliable. But these predictions are often extremely imprecise. This method produces improved estimates of low flows at the expense of high-flows.

The two parameter log-sinh transformation error model produced similar predictions as the simpler log transformation error model in perennial catchments. In ephemeral catchments it produced predictions with poor precision.

4. No single error model performs best in all aspects of predictive performance.

In other words, there is a trade-off between different aspects of performance.

In perennially flowing catchments, we found that the Log transformation error model produced best reliability. But in these same catchments, the Box Cox transformation with lambda=0.2 produced predictions with the best precision.

This means that your choice of error model will depend on:

  • what you will use predictions for, and thus which metrics are most important to you, and
  • the resources available for trialling different error models

Broad Recommendations 

If you’re after a simple choice of a single error model and don’t want to undertake an in-depth analysis of performance trade-offs, we make the following broad recommendations.

Perennial catchments

In perennial catchments, use:

  • Log error model if reliability is important
  • Box Cox transformation with lambda=0.2 if precision is important
  • Box Cox transformation with lambda=0.5 if low bias is important.

Ephemeral catchments

In ephemeral catchments, use:

  • Box Cox transformation with lambda=0.2 if reliability is important
  • Box Cox transformation with lambda=0.5 if precision or bias is important.

See the “Recommendations” section of our paper for further details.

What is the impact of these findings on improving predictive uncertainty?

If you follow these broad recommendations, you can expect to reduce your predictive uncertainty (i.e. precision) from approximately 105% to 40% of observed streamflow, and decrease the biases in total volume from 25% to 4% (based on the median metrics across the 46 case studies), without major compromises in reliability.

The Bureau of Meteorology (BOM) is currently testing the recommended error model to improve their Seasonal Streamflow Forecasting Service. The recommendations are being trialled to improve the post-processing of monthly and seasonal forecasts.

Initial results are promising, with a significant increase in forecast performance over a large number of sites across Australia. We plan to publish this in an upcoming article in the Hydrology and Earth Systems Science journal [4].

For more information on heteroscedastic residual error models, please check out our recent article in Water Resources Research, or contact me via email.

We will also be presenting this work at European Geophysical Union General Assembly 2017 on 23-28th April in Vienna, Austria (abstract). We look forward to seeing you there.

For a sneak peak, see the seminar we recently presented at the Bureau of Meteorology, “Advances in improving streamflow predictions, with application in forecasting” available on figshare.


The outcomes of this research represent the combined work of a great team of researchers and operational personnel at the University of Adelaide (UoA), University of Newcastle (UoN) and the Bureau of Meteorology (BoM).

This includes intellectual contributions from Associate Professor Mark Thyer (UoA), Professor Dmitri Kavetski (UoA), Prof. George Kuczera (U of Newcastle), Narendra Tuteja (BoM), Julien Lerat (BoM), Daehyok Shin (BoM) and Fitsum Woldemsekel (BoM) and financial support of the Australian Research Council, through the ARC Linkage Grant, LP140100978, the Bureau of Meteorology, South East Queensland Water.  The opinions expressed in this article are the author’s own and do not reflect the view of the University of Adelaide, University of Newcastle, the Bureau of Meteorology or South East Queensland Water.


1. McInerney, D., Thyer, M., Kavetski, D., Lerat, J. and Kuczera, G. (2017), Improving probabilistic prediction of daily streamflow by identifying Pareto optimal approaches for modeling heteroscedastic residual errors. Water Resources Research. doi:10.1002/2016WR019168

2. Evin, G., M. Thyer, D. Kavetski, D. McInerney, and G. Kuczera (2014), Comparison of joint versus postprocessor approaches for hydrological uncertainty estimation accounting for error autocorrelation and heteroscedasticity, Water Resources Research, 50(3).

3. Kavetski D, Kuczera G and Franks, SW (2006) Bayesian analysis of input uncertainty in hydrological modelling: 1. Theory, Water Resources Research, 42, W03407.

4. Woldemsekel F., Lerat, J., Tuteja, N., Shin, D.H., Thyer, M., McInerney, D., Kavetski, D., Kuczera, G. (2017) Evaluating residual error approaches to post-processing monthly and seasonal streamflow forecasts, Hydrology and Earth System Sciences, Special Issue on Sub-seasonal to seasonal hydrological forecasting (in preparation).