Statistical Downscaling of Gridded Air-Quality Data

The Automatic Urban and Rural Network (AURN) maintained by DEFRA provides hourly monitoring of air quality at about 170 sites across the United Kingdom. These sites are located in rural and remote areas to monitor background air quality while some are located in industrial, urban or road side locations to monitor places prone to poor air quality, in an attempt to address the spatial heterogeneity of pollutant concentrations. However, monitoring data from only 170 sites cannot give accurate estimates of air quality at any given location, be it a point on the map or an administrative geography, e.g. a local authority, within the UK.

Better spatial coverage of air pollution is provided by output of an atmospheric dispersion model such as the Air Quality Unified Model (AQUM), developed by the UK Met Office specifically to deliver air quality forecasts, which takes into account emission inventories and meteorological variables such as temperature, wind speed and direction. However, it is well known that the raw AQUM output are largely negatively biased, see e.g. Savage et al (2013), (Savage, N. H., P. Agnew, L. S. Davis, C. Ord´o˜nez, R. Thorpe, C. E. Johnson, F. M. O’Connor, and M. Dalvi (2013). Air quality modelling using the Met Office unified model (aqum os24-26): model description and initial evaluation. Geoscientific Model Development 6(2), 353–372). and there is an urgent need to correct these biases at any given spatial resolution specified by the data downloader.

Based on complex Bayesian statistical models (Mukhopadhyay and Sahu, 2016) this project develops methods to integrate sparse AURN air quality monitoring data and a high spatial resolution output of the AQUM, run in hindcast mode, to produce estimates of air quality at any given spatial location. The Bayesian model is essentially used as a space-time bias correction tool and has been run for daily data separately for four most harmful pollutants: nitrogen di-oxide, ozone, PM10 and PM2.5, for the five year period 2007-2011. Rigorous out-of-sample statistical validation methods have been used to establish a very high level of empirical accuracy for the Bayesian model for each of the four pollutants.

Using the best Bayesian model, this MEDMI pilot project obtains daily air quality estimates at each corner point of the 151,284 1-kilometre grid points covering our study region in England and Wales for each of the 1826 days in the five year period 2007-2011.

In addition to the above estimates at the corners of the 1-kilometre grid points, we also obtain daily estimates of aggregated air quality for each of the 346 local authorities within England and Wales. The adopted statistical methods also allows us to report the associated uncertainty estimates of these air quality estimates which can also be downloaded for scientific investigations. Temporal aggregation to annual levels has also been performed at both the 1-kilometre spatial resolution and for each of the 346 local authority areas. These data are also available for download and illustrative annual maps are shown below.

In summary, this project has produced estimates of air quality for four of the most harmful pollutants in England and Wales for the five year period 2007-2011. These estimates along with their uncertainties can be used in epidemiological and other scientific studies linking air pollution to human health. Further user defined temporal, e.g. monthly or seasonal, and spatial, e.g. electoral wards, aggregation is currently under development.

Credits: This research was also supported by the pair of the EPSRC grants EP/J017485/1 and EP/J017442/1, “A rigorous statistical framework for estimating the long-term health effects of air pollution” awarded to the Universities of Southampton and Glasgow. The first of these grants also enabled the Met Office to produce the AQUM model output used here.

NO2 O3 PM10 PM2.5
Annual 1 km air quality
Daily 1 km air quality
Annual local authority air quality
Daily local authority air quality