We have moved!

The bigger, better, brand new DAPA blog is here (link)

Please note this Blog is not updated anymore.

We have moved! -- CLICK HEREe --
Decision and Policy Analysis Research Area – DAPA

Looking for weather data?

Cross-posted from http://ccafs.cgiar.org/blog/looking-weather-data

Being an agricultural researcher –particularly if you have ever investigated the abiotic controls on plant growth – it’s very likely that you have come across the problem of finding appropriate weather data. This is particularly true when (a) you don’t work in an institution that has its own weather station(s); (b) you’re not well connected enough to get data from close colleagues for free; (c) you don’t have enough budget to purchase the data you need from other institutions; or (d) a combination of all the above occurs. For instance, for international organisations options “a” and “c” are unfeasible because of the (potentially) large number of different datasets they might require for a given (broad) analysis, whereas for a young researcher just starting in the world of agro-climatological research and with few contacts “b” is definitely not an option. So, cases vary, but what if you are young and work in an International Organisation?

The available data:

We have realised that agricultural researchers prefer to use weather station data for their studies, as shown in Figure 1 below, and the reason for this is simple: we look for detailed and accurate data.

Figure 1 Frequency of use of the different data sources in agricultural studies based on a review of 247 recordings from published studies (taken from a comprehensive data use survey) (Ramirez-Villegas and Challinor 2012 -see bottom of post)


Nevertheless, measurements of weather for a given site are often unavailable because (1) there is no weather station; (2) weather stations are not well maintained so data are either only available for a short period or contain gaps, (3) collected data are not properly stored; (4) data do not pass basic quality checks; and/or (5) access to data is restricted by holding institutions. This all further constrains agricultural impact assessment, highlighting the importance of making data public.


In the last 10 years, however, various attempts to develop useful datasets have been done by different institutions, usually based on either a combination of weather station data, satellite data, and numerical weather prediction models in addition to interpolation methods, or on the sole application of climate models. The usage of these datasets for agricultural modelling purposes is rather limited for one or more of the following reasons:

  1. their time step is long (monthly in the best case);
  2. their temporal coverage is limited to an average of several years (see for example CRU, WorldClim);
  3. their spatial resolution is too coarse;
  4. their geographic coverage is not wide enough;
  5. only certain variables (i.e. temperatures, rainfall) are reported whereas other agriculturally relevant measures (e.g. potential and/or reference evapotranspiration, relative humidity, solar radiation) are rarely reported or reported at coarse spatial and/or temporal scales.

Apart from the constraints related to access and weather station locations, probably the most important issue regarding weather data is quality, which also greatly affects the performance of impact models.

So, if you’re looking for good quality, well-distributed, abundant, and freely accessible weather data, there’s no option for you. But if you’re a bit more flexible, here are your options (I will focus on rainfall, as this is the least predictable variable):

  • Use GSOD (Global Summary of the Day), a daily dataset of about 9,000 weather stations worldwide. Initially this seems to be a good option, but if you look at the data, you will find some issues:
    • From the potential number of stations (~9,000), to date in tropical areas, very few seem to be still reporting to NCDC (the GSOD database holder).
    • It seems that in the early years (<1950) rainfall was reported by ASR33 teletype, which produced odd rainfall peaks (P. Jones, pers. comm.). Careful attention needs to be paid to these peaks, as they seem to be artificial, and need to be removed.
    • As GSOD data were reported from civil aviation installations, they often misreport values of zero rain (P. Jones, pers. comm.). With this, one cannot really tell if it rained or not, hence useful information is blurred by errors.
  • If it turns out that your location is not in GSOD and GHCN, what you can do is to interpolate (daily or monthly data) using different techniques (amongst which Thin Plate Splines seem to be the best method). Authors have done so in the recent literature for analysing maize yield variation in Africa, and wheat senescence in India. However, as time passes the number of weather stations has been decreasing in the global and other systems (i.e. maintenance of weather stations is expensive) –Figure 2, and this considerably affects the quality of weather stations. Our own analyses have shown considerably low interpolation technique skill when the number of weather stations is limited, and (at least for Africa) the availability of weather stations at daily time scales is highly limited.

Figure 2 Number of weather stations with rainfall data in South Asia (black line) and in East and West Africa (red line). Grey line corresponds to the number of unique 1-degree cells with data over South Asia, whereas orange line corresponds to unique 1-degree cells with data in East and West Africa. CIAT unpublished data. Note the drop in South Asian data in 1971 that corresponds to the time when India decided to not share weather station data anymore.


  • If you have an incomplete weather series, you could attempt to reconstruct the whole weather series using a weather generator. Weather generators are tools designed to model the “stochasticity” of rainfall (algorithms used are named Markov chains). However, beware of:
    • In order to model tropical rainfall a 3rd order Markov chain is needed. All existing algorithms that can be used for fitting an individual station are based on 1st order chains, and the only algorithm based on third order chains cannot be fitted to individual weather stations.
    • If you are generating weather data for more than one point you need to take into account the spatial correlation of rainfall events.
    • There is a relationship between rainfall and other variables (i.e. temperatures and solar radiation), hence if additional fields are being generated, these need to be generated in a sensible way.
  • If you have no data whatsoever and do not have access or the possibility to process GSOD or GHCN datasets, you could use MarkSim using only the location of your site(s). However, MarkSim fitting has been done using only 9,162 weather stations. Analysis has shown that even ~50,000 might not be enough at the global level.
  • If you don’t care much about spatial resolution (but do about temporal) and require data from years >1995, you could use NASA-POWER or GPCP, but these are restricted to 1 degree spatial resolution. You could also use TRMM for years > 1998, at a higher spatial resolution (~28 km), but TRMM tends to overestimate actual rainfall (although the spatial distribution of rainfall is fairly good).
  • General Circulation Models (GCMs) are currently the best way to model the complex processes that occur at the earth system’s level. However, as GCMs are highly complex, they are computationally expensive, so they have only been used for predictions at coarse spatial scales, which cause lack of skill and uncertainties. These errors and uncertainties are much stronger at high temporal scales, making GCM data of limited practical use for crop modelling.
  • Regional Circulation Models (RCMs) are the best way of downscaling GCMs, but the skill of these predictions is highly dependent on the driving GCM and on the skill of the RCM. Some of our unpublished studies indicate that ETA data are useful for crop modelling purposes (i.e. at least in potato and bean –the crops we tested).

Therefore, crop modellers are challenged to gather and verify the usefulness of the available weather data for their studies, understand the broad concepts of climate modelling uncertainties and detect the sensitivities of crop models to errors in weather data, whilst also having a basic understanding of earth processes in order to identify major flaws in climate models and decide the best ways to couple them with crop models for climate change studies.

Further reading

Ramirez-Villegas, J. and Challinor, A., 2012. Assessing relevant climate data for agricultural applications. Agricultural and Forest Meteorology, 161: 26-45. http://dx.doi.org/10.1016/j.agrformet.2012.03.015

Related Posts Plugin for WordPress, Blogger...
Share this:

Leave a Reply

Your email address will not be published. Required fields are marked *

about CIAT

If you could answer these three short questions, that would be really appreciated http://dapa.ciat.cgiar.org/we-want-to-know-our-readers/

Our Latest Presentations