Navigation and service

Derivation of watershed boundaries for GRDC gauging stations based on the HydroSHEDS drainage network

by Bernhard Lehner, McGill University Montreal, Canada. - The full text is published as Report 41 in the GRDC Report Series.

At the beginning of the project (October 2010), the BfG provided the most recent database of GRDC stations containing 7532 records for which watershed outlines should be derived. Of these, 47 stations had to be excluded as there were no point coordinates available. For all other stations, the provided geographic locations in terms of x- and y-coordinates were considered to be of mixed quality, with various uncertainties and likely errors. For this reason, the following two-fold strategy was designed to link the gauging stations to the HydroSHEDS river network. First, an automated process was applied: all stations were linked to the HydroSHEDS river network within a defined radius around the stations while attempting to optimize the agreement between the reported watershed area in the GRDC database and the modeled watershed area derived from HydroSHEDS. If no acceptable location could be detected within the applied search radius, the station was manually inspected in a secondary procedure. The following detailed steps were performed:

Automatic procedures for station allocation

  • For each station, an individual search radius of 5 km was defined.
  • Within this search radius, the watershed area was calculated for every pixel of the HydroSHEDS gridded river network.
  • The modeled watershed areas (HydroSHEDS) were then compared to the reported watershed areas of the corresponding stations as provided in the GRDC database.
  • All pixels with area differences of more than 50% (positive or negative) were excluded from further steps. All other pixels were coded with the absolute value of their area difference (in %); i.e. a pixel with plus or minus 10% error received the value „10“, etc.
  • This procedure provided a ranking scheme according to area discrepancies (RA) with values between 0 and 50, where 0 indicates perfect agreement in watershed area.
  • Next, for every pixel the distance to the original location of the station was calculated (i.e. the distance from the center of the search radius). The distance values were normalized to reach 50 at the maximum distance of 5 km; i.e. a pixel at a distance of 1 km received a value of „10“, etc.
  • This procedure provided a ranking according to distance (RD) with values between 0 and 50, where 0 indicates perfect agreement in station location.
  • Both the area and distance rankings were then combined in an additive way to derive a total ranking (R), whereby distance was weighted double (see „note“ below): R = RA + 2RD
  • This procedure provided a combined ranking with values between 0 and 150, where 0 indicates perfect agreement in both area and distance, and a higher value indicates increasing discrepancies.
  • Finally, from all possible pixels that corresponded to a station, the one showing the lowest ranking value was chosen.
  • ote: The distance ranking (RD) was weighted double so that further away pixels would quickly increase in their ranking values and thus become less likely to be chosen. More precisely: a pixel that is 1 km further away (2x10 ranking points) will only be chosen if the area agreement improves by more than 20%. These settings were applied after several tests showed that many stations with high precision in their coordinates showed a difference in watershed area of 5-10%, hence this magnitude of area disagreement should not immediately trigger a large movement of the station.

Manual procedures for station allocation

  • All stations for which no area agreement of less than 50% existed within the 5 km search radius were manually inspected. This also included 230 stations that had no reported area in the GRDC database.
  • First, the stations were visualized on Google Maps, and it was attempted to verify the river and station names (typically the name of the nearest settlement) in close vicinity to the given location (~10 km).
  • If a station could not be verified within this vicinity, the search was extended along the longitude and latitude lines of the given coordinates (for ~50-100 km). This strategy was applied as in many cases the location was incorrect due to errors in either the longitude or latitude coordinate, but not both. Typical errors included: simple typos in one digit (e.g. 11.58ºN instead of 12.58ºN); logical errors in the original coordinates (e.g. -20.4ºW instead of -19.6ºW for a location that is 0.4º to the right of -20ºW); or a swapped order of the coordinate digits (e.g. 10.35ºN instead of 10.53ºN).
  • If still no location was found that matched the river and/or station name, the station name was queried in Google Maps to see whether a location with this name existed anywhere in acceptable distance.
  • In all cases, the final decision on whether a station was moved to a new and “reliable” location depended on whether at least two out of the following four indicators could be matched reasonably well: a) river name; b) station name; c) watershed area (match between reported GRDC value and modeled HydroSHEDS value); and d) long-term annual discharge (match between reported GRDC value and modeled HydroSHEDS value). This decision was obviously subjective, and difficult combinations could arise (e.g. multiple agreements yet also disagreement(s) in the different indices). If a station was moved, a quality indicator and comment for the decision was added to the record.
  • Typically, the agreement in watershed area had highest priority for the final decision on whether to move a station. In some cases, however, e.g. if river and station names could be clearly verified, and also the discharge values matched, it was concluded that the reported GRDC area was possibly erroneous, and the station was moved to the new location despite the area discrepancy (see comments in Table 2).
  • In some cases, the GRDC stations were at the correct location but the HydroSHED river network could not represent the situation correctly. These cases included artificial canals, braided rivers, or stations within river deltas (see comments in Table 2).
  • For areas above 60 degrees northern latitude the reliability of the results is generally limited due to the low quality of the HydroSHEDS river network. These records should be interpreted with care, even if a high quality is assigned due to well matching areas.
  • Similarly, very small catchments (<10-50 km2) are not very reliable, even if the areas match well within a short distance, as small watersheds are found within close proximity to any location (even incorrect locations).

Calculation of watershed polygons and delivery of results

The watersheds for all re-allocated stations were derived based on the HydroSHEDS drainage network using standard GIS tools and procedures. Basin outlines were produced in two versions: with gridded edges (i.e. exactly following the HydroSHEDS raster cells), and with smoothed edges. The resulting polygons (one for each station) were attributed with the corresponding GRDC station records. Both the re-allocated GRDC stations (points) and corresponding watersheds (polygons) were delivered in ESRI shapefile format.


GRDC is happy to have engaged Bernhard Lehner from the McGill University Montreal, Canada, for the creation of the watershed boundaries for more than 7500 GRDC stations represented in the Global Runoff Database. The generated ESRI shape files will be provided under the conditions of the GRDC Data Policy, which states the non-commercial use of GRDC data and data products and the overall citation of GRDC as the source. It will be updated from time to time whenever extensions are required by the developments of the Global Runoff Database.

This Page

© 2017 BfG