Journal of Hydroinformatics Sample Issue

Page 1

1

© IWA Publishing 2014 Journal of Hydroinformatics

|

16.1

|

2014

Assessment of GeoEye-1 stereo-pair-generated DEM in flood mapping of an ungauged basin I. K. Tsanis, K. D. Seiradakis, I. N. Daliakopoulos, M. G. Grillakis and A. G. Koutroulis

ABSTRACT A very high resolution (VHR) digital elevation model (DEM) is produced from a GeoEye-1 0.5-mresolution satellite stereo pair and is used for floodplain management and mapping applications such as watershed delineation and river cross-section extraction. For this purpose, a 2 m × 2 m resolution terrain surface is produced from the stereo pair by using the Leica Photogrammetry Suite (LPS) enhanced Automatic Terrain Extraction (eATE) algorithm. DEM accuracy is assessed by comparison with measured individual ground control points (GCPs), stream cross-sections and other landscape features. Results show that the produced DEM is in good agreement with ground truth and superior to products of lower resolution, such as 90 m NASA Shuttle Radar Topography Mission (SRTM) and 1:5,000 topographical maps. One- and two-dimensional hydraulic models are used to simulate

I. K. Tsanis (corresponding author) K. D. Seiradakis I. N. Daliakopoulos M. G. Grillakis A. G. Koutroulis Department of Environmental Engineering, Technical University of Crete, Chania, Greece E-mail: tsanis@hydromech.gr I. K. Tsanis Department of Civil Engineering, McMaster University, Hamilton, Canada

rainfall–runoff characteristics and flood wave kinematics of the flash flood event of 17 October 2006 that occurred in the ungauged basin of Almirida, using the 2 m VHR-DEM as an input. Results show that the hydraulic simulation based on the generated VHR-DEM, calibrated and validated via field data, produces an accurate extent and water level of the flooded area. Remote sensing stereo reconstruction is a promising alternative to traditional survey methods in flood mapping applications. Key words

| digital elevation model, flash flood, flood mapping, satellite stereo pair, very high resolution

INTRODUCTION Floods are among the world’s most costly disasters with the

subject to increasing human activity such as urbanization

estimated cost of flood damage in Europe increasing signifi-

that reduces infiltration leading to the increase of surface

cantly in the past decades (Re ; Barredo ). In 2002

runoff, the shortening of the flood’s travel time and an

only, Europe suffered over 10 billion Euros in damages and

increase in the peak flow. Urbanization can directly affect

dozens of people were killed (Toothill ). Flash floods

the capacity of a stream when infrastructure such as

constitute a great challenge in civil protection as they rep-

bridges are constructed within a stream encroaching the

resent a great destructive force. Within minutes to a few

floodplain, or indirectly causing stream channel enlarge-

hours from the causative storm event, flash flood water

ment as a response to the change in stream flow regime

levels can reach their peak, leaving insufficient warning

accompanying urbanization (Hammer ; Gregory et al.

time to prevent human casualties (Borga et al. ; Collier

; Konrad ). Especially in the case of ungauged

). They occur both in areas with no flooding history

basins, flash floods also pose a great challenge to science

and areas with such frequent floods that flooding is con-

as heterogeneities and the lack of observational data

sidered a local climate component (Llasat et al. ).

enhance uncertainties in providing quantitative assess-

The flooding potential of a hydrological basin is mainly

ments (Sivapalan ).

doi: 10.2166/hydro.2013.197


2

I. K. Tsanis et al.

|

GeoEye-1-generated DEM for flood mapping

Journal of Hydroinformatics

|

16.1

|

2014

Flood hazard and flood risk maps are essential in the mitigation of the disastrous effects of floods because they provide a proactive tool for discouraging urbanization in flood-prone areas. These products constitute an essential part of the Flood Directive 2007/60/EC (EP & CEU ) in flood risk management plans. Flood hazard maps are based on the combination of three variables: flood extent, land use and flood return period. As well as land use, urbanization can also cause changes to the floodplain geometric characteristics, thus altering the flood extent for a given flood. Theory and practice have clearly demonstrated the importance of the accurate representation of morphological channel characteristics. Hydraulic relationships between channel morphology and runoff were first explored by Leopold & Maddock (). In flood inundation modeling, digital elevation models (DEMs) greatly affect the model outputs as they have a direct influence on the total drainage

Figure 1

|

Relative cost and accuracy of DEM generation technologies (Richards 2007).

length and slope (Dutta & Herath ). Low-resolution effects such as under-sampling can cause poor terrain

Furthermore, some regions in LiDAR data have null

representation, altering water pathways and flow character-

values due to self-occlusion of buildings (Lee et al. )

istics. This problem is very pronounced when simulating

or the presence of water bodies (Awrangjeb et al. ). Syn-

hydrological processes at a timescale shorter than that of

thetic aperture radar interferometry (InSAR) is also a highly

the surface water process. At this timescale, the linkage of

effective tool for extracting DEMs, with vertical accuracy

GIS and hydrological models becomes difficult because

that can range from c. 1 m to c. 10 m (Sanders )

simulation of channel flow depends heavily on the structure

depending on altitude of observation (Figure 1). InSAR

of channel network.

can penetrate cloud cover with negligible attenuation, but

Remote sensing techniques have provided indispensable

suffers in highly vegetated areas and effects such as shadow-

solutions for generating DEMs for environmental surveying

ing and layover limit its applicability to flat and moderately

and planning applications (Mongus & Žalik ), especially

rough terrains (Eineder ). Figure 1 shows a rough cost-

for large area coverages. Older low-resolution DEM

effect analysis of various DEM generation technologies.

products (30–100 m) are adequate for numerous environ-

Recently available very high resolution (VHR) satellite

mental applications (Nikolakopoulos et al. ), but

stereo-pair products promise to deliver affordable sub-

provide poor terrain detail, especially in lowlands with

meter DEM accuracy that can potentially lead to the

minor slopes that are nevertheless prone to floods. DEMs

much-needed flood hazard and risk maps necessary for

derived from light detection and ranging (LiDAR) generally

flood damage analysis (FDA) studies (Boyle et al. ; Kou-

provide more accurate height information and have suffi-

troulis & Tsanis ).

cient resolution, e.g. sub-meter grids of ±0.1 m vertical

GeoEye-1, the imaging system with highest resolution

accuracy (Lane et al. ; Sanders ). While airborne

available commercially, can collect samples with a ground

laser scanning ‘is here to stay’ (Baltsavias ), it is still

resolution of 0.41 m in the panchromatic or black and

costly for large applications (Figure 1) and there are a few

white mode and multispectral or color imagery at 1.65 m

inherent shortcomings of the LiDAR technology, e.g. lack

resolution. While the satellite collects imagery at 0.41 m,

of correspondence to objects, no redundancy in the

GeoEye’s operation license from the US government

measurements, strong dependency on material features,

requires resampling the imagery to 0.5 m for all customers.

missing visual coverage (Toth & Grejner-Brzezinska ).

The intended use of the GeoEye GeoStereo product is to


3

I. K. Tsanis et al.

|

GeoEye-1-generated DEM for flood mapping

Journal of Hydroinformatics

|

16.1

|

2014

obtain an accurate DEM generation for 3D viewing and fea-

defined by 79 rational polynomial coefficients (RPCs)

ture extraction applications. An automatic DEM generation

approximating the specific sensor model information to

from VHR satellite image stereo pairs offers new challenges

map geodetic ground points to the imaging system’s pixel

for developing techniques for the automatic interpretation

coordinates. Ground control points (GCPs) play a very

of image structures (Krauß et al. ). The exploration of

important role in satellite image adjustment. They are actu-

the efficiency of the data sources in combination with a

ally essential when no physical sensor model or RPC

specific application and advances in available processing

model is available, as shown in Figure 2. Depending on

methods are still in demand (Croitoru et al. ).

the type of solution, a terrain-dependent or independent

Due to computational simplicity and the relative ease of

approach can be followed, with the final step of either

parameterization and calibration, 1D hydraulic models are

approach always being an RPC (Figure 2). In the case

today a staple of flood modeling (Marks & Bates ), pro-

where one of them is available, as in the case of GeoEye-1

viding sufficient accuracy in small computational times even

imagery, GCPs are used in refinement of the RPC solution.

when coupled with detailed topographic data. However,

The advantage of rational functions is that they are

when dealing with highly complex and varying hydraulic

sensor independent, which means that the user does not

parameters such as flood inundation prediction and flash

need to know all of the specific internal and external

floods, DEMs can be used to parameterize a 2D hydraulic model (Sanders ) in order to offer a better representation of the flow field, at the cost of data, model and computation simplicity (Gogoase et al. ). In this study, the topography extracted from a GeoEye-1 stereo-pair-generated DEM is investigated for its accuracy compared to other widely used DEM products and conventional survey topographical techniques. The produced high-resolution DEM is then tested for its efficiency in 1D and 2D hydraulic simulation and flood mapping of a flash flood event that occurred in the Almirida area on the island of Crete in 2006, causing damage to property and the loss of a human life.

METHODOLOGY DEM extraction The first requirement for a DEM from satellite imagery is a satellite stereo-pair product accompanied by a sensor model that describes the geometric relationship between the 3D object space Ob(X, Y, Z ) and 2D image space Im(r, c). The sensor model can be made available in two forms which include the rigorous physical sensor model and the generalized or abstract sensor model. The physical sensor model includes all of the internal and external (i.e. position and orientation information) sensor model information associated with a specific satellite sensor as exists when the imagery is being captured. The abstract model is commonly

Figure 2

|

Schematic of stereo-pair processing depending on sensor model availability.


4

I. K. Tsanis et al.

|

GeoEye-1-generated DEM for flood mapping

Journal of Hydroinformatics

|

16.1

|

2014

camera information. For the ground-to-image transform-

above-ground structures and vegetation (digital surface

ation, the defined ratios of polynomials have the forward

model, DSM). Depending on the problem in hand, surface

form presented in Equations (1) and (2):

features are deemed either required (e.g. Priestnall et al.

) or redundant (e.g. Gomes Pereira & Janssen ). A

1 Z Y X Y 3 X3 ða0 a1 a19 ÞT

ð1 Z Y

X Y3

X3 Þ

(1)

T

ð1b1 b19 Þ

DEM is commonly interpolated from a set of vertices in a 3D coordinate system on a Cartesian grid where it is possible to draw and estimate individual terrain profiles or cross-

1 Z Y X Y 3 X3 ðc0 c1 c19 ÞT

sections along a stream or river. These profiles and cross(2)

ð1 Z Y X Y 3 X3 Þ ð1 d1 d19 ÞT

sections can then be compared with field measurements to check for DEM reliability. This step is essential if accurate

where r, c are image space coordinates, X, Y, Z are ground coordinates and a, b, c, d are the respective RPCs (OGC ) provided by the satellite product vendor. The inverse process allows the user to perform photogrammetric tasks such as orthorectification and stereo reconstruction and requires mutual information matching between the stereo-

cross-sections are to be provided for the next step of hydraulic simulation. Following the model set-up with field parameters, a measured or estimated flow can be applied in order to create flood inundation lines to define flood boundaries for the specified flood. A schematic description of the DEM validation process is presented in Figure 3.

pair members. Essentially, a pixel set {ImL(r1, c1), ImR(r2, c2)} depicting the same object has to be specified is order to

DEM quality

solve Equations (1) and (2) iteratively towards the object’s real world coordinates (Tao & Hu ; Hu et al. ). Given that the distance between pixels ImL(r1, c1) and

The quality of each DEM product against measured points can be estimated using the Root Mean Square Error

ImR(r2, c2), also called disparity, is independent of pixel

(RMSE) of elevations defined in Equation (3):

intensity, one band from each member of the stereo pair is

sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P 2 ðzbi zi Þ RMSE ¼ n

usually enough for DEM extraction. While this computation is straightforward and converges fast (Lin & Yuan ), an

(3)

automatic pixel-wise matching of large stereo-pair products becomes more challenging as local and later global optimization need to be applied in order to ensure robustness. The

where zi is the elevation of the measured point i considered as ground truth, zbi is the elevation of point i on each DEM

Leica Photogrammetry Suite (LPS) eATE (enhanced Auto-

product and n is the number of the measurements. The stat-

matic Terrain Extraction) algorithm is an advanced tool for extracting high-density terrain surfaces from stereo ima-

istics of the relative error of elevation estimation (zbi zi )=zbi can also provide valuable information about the

gery

mutual

distribution of error. Total watershed area, the upslope

information. It approximates a global 2D smoothness con-

based

on

a

pixel-wise

matching

of

area that contributes flow to a common outlet at the

straint by combining many 1D constraints (Hirschmuller

lowest point along the boundary of the watershed, can be

), otherwise known as semi-global matching (Previtali

used as a means of comparison among DEM products.

et al. ). The performance of the algorithm eATE is

Another indicator of DEM quality is the average Euclidean

based on user-defined regional strategies and parameters

distance of the estimated flow path from the actual stream

that control and guide the terrain processing.

centerline. Flow path P(xi, yi) can be extracted using the

Products of this process can vary from 2D maps, stereo-

method of deriving accumulated flow from a DEM pre-

based 3D reconstruction and single-image-based 3D point

sented in Jenson & Domingue (), while the stream

extraction to orthorectification. A DEM is a 3D represen-

centerline S(xj, yj) can be measured in the field or estimated

tation of a terrain’s surface and can either represent bare

otherwise. Given that individual points in P and S are not

ground surface (digital terrain model, DTM) or include

necessarily equidistant or the same in number (i ≠ j), the


5

I. K. Tsanis et al.

Figure 3

|

|

GeoEye-1-generated DEM for flood mapping

Journal of Hydroinformatics

|

16.1

|

2014

Schematic of the cross-section extraction process for a successful hydraulic simulation.

average distance d to the flow can be approximated using

sense of a control volume) will remain constant over time

Equation (4):

and the principle of conservation of momentum which states that the total momentum of a closed system (in this

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 m xi xj þ yi yj min j¼1 i¼1

Pn d(P, S) ¼

n

case a control volume) is constant. The unsteady flow (4)

or otherwise, the average Euclidean distance of each point of P with the closest point of S.

equation solver, developed primarily for subcritical flow regime calculations, was adapted from the UNET model (Barkau ; Brunner ). A 2D hydraulic model was also set up with the help of the CCHE-2D finite difference code developed at the National Center for Computational Hydroscience and

Hydraulic modeling

Engineering (NCCHE), University of Mississippi. CCHE2D is an unsteady, turbulent flow model with non-uniform

The Hydrologic Engineering Center, River Analysis System

sediment and conservative pollutant transport capabilities.

(HEC-RAS), which was developed by the US Army Corps

An efficient element scheme of Wang & Hu () is incor-

of Engineers, has been applied extensively in calculating

porated to numerically solve the 2D depth-averaged shallow

the hydraulic characteristics of rivers (Pappenberger et al.

water equations. The capability of CCHE-2D to simulate

; Carson ). HEC-RAS is designed to perform 1D

subcritical and supercritical free surface floods has been ver-

hydraulic calculations for a full network of natural and con-

ified and validated using analytic methods and many sets of

structed channels (Brunner ) and has an unsteady flow

physical model data and field data ( Jia & Wang ; Jia

component that is capable of simulating 1D unsteady flow

et al. ). The governing equations of CCHE-2D used for

through a full network of open channels. The physical

simulating the flow field are the continuity equation and

laws which govern the unsteady flow of water in a stream

the momentum equations in x and y directions as shown

are the principle of conservation of mass (continuity)

by Nassar (). Details about the model and its com-

which implies that the mass of a closed system (in the

ponents are given by Zhang (, ).


6

I. K. Tsanis et al.

|

GeoEye-1-generated DEM for flood mapping

CASE STUDY

Journal of Hydroinformatics

|

16.1

|

2014

206 m. The area is cultivated and covered mainly by olive trees and natural vegetation. Following the CORINE 2000 Land Cover maps (EEA-ETC/TE ), the watershed is

Almirida, Crete

mainly covered by scrub and natural grassland (42.3%), Crete has a typical Mediterranean island environment with

olive groves (36.6%) and agricultural land interrupted by

about 53% of the annual precipitation occurring in the

wide areas of natural vegetation (20.6%), with as little as

winter, 23% during autumn and 20% during spring while

0.6% urban fabric.

there is negligible rainfall during summer (Naoum & Tsanis ; Koutroulis & Tsanis ). The average precipitation ranges from 440 mm a

–1

Flash flood and post-event field survey

in the east to more than

2,000 mm a–1 at the uplands of western Crete, where oro-

On 16 October 2006, a frontal depression that was already

graphic effects tend to increase both frequency and

located over the central Mediterranean moved eastwards

intensity of winter precipitation (Naoum & Tsanis ;

and crossed over the island of Crete by midday. The low

Roe ; Koutroulis & Tsanis ). These characteristics

pressure with center 1,010 hPa over Malta moved rapidly

together with the rugged topography and small-size basins

eastwards and deepened, so that at 00:00 UTC 17 October

make Crete a rather flood-prone region (Koutroulis et al.

(Figure 5(a)), it was centered just southwest of Crete with

). The hydrological basin of coastal village of Almirida

center lower than 1,008 hPa. High-resolution MeteoSat ima-

2

(Figure 4) covers an area of 25 km and receives an average

gery (Figure 5(b)) shows the frontal depression moving

annual precipitation of 648 mm (based on a 32-year record).

northeast towards the island of Crete. The meteorological

The topography consists of mild slopes in the major part of

formation developed an estimated speed of approximately

the watershed and a few areas with slopes over 10%. The

65 km h–1 that classifies them as a potential Mediterranean

watershed exhibits a moderate orography, with a maximum

tropical storm, a rather rare phenomenon (Lagouvardos

elevation of 527 m and a mean watershed elevation of

et al. ).

Figure 4

|

Almirida basin in the island of Crete, Greece and GCP location.


7

I. K. Tsanis et al.

Figure 5

|

|

GeoEye-1-generated DEM for flood mapping

Journal of Hydroinformatics

|

16.1

|

2014

(a) Mean sea-level pressure at 0000 UTC for 17/10/2006 (isobars in hPa). (b) High-resolution METEOSAT satellite image showing the frontal depression moving northeast towards the island of Crete.

The total measured rainfall depth for the event was

GeoEye-1 GeoStereo stereo pair proved to be the most reason-

200 mm with the major volume precipitating during the

ably priced solution for the resolution provided. While SPOT

course of 7 hours (07:00 to 14:00 local time). The peak rainfall

2.5 m panchromatic and IKONOS 0.8 m were less expensive,

took place at noon and a joint estimation using terrestrial and

GeoEye provides a competitive value at the best-possible

radar measurements showed that it was about 23.0 mm h–1

resolution for satellite products. Products such as LiDAR

and lasted for about 30 min (Daliakopoulos & Tsanis ),

that can potentially yield superior results are rendered uneco-

enhancing surface flow and generating an estimate peak dis-

nomical by the size of the basin and the poor availability of

3 –1

charge of 225 m s . The discharge was estimated with the

airborne sensors in the proximity (Table 1). The GeoEye-1

help of floodmarks and 1D hydraulic modeling of the basin

GeoStereo stereo pair used in the present study was acquired

(Gaume et al. ) at a location shown in Figure 6. The

on 13 August 2009 over the wider area of Almirida watershed.

flood resulted in more than 3 million euros in damage and

The product is characterized as panchromatic–multispectral,

the death of a local resident. During the days after the flood,

has 0.5 m pixel size and the two members were collected

topographic measurements and photographic material were

at angles 79.55334 and 62.05786 . Samples from the three

collected from field surveys and local resident testimonies.

visible bands composite members of a small area of Almirida

At the outlet of the basin the flood marks of the maximum

watershed stereo-pair images are shown in Figure 7.

W

stage were still clearly visible, reaching as high as 2 m at the

For the purposes of this study, GCPs were considered as

downstream control cross-section. Due to the fact that the

the ground truth against which elevation models can be

basin is ungauged and the stream of Almirida is ephemeral,

compared. Ninety high-quality GCPs were collected within

no other flow events have been recorded.

the study area, at open areas with bare terrain, using a pair of differential GPS (DGPS) Leica GS20 Professional

Terrain data availability

Data Mappers. DGPS measurements were corrected offline using the L1 pseudo-range in combination with station

Selection of the most cost-effective high-resolution stereo pro-

TUC2 from the Reference Frame Sub-Commission for

duct for the case study followed a simple ranking of costs for

Europe (EUREF) Permanent Network (EPN), located

each available technological option (Table 1), where the

within the Technical University of Crete campus. Ten of


8

I. K. Tsanis et al.

Figure 6

|

|

GeoEye-1-generated DEM for flood mapping

Journal of Hydroinformatics

|

16.1

|

2014

Almirida basin with ground truth profile measurement locations and profiles at the lower stream (detail on the right).

the GCPs were used as control points during the DEM extraction algorithm while the remaining 80 points in the Table 1

|

Minimum order area, unit cost and final cost for an area equal to that of Almirida basin (25 km2) for various high-resolution stereo products

SPOT satellite 2.5 m b&wa

Additional GCPs were collected using a total station, an

Minimum order

Unit cost

Cost for

Cost in $c for

area (km2)

per km2

25 km2

25 km2

1.5 €

5,400 €

3,780

60

IKONOS 0.8 m Geostereoa

100

$45

$4,500

4,500

GeoEye-1 0.5 Geostereoa

100

$50

$5,000

5,000

SPOT satellite 2.5 m colora

60

2.25 €

8,100 €

5,670

$40

$8,400

8,400

Airborne stereo SARb

200

100 €

20 000 €

14 000

Airborne IFSARb

200

100 €

20 000 €

14 000

Aerial photographyb

200

100 €

20 000 €

14 000

LiDARb

200

100 €

20 000 €

14 000

c

distance meter to read coordinates from the instrument to a particular point. Total station measurements are taken only with respect to the instrument; they therefore have to be referenced later using GPS measurements. A total of 650 validation points were collected using this method. From those, 500 points comprise a set of narrow Almirida watershed (Figure 6, inset). An additional 150 points comprise a set of wide (>80 m) profiles that include cross-sec-

210

e-GEOS Price List for 2012.

b

electronic theodolite (transit) integrated with an electronic

(<50 m) but dense stream cross-sections near the outlet of the

Quickbird, Worldview1/2 2 ma

a

set were used as check points for DEM accuracy validation.

B. Charalampopoulou, personal communication (2013).

Average exchange rate for August 2009 1$ c. 0.7€.

tions (S1, S2) and various accessible slopes (S3, S4, S5) within the watershed (Figure 6). The aim of this comparison is mainly to reveal the DEM efficiency in capturing the geometry of a random point, stream cross-section or profile and to describe the ground geometry for hydraulic applications. Even though various DEMs with different pixel size were extracted for comparison, the resolution selected for further analysis has a 2 m pixel size. Three reference DEMs were also


9

I. K. Tsanis et al.

Figure 7

|

|

GeoEye-1-generated DEM for flood mapping

Journal of Hydroinformatics

|

16.1

|

2014

Members of a sample GeoEye-1 stereo pair from Almirida basin.

used: (1) 10 m × 10 m DEM produced from aerial photos; (2)

RPC correction is such that the sensitivity of the overall

30 m × 30 m DEM digitized from 1:5,000 Hellenic Military

quality to the number of check points is relatively small.

Geographic Service (HMGS) topographical maps; and (3)

In order to document this lack of sensitivity, DEMs of the

NASA SRTM90 90 m × 90 m product (Jarvis et al. ).

same resolution (2 m × 2 m) were produced using a varying number of control points randomly selected from the set of 10 GCPs reserved for control. Figure 8 shows that the

RESULTS

RMSE of check points against GPS measurements for correction remains relatively steady around 0.9 m. It is

The selection of GCPs on stereo-pair images is usually a sub-

therefore inferred that the margin for further corrections is

jective process that needs caution, as the extracted DEM

narrow and the number and location of GCPs have limited

quality is highly dependent on them. During the course of

impacts on the precision of object point determination. This

the study, it was determined that the quality of the initial

finding is in agreement with previous studies using RPC


10

I. K. Tsanis et al.

|

Figure 8

|

GeoEye-1-generated DEM for flood mapping

Journal of Hydroinformatics

|

16.1

|

2014

RMSE of check points against GPS measurements using random subsets of 4–8 from a total of 10 control points for use in RPC refinement.

block adjustment (e.g. Fraser et al. ). In this sense, the

1.30 and 1.14 m, respectively) and significantly lower than

incorporation of the RPC successfully replaces the sensor

the SRTM90 and HMGS30 products (Table 2). It is interesting

model and reduces the need for extensive GCP collection.

to note that control and check points were collected using

The quality of each available DEM was initially evaluated

only GPS measurements, making them potentially less likely

by calculating the RMSE of the elevation of control, check

to be affected by the inherent systematic error of the 650 vali-

and validation points against the respective elevation esti-

dation points that were collected by transit and referenced by

mated at each DEM at the same location (Table 2). The

GPS. Consecutively, the GeoEye-1 stereo-pair-derived DEM

results show that the stereo-pair-produced DEM scored the

yields a meter or sub-meter resolution which makes it a

smallest RMSE (0.79 m for control points, 0.90 m for check

good candidate for a wide range of applications.

points and 1.06 m for validation points). The RMSE was

In terms of total delineated watershed area, the four

found to be lower than that obtained by aerial photos (1.88,

compared DEMs delivered different results using the standard D-8 algorithm (Tribe ) found in ArcGIS. The

Table 2

|

Statistics of different DEM products, including RMSE value for: (a) 10 control points used for the DEM generation; (b) 80 check points measured with GPS; and (c) 500 validation points measured using GPS and transit

RMSE (m)

Control points Check points Validation points

DEM

25.814 km2,

the

while

calculated

an

area

aerial-photo-generated

of

DEM

(10 m × 10 m) estimated a similar area of 25.893 km2 (0.3% larger). The SRTM90 and HMGS30 DEMs estimated a

Aerial

DEM product

stereo-pair-generated

photos

HMGS

SRTM

Stereopair 2m×2m

10 m × 10 m

30 m × 30 m

90 m × 90 m

0.79

1.88

4.28

31.63

0.90

1.30

7.00

6.92

1.06

1.14

7.74

5.18

watershed area of 26.133 and 24.962 km2, respectively, that correspond to 1.2% larger and 3.3% smaller area comparing to the stereo-pair-generated DEM (Table 2). While there is no objective method to estimate the true watershed area at a given resolution, results show that discrepancies

are

not

significant.

Furthermore,

taking

advantage of the superior absolute horizontal accuracy of

Average relative error in validation set (%)

3.2

Standard deviation of relative error in validation set (%)

17.5

0.3

118.5

73.9

GeoEye-1 (4 m CE90, horizontal, without GCP, for the stereo product) over the relatively lower absolute vertical

19.7

47.2

56.8

accuracy (6 m LE90, vertical, without GCP), the main stream channel of the watershed was manually digitized and used to estimate its average distance d from extracted

Watershed area (km )

25.814

25.893

24.962

26.133

DEMs. The two lower-resolution DEMs (HMGS30 and

Average distance d from stream center (m)

2.31

4.89

31.63

43.19

SRTM90) fail to capture the river position (Figure 9). Analy-

2

sis of the average distance between the manually digitized and DEM-produced stream centerline was conducted for


11

I. K. Tsanis et al.

Figure 9

|

|

GeoEye-1-generated DEM for flood mapping

Journal of Hydroinformatics

|

16.1

|

2014

Comparison between stream definitions acquired by manual digitization on the orthorectified GeoEye-1 image (white path) and flow accumulation estimation on each DEM (dashed line).

up to a distance of 1.5 km from the outlet of the stream. The

each location are shown in Figure 10. For illustration pur-

low-resolution effect of older DEM products can lead to

poses, the SRTM90 DEM was excluded from Figure 10

poor terrain representation, altering water pathways during

due to its distance from the other DEM product. The results

hydraulic modeling by over 30 m on average (Table 2). On

show that the stereo-pair-generated DEM delivered profiles

the other hand, the stereo-pair- and the aerial-photo-gener-

that were very close to those acquired by total station

ated DEMs show sufficient agreement to the manually

(Table 2, Figure 11). The comparison between the different

digitized stream path, having an average distance of 2.31

DEMs reveals that the lower-resolution DEMs (HMGS30,

and 4.89 m, respectively (Table 2).

SRTM90) fail to describe the profile geometry; the low res-

Measured profiles (Figure 7) were compared against

olution acts as a smoothing filter of the field. At the

profiles extracted from each DEM product. The results at

same time, even the finer-resolution aerial-photos-generated


12

I. K. Tsanis et al.

Figure 10

|

|

GeoEye-1-generated DEM for flood mapping

Journal of Hydroinformatics

|

16.1

|

2014

Comparison of DEM extraction results with reference DEMs and profile validation points considered as ground truth: (a)–(e) profiles S1–S5, respectively.

DEM (while improved) failed to capture the depth of the

compared to the total station measurements, capturing

stream as the channel width can often measure less than

both the changes of the ground slope and the ground

10 m. In contrast, the stereo-pair-generated DEM describes

elevation in detail. The aerial-photo-generated DEM is also

width and depth of both cross-sections well in the cases

close to the ground truth (Table 2, Figure 11). In some

S1 and S2 (Figure 10). In profiles S3, S4 and S5, the

parts of the profiles, it however over- or underestimates

stereo-pair-generated DEMs exhibit very good agreement

the ground elevation while exhibiting poor fit in small and


13

I. K. Tsanis et al.

Figure 11

|

|

GeoEye-1-generated DEM for flood mapping

Journal of Hydroinformatics

|

16.1

|

2014

Relative error of validation points against elevation for all DEM products.

abrupt changes of the slope, due to its medium spatial

only range between 1.7 and 11.0 m, Figure 11 presents a

resolution.

clear distinction between fine- and coarse-resolution

The relative error of 500 validation points for all DEM

DEMs. In this context, the 2 and 10 m DEMs show virtually

products is plotted against elevation in order to distinguish

no error trend against elevation change with average relative

possible trends in the data (Figure 11). While measurements

error being 3.2% (±17.5%) and 0.3% (±19.7%), respectively.


14

I. K. Tsanis et al.

|

GeoEye-1-generated DEM for flood mapping

Journal of Hydroinformatics

|

16.1

|

2014

On the other hand, coarser DEMs yield average relative

thalweg elevations, but underestimate the relative depth

errors over 70% with a decreasing trend (Table 2). Compar-

of water at the time of peak flow (Figure 13(b)).

ing between HMGS (30 m × 30 m) and SRTM (90 m × 90 m), the former overestimates elevations for all measurement ranges whereas the latter seems to converge to accurate esti-

CONCLUSIONS

mations after 10 m. Furthermore, relative error variance is higher in small elevations, an observation that becomes

This research covers a new approach to more accurate and

more evident in coarser-resolution DEMs.

cost-effective DEM production for use in 1D and 2D hydrau-

In the current case study of the Almirida flash flood,

lic modeling. As such, it is part of a broader European effort

the produced 2-m-resolution DEM delivered the cross-

to document storm and flood processes, improve modeling

section data that were used in the hydraulic simulation in

results and provide valuable tools for flood managers in

HEC-RAS. Sixteen cross-sections were used to simulate

Europe as well as other countries. While it has to be

approximately 730 m of the lower river affected by the

acknowledged that full awareness of the complex processes

flash flood. The width of the cross-sections varied from

governing individual ungauged basins is not possible (Siva-

107 to 294 m, depending on the recorded flood extent on

palan ), uncertainty can be compensated for by using

each cross-section. The simulation time step was set at

innovative methodologies in some aspects of modeling.

10 s. Based on the stream characteristics, the Manning fric-

Such multi-discipline research and modeling applied to a

tion coefficient was considered uniform and equal to 0.04

large geographical area is critically needed in many

for the entire river bed. Respectively, for the floodplain

countries, where documenting basic flood processes has

the Manning coefficient was set at 0.08. Figure 12(a)

been lacking for the past few decades. While analytical

shows the representation of the flood extent at peak flow.

methods and models have been improving, there is no con-

This qualitative result is based on the observation that

certed national effort for basic data collection and

part of the flood wave follows streets and flows around

documentation.

buildings, clearly visible in the satellite image showing suc-

Accurate flood extent is a key feature for developing

cessful modeling and DEM quality. CCHE-2D was set up

detailed flood hazard maps, especially in flood-prone areas

using a total of 85 100 nodes and calibrated using lower

with substantial development. This work shows that tra-

Manning coefficients (0.02 for channel and 0.05 for the

ditional topographic surveying effort can be significantly

floodplain). Figures 12(c) and 12(d) show the model results

reduced when a VHR DEM is available. The extraction of

for 2 m × 2 m and 10 m × 10 m DEMs, respectively. For the

a 2 × 2 m2 VHR DEM from a satellite stereo pair and its

purposes of this study, which is the extraction of accurate

use in hydraulic modeling is presented. The DEM is

flood maps, the 1D and 2D models have produced visually

extracted from a 0.5 m GeoGye-1 Geostereo, a state-of-the-

similar results (Figure 12). In Figures 12(c) and 12(d),

art VHR satellite image using the LPS eATE algorithm

cross-sections and locations of hydrograph are shown for

and the image’s RPC model and GCPs. The resolution of

reference with Figure 7.

the final product is one of the highest among commercially

In a comparative study, the superiority of the highresolution DEM over lower-resolution products such as the 10 m DEM is evident (Figure 12(b)). As well as the

and freely available DEMs and scores the smallest RMSE among all reference DEMs used for comparison (Table 2). In the case of the 2006 flash flood in Almirida, apart

poor representation of stage, there are often areas where

from upstream damage the overbank water flow extended

a low-resolution map is not accurate in terms of flood

over 150 m west of the stream channel and flooded the

extent. The comparison of relative thalweg and water

urban area (Figure 12(a)). Increasing but largely unplanned

surface among DEMs of 2, 10 and 30 m resolution is

tourism development during the last two decades has

shown in Figure 13(a). While differences in the hydraulic

resulted in uncontrolled urbanization inside the ephemeral

simulations between 2 and 10 m DEMs are not so pro-

stream delta as well as upstream, a fact that played a signifi-

nounced, the lower-resolution DEM estimate higher

cant (if not the dominant) role in the evolution of the flash


15

I. K. Tsanis et al.

Figure 12

|

|

GeoEye-1-generated DEM for flood mapping

Journal of Hydroinformatics

Flood extent using HEC-RAS based on the (a) 2 m DEM and (b) 10 m DEM; flood extent using CCHE-2D based on the (c) 2 m DEM and (d) 10 m DEM.

|

16.1

|

2014


16

I. K. Tsanis et al.

Figure 13

|

|

GeoEye-1-generated DEM for flood mapping

Journal of Hydroinformatics

|

16.1

|

2014

(a) Comparison of thalweg and water surface from three different DEMs and (b) stage estimates for each case.

flood and the incurred damages. It is evident that proactive

and assist in their supervised or unsupervised removal.

flood protection measures have to be taken in order to avoid

Further research could also include the comparison of

similar future situations that pose great risk to life and prop-

DEMs extracted from competitive commercial products

erty. The European Flood Directive presented in 2007 asked

such as WorldView-2 stereo-pair imagery.

EC member states to prepare flood risk maps by 2013. Providing the flood frequency information is available, the highresolution elevation information presented in this study

ACKNOWLEDGEMENTS

could support cost-effective flood risk mapping and could be a part of the work done to meet the directive. The limitation of the presented methodology is that the final VHR DEM product includes elevation information

Through its establishment ESRIN, the European Space Agency (ESA) supported this work through the ‘High Resolution Satellite Imagery for Floodplain Mapping

about vegetation and structures over the actual terrain,

(SImFlood)’ Project, contract no. 22306/09/J-LG. Post-

which needs to be removed manually. While some of this

event surveys were supported by the European Community

information is desirable for flood modeling (e.g. building

funded project, HYDRATE, Sixth Framework Programme,

height), it is often considered redundant in a DEM extrac-

contract no. 037024. The authors would also like to thank

tion that aims to depict bare terrain. Multispectral satellite

the anonymous reviewers for their valuable comments and

imagery, such as GeoEye-1 products, can capture structures

suggestions which improved the quality of the paper.

and vegetation by taking advantage of their respective spectral reflectance curves which are characteristic for various material and texture classes (Daliakopoulos et al. ).

REFERENCES

The incorporation of this land use classification information included in multispectral satellite imagery can greatly enhance the identification and extraction of such features

Awrangjeb, M., Ravanbakhsh, M. & Fraser, C. S.  Automatic detection of residential buildings using LIDAR data and


17

I. K. Tsanis et al.

|

GeoEye-1-generated DEM for flood mapping

multispectral imagery. ISPRS Journal of Photogrammetry and Remote Sensing 65 (5), 457–467. Baltsavias, E. P.  A comparison between photogrammetry and laser scanning. ISPRS Journal of Photogrammetry and Remote Sensing 54 (2), 83–94. Barkau, R. L.  UNET, One-dimensional Unsteady Flow Through a Full Network of Open Channels: User’s Manual. US Army COE, Hydrologic Engineering Center, Davis, CA. Barredo, J.  Normalised flood losses in Europe: 1970–2006. Natural Hazards and Earth System Sciences 9 (1), 97–104. Borga, M., Boscolo, P., Zanon, F. & Sangati, M.  Hydrometeorological analysis of the 29 August 2003 flash flood in the Eastern Italian Alps. Journal of Hydrometeorology 8 (5), 1049–1067. Boyle, S., Tsanis, I. & Kanaroglou, P.  Developing geographic information systems for land use impact assessment in flooding conditions. Journal of Water Resources Planning and Management 124 (2), 89–98. Brunner, G. W.  HEC-RAS River Analysis System. Hydraulic Reference Manual. Version 1.0., DTIC Document. Hydrologic Engineering Center, Davis, CA. Carson, E. C.  Hydrologic modeling of flood conveyance and impacts of historic overbank sedimentation on West Fork Black’s Fork, Uinta Mountains, northeastern Utah, USA. Geomorphology 75 (3), 368–383. Collier, C.  Flash flood forecasting: What are the limits of predictability? Quarterly Journal of the Royal Meteorological Society 133 (622), 3–23. Croitoru, A., Hu, Y., Tao, V., Xu, Z., Wang, F. & Lenson, P.  Single and stereo based 3d metrology from high-resolution imagery: methodologies and accuracies. International Archives of Photogrammetry and Remote Sensing 35, 1022–1027. Daliakopoulos, I. N. & Tsanis, I. K.  A weather radar data processing module for storm analysis. Journal of Hydroinformatics 14 (2), 332–344. Daliakopoulos, I. N., Grillakis, E. G., Koutroulis, A. G. & Tsanis, I. K.  Tree crown detection on multispectral VHR satellite imagery. Photogrammetric Engineering and Remote Sensing 75 (10), 1201. Dutta, D. & Herath, S.  Effect of DEM accuracy in flood inundation simulation using distributed hydrological models. Monthly Journal of Institute of Industrial Science, University of Tokyo 53 (11), 602–605. EEA-ETC/TE  CORINE land cover update. I& CLC2000 project. Available at http://terrestrial.eionet.eu.int. Eineder, M.  Alpine digital elevation models from radar interferometry: A generic approach to exploit multiple imaging geometries. Photogrammetrie Fernerkundung Geoinformation 2005 (6), 477. EP & CEU  Directive on the assessment and management of flood risks (2007/60/EC). Official Journal of the European Union L288/27–L288/34. Fraser, C., Dial, G. & Grodecki, J.  Sensor orientation via RPCs. ISPRS Journal of Photogrammetry and Remote Sensing 60 (3), 182–194.

Journal of Hydroinformatics

|

16.1

|

2014

Gaume, E., Bain, V., Bernardara, P., Newinger, O., Barbuc, M., Bateman, A., Blaškovičová, L., Blöschl, G., Borga, M., Dumitrescu, A., Daliakopoulos, I., Garcia, J., Irimescu, A., Kohnova, S., Koutroulis, A., Marchi, L., Matreata, S., Medina, V., Preciso, E., Sempere-Torres, D., Stancalie, G., Szolgay, J., Tsanis, I., Velasco, D. & Viglione, A.  A compilation of data on European flash floods. Journal of Hydrology 367 (1), 70–78. Gogoase, D. E. N., Armaş, I. & Ionescu, C. S.  Inundation maps for extreme flood events at the mouth of the Danube River. International Journal of Geosciences 2, 68–74. Gomes Pereira, L. M. & Janssen, L. L. F.  Suitability of laser data for DTM generation: a case study in the context of road planning and design. ISPRS Journal of Photogrammetry and Remote Sensing 54 (4), 244–253. Gregory, K., Davis, R. & Downs, P.  Identification of river channel change to due to urbanization. Applied Geography 12 (4), 299–318. Hammer, T. R.  Stream channel enlargement due to urbanization. Water Resources Research 8 (6), 1530–1540. Hirschmuller, H.  Stereo processing by semiglobal matching and mutual information. IEEE Transactions on Pattern Analysis and Machine Intelligence 30 (2), 328–341. Hu, Y., Tao, V. & Croitoru, A.  Understanding the rational function model: methods and applications. International Archives of Photogrammetry and Remote Sensing 20, 6. Jarvis, A., Reuter, H., Nelson, A. & Guevara, E.  Hole-filled SRTM for the globe Version 4. Available from the CGIARCSI SRTM 90 m database at http://srtm.csi.cgiar.org. Jenson, S. & Domingue, J.  Extracting topographic structure from digital elevation data for geographic information system analysis. Photogrammetric Engineering and Remote Sensing 54 (11), 1593–1600. Jia, Y. & Wang, S. S. Y.  Numerical model for channel flow and morphological change studies. Journal of Hydraulic Engineering 125 (9), 924–933. Jia, Y., Sam, S. Y. W. & Xu, Y.  Validation and application of a 2D model to channels with complex geometry. International Journal of Computational Engineering Science 3 (01), 57–71. Konrad, C. P.  Effects of Urban Development on Floods. US Geological Survey Fact Sheet 076-03, Tacoma, WA. Koutroulis, A. G. & Tsanis, I. K.  A method for estimating flash flood peak discharge in a poorly gauged basin: Case study for the 13–14 January 1994 flood, Giofiros basin, Crete, Greece. Journal of Hydrology 385 (1), 150–164. Koutroulis, A. G., Tsanis, I. K. & Daliakopoulos, I. N.  Seasonality of floods and their hydrometeorologic characteristics in the island of Crete. Journal of Hydrology 394 (1), 90–100. Krauß, T., Reinartz, P., Lehner, M., Schroeder, M. & Stilla, U.  DEM generation from very high resolution stereo data in urban areas. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences 36, 1682–1777. Lagouvardos, K., Kotroni, V., Nickovic, S. & Kallos, G.  Evidence of a winter tropical storm over eastern


18

I. K. Tsanis et al.

|

GeoEye-1-generated DEM for flood mapping

Mediterranean: Simulations with the regional atmospheric modelling system (RAMS) and the ETA/NMC model. In: Proceedings of the 7th International Conference on Mesoscale Processes, 9–13 September, Reading, UK. Lane, S. N., James, T. D., Pritchard, H. & Saunders, M.  Photogrammetric and laser altimetric reconstruction of water levels for extreme flood event analysis. The Photogrammetric Record 18 (104), 293–307. Lee, D. H., Lee, K. M. & Lee, S. U.  Fusion of lidar and imagery for reliable building extraction. Photogrammetric Engineering and Remote Sensing 74 (2), 215. Leopold, L. & Maddock Jr, T.  The hydraulic geometry of stream channels and some physiographic implications. US Geological Survey, Washington, DC, Professional Paper 252, p. 57. Lin, X. & Yuan, X.  Improvement of the stability solving rational polynomial coefficients. In: International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Beijing 37, 711–716. Llasat, M. C., Llasat-Botija, M., Prat, M. A., Porcú, F., Price, C., Mugnai, A., Lagouvardos, K., Kotroni, V., Katsanos, D., Michaelides, S., Yair, Y., Savvidou, K. & Nicolaides, K.  High-impact floods and flash floods in Mediterranean countries: the FLASH preliminary database. Advances in Geosciences 23, 47–55. Marks, K. & Bates, P.  Integration of high-resolution topographic data with floodplain flow models. Hydrological Processes 14 (11), 2109–2122. Mongus, D. & Žalik, B.  Parameter-free ground filtering of LiDAR data for automatic DTM generation. ISPRS Journal of Photogrammetry and Remote Sensing 67, 1–12. Naoum, S. & Tsanis, I.  Orographic precipitation modeling with multiple linear regression. Journal of Hydrologic Engineering 9 (2), 79–102. Nassar, M.  Multi-parametric sensitivity analysis of CCHE2D for channel flow simulations in Nile River. Journal of Hydroenvironment Research 5 (3), 187–195. Nikolakopoulos, K. G., Kamaratakis, E. K. & Chrysoulakis, N.  SRTM vs ASTER elevation products. Comparison for two regions in Crete, Greece. International Journal of Remote Sensing 27 (21), 4819–4838. OGC  OpenGIS Simple Feature Specification For SQL Version 1.1. Open GIS project document 99-049. Available at http://www.opengeospatial.org/specs. Pappenberger, F., Beven, K., Horritt, M. & Blazkova, S.  Uncertainty in the calibration of effective roughness parameters in HEC-RAS using inundation and downstream level observations. Journal of Hydrology 302 (1), 46–69.

Journal of Hydroinformatics

|

16.1

|

2014

Priestnall, G., Jaafar, J. & Duncan, A.  Extracting urban features from LiDAR digital surface models. Computers, Environment and Urban Systems 24 (2), 65–78. Previtali, M., Barazzetti, L. & Scaioni, M.  Multi-step and multi-photo matching for accurate 3D reconstruction. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences 38, 103–108. Re, M.  Annual Review: Natural Catastrophes 2004, Knowledge series. Munich Re Group, Re, Germany. Richards, M. A.  A Beginner’s Guide to Interferometric SAR Concepts and Signal Processing (AESS Tutorial IV). IEEE Aerospace and Electronic Systems Magazine 22 (9), 5–29. Roe, G. H.  Orographic precipitation. Annual Review of Earth and Planetary Sciences 33, 645–671. Sanders, B. F.  Evaluation of on-line DEMs for flood inundation modeling. Advances in Water Resources 30 (8), 1831–1843. Sivapalan, M.  Prediction in ungauged basins: A grand challenge for theoretical hydrology. Hydrological Processes 17, 3163–3170. Tao, C. V. & Hu, Y.  A comprehensive study of the rational function model for photogrammetric processing. Photogrammetric Engineering & Remote Sensing 67 (12), 1347–1357. Toothill, J.  Central European Flooding August 2002. An EQECAT Technical Report, ABS Consulting. Available at http://www.absconsulting.com/resources/Catastrophe_ Reports/flood_rept.pdf. Toth, C. & Grejner-Brzezinska, D.  Complementarity of LIDAR and stereo-imagery for enhanced surface extraction, geoinformation for all. In: Proceedings of XIXth ISPRS Congress, 16–23 July, Amsterdam, pp. 897–904. Tribe, A.  Automated recognition of valley lines and drainage networks from grid digital elevation models: a review and a new method. Journal of Hydrology 139 (1), 263–293. Wang, S. S. Y. & Hu, K.  Improved methodology for formulating finite element hydrodynamic models. In: Finite Element in Fluids (T. J. Chung, ed.). Hemisphere Publishing, Washington, vol. 8, pp. 457–478. Zhang, Y.  CCHE2D-GUI–Graphical User Interface for the CCHE2D Model User’s Manual–Version 2.2. Available at: http://www.ncche.olemiss.edu/sites/default/files/files/docs/ cche2d/CCHE2D_2.2_User's_Manual.pdf. Zhang, Y.  CCHE-GUI–graphical users interface for NCCHE model user’s manual–version 3.0. National Center for Computational Hydroscience and Engineering. Technical Report No. NCCHE-TR-2006-2.

First received 2 November 2012; accepted in revised form 23 April 2013. Available online 25 June 2013


19

© IWA Publishing 2014 Journal of Hydroinformatics

|

16.1

|

2014

Prediction of flow resistance in a compound open channel Mrutyunjaya Sahu, S. S. Mahapatra, K. C. Biswal and K. K. Khatua

ABSTRACT Flooding in a river is a complex phenomenon which affects the livelihood and economic condition of the region. During flooding flow overtops the river course and spreads around the flood plain resulting in a two-course compound channel. It has been observed that the flow velocity in the flood plain is slower than that in the actual river course. This can produce a large shear layer between sections of the flow and produces turbulent structures which generate extra resistance and uncertainty in flow prediction. Researchers have adopted various numerical, analytical, and empirical models to analyze this situation. Generally, a one-dimensional empirical model is used for flow prediction assuming that the flow in the compound open channel is uniform. However, flow in a compound channel is quasi-uniform due to the transfer of momentum in sub-sections and sudden

Mrutyunjaya Sahu K. C. Biswal K. K. Khatua Department of Civil Engineering, National Institute of Technology, Rourkela, Odisha, India S. S. Mahapatra (corresponding author) Department of Mechanical Engineering, National Institute of Technology, Rourkela, Odisha, India E-mail: mahapatrass2003@yahoo.com

change of depths laterally. Hence, it is essential to analyze the turbulent structures prevalent in the situation. Therefore, in this study, an effort has been made to analyze the turbulent structure involved in flooding using large eddy simulation (LES) method to estimate the resistance. Further, a combination of an artificial neural network (ANN) and a fuzzy logic (FL) is considered to predict flow resistance in a compound open channel. Key words

| adaptive neuro-fuzzy inference system (ANFIS), compound open channel, computational fluid dynamics, correlation, momentum transfer

INTRODUCTION Resistance factors such as drag, boundary shear stress, and

main channel and flood plain is in accordance with the

channel roughness play an important role in predicting con-

flow energy loss, which can be expressed in the form of a

veyance capacity, bank protection, sediment transport, etc.

flow resistance coefficient. Christodoulou & Myers ()

Thus, Einstein & Banks () and Krishnamurthy & Chris-

quantified the apparent shear on the vertical interface

tensen () developed models for estimating a composite

between main channel and flood plain in symmetrical com-

friction factor to study resistance to the flow in a compound

pound sections. Yang et al. () indicated that the Darcy–

open channel. Wormleaton et al. () reported through

Weisbach resistance factor is not suitable for predicting a

extensive experimentation that the Manning’s equation

composite friction factor for measuring the resistance to

and the Darcy–Weisbach equation are not suitable for pre-

flow. The environmental condition and the impact of ther-

dicting discharge of compound channels. Dracos &

modynamic, physical, and hydraulic parameters exhibit

Hardegger () proposed a model to predict a composite

strong non-linear relationships leading to an inaccurate pre-

friction factor in compound open channel flow by taking

diction of a composite friction factor in a compound open

momentum transfer into account, and they also noted that

channel using conventional methods.

a composite friction factor depends on the main channel

Rapid growth in artificial intelligent techniques not only

and flood plain widths and the ratio of the hydraulic

reduces the tedious effort of experimentation but it also

radius to the depth in the main channel. Pang ()

eliminates cumbersome computations. Walid & Shyam

reported that the distribution of discharge between the

() considered a back propagation (BP) algorithm of an

doi: 10.2166/hydro.2013.077


20

M. Sahu et al.

|

Prediction of flow resistance in a compound open channel

Journal of Hydroinformatics

|

16.1

|

2014

artificial neural network (ANN) for prediction of discharge

nature of flow to provide dense fields of data points. Further,

in compound open channel flow. Notable past studies in

they have adopted a large eddy simulation (LES) method to

this direction are a neuro-fuzzy model to simulate the

investigate over-bank channel flow. LES has been utilized to

Colebrook–White equation for the prediction of a friction

model both in-bank channel and over-bank flow condition

factor in smooth open channel flow (Bigil & Altun ;

to investigate the detailed structure of secondary circula-

Yuhong & Wenxin ) and the prediction of a friction

tions. Salvetti et al. () conducted a LES simulation at a

factor in pipe flow problems (Fadare & Ofidhe ). Esen

relatively large Reynolds number for producing results for

et al. () demonstrated the use of adaptive neuro-fuzzy

bed shear with the magnitude of secondary motions and vor-

inference system (ANFIS) for modeling of a ground-coupled

ticity comparable to experiments. Pan & Banerjee (),

heat pump system. Riahi-Madvar et al. () proposed a

Hodges & Street (), and Nakayama & Yokojima

model based on ANFIS to predict longitudinal dispersion

() studied free surface fluctuations in open channel

coefficient in natural streams. ANFIS has been adopted in

flow by employing the LES method where the free surface

a variety of fields for accurate prediction of responses in situ-

has been filtered along with the flow field itself which intro-

ations where input parameters characterize impreciseness

duced extra sub-grid stress (SGS) terms. Beaman ()

and uncertainty. When the relationship between input and

studied the estimation of conveyance using the LES method.

output parameters is difficult to establish using mathemat-

In this study, the inadequacy in prediction of a compo-

ical, analytical, and numerical methods and computation

site friction factor assuming turbulent flow and the

becomes cumbersome and time-consuming, an easily

momentum transfer between the main channel and flood

implementable technique like ANFIS can be adopted.

plain is addressed using an adaptive neuro-fuzzy system.

Thus, an ANFIS model has been proposed in this study to

Further, keeping in view the wide application of LES in

predict a composite friction factor in compound open

open channel flow, an effort has been made to analyze tur-

channel flow.

bulent flow in a compound open channel.

Despite clear successes in the experimental approach, it still suffers from limitations, such as: (i) data are collected at a limited number of points, (ii) the model is usually not at

EXPERIMENTAL DATA USED FOR ANALYSIS

full-scale, and (iii) detailed measurements of turbulence have not usually been considered. A computational

The methods considered to predict the composite friction

approach can partly overcome some of these issues and pro-

factor in a compound open channel are compared with

vide a complementary tool. In particular, a computational

the experimental data of FCF Series A (the experimental

approach is readily repeatable, can simulate at full-scale

data for a straight compound open channel at the Univer-

and provides a spatially dense field of data points. However,

sity of Birmingham) (Tominaga & Nezu ; Soong &

there are significant technical challenges in terms of the pre-

DePue ; Tang & Knight a, b; Atabay et al.

diction of turbulence. In recent years, numerical modeling

). The hydraulic conditions of the data are shown in

of open channel flows has successfully reproduced exper-

Table 1.

imental results. Computational fluid dynamics (CFD) has been used to model open channel flows ranging from main channels to full-scale modeling of flood plains. Simulations have been performed by Krishnappan & Lau

PREDICTION OF COMPOSITE FRICTION FACTOR BY ANFIS

(), Kawahara & Tamai () and Cokljat (). CFD has also been used to model flow features in natural rivers

Prediction of composite friction factor by empirical

by Sinha et al. (), Lane et al. (), and Morvan

models

(). Thomas & Williams (a, b, ) and Shi et al. () have undertaken refined numerical modeling

A compound channel basically consists of a main channel

to examine the detailed time-dependent three-dimensional

with flood plains. The primary factors affecting the


21

Table 1

M. Sahu et al.

|

|

Prediction of flow resistance in a compound open channel

Journal of Hydroinformatics

|

16.1

|

2014

(C ), and Darcy–Weisbach ( f ) are related as shown in

Summary of geometrical factors of experimental data

Source of

Main side

Flood plain

Roughness

Main channel cross-sectional

data

slope

type

type

geometry

FCF Series A

Equation (1): C R1=6 1 8 pffiffiffi ¼ pffiffiffi × ¼ pffiffi g g n f

(1)

Series 1

1.0

Symmetric

Smooth

Trapezoidal

Series 2

1.0

Symmetric

Smooth

Trapezoidal

These factors are evaluated to predict the bed shear

Series 3

1.0

Symmetric

Smooth

Trapezoidal

stress and discharge for both simple and compound

Series 6

1.0

Asymmetric

Smooth

Trapezoidal

open channel flows. Traditionally, the composite rough-

Series 8

0

Symmetric

Smooth

Rectangular

ness in a compound channel is expressed in Manning’s

Series 10

2.0

Symmetric

Smooth

Trapezoidal

form ‘n’ as in Equation (2). The composite friction

Asymmetric

Smooth

Rectangular

Tominaga & Nezu () S (1–3)

0

factor nc across the perimeter can be evaluated as:

Tang & Knight (a) ROA

0

Symmetric

Smooth

Rectangular

ROS

0

Symmetric

Smooth

Rectangular

ð nc ¼ wi ni dp

(2)

where ni ¼ sub-sectional Manning’s roughness and wi ¼

Tang & Knight (b) LOSR

0

Symmetric

Rough

Rectangular

weighted function of sub-sections. Using this formulation the

ALL

0

Symmetric

Rough

Rectangular

calculation of open channel flow is reduced to a 1D formulation.

Atabay et al. ()

A number of empirical formulations have been pro-

ROA

0

Asymmetric

Smooth

Rectangular

posed by investigators to predict a composite Manning’s

ROS

0

Symmetric

Smooth

Rectangular

friction factor in compound open channel flow with differ-

Asymmetric

Rough

Trapezoidal

Soong & Depue () 1

ent assumptions based on the relationships between the discharges, velocities, forces, and shear stresses of the component sub-sections and the total cross-section. These formulations are listed in Table 2 for the estimation of a

resistance coefficient in a compound open channel are

composite Manning’s friction factor. Further, different

geometric parameters (depth of main channel), h and

methods have been also adopted to divide the components

the wall roughness resistant coefficient, K ¼ ks/R where

sub-sections of the compound channels to apply these

ks and R are the roughness height and the hydraulic

models to estimate a composite Manning’s friction factor

radius, respectively. It should be noted that the wall

and the discharge in a compound open channel.

roughness changes along the wetted perimeter of the

In this study, methods proposed by Cox (), Einstein &

cross-section in a compound channel. The composite

Banks (), Lotter (), Krishnamurthy & Christensen

roughness on the wall as well as the shape of the channel

(), and Dracos & Hardegger () have been adopted

affects the turbulent flow structures and the secondary

to predict a composite friction factor. Among these methods,

current across the cross-section and hence, alters the

only Dracos & Hardegger () take momentum transfer

resistance coefficient. Manning’s equation is generally

into account. However, the model proposed by Hin et al.

used for the prediction of discharge in compound open

() can account for momentum transfer but the method

channels. The friction factor is in the form of either

is based on field observation. Further, the data collected

Manning’s coefficient, Chezy’s coefficient, or the Darcy–

have to be calibrated to account for the shape factor par-

Weisbach coefficient, usually considered as a ‘true compo-

ameter to calculate the apparent friction factor. Since this

site friction factor’ (Yang et al. ). In open channel

factor is not generally available, the model is excluded from

flow, the flow resistance coefficient of the boundary

this analysis. Figures 1–3 show the relationship between

expressed by Manning’s coefficient (n), Chezy’s coefficient

true composite friction factor obtained from Manning’s


22

Table 2

M. Sahu et al.

|

|

Prediction of flow resistance in a compound open channel

Journal of Hydroinformatics

|

16.1

|

2014

Models for prediction of composite Manning’s friction factor

Reference

Model notation

Cox ()

COX

Composite friction factor (nc)

Concept

Total resistance force is equal to sum of sub-area resistance forces or ni pffiffiffiffiffiffi weighted by Ai Total discharge is sum of sub-area discharges

Einstein & Banks ()

EBM

Total cross-sectional mean velocity equal to sub-area mean velocity

Lotter ()

LM

Total discharge is sum of sub-area discharges

Krishnamurthy & Christensen ()

KCM

Logarithmic velocity distribution over depth h for wide channel

Dracos & Hardegger ()

D&H

The main channel and flood plain width ratio, and the ratio of the total hydraulic radius to the flow depth in the main channel

¼

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P 2 Ai ni A

A ðAi =ni Þ 2P 3=2 32=3 ni Pi 5 ¼4 P ¼P

¼

PR5=3

P Pi R5=3 i ni "P # 3=2 Pi hi In ni ¼ exp P 3=2 Pi hi n R ¼ f α, ne H

Where pi ¼ sub-sectional perimeter of compound channel, ni ¼ sub-sectional Manning’s ‘n’, Ri ¼ sub-sectional hydraulic radius, Ai ¼ sub-sectional area of compound section, and hi ¼ subsectional depth of flow, R ¼ hydraulic radius of whole compound channel, and α ¼ a measure of increase in wetted perimeter.

Figure 1

|

True composite friction factor v/s composite friction factor predicted by five methods for Atabay et al.’s (2004) experimental conditions.

Figure 3

|

True composite friction factor v/s composite friction factor predicted by five methods for Tang & Knight’s (2001b) experimental conditions.

equation and the friction factor predicted by empirical models given in Table 2 for the three experimental conditions shown in Table 1. The mean absolute relative error for each model is shown in Table 3. It is inferred from Table 3 that the predictive models considered in this study are not capable of accurately predicting composite friction factor for all data sets. For example, Einstein & Banks’ () model predicts Soong & DePue’s () data with reasonable accuracy but fails to predict other data sets. Similarly, Krishnamurthy & Christensen’s () model predicts Tang & Knight’s (a) Figure 2

|

True composite friction factor v/s composite friction factor predicted by five methods for Soong & Depue’s (1996) experimental conditions.

data with adequate accuracy but no other data sets. Therefore, it is desirable to propose a robust predictive method


23

M. Sahu et al.

Table 3

|

|

Prediction of flow resistance in a compound open channel

Journal of Hydroinformatics

|

16.1

|

2014

Mean absolute relative error for different data sets

Data set

Cox (1973)

Einstein & Banks (1950)

Lotter (1993)

Krishnamurthy & Christensen (1972)

Dracos & Hardegger (1987)

FCF Series A

28.33

32.6

24.16

49.47

15.738

Tominaga & Nezu ()

32.21

33.68

8.28

34.72

25.22

Tang & Knight (a)

28.81

57.58

13.52

9.37

13.74

Tang & Knight (b)

28.24

33.721

27.61

46.14

26.14

Atabay et al. ()

17.28

33.43

35.75

18.10

14.98

Soong & DePue ()

13.12

7.42

33.21

15.28

34.38

for an accurate prediction of a composite friction factor

facts. Jang (a) proposed a combination of a neural net-

under different hydraulic conditions.

work and fuzzy logic (FL) known as an ANFIS. ANFIS is

In order to develop a robust approach to predict a compo-

a FIS implemented in the framework of neural networks.

site friction factor, five flow parameters used for the estimation

The combination of both ANN and FIS thus improves the

of the overall discharge in compounds channels suggested by

system performance without interaction with operators.

Yang et al. () are considered. The parameters are: (i) rela-

For this reason, it is possible to deduce the logical pattern

tive width (Br) (ratio of the width of the flood plain (B b) to

of the prediction. The advantage of the technique is that

the total width (B) where b ¼ main channel width); (ii) ratio of

the ANFIS architecture can be used to model the nonlinear

the perimeter of the main channel (Pmc) to the flood plain per-

functions for the prediction of the desired result in a logical

imeter (Pfp) denoted as Pr; (iii) the ratio of hydraulic radius of

manner (Jang a, b, , ).

the main channel (Rmc) to the flood plain (Rfp) denoted as Rr which usually varies with symmetry; (iv) the channel longi-

Fuzzy logic and fuzzy inference systems

tudinal slope (S0); and (v) the relative depth (Hr) i.e., the flow depth of the flood plain (H h) to the total depth (H )

Fuzzy systems are based on IF-THEN fuzzy rules. The building

where h ¼ main channel depth. In this study, these five flow

of FL systems begins with the derivation of a set of IF-THEN

parameters are chosen as input parameters and a composite

fuzzy rules comprising the expertise and knowledge of the

friction factor as an output parameter.

modeling field (Dezfoli ). The modeling of suitable rules is tedious, and hence a predefined method or tool to achieve

Adaptive neuro-fuzzy inference system

the fuzzy rules from numerical and statistical analysis is most appropriate for this context. Fuzzy conditional statements are

The ANFIS is a combination of an ANN and a fuzzy infer-

expressed such as if hydraulic depth (Dr) is small then friction

ence system (FIS) where the neural network learns the

factor is high where these parameters are levels described by

structure of the data but understanding the network struc-

fuzzy sets that are characterized by membership functions.

ture or the associated pattern is difficult. However, the FIS

Hence, these concise forms of fuzzy rules are often employed

can understand the structure and develop the rule base

to make decisions in situations of uncertainty. These play an

using IF-THEN rules to predict the output. A neural net-

important role in the human ability to make decisions.

work with its learning capabilities can be used to learn the

From Figure 4, it can be observed that the FIS and fuzzy

fuzzy decision rules to create a hybrid intelligent system.

decision making procedure comprise five functional build-

The fuzzy system provides expert knowledge to be used by

ing blocks including: (i) rule base, (ii) database, (iii)

the neural network. A FIS consists of three components:

decision making unit, (iv) fuzzification interface, and (v)

first, a rule base which contains a selection of fuzzy rules;

defuzzification interface. The rule base and database are

second, a database defines the membership functions used

referred to as the knowledge base. The inference system is

in the rules; and finally, a reasoning mechanism carries

based on logical rules which map the input variables space

out the inference procedure on the rules and the given

to the output variable space using IF-THEN statements


24

Figure 4

M. Sahu et al.

|

|

Prediction of flow resistance in a compound open channel

Journal of Hydroinformatics

|

16.1

|

2014

Schematic diagram of fuzzy based inference system.

and a fuzzy decision making procedure (Dezfoli ; Jang

Architecture and basic learning rules

& Gulley ). Due to the uncertainty of real and field values to fuzzy data, a fuzzification transition is used to

A typical adaptive network shown in Figure 5 is a network

transform deterministic values to fuzzy values and a defuzzi-

structure consisting of a number of nodes connected through

fication transition is used to transform fuzzy values into

directional links. Each node is characterized by a node func-

deterministic values (Dezfoli ).

tion with fixed or adjustable parameters. The learning or

Figure 5

|

A typical architecture of ANFIS system.


25

M. Sahu et al.

|

Prediction of flow resistance in a compound open channel

Journal of Hydroinformatics

training phase of a neural network is a process to determine parameter values to sufficiently fit the training data. The basic learning rule method is the BP method which seeks to minimize some error, usually the sum of squared differences

|

16.1

|

2014

Layer 4: Every node i in this layer is a squared node with a node function: O4i ¼ wi fi ¼ wi ð pi þ qi y þ ri Þ

(8)

between the network’s outputs and desired outputs. Generally, the model performance is checked by means of distinct test data, and a relatively good fit is expected in the testing phase. Considering a first order fuzzy interface system according to Takagi, Sugeno and Kang (TSK), a fuzzy

where wi is the output of layer 3, and is the parameter set. Parameters in this layer will be referred to as ‘consequent parameters’. Layer 5: The single circle node computes the overall

model consists of two rules (Sugeno & Kang ):

output as the summation of all incoming signals:

Rule 1: If x is A1 and y is B1 then f1 ¼ p1 x þ q1 y þ r1

O5i ¼ Overall output ¼

Rule 2: If x is A2 and y is B2 then f2 ¼ p2 x þ q2 y þ r2

(3)

(4)

If f1 and f2 are constants instead of linear equations, we

n X i

P wi fi wi fi ¼ Pi wi

(9)

i

Thus, an adaptive network as presented in Figure 5 is functionally equivalent to a fuzzy interface. The basic learning rule of ANFIS is the BP gradient descent which

have a zero order TSK fuzzy-model. Node functions in the

calculates error signals (defined as the derivative of the

same layer are of the same function family as described

squared error with respect to each node’s output) recursively

below. It is to be noted that Oji denotes the output of the

from the output layer backward to the input nodes (Werbos

i

th

node in layer j. Layer 1: Each node in this layer generates a membership

). This learning rule is exactly the same as the back-propagation learning rule used in the common feed-forward neural

grade of a linguistic label. For instance, the node function of

networks (Rumelhart et al. ). From the ANFIS architec-

the i th node might be:

ture (Figure 5), it is observed that given values of the premise parameters, the overall output can be expressed as

j

Oi ¼ μAi ðxÞ ¼

1 x ci ai

bi

(5)

a linear combination of the consequent parameters. Based on this observation, a hybrid learning rule is employed here, which combines a gradient descent and the least squares

where x is the input to the node i, and Ai is the linguistic label (small, large) associated with this node; and {ai, bi, ci} is the parameter set that changes the shapes of the membership function. Parameters in this layer are referred to as the ‘premise parameters’.

method to find feasible antecedent and consequent parameters (Jang a, ). The details of the hybrid rule are given by Jang et al. (), where it is also claimed to be significantly faster than the classical back-propagation method. Hybrid learning algorithm

Layer 2: Each node in this layer calculates the firing strength of each rule via multiplication: O2i

¼ wi ¼ μAi ðxÞ × μBi ð yÞ, i ¼ 1, 2

From the ANFIS architecture (Figure 5), we observe that (6)

Layer 3: The i th node of this layer calculates the ratio of the i th rule’s firing strength to the sum of all rules’ firing strengths: O3i

wi ¼ wi ¼ , i ¼ 1, 2 w1 þ w2

(7)

For convenience outputs of this layer will be called normalized firing strengths.

when the values of the premise parameters are fixed the overall output can be expressed as a linear combination. The output ‘F’ can be rewritten as: F¼

w1 w1 f1 þ f2 w1 þ w2 w1 þ w2

¼ wf1 þ wf2 ¼ ðwxÞp1 þ ðwyÞq1 þ ðw1 Þr1 þ ðw2 xÞp2 þ ðw2 yÞq2 þ ðw2 Þr2 (10)


26

M. Sahu et al.

|

Prediction of flow resistance in a compound open channel

Journal of Hydroinformatics

|

16.1

|

2014

which is linear in the consequent parameters p1, q1, r1, p2,

model parameters are matched. After that, 22 data are

q2, r2. Therefore, the hybrid learning algorithm developed

used for testing to verify the accuracy of the proposed

can be applied directly. More specifically, in the forward

model.

pass of the hybrid learning algorithm, node outputs go forward until layer 4 and the consequent parameters are identified by the least squares method. In the backward pass, the error signal propagates backward and the premise parameters are updated by gradient descent. As mentioned, the consequent parameters thus identified are optimal under the condition that the premise parameters are fixed. Accordingly, the hybrid approach converges much faster since it reduces the dimension of the search space of the original back-propagation method. This network fixes the membership functions and adapts only the consequent parts; then, ANFIS can be viewed as a functionallinked network (Klassen & Pao ; Pao ) where the enhanced representation, takes advantage of human knowledge and expresses more insight. By fine-tuning the membership functions, we actually generate this enhanced representation. Training and testing of ANFIS network The data required for the simulation are first generated using Manning’s equation for obtaining a composite friction factor under different hydraulic conditions, as shown in Table 1. The input parameters for the simulation are

Prediction of composite friction factor using ANFIS The composite friction factor is predicted using the ANFIS model based on five input parameters, such as relative width, ratio of perimeter of main channel to flood plain perimeter, ratio of hydraulic radius of main channel to flood plain, channel longitudinal slope, and relative depth. The pattern of variation of the actual and predicted composite friction factor is shown for the training and testing data sets in Figures 6 and 7, respectively. The black line indicates actual output and the grey line represents the predicted data from ANFIS. The plots show the coherent nature of the data distribution. The surface plot is shown in Figure 8. It can be observed that the surface covers the total landscape of decision space. Residuals are calculated as the difference between the actual and the predicted composite friction factors for training data set and are plotted in Figure 9. It can be observed that the residuals are distributed evenly along the centerline of the plot. To verify the accuracy of the results, a regression analysis is also carried out. Regression curves are plotted in Figures 10 and 11 between the actual composite friction

referred to in a previous section (Prediction of composite friction factor by empirical models). The entire experimental data set is divided into training and testing data sets. A total of 228 data are used. Among the 228 data, 206 are considered as training data and 22 as testing data. The number of nodes in the second layer is increased gradually during the training process starting with two. It was observed that the error converges (decreases) as the nodes increase to five. Hence, the number of nodes in the second layer is fixed at five and further analysis is carried out. The five layers are one input, three hidden, and one output layer. The network was run on a MATLAB platform using a Pentium IV desktop computer. A Gaussian-type membership function (gauss2mf) is chosen for input as for input 1 and a linear-type membership function is used for output while generating FIS. The function goes steadily after 10 iterations due to a faster hybrid learning rule which ensured that the

Figure 6

|

Distribution of composite friction factor (training data).


27

M. Sahu et al.

|

Prediction of flow resistance in a compound open channel

Journal of Hydroinformatics

Figure 9

Figure 7

Figure 8

|

|

|

|

16.1

|

2014

Residual distribution of training data set.

Distribution of composite friction factor (testing data).

Figure 10

|

Correlation plot for training set of data points.

Figure 11

|

Correlation plot for testing set of data points.

Surface plot.

factor and the predicted composite friction factor for the training and the testing data, respectively. It can be observed that the data are well fitted because high values of the coefficient of determination (R 2 ¼ 0.991 for training and R 2 ¼ 0.962 for testing data) are obtained. The testing data set is used to find the coefficient of determination for the other five methods as shown in Figures 12–16. From these figures, it can be observed that the EBM method exhibits

NUMERICAL MODELING OF TURBULENT FLOW STRUCTURES

the least accuracy because the coefficient of determination (R 2) of EBM is 0.687 whereas the coefficient for the

Although the ANFIS model is quite robust in predicting a

ANFIS method is 0.962.

composite friction factor considering the non-linearity in the


28

M. Sahu et al.

Figure 12

Figure 13

|

|

|

Prediction of flow resistance in a compound open channel

Journal of Hydroinformatics

Figure 14

|

Correlation plot for testing data (LM).

Figure 15

|

Correlation plot for testing data (KCM).

Figure 16

|

Correlation plot for testing data (D & H).

Correlation plot for testing data (COX).

Correlation plot for testing data (EBM).

relation between the input flow parameters and the output, it is vital to find out the reason for this non-linear relationship. In fact, momentum transfer in compound channels leads to an inaccurate estimation of discharge using empirical relations. Here, an attempt is made to present the effect of momentum transfer on the discharge in a compound channel via numerical analysis so that insight into flow mechanism can be gained. The numerical analysis simulates a tilting flume with a 8 m length and a 0.4 × 0.4 m2 cross-section for which Tominaga & Nezu () carried out experiments using fiber-optic laser-Doppler anemometer to measure three-directional components of the turbulent velocity

|

16.1

|

2014


29

M. Sahu et al.

|

Prediction of flow resistance in a compound open channel

shown in Figure 17 (S 1 case). The geometry of the channel is

Journal of Hydroinformatics

|

16.1

|

2014

configuration (physical) space:

discretized with ANSYS 12 design modeler. The width to depth (B/H) ratio of the channel is 4.981 whereas the slope is 0.00064. The flow is considered as uniform incompressible turbulent flow at the test section 7.5 m from the inlet. The hydraulic radius (R) of the channel is 0.043. The Reynolds number (Re) of the flow for this case is 6.72 × 104.

@ρ @ ðρui Þ þ ¼0 @t @xi

(11)

@ ρui uj @σ ij @ @ @p @τ ij ðρui Þ þ ¼ μ @t @xj @xi @xj @xi @xj

(12)

The fluid flow equations are solved by discretizing the whole domain into unstructured hybrid mesh (mixture of

where ρ ¼ density of water, ui and uj are the unresolved vel-

prism and triangular) that divides the continuum into a

ocity components in the xi and xj directions, σij ¼ normal

finite number of nodes considering near-wall effect. The

stress in plane i along j direction, p ¼ pressure, τij ¼ tangen-

computations need a spatial discretization and time march-

tial shear stress in plane i along j direction. Equation (11)

ing scheme. In this study, the transient simulation process is

is the continuity equation which is linear and does not

completed with the help of the commercial package ANSYS

change due to filtering.

CFX (ANSYS CFX Tutorials ANSYS CFX Release 11.0

To capture the flow feature in turbulence, large-scale

). This package generally solves the Navier–Stokes

motion is captured as a direct numerical simulation (DNS)

(NS) using a finite element-finite volume method. The

in LES but the effect of small scales is modeled using a sub-

mesh and simulation details are shown in Table 4. The gov-

grid scale (SGS) model. The LES method can incorporate a

erning equations (Equations (11) and (12)) are employed for

much coarser grid so that the temporal evolution of the

LES obtained by filtering the time-dependent NS equation

large-scale turbulent motions can be directly simulated

and continuity in either Fourier (wave-number) space or

while the unresolved small-scale motions can be modeled

Figure 17

Table 4

|

|

Geometric alignment of flume channel along with boundary conditions.

Summary of mesh and simulation details using ANSYS-CFX

LETOT (Large eddy turn over time state) þ

Case

Mesh spacing (m)

y range

H/u* (sec)

Time step (sec)

initial trial

S1

0.005

9.23–110.87

5

0.001

70

yþ ¼

yu ¼ scaled depth of flow where y ¼ respective flow depth, u ¼ flow velocity, u* ¼ shear velocity. u

10


30

M. Sahu et al.

|

Prediction of flow resistance in a compound open channel

Journal of Hydroinformatics

|

16.1

|

2014

through the use of a Smagorinsky model. The filtering pro-

channel at approximately 0.057 m from the centerline. The

cess filters out the eddies whose scales are smaller than the

isovel lines bulge significantly upward in the vicinity of the

filter width or grid spacing used in the computations. The

junction edge along the flow. The patterns of the isovel

results of the simulation are compared with case S 1 of Tomi-

lines from LES simulation results convincingly follow the

naga & Nezu (). From Table 5, it is evident that the results

experimental results of Tominaga & Nezu (). The

obtained from LES simulation are in good agreement with

reason for this bulge is the decelerated region on both sides

case S 1 of Tominaga & Nezu (). Here, mean bulk vel-

of the junction region of the main channel. The region is cre-

ocity is calculated using the formulation:

ated because of the low momentum transport due to the

Ð Wb ¼

secondary current away from the wall. This causes the wdA A

(13)

where, Wb ¼ mean velocity of the flow, w ¼ velocity of the point of consideration.

bulge in the main channel and flood plain interface due to high momentum transport by the secondary current. Consequently, the primary velocity is directly affected by the momentum transport due to the secondary current. The momentum transfer due to the secondary circulation com-

The composite friction factor is calculated from Manning’s equation.

ponent and the turbulent transport are three-dimensional in nature. These flow structures also depend on the corner

The isovel lines of the non-dimensional stream-wise vel-

of the channel and the shape of the compound cross-section.

ocity W(z) computed by the LES method are shown in

It is quite evident that turbulent structures as discussed are

Figure 18. The simulation shows that maximum velocity is

three-dimensional and highly non-linear.

0.4049 m/s which is observed near the centerline of the Table 5

|

Flow parameters of the experiment and simulation

Maximum

Mean bulk

Composite

velocity,

velocity, Wb

friction factor,

Case

Wmax (m/s)

(m/s)

‘nc’

S1

0.409

0.368

0.011383

Based on analysis made in this study, the following certain

LES simulation results

0.4049

0.3671

0.011380

conclusions can be drawn:

CONCLUSIONS

1. Five empirical models for the prediction of a composite friction factor have been studied. It is observed that the models can predict the composite friction factor accurately for a few data sets. Generally, the models break down when predicting the composite friction factor for a wide range of hydraulic conditions and geometries of compound channel. 2. To alleviate the above problem, a robust prediction strategy based on an ANFIS has been proposed. It is demonstrated that the ANFIS model is quite capable of predicting a composite friction factor with reasonable accuracy for a wide range of hydraulic conditions. 3. Further, the LES turbulence model has been adopted to analyze the compound open channel condition. The velocity distribution in an asymmetric compound channel is presented. The composite friction factor found from the Figure 18

|

Mean velocity distribution of LES simulation.

LES is in good agreement with experimental results.


31

M. Sahu et al.

|

Prediction of flow resistance in a compound open channel

4. Moreover, both the LES and ANFIS models are fairly convincing and account for turbulence during the prediction of the discharge and the composite friction factor in a compound open channel. A reasonably accurate prediction of composite friction factor for different geometry, hydraulic conditions and bed/material can be obtained with less computational effort by ANFIS which can be useful for field engineers. 5. In future, the study can be extended to consider different hydraulic conditions for the prediction of composite friction factor using LES and ANFIS models.

REFERENCES Atabay, S., Knight, D. W. & Seckin, G.  Influence of a mobile bed on the boundary shear in a compound channel. In: Proceedings of the Second International Conference on Fluvial Hydraulics, Naples, Italy, 23–25 June, 1, pp. 337–345. ANSYS CFX Tutorials ANSYS CFX Release 11.0.  ANSYS Inc. Southpointe, 275 Technology Drive, Canonsburg, PA 15317. Beaman, F.  Large Eddy Simulation of Open Channel Flows for Conveyance Estimation. PhD Thesis, University of Nottingham. Bigil, A. & Altun, H.  Investigation of flow resistance in smooth open channels using artificial neural network. Flow Meas. Instrum. 19, 404–408. Christodoulou, G. C. & Myers, W. R. C.  Apparent friction factor on the flood plain-main channel interface of compound channel sections. In: Proc. 28th IAHR Congress, Graz, Austria. Cokljat, D.  Turbulence Models for Non-circular Ducts and Channels. PhD Thesis, City University London. Cox, R. G.  Effective hydraulic roughness for channels having bed roughness different from bank roughness. Miscellaneous Paper H-73-2, US Army Engineers Waterways Experiment Station, Vicksburg, MS. Dezfoli, K. A.  Principles of Fuzzy Theory and its Application on Water Engineering Problems. Jihad Press, Tehran, Iran, p. 227. Dracos, T. & Hardegger, P.  Steady uniform flow in prismatic channels with flood plains. J. Hydraul. Res. IAHR 25 (2), 169–185. Einstein, H. A. & Banks, R. B.  Fluid resistance of composite roughness. Trans. Am. Geo. Union 31 (4), 603–610. Esen, H., Inalli, M., Sengur, A. & Esen, M.  Modeling a ground-coupled heat pump system using adaptive neurofuzzy inference systems. J. Refrig. 31 (1), 64–74. Fadare, D. A. & Ofidhe, I. U.  Artificial neural network model for prediction of friction factor in pipe flow. J. Appl. Sci. Res. 5 (6), 662–670.

Journal of Hydroinformatics

|

16.1

|

2014

Hin, L. S., Bessaih, N., Ling, L. P., Ghani, A., Zakaria, N. A. & Seng, M. Y.  Discharge estimation for equatorial natural rivers with over bank flow. Int. J. River Basin Manage. 6 (1), 13–21. Hodges, B. R. & Street, R. L.  On simulation of turbulent nonlinear free-surface flows. J. Comp. Phys. 151, 425–457. Jang, R. J. a Fuzzy modeling using generalized neural networks and Kalmman filter algorithm. In: Int. Proc. 9th National Conf. on Artificial Intelligence, Anaheim, CA, 15–19 July, pp. 762–767. Jang, R. J. b Rule extraction using generalized neural networks. In: Int. Proc. 4th IFSA World Congress, pp. 82–86. Jang, R. J.  ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans. Sys. Man. Cyber. 23 (3), 665–685. Jang, R. J.  Structure determination in fuzzy modeling: a fuzzy CART approach. In: Proc. IEEE conf. on Fuzzy Systems, Orlando, FL. Jang, J. S. R. & Gulley, N.  Fuzzy Logic Toolbox: Reference Manual. The Math Works Inc., Natick, MA, USA. Jang, J. S. R., Sun, C. T. & Mizutani, E.  Neuro-fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence. Prentice-Hall International, London. Kawahara, Y. & Tamai, N.  Numerical calculation of turbulent flows in compound channels with an algebraic stress turbulence model. In: Proc. 3rd Symp. Refined Flow Modeling and Turbulence Measurements, Tokyo, Japan, pp. 9–17. Klassen, M. S. & Pao, Y. H.  Characteristics of the functional link net: a higher order delta rule net. In: IEEE Proc. Conf. Neural Networks, San Diego, CA. Krishnamurthy, M. & Christensen, B. A.  Equivalent roughness for Shallow channels. J. Hydraul. Eng. ASCE 98 (12), 2257–2263. Krishnappan, B. G. & Lau, Y. L.  Turbulence modeling of flood plain flows. J. Hydraul. Eng. ASCE 112 (4), 251–265. Lane, S. N., Bradbrook, K. F., Richards, K. S., Biron, P. A. & Roy, A. G.  The application of computational fluid dynamics to natural river channels: three-dimensional versus twodimensional approaches. Geomorphology 29, 1–20. Lotter, G. K.  Considerations on hydraulic design of channel with different roughness of walls. Trans. AU Sci. Res. Inst. Hydraul. Eng. 9, 238–241. Morvan, H. P.  Three-dimensional Simulation of River Flood Flows. PhD Thesis, University of Glasgow, Glasgow. Nakayama, A. & Yokojima, S.  LES of open-channel flow with free-surface fluctuations. In: Proc. Hydraul. Eng. JSCE. 46, 373–378. Pan, Y. & Banerjee, S.  Numerical investigation of free-surface turbulence in open-channel flows. Phys. Fluids 113 (7), 1649–1664. Pang, B.  River flood flow and its energy loss. J. Hydraul. Eng. ASCE 124 (2), 228–231. Pao, Y. H.  Adaptive Pattern Recognition and Neural Network. Addison-Wesley, Boston, MA, pp. 197–222. Riahi-Madvar, H., Ayyoubzadeh, A. S., Khadangi, E. & Ebadzadeh, M. M.  An expert system for predicting


32

M. Sahu et al.

|

Prediction of flow resistance in a compound open channel

longitudinal dispersion coefficient in natural streams by using ANFIS. Exp. Sys. App. 36 (2), 1142–1154. Rumelhart, D. E., Hinton, G. E. & William, D. E.  Learning internal representations by error propagation. In: Parallel distributed processing: explorations in the microstructure of cognition (D. E. Rumelhart & J. L. McClelland, eds). MIT Press, Cambridge, MA, pp. 318–362. Salvetti, M. V., Zang, Y., Street, R. L. & Banerjee, S.  Large-eddy simulation of free-surface decaying turbulence with dynamic subgrid-scale models. Phys. Fluids 9 (8), 2405–2419. Shi, J., Thomas, T. G. & Williams, J. J. R.  Large eddy simulation of flow in a rectangular open channel. J. Hydraul. Res. 37 (3), 345–361. Sinha, S. K., Sotiropoulos, F. & Odgaard, A. J.  Threedimensional numerical model for flow through natural rivers. J. Hydraul. Eng. 124 (1), 13–24. Soong, T. W. & DePue II, P. M.  Variation of Manning’s Coefficient with Channel Stage. Unpublished MS Thesis, University of Illinois at Urbana-Champaign, Urbana, IL. Sugeno, M. & Kang, G. T.  Structure identification of fuzzy model. Fuzzy Sets Syst. 28, 15–33. Tang, X. & Knight, D. W. a Analysis of bed form dimensions in a compound channel. In: Proceedings of 2nd IAHR Symposium on River, Coastal and Estuarine Morphodynamics, Obihiro, Japan, pp. 555–563. Tang, X. & Knight, D. W. b Experimental study of stagedischarge relationships and sediment transport rates in a compound channel. In: Proc. 29th IAHR Congress, Beijing, China, 16–21 September, pp. 69–76.

Journal of Hydroinformatics

|

16.1

|

2014

Thomas, T. G. & Williams, J. a Large eddy simulation of a symmetric trapezoidal channel at Reynolds number of 430,000. J. Hydraul. Res. 33 (6), 825–842. Thomas, T. G. & Williams, J. b Large eddy simulation of turbulent flow in an asymmetric compound open channel. J. Hydraul. Res. 33 (1), 27–41. Thomas, T. G. & Williams, J.  Large eddy simulation of flow in a rectangular open channel. J. Hydraul. Res. 37 (3), 345–361. Tominaga, A. & Nezu, I.  Turbulent structures in compound open-channel flow. J. Hydraul. Eng. ASCE 117, 21–41. University of Birmingham Flow Database. Available at: www. flowdata.bham.ac.uk/atabay/index.shtml; www.flowdata.bham. ac.uk/fcfa.shtml; www.flowdata.bham.ac.uk/tang/data.shtml. Walid, H. S. & Shyam, S. S.  An artificial neural network for non-iterative calculation of the friction factor in pipeline flow. Comput. Electron. Agric. 21, 219–228. Werbos, P. J.  Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Dissertation, Harvard University, Cambridge, MA. Wormleaton, P. R., Allen, J. & Hadjipanos, P.  Discharge assessment in compound channel flow. J. Hydraul. Eng. ASCE 108 (9), 975–994. Yang, K., Cao, S. & Liu, X.  Study on resistance coefficient in compound channels. Acta Mech. Sinica 21, 353–361. Yang, K., Cao, S. & Liu, X.  Flow resistance and its prediction methods in compound channels. Acta Mech. Sinica 23, 23–31. Yuhong, Z. & Wenxin, H.  Application of artificial neural network to predict the friction factor of open channel. Commun. Nonlinear Sci. Numer. Simulat. 14, 2373–2378.

First received 10 April 2012; accepted in revised form 1 May 2013. Available online 30 May 2013


33

© IWA Publishing 2014 Journal of Hydroinformatics

|

16.1

|

2014

Evolutionary network flow models for obtaining operation rules in multi-reservoir water systems Néstor Lerma, Javier Paredes-Arquiola, Jose-Luis Molina and Joaquín Andreu

ABSTRACT Obtaining operation rules (OR) for multi-reservoir water systems through optimization and simulation processes has been an intensely studied topic. However, an innovative approach for the integration of two approaches – network flow simulation models and evolutionary multi-objective optimization (EMO) – is proposed for obtaining the operation rules for integrated water resource management (IWRM). This paper demonstrates a methodology based on the coupling of an EMO algorithm (NSGA-II or Non-dominated Sorting Genetic Algorithm) with an existing water resources allocation simulation network flow model (SIMGES). The implementation is made for a real case study, the Mijares River basin (Spain) which is characterized by severe drought events, a very traditional water rights system and its historical implementation of the conjunctive use of surface and ground water. The established operation rules aim to minimize the maximum deficit in the short term without compromising the maximum deficits in the long term. This research

Néstor Lerma (corresponding author) Javier Paredes-Arquiola Joaquín Andreu Universitat Politècnica de València, Research Institute of Water and Environmental Engineering (IIAMA), Ciudad Politécnica de la Innovación, Camino de Vera, 46022 Valencia, Spain E-mail: neslerel@upv.es Jose-Luis Molina Polytechnic School of Engineering, Department of Hydraulic Engineering, Salamanca University, Av. de los Hornos Caleros, 50, 05003 Ávila, Spain

proves the utility of the proposed methodology by coupling NSGA-II and SIMGES to find the optimal reservoir operation rules in multi-reservoir water systems. Key words

| agricultural demands, AQUATOOL, decision support system shell, deficits, drought, genetic algorithms, NSGA-II, operating rules, optimization, SIMGES, simulation, water resources system

INTRODUCTION Several authors have noted the absence of the application of

appropriate management strategies often involves multiple

optimization models to the real management of multi-reser-

conflicting objectives that should be ‘optimized’ simul-

voir water systems (Yeh ; Wurbs ; Labadie ).

taneously (Makropoulos et al. ). Thus, there exists the

The applicability of most reservoir operation models is lim-

concept of Pareto optimal solutions, i.e. solutions for

ited because of the ‘high degree of abstraction’ necessary

which it is not possible to improve on the attainment of

for the efficient application of optimization techniques

one objective without making at least one of the others

(Akter & Simonovic ; Moeni et al. ). On the other

worse. Evolutionary multi-objective optimization (EMO)

hand, other authors such as Oliviera & Loucks () main-

algorithms offer a means of finding the optimal Pareto

tain that this is because of institutional limitations rather

front (Farmani et al. a; Cisty ; Abd-Elhamid &

than technological or mathematical limitations. Decision-making in environmental and hydrological projects can be complex and inflexible because of the

Javadi ). The decision-maker can consequently be provided with a set of non-dominated solutions to select a final design solution from that set.

socio-political,

Although the efficiency of these algorithms in solving a

environmental and technical factors. The selection of the

number of complicated real-world problems in electrical,

inherent

trade-offs

doi: 10.2166/hydro.2013.151

among

economic,


34

N. Lerma et al.

|

Evolutionary network flow models in multi-reservoir water systems

Journal of Hydroinformatics

|

16.1

|

2014

hydraulic, structural or aeronautical engineering has been

example, ‘reduce demands’ or ‘start pumping groundwater’.

illustrated (Farmani et al. b, , ; Hanne &

These types of OR are commonly called Rule Curves (RC),

Nickel ; Molina-Cristobal et al. ; Osman et al.

and although they are not always the most efficient rules

; Murugan et al. ), there have been limited appli-

they are considered the most practical and accepted by users.

resources

This paper aims to show the findings of RC for multi-

management (Farmani et al. ; Molina et al. ).

reservoir water systems by means of the coupling of an

There are recent applications of EMO algorithms related

EMO (NSGA-II) (Deb et al. ) with the simulation flow

to other water resources research studies, such as the opti-

network model SIMGES (Andreu et al. ). The proposed

mal design of water distribution systems or reservoirs

method is applied to the Mijares River basin water system

(Cisty ; Nazif et al. ; Haghighi et al. ; Hınçal

(Spain) which is characterized by strong drought events, a

et al. ; Louati et al. ), the conjunctive use of surface

very traditional water rights system and its historical

water and groundwater (Safavi et al. ), the control of

implementation of the conjunctive use of surface and

seawater intrusion in coastal aquifers (Abd-Elhamid &

ground water.

cations

in

the

policy

analysis

of

water

Javadi ; Kourakos & Mantoglou ; Sedki & Ouazar

The paper is structured as follows. First, a theoretical

) or hydrological studies (Dumedah et al. ; Gorev

background on reservoir operation rules and EMO is devel-

et al. ; Hassanzadeh et al. ).

oped. A case study is then presented, followed by a

In this work, an evolutionary multi-objective optimiz-

description of the integrated methodology in which the

ation algorithm, NSGA-II (Non-dominated Sorting Genetic

implementation of the SIMGES and EMO methods is

Algorithm; Deb et al. ), is coupled with the flow net-

described. The results are then discussed and several con-

work model SIMGES (Andreu et al. ) and used to

clusions are drawn.

assist in the selection of the best operation rules in multireservoir water systems. Despite the development and growing use of optimiz-

RESERVOIR OPERATION RULES AND EMO

ation models (Labadie ), most reservoir planning and operation studies are based on simulation modelling and

Traditionally, reservoir operation is based on heuristic pro-

thus require the intelligent specification of operation rules

cedures, RC and subjective judgments by the operator. This

(OR). Lund & Guzman () review the derived single-

provides general operation strategies for reservoir releases

purpose operating rules for reservoirs in series and in paral-

according to the current reservoir level, hydrological con-

lel for different purposes, with the derived rules supported

ditions, water demands and the time of year (Hakimi-

by conceptual or mathematical deduction. Obtaining OR

Asiabar et al. ; Moeni et al. ). In practice, reservoir

from the results of optimization models can be done using

operators usually follow RC which stipulate the actions that

simple (Young ) or multiple (Bhaskar & Whitlach

should be taken depending on the current state of the

) linear regressions and the use of simple statistics,

system (Alcigeimes & Billib ). Rule curves, or guide

tables and graphs (Lund & Ferreira ). Unfortunately, a

curves, are used to denote the operating rules that define

regression analysis can produce poor results, limiting the

the ideal or target storage levels and provide a mechanism

use of the obtained OR (Labadie ). On the other

for release rules to be specified as a function of water storage

hand, empirical OR has limited applicability, as for the

(Mohan & Sivakumar ; Hakmi-Asiabar et al. ).

space rule (Bower et al. ) or the New York City rule

Moreover, RC can be defined as a trigger indicator to start

(Clark ).

different measures, or actions, for water management.

In many real systems, the typical OR is defined by a

Obtaining RC from the results given by optimization

volume target for a reservoir that had to be maintained.

models by linear regressions is a complex task (Young

Another typical OR is defined by a curve (variable monthly

). Revelle et al. () proposed a linear decision rule;

and constant year by year) for a reservoir or a group of reser-

Lund & Ferreira () used tables and statistics of the

voirs that defines a threshold to trigger an action, for

results from an optimization model to obtain the OR of


35

N. Lerma et al.

|

Evolutionary network flow models in multi-reservoir water systems

Journal of Hydroinformatics

|

16.1

|

2014

the Missouri River water system. A common technique for

et al. ; Chan Hilton & Culver ; Singh & Minsker

obtaining OR and RC is based on an iteration method for

) and water resources systems management (Suen &

river basin simulation models. These iterations are con-

Eheart ).

trolled by an optimization algorithm that varies the

In the last years, there have been new advances and

operation rules depending on the results. For example, Cai

improvements for the NSGA-II MOEA. e-NSGA-II rep-

et al. () described strategies for solving large non-linear

resents an improvement over the original NSGA-II

water resource management models combining a genetic

developed by Deb et al. () by incorporating epsilon-dom-

algorithm (GA) with linear programming (LP), in which a

inance archiving (Laumanns & Ocenasek ) and

GA/LP approach was applied to a reservoir operation

adaptive population sizing (Harik et al. ). Epsilon-dom-

model with hydropower generation and to a long-term

inance archiving helps to reduce the computational demand

dynamic river basin planning model. Simulation models

of solving high-dimensional optimization problems (Kollat

are the most widespread tool for the analysis and planning

& Reed ) by allowing the user to control the resolution

of water systems. These models are characterized by their

at which the objectives are evaluated and ranked. However,

flexibility and by the possibility of including very complex

the use of NSGA-II to couple flow network models, which

elements in the modelling. They allow a more detailed rep-

is the application of this research (SIMGES), is a new topic

resentation of the systems than the optimization models

in the literature. The studies on coupling network flow

(Loucks & Sigvaldason ). River basin management

models and EMO algorithms such as NSGA-II are scarce

decisions are therefore generally made with the support of

or even non-existent in the literature. The NSGA-II algorithm

simulation models.

can be coupled to several other simulation models to provide

Quantitative compromises for the objectives and constraints

presented

in

the

methodology

section

are

optimized solutions by taking advantage of the power of those models (Farmani et al. ; Molina et al. ).

developed in this study using a multi-objective evolutionary

Most of the OR optimization problems have a multi-

algorithm (MOEA), non-dominated sorting genetic algor-

objective nature. Consequently, a multi-objective analysis

ithm II (e-NSGA-II) (Deb et al. ). The concept of

is necessary for identifying the best solutions and simul-

Pareto optimality is used to define the multi-objective com-

taneously considering several objectives that are frequently

promises for a system. A solution is Pareto optimal (or

in conflict (trade-offs). Many studies have used multi-

non-dominated) if no other solution in the solution space

objective techniques to address the multi-reservoir optimiz-

gives a better value for one objective without also degrading

ation problem.

the performance of at least one other objective. MOEAs are

Classical multi-objective approaches such as the weight-

heuristic search algorithms that change the approximation

ing approach or the constrain method were used for this

to the Pareto optimal set using crossover, selection and

purpose (Croley & Rao ; Yeh & Becker ; Liang

mutation operators to mimic natural selection in the popu-

et al. ; Wang et al. ).

lations of organisms in nature. The evolutionary algorithm

More recent applications use evolutionary multi-objec-

search process is an iterative process of selection that pre-

tive techniques for the same purpose. Reddy & Kumar

serves and reproduces high-quality solutions and that

() developed a multi-objective differential evolutionary

varies to introduce innovation in order to improve the popu-

algorithm and applied it to the Hirakud reservoir project

lation of solutions.

(India). Kim et al. () applied the NSGA-II algorithm

There are many examples demonstrating that MOEAs

to the Han River basin multi-reservoir system. Chen et al.

can solve complex non-linear and non-convex multi-

() developed a macro-evolutionary multi-objective gen-

objective problems (a detailed review is given by Coello-

etic algorithm for optimizing the rule curves of a water

Coello et al. ). Examples of applications in water

resources system in Taiwan. Malekmohammadi et al. ()

resources engineering include groundwater monitoring

presented an approach for incorporating flood control and

design (Cieniawski et al. ; Reed & Minsker ;

water supply objectives for a cascade system of reservoirs

Kollat & Reed ), groundwater remediation (Beckford

by coupling the NSGA-II algorithm with an ELECTRE-TR1


36

N. Lerma et al.

|

Evolutionary network flow models in multi-reservoir water systems

Journal of Hydroinformatics

|

16.1

|

2014

W

(Elimination and Choice Translating Reality) postprocessor.

505 mm, and the average temperature is 14.4 C according

Reddy & Kumar () presented a multi-objective evol-

to the Basin Water Plan (CHJ ). The maximum altitude

utionary algorithm to derive operation rules for the multi-

is 2,024 m above sea level.

purpose Bhadra reservoir system (India). Furthermore,

Regarding the storage infrastructure of the basin, there

Chang & Chang () applied the NSGA-II algorithm in

are three main reservoirs: the largest in terms of capacity

other reservoir systems in Taiwan to optimize state curves.

is the Arenós reservoir (95 Mm3); located downstream is

Lin et al. () modified the algorithm SCE-UA (Shuffled

the Sichar reservoir (49 Mm3); and finally, located in the

Complex Evolution) to use it as a multi-objective tool to

tributary Rambla de la Viuda is the María Cristina reservoir

determine optimal water policy for the hydroelectric

(19.7 Mm3).

system of Huanren (NE China).

The topology of the model for the Mijares water system is shown in Figure 2. The model includes a main course that represents the Mijares River where the Arenós and Sichar

CASE STUDY: MIJARES RIVER BASIN

reservoirs are located. The other river considered is the tributary Rambla de la Viuda, in which the María Cristina

The Mijares River basin is located in the eastern slope of the

reservoir is located. The different sources of runoff con-

Iberian Peninsula (Figure 1). The water system occupies a sur-

sidered are the runoff of the basin upstream of the Arenós

face area of 5,466 km2. The total population of the zone is

reservoir, the runoff from the mid-basin of the Mijares

363 578 inhabitants, and the urban supply is generated by

River between the Arenós and Sichar reservoirs and the

exploiting pumping wells and the using springs. The total

runoff from the Rambla de la Viuda river flowing to the

cropped surface is 124 310 ha, of which 43 530 ha (35%) cor-

María Cristina reservoir. The irrigation demand can be

responds to irrigated land and the rest (65%) is occupied by

grouped into four main zones: traditional, channel 220,

dry-land farming. Citruses constitute the predominant crop,

channel 100 and María Cristina. The main features of

with approximately 87% of the irrigated area. The length of

these demands are shown in Table 1. The urban supply

the main river branch is approximately 156 km, with an aver-

comes from the Plana de Castellón aquifer, which is located

age runoff of 380 Mm3 a–1.

mainly beneath the coastal plain and is recharged by precipi-

Two climatologically different geographical areas can be distinguished: a coastal climate with a Mediterranean coast-

tation, infiltration from irrigation and Mijares riverbed infiltration.

line and a continental climate area located upstream of the

One of the main issues of the basin is the allocation of the

Arenós reservoir. The mean annual rainfall of the area is

resource between the agricultural demands. The traditionally

Figure 1

|

Location of the Mijares River basin.


37

N. Lerma et al.

Figure 2

Table 1

|

|

|

Evolutionary network flow models in multi-reservoir water systems

Journal of Hydroinformatics

|

16.1

|

2014

Topology of the simulation model developed for the Mijares River basin.

Values of the water demand for the irrigated areas of the basin (Mm3 a 1)

Mixed irrigated areas Traditional irrigated area

Surface water

65

Groundwater

Channel 100

Channel 220

María Cristina

37

40

25

irrigated area in the low part of the basin is more than a mil-

OR in order to protect the rights of traditional irrigation over

lennium old, so its water rights are predominant over other

surface water by imposing the use of ground water for

agricultural uses. On the other hand, the irrigation of the

modern irrigation. Current management is based on a RC

middle part of the basin represents modern irrigation (Chan-

defined in 1970, called Agreement 70 (Figure 3). The indi-

nel 220, Channel 100 and María Cristina), also called ‘mixed

cator of this RC is the storage of the Sichar and Arenós

irrigation’ because of the possibility of using both surface and

reservoirs. If the sum of the volume of both reservoirs is

ground water. In this situation, it is necessary to establish an

greater than the defined RC, then all the demands can use


38

Figure 3

N. Lerma et al.

|

|

Evolutionary network flow models in multi-reservoir water systems

Journal of Hydroinformatics

|

16.1

|

2014

Rule Curve of ‘Agreement 70’.

cheaper surface water. On the other hand, when the volume

arc and nodes. Nodes usually represent the most important

storage goes down, the RC mixed irrigation demands have to

elements of the water system, such as divergence and conflu-

pump water and the remaining surface water is reserved for

ence points, reservoirs and demands. On the other hand,

the traditionally irrigated area.

arcs represent any water conveyance element (natural or artificial). Furthermore, an internal combination of arcs and nodes within the model allows other types of elements,

METHODOLOGY

such as hydroelectric plants and water returns in the internal flow network, to be modelled. Arcs are defined by

The methodology estimates the RC for a complex multi-reser-

the initial and final nodes, by the maximum and minimum

voir water system through the iterative use of a river basin

flows and by the cost that produces each resource unit

simulation model. A popular EMO algorithm that is usually

that flows through it. Mathematically, the simulation

applied in water resources engineering, NSGA-II, has been

model is based on the resolution for each time step (monthly

used. NSGA-II is an EMO algorithm with a specific operator

in this case) of an internal conservative flow network.

to handle constraints. Furthermore, the simulation of a water

The equivalent objective function defined in the

basin management model is required. The results obtained by

SIMGES model and simplified for our problem is the

this model represent the situation of the water system under

following:

the proposed water management policies. The water basin management model has been developed using the SIMGES module (Andreu et al. ) included in the decision support system shell (DSSS) AquaTool (Andreu et al. ). The combination of non-linear algorithms together with linear programming is common in water resources models. Implementation of the simulation model SIMGES

Min F ¼

I m X X i¼1

þ

!

n¼1

J X j¼1

Vn,i,t ðCn þ pni Þ þ Spi,t Csp

DR j,t (Cdr þ pnj ) þ

K X

(1) DDk,t (CDD þ pnk )

k¼1

where t is the index for time; i is the index for reservoir; I is the total number of reservoirs in the model; Vn,i is the

The method requires multiple iterations of a simulation

volume of reservoir i in pool n; m is the number of pools

model that accurately represents the water system. For this

in a reservoir; Cn is the cost/benefit of the storage water in

purpose, the simulation module SIMGES included in the

pool n; pni is the priority number assigned to reservoir i;

DSSS AquaTool has been used. SIMGES is based on the

Spi is the spill of reservoir i; Csp is the cost of spills in the

conceptualization of river basins by networks comprising

reservoirs; DRj is the deficit of the minimum flow


39

N. Lerma et al.

|

Evolutionary network flow models in multi-reservoir water systems

Journal of Hydroinformatics

|

16.1

|

2014

established for river or channel j; Cdr is the cost of deficit of a

on the ecological flow or on the activation of pumping

minimum flow; J is the number of rivers and channels; pnj is

from the aquifer.

the priority number of river j; DDk is the deficit of demand

The model developed for the Mijares River basin (Figure 2)

k; K is the number of demands in the model; CDD is the cost

represents the current situation of the system quite well. Three

associated with the deficits of the demands; and pnk is the

runoff inflow elements are considered (one for each reservoir),

priority number of demand k.

with historical monthly data obtained from re-naturalized

Restrictions are related to physical constraints or other

monthly flows for the period 1940–2008. Additionally, the

types of constraints such as legal or environmental con-

three existing reservoirs have been taken into account

straints. Other constraints such as the balance in each

(Arenós, Sichar and María Cristina). The demands are con-

junction or diversion are also taken into account. Figure 4

sidered at the correct aggregation level to represent the

shows a diagram of SIMGES which takes into account the

different irrigators. Six demands have been considered in the

above aspects and data water system (demands, inflows,

model: two urban demands (Castellón de la Plana and

etc.) to translate this problem into an internal network

Borriol-Benicassim) and the four above-mentioned agricul-

flow optimization problem, resolved using the Out-of-Kilter

tural demands. SIMGES allows the surface–groundwater

algorithm (Ford & Fulkerson ).

interaction to be modelled in a very complete way with several

The water management within the simulation model is

types of aquifers and river reaches connected to the aquifers.

defined in several ways. First, a priority system that sets

There are requirements for the ecological flows established

water demands in order of priority (hierarchical order) is

in several parts of the basin. Within the model, the flows are

established. Similarly, a hierarchical system is established

considered in two specific river reaches with a constant flow

to define the releases among the reservoirs. Furthermore,

of 0.5 m3 s–1 (1.3 Mm3/month).

the reservoirs are divided into zones such that the model tries to keep all reservoirs in the same zone and starts releas-

NSGA-II implementation

ing depending on the priority. Finally, there are operation rules that allow the triggering of decisions based on indi-

NSGA-II (Deb et al. ) (elitist non-dominated sorting

cators. These indicators can be the volume stored in one

genetic algorithm) is an EMO algorithm with a specific oper-

or several reservoirs or the cumulative runoff of several

ator to handle constraints. In this method, a fast non-

months. The decision can represent the application of a

dominated sorting approach with a selection operator is

restriction on the demands, expressed as a percentage of

used to create a mating pool by combining the parent and

one or several demands, on the flow through the turbines,

offspring populations and selecting the best solutions with

Figure 4

|

Flowchart of SIMGES; interaction between data and management system.


40

N. Lerma et al.

|

Evolutionary network flow models in multi-reservoir water systems

Journal of Hydroinformatics

|

16.1

|

2014

respect to the fitness and the spread (Deb et al. ; Dume-

configurations belonging to the Rt population will be

dah et al. ). The next generation is populated starting

placed in the new population. Those fronts that cannot be

with the best non-dominated front and progresses through

placed are ignored.

the rest of the fronts until the population size is reached; if

When the last front is under consideration, the solutions

in the final stage there are more individuals in the non-

that belong to this front can exceed future solutions to be

dominated front than there is available space, a crowded dis-

placed in the descendant population (Figure 5). In this

tance-based niching strategy is used to choose which

case, it is useful to use strategies that allow those configur-

individuals of that front are entered into the next population.

ations to be selected at a scarcely populated area that is

The crowding distance value of a solution provides an esti-

far away from the other solutions. This will fill up the rest

mate of the density of solutions surrounding that solution

of the positions of the descendant population instead of

(Raquel & Naval ). In this research, NSGA-II is used

choosing configurations randomly.

for the evaluation of the objective functions that allow the aptitude of the operation rules to be known.

These strategies are irrelevant for the first-generational cycles of the algorithm because there are many fronts that

Through this algorithm, the descendant population Qt

persist to the next generation. However, as the process

(size N) is created using the parent population Pt (size N).

moves forward, several configurations become part of the

Both populations are combined to form Rt with a size of

first generation and this front may have more than N

2N. By means of non-dominated sorting, the population Rt

genes or individuals. It is therefore important that the non-

is classified in different Pareto fronts. Although this process

rejected configurations are chosen through a methodology

requires more effort, it is necessary because dominance test-

that guarantees diversity. When the population as a whole

ing between the parent and descendant populations is

converges to the Pareto front, the algorithm ensures that

developed. Once the sorting process is complete, the new

the solutions are separated from each other.

population is generated from the configurations of the

Initially, a parent population P0 is created in the NSGA-

non-dominated Pareto fronts. This new population is first

II algorithm (randomly or by an initialization technique).

built with the best non-dominated Pareto front (F1). The pro-

The population is sorted according to the non-dominance

cess continues with the solutions from the second front (F2),

of the different levels (sorting of Pareto fronts F1, F2, …).

the third front (F3) and so on. Because the population Rt has

For each solution, a flair function is assigned according to

a size of 2N and there are only N configurations that form

its dominance level (1 for the best level), which decreases

the

throughout the process. Sorting by tournament (using a

descendant

Figure 5

|

population,

not

all

of

the

front

Schematic diagram of the mechanism for promoting individuals of NSGA-II.


41

N. Lerma et al.

|

Evolutionary network flow models in multi-reservoir water systems

Journal of Hydroinformatics

|

16.1

|

2014

crowding tournament operator), crossing and mutation are

returned to the optimization tool and the process is

used to create the population of descendants Q0 with a

repeated. Consistency is critical to be able to identify a pre-

size N. The main phases followed by NSGA-II are:

ferred alternative with confidence. In the proposed method,

1. Combine parents and descendants to create Rt ¼ Pt ∪ Qt. Develop the non-dominated sorting to Rt and identify fronts Fi, i ¼ 1, 2 …, etc. 2. Make Ptþ1 ¼ Ø, and i ¼ 1. While |Ptþ1| þ |Fi| < N, make |Ptþ1| ¼ |Ptþ1| ∪ |Fi| and i ¼ i þ 1. 3. Sort by crowding (Fi0 < C, described below) and including at Pi the N–|Ptþ1| most widespread solutions using the crowding distance values associated with the front Fi. 4. Creating the descendant population Qiþ1 from Piþ1 using selection by crowding tournament, crossing and mutation.

the first step in the consistency check occurs after the evolutionary algorithm has generated a set of non-dominated policy or management options. Usually, solutions generated by the evolutionary algorithm are a good indicator of shortcomings of the network flow model structure. For example, if changes in a node should have an effect on the utility function

and

this

has

been

ignored

intentionally

or

unintentionally in the SIMGES model, the results generated by EMO will exploit this weakness in the flow network and generate solutions that should have corresponded to higher utility function values.

Coupling of methods: the multi-objective optimization model

The objective functions of the problem take into account the maximum deficit of the demands as well as the resilience of the water system. For that, three objective functions are

NSGA-II is used to define and test RC for the water allocation model developed with SIMGES. Each individual is composed of 13 values representing the value of the RC in each month of the year (12) and the restriction coefficient (1). The indicator of

proposed. The problem can be mathematically expressed as follows. Given three objective functions:

this RC is the storage of the Sichar and Arenós reservoirs. This RC is imposed in the water allocation model, and a run is per-

x ¼ f(β)

(2)

y ¼ g(β)

(3)

z ¼ h(β)

(4)

formed. The results of this run are used to estimate the objective functions, defined by Equations (2)–(4). The value of this objective function is translated to the multi-objective algorithm to define the aptitude of the RC proposed. NSGA-II (Deb et al. ) is used to examine the SIMGES model and inspect it for inconsistencies or errors and to generate optimal trade-offs between conflicting objectives

considering

alternative

management

scenarios

simultaneously. Consistency checks can help provide some confidence in the representation of the decision-maker’s preferences. In checking for consistency, it is important to detect errors in the decision-making utility function. For utility functions implying a complex preference structure, there is a greater need and opportunity for meaningful consistency checks (Castelletti & Soncini-Sessa ). Attempting to achieve the multiple goals simultaneously requires identifying a compromise in the Pareto optimality. EMO algorithms employ a population-based search to find many Pareto efficient solutions in a single run. Once the probability of all the linked nodes has been updated by compiling the SIMGES model, the objective function values are

where x is maximum annual deficit for agricultural demands (MaxDef1Year) (Minimized); y is maximum 10 consecutive years deficit for agricultural demands (MaxDef10Years) (Minimized); and z is years of pumping (Minimized). These objective functions are optimized by coupling NSGA-II algorithm and SIMGES. Results from this optimization running represent the outcome of SIMGES model. For this reason, these three functions are restricted by the solutions of Equation (1). On the other hand, β represents a combination of n non-ranked and non-weighted management options, which are the decision nodes of the SIMGES model and represent the RC. They also represent the genes of the chromosome of the algorithm: β ¼ ðg0 , g2 , . . . , gn Þ

(5)


42

N. Lerma et al.

|

Evolutionary network flow models in multi-reservoir water systems

Journal of Hydroinformatics

|

16.1

|

2014

These n input variables representing RC denote the set

(5 Mm3) and a maximum (87 Mm3) value depending on the

of feasible parameters over which the model produces a

associated reservoirs (Sichar and Arenós) and the restriction

realistic output. There are therefore j optimized solutions

coefficient varies between 0 and 1 or, in other words, between

placed at the Pareto front expressed as j combinations of

not applying and applying a total restriction (100%).

the different operation rules belonging to each input variable:

Two constraints related to the deficit objective functions were defined:

β a ¼ xa , ya , za β b ¼ xb , yb , zb ... β j ¼ xj , yj , zj

(6)

MaxDef1Year < 50%

(7)

MaxDef10Years < 100%

(8)

Each β (the RC) is represented by the volume threshold in each month of the OR and the restriction coefficient corre-

Each evaluation of the objective functions requires the

sponding to each of them. These variables (volume

simulation model be run under this operation rule. To do

threshold and restriction coefficient) are discretized at cer-

this, the process is as follows (Figure 6). First, the parameters

tain intervals. The volume level is between a minimum

of the EMO and the minimum and maximum thresholds of

Figure 6

|

Schematic model coupling.


43

N. Lerma et al.

|

Evolutionary network flow models in multi-reservoir water systems

the decision variables are defined in a Master Application

Journal of Hydroinformatics

|

16.1

|

2014

RESULTS AND DISCUSSION

that is responsible for controlling the whole process. After this, the Mater Application runs NSGA-II, which

The results drawn from this analysis are shown in the differ-

defines the first individual (set of decision variables), and

ent figures representing on the one hand the Pareto front

with these variables the data files needed for SIMGES are

that links the different objective functions and, on the

created. SIMGES is run, and the Master Application

other hand, the operation rule parameters that are the

imports the results and calculates deficits. The aquifer pump-

decision variables of the algorithm. The results presented

ing allows the OFs to be evaluated, and this value is returned

here correspond to different tests conducted with the

to the optimization model to create the next individual.

NSGA-II algorithm for the different OR proposed.

Regarding EMO, the initial population for the optimiz-

Two hundred points are represented in Figures 7–10.

ation was 200 with a crossover probability of 0.9, a single-

Each of these points represents the result applying

point binary crossover, a bitwise mutation probability of

SIMGES for each combination of parameters obtained

0.005 and a seed for a random number generator of

using NSGA-II to define the RC mentioned at the beginning

0.123457. This setting was the most suitable for handling

of the previous section on ‘Coupling of methods’. These 200

the problem after developing a detailed test with different

points represents an optimized solution for the OFs defined

configurations.

in Equations (2)–(4), RCs parameters or interesting variables

Figure 7

|

Pareto front 1; maximum deficits for the agricultural demands (for colour/symbol coding, see Table 2).

Figure 8

|

Pareto front 2; number years pumped versus deficit of the agricultural demands (for colour/symbol coding, see Table 2).


44

Figure 9

N. Lerma et al.

|

Figure 10

|

Evolutionary network flow models in multi-reservoir water systems

Journal of Hydroinformatics

|

16.1

|

2014

Maximum pumping of 1 and 10 years for the mixed irrigation (for colour/symbol coding, see Table 2).

|

Restriction coefficient (for colour/symbol coding, see Table 2).

(maximum pumping of the mixed irrigation), drawn from

Table 2

|

Colour/symbol coding adopted in Figures 7–11

the last population found by NSGA-II algorithm; this is how RCs are obtained. To relate the solutions of one figure to the rest of the figures, a colour scale gradient has been fixed and solutions are sorted according to the maximum annual deficit of the agricultural demands (abscissa of Figure 7). Table 2 lists the colour/symbol coding adopted in Figures 7–11. For example, the points (in any figure) with colours between orange and yellow are related to OR that provide a maximum annual deficit of the agricultural demands between 5 and 10%.

Maximum annual deficit of the agricultural Colour/Symbol

demands (%)

Red/▴–orange/▪

0–5

Orange/▪–yellow/▪ Yellow/▪–green/ ×

5–10 10–20

Green/ × –cyan/♦

20–25

Blue/Ж–purple/

25–30

Purple/ –pink/

30–35

Pink/ –dark red/ þ

35–37


45

N. Lerma et al.

Figure 11

|

|

Evolutionary network flow models in multi-reservoir water systems

Journal of Hydroinformatics

|

16.1

|

2014

Curves at the volume level, parameter of the operation rule (for colour/symbol coding, see Table 2).

Figure 7 shows the Pareto front corresponding to the

associated with annual deficits of 20–28% with 10 years of

short term (1 year) and the long term (10 years) of the deficit

accumulated deficits ranging between 20–100%, hence

of agricultural demands. Notice that an inferior front can be

the high values of these deficits. By not restricting the

distinguished by the dispersing point over this line. This OR

demand, constant pumping is not necessary and the

represents a great variety of possible solutions with deficits

number of years pumped decreases to 35. This lower

ranging from 0 to 36% for the short-term deficit and up to

value of years pumped means that no matter which operat-

100% of the long-term deficit (in percentage over the

ing rules apply, it is always necessary to pump at least 35 of

annual demand).

the 68 years of the simulation because the surface water is

The growing trend of this figure is due to the conditions of

not enough to supply the whole water demand of the

the basin: (1) storages in the reservoirs and (2) the fact that agri-

basin. Finally, the third zone corresponds to a large

cultural uses with more demand can be supplied with

number of years pumped, but this time the zone is associated

groundwater. The set of demands that can receive groundwater

with high values of the maximum deficits; this set of sol-

can therefore achieve a state of no deficit. From this situation,

utions is not appropriate due to the high deficits of the

and as shown in the figure, increasing the annual deficit implies

demands and the number of pumped years.

that the growth also accumulated 10 years of deficit. The opti-

In addition to the number of years pumped, it is important

mal solution is not that with zero deficits, and the number of

to represent the maximum annual pumping of the mixed irriga-

years pumped must also be taken into account.

tion facing the maximum long-term pumping of the same

Figure 8 shows the number of years pumped, sorted

demand (Figure 9). The figure shows a scatter cloud of points

according to the annual deficit of the agricultural demands.

and much more restrictive intervals of variation of the pump-

Note that this figure indicates the number of years of pump-

ing than for the deficits of the agricultural demands. The

ing required to achieve specific deficits. This parameter (the

annual pumping is 87–100% of the maximum annual pumping

number of years) was taken into account because pumping

and of 10 years of pumping, and 67–90% of the accumulated

has an associated cost which decreases while reducing the

10 years of pumping. Very high values for both indicators

pumping time. It is possible to distinguish three zones in

imply that water is scarce and requires high pumping for agri-

the figure, the first one approximately 0–20% of the annual

cultural areas of channel 100, channel 220 and María Cristina

deficit in which the pump is above 55 years. For fewer defi-

(mixed irrigation) in order to avoid deficits.

cits it is therefore necessary to pump up to 68 years, which is

Looking at the colour distribution discussed above, it

the number of years simulated. The second zone is

can be seen that the first stage (0–15% of the annual deficits)


46

N. Lerma et al.

|

Evolutionary network flow models in multi-reservoir water systems

Journal of Hydroinformatics

|

16.1

|

2014

of the Pareto front for the deficits is associated with a

the water surface), only traditional irrigation has to be

change of 10 years of pumping between 90 and 84%. The

taken into account and mixed irrigation has to pump

rest of the Pareto front (15–35% of the annual deficits) cor-

water. Curve B (diamond/cyan in online version) is associ-

responds to a variation of annual pumping between 100 and

ated with annual maximum deficits of 20–25% of the

87%. There is therefore an area that varies as a function of

agrarian demands and 30–100% in the case of the maximum

10 years of pumping and another area that depends on the

deficits of 10 years. This curve is defined with a level lower

annual pumping.

than curve A, allowing a larger surface to supply the mixed

Figure 10 represents the coefficient of restriction, the

irrigation and, therefore, somewhat less by pumping. Curve

OR parameter and the decision variable algorithm depend-

C (Cyrillic symbol Ж/dark blue in online version) is similar

ing on the maximum deficit agrarian demands. The

to B, differing mainly in the first months of the hydrological

obtained restriction is around 100%, more specifically 92–

year, i.e. November to January. Those months can be seen as

100%, although the largest set of solutions is 96–100%.

the curve B reserve supplying more water to the surface for

The figure reveals that a very high restriction has to be

traditional irrigation. However, curve C allows for a greater

applied regardless of the results obtained. However, the

surface to supply mixed irrigation; for this reason, the tra-

restriction also influences the volume level (the other par-

ditional irrigation (and therefore the agrarian demands)

ameter of the operating rules) in these results to be obtained.

deficit increases. Finally, curve D (filled circle/purple in

In Figures 7 and 8, the 200 points shown in each figure

online version) corresponds to maximum annual deficits of

(last population found by NSGA-II) represent two Pareto

25–30% of the agricultural demands and 30–100% in the

fronts, the first between the short term (1 year) and the

case of the greatest deficiencies of 10 years. This RC is

long term (10 years) of the deficit of agricultural demands

defined with low levels and is associated with a very small

and the second between the short-term deficit and the

reserve for traditional irrigation, causing high deficits of tra-

number of pumping years.

ditional demands.

As mentioned above, there are 200 results that provide

Table 3 shows the results (deficit and pumping) of the

different combinations of objective functions. These results

water system without OR and with RC ‘Agreement 70’.

translate into 200 RCs obtained by the NSGA-II algorithm.

The results without OR are not in the solutions of the

Each of these RCs is a curve defined with 13 values, 12 cor-

NSGA-II algorithm because it has a maximum deficit of

responding to the months of a year and another to the

10 years of the traditionally irrigated area, which is larger

coefficient of restriction. Because representing and analyz-

than the limit established in official studies developed by

ing 200 curves is not feasible, and given that some curves

the Jucar Basin Authority. The results with ‘Agreement 70’

are not applicable to real management scenarios because

RC follow a similar behaviour to the points marked ‘ × ’

of the complexity and variability of their definitions, four

(green in online version) of the figures, and this curve

curves have been selected to represent various parts of the Pareto front (Figure 11).

Table 3

|

Results of deficits and pumping without OR and with the RC Agreement 70

Curve A (filled square/orange in online version) corresponds to solutions close to the origin of Pareto front reference 1 (Figure 7), i.e. the maximum annual deficits and 10 years of claims of 5% of the agricultural environ-

Traditional irrigated area

ment. This implies, as already explained, a large number of years pumped (Figure 8) and high values of these pumps (Figure 9). To achieve these results, the OR are defined with fairly high levels (compared to the other three curves). When the sum of the volumes of the Arenós and Sichar reservoirs are below those levels (which indicates that the RC of the mixed irrigation are not supplied with

Mixed irrigated areas

Maximum deficit of 1 year Maximum deficit of 10 years Maximum pumping of 1 year Maximum pumping of 10 years

Without

RC Agreement

OR (%)

70 (%)

37.12

23.35

217.28

55.77

87.4

97.85

59.64

73.3


47

N. Lerma et al.

|

Evolutionary network flow models in multi-reservoir water systems

Journal of Hydroinformatics

|

16.1

|

2014

(Figure 3) corresponds to curve C (Figure 11) but with a

projects INTEGRAME (contract CGL2009-11798) and

slightly lower level.

SCARCE

(program

Consolider-Ingenio

2010,

project

CSD2009-00065) and the Generalitat Valenciana for the Gerónimo Forteza grant (FPA/2012/006). The authors also

CONCLUSIONS

thank the European Commission (Directorate-General for Research

&

Innovation)

for

(program

funding

the

project

FP7-ENV-2011,

project

This paper demonstrates the optimization of operating rules

DROUGHT-R&SPI

based on the coupling of an EMO with a flow network

282769) and the Seventh Framework Programme of the

model. This approach allows a set of rule curves of a reser-

European Commission for funding the project SIRIUS

voir to be obtained for the allocation of water during

(FP7-SPACE-2010-1, project 262902).

drought demands. The EMO used was the NSGA-II algorithm. The simulation model was developed with the program SIMGES of the decision support system shell

REFERENCES

AQUATOOL based on network flow algorithms. The problem that arises is how to reduce the highest annual deficits and the maximum long-term deficits while taking into account the cost of additional pumping. The optimization decision variables are the trigger volume of applying the OR and the restriction coefficient. The coupling methodology is based on the evaluation of the objective function, which represents a run of the simulation model for watershed management to estimate the demand deficits and pumps. This methodology has been applied to the Mijares River basin, a system that is characterized by severe droughts, a well-established system of rights between users and the possibility of the joint use of surface and groundwater resources. By applying this approach, different types of operating rules have been tested to provide results in terms of deficits and similar pumps. A multi-objective point of view allowed the short and long terms of the deficit and the pumping resource to be taken into account. Moreover, this implementation helps users or managers of the water system to determine the best or most convenient management for the river basin.

ACKNOWLEDGEMENTS The authors wish to thank the Confederación Hidrográfica del Júcar (Spanish Ministry of the Environment) for the data provided in developing this study, the Comisión Interministerial de Ciencia y Tecnología (CICYT or Spanish Ministry of Science and Innovation) for funding the

Abd-Elhamid, H. F. & Javadi, A. A.  A cost-effective method to control seawater intrusion in coastal aquifers. Water Resources Management 25, 2755–2780. Akter, T. & Simonovic, S. P.  Modelling uncertainties in shortterm reservoir operation using fuzzy sets and a genetic algorithm. Hydrological Sciences Journal 49 (6), 1079–1081. Alcigeimes, B. C. & Billib, M.  Evaluation of stochastic reservoir operation optimization models. Advances in Water Resources 32, 1429–1443. Andreu, J., Capilla, J. & Sanchís, E.  AQUATOOL: A generalized decision support-system for water-resources planning and operational management. Journal of Hydrology 177, 269–291. Beckford, O., Chan Hilton, A. B. & Liu, X.  Development of an enhanced multi-objective robust genetic algorithm for groundwater remediation design. In: Proceedings of World Water and Environmental Resources Congress 2003 (P. A. Debarry, ed.). American Society of Civil Engineering, Reston, VA. Bhaskar, N. R. & Whitlach Jr, E. E.  Deriving of monthly reservoir release policies. Water Resources Research 16 (6), 987–993. Bower, B. T., Hufschmidt, M. M. & Reedy, W. H.  Operation procedures: Their role in the design and implementation of water resource systems by simulation analysis. In: Design of Water Resource Systems, Chapter 11 (A. Maass, M. M. Hufschmidt, R. Dorfman, H. A. Thomas Jr, S. A. Marglin & G. M. Fair, eds). Harvard University Press, Cambridge, Massachusetts, pp. 443–458. Cai, X., McKinney, D. C. & Lasdon, L. S.  Solving nonlinear water management models using a combined genetic algorithm and linear programming approach. Advances in Water Resources 24, 667–676. Castelletti, A. & Soncini-Sessa, R.  Bayesian Networks and participatory modelling in water resource management. Environmental Modelling & Software 22, 1075–1088. Chan Hilton, A. B. & Culver, T. B.  Groundwater remediation design under uncertainty using a robust genetic algorithm.


48

N. Lerma et al.

|

Evolutionary network flow models in multi-reservoir water systems

Journal of Water Resources Planning and Management 131 (1), 25–34. Chang, L.-C. & Chang, F.-J.  Multi-objective evolutionary algorithm for operating parallel reservoir system. Journal of Hydrology 377, 12–20. Chen, L., McPhee, J. & Yeh, W.-G.  A diversified multiobjective GA for optimizing reservoir rule curves. Advances in Water Resources 30, 1082–1093. CHJ  Plan Hidrológico del Júcar. Confederación Hidrográfica del Júcar. Ministerio de Medio Ambiente, España. Cieniawski, S. E., Eheart, J. W. & Ranjithan, S.  Using genetic algorithms to solve a multiobjective groundwater monitoring problem. Water Resources Research 31 (2), 399–409. Cisty, M.  Hybrid genetic algorithm and linear programming method for least-cost design of water distribution systems. Water Resources Management 24, 1–24. Clark, E. J.  Impounding reservoirs. Journal of American Water Works Association 48 (4), 349–354. Coello-Coello, C. A., Lamont, G. B. & Van Veldhuizen, D. A.  Evolutionary Algorithms for Solving Multi-objective Problems. Springer, New York. Croley, T. E. & Rao, K. N. R.  Multi-objective risks in reservoir operation. Water Resources Research 15 (4), 1807–1814. Deb, K., Pratap, A., Agarwal, S. & Meyarivan, T.A.  Fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions On Evolutionary Computation 6 (2), 182–197. Dumedah, G., Berg, A. A., Wineberg, M. & Collier, R.  Selecting model parameter sets from a trade-off surface generated from the non-dominated sorting genetic algorithmII. Water Resources Management 24, 4469–4489. Farmani, R., Abadia, R. & Savic, D.  Optimum design and management of pressurised branched irrigation networks. Journal for Irrigation and Drainage 133 (6), 538–547. Farmani, R., Henriksen, H. J. & Savic, D.  An evolutionary Bayesian belief network methodology for optimum management of groundwater contamination. Environmental & Modelling Software 24, 303–310. Farmani, R., Savic, D. & Walters, G.A. a Evolutionary multiobjective optimisation in water distribution network design. Journal of Engineering Optimization 37 (2), 167–183. Farmani, R., Walters, G. A. & Savic, D. b Trade-off between total cost and reliability for Anytown water distribution network. Journal of Water Resources Planning and Management 131 (3), 161–171. Farmani, R., Walters, G. A. & Savic, D.  Evolutionary multiobjective optimization of the design and operation of water distribution network: total cost vs. reliability vs. water quality. Journal of Hydroinformatics 8 (3), 165–179. Ford, C. R. & Fulkerson, D. R.  Flow in Networks. Princeton University Press, Princeton, NJ. Gorev, N. B., Kodzhespirova, I. F., Kovalenko, Y., Álvarez, R., Prokhorov, E. & Ramos, A.  Evolutionary testing of hydraulic simulator functionality. Water Resources Management 25, 1935–1947.

Journal of Hydroinformatics

|

16.1

|

2014

Haghighi, A., Samani, H. M. V. & Samani, Z. M. V.  GA-ILP method for optimization of water distribution networks. Water Resources Management 25, 1791–1808. Hakimi-Asiabar, M., Ghodsypour, S. H. & Kerachian, R.  Deriving operating policies for multi-objective reservoir systems: application of self-learning genetic algorithm. Applied Soft Computing 10 (4), 1151–1163. Hanne, T. & Nickel, S.  A multiobjective evolutionary algorithm for scheduling and inspection planning in software development projects. European Journal of Operational Research 167 (3), 663–678. Harik, G. R., Lobo, F. G. & Goldberg, D. E.  The Compact Genetic Algorithm (IlliGAL Report No. 97006). University of Illinois at Urbana–Champaign, Illinois Genetic Algorithms Laboratory, Urbana, IL. Hassanzadeh, Y., Abdi, A., Talatahari, S. & Singh, V. P.  Metaheuristic algorithms for hydrologic frequency analysis. Water Resources Management 25, 1855–1879. Hınçal, O., Altan-Sakarya, A. B. & Ger, A. M.  Optimization of multireservoir systems by genetic algorithm. Water Resources Management 25, 1465–1487. Kim, T., Heo, J.-H. & Jeong, C.-S.  Multireservoir system optimization in the Han River basin using multi-objective genetic algorithms. Hydrological Processes 20, 2057–2075. Kollat, J. B. & Reed, P. M.  Comparison of multi-objective evolutionary algorithms for long-term monitoring design. Advances in Water Resources 29 (6), 792–807. Kourakos, G. & Mantoglou, A.  Simulation and multi-objective management of coastal aquifers in semiarid regions. Water Resources Management 25, 1063–1074. Labadie, J.  Reservoir system optimization models. Water Resources Update, University Council on Water Resources 108(Summer), 83–110. Labadie, J. W.  Optimal operation of multireservoir systems: state-of-the-art review. Journal of Water Resources Planning and Management 130 (2), 93–111. Laumanns, M. & Ocenasek, J.  Bayesian optimization algorithms for multi-objective optimization. In: Parallel Problem Solving from Nature (J. Guervós, P. Adamidis, H. Beyer, J. Martín & H. Schwefel, eds) PPSN VII, 7th International Conference, Granada, Spain, September 7–11. Springer, Berlin, Lecture Notes in Computer Science 2439, pp. 298–307. Liang, Q., Johnson, L. E. & Yu, Y. S.  A comparison of two methods for multiobjective optimization for reservoir operation. Water Resources Bulletin 32 (2), 333–340. Lin, J.-Y., Cheng, C.-T. & Lin, T.  A Pareto strength SCE-UA algorithm for reservoir optimization operation. Fourth International Conference on Natural Computation. IEEE Computer Society. Louati, M. H., Benabdallah, S., Lebdi, F. & Milutin, D.  Application of a genetic algorithm for the optimization of a complex reservoir system in Tunisia. Water Resources Management 25, 2387–2404. Loucks, D. P. & Sigvaldason, O. T.  Multiple reservoir operation in North America. In: The Operation of Multiple


49

N. Lerma et al.

|

Evolutionary network flow models in multi-reservoir water systems

Reservoir Systems (Z. Kaczmarck & J. Kindler, eds) IIASA Collaborative Proceedings Series CP-82-53, pp. 1–103, Luxemburg. Lund, J. & Ferreira, I.  Operating rule optimization for Missouri River reservoir system. Journal of Water Resources Planning and Management 122 (4), 287–295. Lund, J. R. & Guzman, J.  Developing seasonal and long-term reservoir system operation plans using HEC-PRM. Technical Report No. RD-40. Hydrologic Engineering Center, US Army Corps of Engineers, Davis, California. Makropoulos, C. K., Natsis, K., Liu, S., Mittas, K. & Butler, D.  Decision support for sustainable option selection in integrated urban water management. Environmental Modelling & Software 23, 1448–1460. Malekmohammadi, B., Zahraie, B. & Kerachian, R.  Ranking solutions of multi-objective reservoir operation optimization model using multi-criteria decision analysis. Expert Systems with Applications 38 (6), 7851–7863. Moeni, R., Afshar, A. & Afshar, M. H.  Fuzzy rule-based model for hydropower reservoirs operation. International Journal of Electrical Power & Energy Systems 33 (2), 171–178. Mohan, S. & Sivakumar, S.  Development of multi-objective reservoir systems operation using DP-based neuro-fuzzy model: a case study in PAP systems. In Fourth INWEPF Steering Meeting and Symposium (INWEPF), India. Molina, J. L., Farmani, R. & Bromley, J.  Aquifers management through evolutionary Bayesian networks: the Altiplano case study (SE Spain). Water Resources Management 25, 3883–3909. Molina-Cristobal, A., Griffin, I. A., Fleming, P. J. & Owens, D. H.  Multiobjective controller design: optimising controller structure with genetic algorithms. In Proceedings of the 2005 IFAC World Congress on Automatic Control, Prague, Czech Republic. Murugan, P., Kannana, S. & Baskarb, S.  NSGA-II algorithm for multi-objective generation expansion planning problem. Electric Power Systems Research 79, 622–628. Nazif, S., Karamouz, M., Tabesh, M. & Moridi, A.  Pressure management model for urban water distribution networks. Water Resources Management 24, 437–458. Oliviera, R. & Loucks, D. P.  Operating rules for multireservoir systems. Water Resources Research 33 (4), 839–852. Osman, M. S., Abo-Sinna, M. A. & Mousa, A. A.  An effective genetic algorithm approach Multiobjective Resource Allocation Problems (MORAPs). Applied Mathematics & Computing 163 (2), 755–768. Raquel, C. R. & Naval, P. C.  An effective use of crowding distance in multiobjective particle swarm optimization.

Journal of Hydroinformatics

|

16.1

|

2014

Proceedings of the Conference on Genetic and Evolutionary Computation, June 25–29, ACM, New York, USA, pp. 257–264. Reddy, M. J. & Kumar, D.  Optimal Reservoir Operation Using Multi-Objective Evolutionary Algorithm. Department of Civil Engineering, Indian Institute of Science, Bangalore, India. Reddy, M. J. & Kumar, D.  Multiobjective differential evolution with application to reservoir system optimization. Journal of Computing in Civil Engineering 21 (2), 136–146. Reed, P. & Minsker, B. S.  Striking the balance: long-term groundwater monitoring design for conflicting objectives. Journal of Water Resources Planning and Management 130 (2), 140–149. Revelle, C., Joeres, E. & Kirby, W.  The linear decision rule in reservoir management and design. I. Development of the stochastic model. Water Resources Research 5 (4), 767–777. Safavi, H. R., Darzi, F. & Mariño, M. A.  Simulation-optimization modelling of conjunctive use of surface water and groundwater. Water Resources Management 24, 1965–1988. Sedki, A. & Ouazar, D.  Simulation-optimization modeling for sustainable groundwater development: a Moroccan coastal aquifer case study. Water Resources Management 25 (11), 2855–2875. Singh, A. & Minsker, B. S.  Uncertainty-based multiobjective optimization of groundwater remediation design. Water Resources Research 44, W02404. Suen, J. P. & Eheart, J. W.  Reservoir management to balance ecosystem and human needs: incorporating the paradigm of the ecological flow regime. Water Resources Research 42 (3), pW03417, 2006. Wang, Y. C., Yoshitani, J. & Fukami, K.  Stochastic multiobjective optimization of reservoirs in parallel. Hydrological Processes 19, 3551–3567. Wurbs, R. A.  Reservoir-system simulation and optimization models. Journal of Water Resources Planning and Management 119 (4), 455–472. Yeh, W. W.-G.  Reservoir management and operations models: a state-of-the-art review. Water Resources Research 21 (12), 1797–1818. Yeh, W. W.-G. & Becker, L.  Multiobjective analysis of multireservoir operations. Water Resources Research 18 (5), 1326–1336. Young, G.  Finding reservoir operating rules. Journal of the Hydraulics Division, Proceedings of the American Society of Civil Engineering 93 (6), 297–321.

First received 24 September 2012; accepted in revised form 6 May 2013. Available online 25 June 2013


50

© IWA Publishing 2014 Journal of Hydroinformatics

|

16.1

|

2014

Generation of the bathymetry of a eutrophic shallow lake using WorldView-2 imagery Onur Yuzugullu and Aysegul Aksoy

ABSTRACT In this study, water depth distribution (bathymetric map) in a eutrophic shallow lake was determined using a WorldView-2 multispectral satellite image. Lake Eymir in Ankara (Turkey) was the study site. In order to generate the bathymetric map of the lake, image and data processing, and modelling were applied. First, the bands that would be used in depth prediction models were determined

Onur Yuzugullu Aysegul Aksoy (corresponding author) Department of Environmental Engineering, Middle East Technical University, 06800 Ankara, Turkey E-mail: aaksoy@metu.edu.tr

through statistical and multicollinearity analyses. Then, data screening was performed based on the standard deviation of standardized residuals (SD_SR) of depth values determined through preliminary linear regression models. This analysis indicated the sampling points utilized in depth modelling. Finally, linear and non-linear regression models were developed to predict the depths in Lake Eymir based on remotely sensed data. The non-linear regression model performed slightly better compared to the linear one in predicting the depths in Lake Eymir. Coefficients of determination (R 2) up to 0.90 were achieved. In general, the bathymetric map was in agreement with observations except at resuspension areas. Yet, regression models were successful in defining the shallow depths at shore, as well as at the inlet and outlet of the lake. Moreover, deeper locations were successfully identified. Key words

| bathymetry, remote sensing, shallow lake, WorldView-2

INTRODUCTION Water depth is important for several physical and biological

aerial and radar) can be used in obtaining bathymetric infor-

processes in a lake (Leira & Cantonati ). Together with

mation. In the literature, such applications have mainly

water volume, water depth impacts natural assimilation

focussed on coastal waters and estuaries (Philpot ; Grei-

capacity, pollution dilution factor, water temperature and

danus et al. ; Robbins ; Hennings ; Sandidge &

retention time. Light penetration and growth of algal species

Holyer ; Roberts ; Calkoen et al. ; Lafon et al.

(especially attached algae) may depend on the depth

; Dierssen et al. ; Stumpf et al. ; Jordan & Fon-

of water. Furthermore, water depth influences mixing of

stad ; Mobley et al. ; Lyzenga et al. ; McIntyre

water layers, sedimentation of solids and re-suspension of

et al. ; Bachmann et al. , ; Kao et al. ; Lee

bottom sediments. Therefore, obtaining the spatial distri-

; Marchisio et al. ). However, the use of multispec-

bution of water depths or bathymetric information may be

tral images in determination of the bathymetry of lakes is

critical in assessing the impact of pollutants on lake water

not common. This is due to the fact that complex relation-

quality.

ships dominate between radiance and lake characteristics

Sonar/radar systems have been frequently used in deri-

in lakes compared to Case 1 waters, as well as coastal

vation of the bathymetric maps of lakes (Tureli & Norman

waters and estuaries. Complexities arise from high chloro-

; Morgan et al. ). However, these systems may

phyll-a concentrations, suspended solids, organic matters

require extensive fieldwork and financial means, especially

and bottom reflection in most lakes. The existence of few

for large water bodies. In order to ease these difficulties,

sensors on most multispectral satellites may result in insuffi-

remotely

ciency in distinguishing between the impacts of these factors

sensed

doi: 10.2166/hydro.2013.133

images

(hyperspectral,

multispectral,


51

O. Yuzugullu & A. Aksoy

|

Bathymetry generation using WorldView-2 imagery

Journal of Hydroinformatics

|

16.1

|

2014

on radiance values. However, the launch of new satellites,

Linear and non-linear regression models were developed

such as WorldView-2, may provide new means and

to derive the spatial distribution of depths. The depths

additional sensors that may aid in depth determination in

obtained with these models were compared to the actual

lakes as well.

bathymetric map of the lake.

The WorldView-2 satellite was launched in the fourth quarter of 2009. It has eight spectral bands covering the electromagnetic spectrum range of 400–1,040 nm (Table 1)

MATERIALS AND METHODS

(Digital Globe ). With its 0.46 m panchromatic and 1.84 m multispectral resolution, studies which require high

Study area

spatial resolution can be conducted (Lee et al. ). The satellite has a radiometric resolution of 11-bits and a temporal W

Lake Eymir is a shallow natural lake located at 39.28 N and

resolution of 3.7 days at 20 or less. It allows images to be cap-

32.30 E. It is located 20 km south of Ankara (Figure 1) at an

tured in an area of 65.6 km × 110 km at the nadir. Its coastal

altitude of 969 m (Beklioglu et al. ). The surface area of

blue band that senses the 400–450 nm of the spectrum is

the lake is around 1.25 km2. It has a shoreline of 11 km and

characterized by its relatively shorter wavelength and higher

a catchment area of 971 km2. In 1990, the area surrounding

energy. It can penetrate to deeper parts of water bodies. It

the lake and 245 km2 of its catchment area was declared as a

has been reported that depths down to 30 m can be identified

‘Special Environmental Protection Area’ by Decree of the

by coastal blue and blue bands (Digital Globe ). WorldView-2 imagery has been used in recent studies

Cabinet of Ministers due to its ecological significance (Yuzugullu ).

for bathymetry determinations in coastal waters and estu-

The average water depth in the lake changes depending

aries (Glass et al. ; Marchisio et al. ; Lee et al.

on the balance between inflows and outflow. Lake Eymir is

; McCarthy et al. ; Parthish et al. ). Lee et al.

mainly fed by Lake Mogan in the south (98% of the total

() reported that green and yellow bands were more effec-

inflow), the Kislakci Stream in the east and groundwater

tive in depth determination in the range of 2.5–20 m in

sources. The excess water of the lake drains into Imrahor

coastal waters. Marchisio et al. () showed the efficacy

Creek in the east (Yenilmez et al. ). The average water

of coastal blue and blue bands in revealing depths up to

depth is 4 m. Annual water level fluctuations in the lake

7 m. To our knowledge, there is no study on bathymetry gen-

vary by 0.5–1.0 m, depending on the net inflow and evapor-

eration for shallow eutrophic lakes using the WorldView-2

ation (Yagbasan & Yazicigil ). In April 2011, the

imagery.

average depth in the lake reached 4.5 m (Yuzugullu ).

In this study, a WorldView-2 image was used to deter-

The lake has been suffering from the effects of eutrophi-

mine the bathymetry and, therefore, the spatial distribution

cation. It has been turbid and rich in algal species for a long

of water depths in eutrophic Lake Eymir in Ankara,

time. In studies performed in different time periods,

Turkey. The relationships between measured depths and

eutrophic conditions were reported (Diker ; Tan ;

radiance values at different bands were investigated.

Ozen ). It was shown that water balance, and therefore the depth of water, had a significant impact on the water

Table 1

|

quality of the lake (Beklioglu et al. ). Therefore, bathy-

Spectral bands of WorldView-2 sensors

metry and water depths provide substantial information in

Wavelength

Wavelength

Band

range (nm)

Band

range (nm)

Coastal Blue (Band 1)

400–450

Red (Band 5)

630–690

Blue (Band 2)

450–510

Red Edge (Band 6)

705–745

Green (Band 3)

510–580

NIR-1 (Band 7)

770–895

Yellow (Band 4)

585–625

NIR-2 (Band 8)

860–1,040

the assessment of water quality changes in the lake.

METHODOLOGY The methodology followed in the development of depth (or bathymetry) models based on remotely sensed data is


52

O. Yuzugullu & A. Aksoy

Figure 1

|

|

Bathymetry generation using WorldView-2 imagery

Journal of Hydroinformatics

Locations of Lake Eymir and sampling points.

depicted in Figure 2. Following the acquisition of the image on 28 July 2010, a field work was realized on 2 August 2010. Depth measurements were conducted at 59 points (depicted in Figure 1). In the time gap between the image and field work dates, there was no precipitation or significant change in temperature or other conditions which would alter water depths. As a result, it was assumed that the depths and water quality parameters were representative of the conditions on the date the image was taken. Sampling locations for ground truth data were selected arbitrarily to cover the whole lake area. The geographical coordinates of the points were determined using a Garmin GPS receiver with ±1.5 m positional accuracy on average. Image processing was conducted using ENVI 4.7. The image had geographic projection and ED 50 Datum. The lake area was cropped and isolated. Therefore, only the lake area was taken as the region of interest. The dark pixel subtraction method was used in order to eliminate atmospheric effects in the ortho-rectified satellite image (Chavez ). In order to obtain the radiance values, first, image histograms were generated for the corresponding spectral bands. Then, zero values in the histograms were removed. Finally, the minimum and the maximum values in the histograms were determined to aid in the conversion

Figure 2

|

Flowchart of the methodology.

|

16.1

|

2014


53

O. Yuzugullu & A. Aksoy

|

Bathymetry generation using WorldView-2 imagery

Journal of Hydroinformatics

|

16.1

|

2014

of digital numbers to radiance values using the method pro-

performed in an iterative procedure to identify highly corre-

vided by Beisl et al. (). Histogram values were matched

lated independent variables. At this stage, correlation matrix

to digital numbers in the range of 0–255. By generating

was used to remove an independent variable that had the

band-specific linear equations, digital numbers were con-

highest correlation with another. Then, a new correlation

verted to radiance values.

matrix was generated for the remaining variables. This

In order to test the suitability of data in regression model

cycle was repeated until multicollinearity was eliminated

development and to improve the model prediction perform-

between independent variables. Correlation coefficient (r)

ance, a screening procedure was applied to select the

was used as the criterion for variable elimination. It was

independent variables of depth models. The initial stage of

assumed that multicollinearity existed between variables if

the procedure was to remove the sampling points at

the absolute r value was greater than 0.6. The remaining

locations with high turbidity. This was applied to minimize

variables proceeding multicollinearity analysis were con-

the negative impact of re-suspended sediments in depth

sidered in regression model development in prediction of

determination. Since Lake Eymir is a shallow lake, local

the water depths or bathymetry of Lake Eymir.

re-suspension can occur due to various factors such as

Performances in bathymetry determinations using multi-

groundwater inflow, wind effect, turbulence due to velocity

spectral images are variable for Case 1 and Case 2 waters. In

variations as a result of cross-sectional and flow direction

Case 1 waters (i.e., open ocean) chlorophyll is the main opti-

changes. As depicted in Figure 1, the shape of Lake Eymir

cally active constituent. These waters generally lack

makes it prone to these impacts. On the sampling date, the

suspended particles. On the other hand, there is a complex

lake was mostly clear with an average total suspended

relationship between reflectance and water quality par-

solids concentration of 1.92 mg/L and an average chloro-

ameters in Case 2 waters (i.e., coastal, estuary or inland

phyll-a concentration of 3.49 μg/L, respectively (Yuzugullu

waters such as lakes). This complexity is mainly due to the

). However, at some locations turbidity was observed

co-presence

due to re-suspension of bottom sediments. These locations

coloured dissolved organic matter in high concentrations

were identified on the image by locating the zones that exhi-

(Kishino et al. ; Sudheer et al. ). Since depth deter-

bit high radiance due to suspended solids. The sampling

mination in Case 2 waters or turbid lakes using multispectral

points were placed over the image and the ones that were

images can be problematic, data elimination may be

over the re-suspension areas were removed from the data

required to improve the prediction capability of the bathy-

set. As a result, 11 sampling points were removed from

metry models. Stevens () showed that regression

further analysis. These points can be seen in Figure 1 (one

model prediction performance can be improved by eliminat-

is hidden due to overlap).

ing outlier data based on standardized residuals (SR) of a

of

chlorophyll,

suspended

particles

and

The water depth (the dependent variable of the models)

regression model. In this approach, first a regression

and the radiances at eight spectral bands (the independent

model is developed using the data set. Then, outlier obser-

variables of the models) were analysed for validity of nor-

vation points are determined and a new model is

mality. For this purpose, Q–Q plots were prepared

developed using the remaining observation points. In this

assuming normal, log-normal and exponential distributions.

study, a similar approach was used to eliminate the outlier

These plots were used to determine the form (as is, logarith-

observations. First, an initial regression model was devel-

mic transformation, or exponential transformation) of the

oped. Then, the standard deviation of SR (SD_SR) was

independent variables that would be used in regression

calculated. At this stage, different multipliers (n) of SD_SR

model development. The distribution type of a variable

were evaluated (n ¼ 1.5, 1.4,…, 0.5). Then, observation

was selected based on the slope information in the corre-

data with an SR greater than n × SD_SR were eliminated.

sponding Q–Q plot. If the slope was close to 1, the

For each case (n × SD_SR), a linear regression model was

corresponding distribution type was selected for the given

created. Then, these models were assessed based on basic

variable. Following the determination of independent vari-

statistics (minimum, maximum, mean and standard devi-

able distribution forms, multicollinearity analysis was

ation) for the dependent variable (depth), and F-test


54

O. Yuzugullu & A. Aksoy

|

Bathymetry generation using WorldView-2 imagery

Journal of Hydroinformatics

|

16.1

|

2014

for predictions. The model with a small F-value and basic

coefficient of determination (R 2) values between observed

statistics similar to the original observation data (48

and predicted water depths at given locations.

observation points) was chosen as the best model. For the observation data for Lake Eymir, n ¼ 0.7 resulted in the best filter in establishment of the data set for model

RESULTS AND DISCUSSION

development. However, this filter (0.7 × SD_SR) resulted in elimination of 23 additional sampling points. As a result,

The Q–Q plots indicated that the radiance data in most of

bathymetry model developments were realized using the

the spectral bands had normal distributions, except in

25 sampling points depicted in Figure 1, which corre-

Band 5 and Band 2. In these bands, log-normal distributions

sponded to a sampling density of 20 samples per square

prevailed. Based on this information, the log transform-

kilometre of the lake. Thirty-two per cent of these points

ations (base 10) of the data in Bands 5 and 2 were used in

(eight sampling points) were used in the model development

multicollinearity and correlation analysis. The correlation

stage. The remaining 68% (17) were employed for model

matrix indicated that Band 1 (coastal blue) was highly corre-

validation. Allocation of the locations of the sampling

lated with Bands 2, 3 and 4 (r > 0.75). Moreover, at 95%

points for model development and model validation was

confidence level, r values for the relationships between

performed arbitrarily while care was taken to have as even

Band 1 and Band 6, and Band 1 and Band 7 were higher

a spatial distribution as possible.

than 0.6, which was the lower limit for multicollinearity

Following

data

screening,

linear

and

non-linear

elimination. Multicollinearity analysis indicated that only

regression models were developed to predict the bathymetry

the data in Band 1, Band 8 and the logarithmic transform

of the lake. The general forms of the linear and non-linear

of the data in Band 5 were independent from each other

regression models are given in Equations (1) and (2),

and could be used as explanatory variables in regression

respectively:

model development. Puetz et al. () and Maheswari () showed the usefulness of inclusion of Bands 1 and 8

di ¼ a þ

J X

kj xij

(1)

j¼1

in depth determinations in coastal waters as well. Band 1 senses the radiation in the 400–450 nm wavelength interval. This band supports bathymetric studies by

di ¼

J X

sensing the deeper parts of a water body compared to mj

kj xij

(2)

j¼1

other sensors (Puetz et al. ). Band 5, on the other hand, acquires radiance data in the range of 630–690 nm. The light in this region of the electromagnetic spectrum

where, di is the water depth at location i, a is the intercept, kj

is mainly absorbed by chlorophyll-a (Thiemann & Kauf-

is the regression coefficient for band j, xij is the radiance at

mann ). As mentioned before, analysis of the data in

location i at band j, and mj is the exponent for band j. The

this band revealed a log-normal distribution. This was in

a represents the offset for the depth of 0 m (Loomis )

line with the distribution of measured chlorophyll-a con-

for the linear regression model. This parameter is used to

centrations in Lake Eymir. The radiance in Band 8 (xi8)

handle the average error that would be produced by over-

was another explanatory variable that was used in bathy-

and under-predictions at different depths as a result of the

metry model development for Lake Eymir. In various

impact of heterogeneous bottom cover (macrophytes, sand,

studies, the relationship between suspended particles and

gravel, etc.) and variable water quality (suspended solids,

radiance in near-infrared band has been shown (Doxaran

chlorophyll, etc.) on reflectance values (Loomis ) for

et al. ). In this study, the impact of suspended particles

the linear model. In the above equations, a, kj and mj

in depth determination was considered through inclusion

values are set by XLStat software by minimizing the root

of Band 8. An initial analysis of the distribution of sus-

mean square error (RMSE) and maximizing the Pearson

pended particle concentrations in Lake Eymir indicated a


55

O. Yuzugullu & A. Aksoy

|

Bathymetry generation using WorldView-2 imagery

Journal of Hydroinformatics

|

16.1

|

2014

normal distribution similar to that for the distribution of

sensed data are given below in Equations (3) and (4),

radiance values in Band 8. It must also be noted that it is

respectively:

possible to observe frequent algal blooms over the lake surface in patches. Moreover, macrophytes may cover the

di ¼ 2:433 þ 193:000xi1 1:313 log xi5 108:886xi8

(3)

0:128 log x5:000 419:672x1:378 di ¼ 1140:027x1:628 i1 i5 i8

(4)

bottom, especially at shallower depths closer to shore. Therefore, reflectance in Band 8 can be impacted by these as well. Preceding the determination of independent explanatory variables (that have no multicollinearity), data

R 2, adjusted R 2, RMSE and F-value with respect to the

screening was performed. As mentioned earlier, the

calibration data set were 0.87, 0.78, 0.370 and 6.78 × 10 4,

sampling points over re-suspension areas were removed

respectively, for the linear regression model at 95% confi-

from the data set in order to avoid the interference these

dence level. For the same data set, R 2, adjusted R 2, RMSE

areas would produce in depth predictions. Then, the

and F-value were 0.90, 0.83, 0.379 and 3.04 × 10 4, respect-

remaining 48 sampling points were taken into consider-

ively, when the non-linear regression model was used.

ation. The minimum, average and maximum water depths

Performances of these models were also tested against the

at these points were 2.50, 4.57 and 5.75 m, respectively.

validation data. R 2, adjusted R 2, RMSE and F-value were

These values were 2.50, 4.56 and 5.75 m, respectively, for

0.805, 0.760, 0.488 and 1.07 × 10 6, respectively, when the

the full observation data set (59 observation points).

linear regression model was applied. The corresponding

Further data elimination was conducted based on SD_SR.

values for the non-linear model were 0.855, 0.822, 0.365

This approach was used to improve the prediction capa-

and 1.11 × 10 7, respectively, at 95% confidence level. In

bility of depth models. Application of remote sensing

both models, the radiance values in Band 1 had the highest

technology to Case 2 waters to make water quality predic-

coefficient (kj in Equations (1) and (2)) compared to other

tions may be problematic compared to Case 1 waters due

bands, keeping in mind that the radiance in Band 5 (xi5)

to the presence of water constituents that may significantly

was in logarithmic scale. This situation emphasized the

impact radiance values (Swardika ). It is very probable

importance of Band 1 in bathymetry determination. A simi-

that local algal blooms, bottom sediments, suspended par-

lar observation was valid in the correlation matrix as well.

ticles, and even waves can impact radiance values.

Compared to other bands, depth had the highest r (0.351)

Another difficulty is the heterogeneous distribution of

for Band 1 radiance at 95% confidence level when 48

these interferences which may lead to extreme values. By

sampling points were considered. The coefficients for

regarding extreme values as outliers, the impact of such

Bands 5 and 8 were negative which were indicative of the

interferences in model prediction performance may be

interference due to absorption based on the presence of sus-

improved at least at other locations in the lake that are

pended solids and algal species. It must also be noted that

less prone to such effects. As seen in Figure 1, removed

another model based on the ratio method proposed by

sampling points form clusters in certain locations. It is

Stumpf et al. () was tested. The ratio of ln(xi5)/ln(xi1)

possible that these locations were subject to the interfer-

was used. This ratio had the highest correlation with depth

ences mentioned. The minimum, average and maximum

(R 2 ¼ 0.51) compared to other combinations. The model

depths for 25 observation points used in the model devel-

obtained for this ratio, di ¼ 15.652*(ln xi5/ln xi1) 11.886,

opment

resulted in no better performance. R 2 and RMSE were

were

2.80,

4.70

and

5.70 m,

respectively.

Therefore, it can be said that deeper locations were con-

0.46 and 0.529 at 95% confidence level.

sidered as ground truth data for modelling purposes. It

Measured versus predicted depths are depicted in

may be the case that deeper locations impacted less from

Figure 3. As can be seen, both models were successful in pre-

bottom sediment re-suspension or bottom reflection.

dicting low and high depths in Lake Eymir. However, the

The linear and non-linear regression models generated

statistical analysis given before stated that the non-linear

to determine the depths at different locations using remotely

model was slightly better in depth predictions. For the


56

O. Yuzugullu & A. Aksoy

|

Bathymetry generation using WorldView-2 imagery

Journal of Hydroinformatics

|

16.1

|

2014

regression model and 0.55 m in the non-linear regression model. The errors in the calculated average depths were 13 and 12%, respectively, for the linear and non-linear models. The bathymetric maps of Eymir Lake that are generated using Equations (3) and (4) are depicted in Figures 4 and 5, respectively. Both models simulated the shallow depths at shores with success. The increasing depths from the shoreline can be clearly seen for both models. Tureli & Norman () studied the bathymetry of the lake using sonar technology. According to that study, the lake bottom had a bowl-type structure with steep slopes at shores. As a result, sharp increases were observed in depths progressing away Figure 3

|

Predicted versus measured depths for linear and non-linear depth models.

from the shore to the inner regions of the lake. The midregion of the lake was the deepest location with an average

validation data set, the average error was calculated as 0.30

depth of 5.5 m in 1985. They also stated that the lake

and 0.2 m for the linear and non-linear regression models,

became relatively shallow at the southern and eastern

respectively. The average depth for the validation data set

parts, which correspond to the inlet and outlet of the lake,

(observations) was 4.73 m. The predicted average depths

respectively. The findings of Tureli & Norman () are

were 4.65 and 4.71 m for the linear and non-linear regression

consistent with the results of this study. As Lake Eymir

models, respectively. These correspond to 2 and 0.5% error in

has a valley-type structure, a sharp increase is expected in

the predicted average depths, respectively. Therefore, models

depth in short distances away from the shore. This is cap-

developed using screened ground truth data were successful

tured by the depth models (Figures 4 and 5). Moreover,

in predicting the average depth. When the models were

the southern and eastern parts of the lake are shallower

applied to predict the depths at 48 sampling points, the aver-

than the other parts. The deeper regions of the lake are

age error in depth predictions was 0.61 m in the linear

shown by darker shades in Figures 4 and 5. In general, the

Figure 4

|

The bathymetric map of Lake Eymir generated by the linear regression depth model.


57

O. Yuzugullu & A. Aksoy

Figure 5

|

|

Bathymetry generation using WorldView-2 imagery

Journal of Hydroinformatics

|

16.1

|

2014

The bathymetric map of Lake Eymir generated by the non-linear regression depth model.

distributions of relatively lower and higher depths were in

these locations as well. For the existing situation, regression

line with the observations. However, the depths at re-sus-

models were successful in defining the shallow depths at

pension areas were in error. This could be seen especially

shore and close to the inlet and outlet of the lake. Moreover,

at the southern part of the lake closer to the inlet. At these

deeper locations were successfully identified.

locations mixed values were observed. Overall, although

Bathymetry determination using WorldView-2 can aid

both models predicted the depths well, the non-linear

in water quality studies. Use of remotely sensed data may

model was better in predicting the shallower depths at

provide an alternative in determination of the distribution

shores. However, the non-linear model was more sensitive

of depths and examination of the water quality in lakes

to the impact of re-suspension areas.

with respect to these depths. Scale advantage supplied by remote sensing over traditional bathymetry generation methods may make it preferable for large lakes. However,

CONCLUSIONS The results of this study showed that WorldView-2 image can be used to predict the depths in a eutrophic lake. Bands 1, 5 and 8 of the WorldView-2 satellite were adequate

more research is needed to investigate the effects of spatially and temporarily heterogeneous bottom characteristics (i.e., variable

coverage

by

macrophytes,

different

bottom

materials) on reflectance values in determination of depths in a eutrophic lake.

to determine the depth distribution. Among these bands, Band 1 (coastal blue band) made the highest contribution in determination of the depths in the eutrophic lake.

ACKNOWLEDGEMENTS

The presence of turbidity due to re-suspension areas caused interference in predicting the depths. However, elim-

The authors are grateful to the Scientific and Technical

inating these areas in the depth model development helped

Research Council of Turkey (TUBITAK) for providing

to make good depth estimates at locations where the impact

financial

of turbidity was less. More study is required to deal with this

CAYDAG-106Y201). The authors acknowledge Res. Asst.

issue and improve the prediction capability of the models at

Elif Kucuk for her support during field work.

support

for

this

study

(Project

Number:


58

O. Yuzugullu & A. Aksoy

|

Bathymetry generation using WorldView-2 imagery

REFERENCES Bachmann, C. M., Ainsworth, T. L., Fusina, R. A., Montes, M. J., Bowles, J. H., Korwan, D. R. & Gillis, D. B.  Bathymetric retrieval from hyperspectral imagery using manifold coordinate representations. IEEE Trans. Geosci. Remote Sens. 47, 884–897. Bachmann, C. M., Montes, M. J., Fusina, R. A., Parrish, C., Sellars, J., Weidemann, A., Goode, W., Nichols, C. R., Woodward, P., Mcilhany, K., Hill, V., Zimmerman, R., Korwan, D., Truitt, B. & Scwartzschild, A.  Very shallow water bathymetry retrieval from hyperspectral imagery at the Virginia Coast Reserve (VCR’07) multi-sensor campaign. In: Proceedings of 2008 IEEE International Geoscience and Remote Sensing Symposium, 6–11 July 2008, Boston, MA, pp. 125–128. Beisl, U., Telaar, J. & Schonermark, M. V.  Atmospheric correction, reflectance calibration and BRDF correction for ADS40 image data. In: Proceedings of 2008 ISPRS Congress, Commission Papers XXXVIII. 3–11 July 2008, Beijing, China, pp. 7–12. Beklioglu, M., Ince, O. & Tuzun, I.  Restoration of eutrophic Lake Eymir, Turkey, by biomanipulation undertaken after a major external nutrient control I. Hydrobiologia 489, 93–105. Calkoen, C. J., Hesselmans, G. H. F. M., Wensink, G. J. & Vogelzang, J.  The bathymetry assessment system: efficient depth mapping in shallow seas using radar images. Int. J. Remote Sens. 22, 2973–2998. Chavez, P. S.  Atmospheric, solar, and MTF corrections for ERTS digital imagery. In: Proceedings of the American Society of Photogrammetry. Fall technical meeting, Phoenix, AZ, p. 69. Dierssen, H. M., Zimmerman, R. C., Leathers, R. A., Downes, T. V. & Davis, C. O.  Ocean color remote sensing of seagrass and bathymetry in the Bahamas Banks by high resolution airborne imagery. Limnol. Oceanogr. 48, 444–455. Digital Globe  8-band multispectral imagery. Available at: www.digitalglobe.com/index.php/48/Products? product_id=27 (accessed 9 July 2011). Diker, Z.  A Hydrobiological and Ecological Study in Lake Eymir. MS Thesis, Middle East Technical University, Ankara, Turkey. Doxaran, D., Froidefond, J. M., Lavender, S. & Castaing, P.  Spectral signature of highly turbid waters: application with SPOT data to quantify suspended particulate matter concentrations. Remote Sens. Environ. 81, 149–161. Glass, A. L., Walker, B., Peters, M. & Dykes, L.  Improving the usability of high resolution imagery for tropical areas: deglinting, de-hazing and calibration of very high resolution satellite imagery. In: Proceedings of Map Asia 2010 & ISG 2010. 26–28 July 2010, Kuala Lumpur, Malaysia. Available at: www.mapasia.org/2010/proceeding/pdf/lisa.pdf (accessed 10 June 2011). Greidanus, H., Calkoen, C., Hennings, I., Romeiser, R., Vogelzang, J. & Wensink, G. J.  Intercomparison and validation of

Journal of Hydroinformatics

|

16.1

|

2014

bathymetry radar imaging models. In: Proceedings of 1997 IEEE International Geoscience and Remote Sensing Symposium. 3–8 August 1997, Singapore, pp. 1320–1322. Hennings, I.  A historical overview of radar imagery of sea bottom topography. Int. J. Remote Sens. 19, 1447–1454. Jordan, D. & Fonstad, M.  Two dimensional mapping of river bathymetry and power using aerial photography and GIS on the Brazos River, Texas. Geocarto Int. 20, 13–20. Kao, H. M., Ren, H., Lee, C. S., Chang, C. P., Yen, J. Y. & Lin, T. H.  Determination of shallow water depth using optical satellite images. Int. J. Remote Sens. 30, 6241–6260. Kishino, M., Tanaka, A. & Ishizaka, J.  Retrieval of chlorophyll a, suspended solids, and colored dissolved organic matter in Tokyo Bay using ASTER data. Remote Sens. Environ. 99, 66–74. Lafon, V., Froidefond, J. M., Lahet, F. & Castaing, P.  SPOT shallow water bathymetry of a moderately turbid tidal inlet based on field measurements. Remote Sens. Environ. 81, 136–148. Lee, S. R.  A coarse-to-fine approach for remote-sensing image registration based on a local method. Int. J. Smart Sens. Intell. Systems 3, 690–702. Lee, K. R., Kim, A. M., Olsen, R. C. & Kruse, F. A.  Using WorldView-2 to determine bottom-type and bathymetry. In: Proceedings of the SPIE 8030 (Ocean Sensing and Monitoring III). Available at: spiedigitallibrary.org/ proceedings/resource/2/psisdg/8030/1/80300D_1 (accessed 10 June 2011). Leira, M. & Cantonati, M.  Effects of water-level fluctuations on lakes: an annotated bibliography. Hydrobiologia 613, 171–184. Loomis, M. J.  Depth Derivation from the WorldView-2 Satellite using Hyperspectral Imagery. MS Thesis, Naval Postgraduate School, Monterey, CA, USA. Lyzenga, D. R., Malinas, N. P. & Tanis, F. J.  Multispectral bathymetry using a simple physically based algorithm. IEEE Trans. Geosci. Remote Sens. 44, 2251–2259. Maheswari, R. M.  WorldView-2 (WV-2) coastal, yellow, rededge, NIR-2 in underwater habitat mapping. Available at: dgl.us.neolane.net/res/img/7a827acbb24ab9cdd85da7b64 d0f9259.pdf (accessed 11 November 2011). Marchisio, G., Pacifici, F. & Padwick, C.  On the relative predictive value of the new spectral bands in the Worldview-2 satellite. In: Proceedings of the 2010 International Geoscience and Remote Sensing Symposium. 25–30 July 2010, Honolulu, Hawaii, pp. 2723–2726. Mccarthy, B. L., Olsen, R. C. & Kim, A. M.  Creation of bathymetric maps using satellite imagery. In: Proceedings of the SPIE 8030 (Ocean Sensing and Monitoring III). 26–27 April 2011, Orlando, FL, 80300C. McIntyre, M. L., Naar, D. F., Carder, K. L., Donahue, B. T. & Mallinson, D. J.  Coastal bathymetry from hyperspectral remote sensing data: comparisons with high resolution multibeam bathymetry. Mar. Geophys. Res. 27, 128–136.


59

O. Yuzugullu & A. Aksoy

|

Bathymetry generation using WorldView-2 imagery

Mobley, C. D., Sundman, L. K., Davis, C. O., Bowles, J. H., Downes, T. V., Leathers, R., Montes, M. J., Bissett, W. P., Kohler, D. D., Reid, R. P., Louchard, E. M. & Gleason, A.  Interpretation of hyperspectral remote-sensing imagery by spectrum matching and look-up tables. Appl. Opt. 44, 3576–3592. Morgan, L. A., Shanks, W. A., Lovalvo, D. A., Jhonson, S. Y., Stephenson, W. J., Pierce, K. L., Harlan, S. S., Finn, C. A., Lee, G., Webring, M., Shulze, B., Duhn, J., Sweeney, R. & Balistrieri, L.  Exploration and discovery in Yellowstone Lake: results from high-resolution sonar imaging, seismic reflection profiling, and submersible studies. J. Volcanol. Geoterm. Res. 122, 221–242. Ozen, A.  Role of Hydrology, Nutrients and Fish Predation in Determining the Ecology of a System of Shallow Lakes. MS Thesis, Middle East Technical University, Ankara, Turkey. Parthish, D., Gopinath, G. & Ramakrishnan, S. S.  Coastal bathymetry by coastal blue. Available at: www.dgl.us. neolane.net/res/img/db5653880b1d7abc6fd5c393de7c909d. pdf (accessed 11 November 2011). Philpot, W. D.  Bathymetric mapping with passive multispectral imagery. Appl. Opt. 28, 1569–1578. Puetz, A. M., Lee, K. & Olsen, R. C.  WorldView-2 data simulation and analysis results. In: Proceedings of the SPIE 7334 (algorithms and technologies for multispectral, hyperspectral, and ultraspectral imagery XV). 73340U. Available at: proceedings.spiedigitallibrary.org/proceeding. aspx?articleid=778667 (accessed 11 November 2011). Robbins, B.  Quantifying temporal change in seagrass areal coverage: the use of GIS and low resolution aerial photography. Aquat. Bot. 58, 259–267. Roberts, A. C. B.  Shallow water bathymetry using integrated airborne multi-spectral remote sensing. Int. J. Remote Sens. 20, 497–510. Sandidge, J. & Holyer, R.  Coastal bathymetry from hyperspectral observations of water radiance. Remote Sens. Environ. 65, 341–352.

Journal of Hydroinformatics

|

16.1

|

2014

Stevens, J. P.  Outliers and influential data points in regression analysis. Psychol. Bull. 95, 334–344. Stumpf, R. P., Holderied, K. & Sinclair, M.  Determination of water depth with high-resolution satellite imagery over variable bottom types. Limnol. Oceanogr. 48, 547–556. Sudheer, K. P., Chaubey, I. & Garg, V.  Lake water quality assessment from Landsat thematic mapper data using neural network: an approach to optimal band combination selection. J. Am. Water Resour. Assoc. 42, 1683–1695. Swardika, I. K.  Bio-optical characteristic of case-2 coastal water substances in Indonesia coast. Int. J. Remote Sens. Earth Sci. 4, 64–84. Tan, C. O.  The Roles of Hydrology and Nutrients in Alternative Equilibrium of Two Shallow Lakes of Anatolia, Lake Eymir and Lake Mogan: Using Monitoring and Modeling Approaches. MS Thesis, Middle East Technical University, Ankara, Turkey. Thiemann, S. & Kaufmann, H.  Determination of chlorophyll content and trophic state of lakes using field spectrometer and IRS-1C satellite data in the Mecklenburg Lake district, Germany. Remote Sens. Environ. 73, 227–235. Tureli, K. & Norman, T.  Ankara güneyindeki Eymir Gölü’nün batimetresi ve taban sedimanları (The bathymetry and bottom sediments of Lake Eymir located in south of Ankara). Geol. Bull. Turk. 35, 91–99. Yagbasan, O. & Yazicigil, H.  Sustainable management of Mogan and Eymir Lakes in central Turkey. Environ. Geol. 56, 1029–1040. Yenilmez, F., Keskin, F. & Aksoy, A.  Water quality trend analysis in Lake Eymir, Ankara. Phys. Chem. Earth. 36, 135–140. Yuzugullu, O.  Determination of Chlorophyll-a Distribution in Lake Eymir Using Regression and Artificial Neural Network Models with Hybrid Inputs. MS Thesis, Middle East Technical University, Ankara, Turkey.

First received 1 August 2012; accepted in revised form 9 May 2013. Available online 6 June 2013


60

© IWA Publishing 2014 Journal of Hydroinformatics

|

16.1

|

2014

Multi-site evaluation to reduce parameter uncertainty in a conceptual hydrological modeling within the GLUE framework Kairong Lin, Pan Liu, Yanhu He and Shenglian Guo

ABSTRACT Reducing uncertainty of hydrological modeling and forecasting has both theoretical and practical importance in hydrological sciences and water resources management. This study focuses on reducing parameter uncertainty by multi-sites validating for the conceptual Xinanjiang model. The generalized likelihood uncertainty estimation (GLUE) method was used to conduct the uncertainty analysis with Shuffled Complex Evolution Metropolis (SCEM-UA) sampling. The discharge criterion of interior gauge station was added to select the behavioral parameters, and then two comparable schemes were established to illustrate how well the uncertainty can be reduced by considering the observations of the interior sites’ flow information. The Dongwan watershed, a sub-basin of the Yellow River basin in China, was selected as the case study. The results showed that the number and standard deviation of behavioral parameter sets decreased, and the simulated runoff series by the Xinanjiang model with the behavioral parameter sets can fit better with the observed runoff series when setting the threshold value at the interior sites. In addition, considering the interior sites’ flow information allows one to derive more reasonable prediction bounds and reduce the uncertainty in hydrological modeling and forecasting to some degree. Key words

| GLUE, multi-site evaluation, parameter uncertainty, SCEM-UA, Xinanjiang model

Kairong Lin (corresponding author) Yanhu He Guangdong Key Laboratory for Urbanization and Geo-simulation, School of Geography and Planning, Sun Yat-sen University, Guangzhou 510275, China E-mail: linkr@mail.sysu.edu.cn Kairong Lin Yanhu He Key Laboratory of Water Cycle and Water Security in Southern China of Guangdong High Education Institute, Sun Yat-sen University, Guangzhou 510275, China Pan Liu Shenglian Guo State Key Laboratory of Water Resources and Hydropower Engineering Science, Wuhan University, Wuhan 430072, China

INTRODUCTION Hydrological models have been accepted as effective tools

The treatment of uncertainty engaged with the explosion

in the description of dynamic relations between hydrologi-

of methods devoted to deriving meaningful uncertainty

cal processes, meteorological behaviors, land use and land

bounds for hydrological model predictions (e.g., Beven &

cover, and also the changes of vegetation coverage within

Freer ; Thiemann et al. ; Vrugt et al. ; Morad-

a watershed, providing theoretical and practical support

khani et al. ; Ajami et al. ; Benke et al. ;

for river basin management (Wagener & Gupta ;

Vrugt & Robinson ; Li et al. ; Mousavi et al. ).

Hejazi et al. ). The hydrological system is complicated

Prediction in ungauged basins (PUB) is an initiative that

by climate changes such as atmospheric circulation, precipi-

emerged out of discussions among International Association

tation, air temperature,

surface

of Hydrological Sciences (IAHS) members on the world-

properties such as the geological conditions, vegetation

wide web and during a series of IAHS sponsored meetings

and soil conditions (Lin et al. ). As a result, the com-

in Maastricht (July 18–27, 2001), Kofu (March 28–29,

and

the

underlying

plexity of the hydrological system poses great challenges

2002), and Brasilia (November 20–22, 2002) about the

for the hydrological modeling practices, and uncertainty

need to reduce the predictive uncertainty in hydrological

analysis is still an important issue for hydrological modeling

science and practice (Sivapalan et al. ). Indeed, the

and forecast.

final aim of studying uncertainty is to find the ways and

doi: 10.2166/hydro.2013.204


61

K. Lin et al.

|

Multi-site evaluation to reduce parameter uncertainty with the GLUE framework

Journal of Hydroinformatics

|

16.1

|

2014

measures to reduce the uncertainty in hydrological model-

information for evaluation to gain less uncertainty, the

ing and forecasting, so as to increase the accuracy and

objective of this study is to reduce parameter uncertainty

reliability of hydrological forecasting.

by using multi-site evaluation in the performance of the

One of the efficient ways of reducing uncertainty is to

Xinanjiang model, based on the generalized likelihood

use new and all available information (Beven & Binley

uncertainty estimation (GLUE) method with the Shuffled

). For example, Goodman () pointed out that the

Complex Evolution Metropolis (SCEM-UA) sampling algor-

statistical methods that lend themselves to correct quantifi-

ithm. Undoubtedly, utilization of the multi-site evaluation

cation of the uncertainty were also effective for combining

may be of theoretical and practical merit in obtaining

different sources of information, and concluded that one

some insight into the causes behind the hydrological model-

way to reduce uncertainty was to use all the available

ing uncertainty, one of the crucial but tough problems in the

data. Freer et al.’s () research showed that further con-

hydrological modeling practices. The rest of this paper is

straining of the model responses using the fuzzy water

organized as follows: the section below briefly describes

table elevations at both locations considerably reduced the

the uncertainty estimation schemes and the Xinanjiang

number of behavioral parameter sets. Uhlenbrook &

model; then, in the next section, we introduce the study

Sieber () also pointed out that the potential restriction

area and associated hydrological data; results are discussed

of the uncertainty clearly depended on the goodness of the

and analyzed in the section after that; finally, the last section

simulation of the additional data set. Gallart et al. ()

contains the major conclusions.

used conditioning on water table records and the distribution of parameters obtained from point observations to reduce the uncertainty of predictions for both streamflow

METHODOLOGY

and groundwater contribution. Maschio et al. () dealt with uncertainty mitigation by using observed data, integrat-

Uncertainty estimation technique

ing the uncertainty analysis and the history-matching processes. The main characteristic of their study was the

The GLUE method proposed by Beven & Binley () to

use of observed data as constraints to reduce the uncertainty

estimate parameter uncertainty has been widely used in

of the reservoir parameters. Lumbroso & Gaume () used

many complex and nonlinear models. The GLUE method

the analysis of various types of data that can be collected

is devoted to the investigation of hydrological modeling

during post-event surveys and consistency checks to

uncertainty by producing the prediction limits for the mod-

reduce the uncertainty in indirect discharge estimates.

eled streamflow series and a set of behavioral parameters

In fact, interior hydrological information has been used

(e.g., Freer et al. ; Beven & Freer ; Blazkova &

to improve the performance of hydrological models in many

Beven ; Montanari ; McMichael et al. ; Jin

literatures. The study by Gupta et al. () proposed the use

et al. ; Ng et al. ). The popularity of the GLUE

of the multiple and non-commensurable measures of infor-

method is probably best explained by its conceptual simpli-

mation to improve calibration of hydrologic models.

city, relative ease of implementation, the ability to handle

Thereafter, many studies have proved that it is helpful to

different error structures and models without major modifi-

use interior hydrological information to improve the hydro-

cations to the method itself.

logical modeling to some degree for both conceptual model

The SCEM-UA algorithm (Vrugt et al. ) can be used

and distributed model (e.g., Krysanova et al. ; Andersen

to improve the efficiency of the GLUE, which has a heavy

et al. ; Moussa et al. ; Das et al. ; Feyen et al.

computational burden. The SCEM-UA algorithm is an adap-

). There could be more uncertainty if only the error at

tive Markov Chain Monte Carlo (MCMC) sampler, which

the outlet is considered, and this uncertainty can be con-

has good ability to infer the posterior probability distribution

siderably reduced by using more available information,

of hydrologic model parameters (e.g., Gong ; Blasone &

such as the interior sites’ flow information. Therefore,

Vrugt ; McMillan & Clark ; Dotto et al. ; Xu

based on the idea of inputting more available useful

et al. ). Due to the merits of the SCEM sampling and


62

K. Lin et al.

|

Multi-site evaluation to reduce parameter uncertainty with the GLUE framework

Journal of Hydroinformatics

|

16.1

|

2014

the GLUE method, these two methods can be combined

observations of the interior sites’ flow information in an

together. For example, the initial range of parameter

alternative strategy. It is notable that the proposed idea of

samples can be wide without necessarily increasing compu-

the utility of the interior sites’ information is not limited to

tational requirements (Dotto et al. ). Blasone & Vrugt

the GLUE or MCMC methods. The flowcharts of these

() compared performance of the informal likelihoods

two schemes are shown in Figure 1. Scheme I sets the

in the SCEM-UA algorithm with the GLUE method and

threshold of likelihood measure only at the outlet, and

demonstrated that the targeted sampling resulted in better

scheme II sets the threshold of likelihood measure at both

predictions of the model output (and that the uncertainty

the outlet and interior sites. First, in this study, the Nash–

limits were less sensitive to the number of retained

Sutcliffe efficiency index (NE) (Nash & Sutcliffe ) is

solutions).

selected as the likelihood measure, which is defined as:

Therefore, the GLUE method with SCEM algorithm was adopted for uncertainty analysis in our study. In this study, two schemes were established by using the GLUE method with the SCEM-UA sampling algorithm, to study how well parameter uncertainty can be reduced by considering the

Figure 1

|

Flowchart of GLUE method with SCEM-UA sampling algorithm.

n P

NE ¼ 1:0

½Qobs ðiÞ Qsim ðiÞ 2

i¼1 n P i¼1

Qobs ðiÞ Qobs

2

(1)


63

K. Lin et al.

|

Multi-site evaluation to reduce parameter uncertainty with the GLUE framework

where Qobs ðiÞ, Qsim ðiÞ, and Qobs denote the observed

Journal of Hydroinformatics

|

16.1

|

2014

calculated formula is:

runoff, simulated runoff and the mean value of the observed runoff series, respectively, n is the length of the observed data series. Second, instead of the Monte Carlo method, the SCEM-UA algorithm was used to generate a sample of parameter sets. In this study, the SCEM-UA algorithm produces

n P

J ½Qobs ðiÞ

CR ¼ i¼1 where,

NE-dependent samples before setting a threshold, so the simulation associated with each of the parameter sets has

J ½Qobs ðiÞ ¼

equal weight. After that, a threshold value of likelihood measure is decided and the behavioral parameter sets whose likelihood values are greater than the thresholds are chosen. Then the discharge predictions from the behavioral parameter sets were ranked in order of magnitude and, using the likelihood weights associated with each behavioral parameter set, which is defined as:

(2)

(5)

The confidence interval of discharge at each time step is the major result by the GLUE method in terms of evaluations of hydrological modeling uncertainty. Interval width (IW) is usually adopted as one of the major indices to evaluate the uncertainty interval, but it depends on the

width (RIW) is used, which is defined by the following equation:

where W(i) and L(θi) are likelihood weight and likelihood measure value associated with behavioral parameter set θi, respectively, n is the number of behavioral parameter sets. Finally, a cumulative probability distribution for the ranked discharge predictions is obtained by Equation (3):

PðQ Qi Þ ¼

Qlow ðiÞ < Qobs ðiÞ < Qup ðiÞ otherwise

magnitudes of discharge which makes it impossible to

i¼1

j¼1 n P

1, 0,

compare across basins. In this study, a relative interval

Lðθi Þ W ðiÞ ¼ n P Lðθi Þ

i P

(4)

n

n P

RIW ¼ i¼1

Qup ðiÞ Qlow ðiÞ nQobs

(6)

where Qlow (i) and Qup(i) denote the lower and the upper uncertainty bounds at time i, respectively, the meaning of Qobs is the same as in Equation (1).

W ð jÞ

The Nash–Sutcliffe efficiency index of the median (3)

W ð jÞ

j¼1

where Q represents discharge, and Qi is the ranked dis-

values MQ0.5 (NE(MQ0.5)) is also used as an evaluation index to judge whether or not the median values MQ0.5 and the uncertainty intervals are effective crisp simulations of the observation of total flow.

charge prediction which is ranked at the ith place, n has the same meaning as Equation (2).

Xinanjiang conceptual model

According to the cumulative probability distribution, an uncertainty bound can be obtained for a given certainty

The Xinanjiang model, developed in 1973 and published in

level.

1980 (Zhao et al. ), is one conceptual hydrological

In this study, three indices were adopted to evaluate the

model and has been widely used in China. Its main feature

uncertainty interval. One is the containing ratio (CR), which

is the concept of runoff formation on repletion of storage,

is defined as the ratio of the number of the observations fall-

which denotes that runoff is not produced until the soil

ing within their respective uncertainty intervals to the total

moisture content of the aeration zone reaches field

number of observations (Beven & Binley ; Montanari

capacity, and thereafter runoff equals the rainfall excess

; Xiong & O’Connor ; Lin et al. ). The

without further loss (Zhao & Liu ). Based on the


64

K. Lin et al.

|

Multi-site evaluation to reduce parameter uncertainty with the GLUE framework

concept of runoff formation on repletion of storage, the

Journal of Hydroinformatics

|

16.1

|

2014

STUDY REGIONS AND DATA

total runoff, R, of the basin is calculated by using a soil moisture storage capacity distribution curve in the Xinan-

River basins

jiang model. After that, the total runoff, R, is separated into only two components, i.e., the surface runoff and the

The Dongwan watershed was selected as the case study, and

groundwater runoff in the early version of the Xinanjiang

is a sub-watershed of the Yellow River basin and located in

model (e.g., Zhao et al. ). In the subsequent appli-

Henan Province in China, at longitude 111 230 to 112 510

cation of the Xinanjiang model, the runoff, R, is

and latitude of 33 510 to 34 370 (Figure 3). It drains an

W

W

W

W

2

separated into three components, i.e., surface runoff (RS),

area of 2,623 km , rising in the mountain Funiu situated in

ground water runoff (RG), and interflow (RI) with the

the Qinling Mountain. Vegetation cover of the watershed

aim of simulating the real runoff processes in the correct

is good and soil erosion is not serious. The Dongwan water-

way (Zhao & Liu ), and this version of the Xinanjiang

shed belongs to a monsoon climate area and its rainfall

model is used in this study. The model consists of four

varies greatly with different seasons. The inter-annual vari-

major parts (Figure 2): evapotranspiration, runoff pro-

ation of precipitation is very large and climatic tendencies

duction, runoff separation, and flow routing. There are 15

produce the highest flooding in the period July to August.

parameters when using the Muskingum method for flow

The mean annual precipitation and runoff are 791 and

routing, which may be grouped as follows: evapotranspira-

276 mm, respectively. Figure 3 shows eight rainfall gauge

tion parameters KE, X, Y, C; runoff production parameters

stations and three hydrological gauge stations (Luanchan,

WM, B, IMP; runoff separation parameters SM, EX, KI,

Tantou, and Dongwan) located in the Dongwan watershed.

KG; and runoff concentration parameters CI, CG, N,

The data selected for modeling are hourly rainfall and dis-

NK, XE, K. The meanings of the model parameters are

charges over the same period of 1 June to 30 October in

listed in Table 1.

seven consecutive years from 1993 to 1998. In this study,

Figure 2

|

Flowchart of the Xinanjiang model.


65

K. Lin et al.

Table 1

|

|

Multi-site evaluation to reduce parameter uncertainty with the GLUE framework

Parameters of the Xinanjiang model and their prior ranges

Parameter

Range

Description

WM/(mm)

100–200

Areal soil moisture storage capacity

X

0.05–0.2

Proportion of soil moisture storage capacity of the upper layer (WUM) to WM

Y

0.4–0.7

Proportion of soil moisture storage capacity of the lower layer (WLM) to (1 X)*WM

KE

0.8–1.5

Ratio of potential evapotranspiration to pan evaporation

B

0.1–0.4

The exponent of the soil moisture storage capacity curve

Journal of Hydroinformatics

|

16.1

|

2014

and Calcic Luvisols (LVk), all of which are almost evenly distributed in the watershed. Therefore, the parameters are considered as homogeneous over the whole basin in this study. Extraction of the digital river network and sub-basin based on digital elevation model (DEM) Based on DEM data with a map scale of 1:250,000, the digital river network, sub-watersheds, and topological relations of the study area are extracted automatically by using Arc Hydro Tools, including the related hydrological topography

SM/(mm)

10–50

Areal mean free water capacity of the surface soil layer

features, such as the area, river length, and gradient, etc. In

EX

1–1.5

The exponent of the free water capacity curve

sub-watersheds by three hydrological gauge stations (Luan-

KI

0.1–0.3

The outflow coefficients of the free water storage to interflow

and 928 km2 respectively (Figure 3).

KG

0.1–0.4

The outflow coefficients of the free water storage to groundwater

Model parameter ranges Based on previous studies of the Xinanjiang model (Zhao

this study, the Dongwan watershed was divided into four chan, Tantou, and Dongwan), with areas of 340, 729, 626,

IMP

0.01

The ratio of the impervious to the total area of the basin

C

0.08–0.18

The coefficient of deep evapotranspiration

et al. ; Zhao ; Zhao & Liu ) and the character-

CI

0.9–0.93

The recession constant of the lower interflow storage

land cover, and vegetation and soil conditions, the prior

CG

0.997

The recession constant of groundwater storage

istics of the Dongwan watershed, such as climate, land use, ranges of the Xinanjiang model in this study were determined and listed in Table 1. In detail, the value of the

N

1–5

Number of reservoirs in the instantaneous unit hydrograph

ratio of the impervious to the total area of the basin (IMP)

NK

4–10

Common storage coefficient in the instantaneous unit hydrograph

The parameters of the Muskingum method XE and K are

XE

0.45

The weighting factor of the Muskingum method

discharge, which are equal to 0.45 and 5 h respectively.

K/(h)

5

The storage time constant of the Muskingum method

the data from 1993 to 1996 are selected as the calibration

is taken as 0.01 because the study area is a natural basin. estimated by the trial and error method using the observed Thus, 14 parameters were selected for the uncertainty analysis.

RESULTS

period, and the data from 1997 to 1998 are selected as the validation period.

Comparison of the behavioral parameter sets

As shown in Figure 3, eight land cover types were identified in the Dongwan watershed in which there were three

To assess the impact of using the interior sites’ flow infor-

main kinds of land cover: woodland, cropland and

mation on the uncertainty of hydrological modeling, this

wooded grassland, with slightly different subdivisions. The

study accepted 12 scenarios (as shown in Table 2) by

Dongwan watershed consists of three main kinds of soil

taking the threshold values of the Nash–Sutcliffe efficiency

types: Calcaric Cambisols (CMc), Eutric Cambisols (CMe),

index (NE-outlet) at the outlet (Dongwan station) as 50,


66

K. Lin et al.

Figure 3

Table 2

|

|

|

Multi-site evaluation to reduce parameter uncertainty with the GLUE framework

Journal of Hydroinformatics

|

16.1

|

2014

Location, digital river network, and land cover of the Dongwan basin.

Comparison of number of behavior parameters of different scenarios

scenarios are listed in Table 2, which showed that the number of behavioral parameter sets decreased when setting

70%

the threshold value at the interior site under all the

Threshold of NE-Interior site

Luanchuan

threshold values at the outlet, especially for setting the

Threshold of NE-Outlet

Scheme I

Luanchan

Tantou

and Tantou

50%

4,927

3,513

4,273

3,225

Tantou. Figure 4 shows the scatter map between the Nash

60%

4,872

3,507

4,270

3,218

efficiency indices at the outlet and interior sites under the

70%

4,645

3,448

4,200

3,184

threshold of the Nash efficiency index at the outlet as

Scheme I represents the scenario without setting the threshold value at the interior site; NE-Outlet and NE-Interior sites are the Nash–Sutcliffe efficiency indices at the outlet and interior sites, respectively.

threshold value at all interior sites, i.e., Luanchuan and

50%. From Figure 4, although it does not show direct relationship, it can be seen that the Nash efficiency index at the outlet is sensitive with that at the interior sites, and with the greater value of the Nash–Sutcliffe efficiency

60, and 70% without setting the threshold value at the

index at the interior sites, it is easier to get the greater

interior sites and setting the threshold values of different

value of that at the outlet.

interior sites (NE-interior site) as 70%. The Xinanjiang

For further analysis of the difference in behavioral par-

model was used to perform the hydrological modeling,

ameter sets among different threshold values at the

and the GLUE method with the SCEM-UA sampling algor-

interior sites, two schemes were selected from the above

ithm was adopted for the uncertainty analysis. The total

12 scenarios. Scheme I sets the threshold of likelihood

number of behavioral parameter sets of the above 12

measure only at the outlet as 70% (NE ¼ 70%), and


67

K. Lin et al.

Figure 4

|

|

Multi-site evaluation to reduce parameter uncertainty with the GLUE framework

Journal of Hydroinformatics

|

16.1

|

2014

The scatter map between the Nash–Sutcliffe efficiency indices at the outlet and interior stations under threshold of the Nash–Sutcliffe efficiency index at the outlet as 50%.

scheme II sets the threshold of likelihood measure at both

outlet do not always produce high likelihood measure

the outlet and interior sites (Dongwan station, Luanchuan

values at the interior sites. Typically, some values were

and Tantou stations) as 70% (NE1 ¼ NE2 ¼ NE3 þ 70%).

even smaller than 50% (the shaded numbers in Table 3).

Table 3 lists part of the behavioral parameter sets and associ-

That is, many unreasonable behavioral parameter sets

ated likelihood measure values in scheme I. As shown in

were obtained by using scheme I. It is indicated that some

Table 3, the parameter sets based only on the runoff at the

unreasonable parameter sets can be removed by setting

Table 3

|

Part of the behavioral parameter sets obtained by scheme I

WM

X

Y

KE

B

SM

EX

KI

KG

C

CI

N

NK

NE-LC/%

NE-TT/%

NE-Outlet/%

114.03

0.06

0.63

1.09

0.14

17.34

1.47

0.24

0.32

0.15

0.93

1.07

7.35

43.90

45.80

71.40

161.78

0.12

0.67

1.30

0.25

28.63

1.00

0.23

0.28

0.13

0.92

1.64

5.26

60.40

70.50

79.70

124.65

0.06

0.57

1.22

0.25

21.04

1.36

0.11

0.13

0.18

0.91

1.88

8.95

64.90

70.30

75.10

129.60

0.13

0.60

1.38

0.23

48.05

1.32

0.26

0.38

0.16

0.91

1.52

8.64

67.10

73.80

81.50

111.35

0.17

0.53

1.31

0.14

30.41

1.26

0.10

0.19

0.17

0.91

3.91

6.20

66.90

73.20

77.10

152.56

0.17

0.65

0.96

0.36

10.78

1.48

0.40

0.19

0.10

0.92

2.51

8.89

50.20

54.90

74.00

142.76

0.15

0.42

1.43

0.20

29.11

1.13

0.30

0.24

0.13

0.91

1.51

7.88

64.40

73.70

82.20

111.73

0.06

0.65

1.26

0.28

29.33

1.09

0.28

0.31

0.09

0.90

2.97

4.97

73.30

66.80

71.70

136.84

0.13

0.56

1.50

0.16

27.88

1.38

0.29

0.16

0.09

0.91

3.29

9.49

53.00

70.60

83.60

154.24

0.10

0.62

1.36

0.36

17.12

1.33

0.28

0.33

0.10

0.92

3.02

9.20

71.70

69.00

71.50

116.07

0.08

0.52

1.03

0.27

11.69

1.41

0.17

0.34

0.13

0.92

2.47

5.69

63.60

57.80

71.90

134.56

0.18

0.58

1.16

0.14

41.46

1.26

0.13

0.23

0.10

0.93

4.50

6.98

47.10

59.50

77.80

120.91

0.15

0.51

1.45

0.24

42.02

1.41

0.40

0.26

0.12

0.92

3.80

4.32

57.80

69.80

82.70

138.11

0.09

0.49

1.04

0.23

16.37

1.22

0.18

0.30

0.14

0.92

2.94

4.04

35.50

37.00

72.30

159.70

0.19

0.47

1.17

0.14

18.99

1.37

0.17

0.20

0.16

0.93

2.66

4.10

52.30

66.40

82.80

138.70

0.16

0.66

1.43

0.14

49.51

1.08

0.24

0.37

0.15

0.92

1.28

4.25

71.20

77.20

78.90

155.49

0.07

0.57

0.96

0.37

33.05

1.47

0.13

0.28

0.17

0.91

3.30

7.33

47.90

46.60

70.70

104.84

0.11

0.61

1.30

0.22

29.13

1.31

0.18

0.32

0.12

0.90

1.52

7.28

74.00

69.70

70.30

115.28

0.13

0.69

1.19

0.13

18.13

1.18

0.15

0.26

0.16

0.90

4.50

4.36

71.30

71.50

71.60

146.48

0.13

0.54

1.49

0.12

35.55

1.15

0.15

0.29

0.16

0.90

2.98

7.56

72.80

77.10

75.20

NE-LC, NE-TT, and NE-Outlet are the Nash–Sutcliffe efficiency indices at Luanchuan, Tantou, and outlet respectively.


68

K. Lin et al.

|

Multi-site evaluation to reduce parameter uncertainty with the GLUE framework

Journal of Hydroinformatics

|

16.1

|

2014

the threshold of the likelihood measure at the interior sites.

the SCEM-UA-derived initial sample contains numerous sol-

The mean and standard deviation of behavioral parameter

utions in the high probability density (HPD) region of the

sets and efficiency coefficients of scheme I and scheme II

parameter space, so that the average distance of the various

were gained and are shown in Figure 5. Referring to Figure 5,

parameter combinations to the optimal model is small. Fur-

it can be seen that the standard deviation of most behavioral

thermore, most of the parameters’ posterior distributions

parameter sets decreased greatly when setting the threshold

obtained by scheme II showed more peak than those obtained

value at the interior sites, and the same with the Nash–

by scheme I. This finding implied that the posterior distri-

Sutcliffe efficiency indices at the outlet and interior sites.

butions obtained by scheme II can evolve into the HPD

All of the above results and analysis indicated that taking

region of the parameter space with higher frequency, so as to

the interior sites’ information into consideration can

obtain more reasonable posterior distributions of the hydrolo-

reduce parameter uncertainty to some degree.

gical parameters, since scheme II further filters the alternative simulation results using the interior flow information.

Comparison of parameters’ posterior distributions Comparison of uncertainty intervals Figure 6 illustrates comparisons between the parameters’ posterior distributions of the Xinanjiang model obtained

To investigate how the interior sites’ information affects the

by scheme I and scheme II, respectively. As shown in

efficiency of uncertainty interval in the Xinanjiang model-

Figure 6, the posterior distributions all show distinct non-

ing, three indices including the CR, RIW, and the Nash–

uniform distribution, and have peak value mostly in the

Sutcliffe efficiency index of the median MQ0.5 presented

two schemes. Blasone & Vrugt () have indicated that

above,

Figure 5

|

were selected

to

evaluate the

efficiency of

Comparisons of the mean and standard deviation of behavior parameters and Nash–Sutcliffe efficiency index of different schemes under threshold of the Nash–Sutcliffe efficiency index at the outlet as 70%.


69

Figure 6

K. Lin et al.

|

|

Multi-site evaluation to reduce parameter uncertainty with the GLUE framework

Journal of Hydroinformatics

|

16.1

|

2014

The posterior distribution of parameters obtained by scheme I and scheme II.

uncertainty. The uncertainty intervals for a given confidence

the Xinanjiang model obtained by scheme I and scheme

level of 90% are obtained by using the GLUE method with

II. NE(MQ0.5) in Table 4 represents the Nash–Sutcliffe effi-

setting a given threshold value of NE as 70%. Table 4 dis-

ciency index of the median MQ0.5 produced from the

plays the results of the uncertainty evaluation indices of

uncertainty analysis by fitting the observed runoff series.


70

K. Lin et al.

Table 4

|

|

Multi-site evaluation to reduce parameter uncertainty with the GLUE framework

Journal of Hydroinformatics

|

16.1

|

2014

Assessing indices of uncertainty for different schemes

Dongwan (Outlet) Evaluation index

Tantou

Luanchuan

Scheme I

Scheme II

RI%

Scheme I

Scheme II

RI%

Scheme I

Scheme II

RI%

RIW

Calibration Verification

0.606 0.617

0.524 0.538

13.58 12.91

0.589 0.607

0.524 0.538

11.05 11.37

0.575 0.665

0.524 0.538

8.89 19.13

CR

Calibration Verification

0.683 0.693

0.660 0.674

3.35 2.74

0.677 0.628

0.661 0.612

2.42 2.60

0.666 0.631

0.652 0.607

2.15 3.84

NE(MQ0.5)

Calibration Verification

0.803 0.859

0.807 0.861

0.59 0.23

0.848 0.808

0.850 0.811

0.24 0.38

0.779 0.747

0.783 0.754

0.54 0.83

RI is the percentage of IW decrease from scheme II to scheme I; RIW is relative interval width; CR is containing ratio; NE(MQ0.5) is the Nash–Sutcliffe efficiency index of the median value MQ0.5.

Figures 7 and 8 illustrate the uncertainty intervals and

can be found in the RIW, implying that considering the interior

observed flow during the time period of 24 July–12 October

sites’ flow information can reduce parameter uncertainty to

1996 (calibration period) and 19 July–13 October 1998 (vali-

some degree. It can be also observed from Table 4 and Figures 7

dation period) at Dongwan obtained by scheme I and

and 8 that the Nash–Sutcliffe efficiency index of the median

scheme II, respectively.

MQ0.5, NE (MQ0.5) increased with setting the thresholds at

It can be found from Table 4 and Figures 7 and 8 that the

the interior sites, which indicated that, when considering the

CR did not decrease by much but a more significant decrease

interior sites’ flow information, the simulated runoff series by

Figure 7

|

The runoff uncertainty intervals and observed flow during the time period 24 July–12 October at the outlet obtained by scheme I and scheme II.


71

Figure 8

K. Lin et al.

|

|

Multi-site evaluation to reduce parameter uncertainty with the GLUE framework

Journal of Hydroinformatics

|

16.1

|

2014

The runoff uncertainty intervals and observed flow during the time period 19 July–13 October at the outlet obtained by scheme I and scheme II.

the Xinanjiang model with the behavioral parameter sets will

quite different to those in calibration. The only response to

fit better with the observed runoff series. Referring to

this would appear to be to moderate our expectations of

Table 4, the results also showed that the total coverage ratios

what a model, or set of models, can do in prediction. Other

in both calibration and validation are not very high. It was

relative issues need to be carried out in the future.

found that the coverage ratios are high at the high flow, but they are low at the low flow, and the period of low flow is longer than that of high flow. As we know, there can be uncertainty due to many reasons, e.g., input uncertainty, model

CONCLUSION

structure uncertainty, parameter uncertainty; however, in this case, the reason for this result is that the model used in

The aim in researching uncertainty is to find the ways and

this study cannot perform very well at the low flow in the

measures to reduce parameter uncertainty in hydrological mod-

study area. As pointed out by Beven et al. (), we should

eling and forecasting, so as to increase the accuracy and

not expect such periods to be well predicted by the set of behav-

reliability of hydrological forecasting. Using all the available

ioral models identified in calibration. We should also not

and new data for multi-site evaluation is one of the valid ways

expect that such periods would be covered by any statistical

to reduce parameter uncertainty in hydrological modeling and

representation of the calibration errors, since the epistemic

forecasting. Based on the GLUE method with the SCEM-UA

uncertainties of inconsistent periods in prediction might be

sampling algorithm, this study focuses on reducing hydrological


72

K. Lin et al.

|

Multi-site evaluation to reduce parameter uncertainty with the GLUE framework

modeling uncertainty by using the interior hydrological information in the performance of the Xinanjiang model. Comparison of the results between 12 scenarios showed that, under the same threshold of the Nash–Sutcliffe efficiency index at the outlet, the number and standard deviation of behavioral parameter sets decreased greatly when setting the threshold value at the interior sites. The uncertainty analysis confirmed that the GLUE method with the SCEM-UA sampling algorithm, which periodically updates the size and direction of the proposal distribution, was able to locate the HPD region of the parameter space efficiently. In addition, the CR decreased by not much but a more significant decrease can be found in the RIW, implying that considering the interior sites’ flow information, which makes the selection of behavior parameters stricter, can reduce parameter uncertainty to some degree. As well, the Nash–Sutcliffe efficiency of the median value, MQ0.5, increased when the interior sites’ flow information was taken into consideration, which indicated that when considering the interior sites’ flow information, the simulated runoff series by the Xinanjiang model with the behavioral parameter sets can fit better with the observed runoff series, and correspondingly, the abstracted median value, MQ0.5, can be improved for better prediction of the runoff.

ACKNOWLEDGEMENTS The authors would like to express their gratitude to Valeria from the University of Illinois at Champaign-Urbana. The authors are grateful to Dr Jasper A. Vrugt for developing code of SCEM-UA. The authors would like to express their sincere gratitude to Prof. Keith Beven and the other two anonymous referees for their constructive comments and useful suggestions that helped us improve our paper. This study was financially supported by National Natural Science Foundation of China (Grant No. 50809078), and project of Pearl-River-New-Star of Science and Technology of Guangzhou City (Grant No. 2011J2200051).

REFERENCES Ajami, N. K., Duan, Q. & Sorooshian, S.  An integrated hydrologic Bayesian multimodel combination framework:

Journal of Hydroinformatics

|

16.1

|

2014

confronting input, parameter and model structural uncertainty. Water. Resour. Res. 43, W01403. Andersen, J., Refsgaard, J. C. & Jensen, K. H.  Distributed hydrological modelling of the Senegal river basin – model construction and validation. J. Hydrol. 247 (3–4), 200–214. Benke, K. K., Lowell, K. E. & Hamilton, A. J.  Parameter uncertainty, sensitivity analysis and prediction error in a water-balance hydrological model. Math. Comput. Model. 47 (11–12), 1134–1149. Beven, K. & Binley, A.  The future of distributed models: model calibration and uncertainty prediction. Hydrol. Process. 6, 279–298. Beven, K. & Freer, J.  Equifinality, data assimilation, and uncertainty estimation in mechanistic modeling of complex environmental systems using the GLUE methodology. J. Hydrol. 249, 11–29. Beven, K., Smith, P. J. & Wood, A.  On the colour and spin of epistemic error (and what we might do about it). Hydrol. Earth Syst. Sci. 15, 3123–3133. Blasone, R. S. & Vrugt, J. A.  Generalized likelihood uncertainty estimation (GLUE) using adaptive Markov Chain Monte Carlo sampling. Adv. Water Res. 31, 630–648. Blazkova, S. & Beven, K. J.  Flood frequency estimation by continuous simulation for a catchment treated as ungauged (with uncertainty). Water Resour. Res. 38 (8), 1139. Das, T., Bardossy, A., Zehe, E. & He, Y.  Comparison of conceptual model performance using different representations of spatial variability. J. Hydrol. 356 (1–2), 106–118. Dotto, C. B. S., Mannina, G., Kleidorfer, M., Vezzaro, L., Henrichs, M., McCarthy, D. T., Freni, G., Rauch, W. & Deletic, A.  Comparison of different uncertainty techniques in urban stormwater quantity and quality modelling. Water Res. 46 (8), 2545–2558. Feyen, L., Kalas, M. & Vrugt, J. A.  Semi-distributed parameter optimization and uncertainty assessment for largescale streamflow simulation using global optimization. Hydrol. Sci. J. 53 (2), 293–308. Freer, J., Beven, K. J. & Ambroise, B.  Bayesian estimation of uncertainty in runoff prediction and the value of data: an application of the GLUE approach. Water Resour Res. 32 (7), 2161–2173. Freer, J., McMillan, H., McDonnell, J. J. & Beven, K. J.  Constraining dynamic TOPMODEL responses for imprecise water table information using fuzzy rule based performance measures. J. Hydrol. 291, 254–277. Gallart, F., Latron, J., Llorens, P. & Beven, K.  Using internal catchment information to reduce the uncertainty of discharge and baseflow predictions. Adv. Water Res. 20, 808–823. Gong, Z. J.  Estimation of mixed Weibull distribution parameters using the SCEM-UA algorithm: application and comparison with MLE in automotive reliability analysis. Reliab. Eng. Syst. Safe. 15 (1), 915–922. Goodman, D.  Extrapolation in risk assessment: improving the quantification of uncertainty, and improving information to reduce the uncertainty. Hum. Ecol. Risk. Assess. 8 (1), 177–192.


73

K. Lin et al.

|

Multi-site evaluation to reduce parameter uncertainty with the GLUE framework

Gupta, H. V., Sorooshian, S. & Yapo, P. O.  Toward improved calibration of hydrologic models: multiple and noncommensurable measures of information. Water Resour. Res. 34, 751–763. Hejazi, M. I., Cai, X. M. & Borah, D. K.  Calibrating a watershed simulation model involving human interference: an application of multi-objective genetic algorithms. J. Hydroinform. 10 (1), 97–111. Jin, X., Xu, C., Zhang, Q. & Singh, V. P.  Parameter and modeling uncertainty simulated by GLUE and a formal Bayesian method for a conceptual hydrological model. J. Hydrol. 383, 147–155. Krysanova, V., Bronstert, A. & Müller-Wohlfeil, D.  Modelling river discharge for large drainage basins: from lumped to distributed approach. Hydrol. Sci. J. 44 (2), 313–331. Li, Z., Shao, Q., Xu, Z. & Cai, X.  Analysis of parameter uncertainty in semi-distributed hydrological models using bootstrap method: a case study of SWAT model applied to Yingluoxia watershed in northwest China. J. Hydrol. 385, 76–83. Lin, K., Chen, X., Zhang, Q. & Chen, Z.  A Modified Generalized Likelihood Uncertainty Estimation Method by Using Copula Function. IAHS Publication, Wallingford, UK, 335, pp. 51–56. Lin, K., Zhang, Q. & Chen, X.  An evaluation of impacts of DEM resolution and parameter correlation on TOPMODEL modeling uncertainty. J. Hydrol. 394, 370–383. Lumbroso, D. & Gaume, E.  Reducing the uncertainty in indirect estimates of extreme flash flood discharges. J. Hydrol. 414, 16–30. Maschio, C., Schiozer, D. J., Moura, M. A. B. & Becerra, G. G.  A methodology to reduce uncertainty constrained to observed data. SPE. Reserv. Eval. Eng. 12 (1), 167–180. McMichael, C. E., Hope, A. S. & Loaiciga, H. A.  Distributed hydrological modelling in California semi-arid shrublands: MIKE SHE model calibration and uncertainty estimation. J. Hydrol. 317 (3–4), 307–324. McMillan, H. & Clark, M.  Rainfall-runoff model calibration using informal likelihood measures within a Markov chain Monte Carlo sampling scheme. Water Resour. Res. 45, W04418. Montanari, A.  Large sample behaviors of the generalized likelihood uncertainty estimation (GLUE) in assessing the uncertainty of rainfall–runoff simulations. Water Resour. Res. 41, W08406. Moradkhani, H., Hsu, K.-L., Gupta, H. & Sorooshian, S.  Uncertainty assessment of hydrologic model states and parameters: sequential data assimilation using the particle filter. Water Resour. Res. 41 (5), 1–17. Mousavi, S. J., Abbaspour, K. C., Kamali, B., Amini, M. & Yang, H.  Uncertainty-based automatic calibration of HEC-HMS

Journal of Hydroinformatics

|

16.1

|

2014

model using sequential uncertainty fitting approach. J. Hydroinform. 14 (2), 286–309. Moussa, R., Chahinian, N. & Bocquillon, C.  Distributed hydrological modelling of a Mediterranean mountainous catchment – model construction and multi-site validation. J. Hydrol. 337 (1–2), 35–51. Nash, J. E. & Sutcliffe, J. V.  River flow forecasting through the conceptual models, 1: a discussion of principles. J. Hydrol. 10 (3), 282–290. Ng, T. L., Eheart, J. W. & Cai, X. M.  Comparative calibration of a complex hydrologic model by stochastic methods GLUE and PEST. Trans. ASABE 53 (6), 1773–1786. Sivapalan, M., Takeuchi, K., Franks, S. W., Sivapalan, M., Takeuchi, K., Franks, S. W., Gupta, K., Karambiri, H., Lakshmi, K., Liang, X., McDonnell, J. J., Mendiondo, E. M., O’Connell, P. E., Oki, T., Pomeroy, J. W., Schertzer, D., Uhlenbrook, S. & Zehe, E.  IAHS decade on predictions in ungauged basins (PUB), 2003–2012: shaping an exciting future for the hydrological sciences. Hydrol. Sci. J. 48 (6), 857–880. Thiemann, M., Trosset, M., Gupta, H. & Sorooshian, S.  Bayesian recursive parameter estimation for hydrological models. Water Resour. Res. 7 (10), 21–35. Uhlenbrook, S. & Sieber, A.  On the value of experimental data to reduce the prediction uncertainty of a process-oriented catchment model. Environ. Modell. Softw. 20 (1), 19–32. Vrugt, J. A. & Robinson, B. A.  Treatment of uncertainty using ensemble methods: comparison of sequential data assimilation and Bayesian model averaging. Water Resour. Res. 43, W01411. Vrugt, J. A., Gupta, H. V., Bouten, W. & Sorooshian, S.  A Shuffled Complex Evolution Metropolis algorithm for optimization and uncertainty assessment of hydrologic model parameters. Water Resour. Res. 39 (8), 1201. Wagener, T. & Gupta, H. V.  Model identification for hydrological forecasting under uncertainty. Stoch. Environ. Res. Risk. A 19, 378–387. Xiong, L. & O’Connor, K. M.  An empirical method to improve the prediction limits of the GLUE methodology in rainfall-runoff modeling. J. Hydrol. 349, 115–124. Xu, D., Wang, W., Chau, K. & Chen, S.  Comparison of three global optimization algorithms for calibration of the Xinanjiang model parameters. J. Hydroinform. 15 (1), 174–193. Zhao, R. J.  The Xinanjiang model applied in China. J. Hydrol. 135, 371–381. Zhao, R. J. & Liu, X. R.  The Xinanjiang model. In: Computer Models of Watershed Hydrology (V. P. Singh, ed.). Water Resources Publication, Highlands Ranch, CO. Zhao, R. J., Zhang, Y. L. & Fang, L. R.  The Xinanjiang model. Hydrological Forecasting Proceedings Oxford Symposium. IASH, Oxford, vol. 129, 1980, pp. 351–356.

First received 5 November 2012; accepted in revised form 10 May 2013. Available online 4 June 2013


74

© IWA Publishing 2014 Journal of Hydroinformatics

|

16.1

|

2014

Integration of an evolutionary algorithm into the ensemble Kalman filter and the particle filter for hydrologic data assimilation Gift Dumedah and Paulin Coulibaly

ABSTRACT Data assimilation (DA) methods continue to evolve in the design of streamflow forecasting procedures. Critical components for efficient DA include accurate description of states, improved model parameterizations, and estimation of the measurement error. Information about these components are usually assumed or rarely incorporated into streamflow forecasting procedures. Knowledge of these components could be gained through the generation of a Pareto-optimal set – a set of competitive members that are not dominated by other members when compared using evaluation objectives. This study integrates Pareto-optimality into the ensemble Kalman filter (EnKF) and the particle filter (PF). Comparisons are made between three methods: evolutionary data assimilation (EDA) and methods based on the integration of Pareto-optimality into the EnKF (ParetoEnKF) and into the PF (ParetoPF). The methods are applied to assimilate daily streamflow into

Gift Dumedah (corresponding author) Department of Civil Engineering, Monash University, Building 60, Melbourne, Victoria 3800, Australia E-mail: dgiftman@hotmail.com Paulin Coulibaly School of Geography and Earth Sciences, and Department of Civil Engineering, McMaster University, 1280 Main Street West, Hamilton, Ontario, Canada L8S4L8

the Sacramento Soil Moisture Accounting model in the Spencer Creek watershed in Canada. The updated members are applied to forecast streamflows for up to 10 days ahead, where forecasts for 1 day, 5 day and 10 day lead times are compared to observations. The results show that updated estimates are similar for all three methods. An evaluation of updated members for multi-step forecasting revealed that EDA had the highest forecast accuracy compared to ParetoEnKF and ParetoPF, which have similar accuracies. Key words

| data assimilation, ensemble Kalman filter, multi-objective evolutionary algorithms, Pareto-optimality, particle filter, streamflow forecasting

INTRODUCTION Data assimilation (DA) has gained popularity in the design

model state and parameterizations have a direct influence

of streamflow forecasting methods. It is an analytical

on simulations (e.g. streamflow) and subsequent assimila-

approach that allows an optimal merger between inaccurate

tion, whereas the measurement error controls the relative

model output and imperfect observations, and accounts for

penalty between simulation and observation. State and

uncertainties in model and observation data (Liu & Gupta

model parameterizations also control model forecasts (i.e.

). For hydrological forecasting systems, accurate esti-

background information) and ensemble simulations, which

mation of the state, determination of the measurement (or

are combined with observations to determine updated

observation) error, and parameterizations of the model are

ensemble members. However, specific integration of these

crucial components for the performance of the DA

components into streamflow forecasting procedures has

method (Chen ; Snyder et al. ; van Leeuwen

not been fully examined in the hydrological literature.

). These three components are important for the

Evolutionary algorithms have been shown to provide a

design of efficient hydrological forecasting systems. The

stochastic framework to address key components for DA,

doi: 10.2166/hydro.2013.088


75

G. Dumedah & P. Coulibaly

|

Integration of Pareto-optimality into Bayesian-type filters

Journal of Hydroinformatics

|

16.1

|

2014

including efficient parameter estimation and the estimation

In assimilation, the evolutionary algorithm employs the

of measurement error using the dynamics between simu-

variational DA approach (Reichle et al. ; Caparrini et al.

lations and observations (Chemin & Honda ; Nazemi

; Caparrini et al. ) to minimize a cost (or penalty)

et al. ; Ines & Mohanty ; Dumedah et al. ).

function by finding the least squares estimate between

Multi-objective evolutionary algorithms have been widely

ensemble simulations and the observation data. As a

used for hydrological applications (Chemin & Honda

result, evolutionary data assimilation (EDA) uses evolution-

; Ines & Mohanty , ; Dumedah et al. ,

ary strategy to continuously evolve a population of

, a, b). Evolutionary algorithms allow several mem-

competing members through evaluation conditions that

bers in a population to compete among themselves based

are defined by the cost function and other accuracy

on evaluation objectives – the fitter members are selected

measures such as the root mean square error (RMSE).

and varied to reproduce new members to form a new popu-

Applied in a sequential mode, the EDA evolves the popu-

lation. The procedure is repeated to evolve the population

lation of members at each assimilation time step and also

through natural selection and variation of fitter members

between time steps. At each assimilation time step, several

using crossover and mutation – nudging operators for mod-

members are evaluated, but only the ensemble members

ifying members to maintain diversity between members.

that remain non-dominated are selected as the updated

Each cycle of evolution of the population to reproduce a

members to determine the ensemble mean and its variance.

new one is called a generation. The evaluation conditions

A detailed description of the computational procedure of the

change with each population as the fitness of its members

EDA is provided in the subsection on EDA.

usually increases with every cycle of the evolution. This con-

Moreover, the design of streamflow forecasting systems

tinuous evolution of the population of members through

has been dominated by popular DA methods such as the

different evaluation conditions allows the determination of

ensemble Kalman filter (EnKF) and the particle filter (PF)

the Pareto-optimal set – a set of equally accurate members

(Moradkhani & Hsu ; Weerts & El-Serafy ; Clark

that are not dominated when compared to other members

et al. ; van Leeuwen ; Weerts et al. ; Xie &

using evaluation objectives.

Zhang ). Brief descriptions for the EnKF and PF are

Note that the evolution continuously evaluates the

given below. The EnKF was developed by Evensen

dynamics between several simulations and perturbed obser-

(a), and has been applied in several hydrological studies

vations. This interaction estimates the measurement noise as

(Clark et al. ; Komma et al. ; Thirel et al. ; Xie

the error of using several simulations (or measurements) to

& Zhang ). The EnKF uses Monte Carlo integration to

approximate an ensemble of perturbed observations. The

estimate the posterior probability density function (pdf)

continuous evolution and the final population from which

through the ensemble mean and covariance (Evensen

the Pareto-optimal set is determined provide an appealing

b, ; Burgers et al. ). The ensemble members,

framework to adaptively approximate the measurement

which may include perturbed states, model parameters,

error, and to improve the estimation of state and model

and forcing data uncertainties, are propagated by using the

parameterizations. This can facilitate the integration of

model to make predictions (or measurements) to future

Pareto-optimality into Kalman-type assimilations where,

time. The ensemble predictions are combined with obser-

instead of assimilating randomly generated members (or

vations to determine the Kalman gain function (denoted

ensemble members), the framework first determines

K ) and the innovation vector. The K that is computed

Pareto-optimal members before they are optimally merged

using the covariance between states, parameters, and for-

with the perturbed observations. The integration of Pareto-

cing data is combined with the innovation vector, the

optimality into Kalman-type assimilation can facilitate the

residual between simulation and observation, to update

design and performance of hydrological forecasting systems.

the ensemble members. The above procedure is repeated

Further information on evolutionary algorithms can be

to evolve state and model parameter components for sub-

found in Deb () and Eiben & Smith ().

sequent time steps. Detailed application of the EnKF is


76

G. Dumedah & P. Coulibaly

|

Integration of Pareto-optimality into Bayesian-type filters

Journal of Hydroinformatics

|

16.1

|

2014

provided in the subsection on integration of Pareto-optimal-

set using evolutionary strategy before merging resulting

ity into the EnKF (ParetoEnKF).

evolved members with perturbed observations through the

The PF was originally developed by Gordon et al. (),

EnKF. The ParetoPF determines the updated ensemble

and has been applied widely in hydrology (Moradkhani &

members by using the evolutionary algorithm to generate

Hsu ; Weerts & El-Serafy ; van Leeuwen ).

the Pareto-optimal members that, in turn, are assimilated

The PF uses recursive Bayesian estimation to also estimate

using a particle filtering method. The three methods are

the posterior pdf, but through the use of weighted ensemble

applied in a state-parameter estimation procedure to assim-

members for approximating the full pdf (Bengtsson et al.

ilate daily streamflow into the Sacramento Soil Moisture

; Chen ; Vossepoel & van Leeuwen ; Snyder

Accounting (SAC-SMA) model in Spencer Creek watershed

et al. ; van Leeuwen ). As in the EnKF, the ensem-

in southern Ontario, Canada. The updated ensemble mem-

ble members for states, model parameters, and forcing data

bers are applied to forecast streamflow for up to a 10 day

uncertainties are propagated by using the model to make

lead time in order to evaluate the forecasting performance

predictions forward in time. The ensemble predictions are

for the DA methods.

combined with observations to compute the ensemble

The rest of the paper is organized as follows. The Data

weight, which in turn is re-sampled to replace low weighted

and methods section describes the study area and the rain-

members with normalized weighted members. The re-

fall-runoff model, and implementation procedures for the

sampled weights are applied to update the ensemble mem-

three methods. The resulting assimilation outputs and

bers, and the procedure is repeated to evolve state and

model forecasts are presented in the Results and discussion

model parameter components for subsequent time steps.

section. The implications of the results on the design and

Detailed application of the PF is provided in the subsection

performance of streamflow forecasting systems, and findings

on integration of Pareto-optimality into the PF (ParetoPF).

of this study are summarized in the Conclusions section.

Since both the EnKF and the PF have clear and standard computational procedures, their applications in this study are outlined in the subsections on ParetoEnKF and

DATA AND METHODS

ParetoPF. Despite their popularity, analytical drawbacks, including the assumption of normality for model errors in

Study area and rainfall-runoff model

the EnKF and the weight degeneracy problems in the PF, are well-known challenges in the DA literature (Chen

The study area, Spencer Creek watershed shown in Figure 1,

; Moradkhani & Hsu ; Weerts & El-Serafy ;

is located westward of Lake Ontario in southern Ontario,

Clark et al. ; Snyder et al. ; van Leeuwen ).

Canada. The Spencer Creek watershed has a drainage area

While these limitations are not specifically addressed here,

of about 280 km2, and the land cover is mainly agricultural

this study will demonstrate the integration of Pareto-optim-

with mixed forest. The upstream area has a flat physio-

ality into the EnKF and PF. The integration would allow

graphic terrain, whereas the downstream area has variable

adaptive estimation of the measurement error, and the

topography. Forcing data, including daily temperature, and

assimilation of a continuously evolved set of members to

daily precipitation are obtained from Environment Canada

ensure that model forecasts are generated from improved

weather stations and also from McMaster University

model parameterizations and updated states. As will be

weather stations. Two streamflow gauging stations, Highway

demonstrated in this study, the continuous evaluation of

5 located at the upstream section, and Dundas, located at

the updated ensemble members and their associated

the downstream section, are used in this study.

model forecasts is an important measure for evaluating forecasting performance of DA methods.

The rainfall-runoff model used is a modified version of the SAC-SMA model, in which a snowmelt routine was

This study makes a comparison between three assimila-

included. The SAC-SMA model is a conceptual model and

tion approaches: EDA, a method based on the ParetoEnKF

it spatially lumps the drainage area to account for water bal-

and ParetoPF. The ParetoEnKF generates a Pareto-optimal

ance (inflow, storage, and outflow) in the catchment. The


77

Figure 1

G. Dumedah & P. Coulibaly

|

|

Integration of Pareto-optimality into Bayesian-type filters

Journal of Hydroinformatics

|

16.1

|

2014

Study area – Spencer Creek watershed. Source: Natural Resources Canada.

model states that are based on the soil moisture budget are

Dumedah ; Dumedah & Coulibaly ). Compu-

determined from the mean area precipitation, the evapotran-

tational procedures of the NSGA-II can be found in the

spiration, the net streamflow, and the net groundwater loss.

following sources: Deb et al. (, ), Deb (),

The SAC-SMA model has been applied in several studies,

Deb & Goel () and Coello Coello et al. (). A

and is extensively used for operational streamflow forecast-

flowchart for computational procedures in the EDA is

ing (Vrugt et al. a, b; Vrugt & Robinson ). A list

shown in Figure 2 – detailed descriptions are given

of SAC-SMA model parameters and state variables is given

below.

in Table 1. Note that the intervals represent values that are

The EDA is applied to sequentially assimilate daily

physically meaningful for the SAC-SMA model in the con-

streamflow into the SAC-SMA model through the simul-

text of the Spencer Creek watershed. Further information

taneous

on the SAC-SMA model can be found in Vrugt et al.

components. The state and model parameter components

(a, b).

are considered time variant in a way that they are updated

estimation

of

state

and

model

parameter

for each assimilation time step when there is a new obserThe EDA procedure

vation. This is the same as in the standard state-parameter assimilation procedure in Moradkhani et al. (). The

The EDA uses the Non-dominated Sorting Genetic Algor-

EDA begins by using the NSGA-II to generate n random

ithm-II (NSGA-II) to continuously evolve a population of

members into a population Pn for initial time t0. This initial

members through different assimilation time steps. The

population is generated by using the model parameter

NSGA-II was designed by Deb et al., and is an advanced

bounds and the forcing data uncertainties shown in

multi-objective evolutionary algorithm that has been

Table 1. The population Pn is varied using crossover and

applied in several hydrological studies (Tang et al. ;

mutation operators to generate a child population of size n

Confesor & Whittaker ; Wohling et al. ;

where both populations are combined to create 2n members


78

Table 1

G. Dumedah & P. Coulibaly

|

|

Integration of Pareto-optimality into Bayesian-type filters

Journal of Hydroinformatics

|

16.1

|

2014

Description and intervals for model parameters and state variables for the SAC-SMA model

Parameter

Description

Interval

UZTWM

Upper-zone tension water maximum storage (mm)

5–100

UZFWM

Upper-zone free water maximum storage (mm)

5–100

LZTWM

Lower-zone tension water maximum storage (mm)

100–500

LZFPM

Lower-zone free water primary maximum storage (mm)

50–500

Model parameters

LZFSM

Lower-zone free water supplemental maximum storage (mm)

250–1,000

ADIMP

Additional impervious area (–)

0.01–0.4

UZK

Upper-zone free water lateral depletion rate (day 1)

0.01–0.2

LZPK

Lower-zone primary free water depletion rate (day 1)

0.0001–0.02

LZSK

Lower-zone supplemental free water depletion rate (day 1)

0.1–0.5

Maximum percolation rate (–)

1–10

Recession parameters

Percolation and other ZPERC REXP

Exponent of the percolation equation (–)

1–10

PCTIM

Impervious fraction of the watershed area (–)

0.0–0.01

PFREE

Fraction percolating from upper to lower-zone free water storage (–)

0–0.5

RIVA

Riparian vegetation area (–)

0

SIDE

Ratio of deep recharge to channel base flow (–)

0

SAVED

Fraction of lower-zone free water not transferable to tension water

0

Upper-zone tension water storage content (mm)

Updated

Soil moisture states UZTWC UZFWC

Upper-zone free water storage content (mm)

Updated

LZTWC

Lower-zone tension water storage content (mm)

Updated

LZFPC

Lower-zone free primary water storage content (mm)

Updated

LZFSC

Lower-zone free secondary water storage content (mm)

Updated

ADIMC

Additional impervious area content linked to stream network (mm)

Updated

Snow routine components DDF

Degree day factor

1–5

SCF

Snowfall correction factor

0.8–1.2

TR

Upper threshold temperature, to distinguish between rainfall, snowfall and a mix of rain and snow

0–1

ATHORN

A constant for Thornthwaite’s equation

0.1–0.3

RCR

Rainfall correction factor

0.8–1.2

SWE (state)

Snow water equivalent (mm)

Nash-cascade routing components RQ

Residence time parameters of quick flow

0.4–0.95

Three linear reservoirs to route the upper-zone (quick response) channel inflow (mm)

Updated

Precipitation (mm)

±5%

UHG1 UHG2 UHG3 Forcing variables PRECIP

(continued)


79

Table 1

G. Dumedah & P. Coulibaly

|

|

Integration of Pareto-optimality into Bayesian-type filters

Journal of Hydroinformatics

|

16.1

|

2014

continued

Parameter

Description

Interval

TEMPR

Temperature ( C)

±5%

EVAPO

Evapotranspiration (mm)

±5%

W

Variables with interval ‘updated’ means they are defined using model output

Figure 2

|

Computational procedures for a sequential assimilation in the EDA.

into a new population Pc. The variation operators employ a uniform typed crossover with a crossover probability, and a substitution typed mutation with a mutation rate. Note that a member (or a solution) is represented as a vector containing values for states, model parameters, and forcing data uncertainties, where they are applied in the SAC-SMA model to generate streamflow. The states are obtained using Equation (1), and the forcing data are perturbed according to Equation (2). The model parameters, states,

the observed streamflow is perturbed using Equation (4): xt ¼ f[xt 1 , ut 1 , zt ]

(1)

ut ¼ ut þ γ t ,

(2)

γ t ∼ N(0, β ut )

^yt ¼ h[xt , zt , ut ] yt ¼ yt þ εt ,

εt ∼ N(0, β yt )

(3) (4)

and the forcing data are applied into the SAC-SMA model

where, for each population member, xt is a vector of

to estimate the predicted streamflow in Equation (3), and

forecasted states at time t with dimension L × 1; L is the


80

G. Dumedah & P. Coulibaly

|

Integration of Pareto-optimality into Bayesian-type filters

Journal of Hydroinformatics

|

16.1

|

2014

number of model states; f[.] is the system transition function

model parameters, and forcing data uncertainties into the

(or the SAC-SMA model); xt 1 is a vector of updated states

SAC-SMA model:

for the previous time; zt is the model parameter with dimension F × 1; F is the number of model parameters; ut is the forcing data with dimension E × 1; E is the number of forcing variables; and γt is the forcing data error with covariance β ut at each time step. Additionally, ^yt is the ensemble streamflow predictions with dimension 2n × 1; h is the measurement function (i.e. the SAC-SMA model);

Pk Bias ¼

i¼1

( ^yi yi ) k

sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pk 2 ^ i¼1 ðyi yi Þ RMSE ¼ k

yt is the observed streamflow with dimension 2n × 1; and εt is the observed streamflow error with covariance β yt at each time step. The population Pc is evaluated and continuously evolved over several generations to determine the Pareto-

(5)

k X

J(yi ) ¼

i¼1

8 2 k > < ^yi ^yb,i X >

i¼1 :

σ 2b

(6) 9 2> ^ ðy yi Þ = þ i 2 σo > ;

(7)

optimal set using the evaluation objectives: bias in

where ^yb,i is the background value for the ith data point, yi is

Equation (5), RMSE in Equation (6), and the cost function

the observed value for the ith data point, σ 2b is the variance

J in Equation (7). All the objectives are minimized such that the bias and RMSE aim to determine a simulated stream-

for the background streamflow, σ 2o is the variance for the observed streamflow, ^yi is the analysis (or searched) value

flow (from the SAC-SMA model) that is closest to the

for the ith data point that minimizes J(yi), and k is the

perturbed streamflow observation. Note that the observed

number of data points (in this study, k ¼ 1 for sequential

daily streamflow is randomly perturbed using the associ-

assimilation).

ated hourly variance such that 2n ensemble observations

The EDA increments the assimilation time step to t1

are generated to correspond to the number of members in

where the n members in the final evolved population

Pc. The minimization of J allows the determination of a

found at t0 are varied and integrated to create a seed popu-

simulated streamflow from the SAC-SMA model that rep-

lation Pc of 2n ensemble members. The population Pc is

resents an optimal compromise between the background

continuously evaluated and evolved to determine a new

(i.e. forecast from previous ensemble members) and the

Pareto-optimal set for t1, where it is used to determine the

observed streamflow. The background streamflow is

updated ensemble members and background information

the average streamflow value determined by applying the

for future time step t2. The above steps are repeated to

ensemble members from the previous time step into

create seed population, continuously evolve these members,

the SAC-SMA model to forecast ensemble streamflows

and determine the updated ensemble members and back-

for the current time step. For the initial time step, the back-

ground information for subsequent time periods. Note that

ground value is computed from a randomly generated

procedures for determining background information are

population of members.

used in a similar approach to determine model forecasts

The final evolved population of size n from which the

for the 10 day lead times. That is, the streamflow forecasts

Pareto-optimal set is determined represents the updated

are based on the updated ensemble members that are associ-

ensemble members for t0. This final evolved population of

ated with specific values for forcing data uncertainties, states

members is used to determine the streamflow ensemble

and model parameters.

mean and its associated variance. This population is also used to forecast n ensemble members for future time t1

The PartoEnKF procedure

where the average and variance of the ensemble members are used as background information at t1. Note that the

The ParetoEnKF method uses the NSGA-II to continuously

streamflow forecast is conducted by applying the final

evolve a population of members before assimilating the

ensemble members that have specific values for states,

resulting Pareto-optimal set using the EnKF method. That


81

G. Dumedah & P. Coulibaly

|

Integration of Pareto-optimality into Bayesian-type filters

Journal of Hydroinformatics

|

16.1

|

2014

is, instead of assimilating randomly perturbed ensemble

are applied to determine the Pareto-optimal set using the

members, the ParetoEnKF first generates an equally com-

evaluation objectives: bias in Equation (5), RMSE in

petitive set of members before they are merged with

Equation (6), and the cost function in Equation (7). All

observation data. In the ParetoEnKF method, the EnKF is

members in the final population from which the Pareto-opti-

only used to update the final evolved members in the

mal set is determined represent ensemble members (with

Pareto-optimal set, whereas the NSGA-II controls the

associated streamflows) to be assimilated using the EnKF

Pareto distribution through continuous evolution and natu-

method.

ral selection of members. Note that the assimilation of the

The predicted ensemble streamflows ( ^yt ) are combined

final evolved population is conducted following the state-

with perturbed observations (yt) in Equation (4) to deter-

parameter formulation outlined in Moradkhani et al.

mine the Kalman gain functions for model parameters in

(). A flowchart for computational procedures in the Par-

Equation (8), and for state components in Equation (9):

etoEnKF is shown in Figure 3 – detailed descriptions are given below. The estimation of the Pareto-optimal set is described briefly, since the generation of the final evolved population

1

yy y Ktz ¼ β zy t [β t þ β t ]

(8)

yy y 1 Ktx ¼ β xy t [β t þ β t ]

(9)

that contains the Pareto-optimal set has been described in time t0, the ParetoEnKF method uses the NSGA-II to

where βzy is the cross variance of parameter ensemble zt and the ensemble streamflow prediction ^yt , βyy is the forecast

evolve a randomly generated population Pv (of the same

error covariance for ensemble of streamflow prediction ^yt ,

size as Pc in the EDA subsection). The EDA procedures

β yt is the covariance for the observed streamflow, and βxy is

detail in the subsection on EDA. Beginning with the initial

Figure 3

|

Computational procedures for a sequential assimilation in the ParetoEnKF – integration of Pareto-optimality into the EnKF.


82

G. Dumedah & P. Coulibaly

|

Integration of Pareto-optimality into Bayesian-type filters

Journal of Hydroinformatics

|

16.1

|

2014

the cross covariance of the model state ensemble and the

final evolved population that contains the Pareto-optimal

streamflow prediction ensemble. The model parameters

set using the PF. That is, instead of assimilating randomly

and state components are directly updated using their

perturbed ensemble members, the ParetoPF first generates

respective Kalman gain functions and the innovation

an equally competitive set of members before they are

vector, as in Equations (10) and (11), respectively:

merged with perturbed observation data. The ParetoPF uses the PF only to update the final evolved members in

zþ t

¼

z t

þ

Ktz (yt

^yt )

(10)

the Pareto-optimal set, whereas the Pareto distribution is controlled by the NSGA-II in continuous evolution and

x ^ xþ t ¼ xt þ Kt (yt yt )

where

zþ t

(11)

is the updated model parameter components, z t

is the perturbed model parameters before update, xþ t is the updated state components, and x t is the perturbed states before update. These updated members for t0 are populated into Pv, where they are applied into the SACSMA model to make v ensemble forecasts of streamflow for future time step t1. The ensemble forecasts (for streamflow) are used to determine the ensemble mean and its associated variance where they represent the background information for t1. At t1, the updated population from t0 is used as the seed population where it is varied and evolved using the NSGAII. A new evolved population Pv for t1 is determined where it is again updated using the EnKF method. The above procedures are repeated for subsequent time steps to evolve previously updated populations of members, assimilate evolved members using the EnKF method, and update evolved members for future forecasts. As in the EDA, streamflow forecasts for the 10 day lead times are determined using the same procedure for estimating the background information. Further information on the EnKF method can be found in various sources (Evensen b; Houtekamer & Mitchell ; Moradkhani et al. ; Weerts & El-Serafy ; Clark et al. ; Komma et al. ; Thirel et al. ; Xie & Zhang ).

natural selection of members. The computational procedure for the ParetoPF is shown in Figure 4 – detailed descriptions are given below. Beginning with the initial time t0, the ParetoPF method uses the NSGA-II to evolve a randomly generated population Pv. The EDA procedures described earlier in the EDA subsection are applied to determine the Pareto-optimal set using the evaluation objectives: bias, RMSE, and J. All members in the final population from which the Pareto-optimal set is determined represent ensemble members (with associated streamflows) to be assimilated using particle filtering. As in the ParetoEnKF, observed streamflows are perturbed according to Equation (4). The predicted streamflow ensemble ^yt and the ensemble of perturbed observations yt are applied to determine the ensemble weight (w) in Equation (12). Note that n is the ensemble size, which is the same as v – the number of members in Pv. The weights are re-sampled using a residual re-sampling approach in Equation (13) (Lui & Chen ; Weerts & ElSerafy ; van Leeuwen ). The function fix(A) rounds the elements of A to the nearest integer. The re-sampling reduces the variance between ensemble weights such that low weighted ensemble members are discarded and replaced with high normalized weighted members (Moradkhani & Hsu ; Weerts & El-Serafy ; van Leeuwen ): exp(0:5=β y )(yt ^yt )2 w ¼ Pn y ^2 i¼1 (exp(0:5=β ))(yt yt )

(12)

The ParetoPF procedure The working procedure for the ParetoPF method is similar to that of the ParetoEnKF method, except that the ParetoPF

wr ¼

nw fix(nw) P ; n ni¼1 ki

ki ¼ fix(nwi )

(13)

uses a PF to assimilate an equally competitive set of members. The ParetoPF method uses NSGA-II to continuously

The re-sampled weights w r are mapped to new indexed

evolve a population of members before assimilating the

ensemble members according to their ensemble weights


83

Figure 4

G. Dumedah & P. Coulibaly

|

|

Integration of Pareto-optimality into Bayesian-type filters

Journal of Hydroinformatics

|

16.1

|

2014

Computational procedures for a sequential assimilation in the ParetoPF – integration of Pareto-optimality into the PF.

following the mapping procedure in Moradkhani & Hsu

; Vossepoel & van Leeuwen ; Snyder et al. ;

(). The new indexes are applied to determine the re-

van Leeuwen ).

sampled model parameters (zt resample), states (xt resample),

These updated members for t0 are populated into Pv

and streamflow predictions ( ^yt resample ). The model par-

and are applied into the SAC-SMA model to make v

ameters are then perturbed using Equation (14). The

ensemble forecasts of streamflow for future time step t1.

posterior expectation (mean) for the ensemble streamflow

The ensemble forecasts (of streamflow) are used to deter-

is determined using Equation (15):

mine the ensemble mean and its associated variance where they represent the background information for t1.

zþ t

¼ zt resample þ υt 1 ,

υt 1 ∼ N(0,

β zt 1 )

(14)

At t1, the updated population from t0 is used as the seed population where it is varied and evolved using the NSGA-II. A new evolved population Pv for t1 is determined

E( ^yt ) ¼

N X

^yt resample

(15)

i¼1

where it is again updated using particle filtering. The above procedures are repeated for subsequent time steps to evolve previously updated population of members, assimi-

where υt 1 is the model parameter error with covariance

late evolved members using particle filtering, and update

β zt 1 . Detailed information on the applied particle filtering

evolved members for future model forecasts. Note that as

procedure can be found in Moradkhani & Hsu () and

in the ParetoEnKF, streamflow forecasts for the 10 day

other general sources (Gordon et al. ; Lui & Chen

lead times are determined using the same procedure for

; Bengtsson et al. ; Chen ; Weerts & El-Serafy

estimating the background information.


84

G. Dumedah & P. Coulibaly

|

Integration of Pareto-optimality into Bayesian-type filters

RESULTS AND DISCUSSION

Journal of Hydroinformatics

|

16.1

|

2014

that the EDA approach has an improved convergence pattern compared to both ParetoEnKF and ParetoPF.

The three assimilation methods were run from 2007 to 2010,

The clustering pattern in ParetoPF is more consistent with

with 1,000 ensemble members for each time step. The 1,000

a similar convergence pattern to the EDA than the

ensemble size was chosen to be large enough to accommo-

ParetoEnKF. The pattern of parameter convergence in the

date the dynamics from all model states and parameters in

EDA procedure is due to its multiple evolution of ensemble

Table 1. In the EDA, 40 members are evolved through 25

members through continuous selection and variation (cross-

generations to make up for the 1,000 ensemble members,

over

whereas the ParetoEnKF and ParetoPF methods merge the

members. That is, the selection of a subset of all evaluated

evolved population of members from which the resulting

members in the EDA plays a crucial role to enhance par-

Pareto-optimal set is determined with perturbed streamflow

ameter convergence.

and

mutation)

of

competitive

(non-dominated)

observations. Consequently, the number of updated mem-

The convergence of model parameters in the EDA is

bers is 20 (i.e. 2n ¼ 40) for each of the three methods. The

further examined through the distribution of model par-

observation error for streamflow, which is time variant, is

ameter values for the updated ensemble members across all

estimated as the hourly variance for each day of streamflow

assimilation time steps. The level of convergence of model

data. A time-variant model error is estimated adaptively

parameters is examined through clustering analysis where

from the ensemble members using the procedure for estimat-

the persistence of cluster groups across all assimilation

ing the background error outlined in the subsection on

steps for each model parameter is evaluated. The clustering

EDA. Given that the assimilation was conducted sequen-

analysis is conducted on the ensemble parameter values

tially at a daily time step, the RMSE in Equation (6) is

where the appropriate number of clusters was determined

used together with the cost function in Equation (7) to evalu-

using the knee procedure in Thorndike (). The number

ate candidate members (where k ¼ 1, in both equations).

of cluster groups examined when determining the appropri-

It is noteworthy that the EDA is based on the NSGA-II

ate number of clusters is variable, typically between four

procedure, so a standard crossover probability of 0.8 and a

and eight. The cluster with the largest membership is deter-

mutation probability of 1/r (where r is the number of vari-

mined, along with its coverage of the parameter space, the

ables) are used. The various updated ensemble members

centroid, and the lower and upper bounds to represent the

were applied to make streamflow forecasts for up to 10

converged parameter space with the largest weight.

days ahead, where each time step has 10 ensemble forecasts

The largest membership cluster for each model par-

starting from 1 day, 2 day, up to 10 day lead times. The

ameter across all assimilation time steps is shown in

resulting streamflow forecasts for 1 day, 5 day and 10 day

Table 2. It is noteworthy that the parameter values are all

lead times are compared to the observed streamflows. The

re-scaled between zero and one before application in the

outputs for the updated ensemble members and their

clustering analysis. As a result, the values shown in this

extended model forecasts for the three DA methods are pre-

table represent the re-scaled values. The coverage represents

sented and examined in the following subsections.

the proportion of members in the largest membership cluster in relation to the total number of members across all assimilation time steps. The coverage therefore quantifies

Convergence of model parameters for EDA updated

the weight of the cluster with the largest membership and

members

accounts for variability of cluster memberships due to variable cluster groupings. The coverage representing the level

It is noteworthy to begin by demonstrating the improved

of convergence for each model parameter across all assimi-

parameter convergence obtained through the EDA pro-

lation time steps is shown in Figure 9 for Dundas and

cedure. The distribution of ensemble parameter values for

Highway 5 stations.

the EDA, ParetoEnKF, and ParetoPF methods are com-

The convergence of model parameter values illustrated

pared in Figures 5–8. The parameter distributions illustrate

for the EDA output is significant. The coverage illustrates


85

Figure 5

G. Dumedah & P. Coulibaly

|

|

Integration of Pareto-optimality into Bayesian-type filters

Journal of Hydroinformatics

|

16.1

|

2014

Distribution of model parameter values in response to changes across assimilation time steps for the three methods: EDA, ParetoEnKF and ParetoPF: model parameters 1–8. (a) Model parameters 1–4. (b) Model parameters 5–8.

the recurrence of cluster groups such that across all assimi-

consistently reliable across the assimilation time steps. The

lation time steps, the uncertainty applied to precipitation

significance of these findings is that the convergence of par-

was found to be between 0.166 and 0.499 of its re-scaled

ameter values across different observation/assimilation time

value, with a coverage of about 79%. About 29 model par-

steps is valuable in the retrieval of variables that are not

ameters out of 32 have converged to about 80% across all

explicitly observed. These illustrations show the potential

assimilation time steps for both Dundas and Highway 5

of the EDA approach to examine the convergence of

watersheds. The high level of convergence for the 29

model parameters and their associated clusters through

model parameters means that their clustered intervals are

time in order to determine their relationships, sensitivities,


86

Figure 6

G. Dumedah & P. Coulibaly

|

|

Integration of Pareto-optimality into Bayesian-type filters

Journal of Hydroinformatics

|

16.1

|

2014

Distribution of model parameter values in response to changes across assimilation time steps for the three methods: EDA, ParetoEnKF and ParetoPF: model parameters 9–16. (a) Model parameters 9–12. (b) Model parameters 13–16.

and their responses to changes in observation and forcing

estimates

data.

shown in Figure 10. The evaluation of the ensemble

(the

analysis)

and

the

observations

is

means in comparison to the observations is shown in Evaluation of streamflow assimilations

Table 3 using the Nash–Sutcliffe efficiency (NSE) in Equation (16) and percent bias in Equation (17). The

The updated ensemble estimates of streamflow for

percent bias represents the proportion of the estimation

the three methods are compared in the following.

which

A temporal comparison between the updated ensemble

zero indicates unbiased estimation, whereas values

is

biased

such

that

a

minimum

value

of


87

Figure 7

G. Dumedah & P. Coulibaly

|

|

Integration of Pareto-optimality into Bayesian-type filters

Journal of Hydroinformatics

|

16.1

|

2014

Distribution of model parameter values in response to changes across assimilation time steps for the three methods: EDA, ParetoEnKF and ParetoPF: model parameters 17–24. (a) Model parameters 17–20. (b) Model parameters 21–24.

greater than zero indicate the level of bias in the estimation: "P NSE ¼ 1

m ^ 2 t¼1 (yt yt ) Pm 2 t¼1 (yt y)

where yt is the observed streamflow at time t, ^yt is the estimated streamflow at time t, y is the mean of observed streamflow, and m is the number of data points.

#

The three methods produce similar streamflow esti(16)

mations

when

compared

to

the

observations.

The

assessment of estimated streamflows using evaluation percent bias ¼ 100 ×

Pm ^ t¼1 jyt yt j Pm i¼1 yt

measures is similar at both upstream and downstream (17)

stations for all three methods. For example, streamflow evaluations using the NSE are: 0.894 for EDA, 0.897 for


88

Figure 8

G. Dumedah & P. Coulibaly

|

|

Integration of Pareto-optimality into Bayesian-type filters

Journal of Hydroinformatics

|

16.1

|

2014

Distribution of model parameter values in response to changes across assimilation time steps for the three methods: EDA, ParetoEnKF and ParetoPF: model parameters 25–32. (a) Model parameters 25–28. (b) Model parameters 29–32.

ParetoEnKF, and 0.900 for ParetoPF at Dundas gauge

Evaluation of streamflow forecasts

station. The similarity between the assimilations for the three methods has important implications. For example,

The above results illustrate that improvements to the Pareto-

the similarity suggests that one method can be easily

optimal members based on the EnKF and PF methods are

replaced with another method with very little effect on the

minimal and could be ignored at assimilation stage. The

accuracy of the assimilated streamflow. Although the three

identical ensemble means from the three methods suggest

methods have different computational pathways for assimi-

that comparable updated members could be determined

lation, their ensemble means are comparable, and each

from different state and model parameterizations. Given

method appears adequate to merge simulated streamflows

these identical updated estimates, a persistent question is

with perturbed observations.

the response of future streamflows to state and model


89

Table 2

G. Dumedah & P. Coulibaly

|

|

Integration of Pareto-optimality into Bayesian-type filters

Model parameters and their converged intervals represented by the largest membership clusters with a definition of their coverage, centroids and lower and upper bounds estimated for the Dundas station. The coverage is presented as a fraction where a maximum value of one represents a perfectly converged cluster and a value close to zero represents a sensitive cluster

Journal of Hydroinformatics

|

16.1

|

2014

three methods are also the same. These questions and their implications are examined in this subsection. A comparison between the observations and forecast streamflows for the three methods is shown in Figure 11

Parameter

Centroid

Lower bound

Upper bound

Coverage

for Dundas at the downstream outlet. The evaluation

UZTWM

0.620

0.422

0.856

0.977

measures for these forecasts at both upstream and down-

UZFWM

0.439

0.211

0.766

0.930

stream stations are presented in Table 3. The forecast

UZK

0.337

0.022

0.487

0.580

streamflows from the updated members are different com-

PCTIM

0.154

0.012

0.470

0.928

pared

ADIMP

0.833

0.411

0.924

0.982

assimilation stage in the subsection on evaluation of stream-

ZPERC

0.436

0.222

0.764

0.950

flow assimilations. The forecasts from the EDA method have

REXP

0.447

0.266

0.652

0.955

higher accuracy than the forecasts from the ParetoEnKF and

LZTWM

0.582

0.388

0.799

0.991

ParetoPF methods. The rate of decrease in forecast accuracy

LZFSM

0.419

0.200

0.766

0.956

from 1 day through to 10 day is smaller for the EDA method

LZFPM

0.548

0.333

0.699

0.955

when compared to the rate of decrease in accuracy for the

LZSK

0.801

0.640

0.940

0.880

ParetoEnKF and ParetoPF forecasts. That is, the accuracy

LZPK

0.072

0.002

0.230

0.982

of streamflow forecasts deteriorates quickly with increasing

PFREE

0.232

0.011

0.436

0.996

lead time for ParetoEnKF and ParetoPF methods, whereas

RQ

0.903

0.770

1.000

0.485

the differences between forecasts from the EDA method

DDF

0.438

0.210

0.592

0.963

are much smaller for most time steps. Between the three

SCF

0.444

0.222

0.548

0.881

methods, the EDA method produces the highest accuracy

TR

0.488

0.344

0.646

0.994

for streamflow forecasts, whereas forecast accuracy is simi-

ATHORN

0.527

0.344

0.636

0.966

lar for both ParetoEnKF and ParetoPF methods.

RCR

0.273

0.130

0.448

0.978

These results emphasize the importance of continuous

UZTWC

0.332

0.126

0.410

0.950

evaluation of the updated ensemble members for future

UZFWC

0.283

0.180

0.477

0.962

time periods. For example, subsequent evaluation of the

LZTWC

0.448

0.263

0.666

0.904

updated ensemble members for future time steps exposed

LZFSC

0.120

0.015

0.299

0.984

accuracy differences in streamflow forecasts for the three

to

the

comparable

outputs

obtained

at

the

LZFPC

0.121

0.012

0.299

0.980

methods that generate similar streamflow values at the

ADIMC

0.340

0.213

0.522

0.781

assimilation stage. The rapid decrease in forecast accuracy

UHG1

0.145

0.014

0.288

0.984

for the ParetoEnKF and ParetoPF methods may illustrate a

UHG2

0.196

0.023

0.277

0.987

skewed association between the assimilated streamflows

UHG3

0.271

0.142

0.377

0.983

and its corresponding updates for state and model par-

SWE

0.346

0.216

0.577

0.974

ameter components. In other words, state and model

EVAPO

0.439

0.222

0.599

0.604

parameter updates that are performed based on the evolved

PRECIP

0.394

0.166

0.499

0.786

ensemble members do not seem to improve model

TEMPR

0.403

0.288

0.599

0.810

forecasts. Discussion on design and performance of DA methods

parameter updates from the three comparable sets of updated members. That is, since the updated members are

The above sections have compared assimilation outputs for

similar for the three different methods, it is desirable to

the three DA methods. The three methods produce similar

determine whether streamflow forecasts that are generated

assimilated streamflows, but their corresponding model fore-

using state and parameter updates from the corresponding

casts for future time periods are different. The rationale for


90

G. Dumedah & P. Coulibaly

Figure 9

|

|

Integration of Pareto-optimality into Bayesian-type filters

Journal of Hydroinformatics

|

16.1

|

2014

Percent convergence of model parameter spaces indicative of the coverage of the largest membership clusters. The coverage in this case is the ratio of the number of cluster members to the total number of members across all assimilation time steps. (a) Dundas station. (b) Highway 5 station.

continuous evaluation of model forecasts from the updated

assimilation does not skew the optimal merger between

ensemble members was to validate the assimilation pro-

simulation and observation by distorting state and model

cedure, and to ensure that the mergers between simulation

parameterizations. Second, accuracy improvements gained

and perturbed observation for any time step is evaluated

through the assimilation of Pareto-optimal members for

for future time steps. The results show that all three DA

different time steps is minimal and can be ignored at the

methods produce comparable updated ensemble members,

assimilation stage. This was exemplified by similarities

and that EDA has the highest forecast accuracy and is pre-

between assimilated streamflows from the three methods.

ferable

to

ParetoEnKF

or

ParetoPF

methods

for

streamflow forecasting.

Third, state and model parameter updates performed on Pareto-optimal members do not increase the forecasting per-

These findings have important implications on the

formance of these members. This was illustrated by the high

design of DA procedures for streamflow forecasting. First,

accuracy for EDA streamflow forecasts compared to fore-

continuous evaluation of the updated ensemble members

casts made from either the ParetoEnKF or ParetoPF

for future time steps is equally important as the determi-

methods. Finally, the findings illustrate the forecasting per-

nation of the updated members. Evaluation of the updated

formance

members for future time steps would ensure that the

encapsulate the memory of past model states and

for

continuously

evolving

members

that


91

G. Dumedah & P. Coulibaly

Figure 10

Table 3

|

|

|

Integration of Pareto-optimality into Bayesian-type filters

Journal of Hydroinformatics

|

16.1

|

2014

Comparison between observations and updated ensemble members for EDA, ParetoEnKF, and ParetoPF at Highway 5 and Dundas gauging stations.

Evaluation of the updated ensemble members, and their forecasting perform-

Station

Measure

measurement (i.e. the SAC-SMA model) errors in the form of simulation–observation dynamics. These properties have

ance for DA methods

ParetoEnKF

ParetoPF

EDA

been shown to improve the performance of streamflow forecasting, and could facilitate real-time predictions and the

Evaluation for updated members Highway 5

NSE Percent bias

0.949 0.142

0.941 0.149

0.947 0.148

Dundas

NSE Percent bias

0.897 0.152

0.900 0.154

0.894 0.157

initiation of models from unknown initial conditions.

CONCLUSIONS

Evaluation for 1 day forecast (background information) Highway 5

NSE Percent bias

0.756 0.387

0.766 0.373

0.836 0.277

This study has illustrated the integration of Pareto-optimality

Dundas

NSE Percent bias

0.739 0.332

0.764 0.296

0.777 0.281

applied Pareto-optimality to obtain information on model

Evaluation for 5 day forecast Highway 5

NSE Percent bias

0.581 0.482

0.589 0.471

0.754 0.431

Dundas

NSE Percent bias

0.490 0.484

0.507 0.466

0.725 0.324

into Kalman-type and PF-type assimilations. The study state, improve model parameterizations, and to better estimate measurement error. This information was, in turn, incorporated into the EnKF and PF methods to improve their forecasting performance. Comparative evaluation was conducted to examine forecasting performance for the three methods: the EDA, and the methods based on the inte-

Evaluation for 10 day forecast Highway 5

NSE Percent bias

0.554 0.495

0.555 0.485

0.666 0.463

gration of Pareto-optimality into the EnKF and PF methods.

Dundas

NSE Percent bias

0.426 0.517

0.437 0.501

0.625 0.420

SMA model in the Spencer Creek watershed in southern

The ensemble means are used to compute the evaluation measures

The three methods assimilate daily streamflow into the SACOntario, Canada. The resulting updated ensemble members were, in turn, applied to predict streamflow for up to 10 day


92

G. Dumedah & P. Coulibaly

Figure 11

|

|

Integration of Pareto-optimality into Bayesian-type filters

Journal of Hydroinformatics

|

16.1

|

2014

Comparison between observations and 1 day, 5 day and 10 day streamflow forecasts for EDA, ParetoEnKF, and ParetoPF methods at Dundas gauging station.

lead times where forecasts for 1 day, 5 day and 10 day lead

forecasting. Additionally, the results illustrated the capability

times were compared to the observation data.

of the EDA approach to estimate convergent model par-

The results show that the optimal merger between simu-

ameter values and to identify persistent, as well as sensitive,

lations and observations for the three DA methods generate

model parameter spaces. It was found that the additional

similar ensemble estimates. However, a subsequent evalu-

update steps from the EnKF method (for ParetoEnKF) and

ation of the updated members for future time periods yields

the PF method (for ParetoPF) generally degrade the conver-

different forecasting performance for the three methods.

gence of model parameters and do not improve the overall

The ParetoEnKF and ParetoPF methods have similar fore-

accuracy of streamflow estimation.

casting performance, whereas the EDA method has the

While most studies emphasize assimilation results for

highest forecasting accuracy and could be the desired

DA methods, this study has illustrated the importance for

method for streamflow forecasting in the SAC-SMA model.

a continuous evaluation of the updated ensemble members.

The high performance of the EDA method illustrates that

The continuous evaluation of the assimilation includes a

the continuous evolution and subsequent merging of Pareto-

comparison of the updated ensemble members and their

optimal members with perturbed observations provides an

associated model forecasts to the observations. The accu-

appealing framework to enhance the accuracy of streamflow

racy of model forecasts from the updated members, which


93

G. Dumedah & P. Coulibaly

|

Integration of Pareto-optimality into Bayesian-type filters

typifies forecasting performance (and, indirectly, robustness) of the assimilation procedure, is equally as important as the estimation of the updated members themselves. That is, the continuous and comparative evaluation of the updated members and their associated model forecasts is an important measure for assessing the forecasting performance of DA methods. Future studies should compare the updated members from the EDA method to the standard EnKF and PF methods.

REFERENCES Bengtsson, T., Snyder, C. & Nychka, D.  Toward a nonlinear ensemble filter for high-dimensional systems. Journal of Geophysical Research 108 (D24), 8775. Burgers, T., van Leeuwen, J. P. & Evensen, G.  Analysis scheme in the ensemble Kalman filter. American Meteorological Society Monthly Weather Review 126 (6), 1719–1724. Caparrini, F., Castelli, F. & Entekhabi, D.  Mapping of landatmosphere heat fluxes and surface parameters with remote sensing data. Boundary-layer Meteorology 107 (3), 605–633. Caparrini, F., Castelli, F. & Entekhabi, D.  Variational estimation of soil and vegetation turbulent transfer and heat flux parameters from sequences of multisensor imagery. Water Resources Research 40 (12), W12515. Chemin, Y. & Honda, K.  Spatiotemporal fusion of rice actual evapotranspiration with genetic algorithms and an agrohydrological model. IEEE Transactions on Geoscience and Remote Sensing 44 (11), 3462–3469. Chen, Z.  Bayesian Filtering: From Kalman Filters to Particle Filters, and Beyond. Technical report, McMaster University, Adaptive System Laboratory, Hamilton, ON, Canada. Clark, M., Rupp, D., Woods, R., Zheng, X., Ibbitt, R., Slater, A., Schmidt, J. & Uddstrom, M.  Hydrological data assimilation with the ensemble Kalman filter: use of streamflow observations to update states in a distributed hydrological model. Advances in Water Resources 31 (10), 1309–1324. Coello Coello, C. A., van Veldhuizen, D. A. & Lamont, G. B.  Evolutionary Algorithms for Solving Multi-Objective Problems. Kluwer Academic/Plenum Publishers, New York, NY. Confesor, R. B. & Whittaker, G. W.  Automatic calibration of hydrologic models with multi-objective evolutionary algorithm and Pareto optimization. Journal of American Water Resources Association 43 (4), 981–989. Deb, K.  Multi-objective Optimization using Evolutionary Algorithms. John Wiley and Sons, Chichester, UK. Deb, K. & Goel, T.  Controlled elitist non-dominated sorting genetic algorithms for better convergence. In: Evolutionary Multi-criterion Optimization, 1st International Conference,

Journal of Hydroinformatics

|

16.1

|

2014

EMO 2001, Swiss Federal Institute of Technology, Springer, Zurich, Switzerland, vol. 1993, pp. 67–81. Deb, K., Agrawal, S., Pratap, A. & Meyarivan, T.  A fast elitist non-dominated sorting genetic algorithms for multi-objective optimization: NSGA-II. In: Parallel Problem Solving from Nature VI (PPSN-VI), Springer Lecture Notes in Computer Science, No. 1917, Springer, Paris, France, pp. 849–858. Deb, K., Pratap, A., Agrawal, S. & Meyarivan, T.  A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6 (2), 182–197. Dumedah, G.  Formulation of the evolutionary-based data assimilation, and its practical implementation. Water Resources Management 26, 3853–3870. Dumedah, G. & Coulibaly, P.  Evolutionary assimilation of streamflow in distributed hydrologic modeling using in-situ soil moisture data. Advances in Water Resources 53, 231–241. Dumedah, G., Berg, A. A., Wineberg, M. & Collier, R.  Selecting model parameter sets from a trade-off surface generated from the Non-dominated Sorting Genetic Algorithm-II. Water Resources Management 24 (15), 4469–4489. Dumedah, G., Berg, A. A. & Wineberg, M.  An integrated framework for a joint assimilation of brightness temperature and soil moisture using the Non-dominated Sorting Genetic Algorithm-II. Journal of Hydrometeorology 12 (2), 1596–1609. Dumedah, G., Berg, A. A. & Wineberg, M. a Evaluating autoselection methods used for choosing solutions from Paretooptimal set: does non-dominance persist from calibration to validation phase? Journal of Hydrologic Engineering 17 (1), 150–159. Dumedah, G., Berg, A. A. & Wineberg, M. b Pareto-optimality and a search for robustness: choosing solutions with desired properties in objective space and parameter space. Journal of Hydroinformatics 14 (2), 270–285. Eiben, A. E. & Smith, J. E.  Introduction to Evolutionary Computing. Springer. Evensen, G. a Inverse methods and data assimilation in nonlinear ocean models. Physica D 77 (1–3), 108–129. Evensen, G. b Sequential data assimilation with a non-linear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. Journal of Geophysical Research 99 (C5), 10,143–10,162. Evensen, G.  The ensemble Kalman filter: theoretical formulation and practical implementation. Ocean Dynamics 53 (4), 343–367. Gordon, N., Salmond, D. & Smith, A.  Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEEE Proceedings on Radar and Signal Processing 140 (2), 107–113. Houtekamer, P. L. & Mitchell, H. L.  Data assimilation using an ensemble Kalman filter technique. Monthly Weather Review 126 (3), 796–811. Ines, A. & Mohanty, B.  Near-surface soil moisture assimilation for quantifying effective soil hydraulic properties


94

G. Dumedah & P. Coulibaly

|

Integration of Pareto-optimality into Bayesian-type filters

using genetic algorithm: 1. Conceptual modeling. Water Resources Research 44 (6), W06422. Ines, A. & Mohanty, B.  Near-surface soil moisture assimilation for quantifying effective soil hydraulic properties using genetic algorithms: 2. Using airborne remote sensing during SGP97 and SMEX02. Water Resources Research 45, W01408. Komma, J., Bloschl, G. & Reszler, C.  Soil moisture updating by ensemble Kalman filtering in real-time flood forecasting. Journal of Hydrology 357 (3–4), 228–242. Liu, Y. & Gupta, H. V.  Uncertainty in hydrologic modeling: toward an integrated data assimilation framework. Water Resources Research 43, 1–18. Lui, J. S. & Chen, R.  Sequential Monte-Carlo methods for dynamical systems. Journal of the American Statistical Association 93 (443), 1032–1044. Moradkhani, H. & Hsu, K.  Uncertainty assessment of hydrologic model states and parameters: sequential data assimilation using the particle filter. Water Resources Research 41, W05012. Moradkhani, H., Sorooshian, S., Gupta, H. V. & Paul Houser, R.  Dual state parameter estimation of hydrological models using ensemble Kalman filter. Advances in Water Resources 28 (2), 135–147. Nazemi, A., Yao, X. & Chan, A.  Extracting a set of robust Pareto-optimal parameters for hydrologic models using NSGA-II and SCEM. In: IEEE Congress on Evolutionary Computation, Vancouver, BC, 2006, pp. 1901–1908, doi: 10.1109/CEC.2006.1688539. Reichle, R., McLaughlin, D. & Entekhabi, D.  Variational data assimilation of microwave radiobrightness observations for land surface hydrology applications. IEEE Transactions on Geoscience and Remote Sensing 39 (8), 1708–1718. Snyder, C., Bengtsson, T., Bickel, P. & Anderson, J.  Obstacles to high-dimensional particle filtering. Monthly Weather Review 136, 4629–4640. Tang, Y., Reed, P. & Wagener, T.  How effective and efficient are multiobjective evolutionary algorithms at hydrologic model calibration? Hydrology and Earth System Sciences 10, 289–307.

Journal of Hydroinformatics

|

16.1

|

2014

Thirel, G., Martin, E., Mahfouf, J.-F., Massart, S., Ricci, S. & Habets, F.  A past discharges assimilation system for ensemble streamflow forecasts over France. Part 1: description and validation of the assimilation system. Hydrology and Earth System Sciences 14 (8), 1623–1637. Thorndike, R. L.  Who belong in the family? Psychometrika 18, 4. van Leeuwen, P. J.  Particle filtering in geophysical systems. Monthly Weather Review 137 (12), 4089–4114. Vossepoel, F. C. & van Leeuwen, P. J.  Parameter estimation using a particle method: inferring mixing coefficients from sea level observations. Monthly Weather Review 135 (3), 1006–1020. Vrugt, J. A. & Robinson, B. A.  Treatment of uncertainty using ensemble methods: comparison of sequential data assimilation and Bayesian model averaging. Water Resources Research 43 (1), W01201–W01701. Vrugt, J. A., Gupta, H. V., Dekker, S. C., Sorooshian, S., Wagener, T. & Bouten, W. a Application of stochastic parameter optimization to the Sacramento soil moisture accounting model. Journal of Hydrology 325 (1–4), 288–307. Vrugt, J. A., Gupta, H. V., Nuallain, B. O. & Bouten, W. b Real-time data assimilation for operational ensemble streamflow forecasting. Journal of Hydrometeorology 7 (3), 548–565. Weerts, A. H. & El Serafy, G. Y.  Particle filtering and ensemble Kalman filtering for state updating with hydrological conceptual rainfall-runoff models. Water Resources Research 42, W09301–W09602. Weerts, A. H., El Serafy, G. Y., Hummel, S., Dhondia, J. & Gerritsen, H.  Application of generic data assimilation tools(datools) for flood forecasting purposes. Computers & Geosciences 36 (4), 453–463. Wohling, T., Vrugt, J. A. & Barkle, G. F.  Comparison of three multiobjective optimization algorithms for inverse modeling of vadose zone hydraulic properties. Soil Science Society of America Journal 72 (2), 305–319. Xie, X. & Zhang, D.  Data assimilation for distributed hydrological catchment modeling via ensemble Kalman filter. Advances in Water Resources 33 (6), 678–690.

First received 5 May 2012; accepted in revised form 10 May 2013. Available online 5 June 2013


95

© IWA Publishing 2014 Journal of Hydroinformatics

|

16.1

|

2014

Encapsulation of parametric uncertainty statistics by various predictive machine learning models: MLUE method Durga L. Shrestha, Nagendra Kayastha, Dimitri Solomatine and Roland Price

ABSTRACT Monte Carlo simulation-based uncertainty analysis techniques have been applied successfully in hydrology for quantification of the model output uncertainty. They are flexible, conceptually simple and straightforward, but provide only average measures of uncertainty based on past data. However, if one needs to estimate uncertainty of a model in a particular hydro-meteorological situation in real time application of complex models, Monte Carlo simulation becomes impractical because of the large number of model runs required. This paper presents a novel approach to encapsulating and predicting parameter uncertainty of hydrological models using machine learning techniques. Generalised likelihood uncertainty estimation method (a version of the Monte Carlo method) is first used to assess the parameter uncertainty of a hydrological model, and then the generated data are used to train three machine learning models. Inputs to these models are specially identified representative variables. The trained models are then employed to predict the model output

Durga L. Shrestha (corresponding author) CSIRO Land and Water, Highett, Australia E-mail: durgalal.shrestha@csiro.au Nagendra Kayastha Dimitri Solomatine Roland Price UNESCO-IHE Institute for Water Education, Delft, The Netherlands Dimitri Solomatine Roland Price Water Resources Section, Delft University of Technology, Delft, The Netherlands

uncertainty which is specific for the new input data. This method has been applied to two contrasting catchments. The experimental results demonstrate that the machine learning models are quite accurate. An important advantage of the proposed method is its efficiency allowing for assessing uncertainty of complex models in real time. Key words

| hydrological modelling, machine learning, MLUE, Monte Carlo, uncertainty analysis

INTRODUCTION Hydrological models, in particular rainfall-runoff models,

when it is too costly to measure them in the field. Concep-

are simplified representations of reality and aggregate the

tual

complex, spatially and temporally distributed physical pro-

parameters, which cannot be directly measured. Manual

rainfall-runoff

models

usually

contain

several

cesses through relatively simple mathematical equations

adjustment of the parameter values is labour intensive and

with parameters. The parameters of the rainfall-runoff

its success is strongly dependent on the experience of the

models can be estimated in two ways ( Johnston & Pilgrim

modeller. In the last two decades, a number of automated

). First, they can be estimated from the available knowl-

routines have been suggested (see e.g. Duan et al. ;

edge or measurements of the physical process, provided the

Yapo et al. ; Solomatine ; Madsen ; Vrugt

model parameters realistically represent the measurable

et al. ).

physical process. In the second approach, parameter

While considerable attention has been given to the

values are estimated by calibration on the basis of the

development of calibration methods which aim to find a

input and output measurements in situations when the par-

single best or Pareto set of values for the parameter vector,

ameters do not represent directly measurable entities or

a realistic estimation of parameter uncertainty received

doi: 10.2166/hydro.2013.242


96

D. L. Shrestha et al.

|

Encapsulation of parametric uncertainty statistics: MLUE method

Journal of Hydroinformatics

|

16.1

|

2014

special attention over the last few years. It is now being

Binley (). GLUE is one of the popular methods for ana-

broadly recognised that proper consideration of uncertainty

lysing parameter uncertainty in hydrological modelling and

in hydrologic predictions is essential for purposes of both

has been widely used over the past 15 years to analyse and

research and operational modelling (Wagener & Gupta

estimate predictive uncertainty, particularly in hydrological

). The value of hydrologic prediction for water

applications (see e.g. Freer et al. ; Beven & Freer ;

resources-related decision-making processes is limited if

Montanari ). Users of the GLUE (and actually of any

reasonable estimates of the corresponding predictive uncer-

MC method in general) are attracted by its simple under-

tainty are not provided (Georgakakos et al. ). The

standable ideas, relative ease of implementation and use,

research community has done quite a great deal in moving

and its ability to handle different error structures and

towards the recognition of the necessity of complementing

models without major modifications to the method itself.

point forecasts of decision variables by the uncertainty esti-

Despite its popularity, there are theoretical and practical

mates, and nowadays it is widely recognised that along the

issues related with the GLUE method reported in the litera-

modelling per se, there is a need to (i) understand and ident-

ture. For instance, Mantovan & Todini () argue that

ify sources of uncertainty, (ii) quantify uncertainty, (iii)

GLUE is inconsistent with the Bayesian inference processes

evaluate the propagation of uncertainty through the

such that it leads to an overestimation of uncertainty, both

models, and (iv) find means to reduce uncertainty. Incorpor-

for the parameter uncertainty estimation and the predictive

ating uncertainty into deterministic forecasts helps to

uncertainty estimation. For the account of different views at

enhance the reliability and credibility of the model outputs.

the methodological correctness of GLUE, readers are

One may observe a significant proliferation of uncer-

referred to the citation above and the subsequent discus-

academic

sions in the Journal of Hydrology in 2007 and 2008, and to

literature, trying to provide meaningful uncertainty bounds

the papers by Stedinger et al. () and Vrugt et al. ().

of the model predictions. Pappenberger et al. () provide

Since MC-based methods require a large number of

tainty analysis

methods

published

in

the

a decision tree to find the appropriate method for a given

samples (or model runs), their applicability is sometimes

situation. However, the methods to estimate and propagate

limited to simple models. In the case of computationally

this uncertainty have so far been limited in their ability to

intensive models, the time and resources required by these

distinguish between different sources of uncertainty and in

methods could be prohibitively expensive. Alternative

the use of the retrieved information to improve the model

approximation methods have been developed (e.g. moment

structure analysed. These methods range from analytical

propagation techniques), which under certain assumptions

and approximation methods (see e.g. Tung ) to Monte

are able to calculate directly the first and second moments

Carlo (MC) sampling-based methods (e.g. Beven & Binley

without the application of MC simulation (see e.g. Rosen-

; Kuczera & Parent ; Thiemann et al. ) with

blueth ; Harr ; Melching ). A number of

the use of Bayesian approaches to determine the posterior

methods allow for reducing the number of MC simulation

distributions; methods based on the analysis of model

runs, for instance, Latin hypercube sampling (see e.g.

errors (e.g. Montanari & Brath ); machine learning

McKay et al. ) but they may fail to provide reliable esti-

methods (Shrestha & Solomatine ; Solomatine &

mates of uncertainty.

Shrestha ; Shrestha et al. ), and methods based on fuzzy set theory (see e.g. Maskey et al. ).

For models with a large number of parameters, the sample size from the respective parameter distributions

Due to complexities, or even impossibility of using

must be very large in order to achieve a reliable estimate

analytical methods to propagate uncertainty from par-

of uncertainties (Kuczera & Parent ) (it is worth men-

ameters

MC-based

tioning that this is a problem for all methods based on

(sampling) techniques have been widely applied in studying

sampling and multiple model runs). One of the ways to

uncertainty of hydrological models. A version of the MC

address the problem of computational complexity in optim-

simulation method was introduced under the term ‘general-

isation, random search or MC simulation, is to use a limited

ised likelihood uncertainty estimation’ (GLUE) by Beven &

number of samples of parameter vectors and run the

to

outputs

for

complex

models,


97

D. L. Shrestha et al.

|

Encapsulation of parametric uncertainty statistics: MLUE method

Journal of Hydroinformatics

|

16.1

|

2014

hydrologic or hydraulic model in order to generate a data set

needs only a single set of MC simulations in off line mode

which is then used as the calibration set for building an

and allows one to predict the uncertainty bounds of the

approximating regression model. This latter (fast) model

model prediction when the new input data are observed

(also called a meta-model, or surrogate model) is then

and fed into hydrological models (whereas the standard

used instead of the (slower) original model; such approach

MC approach requires new multiple model runs for each

is widely employed in industrial design optimisation, and

new input).

in water resources problems was used, for example by Solo-

In a comparison with previous study of Shrestha et al.

matine & Torres () in model-based optimisation, and

(), the main contributions of this study are to (i) provide

Khu & Werner () in reducing the number of MC

an extensive review of the state art of the uncertainty analy-

simulations.

sis

methods

used

in

hydrology,

(ii)

generalise

the

Yet another approach is to use more efficient sampling

methodology and to extend it further to approximate prob-

strategies, as was done by Blasone et al. () who used

ability distribution function of the model outputs, (iii)

adaptive Markov chain Monte Carlo sampling within the

apply methodology to different study area, (iv) employ and

GLUE methodology to improve the sampling of the high

compare different machine learning models to emulate

probability density region of the parameter space. Other

MC simulation results, and (v) compare the methodology

examples of this approach are the delayed rejection adaptive

with yet another uncertainty analysis method. The HBV

Metropolis method (Haario et al. ), and the differential

(Hydrologiska Byråns Vattenbalansavdelning) hydrological

evolution adaptive Metropolis method, DREAM (Vrugt

models of the Brue catchment in UK and Bagmati catch-

et al. ).

ment in Nepal are used as case studies.

One of the practical observations concerning the GLUE method is that in many cases the percentage of observations falling within the prediction limits provided by GLUE is

MACHINE LEARNING METHODS

smaller than the given confidence level used to produce these prediction limits (see e.g. Montanari ). Xiong &

In this section, we introduce briefly the main notions of

O’Connor () modified the GLUE method to somehow

machine learning and the methods used. Major focus of

resolve this issue, so that the prediction limits would envel-

machine learning is to automatically produce (induce) pre-

ope the observations better.

dictive models from data. A machine learning algorithm

There is, however, an issue which is not widely discussed

estimates an unknown mapping (or dependency) between

in the literature, and this is the assessment of model uncer-

the inputs (predictors) and outputs (predictands) of a phys-

tainty when it is used in operation, i.e. when the new input

ical system from the available data (Mitchell ). As

data are fed into the model, in other words, uncertainty pre-

such a dependency (model) is discovered, it can be used to

diction. The MC simulation provides only the averaged

predict the future outputs of the system from the known

uncertainty estimates based on the past data, but in real

input values. Machine learning techniques, based on

time forecasting situations there may be simply little time

observed data D ¼ (X, y) ¼ {xt, yt}, t ¼ 1, 2,…, N, try to ident-

to perform the MC simulations for the new input data in

ify (learn) the target function f(xt, w) describing how the real

order to assess the model uncertainty for a new situation.

system behaves, where X is the matrix (x, vector) of the input

Recently, we proposed to use artificial neural network

data, y is the vector of systems’ response, N is the number of

(ANN) to emulate the MC simulations results obtained for

data, w is the parameter vector of the function. Learning (or

the past data, and named this method MLUE – machine learning in parameter uncertainty estimation (Shrestha

‘training’) here is the process of minimising the difference between observed response y and model response ŷ through

et al. ). The idea of this method is to use the data

an optimisation procedure. Such a model f is often called a

from MC simulations to train a statistical or machine learn-

‘data-driven model’. For a recent overview of data-driven mod-

ing model to (with specially selected inputs) predict the

elling in water-related issues, see for example Solomatine &

quantiles of the model error distribution. MLUE method

Ostfeld (), Maier et al. (), and Elshorbagy et al.


98

D. L. Shrestha et al.

|

Encapsulation of parametric uncertainty statistics: MLUE method

Journal of Hydroinformatics

|

16.1

|

2014

(a, b). A review of the application of machine learning

and the results are interpretable. An example of using MT

techniques to estimate the uncertainty of rainfall-runoff

in the role of a data-driven rainfall-runoff model can be

models can be found in Shrestha & Solomatine () and

found in Solomatine & Dulal ().

Shrestha et al. (). Three machine learning methods namely ANN, model

Locally weighted regression

tree, and locally weighted regression (LWR), are used in this study. Among them, ANN is the most popular technique

LWR is a method that builds a regression model selecting

and has been extensively used in hydrological modelling

only a limited number of examples close to the vector of

over the past 15 years (see for example early papers

input xq (often called a query vector). The selected examples

by Minns & Hall (), Maier & Dandy (), Abrahart

are assigned weights according to their distance to the query

& See (), Govindaraju & Rao (), Dawson &

vector, and regression equations are generated using the

Wilby () and Dibike & Solomatine ()). The follow-

weighted data. The word ‘local’ in the ‘locally weighted

ing sections present a brief overview of the other two

regression’ means that the function is approximated based

methods which are less known in the water and environ-

on data in the locality of the query vector, and it is

mental modelling community.

‘weighted’ because the contribution of each training example is weighted by its distance from the query vector.

Model trees

The regression function f built for the neighbourhood of

A model tree (MT) is a hierarchical (or tree-like) modular

an ANN, etc. Various distance-based weighting schemes

model which has splitting rules in non-terminal nodes and

can be employed (given in Appendix A, available online at

linear regression functions at the leaves of the tree. In fact,

http://www.iwaponline.com/jh/016/242.pdf). For a detailed

it is a piece-wise linear regression model. In the mid

description of LWR method, the readers are referred to

1980s, the Australian researcher Dr J. Ross Quinlan

Aha et al. (), and its application in rainfall-runoff model-

suggested the so-called M5 algorithm to build MT (Witten

ling is reported in Solomatine et al. ().

the query vector xq can be a linear or non-linear function,

& Frank ); this is an iterative scheme that progressively splits the examples in the space of inputs {x1, x2, … xn} using the criterion xi < A, where i and A are the values chosen at

METHODOLOGY

each iteration according to the ‘splitting criterion’. This criterion is based on the standard deviation of the output

The original version of the main ideas of the MLUE method

values (in rainfall-runoff models this is runoff) in the result-

can be found in Shrestha et al. () (in open access), here

ing subsets, which is used as a measure of the possible

we present only a brief description of it, however in a more

regression model error if it is built for this subset. All

formalised and generalised fashion. The basic idea is to esti-

values of i and A are examined, and the M5 algorithm per-

mate the uncertainty of the hydrological model under a

forms the splits that ensures the small standard deviation

number of the following assumptions. First, that the model

in the resulting subsets; the splitting iterations continue

uncertainty is different in various hydro-meteorological con-

trying to perform the best possible split of each of the result-

ditions, and depends on the corresponding forcing input

ing subsets. At a certain moment splits are stopped, and

and the model states (e.g. rainfall, antecedent rainfall, soil

linear regression models are built for each of the resulting

moisture, etc.). Second, that the uncertainties associated

subsets. The splitting procedure can be presented as a hier-

with the prediction of the hydrological variables such as

archy, or a tree, where the splitting rules are in the

runoff in similar hydro-meteorological conditions are also

intermediate nodes and the linear models are associated

similar. By ‘hydrological conditions’, we mean a vector of

with the tree leaves. MT can tackle tasks with very high

variables representing such conditions – the combination of

dimensionality, up to hundreds of variables. Compared to

the particular values of the input and state variables (possibly

other machine learning techniques, MT learning is fast

lagged and transformed), which are seen as the driving forces


99

D. L. Shrestha et al.

|

Encapsulation of parametric uncertainty statistics: MLUE method

generating runoff. This assumption is quite natural: e.g. typi-

Journal of Hydroinformatics

|

16.1

|

2014

forcing vector x0 t, i.e.

cally prediction error (and hence uncertainty) is higher in case of peak flows during extreme events compared to the

{^yt,1 , . . . , ^yt,s } ¼ {M(x0t , θ 1 ), . . . , M(x0t , θs )}

(1)

low flows – however, the proper statistical analysis to support the validity of this assumption is still to be done. The flow chart of the MLUE methodology is presented

where θ is the parameter vector of the model M. Similarly, each column of matrix Y, i.e. {ŷ1,s,…, ŷt,s}T is one realisation

in Figure 1. Let us assume that S various vectors of par-

of MC simulations corresponding to the parameter set θs.

ameters or inputs are sampled and for each of them the

Note that Equation (1) does not represent predictive uncertainty Pt(y|y)̂ which is the uncertainty related to the actual

hydrological model M is run generating a time series of the model output ŷ. The results are presented in the matrix form Y ¼ {ŷt,s}, where t ¼ 1,…, N, s ¼ 1,…, S, N is the

value given the model predictions and all the information

number of time steps, S is the number of simulations. Note

Todini ; Todini ). Rather, by and large it represents

that each row of the matrix Y corresponds to the particular

uncertainty of the model predictions due to the parameter

Figure 1

|

Schematic diagram of the MLUE method.

and knowledge available up to the present (Mantovan &


100

D. L. Shrestha et al.

|

Encapsulation of parametric uncertainty statistics: MLUE method

uncertainty, i.e. Pt(y|̂ xt, θ). Estimating quantiles of the distribution of ŷ or probability density Pt(y|̂ xt, θ) is not always practical in real time application (e.g. for computationally expensive environmental models). However, we can approximate Pt(y|̂ xt, θ) by estimating its quantiles using the MLUE

Journal of Hydroinformatics

|

16.1

|

2014

conditional on the model structure, inputs and other parameters (e.g. in case of using GLUE framework this is the likelihood weight vector ws). 4. The prediction intervals [PItL (α), PItU (α)] for the given confidence level of 1 α (0 < α< 1)

method. Our intention is to build a regression (machine learning) model U which is relatively efficient (fast) and can

PItL (α) ¼ Qt (α=2), PItU (α) ¼ Qt (1 α=2)

(6)

encapsulate these uncertainty results in the following form: where PItL (α) and PItU (α) are the differences between the statistical properties z of ^yt ¼ U ðxt Þ

ð2Þ

model output and the lower and upper bounds of the prediction intervals (PI) respectively, corresponding to the

where z ¼ {z1,…, zK} is a set of desired statistical properties; x

1 α confidence level.

is the input vector of the model U which is constructed from If U in Equation (2) is treated as a quantile, the general

the forcing input variables x’, model state s and possibly model output ŷ (all possibly combined, transformed and/or

equation for calculating the conditional prediction quantile

lagged). A way to construct the input space x is described in

(Equation (5)) can be presented as

next section. To characterise the uncertainty of the model M prediction, the following uncertainty descriptors can be considered.

Qt (p) ¼ U(xt ) þ ξ

(7)

where ξ is the error between the target quantile and the pre-

1. The prediction variance σ 2t (^yt )

dicted quantile by the machine learning model. In particular, the two quantiles that represent the bounds of

σ 2t (^yt ) ¼

S X

1 (^yt,s yt,s )2 S 1 s¼1

(3)

where yt,s is the mean of MC realisations at the time step t.

the PI (Equation (6)) can be calculated as follows: PItL (α) ¼ UL (xt ) þ ξL PItU (α) ¼ UU (xt ) þ ξU

(8)

0

2. The prediction quantile Qt (p) of yt̂ corresponding to the pth [0, 1] quantile

Since these prediction quantiles are derived from the current value of the model output (Equation (5)), then

P(^yt < Q0t (p)) ¼

S X

the general model for the predictive quantile can be ws j^yt,s < Q0t (p)

(4)

presented as

s¼1

where ws is the weight given to the model output at simulation s, ŷt,s is the value of model output at the time t simulated by the model M(x,θs). The use of weights is assumed in case of using GLUE framework. 3. The conditional prediction quantile Qt(p) corresponding to the pth quantile

Q0t (p) ¼ U(xt ) þ ^yopt t

In particular, the upper and lower bounds of the PI of the model output are given by PLLt ¼ UL (xt ) þ ^yopt t ^opt PLU t ¼ UU (xt ) þ yt

Qt (p) ¼ Q0t (p) ^yopt t

(9)

(10)

(5) where UL and UU are the machine learning models for the

where

^yopt t

is the output of the calibrated (optimal) model.

lower and upper bounds of the PIs, respectively. It is worth-

Note that the quantiles Qt(p) obtained in this way are

while to mention that Equation (10) is valid for the


101

D. L. Shrestha et al.

|

Encapsulation of parametric uncertainty statistics: MLUE method

Journal of Hydroinformatics

|

16.1

|

2014

uncertainty descriptors in Equation (6) and it is assumed

variable runoff (Yt). However, the uncertainty model U,

that there is an optimal (calibrated) model M.

whose aim is to predict the error distribution of the simu-

Model U, after being trained on the historical calibration

lated runoff, may be trained with the possible combination

data (generated by MC simulations), encapsulates the under-

of rainfall and evapotranspiration (or effective rainfall),

lying dynamics of the uncertainty descriptors of the MC

their several past (lagged) values, the lagged values of

simulations and maps the input (or more precisely, vectors

runoff, and, possibly, their derivatives and/or combinations.

in space x) to these descriptors. The model U can be of various types, from linear to non-linear regression models such as an ANN. The choice of model depends on the complexity

VERIFICATION

of the problem to be handled and the availability of data. Once the model U is trained on the calibration data, it can

The uncertainty model U can be validated in two ways: (i)

be employed in operation to estimate the uncertainty

measuring its predictive capability in approximating the

descriptors such as quantiles for the new unseen input

uncertainty descriptors of the realisations of MC simu-

data vectors.

lations; and (ii) measuring the ‘quality’ of representing uncertainty by using some indices. Two performance measures, such as coefficient of corre-

SELECTION OF INPUT VARIABLES FOR THE UNCERTAINTY MODEL

lation (CoC) and the root mean square error (RMSE), are widely used to measure the predictive capability of models, and they can be employed for the uncertainty

Selection of appropriate variables to serve as model inputs

model as well. Beside these numerical measures, the graphi-

for the uncertainty model U is extremely important as they

cal plots such as scatter and time series plot of the

should be relevant for the particular modelling exercise

uncertainty descriptors obtained from the MC simulations

and the type of the process model M and its inputs. For

and their predicted values are used to judge the performance

this, the domain (expert) knowledge and analysis of

of the uncertainty model U.

causal relationship between inputs and outputs should be

For assessing the quality of model U, we use two

used in combination. The following variables (or their com-

measures (Shrestha & Solomatine ). Model U is con-

binations) of the process model M are considered as the

sidered to be good if PI coverage probability and mean PI

candidates for being the input variables for model U: (i)

calculated for U are close to those calculated for the MC

input variables; (ii) state variables; (iii) outputs; (iv) time

simulation data from which is used to train U.

derivatives (rate of change) of the input data and state variables; (v) lagged variables of input, state and observed output; and (vi) other data from the physical system that may be relevant to the uncertainty descriptors. Since the nature of models M and U is very different, analysis techniques such as linear correlation or average mutual information between the uncertainty descriptors and the input data listed above may help in choosing the relevant input variables. Based on the domain knowledge and analysis of causal relationships, several structures of input data can be tested to select the optimal input data

1. PI coverage probability (PICP). It measures the percentage of observations falling inside the PI and ideally should be equal to the confidence bounds used to generate these intervals. It is an indication of the quality of model U. PICP is given by: N 1X Ct N t¼1 ( 1, PLLt yt PLU t with C ¼ 0, otherwise

PICP ¼

(11)

structure. For example, if a model M is a conceptual hydrological

where yt is the observed model output at the time t.

model, it would typically use rainfall (Rt) and evapotran-

2. Mean prediction interval (MPI). It measures the average

spiration (Et) as input variables to simulate the output

width of the PIs (it gives an indication of how large the


102

D. L. Shrestha et al.

|

Encapsulation of parametric uncertainty statistics: MLUE method

N 1X L (PLU t PLt ) N t¼1

|

16.1

|

2014

STUDY AREA

uncertainty is) and given by: MPI ¼

Journal of Hydroinformatics

(12)

The MLUE approach has been tested to two contrasting catchments: Brue and Bagmati. The Brue catchment is located in South West of England (Figure 2). It has a drai-

Besides these uncertainty statistics, a visual inspection

nage area of 135 km2 with the average annual rainfall of

of the plot of uncertainty bounds and of the observed

867 mm and the average river flow of 1.92 m3 s 1 (measured

model output can additionally provide significant infor-

in a period from 1961 to 1990). The hourly rainfall, dis-

mation about how effective the uncertainty model is in

charge, and the weather data (temperature, wind, solar

enclosing the observed model outputs along the different

radiation, etc.) are computed from the 15 minutes resolution

input regimes (e.g. low, medium or high flows in hydrology).

data which are available from a period 1993 to 2000.

More detailed description of the performance measures can

The catchment average rainfall data are used in the

be found in Shrestha & Solomatine () and Shrestha

study. The hourly potential evapotranspiration is computed

et al. ().

using the modified Penman method recommended by FAO (Allen et al. ). One year hourly data from June 24 1994 to June 24 1995 is selected for calibration of the

HYDROLOGICAL MODEL

HBV model and data from June 25 1995 to May 31 1996 for the verification (testing) of the HBV model.

A simplified version of HBV model (Bergström ) was

The Bagmati catchment is located in the central moun-

used. This is a lumped conceptual hydrological model

tainous region of Nepal (Figure 3). Compared to the Brue,

which includes conceptual numerical descriptions of the

the size of the Bagmati catchment is bigger, the length of

hydrological processes at catchment scale. The model com-

the data is larger, the temporal resolution of the data is

prises subroutines for snow accumulation and melt, soil

coarse (daily), and the quality of the data is comparatively

moisture accounting procedure, routines for runoff gener-

poorer. It encompasses nearly 3,700 km2 within Nepal and

ation, and a simple routing procedure. The model has 13

reaches the Ganges River in India. The catchment area

parameters; however only nine parameters (see Table 1)

draining to the gauging station at Pandheradobhan is

are effective when there is no snowfall.

about 2,900 km2. Two thousand daily records from 1 January

Table 1

|

Ranges and the optimal values of the HBV model parameters

Brue

Bagmati

Parameter

Description and Unit

Range

Value

FC

Maximum soil moisture content (L)

100–300

160.335

LP

Limit for potential evapotranspiration ( )

0.5–0.99

0.527

ALFA

Response box parameter ( )

BETA

Exponential parameter in soil routine ( )

K

Recession coefficient for upper tank (/T)

0.0005–0.1

K4

Recession coefficient for lower tank (/T)

0.0001–0.005

PERC

Maximum flow from upper to lower tank (L/T)

0.01–0.09

CFLUX

Maximum value of capillary flow (L/T)

0.01–0.05

MAXBAS

Transfer function parameter (T)

0–4 0.9–2

8–15

Range

50–500 0.3–1

1.54

Value

450 0.90

0–4

0.1339

1.963

1–6

1.0604

0.001

0.05–0.5

0.3

0.004

0.01–0. 5

0.04664

0.089

0–8

7.5

0.0038

0–1

0.0004

1–3

2.02

12

Note: The uniform ranges of parameters are used both for calibrating the HBV model, and for analysis of the parameter uncertainty of the HBV model.


103

D. L. Shrestha et al.

|

Encapsulation of parametric uncertainty statistics: MLUE method

Journal of Hydroinformatics

|

16.1

|

2014

EXPERIMENTAL SETUP Uncertainty analysis Hydrological models are calibrated by using adaptive cluster covering (Solomatine ), an efficient randomised search method implemented in GLOBE software. The GLUE method is used in uncertainty analysis because it has now been widely used for uncertainty estimation in a variety of models of complex environmental systems (we do not discuss here how much it does (or does not) follow the Figure 2

|

The Brue catchment showing dense rain gauges network (reproduced from Shrestha & Solomatine (2008) with permission from the International Association for Hydraulic Research). The horizontal and vertical axes refer to the

Bayesian framework). No model is perfect (free from structural error), observation and input data are not free from

easting and northing in British national grid reference co-ordinates. Circles

errors, so Monte Carlo simulation results considering only

denote the rainfall stations and triangles denote the discharge gauging stations. The location of the Brue catchment (solid circle) in the map of UK is

parameter uncertainty are not free from these sources of

shown in the inset.

error. We tried to reduce such errors as much as possible by selecting the best (automatically calibrated) model (which is reasonably accurate), and quality control of the input and observation data. Of course, uncertainty results only considering parameter uncertainty from GLUE are contaminated by other sources of error, again, we tried to minimise them. In this, we follow many researchers using parametric uncertainty analysis. Though Beven claimed that GLUE can be applied to other sources of error as well, we explicitly consider only parameter uncertainly in this study. So we can assume that the uncertainty results produced

by

GLUE

represent

mostly

the

parametric

uncertainty per se, and neglecting the contamination by other sources of error seems to be reasonable thing to do. It is also worth noting that because of informal likelihood function and cut-off threshold value used in GLUE to Figure 3

|

Location map of the Bagmati catchment considered in this study. Discharge

select the behaviours parameter sets, GLUE does not con-

measured at Pandheradobhan is used for the analysis (adopted from Solo-

sider complete parameter uncertainty in statistical sense

matine et al. (2008)).

(Vrugt et al. ). The convergence of MC simulations is assessed to deter-

1988 to 22 June 1993 are selected for calibration of the pro-

mine the number of samples required to obtain the reliable

cess model (HBV hydrological model) and data from 23

results by authors in the previous publication (see Shrestha

June 1993 to 31 December 1995 are used for the verification

et al. ) and is not reported here. The parameters of

of the process model. The first two months of calibration

the HBV model are sampled using non-informative uniform

data are used as the warming-up period and hence excluded

sampling without prior knowledge of individual parameter

in the study. In separation of the 8 years of data into cali-

distributions other than a feasible range of values (see

bration and verification sets, we follow the previous study

Table 1). We use the sum of the squared errors as the

of Solomatine & Shrestha ().

basis to calculate the generalised likelihood measure (see


104

D. L. Shrestha et al.

|

Encapsulation of parametric uncertainty statistics: MLUE method

Freer et al. ) in the form:

L(θ s jD) ¼

1

σ 2e σ 2obs

Journal of Hydroinformatics

|

16.1

|

2014

For the Bagmati catchment, since the resolution of data is daily (as opposed to hourly for the Brue), we do not con-

!λ (13)

sider the derivative (stepwise difference) of the flow as input to the model. The same data sets used for calibration and verification

where L(θs|D) is the generalised likelihood measure for

of the HBV model are used for training and verification of

the sth model (with parameter vector θs) conditioned

model U, respectively. However, for proper training of the

on the observations D, σ 2e is the associated error variance

machine learning models, the calibration data set is segmen-

is the observed variance for the period

ted into the two subsets: 15% of data sets for cross-

under consideration, λ is a user defined parameter. We set λ

validation (CV) and 85% for training per se. CV data set

to 1, so Equation (13) is equivalent to the Nash–Sutcliffe

was used to identify the best structure of machine learning

coefficient of efficiency (CoE) (Nash & Sutcliffe ).

models.

for the sth model,

σ 2obs

The threshold value of CoE ¼0 is selected to classify simulation as either behavioural or non-behavioural. The

Machine learning models

number of behavioural models is set to 25,000, which is based on the convergence analysis of MC simulations. Var-

A multilayer perceptron neural network with one hidden

ious uncertainty descriptors such as variance, quantiles, PIs

layer is used; the Levenberg–Marquardt algorithm is

and estimates of the probability distribution functions are

employed for its training. The hyperbolic tangent function

computed from these 25,000 MC realisations. Note that

is used for the hidden layer, and the linear transfer

these descriptors are computed using likelihood measure

function – for the output layer. The maximum number of

(Equation (13)) as weights ws in Equation (4). The model

epochs is fixed to 1000. Trial and error method is adopted

parameters ranges used for MC sampling are given in

to find the optimal number of neurons in the hidden layer;

Table 1. For Bagmati catchment, first 122,132 MC samples

we tried the number of neurons ranging from 1 to 10. It

are generated by setting threshold value of 0.7 to obtain

was found that 7 and 8 neurons for lower and upper PI,

25,000 behavioural samples. However, to make consistent

respectively, gave the lowest CV error for the Brue. For

with the Brue catchment experiment, model simulations

the Bagmati catchment, the number of hidden neurons

with negative CoE are removed for further analysis, leaving

reduced to 5 and 7.

116,153 samples out of 122,132.

Experiments with MT are carried out with various values of the pruning factor that controls the complexity of

Input variables and data

the generated model (i.e. number of the linear models) and hence the generalising ability of the model. We report the

Selection of input variables for the machine learning model U are based on the methods outlined in the previous section

results of the MT which have a moderate level of complexity. Note that CV data set has not been used in the MT,

and publication of Shrestha et al. () and is not discussed

rather it uses the whole calibration data set to build the

here; they are constructed from the forcing input variables

model.

(e.g. rainfall, evapotranspiration) used in the process models, and the observed discharge. The selected input variables are REt 9a, Yt 1, ΔYt 1 for the Brue catchment and

In the LWR model, we vary two important parameters – number of neighbours and the weight functions (see Appendix A, available online at http://www.iwaponline.com/jh/

REt 0, REt 1, Yt 1, Yt 2, for the Bagmati catchment where

016/242.pdf). Several experiments are done with different

REt τ: effective rainfall at time t τ;

combination of these values and the best results are

Yt τ: discharge at time t τ; where τ is lag time;

obtained with five neighbours and the linear weight function

REt 9a: the average of REt 5, REt 6, REt 7, REt 8, REt 9;

for the Brue and 11 neighbours and Tricube weight function

ΔYt 1 ¼ Yt 1 Yt 2.

for the Bagmati catchment.


105

D. L. Shrestha et al.

|

Encapsulation of parametric uncertainty statistics: MLUE method

Modelling the probability distribution function

Journal of Hydroinformatics

|

16.1

|

2014

observed discharge in the verification period is 54% higher than that in the calibration period which apparently

In the previous study (Shrestha et al. ), we estimate the

increases performance in the verification period.

90% PIs by building only two models predicting the 5 and

Figure 4 shows a comparison of the 90% prediction

95% quantiles. In this paper, the methodology is extended

bounds estimated by the GLUE and the three machine learn-

to predict several quantiles of the model outputs to estimate

ing models in the verification period for the Brue catchment.

the distribution functions (CDF) of the model outputs gener-

One can see a noticeable difference among them for predict-

ated by the MC simulations. The methodology applied to

ing the lower and upper bounds of PI. For example, in the

estimate only two quantiles can be extended to approximate

second peak of Figure 4(a), the upper bound of PI is underes-

the full distribution of the model outputs. The procedures to

timated by ANN compared to the MT and LWR. However,

estimate the CDF of the model outputs consists of (i) deriv-

the lower bound is well approximated by the ANN compared

ing the CDF of the realisations of the MC simulations in the

to the other models. Furthermore, in Figure 4(b), the ANN is

calibration data, (ii) selecting several quantiles of the CDF in

overestimating two peaks, while the MT and LWR models

such a way that these quantile can approximate the CDF,

underestimate them (Figure 4(d) and (f)). From Figure 4, it

(iii) computing corresponding prediction quantiles using

can be seen that the results of the three models are compar-

Equation (5), (iv) constructing and training separate

able. They reproduce the MC simulations uncertainty

machine learning models for each prediction quantiles, (v)

bounds reasonably well except for some peaks, in spite of

using these models to predict the quantiles for the new

the low correlation of the input variables with the PIs. The

input data vector, and (vi) constructing a CDF from these

predicted uncertainty bounds follow the general trend of the

discrete quantiles by interpolation. This CDF will be

MC uncertainty bounds although some errors can be noticed

approximation to the CDF of the MC simulations. We select 19 quantiles from 5 to 95% with uniform inter-

and the model fails to capture the observed flow during one of the peak events (Figure 4(a), (c), and (e)).

val of 5%, and then an individual machine learning model is

For the Bagmati catchment, it is found that only 49.79%

constructed for each quantile using the same structure of the

of observed discharge data is inside the 90% prediction

input data and the model that was used for modelling two

bounds computed by the GLUE method in the calibration

quantiles. In principle, the optimal set of input data and

period and 61.48% in the verification period. Therefore,

the model structure could be different for each quantile,

we follow the modified GLUE method (denoted by

but we leave this investigation to future studies.

mGLUE) (Xiong & O’Connor ) to improve the capacity of the prediction bounds to capture the observed runoff data. mGLUE method uses the bias corrected MC simu-

RESULTS

lations to estimate the uncertainty bounds. Compared to

The HBV model is calibrated maximising CoE. CoE values

mGLUE method includes two more procedural steps.

of 0.96 and 0.83 are obtained for the calibration period in

Firstly, for each behavioural parameter set, a simulation

the Brue and Bagmati catchment, respectively. We also

bias curve is constructed on the basis of the simulation

experimented

performance

series that are obtained using the calibration data. Thus,

measures taking into account different temporal scales and

for a number S of the behavioural parameter sets, there

using step-wise line search (Kuzmin et al. ). The

will be S different simulation bias curves. Secondly, at

model is validated by simulating the flows for the indepen-

each time step, with the new data input, all the different pre-

dent verification data set, and CoE is 0.83 and 0.87 in the

diction values for the same observation are corrected by

Brue and Bagmati catchments, respectively. HBV model is

dividing by a common median bias value, before the deri-

quite accurate for the Brue catchment but its error (uncer-

vation of the prediction limits.

the original GLUE method (Beven & Binley ), the

with

more

sophisticated

tainty) is quite high during the peak flows. Note that for

Figure 5 presents the 90% prediction bounds estimated

the Bagmati catchment, the standard deviation of the

by the mGLUE and the three machine learning models in


106

Figure 4

D. L. Shrestha et al.

|

|

Encapsulation of parametric uncertainty statistics: MLUE method

Journal of Hydroinformatics

|

16.1

|

2014

Hydrograph of 90% prediction bounds estimated by GLUE and machine learning methods for the Brue catchment in parts of the veriďŹ cation period. The black dots indicate the observed discharges and the dark grey shaded area – the prediction uncertainty that results from GLUE. The black lines denote the prediction uncertainty estimated by neural networks ((a) and (b)), model trees ((c) and (d)) and locally weighted regression ((e) and (f)).

Figure 5

|

Same as Figure 4, but for the Bagmati catchment.


107

D. L. Shrestha et al.

|

Encapsulation of parametric uncertainty statistics: MLUE method

Journal of Hydroinformatics

|

16.1

|

2014

the verification period. With mGLUE method, the percen-

estimated by them enclose relatively lower percentage of

tage of the observation falling inside the bounds is

the observed values compared to those of the ANN.

increased to 65.26 and 67.52% in the calibration and verifi-

So far we have compared the performance of the three

cation periods, respectively. The machine learning models

machine learning models by analysing the accuracy of the

are able to approximate the mGLUE simulation results

prediction only; however, there are other factors to be con-

reasonably well. The results of the three machine learning

sidered as well. These include computational efficiency,

models are comparable; however one can see a noticeable

simplicity or ease of use, number of training parameters

difference between them when predicting the peaks. The

required, flexibility, transparency, etc. Computational effi-

highest peak in Figure 5(a) is overestimated by the ANN

ciency is shown in Table 3. One can see that the time

model, while the other two peaks in Figure 5(b) are

required to generate uncertainty results by MLUE methods

underestimated.

in the verification period is significantly lower than that

Figure 6 and Table 2 present a summary of statistics of

required by GLUE method. Table 4 shows linguistic variables

the uncertainty estimation in the verification period. The

to describe other factors mentioned above with parameters of

ANN model is very close to the MC simulations results.

machine learning models to be tuned. In ANN, we have

The MT and LWR are better than the ANN with respect

only tuned one parameter – number of hidden neurons.

to MPI (note that lower MPI is the indication of better per-

MT also contains one parameter – pruning factor that has

formance), however PICP shows that the prediction limits

to be tuned. While in LWR, two parameters – number

Figure 6

Table 2

|

|

A comparison of statistics of uncertainty (PICP and MPI) estimated with GLUE, neural networks (ANN), model trees (MT), and locally weighted regression (LWR) in the verification period. (a) Brue catchment; (b) Bagmati catchment.

Performances of the models measured by the coefficient of correlation (CoC), root mean squared error (RMSE), the prediction interval coverage probability (PICP) and the mean prediction interval (MPI) in the verification data set

Lower prediction interval RMSE (m3/s)

Upper prediction interval RMSE (m3/s)

MPI (m3/s)

Catchment

Model

CoC

Brue

ANN

0.86

0.56

0.80

1.59

77.00

2.09

MT

0.84

0.61

0.79

1.63

68.72

1.95

LWR

0.82

0.64

0.80

1.60

75.43

1.93

ANN

0.81

51.46

0.94

61.59

66.24

124.03

MT

0.81

50.25

0.95

52.14

59.05

120.59

LWR

0.86

44.56

0.96

50.37

59.16

121.73

Bagmati

Note: Bold type signifies the maximum value in each statistics.

CoC

PICP (%)


108

Table 3

D. L. Shrestha et al.

|

|

Encapsulation of parametric uncertainty statistics: MLUE method

|

16.1

|

2014

From the visual inspection one can see that the CDFs

Computational time for GLUE and MLUE

Brue

Journal of Hydroinformatics

are reasonably approximated by the machine learning

Bagmati

methods. However, it may require a rigorous statistical test

Catchments Period

Calibration

Verification

Calibration

Verification

to conclude if the estimated CDFs are not significantly

Number of data used

8760

8217

2000

922

different from those given by the GLUE simulations. In

GLUE

16:34:00

11:45:00

7:45:00

6:41:00

results of the significance test (e.g. Kolmogorov–Smirnov)

ANN

2:07:00

0:04:00

1:03:00

0:01:30

may not be reliable.

MT

1:07:00

0:03:00

0:33:00

0:01:05

LWR

4:07:00

0:09:00

2:03:00

0:03:00

this study, since we have limited data (only 19 points) the

Note: The time (hh:mm:ss) is based on prediction of two quantiles (5% and 95%) and also includes data analysis and preparation time in the calibration period except for GLUE.

DISCUSSION

of neighbours and weighting functions have been tuned to

In this study, the uncertainty of the model output is assessed

get optimal results. Such parameters are optimised by

when the hydrological process model is used in simulation

exhaustive search during training the model. It can be

mode. However, this method can be used also in forecasting

observed that none of the models is superior with respect

mode, provided that the process model is also run in fore-

to all factors; however one may favour ANN if the ranking

casting mode. Note that we have not used the current

is done by giving equal weight to all factors.

observed discharge Qt as an input to machine learning models because during the model application this variable is not available (indeed, the value of this variable is calcu-

Modelling the probability distribution function

lated by the HBV model, and the machine learning model assesses the uncertainty of this output).

Figure 7 and Figure 8 show comparison of the CDFs for the

It is observed that the results of machine learning

peak events estimated by the three machine learning

models and the GLUE (or mGLUE) are visually closer to

methods for the Brue and Bagmati catchment, respectively.

each other. The model prediction uncertainty caused by par-

One can see that the CDFs estimated by the ANN, MT and

ameter uncertainty is rather large. There could be several

LWR are comparable and are very close to the CDFs given

reasons for this including the following ones (Shrestha

by the GLUE simulations. It is observed that the CDFs esti-

et al. ): (i) the GLUE and mGLUE methods do not

mated by the ANN, MT and LWR models deviate a little

strictly follow the Bayesian inference process (Mantovan

more near the middle of it for the peak event of 9 January

& Todini ) and overestimate the model prediction

1996 in the Brue catchment (see Figure 7(b)). The CDFs esti-

uncertainty; (ii) in the GLUE method, the uncertainty

mated by the ANN, MT and LWR deviate a bit more at the

bound very much depends on the rejection threshold

higher percentiles values for the peak event of 13 August

separating behavioural and non-behavioural models: in

1995 in the Bagmati catchment (see Figure 8(b)).

this study we use quite a low value of rejection threshold

Table 4

|

Performance criteria of machine learning models indicated by linguistic variables

Accuracy Models

Model parameters (optimised)

CoC

PICP and MPI

Efficiency

Transparency

Rank

ANN

Number of hidden nodes

High

High

Medium

Low

1

MT

Pruning factor

Medium

Low

High

Medium

2

LWR

Number of neighbours and weight functions

Low

Medium

Low

High

3


109

Figure 7

D. L. Shrestha et al.

|

|

Encapsulation of parametric uncertainty statistics: MLUE method

Journal of Hydroinformatics

|

16.1

|

2014

A comparison of cumulative distribution function (CDF) estimated with GLUE and neural networks (ANN), model trees (MT), and locally weighted regression (LWR) for the Brue catchment in a part of the verification period. (a) Peak event of 20 December 1995; (b) peak event of 9 January 1996.

(CoE value of 0) which produces relatively wider uncer-

When comparing the percentage of the observed dis-

tainty bounds; and (iii) we consider only parameter

charge data falling within the uncertainty bounds (i.e.

uncertainty, thus implicitly assuming no model structure

PICP) produced by the GLUE, it can be seen that this per-

and input data uncertainty.

centage is much lower than the specified confidence level

It can be noticed that the performance of machine learn-

to generate these bounds. Low PICP value is consistent

ing models to predict lower quantiles (5%, 10%, etc.) is

with the results reported in the literature (see e.g. Montanari

relatively higher compared to those of the models for the

; Xiong & O’Connor ). The low ‘quality’ of the PIs

upper quantiles (90%, 95%, etc.). This can be explained by

obtained by the GLUE in enveloping the real-world dis-

the fact that the upper quantiles correspond to higher

charge observations might be mainly due to the following

values of flow (where the HBV model is obviously less accu-

three reasons (Shrestha et al. ): (i) by using GLUE

rate) and higher variability, which makes prediction a

method we investigate only the parametric uncertainty with-

difficult task. It is possible to develop a specific model only

out consideration of uncertainty in the model structure, the

to simulate the peak observed data and their uncertainty as

input (such as rainfall, temperature data) and the output dis-

well as for the mean flows. In general, such model performs

charge data; (ii) we use uniform distribution and ignore the

better than the global model. In this study, we have used MT

parameters correlation; (iii) results of the GLUE method

and LWR models for uncertainty estimation which implicitly

depend on the (subjectively set) threshold value and likeli-

build the local models internally. It would be interesting to

hood measure for selecting the behavioural parameter sets.

build the local models explicitly for high flow events for

To approximate CDF, an individual machine learning

example. However, it is always not possible because of train-

model is constructed for each quantile with the same struc-

ing data requirements for such rare and extreme events.

ture of the input data and the model configuration. Thus, we


110

Figure 8

D. L. Shrestha et al.

|

|

Encapsulation of parametric uncertainty statistics: MLUE method

Journal of Hydroinformatics

|

16.1

|

2014

Same as Figure 7, but for the Bagmati catchment in a part of the verification period. (a) Peak event of 14 September 1994; (b) peak event of 13 August 1995.

have not undertaken the full-fledged optimisation of the

data belong, and only a little beyond. In order to avoid the

model and the input data structure of the machine learning

problem of extrapolation, an attempt should be made to

models and there is a hope to improve the results. Further-

ensure that the training data includes various possible combi-

more one can notice that the CDFs estimated are not

nations of the events including the extreme (such as extreme

necessarily monotonously increasing (see e.g. 30% quantile

flood), however, this is not always possible since the

of the MT model for the second case study). This is not sur-

extremes tend to be rather rare events. Like most of the

prising given that individual models are built for each

uncertainty analysis methods, the MLUE method also pre-

quantile independently. This deficiency can be addressed

supposes the existence of a reasonably long, precise and

by a correcting scheme (to be developed) that would

relevant time series of measurements. As pointed out by

ensure monotonicity of the overall CDF.

Hall & Anderson (), uncertainty in extreme or unrepea-

In this paper, the MLUE method is applied to emulate

table events is more important than in situations where there

the results of the GLUE and mGLUE methods, however it

are historical data sets, and this may require different

can be used for other uncertainty analysis methods such as

approaches towards uncertainty estimation. The lack of suffi-

Markov chain Monte Carlo, Latin hypercube sampling,

cient historical data makes the uncertainty results from the

etc. Furthermore, the MLUE method can be applied in the

model unreliable. This is actually true for all MC-based

context of other sources of uncertainty – input, structure

methods that use past data to make judgements about the

or combined.

future uncertainty.

Since the machine learning technique is the core of the

The MLUE method is applicable only to systems whose

MLUE method, it may have a problem of extrapolation for

physical characteristics do not change considerably with

extreme (rare) events. This means that the results are reliable

time. The results will not be reliable if the physics of the

only within the boundaries of the domain where the training

catchment

(e.g.

land

use)

and

hydro-meteorological


111

D. L. Shrestha et al.

|

Encapsulation of parametric uncertainty statistics: MLUE method

Journal of Hydroinformatics

|

16.1

|

2014

conditions differ substantially from what was observed

or mGLUE uncertainty bounds. It is also observed that the

during the model calibration. If there is evidence of such

uncertainty bounds estimated by ANN, MT and LWR are

changes, then the models should be re-calibrated.

comparable; however ANN is a bit better than the other

The reliability and accuracy of the uncertainty analysis

two models. Second we extend the MLUE method to

depend on the accuracy of the uncertainty models used, so

approximate the CDF of the model outputs, and the results

attention should be given to these aspects as well. The pro-

demonstrate that the MLUE is performing quite well in esti-

posed

mating the CDF resulting from the GLUE (and mGLUE)

method

does

not

consider

the

uncertainty

associated with the model U itself. However, one could use CV data set to improve the accuracy of the model U by generalising its predictive capability.

methods. It can be recommended to direct further studies at testing applicability of the MLUE approach with other sampling methods, ensuring compatibility of the models for multiple quantiles to achieve monotonicity of the resulting approxi-

CONCLUSIONS

mation of CDF, considering multiple sources of uncertainty, and testing the method on more complex models.

This paper presents the further development, studying the relative performance and application of the MLUE method presented in its initial form by Shrestha et al. (), in predicting parameter uncertainty in rainfall-runoff modelling. The basic idea of the MLUE method is to encapsulate the computationally expensive MC simulations of a process model by an efficient machine learning model. (We used GLUE, a version of MC simulation method.) This model is first trained on the data generated by the MC simulations to encapsulate the relationship between the hydro-meteorological variables and the uncertainty statistics of the model output probability distribution, e.g. quantiles. Then the trained model can be used to estimate the latter for the new input data. The MLUE method is computationally efficient and can be used in real time applications when a large number of model runs are required. We use three machines learning techniques, namely ANN, MT and LWR to predict several uncertainty descriptors of the rainfall-runoff model outputs. It is observed that

ACKNOWLEDGEMENTS Most of this work has been completed during the first author’s post doctorate research and second author’s PhD research at UNSECO-IHE Institute for Water Education, Delft, The Netherlands; these were partly funded by the European Community’s 7th Framework Research Program through the grants to the budget of the EnviroGRIDS, KULTURisk and WeSenseIt projects. WIRADA project (The Water Information Research and Development Alliances between CSIRO’s Water for a Healthy Country Flagship and the Australian Bureau of Meteorology) partly supported the first author for completing this manuscript. The authors sincerely thank the editor and the three anonymous

reviewers

for

providing

helpful

and

constructive comments to improve the manuscript.

the percentage of the observation discharge data falling within the prediction bounds generated by GLUE is much lower than the given certainty level used to produce these

REFERENCES

prediction bounds. Thus, we also apply mGLUE (Xiong & O’Connor ) method to improve the percentage of the observation falling within the prediction bounds. On the two case studies we first demonstrate the application of the MLUE method to estimate the two quantiles (5 and 95%) forming the 90% PIs. Several performance indicators and visual inspection show that machine learning models are reasonably accurate to approximate the GLUE

Abrahart, R. J. & See, L.  Comparing neural network and autoregressive moving average techniques for the provision of continuous river flow forecasts in two contrasting catchments. Hydrological Processes 14, 2157–2172. Aha, D., Kibler, D. & Albert, M.  Instance-based learning algorithms. Machine Learning 6, 37–66. Allen, R. G., Pereira, L. S., Raes, D. & Smith, M.  Crop Evapotranspiration: Guidelines for Computing Crop Water Requirements. Irrigation and Drainage Paper No. 56, FAO,


112

D. L. Shrestha et al.

|

Encapsulation of parametric uncertainty statistics: MLUE method

Rome. Available at: http://www.fao.org/docrep/X0490E/ x0490e00.htm. Bergström, S.  Development and application of a conceptual runoff model for Scandinavian catchments. SMHI Reports RHO, No. 7, Norrköping, Sweden. Beven, K. & Binley, A.  The future of distributed models: Model calibration and uncertainty prediction. Hydrological Processes 6, 279–298. Beven, K. & Freer, J.  Equifinality, data assimilation, and uncertainty estimation in mechanistic modelling of complex environmental systems using the GLUE methodology. Journal of Hydrology 249, 11–29. Blasone, R., Vrugt, J., Madsen, H., Rosbjerg, D., Robinson, B. & Zyvoloski, G.  Generalized likelihood uncertainty estimation (GLUE) using adaptive Markov Chain Monte Carlo sampling. Advances in Water Resources 31, 630–648. Dawson, C. W. & Wilby, R. L.  Hydrological modelling using artificial neural networks. Progress in Physical Geography 25, 80–108. Dibike, Y. B. & Solomatine, D. P.  River flow forecasting using artificial neural networks. Journal of Physics and Chemistry of the Earth, Part B: Hydrology, Oceans and Atmosphere 26, 1–8. Duan, Q., Sorooshian, S. & Gupta, V.  Effective and efficient global optimization for conceptual rainfall-runoff models. Water Resources Research 28, 1015–1031. Elshorbagy, A., Corzo, G., Srinivasulu, S. & Solomatine, D. P. a Experimental investigation of the predictive capabilities of data driven modeling techniques in hydrology – Part 1: concepts and methodology. Hydrology and Earth System Sciences 14, 1931–1941. Elshorbagy, A., Corzo, G., Srinivasulu, S. & Solomatine, D. P. b Experimental investigation of the predictive capabilities of data driven modeling techniques in hydrology – Part 2: application. Hydrology and Earth System Sciences 14, 1943–1961. Freer, J., Beven, K. & Ambroise, B.  Bayesian estimation of uncertainty in runoff prediction and the value of data: an application of the GLUE approach. Water Resources Research 32, 2161–2173. Georgakakos, K., Seo, D.-J., Gupta, H. V., Schaake, J. & Butts, M. M.  Towards the characterization of streamflow simulation uncertainty through multimodel ensembles. Journal of Hydrology 298, 222–241. Govindaraju, R. S. & Rao, A. R.  Artificial Neural Networks in Hydrology. Kluwer Academic Publishers, Amsterdam, 348 pp. Haario, H., Laine, M., Mira, A. & Saksman, E.  DRAM: efficient adaptive MCMC. Statistical Computation 16, 339–354. Hall, J. & Anderson, M. G.  Handling uncertainty in extreme or unrepeatable hydrological processes – the need for an alternative paradigm. Hydrological Processes 16, 1867–1870. Harr, M.  Probabilistic estimates for multivariate analyses. Applied Mathematical Modeling 13, 313–318. Johnston, P. & Pilgrim, D.  Parameter optimization for watershed models. Water Resources Research 12, 477–486.

Journal of Hydroinformatics

|

16.1

|

2014

Khu, S.-T. & Werner, M. G. F.  Reduction of Monte-Carlo simulation runs for uncertainty estimation in hydrological modelling. Hydrology and Earth System Sciences 7, 680–692. Kuczera, G. & Parent, E.  Monte Carlo assessment of parameter uncertainty in conceptual catchment models: the Metropolis algorithm. Journal of Hydrology 211, 69–85. Kuzmin, V., Seo, D.-J. & Koren, V.  Fast and efficient optimization of hydrologic model parameters using a priori estimates and stepwise line search. Journal of Hydrology 353, 109–128. Madsen, H.  Automatic calibration of a conceptual rainfall– runoff model using multiple objectives. Journal of Hydrology 235, 276–288. Maier, H. R. & Dandy, G. C.  Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications. Environmental Modelling & Software 15, 101–124. Maier, H. R., Jain, A., Dandy, G. C. & Sudheer, K. P.  Methods used for the development of neural networks for the prediction of water resource variables in river systems: current status and future directions. Environmental Modelling & Software 25, 891–909. Mantovan, P. & Todini, E.  Hydrological forecasting uncertainty assessment: Incoherence of the GLUE methodology. Journal of Hydrology 330, 368–381. Maskey, S., Guinot, V. & Price, R. K.  Treatment of precipitation uncertainty in rainfall-runoff modelling: a fuzzy set approach. Advance in Water Resources 27, 889–898. McKay, M. D., Conover, W. J. & Beckman, R. J.  A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21, 239–245. Melching, C. S.  An improved-first-order reliability approach for assessing uncertainties in hydrologic modeling. Journal of Hydrology 132, 157–177. Minns, A. W. & Hall, M. J.  Artificial neural networks as rainfall-runoff models. Hydrological Science Journal 41, 399–417. Mitchell, T.  Machine Learning. McGraw-Hill, Singapore, 414 pp. Montanari, A.  Large sample behaviors of the generalized likelihood uncertainty estimation (GLUE) in assessing the uncertainty of rainfall-runoff simulations. Water Resources Research 41, W08406. Montanari, A. & Brath, A.  A stochastic approach for assessing the uncertainty of rainfall-runoff simulations. Water Resources Research 40, W01106. Nash, J. & Sutcliffe, J.  River flow forecasting through conceptual models – Part I – A discussion of principles. Journal of Hydrology 10, 282–290. Pappenberger, F., Harvey, H., Beven, K., Hall, J. & Meadowcroft, I.  Decision tree for choosing an uncertainty analysis methodology: a wiki experiment http://www.floodrisknet. org.uk/methods http://www.floodrisk.net. Hydrological Processes 20, 3793–3798.


113

D. L. Shrestha et al.

|

Encapsulation of parametric uncertainty statistics: MLUE method

Rosenblueth, E.  Two-point estimates in probability. Applied Mathematical Modelling 5, 329–335. Shrestha, D. L. & Solomatine, D. P.  Machine learning approaches for estimation of prediction interval for the model output. Neural Networks 19, 225–235. Shrestha, D. L. & Solomatine, D. P.  Data-driven approaches for estimating uncertainty in rainfall-runoff modelling. International Journal of River Basin Management 6, 109–122. Shrestha, D. L., Kayastha, N. & Solomatine, D. P.  A novel approach to parameter uncertainty analysis of hydrological models using neural networks. Hydrology and Earth System Sciences 13, 1235–1248. Solomatine, D. P.  Two strategies of adaptive cluster covering with descent and their comparison to other algorithms. Journal of Global Optimization 14, 55–78. Solomatine, D. P. & Torres, L. A. A.  Neural network approximation of a hydrodynamic model in optimizing reservoir operation. In: Hydroinformatics ’96 (A. Muller, ed.). Balkema, Rotterdam. Solomatine, D. P. & Dulal, K. N.  Model trees as an alternative to neural networks in rainfall–runoff modelling. Hydrological Sciences Journal 48, 399–411. Solomatine, D. P. & Ostfeld, A.  Data-driven modelling: some past experiences and new approaches. Journal of Hydroinformatics 10, 3–22. Solomatine, D. P. & Shrestha, D. L.  A novel method to estimate model uncertainty using machine learning techniques. Water Resources Research 45, W00B11. Solomatine, D. P., Maskey, M. & Shrestha, D. L.  Instancebased learning compared to other data-driven methods in hydrological forecasting. Hydrological Processes 22, 275–287. Stedinger, J. R., Vogel, R. M., Lee, S. U. & Batchelder, R.  Appraisal of the generalized likelihood uncertainty

Journal of Hydroinformatics

|

16.1

|

2014

estimation (GLUE) method. Water Resources Research 44, W00B06. Thiemann, M., Trosset, M., Gupta, H. V. & Sorooshian, S.  Bayesian recursive parameter estimation for hydrologic models. Water Resources Research 37, 2521–2535. Tung, Y.-K.  Uncertainty and reliability analysis. In: Water Resources Handbook (L. W. Mays, ed.). McGraw-Hill, New York, 7.1–7.65. Todini, E.  A model conditional processor to assess predictive uncertainty in flood forecasting. Journal of River Basin Management 6, 123–137. Vrugt, J. A., ter Braak, C., Gupta, H. & Robinson, B.  Equifinality of formal (DREAM) and informal (GLUE) Bayesian approaches in hydrologic modeling? Stochastic Environmental Research and Risk Assessment 23, 1011– 1026. Vrugt, J. A., Diks, C., Gupta, H. V., Bouten, W. & Verstraten, J. M.  Improved treatment of uncertainty in hydrologic modeling: Combining the strengths of global optimization and data assimilation. Water Resources Research 41, W01017. Wagener, T. & Gupta, H. V.  Model identification for hydrological forecasting under uncertainty. Stochastic Environmental Research and Risk Assessment 19, 378–387. Witten, I. H. & Frank, E.  Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann Publishers, San Francisco, USA, 371 pp. Xiong, L. & O’Connor, K.  An empirical method to improve the prediction limits of the GLUE methodology in rainfallrunoff modeling. Journal of Hydrology 349, 115–124. Yapo, P., Gupta, H. V. & Sorooshian, S.  Automatic calibration of conceptual rainfall-runoff models: sensitivity to calibration data. Journal of Hydrology 181, 23–48.

First received 20 December 2012; accepted in revised form 3 June 2013. Available online 25 July 2013


114

© IWA Publishing 2014 Journal of Hydroinformatics

|

16.1

|

2014

Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system Haijun Wang, Wenting Zhang, Song Hong, Yanhua Zhuang, Hongyan Lin and Zhen Wang

ABSTRACT Non-point source (NPS) pollution has become the major reason for water quality deterioration. Due to the differences in the generation and transportation mechanisms between urban areas and rural areas, different models are needed in rural and urban places. Since land use has been rapidly changing, it is difficult to define the study area as city or country absolutely and the complex NPS pollution in these urban–rural mixed places are difficult to evaluate using an urban or rural model. To address this issue, a fuzzy system-based approach of modeling complex NPS pollutant is proposed concerning the fuzziness of each land use and the ratio of belonging to an urban or rural place. The characteristic of land use, impact of city center and traffic condition were used to describe spatial membership of belonging to an urban or rural place. According to the spatial

Haijun Wang Song Hong (corresponding author) Yanhua Zhuang Hongyan Lin Zhen Wang School of Resource and Environmental Science, Wuhan University, Wuhan, China E-mail: environmentalanalytics@gmail.com Wenting Zhang Department of Geography and Resource Management, The Chinese University of Hong Kong, Hong Kong

membership of belonging to an urban or rural place, the NPS distributions calculated by the urban model and rural model respectively were combined. To validate the method, Donghu Lake, which is undergoing rapid urbanization, was selected as the case study area. The results showed that the urban NPS pollutant load was significantly higher than that of the rural area. The land usage influenced the pollution more than other factors such as slope or precipitation. It also suggested that the impact of the urbanization process on water quality is noteworthy. Key words

| Donghu Lake, fuzzy system, non-point source pollution, urban–rural watershed

INTRODUCTION Excessive loads of pollution into rivers, lakes, reservoirs and

pollutants in urban and rural places are different, the

estuaries are now becoming a major concern to water

models and the factors as well as the corresponding par-

resource managers across the world (Shrestha et al. ;

ameters must be different (William et al. ; Kim et al.

Jing & Chen ; Liu & Tong ). Non-point source

; Zhang et al. ; Phillips et al. ) to ensure accu-

(NPS) pollution significantly contributes to the deterioration

rate results. To begin with, close attention was paid to

of water quality (Leone et al. ; Ouyang et al. ) due to

rural NPS pollution, as agricultural chemicals contributed

the difficulty in identifying, assessing and controlling the

to the NPS pollution a great deal. For example, the empirical

sources of this type of pollution. The major NPS pollutants

quantitative approach, namely, the universal soil loss

are nitrogen (N) and phosphorus (P). Recently, numerous

equation (USLE), is developed to predict large scale soil ero-

research efforts have been made to discover the process

sion and the designation of potential risk zones for

and spatial quality of NPS N and P pollutants to support pre-

agricultural plots (Pandey et al. ). Thanks to its low

vention and mitigation measures (Zhang & Huang ). In

data and parameter requirements, in contrast to physically-

particular, the NPS pollution can be classified into two

based models, as well as its scale-independent geometric res-

types: agricultural/rural NPS and urban NPS (Edwin et al.

olution (Renard et al. ; Bahadur ; Dumas et al. ;

). As the generation and transportation process of NPS

Volk et al. ), USLE is widely used in the evaluation of

doi: 10.2166/hydro.2013.266


115

H. Wang et al.

|

Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system

Journal of Hydroinformatics

|

16.1

|

2014

the rural NPS pollutant loads by providing average annual

the Baltimore Ecosystem Study method to explore the

soil erosion (Fistikoglu & Harmancioglu ; Haregeweyn

impact of urbanization on the magnitude and export flow

& Yohannes ). Additionally, the other rural NPS evalu-

distribution of nitrogen, concluding that they are highly cor-

ation model, the export coefficient model (ECM), is well-

related. However, their studies still did not separate the rural

developed in determining NPS pollution (Do et al. )

region from the urban region for their study area and the

with the simple model format for agricultural areas

problem of modeling complex NPS pollution at urban–

(Johnes & Heathwaite ) at the same time. In short,

rural mixed place has not yet been solved.

the model for rural NPS evaluation is fully developed and

To address the problem of evaluating the NPS pollutant

widely applied. Yet, as land use is changing from agricul-

loads in an urban–rural mixed area, Zhuang et al. () pro-

tural to urban, the natural soil surface is replaced with

posed a CA-AUNPS model to asses the spatial and temporal

impermeable surfaces (Chris et al. ) which suffer from

variation of complex NPS pollution for a lake watershed of

higher population density and more intensive human activi-

central China. In this model, Focal Neighborhood method

ties (Shon et al. ). This will influence the generation and

was used as the coupling model to combine the export empiri-

transportation process of NPS pollutants. Recently, due to

cal model and L-THIA model. In our study, a fuzzy

the wide process of urbanization all over the world, the

membership-based approach is proposed. To our knowledge,

urban NPS pollution research has became more popular.

it is difficult to classify an area into urban or rural absolutely

For instance, Shon et al. () used a storm water manage-

due to the fact that the multiple or fuzzy characteristics of

ment model to estimate the NPS pollutant loads; Bhaduri

non-urban, partly-urban and urban states in the process of

et al. () proposed a Geographical Information System

urban development are not solved (Liu & Phinn ). Con-

(GIS)–NPS model to assess the NPS pollutant loads under

ventionally, the land use can be classified into ‘0’ meaning

urbanization by using the Long-Term Hydrologic Impact

non-urban or rural and ‘1’ meaning urban. According to this

Assessment (L-THIA) model.

classification, urban land which is surrounded by rural land

Even if the NPS evaluation models for urban or rural

and the land use at the boundary of rural-urban areas may

areas are well-developed, all these models concentrate on

be misclassified and then result in mistakes in the evaluation

one aspect, urban or rural pollutant loads, which is insuffi-

of the NPS pollutant loads. The fuzzy membership can

cient to evaluate the NPS pollutant loads in the urban-

express the ratio of the land cell belonging to urban or rural,

rural mixed areas. In this study, we identify the NPS pol-

which ranges from 0 to 1. The fuzzy expression may be suit-

lution in urban–rural mixed areas and caused by various

able for use in the NPS evaluation and have been employed

pollutants in rural and urban surface runoff together as the

to assist in the calculation of water quality (Yang et al. ).

complex NPS pollution. Since more and more urban–rural

For example, Dixon () incorporated GIS, global position

mixed areas are emerging, as the result of rapid economic

system, remote sensing (RS) and the fuzzy rule-based model

development, some efforts have concentrated on the NPS

to generate groundwater sensitivity maps. Besides the

pollution in mixed urban and rural watershed and have

traditional calculation for groundwater quality, his method-

mentioned that the process of urbanization impacted the

ology was further refined through fuzzy rule-based model to

water quality greatly (Wang et al. ; Zheng et al. ).

incorporate land-use/pesticide application and soil structure

For example, Chris et al. () measured the water quality

information. Gemitzi et al. () combined GIS with fuzzy

in agricultural, urban and mixed land, and determined the

logic and multi-criteria evaluation techniques for data acqui-

water quality from these three places, but Chris et al. just

sition and the production of factor images. Then, he created

compared the measured samples to confirm that the water

the intermediate and final ground water vulnerability map

quality varied in different areas and did not address the

based on factor images. In accordance with previous studies

problem of how to evaluate the NPS pollutant loads in

and the aims of this study, the fuzzy membership-based

different areas. Shields et al. () pointed out that the

approach is employed to describe the fuzziness in land

urbanizing study area is different from the traditional

usage and then to express the complex NPS pollutants in

urban or rural catchment. Thus, Shields et al. employed

rural and urban mixed areas.


116

H. Wang et al.

|

Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system

Journal of Hydroinformatics

|

16.1

|

2014

In this paper, the universal soil loss equation model

Figure 2 shows the temporal land use patterns of the site

(Wischmeier & Smith ), the export coefficient model and

in 1991, 2002 and 2005, respectively. From the view of land

the long-term hydrologic impact assessment model (Harbor

use pattern, the north-western portion of the area functions

; Lim et al. ), are employed to evaluate the spatial dis-

completely as a city under the urban expansion of Wuhan,

tribution and quality of rural and urban NPS pollutant loads

while the obstruction of the lake still allows for rural proper-

respectively. In particular, the USLE model and the ECM are

ties. In 1991, the agricultural land accounted for 48.16% of

integrated to calculate the NPS pollutant loads with the

the whole land area (the total area of built-up, forest, agricul-

hypothesis that the study area is rural, yet the L-THIA model

tural land). Therefore, the agricultural land area was larger

is used to achieve the urban NPS pollutant loads.

than the built-up land area, which was still more scattered,

Generally, this study aims to develop an integrated

with no centralized developing tendency at that time. By

method to assess complex NPS pollution under the process

2002, under the background of economic development of

of urbanization, which accounts for the fuzziness of the real

China, the western basin exhibited features of a city, as the

world. It involves the following objectives: (1) calculating

western development rate was significantly higher than that

rural and urban NPS pollutant loads by using well-

of the eastern area. Accompanied with the significant process

developed NPS evaluation models, USLE, ECM and L-

of urban expansion, the built-up land increased to 93.13 km2

THIA models respectively; (2) classifying the study area

in 2005, while the eastern area was still rural because of the

into rural and urban by fuzzy membership function of the

obstruction of the lake. The built-up area of Donghu water-

characteristic of land use, impact of city center and traffic

shed increased from 51.44 km2 in 1991 to 93.13 km2 in

condition; (3) combining the results of rural and urban pol-

2005, and the agriculture and forest were reduced to provide

lutant loads calculating models, according to the fuzzy

space for urban development (according to the statistic data of

membership; and (4) carrying out the case study of a rapid

land use map). Generally, we can conclude that the Donghu

developing watershed, Donghu watershed in central

watershed was a typical urban–rural mixed area in 2005.

China, to confirm the proposed methods.

METHODS STUDY AREA Firstly, assuming that the watershed is rural primarily, this 0

0

0

0

The study area (Figure 1, 114 18 ∼114 30 E, 30 30 ∼30 38 N,

study calculates the particulate pollutant loads by using the

18,075 ha.), the Donghu watershed, is located in the eastern

USLE model, considering the factors of slope, normalized

portion of the city of Wuhan (Gao et al. ). The study site

difference vegetation index (NDVI), land use type, soil type

is one of the largest downtown lakes in China. In addition to

and rainfall. Subsequently, the dissolved pollutant loads

W

W

W

W

the general functions of a lake, such as regulating climate,

were determined by the ECM which uses the export coeffi-

degrading pollution, providing living space for aquatic life

cient and the corresponding land use pattern to establish the

and preventing flooding, the Donghu watershed has a signifi-

relationship of land use type and pollutant loads. In particular,

cant impact on the ecological environmental safety of

NDVI is a simple graphical indicator to assess whether the

Wuhan. Due to the radial effects of urban centers, the

target that has been observed contains live green vegetation

impact of intense anthropogenic activities and urbanization

or not (Rulinda et al. ). Secondly, assuming that the water-

on the watershed water quality is profound.

shed is urban, the L-THIA model is used to generate the spatial

The land use classification is extracted from the LAND-

distribution of NPS pollutant loads in terms of total phos-

SAT TM images in 1991, 2002 and 2005 by the ERDAS

phorus and nitrogen. Finally, fuzzy membership functions

software package, and the resolution of the RS images is

are established to define the rural and urban weights for

30 m × 30 m. Then the results are revised by the land usage

each land use cell. As opposed to binary weight, the weights

pattern provided by ‘The Earth System Science Data Sharing

defined here are used to combine the results of the rural and

Nets’.

urban NPS pollutant loads calculating models.


117

Figure 1

H. Wang et al.

|

|

Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system

Journal of Hydroinformatics

|

16.1

|

2014

Location and land use pattern of the study watershed.

The spatial data used in our study, like NDVI, slope,

Particulate N and P loads based on USLE model

distance to road, distance to city center and land use pattern, are retrieved from LANDSAT TM imageries with

The USLE was proposed by Wischmeier & Smith (),

the resolution of 30 m × 30 m. The non-spatial data, includ-

and has since been widely used at a watershed scale. It is

ing rainfall capacity and the soil type, are obtained from

an empirical model allowing the average annual soil loss

the Wuhan Statistical Year Book and other statistical

based on the product of five erosion risk indicators

sources.

(Meusburger et al. ). The empirical model to obtain particulate loads is represented in Equation (1):

Evaluating NPS pollutant loads in rural areas Wxp ¼ β C A η Sd

ð1Þ

In this section, the classic USLE model and the ECM are used to acquire the particulate and dissolved pollutant

where, Wxp is the particulate pollutant load (kg/(hm2.a));

loads of N and P, respectively.

β is the dimensionless unit conversion constant; CN and


118

Figure 2

H. Wang et al.

|

|

Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system

Journal of Hydroinformatics

|

16.1

|

2014

Temporal and historical landscape patterns of study area.

CP are the concentrations of particulate N and P, respectively, which are available from the almanac of soil in the Hubei and Henan provinces and the soil database of China provided by the Institute of Soil Science, Chinese Academy of Science, Nanjing; A is the amount of the

Table 1

Scale

Description

β

1,000

Converting t/hm2 a to kg/hm2 a

CN, CP

Adsorption P concentration is 0.048%; Adsorption N concentration is 0.084%

In red earth

A

0–139.05 t/hm2 a

η

2

Referring to the relevant research, in this paper the η was identified as 2

Sd

0.1–0.4

Referring to the Sd of Changjiang basin provided by Changjiang Water Resources Committee and the feature of our study site, Sd was identified varying from 0.1 to 0.4 with the 0.25 as the average value for all the cells and then the Sd of grid (i, j) could be calculated depending on the distance from the grid to the lake

soil loss (t/(hm .a)); η is the non-dimensional concentration coefficient; Sd is the ratio of the final pollutant loading into the lake to the original load generated in each cell. The specific value of each parameter is ).

The description of parameters in Equation (1)

Parameter

2

shown in Table 1 (Shi et al. ; Xu et al. ; Xue

|

From the USLE model, the amount of the soil loss can be obtained as Equation (2) shows, where K is the soil erod2

2

ibility factor (t hm h)/(hm MJ mm), P is the support practice factor (non-dimensional), C is the cover management factor (non-dimensional), R is the rainfall erosivity factor ((MJ mm)/(hm2 h a)) and LS is the slope steepness factor (non-dimensional). A ¼ K P C R LS

ð2Þ

The soil erodibility factor (K) is related to the integrated

the amount of water runoff and, thus, reduce the erosion

effects of rainfall, runoff and infiltration on soil loss and can

rate (Volk et al. ), ranging from 0 to 1. The support prac-

reflect the process of soil loss during storm events on upland

tice factor can be obtained through considering the variation

areas (Renard et al. ). In our study, the soil type is gen-

of the land use pattern. In this paper, by referencing soil con-

eral red earth and according to experimental data (Deng

servation operations and relevant research (Bu et al. ;

et al. ; Wang ; Zhang et al. ), the K value is

Cai et al. ; Xu & Shao ) on the study area, the P

denoted as 0.299.

values are identified according to the land use type: the

The support practice factor (P) reflects the effects of soil conservation operations or other measures that will reduce

built-up land has a P value of 0.35; forest is 0.5; agriculture is 0.66; water body is 0.


119

H. Wang et al.

|

Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system

Journal of Hydroinformatics

|

16.1

|

2014

The cover management factor (C ) is a weighted index,

The rainfall factor (R) represents two characteristics of a

which takes the effect of land use on soil erosion into account

storm that determine its erosivity: the amount of rainfall and

(Dumas et al. ). It is measured as the ratio of soil loss to

the peak intensity sustained over an extended period. R is

land cropped under continuously fallow conditions (Wisch-

computed by using the function of monthly precipitation

meier & Smith ). By definition, C equals 1 under

(Dumas et al. ) (see Equation (5)):

standard fallow conditions. As vegetative cover approaches 100%, the C factor value approaches the minimal value. The C value of each cell is obtained by Equation (3) (Cai et al. ; Zhao et al. ). 8 lc ¼ 0 <1 C ¼ 0:6805 0:3436 lg lc 0 lc < 78:3% : 0 78:3% lc 8 0 > > < NDVI þ 0:0675 lc ¼ > 0:47 > : 1

12 X ð 2:6398 þ 0:3046Pi Þ

tation in the ith month which is obtained from the ð3Þ

statistics yearbook. The slope length and steepness factor (LS) represent the effect of topography on erosion, as an increase in slope length and steepness will produce higher overland flow vel-

ð4Þ

0:4025 < NDVI 1

ocities, thus, stronger erosion. LS is derived from Equation (6) (Wischmeier & Smith ; Dumas et al. ):

where lc is the vegetation coverage, non-dimensional. lc in Equation (3) can be obtained through the function of NDVI (see Equation (4)). An NDVI approaching a value of 1 means the associated area is fully covered by vegetation. Using NDVI retrieved from RS data, C values for our site can be calculated ranging from 0 to 1, with the average value of 0.3316 (see Figure 3(a)).

Figure 3

|

ð5Þ

i¼1

where R is in MJ mm/(hm2 h a), and Pi is the precipi-

1 NDVI 0:0675 0:0675 < NDVI 0:4025

LS ¼

m l ð0:085 þ 0:045θ þ 0:0025θ2 Þ 22:13

8 0:3 > > > > < 0:25 m ¼ 0:2 > > > 0:15 > : 0:10

22:50 θ 17:50 θ < 22:50 12:50 θ < 17:50 7:50 θ < 12:50 θ < 7:50

Distributions of USLE factors: (a) the cover management factor C; (b) the slope length and steepness factor LS.

ð6Þ

ð7Þ


120

H. Wang et al.

|

Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system

where l is the slope length in meters, θ is the slope angle in

Table 2

|

Journal of Hydroinformatics

|

from 0.1 to 0.3 (McCool et al. ) computed by the function

The concentration of dissolved p

t/km2 a

t/km2 a

Land use type

30 m resolution DEM and ArcGIS software package, fol-

Built-up

0.6

0.011

lowed by the values of m for each cell. LS values vary from

Forest

0.119

0.007

0.088 to 3.037, with an average of 0.152 (Figure 3(b)).

Agricultural

1.2

0.04

Water

0

0

employed in NPS pollution studies (Kay et al. ) which avoids the difficulty of the physical models. This eliminates the difficulties associated with the complex formation of NPS pollution, thereby reducing the requirements of monitoring the processes of the migration and transformation of the pollutants. Thus, the ECM is available for estimating the NPS pollution for the medium or the large-scale watershed. This model is commonly represented in the form of Equation (8):

Wxd ¼

Evaluating NPS pollutant loads in urban areas A distributed hydrological-water quality model based on hydrological response units, the L-THIA model (Phillips et al. ), is selected to simulate the urban NPS pollutant loads. It takes long-term hydrological impacts on land use change into consideration, so it can be useful in researching the relationships between urbanization, surface runoff and urban NPS pollution (Yang et al. ). L-THIA was developed as an effective approach to estimate the NPS pollution resulting from past or proposed land use changes (Zhang et al. ).

n X m X i

2014

The concentration of dissolved N

of slope (see Equation (7)). l and θ are calculated applying a

The ECM is a well-developed method that has been widely

|

Export coefficient of the pollutant (E) under hypothesis of rural area

degrees, and m is the slope angle contingent variable ranging

Dissolved N and P loads based on the export coefficient model

16.1

Based on the L-THIA model, the NPS pollutant loads E×α

ð8Þ

can be acquired through Equation (10):

j

NPSurban

model

¼ AR AE UR

ð10Þ

where Wxd is the output quantity of the dissolved pollutant (kg/hm2 a), E is the export coefficient of the pollutant (t/km2 a) on different land usages, and α is the conversion factor with the value of 10. The value of E is identified according to the literature review of NPS studies on the Yangtze River and city of Chongqing (Liu et al. ; Cao et al. ) and characteristic of our site (see Table 2).

AR ¼

ðRP 0:2SÞ2 RP þ 0:8S

ðP 0:2SÞ

1000 10 S ¼ 25:4 CN

ð11Þ

ð12Þ

in which NPSurban_model is the NPS pollutant load NPS pollutant loads in rural areas

(kg/hm2 a); UR is the unit conversion constant, 10 2; and AE is the concentration of pollutant in the surface runoff

Based on the USLE which integrated with the empirical model

for each land use type (mg/L). Due to the difficulty in collect-

and the ECM, the particulate pollutant load, Wxp, and the

ing data, we identify the concentration of pollutant, AE, in

dissolved pollutant load, Wxd, are achieved respectively.

different land use types by literature review and the detailed

Then, the NPS pollutant loads assuming the study area is

information is represented in Table 3. AR is the quantity of

rural, NPSrural_model, can be calculated by adding the particu-

actual runoff, in mm which can be retrieved from the function

late and dissolved NPS pollutant loads (see Equation (9)).

of total annual precipitation, RP, and potential maximum precipitation, S (see Equation (11) and Equation (12)).

NPSrural

model

¼ Wxp þ Wxd

ð9Þ

According to the precipitation data measured by the


121

Table 3

H. Wang et al.

|

|

Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system

Journal of Hydroinformatics

|

16.1

|

2014

Complex NPS loads in urban and rural mixed places

Concentrations of pollutants (AE) under the hypothesis of urban area

Land use type

Concentration of N (mg/L)

Concentration of P (mg/L)

Built-up

3.92

0.4

lutants in rural and urban areas are totally different, it is

Forest

1.9

0.42

essential to discriminate the rural and urban areas and use

Agricultural

5.7

1.6

Water

0

0

Due to the fact that generations and properties of NPS pol-

various models and parameters to evaluate the NPS in these two places. However, it is difficult to discriminate the rural and urban cells in a rapid developing area where

monitoring station and the approach of interpolation, the

the rural and urban places are mixed and coexisting. Con-

annual precipitation of each cell is obtained (see Figure 4).

ventionally, an administrative boundary is employed to

The maximum precipitation can be identified by the CN

distinguish the characteristic of the study area. In China, a

value which is obtained from literature review. In Yang’s

city administratively contains a built up area, suburbs, and

study (), the CN value of the Hanyang district, which is

counties under city administration. Usually the built-up

approximately 15 kilometers away from Donghu Lake,

area is urban, and the suburbs are a mix of urban and

experiencing similar temperatures and rainfall as our site,

rural. In fact, it is hard to distinguish, using an administra-

were proposed. In addition, the CN value used in Yang’s

tive boundary, between urban place and rural place which

paper had been modified by the antecedent moisture con-

suffer totally different process of NPS pollutant generation

dition (AMC) already (Zhao ; Li et al. ; Wang

and transportation. Especially in some rapid developing

et al. ) and the detailed values are presented in Table 4.

cities, rural places are continually changing into urban areas to satisfy the requirement of economic development and population growth. Conventional classification methods divided the land into classic clusters, for example ‘0’ meaning rural or ‘1’ meaning urban. Fuzziness exists in the real world, especially in the boundaries where it is hard to judge or classify. Hence, the classic classification method would be unsuitable. A fuzzy membership function-based approach is proposed to define a cell as urban or rural which not only classifies the cell into an urban or rural cell, but also provides the membership of belonging to a rural or an urban place which can reflect the degree of belonging to a certain cluster and can be used to combine the urban and rural NPS evaluation results.

Figure 4

|

In our work, three factors are defined to evaluate

Spatial distribution of summed annual precipitation in 2005.

whether a cell belongs to urban place: (1) characteristic of surrounding land use, (2) influence of city center, (3) traffic Table 4

|

condition. The density of built-up land within the land use

CN value in each land use type in the L-THIA model

Land use type

CN

Built-up

98.81

Forest

92.51

Agricultural

96.02

Water

0.00

cell is used to express the characteristic of surrounding land use. The grids which are within 150 m are taken into consideration to calculate the density. The city center is defined as the CBD (Central Business District) of Wuhan, and the distance from the city center is employed to measure the influence of the city center. Finally, the distance to the


122

H. Wang et al.

|

Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system

Journal of Hydroinformatics

|

16.1

|

2014

nearest road line is proposed to reflect the traffic condition,

By suggestion of experts, the fuzzy functions are

and a smaller distance a more convenient traffic condition.

defined and Figure 5 represents the tendency line between

Particularly, the factors stated above can be denoted as X ¼ {x1 ; x2 ; x3 ; . . . ; xm } where m is the number of attribu-

membership and the value of factors. The fuzzy functions of belonging to a rural place

established:

are defined as frural ¼ 1–furban. Then, the membership of

f ∼ : X ! ϑðrÞ. In other words, by function f the attribute

belonging to an urban place is defined as Equation (18)

of a certain land use cell, xi, can be mapped to the member-

shows.

tion,

and

the

fuzzy

mapping

can

be

ship of belonging to a certain cluster j, which can be written as rij. The function f is determined by the suggestions of experts and the characteristic of the study area. According to the membership of single attribution, the comprehensive evaluation can be conducted. The membership of belonging to a certain cluster j can be calculated according to the fuzzy operator of rij (see Equation (13)). In our study, the multiple product of all memberships of single attribution is employed (see Equation (14)). Rj ¼ ⊗rij Rj ¼

m Y

furban ðXÞ ¼ furban1 ðDeBuilt; a; b; cÞ furban2 ðDisCen; a; b; cÞ furban3 ðDisRoad; a; b; cÞ ð18Þ Finally, the NPS in land use cell can be calculated as Equation (19) shows. Otherwise, the integrated NPS would be calculated according to binary weight as Equation (20) shows. NPS ¼ furban ðXÞ NPSurban

model

þ f rural ðXÞ NPSurban

ð19Þ

ð13Þ ð14Þ

rij

i¼1

Generally, a cell with a large density of built-up land,

model

NPS ¼

8 > > <

NPSurban

> > :

NPSurban

1 þ NPSrural model 0 cell ϵ urban place model 0 þ NPSrural model 1 cell ϵ rural place model

ð20Þ

small distance to the city center and small distance to a road tends to be an urban cell. Inversely, the cell should be a rural cell. And then the bell-shaped function, the Gaussian curve function, and sigmf function, which is a

RESULTS AND DISCUSSION

function composed of the difference between two sigmoidal membership functions can be used as fuzzy function

Urban NPS

(see Equations (15)–(17)) to express the relationship between attributions and the membership functions. The

Urban NPS pollutant loads calculated by the L-THIA model

parameters of the fuzzy function are feasible to be gener-

are shown in Figures 6(a) and 7(a), assuming that the study

ated according to the opinion of experts. The Analytic

area is totally urban. The L-THIA model determines the

Hierarchy Process (AHP) (Saaty ) which is a pair-

urban NPS pollutant loads through rainfall-runoff and con-

wise comparison approach has been used to extract the

centration of pollutants within each land use type because

experts’ opinion.

in an urban system the land is covered by impervious areas and the influence of natural factors such as slope or soil

fbell ðx; a; b; cÞ ¼

1

ð15Þ

1 þ jðx cÞ=aj2b

type is less while the impact of human activities is larger. In the L-THIA model, the intensity of human activities is indirectly reflected by land use. As a result, the total nitrogen

fsigmf ðx; a1; c1; a2; c2Þ ¼

1 1 þ e a1ðx c1Þ

1

(TN) in the built-up land area is around 42 to 47 km/hm2 a,

1 þ e a2ðx c2Þ ð16Þ

ðx bÞ2 =c2

fgaussian ðx; a; b; cÞ ¼ a e

ð17Þ

and that of the forest area is about 17 to 22 km/hm2 a. Meanwhile the agricultural land area undergoes the highest TN load, achieving about 60 km/hm2 a, and the total phosphorus (TP) load suffers similar spatial distribution to that of TN.


123

Figure 5

H. Wang et al.

|

|

Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system

Journal of Hydroinformatics

|

16.1

|

2014

The membership functions of three factors.

Rural NPS

factors and other factors to simulate NPS pollutant loads, the distributions tend to be gentle and gradual.

As for the rural NPS pollutant loads (Figures 6(b) and 7(b)) calculated by the USLE model, the TN and the TP range

Division of urban and rural areas

from 0 to 30 km/hm2 a and 1 to 10 km/hm2 a, respectively, which do not correspond to the land use pattern but relate

By the fuzzy membership functions and three factors, the

much more to the nature factors like slope, the ration of veg-

land use cells can be classified into rural and urban at the

etation (which is measured by NDVI), soil type and so on. In

same time. Figure 10 displays three factors in x, y and z

Figures 8(a) and 9(a), TP and TN highlights are both concen-

axis, with the position of the point denoting the value of

trated in the southern region, and the analysis reveals that

three factors and the color denoting the membership of

these highlights usually corresponded with difficult terrain,

belonging to an urban place. In plot A (see Figure 10)

such as a larger slope which tends to be associated with

where the density of built-up is close to 1 (the maximum)

hard runoff and large soil erodibility, resulting in higher

and the distance to center is short, the cells undergo

NPS loads. Since the USLE model combines rainfall, topo-

higher membership of belonging to an urban place. In plot

graphy, management factors, soil types, cover management

B, the cells with medium value of three factors are hard to


124

H. Wang et al.

|

Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system

Figure 6

|

Results of TP load by different methods: (a) by urban method, (b) by rural method, and (c) integrated result.

Figure 7

|

Results of TN load by different methods: (a) by urban method, (b) by rural method, and (c) integrated result.

Figure 8

|

Histograms of TP load: (a) by USLE model, (b) by L-THIA model, and (c) integrated result.

Journal of Hydroinformatics

|

16.1

|

2014


125

Figure 9

H. Wang et al.

|

Figure 10

|

Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system

Journal of Hydroinformatics

|

16.1

|

2014

Histograms of TN load: (a) by USLE model, (b) by L-THIA model, and (c) integrated result.

|

Three-dimensional distribution of the membership with three factors denoted by x, y and z axis.

classify and the memberships of these cells are around 0.4 to

Additionally, to justify the advantage of using the

0.6. Additionally, we find that as the distance from the road

fuzzy approach, the conventional results with binary

increases, the cell tends to undergo small membership of

values obtained by the k-means classification method

belonging to urban places. Besides the cells in plot B, the

are displayed in Figure 11(b). By comparison, we find

cells in the center of this three-dimensional space (Plot C)

that the membership ranging from 0 to 1 can reflect

also had membership near to 0.5.

the distribution and characteristic of land use better

Correspondingly, Figure 11(a) represents the member-

than the classic results where only two integers are

ship based on spatial distribution of the land use cells. We

denoted, as ‘rural’ and ‘urban’. The fuzzy approach gen-

find out the cells with the highest fuzziness (around 0.5)

erates generally similar results to that of the classic

are concentrated around the boundary of the urban and

method where the rural and urban distribution is gener-

rural places.

ally the same. But in the boundary, the fuzzy approach


126

H. Wang et al.

Figure 11

|

|

Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system

Journal of Hydroinformatics

|

16.1

|

2014

The classification of land use cells to urban and rural in two-dimensional space: (a) the results by fuzzy approach; (b) the results by conventional k-means method.

uses a gradual value to express the change from urban to

Table 5

|

Pollutant loads of each land use type by different methods

rural however the classic results provide a sharpened break. Average P

Urban Rural Over all

Average N

Urban Rural Over all

Complex NPS Ren et al. () investigated and analyzed the urbanization level and the water quality of Shanghai from 1947

Built-up

Forest

Agriculture

Water

(kg/hm2 a)

(kg/hm2 a)

(kg/hm2 a)

(kg/hm2 a)

3.451 2.7952 3.1443

5.6545 4.8293 4.888

11.921 8.8148 9.4385

0 0 0

45.156 35.667 37.573

0 0 0

30.804 22.381 26.865

17.386 12.809 13.135

to 1996, showing that the faster the rate of urbanization increased, the poorer the water quality became in his case study; Sartor et al. () focused on the pollutant loads on urban streets, and pointed out that pollutant

NPSs are integrated, and there is no sharpened break

loads of nutrients in urban runoff were much higher

between rural and urban areas, meaning the NPSs in

than that of rural areas in his case study; Shon et al.

rural and urban areas are interactive even if they have

() argued that the amount of NPS pollutant loads

different generations and characteristics. Especially in

discharged into rivers was larger in urban regions than

the area of boundary, the urban NPS and rural NPS are

in forests and farmlands, because of the high population

mixed and interactive. Hence, in the integrated NPS dis-

and greater impermeable areas, and then used a storm

tribution maps, there is no sharpened break, at the same

water management model (SWMM) to simulate NPS

time the different NPS evaluation models are applied for

pollutant loads in the target area. Similarly, this study

different places.

revealed that NPS pollutant loads in the scope of urban land are larger than that of the rural model (Table 5) in

Donghu

watershed.

The

integrated

results

(see

CONCLUSION

Figures 6(c) and 7(c)) better reflects the complex NPS pollution distribution because it assumed that the study

Computation and analysis of NPS pollutants according to

area is rural and urban mixed. According to the member-

land use changes, precipitations, topography, soil type, veg-

ship of belonging to urban and rural, the urban and rural

etation and others in urban–rural mixed places were


127

H. Wang et al.

|

Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system

Journal of Hydroinformatics

|

16.1

|

2014

presented. In this study, we established a comprehensive

with corresponding classification factors. For example, in

model that successfully calculates rural and urban NPS pol-

Donghu watershed the distance to the city center, the den-

lutant loads respectively by USLE, ECM and L-THIA. Then,

sity of built-up land, and the distance to the nearest road

we introduced the fuzzy membership function based

are employed to describe the characteristics of each land

approach to integrate the rural and urban NPS pollutant

use cell by fuzzy membership function, while in other case

loads through the evaluation of land use characteristic.

study areas the factors may be various. Then, according to

Afterwards, the results were successfully obtained regarding

fuzziness in terms of land usage, the degree of being urban

complex NPS pollution in an urban–rural mixed watershed,

or rural can be identified. Afterward, the relationship

Donghu watershed.

between urbanization and the above mentioned problems

Even if numerous studies are concerned with the vari-

can be exactly assessed.

ation of NPS pollutant loads under the rapid urbanization process, there is no applicative model that alludes to the increasingly urban–rural mixed watershed and that con-

ACKNOWLEDGEMENTS

siders the difference in generation and characteristic of pollution between the urban and rural areas. To address this issue, we firstly identify the NPS pollution in urban–

This study is supported by funding from the National Natural

Science

Foundation

of

China

(grant

nos.

rural mixed areas and caused by various pollutants in

40701184

rural and urban surface runoff together as the complex

appreciate the contributions of the anonymous referees,

NPS pollution, and then employ the fuzzy membership func-

who provided very useful suggestions.

and

40871179).

All

the

authors

greatly

tion to classify the urban areas and rural areas so as to integrate the well-developed urban NPS model and rural NPS model. The results are proven to be consistent with existing research conclusions and with the characteristics of our site. To our knowledge, urbanization is popular worldwide. Take China as an example: the national urbanization level stood at 11% in 1949 and sharply increased to 29% in 1996 (Wang et al. ). Although the rapid urbanization process has boosted the economy and led to a higher quality of life, some adverse effects have been brought along with it. For example, in addition to detrimental water quality, the descent of indoor-air-quality (Wang et al. ), damage of ecosystem, climate change (Grimm et al. ), threat to biodiversity (Pompeu et al. ), effect on tree growth (Gregg et al. ), promotion of asthma (Lin et al. ) and so on are associated with rapid urbanization. Hence, it is significant

to

evaluate,

analyze

and

understand

the

relationship between urbanization and the corresponding detrimental effects. Within this context, the model proposed in this study which focused on the fuzziness in the ruralurban mixed places is innovative and applicable for the above mentioned problems. In particular, the model proposed is capable of being employed in any other rural and urban mixed regions, which undergo rapid urbanization,

REFERENCES Bahadur, K. C. K.  Mapping soil erosion susceptibility using remote sensing and GIS: A case of the upper Nam Wa Watershed, Nan Province, Thailand. Environ. Geol. 57 (3), 695–705. Bhaduri, B., Harbor, J., Engel, B. & Grove, M.  Assessing watershed-scale, long-term hydrologic impacts of land-use change using a GIS-NPS model. Environ. Manage. 26 (6), 643–658. Bu, Z. H., Sun, J. Z. & Zhou, F. J.  Study on quantitative remote sensing method for soil erosion and its application. Acta Pedologica Sinica 34 (3), 235–245. Cai, C. F., Ding, S. W., Shi, Z. H., Huang, L. & Zhang, G. Y.  Study of applying USLE and geographical information system IDRISI to predict soil erosion in small watershed. J. Soil Water Conserv. 14 (2), 19–24. Cao, Y. L., Li, C. M., Guo, J. S. & Fang, F.  Pollutant source analysis and pollution loads estimation from non-point source in Chongqing Three Gorges Reservoir Region. J. Chongqing Jianzhu Univ. 29 (4), 1–5. Chris, B. C., Randy, K. K. & James, A. T.  Water quality in agricultural, urban, and mixed land use watershed. J. Am. Water Res. Assoc. 40 (6), 1593–1601. Deng, L. J., Hou, D. B., Wang, C. Q., Zhang, S. R. & Xia, J. G.  Study on characteristics of erodibility of natural soil and non-irrigated soil of Sichuan. Soil Water Conserv. China 7, 23–25.


128

H. Wang et al.

|

Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system

Dixon, B.  Groundwater vulnerability mapping: A GIS and fuzzy rule based integrated tool. Appl. Geogr. 25 (4), 327–347. Do, H. T., Lo, S. L., Chiueh, P. T., Lan, A. P. T. & Shang, W. T.  Optimal design of river nutrient monitoring points based on an export coefficient model. J. Hydrol. 406 (1–2), 129–135. Dumas, P., Printemps, J., Mangeas, M. & Luneau, G.  Developing erosion models for integrated coastal zone management: A case study of The New Caledonia west coast. Mar. Pollut. Bull. 61, 519–529. Edwin, D. O., Zhang, X. L. & Yu, T.  Current status of agricultural and rural non-point source pollution assessment in China. Environ. Pollut. 158, 1159–1168. Fistikoglu, O. & Harmancioglu, N. B.  Integration of GIS with USLE in assessment of soil erosion. Water Res. Manag. 16, 447–467. Gao, J. Q., Xiong, Z. T., Zhang, J. D., Zhang, W. H. & Obono Mba, F.  Phosphorus removal from water of eutrophic Lake Donghu by five submerged macrophytes. Desalination 242 (1), 193–204. Gemitzi, A., Petalas, C., Tsihrintzis, V. A. & Pisinaras, V.  Assessment of groundwater vulnerability to pollution: A combination of GIS, fuzzy logic and decision making techniques. Environ. Geol. 49 (5), 653–673. Gregg, J. W., Jones, C. G. & Dawson, T. E.  Urbanization effects on tree growth in the vicinity of New York City. Nature 424 (6945), 183–187. Grimm, N. B., Foster, D., Groffman, P., Grove, J. M., Hopkinson, C. S., Nadelhoffer, K. J., Pataki, D. E. & Peters, D. P.  The changing landscape: Ecosystem responses to urbanization and pollution across climatic and societal gradients. Front. Ecol. Environ. 6 (5), 264–272. Harbor, J.  A practical method for estimating the impact of land use change on surface runoff, groundwater recharge and wetland hydrology. J. Am. Plann. Assoc. 60, 91–104. Haregeweyn, N. & Yohannes, F.  Testing and evaluation of the agricultural non-point source pollution model (AGNPS) on Augucho catchment, western Hararghe, Ethiopia. Agric. Ecosyst. Environ. 99, 201–212. Jing, L. & Chen, B.  Field investigation and hydrological modelling of a subarctic wetland – the Deer River Watershed. J. Environ. Inf. 17 (1), 36–45. Johnes, P. J. & Heathwaite, A. L.  Modelling the impact of land use change on water quality in agricultural catchments. Hydrol. Processes 11 (3), 269–286. Kay, D., Crowther, J., Stapleton, C. M., Wyer, M. D., Fewtrell, L., Anthony, S., Bradford, M., Edwards, A., Francis, C. A., Hopkins, M., Kay, C., McDonald, A. T., Watkins, J. & Wilkinson, J.  Faecal indicator organism concentrations and catchment export coefficients in the UK. Water Res. 42 (10–11), 2649–2661. Kim, K. Y., Ventura, S. J., Harris, P. M., Thum, P. G. & Prey, J.  Urban non-point-source pollution assessment using a geographical information- system. J. Environ. Manage. 39 (3), 157–170.

Journal of Hydroinformatics

|

16.1

|

2014

Leone, A., Ripa, M. N., Uricchio, V., Deak, J. & Vargay, Z.  Vulnerability and risk evaluation of agricultural nitrogen pollution for Hungary’s main aquifer using DRASTIC and GLEAMS models. J. Environ. Manage. 90 (10), 2969–2978. Li, Y. J., Nian, Y. G., Song, Y. W., Hu, S. R., Nie, Z. D., Yan, H. H. & Yin, Q.  Spatio-temporal variation of non-point source pollutants in Wuli Lake, Taihu Lake. J. Sichuan Univ. (Engineering Science Edition) 41 (2), 125–130. Lim, K. J., Engel, B. A., Tang, Z., Muthukrishnan, S., Choi, J. & Kim, K.  Effects of calibration on L-THIA GIS runoff and pollutant estimation. J. Environ. Manage. 78, 35–43. Lin, R. S., Sung, F. C., Huang, S. L., Gou, Y. L., Ko, Y. C., Gou, H. W. & Shaw, C. K.  Role of urbanization and air pollution in adolescent asthma: A mass screening in Taiwan. J. Formosan Med. Assoc. 100 (10), 649–655. Liu, Y. & Phinn, S. R.  Modelling urban development with cellular automata incorporating fuzzy-set approaches. Comput. Environ. Urban Syst. 27 (6), 637–658. Liu, Z. & Tong, S. T. Y.  Using HSPF to model the hydrologic and water quality impacts of riparian land-use change in a small watershed. J. Environ. Inf. 17 (1), 1–14. Liu, R. M., Yang, Z. F., Ding, Z. F., Ding, X. W., Shen, Z. Y., Wu, X. & Liu, F.  Effect of land use/cover change on pollution load of non-point source in upper reach of Yangtze River basin. Environ. Sci. 27 (12), 2407–2414. McCool, D., Brown, L., Foster, G., Mutchler, C. & Meyer, L.  Revised slope steepness factor for the USLE. Trans. Am. Soc. Agric. Eng. 30, 1387–1396. Meusburger, K., Konz, N., Schaub, M. & Alewell, C.  Soil erosion modelled with USLE and PESERA using QuickBird derived vegetation parameters in an alpine catchment. Int. J. Appl. Earth Obs. Geoinf. 12, 208–215. Ouyang, W., Skidmore, A. K., Toxopeus, A. G. & Hao, F. H.  Long-term vegetation landscape pattern with non-point source nutrient pollution in upper stream of Yellow River basin. J. Hydrol. 389, 373–380. Pandey, A., Chowdary, V. M. & Mal, B. C.  Identification of critical erosion prone areas in the small agricultural watershed using USLE, GIS and remote sensing. Water Resour. Manage. 21, 729–746. Phillips, P., Russell, F. A. & Turner, J.  Effect of non-point source runoff and urban sewage on Yaquedel Norte River in Dominican Republic. Int. J. Environ. Pollut. 31 (3–4), 244– 266. Pompeu, P. S., Alves, C. B. M. & Callisto, M. A. R. C. O. S.  The effects of urbanization on biodiversity and water quality in the Rio das Velhas basin, Brazil. Am. Fish. Soc. Symp. 47, 11–22. Ren, W. W., Zhong, Y., Melilgrana, J., Ariderson, B., Watt, W. E., Chan, J. K. & Leung, H. L.  Urbanization, land use, and water quality in Shanghai 1947–1996. Environ. Int. 29, 649–659. Renard, K. G., Foster, G. R., Weesies, G. A., McCool, D. K. & Yoder, D. C.  Predicting Soil Erosion by Water: A Guide


129

H. Wang et al.

|

Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system

to Conservation Planning with Revised Universal Soil Loss Equation (RUSLE). Department of Agriculture, ARS, Washington, DC. Rulinda, C. M., Bijker, W. & Stein, A.  Characterising and quantifying vegetative drought in East Africa using fuzzy modelling and NDVI data. J. Arid Environ. 78, 169–178. Saaty, T. L.  The Analytic Hierarchy Process. McGraw-Hill, New York. Sartor, J. D., Boyd, G. B. & Agardy, F. J.  Water Pollution Aspects of Street Surface Contaminants. The United States Environmental Protection Agency, Washington, DC. Shi, Z. H., Cai, C. F., Ding, S. W., Li, Z. X., Wang, T. W., Zhang, B. & Sheng, X. L.  Research on nitrogen and phosphorus load of agricultural non-point sources in middle and lower reaches of Hanjiang River based on GIS. Acta Scientiae Circumstantiae 22 (4), 473–477. Shields, C. A., Band, L. E., Law, N., Groffman, P. M., Kaushal, S. S., Savvas, K., Fisher, G. T. & Belt, K. T.  Streamflow distribution of non-point source nitrogen export from urbanrural catchments in the Chesapeake Bay watershed. Water Resour. Res. 44 (9), 1–13. Shon, T. S., Kim, S. D., Cho, E. Y., Im, J. Y., Min, K. S. & Shin, H. S.  Estimation of NPS pollutant properties based on SWMM modeling according to land use change in urban area. Desalin. Water Treat. 37/38 (1–3), 333. Shrestha, S., Kazama, F., Newham, L. T. H., Babel, M. S., Clemente, R. S., Ishidaira, H., Nishida, K. & Sakamoto, Y.  Catchment scale modeling of point source and nonpoint source pollution loads using pollutant export coefficients determined from long-term in stream monitoring data. J. Hydro-Environ. Res. 2, 134–147. Volk, M., Moller, M. & Wurbs, D.  A pragmatic approach for soil erosion risk assessment within policy hierarchies. Land Use Policy 27, 997–1009. Wang, D. C.  Modeling the Process of Runoff and Sediment Yield on Slopeland Based on ARCGIS. Southwestern University, Chongqing. Wang, Z., Bai, Z., Yu, H., Zhang, J. & Zhu, T.  Regulatory standards related to building energy conservation and indoor-air-quality during rapid urbanization in China. Energy Build. 36 (12), 1299–1308. Wang, J. Y., Da, L. J., Song, K. & Li, B. L.  Temporal variations of surface water quality in urban, suburban and rural areas during rapid urbanization in Shanghai, China. Environ. Pollut. 152, 387–393. Wang, K., Wu, W. Y., Chen, Y. Q. & Ding, H.  Study on nonpoint pollution characteristics of urban runoff in Fuzhou city. J. Minjiang Univ. 30 (2), 107–111. William, S. J. R., Nicks, A. D. & Arnold, J. R.  Simulation for water resource in rural basins. Hydraul. Eng. 111 (6), 970– 986.

Journal of Hydroinformatics

|

16.1

|

2014

Wischmeier, W. H. & Smith, D. D.  Predicting Rainfall Erosion Losses.USDA Agricultural Research Services handbook 537. USDA, Washington, DC, p. 57. Xu, Y. L., Li, H. E. & Ni, Y. M.  Estimate on pollutant loads of nitrogen and phosphorus based on USLE in Heihe river watershed. J. Northwest Sci-Tech Univ. Agric. Forestry (Natural Science Edition) 34 (3), 138–142. Xu, Y. Q. & Shao, X. M.  Estimation of soil erosion supported by GIS and RUSLE: A case study of Maotiaohe Watershed, Guizhou Province. J. Beijing Forestry Univ. 28 (4), 67–71. Xue, S. L.  Simulating the Non-Point Pollution Load of Nitrogen and Phosphorus in the Heihe Watershed Based on GIS. Xi’an University of Technology, Xi’an. Yang, A. L., Huang, G. H., Qin, X. S. & Fan, Y. R.  Evaluation of remedial options for a benzene-contaminated site through a simulation-based fuzzy-MCDA approach. J. Hazard. Mater. 213, 421–433. Yang, L., Ma, K. M., Guo, Q. H. & Bai, X.  Evaluating longterm hydrological impacts of regional urbanisation in Hanyang, China, using a GIS model and remote sensing. Int. J. Sustainable Dev. World Ecol. 15, 350–356. Zhang, H. & Huang, G. H.  Assessment of non-point pollution using a spatial multicriteria approach. Ecol. Modell. 222, 313–321. Zhang, J. H., Shen, T., Liu, M. H., Wan, Y., Liu, J. B. & Li, J.  Research on non-point source pollution spatial distribution of Qingdao based on L-THIA model. Math. Comput. Modell. 54, 1151–1159. Zhang, K. L., Peng, W. Y. & Yang, H. L.  Soil erodibility and its estimation for agricultural soil in China. Acta Pedologica Sinica 44 (1), 7–13. Zhang, W. W., Shi, M. J. & Huang, Z. H.  Controlling nonpoint-source pollution by rural resource recycling. Nitrogen runoff in Tai Lake valley, China, as an example. Sustainability Sci. 1 (1), 83–89. Zhao, Y. X.  Prediction of Non-point Pollution in the Small Watershed of the Miyun Reservoir. Beijing Jiaotong University, Beijing. Zhao, Y. X., Zhang, W. S., Wang, Y. & Wang, T. T.  Soil erosion intensity prediction based on 3S technology and the USLE: A case from Qiankeng Reservoir basin in Shenzhen. J. Subtropical. Resour. Environ. 2 (3), 23–28. Zheng, C., Yang, W. & Yang, Z. F.  Strategies for managing environmental flows based on the spatial distribution of water quality: A case study of Baiyangdian Lake, China. J. Environ. Inf. 18 (2), 84–90. Zhuang, Y. H., Hong, S., Zhang, W. T., Lin, H. Y., Zeng, Q. H., Nguyen, T., Niu, B. B. & Li, W. Y.  Simulation of the spatial and temporal changes of complex non-point source loads in a lake watershed of central China. Water Sci. Technol. 67.9, 2050–2058.

First received 4 October 2012; accepted in revised form 15 May 2013. Available online 13 July 2013


130

© IWA Publishing 2014 Journal of Hydroinformatics

|

16.1

|

2014

The characteristics of probability distribution of groundwater model output based on sensitivity analysis Xiankui Zeng, Jichun Wu, Dong Wang and Xiaobin Zhu

ABSTRACT The probability distribution of groundwater model output is the direct product of modeling uncertainty. In this work, we aim to analyze the probability distribution of groundwater model outputs (groundwater level series and budget terms) based on sensitivity analysis. In addition, two sources of uncertainties are considered in this study: (1) the probability distribution of model’s input parameters; (2) the spatial position of observation point. Based on a synthetical groundwater model, the probability distributions of model outputs are identified by frequency analysis. The sensitivity of output’s distribution is analyzed by stepwise regression analysis, mutual entropy analysis, and classification tree analysis methods. Moreover, the key uncertainty variables influencing the mean,

Xiankui Zeng Jichun Wu (corresponding author) Dong Wang Xiaobin Zhu Key Laboratory of Surficial Geochemistry, Ministry of Education, Department of Hydrosciences, School of Earth Sciences and Engineering, State Key Laboratory of Pollution Control and Resource Reuse, Nanjing University, Nanjing, 210093, China E-mail: jcwu@nju.edu.cn

variance, and the category of probability distributions of groundwater outputs are identified and compared. Results show that mutual entropy analysis is more general for identifying multiple influencing factors which have a similar correlation structure with output variable than a stepwise regression method. Classification tree analysis is an effective method for analyzing the key driving factors in a classification output system. Key words

| classification tree analysis, frequency analysis, groundwater modeling, mutual entropy analysis, probability distribution, sensitivity analysis

INTRODUCTION Groundwater modeling and prediction are influenced by

Uncertainty analysis of groundwater models is often

many factors from the surface to underground. The uncer-

implemented in a probability statistical framework (Blasone

tainty of groundwater model outputs stems from a number

et al. ; Hassan et al. ). The results are generally

of factors including incomplete model structure, incorrect

expressed as the probability distributions of outputs of inter-

boundary conditions, and aquifer parameters (Hassan

est

(e.g.,

groundwater

level,

boundary

flux,

solute

et al. ; Wu et al. ; Zhang et al. ; Gungor &

concentration). The uncertainty of a random variable can

Goncu ; Zeng et al. ). Data scarcity and obser-

be described by its characteristics of probability distribution,

vation

handling

which include probability density function (PDF) and

simulation uncertainty (Hassan et al. ; Mpimpas

numerical characteristics (e.g., mean and variance). The

et al. ; Wang et al. ). In recent years, a number

location, range, and shape of a random variable’s distri-

of studies have been developed to assess the uncertainties

bution are determined by the probability distribution. The

errors

enhance

the

difficulty

in

on groundwater model outputs. Moreover, these studies

sensitivity analysis of groundwater output’s probability dis-

focus on the uncertainty assessments by referring uncer-

tribution is primarily aimed at identifying two types of

tainty

parameters,

influencing factors. One is the factor affecting the numerical

conceptual model, and scenario (Blasone et al. ;

characteristics of a random variable, the other is the driving

Hassan et al. ; Ye et al. ; Hashemi et al. ;

factor which leads the output variable to obey a specific

Morway et al. ).

PDF.

sources

such

doi: 10.2166/hydro.2013.106

as

hydrogeological


131

X. Zeng et al.

|

Probability distribution of groundwater model output

Journal of Hydroinformatics

|

16.1

|

2014

For the uncertainty analysis of groundwater simulation,

groundwater levels series (GLS) and groundwater budget

in general, we are interested in the probability distribution of

terms. The suitable PDFs of outputs were selected by the

model output. However, little attention has been devoted to

Kolmogorov–Smirnov

the influencing factors of model output’s distribution in pre-

regression and mutual entropy analysis were used to identify

test.

After

that,

the

stepwise

vious studies. In this paper, we focus on the sensitivity of

the influencing factors of the first two moments of GLS

groundwater model output’s probability distribution for

(mean and variance). In addition, mutual entropy analysis

two sources: (1) the probability distribution of the input par-

is a reliable sensitivity analysis method based on infor-

ameters; (2) the spatial position of observation point.

mation

Frequency analysis is a technique which has been exten-

theory,

which

is

compared

with

stepwise

regression analysis. Finally, for the sensitivity analysis of

sively used in hydrologic uncertainty issues (Lang et al. ;

classification output system, classification tree analysis was

Neppel et al. ), such as the design of flood control and

used to identify the driving factors that lead the GLS to

risk management. The observation series is used to fit an

obey a specified distribution.

alternative PDF, and then the variable’s distribution uncer-

The main results of this study were obtained from a syn-

tainty is analyzed statistically (Smakhtin ; Katz et al.

thetic groundwater model. This groundwater model is

). Generally, there are three basic procedures for fre-

simple compared to a real groundwater system. Therefore,

quency analysis: (1) selecting a suitable PDF for data

the research results can be regarded as a mathematical

series; (2) parameter estimation for the selected PDF; (3)

exploration into the characteristics of probability distri-

uncertainty assessment for the data series (Onoz & Bayazit

bution of groundwater model outputs. Some conclusions

). Herein, how to select a suitable PDF is the key pro-

need further confirmation in the real field. Nevertheless,

blem for a frequency analysis. According to Mcmahon &

the use of a real groundwater model is not easy for such

Srikanthan (), Haktanir (), Onoz & Bayazit (),

analysis, because observations are often limited in the

and Vogel et al. (), there is not a universal applicable

number and length of a data series.

rule to select the best PDF, and the qualified PDF should be selected based on effective comparison and testing. For the complicated groundwater model, it is hard to

In the following sections, the methods used for this research are described. Then, a synthesized groundwater flow model is presented. In the results and discussion sec-

describe the influences of model inputs on outputs directly

tion,

we

describe

the

characteristics

of

probability

by mathematic model. Sensitivity analysis provides an effec-

distribution of model output. Finally, the main conclusions

tive framework for unraveling the relationship between the

drawn from the analysis are provided.

input variables and outcomes. In general, the studying object is the direct model output, such as hydraulic head (Rojas et al. ; Mazzilli et al. ) and solute concen-

METHODS

tration (Huysmans et al. ; Zhang et al. ). The influencing factors of output variable can be identified by

Parameter estimation and goodness of fit test

sensitivity analysis. In this study, the research object is not the direct model output, but the probability distribution of

Seven functions were chosen as the alternative probability

output. The importance of this kind of influencing factor

distribution functions to fit the outputs of groundwater

can be regarded as another form of sensitivity to model

model.

output. Furthermore, recognizing the distribution character-

gamma, log-2-parameter gamma, Pearson type III, log-

istics of model output will help in identifying groundwater

Pearson type III, and uniform distribution, respectively.

modeling uncertainty, improving model structure, and pro-

The methods used for parameter estimation have been illus-

viding feedback for data collecting activities relating to

trated in many papers and will not be provided here.

model uncertainty analysis.

Readers can obtain detailed derivation processes by refer-

A synthetic groundwater model was built for producing groundwater outputs. The outputs of the model include

They

were

normal,

log-normal,

2-parameter

ring to Chen et al. (), Ross (), Singh & Singh (a, b), and Sun & Zheng ().


132

X. Zeng et al.

|

Probability distribution of groundwater model output

Journal of Hydroinformatics

|

16.1

|

2014

The Kolmogorov–Smirnov test (Melo et al. ; Wang

equal-width intervals. The number in each contingency

& Wang ) is a convenient method of a hypothesis test by

table is a nonnegative integer which represents the

comparing the statistic value with the critical value at a

number of observed events satisfying the joint conditions

specified confidence level. The statistic value is evaluated

of row and column.

by comparing the proposed PDF with the empirical distri-

The probability of the state with input variable xi and

bution function constructed based on samples. The

output variable yi is pij ¼ Nij/N, where Nij is the value of

Kolmogorov–Smirnov test is a standard procedure of good-

the contingency table at i-th row and j-th column, and N is

ness of fit test, and it will not be described here.

the number of samples. In addition, Ni. is the cumulative number of samples in the i-th interval of x for the whole range of y, and N.j denotes the cumulative number of

Stepwise regression analysis

samples in the j-th interval of y for the whole range of x. Stepwise regression analysis is a common approach for

Consequently, when considering the state xi only, the prob-

global sensitivity analysis. The basic idea for regression

ability can be written as pi. ¼ Ni./N, and the probability of

analysis is to fit the input and output variable with a linear

outcomes only with the state yj is given by p.j ¼ N.j/N

regression model (Pappenberger et al. ; Mishra et al.

(Mishra et al. ).

). The model generated at every step is tested to

The entropy of a variable represents the amount of aver-

ensure that all the regression variables are important to

age information. According to information theory, the

the model. The t-test measuring the difference between

entropies of variable x, y, and (x, y) are defined as follows:

samples and the regression model is applied to test the importance of a variable. In addition, if some variables are found to be insignificant, then the most insignificant variable is removed from the model. Moreover, the stepwise regression process will continue until each variable in the regression model becomes significant and the variables outside of the model are insignificant (Mishra et al. ; Bergante et al. ; Zeng et al. ). After that, the uncertainty importance of input variable can be defined as standardized regression coefficient (SRC): bj σ xj SRC ¼ σ ð yÞ

HðxÞ ¼

X

pi: ln pi: ; Hð yÞ ¼

X

i

Hðx; yÞ ¼

ð2Þ

p:j ln p:j

j

XX i

ð3Þ

pij ln pij

j

In information theory, the mutual information of two variables is a quantity that measures the mutual dependence of two variables. The mutual entropy between x and y is described as the reduction in the uncertainty of y due to the information of x, which can be given by:

ð1Þ

Iðx; yÞ ¼ HðxÞ þ Hð yÞ Hðx; yÞ ¼

XX i

j

pij ln

pij pi: p:j

ð4Þ

where y is the output variable, xj is the input variable num-

In mutual entropy method, the uncertainty importance

bered by j, σ(xj), σ(y) are the standard deviations of xj and

of input variables on output variable is indicated by two indi-

y, respectively, bj is the regression coefficient of xj.

cators: uncertainty coefficient (U) and R statistic (R)

Mutual entropy analysis

(Mishra et al. ; Zeng et al. ): Iðx; yÞ U ðx; yÞ ¼ 2 HðxÞ þ Hð yÞ

ð5Þ

The distribution character of data set (X, Y ) can be described using contingency tables. For the contingency tables’ rows, the label denotes the input variable x, and

Rðx; yÞ ¼ ½1 expf 2Iðx; yÞg 1=2

ð6Þ

the range is divided into i equal-width intervals. For

These two measures take values in the range [0, 1], U (or

the contingency tables’ columns, the label denotes

R) is 0 if x and y are independent, and it takes 1 if x is com-

the output variable y, and the range is divided into j

pletely related to y.


133

X. Zeng et al.

|

Probability distribution of groundwater model output

Journal of Hydroinformatics

Classification tree analysis

|

16.1

|

2014

Assuming the splitting variable X, with n samples ordered by magnitude, the amount of alternative split

Sensitivity analysis techniques such as stepwise regression,

points of X is n-1 by choosing the midpoint of two adjacent

regionalized sensitivity analysis (Pappenberger et al. ),

samples. The point that maximizes the information gain or

and mutual entropy analysis are useful for identifying impor-

minimizes the uncertainty of outputs is selected. InfoGain

tant influencing factors if the study object is a continuous

is calculated by the equation (Myles et al. ):

variable. When the problem relates to binary outcomes such as ‘right’ vs. ‘wrong’, ‘yes’ vs. ‘no’, the classification

InfoGain ¼ Infoð parentÞ

tree method provides a more efficient framework for identi-

X

ð pk ÞInfoðchildk Þ

ð8Þ

k

fying the factors driving the result into particular categories (Mishra et al. ; Englehart & Douglas ; Esther et al. ; MacQuarrie et al. ).

where parent denotes the space before splitting, childk denotes the subspace after splitting, and pk is the ratio of

The fundamental target for constructing a classification

the samples which passed into the k-th subspace. The

tree model is searching for a classifying rule. The output is

purity of a space describing the distribution of samples’

classified by a series of splits based on splitting variables.

types is expressed as follows:

Each split is determined by the appropriate classifier. Thus, the following two steps are essential for constructing a classification tree: (1) selecting an appropriate splitting

purity ¼

X j

p2j ; pj ¼

Nj ðtÞ N ðtÞ

ð9Þ

variable and determining the split point; (2) deciding when where pj is the proportion of samples belonging to class j.

to continue splitting or to declare splitting termination. The split can be defined by several principles, such as

The classification tree is constructed by the successive

maximum information gain (InfoGain), maximum impurity

selection of splitting points. It is beneficial to set up some

reduction, and maximum reduction in deviance (Mishra

constraint for preventing excessive splitting. If the number

et al. ; Myles et al. ). The InfoGain index based

of samples in a subspace below the minimum value, or the

on information entropy theory was applied to construct a

purity of samples in a subspace is higher than the maximum

classification tree in this study. The outputs are classified

value specified by the user, the splitting is terminated at that

into subspaces by selecting a splitting point of the splitting

node. Furthermore, a classification tree can be optimized by

variable. This implies the complicated and disordered out-

pruning and reconstruction, which acquires a balance

puts are arranged and sorted with higher order within

between the complexity and classification precision. After

subspaces. Therefore, the uncertainty of output variable is

the classification tree is constructed, the sensitivities of split-

reduced by acquiring information.

ting variables can be simply determined by comparing the

A classification tree is built on two types of nodes:

order used to classify outputs (Mishra et al. ).

branch nodes and leaf nodes. Each branch node is the parent of two children branch nodes, and the leaf node is

IMPLEMENTATION OF METHODS

the endpoint of the tree. The uncertainty or information entropy of output vari-

Description of the synthesized model

able y in node t is defined as: Info ¼

X Nj ðtÞ j

N ðtÞ

Nj ðtÞ ln N ðtÞ

ð7Þ

For the purpose of frequency analysis, we constructed a synthetic three-dimensional steady-state groundwater model (Rojas et al. ) (Figure 1). The model domain is

where Nj (t) denotes the number of samples belonging to

5,000 m in the x direction, 3,000 m in the y direction, and

the class j at node t, and N(t) is the number of samples

53 m in the z direction (thickness). The model area is a rec-

at node t.

tangle (5,000 m by 3,000 m) and discretized into 25 m by


134

Figure 1

X. Zeng et al.

|

|

Probability distribution of groundwater model output

Journal of Hydroinformatics

|

16.1

|

2014

A schematic diagram of synthetic groundwater model.

25 m grid cells. Five pumping wells and 240 observation

Table 1

|

Spatial correlation parameters of hydraulic conductivity K for the layers of synthetic groundwater model

points are placed at the confined aquifer. The upper layer is 25 m in thickness and is modeled as an unconfined aqui-

Parameter

fer. The lower layer is 25 m in thickness and is confined. The Layer

Mean of K (m/d )

Variance of lnK

Correlation length of lnK (m)

neglecting its storage capacity. The three model layers were

1

5.0

2.0

80

assumed to be horizontal in extension. The hydraulic con-

2

0.1

0.5

80

ductivity distribution within each aquifer is heterogeneous,

3

5.0

2.0

80

upper and lower layer is separated by a 3-m confining bed by

and the hydraulic conductivity field within each layer is assumed to be statistically stationary.

Model parameters

Boundary conditions set up As shown in Figure 1, for the model aquifers, two impermeable boundary conditions are specified along the south and

Model layers are assumed to be homogeneous statistically

north boundaries. Along the west boundary, a constant head

with a constant mean of hydraulic conductivity K. Smaller-

boundary condition is imposed. The east side of the domain

scale variability is represented using the theory of random

is bounded by a 20 m-wide river, and the river level is 40 m.

space functions. In addition, an isotropic exponential covari-

The riverbed’s thickness is 2 m, and the elevation at the

ance function is used to describe the K fields of layers. The

bottom of the riverbed is 35 m. Furthermore, sources and

spatial distribution of hydraulic conductivity is generated

sinks in the model include recharge from precipitation, dis-

using the direct Fourier transform method (Robin et al.

charge from pumping and evapotranspiration. The top

). The spatial structure parameters of lnK for different

surface of unconfined aquifer receives the precipitation

layers are presented in Table 1.

recharge

uniformly,

and

the

model

bottom

is

an


135

X. Zeng et al.

|

Probability distribution of groundwater model output

Journal of Hydroinformatics

|

16.1

|

2014

impermeable boundary. In addition, the pumping wells and

The Monte Carlo simulation procedure involves two parts

observation wells are only screened at layer 3. An evapotran-

(part I and part II).

spiration zone, delineated by a rectangle in the left side of the study area, is defined with an evapotranspiration surface

Part I

elevation at 51 m, and the extinction depth is set as 5 m. Then, the unknown model parameters including the water level of constant head boundary, the conductance of

Part I includes the following steps: 1. Generating model mesh, setting the initial head condition,

river bed, precipitation rate, maximum evapotranspiration

the positions of pumping wells and observation points, etc.

rate, and pumping rate are defined in specified ranges

2. Setting the hydraulic conductivity K of model layers.

(Table 2). In addition, the conductance of riverbed rep-

Based on the mean and the covariance function of lnK

resents the interconnection between river and unconfined

(Table 1), the random fields of K are generated by the

aquifer, which is calculated as follows (Harbaugh ): CRiv ¼

KRiv l w m

direct Fourier transform method.

ð10Þ

where CRiv is the conductance of riverbed, KRiv is the vertical hydraulic conductivity of riverbed, l is the length of reach, w is the width of river, and m the thickness of riverbed. The probability distribution of groundwater model output is influenced by input parameters. Therefore, two conditions that the input parameters follow, uniform and normal distributions, are both considered in this study. In addition, the range of uniform distribution is consistent with interval of corresponding normal distribution. The parameters of these two distributions are shown in Table 2.

stant head boundary, conductance of riverbed, and pumping rate. A boundary condition is assigned a value by sampling uniformly from the corresponding range (Table 2). 4. Running the established model and collecting the outputs of groundwater model. The outputs include the groundwater levels of observation points in layer 3, the inflow from constant head boundary and precipitation, the outflow from well pumping, evapotranspiration process, and river boundary. 5. Repeating step 2 to step 4 500 times. 6. Conducting frequency analysis for groundwater model constructed by the output of every realization, e.g., the groundwater levels of an observation point from 1st to

The numerical model of synthesized groundwater flow system is built using MODFLOW-2005 (Harbaugh ). |

rate, maximum evapotranspiration rate, water head of con-

outputs. The data series used for frequency analysis is

Monte Carlo simulation

Table 2

3. Setting the boundary conditions, including precipitation

500th realization. Therefore, each data series has 500 samples. The data series include 240 GLS and five groundwater budget series. The procedure of frequency analysis can be summarized as two steps: (1) parameter

Probability distributions of model parameters

estimation for each alternative PDF; (2) taking the Uniform distribution

Normal distribution

Kolmogorov–Smirnov test for each PDF. If all the

Model parameter

Minimum

Maximum

Mean

Variance

alternative PDFs have poor performance (cannot pass

Precipitation rate (m/d)

6.0 × 105

6.0 × 104

3.3 × 104

7.1 × 105

Evapotranspiration rate (m/d)

5.0 × 10

5.0 × 10

2.75 × 10

Constant head (m)

47.0

52.0

49.5

0.6579

Conductance of riverbed (m2/d)

10.0

500.0

255.0

64.4737

Pumping rate (m3/d )

500.0

3000.0

1750.0

328.9474

4

3

3

through the Kolmogorov–Smirnov test, and the significance level α was set to 0.05 in this study), we will

5.92 × 10

4

mark the GLS as an unknown PDF. Part II The procedure of part II is the same as that of part I, except for step 3. In this part, step 3, a boundary condition is assigned a value by sampling from corresponding normal distribution.


136

X. Zeng et al.

|

Probability distribution of groundwater model output

Journal of Hydroinformatics

RESULTS AND DISCUSSION

|

16.1

|

2014

uniform distribution nearly. Moreover, when the input parameters are sampled from normal distribution, it is

Figure 2 shows the frequency distribution of parameters’

obvious that most of the GLS obey normal distribution

samples which are sampled from uniform and normal distri-

(Figure 3(b)). The groundwater budget terms include the inflows from

butions, respectively.

constant head boundary (InCH) and precipitation (InPre), and outflows from river leakage (OutRiv), evapotranspiration

Frequency analysis

(OutEva), and pumping (OutPum). Obviously, the probability The outputs of groundwater model are tested for each

distributions of InPre and OutPum are fully controlled by

alternative PDF by Kolmogorov–Smirnov test, and the sig-

model input parameters (precipitation rate and pumping

nificance level is 0.05. The numbers of GLS which obey

rate). Figure 4 shows the frequency distributions of InCH,

normal, log-normal (Log-nor), 2-parameter gamma (G2),

OutRiv, and OutEva. When input parameters are sampled

log-2-parameter gamma (Log-G2), Pearson type III (P3),

from uniform distribution, none of the budget terms can

log-Pearson type III (Log-P3), uniform, and unknown distri-

pass the Kolmogorov–Smirnov test (Figures 4(a)–4(c)). More-

bution are denoted as ni (i ¼ 1,2,…,8) in order. After that, the

over, the distributions of these budget terms are significantly

ratio for each PDF was calculated as:

different from uniform distribution. By contrast, all the budget terms have passed the Kolmogorov–Smirnov test as

ratioi ¼ ni =240

ð11Þ

normal distribution when input parameters are sampled from normal distribution (Figures 4(d)–4(f)).

As shown in Figure 3, the PDF of GLS is strongly influenced by the probability distribution of model input

Stepwise regression analysis

parameters. When the input parameters are sampled from uniform distribution, although a majority of GLS obey

Figure 3 shows that only a part of GLS obeys a specified

unknown distribution (Figure 3(a)), the rest of GLS obey

PDF. The observed GLS show different characteristics of

Figure 2

|

The frequency distributions of precipitation rate (PREC), evapotranspiration rate (EVAP), constant head (CH), conductance of riverbed (CRIV), and pumping rate (PUMP). First and second rows represent these parameters are sampled from uniform and normal distributions, respectively.


137

X. Zeng et al.

Figure 3

|

Figure 4

|

|

Probability distribution of groundwater model output

Journal of Hydroinformatics

|

16.1

|

2014

The ratios of GLS which obey normal, G2, P3, uniform, Log-nor, Log-G2, Log-P3, and unknown distribution. (a) and (b) denote input parameters are sampled from uniform and normal distributions, respectively.

Frequency distributions of inflow from constant head boundary (InCH), outflows from river leakage (OutRiv) and evapotranspiration (OutEva). The plots (a), (b), (c) and the plots (d), (e), (f) denote input parameters are sampled from uniform and normal distributions, respectively.

probability distributions among observation points. The

pumping well, evapotranspiration area, and the average dis-

probability distribution of GLS is influenced by the spatial

tance from five pumping wells. The variables are listed in

position of an observation point. Thus, the stepwise

Table 3 and numbered from 1 to 7, all of them are normal-

regression analysis is used to identify the key factors of the

ized before regression analysis.

mean and variance of GLS. The input variables of regression

As shown in Figure 5, as the input parameters are

model are the distances of an observation point from sur-

sampled from uniform and normal distribution, respectively,

rounding model boundaries. They are the distances of an

the sensitivities of influencing factors are almost identical

observation point from northern boundary, river boundary,

for the mean of GLS, e.g., Figure 5(a) vs. Figure 5(b). How-

southern boundary, constant head boundary, the nearest

ever, the influences of regression variables on the variance


138

Table 3

X. Zeng et al.

|

|

Probability distribution of groundwater model output

Journal of Hydroinformatics

|

16.1

|

2014

larger than that in the variance model, e.g., Figure 5(a) vs.

Input variables of stepwise regression model and their numbers

Variable

No.

Distance from an observation point to northern boundary (D1)

1

Distance from an observation point to river boundary (D2)

2

Distance from an observation point to southern boundary (D3)

3

Distance from an observation point to constant head boundary (D4)

4

Distance from an observation point to the nearest pumping well (D5)

5

Average distance from an observation point to five pumping wells (D6)

6

Distance from an observation point to evapotranspiration area (D7)

7

Figure 5(c), Figure 5(b) vs. Figure 5(d). Thus, the mean of GLS is more dependent on the distance from river boundary than the variance of GLS. In addition, the average distance from five pumping wells (D6) is inversely related with the mean and variance of GLS.

Mutual entropy analysis Stepwise regression analysis is restricted in monotonic linear issues, and mutual entropy analysis is capable of treating the complicated non-monotonic relationship between output and input variables. The same as for stepwise regression analysis, the input variables are also listed in Table 3, and the output variables are the mean and variance

of GLS are slightly different for these two distributions, such

of GLS. Tables 4–7 display the contingency tables of mutual

as Figure 5(c) vs. Figure 5(d).

entropy analysis.

For the regression analysis of the mean of GLS, four

Figure 6 shows the results of mutual entropy analysis.

variables (D2, D5, D6, and D7) passed into the regression

Similar to the results of stepwise regression analysis, the sen-

model. The variable with the largest sensitivity is D2 (the

sitivities of input variables are similar for the mean and

regression coefficient is about 0.97). For the regression

variance of GLS. The most important influencing factors

analysis of the variance of GLS, four variables (D2, D5,

for the mean and variance of GLS are the distances from

D6, and D7) also passed into the regression model. The vari-

an observation point to river and constant head boundaries

able with the largest sensitivity is also D2 (the regression

(D2 and D4). In addition, for the mean of GLS, the variables

coefficient is about 0.80). Therefore, the mean and variance

with the weakest sensitivity are the distances from an obser-

of GLS are affected similarly by the regression variables.

vation point to northern and southern boundaries (D1 and

They are both significantly influenced by the distance

D3). For the variance of GLS, the distance from an obser-

from river boundary (D2), and other regression variables

vation point to evapotranspiration area (D7) holds the

have very low influences relative to D2. Furthermore,

smallest sensitivity. Nevertheless, the index values of D1,

the regression coefficient of D2 in the mean model is

D3, and D7 are very close for the mean and variance of

Figure 5

|

The regression coefficients of the entered variables in stepwise regression analysis. The plots (a), (b) and the plots (c), (d) denote output variables are the means and variances of GLS, respectively. The plots (a), (c) and the plots (b), (d) indicate input parameters are sampled from uniform and normal distributions, respectively.


Contingency tables when Y (labeled by column) is the mean of GLS, and model parameters are sampled from uniform distribution

D2

21

9

9

24

9

11

21

9

11

17

23

|

21

36

24

0

0

9

11

17

23

0

18

0

16

44

19

0

0

39

0

9

11

21

19

21

9

9

24

18

0

0

0

60

9

9

21

21

D2

9

31

8

12

21

26

4

19

28

4

11

27

14

|

20

0

60

9

5

36

13

0

0

39

21

14

20

44

0

16

44

0

8

13

4

36

24

0

0

5

2

0

9

1

9

19

8

0

D4

D7

0

8

66

11

36

4

24

17

36

0

0

0

0

23

20

8

0

27

12

14

19

25

9

12

0

0

7

12

12

9

13

61

17

12

D5

0

0

11

27

14

8

0

4

18

57

2

0

19

28

4

9

19

31

31

10

0

21

26

4

9

1

57

4

18

38

9

31

8

12

40

20

12

14

55

43

38

24

10

0

2

0

0

0

D6

27

4

8

23

64

18

8

20

8

5

2

0

9

D7

20

65

0

0

34

9

8

39

28

6

0

0

0

0

12

20

8

2

25

13

32

11

14

12

0

0

7

13

19

2

12

12

54

46

Contingency tables when Y (labeled by column) is the mean of GLS, and model parameters are sampled from normal distribution

D2

D3

9

9

19

23

36

24

0

9

9

23

19

0

12

9

9

22

20

0

0

9

9

17

25

0

0

D4

0

9

9

17

25

0

48

0

9

9

22

20

33

27

9

9

23

19

0

60

9

9

19

23

D5

D6

0

0

60

9

4

36

14

0

0

33

27

14

17

42

0

12

48

0

8

13

4

36

24

0

0

5

2

0

D7

0

5

66

14

12

41

4

23

15

39

0

0

0

0

23

20

8

0

27

12

12

19

27

9

12

0

0

7

12

12

8

14

Contingency tables when Y (labeled by column) is the variance of GLS, and model parameters are sampled from normal distribution

D1

D2

D3

D4

D5

D6

D7

29

6

11

41

19

0

0

11

31

11

7

0

6

20

34

30

21

5

7

37

48

0

0

51

47

16

10

25

23

3

9

5

55

0

0

28

20

5

7

32

23

5

0

34

60

13

7

9

47

22

3

0

0

0

0

28

20

5

7

32

23

5

0

25

23

3

9

5

55

0

0

9

21

7

11

20

8

3

24

13

36

9

12

11

31

11

7

0

6

20

34

14

29

6

11

41

19

0

0

5

2

0

9

12

0

0

7

14

20

0

12

|

14

Journal of Hydroinformatics

|

0

D3

40

D1

Table 7

D6

Contingency tables when Y (labeled by column) is the variance of GLS, and model parameters are sampled from uniform distribution

D1

Table 6

D5

Probability distribution of groundwater model output

9

D4

|

9

D3

X. Zeng et al.

D1

Table 5

139

|

Table 4

16.1

| 2014


140

Figure 6

X. Zeng et al.

|

|

Probability distribution of groundwater model output

Journal of Hydroinformatics

|

16.1

|

2014

Index values of input variables in mutual entropy analysis. First and second rows represent input parameters are sampled from uniform and normal distributions, respectively. The plots (a), (c) and the plots (b), (d) indicate output variables (Y ) are the mean and variance of GLS, respectively.

GLS. Moreover, when the model input parameters are

Classification tree analysis

sampled from uniform and normal distributions, respectively, the influences of input variables are similar for these

Figure 7 displays a conventional diagram that labels the PDF

two distributions.

of GLS, when input parameters are sampled from uniform

Compared with the results of stepwise regression analysis, the distance from constant head boundary (D4) has a

distribution. Figure 8 shows the PDF of GLS when input parameters are sampled from normal distribution.

significant influence on the first two moments of GLS. How-

Figure 3 shows that the PDF of GLS is strongly related

ever, the variable D4 has been excluded from the stepwise

to the probability distribution of groundwater model input

regression analyses of both mean and variance of GLS. As

parameters. However, as shown in Figures 7 and 8, the

shown in Figure 1, the sum of the distances from an obser-

PDF of GLS is not fully controlled by the probability distri-

vation point to the river boundary and constant head

bution of input parameters. The category of the PDF of GLS

boundary is a constant (the width of the study area,

is not uniformly distributed in the space of model layer. For

5,000 m). According to the constructing mechanism of step-

identifying the driving factors that lead GLS to follow the

wise regression model, the influence of D2 and D4 is

specific PDF (uniform or normal), classification tree

presented by only one variable (D2) in stepwise regression

method is used to identify these driving factors. The GLS

analysis. Moreover, the importance of D4 is significant as

are classified into two categories: 0 obeys uniform or

well as D2. The influence mode of D2 on the output vari-

normal distribution when the input parameters are sampled

ables is inversed to that of D4, and this situation is the

from uniform or normal distribution, respectively; 1 does

same as D1 and D3. In addition, the relationship between

not obey. The input variables in the classification tree

the influence modes of D2 and D4 (or D1 and D3) on

model have identical numbers to the variables used in step-

output variables is certified by the contingency tables in

wise regression and mutual entropy analyses (see Table 3).

Tables 4–7. Furthermore, the input variables excluded

As shown in Figure 9, GLS is passed into subspaces by

from the stepwise regression model are able to be identified

selecting suitable input variables used for splitting. In

by mutual entropy analysis. By contrast, these variables are

addition, the classification tree (four ranks in this paper) is

roughly treated as invalid influencing factors by stepwise

built by constantly splitting. The maximum purity was set

regression analysis.

as 0.82 in this classification tree. The results indicate that


141

X. Zeng et al.

|

Probability distribution of groundwater model output

Journal of Hydroinformatics

|

16.1

|

2014

Furthermore, when the model input parameters are sampled from normal distribution, the tree model contains four variables, and the entry order is D6, D2, D5, and D1. Moreover, variable D6 and D2 are also the most significant driving factors. Groundwater is a complex system affected by many factors. According to the central limit theorem, when a system is constructed by a large number of independent random variables, each with finite mean and variance, the output Figure 7

|

Conventional diagram labeling the probability distribution of GLS when input parameters are sampled from uniform distribution.

of the system will be approximately normally distributed. Thus, when the groundwater model parameters are sampled from normal and uniform distributions, respectively, the outputs of groundwater model following normal distribution are many more than that following uniform distribution (see Figures 3, 4, 7 and 8). Moreover, Figure 9 shows that whether the GLS obey normal distribution is controlled by more driving factors than that leading GLS to obey uniform distribution. As has been stated, the key driving factors of GLS are D2 and D6. In addition, variable D2 obtains a significant importance in stepwise regression and mutual entropy analyses. However, the mean and variance of GLS are slightly

Figure 8

|

Conventional diagram labeling the probability distribution of GLS when input

influenced by variable D6. As a result, the mean and var-

parameters are sampled from normal distribution.

iance of GLS are both controlled by the distance from observation point to river boundary (or constant head

when the groundwater model input parameters are sampled

boundary). The category of the PDF of GLS is dominated

from uniform distribution, only two variables (D6 and D2)

by the average distance from observation point to five pump-

entered into the classification tree model. Therefore, the

ing wells, and the distance from observation point to river

probability distribution of GLS is driven by D6 and D2.

boundary (or constant head boundary).

Figure 9

|

Splitting process of classification tree analysis. SZ denotes sample size, P denotes the purity of a space, SV denotes splitting variable. (a) and (b) indicate groundwater model parameters are sampled from uniform and normal distribution, respectively.


142

X. Zeng et al.

|

Probability distribution of groundwater model output

CONCLUSIONS The uncertainty of groundwater modeling can be represented by the characteristics of probability distribution of model outputs. Based on a synthetic groundwater model, and the sensitivity analysis of the probability distributions of model outputs, the following conclusions are drawn: 1. The characteristics of probability distribution of groundwater model output is analyzed and summarized. The most important influencing factors for the mean and variance of GLS are the distances from an observation point to river and constant head boundaries. The most important driving factor for the PDF of GLS is the distance from an observation point to all pumping wells. In addition, the distribution characteristics of groundwater model outputs (GLS and budget terms) are significantly influenced by the probability distribution of input parameters. 2. Stepwise regression analysis is a defective sensitivity analysis method for identifying multiple influencing factors which have similar correlation structure with output variable. By contrast, mutual entropy analysis is more general in identifying complicated multivariate relationships. Furthermore, mutual entropy analysis is able to identify the influence of variables which are excluded from stepwise regression analysis. Moreover, classification tree analysis is an effective method for analyzing the key driving factors in a classification output system.

ACKNOWLEDGEMENTS This study was supported by the National Natural Science Fund of China (Nos. 41172207, 41030746, 51190091, and 41071018), Program for New Century Excellent Talents in University (NCET-12-0262), China Doctoral Program of Higher Education (20120091110026), Qing Lan Project, the Skeleton Young Teachers Program and Excellent Disciplines Leaders in Midlife-Youth Program of Nanjing University.

REFERENCES Bergante, S., Facciotto, G. & Minotta, G.  Identification of the main site factors and management intensity affecting the

Journal of Hydroinformatics

|

16.1

|

2014

establishment of Short-Rotation-Coppices (SRC) in Northern Italy through stepwise regression analysis. Cent. Eur. J. Biol. 5, 522–530. Blasone, R. S., Vrugt, J. A., Madsen, H., Rosbjerg, D., Robinson, B. A. & Zyvoloski, G. A.  Generalized likelihood uncertainty estimation (GLUE) using adaptive Markov chain Monte Carlo sampling. Adv. Water Resour. 31, 630–648. Chen, Y. F., Hou, Y., Van Gelder, P. & Zhigui, S.  Study of parameter estimation methods for Pearson-III distribution in flood frequency analysis. Iahs-Aish P 271, 263–269. Englehart, P. J. & Douglas, A. V.  Diagnosing warm-season rainfall variability in Mexico: A classification tree approach. Int. J. Climatol. 30, 694–704. Esther, A., Groeneveld, J., Enright, N. J., Miller, B. P., Lamont, B. B., Perry, G. L. W., Blank, F. B. & Jeltsch, F.  Sensitivity of plant functional types to climate change: classification tree analysis of a simulation model. J. Veg. Sci. 21, 447–461. Gungor, O. & Goncu, S.  Application of the soil and water assessment tool model on the Lower Porsuk Stream Watershed. Hydrol. Process. 27, 453–466. Haktanir, T.  Comparison of various flood frequencydistributions using annual flood peaks data of rivers in Anatolia. J. Hydrol. 136, 1–31. Harbaugh, A. W.  The U.S. Geological Survey modular groundwater model–the Ground-Water Flow Process. U.S. Geological Survey Techniques and Methods 6-A16, pp. 81–84. Hashemi, H., Berndtsson, R., Kompani-Zare, M. & Persson, M.  Natural vs. artificial groundwater recharge, quantification through inverse modeling. Hydrol. Earth Syst. Sci. 17, 637–650. Hassan, A. E., Bekhit, H. M. & Chapman, J. B.  Uncertainty assessment of a stochastic groundwater flow model using GLUE analysis. J. Hydrol. 362, 89–109. Huysmans, M., Madarasz, T. & Dassargues, A.  Risk assessment of groundwater pollution using sensitivity analysis and a worst-case scenario analysis. Environ. Geol. 50, 180–193. Katz, R. W., Parlange, M. B. & Naveau, P.  Statistics of extremes in hydrology. Adv. Water Resour. 25, 1287–1304. Lang, M., Pobanz, K., Renard, B., Renouf, E. & Sauquet, E.  Extrapolation of rating curves by hydraulic modelling, with application to flood frequency analysis. Hydrol. Sci. J. 55, 883–898. MacQuarrie, C. J. K., Spence, J. R. & Langor, D. W.  Using classification tree analysis to reveal causes of mortality in an insect population. Agr. Forest Entomol. 12, 143–149. Mazzilli, N., Guinot, V. & Jourde, H.  Sensitivity analysis of two-dimensional steady-state aquifer flow equations. Implications for groundwater flow model calibration and validation. Adv. Water Resour. 33, 905–922. McMahon, T. A. & Srikanthan, R.  Log Pearson III distribution – Is it applicable to flood frequency-analysis of Australian streams. J. Hydrol. 52, 139–147.


143

X. Zeng et al.

|

Probability distribution of groundwater model output

Melo, I., Tomasik, B., Torrieri, G., Vogel, S., Bleicher, M., Korony, S. & Gintner, M.  Kolmogorov–Smirnov test and its use for the identification of fireball fragmentation. Phys. Rev. C. 80, 024904. Mishra, S., Deeds, N. E. & RamaRao, B. S.  Application of classification trees in the sensitivity analysis of probabilistic model results. Reliab. Eng. Syst. Safe. 79, 123–129. Mishra, S., Deeds, N. & Ruskauff, G.  Global sensitivity analysis techniques for probabilistic ground water modeling. Ground Water 47, 730–747. Morway, E. D., Niswonger, R. G., Langevin, C. D., Bailey, R. T. & Healy, R. W.  Modeling variably saturated subsurface solute transport with MODFLOW-UZF and MT3DMS. Ground Water 51, 237–251. Mpimpas, H., Anagnostopoulos, P. & Ganoulis, J.  Uncertainty of model parameters in stream pollution using fuzzy arithmetic. J. Hydroinf. 10, 189–200. Myles, A. J., Feudale, R. N., Liu, Y., Woody, N. A. & Brown, S. D.  An introduction to decision tree modeling. J. Chemometr. 18, 275–285. Neppel, L., Renard, B., Lang, M., Ayral, P. -A., Coeur, D., Gaume, E., Jacob, N., Payrastre, O., Pobanz, K. & Vinet, F.  Flood frequency analysis using historical data: accounting for random and systematic errors. Hydrol. Sci. J. 55, 192–208. Onoz, B. & Bayazit, M.  Best-fit distributions of largest available flood samples. J. Hydrol. 167, 195–208. Pappenberger, F., Beven, K. J., Ratto, M. & Matgen, P.  Multimethod global sensitivity analysis of flood inundation models. Adv. Water Resour. 31, 1–14. Robin, M. J. L., Gutjahr, A. L., Sudicky, E. A. & Wilson, J. L.  Cross-correlated random-field generation with the direct Fourier-transform method. Water Resour. Res. 29, 2385– 2397. Rojas, R., Feyen, L. & Dassargues, A.  Sensitivity analysis of prior model probabilities and the value of prior knowledge in the assessment of conceptual model uncertainty in groundwater modelling. Hydrol. Process. 23, 1131–1146. Ross, S. M.  Introduction to Probability and Statistics for Engineers and Scientists. Elsevier Academic Press, San Diego, CA.

Journal of Hydroinformatics

|

16.1

|

2014

Singh, V. P. & Singh, K. a Derivation of the gammadistribution by using the principle of maximum-entropy (POME). Water Resour. Bull. 21, 941–952. Singh, V. P. & Singh, K. b Derivation of the Pearson Type (PT) III distribution by using the principle of maximum-entropy (POME). J. Hydrol. 80, 197–214. Smakhtin, V. U.  Low flow hydrology: a review. J. Hydrol. 240, 147–186. Sun, C. X. & Zheng, S. Q.  Some results of parameter estimator based on uniform distribution. Coll. Math. J. 22, 130–134. Vogel, R. M., Mcmahon, T. A. & Chiew, F. H. S.  Floodflow frequency model selection in Australia. J. Hydrol. 146, 421–449. Wang, D., Singh, V. P., Zhu, Y. S. & Wu, J. C.  Stochastic observation error and uncertainty in water quality evaluation. Adv. Water Resour. 32, 1526–1534. Wang, F. G. & Wang, X. D.  Fast and robust modulation classification via Kolmogorov-Smirnov test. IEEE T. Commun. 58, 2324–2332. Wu, J. C., Lu, L. & Tang, T.  Bayesian analysis for uncertainty and risk in a groundwater numerical model’s predictions. Hum. Ecol. Risk Assess. 7, 1310–1331. Ye, M., Pohlmann, K. F., Chapman, J. B., Pohll, G. M. & Reeves, D. M.  A model-averaging method for assessing groundwater conceptual model uncertainty. Ground Water 48, 716–728. Zeng, X. K., Wang, D. & Wu, J. C.  Sensitivity analysis of the probability distribution of groundwater level series based on information entropy. Stoch. Environ. Res. Risk Assess. 26, 345–356. Zeng, X. K., Wang, D., Wu, J. C. & Chen, X.  Reliability analysis of the groundwater conceptual model. Hum. Ecol. Risk Assess. 19, 515–525. Zhang, P., Aagaard, P., Nadim, F., Gottschalk, L. & Haarstad, K.  Sensitivity analysis of pesticides contaminating groundwater by applying probability and transport methods. Integr. Environ. Assess. Manag. 5, 414–425. Zhang, X., Hoermann, G. & Fohrer, N.  Parameter calibration and uncertainty estimation of a simple rainfall-runoff model in two case studies. J. Hydroinformat. 14, 1061–1074.

First received 5 January 2013; accepted in revised form 13 June 2013. Available online 12 July 2013


144

© IWA Publishing 2014 Journal of Hydroinformatics

|

16.1

|

2014

An optimization model for water resources allocation risk analysis under uncertainty Y. L. Xie and G. H. Huang

ABSTRACT In order to deal with the risk of low system stability and unbalanced allocation during water resources management under uncertainties, a risk-averse inexact two-stage stochastic programming model is developed for supporting regional water resources management. Methods of intervalparameter programming and conditional value-at-risk model are introduced into a two-stage stochastic programming framework, thus the developed model can tackle uncertainties described in

Y. L. Xie G. H. Huang (corresponding author) MOE Key Laboratory of Regional Energy and Environmental Systems Optimization, Resources and Environmental Research Academy, North China Electric Power University, Beijing 102206, China E-mail: guohe.huang3@gmail.com

terms of interval values and probability distributions. In addition, the risk-aversion method was incorporated into the objective function of the water allocation model to reflect the preference of decision makers, such that the trade-off between system economy and extreme expected loss under different water inflows could be analyzed. The proposed model was applied to handle a water resources allocation problem. Several scenarios corresponding to different river inflows and risk levels were examined. The results demonstrated that the model could effectively communicate the interval-format and random uncertainties, and risk aversion into optimization process, and generate inexact solutions that contain a spectrum of water resources allocation options. They could be helpful for seeking cost-effective management strategies under uncertainties. Moreover, it could reflect the decision maker’s attitude toward risk aversion, and generate potential options for decision analysis in different system-reliability levels. Key words

| conditional value-at-risk, inexact two-stage stochastic programming, risk analysis, uncertainty, water resources allocation

INTRODUCTION Water resources are critical for human survival, and human

and temporal units, and incompleteness or impreciseness of

society would be unable to prosper or even exist without

observed information (McIntyre et al. ; Maqsood et al.

them. The ever-growing conflicting demand for water

). Therefore, it is desired that the uncertainties should

resources supplies threaten the sustainability of this essential

be considered in water allocation planning programming.

resources recycling. Coupled with rapid increasing water

Over the past decades, inexact optimization models have

demand, decreasing usable water supplies and poor manage-

been widely used to tackle uncertainties and complexities in

ment have led to inefficient water resources allocation, and

water resources allocation problems, and a majority of them

the unsustainable use of water resources with significant

were based on fuzzy, stochastic, and interval-parameter pro-

economic, social, and environmental ramifications. More-

gramming (abbreviated as FMP, SMP, and IPP), as well as

over, in water resources systems, many system parameters

their combinations (Slowinski et al. ; Wagner et al. ;

and their inter-relationship may appear uncertain. Such

Huang ; Chang et al. ; Russell & Campbell ;

uncertainties, that would affect the related exercises for gen-

Wang & Du ; Li et al. , ; Li & Huang ; Cetin-

erating desired water resources management schemes, may

kaya et al. ; Simonovic ; Guo & Huang ; Xu &

be caused by the errors in acquired data, variations in spatial

Qin ; Lv et al. ). For example, Huang () developed

doi: 10.2166/hydro.2013.239


145

Y. L. Xie & G. H. Huang

|

Water resources allocation management and risk analysis model

Journal of Hydroinformatics

|

16.1

|

2014

an interval chance-constraint programming model for water

the violation of some overriding policies, those methods/

quality management in a Chinese city, which allowed prob-

models would fail to analyze the economic consequences;

ability distributions and discrete intervals to be incorporated

also, none of the above methods could facilitate the analysis

within the optimization process. Jairaj & Vedula () opti-

of various policy scenarios that were associated with different

mized a multi-reservoir system through using a fuzzy

levels of economic penalties when the promised targets were

mathematical programming method, where uncertainties

violated in the water resources management process.

existing in reservoir inflows were treated as fuzzy sets. Faye

Inexact two-stage stochastic programming (ITSP),

et al. () proposed a long-term water resources allocation

coupled with two-stage stochastic programming (TSP) and

model for an irrigation management problem of a reservoir

IPP, is an attractive technique to help overcome the above

system, where the fuzzy logic presented as a particularly ade-

shortcomings. In the ITSP, a decision is first undertaken

quate means to refine on-line the formulation of the objective

before values of random variables are known; then, after

function of the recurrent optimization problem. Teegavarapu

the random events have happened and their values are

& Elshorbagy () proposed a fuzzy mean squared error

known, a second-stage decision can be made in order to

measure to evaluate the performance of time series prediction

minimize ‘penalties’ that may appear due to any infeasibility

models in water resources, where membership functions

(Loucks et al. ; Birge ). ITSP methods have been

derived from a number of modeler preferences could be

widely explored in water resources management in the

easily aggregated to obtain a single integrated membership

past decades (Ferrero et al. ; Huang & Loucks ;

function. Chaves & Kojiri () applied a stochastic fuzzy

Seifi & Hipel ; Luo et al. ; Maqsood et al. ;

neural network model for the optimization of reservoir

Li et al. , ; Guo et al. ; Huang et al. ;

monthly operational strategies considering maximum water

Wang & Huang ). For example, Maqsood et al. ()

utilization and improvements on water quality simultaneously,

developed an interval-parameter fuzzy two-stage stochastic

where the stochastic fuzzy neural network was defined as a

programming method for water resources systems planning

fuzzy neural network model stochastically trained by a genetic

and management under uncertainty. Li & Huang ()

algorithm. Zhang et al. () introduced an inexact-stochastic

proposed an inexact two-stage stochastic nonlinear pro-

dual water supply programming model for regional water

gramming model for supporting decisions of water

resources management, which was based on analysis of the

resources allocation within a multi-reservoir system. Wang

inexact characteristics in demand and supply subsystems of a

& Huang () developed an interactive two-stage stoch-

dual water supply system and their dynamic interactions. Lu

astic fuzzy programming model for water allocation

et al. () advanced an interval-valued fuzzy linear-program-

management, where the method can not only tackle dual

ming method based on infinite α-cuts for an agricultural

uncertainties presented as fuzzy boundary intervals, but

irrigation problem, where a two-step infinite α-cuts solution

also permit in-depth analyses of various policy scenarios.

method is communicated to the solution process to discretize

Huang et al. () developed an integrated optimization

infinite α-cuts to interval-valued fuzzy membership functions.

method for supporting agriculture water management and

Tran et al. () developed a stochastic dynamic programming

planning in Tarim River Basin, Northwest China, where

model for reservoir water management strategy planning in

the method couples ITSP and quadratic programming. In

southern Vietnam, where multi-users, stochastic water level,

general, ITSP is effective for problems where an analysis

the timing and quantity of water release, and climatic con-

of policy scenarios is desired and the related data are

ditions were considered. Liu et al. () proposed an

random/interval format in nature. However, in the previous

interval-parameter chance-constrained fuzzy multi-objective

study, the minimum cost or maximum net benefit are usually

programming model for assisting water pollution control

considered as the objective in a general ITSP model, which

within a sustainable wetland management system, where the

could lead to the problems of low system stability and unba-

proposed approach can effectively handle the uncertainties

lanced allocation risk. Most of the models generated by the

and complexities in the water pollution control systems. How-

ITSP methods for water resources management take the

ever, in water resources planning practice, when it comes to

system benefit as the objective without considering the risk


146

Y. L. Xie & G. H. Huang

|

Water resources allocation management and risk analysis model

Journal of Hydroinformatics

|

16.1

|

2014

aversion, which should also be incorporated in the proposed inexact stochastic programming approach. Incorporating risk measures in the objective functions within other optimization methods is a fairly recent research topic. An alternative risk measure, namely conditional valueat-risk (CVaR), proposed by Rockafellar & Uryasev (), is a widely accepted risk measure (Ahmed ; Schultz & Tiedemann ; Fábián ). The CVaR model is a new risk measurement method based on probability distributions of random variables, and has been widely used for portfolio selection (Kall & Mayer ; Klein Haneveld & Van der Vlerk ; Schultz & Neise ; Liu et al. ). Previously, the application of CVaR in the water resources management field has been relatively limited. For example, Piantadosi et al. () developed a stochastic dynamic programming model with CVaR for supporting urban storm water management. Shao et al. () proposed a stochastic dynamic programming model with CVaR constraints for supporting water resources management under uncertainty. Nevertheless, most of the models take the system risk as the constraints, and

Figure 1

|

Framework of the RITSP model.

no previous studies were focused on development of riskaversion inexact two-stage stochastic programming (RITSP) method through integrating IPP, TSP, and CVaR into a general

RITSP method, which is based on IPP, TSP, and CVaR

framework for water resources allocation management with

techniques. Each technique has a unique contribution in

considering the risk aversion in the system objective.

enhancing the RITSP’s capacities for tackling the uncer-

Therefore, the aim of this study is to develop a RITSP

tainties and system risk. For example, the probability

method for water resources allocation management under

distributions and policy implications were handled through

uncertainty. It is the first attempt where IPP, TSP, and

TSP; the uncertainties presented as discrete intervals were

CVaR methods are integrated into a general framework of

reflected through IPP; the system risk was addressed by

a maximum benefit objective in the water resources allo-

CVaR. The modeling framework would offer feasible and

cation problem under uncertainties presented as interval

reliable solutions under different scenarios of allocation

values and probabilities. A case study will demonstrate the

targets, which are helpful for decision makers (Maqsood

performance of the RITSP method in water resources man-

et al. ).

agement systems planning under uncertainty. Furthermore, it will be shown how it can be used to generate water allo-

Two-stage stochastic programming

cation policies under a given risk level, as well as to determine which designs can most efficiently lead to the

Consider a typical water resources management system in a

optimized system objectives.

region, where a water resources manager is responsible for allocation of limited water to multiple competing users during a planning horizon. The water manager needs to

METHODOLOGY

promise each user an allocation target in the management process, which can help the water users make their gener-

An RITSP model was based on IPP, CVaR model, and

ation plans. If the promised water is delivered, it will

TSP. Figure 1 presents the general framework of the

result in net benefit to the local economy and drive the


147

Y. L. Xie & G. H. Huang

|

Water resources allocation management and risk analysis model

regional industry development; however, if the promised

Journal of Hydroinformatics

|

16.1

|

2014

optimization problem can be expressed as:

water is not delivered, the benefit will be reduced, due to the curtailed demand and the imposed penalty. Since the amount of available water is random, this water allocation

Eω∈Ω ½Qðx, ωÞ ¼

v X

ph Qðx, ωh Þ

(3)

h¼1

problem can be formulated as a two-stage stochastic programming with the objective of maximizing the expected value of economic activity in the region. The

general

form

of

TSP

For each realization of random variable ωh, a secondstage decision is made, which is denoted by yh. The

problems

read:

second-stage optimization problem can be rewritten as:

zðx, ωÞ ¼ cx Qðx, ωÞ, and a TSP model can be formulated as follows (Birge & Louveaux ): f ¼ max cx Eω∈Ω ½Qðx, ωÞ

min qðyh , ωh Þ (1a)

subject to ax b

(1b)

x 0

(1c)

(4a)

subject to Dðωh Þyh hðωh Þ þ T ðωh Þx

(4b)

yh 0

(4c)

Thus, Model (1) can be equivalently formulated as a linear programming model (Ahmed et al. ):

where f is the system benefit, x is the first-stage decision of water allocation made before the random variable ω is observed (ω ∈ Ω), and c is the benefit coefficients of

f ¼ max cx

and Qðx, ωÞ is the optimal value of the following nonlinear programming: min qðy, ωÞ

(2a)

ph qðyh , ωh Þ

(5a)

h¼1

first-stage variable x in the objective function; a is the technical coefficients, b is right-hand side coefficients,

v X

subject to ax b

(5b)

Dðωh Þyh hðωh Þ þ T ðωh Þx

(5c)

x 0

(5d)

yh 0

(5e)

subject to DðωÞy hðωÞ þ T ðωÞx y 0

(2b)

(2c)

where y is the second-stage adaptive decision, which

Risk-averse two-stage stochastic programming

depends on the realization of the random variable. qðx, ωÞ denotes the second-stage cost function, while

In the TSP, the first-stage decisions are deterministic and the

fDðωÞ, hðωÞ, T ðωÞjω ∈ Ωg are random model parameters

second-stage decisions are allowed to depend on the elemen-

with reasonable dimensions, which are functions of the

tary events, i.e., yh ¼ yðωh Þ. Basically, the second-stage

random variable ω. By letting random variables ω take

decisions represent the operational decisions, which change

discrete values with probability levels ph (h ¼ 1, 2,…, v P and ph ¼ 1), the expected value of the second-stage

depending on the realized values of the random data. The objective function Qðx, ωÞ of the second-stage problem, also


148

Y. L. Xie & G. H. Huang

|

Water resources allocation management and risk analysis model

Journal of Hydroinformatics

|

16.1

|

2014

known as the recourse (benefit) function, is a random vari-

VaR has the additional difficulty, for stochastic prob-

able and therefore, the total profit function zðx, ωÞ is a

lems, that it requires the use of binary variables for its

random variable. Determining the optimal decision vector x

modeling. Instead, computation of CVaR does not require

leads to the problem of comparing random profit variables

the use of binary variables and it can be modeled by the

zðx, ωÞ. Comparing random variables is one of the main inter-

simple use of linear constraints. The concept of CVaR is

ests of decision theory in the presence of uncertainty. While

illustrated in Figure 2. CVaR(z) is the conditional expected

comparing random variables, it is crucial to consider the

value not exceeding the value under the confidence level

effect of variability, which leads to the concept of risk. The

α. The CVaR at the confidence level α is given by:

preference relations among random variables can be specified using a risk measure. One of the main approaches in the practice of decision making under risk uses mean-risk models (Ogryczak & Ruszczyn´ski ). In these models,

CVaRα ðzÞ ¼ inf ξ ξ∈R

1 E ½ ξ z þ 1 α

(9)

one minimizes the mean-risk function, which involves a

where ξ is an auxiliary variable, which is the maximum

specified risk measure ρ:z ! R, where ρ is a functional and

value at the cumulative probability α.

z is a linear space of F-measurable functions on the prob-

Thus, Model (6) can be redefined as:

ability space (Ω, F, P): maxfEðzðx, ωÞÞ λCVaRα ðzðx, ωÞÞg maxfEðzðx, ωÞÞ λρðzðx, ωÞÞg

(10)

(6) In

In this approach, λ is a nonnegative trade-off coefficient

addition,

(Birbil

et

al.

CVaRα ðz þ aÞ ¼ CVaRα ðzÞ þ a, a ∈ R ),

therefore,

CVaRα ðzðx, ωÞÞ ¼

representing the exchange rate of mean benefit for risk, and

CVaRα ðcx Qðx, ωÞÞ ¼ cx CVaRα ðQðx, ωÞÞ,

also refers to it as a risk coefficient, which is specified by

can be reformed as the following linear programming

decision makers according to their risk preferences. Usually,

problem:

Model

(10)

when typical dispersion statistics, such as variance, are used as risk measures, the mean-risk approach may lead to inferior

Max f ¼ ð1 λÞcx

solutions. In order to remedy this drawback, models with

alternative asymmetric risk measures, such as downside

h¼1

risk measures, have been proposed (Ogryczak & Ruszczyn ´ski ), and conditional value-at-risk (CVaR) measure which is based on the value-at-risk (VaR) was widely applied in many areas to downside risk measures among the popular risk-aversion methods.

v X

v 1 X ph qðyh , ωh Þ þ λ ξ ph Vh 1 α h¼1

ax b

lower than or equal to this value (e.g., l ) is lower than or equal to 1 α: (7)

CVaR at level α, in a simple way, is defined as follows (Rockafellar & Uryasev , ): (8)

(11a)

(11b)

VaR is a measure computed as the maximum profit

CVaRðzÞ ¼ Eðzjz VaRðzÞÞ

)

subject to

value (e.g., z) such that the probability of the profit being

VaR ¼ maxfljpðz lÞ 1 αg

(

Figure 2

|

VaR and CVaR illustration.


149

Y. L. Xie & G. H. Huang

|

Water resources allocation management and risk analysis model

Dðωh Þyh hðωh Þ þ T ðωh Þx, Vh ξ cx þ q yh, ωh ,

h ¼ 1, 2, . . . , v

(11c)

where

Journal of Hydroinformatics

± ± ± ± a± , c± , x± , b± , y± h , ξ , ωh , Vh ∈ fR g,

|

16.1

and

|

2014

f R± g

denotes a set of interval parameters and/or variables; superscript ‘±’ means interval-valued feature; the ‘–’ and

h ¼ 1, 2, . . . , v

(11d)

‘þ’ superscripts represent lower and upper bounds of an interval parameter/variable, respectively.

Vh 0,

h ¼ 1, 2, . . . , v

(11e) Solution of the RITSP model

x 0

(11f) Model (12) can be transformed into two deterministic sub-

yh 0,

h ¼ 1, 2, . . . , v

(11g)

models that correspond to the lower and upper bounds of desired objective function value. This transformation process is based on an interactive algorithm, which is

Risk-averse inexact two-stage stochastic programming

different from the best/worst case analysis (Huang et al. ). The objective function value corresponding to f þ is

However, in water resources optimization problems,

desired first because the objective is to maximize net

uncertainty presented as interval numbers is more

system benefit. The sub-model to find f þ can be first formu-

straightforward than probability density functions (PDFs)

lated as follows (assume that c± 0, A± 0, and b± 0):

due to the poor quality of information that can be obtained (Li et al. ). Thus, by introducing the interval parameter

Max f þ ¼ ð1 λÞcþ x

programming to quantify those uncertainties presented in

terms of interval values, Model (11) can be transformed

v X

±

ph q y± h , ωh

h¼1

(

v 1 X þ λ ξ± ph Vh± 1 α h¼1

ph q

y h,

ω h

h¼1

into the following RITSP model: Max f ± ¼ ð1 λÞc± x±

v X

(

v 1 X ph Vh þλ ξ 1 α h¼1

)

þ

(13a) )

subject to x ¼ x þ μðxþ x Þ

(13b)

0 μ 1

(13c)

(12b)

a x bþ

(13d)

(12c)

D ω h yh h ωh þ T ωh x,

(12d)

Vh ξþ cþ x þ q y h , ωh ,

(12e)

Vh 0,

(12f)

y h 0,

(12a) subject to a± x± b± ± ± ± ± D ω± h yh h ωh þ T ωh x , ± Vh± ξ± c± x± þ q y± h , ωh , Vh± 0,

h ¼ 1, 2, . . . , v

x± 0 y± h

0,

h ¼ 1, 2, . . . , v

h ¼ 1, 2, . . . , v h ¼ 1, 2, . . . , v

h ¼ 1, 2, . . . , v h ¼ 1, 2, . . . , v

h ¼ 1, 2, . . . , v h ¼ 1, 2, . . . , v

(13e)

(13f)

(13g)

(13h)

þ where μ and y h are decision variables. The optimal fopt , μopt

(12g)

and y h opt would be obtained through solving the Submodel


150

Y. L. Xie & G. H. Huang

|

Water resources allocation management and risk analysis model

Journal of Hydroinformatics

|

16.1

|

2014

(13), and xopt ¼ x þ μopt ðxþ x Þ is the optimized first-stage

development. In the study region, agriculture and industry

variable, which may correspond to the optimized upper-

are the dominant activities and the agricultural irrigation

bound objective function value. Based on the above solutions,

and industrial consumption accounts for more than 75% of

the second submodel for f can be formulated as follows:

the total water demands to promote the development of the regional economy; 20–25% of the total water consumption

Max f ¼ ð1 λÞc xopt

v X

þ ph q yþ h , ωh

h¼1

( þ λ ξ

v X

1 ph Vhþ 1 α h¼1

is used for drinking, cleaning, and other municipal purposes. The main elements of the problem involve the water resource

)

availabilities, the water demands to satisfy current and poten(14a)

tial future needs, and the water transportation systems. With less water available, the management of water resources is becoming more complex in order to satisfy all users, and

subject to

existing public institutions lack the capacity and structure to þ

a xopt b

(14b)

þ þ þ D ωþ h yh h ωh þ T ωh xopt , Vhþ

ξ c xopt þ q

yþ h,

ωþ h

h ¼ 1, 2, . . . , v

properly deal with the situation. Since local economic development relies heavily on the availability of the water supply, the adaptive strategies to water shortage crises are of high

(14c)

importance to local government. Moreover, in water resources systems, the manager wants to obtain different

,

h ¼ 1, 2, . . . , v

(14d)

options for water supply, and select the option or combination of options that provide the necessary amount of

Vhþ Vh 0,

h ¼ 1, 2, . . . , v

(14e)

water in the most cost-effective manner while taking into account technical and social criteria. From an economic point of view, all users need to know how much water they

xopt

yþ h

y h

0,

h ¼ 1, 2, . . . , v

(14f)

can expect to obtain during the planning horizon in order to establish and make plans for rational production. How-

Solutions of

yþ opt

can be obtained through Submodel (14).

ever, a variety of complexities exist in the study problem.

Through integrating solutions of Submodels (13) and (14),

On the one hand, the hydrologic cycle is basically dependent

interval solution for Model (12) can be obtained as follows:

upon the geology and climate, which determine the physical

± fhopt

¼

h

þ fopt , fopt

characteristics of the basin, the natural environment, and the

i

(15a)

the river basin. On the other hand, human activities impact on the natural resources, mainly through land and water

xopt ¼ x þ μopt ðxþ x Þ

y± hopt

variability of the mass and energy exchanges occurring within

(15b)

use, and produce changes in the dynamics of the natural environment and the hydrologic cycle, which may amplify

h i þ ¼ y hopt , yhopt

(15c)

the variability of those exchanges, affecting the hydrologic balance and the use of the natural resources (Victoria et al. ; Li et al. ). These complexities could become further compounded by not only interactions among many uncertain

CASE STUDY

system components but also their economic implications caused by improper policies.

A case study of regional water management is then provided

Thus, the manager needs to create a plan to effectively allo-

for demonstrating applicability of the developed method. In

cate the uncertain supply of water to the three users in order to

the water resources system, a water manager is responsible

maximize the overall system benefit while simultaneously con-

for allocating the limited water resources to support the

sidering the uncertainties in the system. In addition, based on

regional

the regional water management policies, an allowable flow

municipality,

industrial

and

agricultural


151

Y. L. Xie & G. H. Huang

|

Water resources allocation management and risk analysis model

Journal of Hydroinformatics

|

16.1

|

2014

level to each user must be regulated. If the promised water

i ¼ 2 for industrial production, and i ¼ 3 for agricultural sector;

amount is delivered, the net benefit will be generated. How-

h is the index of scenarios where h ¼ 1, 2,…, 7; Wi± is the allo-

ever, if the promised water amount is not delivered, either

cation target of water that is promised to user i; D± ih is the

the water must be obtained from higher price alternatives or

amount of water deficit by which the water allocation target

the supply must be decreased by reducing the scale of pro-

Wi± is not met in scenario h; NB± i is the net benefit of user i

duction to fill the so-called deviation, causing economic

per unit of water allocated; Ci± is the reduction of net benefit

losses (Li et al. , ). Moreover, the existence of multiple

to user i per unit of water not delivered; λ is a nonnegative

uncertainties associated with the water resources system will

trade-off coefficient representing the exchange rate of mean

aggravate the risk of system impairment and failure. Therefore,

benefit for risk; ξ± α is an auxiliary variable, which is the maxi-

it is desirable that the risk control should be considered in the

mum benefit at the cumulative probability α; α is the

water allocation planning program. The problem under con-

confidence level; Vh± is a positive auxiliary variable under scen-

sideration of the risk of water resources system transforms

ario h; ph is probability of occurrence for scenario h; q± h is the

into how to effectively allocate water to various sectors in

available water resources in scenario h; Wi±max is the minimum

order to achieve a maximum benefit assuming a given risk

allowable allocation amount for user i.

level under uncertainties. To solve such a problem, the pro-

For Model (16), if Wi± are considered as uncertain

posed RITSP is considered to be a suitable approach for

inputs, the existing methods for solving inexact linear pro-

dealing with the study problem:

gramming problems cannot be used directly. In this study, an optimized set of target values will be identified by

3 3 X 7 X X ± Max f ¼ ð1 λÞNB± W ph Ci± D± i i ih ±

i¼1

( ξ± α

þλ

having μi in Model (17) be decision variables. This opti-

i¼1 h¼1 7 1 X ph Vh± 1 α h¼1

)

mized set will correspond to the highest possible system (16a)

Wi ¼ Wi þ μi ΔWi ,

where

ΔWi ¼ Wiþ Wi ,

ing an optimized set of target values Wi± in order to support

[constraints of water availability] Wi±

let

μi ∈ ½0, 1 . μi are decision variables that are used for identify-

subject to 3 X

benefit under the uncertain water allocation targets. Accordingly,

D± ih

the related policy analyses (Huang & Loucks ). For

q± h,

∀h

(16b)

example, when Wi± approach their upper bounds (i.e., when ui ¼ 1), a relatively high benefit would be obtained if

i¼1

the water demands are satisfied; however, a high penalty

[constraints of extreme allocation amounts]

may have to be paid when the promised water is not delivWi±max Wi± D± ih , ± Wi± D± ih Wi min ,

Vh± ξ±

3 X

∀i, t, h,

(16c)

∀i, t, h,

(16d)

when ui ¼ 0), we may have a lower cost and a higher risk

± NB± i Wi þ

i¼1

ered. Conversely, when Wi± reach their lower bounds (i.e.,

3 X

of violating the promised targets. Therefore, by introducing decision variables ui, and according to Huang & Loucks

Ci± D± ih , ∀h

(16e)

(), the model can be transformed into two deterministic submodels based on an interactive algorithm. Since the

i¼1

objective is to maximize the net system benefit, the submo-

[nonnegative constraints]

del corresponding to upper-bound objective function value

Vh± 0,

∀h

(16f)

D± ih 0,

∀i, h

(16g)

( f þ ) is first desired. Thus, we have:

Max f þ ¼

3 3 X 7 X X ð1 λÞNBþ ph Ci D ih i Wi þ μi ΔWi i¼1

(

where f± is the net system benefit over the planning horizon ($); i is the index of water users, where i ¼ 1 for municipality,

þ λ ξþ α

7 X

1 ph Vh 1 α h¼1

)

i¼1 h¼1

(17a)


152

Y. L. Xie & G. H. Huang

|

Water resources allocation management and risk analysis model

Journal of Hydroinformatics

subject to ΔWi ¼

Wiþ

Vhþ ξ

Wi

3 X

Wi

(17c)

þ μi ΔWi

D ih

qþ h,

∀h

Ciþ Dþ ih , ∀h

|

2014

(18f)

i¼1

Vhþ 0,

∀h

(18g)

Dþ ih 0,

∀i, h

(18h)

where fopt and Dþ it opt are solutions of the Submodels (18).

Wiþmax Wi þ μi ΔWi D ih ,

Thus, the solutions for Model (16) under the optimized tar∀i, t, h,

þ

ξ

3 X

(17e)

gets can be obtained through incorporating the solutions of the two submodels.

þ Wi þ μi ΔWi D ih Wi min ,

Vh

3 X

16.1

(17d)

i¼1

NB i Wi opt þ

i¼1

(17b)

0 μi 1

3 X

|

NBþ i

∀i, t, h,

Table 1 provides the water target demands and the related economic data. The data were obtained from a

Wi

(17f)

þ μi ΔWi þ

i¼1

3 X

number of representative cases for water resources manageCi D ih ,

∀h

(17g)

ment (Loucks et al. ; Huang & Loucks ; Li et al. , ). Since uncertainties exist in the system com-

i¼1

ponents, water allocation targets and economic data are Vh

0,

∀h

(17h)

expressed as intervals format. Let Wiþ be the quantity of water that is promised to each user i. If this water is deliv-

D ih 0,

∀i, h

(17i)

þ where fopt , D ihopt , and uiopt are solutions of the Submodels

(17). Solution for f þ provides the extreme upper bound of system benefit under uncertain inputs. Then, the optimized water allocation targets would be Wi opt ¼ Wi þ ΔWkt ui opt . Consequently, the submodel corresponding to the lower bound of the objective function value (i.e., f ) is: Max f ¼

unit of water allocated is estimated to be NB± i . However, if the promised water is not delivered, either water must be obtained from alternative and more expensive sources, or demand must be curtailed by reduced production and/or increased recycling within the industrial concern, or by reduced irrigation in the agricultural sector. This results in a reduction of net benefit to user i of Ci± per unit of water not delivered (Ci± > NB± i ). In addition, in the water

3 3 X 7 X X ð1 λÞNB ph Ciþ Dþ i Wi opt ih i¼1

ered, the resulting net benefit to the local economy per

resources system, the total amount of water available has

i¼1 h¼1

(

7 1 X þ λ ξ ph Vhþ α 1 α h¼1

)

(18a)

Table 1

|

Water target demands and the related economic data

User

subject to Wi opt ¼ Wi þ ΔWkt ui opt 3 X

Wi opt Dþ ih qh , ∀h

Wi opt

Dþ ih

Wi min ,

∀i, t, h, ∀i, t, h,

Industrial

Agricultural

(18b)

Water allocation target, Wi± (106 m3)

[2.20, 4.00]

[3.00, 5.50]

[3.50, 6.50]

Minimum allowable [1.00, 1.50] allocation, Wi±min (106 m3)

[0.50, 1.00]

[0.60, 1.00]

(18c)

Net benefit when water demand is satisfied, 3 NB± i ($/m )

[90, 100]

[45, 55]

[25, 35]

(18d)

Penalty when water is not delivered, Ci± ($/m3)

[125, 135]

[70, 80]

[45, 55]

i¼1

Wi max Wi opt Dþ ih ,

Municipal

(18e)


153

Y. L. Xie & G. H. Huang

|

Water resources allocation management and risk analysis model

Journal of Hydroinformatics

|

16.1

|

2014

characteristics of random and increasing or decreasing trend

and 0.99. In addition, the value of λ can be chosen as any

changes. Theoretically, there are other ways to generate the

real number. After a number of test runs, it was found

random variables. One is a survey from different experts,

that, if λ value is over 0.6, the solutions of optimal water allo-

based on an assumption that there were not enough data

cation targets and water shortage amounts are the same as

available. A large group of experts are required to estimate

the one obtained under λ ¼ 0.6. Therefore, in order to reflect

the value of a certain parameter. Then the value of the par-

the variation trend of allocation policies by changing the

ameter can be obtained via analyzing the estimations

value of λ, the λ value is set between 0 and 1.

through sample statistic inductive methods. The other way

Uncertainties exist in many of the system components (pro-

is exemplified by the probability cumulative distribution

vided as intervals for water allocation targets and economic

function, which is based on that there are enough data avail-

data, as well as distribution information for the total water avail-

able. According to the local policy of hypothetical cases,

ability). The problems under consideration include: (1) how to

seven discrete water inflow values (i.e., very-low, low, low-

suitably allocate water flows to achieve a maximized system

medium, medium, medium-high, high, and very-high) are

benefit; (2) how to identify desired water allocation policies

selected as the range of intervals. In addition, division of

under different risk levels; and (3) how to seek cost-effective

the targets into a number of predefined values associated

water resources management strategies under complex uncer-

with probabilities (8, 12, 16, 25, 15, 14, and 10%) can

tainties. The developed RITSP is considered to be a suitable

meet the requirement of the RITSP. Table 2 shows the

approach for dealing with these problems.

water inflow levels and the associated probabilities of occurrence. From previous studies (Conejo et al. ; Pousinho

RESULT ANALYSIS AND DISCUSSION

et al. ), the value of α is commonly set between 0.90 Table 2

|

Results have been obtained through solving the RITSP

Stream flow distribution

Flow level

Probability

6 3 Stream flows q ± h (10 m )

Very-low (V-L)

0.08

[3.80, 5.20]

Low (L)

0.12

[5.50, 6.50]

Low-medium (L-M)

0.16

[6.90, 8.20]

Medium (M)

0.25

[8.50, 9.80]

Medium-high (M-H)

0.15

[10.0, 11.5]

High (H)

0.14

[11.5, 12.9]

Very-high (V-H)

0. 10

[13.2, 14.5]

Table 3

|

model. The solutions for the objective function value and most of the nonzero decision variables were interval numbers.

Generally,

solutions

presented

as

intervals

demonstrate that the related decisions should be sensitive to the uncertain modeling inputs (Li et al. ). Table 3 shows the solutions of water allocation targets (Wi opt ) under different α and λ levels during the planning horizon. Various α and λ levels correspond to different system confidence levels and different levels of trade-off

Optimal targets of the RITSP model

λ level α level

Wi

0.90

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

W1 opt W2 opt W3 opt

4.00 5.40 3.50

4.00 4.80 3.50

4.00 4.00 3.50

4.00 4.00 3.50

4.00 4.00 3.50

4.00 4.00 3.50

4.00 3.20 3.50

4.00 3.20 3.50

4.00 3.20 3.50

4.00 3.20 3.50

4.00 3.20 3.50

0.95

W1 opt W2 opt W3 opt

4.00 5.40 3.50

4.00 4.00 3.50

4.00 3.20 3.50

4.00 3.00 3.50

4.00 3.00 3.50

4.00 3.00 3.50

4.00 3.00 3.50

4.00 3.00 3.50

4.00 3.00 3.50

4.00 3.00 3.50

4.00 3.00 3.50

0.99

W1 opt W2 opt W3 opt

4.00 5.40 3.50

4.00 4.00 3.50

4.00 3.00 3.50

3.20 3.00 3.50

3.20 3.00 3.50

3.20 3.00 3.50

3.20 3.00 3.50

3.20 3.00 3.50

3.20 3.00 3.50

3.20 3.00 3.50

3.20 3.00 3.50

opt


154

Y. L. Xie & G. H. Huang

|

Water resources allocation management and risk analysis model

Journal of Hydroinformatics

|

16.1

|

2014

between profit and risk, thus would lead to varied water allo-

and λ levels are varied; the benefit from industry lies

cation targets. For example, when α ¼ 0.90 and 0.95, water

between the profits from the municipality and agriculture.

allocation targets for the municipal sector would be 4.00 ×

Moreover, the water allocation targets would decrease

106 m3 under different λ levels; however, when α ¼ 0.99,

with increment of the λ levels, especially in a high confi-

the water allocation targets for this user would be 4.00 ×

dence level. For example, when α ¼ 0.99, the water

106 (λ ¼ 0, 0.1, and 0.2), and 3.20 × 106 m3 (the value of λ

allocation targets for industry would be 5.40 × 106 m3 (λ ¼

is from 0.3 to 1.0). Generally, water resources would first

0), 4.00 × 106 m3 (λ ¼ 0.1), and 3.00 × 106 m3 (the value of

be allocated to the municipal sector, followed by the indus-

λ is from 0.2 to 1.0).

trial and agricultural sectors. For example, the optimized

Variations in Wi± could reflect different policies of water

allocation target for the municipality over the planning hor-

resources management under uncertainty. When the water

izon would be close to its maximum value under different α

allocation targets reach their lower bounds, the corresponding

and λ levels. This is because the municipality could bring

policy may result in less water shortage and lower economic

about the highest benefit when its demand is satisfied;

penalty. Moreover, the upper bounds of Wi± would lead to a

thus, the manager would have to promise larger amounts

strategy with higher allocated targets, resulting in a higher

to it to achieve a maximized system benefit. The optimized

system benefit and a higher risk of penalty when the water

allocation target for the agricultural sector would reach its

inflow is in a lower level. Therefore, different policies in prede-

minimum value under demanding conditions since this

fining the promised water allocation are associated with

user is associated with the lowest benefit. In comparison,

different levels of economic benefit and system failure risk.

the optimized water allocation target for industry would

Tables 4 to 6 present the water deficit (D± ih ) under differ-

fluctuate within its minimum and maximum values as α

ent scenarios in the planning horizon. The solutions of D± ih

Table 4

|

Solutions of D± ih from RITSP model under α ¼ 0.90

α ¼ 0.90 H

I

λ¼0

λ ¼ 0.1

λ ¼ 0.2

λ ¼ 0.3

λ ¼ 0.4

λ ¼ 0.5

λ ¼ 0.6

1

1 2 3

[0.80,1.30] [4.40,4.90] [2.50,2.90]

[0.80,1.30] [3.80,4.30] [2.50,2.90]

[0.80,1.30] [3.00,3.50] [2.50,2.90]

[0.80,1.30] [3.00,3.50] [2.50,2.90]

[0.80,1.30] [3.00,3.50] [2.50,2.90]

[0.80,1.30] [3.00,3.50] [2.50,2.90]

[0.80,1.30] [2.20,2.70] [2.50,2.90]

2

1 2 3

0 [3.90,4.50] [2.50,2.90]

0 [3.30,3.90] [2.50,2.90]

0 [2.50,3.10] [2.50,2.90]

0 [2.50,3.10] [2.50,2.90]

0 [2.50,3.10] [2.50,2.90]

0 [2.50,3.10] [2.50,2.90]

0 [1.70,2.30] [2.50,2.90]

3

1 2 3

0 [2.20,3.10] [2.50,2.90]

0 [1.60,2.50] [2.50,2.90]

0 [0.80,1.70] [2.50,2.90]

0 [0.80,1.70] [2.50,2.90]

0 [0.80,1.70] [2.50,2.90]

0 [0.80,1.70] [2.50,2.90]

0 [0,0.90] [2.50,2.90]

4

1 2 3

0 [0.60,1.50] [2.50,2.90]

0 [0,0.90] [2.50,2.90]

0 [0,0.10] [1.70,2.90]

0 [0,0.10] [1.70,2.90]

0 [0,0.10] [1.70,2.90]

0 [0,0.10] [1.70,2.90]

0 0 [0.90,2.20]

5

1 2 3

0 0 [1.40,2.90]

0 0 [0.80,2.30]

0 0 [0,1.50]

0 0 [0,1.50]

0 0 [0,1.50]

0 0 [0,1.50]

0 0 [0,0.70]

6

1 2 3

0 0 [0,1.40]

0 0 [0,0.80]

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

7

1 2 3

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

Note: When λ > 0.6, the solutions of D± ih are the same as those obtained under λ ¼ 0.6.


155

Table 5

Y. L. Xie & G. H. Huang

|

|

Water resources allocation management and risk analysis model

Journal of Hydroinformatics

|

16.1

|

2014

Solutions of D± ih from RITSP model under α ¼ 0.95

α ¼ 0.95 h

i

λ¼0

λ ¼ 0.1

λ ¼ 0.2

λ ¼ 0.3

λ ¼ 0.4

λ ¼ 0.5

λ ¼ 0.6

1

1 2 3

[0.80,1.30] [4.40,4.90] [2.50,2.90]

[0.80,1.30] [3.00,3.50] [2.50,2.90]

[0.80,1.30] [2.20,2.70] [2.50,2.90]

[0.80,1.30] [2.00,2.50] [2.50,2.90]

[0.80,1.30] [2.00,2.50] [2.50,2.90]

[0.80,1.30] [2.00,2.50] [2.50,2.90]

[0.80,1.30] [2.00,2.50] [2.50,2.90]

2

1 2 3

0 [3.90,4.50] [2.50,2.90]

0 [2.50,3.10] [2.50,2.90]

0 [1.70,2.30] [2.50,2.90]

0 [1.50,2.10] [2.50,2.90]

0 [1.50,2.10] [2.50,2.90]

0 [1.50,2.10] [2.50,2.90]

0 [1.50,2.10] [2.50,2.90]

3

1 2 3

0 [2.20,3.10] [2.50,2.90]

0 [0.80,1.70] [2.50,2.90]

0 [0,0.90] [2.50,2.90]

0 [0,0.70] [2.30,2.90]

0 [0,0.70] [2.30,2.90]

0 [0,0.70] [2.30,2.90]

0 [0,0.70] [2.30,2.90]

4

1 2 3

0 [0.60,1.50] [2.50,2.90]

0 [0,0.10] [1.70,2.90]

0 0 [0.90,2.20]

0 0 [0.70,2.00]

0 0 [0.70,2.00]

0 0 [0.70,2.00]

0 0 [0.70,2.00]

5

1 2 3

0 0 [1.40,2.90]

0 0 [0,1.50]

0 0 [0,0.70]

0 0 [0,0.50]

0 0 [0,0.50]

0 0 [0,0.50]

0 0 [0,0.50]

6

1 2 3

0 0 [0,1.40]

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

7

1 2 3

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

Note: When λ > 0.6, the solutions of D± ih are the same as those obtained under λ ¼ 0.6.

under the given targets reflect the variations of system con-

medium level of stream flow under the scenario of α with

ditions caused by inputs of the uncertain parameters.

the value of 0.95, the amount of industrial water shortage

Generally, the water shortage solutions of the three users

would be [0.60, 1.50] × 106 m3 (λ ¼ 0), [0, 0.90] × 106 m3

and scenarios can be similarly interpreted based on the

(λ ¼ 0.1), [0, 0.10] × 106 m3 (the value of λ is from 0.2 to

results. As the water flow level increases, the water allo-

0.9), and 0 (λ ¼ 1.0), respectively; the water shortage of the

cation target would be satisfied, and the water shortage

municipal sector would be 0 under different λ values, and

would decrease. For example, when α ¼ 0.90 and λ ¼ 0.1,

the agricultural shortage would decrease from [2.50,

the industrial water shortages would be [3.80, 4.30] ×

2.90] × 106 m3 to [0.90, 2.20] × 106 m3 when λ changes from

106 m3, [3.30, 3.90] × 106 m3, [1.60, 2.50] × 106 m3, and

0.1 to 1.0. Generally, as λ increases, the allocation target

[0, 0.90] × 106 m3, when flow levels are very-low, low, low-

and shortage would decrease, leading to a decreased

medium, and medium, respectively; there would be no

amount of water shortage. It indicated that when the risk

shortages under medium-high, high and very-high flow

level λ increases, water managers would choose a conserva-

levels. In addition, a trade-off could be analyzed by assigning

tive water allocation scheme to avoid the risk. In contrast, a

different λ values in the model constraints when α is fixed.

lower λ value would result in alternatives with lower risk

From Tables 3–6, a number of decision variables such as

aversion. Moreover, when the confidence level of α

the target values (Wi opt ) and the upper and lower bounds

increases, the allocation target would decrease, leading to

of the shortage (D± ih ) amount would vary with different λ

a reduced amount of water shortage and increased water

values. As the value of λ increases, the water allocation

allocation balance among users. For example, under the

target and shortage of the three users would decrease. For

low inflow level, the municipal water shortage would be 0,

example, when the available quantity of water is at the

and agricultural deficit would be [2.50, 2.90] × 106 m3 with


156

Table 6

Y. L. Xie & G. H. Huang

|

|

Water resources allocation management and risk analysis model

Journal of Hydroinformatics

|

16.1

|

2014

Solutions of D± ih from RITSP model under different α ¼ 0.99

α ¼ 0.99 h

i

λ¼0

λ ¼ 0.1

λ ¼ 0.2

λ ¼ 0.3

λ ¼ 0.4

λ ¼ 0.5

λ ¼ 0.6

1

1 2 3

[0.80,1.30] [4.40,4.90] [2.50,2.90]

[0.80,1.30] [3.00,3.50] [2.50,2.90]

[0.80,1.30] [2.00,2.50] [2.50,2.90]

[0,0.50] [2.00,2.50] [2.50,2.90]

[0,0.50] [2.00,2.50] [2.50,2.90]

[0,0.50] [2.00,2.50] [2.50,2.90]

[0,0.50] [2.00,2.50] [2.50,2.90]

2

1 2 3

0 [3.90,4.50] [2.50,2.90]

0 [2.50,3.10] [2.50,2.90]

0 [1.50,2.10] [2.50,2.90]

0 [0.70,1.30] [2.50,2.90]

0 [0.70,1.30] [2.50,2.90]

0 [0.70,1.30] [2.50,2.90]

0 [0.70,1.30] [2.50,2.90]

3

1 2 3

0 [2.20,3.10] [2.50,2.90]

0 [0.80,1.70] [2.50,2.90]

0 [0,0.70] [2.30,2.90]

0 0 [1.50,2.80]

0 0 [1.50,2.80]

0 0 [1.50,2.80]

0 0 [1.50,2.80]

4

1 2 3

0 [0.60,1.50] [2.50,2.90]

0 [0,0.10] [1.70,2.90]

0 0 [0.70,2.00]

0 0 [0,1.20]

0 0 [0,1.20]

0 0 [0,1.20]

0 0 [0,1.20]

5

1 2 3

0 0 [1.40,2.90]

0 0 [0,1.50]

0 0 [0,0.50]

0 0 0

0 0 0

0 0 0

0 0 0

6

1 2 3

0 0 [0,1.40]

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

7

1 2 3

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

Note: When λ > 0.6, the solutions of D± ih are the same as those obtained under λ ¼ 0.6.

the scenarios of α with the values of 0.90, 0.95, and 0.99; for

to 3.2 × 106 m3 when the value of λ changes from 0.4 to 1.0

the industrial sector with different λ values, the amount of

under the scenarios of α with the value of 0.99. In addition,

6

water deficit would decrease from [3.90, 4.50] × 10

to

under the lower level of water inflow, the amount of indus-

[1.70, 2.30] × 106 m3 when α is a fixed value of 0.90, and

trial allocation would be decreased, and the agricultural

reduce from [3.90, 4.50] × 106 to [0.70, 1.30] × 106 m3 when

water allocation would increase, when the value of α

the value of α is 0.99. In such a case, the extreme risk

increases from 0.90 to 0.99 under the same λ value. For

would be lowered and the system feasibility would be

example, under the low-medium water flow level, when λ

enhanced. In contrast, a lower α value would result in a

is a fixed value of 0.6, the industrial water allocation

higher possibility of system loss in extreme conditions.

would be [2.30, 3.20] × 106, 3.00 × 106, and [2.30, 3.00] ×

The RITSP model can generate a great deal of water

106 m3, and the amount of agricultural allocation would

allocation strategies with different α and λ values under

be [0.60, 1.00] × 106, [0.60, 1.20] × 106, and [0.70, 2.00] ×

different inflow levels, in order to analyze the effects of α

106 m3, under the condition of α increasing from 0.90 to

and λ on water allocation policies. Figures 3–5 present the

0.99. From Figures 3–5, the lower and upper bounds of the

optional water allocation schemes obtained through the

water allocation amount would vary with the change of α.

RITSP model. Due to the highest benefit, water would be

This shows that the effect of the risk measure on the model-

first allocated to the municipal sector under different α

ing outputs could be adjusted by changing the α value.

and λ values. For example, the water allocated to municipal

Generally, a high α value would lead to a lower risk and

sectors would reach the upper bound of the water allocation

enhanced system feasibility. The water allocated to the

6

3

target (e.g., 4.00 × 10 m ) under the scenarios of α with the

users with higher benefit would decrease, and the water sup-

values of 0.90 and 0.95; the water allocation would decrease

plied to the users with lower benefit would increase when


157

Y. L. Xie & G. H. Huang

Figure 3

|

|

Water resources allocation management and risk analysis model

Journal of Hydroinformatics

|

16.1

|

2014

Optimized water allocation schemes under α ¼ 0.90.

there is an α value increment with a fixed λ value, in order to

change as the value of λ increased. When α is a fixed

reduce the risk of unbalance water allocation caused by the

value of 0.95, the net benefit would be $ [400.22, 640.89] ×

objective of maximum net benefit in water resources system

106 (λ ¼ 0), $ [427.87, 628.12] × 106 (λ ¼ 0.1), $ [433.14,

planning and management.

613.28] × 106 (λ ¼ 0.2), $ [434.30, 608.77] × 106 (the value

Figures 6 and 7 show the varying trend of the RITSP

of λ is from 0.3 to 1.0) respectively; the recourse cost

model’s objective, the system net benefit and recourse cost

would be $ [178.62, 290.28] × 106 (λ ¼ 0), $ [114.39,

under different α and λ values. In general, the intervals of

199.63] × 106 (λ ¼ 0.1), $ [85.23, 158.37] × 106 (λ ¼ 0.2), $

the model’s objective would decrease as the value of λ

[78.74, 148.21] × 106 (the value of λ is from 0.3 to 1.0)

increases when α is a fixed value. For example, when α is

respectively (as shown in Figure 7(b)). It indicated that

a fixed value of 0.90, the objective of the RITSP model

increasing the value of λ would increase the relative impor-

would be $ [0.40, 0.64] × 109, $ [0.57, 0.96] × 109, $ [0.79,

tance of the risk term and also lead to a higher system risk,

1.62] × 109, $ [0.97, 1.62] × 109, $ [1.15, 1.95] × 109, $ [1.33,

and the water managers would choose a conservative

9

9

9

2.28] × 10 , $ [1.63, 2.62] × 10 , $ [1.83, 2.95] × 10 , $ [2.02,

scheme with a lower system benefit. Moreover, as the

3.28] × 109, $ [2.22, 3.62] × 109, and $ [2.42, 3.95] × 109

value of α increases, the net benefit would decrease. For

under the scenario of λ varying from 0 to 1.0, respectively

example, when λ is a fixed value of 0.6, the net benefit

(as shown in Figure 6(a)). In addition, first, the values of

would be $ [433.14, 613.28] × 106, $ [434.30, 608.77] × 106,

system net benefit would decrease and then would not

and $ [403.58, 557.12] × 106 under the scenarios of α with


158

Y. L. Xie & G. H. Huang

Figure 4

|

|

Water resources allocation management and risk analysis model

Journal of Hydroinformatics

|

16.1

|

2014

Optimized water allocation schemes under α ¼ 0.95.

the value of 0.90, 0.95, and 0.99, respectively (Figure 7).

expected benefit and a higher CVaR value. Increasing λ

Thus, increasing the parameter λ and/or the parameter α

leads to a more risk-averse policy with a lower system

implies a higher level of risk than the recourse cost and

benefit and lower expected recourse costs in general. Thus,

the total positioning profit, which together constitute the

increasing the parameter λ and/or the parameter α implies

expected total benefit; change monotonically as a function

a higher level of risk aversion, and water managers would

of α.

choose a more risk-averse policy that would be a lower Figure 8 illustrates how the optimal CVaR changes as

water allocation target for each user in order to avoid the

the risk parameters α and λ increase through solving

risk of water shortage, and a well-balanced water allocation

the RITSP model. Similar to the optimal objective of the

scheme to reduce the risk of conflicts over competition for

RITSP model, CVaR also decreases as α increases by the

water resources.

definition of CVaR. When α increases the corresponding

When the λ value is 0, the RITSP model would be an

value-at-risk increases, and CVaR accounts for the risk of

ITSP model for water resources system management under

larger realizations. Thus, larger α values would lead to

uncertainty. The detailed optimal water targets and water

more conservative policies, which give more weight to

shortage from ITSP are presented in Tables 3–6. Differently

worse scenarios. However, CVaR increases as λ increases.

from the RITSP model, the ITSP model aims to obtain the

Due to the changing trade-off between the expectation and

maximum benefit in the optimal process of water allocation,

the CVaR criterion, larger λ values provide us with a lower

and it does not take the risk of model feasibility and


159

Figure 5

Y. L. Xie & G. H. Huang

|

|

Water resources allocation management and risk analysis model

Journal of Hydroinformatics

|

16.1

|

2014

Optimized water allocation schemes under α ¼ 0.99.

reliability into consideration. These limitations could lead to

objective of a water resources system management model,

low system stability and unbalanced allocation patterns. For

managers could obtain a robust and riskless decision.

example, when λ value is equal to 0, the water allocation targets of the municipal and industrial sectors would first be satisfied and reach their upper bounds, due to a higher

CONCLUSIONS

benefit, and the agricultural water allocation target would reach the lower bounds; especially, in the very-low inflow

In this study, a RITSP model is developed for supporting

level, water shortage would first occur in the agricultural

regional water resources management problems under

sector. Moreover, the net benefit of ITSP is higher than

uncertainty. This method is based on an integration of

that of the RITSP model. This also implies that the system

IPP, CVaR model, and two-stage stochastic programming

objective of the ITSP model is only to obtain a maximum

(TSP). It allows uncertainties presented as both probability

benefit without regarding risk aversion. In addition, the

distributions and interval values to be incorporated within

width of interval net benefit in the RITSP model is narrower

a general optimization framework. Moreover, the risk-

than that of the ITSP model. It is indicated that the system

aversion method was incorporated into the objective func-

benefit relies on the water resources condition, and tends

tion to reflect the preference of decision makers, such that

to fluctuate more intensively with the change of available

the trade-off between system economy and extreme

water resources. Through integrating CVaR into the

expected loss could be analyzed. Then, the developed


160

Figure 6

Y. L. Xie & G. H. Huang

|

|

Water resources allocation management and risk analysis model

Journal of Hydroinformatics

|

16.1

|

2014

The objectives of the RITSP model under different α and λ levels.

method has been confirmed through a case study of a water

allocation plans with a maximized system, and reflecting the

resources allocation problem involving three competing

decision maker’s attitude toward risk aversion.

water users. A number of scenarios corresponding to differ-

The proposed method could help water resources man-

ent river inflow and risk levels was examined; the results of

agers identify desired management policies under various

the case study suggest that the methodology is applicable to

economic considerations. The study results suggested that

reflecting complexities of water resources management and

the proposed approach was also applicable to many other

can be used for providing bases for identifying desired water

environmental and energy management problems. The


161

Figure 7

Y. L. Xie & G. H. Huang

|

|

Water resources allocation management and risk analysis model

Journal of Hydroinformatics

|

16.1

|

2014

Net benefit and recourse cost under different α and λ levels.

risk-based framework could be used to assess the perform-

methodologies to handle various types of uncertainties.

ance risk of unbalanced water resources allocation

However, compared with other approaches, there is still

strategies in compliance with the economic and/or environ-

much space for improvement of the proposed model. For

mental management goals, and help managers identify

example, RITSP would have difficulties in dealing with the

desired water resources management policies under various

uncertainties in the model’s right-hand side coefficients;

environmental, economic, and system reliability consider-

the probability of random variable is estimated through stat-

ations. It could also be coupled with other optimization

istical analysis, which would unavoidably bring errors to the


162

Figure 8

Y. L. Xie & G. H. Huang

|

|

Water resources allocation management and risk analysis model

Journal of Hydroinformatics

|

16.1

|

2014

Optimal values of CVaR under different α and λ levels.

system; the selection of a suitable alternative among the

grateful to the editor and the anonymous reviewers for

obtained interval solutions under different α and λ values

their insightful comments and suggestions.

is of significant complexity and becomes an extra burden for water resources managers. It is also possible that fuzzy logic could be used instead of λ values to deal with uncertainties in many real-world optimization problems, due to

REFERENCES

the inherent ambiguity of the fuzzy subsets. Further studies are desired to mitigate these limitations.

ACKNOWLEDGEMENTS This research was supported by the Fundamental Research Funds for the Central Universities (13XS20), the Major Project Program of the Natural Sciences Foundation (51190095), and the Program for Innovative Research Team in University (IRT1127). The authors are extremely

Ahmed, S.  Mean-risk objectives in stochastic programming. Technical Report, Georgia Institute of Technology. E-print available at 2004, http://www.optimization-online.org. Ahmed, S., Tawarmalani, M. & Sahinidis, N. V.  A finite branch-and-bound algorithm for two-stage stochastic integer programs. Math. Program. A 100, 355–377. Birbil, S. I., Frenk, J., Kaynar, B. & Noyan, N.  The VaR implementation handbook. In: Risk Measures and Their Applications in Asset Management (G. N. Gregoriou, ed.). McGraw-Hill, New York, pp. 311–337. Birge, J. R.  Decomposition and partitioning methods for multistage stochastic linear programs. Oper. Res. 33, 989–1007.


163

Y. L. Xie & G. H. Huang

|

Water resources allocation management and risk analysis model

Birge, J. R. & Louveaux, F. V.  Introduction to Stochastic Programming. Springer, New York. Cetinkaya, C. P., Fistikoglu, O., Fedra, K. & Harmancioglu, N. B.  Optimization methods applied for sustainable management of water-scarce basins. J. Hydroinformat. 10, 69–95. Chang, N. B., Wen, C. G., Chen, Y. L. & Yong, Y. C.  A grey fuzzy multiobjective programming approach for the optimal planning of a reservoir watershed, Part A: theoretical development. Water Res. 30, 2329–2340. Chaves, P. & Kojiri, T.  Deriving reservoir operational strategies considering water quantity and quality objectives by stochastic fuzzy neural networks. Adv. Water Resour. 30, 1329–1341. Conejo, A. J., García-Bertrand, R., Carrión, M., Caballero, A. & Andrés, A.  Optimal involvement in futures markets of a power producer. IEEE Trans. Power Syst. 23, 701–711. Fábián, C. I.  Handling CVAR objectives and constraints in twostage stochastic models. Eur. J. Oper. Res. 191 (3), 888–911. Faye, R. M., Sawadogo, S., Lishoua, C. & Mora-Camino, F.  Long-term fuzzy management of water resource systems. Appl. Math. Comput. 137, 459–475. Ferrero, R. W., Riviera, J. F. & Shahidehpour, S. M.  A dynamic programming two-stage algorithm for long-term hydrothermal scheduling of multireservoir systems. Trans. Power Syst. 13 (4), 1534–1540. Guo, P. & Huang, G. H.  Inexact fuzzy-stochastic programming for water resources management under multiple uncertainties. Environ. Model. Assess. 15 (2), 111–124. Guo, P., Huang, G. H. & Li, Y. P.  An inexact fuzzy-chanceconstrained two-stage mixed-integer linear programming approach for flood diversion planning under multiple uncertainties. Adv. Water Resour. 33 (1), 81–91. Huang, G. H.  IPWM: an interval-parameter water quality management model. Eng. Optim. 26, 79–103. Huang, G. H.  A hybrid inexact-stochastic water management model. Eur. J. Operat. Res. 107, 137–158. Huang, G. H., Baetz, B. W. & Patry, G. G.  An interval linear programming approach for municipal solid waste management planning under uncertainty. Civil Eng. Environ. Syst. 9, 319–335. Huang, Y., Chen, X., Li, Y. P., Willems, P. & Liu, T.  Integrated modeling system for water resources management of Tarim River Basin. Environ. Eng. Sci. 27 (3), 255–269. Huang, Y., Li, Y. P., Chen, X. & Ma, Y. G.  Optimization of the irrigation water resources for agricultural sustainability in Tarim River Basin, China. Agric. Water Manage. 107, 74–85. Huang, G. H. & Loucks, D. P.  An inexact two-stage stochastic programming model for water resources management under uncertainty. Civil Eng. Environ. Syst. 17, 95–118. Jairaj, P. G. & Vedula, S.  Multi-reservoir system optimization using fuzzy mathematical programming. Water Resour. Manage. 14, 457–472. Kall, P. & Mayer, J.  Stochastic Linear Programming: Models, Theory, and Computation. International Series in Operations Research and Management Science. Springer, New York.

Journal of Hydroinformatics

|

16.1

|

2014

Klein Haneveld, W. K. & Van der Vlerk, M. H.  Integrated chance constraints: Reduced forms and an algorithm. Comput. Manage. Sci. 3, 245–269. Li, Y. P. & Huang, G. H.  Interval-parameter two-stage stochastic nonlinear programming for water resources management under uncertainty. Water Resour. Manage. 22, 681–698. Li, Y. P., Huang, G. H. & Nie, S. L.  An interval-parameter multi-stage stochastic programming model for water resources management under uncertainty. Adv. Water Resour. 29, 776–789. Li, Y. P., Huang, G. H. & Nie, S. L.  Mixed interval-fuzzy twostage integer programming and its application to flooddiversion planning. Eng. Optim. 39 (2), 163–183. Li, Y. P., Huang, G. H., Nie, S. L. & Chen, X.  A robust modeling approach for regional water management under multiple uncertainties. Agr. Water Manage. 98, 1577–1588. Li, Y. P., Huang, G. H., Wang, G. Q. & Huang, Y. F.  FSWM: a hybrid fuzzy-stochastic water-management model for agricultural sustainability under uncertainty. Agric. Water Manage. 12 (96), 1807–1818. Liu, Y., Cai, Y. P., Huang, G. H. & Dong, C.  Intervalparameter chance-constrained fuzzy multi-objective programming for water pollution control with sustainable wetland management. Procedia Environ. Sci. 13, 2316–2335. Liu, C., Fan, Y. & Ordóň ez, F.  A two-stage stochastic programming model for transportation network protection. Comput. Oper. Res. 36, 1582–1590. Loucks, D. P., Stedinger, J. R. & Haith, D. A.  Water Resource Systems Planning and Analysis. Prentice-Hall, Englewood Cliffs, NJ. Lu, H. W., Huang, G. H. & He, L.  Development of an interval-valued fuzzy linear-programming method based on infinite α-cuts for water resources management. Environ. Model. Softw. 25, 354–361. Luo, B., Maqsood, I., Yin, Y. Y., Huang, G. H. & Cohen, S. J.  Adaptation to climate change through water trading under uncertainty – an inexact two-stage nonlinear programming approach. J. Environ. Inf. 2 (2), 58–68. Lv, Y., Huang, G. H., Li, Y. P. & Sun, W.  Managing water resources system in a mixed inexact environment using superiority and inferiority measures. Stoch. Env. Res. Risk A 26 (5), 681–693. Maqsood, I., Huang, G. H. & Yeomans, J. S.  An intervalparameter fuzzy two-stage stochastic program for water resources management under uncertainty. Eur. J. Oper. Res. 167 (1), 208–225. McIntyre, N., Wagener, T., Wheater, H. S. & Siyu, Z.  Uncertainty and risk in water quality modelling and management. J. Hydroinformat. 5 (4), 259–274. Ogryczak, W. & Ruszczyn´ski, A.  Dual stochastic dominance and related mean-risks models. SIAM J. Optim. 13 (2), 60–78. Piantadosi, J., Metcalfe, A. V. & Howlett, P. G.  Stochastic dynamic programming (SDP) with a conditional value-at-risk (CVaR) criterion for management of storm-water. J. Hydrol. 348 (3–4), 320–329.


164

Y. L. Xie & G. H. Huang

|

Water resources allocation management and risk analysis model

Pousinho, H. M. I., Mendes, V. M. F. & Catalão, J. P. S.  A risk-averse optimization model for trading wind energy in a market environment under uncertainty. Energy 36, 4935–4942. Rockafellar, R. & Uryasev, S.  Optimization of conditional value at risk. J. Risk 2 (3), 21–41. Rockafellar, R. & Uryasev, S.  Conditional value-at-risk for general loss distributions. J. Bank. Financ. 26, 1443–1471. Russell, S. O. & Campbell, P. F.  Reservoir operating rules with fuzzy programming. ASCE J. Water Resour. Plan. Manage. 122, 165–170. Schultz, R. & Neise, F.  Algorithms for mean-risk stochastic integer programs in energy. Rev. Invest. Oper. 28, 4–16. Schultz, R. & Tiedemann, S.  Conditional value-at-risk in stochastic programs with mixed-integer recourse. Math. Program. 105 (2), 365–386. Seifi, A. & Hipel, K. W.  Interior-point method for reservoir operation with stochastic inflows. J. Water Resour. Plan. Manage. 127 (1), 48–57. Shao, L. G., Qin, X. S. & Xu, Y.  A conditional value-at-risk based inexact water allocation model. Water Resour. Manage. 25, 2125–2145. Simonovic, S.  A new method for spatial and temporal analysis of risk in water resources management. J. Hydroinformat. 11 (3–4), 320–329. Slowinski, R., Urbaniak, A. & Weglarz, J.  Probabilistic and fuzzy approaches to capacity expansion planning of a water supply system. In: Systems Analysis Applied to Water and

Journal of Hydroinformatics

|

16.1

|

2014

Related Land Resources (L. Valadares Tavares & J. Evaristo da Silva, eds). Pergamon, Oxford, pp. 93–98. Teegavarapu, R. & Elshorbagy, A.  Fuzzy set based error measure for hydrologic model evaluation. J. Hydroinformatic. 7 (3), 199–208. Tran, L. D., Schilizzi, S., Chalak, M. & Kingwell, R.  Optimizing competitive uses of water for irrigation and fisheries. Agric. Water Manage. 101, 42–51. Victoria, F. B., Viegas Filho, J. S., Pereira, L. S., Teixeira, J. L. & Lanna, A. E.  Multi-scale modelling for water resources planning and management in rural basins. Agricult. Water Manage. 77, 4–20. Wagner, J. M., Shamir, U. & Marks, D. H.  Containing groundwater contamination: planning models using stochastic programming with recourse. Eur. J. Oper. Res. 7, 1–26. Wang, X. H. & Du, C. M.  An internet based flood warning system. J. Environ. Inf. 2, 48–56. Wang, S. & Huang, G. H.  Interactive two-stage stochastic fuzzy programming for water resources management. J. Environ. Manage. 92 (8), 1986–1995. Xu, Y. & Qin, X. S.  Rural effluent control under uncertainty: An inexact double-sided fuzzy chance-constrained model. Adv. Water Resour. 33 (9), 997–1014. Zhang, X. H., Zhang, H. W., Chen, B., Guo, H. C., Chen, G. Q. & Zhao, B.A.  An inexact-stochastic dual water supply programming model. Commun. Nonlinear Sci. 14, 301–309.

First received 12 December 2012; accepted in revised form 13 June 2013. Available online 17 July 2013


165

© IWA Publishing 2014 Journal of Hydroinformatics

|

16.1

|

2014

Hybrid metaheuristics for multi-objective design of water distribution systems Qi Wang, Dragan A. Savic´ and Zoran Kapelan

ABSTRACT Multi-objective design of Water Distribution Systems (WDSs) has received considerable attention in the past. Multi-objective evolutionary algorithms (MOEAs) are popular in tackling this problem due to their ability to approach the true Pareto-optimal front (PF) in a single run. Recently, several hybrid metaheuristics based on MOEAs have been proposed and validated on test problems. Among these algorithms, AMALGAM and MOHO are two noteworthy representatives which mix their constituent algorithms in contrasting fashion. In this paper, they are employed to solve a wide range of benchmark design problems against another state-of-the-art algorithm, namely NSGA-II. The design

Qi Wang (corresponding author) Dragan A. Savic´ Zoran Kapelan Centre for Water Systems, College of Engineering, Mathematics and Physical Sciences, University of Exeter, North Park Road, Exeter EX4 4QF, United Kingdom E-mail: qw212@exeter.ac.uk; zero3315263@gmail.com

task is formulated as a bi-objective optimisation problem taking cost and network resilience into account. The performance of three algorithms is assessed via normalised hypervolume indicator. The results demonstrate that AMALGAM is superior to MOHO and NSGA-II in terms of convergence and diversity on the networks of small-to-medium size; however, for larger networks, the performance of hybrid algorithms deteriorates as they lose their adaptive capabilities. Future improvement and/or redesign on hybrid algorithms should not only adopt the strategies of adaptive portfolios of subalgorithms and global information sharing, but also prevent the deterioration mainly caused by imbalance of constituent algorithms. Key words

| hybrid metaheuristics, hypervolume, multi-objective design, resilience, water distribution system

INTRODUCTION The design of Water Distribution Systems (WDSs) by multi-

popular for this task due to their ability to approach the

objective evolutionary algorithms (MOEAs) has attracted

true Pareto-optimal front (PF) in a single run (Zitzler &

considerable attention during recent years (Keedwell &

Thiele ; Farmani et al. a).

Khu ; Prasad & Park ; Khu & Keedwell ; Farm-

Farmani et al. (a) compared the performance of

ani et al. ; Prasad & Tanyimboh ; Fu et al. a).

three commonly used MOEAs, i.e. Non-dominated Sorting

The primary goal of the MOEA is to generate a trade-off

Genetic Algorithm II (NSGA-II), Strength Pareto Evolution-

between the total cost and system benefits, while meeting

ary Algorithm 2 (SPEA2) as well as Multi-Objective Genetic

consumer demands and other system constraints (e.g.

Algorithm (MOGA), on multi-objective design of a WDS

pressure, velocity, etc.). As combinatorial optimisation

applying them to two benchmark networks, as well as a

problems with Non-deterministic Polynomial-time hard

large real-life network. They concluded that SPEA2 (Zitzler

(NP-hard) feature (Papadimitriou & Steiglitz ), it is chal-

et al. ) outperformed other techniques in satisfying both

lenging to tackle the design of a real-world WDS as it often

goals of multi-objective optimisation, i.e. closeness to the

incurs expensive computational efforts, especially when

true PF and diversity among the non-dominated solutions,

extended period simulations are required for objective evalu-

especially on a large network. Subsequently, Farmani et al.

ations (Keedwell & Khu ). MOEAs are suitable and

(b) used NSGA-II to solve an expanded rehabilitation

doi: 10.2166/hydro.2013.009


166

Q. Wang et al.

|

Hybrid metaheuristics for multi-objective design of water distribution system

Journal of Hydroinformatics

|

16.1

|

2014

problem of the Anytown network (Walski et al. ) as a

In this paper, we applied two recently-proposed hybrid

realistic benchmark taking cost and resilience index

algorithms, i.e. AMALGAM and Multi-Objective Hybrid

(Todini ) into account.

Optimisation (MOHO), to solve the multi-objective design

In order to yield acceptable near optimal solutions

of a WDS. More specifically, we tested the strength of two

and reduce the overall number of hydraulic evaluations,

different hybrid schemes (Talbi ), namely high-level

Keedwell & Khu () investigated the possibility of com-

teamwork hybrid (HTH) and high-level relay hybrid

bining NSGA-II with a neighbour search to solve the

(HRH), by conducting the bi-objective optimal design on a

multi-objective design of the New York tunnels network.

wide range of benchmark models collected from the litera-

Results showed an encouraging improvement of the hybrid

ture, including the Anytown network which is regarded as

algorithm given a budget of model simulations. Later on,

one of the challenging benchmarks receiving less attention

they tried to combine a novel cellular automaton-based initi-

in the past (Prasad & Tanyimboh ). The problem was

alisation technique with a Genetic Algorithm (GA) to solve

formulated to minimise the total cost and to maximise the

the least cost design of a WDS (Keedwell & Khu ). The

network resilience, as defined by Prasad & Park (). In

applications to two large networks from industry highlighted

order to compare the performance of hybrid algorithms

the benefits of using this approach to discover better results

with state-of-the-art MOEAs in the domain, we used

in a fixed time span.

NSGA-II to solve the aforementioned problem as well. In

Besides integrating a local search strategy with current

addition, with an attempt to clearly evaluate the perform-

MOEAs, Raad et al. () applied a hybrid metaheuristic

ance of each algorithm, we employed a well-established

algorithm, called a multi-algorithm, genetically adaptive

indicator, i.e. hypervolume (Deb ), to assess the quality

multi-objective method (AMALGAM) proposed by Vrugt

of final solutions. Multiple independent optimisation runs

& Robinson (), for the first time to address the optimal

were carried out on each problem, which served to generate

design of a WDS considering the total cost and network resi-

unbiased evaluation based on statistics. The main contri-

lience (Prasad & Park ). Instead of using the original

butions of this paper are the investigation of the capability

sub-algorithms, they employed a greedy design heuristic,

of hybrid metaheuristics to perform multi-objective design

two variants of NSGA-II and discrete particle swarm optim-

of a WDS and comparison of their performance with

isation (PSO) because of their tendency to succeed in a

that of modern MOEAs by extensive testing. Therefore, this

discrete multi-objective optimisation setting. The results

work aims to uncover the reasons for success and/or failure

obtained from three benchmark models as well as a

of the two algorithms, and in turn, to establish how the

real WDS in South Africa proved the strength of the

hybrid algorithms could benefit from further improvements.

AMALGAM-type algorithm as a faster, more reliable tool for multi-objective design of a WDS.

The remainder of this paper is organised as follows: first, multi-objective design of a WDS is briefly introduced fol-

Wolpert & Macready () presented a number of ‘no

lowed by the mechanisms of AMALGAM and MOHO in

free lunch’ theorems and demonstrated the danger of analys-

more detail. Then, the benchmark problems used in this

ing algorithms by their performance on a small set of cases.

paper are summarised and the performance metric is

Most of the previous work tests several MOEAs (often built

given. After comparing the results obtained from each algor-

on different concepts) on quite a few benchmark and/or

ithm, conclusions are drawn at the end.

real-world WDS design problems, therefore, the conclusions might be biased since it is impossible for a specific optimisation algorithm to be effective on a wide range of problems.

MULTI-OBJECTIVE DESIGN OF WDSs

Hybrid algorithms arise with an attempt to overcome this difficulty by combining the power of different methods.

The design of a WDS always involves optimising multiple

However, many such schemes proposed for WDS design

and usually conflicting objectives at the same time, such

often require the parameters to be fine-tuned, hence the

as, total cost, system reliability and water quality. The goal

lack of adaptability, robustness and popularity.

of multi-objective design of a WDS is to get as close as


167

Q. Wang et al.

|

Hybrid metaheuristics for multi-objective design of water distribution system

possible to the true trade-off between cost and benefit, which

Journal of Hydroinformatics

|

16.1

|

2014

HYBRID METAHEURISTICS

offers a range of alternatives for the decision making process. A typical WDS design problem consists of providing

Unlike the self-contained algorithms, hybrid metaheuristics

cost-effective specification of various components, i.e.

combine two or more different mechanisms (usually built

pipes, pumps, valves tanks, etc., within the network given

on population-based evolutionary algorithms) to facilitate

the system layout. In a more narrow sense, various investi-

the efficiency of the search towards the global optima. In

gators considered the design task to be the specification of

an attempt to classify hybrid algorithms using common ter-

the best combination of pipe sizes from within a discrete

minology, Talbi () presented a taxonomy mechanism

range of commercial diameters that meets the water

for current hybrid metaheuristics in a qualitative way consid-

demand and other system requirements. Herein, we focus

ering both design and implementation issues. The taxonomy

on this narrow definition of the problem using bi-objective

combined a hierarchical classification scheme with a flat

optimisation to minimise the total capital cost and maximise

classification scheme to provide a clear and structural fra-

the performance benefits of the network. The value of the

mework for comparative purposes. Here, we mainly focus

latter objective is calculated based on hydraulic simulation

on the design issues of hybrid algorithms.

through the EPANET2.0 package (Rossman ).

At the first level of the hierarchical classification, low-

A series of indicators (Todini ; Prasad & Park ;

level and high-level hybridisations can be distinguished.

Prasad & Tanyimboh ) have been proposed in the litera-

This is done by ascertaining whether the component meta-

ture as a surrogate of performance benefit giving preference

heuristics are embedded or self-contained. In the low-level

to a ‘looped network’. Recently, the resilience index (Todini

hybrid class, a certain functional part of an algorithm is sub-

) has gained more attention due to its ability to account

stituted with another algorithm. While in the high-level

for failure conditions in a risk type measure. It is defined

hybrid class, each algorithm works on its own without

based on the concept that the total input power into a net-

depending on other metaheuristics. At the second level of

work consists of the power dissipated in the network and

the hierarchical classification, each class (low-level or

the power delivered at demand nodes. In response to

high-level hybrid) is further divided into relay and teamwork

Todini’s measure, Prasad & Park () developed the net-

classes according to the working fashion, i.e. optimising a

work resilience metric by taking the uniformity of pipes

problem in turn or cooperatively. Therefore, four general

connected to a certain node into account. The advantage

types of algorithms are derived from the hierarchical taxon-

of the latter is that it explicitly rewards redundancy of simi-

omy, i.e. Low-level Relay Hybrid, Low-level Teamwork

larly sized pipes as improving the reliability of network

Hybrid (LTH), HRH and HTH. According to the flat classi-

under pipe failure scenarios (Raad et al. ). A new

fication, all abovementioned hybridisation classes can be

approach was recently proposed to provide flexibility to

categorised into homogeneous/heterogeneous, global/par-

the design of water supply (Zhang & Babovic ) by con-

tial and specialist/general schemes. In homogeneous

sidering innovative Real Options technology. However,

hybrids, all the constituent algorithms use the same meta-

this approach deals with the design of water systems under

heuristic.

uncertainty which is not considered here.

metaheuristics are employed. The hybrid schemes can also

While

in

heterogeneous

hybrids,

different

Given the above and the fact that this paper focuses on

be viewed as global or partial hybrids depending on whether

the comparison of hybrid metaheuristics for the WDS

the whole search space will be the same for all the

design, the optimisation methodology presented here is

sub-algorithms or decomposed into sub-areas (one for each

based on the conventional WDS design driven by the

sub-algorithm). From the perspective of function of

trade-off between the WDS design cost and performance,

metaheuristics, specialist hybrids can be distinguished

the latter being evaluated by using the network resilience

from general hybrids as they combine sub-algorithms

metric.

which aim to solve different problems from the others.


168

Q. Wang et al.

|

Hybrid metaheuristics for multi-objective design of water distribution system

Journal of Hydroinformatics

|

16.1

|

2014

Van Zyl et al. () proposed an LTH algorithm, which

the population-based evolutionary algorithms in contrasting

incorporated a hill-climber strategy with a GA method, to

fashions. In particular, two instances of high-level hybrid

solve operational optimisation of a WDS. They concluded

scheme are analysed by solving the bi-objective design prob-

that the hybrid algorithm outperformed pure GA by finding

lems using 12 WDS benchmark networks collected from the

good solutions quickly. They also showed that a local search

literature.

method complemented GA by efficiently finding local optima.

Instance of HTH: AMALGAM

Cisty () combined a GA with Linear Programming (LP) as an LTH to solve three least-cost design problems

Vrugt & Robinson () proposed a multi-algorithm,

of a WDS. This method employed a GA to decompose

genetically adaptive multi-objective method, known as

looped network configurations into a group of branched net-

AMALGAM. This can be classified as an HTH, hetero-

works. LP was then applied to optimise the branched

geneous, global, general framework. It simultaneously

networks as it was more reliable than heuristic methods in

employs four sub-algorithms within the framework, includ-

finding the global optimum. The results demonstrated the

ing NSGA-II, PSO, adaptive metropolis search (AMS) and

hybrid’s superiority in consistently generating better sol-

differential evolution (DE). The main aim of the developed

utions when compared to GA and Harmony Search.

algorithm was to overcome the drawbacks, as well as poss-

Tolson et al. () extended dynamically dimensioned

ible failure of an individual algorithm on a specific

search (DDS), which is a continuous global optimization

problem. The new concepts of multi-method search and

algorithm (Tolson & Shoemaker ) and developed an

genetically adaptive offspring creation are developed to

LTH (called hybrid discrete DDS, HD-DDS) by introducing

ensure a fast, reliable and computationally efficient algor-

two local search strategies. These local search heuristics

ithm for multi-objective optimisation. Results on a set of

involved one-pipe change and two-pipe change local

well-known multi-objective test functions suggest that this

moves in the process of solving a discrete, single-objective,

hybrid method achieved a tenfold improvement in conver-

constrained WDS design problem. The main advantages of

gence metric (Deb et al. ) over NSGA-II for the more

the algorithm were that it does not require fine-tuning of a

complex, higher dimensional problems. Besides its extra-

number of parameters and that it is computationally effi-

ordinary performance, AMALGAM provides a general

cient when compared to GA or PSO. The results obtained

template which is flexible and extensible, and could easily

(especially on a large network) revealed that it outperformed

accommodate any other population-based algorithms. A

the state-of-the-art existing algorithms in terms of searching

sequential version of AMALGAM code was requested

ability and computational efficiency.

from Vrugt for this work. The pseudocode of AMALGAM

As most low-level hybrid schemes commonly combine

is illustrated in Figure 1.

various local search strategies or a mechanism different

The parameter settings of three population-based sub-

from population-based techniques into the structure of evol-

algorithms within AMALGAM are summarised in Table 1.

utionary algorithms, they turn out to be tailored to cope with

Besides using GA, PSO and DE, AMALGAM also includes

specific problems. This is most often done by experimenting

AMS as a Markov Chain Monte Carlo (MCMC) sampler

with a rule that determines when to switch from one algor-

that proactively avoids the search being trapped in local

ithm to another. However, this makes such a hybrid less

optima. The algorithm works by substituting the parents

flexible as it would generally fail to adapt to other appli-

with offspring of lower fitness (Haario et al. ). This sam-

cations. On the other hand, few of these low-level hybrid

pler also shows superior efficiency in exploring the search

algorithms are designed for multi-objective optimisation

space of high-dimensionality. Therefore, AMS is capable of

except Creaco & Franchini (). Since the main concern

rapidly travelling across the entire Pareto distribution

of this paper is about multi-objective design of a WDS,

when the optimisation process progresses towards the PF.

herein, we focus on the comparison of two different high-

Readers are referred to the supporting information of

level hybrid schemes, i.e. HTH and HRH, which employ

(Vrugt & Robinson ) for more details.


169

Q. Wang et al.

Figure 1

Table 1

|

|

|

Hybrid metaheuristics for multi-objective design of water distribution system

Journal of Hydroinformatics

|

16.1

|

2014

Pseudocode of AMALGAM.

Setting of parameters in AMALGAM

GA

PSO

DE

Crossover rate

0.9

Inertia factor

0.5 þ 0.5u(0,1)

Scaling factor 1

u(0.6,1.0)

Mutation rate

1/L

Cognitive weight

1.5

Scaling factor 2

u(0.2,0.6)

Distribution index for crossover

20

Social weight

1.5

Distribution index for mutation

20

Turbulence factor

u( 1,1)

Note: L is the number of decision variables; u(a,b) is a uniform random number between a and b.

Instance of HRH: MOHO

the primary idea in a Matlab environment. Figure 2 shows the pseudocode of MOHO.

Moral & Dulikravich () focused on another hybrid

MOHO evaluates the performance of its sub-algorithms

scheme following the concept of Pareto-dominance. They

on five distinct improvements: (1) changes in the size of non-

presented an MOHO algorithm as a HRH, heterogeneous,

dominated set; (2) whether there exists a solution from the

global, general metaheuristic which implements three sub-

new generation which dominates any members in the last

algorithms in a sequential manner. The MOHO hybrid

generation; (3) changes in the hypervolume indicator; (4)

coordinates SPEA2, Multi-Objective Particle Swarm Optim-

changes in average Euclidian distance; (5) increase in the

isation (MOPSO) and Non-dominated Sorting Differential

spread indicator. The innovative part of this evaluation strat-

Evolution (NSDE) and decides which one of them will gen-

egy is that MOHO considers not only the quality of the non-

erate offspring using the automatic switching procedure.

dominated set in the next generation (i.e. in terms of conver-

More specifically, MOHO proceeds by choosing one of

gence and diversity), but also takes into account the

them for producing the next generation based on the per-

perturbation introduced by the potential solutions which

formance of the currently employed algorithm. Five

may bring substantial improvement in later iterations. The

different indicators for measuring improvements on finding

main differences between the original MOHO and the one

non-dominated solutions, including the quality of approxi-

reported here are twofold. First, the initial population is gen-

mation and distribution, are used to decide whether to

erated using uniformly distributed random sampling rather

continue with a particular algorithm or change to another

than Sobol’s quasi-random sequence generator (Bratley &

one. In this paper, we are not able to implement the original

Fox ) as the advantage of this method vanishes for

MOHO software; instead, we tried to recreate it following

higher dimensional problems (Rahnamayan et al. ).


170

Figure 2

Q. Wang et al.

|

|

Hybrid metaheuristics for multi-objective design of water distribution system

Journal of Hydroinformatics

|

16.1

|

2014

Pseudocode of MOHO.

Secondly, since the parameter settings of each sub-algorithm

with the number of candidate solutions ranging between

were not clearly stated in the original MOHO, we configure

107 and 10454. The name of benchmark models, number of

these values by trial-and-error method on some difficult test

pipes, diameter options and relevant design criteria are sum-

functions and choose the best combination based on the

marised in Table 2.

experimental results. Additionally, the maximum number

It is worth mentioning that four benchmark models,

of consecutive iterations of a certain sub-algorithm is set to

BLA, FOS, PES and MOD, adopted from Bragalli et al.

1/50 of total generations.

() are more realistic compared with others (except

Greater details about two hybrid algorithms and their

ANT) as they all take a reasonable range of pressure

performance can be found in the original authors’ papers

head (not only minimum pressure requirement) as well

(Vrugt & Robinson ; Moral & Dulikravich ).

as the upper bound on flow velocity in the network.

Apart from using hybrid algorithms with distinct schemes,

Although ANT was introduced as a hypothetical network,

we also applied NSGA-II to solve the benchmark problems

it contains most common features (multiple loading con-

for the purpose of comparison of the quality of final sol-

ditions, pipe duplication or reconditioning (i.e. cleaning

utions. For more details about NSGA-II, the readers are

and re-lining), new pipe installation, tank location and

referred to Deb et al. (). The latest version of NSGA-II

operation as well as pump scheduling) found in many

(revision 1.1.6) was downloaded from the website of

real systems. For a detailed description of design criteria

Kanpur Genetic Algorithms laboratory (http://www.iitk.ac.

on each model, interested readers are referred to Dong

in/kangal/codes.shtml).

et al. (), Raad () as well as via http://centres. exeter.ac.uk/cws.

CASE STUDIES

Performance indicator

Benchmark problems

It should be emphasised here that there is no ideal indicator

To well compare the performance of AMALGAM and

convergence and diversity of multi-objective optimisation.

MOHO against NSGA-II, 12 WDS networks were collected

Among the various metrics which are designed to measure

from the literature and served as benchmarks for optimis-

the achievement of MOEAs, it is established that hyper-

ation tests. The number of pipes in these models ranges

volume (HV) is a single metric which can assess the

from eight to 454, which, together with various design cri-

performance of both aspects in a combined sense (Deb

teria, provide a wide range of problems and search spaces

). In order to remove the bias caused by the magnitude

which can give consistent and definite evaluation of both


171

Table 2

Q. Wang et al.

|

|

Hybrid metaheuristics for multi-objective design of water distribution system

Journal of Hydroinformatics

|

16.1

|

2014

Benchmark models used for comparison of each algorithm

Design Criteria No.

Model

Pipe Count

Option Count

Min Head

Max Head

Max Velocity

Multiple Loading Condition

1

Two-loop Network (TLN)

2

BakRyan Network (BAK)

8

14

Yes

No

No

No

9

11

Yes

No

No

No

3

New York Tunnel Network (NYT)

4

Blacksburg Network (BLA)

21

16

Yes

No

No

No

23

15

Yes

Yes

Yes

No

5 6

GoYang Network (GOY)

30

8

Yes

No

No

No

Hanoi Network (HAN)

34

6

Yes

No

No

No

7

Fossolo Network (FOS)

58

22

Yes

Yes

Yes

No

8

Pescara Network (PES)

99

13

Yes

Yes

Yes

No

9

Modena Network (MOD)

317

13

Yes

Yes

Yes

No

10

Balerma Irrigation Network (BIN)

454

10

Yes

No

No

No

11

Two Reservoir Network (TRN)

12

Anytown Network (ANT)

8

8

Yes

No

No

Yes

43

10

Yes

No

No

Yes

Note: For TRN network, three of eight pipes are existing pipes which have three options including ‘do nothing’, cleaning or duplication; for ANT network, although there are only 43 pipes to be considered, its formulation contains up to 112 decision variables, which makes it the most challenging problem in the list.

of different objective functions, we take the normalised ver-

RESULTS AND DISCUSSION

sion of HV, called the ratio of the HV of approximation set and of true Pareto-optimal front (HVR) (Deb ), to evalu-

The benchmark networks adopted in this paper encompass a

ate the quality of final solutions obtained from each

wide range of network sizes, with up to several hundreds of

algorithm. The expression of HV and HVR are shown as

pipes. Hence, various computational budgets (Table 3) were

Equations (1) and (2), respectively:

tested to make sure each algorithm converged well before

HV ¼ volume

[jQj

v ; i¼1 i

their performance could be compared. It is worth noting that (1)

these budgets (i.e. population size and number of generations for each benchmark problem) are kept the same for all three algorithms. As such, the number of function evaluations via

HV(Q) HVR ¼ HV(P )

(2)

where vi is the hypercube constructed with a reference point

EPANET2.0 (Rossman ) varied from 25,000 to 500,000. Because each algorithm produces a first generation in a different way, multiple runs are implemented to eliminate the

(normally a vector of worst objective values) and the solution i as the diagonal corners; Q is the non-dominated

Table 3

|

Configuration of computational budget

solutions obtained by an algorithm and P* is the solutions

Population

in the true PF.

Size

Since we do not have a theoretical true PF for each benchmark problem, to assist the evaluation of performance, a quasi-true Pareto-Optimal front (quasi-PF) was generated for each problem. This was achieved by applying a non-dominated sorting procedure to the aggregated Pareto fronts obtained by all three algorithms through multiple runs.

100

Generation

Pipe No.

Problems

250

50

500 1000 5000

100 500 N/A

TLN, BAK, NYT, BLA, GOY, HAN, TRN FOS, PES MOD, BIN ANT

Note: The numbers of population size and generation (except ANT) are decided based on trial runs in order to ensure the convergence of NSGA-II given the specified computational budgets. The number of generation on the ANT problem follows the same setting chosen by Farmani et al. (2005b).


172

Q. Wang et al.

|

Hybrid metaheuristics for multi-objective design of water distribution system

Journal of Hydroinformatics

|

16.1

|

2014

influence of the initial population, and the statistical results of

Reference points are also provided in terms of cost (in million

HVR are used to assess their performance. Thirty independent

units) and network resilience values. For example, for TLN

runs were carried out for all cases except ANT, which was run

the reference point is (5.0, 0.1). The results clearly demon-

for 10 times as it requires many more generations to ensure

strate that AMALGAM consistently outperforms MOHO

convergence and thus is extremely time-consuming.

and NSGA-II on the networks of small-to-medium size

Figure 3 shows the box plot of statistical performances of

(Wang et al. ), i.e. TLN, BAK, NYT, GOY, FOS, PES,

three algorithms on 12 benchmark problems. The top and

MOD and TRN. The performance of MOHO was compar-

bottom edges of the grey bar in each plot represent the maxi-

able to that of AMALGAM and NSGA-II on smaller

mum and minimum values of HVR for each algorithm,

networks, i.e. TLN, BAK, NYT, BLA, GOY and TRN; how-

respectively. The intermediate short lines in dark colour

ever, it became less efficient on larger networks, i.e. HAN,

denote the average values of HVR for each algorithm.

FOS, PES, MOD, BIN and ANT, as the complexity of the

Figure 3

|

Statistical performances of each algorithm on each problem using HVR indicator.


173

Q. Wang et al.

|

Hybrid metaheuristics for multi-objective design of water distribution system

Journal of Hydroinformatics

|

16.1

|

2014

problem increased. Even worse, it was able to find only one

for initialisation, successfully discovered non-dominated sol-

feasible solution on the ANT problem in 10 runs. On smaller

utions all the time.

networks (less than 400 pipes), NSGA-II performed worse

Another way to compare the performance of three algor-

than hybrid algorithms except on the BLA and HAN pro-

ithms is to illustrate their contributions to the Pareto front

blems; on the contrary, it dominated hybrid algorithms on

obtained via multiple runs on each case (see Figure 4).

larger networks, i.e. BIN and ANT. Admittedly, none of the

Herein, only four cases, namely NYT, HAN, PES, and BIN,

algorithms converged on the ANT problem, which also

are chosen as they exhibit different levels of complexity

implies that it was the most complex problem in the selected

within the problems considered in the paper. Each figure is pro-

cases by considering many aspects simultaneously. Further-

duced in the following manner. Firstly, the objective function

more, it is important to emphasise that the convergence of

values of the non-dominated solutions obtained by each algor-

MOHO and NSGA-II were highly dependent on initial

ithm (via 30 runs) are rounded to four-digit precision and the

random seeds. For instance, twice out of 10 runs, improper

duplicate solutions are removed. Next, the quasi-PF for each

seeds resulted in complete failure of NSGA-II as there were

case is generated using the non-dominated sorting procedure

no feasible solutions found in the final population. By con-

(Deb et al. ). Seven data sets are then obtained by counting

trast, AMALGAM, which uses Latin hypercube sampling

the common contribution of all three algorithms (denoted as

Figure 4

|

Pareto fronts obtained via multiple runs by AMALGAM, MOHO, and NSGA-II. (a) Case NYT, (b) Case HAN, (c) Case PES, (d) Case BIN.


174

Q. Wang et al.

|

Hybrid metaheuristics for multi-objective design of water distribution system

UAMALGAMþMOHOþNSGA-II), every

two

algorithms

common

(denoted

as

Journal of Hydroinformatics

|

16.1

|

2014

of

obtained by each algorithm in the quasi-PF set for each

UAMALGAMþMOHO,

test case. On seven of these benchmark problems (mainly

contribution

UAMALGAMþNSGA-II, UMOHOþNSGA-II), and individual contri-

on larger networks), NSGA-II found a significant number

butions of each algorithm (denoted as SAMALGAM, SMOHO,

of solutions in the quasi-PF sets. For smaller test cases,

SNSGA–II,), which has already excluded the common ones.

like TLN and TRN, its contribution was similar to that of

Finally, these data sets are plotted in Figure 4. It should be

AMALGAM and MOHO. It was worth noting that on the

noted that these sets can be empty and therefore are not necess-

ANT problem the quasi-PF was comprised solely of the sol-

arily shown on the figure. Similar algorithm performance

utions

trends can be observed as discussed previously. AMALGAM

superiority of NSGA-II in terms of convergence given a

was consistently superior to the others in terms of diversity by

fixed computational budget. Conversely, AMALGAM suc-

identifying the solutions in the region of high network resili-

cessfully produced more solutions on five small-to-medium

ence. NSGA-II outperformed hybrid algorithms in terms of

size networks when compared to NSGA-II. Such perform-

convergence towards the region of low cost, especially on

ance was due to its better achievement in terms of

larger networks (i.e. HAN, PES, and BIN) while MOHO was

diversity and convergence. It can also be observed that

able to find solutions in the quasi-PFs of NYT and PES, albeit

AMALGAM always found extreme points of the quasi-PF

completely failing on HAN and BIN problems.

sets in the region of high network resilience, which were

obtained

by

NSGA-II.

This

highlighted

the

To compare quantitatively the contributions of each

often neglected by NSGA-II and MOHO. Interestingly,

algorithm, Table 4 summarises the percentage of solutions

MOHO failed to generate any members in the quasi-PF sets on HAN, MOD, BIN and ANT. Furthermore, it only

Table 4

|

found a feasible solution set once out of 10 runs on the

Percentage of contribution from each algorithm for each design problem

ANT problem. Contribution in percentage (%)

In order to investigate the reasons why hybrid algorithms HAN

failed on some cases, the evolutionary processes of each sub-

62

25

algorithm within AMALGAM and MOHO on all design pro-

21

32

0

blems were recorded and analysed. Four cases, i.e. HAN,

31

42

78

PES, BIN and ANT, were selected and discussed here as

Problem

TLN

BAK

NYT

BLA

GOY

AMALGAM

88

100

67

52

MOHO

91

98

60

NSGA-II

98

96

39

they represented the most difficult ones under limited compu-

Contribution in percentage (%) Problem

FOS

AMALGAM

31

MOHO NSGA-II

PES

ANT

tational budget levels, i.e. 250, 500, 1000 and 5000

MOD

BIN

TRN

38

56

38

98

0

offspring points in AMALGAM was maintained at 5, the

31

12

0

0

95

0

number of individuals provided by a specific sub-

38

50

44

62

99

100

algorithm was expected to vary between 5 and 85. As

generations, respectively. Since the bottom line of creating

Note: The maximum contribution to each problem is shown in boldface.

Figure 5

|

shown in Figure 5, for the HAN problem, AMS outperformed

Statistical performances of sub-algorithms within AMALGAM on four selected cases.


175

Figure 6

Q. Wang et al.

|

|

Hybrid metaheuristics for multi-objective design of water distribution system

Journal of Hydroinformatics

|

16.1

|

2014

Statistical performances of sub-algorithms within MOHO on four selected cases.

the other three sub-algorithms by generating a median value

of multi-objective design of WDS benchmark networks.

of 50 points within the 250 generations. GA worked better

AMALGAM employs four sub-algorithms simultaneously

than DE followed by PSO which always stayed around the

and adapts offspring creation genetically based on the suc-

bottom line. However, this behaviour changes steadily from

cess rate of each algorithm in producing the next

less complex (i.e. PES) to more complex (ANT) problems

population. MOHO, on the other hand, selects in sequence

as GA consistently dominated other sub-algorithms. Only

when to switch from one of its sub-algorithms to another by

DE was comparable to GA on the PES and BIN problems,

monitoring performance on five separate aspects. NSGA-II

while PSO and AMS seldom made a contribution to the

was used as a representative of state-of-the-art MOEAs for

population and stayed at the minimum level most of the

the purpose of comparison. Multiple independent runs

time. For the ANT problem, GA steadily produced most off-

were carried out on each test cases and the HVR metric

spring. In other words, AMALGAM behaved like NSGA-II.

was adopted to assess their performance in terms of conver-

Therefore, the failure of AMALGAM on the ANT problem

gence and diversity.

could be attributed to the fact that PSO, AMS and DE were

The results clearly reveal that AMALGAM (HTH

not effective and consequently wasted search resources. In

scheme) is superior to NSGA-II on the networks of small-

three of the four selected cases and the MOD problem,

to-medium size, which indicates that this achievement

MOHO completely failed to contribute any solutions in sets

benefits from the strategies of adaptive multi-method

of the quasi-PF. In Figure 6, it can be observed that

search and global information sharing. On the other hand,

MOPSO was inefficient especially on large networks as on

the HTH scheme has potential to achieve better perform-

average it ran less than 1/20 of total iterations. Although

ance compared to the HRH scheme through taking full

SPEA2 was comparable with NSDE on the first three cases,

advantage of each sub-algorithm more efficiently. However,

it was not selected to produce a next generation on the

on larger networks, the behaviour of hybrid algorithms

ANT problem. This resulted in MOHO working similarly to

gradually deteriorated or completely failed. The underlying

NSGA-II while wasting nearly 20% of iterations to explore

reason why hybrid metaheuristics perform worse on larger

the search space. Another explanation for MOHO’s ineffi-

networks was also investigated by monitoring the evolution-

ciency is that the adaptive feature may be significantly

ary process of its sub-algorithms in detail. The failure is

weakened as the inefficiency of a certain constituent algor-

attributed to the loss of effectiveness in terms of proactive

ithm produces poor solutions.

adaptation. Actually, it is observed that, on the ANT problem, AMALGAM performed nearly the same as NSGA-II because GA dominated other sub-algorithms completely

CONCLUSIONS

most of the time. Admittedly, there is still a lack of theoretical analysis in and

the literature about the impact of problem characteristics on

MOHO, as well as NSGA-II were applied to a wide range

the performance of metaheuristics, which makes them and

Two

hybrid

algorithms,

namely

AMALGAM


176

Q. Wang et al.

|

Hybrid metaheuristics for multi-objective design of water distribution system

associated hybrid methods (like AMALGAM and MOHO in this paper) as black-box approaches and thus results in them receiving criticism. Future work on this aspect is needed to change this situation substantially. There is also a gap between the design and application stages of hybrid schemes, which verifies the effectiveness and efficiency of a specific combination of different sub-algorithms from a mathematical point of view. Without this step, it can be misleading when creating a new hybrid scheme. Moreover, the parameterisation issue of hybrid algorithms should be carefully investigated giving consideration to different problem characteristics. In addition, with the development of both hardware and software in computer technology, the computational capacity of modern PCs has been significantly improved; hence, we suggest that any newly-developed hybrid frameworks or MOEAs should be tested on a wide range of benchmark networks as shown in this work. Furthermore, considerable attention should be focused on the networks of medium-tolarge size which give sufficient consideration to the requirements of real-world cases. On the other hand, there are additional concerns other than cost and reliability (e.g. water quality issues) in real cases. The multi-objective design of a WDS may need to adapt to a many-objective (more than three objectives) design process (Fu et al. b). Thus, the future development of hybrid metaheuristics should cope with the expansion of dimensionality in both objective function space and decision variable space.

REFERENCES Bragalli, C., D’Ambrosio, C., Lee, J., Lodi, A. & Toth, P.  Water Network Design by MINLP. IBM Research Report. Bratley, P. & Fox, B. L.  Algorithm 659: implementing Sobol’s quasirandom sequence generator. ACM Trans. Math. Softw. 14 (1), 88–100. Cisty, M.  Hybrid genetic algorithm and linear programming method for least-cost design of water distribution systems. Water Resour. Manage. 24, 1–24. Creaco, E. & Franchini, M.  Fast network multi-objective design algorithm combined with an a posteriori procedure for reliability evaluation under various operational scenarios. Urban Water J. 9 (6), 385–399. Deb, K.  Multi-Objective Optimization using Evolutionary Algorithms. John Wiley & Sons, Chichester, UK.

Journal of Hydroinformatics

|

16.1

|

2014

Deb, K., Pratap, A., Agarwal, S. & Meyarivan, T.  A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6 (2), 182–197. Dong, X., Liu, S., Tao, T., Li, S. & Xin, K.  A comparative study of differential evolution and genetic algorithms for optimizing the design of water distribution systems. J. Zhejiang Univ-Sci A (Appl. Phys. Eng.) 13 (9), 674–686. Farmani, R., Savic, D. A. & Walters, G. A. a Evolutionary multi-objective optimization in water distribution network design. Eng. Optim. 37 (2), 167–183. Farmani, R., Walters, G. A. & Savic, D. A. b Trade-off between total cost and reliability for Anytown water distribution network. J. Water Res. Plan. Manage. 131 (3), 161–171. Farmani, R., Walters, G. & Savic, D.  Evolutionary multiobjective optimization of the design and operation of water distribution network: Total cost vs. reliability vs. water quality. J. Hydroinf. 8 (3), 165–179. Fu, G., Kapelan, Z. & Reed, P. a Reducing the complexity of multiobjective water distribution system optimization through global sensitivity analysis. J. Water Res. Plan. Manage. 138 (3), 196–207. Fu, G., Kapelan, Z., Kasprzyk, J. & Reed, P. b Optimal design of water distribution systems using many-objective visual analytics. J. Water Res. Plan. Manage. 10.1061/(ASCE)WR. 1943-5452.0000311. Haario, H., Saksman, E. & Tamminen, J.  An adaptive metropolis algorithm. Bernoulli 7 (2), 223–242. Keedwell, E. & Khu, S.  Novel cellular automata approach to optimal water distribution network design. J. Comput. Civil Eng. 20 (1), 49–56. Keedwell, E. C. & Khu, S. T.  More Choices in Water System Design Through Hybrid Optimisation. Computing and Control for the Water Industry 2003, London, UK, pp. 257–264. Khu, S.-T. & Keedwell, E.  Introducing more choices (flexibility) in the upgrading of water distribution networks: The New York city tunnel network example. Eng. Optim. 37 (3), 291–305. Moral, R. J. & Dulikravich, G. S.  Multi-objective hybrid evolutionary optimization with automatic switching among constituent algorithms. AIAA J. 46 (3), 673–681. Papadimitriou, C. H. & Steiglitz, K.  Combinatorial Optimization: Algorithms and Complexity. Dover Publications, New York. Prasad, T. D. & Park, N.-S.  Multiobjective genetic algorithms for design of water distribution networks. J. Water Res. Plan. Manage. 130 (1), 73–82. Prasad, T. D. & Tanyimboh, T. T.  Entropy based design of “Anytown” water distribution network. In: Water Distribution Systems Analysis 2008 (J. E. Van Zyl, A. A. Ilemobade & H. E. Jacobs, eds). ASCE, Kruger National Park, South Africa, pp. 450–461. Raad, D. N.  Multi-objective Optimisation of Water Distribution Systems Design Using Metaheuristics. University of Stellenbosch, Stellenbosch. Raad, D., Sinske, A. & Van Vuuren, J.  Robust multi-objective optimization for water distribution system design using a meta-metaheuristic. Int. Trans. Oper. Res. 16 (5), 595–626.


177

Q. Wang et al.

|

Hybrid metaheuristics for multi-objective design of water distribution system

Rahnamayan, S., Tizhoosh, H. R. & Salama, M. M. A.  A novel population initialization method for accelerating evolutionary algorithms. Comput. Math. Appl. 53 (10), 1605–1614. Rossman, L. A.  EPANET 2 Users Manual. U.S. Environment Protection Agency, Cincinnati, Ohio, USA. Talbi, E. G.  A taxonomy of hybrid metaheuristics. J. Heuristics 8, 541–564. Todini, E.  Looped water distribution networks design using a resilience index based heuristic approach. Urban Water 2 (2), 115–122. Tolson, B. A. & Shoemaker, C. A.  Dynamically dimensioned search algorithm for computationally efficient watershed model calibration. Water Resour. Res. 43, W01413. Tolson, B. A., Asadzadeh, M., Maier, H. R. & Zecchin, A.  Hybrid discrete dynamically dimensioned search (HD-DDS) algorithm for water distribution system design optimization. Water Resour. Res. 45, W12416. Van Zyl, J. E., Savic, D. A. & Walters, G. A.  Operational optimization of water distribution systems using a hybrid genetic algorithm. J. Water Resour. Plan. Manage. 130 (2), 160–170. Vrugt, J. A. & Robinson, B. A.  Improved evolutionary optimization from genetically adaptive multimethod search. Proc. Natl. Acad. Sci. USA 104 (3), 708–711.

Journal of Hydroinformatics

|

16.1

|

2014

Walski, T. M., Brill, J. E. D., Gessler, J., Goulter, I. C., Jeppson, R. M., Lansey, K., Lee, H.-L., Liebman, J. C., Mays, L., Morgan, D. R. & Ormsbee, L.  Battle of the network models: Epilogue. J. Water Res. Plan. Manage. 113 (2), 191–203. Wang, Q., Savic´, D. & Kapelan, Z.  Hybrid optimisation algorithms for multi-objective design of water distribution systems. 10th International Conference on Hydroinformatics, Hamburg, Germany. Wolpert, D. H. & Macready, W. G.  No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1 (1), 67–82. Zhang, S. X. & Babovic, V.  A real options approach to the design and architecture of water supply systems using innovative water technologies under uncertainty. J. Hydroinf. 14 (1), 13–29. Zitzler, E. & Thiele, L.  Multiobjective evolutionary algorithms: A comparative case study and the strength Pareto approach. IEEE Trans. Evol. Comput. 3 (4), 257–271. Zitzler, E., Laumanns, M. & Thiele, L.  SPEA2: Improving the strength pareto evolutionary algorithm for multiobjective optimization. In: Evolutionary Methods for Design, Optimisation and Control (K. Giannakoglou, D. Tsahalis, J. Periaux, K. Papailiou & T. Fogarty, eds). International Center for Numerical Methods in Engineering (CIMNE), Barcelona, Spain, pp. 95–100.

First received 17 January 2013; accepted in revised form 24 June 2013. Available online 24 July 2013


178

© IWA Publishing 2014 Journal of Hydroinformatics

|

16.1

|

2014

Consequence management of chemical intrusion in water distribution networks under inexact scenarios Abbas Afshar and Ehsan Najafi

ABSTRACT The US Environmental Protection Agency (EPA)’s Response Protocol Toolbox provides a list of recommendations on actions that may be taken to minimize the potential threats to public health following a contamination threat. This protocol comprises three steps: (1) detection of contaminant presence, (2) source identification and (3) consequence management. This paper intends to explore consequence management under source uncertainty, applying Minimize Maximum Regret (MMR) and Minimize Total Regret (MTR) approaches. An ant colony optimization algorithm is coupled with the EPANET network solver for structuring the MMR and MTR models to present a robust method for consequence management by selecting the best combination of hydrants and valves for isolation

Abbas Afshar Department of Civil Engineering and EnviroHydroinformatic Center of Excellence, Iran University of Science & Technology, Tehran, Iran Ehsan Najafi (corresponding author) Department of Civil Engineering, Iran University of Science & Technology, Tehran, Iran E-mail: Ehs.najafi@gmail.com

and contamination flushing out of the system. The proposed models are applied to network number 3 of EPANET to present its effectiveness and capabilities in developing effective consequence management strategies. Key words

| ant colony algorithm, consequence management, minimize maximum regret, water network contamination

NOTATION

Gkgb

objective function value for the ant with the best

Z(x, s)

x under scenario s

performance within the past total iterations L

set of options {lij}

α, β

parameters

which

Z(x*, s) number of polluted consumer nodes with optimal control

the

solution x* under scenario s

relative

importance of the pheromone trail against heurisηij

F

number of polluted consumer nodes from the

tic value

beginning of consequence management until

heuristic value representing the desirability of state

the end of the simulation counted over all discrete time intervals

transition ij ρ

number of polluted consumer nodes with solution

coefficient of pheromone evaporation

i

node index

τij (t)

total pheromone deposited on path ij at iteration t

n

total number of consumer nodes

k gb

ant with the best performance within the past total

S

set of scenarios

iterations

x

a solution composed of valves and hydrants

X

search space that is set of solutions composed of

Pij (k, t) likelihood that ant k selects option lij for decision

valves and hydrants

point i at iteration t q

random variable uniformly distributed over [0, 1]

ct

threshold value

q0

tunable parameter ∈[0, 1]

tCM

time of beginning the consequence management

Q

constant

x*

an optimal solution composed of valves and hydrants

R(x, s)

regret for solution x under scenario s

EPS

overall simulation duration

doi: 10.2166/hydro.2013.125


179

A. Afshar & E. Najafi

|

Consequence management under inexact scenarios

Journal of Hydroinformatics

|

16.1

|

2014

INTRODUCTION

a); (3) flushing the contaminated water out of the

Water distribution networks are one of the most important

binations of public notification, valve operations, and system

infrastructures and highly vulnerable to deliberate contami-

flushing. Flushing is the purging of water from the distri-

nation intrusions. Following the terrorism events of

bution network via fire hydrants or blow-off ports to

September 11, 2001 in the United States, the literature has

address water quality concerns (Baranowski et al. ).

been focused more on the possibility of intentional contami-

Consequence management strategies which could best mini-

nation intrusions within drinking water distribution systems.

mize public health hazard and economic impacts to

The

(EPA)’s

remediate contaminated systems must then be evaluated.

Response Protocol Toolbox (US EPA ) provides a list

Limited researches have systematically focused on the devel-

of recommendations on actions that may be taken to mini-

opment and application of the most effective consequence

mize the potential threats to public health following a

management strategies in response to contamination

contamination threat. This protocol comprises three steps:

which leaves it in its early stages of development. Bara-

(1) detection of contaminant presence, (2) source identifi-

nowski

cation and (3) consequence management.

management in order to identify demands which were

system through hydrants (US EPA b); and (4) any com-

US

Environmental

Addressing

the

Protection

contaminant

Agency

detection,

&

LeBoeuf

()

investigated

consequence

numerous

most appropriate to minimize the concentration of contami-

researchers during the last decade have focused on the place-

nants in a water distribution network. They employed three

ment of online water quality monitoring sensors to effectively

different gradient-based optimization techniques in order to

detect contamination incidents in shortest possible time to

find out the near-optimal demand necessary and requisite

reduce potential public health and economic consequences

for minimizing total network contaminant concentration

(Ostfeld & Salomons ; Berry et al. ; Propato ).

after detection of pollution presence by warning sensors.

The locations of online sensors can be optimized to help

In another attempt, Baranowski & LeBoeuf ()

achieve one goal or a combination of goals such as minimiz-

employed a genetic algorithm to minimize contaminant con-

ing public exposure to contaminants, the spatial extent of

centrations in a water network along with minimizing the

contamination, sensor detection time, or costs. To address

cost of hydrant flushing. EPANET as a hydraulic simulator

the second step (i.e. source identification) of the protocol a

was employed in their study, and the genetic algorithm

few other researchers have investigated different methods

was utilized to identify the following items: (1) the nodes

for identifying locations of contaminant injection after detec-

at which to alter the demand; (2) the new demands for

tion of pollution (De Sanctis et al. ; Laird et al. ;

these nodes; and (3) pipe closure locations essential to

Preis & Ostfeld ). As the number of measurements

decrease the contaminant concentration during an incident.

increases over time, the problem is better defined but con-

Regarding their assumption, flushing could be done at any

taminant spread and public exposure also increase (Poulin

nodes and every pipe could be closed as desired. Preis &

et al. ), so source identification in a short time is a

Ostfeld () utilized Non-Dominated Sorted Genetic

tricky task. Due to the sparseness of the sensor grid, this pro-

Algorithm II (NSGAII) as an optimizer in order to enhance

blem inherently has non-unique solutions (Laird et al. ).

the response against intentional contamination intrusions

Thus solving the inverse problem of source identification

into water networks. They explored two conflicting objec-

leads to several probable injection locations.

tives: (1) contaminant mass consumed minimization

Regarding the subsequent successful detection of a con-

following detection, versus (2) minimization of the number

tamination event via a contamination warning system,

of operational activities requisite for isolation and flushing

consequence management strategies must be implemented.

the contaminant out of the network. They defined the first

These consequence management strategies would include

objective as the total mass of contamination in the con-

the following factors: (1) public notification; (2) isolation

sumed water following detection until the end of the

of a contaminant through valve operations (US EPA

simulation period. In their system simulation, occurrence


180

A. Afshar & E. Najafi

|

Consequence management under inexact scenarios

Journal of Hydroinformatics

|

16.1

|

2014

of negative pressures was disregarded. Poulin et al. ()

nominated nodes for consequence management, a well-

introduced a simple topological method to organize the iso-

established approach is lacking. In this study, consequence

lation of polluted zones within the drinking water supply

management is explored under source uncertainty applying

networks. Their approach is based on closing proper

Minimize Maximum Regret (MMR) and Minimize Total

valves and leaving one pipe to let clean water go through

Regret (MTR) approaches. Although the min-max regret

the isolated area. Following the previous study which

model has been applied to several optimization cases

addressed isolation of the contaminated area, in another

under uncertainties (Averbakh ; Chang & Davila

study, Poulin et al. () defined unidirectional flushing

; Afshar & Amiri ), utilization of these methods in

strategies through a heuristic set of rules in a well-organized

consequence management has not been reported.

and efficient way. Alfonso et al. () presented a methodology for finding sets of operational activities in a water distribution network in order to flush the pollution out of

METHODOLOGY

the system to minimize the impact on the population. They explored the situation as two aspects: single-objective

In this paper, an ant colony optimization (ACO) algorithm

and multi-objective optimization problem, which were

for solving MMR and MTR models, considering a constraint

investigated by using optimization techniques, in combi-

for technical operational capacity, is presented. A water dis-

nation with EPANET.

tribution network is considered in which some of the pipes

Although different strategies have been defined and a

represent valves and some of the nodes represent hydrants.

few methodologies developed to effectively manage the con-

To deal with uncertainties, five nodes are assumed as prob-

sequences of the contamination after potential source

able locations of intrusion. The strategy that has been used

identification, very few attempts have been made to expli-

consists of three steps. The first step finds the optimal sol-

citly or implicitly deal with the uncertain location of the

ution employing the ACO algorithm. The objective

intrusion as an important issue which should be addressed

function in the optimization model seeks to minimize the

following the second step. In fact, inverse solutions for

total number of polluted consumer nodes for each scenario.

source identification may not result in a single solution,

The second step finds the solution that minimizes the maxi-

implying that more than one source may be nominated as

mum regret over all potential scenarios, as will be defined in

a possible polluted node. Each polluted node will call for

the next section (Minimizing Maximum Regret). The third

a different optimal consequence management and oper-

step finds the solution which minimizes the total regret

ational strategy. All previous studies have developed

over all potential scenarios (Minimizing Total Regret).

consequence management strategies assuming predefined intrusion location. Disregarding the uncertainties involved in the assumed polluted node may result in a solution strategy very far away from the optimal one. In a most recent

MMR AND MTR APPROACHES IN CONSEQUENCE MANAGEMENT STRATEGIES

work, Haxton & Uber () utilized a source location algorithm according to an event backtracking analysis to

Many problems are associated with the degree of uncer-

determine feasible and likely injection nodes. In their

tainty. In these situations, the decision maker tries to find

study, the source locations were considered as inputs to

a solution that performs relatively well across uncertainties.

the flushing approach, which made the average impact

Regret criterion is a useful tool for decision making under

least across all of the injection locations. Based on their

uncertainty. Regret is a sense of loss which is felt by the

results, knowing the contaminant source location would

decision maker knowing an alternative action would be

influence the efficiency of the flushing significantly and if

more profitable than the one that was taken (Mausser &

the number of potential and feasible source locations was

Laguna ). For instance, in finance, an investor may

smaller, the decrease in impacts would be greater. Realizing

observe not only his own portfolio performance but also

that there might be no priority in selecting any of the

returns on other stocks or portfolios in which he was able


181

A. Afshar & E. Najafi

|

Consequence management under inexact scenarios

Journal of Hydroinformatics

|

16.1

|

2014

to invest but decided not to. Therefore, it seems very natural

according to the optimal solution for that scenario. The

to assume that the investor may feel joy/disappointment if

MMR model may now be formulated as:

his own portfolio outperformed/underperformed some benchmark portfolio or portfolios (Aissi et al. ). MMR

r ¼ minfmax R(x, s)g

(2)

and MTR approaches are among the most reliable criteria for decision making under uncertainties when the likelihood

The aim of Equation (2) is finding a solution by ACO to

of the possible outcomes cannot be predicted with a satisfy-

minimize the maximum regret over all the possible scen-

ing accuracy (Loulou & Kanudia ). In other words, the

arios. The approach of MTR is very similar to MMR while

MMR and MTR approaches are suitable in situations where

minimization of total regret is considered as the objective

the decision maker may feel regret if a wrong decision has

function. The structure of the model is as follows:

been made, so when he/she decides which results will be more satisfying if this regret were taken into account. The

X

r ¼ min

(3)

R(x, s)

S

objective of the MMR approach is to address a decision which minimizes the maximum deviation between that decision and the optimum decision for each scenario over all possible (or identified) scenarios. In fact this model intends to make a decision with the best possible performance in the worst case (Aissi et al. ). In this study, alternative actions or scenarios are defined as probable injection locations. The decision maker in this problem is an authority governing and running the city’s water distribution network. He/she is the one who bears the responsibility of making a decision and may make that decision based on MMR or MTR when analyzing the consequences and the harmful impact of the harmful effects of

In order to illustrate the previous definitions, the problem of sensor placement, as contaminant warning systems for a water distribution system, is considered. Suppose that three different layouts (installations) of water quality monitoring stations have been proposed and intrusion could occur in one of the nodes with labels i, j and k (scenarios). Table 1 shows the time of contaminant detection in minutes for each of the three scenarios. The crude choice to minimize the longest duration detection time would be selection of layout 2, ensuring the time of detection does not exceed 320 min. However, based on Table 2 if intrusion at node j occurred, the regret

intrusion on public health. Let Z(x, s) be the number of polluted consumer nodes under scenario s and solution x where x is a solution that consists of valves and hydrants used to isolate and remove contaminant out of the system and s is a probable node of intrusion. In this definition x ∈ X and s ∈ S where X is decision space and S contains all of the probable nodes of intrusion (scenarios). For a solution x ∈ X, the regret under

associated with this choice would be 300, which is the difference between the 320 and 20 min which is too large and could have been avoided if the exact scenario had been known. In addition the total regret of this choice is 620 which is too big (the summation of 65, 300 and 255 min). Therefore, in this example, according to the maximum and the total regret the best choice would be to select layout 1,

scenario s ∈ S is defined as follows: R(x, s) ¼ Z(x, s) Z(x , s)

(1)

where Z(x*, s) is the total number of polluted consumer

Table 1

|

Time of contaminant detection for different layouts of monitoring stations (minute)

Scenario

Intrusion at

Intrusion at

Intrusion at

Worst time of

Layout

node i

node j

node k

detection

to determine x*, the number of polluted consumer nodes

Layout 1

385

20

40

385

should be calculated for each scenario individually by the

Layout 2

305

320

295

320

optimizer. Therefore regret for a scenario is defined as the

Layout 3

240

280

330

330

difference of polluted consumer nodes counted for applying

Best time of detection

240

20

40

nodes with optimal solution x* under scenario s. In order

a solution x ∈ X and the number of polluted consumer nodes


182

A. Afshar & E. Najafi

Table 2

|

|

Consequence management under inexact scenarios

Maximum and total regret of each scenario for each layout of monitoring stations (minute)

Journal of Hydroinformatics

beginning

the

consequence

management.

|

16.1

|

2014

Although

shorter and/or longer time steps could be used, it is a more rational time step for consequence management

Intrusion

Intrusion

Intrusion

Maximum

Total

at node i

at node j

at node k

regret

regret

both from computational and plan implementational

Layout 1

145

0

0

145

145

points of view. When minimizing F, indirectly two key

Layout 2

65

300

255

300

620

issues are addressed: first, reducing the pollution extent

Layout 3

0

260

290

290

550

(contaminated area) in the network and second, reducing the time of exposure of concentrations above the threshold (Alfonso et al. ). In this article regardless

ensuring maximum and total regret of no worse than

of the differences in total nodal demands, it is assumed

145 min.

that the density of population over the nodes is equally

In this study, the fitness value is calculated as the total

distributed; therefore, all nodes in terms of impact are

number of polluted consumer nodes from the beginning of

similarly significant. Note that this is crude because

consequence management until the end of the simulation

EPANET example 3 consists of nodes with different

counted over all discrete time intervals:

demands. In addition, in reality some nodes of a water

n X EPS X

distribution network such as supply nodes for hospitals N(i, t)

(4)

i¼1 t¼tCM

and schools are more important than others if they become polluted. So it would be more precise if nodes were weighted according to their demands and impor-

Please note that the number of polluted nodes may

tance which needs to be considered in future studies.

vary from one computational time step to another. In other words, due to dynamics of the system, a given node may be recognized as a polluted node in one time

ACO ALGORITHM; GENERAL ASPECTS

step and unpolluted in the next one. Summation of the total polluted nodes in the entire computational time

ACO algorithms, using principles of communicative behav-

steps will often result in a fitness value exceeding the

ior occurring in real ant colonies, have successfully been

total number of network nodes. In Equation (4), i is the

applied to solve various combinatorial optimization prob-

node index, n is the total number of consumer nodes,

lems (Abbasi et al. ).

tCM represents the time of beginning the consequence management and EPS (Extended Period Simulation) is the overall simulation duration. The value of N depends on the existence of certain pollutant concentrations in the nodes. A node is considered polluted when its concentration exceeds the threshold value ct. Depending on the nature of the contaminant and its impact on public

In general, the kth ant at iteration t moves from state i to state j with probability (Dorigo et al. ): 8 α h iβ > > τ ij (t) ηij > < h iβ Pij (k, t) ¼ PJ α τ ij ηij > j¼1 > > : 0

if j ∈ Nk (i)

(5)

otherwise

health, different residual concentrations may be set for the consequence management strategy. Without loss of

where τ ij (t) is the total pheromone deposited on path ij at

generality, a lower threshold of 0.01 mg/l has been used

iteration t, ηij is the heuristic value representing the desirabil-

here to observe and assess the consequences if the man-

ity of state transition ij, Nk(i) is the possible neighborhood of

agement period extends for a longer time. If the

ant k when located at decision point i, and α and β are two

pollution concentration in node i at time t is more than

parameters to control the relative importance of the phero-

0.01 mg/l, it is assumed polluted and hence N(i, t)

mone trail against the heuristic value.

denoted as 1, otherwise it is assigned 0. The polluted con-

Let q be a random variable uniformly distributed over

sumer nodes are added together every 15 min after

[0, 1] and q0 ∈ [0, 1] be a tunable parameter. The next


183

A. Afshar & E. Najafi

|

Consequence management under inexact scenarios

Journal of Hydroinformatics

|

16.1

|

2014

option, j, that ant k chooses is (Dorigo & Gambardella ): ( j¼

n o argl∈Nk (t) max ½τ il (t) α ½ηil (t) β

q q0

J

otherwise

(6)

where J is a random variable value selected based on the probability distribution of Pij (k, t) (Equation (6)). Once all ants have built a tour, the pheromone trail intensity will be updated. This is done according to following equations: τ ij (t þ 1) ¼ (1 ρ):τ ij (t) þ Δτ ij (t)

(7)

where τ ij (t þ 1) is the amount of pheromone deposited for a state transition ij at iteration t þ 1, 0 ρ 1 is the pheromone evaporation coefficient and Δτ ij (t) is the amount of pheromone deposited on path ij at iteration t: 8 < Q Δτ ij (t) ¼ Gk gb : 0

if (i, j) ∈ tour done by ant k gb

Schematic of EPANET’s example 3.

between pressure and demand is incorporated. In this type of analysis, functions assume fixed demand above a given critical pressure, zero demand below a given minimum demand for intermediate pressures (Cheung et al. ). In

(8)

this study, minimum and desired pressure limits are assumed to be 0 and 25 m, respectively. Flow of an open pffiffiffiffi hydrant was modeled as an emitter by Q ¼ K P, where P

otherwise

value for ant

|

pressure and some relationship between pressure and

where, Q is a constant and Gkgb is the objective function k gb

Figure 1

which is the ant with the best performance

within the past total iterations.

is the pressure drop across the emitter and K is the emitter coefficient. K for all of the simulations was considered 1 l/s/m0.5. In this research, the hydrants and valves that are selected for consequence management are similar to those that Preis & Ostfeld () utilized in their research

MODEL SETUP

(Table 3). The total numbers of decision variables are equal to 51 which consider the modes of operation for 20

The water system utilized is the EPANET example 3 net-

valves and 31 hydrants. The decision variables are coded

work (Rossman ). It comprises two constant head

as binary numbers (0, 1) which determines whether the

sources, a lake and a river, three elevated storage tanks,

valve and hydrants are open or closed. Initially all valves

two pumping stations, 117 pipes, 59 consumer nodes and

are assumed ‘open’ and hydrants are ‘closed’. The mode of

35 internal nodes (Figure 1). In EPANET, a hydraulic and

operation for open valves and closed hydrants are identified

constituent time step of 15 min was used for a 24 hour simulation period.

Table 3

|

Valve and hydrant locations employed in consequence management

Today accounting for uncertainty without accounting for the certain errors coming from wrong water distribution network modeling is unacceptable. Thus, here, realizing the deficiency of the EPANET software in handling the pressure driven condition, an extension of EPANET was prepared to directly include the pressure driven issue in the modeling approach. In pressure driven analysis, the relationship

Valve locations (links number)

111, 175, 105, 116, 177, 215, 204, 237, 269, 173, 123, 107, 229, 311, 155, 309, 221, 231, 317, 301

Hydrant locations (nodes number)

40, 50, 60, 601, 61, 120, 129, 164, 169, 173, 179, 181, 183, 184, 187, 195, 204, 206, 208, 241, 249, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275


184

A. Afshar & E. Najafi

|

Consequence management under inexact scenarios

Journal of Hydroinformatics

|

16.1

|

2014

by 0, whereas the decision variable for closed valves and

response activities is restricted to 20. In some real cases,

open hydrants is represented by 1 in the proposed binary

however, the number of operational responses may far

coding. In any trial solution, decision variables are free to

exceed this number if a large number of valves and hydrants

take either 0 or 1 to redefine the operational mode of the

are selected by the optimizer to be re-operated.

valves and/or hydrants. Each trial solution will have its

In order to apply ACO algorithms to a specific problem,

own consequences with its own regret, if implemented. It

the problem should be represented on a graph or a similar

should be noted that in EPANET example 3 there are no

structure easily covered by ants (Afshar et al. ). Sol-

valves, but each pipe can be closed or opened at any time

utions that are produced by ants are combinations of

and this option was used to overcome this issue. Contami-

valves and hydrants which are closed and opened by oper-

nant injection takes place with a mass rate of 0.006 kg/s at

ators at 13:00 simultaneously and remain unchanged until

09:00 am for a duration of 7 hours.

24:00. In Figure 2, each column represents a valve or

In this study we assume that: (1) the water system is

hydrant utilized in consequence management. As an

equipped with some sensors for detection of contaminant

example, if an ant selects number 1, it means that the

presence; (2) nodes 103, 111, 125, 113 and 259 are probable

status of that valve or hydrant will be changed in the pro-

nodes of intrusion (Table 4); and (3) the necessary time for

posed management alternative. Otherwise, the situation of

detection of contaminant presence in the network by moni-

the valve or hydrant will remain unchanged.

toring stations and delay in the response time (including: (1)

As mentioned above, in order to obtain optimal sol-

contamination source identification; (2) isolation and con-

utions for MMR and MTR models, the ACO algorithm

tainment by valve closures; (3) flushing by hydrant

should be solved for each scenario individually. The func-

opening; and (4) public notification (Preis & Ostfeld

tion evaluations for all of the ACO algorithms are 90,000

)) is 4 hours; thus consequence management will

and the metaheuristic parameters are: β ¼ 0, α ¼ 1, ρ ¼ 0.05

begin at 13:00 pm. In addition, the constraint ‘technical

and q0 ¼ 0.4. To control the amount of pheromone depos-

operational capacity to implement response’ is considered.

ited for a state transition ij, the proper value of Q must be

Based on this constraint, the total number of operational

selected by sensitivity analysis. The values of Q for each sub-problem consisting of different inexact scenarios and

Table 4

|

MMR and MTR are selected through sensitivity analysis as

Probable nodes of intrusion (scenarios)

displayed in Table 5. Scenario 1

Node 103 number

Scenario 2

Scenario 3

Scenario 4

Scenario 5

111

125

113

259

RESULTS AND DISCUSSION The number of polluted consumer nodes without performing consequence management for each scenario is shown in Table 6. As presented in Table 6, the total number of polluted consumer nodes for the identified scenarios ranges from 753 to 1,307 for scenario numbers 4 and 3, respectively. Numbers of operational activities along with identification number of valves and hydrants to be re-oper-

Figure 2

Table 5

Q

|

|

Decision graph of ACO algorithm for consequence management strategies.

ated under optimal solutions for different scenarios are

Q values for ACO algorithms in order to find optimum solutions Scenario 1

Scenario 2

Scenario 3

Scenario 4

Scenario 5

MMR Model

MTR Model

40

50

115

60

45

70

180


185

Table 6

A. Afshar & E. Najafi

|

|

Consequence management under inexact scenarios

Number of polluted consumer nodes without performing response actions

Scenario 1

914 Polluted consumer nodes

Scenario 2

Scenario 3

Scenario 4

Scenario 5

906

1,307

753

1,153

Journal of Hydroinformatics

|

16.1

|

2014

from the optimization scheme demonstrate that the use of the number of polluted nodes in consequence management helps in reducing both exposure time and consumed pollutant

concentrations.

Given

a

scenario,

results

also

illustrate the usefulness of ACO to provide optimal solutions in order to minimize the number of polluted nodes in water

presented in Table 7. As presented, for scenario number 1, a total of 18 operational activities are identified with nine valves and nine hydrants. The number of polluted consumer nodes based on the occurrence of each scenario while employing the optimal solutions are shown in Table 8. In this table, dark cells are optimal solutions under given consequence management scenarios. Compared to the number of polluted consumer nodes with no consequence management, a significant reduction in the number of polluted consumer nodes may be achieved for scenario numbers 1 to 5. Specifically speaking, assuming that the polluted node is fully identified, the total number of polluted consumer nodes may be reduced by 82, 80, 69, 79 and 84% for

distribution network. Amounts of regret under different scenarios are shown in Table 9. Suppose that the decision maker implements the optimal solution for scenario number 3, for which 402 nodes are expected to be polluted. Let’s assume that, in reality, scenario number 4 occurs. In this case, as a result of incorrect scenario identification, the decision maker must pay for a regret of 238 extra polluted nodes (399–161). This issue is reflected in the fifth column and fourth row of Table 9. According to Table 9, if the decision maker employs optimal solutions, the maximum regret would range from 298 to 996 and total regret from 676 to 1,835 for different scenarios.

scenario numbers 1 to 5, respectively. As an example, for the first scenario, implementation of the consequence man-

MMR approach

agement may reduce the total number of polluted nodes from 914 to 164 (Table 8). Please note that even if the

Minimizing the total or maximum regret may eventually

right scenario is not correctly identified, the management

lead to a more robust solution for cases of inexact scenarios.

strategy will still reduce the number of polluted consumer

The objective of the MMR approach is to address a decision

nodes by 42% (from 914 to 533) or more. The second

which minimizes the maximum deviation between the

column of Table 8 illustrates that if the decision maker

alternative taken and the optimum one for each scenario

implements the optimum solution associated with scenario

over all possible (or identified) scenarios. In fact this

number 1, the total number of polluted consumed nodes

model intends to make a decision with the best possible per-

will reach 164, whereas this number may increase to 699

formance in the worst case. In order to minimize the

nodes if the third scenario prevails. The results obtained

maximum regrets under different scenarios, the MMR

Table 7

|

Optimal solutions for each scenario

Number of operational activities

Valve numbers

Hydrant numbers

Scenario 1

18

107, 111, 155, 204, 221, 231, 269, 309, 311

601, 120, 179, 183, 184, 187, 195, 204, 267

Scenario 2

18

107, 111, 116, 204, 215, 221, 231, 269, 309

179, 183, 184, 187, 195, 204, 263, 267, 269

Scenario 3

20

155, 175, 237

40, 601, 61, 120, 129, 164, 169, 173, 179, 181, 183, 195, 204, 265, 269, 271, 273

Scenario 4

18

116, 204, 231, 269, 309, 311

60, 129, 179, 183, 184, 187, 195, 204, 257, 261, 267, 269

Scenario 5

20

107, 111, 123, 221, 269, 301, 309

164, 169, 173, 179, 181, 183, 257, 259, 261, 263, 265, 269, 273


186

Table 8

A. Afshar & E. Najafi

|

|

Consequence management under inexact scenarios

Number of polluted consumer nodes based on occurrence of each scenario and

Journal of Hydroinformatics

Table 10

|

|

16.1

|

2014

Optimal solution obtained from MMR approach

employing optimal solutions

Scenario 1

Scenario 2

x*(sc ¼ 1)

164

226

x*(sc ¼ 2)

183

179

x*(sc ¼ 3)

533

x*(sc ¼ 4)

346

x*(sc ¼ 5)

382

Scenario 3

Scenario 4

Scenario 5

699

195

477

1,396

204

731

475

402

399

671

336

1,398

161

679

552

589

444

179

Number of operational activities

Valve numbers

Hydrant numbers

19

105, 111, 116, 155, 204, 215, 229, 269, 301, 309, 311, 317

40, 173, 195, 204, 259, 267, 271

MTR approach approach is applied. The results are shown in Tables 10 and

To test the performance of the MTR model in handling injec-

11. By minimizing maximum regret, it is intended to re-

tion location uncertainty in consequence management, the

operate the valves and hydrants in such a way that, for all

same case example and the same five scenarios are used.

possible scenarios, the maximum deviation of the number

The final results of the model application are provided in

of polluted consumer nodes from the optimum value is mini-

Tables 12 and 13. Similarly to Table 10, Table 12 provides

mized. In this case, the maximum amount of regret under

the optimum solution which minimizes the total regret

different scenarios will be reduced. Table 10 provides the

over all possible scenarios. For valves and hydrants pro-

valves and hydrants which minimize the maximum regret

posed in Table 12, the total values of regrets for different

over all identified scenarios. For valves and hydrants pro-

scenarios are presented in Table 13. As presented, for the

posed in Table 10, the values of regret for different

proposed valves and hydrants re-operation, maximum

scenarios are presented in Table 11. As presented, for the

regret will result when scenario number 3 occurs. Table 13

proposed valves and hydrants re-operation, the maximum

shows that the decision maker should implement valves

regret will result when scenario number 3 occurs. By com-

and hydrants provided in Table 12, for which the total

paring

single

regret for the fifth scenario will be 18. In this case if other

scenario-based optimum number of polluted consumer

scenarios take place, the decision maker will feel regrets ran-

nodes, the solution to the MMR model has decreased the

ging from 82 to 183 polluted consumer nodes. For the

maximum regret. To be pleased about the drops in maxi-

proposed solution the total regret over all identified scen-

mum regret, one may compare the regret values in

arios will be equal to 576 polluted consumer nodes

Table 11 with those in column 7 of Table 9. Table 11 implies

(Table 13). Compared to the MMR model, the total regret

that no matter which scenario is going to happen, the

resulting from the MTR model has been reduced (i.e. from

decision maker’s regret will not exceed 157 polluted consu-

639 in Table 11 to 576 in Table 13). In this model, compared

mer nodes. Whereas the maximum regret for the single

to the MMR model, total regret is reduced by almost 10 per-

scenario-based solution ranges from 298 to 996 for different

cent and maximum regret is increased by 14 percent. As a

scenarios (Table 9).

result, it seems that MMR and MTR models nearly eventuate

Table 9

|

the

maximum

regret

associated

with

Regret under different scenarios

Reality Assumed scenario

Scenario 1

Scenario 2

Scenario 3

Scenario 4

Scenario 5

Maximum regret

Total regret

Scenario 1

0

47

297

34

298

298

676

Scenario 2

19

0

994

43

552

994

1,608

Scenario 3

369

296

0

238

492

492

1,395

Scenario 4

182

157

996

0

500

996

1,835

Scenario 5

218

373

187

283

0

373

1,061


187

Table 11

A. Afshar & E. Najafi

|

|

Consequence management under inexact scenarios

Journal of Hydroinformatics

Scenario 2

Scenario 3

Scenario 4

Scenario 5

Number of polluted consumer nodes

294

319

559

272

280

Regret

130

140

157

111

101

|

16.1

|

2014

Number of polluted consumer nodes and amounts of regret with optimal solutions under different scenarios obtained from MMR approach

Scenario 1

Table 12

|

Total ¼ 639

any employed consequence management strategy. In

Optimal solution obtained from MTR approach

addition it was shown that knowing the exact location of Number of operational activities

Valve numbers

Hydrant numbers

16

107, 111, 116, 123, 215, 221, 237, 269, 309, 317

129, 173, 195, 267, 269, 271

contaminant intrusion will strongly influence the effectiveness of the response activities. Contamination detection, source identification, and consequence management strategy implementation are all

in the same results and both are suitable in consequence

time consuming and may demand relatively considerable

management.

time. To minimize the time lag and overcome the computational shortcomings, it is recommended to set up and calibrate the models for detection, source identification,

CONCLUSIONS

and consequence management in advance and have them

Efficient consequence management in an intentionally con-

MTR and MMR models are suitable for design and analysis

taminated water distribution network is only possible if the

of consequence management strategies. Realizing the dis-

source of the contamination is known. Complete knowledge

crete nature of the decision space in the consequence

on the contaminant source location will lead to great

management problem, the ACO algorithm performed

reduction in the polluted consumer nodes and greatly influ-

quite satisfactorily and is recommended for similar studies.

ence the effectiveness of the consequence management.

Although not providing the water network governor and

However, inverse solutions for source identification may

authorities with a single solution, results of the proposed

identify multiple inexact sources of contamination and pol-

models may provide the decision maker with a reasonable

luted nodes, each one demanding different optimal

level of awareness and impacts of the alternating decisions

consequence management and operational strategy. This

considering the uncertainties involved in exact identifi-

study proposed and tested a systematic approach based on

cation of the contaminated node. In addition, the

MMR and MTR in connection with the well-established

proposed methodology disregards other uncertainties,

ACO approach to develop a set of robust consequence man-

such as type of injected contaminant and injection time,

agement strategies with known impacts on alternating

which need to be investigated in future works. The testing

strategies. It was illustrated that the approach is mathemat-

of more networks with different topological structures is

ically sound, computationally feasible and the proposed

recommended for improving the confidence in the pro-

method can be used to analyze the regrets associated with

posed approach.

in ‘ready to be used’ condition. It was shown that both

Table 13

|

Number of polluted consumer nodes and amounts of regret with optimal solutions under different scenarios obtained from MTR approach

Number of polluted consumer nodes Regret

Scenario 1

Scenario 2

Scenario 3

Scenario 4

Scenario 5

246

350

585

283

197

82

171

183

122

18

Total ¼ 576


188

A. Afshar & E. Najafi

|

Consequence management under inexact scenarios

REFERENCES Abbasi, A., Afshar, A. & Jalali, M. R.  Ant-colony-based simulation–optimization modeling for the design of a forced water pipeline system considering the effects of dynamic pressures. J. Hydroinf. 12 (2), 212–224. Afshar, A. & Amiri, H.  A min-max regret approach to unbalanced bidding in construction. KSCE J. Civil Eng. 14 (5), 653–661. Afshar, A., Sharifi, F. & Jalali, M. R.  Non-dominated archiving multi-colony ant algorithm for multi-objective optimization: application to multi-purpose reservoir operation. Eng. Optim. 41 (4), 313–325. Aissi, H., Bazgan, C. & Venderpooten, D.  Min-max and minmax regret versions of some combinatorial optimization problems: a survey. Eur. J. Oper. Res. 197 (2), 427–438. Alfonso, L., Jonoski, A. & Solomatine, D.  Multiobjective optimization of operational responses for contaminant flushing in water distribution networks. J. Water Resour. Plan. Manage. 136 (1), 48–58. Averbakh, I.  Minmax regret linear resource allocation problems. Oper. Res. Lett. 32 (2), 174–180. Baranowski, T. M. & LeBoeuf, E. J.  Consequence management optimization for contaminant detection and isolation. J. Water Resour. Plan. Manage. 132 (4), 274–282. Baranowski, T. M. & LeBoeuf, E. J.  Consequence management utilizing optimization. J. Water Resour. Plan. Manage. 134 (4), 386–394. Baranowski, T., Janke, R., Murray, R., Bahl, S., Sanford, L., Steglitz, B. & Skadsen, J.  Case study analysis to identify and evaluate potential response initiatives in a drinking water distribution system following a contamination event. Borchardt Conf., Univ. of Mich., Ann Arbor, Mich. Berry, J., Hart, W. E., Phillips, C. A., Uber, J. G. & Watson, J. P.  Sensor placement in municipal water networks with temporal integer programming models. J. Water Resour. Plan. Manage. 132 (4), 218–224. Chang, N. & Davila, E.  Minimax regret optimization analysis for a regional solid waste management system. Waste Manage. 27 (6), 830–832. Cheung, P., Van Zyl, J. & Reis, L.  Extension of EPANET for pressure driven demand modeling in water distribution system. In: Proceeding of the 8th International Conference in Computing and Control in Water Industry.Water Management for the 21st Century, Center for Water Systems, University of Exeter, UK, 1 (2), 311–316. De Sanctis, A., Shang, F. & Uber, J.  Determining possible contaminant sources through flow path analysis. In: Proceedings of the 8th Water Distribution System Analysis Symposium, Cincinnati, OH. Dorigo, M. & Gambardella, L. M.  Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans. Evol. Comput. 1 (1), 53–66.

Journal of Hydroinformatics

|

16.1

|

2014

Dorigo, M., Maniezzo, V. & Colorni, A.  The ant system: optimization by a colony of cooperating ants. IEEE Trans. Syst. Man. Cybern. 26, 29–42. Haxton, T. & Uber, J. G.  Flushing under source uncertainties. In: Proceedings of 12th Annual Water Distribution Systems Analysis (WDSA) Conference, American Society of Civil Engineers (ASCE), AZ, Tucson, pp. 604–612. Laird, C. D., Biegler, L. T. & Waanders, B.  Mixed-integer approach for obtaining unique solutions in source inversion of water networks. J. Water Resour. Plan. Manage. 132 (4), 242–251. Loulou, R. & Kanudia, A.  Minimax regret strategies for greenhouse gas abatement: methodology and application. Oper. Res. Lett. 25 (5), 219–230. Mausser, M. E. & Laguna, M.  A heuristic to minimax absolute regret for linear programs with interval objective function coefficients. Eur. J. Oper. Res. 117 (1), 157–174. Ostfeld, A. & Salomons, E.  Optimal layout of early warning detection stations for water distribution systems security. J. Water Resour. Plan. Manage. 130 (5), 377–385. Poulin, A., Mailhot, A., Grondin, P., Delorme, L., Periche, N. & Villeneuve, J. P.  A heuristic approach for operational response to drinking water contamination. J. Water Resour. Plan. Manage. 134 (5), 457–465. Poulin, A., Mailhot, A., Periche, N., Delorme, L. & Villeneuve, J. -P.  Planning unidirectional flushing operations as a response to drinking water distribution system contamination. J. Water Resour. Plan. Manage. 136 (6), 647–657. Preis, A. & Ostfeld, A.  Contamination source identification in water systems: a hybrid model trees linear programming scheme. J. Water Resour. Plan. Manage. 132 (4), 263–273. Preis, A. & Ostfeld, A.  Multiobjective contaminant response modelling for water distributions systems security. J. Hydroinf. 10 (4), 267–274. Propato, M.  Contamination warning in water networks: general mixed-integer linear models for sensor location design. J. Water Resour. Plan. Manage. 132 (4), 225–233. Rossman, L. A.  EPANET 2.0: User’s Manual. National Risk Management Research Laboratory, US EPA, Cincinnati. US EPA  Response Protocol Toolbox: Planning for and Responding to Drinking Water Contamination Threats and Incidents – Overview and Application. US Environmental Protection Agency, Washington, DC. US EPA a Response Protocol Toolbox: Planning for and Responding to Drinking Water Contamination Threats and Incidents—Module 5: Public Health Response Guide. US Environmental Protection Agency, Washington, DC. US EPA b Response Protocol Toolbox: Planning for and Responding to Drinking Water Contamination Threats and Incidents—Module 6: Remediation and Recovery Guide. US Environmental Protection Agency, Washington, DC.

First received 21 July 2012; accepted in revised form 24 June 2013. Available online 25 July 2013


189

© IWA Publishing 2014 Journal of Hydroinformatics

|

16.1

|

2014

Free-surface flow simulations for discharge-based operation of hydraulic structure gates C. D. Erdbrink, V. V. Krzhizhanovskaya and P. M. A. Sloot

ABSTRACT We combine non-hydrostatic flow simulations of the free surface with a discharge model based on elementary gate flow equations for decision support in the operation of hydraulic structure gates. A water level-based gate control used in most of today’s general practice does not take into account the fact that gate operation scenarios producing similar total discharged volumes and similar water levels may have different local flow characteristics. Accurate and timely prediction of local flow conditions around hydraulic gates is important for several aspects of structure management: ecology, scour, flow-induced gate vibrations and waterway navigation. The modelling approach is described and tested for a multi-gate sluice structure regulating discharge from a river to the sea. The number of opened gates is varied and the discharge is stabilized with automated control by varying gate openings. The free-surface model was validated for discharge showing a correlation coefficient of 0.994 compared to experimental data. Additionally, we show the analysis of computational fluid

C. D. Erdbrink (corresponding author) V. V. Krzhizhanovskaya P. M. A. Sloot University of Amsterdam, Amsterdam, The Netherlands and National Research University of Information Technologies, Mechanics and Optics, Saint Petersburg, Russia E-mail: chriserdbrink@gmail.com; christiaan.erdbrink@deltares.nl C. D. Erdbrink Deltares, Delft, The Netherlands

dynamics (CFD) results for evaluating bed stability and gate vibrations. Key words

| computational fluid dynamics, discharge sluice, free-surface flow, gate operation, hydraulic gates, hydraulic structures

NOTATION A

amplitude of gate vibration (m)

h0

upstream water depth before reaching pier (m)

Alake

surface area of lake (m2)

h1

upstream water depth between piers, upstream

a

gate opening (m)

Cc

contraction coefficient for flow past sharp-

h2

water depth in control section (m)

edged underflow gate ( )

h3

water depth downstream of gate, between

Cc,in

contraction

coefficient

of gate (m)

for

flow

entering

piers, behind recirculation zone (m)

upstream section between piers ( )

h4

downstream water depth beyond pier (m)

CD

discharge coefficient for submerged flow (–)

htarget

target lake level to be reached at the end of dis-

CD*

average value of CD over one discharge event,

CE

charge period (m)

computed by discharge model ( )

k

turbulent kinetic energy (m2/s2)

discharge coefficient as used by Nago ()

KP, KI, KD

gain parameters of PID discharge controller

for experimental data

( ) 3

e

error value of PID discharge controller (m /s)

m

number of gates opened fully or partially ( )

fgate

frequency of gate vibration (Hz)

total number of gates of the structure ( )

Fr

Froude number ( )

n ~ n

normal vector ( )

g

gravitational constant (m/s2)

p

pressure (Pa)

h

water depth (m)

Q

discharge (m3/s)

doi: 10.2166/hydro.2013.215


190

QDM

C. D. Erdbrink et al.

|

Free-surface flow simulations for discharge-based operation of hydraulic gates

discharge through one gate, calculated by discharge model (m3/s)

Qgate

discharge through one gate, calculated by the system model (m3/s)

QMF

modular flow discharge based on gate underflow contraction criterion (m3/s) discharge per unit width (m2/s)

t

time variable (s)

U

magnitude of flow velocity vector; in 2DV ffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi model defined as U ¼ u21 þ u23 (m/s)

Uvc

magnitude of flow velocity vector in vena

~ u ¼ (u1 , u3 ) flow velocity vector in 2DV model; u1 is horizontal velocity, u3 is vertical velocity (m/s) reduced velocity parameter of flow-induced vibrations ( ) total volume passing the structure in a given amount of time (m3) Vtot,req

required total volume to pass the structure in a given amount of time in order to reach htarget (m3) width between piers (m)

α

calibration parameter for turbulent flow in bed stability parameter (–) relaxation factor in formula for CD in system model (–) turbulent dissipation (m2/s3)

ξin

entrance loss coefficient (–)

ξout

exit loss coefficient (–)

Ψ

stability parameter for beginning of motion of

∇ui ∇ ~ u q r ⌊r⌋ ⌈r⌉ s

INTRODUCTION This paper gives an outline of how near-field free-surface flow simulations can be used in the operation of gates of large hydraulic structures. Barrier operation is commonly based on water level pre-

the structure: for a weir in a river this is to maintain the upstream water level; for a discharge sluice this is to transfer river water out to the sea while keeping a safe inland level. Present-day hydraulic structures have various secondary functions, such as providing favourable ecological conditions, for which usually no numerical aids are available in daily operation. A better prediction of the flow near structures would be beneficial to durable performance of all barrier tasks. Proper design studies pay attention to all functions of a structure and assess the impact of all relevant flow features. However, operational constraints change in time for natural reasons (e.g. sea-level rise) or political reasons (e.g. In addition, sometimes the design criteria that were originally applied cannot be retrieved, yielding uncertainty about safety levels and allowable limits of gate settings in the present. agement for which an informed view on discharge and flow around gates is essential. First, the prediction of bed material

@ui =@x @ui =@z defined as

gradient operator, defined as ∇ui ¼ divergence operator, ∇ ~ u ¼ @u1 =@x þ @u3 =@z

2014

There are several aspects in contemporary barrier man-

ε

granular bed material (–)

|

‘Kierbesluit Haringvlietsluizen’) (see Rijkswaterstaat ).

w

β

16.1

procedures are aimed at fulfilling the main function of

contracta (m/s)

Vtot

|

dictions from system-scale far-field flow models. The

q

Vr

Journal of Hydroinformatics

stability and scour, including local erosion (Hoffmans & Pilarczyk ; Azamathulla ), as well as large-scale morphological changes of surrounding bathymetry (Nam et al. ) greatly depend on the flow. Second, ecological issues such as fish migration, salt water intrusion and mobile fauna are also linked with local flow characteristics

number of combinations of r objects out of q q! objects (0 r q), defined as r!ðq r Þ!

(Martin et al. ). Third, other relevant aspects are the

floor function, defined as ∀r ∈ R, ⌊r⌋ ¼

(Naudascher & Rockwell ) and the impact of flow

max(n ∈ Z:n r)

around structures on nearby shipping traffic. Fourth, for

ceiling function, defined as ∀r ∈ R,

the structure itself, local flow prediction is useful for dealing

⌈r⌉ ¼ min(n ∈ Z:n r)

with abnormal conditions: downtime of gates during sched-

time-average of quantity s

uled maintenance or unexpected gate failure.

dynamic forces associated with flow-induced gate vibrations


191

C. D. Erdbrink et al.

|

Free-surface flow simulations for discharge-based operation of hydraulic gates

Journal of Hydroinformatics

|

16.1

|

2014

The barriers and sluices built in the south-west of The

(such as constant discharge coefficients) is that the prediction

Netherlands in the period 1960–2000 are good examples of

quality of discharges in system-scale models is often unclear.

structures where different functions are combined. Present

Warmink et al. (, ) investigated the uncertainty in cali-

management of the barriers at Haringvliet and Oosterschelde

bration of water levels in river models resulting from the

calls for smart use to allow for regulation of fresh and salt

limited availability of discharge data. It was concluded that

water flows and fish migration. At the same time, the aging

the necessary extrapolation of the calibration parameter

process of these structures demands an increasing awareness

(bed roughness of main channel) leads to significant uncer-

of structural safety issues. The new storm-surge barrier of

tainty in simulated design water levels. More intensive

Saint Petersburg, Russia, is another example. This large

measurement of discharges, for which most gated structures

dam houses two sector-gates and three sections of radial

are ideal, and a physically more realistic representation of

gates that protect the low-lying city centre and regulate the

hydraulic structures in models are self-evident improvements

discharge from the river Neva. Operation of this complex

that nevertheless require a culture shift.

structure must rely on state-of-the-art flow models.

The application of computational fluid dynamics (CFD)

The above considerations motivate quantification of flow

in the assessment of flow impact issues that arise long after

around a hydraulic structure. The aim of this paper is to lay a

the start of operation of a structure is rare. Bollaert et al.

foundation for numerical models to estimate gate discharges

() employ numerical modelling to assess the influence

and to evaluate the impact of flow near hydraulic structures

of gate usage on the formation of plunge pool scour of a

in a way that is fit for operational applications. The influence

hydropower dam. For some issues, like salt water intrusion

of waves is not investigated; the focus is on flow (current).

and sediment transport past a discharge-regulating structure,

Traditionally, flow around hydraulic structures is

the solution cannot be found in a modelling tool at one

studied experimentally in the design stage or as a fundamen-

scale. The local flow simulation should in those cases be

tal research topic (Kolkman ; Roth & Hager ).

coupled to a mid- or far-field model that covers a larger area.

Numerous numerical studies have looked into sluice gate

A central role nowadays is played by the multi-

flow (Khan et al. ; Kim ; Akoz et al. ), but no

disciplinary field of hydroinformatics (Solomatine & Ostfeld

single accepted, validated modelling tool exists for assessing

; Krzhizhanovskaya et al. ; Melnikova et al. ;

turbulent gate flow with suitable practical value. Estimating

Pyayt et al. a; Pengel et al. ), in which different

discharge over weirs or under gates is not trivial. New dis-

forms of modelling (physics-based and data-driven) are con-

charge equations are still being introduced, both from

sidered and combined with contemporary computational

informatics viewpoints (Khorchani & Blanpain ) and

techniques like machine learning (Pyayt et al. b). In the

from the traditional viewpoint of measurements (Habibza-

context of the present study, it is noted beforehand that for

deh et al. ).

a complex hydraulic structure, data-driven modelling alone

System-scale models of inland water systems simulate the

is not an apt option, because a single Q-H-relation does not

flow in river branches by solving the one-dimensional or quasi-

describe all states (Kolkman ), or is highly impractical

two-dimensional Shallow Water equations (Deltares a, b).

as it would require extensive permanent monitoring.

The fact that these hydrostatic models do not simulate the flow

This study takes the underlying physics as a starting

around hydraulic structures explicitly is not a severe limitation

point: elementary flow equations are combined with two-

for most applications. The system effect of the operation of var-

dimensional model in the vertical (2DV) time-dependent

ious gates on the water levels in adjacent water bodies (river

CFD simulations. The method bridges modelling scales

branches) can thus be studied (for instance Becker & Schwa-

with a minimum of data coupling and at the same time intro-

nenberg ). For stability of granular bed material and salt

duces the use of numerical aids into practical barrier

water transport, however, the flow acceleration in the vertical

operation for issues that at present are decided upon by

dimension needs to be simulated. Moreover, the downside of

expert judgement by the operator.

primarily water level-centred validation and calibration in

The remainder of this paper is organized as follows: first,

combination with parameterized structure representations

we describe the overall approach, then the method is


192

C. D. Erdbrink et al.

|

Free-surface flow simulations for discharge-based operation of hydraulic gates

Journal of Hydroinformatics

|

16.1

|

2014

described in three sections about discharge modelling, free-

question addressed is how to find the set of gate configur-

surface flow simulation and analysis of the modelling

ations capable of delivering the required discharge that

results. Next, the results of a series of validation runs for

also meet the relevant constraints on flow properties.

the free-surface model are discussed, followed by the results of a test case that gives numerical examples of all modelling steps. We end the paper with recommendations, con-

METHOD

clusions and an outlook on future work. Discharge computations

APPROACH

Configurations of multi-gated structure

For obtaining a timely prediction of the flow around gates, we

Let us consider the gate configurations of a discharge struc-

propose a multi-step physics-based modelling strategy which

ture consisting of n similar openings, each accommodating a

uses data input from a system-scale model. The work-flow of

movable gate, see Figure 2. In its idle state, all n gates close

the suggested gate operation system is shown in Figure 1.

off the openings between the piers and the total discharge is

The first step consists of the extraction of predicted

zero. During a discharge event, m gates will be opened par-

water levels on both sides of the structure from a far-field

tially or completely, allowing a certain discharge through

(system-scale) model that contains the structure. Different

the structure. A ‘gate configuration’ is defined as the allo-

possible gate settings (when to open, how many gates to

cation of a number of gates (m n) that are opened with a

use) are identified in the second step. All options need to

gate opening a(t) while the other gates remain closed. All

be assessed in terms of discharge capacity; this happens in

gates selected for opening will be operated similarly, i.e.

step 3. In the fourth step of Figure 1, for all gate configur-

with the same a(t).

ations capable of discharging the required volume, the

Before deciding which gates to open, first the possible

resulting flow is simulated using CFD. Subsequent analysis

combinations of opening gates are identified and counted.

of the simulation results determines the impact of the flow

In general, flow instabilities are not favourable for maintain-

for specific issues such as bed stability. The fifth and final

ing an efficient and controllable discharge. As in other parts

step comprises the actual decision of gate operation actions.

of physics, symmetry is a global measure for stability of free-

The conventional sequence of steps taken by most oper-

surface flows. If asymmetry is allowed, m gates can be

ational systems follows the dashed line in Figure 1, 2–4, which can be seen as an addition to computational

chosen freely from the total of n available slots. the Then n , using number of possible combinations is obviously m the common notation for combinatorial choice of m objects

decision support systems (steps 1 and 5) by Boukhanovsky

out of n. For the condition of symmetry to hold, gates may

& Ivanov () and Ivanov et al. ().

only be opened in such a way that the pattern is symmetric

skipping steps 3 and 4. The present study focuses on steps

A multi-gated discharge sluice with underflow gates will

about the vertical plane of symmetry in flow direction (see

be used to describe the modelling method. The central

Figure 2). This implies that the number of options reduces

Figure 1

|

Scheme of evaluation steps leading to a decision on optimal gate operation. Steps 1–4 are treated in this paper. The dashed line shows the shorter decision sequence taken by barrier systems that do not take into account flow effects.


193

C. D. Erdbrink et al.

Figure 2

to

|

|

Free-surface flow simulations for discharge-based operation of hydraulic gates

Journal of Hydroinformatics

|

16.1

|

2014

A multi-gated discharge sluice in plan view. In this example, gates 3, 4 and 5 are opened, the others are closed; so n ¼ 7 and m ¼ 3. The dotted line depicts plane of symmetry.

⌊n=2⌋ ⌊m=2⌋

for all 0 m n, where m cannot be chosen

odd if n is even – in which case there are no options at all. For a structure with seven gates (n ¼ 7), for instance, the total number of possible ways to open 1, 2, …, 7 gates is P7 7 7 i¼1 i 1 ¼ 2 1 ¼ 127 if asymmetry is allowed and

P7 ⌊7=2⌋ ¼ 2⌈7=2⌉ 1 ¼ 15 if only symmetric configuri¼1 ⌈i=2⌉ ations are permitted.

This shows that the symmetry constraint greatly reduces the number of ways to open a given number of gates. Furthermore, an even number of gates has roughly half the number of possibilities, because opening any odd number of gates then results in asymmetric inflow. This could also hold for an

Figure 3

|

Classic box model of outflow of a river to sea. An outlet barrier structure regulates the lake level while keeping salt seawater out.

odd-numbered gate structure which misses one (or any odd m < n) of the gates due to maintenance or operational failure.

where Qriver is discharge from a river, Qbarrier is the total discharge through the gates of the barrier, hlake is the water

System model and gate control The basis is formed by a classic box model, see for example Stelling & Booij (). The focus is on submerged flow through a multi-gated outlet barrier that blocks seawater from entering the lake at high tide and discharges river water to sea at low tide, see Figure 3. This basic model serves in the present study as a surrogate system-scale model. The water levels it generates will be used as bound-

level in the lake, Alake is the area of the lake assumed independent of hlake. Submerged flow past an underflow gate is by definition affected by the downstream water level. The associated discharge depends on both water levels (sea and lake), the gate opening a and a discharge coefficient for submerged flow CD. The discharge Q through a barrier gate at time t is written as: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Qgate ðtÞ ¼ CD ai ðtÞw 2g(hlake ðtÞ hsea ðtÞ),

ary conditions for the near-field modelling. Assuming barrier gates are closed (except when dischar-

where w is the flow width (see Figure 2) and the subscript

ging under natural head from lake to sea) and assuming zero

‘barrier’ is dropped from now on. Sea level hsea is approxi-

evaporation and precipitation, the system is described by:

mated by a sine function. The total discharged volume that passes the barrier in the period during which hlake > hsea is

Qriver Qbarrier ¼ Alake

dhlake , dt

found after summing over all m gates and integrating with respect to time.


194

C. D. Erdbrink et al.

|

Free-surface flow simulations for discharge-based operation of hydraulic gates

Journal of Hydroinformatics

|

16.1

|

2014

Two gate opening scenarios will be considered. In both

0.10, KI ¼ 0.45 and KD ¼ 0.55 are used. The setpoint Qset is

scenarios equal gate openings a(t) are applied to all m gates

constant and equal to Qtot,req, except for linear setpoint

selected for opening. The first scenario uses a constant gate

ramping applied at the start of discharge to prevent undue

opening aconst for the whole discharge period (from tstart to

fluctuations of gate position. At each time step, the required

tend). The opening required to lower the lake level to a desired lake level htarget is found by estimating the average

gate opening is derived from this discharge divided by pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 0 mCD w 2gðhlake ðtÞ hsea (t)Þ. Figure 4 shows the flow chart

required discharge Qtot,req to achieve this and by making

of the system model. It includes computations of the two

estimates of the average discharge coefficient and water

gate operation scenarios.

levels during the discharge period:

Figure 4 shows that the total discharge computed by the system model Qtot is being used to calculate the new

aconst ¼

lake level. Additionally, it shows that at the start of each

0 Q tot,req

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi with 0 h 0 ) 0 w 2g(h mC lake sea D

Alake (hlake (t(start) ) htarget ) Q tot,req ¼ t0end tstart 0

discharge event, i.e. when the gates are opened, the prediction of the discharge coefficient CD0 is updated using data from the discharge model. For both situations, with and without PID control, this coefficient is found by a relaxation function with the mean discharge coefficient C*D of

where bars are time-averages and primes indicate predictions of future values. In the second scenario, the discharge is regulated by a proportional integral derivative (PID) controller (Brown ). The goal of this scenario is to have a more constant gate discharge by varying the gate openings in time, whilst still achieving the same htarget as in the first scenario. The discrete PID formula for discharge at ti is:

the previous discharge event computed by the discharge model. For the nth discharge event, the update formula reads: 0 0 0 CD ðnÞ ¼ CD ðn 1Þ þ β CD ðn 1Þ CD ðn 1Þ In all computations, a relaxation factor β ¼ 0.75 is applied. Discharge coefficients actually depend on numerous fac-

Qðti Þ ¼ KP eðti Þ þ KI

i X j¼1

e(tj ) þ KD

eðti Þ e(ti 1 ) , Δt

tors. Also, flows through neighbouring gates influence each other. To distinguish between different gate configurations with the same total flow-through area m w aconst , these two things need to be taken into account. This is done in

where Ki are the gain parameters and the error value is

the discharge model described in the next subsection, see

defined as e(ti ) ¼ Qset Q(ti 1 ). In the simulations, KP ¼

also the bold block in Figure 4.

Figure 4

|

System model: flow chart of gate control and water level computations.


195

C. D. Erdbrink et al.

|

Free-surface flow simulations for discharge-based operation of hydraulic gates

Discharge model

Journal of Hydroinformatics

|

16.1

|

2014

A good geometric design of a discharge-regulator is such that no transition occurs from one flow type to another

Vertical lift gates with underflow are raised vertically

during regular usage. The model therefore checks if indeed

between piers of a structure. The two main flow types that

submerged discharge occurs. As criterion for reaching the

occur are free flow and submerged flow. When the gates

modular flow discharge QMF, the minimum flow depth in

are lifted higher than the water surface, there exists free or

the control section h2 is compared to the flow height in

submerged Venturi flow (Boiten ). All flow types have

the point of maximum vertical contraction Cc·a, the so-

different discharge characteristics and associated formulae.

called ‘vena contracta’. Free and intermediate flow regimes

For estimating the submerged flow discharge, the local

are thus detected, but not calculated. Submerged Venturi

water depths are schematized according to Figure 5 (after

flow is not considered either, since the idea is to actively

Kolkman ). Conservation of the energy head (Bernoulli

control the flow.

equation) is applied in the accelerating parts and the

All four non-linear equations are reshaped into third-

momentum equations in the decelerating parts, yielding a

order polynomials f(hi, hiþ1, Q) ¼ 0. Discharge Q is substi-

system of four equations (see Appendix, available online at

tuted for the velocity terms and remains as the only

http://www.iwaponline.com/jh/016/215.pdf).

unknown in the system of equations. As prescribed for

Transitions h0–h1 and h3–h4 with loss coefficients ξin

sub-critical flow conditions (Chow ), computational

and ξout represent the effects of flow entering and leaving

direction behind the gate is from downstream to upstream

the narrow area between two piers. Transitions h1–h2–h3

(h4 to h2). On the lake side, computations go in flow

are the characteristic underflow gate zones, see Battjes

direction up to the control section (h0 to h2). The dis-

() for details. Computations were carried out accord-

charge coefficient CD is derived from the contraction

ing to the flow chart shown in Figure 6 with the aim of

coefficient Cc for sharp-edged gates, fitted on experimen-

giving better discharge estimates. The lake and sea levels

tal data cited in Kolkman () so that the full range

computed in the system model served as boundary con-

of gate openings a/h1 is covered; see the Appendix for

ditions – for variables h0 and h4 of this model,

equations (available online at http://www.iwaponline.

respectively.

com/jh/016/215.pdf).

Figure 5

|

Definitions of local water depths hi for underflow gate in hydraulic structure, after Kolkman (1994). Above: top view of pier; below: cross-section free water surface around gate. Sketch not to scale.


196

Figure 6

C. D. Erdbrink et al.

|

|

Free-surface flow simulations for discharge-based operation of hydraulic gates

Journal of Hydroinformatics

|

16.1

|

2014

Flow chart of discharge model. This computation is repeated each time step; it is fully contained in the block named ‘discharge model’ in Figure 4.

Iterations on Q ultimately yield a value at which

one water body (lake) to the other (sea), see Figure 7.

h2,forward, computed from upstream, is equal to h2,backward

A rigid rectangular gate with a sharp-edged bottom is mod-

computed from downstream. This is the achieved value of

elled implicitly by cutting its shape out of the flow domain.

Q for the given gate opening a. The entrance and exit

The Reynolds-Averaged Navier–Stokes (RANS) equations

losses are assumed to depend on the number of gates in

for incompressible flow, included in the Appendix (available

use (m). The method does not distinguish between different

online at http://www.iwaponline.com/jh/016/215.pdf), are

gate configurations with equal m, however. Numerical

the basis for the simulations. Figure 8 gives the flow chart

results are shown in the results section.

of the CFD simulations. The model domain covers the flow from h1 to h3. These input values are taken from the discharge model.

CFD SIMULATIONS

For each simulated flow situation, two consecutive runs are made: a steady-state run and a time-dependent transient

Step 4 in Figure 1 consists of two parts: free-surface CFD

run. In the former run, iterations on the outflow velocity pro-

simulations (discussed in this section) and flow analysis (dis-

file are done until pressure at the surface becomes zero. The

cussed in the next section).

results of this pre-run are then implemented as initial conditions for the transient run, which uses a moving mesh to

Model set-up

simulate the free surface. Boundary conditions are similar for both runs except for the surface downstream of the

A non-hydrostatic flow model is applied to find out which of

gate, see Figure 7.

the selected gate settings is most favourable in terms of flow

The upstream flow boundary consists of a hydrostatic

properties. The two-dimensional domain (2DV) is defined

pressure profile pðzÞ ¼ ρgðh1 zÞ. The downstream bound-

by a vertical cross-section through the gate section from

ary is a block profile u-velocity. No slip is applied at the

Figure 7

|

Boundary conditions of CFD model. The main flow direction is from left to right. Sketch not to scale.


197

Figure 8

C. D. Erdbrink et al.

|

|

Free-surface flow simulations for discharge-based operation of hydraulic gates

Journal of Hydroinformatics

|

16.1

|

2014

Flow chart of FEM free-surface flow simulations.

walls (~ u ¼ 0) along with a wall function. The steady pre-run uses a ‘rigid lid’ (free slip boundary, ~ u ~ n ¼ 0) for the downstream water surface. The upstream free surface is modelled as a rigid lid in both runs.

water depth at this boundary would imply a change of local pressure, which contradicts the applied pressure profile. In the course of the transient run, the free surface adapts to the pressure field and vice versa. Because the physical

An unstructured computational mesh is used with refine-

flow situation is quasi-steady, with fluctuations depending

ments near the bottom wall and gate boundaries, made up of

on degree of submergence and gate opening, the surface

around 35,000 triangular elements and yielding about 230,000

may show oscillations in time in its equilibrium state. As a

degrees of freedom for a transient run. Figure 9 shows part of

consequence, the flow discharge is also not strictly constant

the mesh. The Arbitrary Langrangian–Eulerian (ALE) method

in the equilibrium state.

with Winslow smoothing (Donea et al. ) is applied to

The package Comsol Multiphysics is used to simulate

compute the deformation of the computational mesh down-

the gate flow. This finite element method (FEM) solver is

stream of the gate. At the top boundary in the transient run,

applied to solve the discretised RANS equations. The gener-

the velocity condition is an open boundary with zero stress

alized alpha time-implicit stepping method is applied to

in normal direction. At the same boundary, the mesh velocity

ensure Courant stability, with a strict maximum time step

in normal direction is prescribed as umesh,n ¼ u1 nx þ u3 nz

of Δt ¼ 0.02 s. The time step in the CFD model is completely

(Ferziger & Peric´ ). Mesh convergence tests showed that

independent of the time step in the system model and dis-

the applied mesh is sufficiently dense so that results do not

charge model. The variables are solved in two segregated

improve on further mesh refinement.

groups using a combination of the PARDISO solver and

The more common choice of applying a velocity con-

the iterative BiCGStab solver in combination with a

dition upstream and a pressure boundary downstream

VANKA preconditioner. The standard k-epsilon model is

conflicts with the required ALE moving mesh condition at

used for turbulence closure. Simulation of 24 seconds of

the outlet boundary. Vertical mesh freedom is necessary for

physical time took around 6 hours of wall-clock time on

the surface movement. A hydrostatic pressure profile

an Intel 8-core i7 processor, 2.93 GHz, 8 Gb RAM, occupy-

cannot be prescribed at the outlet, since any change in

ing on average 1 Gb RAM and 50% of total CPU power.

Figure 9

|

Example snapshot showing part of the computational mesh. Deformed surface downstream of gate is visible. Flow is from left to right.


198

C. D. Erdbrink et al.

|

Free-surface flow simulations for discharge-based operation of hydraulic gates

Journal of Hydroinformatics

|

16.1

|

2014

in which obviously h2 ¼ Cc a in fully free flow. An overview

ANALYSIS

of critical flow theory from a historical perspective is given The second part of step 4 in Figure 1 is the analysis of the

by Castro-Orgaz & Hager () and from a more practical

modelling results obtained in previous steps. In this section,

viewpoint by Boiten (). In a more complete flow assess-

three aspects of analysis are discussed: flow parameters,

ment, not only the vertical contraction caused by the

vibrations and bed stability.

underflow gate is used as a criterion for modular flow, as is done here, but also contraction caused by horizontal and possibly vertical flow domain transitions at the inlet of

Flow parameters

the structure should be included.

Three parameters that are required for assessing various types of flow impact are extracted from the CFD model:

Vibrations

the contraction coefficient Cc, the velocity in the vena contracta Uvc and the Froude number (Fr). The flow field is

The interaction of current with the movable hydraulic gate

interpolated to a regular grid, so that the edge of the separ-

is capable of causing significant flow-induced vibrations

ated layer is found, see Figure 10. The contraction

(FIV). Although dedicated design tests greatly reduce sus-

coefficient is thus found directly.

ceptibility for dangerous dynamic forces (Jongeling &

The cross-sectional averaged velocity in the vena con-

Erdbrink ), active prediction and control will broaden

tracta is defined by a spatial average in the separated shear

the windows of operation. The literature on dynamic gate

zone:

forces caused by this phenomenon uses a dimensionless

1 Uvc (t) ¼ Cc (t)a

parameter of reduced velocity to signify occurring gate

Cc(t) a ð

U(z, t) dz z¼0

where U is the velocity magnitude scalar at the point of maximum flow contraction. For gate flow with significant vc , may fluctuations, the temporal mean of this quantity, U be used. The Froude number is a widely used dimensionless measure for flow-related surface curvatures. It is used for describing the transition from intermediate to free flow regimes and predicting modular flow discharge and associated gate openings. Here it is defined as: Uvc (t) Fr(t) ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffi gh2 ðtÞ

Figure 10

|

vibrations (Hardwick ; Billeter & Staubli ; Erdbrink ). In time-dependent form it is written as: Vr(t) ¼

Uvc (t) fgate ðtÞ L

where fgate is the response frequency of the structure in Hz; L is a characteristic length scale of the gate, usually the thickness of the gate bottom, and Uvc as defined in the previous section. The response frequency is not easily determined analytically (see general formula in Appendix, available online at http://www.iwaponline.com/jh/016/ 215.pdf); among other reasons because the ‘added’ water mass mw that is caused by the inertia of water being pushed away by the gate deviates from analytical values

Vector flow field of run II. Flow is from left to right. The computed free surface behind the gate shows local lowering. Dashed line indicates separation between positive and negative u1-velocity. The figure shows only part of the actual computational domain. Total domain length is 3.6 m.


199

C. D. Erdbrink et al.

|

Free-surface flow simulations for discharge-based operation of hydraulic gates

Journal of Hydroinformatics

|

16.1

|

2014

at non-zero gate flow (Blevins ). The gate frequency

the turbulence (that depends on flow type and local geome-

may best be monitored in situ by installing sensors –

try, e.g. slopes in bottom profile).

which need to be sensitive to small amplitudes in order to have predictive value. Erdbrink et al. () provide a recipe for a data-driven gate control system for gate vibrations. It is therein proposed to combine physics-

MODEL VALIDATION RESULTS

based modelling and sensor data with machine learning computations to steer the gates clear of riskful situations.

A series of validation runs was performed for the free-sur-

From numerous experimental studies it is concluded

face model. ‘Validation run’ is used here in the meaning

(Naudascher & Rockwell ) that for a specific gate, the

discussed by Stelling & Booij (): the uncalibrated

amplitude A due to FIV, in cross-flow or in-flow direction

model is run without any tweaking of parameters to see

or both, is a function of Vr, a and submergence:

if it can reproduce the most important physical features. Experimental laboratory data by Nago (, ) for a

A ¼ f(Vr, a, h3 )

vertical

sharp-edged

gate

under

submerged

efflux

serve as comparison. Nago’s (, ) dimensions Details of the gate geometry are decisive for occurrence or absence of vibrations. A database with response data

were used without any scaling. His discharge formula pffiffiffiffiffiffiffiffiffiffi Q ¼ CE aw 2gh1 does not contain the downstream level

from past laboratory studies could be used to predict ampli-

h3 explicitly. Its influence is instead found in the discharge

tudes of future flow situations in an operational system.

coefficient CE. The simulated discharge is computed by spatial integration of horizontal velocity at the outflow boundary. In Figure 11, coefficient CE is plotted for differ-

Scour and bed protection

ent series of dimensionless gate openings and for a range

The classical prediction of local scour downstream of weirs and sluice structures caused by outlet currents is described by Breusers () and Hoffmans & Pilarczyk (). More recently, contemporary computational techniques were introduced for scour estimation, e.g. Azmathullah et al. (). In the classical physics-based design formulae, turbulence parameters are used to predict the depth of the scour hole in unprotected beds. For beds protected with granular material (loose rocks), the Shields parameter is a classic non-dimensional measure applied as a first indicator for instability (Shields ). An adapted version of this parameter used by Jongeling et al. () and elaborated upon by Hofland () and Hoan et al. () is defined as:

Ψ(x) ¼

pffiffiffiffiffiffiffiffiffi 2 U(x) þ α k(x) Δgd(x)

of dimensionless downstream levels. The results of the validation runs make clear that the simulations capture the discharges of the experimental data quite accurately: the correlation coefficient is 0.994 and the root mean square error is 1.14%. The fact that the uncalibrated model shows good discharge estimates gives confidence in the predictive power of this modelling approach. Physical output not validated here (such as TKE) may be calibrated in future studies by adjusting suitable model parameters. Convergence of various flow variables occurs at different rates. First, the mean velocities stabilize, and then the forces on the gate converge, then the discharge, and lastly the turbulent energy. The chosen boundary conditions proved to lead to

with Δ ¼

ρs ρw , ρw

stable results for all submergence ratios of Nago’s (, ) data. It was found that the moving mesh is the critical factor for numerical stability. ALE is a suitable method for

where 〈::〉 denotes spatial averaging over the whole water

computing the free surface for quasi-steady gate flow as

depth, k is the turbulent kinetic energy (TKE), d is the is the mean flow velocity magnitude local water depth, U

long as the flow remains submerged. Steep surface gradients

and α is an empirical parameter for bringing into account

and hence numerical instabilities.

associated with lowering h3 cause inverted mesh elements


200

Figure 11

C. D. Erdbrink et al.

|

|

Free-surface flow simulations for discharge-based operation of hydraulic gates

Journal of Hydroinformatics

|

16.1

|

2014

Results of validation runs showing discharge coefficient CE simulated by the free-surface CFD model versus experimental data of submerged flow of a sharp-edged underflow gate by Nago (1978, 1983). Left: sorted by gate opening (a/h1) and downstream level (h3/a). Right: direct comparison of the same data. Dashed lines mark 10% deviation.

TEST CASE RESULTS The described methods are illustrated by a test case example. The results of three modelling steps are discussed: the sluice model containing the system model (for water levels) plus the discharge model (Figures 4 and 6), the free-surface model (Figure 8) and analysis of vibrations and bed stability. Four tidal cycles and four discharge events were modelled for a discharge sluice with seven gates regulating a lake with constant river inflow. The goal of the computations is to determine the optimal number of gates to open and the best gate operation scenario.

• •

tidal amplitude ¼ 0.60 m tidal period ¼12.5 hours The sluice model was run for 1 m 7. When opening

only one gate, the target lake level could not be reached even when lifting the gate completely. When using two gates, the target level is reached, but the modular flow limit is exceeded for the greatest part of the discharge period. This results in unwanted transitions to intermediate and free flow with fluctuating discharges that are hard to control. For 3 m 7 strictly submerged flow exists and the target is met. Therefore, only these configurations are modelled further. The plotted water levels (Figure 12) show that the lake level fluctuates in a controlled way and is nearly identi-

Results of system and discharge model

cal for the scenarios with and without discharge control.

Model parameters

charges in time are plotted for one tidal period for the

In Figure 13, the gate openings and achieved gate dis-

• • • • • • • •

situations with three or seven gates opened during the disn ¼ 7, m ¼ 1, … , 7 7

Alake ¼ 1.9·10 m

2

charge event. Intermediate numbers of operated gates (4 m 6) lie between the shown curves for m ¼ 3 and

Qriver ¼ 100 m3/s

m ¼ 7, but are not plotted for clarity. It can be seen that con-

hlake(t ¼ 0) ¼ 6.1 m

stant gate openings give discharges that vary in time

htarget ¼ 6.0 m

following the time-dependent hydraulic head difference. In

w ¼ 22.5 m

the PID-controlled scenario, the gate opening is automati-

sill height: 3 m

cally operated in such a way that the discharge stabilizes

mean sea level ¼ z0 þ 6.1 m

quickly after the start.


201

C. D. Erdbrink et al.

|

Free-surface flow simulations for discharge-based operation of hydraulic gates

Journal of Hydroinformatics

|

16.1

|

2014

simulation depending on specific interests and available computing power. Results of CFD simulations To simulate the two selected runs I and II within the validated range, the levels and opening are scaled down with length scale 1:10, see Table 1. The near-gate flow velocities, pressures, TKE and dissipation are simulated. Figure 10 shows a plot of the

Figure 12

|

Results of sluice model for 3 m 7: sea and lake level for gate operation

simulated flow field of run II (at length scale 1:10) by indicating ~ u. The simulated free surface as expected sinks in the

scenario with and without PID-controlled discharge. Vertical line indicates moment of maximum head difference.

region directly downstream of the gate (solid line in Figure 10). In this case, the vena contracta is located at

In this multi-scale modelling approach, averaged values

short distance downstream of the flow separation point.

from the discharge model are used to improve discharge pre-

The separation between positive and negative horizontal vel-

dictions at system scale. However, instantaneous discharges

ocities in the recirculation area is derived (dashed line in

and gate openings computed in both models inevitably

Figure 10). At a distance of around five times the down-

differ. The largest discrepancies are around 10%. This

stream water level past the gate, the flow reattaches at the

could be improved by examining different update methods,

surface and the velocity starts to return to a more uniform

at the cost of longer computation time.

profile.

Three configurations are selected for evaluation by free-

Figure 14 shows plots of the pressure and TKE of run II.

surface simulations. These cases are marked in Figure 13 as

In the case shown in the plots, the equilibrium state

runs I, II and III. Runs I and III represent extremes: a con-

reached in the simulations is fully steady. Pressure gradients

stant gate opening with only three gates in use (high Q) and

are mild; the pressure returns smoothly to a hydrostatic

a controlled opening with all seven gates in use (low Q).

shape as the streamlines become parallel downstream. The

All three runs are at the time of maximum head difference.

TKE reaches a maximum in the middle of the water

In real-life practice, more cases could be selected for

column at about two times the downstream water depth

Figure 13

|

Results of sluice model: gate openings (left) and achieved discharges per gate (right).


202

C. D. Erdbrink et al.

Table 1

|

|

Free-surface flow simulations for discharge-based operation of hydraulic gates

Journal of Hydroinformatics

|

16.1

|

2014

Values of selected CFD runs

Gate opening a

Total discharge

Discharge per

Discharge per gate per

Run

Gate configuration

scale

h0 (m)

h1 (m)

h3 (m)

h4 (m)

(m)

Qtot (m3/s)

gate Qi (m3/s)

unit width qi (m2/s)

I

m ¼ 3, constant opening

1:1 1:10

3.07 0.307

2.93 0.293

2.38 0.238

2.50 0.250

1.30 0.130

270 0.855

90.1 0.285

4.00 0.127

II

m ¼ 3, PID control

1:1 1:10

3.07 0.307

3.00 0.300

2.44 0.244

2.50 0.250

1.14 0.114

237 0.750

79.02 0.250

3.51 0.111

III

m ¼ 7, PID control

1:1 1:10

3.07 0.307

3.06 0.306

2.49 0.249

2.50 0.250

0.610 0.0610

237 0.750

33.86 0.107

1.51 0.0476

Length

▪ Input values for CFD runs. All water levels hi are relative to z ¼ 0.

past the gate. Run I has a steeper surface behind the gate

observations from the free surface curvatures of the final sol-

than run II (shown in Figures 10 and 14) and higher TKE

ution of the transient simulations. The flow impact on the bed protection material is esti-

levels, while run III has the lowest TKE levels and the most level surface downstream of the gate.

mated by computing Ψ for two different α for the selected

Results of flow analysis

runs. The whole water depth d is used for averaging the pffiffiffi 2 square of the maximum local velocity term U þ α k . The results are plotted in Figure 15.

The output of the CFD free-surface model is used for com-

The plot shows that run I (three gates with constant open-

puting the values of the three flow parameters that were

ing) has the strongest flow impact on the bed material of the

discussed in an earlier section, see Table 2.

three runs irrespective of the choice for α. The Ψ–values of

Table 2 shows that the contraction coefficients do not

run II show that controlling the discharge without opening

differ much, which is expected for similar gate types. The

more gates already gives a lower flow impact on the bed.

velocity in the control section Uvc is highest for the situation

Run III (seven gates with controlled discharge) has the

with highest discharge per gate (run I) and lowest for the

lowest flow impact. All runs reach their maximum flow

situation with smallest discharge per gate (run III). The

impact on the bed around the same (limited) distance down-

same holds for the Froude number. This matches

stream of the gate. For all runs the general shape of the curves

Figure 14

|

Pressure p in Pa (above) and turbulent kinetic energy k (TKE) in m2/s2 (below) of run II.


203

Table 2

C. D. Erdbrink et al.

|

|

Free-surface flow simulations for discharge-based operation of hydraulic gates

Journal of Hydroinformatics

|

16.1

|

2014

Computed flow parameters derived from CFD model results

Run

Cc ( )

Uvc (m/s)

Fr ( )

I

0.88

3.56

0.83

II

0.86

3.50

0.78

III

0.84

2.74

0.57

is quite similar for both values of α, indicating that turbulence is dominant over mean velocity for the flow impact. Overall, the values of the bed stability parameter are somewhat low compared to previous numerical investigations by Erdbrink & Jongeling () and Erdbrink (), which could be attributed to the use of the standard k-epsilon model in this study instead of the RNG k-epsilon turbulence model used in the two mentioned studies. Choos-

Figure 16

|

Gate vibration response for runs I–III giving relative amplitude A / Amax as a function of reduced velocity Vr. Fictitious response curves are used to illustrate the method. Two regions of gate openings a are distinguished.

ing higher α values could compensate the lower TKE. For practical application one should fix α after calibration in

bottom gate occurs at small gate openings, therefore

experimental investigations and define a threshold value for

higher amplitudes are expected for run III. In this fictitious

Ψ that should not be exceeded during operation and that

instance, the computed Vr-ranges indeed give higher relative

can be used as a fitness measure for different flow scenarios.

amplitudes for run II than for the other two runs. As

Turning to the assessment of gate vibrations, it is calcu-

with the bed protection assessment, the definition of a

lated that for an assumed range of structural response

literature-based threshold level would be a logical addition

frequencies of 2–5 Hz (typical values for large hydraulic

for real applications.

gates), the reduced velocity number Vr lies in the range

Based on the discussed modelling results and flow analy-

3.5–8.5 for runs I and II and in the range 2.5–6 for run III.

sis, it may be decided to implement the discharge scenario of

For illustration purposes, a response curve is devised, see

run II, because it leads to acceptable vibration levels and

Figure 16, since a full evaluation is laborious (e.g. Billeter

gives a lower impact on the bed material than run I –

& Staubli ). Projection of the Vr-values onto the

while still ensuring sufficient discharge volume to reach

response curve give resulting vibration amplitudes.

the target lake level.

Two different response curves are used in Figure 16. The most significant excitation of cross-flow vibrations of a flat-

RECOMMENDATIONS As a main recommendation, we propose to apply this modelling process in a case study of existing barrier structures such as Haringvliet, Oosterschelde, Maeslantkering in The Netherlands or the Saint Petersburg barrier in Russia. This research should find a natural place within on-going work on system-scale modelling for water level prediction used in decision support for hydraulic structures (Boukhanovsky & Ivanov ). Specific gate uses are to be simulated and evaluated. For

Figure 15

|

Computed values of bed stability parameter Ψ downstream of the gate for two different values of turbulence impact parameter α. Runs I, II and III are shown.

the last two barriers just mentioned, the operational modelling system will be mostly aimed at widening the window of


204

C. D. Erdbrink et al.

|

Free-surface flow simulations for discharge-based operation of hydraulic gates

Journal of Hydroinformatics

|

16.1

|

2014

operation. The introduced methods can also be adapted for

control process. A combination of elementary equations

weirs in rivers. Coupling the presented models with a mid-

and empirical relations was used for this. The increase in

field or far-field model of regional scale would enable an

computational power over the years now enables solving

operational impact assessment for water management

these flow equations in quick assessment procedures

issues such as salt water intrusion.

during operation.

The inclusion of measurement data (from field sensors

Free-surface CFD simulations of the turbulent flow past

or laboratory tests) is necessary for the calibration of empiri-

an underflow gate revealed the effects of local lowering of

cal parameters (such as entrance and exit losses), for the

the surface on flow velocities and TKE levels. Time-depen-

process of model validation and for providing actual

dent FEM simulations with a moving mesh technique were

model input (water levels). Experiences from the field of

found to give stable solutions of the free-surface under sub-

hydroinformatics should be added to the present research

merged conditions. From a series of validation runs it is

to make the extension towards data-driven modelling com-

concluded that the free-surface model yields discharge

ponents. The link with data assimilation that is to be

values for a range of gate openings and submergence

accommodated by the higher-level models is obvious.

levels within an acceptable accuracy of experimental values.

A longstanding issue in the engineering practice of

Among the flow analysis possibilities based on output

detailed hydrodynamics is turbulence modelling. The right

from the free-surface model is computation of the Froude

balance between accuracy and computational costs needs

number, the reduced velocity parameter for estimating gate

to be found for specific applications. Again, smart use of

vibrations and a stability parameter for granular bed protec-

measurement data for numerical validation and calibration

tion. The numerical example of the discharge sluice has

could be the key. It is furthermore expected that intermedi-

proved the feasibility of combining discharge estimates

ate and free flow conditions where hydraulic jumps occur

with free-surface simulations for deriving operational

away from the gate, including the Venturi flow type, can

decisions. For the particular case treated in this paper, it

be modelled more universally using other numerical

was found that lower TKE levels of the PID-controlled dis-

methods such as Phase Field or Volume of Fluid. If

charge scenarios contribute significantly to reducing the

needed, the model can thus be extended to account for

flow attack on the bed protection. Additionally, the model

dynamic effects directly related to opening and closing

showed the influence of the number of opened gates on

actions of the gates. Active setpoint ramping of the PID-

the flow properties.

control using feed-forward model predictions is another recommendation related to this.

The practical benefits of including near-field flow modelling in gate control systems seem clear. It will enable more sophisticated water reservoir management in everyday operation with respect to issues such as salt water intrusion, fish

CONCLUSIONS AND FUTURE WORK

migration and possibly saving energy. In extraordinary situations, model results can help maintain safe gate usage and

The purpose of the current study was to set up physics-based

prevent gate vibrations, washing away of bed protection and

modelling methods for a flow-centred operation of gates of

the development of scour holes around the structure.

hydraulic structures. The described case of a multi-gated

Limitations of the followed modelling approach need to

outlet barrier sluice has shown how discharge estimates

be addressed in follow-up studies. Additional calibrations

and free-surface simulations can aid in deciding on optimal

are necessary: PID-control optimization to obtain the

gate configuration and opening scenarios.

desired discharge more precisely, discharge and loss coeffi-

The application of a PID-controller to achieve a more

cients in the flow equations and turbulence model

constant discharge during changing head differences

parameters. Next to this, improvements to the free-surface

emerged as a feasible addition to traditional structure oper-

model should broaden the range of applicability so that stee-

ation. Prediction of gate discharge coefficients is a central

per surface disruptions and hydraulic jumps as found in free

issue in determining appropriate gate openings in the

flows can be captured as well.


205

C. D. Erdbrink et al.

|

Free-surface flow simulations for discharge-based operation of hydraulic gates

The physics-based model of this study is logically complemented by data-driven techniques in future studies. It is believed that hydroinformatics provides the required tools for this. Use of sensor data from real-life structures and coupling to system-scale water level prediction models are seen as next steps. Moreover, it should be investigated how operational decisions should be derived when taking into account the various criteria and flow constraints.

ACKNOWLEDGEMENTS This work was supported by the EU FP7 project UrbanFlood, grant N 248767; by the Leading Scientist Program

of

the

Russian

Federation,

contract

11.

G34.31.0019 and by the BiG Grid project BG-020-10, #2010/01550/NCF

with

financial

support

from

The

Netherlands Organisation for Scientific Research NWO. It is carried out in collaboration with Deltares.

REFERENCES Akoz, M. S., Kirkgoz, M. S. & Oner, A. A.  Experimental and numerical modeling of a sluice gate flow. J. Hydraul. Res. 47 (2), 167–176. Azamathulla, H. Md.  Gene expression programming for prediction of scour depth downstream of sills. J. Hydrol. 460–461, 156–159. Azmathullah, H. M. D., Deo, M. C. & Deolalikar, P. B.  Estimation of scour below spillways using neural networks. J. Hydraul. Res. 44 (1), 61–69. Battjes, J. A.  Vloeistofmechanica. Lecture notes CT2100, Delft University of Technology, Fac. of Civil Eng. & Geosciences, Fluid Mechanics section. Becker, B. P. J. & Schwanenberg, D.  Conjunctive real time control and hydrodynamic modelling in application to Rhine river. In: HIC 2012: Proceedings of the 10th International Conference on Hydroinformatics. Hamburg, Germany, 14–18 July, 2012. TuTech Verlag, Hamburg. Billeter, P. & Staubli, T.  Flow-induced multiple-mode vibrations of gates with submerged discharge. J. Fluids Struct. 14, 323–338. Blevins, R. D.  Flow-induced Vibration, 2nd edn. Van Nostrand Reinhold, New York. Boiten, W.  Vertical gates as flow measures structures. In: Proceedings of the 2nd International Conference on Hydraulic Modelling. Stratford-upon-Avon, UK, 14–16 June 1994, pp. 33–44. BHR Group, London.

Journal of Hydroinformatics

|

16.1

|

2014

Bollaert, E. F. R., Munodawafa, M. C. & Mazvidza, D. Z.  Kariba dam plunge pool scour: quasi-3D numerical predictions. In: Proceedings of the International Conference on Scour and Erosion ISCE6, Paris, August 27–31, 2012. Boukhanovsky, A. V. & Ivanov, S. V.  Urgent computing for operational storm surge forecasting in Saint Petersburg. Proc. Comput. Sci. 9, 1704–1712. Breusers, H. N. C.  Conformity and time-scale in twodimensional local scour. In: Proceedings of the Symposium on Model and Prototype Conformity. Hydr. Res. Lab., Poona, India, pp. 1–8. Brown, F. T.  Engineering System Dynamics: A Unified Graph-centred Approach, 2nd edn. Taylor & Francis Group, Boca Raton, FL, USA. Chow, V. T.  Open-Channel Hydraulics. McGraw-Hill, New York. Castro-Orgaz, O. & Hager, W. H.  Critical flow: a historical perspective. J. Hydraul. Eng. 136, 3–11. Deltares a Delft3D-Flow User Manual (HydroMorphodynamics) – version: 3.15.20508. Available from: http://oss.deltares.nl/web/opendelft3d. Deltares b SOBEK-RE User Manual. Available from: http:// sobek-re.deltares.nl and www.deltaressystems.com. Donea, J., Huerta, A., Ponthot, J.-Ph. & Rodríguez-Ferran, A.  Arbitrary Lagrangian–Eulerian methods. In: The Encyclopedia of Computational Mechanics (E. Stein, R. De Borst & T. J. R. Hughes, eds). Vol. 1. John Wiley & Sons, Bognor Regis, UK, pp. 413–437. Erdbrink, C. D.  Ontwerpmethodiek granulaire bodemverdediging met CFX ongestructureerd. Deltares research report 1200257-003, kennisonline.deltares.nl. Erdbrink, C. D.  Physical model tests on vertical flow-induced vibrations of an underflow gate. Deltares research report 1202229-004, kennisonline.deltares.nl. Erdbrink, C. D. & Jongeling, T. H. G.  Computations of the turbulent flow about square and round piers with a granular bed protection: 3D flow computations with CFX. Deltares research report Q4386/Q4593, kennisonline. deltares.nl. Erdbrink, C. D., Krzhizhanovskaya, V. V. & Sloot, P. M. A.  Controlling flow-induced vibrations of flood barrier gates with data-driven and finite-element modelling. In: Comprehensive Flood Risk Management (F. Klijn & T. Schweckendiek, eds). CRC Press/Balkema (Taylor & Francis Group), Leiden, Proceedings of the 2nd European Conference on Flood Risk Management FLOODrisk 2012. 20–22 November 2012, Rotterdam, The Netherlands, pp. 425–434. Available from www.crcpress.com/product/ isbn/9780415621441. Ferziger, J. H. & Peric´, M.  Computational Methods for Fluid Dynamics, 3rd edn. Springer-Verlag, Berlin, Heidelberg, New York. Habibzadeh, A., Vatankhah, A. R. & Rajaratnam, N.  Role of energy loss on discharge characteristics of sluice gates. J. Hydraul. Eng. 137 (9), 1079–1084.


206

C. D. Erdbrink et al.

|

Free-surface flow simulations for discharge-based operation of hydraulic gates

Hardwick, J. D.  Flow-induced vibration of vertical-lift gate. J. Hydraul. Div. Proc. ASCE 100 (5), 631–644. Hoan, N. T., Stive, M., Booij, R., Hofland, B. & Verhagen, H.  Stone stability in nonuniform flow. J. Hydraul. Eng. 137 (9), 884–893. Hoffmans, G. J. C. M. & Pilarczyk, K. W.  Local scour downstream of hydraulic structures. J. Hydraul. Eng. 121 (4), 326–340. Hofland, B.  Rock & Roll – Turbulence-Induced Damage to Granular Bed Protections. PhD Thesis, Delft University of Technology, The Netherlands. Ivanov, S. V., Kosukhin, S. S., Kaluzhnaya, A. V. & Boukhanovsky, A. V.  Simulation-based collaborative decision support for surge floods prevention in St. Petersburg. J. Comput. Sci. 3 (6), 450–455. Jongeling, T. H. G. & Erdbrink, C. D.  Dynamica van beweegbare waterkeringen – Trillingen in onderstroomde schuiven en uitgangspunten voor een schaalmodelopstelling. Deltares research report 1200216-000, kennisonline.deltares.nl. Jongeling, T. H. G., Blom, A., Jagers, H. R. A., Stolker, C. & Verheij, H. J.  Design method granular protections. WL| Delft Hydraulics, Technical report Q2933/Q3018. Khan, L. A., Wicklein, E. A. & Rashid, M.  A 3D CFD model analysis of the hydraulics of an outfall structure at a power plant. J. Hydroinform. 7 (4), 283–290. Khorchani, M. & Blanpain, O.  Development of a discharge equation for side weirs using artificial neural networks. J. Hydroinform. 7 (1), 31–39. Kim, D.-G.  Numerical analysis of free flow past a sluice gate. KSCE J. Civil Eng. (Water Eng.) 11 (2), 127–132. Kolkman, P. A.  Discharge relations and component head losses for hydraulic structures. In: Hydraulic Structures Design Manual 8 (D. S. Miller, ed.). IAHR/AIRH, Balkema, pp. 55– 151. Also published in 1989 as Delft Hydraulics report Q953. Krzhizhanovskaya, V. V., Shirshov, G. S., Melnikova, N. B., Belleman, R. G., Rusadi, F. I., Broekhuijsen, B. J., Gouldby, B. P., Lhomme, J., Balis, B., Bubak, M., Pyayt, A. L., Mokhov, I. I., Ozhigin, A. V., Lang, B. & Meijer, R. J.  Flood early warning system: design, implementation and computational modules. Proc. Comput. Sci. 4, 106–115. Martin, D., Bertasi, F., Colangelo, M. A., De Vries, M., Frost, M., Hawkins, S. J., Macpherson, E., Moschella, P. S., Satta, M. P., Thompson, R. C. & Ceccherelli, V. U.  Ecological impact of coastal defence structures on sediment and mobile fauna: Evaluating and forecasting consequences of unavoidable modifications of native habitats. Coast. Eng. 52, 1027–1051. Melnikova, N. B., Shirshov, G. S. & Krzhizhanovskaya, V. V.  Virtual dike: multiscale simulation of dike stability. Proc. Comput. Sci. 4, 791–800. Nago, H.  Influence of gate-shapes on discharge coefficients. Trans. JSCE 10, 116–119. Original in Japanese: Proc. of JSCE 270, Feb. 1978, 59–71.

Journal of Hydroinformatics

|

16.1

|

2014

Nago, H.  Discharge coefficient of underflow gate in open channel. Research Report Department of Civil Engineering, Okayama University, Japan. Nam, P. T., Larson, M., Hanson, H. & Xuan Hoan, L.  A numerical model of beach morphological evolution due to waves and currents in the vicinity of coastal structures. Coast. Eng. 58, 863–876. Naudascher, E. & Rockwell, D.  Flow-induced Vibrations – An Engineering Guide. Dover Publications, New York. Pengel, B., Krzhizhanovskaya, V. V., Melnikova, N. B., Shirshov, G. S., Koelewijn, A. R., Pyayt, A. L. & Mokhov, I. I.  Flood early warning system: sensors and internet. In: IAHS Red Book, N 357, Floods: From Risk to Opportunity (A. Chavoshian & K. Takeuchi, eds). IAHS Press, Wallingford, UK, pp. 445–453. Available from www.iahs. info/uploads/dms/15684.357%20445-453.pdf. Pyayt, A. L., Mokhov, I. I., Kozionov, A., Kusherbaeva, V., Melnikova, N. B., Krzhizhanovskaya, V. V. & Meijer, R. J. a Artificial intelligence and finite element modelling for monitoring flood defence structures. IEEE Workshop on Environmental, Energy, and Structural Monitoring Systems. September 2011. pp. 1–7. Available from http://dx.doi.org/ 10.1109/EESMS.2011.6067047. Pyayt, A. L., Mokhov, I. I., Lang, B., Krzhizhanovskaya, V. V. & Meijer, R. J. b Machine learning methods for environmental monitoring and flood protection. World Acad. Sci. Eng. Technol. 54, 118–123. Available from http://waset. org/journals/waset/v54/v54-23.pdf. Rijkswaterstaat  Haringvlietsluizen op een kier – Effecten op natuur en gebruiksfuncties. Stuurgroep Realisatie de Kier, report AP/2004.07, Dutch Ministry of Public Works. Roth, A. & Hager, W. H.  Underflow of standard sluice gate. Exp. Fluids 27 (4), 339–350. Shields, A.  Anwendung der Aehnlichkeitsmechanik und der Turbulenzforschung auf die Geschiebebetrieb. Mitteilungen der Preussischen Versuchsanstalt fur Wasserbau und Schiffbau, Heft 26. Solomatine, D. P. & Ostfeld, A.  Data-driven modelling: some past experiences and new approaches. J. Hydroinform. 10 (1), 3–22. Stelling, G. S. & Booij, N.  Computational modelling of flow and transport. Lecture notes CTwa4340, Delft University of Technology, The Netherlands. Warmink, J. J., Van der Klis, H., Booij, M. J. & Hulscher, S. J. M. H.  Identification and quantification of uncertainties in river models using expert elicitation. In: Proc. Conf. NCR-days 2008 (A. G. van Os & C. D. Erdbrink, eds). NCR-Publications, 33–2008, Delft, pp. 40–41. Warmink, J. J., Janssen, J. A. E. B., Booij, M. J. & Krol, M. S.  Identification and classification of uncertainties in the application of environmental models. Environ. Model. Software 25, 1518–1527.

First received 15 November 2012; accepted in revised form 28 June 2013. Available online 5 August 2013


207

© IWA Publishing 2014 Journal of Hydroinformatics

|

16.1

|

2014

Implementation of pressure reduction valves in a dynamic water distribution numerical model to control the inequality in water supply Gabriele Freni, Mauro De Marchis and Enrico Napoli

ABSTRACT The analysis of water distribution networks has to take into account the variability of users’ water demand and the variability of network boundary conditions. In complex systems, e.g. those characterized by the presence of local private tanks and intermittent distribution, this variability suggests the use of dynamic models that are able to evaluate the rapid variability of pressures and flows in the network. The dynamic behavior of the network also affects the performance of valves that are used for controlling the network. Pressure reduction valves (PRVs) are used for controlling pressure and reducing leakages. Highly variable demands can produce significant fluctuation of the PRV set point, causing related transient phenomena that propagate through the network and may result in water quality problems, unequal distribution of resources among users, and premature wear of the pipe infrastructure. A model was developed in previous studies and an additional module for pressure control was implemented able to analyze PRVs in a fully dynamic numerical framework. The

Gabriele Freni (corresponding author) Mauro De Marchis Università di Enna ‘Kore’, Facoltà di Ingegneria, Architettura e Scienze Motorie, Cittadella Universitaria, I-94100, Enna, Italy E-mail: gabriele.freni@unikore.it Enrico Napoli Università di Palermo, Dipartimento di Ingegneria Civile, Ambientale ed Aerospaziale, Viale delle Scienze, I-90128, Palermo, Italy

model was demonstrated to be robust and reliable in the implementation of pressure management areas in the network. The model was applied to a district of the Palermo network (Italy). The district was monitored and pressure as well as flow data were available for model calibration. Key words

| dynamic model, intermittent distribution, method of characteristic, pipe-filling process, PRVs, water distribution network modeling

INTRODUCTION The distribution of water resources can be made through two

this practice reduces the background water losses with

different delivery methods: continuous or intermittent distri-

little financial effort (Criminisi et al. ). Despite this,

bution. Continuous distribution ensures better management

when the practice of intermittent supply is protracted over

of the water network because the water demand depends

time, the effect could be opposite. Due to the water

only on user requests and the service quality can be better

hammer induced by the filling process (De Marchis et al.

guaranteed. In a water scarcity condition, an intermittent

), a deterioration of the pipes occurs, thus increasing

system is used by the management authority for rationing

the rate of burst and increasing leakages, preventing

the available water volume, for reducing real losses and/or

achievement of one of the main objectives of the intermit-

for controlling consumption (Fontanazza et al. ).

tent

supply.

Furthermore,

discontinuous

distribution

Due to several detrimental aspects, this approach should

presents several critical aspects, such as users’ inequality

be only applied if no other management choices are avail-

in access to water resources and the presence of filling and

able. Despite this, it is broadly adopted not only in

emptying transient phenomena affecting the mechanical

developing countries (Hardoy et al. ) but also in devel-

stability of the pipes, the durability of the network and

oped ones (Cubillo ). In a water scarcity condition,

water losses (Vairavamoorthy et al. ).

doi: 10.2166/hydro.2013.032


208

G. Freni et al.

|

Implementation of pressure reduction valves in a dynamic water network model

Impact on water quality can be equally relevant because empty water pipes can be exposed to ingress of soil particles

Journal of Hydroinformatics

|

16.1

|

2014

the advantaged parts of the network reducing the inequalities in water resource access among users.

and contaminated water from the surrounding soil through

Pressure transients caused by the combined behavior of

leak openings. This means that the water quality integrity

a network and PRV propagate through a PMA and result in

of the system is compromised and that users cannot be guar-

water supply problems, a higher number of pipe bursts, and

anteed a safe supply (National Research Council ).

premature wear of the pipe infrastructure. Since it is imposs-

Users try to adapt to intermittent distribution by instal-

ible to eliminate demand changes from a network, it is

ling local tanks, in order to collect water when the

important to control PRVs appropriately to minimize their

distribution service is available, and use them when the ser-

impact on the system. The interaction between automatic

vice is suspended (Arregui et al. ). Tanks are often

control valves and transients has been investigated in sev-

oversized with respect to the users’ real needs and their pres-

eral publications. Bergant et al. () investigated the

ence makes the network work in conditions that are quite

effect of valve closing time on the transient response in a

far from the design ones: flows in the lower parts of the net-

pipeline and compared measured data with a simulation

work are much higher than the design until the tanks are full

model. The effect of automatic control valves in a real

and water resources can reach the tanks in the disadvan-

pipe network was shown by Brunone & Morelli (),

taged areas of the network; pressure on the network is

and used to estimate the friction in a transient model. A

generally lower than the design and it is controlled by the

model for analysis and control of PRVs was implemented

levels in the tanks (Giustolisi et al. ).

by Prescott & Ulanicki (), using dynamic formulations

This configuration of the system reduces the applicability of common steady state models, because the private

and experimental analysis but the model was not integrated with a dynamic network modeling approach.

tank filling process creates continuous change in the hydrau-

The analysis of the network during the filling process

lic network behavior. To follow this constant change in

was carried out with a dynamic model, assuming that the

network state variables, dynamic and pressure driven

air pressure inside the network is always equal to the atmos-

models are needed. Considering this aim, Giustolisi ()

pheric one and that the water column cannot be fragmented

presented an extension of the pressure-driven analysis

(De Marchis et al. ). A demand model based on the node

using a global gradient algorithm (Todini ; Giustolisi

pressure-consumption law defining flow draw from the net-

et al. a, b) permitting the effective introduction of the

work and filling the tank was previously integrated into the

lumped nodal demand while preserving the energy balance

network model (De Marchis et al. ). In the present

by means of a pipe hydraulic resistance correction. The

paper, a PRV module was integrated in the network

model allowed the simulation of private tanks but tools for

model, following the dynamic approach proposed in Pre-

the regulation and control of network pressures could not

scott & Ulanicki (), obtaining a fully dynamic model

be modeled.

of the network filling process in the presence of PRVs.

Pressure control is one of the main technical options

The model was calibrated and applied for the implemen-

that a water manager can put in place to reduce the inequal-

tation of PMAs in one of the distribution networks of

ities among users in such complex cases. Nevertheless, the

Palermo (Italy). The research proposed here starts from

low pressures and the complex and dynamic hydraulic be-

the preliminary finding presented by Freni et al. ().

havior of the system with private tanks prevent a simple analysis of the effect of pressure control devices such as pressure reduction valves (PRVs) and pumps. Hydraulically

METHODOLOGY

controlled PRVs maintain a specified outlet pressure, irrespective of a higher fluctuating inlet pressure, and they are

In this section, the numerical model and the case study are

often implemented dividing the network into districts

presented. The model description is divided in two parts: the

(Pressure Management Areas – PMAs). In intermittent net-

discussion of the network hydrodynamic model that was

works, they may control pressures (and indirectly flows) in

previously presented in De Marchis et al. () and the


209

G. Freni et al.

|

Implementation of pressure reduction valves in a dynamic water network model

Journal of Hydroinformatics

|

16.1

|

2014

detailed description of the PRV valve that was implemented

inside the pipe from opposite directions (Figure 1(d)). Once

and integrated in the present study.

they reach the same cross-section, the subsequent collision can cause an increase in pressure that is a function of the vel-

The network model

ocity propagation of the water front. The numerical model, using the method of characteristic, is able to take into

In the proposed numerical model the transient in pipes is simulated using fast elasticity-demand pressure waves. In fact, the initial velocity of the water front, inside a previously empty pipe, can be quite large since the pressure gradient is relatively high due to the rapid change in pressure, which can be considered atmospheric at the water front. In water distribution networks, where the pipes are initially empty, different filling cases occur and must be simulated by the numerical models. The proposed numerical model is able

account these relatively small water hammers. Because of the complexity of the system, determined by the various possible filling conditions that may occur, it is necessary to make some simplifying assumptions. Based on the study conducted by Liou & Hunt (), it is assumed that the air pressure at the water front is always atmospheric and the wave-fronts are always perpendicular to the pipe axis and coincident with the cross-sections. For detailed discussion of the above hypothesis, see De Marchis et al. (, ). In this paper, the solution of hydraulic equations has

to simulate the following cases, shown in Figure 1. The first empty pipeline is connected to the network reservoirs and the filling of the network starts after the opening of

been carried out by means of the Method of Characteristics (MOC), starting from the condition of an empty network. The one-dimensional unsteady flow of the compressible

the gates (Figure 1(a)). As the water front reaches one of the users’ connections, tanks start to fill, with a discharge that depends on the geometric and hydraulic features of the diver-

liquid in the elastic pipe is described by the following system of equations:

sion as well as on the pressure at the derivation point (Figure 1(b)). When the water front reaches the end of a pipe-

g

@h @V @V g þV þ þ gJ þ Vsinϑ ¼ 0 @s @s @t c

(1)

line (Figure 1(c)), water begins to flow inside the pipelines connected to it; the pressure inside the filled pipeline generally continues to increase until a steady-state condition is reached.

g @h @V g @h V þc þ ¼0 c @s @s c @t

(2)

Since water distribution networks are generally looped to increase system reliability, both ends of a pipeline start to fill

where t is the time, V is the velocity averaged over the

during the filling; as a consequence, two water fronts proceed

pipe cross-section, h is the water head, g is the

Figure 1

|

Hydraulic schematics of the network filling process: (a) initial phase of water front propagation; (b) water front reaches a user connection; (c) water front reaches the end of the pipeline; (d) two water fronts proceed inside the pipe in opposite directions.


210

G. Freni et al.

|

Implementation of pressure reduction valves in a dynamic water network model

acceleration due to gravity, c is the celerity of pressure waves, ϑ is the slope of the pipeline, while J ¼ Js þ Ju, rep-

Journal of Hydroinformatics

ds c ¼ dt β

C :

|

16.1

|

2014

(8)

resents the head loss per unit length due to steady and

In the proposed numerical model, the coefficient k was

unsteady friction, respectively. The steady friction contri-

calculated at each time step toward the Vardy & Brown

bution is calculated according to the classical Darcy–

() formulation, given by:

Weisbach equation:

Js ¼

f VjVj D 2g

(3)

where ƒ is the Darcy–Weisbach friction factor, calculated

pffiffiffi c k¼ g with

dynamically at each time step. On the other hand, Ju, according to the formulation of Brunone et al. (),

(9)

c ¼

later modified by Vítkovský et al. (), can be calcu-

8 < :

0:0476 7:41

log

Re

14:3 Re0:05

Re < 2500 Re > 2500

(10)

lated according to: In order to study the transient flow in the water distribution

k @V @V Ju ¼ þ cϕA g @t @s

(4)

network, the MOC are combined with the proper boundary conditions. A constant water head is imposed to all the reservoirs feeding the network, thus water levels remain constant

where k is a coefficient obtained dynamically in the

during the filling process. Coherently with the assumption of

function of the flow regime, as will be shown in

atmospheric air pressure in the pipelines network, the

the following, while φA is a coefficient depending on the

water head at the front face of partially filled pipes is equal to

sign of the convective acceleration. Specifically, φA ¼ þ1

zero.

if V (@V=@s) 0, and 1 if V (@V=@s) < 0. Introducing Equation MOC,

(4)

into

the

differential

Equation

momentum equations

can

(1)

and be

and

applying

continuity transformed

the

partial

Equations (5) and (6) can be solved through the finite difference technique. Following the notation used in Figure 2, these equations read:

into

ordinary differential equations, known as compatibility equations:

(1 þ k)

dV gdh g þα þ gJs þ αVsin(θ) ¼ 0 dt cdt c

(5)

(1 þ k)

dV gdh g β þ gJs βVsin(θ) ¼ 0 dt cdt c

(6)

where α and β are (k þ 2 kφA)/2 and (k þ 2 þ kφA)/2, respectively.

1 þ k c i,nþ1 Vj hi,n V i,n hi,nþ1 j jm þ jm α g hc i i,n i,n J jm þ V jm senθi Δti ¼ 0 þ α 1 þ k c i,nþ1 Vj hi,nþ1 hi,n V i,n j jv jv β g

c i,n i,n i J V jv senθ Δti ¼ 0 β jv

(11)

(12)

where Vji,nþ1 and hi,nþ1 are the velocity and the water head j in the j-th section (of abscissa ( j–1)Li/Ni) of the i-th pipe

The compatibility equations are valid along the proper

at the time step tn þ Δt; θi is the slope of the i-th pipe; and

positive and negative characteristic lines of equation that,

jm and jv are the sections upstream and downstream of the

introducing the unsteady friction model, read:

j-th section, respectively.

Cþ:

ds c ¼ þ dt α

(7)

The time step advancement Δtni , function of the length and of the celerity of the i-th pipe, is calculated for each


211

Figure 2

G. Freni et al.

|

|

Implementation of pressure reduction valves in a dynamic water network model

Journal of Hydroinformatics

|

16.1

|

2014

Space-time scheme for two-pipes of different diameter with the same number of sections N and different time step Δti.

pipe and then the minimum value is chosen as the unique

used to calculate the discharge at nodes only when the floating

time step integration:

valve is open, i.e., while the user tank is not entirely filled.

Δtn ¼ mini Δtni ¼ mini (Lni =(Ni ci )) When the velocity of the water front

(13) Vji,nþ1

is calculated,

the filling process is updated according to:

Thus, this equation must be combined with the tank continuity equation, which can be written as: 8 < :

¼ Lni þ VNi,nþ1 Δt Lnþ1 i

Q j,up Dj ¼ Q j,up ¼ 0

dWj dHj ¼A dt dt

for

Hj < H j, max

for

Hj H j, max

(16)

(14)

where Lnþ1 is the length of the water column inside the pari tially empty i-th pipeline at the time t nþ1.

where Dj is the user water demand at the j-th node, Wj is the volume of the storage tank connected to the node having area A, Hj is the tank water level, and Hj,max is the maximum

The compatibility equations for the pipelines connected to the node are resolved together with the continuity equation at each junction node, and the discharge provided to user tanks is calculated as a function of the water head.

allowed water level in the tank (before the floating valve closes). Further details on the numerical model can be found in De Marchis et al. (, ).

Specifically, the discharge Qj,up at the j-th node entering The valve model

the tank connected to the node can be obtained as: Q j,up ¼ Cv a

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2g(hij h j,tank )

(15)

where Cv is the non-dimensional float valve emitter coefficient, a is the valve effective discharge area, g is the gravity acceleration, hij is the water head at the j-th node and hij,tank is the height of the private tank. Although more complex methods were considered in the

The dynamic analysis of PRVs follows the formulation and experimental analysis provided by Prescott & Ulanicki (, ), in which the derivative of the opening of the valve χm is proportional to the difference between a given set point hset, and the current outlet pressure hout (Figure 3). The PRVs are located at some nodes of the network. Assuming an initial value of χm ¼ χm0, the valve capacity Cv is calculated using the following equation:

past to relate coefficients Cv and a to valve-opening rates, here constant values were used for both of the coefficients (Criminisi et al. ) which have been calibrated experimentally, as discussed in the following paragraphs. Equation (15) can be

Cv ¼ 0:021 0:0296e 51:1xm0 þ 0:0109e 261xm0 0:0032e 683:2xm0 þ 0:0009e 399:5xm0

(17)


212

G. Freni et al.

|

Figure 3

|

Implementation of pressure reduction valves in a dynamic water network model

Journal of Hydroinformatics

|

16.1

|

2014

The PRV scheme and governing flows.

So, knowing the incoming flow (qm) and the inlet head

Another function of valve opening (χm) is the cross-sec-

(hin) of PRV, the PRV outlet head (hout) can be determined

tional area Acs of the control space, that is determined using

with the equation:

the equation:

hout ¼

hin q2m [Cv(xm0 )]2

(18)

Equation (18) relates the flow through the PRV to the head loss across it and the opening. Valve opening and closing are controlled by the pilot circuit allowing a flow q3 to

Acs ¼

on the difference between the outlet head and the valve setting. The valve is characterized by two parameters αopen and αclose, fixed to 1.1 × 10 6 and 10 × 10 6 m2/s respectively in the present study following the experimental results of Prescott & Ulanicki (). The two parameters determine

(21)

Calculating q3 with Equations (19), a new value χm can be estimated representing the adaptation of valve opening to seek the required set point hset:

fill the valve control space. The inflow to the control space of the PRV is dependent

1 3700(0:02732 xm0 )

xm ¼

q3 Δt þ xm0 Acs (xm0 )

(22)

The system of Equations (19)–(22) requires an iterative resolution because inflow q3 depends on hout by means of Equation (19) that is dependent on valve opening condition xm again dependent on q3 according to Equation (22).

the opening and closing celerity of the valve and, at

The system has to be solved making an initial hypothesis on

the same time, its sensitivity and reactivity to pressure

valve opening χm0 and then solving the equations in order

fluctuation. The inflow q3 to the control space is calculated as follows: q3 ¼

αopen (hset hout ) if x_ m 0 αclose (hset hout ) if x_ m < 0

dx dt

ations are continued until the difference between q3,i (with i being the i-th iteration) and q3,i 1 is less of an established tolerance that was assumed equal to 0.1% in the present study.

(19) The case study

where x_ m ¼

until a new value of χm is obtained in Equation (22). The iter-

The model has been applied on one of the 17 distribution net(20)

works of Palermo city (Sicily). The network is fed by two tanks at different levels, that can store up about 40,000 m3


213

G. Freni et al.

|

Implementation of pressure reduction valves in a dynamic water network model

Journal of Hydroinformatics

|

16.1

|

2014

per day, and supply around 35,000 inhabitants (8,700 users).

processes. For the same period, flow data entering the network

It has been designed to deliver about 400 l/capita/d, but the

were available with the same temporal resolution.

actual mean consumption is about 260 l/capita/d. Pipes are

The current configuration of the network is characterized

made of polyethylene and their diameters range from 110

by significant inequality in the distribution of water resources

to 225 mm (Figure 4). Additional details on the analyzed net-

during intermittent supply. As demonstrated in De Marchis

work can be found in De Marchis et al. ().

et al. (), the users in the lower part of the network, charac-

The system is supplied on a daily basis because the high

terized by the lowest geodetic elevation, can access water

level of leakages (around 25%) does not allow the manager

resources soon after the beginning of a service period and

to supply the network continuously with the available water

they are able to fill up their tanks that were emptied in the

resources. Considering that intermittent supply was histori-

period of service unavailability. At the same time, the users

cally a common practice, especially during summer, all the

in the upper part of the network have to wait for the advan-

users are supplied via tanks having volume equivalent to

taged users to collect water resources and pressure over the

two days’ consumption. In the present paper, the aim was

network to rise in order to begin filling their tanks. As dis-

the evaluation of the impact of intermittent supply on water

cussed in the introduction, the definition of PMA can help

resources distribution among users. For this reason, leakages

in the reduction of inequalities among advantaged and disad-

were simply proportionally divided according to the node

vantaged users. In the present study, two configurations were

demand because exact positions of leakages were not known.

considered dividing the district into two and four PMAs. In

The system is monitored by six pressure cells and two elec-

Scenario A, the network was divided into two approximately

tromagnetic flow meters (Figure 4). Data have been provided

equal parts (Figure 5(a)). In Scenario B, the network was

on an hourly basis almost continuously since 2001, and the

divided into four PMAs increasing the number of valves intro-

network hydraulic model calibration is continuously updated

duced in the system and the number of closed pipes

when new data become available (Criminisi et al. ). The

(Figure 5(b)). The two configurations were chosen based on

pressure data used for model calibration have a time resol-

the original design of the network in which the district is

ution of five minutes and were taken from the period

divided into four areas that can be insulated for maintenance

between June and October 2002, during which the network

purposes. In the present application, some of the existing

was managed by intermittent supply on a daily basis. The

static section valves are simply substituted by PRVs.

pressure time series were available at each of the six pressure gauges and used to represent the filling and the emptying

ANALYSIS OF RESULTS The model was initially calibrated according to the pressure profiles available during the monitoring period from the six pressure gauges located in the network. The results, not shown here, can be found in De Marchis et al. () where a section dedicated to the model calibration, in the same case of study, is reported. The presented model was used to analyze and compare different configurations of PMAs in order to reduce inequalities between user accessing and collecting water resources considering the relevant role of private tanks. In the analysis of results, Scenario 0 (i.e. the current situation with one network district with no pressure control) was compared with the two proposed PMA scenarios. The effectiveness of district definition was evaluated by means of Figure 4

|

Case study network scheme.

pressure levels in the network and by means of the water


214

Figure 5

G. Freni et al.

|

|

Implementation of pressure reduction valves in a dynamic water network model

Journal of Hydroinformatics

|

16.1

|

2014

Position of the PRVs and closed pipes on network mains: Scenario A (a) and Scenario B (b). The blue lines define the boundaries of the PMAs. Please refer to the online version of this paper to see this figure in colour: http://www.iwaponline.com/jh/toc.htm.

volume supplied to the users at different moments of the ser-

water head reduction is higher than that registered when

vice day. In all the scenarios, the simulation starts with the

two PRVs were activated (Scenario A). Due to the fact that

reactivation of service during intermittent distribution on a

a pressure driven model is used to calculate the discharge

daily basis. Because of user water consumption during the

entering the users’ tanks, the increase of the pressure in the

day before, at the beginning of the simulation, all the private

disadvantaged nodes and the reduction in the advantaged

tanks are almost empty and their supply valves are fully open.

ones reduces the inequalities in the water supply. Figure 6(c)

Figure 6 shows the comparison between the water head

shows that in the most disadvantaged nodes, located either in

variation in time obtained in the three different scenarios ana-

the highest part of the district or in the farther part of

lyzed here. The first 7 h of the dynamic filling process are

the network from the inlet node, in order to reduce the inequal-

analyzed in four different nodes of the network. Specifically,

ities it is necessary to divide the district into four PMAs. The

the pressure in the nodes 42, 109, 165 and 249 were plotted.

water heads obtained in Scenario A, in fact, are equal to those

These nodes were chosen to be representative of the effects of

in Scenario 0. Finally, Figure 6(d) shows that in the nodes loca-

the PRVs in the different PMAs, as can be observed in Figure 5

ted near to the inlet node the three scenario profiles are very

where the nodes were shown to improve the clarity. Figure 6

similar, with negligible differences in water head distribution.

shows pressure levels in four nodes of the network: initially

Figure 7 shows pressure levels in the network after 3 h in

the pressure is null and the pipes are empty. The time taken

the three selected scenarios. The separation between the differ-

for the filling process is different for each of the four nodes mon-

ent PMAs is clear and the average pressure in the network

itored. In the disadvantaged node (Figure 6(c)), the transient

progressively decreases by implementing two and four differ-

period of the filling process can be protracted for almost 1 h

ent districts: in Scenario 0, the average pressure head is

from the beginning of the simulation. For details about these

22.1 m, decreasing to 21.4 m in Scenario A and 20.7 m in Scen-

inequities, see De Marchis et al. (). The static level of the

ario B. More interestingly, the standard deviation drops from

supply tank is equal to 48 m above medium sea level.

6.5 m in Scenario 0 to 5.4 m and 4.6 m, respectively in Scen-

Figure 6(a) shows that, when the PRVs are activated

ario A and Scenario B. This fact confirms a more uniform

(Scenarios A and B) in the upper part of the network, an increase

distribution of pressures over the network and, considering

of the pressure is achieved, with respect to Scenario 0. Further-

that the majority of the uses are head driven because of private

more, the increase of the pressure is higher in Scenario B,

tanks, a more uniform distribution of resources.

where several PRVs were activated to reduce the inequality

This consideration is confirmed by looking at water head

between the users. On the other hand, Figure 6(b) shows the

after 3 h (Figure 7) and at supplied water volumes after 5 h

reduction of the pressure at node 109 located in the lower

(Figure 8). The percentage of users able to collect the totality

part of the network. Also at this node, in Scenario B the

of their daily demand after 3 h drops from 14% to 10% and


215

G. Freni et al.

|

Implementation of pressure reduction valves in a dynamic water network model

Journal of Hydroinformatics

|

16.1

|

2014

Figure 6

|

Pressure level variation in time in four nodes: (a) 42; (b) 109; (c) 165; (d) 249. In panels (a) and (b) the horizontal line represents the static level of water supply equal to 48 m.

Figure 7

|

Pressure levels in the network after 3 hours: (a) Scenario 0; (b) Scenario A; (c) Scenario B.


216

Figure 8

G. Freni et al.

|

|

Implementation of pressure reduction valves in a dynamic water network model

Journal of Hydroinformatics

|

16.1

|

2014

Volume supplied to the users after 5 hours: (a) Scenario 0; (b) Scenario A; (c) Scenario B.

4% in Scenarios A and B. After 5 h in Scenario 0, one quarter of

impact of such choices on network pressure and on water

the users’ tanks were filled while only 20% and 13% have com-

supply distribution. From a practical perspective, the creation

pleted their supply in the two PMA scenarios. The

of PMAs has a relevant impact on intermittent networks help-

implementation of PMAs has a more relevant impact on users

ing the reduction of inequalities between users accessing and

unable to be supplied: after three hours, 45% of users are

collecting water resources. The presence of private tanks

unable to be supplied in Scenario 0 and this number is reduced

helps advantaged users to collect as much water as possible

to 38% and 29% in Scenario A and B, respectively; after five

in a few hours after the restoration of service; at the same

hours, the number of non-supplied users is still high (39%)

time, several users are unable to collect water because

while it is reduced to 29% and 18% in the two PMA scenarios.

pressure in the network is too low. The introduction of PMAs mitigates this problem by reducing the differences of pressure between different points of the network. The intro-

CONCLUSIONS

duction of the valves reduces the differences between water collected by users in the first part of the service day even if

In the study, a dynamic mathematical model for intermittent

inequalities still remain. The analysis demonstrated that

networks was integrated with a PRV model in order to simu-

PMAs can help move towards having equal distribution of

late management actions for reducing inequalities between

water resources during intermittent service but further ana-

users in their access to water resources. The model was

lyses are needed to implement an optimal distribution of

demonstrated to be robust and to correctly represent the

valves in order to reduce the different distribution of water

application of several valves in the network showing the

supply between users. The impact of valves on the network


217

G. Freni et al.

|

Implementation of pressure reduction valves in a dynamic water network model

is not easily predictable without the use of dynamic models as the presented analysis has demonstrated. Some parts of the network are unaffected by the presence of the valves because they are dominated by the proximity of network inlets. The introduction of valves has a pervasive impact on the network, cutting pressure downstream of the valve, but also increasing pressures in the upper part of the network due to the compensation of flow distribution in the network.

ACKNOWLEDGEMENTS The authors would like to acknowledge the Italian Research Project ‘POR FESR Sicily 2007-2013 – Measure 4.1.1.1 SESAMO – SistEma informativo integrato per l’acquisizione, geStione e condivisione di dAti aMbientali per il supportO alle decisioni’ for providing financial support to the presented research.

REFERENCES Arregui, F., Cabrera Jr, E. & Cobacho, R.  Integrated Water Meter Management. IWA Publishing, London. Bergant, A., Vitkovsky, J., Simpson, A. & Lambert, M.  Valve induced transients influenced by unsteady pipe flow friction. Proc. 10th Int. Meeting of the IAHR Workgroup on the Behaviour of Hydraulic Machinery under Steady Oscillatory Conditions, IAHR, Madrid, Spain. Brunone, B. & Morelli, L.  Automatic control valve-induced transients in an operative pipe system. J. Hydraul. Eng. 125 (5), 534–542. Brunone, B., Golia, U. M. & Greco, M.  Some remarks on the momentum equations for fast transients. Hydraulic transients with column separation (9th and last round table of the IAHR Group), IAHR, Valencia, Spain, pp. 201–209. Criminisi, A., Fontanazza, C. M., Freni, G. & La Loggia, G.  Evaluation of the apparent losses caused by water meter under-registration in intermittent water supply. Water Sci. Technol. 60 (9), 2373–2382. Cubillo, F.  Impact of end uses knowledge in demand strategic planning for Madrid. Water Sci. Technol.: Water Supply 5 (3–4), 233–240. De Marchis, M., Fontanazza, C. M., Freni, G., La Loggia, G., Napoli, E. & Notaro, V.  A model of the filling process of an intermittent distribution network. Urban Water J. 7 (6), 321–333. De Marchis, M., Fontanazza, C. M., Freni, G., La Loggia, G., Napoli, E. & Notaro, V.  Analysis of the impact of

Journal of Hydroinformatics

|

16.1

|

2014

intermittent distribution by modelling the network-filling process. J. Hydroinf. 13 (3), 358–373. Fontanazza, C. M., Freni, G., La Loggia, G., Notaro, V. & Puleo, V.  A composite indicator for water meter replacement in an urban distribution network. Urban Water J. 9 (6), 419–428. Freni, G., De Marchis, M., Dalle Nogare, G. & Napoli, E.  Implementation of pressure reduction valves in a dynamic water distribution system numerical model. Proceedings of the 10th International Conference on Hydroinformatics, Hamburg, July 14–18. Giustolisi, O.  Considering actual pipe connection in WDN analysis. J. Hydraul. Eng. 136 (11), 889–900. Giustolisi, O., Berardi, L. & Laucelli, D.  Generalizing WDN simulation models to variable tank levels. J. Hydroinf. 14 (3), 562–573. Giustolisi, O., Kapelan, Z. & Savic, D. A. a An algorithm for automatic detection of topological changes in water distribution networks. J. Hydraul. Eng. 134 (4), 435–446. Giustolisi, O., Savic, D. A. & Kapelan, Z. b Pressure-driven demand and leakage simulation for water distribution networks. J. Hydraul. Eng. 134 (5), 626–635. Hardoy, J. E., Mitlin, D. & Satterthwaite, D.  Environmental Problems in an Urbanizing World: Finding Solutions for Cities in Africa, Asia and Latin America. Earthscan, London. Liou, C. P. & Hunt, W. A.  Filling of pipelines with undulating elevation profiles. J. Hydraul. Eng. 122 (10), 534–539. National Research Council, Committee on Public Water Supply Distribution  Drinking Water Distribution Systems: Assessing and Reducing Risks. National Academies Press, Washington, DC. Prescott, S. L. & Ulanicki, B.  Dynamic modelling of pressure reducing valves. J. Hydraul. Eng. 129 (10), 804–812. Prescott, S. L. & Ulanicki, B.  Improved control of pressure reducing valves in water distribution networks. J. Hydraul. Eng. 134 (1), 56–65. Todini, E.  A more realistic approach to the ‘extended period simulation’ of water distribution networks. In: Proc. Of CCWI2003, Advances in Water Supply Management (C. Maksimovic, D. Butler & F. A. Memon, eds). A.A. Balkema Publishers, Lisse, pp. 173–184. Vairavamoorthy, K., Akinpelu, E., Lin, Z. & Ali, M.  Design of sustainable system in developing countries. Proceedings of the World Water and Environmental Resources Challenges, Environmental and Water Resources Institute of ASCE, Orlando, Florida, 20–24 May 2001. Vardy, A. E. & Brown, J. M. B.  Transient turbulent friction in smooth pipe flows. J. Sound Vib. 259 (5), 1011–1036. Vítkovský, J. P., Bergant, A., Simpson, A. R. & Lambert, M. F.  Systematic evaluation of one-dimensional unsteady friction models in simple pipelines. J. Hydraul. Eng. 132 (7), 696–708.

First received 21 March 2013; accepted in revised form 5 July 2013. Available online 23 August 2013


218

© IWA Publishing 2014 Journal of Hydroinformatics

|

16.1

|

2014

Improving applicability of neuro-genetic algorithm to predict short-term water level: a case study Gooyong Lee, Sangeun Lee and Heekyung Park

ABSTRACT This paper proposes a practical approach of a neuro-genetic algorithm to enhance its capability of predicting water levels of rivers. Its practicality has three attributes: (1) to easily develop a model with a neuro-genetic algorithm; (2) to verify the model at various predicting points with different conditions; and (3) to provide information for making urgent decisions on the operation of river infrastructure. The authors build an artificial neural network model coupled with the genetic algorithm (often called a hybrid neuro-genetic algorithm), and then apply the model to predict water levels at 15 points of four major rivers in Korea. This case study demonstrates that the approach can be highly compatible with the real river situations, such as hydrological disturbances and water infrastructure under emergencies. Therefore, proper adoption of this approach into a river

Gooyong Lee Heekyung Park (corresponding author) Korea Advanced Institute of Science and Technology (KAIST), 335 Gwahangno, Yuseong-gu, Daejeon 305-701, Republic of Korea E-mail: hkpark@kaist.ac.kr Sangeun Lee International Centre for Water Hazard and Risk Management under the Auspices of UNESCO (ICHARM) 1-6 Minamihara, Tsukuba-shi, Ibaraki-ken, 305-8516, Japan

management system certainly improves the adaptive capacity of the system. Key words

| four-river remediation project, genetic algorithm, hybrid neuro-genetic, neural network, practical approach, water level prediction

INTRODUCTION Background

(FRRP) in 2009. With total expenses approximately amounting to a tenth of the annual national budget, the MLTM

For

sustainable

water

resources

management,

many

countries often initiate and develop huge river remediation

constructed a variety of water infrastructures such as reservoirs, weirs, dikes, wetlands, and eco-parks up to 2011.

projects, e.g., the ‘Tennessee Valley Authority Act (1993–

However, successful river management cannot be

2012)’ in the USA, and the ‘Isar River Remediation Project

ensured by these structural measures. Considering ‘non-

(2000–2011)’ in Germany. Korea has four major rivers.

stationarity’ (Milly et al. ) and ‘no basis for probabilities’

Their slopes are relatively steep, and stream flows differ

(Foley ; Cha et al. ), it is necessary to supplement

vastly from month to month. Thus, people living near the

adaptive capacity with nonstructural measures from a per-

rivers have repeatedly suffered from chronic problems such

spective featuring both economics and reliability. As

as flood, drought, stream depletion, and low water quality.

mentioned by Lee et al. (), when the capacity of water

In addition, many experts (e.g., NIMR ; MLTM ,

infrastructure exceeds a certain level, priorities should be

; Lee & Park ) argue that Korean river management

given to better predicting and monitoring hydrological

in the future will be much more vulnerable to climate

changes, arranging emergency options sufficient to cover a

change than in the present, projecting the increase of heavy

wide range of extreme events, and making timely and ade-

rainfalls in the wet season and the duration of drought in

quate decisions. In these regards, this study was initiated to

the dry season. To solve these problems and to provide full

more accurately predict the water levels at various measure-

amenities for the inhabitants, the Ministry of Land, Transport

ment points of the rivers in FRRP. The authors also took

and Maritime Affairs (MLTM) launched a nationwide, large-

into account that adaptive capacity can be further enhanced

scale project named the Four River Remediation Project

if the prediction models reveal information about how to

doi: 10.2166/hydro.2013.011


219

G. Lee et al.

|

Applicability of neuro-genetic approach for short-term water level prediction

Journal of Hydroinformatics

|

16.1

|

2014

operate water infrastructure constructed by the FRRP in

; Gavin et al. ; Kisi & Asce ; Zhengfu & Fer-

order to maintain the water levels within desirable ranges.

nando ) found several merits of the ANN.

When Korean river managers make plans for the operation of a reservoir or a weir, the preferred way to forecast the water level at a point is to select one or combine hydrological simulation models, e.g., SWAT (Soil and Water Assessment Tool developed by the US Department of Agriculture), WEAP (Water Evaluation and Planning System developed by the Stockholm Environment Institute), PRMS (Precipitation-Runoff Modeling System developed by the US Geological Survey), and HEC-RAS (Hydrologic Engineering Centers River Analysis System developed by the US Army). These simulation models are structured with governing equations and parameters and have been regarded as most adequate to describe physical processes related to the rain– runoff relation, especially within the hydrologic communities. The authors also agree that the simulation models are the best in achieving long-term prediction (for more detail, see Leavesley () and Solomatine ()). However, when river managers are interested in real-time or short-term prediction, there are two serious limitations. One is the demand of the long period and numerous kinds of data to calibrate the model parameters, and the other is excessive consumption of time and endeavor to build and run the models (Grayson et al. ; Chang et al. ). These limitations definitely discourage river managers from using simulation results in making decisions upon the operation of water infrastructures. Objectives of the study This study is based on the viewpoint that the conventional numerical models to predict the water level are not adequate to draw out the full potential of newly constructed water infrastructures because they entail a number of assumptions and demand numerous kinds of data (Chau ; Chang

1. Fast prediction speed: the ANN conducts prediction through the direct relations between inputs and outputs, without necessitating the treatment of data in the geographical information system. 2. Low data requirements: many fewer input variables are needed than when simulation models are used. Those variables can be selected in a flexible manner according to previous literature, modelers’ experiences, new insights, and trial-and-error. 3. Better consideration of site characteristics: to explain highly complicated hydrological phenomena (or dynamic nature of the phenomenon at stake) of a certain site, a simulation model is usually calibrated to adjust its parameters. The ANN model’s parameters and structure can be adjusted to be more site-specific. This is a great advantage in modeling non-linear and unique site characteristics of the watershed of concern. Despite such merits, it does not seem that ANN models are widely used in practice. The authors think that the models should be improved at the perspective of the real river management system to improve applicability. For example, according to the Korean River Management Guideline (K-water ), river managers set up the allowable range of water levels at each point and should maintain the water levels above the lower limits during dry seasons and below the upper limits during wet seasons. The managers are obliged to periodically make decisions on the amount flowing out from upstream weirs or reservoirs and then to request approval from the River Flow Control Office under MLTM. The authors suggest improving the ANN model as follows.

et al. ). Hence, as an alternative method, the authors

1. It should be easy and systematic to optimize the model:

intended to examine the systemic approach of using an arti-

when the ANN model is used to predict the water level

ficial neural network (ANN). Indeed, the ANN has become

at a point, the modeler should consider many of the

the most popular in various system engineering commu-

hydrological characteristics of the watershed basin.

nities (Joo et al. ; Choi & Park ; Robert )

They are also required to know where the flow gauging

when models need good performance and fast calculation

stations and the weather stations are located in the

in short-term or real-time prediction. In studies on water

upper stream, and how the locations of stations influence

resource management, many experts (Karunanithi et al.

the ANN model. Therefore, it is usually difficult to deter-

; Imrie et al. ; Toth et al. ; Cameron et al.

mine the structure of the model. In many previous studies


220

G. Lee et al.

|

Applicability of neuro-genetic approach for short-term water level prediction

using the ANN model, this determination was done by

Journal of Hydroinformatics

|

16.1

|

2014

METHODS

using informal trial-and-error methods, referring to the literature, or relying on personal experiences. Even if

Study areas

useful, these ‘subjective’ methods cannot be expressed explicitly; a significant barrier for engineers and man-

From previous studies, it is seen that a modeler optimizes

agers in the field who require at least well-coded

the performances of the ANN model at a few points in a

algorithms.

river, and strives to improve accuracy (Maier et al. ).

2. The information regarding water levels a couple of days

This observation does not appear impressive to a river man-

later should be provided: the previous studies focused

ager since the results were derived only from the several

on hourly predictions (see Filho & Santos ; Alvisi

specific points. To gain a river manager’s confidence, it

et al. ; Napolitano et al. ) is not helpful when

would be better to let them decide whether the existing

considering the real practices articulated in the Korean

models should really be replaced; it would be very important

River Management Guideline (K-water ). It usually

to ascertain that the ANN model provides good perform-

takes 1 or 2 days to perform the management practices

ances at multiple points in various rivers at a time. This

based on the prediction of water level to implement con-

study thus attempts to examine the applicability of the

trol action. Therefore, the ANN model must be able to

model at 15 points near the locations where weirs or reser-

know the water level after this time lag of a couple

voirs were newly constructed by the FRRP. The 15 points

of days.

fall into four groups according to the rivers. The rivers

3. The model must help determine discharges of the upstream weirs or reservoirs: the climate in Korea is

have geographical and hydrological characteristics, as follows (see also Figure 1 and Table 1).

characterized by frequent localized torrential rainfall and high coefficient of river regime. Under such climate,

1. Han River: it flows through the northern part of Korea

maintaining the water level between the upper and lower

from the Gangwon province to Gyeonggi province via

limits during both dry and wet seasons is critical to pre-

Seoul metropolitan city. The FRRP newly constructed

vent natural disasters under extreme events. In this

three weirs in a section, from the Chungju dam to the Pal-

sense, it is important to examine whether the ANN

dang Lake, of the main stream. The authors selected two

model can clearly explain the relations of the upstream

water level measurement points within this section, as

discharges and the downstream water level. If possible,

shown in Figure 1(a). It seems very likely that the water

the modeler will be able to conduct ‘what-if …’ tests,

level at point 1 will be significantly affected by the dis-

and

charges of the Chungju dam while the water level at

then

determine

emergency

actions

more

systematically.

point 2 is greatly determined by the operation of the weirs. Point 2 is of national concern because the point

To test the three hypotheses, this study examines the use

is at the starting point of Paldang Lake which supplies

of the ANN model at 15 points near the locations where weirs

raw water for approximately 20 million inhabitants

or reservoirs were constructed by the FRRP. This article is

living in Seoul metropolitan city and Gyeonggi province.

designed as follows. Following an introduction, the authors

2. Geum River: it is located at the center of Korea and orig-

explain the methods including study areas, data collection,

inates from the Jeollabuk province, and then flows out into

the ANN model and optimization algorithms, and the way

the West Sea through the Chungcheongnam province and

to validate the model. Then, the authors present the results

Chungcheongbuk province. The river is characterized by

of case studies, which are used to verify the established

rising from many tributaries, i.e., 20 streams. The FRRP

hypotheses. This study summarizes that the investigated

constructed three weirs at the section, 99 km in length,

approach using the ANN model can successfully manage

between the Daecheong dam and Geum River estuary

river systems, cope with emergencies, and raise the adaptive

dam. Within the section, four points on the main stream

capacity of the river management system.

are used to predict the water levels, as shown in Figure 1(b).


221

Figure 1

G. Lee et al.

|

|

Applicability of neuro-genetic approach for short-term water level prediction

Journal of Hydroinformatics

|

16.1

|

2014

Locations of the study areas.

Although point 1 is largely affected by the discharges of

3. Yeongsan River: it passes through the Jeollanam province

the Daecheon dam, it is very challenging for river man-

in the south-western part of Korea, and flows out to the

agers to predict and control the water level because the

West Sea. The distinctive characteristic of this river is

influx of the Miho stream and the Gapcheon tributary

that the regime coefficient of the watershed basin is extre-

at the front fills about half of the total flow in the main

mely high (1:682). This implies that flow rate differs

stream. For other points, which are placed behind point

vastly from season to season so that damage due to

1, the FRRP is likely to improve the ability to manage

floods and droughts frequently occurs. In the section,

water quantity. Water levels at points 2, 3, and 4 are

98 km long, where two weirs were installed by the pro-

dominated by operation of weirs in the upper streams.

ject, there are four points available to estimate the


222

Table 1

G. Lee et al.

|

|

Applicability of neuro-genetic approach for short-term water level prediction

Journal of Hydroinformatics

|

16.1

|

2014

Geographical and hydrological characteristics

Study area

Catchment area (km2)

River length (km)

Annual mean temperature ( C)

Annual mean rainfall (mm)

Coefficient of river regimea

Han River

26,018

481.7

10–11

1,200–1,300

1:393

Geum River

9,912

394.8

11–12

1,100–1,300

1:299

Yeongsan River

3,371

115.5

13

1,100–1,500

1:682

Nakdong River

23,384

506.2

12–14

900–1,400

1:372

a

W

The coefficient usually implies the ratio between the maximum and minimum of daily flow over an average year.

water levels, as in Figure 1(c). Among them, water levels

Znþ1 is an output from (n þ 1)th layer, wn is a weight

at points 1 and 3 are affected by the operation of Seung-

between nodes, unþ1 is the sum of multiplying Zn by wn,

chon and Juksan weirs, respectively. Point 2 is placed

and fi(unþ1) is an activation function transferring the sum

behind the confluence of the Jiseok stream having abun-

of node inputs into a node output. The activation function

dant flow, and thus the water level is influenced by the

can have a logistic, hyperbolic-tangent, or linear form

flow variation of Jiseok stream. Besides, point 4 is used

(Figure 2(b)). Several nodes form a layer, and several

to monitor the flow escaping into the West Sea.

layers form the whole ANN structure again. Among a var-

4. Nakdong River: it originates from the Gangwon province

iety of ANN structures, the most popular one is the MLP

and flows vertically through the Gyeongsangbuk province,

(multi-layer perceptrons) and a feed-forward network with

Gyeongsangnam province, and Busan metropolitan city

several layers (Haykin ; Dibike & Solomatine ). In

into the Southern Sea. It is the longest river in Korea.

several previous studies (Sahoo et al. ; Wang et al.

Accordingly, spatial variations in rainfall and flow are rela-

; Pulido-Calvo & Portela ), ANN with double

tively large, and many inhabitants near the lower stream

hidden layers showed competent results. Therefore, in this

have suffered from floods and droughts almost every

study, both ANNs having a single hidden layer and double

year even though five reservoirs for flood control were

hidden layers are tested, and one is selected for the optimal

installed a long time ago. As a result, while planning the

ANN design at each study site.

FRRP, the MLTM took note of these problems and con-

In many ANN studies, the model structure has been

structed eight large-scale weirs. The construction projects

selected through a trial-and-error method (Hsu et al. ;

were mainly implemented within the section, 277 km

Zealand et al. ; Chiang et al. ). However, the

long, between Andong city and Busan metropolitan city.

results of optimizing the model largely depend on how

Although many points are available to estimate the water

appropriately the hidden layers and nodes are designed

levels in this section, five points where flooding regularly

and how well the functions are defined. Thus, much

occurs were selected as points of interest.

effort should be given to the selection of model structure. Since the mid-1990s, there have been numerous attempts

Model construction and calibration

in water resource management communities to employ the genetic algorithm (GA) (see Savic et al. ; Giustolisi

An artificial neural network is a computational model

& Laucelli ; Giustolisi & Simeone ; Shamseldin

mimicking the human brain that is constructed by huge net-

et al. ). GA is a technique inspired by the principles

works connecting neurons, neural cells, and synapses. The

of natural evolution and selection (see details in Holland

ANN is not required to establish model structure with the

() and Goldberg ()). In previous studies, the GA

full understanding of natural phenomena. The model

has be mainly used in the neural network, as follows: (1)

rather relies on empirical training as humans normally do

selection of input variables; (2) adjustment of weights; (3)

(Haykin ; Maier et al. ). Figure 2(a) exemplifies

number of hidden layers in the MLP; (4) number of

the general structure of an ANN model. A neuron corre-

nodes in each layer; and (5) type of activation function in

sponds to a node where Zn is an output from nth layer,

each node.


223

Figure 2

G. Lee et al.

|

|

Applicability of neuro-genetic approach for short-term water level prediction

Journal of Hydroinformatics

|

16.1

|

2014

General structure of the ANN model (MLP): (a) a node at (n þ 1)th layer; (b) typical activation functions; (c) multi-layered feed forward neural network using double hidden layers.

Based on literature reviews (See & Openshaw ;

time window (t0, t0 1) at the two points should be considered

Alvisi et al. ; Maier et al. ) and Korean river con-

as model inputs for 1-day ahead prediction. As a result, the

dition (K-water ), rainfall and upstream flow are used as

prediction point #1 in the Geum River has 10 model

input variables, and rainfall/flow gaging stations are selected

inputs. To avoid ‘curse of dimensionality and over-fitting in

through preliminary statistical test (i.e., significance test) con-

ANN’ (Haykin ; Giustolisi & Laucelli ), GA is

sidering the travel time (maximum 4 days at study sites (K-

coded to select minimum nodes at hidden layers. The

water )). The number of nodes at input layer is different

number of nodes at a hidden layer is limited to a maximum

at each prediction point because the Korean rivers have a

of 32 following ‘a general principle that the node numbers

different number of branches and lag-time (see Figure 1).

of the hidden layers should be greater than the input layer

Due to Korean climate characteristics (frequent localized tor-

nodes’ (Zeng & Wang ). Finally, 14 average nodes are

rential rainfall and high coefficient of river regime), it is

used in this study (it is similar to the number of input

difficult to construct a model using only input data for a cer-

nodes). Weights are also anticipated to be well trained by

tain period (e.g., dry and wet season). Therefore, the model is

the back propagation algorithm (BPA) solely as done by

constructed by selecting representative points of annual pre-

other ANN studies (Rumelhart et al. ; Rumelhart &

cipitation and flow variations. To take into account the

McClelland ; Abebe & Price ; Robert ; Kisi &

number of branches and the time window (t0, t0 1, t0 2), a

Asce ; Chau ; Chang et al. ; Napolitano et al.

maximum of 16 inputs are necessary. For example, predic-

; Mohanty et al. ). To take into account the high coef-

tion point #1 in Geum River (see Figure 1) is affected by

ficient of river regime, the weights are adjusted to minimize

eight points (four weather gaging stations, three flow gaging

error value for high variations of water level. Three criteria

stations, and one dam). Lag-time between the dam and the

(Table 2) are used for validation.

furthermost flow gaging station is 1–2 days depending on

Finally, the authors decided using the GA to select: (1)

flow speed. To take the lag-time into account, two types of

the number of hidden layers; (2) the number of nodes in


224

Table 2

G. Lee et al.

|

|

Applicability of neuro-genetic approach for short-term water level prediction

Purpose

Coefficient of determination (R 2)

To evaluate the goodness-of-fit of models

Mean square error (MSE)

To quantify the error of model

Mean absolute percent error (MAPE)

To compare the error between populations

|

16.1

|

2014

selected as the optimal model. The number of generations

Criteria for testing validity of the model

Criteria

Journal of Hydroinformatics

is 50, and each generation contains 500 candidate models

Estimation

Pn

R ¼ 2

i¼1 ðPi Pn i¼1 ðOi

MSE ¼

2 OÞ 2 OÞ

n 1X ðPi Oi Þ2 n i¼1

MAPE ¼

1 n

n X i¼1

jPi Oi j Oi

(population size). At the model calibration step, all the weights, wn, need to be trained to minimize the sum of square errors between observed data and model outputs. As a result, dozens of training algorithms have been suggested so far although none of them ensure that the solution reaches a global minimum (for details, see Coulibaly et al. () and Mohanty et al. ()). Among those algorithms, the authors used the back propagation algorithm, first suggested by Rumelhart & McClelland (). The

each hidden layer; and (3) the type of activation function in

algorithm is widely known as an adequate method to train

each node. Overall model construction procedure is sum-

the MLP and, in particular, it is less sensitive to the noises or

marized in Figure 3. In the first part, input data are

errors inherent in input data (Maier & Dandy ).

decided through three steps, ‘selection of water level forecasting point’, ‘selection of input point’, and ‘data

Selection of data, input variables, and validation criteria

collection’. In the second part, ANN is constructed using GA. The initial ANN structure consists of double hidden

There were some limitations in availability of data when the

layers with the same number of nodes as in the input

constructed model is trained and validated. First, the length

layer, and each node contains a random activation function.

of data is a bit short compared to that in other studies. In a

Then, the ANN model is constructed through iteration of

large portion of water level measurement points, the data

three steps, selection, crossover, and mutation, and the prob-

obtained prior to the year 2004 turned out to be inconsistent.

ability of each step is 20, 60 and 20%, respectively. Before a

Second, the FRRP perturbed the quality of data temporally.

new iteration starts, weights are adjusted by BPA. Finally,

From 2009 to 2011, large-scale weirs were constructed, and

1st-rank ANN model of one generation is created and

dredging works were conducted in the study sites, which

saved for selection of optimal ANN structures. The ANN

led to a slight change in the location of several measurement

structure with the lowest coefficient of determination is

points and an increase in measurement errors. Therefore,

Figure 3

|

Overall model construction procedure in this study.


225

G. Lee et al.

|

Applicability of neuro-genetic approach for short-term water level prediction

Journal of Hydroinformatics

|

16.1

|

2014

daily data were restricted to a 5-year period from 1 January

activation functions optimized by the GA. The results can

2004 to 31 December 2008. The total period is subdivided

be interpreted as follows.

into 3 and 2 years to distinguish model training/validation from testing periods, respectively. The 3-year training/validation period (2004–2006) is divided into training and validation parts: if the data period starts from t0, even numbers (t0, t0þ2, t0þ4 …) are used as training data and odd numbers (t0þ1, t0þ3, t0þ5 …) are used as validation data. This study satisfied the minimum data quantity for ANN construction suggested by Lawrence & Peterson (). All the input data are adjusted on a scale ranging from 0 to 1.

1. Among the hidden layers included in all models, double layers amount to more than a third (37%). This gives an insight that there is high nonlinearity between input variables and water level, and among input variables (Haykin ). 2. In the first hidden layers, the number of nodes ranges from 6 to 32 (15 on average), and in the second hidden layers, this ranges from 2 to 17 (6 on average). 3. Hyperbolic tangent functions and logistic curves were dominantly selected as activation functions in hidden

Table 2 represents the criteria for validating the ability of the

layers rather than linear functions. This agrees with

model to predict the water level, in which Pi is the values pre-

Daliakopoulos et al. (), who stated that sigmoid-

dicted by the ANN model, and O is the observed values, and is the average of observed values, and n is the number of O

type functions ensure better performances on the ANN model.

samples. The three criteria, R 2, MSE, and MAPE, are widely

4. In contrast, linear functions were selected for almost

used statistics, which refer to high validity in the constructed

half of activation functions in the output layers, which

model as the statistics are closer to 1, 0, and 0, respectively.

corresponds to other experts’ experiences (cf. Abebe & Price ; Chang et al. ; Pulido-Calvo & Portela ).

RESULTS AND DISCUSSION Selection of the model structure

Testing of the ANN models

Tables 3 and 4 show the results of building the ANN models,

To test the trained model for the period 1 January 2006 to 31

each listing the numbers of layers and nodes, and types of

December 2008, the authors applied data for input

Table 3

|

Results of optimizing the model structure (1-day ahead water level)

Site

Number of inputs

Number of hidden layers

AFs at 1st hidden layera

AFs at 2nd hidden layera

AF at output layer

Han River

#1 #2

9 14

2 2

8Lo, 10T, 5Li 2Lo, 1T, 1Li

1Lo, 1T, 2Li 2Lo, 1T

T T

Geum River

#1 #2 #3 #4

10 6 11 9

1 2 1 1

3T, 12Li 11Lo, 20T, 1Li 2Lo, 5T 1Lo, 2T

– 3Lo, 2T, 1Li – –

Li Li Li Li

Yeongsan River

#1 #2 #3 #4

9 11 6 9

1 2 2 1

2Lo, 1T, 1Li 16Lo, 8T, 3Li 14T, 14Li 1Lo, 20T, 4Li

– 10Lo, 6T 2Lo, 2T –

Lo T T Li

Nakdong River

#1 #2 #3 #4 #5

14 13 14 14 14

2 2 1 1 1

5Lo, 9T, 1Li 5Lo, 4T, 5Li 2T, 1Li 1Lo, 1T 4Lo, 6T, 7Li

2Lo, 2T, 2Li 1Lo, 1T – – –

T T T T Li

a

The expression ‘aLo, bT, cLi’ means that ‘a, b, c’ is the number of each function and ‘Lo, T, Li’ stand for logistic function, hyperbolic tangent function, and linear function in order. The sum of

‘a, b, and c’ is the total number of nodes at each layer.


226

Table 4

G. Lee et al.

|

|

Applicability of neuro-genetic approach for short-term water level prediction

Journal of Hydroinformatics

|

16.1

|

2014

Results of optimizing the model structure (2-day ahead water level)

Site

Number of inputs

Number of hidden layers

AFs at 1st hidden layera

AFs at 2nd hidden layera

AF at output layer

Han River

#1 #2

9 14

1 2

3Lo, 6T 15Lo, 6T, 5Li

– 1T, 1Li

1Lo 1Li

Geum River

# # # #

1 2 3 4

10 6 11 9

1 1 1 1

1Lo, 1Li 1Lo, 1Li 2Lo, 2T, 1Li 10Lo, 9T, 5Li

– – – –

Li Lo Li Li

Yeongsan River

# # # #

1 2 3 4

7 11 8 9

1 2 2 1

1Lo, 2T 10Lo, 9T, 1Li 1Lo, 12T 6Lo, 4T, 7Li

– 12Lo, 4T 2Lo, 2T –

Li T T Li

Nakdong River

# # # # #

1 2 3 4 5

14 13 14 14 14

1 2 1 1 1

15Lo, 9T, 4Li 15Lo, 5T, 10Li 7T, 4Li 11T, 12Li 3Lo, 2T, 1Li

– 1Lo, 1T – – –

Lo T Li T Lo

a The expression ‘aLo, bT, cLi’ means that ‘a, b, c’ is the number of each function and ‘Lo, T, Li’ stand for logistic function, hyperbolic tangent function, and linear function in order. The sum of ‘a, b, and c’ is the total number of nodes at each layer.

variables, and then compared the results with recorded

very difficult to predict as the main stream is influenced

water levels. Results of the 1-day ahead prediction are as

by the flows of many tributaries. However, validation

below. For all measurement points, the models could

testing showed that the ANN models anticipate and

explain changes of water levels very satisfactorily consider-

solve the complication these tributaries cause with excel-

ing that R

2

spans from 0.84 to 0.94 (see Table 5 and

lent accuracy. 3. Yeongsan River: R 2 (0.83) is similar to that of the points in

Figure 4).

the Han River. It should be also noted that MSE is low 2

1. Han River: R of the four points is lower (0.84) than

(0.04) and simultaneously MAPE is relatively high

those in any other study site while MSE is the average

(13.12%). MAPE is more sensitive to overestimation aris-

and MAPE is relatively low. The first criterion reveals

ing when the absolute values of observed data are

that there are difficulties in fitting the model as

smaller. Therefore, the values of MSE and MAPE are

random variations of water levels are rather significant.

interpreted as the constructed models having a little ten-

However, the latter criteria show that errors of the

dency of overestimating low water levels, especially in

models are insignificant. Overall, prediction ability is

the dry season.

quite good even in the seasons when floods or droughts

4. Nakdong River: the models have excellent consistency (R 2 ¼ 0.91), but the other criteria are relatively unsatisfac-

occur. 2. Geum River: R 2 is the largest (0.94) among all the study

tory. Based on these results, it is expected that random

sites. Water levels in this river were once expected to be

variation of the water level data are not significant, but the models do not respond in a very sensitive manner

Table 5

|

when water levels are suddenly changed.

Testing results of the ANN models (R 2) R 2: 1-day ahead

R 2: 2-day ahead

Han River

0.84

0.72

Geum River

0.94

0.87

Yeongsan River

0.83

0.82

Nakdong River

0.91

0.87

For the 2-day ahead prediction models, it is natural that validation criteria get worse. However, Figure 5 states that the models are still acceptable: 0.72 < R 2 < 0.87, and sufficiently small MSE and MAPE. Also, the statistical properties, differing from study sites in cases of the


227

Figure 4

G. Lee et al.

|

|

Applicability of neuro-genetic approach for short-term water level prediction

Journal of Hydroinformatics

|

16.1

|

2014

Results of testing the trained model (1-day ahead water level): (a) Han River; (b) Geumgang River; (c) Youngsan River; (d) Nakdong River.

1-day prediction models, remain valid when the overall perspective of the criteria is considered. The models are

Is it easy and systematic to find the optimum structure of the model?

most consistent with the points in the Geum River, but they are not perfectly fitted with the points in the Han

The model is optimized by using the genetic algorithm

River basin. In addition, the model fits with Nakdong

(selecting model structure) and the back propagation algor-

River, but a small error results from insensitivities of the

ithm (adjusting weights). Although some professional efforts

models.

for programming were required (in practice, this could be

In general, the ANN models are acceptable in predicting

easily settled by developing the GUI, interacting with the

water levels at all points even when there is uncertainty

GA and back propagation (BP) codes, and a manual),

in tomorrow or the day after tomorrow’s weather

these algorithms greatly reduced the labor in repeating

conditions.

trials and errors. Simultaneously, with the assistance of linking the ANN model with the GA, the authors could have great confidence in determining hidden layers, nodes, and

Discussion

activation functions, which might otherwise have been arbitrary.

At the beginning of this study, the authors form three hypotheses to examine the practicality of the ANN models. Each hypothesis is then discussed with the models

Is this model advantageous to forecast 1- or 2-day ahead water levels?

constructed above. The discussion underpins the opinion that the ANN models, especially optimized with the genetic

As the authors mentioned previously, the Korean river man-

algorithm, can achieve many requirements necessary to

agement system needs 1 or 2 days to decide the operation of

replace the existing models and eventually enhance the

upstream infrastructure to maintain the downstream water

adaptability of the river management system.

levels. Practically, it is important to have the prediction

Figure 5

|

Results of testing the trained model (2-day ahead water level): (a) Han River; (b) Geumgang River; (c) Youngsan River; (d) Nakdong River.


228

G. Lee et al.

|

Applicability of neuro-genetic approach for short-term water level prediction

Journal of Hydroinformatics

|

16.1

|

2014

methods and models that are highly advantageous to predict

Again, these tests let the modeler determine the acceptable

the 1- or 2-day ahead water levels. The ANN models coupled

range of the discharge amount of the upstream in order to

2

with the GA showed satisfactory validity (e.g., 0.84 < R <

satisfy the management level at the point. For illustration,

0.94 for 1-day ahead water level, and 0.72 < R 2 < 0.88 for

see Figure 6(a); we are interested in the water level at point

2-day ahead water level) and more consistent results than

1 of the Geum River, and the management level is hypotheti-

the hourly models (for instance, Filho & Santos (),

cally set at 3.80 m. Let us also assume that now is day 61 (at

Alvisi et al. (), and Napolitano et al. ()) that tried

this moment, the water level is 3.50 m, and the discharge

to build the prediction models for 1-, 12-, and 18-hours

amount from the upper reservoir is 810 m3/s). This is the

ahead water levels. The coefficient of determination,

time when we get the model prediction that tomorrow’s

which was estimated from their models, ranged from 0.4

water level (3.84 m) would go beyond the management

to 0.95. In addition, even at the points of the Geum River

level. It is thus natural to investigate what would result

which are complicated due to the influence of many tribu-

from the intentionally reduced discharges. By using the

taries, the 2-day ahead water level can be predicted with

ANN model with other assumptions on the upper discharges,

2

the accuracy of R ¼ 0.87. For a further study, comparing

it would be possible to get the prediction that the discharge

the neuro-genetic algorithm and other conventional neural

amount should be immediately dropped to less than

networks will be meaningful for additional verification of

789 m3/s to maintain tomorrow’s water level below 3.8 m,

the developed model.

as in Figure 6(b). Indeed, the Korean River Management Guideline (K-water ) mentions that for cases where meeting the management level is threatened, the river manager is

Will the models be helpful for deciding the operation of the upstream weirs or reservoirs?

exceptionally allowed to ‘act first, report later’.

The ANN models are implicit in explaining a quantitative relation between the upstream flow and the downstream

CONCLUSION

water level. Hence, these models facilitate further decisionmaking in face of the anticipation that the water level at a

Recently, the Korean government implemented the FRRP

point of concern would be risky under un-intervened con-

with a great deal of ambition. However, it is hard to think

ditions. The modeler can carry out ‘what-if …’ tests while

that the constructed weirs, dams, and reservoirs will solve

thinking about the different operation (or different discharge)

all the chronic problems that riparian areas have long

of weirs and reservoirs constructed in the upper stream.

faced, and climate change is likely to aggravate. For adaptive

Figure 6

|

Example illustrating the decision operation of the upstream water infrastructure at point 1 of the Geum River: (a) prediction of the 1-day ahead water level; (b) estimated relation between upstream flow (day 61) and downstream water level (day 62).


229

G. Lee et al.

|

Applicability of neuro-genetic approach for short-term water level prediction

capacity of the river management system, this study had a

Journal of Hydroinformatics

|

16.1

|

2014

REFERENCES

special interest in raising the capability of predicting water levels at various points of the rivers. Such intelligent forecasting

capabilities

can

be

heightened

by

carefully

monitoring weather conditions and upstream water flow data, adequately utilizing the data in predicting 1- or 2-day ahead water level, and building the models properly to satisfy practical requirements. In this context, the authors tested the use of a hybrid neuro-genetic algorithm in predicting water levels at 15 points of four rivers. The results are summarized as follows. 1. By using the genetic algorithm, it was possible to greatly reduce the trials and errors which were necessary to find out the optimum structure of the ANN model. The developed ANN model demonstrates the great advantage that hidden layers, nodes, and activation functions can be selected in a more formulated manner. 2. The ANN models showed satisfactory validity over the 15 water level measurement points. Especially, the coefficient of determination ranged from 0.84 to 0.94 for 1-day ahead water levels, and from 0.72 to 0.88 for 2-day ahead water levels. Based on these statistics, it was found that the built models have greater prediction abilities than those presented in previous studies. 3. The ANN models could clearly explain the relation between the upstream flow and the downstream water level. This advantage can be of significant merit when the river manager anticipates the water levels within the acceptance level. The models can encourage the river manager to investigate the consequences of differently operating water infrastructures located in the upstream. Therefore, they are considerably helpful in making urgent decisions regarding how water infrastructure should be properly operated and maintained.

ACKNOWLEDGEMENTS This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (Grant No. 2012-0001656).

Abebe, A. J. & Price, R. K.  Managing uncertainty in hydrological models using complementary models. Hydrol. Sci. J. (J. Des Sci. Hydrologiques) 48 (5), 679–692. Alvisi, S., Mascellani, G., Franchini, M. & Bardossy, A.  Water level forecasting through fuzzy logic and artificial neural network approaches. Hydrol. Earth Syst. Sci. 10, 1–17. Cameron, D., Kneale, P. & See, L.  An evaluation of a traditional and a neural net modeling approach to flood forecasting for an upland catchment. Hydrol. Process. 16, 1033–1046. Cha, D., Lee, S. & Park, H.  Investigating the vulnerability of dry-season water supplies to climate change: using the Gwangdong Reservoir Drought Management Model. Water Resour. Manage. 26 (14), 4183–4201. Chang, F. J., Chiang, Y. M. & Chang, L. C.  Multi-step-ahead neural networks for flood forecasting. Hydrol. Sci. J. (J. Des. Sci. Hydrologiques) 52 (1), 114–130. Chau, K. W.  Particle swarm optimization training algorithm for ANNs in stage prediction of Shing Mun River. J. Hydrol. 329 (3–4), 363–367. Chiang, Y. M., Chang, L. C. & Chang, F. J.  Comparison of static-feedforward and dynamic-feedback neural network for rainfallrunoff modeling. J. Hydrol. 290, 297–311. Choi, D. & Park, H.  A hybrid artificial neural network as a software sensor in a wastewater treatment process. Water Res. 35 (16), 3959–3967. Coulibaly, P., Anctil, F. & Bobee, B.  Hydrological forecasting using artificial neural networks: the state of the art. Can. J. Civil Eng. 26 (3), 293–304. Daliakopoulos, I. N., Coulibaly, P. & Tsanis, I. K.  Groundwater level forecasting using artificial neural networks. J. Hydrol. 309, 229–240. Dibike, Y. B. & Solomatine, D. P.  River flow forecasting using artificial neural networks. Phys. Chem. Earth Part B 26 (1), 1–7. Filho, A. J. P. & Santos, C. C.  Modeling a densely urbanized watershed with an artificial neural network, weather radar and telemetric data. J. Hydrol. 317, 31–48. Foley, A. M.  Uncertainty in regional climate modeling: a review. Prog. Phys. Geog. 34 (5), 647–670. Gavin, J. B., Graeme, C. D. & Holger, R. M.  Data transformation for neural network models in water resources applications. J. Hydroinf. 5, 245–258. Giustolisi, O. & Laucelli, D.  Improving generalization of artificial neural networks in rainfall–runoff modeling. Hydrol. Sci. J. 50 (3), 439–457. Giustolisi, O. & Simeone, V.  Optimal design of artificial neural networks by a multi-objective strategy: groundwater level predictions. Hydrol. Sci. J. 51 (3), 502–523. Goldberg, D. E.  Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley-Longman, Reading, MA, USA.


230

G. Lee et al.

|

Applicability of neuro-genetic approach for short-term water level prediction

Grayson, R. B., Moore, I. D. & McMahon, T. A.  Physically based hydrologic modelling, 2: is the concept realistic? Water Resour. Res. 28 (10), 2659–2666. Haykin, S.  Neural Networks: a Comprehensive Foundation (2nd edn). Prentice-Hall, Upper Saddle River, NJ, USA. Holland, J.  Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, MI, USA. Hsu, K. L., Gupta, H. V. & Sorooshian, S.  Artificial neural network modeling of the rainfall–runoff process. Water Resour. Res. 31 (10), 2517–2530. Imrie, C. E., Durucan, S. & Korre, A.  River flow prediction using artificial neural networks: generalisation beyond the calibration range. J. Hydrol. 233, 138–153. Joo, D., Choi, D. & Park, H.  The effects of data preprocessing in the determination of coagulant dosing rate. Water Res. 34 (13), 3295–3302. Karunanithi, N., Grenney, W. J., Whitley, D. & Bovee, K.  Neural networks for river flow prediction. J. Comput. Civil Eng. 8 (2), 201–220. Kisi, O. & Asce, M.  River flow modeling using artificial neural networks. J. Hydrol. Eng. 1 (60), 60–63. K-water  Dam Operation Manual. K-water, Korea (in Korean). Lawrence, J. & Peterson, A.  Brainmaker: User’s Guide and Reference Manual. California Scientific Software, Nevada City, CA, USA. Leavesley, G. H.  Modeling the effects of climate change on water resources? A review. Clim. Change. 28, 159–177. Lee, S. & Park, H.  Adaptation practices of urban water infrastructure management. Proceedings of the APEC Climate Symposium 2011, Hawaii, USA. Lee, S., Suhaimi, A. & Park, H.  Lessons from water scarcity of the 2008–2009 Gwangdong reservoir: needs to address drought management with the adaptiveness concept. Aquat. Sci. 74 (2), 213–227. Maier, H. R. & Dandy, G. C.  Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications. Environ. Modell. Softw. 15, 101–124. Maier, H., Jain, A., Dandy, G. & Sudheer, K.  Methods used for the development of neural networks for the prediction of water resource variables in river systems: Current status and future directions. Environ. Modell. Softw. 25 (8), 891–909. Milly, P. C. D., Betancourt, J., Falkenmark, M., Hirsch, R. M., Kundzewicz, Z. W., Lettenmaier, D. P. & Stouffer, R. J.  Stationarity is dead: whither water management? Science 319 (5863), 573–574. Ministry of Land, Transport and Maritime Affairs  Master Plan for Four River Project. Korea (in Korean). Ministry of Land, Transport and Maritime Affairs  Future Water Resources Management Strategies for Coping with Climate Change. Korea (in Korean). Mohanty, S., Jha, M. K., Kumar, A. & Sudheer, K. P.  Artificial neural network modeling for groundwater level forecasting in

Journal of Hydroinformatics

|

16.1

|

2014

a river island of Eastern India. Water Resour. Manage. 24, 1845–1865. Napolitano, G., See, L., Calvo, B., Savi, F. & Heppenstall, A.  A conceptual and neural network model for real-time flood forecasting of the Tiber river in Rome. Phys. Chem. Earth. 35 (3–5), 187–194. National Institute of Meteorological Research  Understanding Climate Change II – Climate Change in the Korean Peninsula: Present and Future. Korea Meteorological Administration (in Korean). Pulido-Calvo, I. & Portela, M. M.  Application of neural approaches to one-step daily flow forecasting in Portuguese watersheds. J. Hydrol. 332 (1–2), 1–15. Robert, J. A.  Neural network rainfall–runoff forecasting based on continuous resampling. J. Hydroinf. 5, 51–61. Rumelhart, D. E., Hinton, G. E. & Williams, R. J.  Learning internal representation by error propagation. Parallel Distributed Process 1, 318–362. Rumelhart, D. E. & McClelland, J. L.  Parallel Distributed Processing: Explorations in the Microstructure of Cognition. MIT Press, Cambridge, MA, USA. Sahoo, G. B., Ray, C. & De Carlo, E. H.  Use of neural network to predict flash flood and attendant water qualities of a mountainous stream on Oahu, Hawaii. J. Hydrol. 327, 525–538. Savic, D. A., Walters, G. A. & Davidson, J. W.  A genetic programming approach to rainfall–runoff modelling. Water Resour. Manage. 13 (3), 219–231. See, L. & Openshaw, S.  A hybrid multi-model approach to river level forecasting. Hydrol. Sci. J. 45 (4), 523–536. Shamseldin, A. Y., O’Connor, K. M. & Nasr, A. E.  A comparative study of three neural network forecast combination methods for simulated river flows of different rainfall–runoff models. Hydrol. Sci. J. (J. Des. Sci. Hydrologiques) 52, 896–916. Solomatine, D. P.  Data-driven modelling paradigm, methods, experiences. In: Proceedings of the 5th International Conference on Hydroinformatics, Cardiff, UK, 1–5 July, pp. 1–7. Toth, E., Brath, A. & Montanari, A.  Comparison of shortterm rainfall prediction models for real-time flood forecasting. J. Hydrol. 239 (1–4), 132–147. Wang, W., Gelder, P. H. A. J. M., Vrijling, J. K. & Ma, J.  Forecasting daily streamflow using hybrid ANN models. J. Hydrol. 324, 383–399. Zealand, C. M., Burn, D. H. & Simonovic, S. P.  Short term streamflow forecasting using artificial neural networks. J. Hydrol. 214, 32–48. Zeng, Z. & Wang, W.  Advances in neural network research and application. Lecture Notes in Electrical Engineering 67, Springer, 449. Zhengfu, R. & Fernando, A.  Use of an artificial neural network to capture the domain knowledge of a conventional hydraulic simulation model. J. Hydroinf. 9 (1), 15–24.

First received 18 January 2013; accepted in revised form 20 May 2013. Available online 10 July 2013


231

© IWA Publishing 2014 Journal of Hydroinformatics

|

16.1

|

2014

Impact of climate change on future stream flow in the Dakbla river basin Srivatsan V. Raghavan, Vu Minh Tue and Liong Shie-Yui

ABSTRACT A systematic ensemble high-resolution climate modelling study over Vietnam was performed and future hydrological changes over the small catchment of Dakbla, Central Highland region of Vietnam, were studied. Using the widely used regional climate model WRF (Weather Research and Forecasting), future climate change over the period 2091–2100 was ascertained. The results indicate

Srivatsan V. Raghavan (corresponding author) Vu Minh Tue Liong Shie-Yui Tropical Marine Science Institute, 18 Kent Ridge Road, 119227, Singapore E-mail: tmsvs@nus.edu.sg

W

that surface temperature over Dakbla could increase by nearly 3.5 C, while rainfall increases of more than 40% is likely. The ensemble hydrological changes suggest that the stream flow over the peak and post-peak rainfall seasons could experience a strong increase, suggesting risks of flooding, with an overall average annual increase of stream flow by 40%. These results have implications for water resources, agriculture, biodiversity and economy, and serve as useful findings for policy makers. Key words

| climate change, dynamical downscaling, hydrology, stream flow, Soil and Water Assessment Tool, WRF

INTRODUCTION Climate change impacts are studied using the information

fine-scale details to be applied for regional-scale impact

derived by global climate models (GCMs) which still

studies. When impact studies are performed, such as hydrol-

remain the primary tools in understanding climate and cli-

ogy, regional-scale impact studies warrant high-resolution

mate change at a global scale. However, it has been

climate information. To this end, regional climate models

realized that to study sub-global scales, i.e. continental,

(RCMs) (which are limited area models) at a higher resol-

regional or sub-regional scales, the GCMs do not provide

ution than that of GCMs (c. 10–50 km) are widely used in

detailed information of climate as it is observed in reality.

climate research. For hydrological studies it has become

This is largely attributable to the coarse resolution of the

common to use the output of the regional climate models

GCMs, making them unsuitable for regional impact studies

as input to hydrological models. Similar studies have been

(Giorgi ). The need for regional scale information is

done by Hay et al. (); Sushama et al. (); Andersson

also emphasized by the fact that GCM climate projections

et al. () and Graham et al. ().

do not allow regional examinations such as water balances

This paper describes such a method where the climate

or trends of extreme precipitation due to their coarse grid

outputs (precipitation and surface temperature) from

resolution. This clearly applies to hydrological impact

a high-resolution regional climate model (Weather Research

studies over a river basin, as most of the river basins of the

and Forecasting or WRF) are applied to a hydrological

world are smaller than the typical resolution (c. 300 km)

model (Soil and Water Assessment Tool, SWAT) (Arnold

of the GCM. Such hydrological models therefore need to

et al. ) to study changes in future stream flow over the

be driven by high-resolution data for better assessments of

small river catchment Dakbla, over the Central Highland

regional scale impacts. The GCMs do not simulate precipi-

region of Vietnam. Ensemble scenarios of climate change

tation, one of the most important and sensitive climate

derived from the WRF model driven by three different

parameter highly variable in space and time, with adequate

GCMs are described, all under the A2 emission scenario.

doi: 10.2166/hydro.2013.165


232

S. V. Raghavan et al.

|

Climate change impact on future Dakbla river basin stream flow

Journal of Hydroinformatics

|

16.1

|

2014

Similar studies have also been documented by Hamlet &

through to October (referred to as MJJASO) and dry season

Lettenmaier () and Wei & Watkins ().

from November through to April (referred to as NDJFMA). Flood season is around 1 month after the rainy season, because some buffer time is required to fill up the groundwater for basalt

STUDY AREA

soil in this region after the earlier 6-month dry period. Due to the steep slope topography and heavy rainfall concentrations,

The Dakbla River is a small tributary of the Mekong river over

stream flow in this region acquires a high velocity, especially

the Lower Mekong Basin (LMB) in southeast Asia. The catch-

during floods, causing massive damage to people and property.

2

ment has a total area of 2,560 km from the upstream to Kon

There is also a very high potential of constructing hydropower

Tum gauging station (Figure 1) and lies over the Central High-

dams to store surface water for multipurpose needs: irrigation,

land region of Vietnam. The catchment is covered mostly by

electricity generation and flood control. Upper Kon Tum hydro-

tropical forests which are classified as tropical evergreen

power, with an installed capacity of 210 MW, has been under

forest, young forest, mixed forest, planned forest and shrub.

construction since 2009 (to be completed in 2014) in the

The climate of this region follows the pattern of the Central

upstream region of Dakbla river; at 110 km downstream, the

Highland region in Vietnam with an annual average tempera-

Yaly hydropower plan has been constructed (installed capacity

W

ture of c. 20–25 C and a total annual average rainfall of

720 MW; the second biggest hydropower project in Vietnam)

c. 1,500–3,000 mm with high evapotranspiration rates of

which has been in operation since 2001. Forecasting stream

c. 1,000–1,500 mm per annum. There are two main seasons

flow mainly by using rainfall is therefore an important task in

for the Central Highland region: a rainy season from May

this region for both hydropower and irrigation.

Figure 1

|

Map of Vietnam climate zones and location of Dakbla catchment. (a) Different climate zones and topography of Vietnam; and (b) Dakbla catchment and its meteorological and river gauging station.


233

S. V. Raghavan et al.

|

Climate change impact on future Dakbla river basin stream flow

METHODS

Journal of Hydroinformatics

|

16.1

|

2014

wind speed. Due to limited available meteorological data for the site considered in this study, the Hargreaves

Soil and water assessment tool (SWAT)

method is applied. In the SWAT model, the land area in a sub-basin is

The rainfall–runoff model is a typical hydrological model-

divided into what are known as hydrological response

ling tool that determines the runoff from the watershed

units (HRUs). HRUs are constructed through a unique com-

basin resulting from rainfall falling on the basin. Precipi-

bination of land use and soil information. One HRU is the

tation is therefore an important input in deriving runoff in

total area of a sub-basin with a particular land use and soil

hydrological modelling. The SWAT model (Arnold et al.

characteristics. While individual fields with a specific land

), used for rainfall–runoff modelling in this study, was

use and soil may be scattered throughout a sub-basin,

developed to quantify the runoff and concentration load

these areas are lumped together to form a single HRU.

due to the distributed precipitation, watershed topography,

These are used in most SWAT applications since they sim-

soil and land use conditions.

plify a simulation by putting together all similar soil and

SWAT is a river basin scale model developed by the

land use areas into one single response unit (Neitsch et al.

United States Department of Agriculture (USDA) Agricul-

). All parameters such as surface runoff, PET, lateral

ture Research Service (ARS) in the early 1990s. It has

flow, percolation, soil erosion, nitrogen and phosphorus

been designed to work for large river basins over a long

are measured in each HRU.

period of time. Its purpose is to quantify the impact of land management practices on water, sediment and agri-

Model set-up

culture chemical yields with varying soil, land use and management condition. SWAT version 2005 with an

Ensemble regional climate model outputs were used as

ArcGIS user interface (ArcSWAT) was used in this

input to the SWAT hydrological model to determine future

study. There are two methods for estimating surface

hydro-climatic changes. These regional climate model out-

runoff in SWAT model: Green & Ampt () infiltration

puts (surface temperature and precipitation) were derived

method, which requires precipitation input over a sub-

using the WRF model which was used to downscale the

daily scale and the Soil Conservation Service (SCS)

GCMs CCSM3.0, ECHAM5 and MIROC-medres, all

curve number procedure (USDA Soil Conservation Ser-

forced under the Intergovernmental Panel on Climate

vice ) which uses daily precipitation. The latter was

Change (IPCC) A2 future greenhouse gas emission scenario.

selected in this study for simulations, since daily rainfall

This regional climate model was initially driven by the

from the climate models was used as input to the SWAT

ERA40 reanalysis which refer to the ‘true’ climate period

model. The retention parameter is very important in the

of 1981–1990. Later, the WRF model was also driven by

SCS method and is defined by curve number (CN), a func-

the GCMs CCSM3.0, ECHAM5 and the MIROC-medres

tion of the soil permeability, land use and antecedent soil

for both the present day (1981–1990) and the future

water conditions.

(2091–2100) climates. For simplicity, the simulations of

The SWAT model offers three options for estimating

WRF driven by ERA40 reanalysis and the GCMs

potential evapotranspiration (PET): Hargreaves (Hargreaves

CCSM3.0, ECHAM5 and the MIROC-medres are referred

et al. ); Priestley–Taylor (Priestley & Taylor )

to as WRF/ERA, WRF/CCSM, WRF/ECHAM and WRF/

and Penman–Monteith (Monteith ). The Hargreaves

MIROC, respectively.

method requires only maximum, minimum and average sur-

For comparison of WRF model simulated precipitation

face temperature. The Priestley–Taylor method needs solar

and surface temperature profiles, two sets of gridded obser-

radiation, surface temperature and relative humidity. The

vational datasets are used: CRU (Climatic Research Unit,

inputs for the Penman–Monteith method are the same as

University of East Anglia, UK, 0.5 data) and the APHRO-

those for Priestley–Taylor; however, it also requires the

DITE (Asian precipitation highly resolved observational

W


234

S. V. Raghavan et al.

|

Climate change impact on future Dakbla river basin stream flow

data integration towards evaluation of water resources) W

(0.25

data) from the Japanese Meteorological Agency

(JMA). In this paper, the latter is referred as APH. These

Journal of Hydroinformatics

|

16.1

|

2014

error or goodness-of-fit measures available, due to its straightforward physical interpretation (Legates & McCabe ).

datasets have been documented by Mitchell & Jones () and Yatagai et al. (), respectively. For hydrological simulations, daily precipitation data

RESULTS AND DISCUSSION

were obtained from three rainfall stations (Kon Tum, Dak Doa and Kon Plong; the former two lie inside and the

Daily precipitation data were obtained from the three rain-

latter outside the Dakbla catchment) and daily river

fall stations (Kon Plong, Kon Tum and Dak Doa) for the

stream flow data were taken from the gauging station at

periods 1980–1990 (calibration) and 1995–2005 (vali-

Kon Tum, all shown in Figure 1(b). Surface temperature,

dation).

rainfall and discharge data have been acquired for the two

temperature data were also obtained from the local auth-

periods 1980–1990 and 1995–2005, at a daily rate. For use

ority from the Kon Tum meteorological station for the

in the SWAT model, the digital elevation model (DEM) of

same period. Daily river stream flow data were obtained

250 m was obtained from the Department of Survey and

from the Kon Tum gauging station at the downstream end

Mapping (DSM), Vietnam. The land use map was obtained

of the Dakbla River. These data were used for both the cali-

from the Forest Investigation and Planning Institute (FIPI)

bration and validation processes in the stream flow

and the soil map was obtained from the Ministry of Agricul-

simulations of the SWAT model. In the calibration part,

ture and Rural Development (MARD), both in Vietnam

the SWAT model was run in a daily time step for the

(Figure 2).

period of 1980–1990 using observed rainfall and river

Daily

maximum

and

minimum

surface

A couple of benchmarking indices were used to assess

stream flow at Kon Tum gauging station, with the first year

the performance of the SWAT model: Nash–Sutcliffe Effi-

1980 used as the spin-up period. The validation was per-

ciency (NSE) proposed by Nash & Sutcliffe () and the

formed for the 10-year period of 1996–2005 to ensure that

coefficient of determination (R 2). The value of NSE

the model was well calibrated. The reason for choosing

ranges from minus infinity to 1 while R 2 is from 0 to 1,

these 10-year periods for calibration and validation is

with 1 representing a perfect match for both indices. The

because of the data availability; longer-period data spanning

NSE is considered to be the most appropriate relative

30 years were not available from station sources.

Figure 2

|

SWAT model spatial inputs: (a) DEM; (b) land use; and (c) soil map of Dakbla river basin.


235

S. V. Raghavan et al.

|

Climate change impact on future Dakbla river basin stream flow

Journal of Hydroinformatics

|

16.1

|

2014

A sensitivity analysis was conducted prior to calibrating

the precision of OAT sampling to ensure that the changes in

the hydrological model. This is a method that analyzes the

each model output could be attributed to the changed par-

sensitivity of the different model parameters (Table 1) that

ameter. In this study, the LH-OAT design was coupled to

influence

This

the ArcSWAT 2005 model for the sensitivity analysis

method serves to filter out those model parameters that do

module. In the SWAT model there are 25 parameters that

not have a significant influence on the model results. On

are sensitive to stream flow, six parameters sensitive to sedi-

the other hand, it also aims to reduce the number of par-

ment transport and nine other parameters sensitive to water

ameters required in the auto-calibration method.

quality. In this study, sensitivity analysis was performed for

the

hydrological

model

performance.

Traditional methods of sensitivity analysis have been

the 25 parameters of stream flow as listed in Table 1, from

classified by Saltelli et al. (). They are: (1) local method

which 11 most sensitive parameters were then selected

(Melching & Yoon ); (2) integration of local to global

(Table 2) for performing the auto-calibration.

method using random one-factor-at-a-time (OAT) proposed

Since the ArcSWAT model has the options to choose

by Morris (); and (3) global methods such as Monte

either manual or auto-calibration, calibration is applied to

Carlo and Latin-Hypercube (LH) simulation (McKay et al.

the most sensitive parameters to yield the optimal set of

; McKay ). By studying the advantages and disadvan-

values for the model parameters which results in the mini-

tages of each of the above methods, van Griensven & Meixner

mum discrepancy between the observed and the simulated

() developed the LH-OAT method which performs LH

river discharge data. Parameter solution method (ParaSol)

sampling followed by OAT sampling. This method samples

is a built-in auto-calibration model in the ArcSWAT 2005

the full range of all parameters using LH design along with

version (van Griensven & Meixne ) which was used

Table 1

|

SWAT parameters sensitive to stream flow

Group

Parameter

Description

Unit

Soil

Sol_Alb Sol_Awc Sol_K Sol_Z

Moist soil albedo Available water capacity Saturated hydraulic conductivity Depth to bottom of second soil layer

– mm mm–1 mm h–1 mm

Subbasin

Tlaps

Temperature laps rate

HRU

Epco Esco Canmx Slsubbsn

Soil evaporation compensation factor Plant uptake compensation factor Maximum canopy storage Average slope length

– – mm H2O m

Routing

Ch_N2 Ch_K2

Manning’s n value for the main channel Effective hydraulic conductivity in main channel alluvium

– mm h–1

Groundwater

Alpha_Bf Gw_Delay Gw_Revap Gwqmn Revapmn

Baseflow alpha factor Groundwater delay Groundwater ‘revap’ coefficient Threshold depth of water in the shallow aquifer for return flow to occur Threshold depth of water in the shallow aquifer for ‘revap’ to occur

days days – mm H2O mm H2O

Management

Biomix Cn2

Biological mixing efficiency Initial SCS runoff curve number for moisture condition II

– –

General data basin

Sftmp Smfmn Surlag Timp Smfmx Blai Slope

Snowfall temperature Minimum melt rate for snow during year Surface runoff lag time Snow pack temperature lag factor Maximum melt rate for snow during year Maximum potential leaf area index for land cover/plant Slope

W

C km–1

W

C mm H2O C–1 day–1 days – – – – W


236

Table 2

S. V. Raghavan et al.

|

|

Climate change impact on future Dakbla river basin stream flow

Journal of Hydroinformatics

|

16.1

|

2014

Sensitivity analysis ranking of 11 most sensitive parameters in SWAT model to stream flow

Sensitivity analysis order

Parameter

Description

Parameter range

Initial value

Optimal value

1

Cn2

2

Ch_K2

Initial SCS runoff curve number for moisture condition II

35–98

35

96.78

Effective hydraulic conductivity in main channel alluvium

–0.01 to 500

0

150

3 4

Sol_Awc

Available water capacity

0–1

0.22

0.44

Sol_K

Saturated hydraulic conductivity

0–2,000

1.95

1,873

5

Ch_N2

Manning’s n value for the main channel

–0.01 to 0.3

0.014

0.073

6

Alpha_Bf

Baseflow alpha factor

0–1

0.048

0.027

7

Surlag

Surface runoff lag time

1–24

4

1

8

Esco

Plant uptake compensation factor

0–1

0

0.66

9

Gwqmin

Threshold depth of water in the shallow aquifer for return flow to occur

0–5,000

0

1,107

10

Gw_Revap

Groundwater ‘revap’ coefficient

0.02–0.2

0.02

0.17

11

Gw_Delay

Groundwater delay

0–500

31

215

in this study for auto-calibration of the SWAT model. This

derived data (precipitation and surface temperature) to be

ParaSol method has also been documented by van Griens-

used for stream flow simulations is discussed, as the cali-

ven & Meixne (). Using the above methodology, the

bration and validation stages used only the station data

SWAT model was calibrated to ensure a robust performance

precipitation and surface temperature.

before undertaking stream flow simulations using the 2

Before discussing the stream flow results of the SWAT

regional climate model output. The R and the NSE index

model, the WRF model simulated climates is useful to high-

were used as benchmarking indices to assess the goodness-

light the usefulness in applying RCM results for hydrological

of-fit of the SWAT hydrological model.

applications. The comparison of WRF model simulated pro-

The calibration and validation graphical results for

files of present-day surface temperature over Dakbla region

Dakbla River are shown in Figures 3 and 4 at (a) daily and

and the gridded observation datasets CRU and APH is dis-

(b) monthly scales, respectively. It is clearly seen in the cali-

played in Figure 5. It is notable that, even between the

bration that the simulated peak-to-peak discharge (on a

CRU and APH observations, CRU exhibits hotter profiles

monthly scale) and the low flow agree well with the

than the APH dataset. Nevertheless, the WRF model results

observed data better than the agreement seen on daily

show a reasonable simulation of the model by exhibiting a

scale, due to a higher variability in daily scales. The vali-

good pattern of temperature gradients as well as their mag-

dation plots indicate that the trend of observed data is

nitudes. The simulations of WRF/ECHAM, WRF/CCSM

being captured by the simulated flow, although some of

and WRF/MIROC also show similar profiles to that of

the peak-to-peak discharges are underestimated compared

WRF/ERA. Figure 6 shows the WRF model precipitation

to observed flow. The values of R 2 and NSE shown in

distribution over Dakbla catchment for the present-day cli-

Table 3 indicate that the comparison indices over a daily

mate compared against the two gridded observational

and monthly scale for both calibration and validation are

datasets. The WRF/ECHAM shows overestimation in rain-

around 0.5 and 0.7, respectively. These values indicate a

fall over this region, while WRF/CCSM and WRF/MIROC

good performance of the SWAT model (Santhi et al. )

share similar distributions to that of WRF/ERA and APH.

and that the hydrological model was well calibrated using

It can be stressed here that while surface temperatures are

the ParaSol method. Since the model was able to reproduce

more homogeneous and easy to be simulated, precipitation

the pattern of the observed stream flow well enough, the

is rather difficult to simulate well. Detailed evaluation of

next stage of the application of the regional climate model

the model performance was carried out (not discussed


237

S. V. Raghavan et al.

|

Climate change impact on future Dakbla river basin stream ow

Figure 3

|

Calibration of the SWAT model, top: daily scale and bottom: monthly scale.

Figure 4

|

Validation of the SWAT model, top: daily scale and bottom: monthly scale.

Journal of Hydroinformatics

|

16.1

|

2014


238

Table 3

S. V. Raghavan et al.

|

|

Climate change impact on future Dakbla river basin stream flow

Statistical indices of SWAT Dakbla river basin model calibration and validation: R 2 and NSE

Journal of Hydroinformatics

|

16.1

|

2014

The precipitation and surface temperature variables from the RCM outputs of WRF/ERA were initially used for stream

Calibration (1981–1990)

Validation (1996–2005)

Daily

Daily

Monthly

flow simulation, followed by the outputs of WRF/CCSM, WRF/ECHAM and WRF/MIROC. The rationale for doing

Monthly

so is the same as that of the regional climate simulations: to R2

NSE

R2

NSE

R2

NSE

R2

NSE

test the performance of the true climate first and then that

0.58

0.53

0.72

0.74

0.45

0.43

0.73

0.66

of the GCMs. The reasonably good results from the WRF model for the present-day climate over this region imply

here), but is outwith the scope of this paper. These results

that they are suitable for use in the rainfall–runoff model.

are merely a bird’s eye view of regional climate simulations

The daily scale precipitation and temperature derived from

over a small region such as that of Dakbla. The climate

the RCMs were bi-linearly interpolated to the respective rain-

model results are shown to substantiate the use of model-

fall stations (Kon Plong, Kon Tum, Dak Doa) and

derived climate variables for further use in the SWAT hydro-

meteorological station (Kon Tum). The SWAT model usually

logical simulations.

takes measured rainfall data from gauged stations as input,

Figure 5

|

W

Annual surface temperature over Dakbla during 1981–1990 (in C): (a) CRU; (b) APH; (c) WRF/ERA; (d) WRF/CCSM; (e) WRF/ECHAM; and (f) WRF/MIROC.


239

Figure 6

S. V. Raghavan et al.

|

|

Climate change impact on future Dakbla river basin stream flow

Journal of Hydroinformatics

|

16.1

|

2014

Annual daily precipitation over Dakbla during 1981–1990 (in mm day–1): (a) CRU; (b) APH; (c) WRF/ERA; (d) WRF/CCSM; (e) WRF/ECHAM; and (f) WRF/MIROC.

then distributes its values to all of its sub-catchments. An

temperature and precipitation over the Dakbla region.

interpolation is therefore required to compute the station

Figure 7 displays the future response of the delta change

data (at a particular grid point) when using gridded data.

in annual scale for Dakbla region over scenario A2 for

Linear interpolation is therefore applied in this case. The

three different

bilinear interpolation method is an extension of the linear

ECHAM; and (c) WRF/MIROC for surface temperature

interpolation for interpolating functions of two variables on

and precipitation. It can be seen that WRF/CCSM projects

a regular grid; this is therefore used to extract precipitation

the least surface temperature increase compared to WRF/

value from station data at a grid point, from the entire gridded

ECHAM and WRF/MIROC. The change in temperature

data source derived from the RCM output. The same

from these three model scenarios ranges between 2.6 and

approach is applied for the surface temperature.

3.7 C. Precipitation is also expected to increase annually

Before the future stream flow results are discussed, it is also helpful to assess the future changes in the mean surface

models: (a) WRF/CCSM; (b)

WRF/

W

by 20–50%, with the largest (smallest) changes simulated by WRF/MIROC (WRF/CCSM).


240

Figure 7

S. V. Raghavan et al.

|

|

Climate change impact on future Dakbla river basin stream flow

Journal of Hydroinformatics

|

16.1

|

2014

Future response of (1) surface temperature and (2) daily precipitation over Dakbla: (a) WRF/CCSM; (b) WRF/ECHAM; and (c) WRF/MIROC.

Figure 8 shows the stream flow simulated by the SWAT

datasets. This finding is important because drought is one

model for the baseline (1981–1990) (black) and future

of the severe threats to this Central Highland region of Viet-

(2091–2100) (red; see colour version online) period derived

nam and has strong implications due to the high potential

from the inputs (precipitation, temperature) from the three

for hydropower.

different RCM integrations – WRF/CCSM, WRF/ECHAM and WRF/MIROC – all using the same A2 scenario.

In order to assess the characteristics of extreme rainfall and stream flow time series, a boxplot graph is shown in

It can be seen that, over an annual scale, the stream

Figure 9 for both rainfall and discharge at the Kon Tum

flow simulated by WRF/CCSM A2 scenario shows an

station. Overall, the WRF/ECHAM results indicate more

increase of 38% in the future, WRF/ECHAM A2 indicates

rainfall compared to the other two RCM integrations,

an increase of 37% and WRF/MIROC shows the highest

suggesting higher stream flow data. The maximum value of

increase of 46%. The low flow period during the dry

such a discharge is seen in the future stream flow for the

season NDJFMA also indicates a slight increase from all

WRF/ECHAM driven simulation, at 600 m3 s–1.


241

Figure 8

S. V. Raghavan et al.

|

|

Climate change impact on future Dakbla river basin stream flow

Journal of Hydroinformatics

|

16.1

|

2014

Baseline and future stream flow at Kon Tum station for three RCMs.

On a daily scale study of extremes, the probability distri-

model using station data rainfall has been found satisfac-

bution function compares the rainfall and stream flow for

tory; the model-derived rainfall was therefore also used to

the three different RCM results for the baseline and future

assess stream flow simulation over the current and future

periods (Figure 10). All three RCM results agree that

climate. Using the RCM outputs, the present-day and

future stream flow has higher frequency distribution for

future stream flows were also simulated. Results show

3 –1

high discharge (>100 m s ) compared to the baseline.

that the future stream flow over the Dakbla river basin

For the extreme case, a discharge value of more than

is expected to increase, especially during the rainy

3 –1

indicates a higher frequency of future stream

season, which has implications not only for flood mitiga-

flow. This must be taken very seriously, as very high dis-

tion measures but also for water resources management,

charge is critical for river operation management.

hydropower and agriculture. Extreme values of rainfall

480 m s

and discharges indicate that necessary steps should be taken for appropriate river operation management.

CONCLUSIONS

However, much more work is required to improve confidence in these results. Further higher resolution simulation

In this study, regional climate model outputs of precipi-

(5–10 km) of the RCMs may be required to obtain more cred-

tation and surface temperature were applied to a

ible estimates of present-day and future precipitation. Since

hydrological model (SWAT), calibrated using the ParaSol

this result has been obtained only from a few RCM simu-

method, and its simulated discharges were compared to

lations of future climates, it is recommended to obtain an

their observed counterparts. The performance of the

ensemble estimate of future climate change by downscaling


242

Figure 9

S. V. Raghavan et al.

|

|

Climate change impact on future Dakbla river basin stream flow

Journal of Hydroinformatics

|

16.1

|

2014

Box plot for baseline and future for three RCMs at Kon Tum station, top: precipitation and bottom: stream flow.

more GCMs or by using perturbed initial conditions to the

some uncertainties from the hydrological model, improved

RCM to derive multiple estimates of climate. The hydrologi-

spatial data such as the DEM might help to improve the

cal simulations using the results of the derived ensemble

stream flow simulations since the current version was

climate simulations will add to the confidence of such a

mapped a few years ago in 2005. Other than the ParaSol

hydrological impact study.

method which was used for calibration, a few other auto-cali-

Further developments in the RCM model physics and

bration methods which are coupled to SWAT-CUP model

dynamics might also yield improvements in the climate simu-

(SWAT Calibration Uncertainty Procedures, Abbaspour

lations, yielding a better quality of RCM outputs which in

et al. ) might yield more possible outcomes which

turn might improve the hydrological simulations. As to

could help to understand a wider range of uncertainties.


243

Figure 10

S. V. Raghavan et al.

|

|

Climate change impact on future Dakbla river basin stream flow

Journal of Hydroinformatics

|

16.1

|

2014

Probability distribution function for baseline and future for three RCMs at Kon Tum station, top: precipitation and bottom: stream flow.

However, the applications of these methods are compre-

paper, yet provide possible future research work. The research

hensive exercises that entail more sensitivity studies and

findings from this study are still useful as they yield some ‘new’

experimentations; they are as such beyond the scope of this

information that might yield clues to the wider and larger


244

S. V. Raghavan et al.

|

Climate change impact on future Dakbla river basin stream flow

changes to come. This study is one of the first detailed RCM studies undertaken over this region to provide preliminary possible future climate change information to policy makers. As these several uncertainties will be constrained down the road once improvements in the modelling are achieved, those plausible wider and larger changes could be used for further assessments of future changes.

REFERENCES Abbaspour, K. C., Yang, J., Maximov, I., Siber, R., Bogner, K., Mieleitner, J., Zobrist, J. & Srinivasan, R.  Modelling hydrology and water quality in the pre-alpine/alpine Thur watershed using SWAT. Journal of Hydrology 333, 413–430. Andersson, L., Wilk, L., Todd, M., Hughes, D., Earle, A., Kniveton, D., Layberry, R. & Savenije, H.  Impact of climate change and development scenarios on flow patterns in the Okavango River. Journal of Hydrology 331 (1–2), 43–57. Arnold, J. G., Srinivasan, R., Muttiah, R. S. & Williams, J. R.  Large area hydrologic modeling and assessment, part I: Model development. Journal of American Water Resources Association 34 (11), 73–89. Giorgi, F.  Simulations of regional climate using a limited area model nested in a general circulation model. Journal of Climate 3 (9), 941–963. Graham, L. P., Hagemann, S., Jaun, S. & Beniston, M.  On interpreting hydrological change from regional climate models. Climatic Change 81, 97–122. Green, W. H. & Ampt, G. A.  Studies on soil physics, Part I: The flow of air and water through soils. Journal of Agricultural Science 4, 1–24. Hamlet, A. F. & Lettenmaier, D. P.  Long-range climate forecasting and its use for water management in the Pacific Northwest region of North America. Journal of Hydroinformatics 2, 163–182. Hargreaves, G. L., Hargreaves, G. H. & Riley, J. P.  Agriculture benefits for Senegal River basin. Journal of Irrigation and Drainage Engineering 111 (2), 113–124. Hay, L. E., Clark, M. P., Wilby, R. L., Gutowski, W. J., Leavesley, G. H., Pan, Z., Arritt, R. W. & Takle, E. S.  Use of regional climate model output for hydrological simulations. Journal of Hydrometeorology 3, 571–590. Legates, D. R. & McCabe Jr, G. J.  Evaluating the use of ‘goodness-of-fit’ measure in hydrologic and hydroclimatic model validation. Water Resources Research 35 (1), 233–241. McKay, M. D.  Sensitivity and uncertainty analysis using a statistical sample of input values. In: Uncertainty Analyses (Y. Ronen, ed.). CRC Press, Boca Raton, FL, pp. 145–186. McKay, M. D., Beckman, R. J. & Conover, W. J.  A comparison of three methods for selecting values of input

Journal of Hydroinformatics

|

16.1

|

2014

variables in the analysis of output from a computer code. Technometrics 21 (2), 239–245. Melching, C. S. & Yoon, C. G.  Key sources of uncertainty in QUAL2E model of Passaic river. Journal of Water Resources Planning and Management 122 (2), 105–113. Mitchell, T. D. & Jones, P. D.  An improved method of constructing a database of monthly climate observations and associated high-resolution grids. International Journal of Climatology 25, 693–712. Monteith, J. L.  Evaporation and the Environment. Symposia of the Society for Experimental Biology. Cambridge University Press, London, pp. 205–234. Morris, M. D.  Factorial sampling plans for preliminary computation experiments. Technometrics 33, 161–174. Nash, J. E. & Sutcliffe, J. V.  River flow forecasting through conceptual models. Part 1: A discussion of principles. Journal of Hydrology 10 (3), 282–290. Neitsch, S. L., Arnold, J. G., Kiniry, J. R., Srinivatsan, R. & Williams, J. R.  Soil and Water Assessment Tool Input/ Output File Documentation version 2005. Grassland, Soil and WaterResearch Laboratory, Agricultural Research Service, Temple, Texas. Priestley, C. H. B. & Taylor, R. J.  On the assessment of surface heat flux and evaporation using large scale parameters. Monthly Weather Review 100, 81–92. Saltelli, A., Chan, K. & Scott, E. M. (eds)  Sensitivity Analysis. Wiley, New York. Santhi, C., Arnold, J. G., Williams, J. R., Dugas, W. A., Srinivasan, R. & Hauck, L. M.  Validation of the SWAT model on a large river basin with point and nonpoint sources. Journal of the American Water Resources Association 37 (5), 1169–1188. Sushama, L., Laprise, R., Caya, D., Frigon, A. & Slivitzky, M.  Canadian RCM projected climate-change signal and its sensitivity to model errors. International Journal of Climatology 26 (15), 2141–2159. USDA Soil Conservation Service  SCS National Engineering Handbook, Section 4: Hydrology. Washington, DC. van Griensven, A. & Meixne, T.  ParaSol (Parameter Solutions), PUB-IAHS Workshop Uncertainty Analysis in Environmental Modelling, July 2004. Lugano, Italy. van Griensven, A. & Meixner, T.  Methods to quantify and identify the sources of uncertainty for river basin water quality models. Water Science and Technology 53 (1), 51–59. Wei, W. & Watkins, D. W.  Probabilistic streamflow forecasts based on hydrologic persistence and large-scale climate signals in central Texas. Journal of Hydroinformatics 13, 760–774. Yatagai, A., Kamiguchi, K., Arakawa, O., Hamada, A., Yasutomi, N. & Kitoh, A.  APHRODITE: Constructing a long-term daily gridded precipitation dataset for Asia based on a dense network of rain gauges. Bulletin of American Meteorological Society 93, 1401–1415.

First received 15 October 2012; accepted in revised form 23 April 2013. Available online 25 June 2013


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.