1
© IWA Publishing 2014 Journal of Hydroinformatics
|
16.1
|
2014
Assessment of GeoEye-1 stereo-pair-generated DEM in flood mapping of an ungauged basin I. K. Tsanis, K. D. Seiradakis, I. N. Daliakopoulos, M. G. Grillakis and A. G. Koutroulis
ABSTRACT A very high resolution (VHR) digital elevation model (DEM) is produced from a GeoEye-1 0.5-mresolution satellite stereo pair and is used for floodplain management and mapping applications such as watershed delineation and river cross-section extraction. For this purpose, a 2 m × 2 m resolution terrain surface is produced from the stereo pair by using the Leica Photogrammetry Suite (LPS) enhanced Automatic Terrain Extraction (eATE) algorithm. DEM accuracy is assessed by comparison with measured individual ground control points (GCPs), stream cross-sections and other landscape features. Results show that the produced DEM is in good agreement with ground truth and superior to products of lower resolution, such as 90 m NASA Shuttle Radar Topography Mission (SRTM) and 1:5,000 topographical maps. One- and two-dimensional hydraulic models are used to simulate
I. K. Tsanis (corresponding author) K. D. Seiradakis I. N. Daliakopoulos M. G. Grillakis A. G. Koutroulis Department of Environmental Engineering, Technical University of Crete, Chania, Greece E-mail: tsanis@hydromech.gr I. K. Tsanis Department of Civil Engineering, McMaster University, Hamilton, Canada
rainfall–runoff characteristics and flood wave kinematics of the flash flood event of 17 October 2006 that occurred in the ungauged basin of Almirida, using the 2 m VHR-DEM as an input. Results show that the hydraulic simulation based on the generated VHR-DEM, calibrated and validated via field data, produces an accurate extent and water level of the flooded area. Remote sensing stereo reconstruction is a promising alternative to traditional survey methods in flood mapping applications. Key words
| digital elevation model, flash flood, flood mapping, satellite stereo pair, very high resolution
INTRODUCTION Floods are among the world’s most costly disasters with the
subject to increasing human activity such as urbanization
estimated cost of flood damage in Europe increasing signifi-
that reduces infiltration leading to the increase of surface
cantly in the past decades (Re ; Barredo ). In 2002
runoff, the shortening of the flood’s travel time and an
only, Europe suffered over 10 billion Euros in damages and
increase in the peak flow. Urbanization can directly affect
dozens of people were killed (Toothill ). Flash floods
the capacity of a stream when infrastructure such as
constitute a great challenge in civil protection as they rep-
bridges are constructed within a stream encroaching the
resent a great destructive force. Within minutes to a few
floodplain, or indirectly causing stream channel enlarge-
hours from the causative storm event, flash flood water
ment as a response to the change in stream flow regime
levels can reach their peak, leaving insufficient warning
accompanying urbanization (Hammer ; Gregory et al.
time to prevent human casualties (Borga et al. ; Collier
; Konrad ). Especially in the case of ungauged
). They occur both in areas with no flooding history
basins, flash floods also pose a great challenge to science
and areas with such frequent floods that flooding is con-
as heterogeneities and the lack of observational data
sidered a local climate component (Llasat et al. ).
enhance uncertainties in providing quantitative assess-
The flooding potential of a hydrological basin is mainly
ments (Sivapalan ).
doi: 10.2166/hydro.2013.197
2
I. K. Tsanis et al.
|
GeoEye-1-generated DEM for flood mapping
Journal of Hydroinformatics
|
16.1
|
2014
Flood hazard and flood risk maps are essential in the mitigation of the disastrous effects of floods because they provide a proactive tool for discouraging urbanization in flood-prone areas. These products constitute an essential part of the Flood Directive 2007/60/EC (EP & CEU ) in flood risk management plans. Flood hazard maps are based on the combination of three variables: flood extent, land use and flood return period. As well as land use, urbanization can also cause changes to the floodplain geometric characteristics, thus altering the flood extent for a given flood. Theory and practice have clearly demonstrated the importance of the accurate representation of morphological channel characteristics. Hydraulic relationships between channel morphology and runoff were first explored by Leopold & Maddock (). In flood inundation modeling, digital elevation models (DEMs) greatly affect the model outputs as they have a direct influence on the total drainage
Figure 1
|
Relative cost and accuracy of DEM generation technologies (Richards 2007).
length and slope (Dutta & Herath ). Low-resolution effects such as under-sampling can cause poor terrain
Furthermore, some regions in LiDAR data have null
representation, altering water pathways and flow character-
values due to self-occlusion of buildings (Lee et al. )
istics. This problem is very pronounced when simulating
or the presence of water bodies (Awrangjeb et al. ). Syn-
hydrological processes at a timescale shorter than that of
thetic aperture radar interferometry (InSAR) is also a highly
the surface water process. At this timescale, the linkage of
effective tool for extracting DEMs, with vertical accuracy
GIS and hydrological models becomes difficult because
that can range from c. 1 m to c. 10 m (Sanders )
simulation of channel flow depends heavily on the structure
depending on altitude of observation (Figure 1). InSAR
of channel network.
can penetrate cloud cover with negligible attenuation, but
Remote sensing techniques have provided indispensable
suffers in highly vegetated areas and effects such as shadow-
solutions for generating DEMs for environmental surveying
ing and layover limit its applicability to flat and moderately
and planning applications (Mongus & Žalik ), especially
rough terrains (Eineder ). Figure 1 shows a rough cost-
for large area coverages. Older low-resolution DEM
effect analysis of various DEM generation technologies.
products (30–100 m) are adequate for numerous environ-
Recently available very high resolution (VHR) satellite
mental applications (Nikolakopoulos et al. ), but
stereo-pair products promise to deliver affordable sub-
provide poor terrain detail, especially in lowlands with
meter DEM accuracy that can potentially lead to the
minor slopes that are nevertheless prone to floods. DEMs
much-needed flood hazard and risk maps necessary for
derived from light detection and ranging (LiDAR) generally
flood damage analysis (FDA) studies (Boyle et al. ; Kou-
provide more accurate height information and have suffi-
troulis & Tsanis ).
cient resolution, e.g. sub-meter grids of ±0.1 m vertical
GeoEye-1, the imaging system with highest resolution
accuracy (Lane et al. ; Sanders ). While airborne
available commercially, can collect samples with a ground
laser scanning ‘is here to stay’ (Baltsavias ), it is still
resolution of 0.41 m in the panchromatic or black and
costly for large applications (Figure 1) and there are a few
white mode and multispectral or color imagery at 1.65 m
inherent shortcomings of the LiDAR technology, e.g. lack
resolution. While the satellite collects imagery at 0.41 m,
of correspondence to objects, no redundancy in the
GeoEye’s operation license from the US government
measurements, strong dependency on material features,
requires resampling the imagery to 0.5 m for all customers.
missing visual coverage (Toth & Grejner-Brzezinska ).
The intended use of the GeoEye GeoStereo product is to
3
I. K. Tsanis et al.
|
GeoEye-1-generated DEM for flood mapping
Journal of Hydroinformatics
|
16.1
|
2014
obtain an accurate DEM generation for 3D viewing and fea-
defined by 79 rational polynomial coefficients (RPCs)
ture extraction applications. An automatic DEM generation
approximating the specific sensor model information to
from VHR satellite image stereo pairs offers new challenges
map geodetic ground points to the imaging system’s pixel
for developing techniques for the automatic interpretation
coordinates. Ground control points (GCPs) play a very
of image structures (Krauß et al. ). The exploration of
important role in satellite image adjustment. They are actu-
the efficiency of the data sources in combination with a
ally essential when no physical sensor model or RPC
specific application and advances in available processing
model is available, as shown in Figure 2. Depending on
methods are still in demand (Croitoru et al. ).
the type of solution, a terrain-dependent or independent
Due to computational simplicity and the relative ease of
approach can be followed, with the final step of either
parameterization and calibration, 1D hydraulic models are
approach always being an RPC (Figure 2). In the case
today a staple of flood modeling (Marks & Bates ), pro-
where one of them is available, as in the case of GeoEye-1
viding sufficient accuracy in small computational times even
imagery, GCPs are used in refinement of the RPC solution.
when coupled with detailed topographic data. However,
The advantage of rational functions is that they are
when dealing with highly complex and varying hydraulic
sensor independent, which means that the user does not
parameters such as flood inundation prediction and flash
need to know all of the specific internal and external
floods, DEMs can be used to parameterize a 2D hydraulic model (Sanders ) in order to offer a better representation of the flow field, at the cost of data, model and computation simplicity (Gogoase et al. ). In this study, the topography extracted from a GeoEye-1 stereo-pair-generated DEM is investigated for its accuracy compared to other widely used DEM products and conventional survey topographical techniques. The produced high-resolution DEM is then tested for its efficiency in 1D and 2D hydraulic simulation and flood mapping of a flash flood event that occurred in the Almirida area on the island of Crete in 2006, causing damage to property and the loss of a human life.
METHODOLOGY DEM extraction The first requirement for a DEM from satellite imagery is a satellite stereo-pair product accompanied by a sensor model that describes the geometric relationship between the 3D object space Ob(X, Y, Z ) and 2D image space Im(r, c). The sensor model can be made available in two forms which include the rigorous physical sensor model and the generalized or abstract sensor model. The physical sensor model includes all of the internal and external (i.e. position and orientation information) sensor model information associated with a specific satellite sensor as exists when the imagery is being captured. The abstract model is commonly
Figure 2
|
Schematic of stereo-pair processing depending on sensor model availability.
4
I. K. Tsanis et al.
|
GeoEye-1-generated DEM for flood mapping
Journal of Hydroinformatics
|
16.1
|
2014
camera information. For the ground-to-image transform-
above-ground structures and vegetation (digital surface
ation, the defined ratios of polynomials have the forward
model, DSM). Depending on the problem in hand, surface
form presented in Equations (1) and (2):
features are deemed either required (e.g. Priestnall et al.
r¼
) or redundant (e.g. Gomes Pereira & Janssen ). A
1 Z Y X Y 3 X3 ða0 a1 a19 ÞT
ð1 Z Y
X Y3
X3 Þ
(1)
T
ð1b1 b19 Þ
DEM is commonly interpolated from a set of vertices in a 3D coordinate system on a Cartesian grid where it is possible to draw and estimate individual terrain profiles or cross-
c¼
1 Z Y X Y 3 X3 ðc0 c1 c19 ÞT
sections along a stream or river. These profiles and cross(2)
ð1 Z Y X Y 3 X3 Þ ð1 d1 d19 ÞT
sections can then be compared with field measurements to check for DEM reliability. This step is essential if accurate
where r, c are image space coordinates, X, Y, Z are ground coordinates and a, b, c, d are the respective RPCs (OGC ) provided by the satellite product vendor. The inverse process allows the user to perform photogrammetric tasks such as orthorectification and stereo reconstruction and requires mutual information matching between the stereo-
cross-sections are to be provided for the next step of hydraulic simulation. Following the model set-up with field parameters, a measured or estimated flow can be applied in order to create flood inundation lines to define flood boundaries for the specified flood. A schematic description of the DEM validation process is presented in Figure 3.
pair members. Essentially, a pixel set {ImL(r1, c1), ImR(r2, c2)} depicting the same object has to be specified is order to
DEM quality
solve Equations (1) and (2) iteratively towards the object’s real world coordinates (Tao & Hu ; Hu et al. ). Given that the distance between pixels ImL(r1, c1) and
The quality of each DEM product against measured points can be estimated using the Root Mean Square Error
ImR(r2, c2), also called disparity, is independent of pixel
(RMSE) of elevations defined in Equation (3):
intensity, one band from each member of the stereo pair is
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P 2 ðzbi zi Þ RMSE ¼ n
usually enough for DEM extraction. While this computation is straightforward and converges fast (Lin & Yuan ), an
(3)
automatic pixel-wise matching of large stereo-pair products becomes more challenging as local and later global optimization need to be applied in order to ensure robustness. The
where zi is the elevation of the measured point i considered as ground truth, zbi is the elevation of point i on each DEM
Leica Photogrammetry Suite (LPS) eATE (enhanced Auto-
product and n is the number of the measurements. The stat-
matic Terrain Extraction) algorithm is an advanced tool for extracting high-density terrain surfaces from stereo ima-
istics of the relative error of elevation estimation (zbi zi )=zbi can also provide valuable information about the
gery
mutual
distribution of error. Total watershed area, the upslope
information. It approximates a global 2D smoothness con-
based
on
a
pixel-wise
matching
of
area that contributes flow to a common outlet at the
straint by combining many 1D constraints (Hirschmuller
lowest point along the boundary of the watershed, can be
), otherwise known as semi-global matching (Previtali
used as a means of comparison among DEM products.
et al. ). The performance of the algorithm eATE is
Another indicator of DEM quality is the average Euclidean
based on user-defined regional strategies and parameters
distance of the estimated flow path from the actual stream
that control and guide the terrain processing.
centerline. Flow path P(xi, yi) can be extracted using the
Products of this process can vary from 2D maps, stereo-
method of deriving accumulated flow from a DEM pre-
based 3D reconstruction and single-image-based 3D point
sented in Jenson & Domingue (), while the stream
extraction to orthorectification. A DEM is a 3D represen-
centerline S(xj, yj) can be measured in the field or estimated
tation of a terrain’s surface and can either represent bare
otherwise. Given that individual points in P and S are not
ground surface (digital terrain model, DTM) or include
necessarily equidistant or the same in number (i ≠ j), the
5
I. K. Tsanis et al.
Figure 3
|
|
GeoEye-1-generated DEM for flood mapping
Journal of Hydroinformatics
|
16.1
|
2014
Schematic of the cross-section extraction process for a successful hydraulic simulation.
average distance d to the flow can be approximated using
sense of a control volume) will remain constant over time
Equation (4):
and the principle of conservation of momentum which states that the total momentum of a closed system (in this
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 m xi xj þ yi yj min j¼1 i¼1
Pn d(P, S) ¼
n
case a control volume) is constant. The unsteady flow (4)
or otherwise, the average Euclidean distance of each point of P with the closest point of S.
equation solver, developed primarily for subcritical flow regime calculations, was adapted from the UNET model (Barkau ; Brunner ). A 2D hydraulic model was also set up with the help of the CCHE-2D finite difference code developed at the National Center for Computational Hydroscience and
Hydraulic modeling
Engineering (NCCHE), University of Mississippi. CCHE2D is an unsteady, turbulent flow model with non-uniform
The Hydrologic Engineering Center, River Analysis System
sediment and conservative pollutant transport capabilities.
(HEC-RAS), which was developed by the US Army Corps
An efficient element scheme of Wang & Hu () is incor-
of Engineers, has been applied extensively in calculating
porated to numerically solve the 2D depth-averaged shallow
the hydraulic characteristics of rivers (Pappenberger et al.
water equations. The capability of CCHE-2D to simulate
; Carson ). HEC-RAS is designed to perform 1D
subcritical and supercritical free surface floods has been ver-
hydraulic calculations for a full network of natural and con-
ified and validated using analytic methods and many sets of
structed channels (Brunner ) and has an unsteady flow
physical model data and field data ( Jia & Wang ; Jia
component that is capable of simulating 1D unsteady flow
et al. ). The governing equations of CCHE-2D used for
through a full network of open channels. The physical
simulating the flow field are the continuity equation and
laws which govern the unsteady flow of water in a stream
the momentum equations in x and y directions as shown
are the principle of conservation of mass (continuity)
by Nassar (). Details about the model and its com-
which implies that the mass of a closed system (in the
ponents are given by Zhang (, ).
6
I. K. Tsanis et al.
|
GeoEye-1-generated DEM for flood mapping
CASE STUDY
Journal of Hydroinformatics
|
16.1
|
2014
206 m. The area is cultivated and covered mainly by olive trees and natural vegetation. Following the CORINE 2000 Land Cover maps (EEA-ETC/TE ), the watershed is
Almirida, Crete
mainly covered by scrub and natural grassland (42.3%), Crete has a typical Mediterranean island environment with
olive groves (36.6%) and agricultural land interrupted by
about 53% of the annual precipitation occurring in the
wide areas of natural vegetation (20.6%), with as little as
winter, 23% during autumn and 20% during spring while
0.6% urban fabric.
there is negligible rainfall during summer (Naoum & Tsanis ; Koutroulis & Tsanis ). The average precipitation ranges from 440 mm a
–1
Flash flood and post-event field survey
in the east to more than
2,000 mm a–1 at the uplands of western Crete, where oro-
On 16 October 2006, a frontal depression that was already
graphic effects tend to increase both frequency and
located over the central Mediterranean moved eastwards
intensity of winter precipitation (Naoum & Tsanis ;
and crossed over the island of Crete by midday. The low
Roe ; Koutroulis & Tsanis ). These characteristics
pressure with center 1,010 hPa over Malta moved rapidly
together with the rugged topography and small-size basins
eastwards and deepened, so that at 00:00 UTC 17 October
make Crete a rather flood-prone region (Koutroulis et al.
(Figure 5(a)), it was centered just southwest of Crete with
). The hydrological basin of coastal village of Almirida
center lower than 1,008 hPa. High-resolution MeteoSat ima-
2
(Figure 4) covers an area of 25 km and receives an average
gery (Figure 5(b)) shows the frontal depression moving
annual precipitation of 648 mm (based on a 32-year record).
northeast towards the island of Crete. The meteorological
The topography consists of mild slopes in the major part of
formation developed an estimated speed of approximately
the watershed and a few areas with slopes over 10%. The
65 km h–1 that classifies them as a potential Mediterranean
watershed exhibits a moderate orography, with a maximum
tropical storm, a rather rare phenomenon (Lagouvardos
elevation of 527 m and a mean watershed elevation of
et al. ).
Figure 4
|
Almirida basin in the island of Crete, Greece and GCP location.
7
I. K. Tsanis et al.
Figure 5
|
|
GeoEye-1-generated DEM for flood mapping
Journal of Hydroinformatics
|
16.1
|
2014
(a) Mean sea-level pressure at 0000 UTC for 17/10/2006 (isobars in hPa). (b) High-resolution METEOSAT satellite image showing the frontal depression moving northeast towards the island of Crete.
The total measured rainfall depth for the event was
GeoEye-1 GeoStereo stereo pair proved to be the most reason-
200 mm with the major volume precipitating during the
ably priced solution for the resolution provided. While SPOT
course of 7 hours (07:00 to 14:00 local time). The peak rainfall
2.5 m panchromatic and IKONOS 0.8 m were less expensive,
took place at noon and a joint estimation using terrestrial and
GeoEye provides a competitive value at the best-possible
radar measurements showed that it was about 23.0 mm h–1
resolution for satellite products. Products such as LiDAR
and lasted for about 30 min (Daliakopoulos & Tsanis ),
that can potentially yield superior results are rendered uneco-
enhancing surface flow and generating an estimate peak dis-
nomical by the size of the basin and the poor availability of
3 –1
charge of 225 m s . The discharge was estimated with the
airborne sensors in the proximity (Table 1). The GeoEye-1
help of floodmarks and 1D hydraulic modeling of the basin
GeoStereo stereo pair used in the present study was acquired
(Gaume et al. ) at a location shown in Figure 6. The
on 13 August 2009 over the wider area of Almirida watershed.
flood resulted in more than 3 million euros in damage and
The product is characterized as panchromatic–multispectral,
the death of a local resident. During the days after the flood,
has 0.5 m pixel size and the two members were collected
topographic measurements and photographic material were
at angles 79.55334 and 62.05786 . Samples from the three
collected from field surveys and local resident testimonies.
visible bands composite members of a small area of Almirida
At the outlet of the basin the flood marks of the maximum
watershed stereo-pair images are shown in Figure 7.
W
stage were still clearly visible, reaching as high as 2 m at the
For the purposes of this study, GCPs were considered as
downstream control cross-section. Due to the fact that the
the ground truth against which elevation models can be
basin is ungauged and the stream of Almirida is ephemeral,
compared. Ninety high-quality GCPs were collected within
no other flow events have been recorded.
the study area, at open areas with bare terrain, using a pair of differential GPS (DGPS) Leica GS20 Professional
Terrain data availability
Data Mappers. DGPS measurements were corrected offline using the L1 pseudo-range in combination with station
Selection of the most cost-effective high-resolution stereo pro-
TUC2 from the Reference Frame Sub-Commission for
duct for the case study followed a simple ranking of costs for
Europe (EUREF) Permanent Network (EPN), located
each available technological option (Table 1), where the
within the Technical University of Crete campus. Ten of
8
I. K. Tsanis et al.
Figure 6
|
|
GeoEye-1-generated DEM for flood mapping
Journal of Hydroinformatics
|
16.1
|
2014
Almirida basin with ground truth profile measurement locations and profiles at the lower stream (detail on the right).
the GCPs were used as control points during the DEM extraction algorithm while the remaining 80 points in the Table 1
|
Minimum order area, unit cost and final cost for an area equal to that of Almirida basin (25 km2) for various high-resolution stereo products
SPOT satellite 2.5 m b&wa
Additional GCPs were collected using a total station, an
Minimum order
Unit cost
Cost for
Cost in $c for
area (km2)
per km2
25 km2
25 km2
1.5 €
5,400 €
3,780
60
IKONOS 0.8 m Geostereoa
100
$45
$4,500
4,500
GeoEye-1 0.5 Geostereoa
100
$50
$5,000
5,000
SPOT satellite 2.5 m colora
60
2.25 €
8,100 €
5,670
$40
$8,400
8,400
Airborne stereo SARb
200
100 €
20 000 €
14 000
Airborne IFSARb
200
100 €
20 000 €
14 000
Aerial photographyb
200
100 €
20 000 €
14 000
LiDARb
200
100 €
20 000 €
14 000
c
distance meter to read coordinates from the instrument to a particular point. Total station measurements are taken only with respect to the instrument; they therefore have to be referenced later using GPS measurements. A total of 650 validation points were collected using this method. From those, 500 points comprise a set of narrow Almirida watershed (Figure 6, inset). An additional 150 points comprise a set of wide (>80 m) profiles that include cross-sec-
210
e-GEOS Price List for 2012.
b
electronic theodolite (transit) integrated with an electronic
(<50 m) but dense stream cross-sections near the outlet of the
Quickbird, Worldview1/2 2 ma
a
set were used as check points for DEM accuracy validation.
B. Charalampopoulou, personal communication (2013).
Average exchange rate for August 2009 1$ c. 0.7€.
tions (S1, S2) and various accessible slopes (S3, S4, S5) within the watershed (Figure 6). The aim of this comparison is mainly to reveal the DEM efficiency in capturing the geometry of a random point, stream cross-section or profile and to describe the ground geometry for hydraulic applications. Even though various DEMs with different pixel size were extracted for comparison, the resolution selected for further analysis has a 2 m pixel size. Three reference DEMs were also
9
I. K. Tsanis et al.
Figure 7
|
|
GeoEye-1-generated DEM for flood mapping
Journal of Hydroinformatics
|
16.1
|
2014
Members of a sample GeoEye-1 stereo pair from Almirida basin.
used: (1) 10 m × 10 m DEM produced from aerial photos; (2)
RPC correction is such that the sensitivity of the overall
30 m × 30 m DEM digitized from 1:5,000 Hellenic Military
quality to the number of check points is relatively small.
Geographic Service (HMGS) topographical maps; and (3)
In order to document this lack of sensitivity, DEMs of the
NASA SRTM90 90 m × 90 m product (Jarvis et al. ).
same resolution (2 m × 2 m) were produced using a varying number of control points randomly selected from the set of 10 GCPs reserved for control. Figure 8 shows that the
RESULTS
RMSE of check points against GPS measurements for correction remains relatively steady around 0.9 m. It is
The selection of GCPs on stereo-pair images is usually a sub-
therefore inferred that the margin for further corrections is
jective process that needs caution, as the extracted DEM
narrow and the number and location of GCPs have limited
quality is highly dependent on them. During the course of
impacts on the precision of object point determination. This
the study, it was determined that the quality of the initial
finding is in agreement with previous studies using RPC
10
I. K. Tsanis et al.
|
Figure 8
|
GeoEye-1-generated DEM for flood mapping
Journal of Hydroinformatics
|
16.1
|
2014
RMSE of check points against GPS measurements using random subsets of 4–8 from a total of 10 control points for use in RPC refinement.
block adjustment (e.g. Fraser et al. ). In this sense, the
1.30 and 1.14 m, respectively) and significantly lower than
incorporation of the RPC successfully replaces the sensor
the SRTM90 and HMGS30 products (Table 2). It is interesting
model and reduces the need for extensive GCP collection.
to note that control and check points were collected using
The quality of each available DEM was initially evaluated
only GPS measurements, making them potentially less likely
by calculating the RMSE of the elevation of control, check
to be affected by the inherent systematic error of the 650 vali-
and validation points against the respective elevation esti-
dation points that were collected by transit and referenced by
mated at each DEM at the same location (Table 2). The
GPS. Consecutively, the GeoEye-1 stereo-pair-derived DEM
results show that the stereo-pair-produced DEM scored the
yields a meter or sub-meter resolution which makes it a
smallest RMSE (0.79 m for control points, 0.90 m for check
good candidate for a wide range of applications.
points and 1.06 m for validation points). The RMSE was
In terms of total delineated watershed area, the four
found to be lower than that obtained by aerial photos (1.88,
compared DEMs delivered different results using the standard D-8 algorithm (Tribe ) found in ArcGIS. The
Table 2
|
Statistics of different DEM products, including RMSE value for: (a) 10 control points used for the DEM generation; (b) 80 check points measured with GPS; and (c) 500 validation points measured using GPS and transit
RMSE (m)
Control points Check points Validation points
DEM
25.814 km2,
the
while
calculated
an
area
aerial-photo-generated
of
DEM
(10 m × 10 m) estimated a similar area of 25.893 km2 (0.3% larger). The SRTM90 and HMGS30 DEMs estimated a
Aerial
DEM product
stereo-pair-generated
photos
HMGS
SRTM
Stereopair 2m×2m
10 m × 10 m
30 m × 30 m
90 m × 90 m
0.79
1.88
4.28
31.63
0.90
1.30
7.00
6.92
1.06
1.14
7.74
5.18
watershed area of 26.133 and 24.962 km2, respectively, that correspond to 1.2% larger and 3.3% smaller area comparing to the stereo-pair-generated DEM (Table 2). While there is no objective method to estimate the true watershed area at a given resolution, results show that discrepancies
are
not
significant.
Furthermore,
taking
advantage of the superior absolute horizontal accuracy of
Average relative error in validation set (%)
3.2
Standard deviation of relative error in validation set (%)
17.5
0.3
118.5
73.9
GeoEye-1 (4 m CE90, horizontal, without GCP, for the stereo product) over the relatively lower absolute vertical
19.7
47.2
56.8
accuracy (6 m LE90, vertical, without GCP), the main stream channel of the watershed was manually digitized and used to estimate its average distance d from extracted
Watershed area (km )
25.814
25.893
24.962
26.133
DEMs. The two lower-resolution DEMs (HMGS30 and
Average distance d from stream center (m)
2.31
4.89
31.63
43.19
SRTM90) fail to capture the river position (Figure 9). Analy-
2
sis of the average distance between the manually digitized and DEM-produced stream centerline was conducted for
11
I. K. Tsanis et al.
Figure 9
|
|
GeoEye-1-generated DEM for flood mapping
Journal of Hydroinformatics
|
16.1
|
2014
Comparison between stream definitions acquired by manual digitization on the orthorectified GeoEye-1 image (white path) and flow accumulation estimation on each DEM (dashed line).
up to a distance of 1.5 km from the outlet of the stream. The
each location are shown in Figure 10. For illustration pur-
low-resolution effect of older DEM products can lead to
poses, the SRTM90 DEM was excluded from Figure 10
poor terrain representation, altering water pathways during
due to its distance from the other DEM product. The results
hydraulic modeling by over 30 m on average (Table 2). On
show that the stereo-pair-generated DEM delivered profiles
the other hand, the stereo-pair- and the aerial-photo-gener-
that were very close to those acquired by total station
ated DEMs show sufficient agreement to the manually
(Table 2, Figure 11). The comparison between the different
digitized stream path, having an average distance of 2.31
DEMs reveals that the lower-resolution DEMs (HMGS30,
and 4.89 m, respectively (Table 2).
SRTM90) fail to describe the profile geometry; the low res-
Measured profiles (Figure 7) were compared against
olution acts as a smoothing filter of the field. At the
profiles extracted from each DEM product. The results at
same time, even the finer-resolution aerial-photos-generated
12
I. K. Tsanis et al.
Figure 10
|
|
GeoEye-1-generated DEM for flood mapping
Journal of Hydroinformatics
|
16.1
|
2014
Comparison of DEM extraction results with reference DEMs and profile validation points considered as ground truth: (a)–(e) profiles S1–S5, respectively.
DEM (while improved) failed to capture the depth of the
compared to the total station measurements, capturing
stream as the channel width can often measure less than
both the changes of the ground slope and the ground
10 m. In contrast, the stereo-pair-generated DEM describes
elevation in detail. The aerial-photo-generated DEM is also
width and depth of both cross-sections well in the cases
close to the ground truth (Table 2, Figure 11). In some
S1 and S2 (Figure 10). In profiles S3, S4 and S5, the
parts of the profiles, it however over- or underestimates
stereo-pair-generated DEMs exhibit very good agreement
the ground elevation while exhibiting poor fit in small and
13
I. K. Tsanis et al.
Figure 11
|
|
GeoEye-1-generated DEM for flood mapping
Journal of Hydroinformatics
|
16.1
|
2014
Relative error of validation points against elevation for all DEM products.
abrupt changes of the slope, due to its medium spatial
only range between 1.7 and 11.0 m, Figure 11 presents a
resolution.
clear distinction between fine- and coarse-resolution
The relative error of 500 validation points for all DEM
DEMs. In this context, the 2 and 10 m DEMs show virtually
products is plotted against elevation in order to distinguish
no error trend against elevation change with average relative
possible trends in the data (Figure 11). While measurements
error being 3.2% (±17.5%) and 0.3% (±19.7%), respectively.
14
I. K. Tsanis et al.
|
GeoEye-1-generated DEM for flood mapping
Journal of Hydroinformatics
|
16.1
|
2014
On the other hand, coarser DEMs yield average relative
thalweg elevations, but underestimate the relative depth
errors over 70% with a decreasing trend (Table 2). Compar-
of water at the time of peak flow (Figure 13(b)).
ing between HMGS (30 m × 30 m) and SRTM (90 m × 90 m), the former overestimates elevations for all measurement ranges whereas the latter seems to converge to accurate esti-
CONCLUSIONS
mations after 10 m. Furthermore, relative error variance is higher in small elevations, an observation that becomes
This research covers a new approach to more accurate and
more evident in coarser-resolution DEMs.
cost-effective DEM production for use in 1D and 2D hydrau-
In the current case study of the Almirida flash flood,
lic modeling. As such, it is part of a broader European effort
the produced 2-m-resolution DEM delivered the cross-
to document storm and flood processes, improve modeling
section data that were used in the hydraulic simulation in
results and provide valuable tools for flood managers in
HEC-RAS. Sixteen cross-sections were used to simulate
Europe as well as other countries. While it has to be
approximately 730 m of the lower river affected by the
acknowledged that full awareness of the complex processes
flash flood. The width of the cross-sections varied from
governing individual ungauged basins is not possible (Siva-
107 to 294 m, depending on the recorded flood extent on
palan ), uncertainty can be compensated for by using
each cross-section. The simulation time step was set at
innovative methodologies in some aspects of modeling.
10 s. Based on the stream characteristics, the Manning fric-
Such multi-discipline research and modeling applied to a
tion coefficient was considered uniform and equal to 0.04
large geographical area is critically needed in many
for the entire river bed. Respectively, for the floodplain
countries, where documenting basic flood processes has
the Manning coefficient was set at 0.08. Figure 12(a)
been lacking for the past few decades. While analytical
shows the representation of the flood extent at peak flow.
methods and models have been improving, there is no con-
This qualitative result is based on the observation that
certed national effort for basic data collection and
part of the flood wave follows streets and flows around
documentation.
buildings, clearly visible in the satellite image showing suc-
Accurate flood extent is a key feature for developing
cessful modeling and DEM quality. CCHE-2D was set up
detailed flood hazard maps, especially in flood-prone areas
using a total of 85 100 nodes and calibrated using lower
with substantial development. This work shows that tra-
Manning coefficients (0.02 for channel and 0.05 for the
ditional topographic surveying effort can be significantly
floodplain). Figures 12(c) and 12(d) show the model results
reduced when a VHR DEM is available. The extraction of
for 2 m × 2 m and 10 m × 10 m DEMs, respectively. For the
a 2 × 2 m2 VHR DEM from a satellite stereo pair and its
purposes of this study, which is the extraction of accurate
use in hydraulic modeling is presented. The DEM is
flood maps, the 1D and 2D models have produced visually
extracted from a 0.5 m GeoGye-1 Geostereo, a state-of-the-
similar results (Figure 12). In Figures 12(c) and 12(d),
art VHR satellite image using the LPS eATE algorithm
cross-sections and locations of hydrograph are shown for
and the image’s RPC model and GCPs. The resolution of
reference with Figure 7.
the final product is one of the highest among commercially
In a comparative study, the superiority of the highresolution DEM over lower-resolution products such as the 10 m DEM is evident (Figure 12(b)). As well as the
and freely available DEMs and scores the smallest RMSE among all reference DEMs used for comparison (Table 2). In the case of the 2006 flash flood in Almirida, apart
poor representation of stage, there are often areas where
from upstream damage the overbank water flow extended
a low-resolution map is not accurate in terms of flood
over 150 m west of the stream channel and flooded the
extent. The comparison of relative thalweg and water
urban area (Figure 12(a)). Increasing but largely unplanned
surface among DEMs of 2, 10 and 30 m resolution is
tourism development during the last two decades has
shown in Figure 13(a). While differences in the hydraulic
resulted in uncontrolled urbanization inside the ephemeral
simulations between 2 and 10 m DEMs are not so pro-
stream delta as well as upstream, a fact that played a signifi-
nounced, the lower-resolution DEM estimate higher
cant (if not the dominant) role in the evolution of the flash
15
I. K. Tsanis et al.
Figure 12
|
|
GeoEye-1-generated DEM for flood mapping
Journal of Hydroinformatics
Flood extent using HEC-RAS based on the (a) 2 m DEM and (b) 10 m DEM; flood extent using CCHE-2D based on the (c) 2 m DEM and (d) 10 m DEM.
|
16.1
|
2014
16
I. K. Tsanis et al.
Figure 13
|
|
GeoEye-1-generated DEM for flood mapping
Journal of Hydroinformatics
|
16.1
|
2014
(a) Comparison of thalweg and water surface from three different DEMs and (b) stage estimates for each case.
flood and the incurred damages. It is evident that proactive
and assist in their supervised or unsupervised removal.
flood protection measures have to be taken in order to avoid
Further research could also include the comparison of
similar future situations that pose great risk to life and prop-
DEMs extracted from competitive commercial products
erty. The European Flood Directive presented in 2007 asked
such as WorldView-2 stereo-pair imagery.
EC member states to prepare flood risk maps by 2013. Providing the flood frequency information is available, the highresolution elevation information presented in this study
ACKNOWLEDGEMENTS
could support cost-effective flood risk mapping and could be a part of the work done to meet the directive. The limitation of the presented methodology is that the final VHR DEM product includes elevation information
Through its establishment ESRIN, the European Space Agency (ESA) supported this work through the ‘High Resolution Satellite Imagery for Floodplain Mapping
about vegetation and structures over the actual terrain,
(SImFlood)’ Project, contract no. 22306/09/J-LG. Post-
which needs to be removed manually. While some of this
event surveys were supported by the European Community
information is desirable for flood modeling (e.g. building
funded project, HYDRATE, Sixth Framework Programme,
height), it is often considered redundant in a DEM extrac-
contract no. 037024. The authors would also like to thank
tion that aims to depict bare terrain. Multispectral satellite
the anonymous reviewers for their valuable comments and
imagery, such as GeoEye-1 products, can capture structures
suggestions which improved the quality of the paper.
and vegetation by taking advantage of their respective spectral reflectance curves which are characteristic for various material and texture classes (Daliakopoulos et al. ).
REFERENCES
The incorporation of this land use classification information included in multispectral satellite imagery can greatly enhance the identification and extraction of such features
Awrangjeb, M., Ravanbakhsh, M. & Fraser, C. S. Automatic detection of residential buildings using LIDAR data and
17
I. K. Tsanis et al.
|
GeoEye-1-generated DEM for flood mapping
multispectral imagery. ISPRS Journal of Photogrammetry and Remote Sensing 65 (5), 457–467. Baltsavias, E. P. A comparison between photogrammetry and laser scanning. ISPRS Journal of Photogrammetry and Remote Sensing 54 (2), 83–94. Barkau, R. L. UNET, One-dimensional Unsteady Flow Through a Full Network of Open Channels: User’s Manual. US Army COE, Hydrologic Engineering Center, Davis, CA. Barredo, J. Normalised flood losses in Europe: 1970–2006. Natural Hazards and Earth System Sciences 9 (1), 97–104. Borga, M., Boscolo, P., Zanon, F. & Sangati, M. Hydrometeorological analysis of the 29 August 2003 flash flood in the Eastern Italian Alps. Journal of Hydrometeorology 8 (5), 1049–1067. Boyle, S., Tsanis, I. & Kanaroglou, P. Developing geographic information systems for land use impact assessment in flooding conditions. Journal of Water Resources Planning and Management 124 (2), 89–98. Brunner, G. W. HEC-RAS River Analysis System. Hydraulic Reference Manual. Version 1.0., DTIC Document. Hydrologic Engineering Center, Davis, CA. Carson, E. C. Hydrologic modeling of flood conveyance and impacts of historic overbank sedimentation on West Fork Black’s Fork, Uinta Mountains, northeastern Utah, USA. Geomorphology 75 (3), 368–383. Collier, C. Flash flood forecasting: What are the limits of predictability? Quarterly Journal of the Royal Meteorological Society 133 (622), 3–23. Croitoru, A., Hu, Y., Tao, V., Xu, Z., Wang, F. & Lenson, P. Single and stereo based 3d metrology from high-resolution imagery: methodologies and accuracies. International Archives of Photogrammetry and Remote Sensing 35, 1022–1027. Daliakopoulos, I. N. & Tsanis, I. K. A weather radar data processing module for storm analysis. Journal of Hydroinformatics 14 (2), 332–344. Daliakopoulos, I. N., Grillakis, E. G., Koutroulis, A. G. & Tsanis, I. K. Tree crown detection on multispectral VHR satellite imagery. Photogrammetric Engineering and Remote Sensing 75 (10), 1201. Dutta, D. & Herath, S. Effect of DEM accuracy in flood inundation simulation using distributed hydrological models. Monthly Journal of Institute of Industrial Science, University of Tokyo 53 (11), 602–605. EEA-ETC/TE CORINE land cover update. I& CLC2000 project. Available at http://terrestrial.eionet.eu.int. Eineder, M. Alpine digital elevation models from radar interferometry: A generic approach to exploit multiple imaging geometries. Photogrammetrie Fernerkundung Geoinformation 2005 (6), 477. EP & CEU Directive on the assessment and management of flood risks (2007/60/EC). Official Journal of the European Union L288/27–L288/34. Fraser, C., Dial, G. & Grodecki, J. Sensor orientation via RPCs. ISPRS Journal of Photogrammetry and Remote Sensing 60 (3), 182–194.
Journal of Hydroinformatics
|
16.1
|
2014
Gaume, E., Bain, V., Bernardara, P., Newinger, O., Barbuc, M., Bateman, A., Blaškovičová, L., Blöschl, G., Borga, M., Dumitrescu, A., Daliakopoulos, I., Garcia, J., Irimescu, A., Kohnova, S., Koutroulis, A., Marchi, L., Matreata, S., Medina, V., Preciso, E., Sempere-Torres, D., Stancalie, G., Szolgay, J., Tsanis, I., Velasco, D. & Viglione, A. A compilation of data on European flash floods. Journal of Hydrology 367 (1), 70–78. Gogoase, D. E. N., Armaş, I. & Ionescu, C. S. Inundation maps for extreme flood events at the mouth of the Danube River. International Journal of Geosciences 2, 68–74. Gomes Pereira, L. M. & Janssen, L. L. F. Suitability of laser data for DTM generation: a case study in the context of road planning and design. ISPRS Journal of Photogrammetry and Remote Sensing 54 (4), 244–253. Gregory, K., Davis, R. & Downs, P. Identification of river channel change to due to urbanization. Applied Geography 12 (4), 299–318. Hammer, T. R. Stream channel enlargement due to urbanization. Water Resources Research 8 (6), 1530–1540. Hirschmuller, H. Stereo processing by semiglobal matching and mutual information. IEEE Transactions on Pattern Analysis and Machine Intelligence 30 (2), 328–341. Hu, Y., Tao, V. & Croitoru, A. Understanding the rational function model: methods and applications. International Archives of Photogrammetry and Remote Sensing 20, 6. Jarvis, A., Reuter, H., Nelson, A. & Guevara, E. Hole-filled SRTM for the globe Version 4. Available from the CGIARCSI SRTM 90 m database at http://srtm.csi.cgiar.org. Jenson, S. & Domingue, J. Extracting topographic structure from digital elevation data for geographic information system analysis. Photogrammetric Engineering and Remote Sensing 54 (11), 1593–1600. Jia, Y. & Wang, S. S. Y. Numerical model for channel flow and morphological change studies. Journal of Hydraulic Engineering 125 (9), 924–933. Jia, Y., Sam, S. Y. W. & Xu, Y. Validation and application of a 2D model to channels with complex geometry. International Journal of Computational Engineering Science 3 (01), 57–71. Konrad, C. P. Effects of Urban Development on Floods. US Geological Survey Fact Sheet 076-03, Tacoma, WA. Koutroulis, A. G. & Tsanis, I. K. A method for estimating flash flood peak discharge in a poorly gauged basin: Case study for the 13–14 January 1994 flood, Giofiros basin, Crete, Greece. Journal of Hydrology 385 (1), 150–164. Koutroulis, A. G., Tsanis, I. K. & Daliakopoulos, I. N. Seasonality of floods and their hydrometeorologic characteristics in the island of Crete. Journal of Hydrology 394 (1), 90–100. Krauß, T., Reinartz, P., Lehner, M., Schroeder, M. & Stilla, U. DEM generation from very high resolution stereo data in urban areas. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences 36, 1682–1777. Lagouvardos, K., Kotroni, V., Nickovic, S. & Kallos, G. Evidence of a winter tropical storm over eastern
18
I. K. Tsanis et al.
|
GeoEye-1-generated DEM for flood mapping
Mediterranean: Simulations with the regional atmospheric modelling system (RAMS) and the ETA/NMC model. In: Proceedings of the 7th International Conference on Mesoscale Processes, 9–13 September, Reading, UK. Lane, S. N., James, T. D., Pritchard, H. & Saunders, M. Photogrammetric and laser altimetric reconstruction of water levels for extreme flood event analysis. The Photogrammetric Record 18 (104), 293–307. Lee, D. H., Lee, K. M. & Lee, S. U. Fusion of lidar and imagery for reliable building extraction. Photogrammetric Engineering and Remote Sensing 74 (2), 215. Leopold, L. & Maddock Jr, T. The hydraulic geometry of stream channels and some physiographic implications. US Geological Survey, Washington, DC, Professional Paper 252, p. 57. Lin, X. & Yuan, X. Improvement of the stability solving rational polynomial coefficients. In: International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Beijing 37, 711–716. Llasat, M. C., Llasat-Botija, M., Prat, M. A., Porcú, F., Price, C., Mugnai, A., Lagouvardos, K., Kotroni, V., Katsanos, D., Michaelides, S., Yair, Y., Savvidou, K. & Nicolaides, K. High-impact floods and flash floods in Mediterranean countries: the FLASH preliminary database. Advances in Geosciences 23, 47–55. Marks, K. & Bates, P. Integration of high-resolution topographic data with floodplain flow models. Hydrological Processes 14 (11), 2109–2122. Mongus, D. & Žalik, B. Parameter-free ground filtering of LiDAR data for automatic DTM generation. ISPRS Journal of Photogrammetry and Remote Sensing 67, 1–12. Naoum, S. & Tsanis, I. Orographic precipitation modeling with multiple linear regression. Journal of Hydrologic Engineering 9 (2), 79–102. Nassar, M. Multi-parametric sensitivity analysis of CCHE2D for channel flow simulations in Nile River. Journal of Hydroenvironment Research 5 (3), 187–195. Nikolakopoulos, K. G., Kamaratakis, E. K. & Chrysoulakis, N. SRTM vs ASTER elevation products. Comparison for two regions in Crete, Greece. International Journal of Remote Sensing 27 (21), 4819–4838. OGC OpenGIS Simple Feature Specification For SQL Version 1.1. Open GIS project document 99-049. Available at http://www.opengeospatial.org/specs. Pappenberger, F., Beven, K., Horritt, M. & Blazkova, S. Uncertainty in the calibration of effective roughness parameters in HEC-RAS using inundation and downstream level observations. Journal of Hydrology 302 (1), 46–69.
Journal of Hydroinformatics
|
16.1
|
2014
Priestnall, G., Jaafar, J. & Duncan, A. Extracting urban features from LiDAR digital surface models. Computers, Environment and Urban Systems 24 (2), 65–78. Previtali, M., Barazzetti, L. & Scaioni, M. Multi-step and multi-photo matching for accurate 3D reconstruction. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences 38, 103–108. Re, M. Annual Review: Natural Catastrophes 2004, Knowledge series. Munich Re Group, Re, Germany. Richards, M. A. A Beginner’s Guide to Interferometric SAR Concepts and Signal Processing (AESS Tutorial IV). IEEE Aerospace and Electronic Systems Magazine 22 (9), 5–29. Roe, G. H. Orographic precipitation. Annual Review of Earth and Planetary Sciences 33, 645–671. Sanders, B. F. Evaluation of on-line DEMs for flood inundation modeling. Advances in Water Resources 30 (8), 1831–1843. Sivapalan, M. Prediction in ungauged basins: A grand challenge for theoretical hydrology. Hydrological Processes 17, 3163–3170. Tao, C. V. & Hu, Y. A comprehensive study of the rational function model for photogrammetric processing. Photogrammetric Engineering & Remote Sensing 67 (12), 1347–1357. Toothill, J. Central European Flooding August 2002. An EQECAT Technical Report, ABS Consulting. Available at http://www.absconsulting.com/resources/Catastrophe_ Reports/flood_rept.pdf. Toth, C. & Grejner-Brzezinska, D. Complementarity of LIDAR and stereo-imagery for enhanced surface extraction, geoinformation for all. In: Proceedings of XIXth ISPRS Congress, 16–23 July, Amsterdam, pp. 897–904. Tribe, A. Automated recognition of valley lines and drainage networks from grid digital elevation models: a review and a new method. Journal of Hydrology 139 (1), 263–293. Wang, S. S. Y. & Hu, K. Improved methodology for formulating finite element hydrodynamic models. In: Finite Element in Fluids (T. J. Chung, ed.). Hemisphere Publishing, Washington, vol. 8, pp. 457–478. Zhang, Y. CCHE2D-GUI–Graphical User Interface for the CCHE2D Model User’s Manual–Version 2.2. Available at: http://www.ncche.olemiss.edu/sites/default/files/files/docs/ cche2d/CCHE2D_2.2_User's_Manual.pdf. Zhang, Y. CCHE-GUI–graphical users interface for NCCHE model user’s manual–version 3.0. National Center for Computational Hydroscience and Engineering. Technical Report No. NCCHE-TR-2006-2.
First received 2 November 2012; accepted in revised form 23 April 2013. Available online 25 June 2013
19
© IWA Publishing 2014 Journal of Hydroinformatics
|
16.1
|
2014
Prediction of flow resistance in a compound open channel Mrutyunjaya Sahu, S. S. Mahapatra, K. C. Biswal and K. K. Khatua
ABSTRACT Flooding in a river is a complex phenomenon which affects the livelihood and economic condition of the region. During flooding flow overtops the river course and spreads around the flood plain resulting in a two-course compound channel. It has been observed that the flow velocity in the flood plain is slower than that in the actual river course. This can produce a large shear layer between sections of the flow and produces turbulent structures which generate extra resistance and uncertainty in flow prediction. Researchers have adopted various numerical, analytical, and empirical models to analyze this situation. Generally, a one-dimensional empirical model is used for flow prediction assuming that the flow in the compound open channel is uniform. However, flow in a compound channel is quasi-uniform due to the transfer of momentum in sub-sections and sudden
Mrutyunjaya Sahu K. C. Biswal K. K. Khatua Department of Civil Engineering, National Institute of Technology, Rourkela, Odisha, India S. S. Mahapatra (corresponding author) Department of Mechanical Engineering, National Institute of Technology, Rourkela, Odisha, India E-mail: mahapatrass2003@yahoo.com
change of depths laterally. Hence, it is essential to analyze the turbulent structures prevalent in the situation. Therefore, in this study, an effort has been made to analyze the turbulent structure involved in flooding using large eddy simulation (LES) method to estimate the resistance. Further, a combination of an artificial neural network (ANN) and a fuzzy logic (FL) is considered to predict flow resistance in a compound open channel. Key words
| adaptive neuro-fuzzy inference system (ANFIS), compound open channel, computational fluid dynamics, correlation, momentum transfer
INTRODUCTION Resistance factors such as drag, boundary shear stress, and
main channel and flood plain is in accordance with the
channel roughness play an important role in predicting con-
flow energy loss, which can be expressed in the form of a
veyance capacity, bank protection, sediment transport, etc.
flow resistance coefficient. Christodoulou & Myers ()
Thus, Einstein & Banks () and Krishnamurthy & Chris-
quantified the apparent shear on the vertical interface
tensen () developed models for estimating a composite
between main channel and flood plain in symmetrical com-
friction factor to study resistance to the flow in a compound
pound sections. Yang et al. () indicated that the Darcy–
open channel. Wormleaton et al. () reported through
Weisbach resistance factor is not suitable for predicting a
extensive experimentation that the Manning’s equation
composite friction factor for measuring the resistance to
and the Darcy–Weisbach equation are not suitable for pre-
flow. The environmental condition and the impact of ther-
dicting discharge of compound channels. Dracos &
modynamic, physical, and hydraulic parameters exhibit
Hardegger () proposed a model to predict a composite
strong non-linear relationships leading to an inaccurate pre-
friction factor in compound open channel flow by taking
diction of a composite friction factor in a compound open
momentum transfer into account, and they also noted that
channel using conventional methods.
a composite friction factor depends on the main channel
Rapid growth in artificial intelligent techniques not only
and flood plain widths and the ratio of the hydraulic
reduces the tedious effort of experimentation but it also
radius to the depth in the main channel. Pang ()
eliminates cumbersome computations. Walid & Shyam
reported that the distribution of discharge between the
() considered a back propagation (BP) algorithm of an
doi: 10.2166/hydro.2013.077
20
M. Sahu et al.
|
Prediction of flow resistance in a compound open channel
Journal of Hydroinformatics
|
16.1
|
2014
artificial neural network (ANN) for prediction of discharge
nature of flow to provide dense fields of data points. Further,
in compound open channel flow. Notable past studies in
they have adopted a large eddy simulation (LES) method to
this direction are a neuro-fuzzy model to simulate the
investigate over-bank channel flow. LES has been utilized to
Colebrook–White equation for the prediction of a friction
model both in-bank channel and over-bank flow condition
factor in smooth open channel flow (Bigil & Altun ;
to investigate the detailed structure of secondary circula-
Yuhong & Wenxin ) and the prediction of a friction
tions. Salvetti et al. () conducted a LES simulation at a
factor in pipe flow problems (Fadare & Ofidhe ). Esen
relatively large Reynolds number for producing results for
et al. () demonstrated the use of adaptive neuro-fuzzy
bed shear with the magnitude of secondary motions and vor-
inference system (ANFIS) for modeling of a ground-coupled
ticity comparable to experiments. Pan & Banerjee (),
heat pump system. Riahi-Madvar et al. () proposed a
Hodges & Street (), and Nakayama & Yokojima
model based on ANFIS to predict longitudinal dispersion
() studied free surface fluctuations in open channel
coefficient in natural streams. ANFIS has been adopted in
flow by employing the LES method where the free surface
a variety of fields for accurate prediction of responses in situ-
has been filtered along with the flow field itself which intro-
ations where input parameters characterize impreciseness
duced extra sub-grid stress (SGS) terms. Beaman ()
and uncertainty. When the relationship between input and
studied the estimation of conveyance using the LES method.
output parameters is difficult to establish using mathemat-
In this study, the inadequacy in prediction of a compo-
ical, analytical, and numerical methods and computation
site friction factor assuming turbulent flow and the
becomes cumbersome and time-consuming, an easily
momentum transfer between the main channel and flood
implementable technique like ANFIS can be adopted.
plain is addressed using an adaptive neuro-fuzzy system.
Thus, an ANFIS model has been proposed in this study to
Further, keeping in view the wide application of LES in
predict a composite friction factor in compound open
open channel flow, an effort has been made to analyze tur-
channel flow.
bulent flow in a compound open channel.
Despite clear successes in the experimental approach, it still suffers from limitations, such as: (i) data are collected at a limited number of points, (ii) the model is usually not at
EXPERIMENTAL DATA USED FOR ANALYSIS
full-scale, and (iii) detailed measurements of turbulence have not usually been considered. A computational
The methods considered to predict the composite friction
approach can partly overcome some of these issues and pro-
factor in a compound open channel are compared with
vide a complementary tool. In particular, a computational
the experimental data of FCF Series A (the experimental
approach is readily repeatable, can simulate at full-scale
data for a straight compound open channel at the Univer-
and provides a spatially dense field of data points. However,
sity of Birmingham) (Tominaga & Nezu ; Soong &
there are significant technical challenges in terms of the pre-
DePue ; Tang & Knight a, b; Atabay et al.
diction of turbulence. In recent years, numerical modeling
). The hydraulic conditions of the data are shown in
of open channel flows has successfully reproduced exper-
Table 1.
imental results. Computational fluid dynamics (CFD) has been used to model open channel flows ranging from main channels to full-scale modeling of flood plains. Simulations have been performed by Krishnappan & Lau
PREDICTION OF COMPOSITE FRICTION FACTOR BY ANFIS
(), Kawahara & Tamai () and Cokljat (). CFD has also been used to model flow features in natural rivers
Prediction of composite friction factor by empirical
by Sinha et al. (), Lane et al. (), and Morvan
models
(). Thomas & Williams (a, b, ) and Shi et al. () have undertaken refined numerical modeling
A compound channel basically consists of a main channel
to examine the detailed time-dependent three-dimensional
with flood plains. The primary factors affecting the
21
Table 1
M. Sahu et al.
|
|
Prediction of flow resistance in a compound open channel
Journal of Hydroinformatics
|
16.1
|
2014
(C ), and Darcy–Weisbach ( f ) are related as shown in
Summary of geometrical factors of experimental data
Source of
Main side
Flood plain
Roughness
Main channel cross-sectional
data
slope
type
type
geometry
FCF Series A
Equation (1): C R1=6 1 8 pffiffiffi ¼ pffiffiffi × ¼ pffiffi g g n f
(1)
Series 1
1.0
Symmetric
Smooth
Trapezoidal
Series 2
1.0
Symmetric
Smooth
Trapezoidal
These factors are evaluated to predict the bed shear
Series 3
1.0
Symmetric
Smooth
Trapezoidal
stress and discharge for both simple and compound
Series 6
1.0
Asymmetric
Smooth
Trapezoidal
open channel flows. Traditionally, the composite rough-
Series 8
0
Symmetric
Smooth
Rectangular
ness in a compound channel is expressed in Manning’s
Series 10
2.0
Symmetric
Smooth
Trapezoidal
form ‘n’ as in Equation (2). The composite friction
Asymmetric
Smooth
Rectangular
Tominaga & Nezu () S (1–3)
0
factor nc across the perimeter can be evaluated as:
Tang & Knight (a) ROA
0
Symmetric
Smooth
Rectangular
ROS
0
Symmetric
Smooth
Rectangular
ð nc ¼ wi ni dp
(2)
where ni ¼ sub-sectional Manning’s roughness and wi ¼
Tang & Knight (b) LOSR
0
Symmetric
Rough
Rectangular
weighted function of sub-sections. Using this formulation the
ALL
0
Symmetric
Rough
Rectangular
calculation of open channel flow is reduced to a 1D formulation.
Atabay et al. ()
A number of empirical formulations have been pro-
ROA
0
Asymmetric
Smooth
Rectangular
posed by investigators to predict a composite Manning’s
ROS
0
Symmetric
Smooth
Rectangular
friction factor in compound open channel flow with differ-
Asymmetric
Rough
Trapezoidal
Soong & Depue () 1
ent assumptions based on the relationships between the discharges, velocities, forces, and shear stresses of the component sub-sections and the total cross-section. These formulations are listed in Table 2 for the estimation of a
resistance coefficient in a compound open channel are
composite Manning’s friction factor. Further, different
geometric parameters (depth of main channel), h and
methods have been also adopted to divide the components
the wall roughness resistant coefficient, K ¼ ks/R where
sub-sections of the compound channels to apply these
ks and R are the roughness height and the hydraulic
models to estimate a composite Manning’s friction factor
radius, respectively. It should be noted that the wall
and the discharge in a compound open channel.
roughness changes along the wetted perimeter of the
In this study, methods proposed by Cox (), Einstein &
cross-section in a compound channel. The composite
Banks (), Lotter (), Krishnamurthy & Christensen
roughness on the wall as well as the shape of the channel
(), and Dracos & Hardegger () have been adopted
affects the turbulent flow structures and the secondary
to predict a composite friction factor. Among these methods,
current across the cross-section and hence, alters the
only Dracos & Hardegger () take momentum transfer
resistance coefficient. Manning’s equation is generally
into account. However, the model proposed by Hin et al.
used for the prediction of discharge in compound open
() can account for momentum transfer but the method
channels. The friction factor is in the form of either
is based on field observation. Further, the data collected
Manning’s coefficient, Chezy’s coefficient, or the Darcy–
have to be calibrated to account for the shape factor par-
Weisbach coefficient, usually considered as a ‘true compo-
ameter to calculate the apparent friction factor. Since this
site friction factor’ (Yang et al. ). In open channel
factor is not generally available, the model is excluded from
flow, the flow resistance coefficient of the boundary
this analysis. Figures 1–3 show the relationship between
expressed by Manning’s coefficient (n), Chezy’s coefficient
true composite friction factor obtained from Manning’s
22
Table 2
M. Sahu et al.
|
|
Prediction of flow resistance in a compound open channel
Journal of Hydroinformatics
|
16.1
|
2014
Models for prediction of composite Manning’s friction factor
Reference
Model notation
Cox ()
COX
Composite friction factor (nc)
Concept
Total resistance force is equal to sum of sub-area resistance forces or ni pffiffiffiffiffiffi weighted by Ai Total discharge is sum of sub-area discharges
Einstein & Banks ()
EBM
Total cross-sectional mean velocity equal to sub-area mean velocity
Lotter ()
LM
Total discharge is sum of sub-area discharges
Krishnamurthy & Christensen ()
KCM
Logarithmic velocity distribution over depth h for wide channel
Dracos & Hardegger ()
D&H
The main channel and flood plain width ratio, and the ratio of the total hydraulic radius to the flow depth in the main channel
¼
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P 2 Ai ni A
A ðAi =ni Þ 2P 3=2 32=3 ni Pi 5 ¼4 P ¼P
¼
PR5=3
P Pi R5=3 i ni "P # 3=2 Pi hi In ni ¼ exp P 3=2 Pi hi n R ¼ f α, ne H
Where pi ¼ sub-sectional perimeter of compound channel, ni ¼ sub-sectional Manning’s ‘n’, Ri ¼ sub-sectional hydraulic radius, Ai ¼ sub-sectional area of compound section, and hi ¼ subsectional depth of flow, R ¼ hydraulic radius of whole compound channel, and α ¼ a measure of increase in wetted perimeter.
Figure 1
|
True composite friction factor v/s composite friction factor predicted by five methods for Atabay et al.’s (2004) experimental conditions.
Figure 3
|
True composite friction factor v/s composite friction factor predicted by five methods for Tang & Knight’s (2001b) experimental conditions.
equation and the friction factor predicted by empirical models given in Table 2 for the three experimental conditions shown in Table 1. The mean absolute relative error for each model is shown in Table 3. It is inferred from Table 3 that the predictive models considered in this study are not capable of accurately predicting composite friction factor for all data sets. For example, Einstein & Banks’ () model predicts Soong & DePue’s () data with reasonable accuracy but fails to predict other data sets. Similarly, Krishnamurthy & Christensen’s () model predicts Tang & Knight’s (a) Figure 2
|
True composite friction factor v/s composite friction factor predicted by five methods for Soong & Depue’s (1996) experimental conditions.
data with adequate accuracy but no other data sets. Therefore, it is desirable to propose a robust predictive method
23
M. Sahu et al.
Table 3
|
|
Prediction of flow resistance in a compound open channel
Journal of Hydroinformatics
|
16.1
|
2014
Mean absolute relative error for different data sets
Data set
Cox (1973)
Einstein & Banks (1950)
Lotter (1993)
Krishnamurthy & Christensen (1972)
Dracos & Hardegger (1987)
FCF Series A
28.33
32.6
24.16
49.47
15.738
Tominaga & Nezu ()
32.21
33.68
8.28
34.72
25.22
Tang & Knight (a)
28.81
57.58
13.52
9.37
13.74
Tang & Knight (b)
28.24
33.721
27.61
46.14
26.14
Atabay et al. ()
17.28
33.43
35.75
18.10
14.98
Soong & DePue ()
13.12
7.42
33.21
15.28
34.38
for an accurate prediction of a composite friction factor
facts. Jang (a) proposed a combination of a neural net-
under different hydraulic conditions.
work and fuzzy logic (FL) known as an ANFIS. ANFIS is
In order to develop a robust approach to predict a compo-
a FIS implemented in the framework of neural networks.
site friction factor, five flow parameters used for the estimation
The combination of both ANN and FIS thus improves the
of the overall discharge in compounds channels suggested by
system performance without interaction with operators.
Yang et al. () are considered. The parameters are: (i) rela-
For this reason, it is possible to deduce the logical pattern
tive width (Br) (ratio of the width of the flood plain (B b) to
of the prediction. The advantage of the technique is that
the total width (B) where b ¼ main channel width); (ii) ratio of
the ANFIS architecture can be used to model the nonlinear
the perimeter of the main channel (Pmc) to the flood plain per-
functions for the prediction of the desired result in a logical
imeter (Pfp) denoted as Pr; (iii) the ratio of hydraulic radius of
manner (Jang a, b, , ).
the main channel (Rmc) to the flood plain (Rfp) denoted as Rr which usually varies with symmetry; (iv) the channel longi-
Fuzzy logic and fuzzy inference systems
tudinal slope (S0); and (v) the relative depth (Hr) i.e., the flow depth of the flood plain (H h) to the total depth (H )
Fuzzy systems are based on IF-THEN fuzzy rules. The building
where h ¼ main channel depth. In this study, these five flow
of FL systems begins with the derivation of a set of IF-THEN
parameters are chosen as input parameters and a composite
fuzzy rules comprising the expertise and knowledge of the
friction factor as an output parameter.
modeling field (Dezfoli ). The modeling of suitable rules is tedious, and hence a predefined method or tool to achieve
Adaptive neuro-fuzzy inference system
the fuzzy rules from numerical and statistical analysis is most appropriate for this context. Fuzzy conditional statements are
The ANFIS is a combination of an ANN and a fuzzy infer-
expressed such as if hydraulic depth (Dr) is small then friction
ence system (FIS) where the neural network learns the
factor is high where these parameters are levels described by
structure of the data but understanding the network struc-
fuzzy sets that are characterized by membership functions.
ture or the associated pattern is difficult. However, the FIS
Hence, these concise forms of fuzzy rules are often employed
can understand the structure and develop the rule base
to make decisions in situations of uncertainty. These play an
using IF-THEN rules to predict the output. A neural net-
important role in the human ability to make decisions.
work with its learning capabilities can be used to learn the
From Figure 4, it can be observed that the FIS and fuzzy
fuzzy decision rules to create a hybrid intelligent system.
decision making procedure comprise five functional build-
The fuzzy system provides expert knowledge to be used by
ing blocks including: (i) rule base, (ii) database, (iii)
the neural network. A FIS consists of three components:
decision making unit, (iv) fuzzification interface, and (v)
first, a rule base which contains a selection of fuzzy rules;
defuzzification interface. The rule base and database are
second, a database defines the membership functions used
referred to as the knowledge base. The inference system is
in the rules; and finally, a reasoning mechanism carries
based on logical rules which map the input variables space
out the inference procedure on the rules and the given
to the output variable space using IF-THEN statements
24
Figure 4
M. Sahu et al.
|
|
Prediction of flow resistance in a compound open channel
Journal of Hydroinformatics
|
16.1
|
2014
Schematic diagram of fuzzy based inference system.
and a fuzzy decision making procedure (Dezfoli ; Jang
Architecture and basic learning rules
& Gulley ). Due to the uncertainty of real and field values to fuzzy data, a fuzzification transition is used to
A typical adaptive network shown in Figure 5 is a network
transform deterministic values to fuzzy values and a defuzzi-
structure consisting of a number of nodes connected through
fication transition is used to transform fuzzy values into
directional links. Each node is characterized by a node func-
deterministic values (Dezfoli ).
tion with fixed or adjustable parameters. The learning or
Figure 5
|
A typical architecture of ANFIS system.
25
M. Sahu et al.
|
Prediction of flow resistance in a compound open channel
Journal of Hydroinformatics
training phase of a neural network is a process to determine parameter values to sufficiently fit the training data. The basic learning rule method is the BP method which seeks to minimize some error, usually the sum of squared differences
|
16.1
|
2014
Layer 4: Every node i in this layer is a squared node with a node function: O4i ¼ wi fi ¼ wi ð pi þ qi y þ ri Þ
(8)
between the network’s outputs and desired outputs. Generally, the model performance is checked by means of distinct test data, and a relatively good fit is expected in the testing phase. Considering a first order fuzzy interface system according to Takagi, Sugeno and Kang (TSK), a fuzzy
where wi is the output of layer 3, and is the parameter set. Parameters in this layer will be referred to as ‘consequent parameters’. Layer 5: The single circle node computes the overall
model consists of two rules (Sugeno & Kang ):
output as the summation of all incoming signals:
Rule 1: If x is A1 and y is B1 then f1 ¼ p1 x þ q1 y þ r1
O5i ¼ Overall output ¼
Rule 2: If x is A2 and y is B2 then f2 ¼ p2 x þ q2 y þ r2
(3)
(4)
If f1 and f2 are constants instead of linear equations, we
n X i
P wi fi wi fi ¼ Pi wi
(9)
i
Thus, an adaptive network as presented in Figure 5 is functionally equivalent to a fuzzy interface. The basic learning rule of ANFIS is the BP gradient descent which
have a zero order TSK fuzzy-model. Node functions in the
calculates error signals (defined as the derivative of the
same layer are of the same function family as described
squared error with respect to each node’s output) recursively
below. It is to be noted that Oji denotes the output of the
from the output layer backward to the input nodes (Werbos
i
th
node in layer j. Layer 1: Each node in this layer generates a membership
). This learning rule is exactly the same as the back-propagation learning rule used in the common feed-forward neural
grade of a linguistic label. For instance, the node function of
networks (Rumelhart et al. ). From the ANFIS architec-
the i th node might be:
ture (Figure 5), it is observed that given values of the premise parameters, the overall output can be expressed as
j
Oi ¼ μAi ðxÞ ¼
1þ
1 x ci ai
bi
(5)
a linear combination of the consequent parameters. Based on this observation, a hybrid learning rule is employed here, which combines a gradient descent and the least squares
where x is the input to the node i, and Ai is the linguistic label (small, large) associated with this node; and {ai, bi, ci} is the parameter set that changes the shapes of the membership function. Parameters in this layer are referred to as the ‘premise parameters’.
method to find feasible antecedent and consequent parameters (Jang a, ). The details of the hybrid rule are given by Jang et al. (), where it is also claimed to be significantly faster than the classical back-propagation method. Hybrid learning algorithm
Layer 2: Each node in this layer calculates the firing strength of each rule via multiplication: O2i
¼ wi ¼ μAi ðxÞ × μBi ð yÞ, i ¼ 1, 2
From the ANFIS architecture (Figure 5), we observe that (6)
Layer 3: The i th node of this layer calculates the ratio of the i th rule’s firing strength to the sum of all rules’ firing strengths: O3i
wi ¼ wi ¼ , i ¼ 1, 2 w1 þ w2
(7)
For convenience outputs of this layer will be called normalized firing strengths.
when the values of the premise parameters are fixed the overall output can be expressed as a linear combination. The output ‘F’ can be rewritten as: F¼
w1 w1 f1 þ f2 w1 þ w2 w1 þ w2
¼ wf1 þ wf2 ¼ ðwxÞp1 þ ðwyÞq1 þ ðw1 Þr1 þ ðw2 xÞp2 þ ðw2 yÞq2 þ ðw2 Þr2 (10)
26
M. Sahu et al.
|
Prediction of flow resistance in a compound open channel
Journal of Hydroinformatics
|
16.1
|
2014
which is linear in the consequent parameters p1, q1, r1, p2,
model parameters are matched. After that, 22 data are
q2, r2. Therefore, the hybrid learning algorithm developed
used for testing to verify the accuracy of the proposed
can be applied directly. More specifically, in the forward
model.
pass of the hybrid learning algorithm, node outputs go forward until layer 4 and the consequent parameters are identified by the least squares method. In the backward pass, the error signal propagates backward and the premise parameters are updated by gradient descent. As mentioned, the consequent parameters thus identified are optimal under the condition that the premise parameters are fixed. Accordingly, the hybrid approach converges much faster since it reduces the dimension of the search space of the original back-propagation method. This network fixes the membership functions and adapts only the consequent parts; then, ANFIS can be viewed as a functionallinked network (Klassen & Pao ; Pao ) where the enhanced representation, takes advantage of human knowledge and expresses more insight. By fine-tuning the membership functions, we actually generate this enhanced representation. Training and testing of ANFIS network The data required for the simulation are first generated using Manning’s equation for obtaining a composite friction factor under different hydraulic conditions, as shown in Table 1. The input parameters for the simulation are
Prediction of composite friction factor using ANFIS The composite friction factor is predicted using the ANFIS model based on five input parameters, such as relative width, ratio of perimeter of main channel to flood plain perimeter, ratio of hydraulic radius of main channel to flood plain, channel longitudinal slope, and relative depth. The pattern of variation of the actual and predicted composite friction factor is shown for the training and testing data sets in Figures 6 and 7, respectively. The black line indicates actual output and the grey line represents the predicted data from ANFIS. The plots show the coherent nature of the data distribution. The surface plot is shown in Figure 8. It can be observed that the surface covers the total landscape of decision space. Residuals are calculated as the difference between the actual and the predicted composite friction factors for training data set and are plotted in Figure 9. It can be observed that the residuals are distributed evenly along the centerline of the plot. To verify the accuracy of the results, a regression analysis is also carried out. Regression curves are plotted in Figures 10 and 11 between the actual composite friction
referred to in a previous section (Prediction of composite friction factor by empirical models). The entire experimental data set is divided into training and testing data sets. A total of 228 data are used. Among the 228 data, 206 are considered as training data and 22 as testing data. The number of nodes in the second layer is increased gradually during the training process starting with two. It was observed that the error converges (decreases) as the nodes increase to five. Hence, the number of nodes in the second layer is fixed at five and further analysis is carried out. The five layers are one input, three hidden, and one output layer. The network was run on a MATLAB platform using a Pentium IV desktop computer. A Gaussian-type membership function (gauss2mf) is chosen for input as for input 1 and a linear-type membership function is used for output while generating FIS. The function goes steadily after 10 iterations due to a faster hybrid learning rule which ensured that the
Figure 6
|
Distribution of composite friction factor (training data).
27
M. Sahu et al.
|
Prediction of flow resistance in a compound open channel
Journal of Hydroinformatics
Figure 9
Figure 7
Figure 8
|
|
|
|
16.1
|
2014
Residual distribution of training data set.
Distribution of composite friction factor (testing data).
Figure 10
|
Correlation plot for training set of data points.
Figure 11
|
Correlation plot for testing set of data points.
Surface plot.
factor and the predicted composite friction factor for the training and the testing data, respectively. It can be observed that the data are well fitted because high values of the coefficient of determination (R 2 ¼ 0.991 for training and R 2 ¼ 0.962 for testing data) are obtained. The testing data set is used to find the coefficient of determination for the other five methods as shown in Figures 12–16. From these figures, it can be observed that the EBM method exhibits
NUMERICAL MODELING OF TURBULENT FLOW STRUCTURES
the least accuracy because the coefficient of determination (R 2) of EBM is 0.687 whereas the coefficient for the
Although the ANFIS model is quite robust in predicting a
ANFIS method is 0.962.
composite friction factor considering the non-linearity in the
28
M. Sahu et al.
Figure 12
Figure 13
|
|
|
Prediction of flow resistance in a compound open channel
Journal of Hydroinformatics
Figure 14
|
Correlation plot for testing data (LM).
Figure 15
|
Correlation plot for testing data (KCM).
Figure 16
|
Correlation plot for testing data (D & H).
Correlation plot for testing data (COX).
Correlation plot for testing data (EBM).
relation between the input flow parameters and the output, it is vital to find out the reason for this non-linear relationship. In fact, momentum transfer in compound channels leads to an inaccurate estimation of discharge using empirical relations. Here, an attempt is made to present the effect of momentum transfer on the discharge in a compound channel via numerical analysis so that insight into flow mechanism can be gained. The numerical analysis simulates a tilting flume with a 8 m length and a 0.4 × 0.4 m2 cross-section for which Tominaga & Nezu () carried out experiments using fiber-optic laser-Doppler anemometer to measure three-directional components of the turbulent velocity
|
16.1
|
2014
29
M. Sahu et al.
|
Prediction of flow resistance in a compound open channel
shown in Figure 17 (S 1 case). The geometry of the channel is
Journal of Hydroinformatics
|
16.1
|
2014
configuration (physical) space:
discretized with ANSYS 12 design modeler. The width to depth (B/H) ratio of the channel is 4.981 whereas the slope is 0.00064. The flow is considered as uniform incompressible turbulent flow at the test section 7.5 m from the inlet. The hydraulic radius (R) of the channel is 0.043. The Reynolds number (Re) of the flow for this case is 6.72 × 104.
@ρ @ ðρui Þ þ ¼0 @t @xi
(11)
@ ρui uj @σ ij @ @ @p @τ ij ðρui Þ þ ¼ μ @t @xj @xi @xj @xi @xj
(12)
The fluid flow equations are solved by discretizing the whole domain into unstructured hybrid mesh (mixture of
where ρ ¼ density of water, ui and uj are the unresolved vel-
prism and triangular) that divides the continuum into a
ocity components in the xi and xj directions, σij ¼ normal
finite number of nodes considering near-wall effect. The
stress in plane i along j direction, p ¼ pressure, τij ¼ tangen-
computations need a spatial discretization and time march-
tial shear stress in plane i along j direction. Equation (11)
ing scheme. In this study, the transient simulation process is
is the continuity equation which is linear and does not
completed with the help of the commercial package ANSYS
change due to filtering.
CFX (ANSYS CFX Tutorials ANSYS CFX Release 11.0
To capture the flow feature in turbulence, large-scale
). This package generally solves the Navier–Stokes
motion is captured as a direct numerical simulation (DNS)
(NS) using a finite element-finite volume method. The
in LES but the effect of small scales is modeled using a sub-
mesh and simulation details are shown in Table 4. The gov-
grid scale (SGS) model. The LES method can incorporate a
erning equations (Equations (11) and (12)) are employed for
much coarser grid so that the temporal evolution of the
LES obtained by filtering the time-dependent NS equation
large-scale turbulent motions can be directly simulated
and continuity in either Fourier (wave-number) space or
while the unresolved small-scale motions can be modeled
Figure 17
Table 4
|
|
Geometric alignment of flume channel along with boundary conditions.
Summary of mesh and simulation details using ANSYS-CFX
LETOT (Large eddy turn over time state) þ
Case
Mesh spacing (m)
y range
H/u* (sec)
Time step (sec)
initial trial
S1
0.005
9.23–110.87
5
0.001
70
yþ ¼
yu ¼ scaled depth of flow where y ¼ respective flow depth, u ¼ flow velocity, u* ¼ shear velocity. u
10
30
M. Sahu et al.
|
Prediction of flow resistance in a compound open channel
Journal of Hydroinformatics
|
16.1
|
2014
through the use of a Smagorinsky model. The filtering pro-
channel at approximately 0.057 m from the centerline. The
cess filters out the eddies whose scales are smaller than the
isovel lines bulge significantly upward in the vicinity of the
filter width or grid spacing used in the computations. The
junction edge along the flow. The patterns of the isovel
results of the simulation are compared with case S 1 of Tomi-
lines from LES simulation results convincingly follow the
naga & Nezu (). From Table 5, it is evident that the results
experimental results of Tominaga & Nezu (). The
obtained from LES simulation are in good agreement with
reason for this bulge is the decelerated region on both sides
case S 1 of Tominaga & Nezu (). Here, mean bulk vel-
of the junction region of the main channel. The region is cre-
ocity is calculated using the formulation:
ated because of the low momentum transport due to the
Ð Wb ¼
secondary current away from the wall. This causes the wdA A
(13)
where, Wb ¼ mean velocity of the flow, w ¼ velocity of the point of consideration.
bulge in the main channel and flood plain interface due to high momentum transport by the secondary current. Consequently, the primary velocity is directly affected by the momentum transport due to the secondary current. The momentum transfer due to the secondary circulation com-
The composite friction factor is calculated from Manning’s equation.
ponent and the turbulent transport are three-dimensional in nature. These flow structures also depend on the corner
The isovel lines of the non-dimensional stream-wise vel-
of the channel and the shape of the compound cross-section.
ocity W(z) computed by the LES method are shown in
It is quite evident that turbulent structures as discussed are
Figure 18. The simulation shows that maximum velocity is
three-dimensional and highly non-linear.
0.4049 m/s which is observed near the centerline of the Table 5
|
Flow parameters of the experiment and simulation
Maximum
Mean bulk
Composite
velocity,
velocity, Wb
friction factor,
Case
Wmax (m/s)
(m/s)
‘nc’
S1
0.409
0.368
0.011383
Based on analysis made in this study, the following certain
LES simulation results
0.4049
0.3671
0.011380
conclusions can be drawn:
CONCLUSIONS
1. Five empirical models for the prediction of a composite friction factor have been studied. It is observed that the models can predict the composite friction factor accurately for a few data sets. Generally, the models break down when predicting the composite friction factor for a wide range of hydraulic conditions and geometries of compound channel. 2. To alleviate the above problem, a robust prediction strategy based on an ANFIS has been proposed. It is demonstrated that the ANFIS model is quite capable of predicting a composite friction factor with reasonable accuracy for a wide range of hydraulic conditions. 3. Further, the LES turbulence model has been adopted to analyze the compound open channel condition. The velocity distribution in an asymmetric compound channel is presented. The composite friction factor found from the Figure 18
|
Mean velocity distribution of LES simulation.
LES is in good agreement with experimental results.
31
M. Sahu et al.
|
Prediction of flow resistance in a compound open channel
4. Moreover, both the LES and ANFIS models are fairly convincing and account for turbulence during the prediction of the discharge and the composite friction factor in a compound open channel. A reasonably accurate prediction of composite friction factor for different geometry, hydraulic conditions and bed/material can be obtained with less computational effort by ANFIS which can be useful for field engineers. 5. In future, the study can be extended to consider different hydraulic conditions for the prediction of composite friction factor using LES and ANFIS models.
REFERENCES Atabay, S., Knight, D. W. & Seckin, G. Influence of a mobile bed on the boundary shear in a compound channel. In: Proceedings of the Second International Conference on Fluvial Hydraulics, Naples, Italy, 23–25 June, 1, pp. 337–345. ANSYS CFX Tutorials ANSYS CFX Release 11.0. ANSYS Inc. Southpointe, 275 Technology Drive, Canonsburg, PA 15317. Beaman, F. Large Eddy Simulation of Open Channel Flows for Conveyance Estimation. PhD Thesis, University of Nottingham. Bigil, A. & Altun, H. Investigation of flow resistance in smooth open channels using artificial neural network. Flow Meas. Instrum. 19, 404–408. Christodoulou, G. C. & Myers, W. R. C. Apparent friction factor on the flood plain-main channel interface of compound channel sections. In: Proc. 28th IAHR Congress, Graz, Austria. Cokljat, D. Turbulence Models for Non-circular Ducts and Channels. PhD Thesis, City University London. Cox, R. G. Effective hydraulic roughness for channels having bed roughness different from bank roughness. Miscellaneous Paper H-73-2, US Army Engineers Waterways Experiment Station, Vicksburg, MS. Dezfoli, K. A. Principles of Fuzzy Theory and its Application on Water Engineering Problems. Jihad Press, Tehran, Iran, p. 227. Dracos, T. & Hardegger, P. Steady uniform flow in prismatic channels with flood plains. J. Hydraul. Res. IAHR 25 (2), 169–185. Einstein, H. A. & Banks, R. B. Fluid resistance of composite roughness. Trans. Am. Geo. Union 31 (4), 603–610. Esen, H., Inalli, M., Sengur, A. & Esen, M. Modeling a ground-coupled heat pump system using adaptive neurofuzzy inference systems. J. Refrig. 31 (1), 64–74. Fadare, D. A. & Ofidhe, I. U. Artificial neural network model for prediction of friction factor in pipe flow. J. Appl. Sci. Res. 5 (6), 662–670.
Journal of Hydroinformatics
|
16.1
|
2014
Hin, L. S., Bessaih, N., Ling, L. P., Ghani, A., Zakaria, N. A. & Seng, M. Y. Discharge estimation for equatorial natural rivers with over bank flow. Int. J. River Basin Manage. 6 (1), 13–21. Hodges, B. R. & Street, R. L. On simulation of turbulent nonlinear free-surface flows. J. Comp. Phys. 151, 425–457. Jang, R. J. a Fuzzy modeling using generalized neural networks and Kalmman filter algorithm. In: Int. Proc. 9th National Conf. on Artificial Intelligence, Anaheim, CA, 15–19 July, pp. 762–767. Jang, R. J. b Rule extraction using generalized neural networks. In: Int. Proc. 4th IFSA World Congress, pp. 82–86. Jang, R. J. ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans. Sys. Man. Cyber. 23 (3), 665–685. Jang, R. J. Structure determination in fuzzy modeling: a fuzzy CART approach. In: Proc. IEEE conf. on Fuzzy Systems, Orlando, FL. Jang, J. S. R. & Gulley, N. Fuzzy Logic Toolbox: Reference Manual. The Math Works Inc., Natick, MA, USA. Jang, J. S. R., Sun, C. T. & Mizutani, E. Neuro-fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence. Prentice-Hall International, London. Kawahara, Y. & Tamai, N. Numerical calculation of turbulent flows in compound channels with an algebraic stress turbulence model. In: Proc. 3rd Symp. Refined Flow Modeling and Turbulence Measurements, Tokyo, Japan, pp. 9–17. Klassen, M. S. & Pao, Y. H. Characteristics of the functional link net: a higher order delta rule net. In: IEEE Proc. Conf. Neural Networks, San Diego, CA. Krishnamurthy, M. & Christensen, B. A. Equivalent roughness for Shallow channels. J. Hydraul. Eng. ASCE 98 (12), 2257–2263. Krishnappan, B. G. & Lau, Y. L. Turbulence modeling of flood plain flows. J. Hydraul. Eng. ASCE 112 (4), 251–265. Lane, S. N., Bradbrook, K. F., Richards, K. S., Biron, P. A. & Roy, A. G. The application of computational fluid dynamics to natural river channels: three-dimensional versus twodimensional approaches. Geomorphology 29, 1–20. Lotter, G. K. Considerations on hydraulic design of channel with different roughness of walls. Trans. AU Sci. Res. Inst. Hydraul. Eng. 9, 238–241. Morvan, H. P. Three-dimensional Simulation of River Flood Flows. PhD Thesis, University of Glasgow, Glasgow. Nakayama, A. & Yokojima, S. LES of open-channel flow with free-surface fluctuations. In: Proc. Hydraul. Eng. JSCE. 46, 373–378. Pan, Y. & Banerjee, S. Numerical investigation of free-surface turbulence in open-channel flows. Phys. Fluids 113 (7), 1649–1664. Pang, B. River flood flow and its energy loss. J. Hydraul. Eng. ASCE 124 (2), 228–231. Pao, Y. H. Adaptive Pattern Recognition and Neural Network. Addison-Wesley, Boston, MA, pp. 197–222. Riahi-Madvar, H., Ayyoubzadeh, A. S., Khadangi, E. & Ebadzadeh, M. M. An expert system for predicting
32
M. Sahu et al.
|
Prediction of flow resistance in a compound open channel
longitudinal dispersion coefficient in natural streams by using ANFIS. Exp. Sys. App. 36 (2), 1142–1154. Rumelhart, D. E., Hinton, G. E. & William, D. E. Learning internal representations by error propagation. In: Parallel distributed processing: explorations in the microstructure of cognition (D. E. Rumelhart & J. L. McClelland, eds). MIT Press, Cambridge, MA, pp. 318–362. Salvetti, M. V., Zang, Y., Street, R. L. & Banerjee, S. Large-eddy simulation of free-surface decaying turbulence with dynamic subgrid-scale models. Phys. Fluids 9 (8), 2405–2419. Shi, J., Thomas, T. G. & Williams, J. J. R. Large eddy simulation of flow in a rectangular open channel. J. Hydraul. Res. 37 (3), 345–361. Sinha, S. K., Sotiropoulos, F. & Odgaard, A. J. Threedimensional numerical model for flow through natural rivers. J. Hydraul. Eng. 124 (1), 13–24. Soong, T. W. & DePue II, P. M. Variation of Manning’s Coefficient with Channel Stage. Unpublished MS Thesis, University of Illinois at Urbana-Champaign, Urbana, IL. Sugeno, M. & Kang, G. T. Structure identification of fuzzy model. Fuzzy Sets Syst. 28, 15–33. Tang, X. & Knight, D. W. a Analysis of bed form dimensions in a compound channel. In: Proceedings of 2nd IAHR Symposium on River, Coastal and Estuarine Morphodynamics, Obihiro, Japan, pp. 555–563. Tang, X. & Knight, D. W. b Experimental study of stagedischarge relationships and sediment transport rates in a compound channel. In: Proc. 29th IAHR Congress, Beijing, China, 16–21 September, pp. 69–76.
Journal of Hydroinformatics
|
16.1
|
2014
Thomas, T. G. & Williams, J. a Large eddy simulation of a symmetric trapezoidal channel at Reynolds number of 430,000. J. Hydraul. Res. 33 (6), 825–842. Thomas, T. G. & Williams, J. b Large eddy simulation of turbulent flow in an asymmetric compound open channel. J. Hydraul. Res. 33 (1), 27–41. Thomas, T. G. & Williams, J. Large eddy simulation of flow in a rectangular open channel. J. Hydraul. Res. 37 (3), 345–361. Tominaga, A. & Nezu, I. Turbulent structures in compound open-channel flow. J. Hydraul. Eng. ASCE 117, 21–41. University of Birmingham Flow Database. Available at: www. flowdata.bham.ac.uk/atabay/index.shtml; www.flowdata.bham. ac.uk/fcfa.shtml; www.flowdata.bham.ac.uk/tang/data.shtml. Walid, H. S. & Shyam, S. S. An artificial neural network for non-iterative calculation of the friction factor in pipeline flow. Comput. Electron. Agric. 21, 219–228. Werbos, P. J. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Dissertation, Harvard University, Cambridge, MA. Wormleaton, P. R., Allen, J. & Hadjipanos, P. Discharge assessment in compound channel flow. J. Hydraul. Eng. ASCE 108 (9), 975–994. Yang, K., Cao, S. & Liu, X. Study on resistance coefficient in compound channels. Acta Mech. Sinica 21, 353–361. Yang, K., Cao, S. & Liu, X. Flow resistance and its prediction methods in compound channels. Acta Mech. Sinica 23, 23–31. Yuhong, Z. & Wenxin, H. Application of artificial neural network to predict the friction factor of open channel. Commun. Nonlinear Sci. Numer. Simulat. 14, 2373–2378.
First received 10 April 2012; accepted in revised form 1 May 2013. Available online 30 May 2013
33
© IWA Publishing 2014 Journal of Hydroinformatics
|
16.1
|
2014
Evolutionary network flow models for obtaining operation rules in multi-reservoir water systems Néstor Lerma, Javier Paredes-Arquiola, Jose-Luis Molina and Joaquín Andreu
ABSTRACT Obtaining operation rules (OR) for multi-reservoir water systems through optimization and simulation processes has been an intensely studied topic. However, an innovative approach for the integration of two approaches – network flow simulation models and evolutionary multi-objective optimization (EMO) – is proposed for obtaining the operation rules for integrated water resource management (IWRM). This paper demonstrates a methodology based on the coupling of an EMO algorithm (NSGA-II or Non-dominated Sorting Genetic Algorithm) with an existing water resources allocation simulation network flow model (SIMGES). The implementation is made for a real case study, the Mijares River basin (Spain) which is characterized by severe drought events, a very traditional water rights system and its historical implementation of the conjunctive use of surface and ground water. The established operation rules aim to minimize the maximum deficit in the short term without compromising the maximum deficits in the long term. This research
Néstor Lerma (corresponding author) Javier Paredes-Arquiola Joaquín Andreu Universitat Politècnica de València, Research Institute of Water and Environmental Engineering (IIAMA), Ciudad Politécnica de la Innovación, Camino de Vera, 46022 Valencia, Spain E-mail: neslerel@upv.es Jose-Luis Molina Polytechnic School of Engineering, Department of Hydraulic Engineering, Salamanca University, Av. de los Hornos Caleros, 50, 05003 Ávila, Spain
proves the utility of the proposed methodology by coupling NSGA-II and SIMGES to find the optimal reservoir operation rules in multi-reservoir water systems. Key words
| agricultural demands, AQUATOOL, decision support system shell, deficits, drought, genetic algorithms, NSGA-II, operating rules, optimization, SIMGES, simulation, water resources system
INTRODUCTION Several authors have noted the absence of the application of
appropriate management strategies often involves multiple
optimization models to the real management of multi-reser-
conflicting objectives that should be ‘optimized’ simul-
voir water systems (Yeh ; Wurbs ; Labadie ).
taneously (Makropoulos et al. ). Thus, there exists the
The applicability of most reservoir operation models is lim-
concept of Pareto optimal solutions, i.e. solutions for
ited because of the ‘high degree of abstraction’ necessary
which it is not possible to improve on the attainment of
for the efficient application of optimization techniques
one objective without making at least one of the others
(Akter & Simonovic ; Moeni et al. ). On the other
worse. Evolutionary multi-objective optimization (EMO)
hand, other authors such as Oliviera & Loucks () main-
algorithms offer a means of finding the optimal Pareto
tain that this is because of institutional limitations rather
front (Farmani et al. a; Cisty ; Abd-Elhamid &
than technological or mathematical limitations. Decision-making in environmental and hydrological projects can be complex and inflexible because of the
Javadi ). The decision-maker can consequently be provided with a set of non-dominated solutions to select a final design solution from that set.
socio-political,
Although the efficiency of these algorithms in solving a
environmental and technical factors. The selection of the
number of complicated real-world problems in electrical,
inherent
trade-offs
doi: 10.2166/hydro.2013.151
among
economic,
34
N. Lerma et al.
|
Evolutionary network flow models in multi-reservoir water systems
Journal of Hydroinformatics
|
16.1
|
2014
hydraulic, structural or aeronautical engineering has been
example, ‘reduce demands’ or ‘start pumping groundwater’.
illustrated (Farmani et al. b, , ; Hanne &
These types of OR are commonly called Rule Curves (RC),
Nickel ; Molina-Cristobal et al. ; Osman et al.
and although they are not always the most efficient rules
; Murugan et al. ), there have been limited appli-
they are considered the most practical and accepted by users.
resources
This paper aims to show the findings of RC for multi-
management (Farmani et al. ; Molina et al. ).
reservoir water systems by means of the coupling of an
There are recent applications of EMO algorithms related
EMO (NSGA-II) (Deb et al. ) with the simulation flow
to other water resources research studies, such as the opti-
network model SIMGES (Andreu et al. ). The proposed
mal design of water distribution systems or reservoirs
method is applied to the Mijares River basin water system
(Cisty ; Nazif et al. ; Haghighi et al. ; Hınçal
(Spain) which is characterized by strong drought events, a
et al. ; Louati et al. ), the conjunctive use of surface
very traditional water rights system and its historical
water and groundwater (Safavi et al. ), the control of
implementation of the conjunctive use of surface and
seawater intrusion in coastal aquifers (Abd-Elhamid &
ground water.
cations
in
the
policy
analysis
of
water
Javadi ; Kourakos & Mantoglou ; Sedki & Ouazar
The paper is structured as follows. First, a theoretical
) or hydrological studies (Dumedah et al. ; Gorev
background on reservoir operation rules and EMO is devel-
et al. ; Hassanzadeh et al. ).
oped. A case study is then presented, followed by a
In this work, an evolutionary multi-objective optimiz-
description of the integrated methodology in which the
ation algorithm, NSGA-II (Non-dominated Sorting Genetic
implementation of the SIMGES and EMO methods is
Algorithm; Deb et al. ), is coupled with the flow net-
described. The results are then discussed and several con-
work model SIMGES (Andreu et al. ) and used to
clusions are drawn.
assist in the selection of the best operation rules in multireservoir water systems. Despite the development and growing use of optimiz-
RESERVOIR OPERATION RULES AND EMO
ation models (Labadie ), most reservoir planning and operation studies are based on simulation modelling and
Traditionally, reservoir operation is based on heuristic pro-
thus require the intelligent specification of operation rules
cedures, RC and subjective judgments by the operator. This
(OR). Lund & Guzman () review the derived single-
provides general operation strategies for reservoir releases
purpose operating rules for reservoirs in series and in paral-
according to the current reservoir level, hydrological con-
lel for different purposes, with the derived rules supported
ditions, water demands and the time of year (Hakimi-
by conceptual or mathematical deduction. Obtaining OR
Asiabar et al. ; Moeni et al. ). In practice, reservoir
from the results of optimization models can be done using
operators usually follow RC which stipulate the actions that
simple (Young ) or multiple (Bhaskar & Whitlach
should be taken depending on the current state of the
) linear regressions and the use of simple statistics,
system (Alcigeimes & Billib ). Rule curves, or guide
tables and graphs (Lund & Ferreira ). Unfortunately, a
curves, are used to denote the operating rules that define
regression analysis can produce poor results, limiting the
the ideal or target storage levels and provide a mechanism
use of the obtained OR (Labadie ). On the other
for release rules to be specified as a function of water storage
hand, empirical OR has limited applicability, as for the
(Mohan & Sivakumar ; Hakmi-Asiabar et al. ).
space rule (Bower et al. ) or the New York City rule
Moreover, RC can be defined as a trigger indicator to start
(Clark ).
different measures, or actions, for water management.
In many real systems, the typical OR is defined by a
Obtaining RC from the results given by optimization
volume target for a reservoir that had to be maintained.
models by linear regressions is a complex task (Young
Another typical OR is defined by a curve (variable monthly
). Revelle et al. () proposed a linear decision rule;
and constant year by year) for a reservoir or a group of reser-
Lund & Ferreira () used tables and statistics of the
voirs that defines a threshold to trigger an action, for
results from an optimization model to obtain the OR of
35
N. Lerma et al.
|
Evolutionary network flow models in multi-reservoir water systems
Journal of Hydroinformatics
|
16.1
|
2014
the Missouri River water system. A common technique for
et al. ; Chan Hilton & Culver ; Singh & Minsker
obtaining OR and RC is based on an iteration method for
) and water resources systems management (Suen &
river basin simulation models. These iterations are con-
Eheart ).
trolled by an optimization algorithm that varies the
In the last years, there have been new advances and
operation rules depending on the results. For example, Cai
improvements for the NSGA-II MOEA. e-NSGA-II rep-
et al. () described strategies for solving large non-linear
resents an improvement over the original NSGA-II
water resource management models combining a genetic
developed by Deb et al. () by incorporating epsilon-dom-
algorithm (GA) with linear programming (LP), in which a
inance archiving (Laumanns & Ocenasek ) and
GA/LP approach was applied to a reservoir operation
adaptive population sizing (Harik et al. ). Epsilon-dom-
model with hydropower generation and to a long-term
inance archiving helps to reduce the computational demand
dynamic river basin planning model. Simulation models
of solving high-dimensional optimization problems (Kollat
are the most widespread tool for the analysis and planning
& Reed ) by allowing the user to control the resolution
of water systems. These models are characterized by their
at which the objectives are evaluated and ranked. However,
flexibility and by the possibility of including very complex
the use of NSGA-II to couple flow network models, which
elements in the modelling. They allow a more detailed rep-
is the application of this research (SIMGES), is a new topic
resentation of the systems than the optimization models
in the literature. The studies on coupling network flow
(Loucks & Sigvaldason ). River basin management
models and EMO algorithms such as NSGA-II are scarce
decisions are therefore generally made with the support of
or even non-existent in the literature. The NSGA-II algorithm
simulation models.
can be coupled to several other simulation models to provide
Quantitative compromises for the objectives and constraints
presented
in
the
methodology
section
are
optimized solutions by taking advantage of the power of those models (Farmani et al. ; Molina et al. ).
developed in this study using a multi-objective evolutionary
Most of the OR optimization problems have a multi-
algorithm (MOEA), non-dominated sorting genetic algor-
objective nature. Consequently, a multi-objective analysis
ithm II (e-NSGA-II) (Deb et al. ). The concept of
is necessary for identifying the best solutions and simul-
Pareto optimality is used to define the multi-objective com-
taneously considering several objectives that are frequently
promises for a system. A solution is Pareto optimal (or
in conflict (trade-offs). Many studies have used multi-
non-dominated) if no other solution in the solution space
objective techniques to address the multi-reservoir optimiz-
gives a better value for one objective without also degrading
ation problem.
the performance of at least one other objective. MOEAs are
Classical multi-objective approaches such as the weight-
heuristic search algorithms that change the approximation
ing approach or the constrain method were used for this
to the Pareto optimal set using crossover, selection and
purpose (Croley & Rao ; Yeh & Becker ; Liang
mutation operators to mimic natural selection in the popu-
et al. ; Wang et al. ).
lations of organisms in nature. The evolutionary algorithm
More recent applications use evolutionary multi-objec-
search process is an iterative process of selection that pre-
tive techniques for the same purpose. Reddy & Kumar
serves and reproduces high-quality solutions and that
() developed a multi-objective differential evolutionary
varies to introduce innovation in order to improve the popu-
algorithm and applied it to the Hirakud reservoir project
lation of solutions.
(India). Kim et al. () applied the NSGA-II algorithm
There are many examples demonstrating that MOEAs
to the Han River basin multi-reservoir system. Chen et al.
can solve complex non-linear and non-convex multi-
() developed a macro-evolutionary multi-objective gen-
objective problems (a detailed review is given by Coello-
etic algorithm for optimizing the rule curves of a water
Coello et al. ). Examples of applications in water
resources system in Taiwan. Malekmohammadi et al. ()
resources engineering include groundwater monitoring
presented an approach for incorporating flood control and
design (Cieniawski et al. ; Reed & Minsker ;
water supply objectives for a cascade system of reservoirs
Kollat & Reed ), groundwater remediation (Beckford
by coupling the NSGA-II algorithm with an ELECTRE-TR1
36
N. Lerma et al.
|
Evolutionary network flow models in multi-reservoir water systems
Journal of Hydroinformatics
|
16.1
|
2014
W
(Elimination and Choice Translating Reality) postprocessor.
505 mm, and the average temperature is 14.4 C according
Reddy & Kumar () presented a multi-objective evol-
to the Basin Water Plan (CHJ ). The maximum altitude
utionary algorithm to derive operation rules for the multi-
is 2,024 m above sea level.
purpose Bhadra reservoir system (India). Furthermore,
Regarding the storage infrastructure of the basin, there
Chang & Chang () applied the NSGA-II algorithm in
are three main reservoirs: the largest in terms of capacity
other reservoir systems in Taiwan to optimize state curves.
is the Arenós reservoir (95 Mm3); located downstream is
Lin et al. () modified the algorithm SCE-UA (Shuffled
the Sichar reservoir (49 Mm3); and finally, located in the
Complex Evolution) to use it as a multi-objective tool to
tributary Rambla de la Viuda is the María Cristina reservoir
determine optimal water policy for the hydroelectric
(19.7 Mm3).
system of Huanren (NE China).
The topology of the model for the Mijares water system is shown in Figure 2. The model includes a main course that represents the Mijares River where the Arenós and Sichar
CASE STUDY: MIJARES RIVER BASIN
reservoirs are located. The other river considered is the tributary Rambla de la Viuda, in which the María Cristina
The Mijares River basin is located in the eastern slope of the
reservoir is located. The different sources of runoff con-
Iberian Peninsula (Figure 1). The water system occupies a sur-
sidered are the runoff of the basin upstream of the Arenós
face area of 5,466 km2. The total population of the zone is
reservoir, the runoff from the mid-basin of the Mijares
363 578 inhabitants, and the urban supply is generated by
River between the Arenós and Sichar reservoirs and the
exploiting pumping wells and the using springs. The total
runoff from the Rambla de la Viuda river flowing to the
cropped surface is 124 310 ha, of which 43 530 ha (35%) cor-
María Cristina reservoir. The irrigation demand can be
responds to irrigated land and the rest (65%) is occupied by
grouped into four main zones: traditional, channel 220,
dry-land farming. Citruses constitute the predominant crop,
channel 100 and María Cristina. The main features of
with approximately 87% of the irrigated area. The length of
these demands are shown in Table 1. The urban supply
the main river branch is approximately 156 km, with an aver-
comes from the Plana de Castellón aquifer, which is located
age runoff of 380 Mm3 a–1.
mainly beneath the coastal plain and is recharged by precipi-
Two climatologically different geographical areas can be distinguished: a coastal climate with a Mediterranean coast-
tation, infiltration from irrigation and Mijares riverbed infiltration.
line and a continental climate area located upstream of the
One of the main issues of the basin is the allocation of the
Arenós reservoir. The mean annual rainfall of the area is
resource between the agricultural demands. The traditionally
Figure 1
|
Location of the Mijares River basin.
37
N. Lerma et al.
Figure 2
Table 1
|
|
|
Evolutionary network flow models in multi-reservoir water systems
Journal of Hydroinformatics
|
16.1
|
2014
Topology of the simulation model developed for the Mijares River basin.
Values of the water demand for the irrigated areas of the basin (Mm3 a 1)
Mixed irrigated areas Traditional irrigated area
Surface water
65
Groundwater
–
Channel 100
Channel 220
María Cristina
37
40
25
irrigated area in the low part of the basin is more than a mil-
OR in order to protect the rights of traditional irrigation over
lennium old, so its water rights are predominant over other
surface water by imposing the use of ground water for
agricultural uses. On the other hand, the irrigation of the
modern irrigation. Current management is based on a RC
middle part of the basin represents modern irrigation (Chan-
defined in 1970, called Agreement 70 (Figure 3). The indi-
nel 220, Channel 100 and María Cristina), also called ‘mixed
cator of this RC is the storage of the Sichar and Arenós
irrigation’ because of the possibility of using both surface and
reservoirs. If the sum of the volume of both reservoirs is
ground water. In this situation, it is necessary to establish an
greater than the defined RC, then all the demands can use
38
Figure 3
N. Lerma et al.
|
|
Evolutionary network flow models in multi-reservoir water systems
Journal of Hydroinformatics
|
16.1
|
2014
Rule Curve of ‘Agreement 70’.
cheaper surface water. On the other hand, when the volume
arc and nodes. Nodes usually represent the most important
storage goes down, the RC mixed irrigation demands have to
elements of the water system, such as divergence and conflu-
pump water and the remaining surface water is reserved for
ence points, reservoirs and demands. On the other hand,
the traditionally irrigated area.
arcs represent any water conveyance element (natural or artificial). Furthermore, an internal combination of arcs and nodes within the model allows other types of elements,
METHODOLOGY
such as hydroelectric plants and water returns in the internal flow network, to be modelled. Arcs are defined by
The methodology estimates the RC for a complex multi-reser-
the initial and final nodes, by the maximum and minimum
voir water system through the iterative use of a river basin
flows and by the cost that produces each resource unit
simulation model. A popular EMO algorithm that is usually
that flows through it. Mathematically, the simulation
applied in water resources engineering, NSGA-II, has been
model is based on the resolution for each time step (monthly
used. NSGA-II is an EMO algorithm with a specific operator
in this case) of an internal conservative flow network.
to handle constraints. Furthermore, the simulation of a water
The equivalent objective function defined in the
basin management model is required. The results obtained by
SIMGES model and simplified for our problem is the
this model represent the situation of the water system under
following:
the proposed water management policies. The water basin management model has been developed using the SIMGES module (Andreu et al. ) included in the decision support system shell (DSSS) AquaTool (Andreu et al. ). The combination of non-linear algorithms together with linear programming is common in water resources models. Implementation of the simulation model SIMGES
Min F ¼
I m X X i¼1
þ
!
n¼1
J X j¼1
Vn,i,t ðCn þ pni Þ þ Spi,t Csp
DR j,t (Cdr þ pnj ) þ
K X
(1) DDk,t (CDD þ pnk )
k¼1
where t is the index for time; i is the index for reservoir; I is the total number of reservoirs in the model; Vn,i is the
The method requires multiple iterations of a simulation
volume of reservoir i in pool n; m is the number of pools
model that accurately represents the water system. For this
in a reservoir; Cn is the cost/benefit of the storage water in
purpose, the simulation module SIMGES included in the
pool n; pni is the priority number assigned to reservoir i;
DSSS AquaTool has been used. SIMGES is based on the
Spi is the spill of reservoir i; Csp is the cost of spills in the
conceptualization of river basins by networks comprising
reservoirs; DRj is the deficit of the minimum flow
39
N. Lerma et al.
|
Evolutionary network flow models in multi-reservoir water systems
Journal of Hydroinformatics
|
16.1
|
2014
established for river or channel j; Cdr is the cost of deficit of a
on the ecological flow or on the activation of pumping
minimum flow; J is the number of rivers and channels; pnj is
from the aquifer.
the priority number of river j; DDk is the deficit of demand
The model developed for the Mijares River basin (Figure 2)
k; K is the number of demands in the model; CDD is the cost
represents the current situation of the system quite well. Three
associated with the deficits of the demands; and pnk is the
runoff inflow elements are considered (one for each reservoir),
priority number of demand k.
with historical monthly data obtained from re-naturalized
Restrictions are related to physical constraints or other
monthly flows for the period 1940–2008. Additionally, the
types of constraints such as legal or environmental con-
three existing reservoirs have been taken into account
straints. Other constraints such as the balance in each
(Arenós, Sichar and María Cristina). The demands are con-
junction or diversion are also taken into account. Figure 4
sidered at the correct aggregation level to represent the
shows a diagram of SIMGES which takes into account the
different irrigators. Six demands have been considered in the
above aspects and data water system (demands, inflows,
model: two urban demands (Castellón de la Plana and
etc.) to translate this problem into an internal network
Borriol-Benicassim) and the four above-mentioned agricul-
flow optimization problem, resolved using the Out-of-Kilter
tural demands. SIMGES allows the surface–groundwater
algorithm (Ford & Fulkerson ).
interaction to be modelled in a very complete way with several
The water management within the simulation model is
types of aquifers and river reaches connected to the aquifers.
defined in several ways. First, a priority system that sets
There are requirements for the ecological flows established
water demands in order of priority (hierarchical order) is
in several parts of the basin. Within the model, the flows are
established. Similarly, a hierarchical system is established
considered in two specific river reaches with a constant flow
to define the releases among the reservoirs. Furthermore,
of 0.5 m3 s–1 (1.3 Mm3/month).
the reservoirs are divided into zones such that the model tries to keep all reservoirs in the same zone and starts releas-
NSGA-II implementation
ing depending on the priority. Finally, there are operation rules that allow the triggering of decisions based on indi-
NSGA-II (Deb et al. ) (elitist non-dominated sorting
cators. These indicators can be the volume stored in one
genetic algorithm) is an EMO algorithm with a specific oper-
or several reservoirs or the cumulative runoff of several
ator to handle constraints. In this method, a fast non-
months. The decision can represent the application of a
dominated sorting approach with a selection operator is
restriction on the demands, expressed as a percentage of
used to create a mating pool by combining the parent and
one or several demands, on the flow through the turbines,
offspring populations and selecting the best solutions with
Figure 4
|
Flowchart of SIMGES; interaction between data and management system.
40
N. Lerma et al.
|
Evolutionary network flow models in multi-reservoir water systems
Journal of Hydroinformatics
|
16.1
|
2014
respect to the fitness and the spread (Deb et al. ; Dume-
configurations belonging to the Rt population will be
dah et al. ). The next generation is populated starting
placed in the new population. Those fronts that cannot be
with the best non-dominated front and progresses through
placed are ignored.
the rest of the fronts until the population size is reached; if
When the last front is under consideration, the solutions
in the final stage there are more individuals in the non-
that belong to this front can exceed future solutions to be
dominated front than there is available space, a crowded dis-
placed in the descendant population (Figure 5). In this
tance-based niching strategy is used to choose which
case, it is useful to use strategies that allow those configur-
individuals of that front are entered into the next population.
ations to be selected at a scarcely populated area that is
The crowding distance value of a solution provides an esti-
far away from the other solutions. This will fill up the rest
mate of the density of solutions surrounding that solution
of the positions of the descendant population instead of
(Raquel & Naval ). In this research, NSGA-II is used
choosing configurations randomly.
for the evaluation of the objective functions that allow the aptitude of the operation rules to be known.
These strategies are irrelevant for the first-generational cycles of the algorithm because there are many fronts that
Through this algorithm, the descendant population Qt
persist to the next generation. However, as the process
(size N) is created using the parent population Pt (size N).
moves forward, several configurations become part of the
Both populations are combined to form Rt with a size of
first generation and this front may have more than N
2N. By means of non-dominated sorting, the population Rt
genes or individuals. It is therefore important that the non-
is classified in different Pareto fronts. Although this process
rejected configurations are chosen through a methodology
requires more effort, it is necessary because dominance test-
that guarantees diversity. When the population as a whole
ing between the parent and descendant populations is
converges to the Pareto front, the algorithm ensures that
developed. Once the sorting process is complete, the new
the solutions are separated from each other.
population is generated from the configurations of the
Initially, a parent population P0 is created in the NSGA-
non-dominated Pareto fronts. This new population is first
II algorithm (randomly or by an initialization technique).
built with the best non-dominated Pareto front (F1). The pro-
The population is sorted according to the non-dominance
cess continues with the solutions from the second front (F2),
of the different levels (sorting of Pareto fronts F1, F2, …).
the third front (F3) and so on. Because the population Rt has
For each solution, a flair function is assigned according to
a size of 2N and there are only N configurations that form
its dominance level (1 for the best level), which decreases
the
throughout the process. Sorting by tournament (using a
descendant
Figure 5
|
population,
not
all
of
the
front
Schematic diagram of the mechanism for promoting individuals of NSGA-II.
41
N. Lerma et al.
|
Evolutionary network flow models in multi-reservoir water systems
Journal of Hydroinformatics
|
16.1
|
2014
crowding tournament operator), crossing and mutation are
returned to the optimization tool and the process is
used to create the population of descendants Q0 with a
repeated. Consistency is critical to be able to identify a pre-
size N. The main phases followed by NSGA-II are:
ferred alternative with confidence. In the proposed method,
1. Combine parents and descendants to create Rt ¼ Pt ∪ Qt. Develop the non-dominated sorting to Rt and identify fronts Fi, i ¼ 1, 2 …, etc. 2. Make Ptþ1 ¼ Ø, and i ¼ 1. While |Ptþ1| þ |Fi| < N, make |Ptþ1| ¼ |Ptþ1| ∪ |Fi| and i ¼ i þ 1. 3. Sort by crowding (Fi0 < C, described below) and including at Pi the N–|Ptþ1| most widespread solutions using the crowding distance values associated with the front Fi. 4. Creating the descendant population Qiþ1 from Piþ1 using selection by crowding tournament, crossing and mutation.
the first step in the consistency check occurs after the evolutionary algorithm has generated a set of non-dominated policy or management options. Usually, solutions generated by the evolutionary algorithm are a good indicator of shortcomings of the network flow model structure. For example, if changes in a node should have an effect on the utility function
and
this
has
been
ignored
intentionally
or
unintentionally in the SIMGES model, the results generated by EMO will exploit this weakness in the flow network and generate solutions that should have corresponded to higher utility function values.
Coupling of methods: the multi-objective optimization model
The objective functions of the problem take into account the maximum deficit of the demands as well as the resilience of the water system. For that, three objective functions are
NSGA-II is used to define and test RC for the water allocation model developed with SIMGES. Each individual is composed of 13 values representing the value of the RC in each month of the year (12) and the restriction coefficient (1). The indicator of
proposed. The problem can be mathematically expressed as follows. Given three objective functions:
this RC is the storage of the Sichar and Arenós reservoirs. This RC is imposed in the water allocation model, and a run is per-
x ¼ f(β)
(2)
y ¼ g(β)
(3)
z ¼ h(β)
(4)
formed. The results of this run are used to estimate the objective functions, defined by Equations (2)–(4). The value of this objective function is translated to the multi-objective algorithm to define the aptitude of the RC proposed. NSGA-II (Deb et al. ) is used to examine the SIMGES model and inspect it for inconsistencies or errors and to generate optimal trade-offs between conflicting objectives
considering
alternative
management
scenarios
simultaneously. Consistency checks can help provide some confidence in the representation of the decision-maker’s preferences. In checking for consistency, it is important to detect errors in the decision-making utility function. For utility functions implying a complex preference structure, there is a greater need and opportunity for meaningful consistency checks (Castelletti & Soncini-Sessa ). Attempting to achieve the multiple goals simultaneously requires identifying a compromise in the Pareto optimality. EMO algorithms employ a population-based search to find many Pareto efficient solutions in a single run. Once the probability of all the linked nodes has been updated by compiling the SIMGES model, the objective function values are
where x is maximum annual deficit for agricultural demands (MaxDef1Year) (Minimized); y is maximum 10 consecutive years deficit for agricultural demands (MaxDef10Years) (Minimized); and z is years of pumping (Minimized). These objective functions are optimized by coupling NSGA-II algorithm and SIMGES. Results from this optimization running represent the outcome of SIMGES model. For this reason, these three functions are restricted by the solutions of Equation (1). On the other hand, β represents a combination of n non-ranked and non-weighted management options, which are the decision nodes of the SIMGES model and represent the RC. They also represent the genes of the chromosome of the algorithm: β ¼ ðg0 , g2 , . . . , gn Þ
(5)
42
N. Lerma et al.
|
Evolutionary network flow models in multi-reservoir water systems
Journal of Hydroinformatics
|
16.1
|
2014
These n input variables representing RC denote the set
(5 Mm3) and a maximum (87 Mm3) value depending on the
of feasible parameters over which the model produces a
associated reservoirs (Sichar and Arenós) and the restriction
realistic output. There are therefore j optimized solutions
coefficient varies between 0 and 1 or, in other words, between
placed at the Pareto front expressed as j combinations of
not applying and applying a total restriction (100%).
the different operation rules belonging to each input variable:
Two constraints related to the deficit objective functions were defined:
β a ¼ xa , ya , za β b ¼ xb , yb , zb ... β j ¼ xj , yj , zj
(6)
MaxDef1Year < 50%
(7)
MaxDef10Years < 100%
(8)
Each β (the RC) is represented by the volume threshold in each month of the OR and the restriction coefficient corre-
Each evaluation of the objective functions requires the
sponding to each of them. These variables (volume
simulation model be run under this operation rule. To do
threshold and restriction coefficient) are discretized at cer-
this, the process is as follows (Figure 6). First, the parameters
tain intervals. The volume level is between a minimum
of the EMO and the minimum and maximum thresholds of
Figure 6
|
Schematic model coupling.
43
N. Lerma et al.
|
Evolutionary network flow models in multi-reservoir water systems
the decision variables are defined in a Master Application
Journal of Hydroinformatics
|
16.1
|
2014
RESULTS AND DISCUSSION
that is responsible for controlling the whole process. After this, the Mater Application runs NSGA-II, which
The results drawn from this analysis are shown in the differ-
defines the first individual (set of decision variables), and
ent figures representing on the one hand the Pareto front
with these variables the data files needed for SIMGES are
that links the different objective functions and, on the
created. SIMGES is run, and the Master Application
other hand, the operation rule parameters that are the
imports the results and calculates deficits. The aquifer pump-
decision variables of the algorithm. The results presented
ing allows the OFs to be evaluated, and this value is returned
here correspond to different tests conducted with the
to the optimization model to create the next individual.
NSGA-II algorithm for the different OR proposed.
Regarding EMO, the initial population for the optimiz-
Two hundred points are represented in Figures 7–10.
ation was 200 with a crossover probability of 0.9, a single-
Each of these points represents the result applying
point binary crossover, a bitwise mutation probability of
SIMGES for each combination of parameters obtained
0.005 and a seed for a random number generator of
using NSGA-II to define the RC mentioned at the beginning
0.123457. This setting was the most suitable for handling
of the previous section on ‘Coupling of methods’. These 200
the problem after developing a detailed test with different
points represents an optimized solution for the OFs defined
configurations.
in Equations (2)–(4), RCs parameters or interesting variables
Figure 7
|
Pareto front 1; maximum deficits for the agricultural demands (for colour/symbol coding, see Table 2).
Figure 8
|
Pareto front 2; number years pumped versus deficit of the agricultural demands (for colour/symbol coding, see Table 2).
44
Figure 9
N. Lerma et al.
|
Figure 10
|
Evolutionary network flow models in multi-reservoir water systems
Journal of Hydroinformatics
|
16.1
|
2014
Maximum pumping of 1 and 10 years for the mixed irrigation (for colour/symbol coding, see Table 2).
|
Restriction coefficient (for colour/symbol coding, see Table 2).
(maximum pumping of the mixed irrigation), drawn from
Table 2
|
Colour/symbol coding adopted in Figures 7–11
the last population found by NSGA-II algorithm; this is how RCs are obtained. To relate the solutions of one figure to the rest of the figures, a colour scale gradient has been fixed and solutions are sorted according to the maximum annual deficit of the agricultural demands (abscissa of Figure 7). Table 2 lists the colour/symbol coding adopted in Figures 7–11. For example, the points (in any figure) with colours between orange and yellow are related to OR that provide a maximum annual deficit of the agricultural demands between 5 and 10%.
Maximum annual deficit of the agricultural Colour/Symbol
demands (%)
Red/▴–orange/▪
0–5
Orange/▪–yellow/▪ Yellow/▪–green/ ×
5–10 10–20
Green/ × –cyan/♦
20–25
Blue/Ж–purple/
25–30
•
•
•
Purple/ –pink/
30–35
Pink/ –dark red/ þ
35–37
•
45
N. Lerma et al.
Figure 11
|
|
Evolutionary network flow models in multi-reservoir water systems
Journal of Hydroinformatics
|
16.1
|
2014
Curves at the volume level, parameter of the operation rule (for colour/symbol coding, see Table 2).
Figure 7 shows the Pareto front corresponding to the
associated with annual deficits of 20–28% with 10 years of
short term (1 year) and the long term (10 years) of the deficit
accumulated deficits ranging between 20–100%, hence
of agricultural demands. Notice that an inferior front can be
the high values of these deficits. By not restricting the
distinguished by the dispersing point over this line. This OR
demand, constant pumping is not necessary and the
represents a great variety of possible solutions with deficits
number of years pumped decreases to 35. This lower
ranging from 0 to 36% for the short-term deficit and up to
value of years pumped means that no matter which operat-
100% of the long-term deficit (in percentage over the
ing rules apply, it is always necessary to pump at least 35 of
annual demand).
the 68 years of the simulation because the surface water is
The growing trend of this figure is due to the conditions of
not enough to supply the whole water demand of the
the basin: (1) storages in the reservoirs and (2) the fact that agri-
basin. Finally, the third zone corresponds to a large
cultural uses with more demand can be supplied with
number of years pumped, but this time the zone is associated
groundwater. The set of demands that can receive groundwater
with high values of the maximum deficits; this set of sol-
can therefore achieve a state of no deficit. From this situation,
utions is not appropriate due to the high deficits of the
and as shown in the figure, increasing the annual deficit implies
demands and the number of pumped years.
that the growth also accumulated 10 years of deficit. The opti-
In addition to the number of years pumped, it is important
mal solution is not that with zero deficits, and the number of
to represent the maximum annual pumping of the mixed irriga-
years pumped must also be taken into account.
tion facing the maximum long-term pumping of the same
Figure 8 shows the number of years pumped, sorted
demand (Figure 9). The figure shows a scatter cloud of points
according to the annual deficit of the agricultural demands.
and much more restrictive intervals of variation of the pump-
Note that this figure indicates the number of years of pump-
ing than for the deficits of the agricultural demands. The
ing required to achieve specific deficits. This parameter (the
annual pumping is 87–100% of the maximum annual pumping
number of years) was taken into account because pumping
and of 10 years of pumping, and 67–90% of the accumulated
has an associated cost which decreases while reducing the
10 years of pumping. Very high values for both indicators
pumping time. It is possible to distinguish three zones in
imply that water is scarce and requires high pumping for agri-
the figure, the first one approximately 0–20% of the annual
cultural areas of channel 100, channel 220 and María Cristina
deficit in which the pump is above 55 years. For fewer defi-
(mixed irrigation) in order to avoid deficits.
cits it is therefore necessary to pump up to 68 years, which is
Looking at the colour distribution discussed above, it
the number of years simulated. The second zone is
can be seen that the first stage (0–15% of the annual deficits)
46
N. Lerma et al.
|
Evolutionary network flow models in multi-reservoir water systems
Journal of Hydroinformatics
|
16.1
|
2014
of the Pareto front for the deficits is associated with a
the water surface), only traditional irrigation has to be
change of 10 years of pumping between 90 and 84%. The
taken into account and mixed irrigation has to pump
rest of the Pareto front (15–35% of the annual deficits) cor-
water. Curve B (diamond/cyan in online version) is associ-
responds to a variation of annual pumping between 100 and
ated with annual maximum deficits of 20–25% of the
87%. There is therefore an area that varies as a function of
agrarian demands and 30–100% in the case of the maximum
10 years of pumping and another area that depends on the
deficits of 10 years. This curve is defined with a level lower
annual pumping.
than curve A, allowing a larger surface to supply the mixed
Figure 10 represents the coefficient of restriction, the
irrigation and, therefore, somewhat less by pumping. Curve
OR parameter and the decision variable algorithm depend-
C (Cyrillic symbol Ж/dark blue in online version) is similar
ing on the maximum deficit agrarian demands. The
to B, differing mainly in the first months of the hydrological
obtained restriction is around 100%, more specifically 92–
year, i.e. November to January. Those months can be seen as
100%, although the largest set of solutions is 96–100%.
the curve B reserve supplying more water to the surface for
The figure reveals that a very high restriction has to be
traditional irrigation. However, curve C allows for a greater
applied regardless of the results obtained. However, the
surface to supply mixed irrigation; for this reason, the tra-
restriction also influences the volume level (the other par-
ditional irrigation (and therefore the agrarian demands)
ameter of the operating rules) in these results to be obtained.
deficit increases. Finally, curve D (filled circle/purple in
In Figures 7 and 8, the 200 points shown in each figure
online version) corresponds to maximum annual deficits of
(last population found by NSGA-II) represent two Pareto
25–30% of the agricultural demands and 30–100% in the
fronts, the first between the short term (1 year) and the
case of the greatest deficiencies of 10 years. This RC is
long term (10 years) of the deficit of agricultural demands
defined with low levels and is associated with a very small
and the second between the short-term deficit and the
reserve for traditional irrigation, causing high deficits of tra-
number of pumping years.
ditional demands.
As mentioned above, there are 200 results that provide
Table 3 shows the results (deficit and pumping) of the
different combinations of objective functions. These results
water system without OR and with RC ‘Agreement 70’.
translate into 200 RCs obtained by the NSGA-II algorithm.
The results without OR are not in the solutions of the
Each of these RCs is a curve defined with 13 values, 12 cor-
NSGA-II algorithm because it has a maximum deficit of
responding to the months of a year and another to the
10 years of the traditionally irrigated area, which is larger
coefficient of restriction. Because representing and analyz-
than the limit established in official studies developed by
ing 200 curves is not feasible, and given that some curves
the Jucar Basin Authority. The results with ‘Agreement 70’
are not applicable to real management scenarios because
RC follow a similar behaviour to the points marked ‘ × ’
of the complexity and variability of their definitions, four
(green in online version) of the figures, and this curve
curves have been selected to represent various parts of the Pareto front (Figure 11).
Table 3
|
Results of deficits and pumping without OR and with the RC Agreement 70
Curve A (filled square/orange in online version) corresponds to solutions close to the origin of Pareto front reference 1 (Figure 7), i.e. the maximum annual deficits and 10 years of claims of 5% of the agricultural environ-
Traditional irrigated area
ment. This implies, as already explained, a large number of years pumped (Figure 8) and high values of these pumps (Figure 9). To achieve these results, the OR are defined with fairly high levels (compared to the other three curves). When the sum of the volumes of the Arenós and Sichar reservoirs are below those levels (which indicates that the RC of the mixed irrigation are not supplied with
Mixed irrigated areas
Maximum deficit of 1 year Maximum deficit of 10 years Maximum pumping of 1 year Maximum pumping of 10 years
Without
RC Agreement
OR (%)
70 (%)
37.12
23.35
217.28
55.77
87.4
97.85
59.64
73.3
47
N. Lerma et al.
|
Evolutionary network flow models in multi-reservoir water systems
Journal of Hydroinformatics
|
16.1
|
2014
(Figure 3) corresponds to curve C (Figure 11) but with a
projects INTEGRAME (contract CGL2009-11798) and
slightly lower level.
SCARCE
(program
Consolider-Ingenio
2010,
project
CSD2009-00065) and the Generalitat Valenciana for the Gerónimo Forteza grant (FPA/2012/006). The authors also
CONCLUSIONS
thank the European Commission (Directorate-General for Research
&
Innovation)
for
(program
funding
the
project
FP7-ENV-2011,
project
This paper demonstrates the optimization of operating rules
DROUGHT-R&SPI
based on the coupling of an EMO with a flow network
282769) and the Seventh Framework Programme of the
model. This approach allows a set of rule curves of a reser-
European Commission for funding the project SIRIUS
voir to be obtained for the allocation of water during
(FP7-SPACE-2010-1, project 262902).
drought demands. The EMO used was the NSGA-II algorithm. The simulation model was developed with the program SIMGES of the decision support system shell
REFERENCES
AQUATOOL based on network flow algorithms. The problem that arises is how to reduce the highest annual deficits and the maximum long-term deficits while taking into account the cost of additional pumping. The optimization decision variables are the trigger volume of applying the OR and the restriction coefficient. The coupling methodology is based on the evaluation of the objective function, which represents a run of the simulation model for watershed management to estimate the demand deficits and pumps. This methodology has been applied to the Mijares River basin, a system that is characterized by severe droughts, a well-established system of rights between users and the possibility of the joint use of surface and groundwater resources. By applying this approach, different types of operating rules have been tested to provide results in terms of deficits and similar pumps. A multi-objective point of view allowed the short and long terms of the deficit and the pumping resource to be taken into account. Moreover, this implementation helps users or managers of the water system to determine the best or most convenient management for the river basin.
ACKNOWLEDGEMENTS The authors wish to thank the Confederación Hidrográfica del Júcar (Spanish Ministry of the Environment) for the data provided in developing this study, the Comisión Interministerial de Ciencia y Tecnología (CICYT or Spanish Ministry of Science and Innovation) for funding the
Abd-Elhamid, H. F. & Javadi, A. A. A cost-effective method to control seawater intrusion in coastal aquifers. Water Resources Management 25, 2755–2780. Akter, T. & Simonovic, S. P. Modelling uncertainties in shortterm reservoir operation using fuzzy sets and a genetic algorithm. Hydrological Sciences Journal 49 (6), 1079–1081. Alcigeimes, B. C. & Billib, M. Evaluation of stochastic reservoir operation optimization models. Advances in Water Resources 32, 1429–1443. Andreu, J., Capilla, J. & Sanchís, E. AQUATOOL: A generalized decision support-system for water-resources planning and operational management. Journal of Hydrology 177, 269–291. Beckford, O., Chan Hilton, A. B. & Liu, X. Development of an enhanced multi-objective robust genetic algorithm for groundwater remediation design. In: Proceedings of World Water and Environmental Resources Congress 2003 (P. A. Debarry, ed.). American Society of Civil Engineering, Reston, VA. Bhaskar, N. R. & Whitlach Jr, E. E. Deriving of monthly reservoir release policies. Water Resources Research 16 (6), 987–993. Bower, B. T., Hufschmidt, M. M. & Reedy, W. H. Operation procedures: Their role in the design and implementation of water resource systems by simulation analysis. In: Design of Water Resource Systems, Chapter 11 (A. Maass, M. M. Hufschmidt, R. Dorfman, H. A. Thomas Jr, S. A. Marglin & G. M. Fair, eds). Harvard University Press, Cambridge, Massachusetts, pp. 443–458. Cai, X., McKinney, D. C. & Lasdon, L. S. Solving nonlinear water management models using a combined genetic algorithm and linear programming approach. Advances in Water Resources 24, 667–676. Castelletti, A. & Soncini-Sessa, R. Bayesian Networks and participatory modelling in water resource management. Environmental Modelling & Software 22, 1075–1088. Chan Hilton, A. B. & Culver, T. B. Groundwater remediation design under uncertainty using a robust genetic algorithm.
48
N. Lerma et al.
|
Evolutionary network flow models in multi-reservoir water systems
Journal of Water Resources Planning and Management 131 (1), 25–34. Chang, L.-C. & Chang, F.-J. Multi-objective evolutionary algorithm for operating parallel reservoir system. Journal of Hydrology 377, 12–20. Chen, L., McPhee, J. & Yeh, W.-G. A diversified multiobjective GA for optimizing reservoir rule curves. Advances in Water Resources 30, 1082–1093. CHJ Plan Hidrológico del Júcar. Confederación Hidrográfica del Júcar. Ministerio de Medio Ambiente, España. Cieniawski, S. E., Eheart, J. W. & Ranjithan, S. Using genetic algorithms to solve a multiobjective groundwater monitoring problem. Water Resources Research 31 (2), 399–409. Cisty, M. Hybrid genetic algorithm and linear programming method for least-cost design of water distribution systems. Water Resources Management 24, 1–24. Clark, E. J. Impounding reservoirs. Journal of American Water Works Association 48 (4), 349–354. Coello-Coello, C. A., Lamont, G. B. & Van Veldhuizen, D. A. Evolutionary Algorithms for Solving Multi-objective Problems. Springer, New York. Croley, T. E. & Rao, K. N. R. Multi-objective risks in reservoir operation. Water Resources Research 15 (4), 1807–1814. Deb, K., Pratap, A., Agarwal, S. & Meyarivan, T.A. Fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions On Evolutionary Computation 6 (2), 182–197. Dumedah, G., Berg, A. A., Wineberg, M. & Collier, R. Selecting model parameter sets from a trade-off surface generated from the non-dominated sorting genetic algorithmII. Water Resources Management 24, 4469–4489. Farmani, R., Abadia, R. & Savic, D. Optimum design and management of pressurised branched irrigation networks. Journal for Irrigation and Drainage 133 (6), 538–547. Farmani, R., Henriksen, H. J. & Savic, D. An evolutionary Bayesian belief network methodology for optimum management of groundwater contamination. Environmental & Modelling Software 24, 303–310. Farmani, R., Savic, D. & Walters, G.A. a Evolutionary multiobjective optimisation in water distribution network design. Journal of Engineering Optimization 37 (2), 167–183. Farmani, R., Walters, G. A. & Savic, D. b Trade-off between total cost and reliability for Anytown water distribution network. Journal of Water Resources Planning and Management 131 (3), 161–171. Farmani, R., Walters, G. A. & Savic, D. Evolutionary multiobjective optimization of the design and operation of water distribution network: total cost vs. reliability vs. water quality. Journal of Hydroinformatics 8 (3), 165–179. Ford, C. R. & Fulkerson, D. R. Flow in Networks. Princeton University Press, Princeton, NJ. Gorev, N. B., Kodzhespirova, I. F., Kovalenko, Y., Álvarez, R., Prokhorov, E. & Ramos, A. Evolutionary testing of hydraulic simulator functionality. Water Resources Management 25, 1935–1947.
Journal of Hydroinformatics
|
16.1
|
2014
Haghighi, A., Samani, H. M. V. & Samani, Z. M. V. GA-ILP method for optimization of water distribution networks. Water Resources Management 25, 1791–1808. Hakimi-Asiabar, M., Ghodsypour, S. H. & Kerachian, R. Deriving operating policies for multi-objective reservoir systems: application of self-learning genetic algorithm. Applied Soft Computing 10 (4), 1151–1163. Hanne, T. & Nickel, S. A multiobjective evolutionary algorithm for scheduling and inspection planning in software development projects. European Journal of Operational Research 167 (3), 663–678. Harik, G. R., Lobo, F. G. & Goldberg, D. E. The Compact Genetic Algorithm (IlliGAL Report No. 97006). University of Illinois at Urbana–Champaign, Illinois Genetic Algorithms Laboratory, Urbana, IL. Hassanzadeh, Y., Abdi, A., Talatahari, S. & Singh, V. P. Metaheuristic algorithms for hydrologic frequency analysis. Water Resources Management 25, 1855–1879. Hınçal, O., Altan-Sakarya, A. B. & Ger, A. M. Optimization of multireservoir systems by genetic algorithm. Water Resources Management 25, 1465–1487. Kim, T., Heo, J.-H. & Jeong, C.-S. Multireservoir system optimization in the Han River basin using multi-objective genetic algorithms. Hydrological Processes 20, 2057–2075. Kollat, J. B. & Reed, P. M. Comparison of multi-objective evolutionary algorithms for long-term monitoring design. Advances in Water Resources 29 (6), 792–807. Kourakos, G. & Mantoglou, A. Simulation and multi-objective management of coastal aquifers in semiarid regions. Water Resources Management 25, 1063–1074. Labadie, J. Reservoir system optimization models. Water Resources Update, University Council on Water Resources 108(Summer), 83–110. Labadie, J. W. Optimal operation of multireservoir systems: state-of-the-art review. Journal of Water Resources Planning and Management 130 (2), 93–111. Laumanns, M. & Ocenasek, J. Bayesian optimization algorithms for multi-objective optimization. In: Parallel Problem Solving from Nature (J. Guervós, P. Adamidis, H. Beyer, J. Martín & H. Schwefel, eds) PPSN VII, 7th International Conference, Granada, Spain, September 7–11. Springer, Berlin, Lecture Notes in Computer Science 2439, pp. 298–307. Liang, Q., Johnson, L. E. & Yu, Y. S. A comparison of two methods for multiobjective optimization for reservoir operation. Water Resources Bulletin 32 (2), 333–340. Lin, J.-Y., Cheng, C.-T. & Lin, T. A Pareto strength SCE-UA algorithm for reservoir optimization operation. Fourth International Conference on Natural Computation. IEEE Computer Society. Louati, M. H., Benabdallah, S., Lebdi, F. & Milutin, D. Application of a genetic algorithm for the optimization of a complex reservoir system in Tunisia. Water Resources Management 25, 2387–2404. Loucks, D. P. & Sigvaldason, O. T. Multiple reservoir operation in North America. In: The Operation of Multiple
49
N. Lerma et al.
|
Evolutionary network flow models in multi-reservoir water systems
Reservoir Systems (Z. Kaczmarck & J. Kindler, eds) IIASA Collaborative Proceedings Series CP-82-53, pp. 1–103, Luxemburg. Lund, J. & Ferreira, I. Operating rule optimization for Missouri River reservoir system. Journal of Water Resources Planning and Management 122 (4), 287–295. Lund, J. R. & Guzman, J. Developing seasonal and long-term reservoir system operation plans using HEC-PRM. Technical Report No. RD-40. Hydrologic Engineering Center, US Army Corps of Engineers, Davis, California. Makropoulos, C. K., Natsis, K., Liu, S., Mittas, K. & Butler, D. Decision support for sustainable option selection in integrated urban water management. Environmental Modelling & Software 23, 1448–1460. Malekmohammadi, B., Zahraie, B. & Kerachian, R. Ranking solutions of multi-objective reservoir operation optimization model using multi-criteria decision analysis. Expert Systems with Applications 38 (6), 7851–7863. Moeni, R., Afshar, A. & Afshar, M. H. Fuzzy rule-based model for hydropower reservoirs operation. International Journal of Electrical Power & Energy Systems 33 (2), 171–178. Mohan, S. & Sivakumar, S. Development of multi-objective reservoir systems operation using DP-based neuro-fuzzy model: a case study in PAP systems. In Fourth INWEPF Steering Meeting and Symposium (INWEPF), India. Molina, J. L., Farmani, R. & Bromley, J. Aquifers management through evolutionary Bayesian networks: the Altiplano case study (SE Spain). Water Resources Management 25, 3883–3909. Molina-Cristobal, A., Griffin, I. A., Fleming, P. J. & Owens, D. H. Multiobjective controller design: optimising controller structure with genetic algorithms. In Proceedings of the 2005 IFAC World Congress on Automatic Control, Prague, Czech Republic. Murugan, P., Kannana, S. & Baskarb, S. NSGA-II algorithm for multi-objective generation expansion planning problem. Electric Power Systems Research 79, 622–628. Nazif, S., Karamouz, M., Tabesh, M. & Moridi, A. Pressure management model for urban water distribution networks. Water Resources Management 24, 437–458. Oliviera, R. & Loucks, D. P. Operating rules for multireservoir systems. Water Resources Research 33 (4), 839–852. Osman, M. S., Abo-Sinna, M. A. & Mousa, A. A. An effective genetic algorithm approach Multiobjective Resource Allocation Problems (MORAPs). Applied Mathematics & Computing 163 (2), 755–768. Raquel, C. R. & Naval, P. C. An effective use of crowding distance in multiobjective particle swarm optimization.
Journal of Hydroinformatics
|
16.1
|
2014
Proceedings of the Conference on Genetic and Evolutionary Computation, June 25–29, ACM, New York, USA, pp. 257–264. Reddy, M. J. & Kumar, D. Optimal Reservoir Operation Using Multi-Objective Evolutionary Algorithm. Department of Civil Engineering, Indian Institute of Science, Bangalore, India. Reddy, M. J. & Kumar, D. Multiobjective differential evolution with application to reservoir system optimization. Journal of Computing in Civil Engineering 21 (2), 136–146. Reed, P. & Minsker, B. S. Striking the balance: long-term groundwater monitoring design for conflicting objectives. Journal of Water Resources Planning and Management 130 (2), 140–149. Revelle, C., Joeres, E. & Kirby, W. The linear decision rule in reservoir management and design. I. Development of the stochastic model. Water Resources Research 5 (4), 767–777. Safavi, H. R., Darzi, F. & Mariño, M. A. Simulation-optimization modelling of conjunctive use of surface water and groundwater. Water Resources Management 24, 1965–1988. Sedki, A. & Ouazar, D. Simulation-optimization modeling for sustainable groundwater development: a Moroccan coastal aquifer case study. Water Resources Management 25 (11), 2855–2875. Singh, A. & Minsker, B. S. Uncertainty-based multiobjective optimization of groundwater remediation design. Water Resources Research 44, W02404. Suen, J. P. & Eheart, J. W. Reservoir management to balance ecosystem and human needs: incorporating the paradigm of the ecological flow regime. Water Resources Research 42 (3), pW03417, 2006. Wang, Y. C., Yoshitani, J. & Fukami, K. Stochastic multiobjective optimization of reservoirs in parallel. Hydrological Processes 19, 3551–3567. Wurbs, R. A. Reservoir-system simulation and optimization models. Journal of Water Resources Planning and Management 119 (4), 455–472. Yeh, W. W.-G. Reservoir management and operations models: a state-of-the-art review. Water Resources Research 21 (12), 1797–1818. Yeh, W. W.-G. & Becker, L. Multiobjective analysis of multireservoir operations. Water Resources Research 18 (5), 1326–1336. Young, G. Finding reservoir operating rules. Journal of the Hydraulics Division, Proceedings of the American Society of Civil Engineering 93 (6), 297–321.
First received 24 September 2012; accepted in revised form 6 May 2013. Available online 25 June 2013
50
© IWA Publishing 2014 Journal of Hydroinformatics
|
16.1
|
2014
Generation of the bathymetry of a eutrophic shallow lake using WorldView-2 imagery Onur Yuzugullu and Aysegul Aksoy
ABSTRACT In this study, water depth distribution (bathymetric map) in a eutrophic shallow lake was determined using a WorldView-2 multispectral satellite image. Lake Eymir in Ankara (Turkey) was the study site. In order to generate the bathymetric map of the lake, image and data processing, and modelling were applied. First, the bands that would be used in depth prediction models were determined
Onur Yuzugullu Aysegul Aksoy (corresponding author) Department of Environmental Engineering, Middle East Technical University, 06800 Ankara, Turkey E-mail: aaksoy@metu.edu.tr
through statistical and multicollinearity analyses. Then, data screening was performed based on the standard deviation of standardized residuals (SD_SR) of depth values determined through preliminary linear regression models. This analysis indicated the sampling points utilized in depth modelling. Finally, linear and non-linear regression models were developed to predict the depths in Lake Eymir based on remotely sensed data. The non-linear regression model performed slightly better compared to the linear one in predicting the depths in Lake Eymir. Coefficients of determination (R 2) up to 0.90 were achieved. In general, the bathymetric map was in agreement with observations except at resuspension areas. Yet, regression models were successful in defining the shallow depths at shore, as well as at the inlet and outlet of the lake. Moreover, deeper locations were successfully identified. Key words
| bathymetry, remote sensing, shallow lake, WorldView-2
INTRODUCTION Water depth is important for several physical and biological
aerial and radar) can be used in obtaining bathymetric infor-
processes in a lake (Leira & Cantonati ). Together with
mation. In the literature, such applications have mainly
water volume, water depth impacts natural assimilation
focussed on coastal waters and estuaries (Philpot ; Grei-
capacity, pollution dilution factor, water temperature and
danus et al. ; Robbins ; Hennings ; Sandidge &
retention time. Light penetration and growth of algal species
Holyer ; Roberts ; Calkoen et al. ; Lafon et al.
(especially attached algae) may depend on the depth
; Dierssen et al. ; Stumpf et al. ; Jordan & Fon-
of water. Furthermore, water depth influences mixing of
stad ; Mobley et al. ; Lyzenga et al. ; McIntyre
water layers, sedimentation of solids and re-suspension of
et al. ; Bachmann et al. , ; Kao et al. ; Lee
bottom sediments. Therefore, obtaining the spatial distri-
; Marchisio et al. ). However, the use of multispec-
bution of water depths or bathymetric information may be
tral images in determination of the bathymetry of lakes is
critical in assessing the impact of pollutants on lake water
not common. This is due to the fact that complex relation-
quality.
ships dominate between radiance and lake characteristics
Sonar/radar systems have been frequently used in deri-
in lakes compared to Case 1 waters, as well as coastal
vation of the bathymetric maps of lakes (Tureli & Norman
waters and estuaries. Complexities arise from high chloro-
; Morgan et al. ). However, these systems may
phyll-a concentrations, suspended solids, organic matters
require extensive fieldwork and financial means, especially
and bottom reflection in most lakes. The existence of few
for large water bodies. In order to ease these difficulties,
sensors on most multispectral satellites may result in insuffi-
remotely
ciency in distinguishing between the impacts of these factors
sensed
doi: 10.2166/hydro.2013.133
images
(hyperspectral,
multispectral,
51
O. Yuzugullu & A. Aksoy
|
Bathymetry generation using WorldView-2 imagery
Journal of Hydroinformatics
|
16.1
|
2014
on radiance values. However, the launch of new satellites,
Linear and non-linear regression models were developed
such as WorldView-2, may provide new means and
to derive the spatial distribution of depths. The depths
additional sensors that may aid in depth determination in
obtained with these models were compared to the actual
lakes as well.
bathymetric map of the lake.
The WorldView-2 satellite was launched in the fourth quarter of 2009. It has eight spectral bands covering the electromagnetic spectrum range of 400–1,040 nm (Table 1)
MATERIALS AND METHODS
(Digital Globe ). With its 0.46 m panchromatic and 1.84 m multispectral resolution, studies which require high
Study area
spatial resolution can be conducted (Lee et al. ). The satellite has a radiometric resolution of 11-bits and a temporal W
Lake Eymir is a shallow natural lake located at 39.28 N and
resolution of 3.7 days at 20 or less. It allows images to be cap-
32.30 E. It is located 20 km south of Ankara (Figure 1) at an
tured in an area of 65.6 km × 110 km at the nadir. Its coastal
altitude of 969 m (Beklioglu et al. ). The surface area of
blue band that senses the 400–450 nm of the spectrum is
the lake is around 1.25 km2. It has a shoreline of 11 km and
characterized by its relatively shorter wavelength and higher
a catchment area of 971 km2. In 1990, the area surrounding
energy. It can penetrate to deeper parts of water bodies. It
the lake and 245 km2 of its catchment area was declared as a
has been reported that depths down to 30 m can be identified
‘Special Environmental Protection Area’ by Decree of the
by coastal blue and blue bands (Digital Globe ). WorldView-2 imagery has been used in recent studies
Cabinet of Ministers due to its ecological significance (Yuzugullu ).
for bathymetry determinations in coastal waters and estu-
The average water depth in the lake changes depending
aries (Glass et al. ; Marchisio et al. ; Lee et al.
on the balance between inflows and outflow. Lake Eymir is
; McCarthy et al. ; Parthish et al. ). Lee et al.
mainly fed by Lake Mogan in the south (98% of the total
() reported that green and yellow bands were more effec-
inflow), the Kislakci Stream in the east and groundwater
tive in depth determination in the range of 2.5–20 m in
sources. The excess water of the lake drains into Imrahor
coastal waters. Marchisio et al. () showed the efficacy
Creek in the east (Yenilmez et al. ). The average water
of coastal blue and blue bands in revealing depths up to
depth is 4 m. Annual water level fluctuations in the lake
7 m. To our knowledge, there is no study on bathymetry gen-
vary by 0.5–1.0 m, depending on the net inflow and evapor-
eration for shallow eutrophic lakes using the WorldView-2
ation (Yagbasan & Yazicigil ). In April 2011, the
imagery.
average depth in the lake reached 4.5 m (Yuzugullu ).
In this study, a WorldView-2 image was used to deter-
The lake has been suffering from the effects of eutrophi-
mine the bathymetry and, therefore, the spatial distribution
cation. It has been turbid and rich in algal species for a long
of water depths in eutrophic Lake Eymir in Ankara,
time. In studies performed in different time periods,
Turkey. The relationships between measured depths and
eutrophic conditions were reported (Diker ; Tan ;
radiance values at different bands were investigated.
Ozen ). It was shown that water balance, and therefore the depth of water, had a significant impact on the water
Table 1
|
quality of the lake (Beklioglu et al. ). Therefore, bathy-
Spectral bands of WorldView-2 sensors
metry and water depths provide substantial information in
Wavelength
Wavelength
Band
range (nm)
Band
range (nm)
Coastal Blue (Band 1)
400–450
Red (Band 5)
630–690
Blue (Band 2)
450–510
Red Edge (Band 6)
705–745
Green (Band 3)
510–580
NIR-1 (Band 7)
770–895
Yellow (Band 4)
585–625
NIR-2 (Band 8)
860–1,040
the assessment of water quality changes in the lake.
METHODOLOGY The methodology followed in the development of depth (or bathymetry) models based on remotely sensed data is
52
O. Yuzugullu & A. Aksoy
Figure 1
|
|
Bathymetry generation using WorldView-2 imagery
Journal of Hydroinformatics
Locations of Lake Eymir and sampling points.
depicted in Figure 2. Following the acquisition of the image on 28 July 2010, a field work was realized on 2 August 2010. Depth measurements were conducted at 59 points (depicted in Figure 1). In the time gap between the image and field work dates, there was no precipitation or significant change in temperature or other conditions which would alter water depths. As a result, it was assumed that the depths and water quality parameters were representative of the conditions on the date the image was taken. Sampling locations for ground truth data were selected arbitrarily to cover the whole lake area. The geographical coordinates of the points were determined using a Garmin GPS receiver with ±1.5 m positional accuracy on average. Image processing was conducted using ENVI 4.7. The image had geographic projection and ED 50 Datum. The lake area was cropped and isolated. Therefore, only the lake area was taken as the region of interest. The dark pixel subtraction method was used in order to eliminate atmospheric effects in the ortho-rectified satellite image (Chavez ). In order to obtain the radiance values, first, image histograms were generated for the corresponding spectral bands. Then, zero values in the histograms were removed. Finally, the minimum and the maximum values in the histograms were determined to aid in the conversion
Figure 2
|
Flowchart of the methodology.
|
16.1
|
2014
53
O. Yuzugullu & A. Aksoy
|
Bathymetry generation using WorldView-2 imagery
Journal of Hydroinformatics
|
16.1
|
2014
of digital numbers to radiance values using the method pro-
performed in an iterative procedure to identify highly corre-
vided by Beisl et al. (). Histogram values were matched
lated independent variables. At this stage, correlation matrix
to digital numbers in the range of 0–255. By generating
was used to remove an independent variable that had the
band-specific linear equations, digital numbers were con-
highest correlation with another. Then, a new correlation
verted to radiance values.
matrix was generated for the remaining variables. This
In order to test the suitability of data in regression model
cycle was repeated until multicollinearity was eliminated
development and to improve the model prediction perform-
between independent variables. Correlation coefficient (r)
ance, a screening procedure was applied to select the
was used as the criterion for variable elimination. It was
independent variables of depth models. The initial stage of
assumed that multicollinearity existed between variables if
the procedure was to remove the sampling points at
the absolute r value was greater than 0.6. The remaining
locations with high turbidity. This was applied to minimize
variables proceeding multicollinearity analysis were con-
the negative impact of re-suspended sediments in depth
sidered in regression model development in prediction of
determination. Since Lake Eymir is a shallow lake, local
the water depths or bathymetry of Lake Eymir.
re-suspension can occur due to various factors such as
Performances in bathymetry determinations using multi-
groundwater inflow, wind effect, turbulence due to velocity
spectral images are variable for Case 1 and Case 2 waters. In
variations as a result of cross-sectional and flow direction
Case 1 waters (i.e., open ocean) chlorophyll is the main opti-
changes. As depicted in Figure 1, the shape of Lake Eymir
cally active constituent. These waters generally lack
makes it prone to these impacts. On the sampling date, the
suspended particles. On the other hand, there is a complex
lake was mostly clear with an average total suspended
relationship between reflectance and water quality par-
solids concentration of 1.92 mg/L and an average chloro-
ameters in Case 2 waters (i.e., coastal, estuary or inland
phyll-a concentration of 3.49 μg/L, respectively (Yuzugullu
waters such as lakes). This complexity is mainly due to the
). However, at some locations turbidity was observed
co-presence
due to re-suspension of bottom sediments. These locations
coloured dissolved organic matter in high concentrations
were identified on the image by locating the zones that exhi-
(Kishino et al. ; Sudheer et al. ). Since depth deter-
bit high radiance due to suspended solids. The sampling
mination in Case 2 waters or turbid lakes using multispectral
points were placed over the image and the ones that were
images can be problematic, data elimination may be
over the re-suspension areas were removed from the data
required to improve the prediction capability of the bathy-
set. As a result, 11 sampling points were removed from
metry models. Stevens () showed that regression
further analysis. These points can be seen in Figure 1 (one
model prediction performance can be improved by eliminat-
is hidden due to overlap).
ing outlier data based on standardized residuals (SR) of a
of
chlorophyll,
suspended
particles
and
The water depth (the dependent variable of the models)
regression model. In this approach, first a regression
and the radiances at eight spectral bands (the independent
model is developed using the data set. Then, outlier obser-
variables of the models) were analysed for validity of nor-
vation points are determined and a new model is
mality. For this purpose, Q–Q plots were prepared
developed using the remaining observation points. In this
assuming normal, log-normal and exponential distributions.
study, a similar approach was used to eliminate the outlier
These plots were used to determine the form (as is, logarith-
observations. First, an initial regression model was devel-
mic transformation, or exponential transformation) of the
oped. Then, the standard deviation of SR (SD_SR) was
independent variables that would be used in regression
calculated. At this stage, different multipliers (n) of SD_SR
model development. The distribution type of a variable
were evaluated (n ¼ 1.5, 1.4,…, 0.5). Then, observation
was selected based on the slope information in the corre-
data with an SR greater than n × SD_SR were eliminated.
sponding Q–Q plot. If the slope was close to 1, the
For each case (n × SD_SR), a linear regression model was
corresponding distribution type was selected for the given
created. Then, these models were assessed based on basic
variable. Following the determination of independent vari-
statistics (minimum, maximum, mean and standard devi-
able distribution forms, multicollinearity analysis was
ation) for the dependent variable (depth), and F-test
54
O. Yuzugullu & A. Aksoy
|
Bathymetry generation using WorldView-2 imagery
Journal of Hydroinformatics
|
16.1
|
2014
for predictions. The model with a small F-value and basic
coefficient of determination (R 2) values between observed
statistics similar to the original observation data (48
and predicted water depths at given locations.
observation points) was chosen as the best model. For the observation data for Lake Eymir, n ¼ 0.7 resulted in the best filter in establishment of the data set for model
RESULTS AND DISCUSSION
development. However, this filter (0.7 × SD_SR) resulted in elimination of 23 additional sampling points. As a result,
The Q–Q plots indicated that the radiance data in most of
bathymetry model developments were realized using the
the spectral bands had normal distributions, except in
25 sampling points depicted in Figure 1, which corre-
Band 5 and Band 2. In these bands, log-normal distributions
sponded to a sampling density of 20 samples per square
prevailed. Based on this information, the log transform-
kilometre of the lake. Thirty-two per cent of these points
ations (base 10) of the data in Bands 5 and 2 were used in
(eight sampling points) were used in the model development
multicollinearity and correlation analysis. The correlation
stage. The remaining 68% (17) were employed for model
matrix indicated that Band 1 (coastal blue) was highly corre-
validation. Allocation of the locations of the sampling
lated with Bands 2, 3 and 4 (r > 0.75). Moreover, at 95%
points for model development and model validation was
confidence level, r values for the relationships between
performed arbitrarily while care was taken to have as even
Band 1 and Band 6, and Band 1 and Band 7 were higher
a spatial distribution as possible.
than 0.6, which was the lower limit for multicollinearity
Following
data
screening,
linear
and
non-linear
elimination. Multicollinearity analysis indicated that only
regression models were developed to predict the bathymetry
the data in Band 1, Band 8 and the logarithmic transform
of the lake. The general forms of the linear and non-linear
of the data in Band 5 were independent from each other
regression models are given in Equations (1) and (2),
and could be used as explanatory variables in regression
respectively:
model development. Puetz et al. () and Maheswari () showed the usefulness of inclusion of Bands 1 and 8
di ¼ a þ
J X
kj xij
(1)
j¼1
in depth determinations in coastal waters as well. Band 1 senses the radiation in the 400–450 nm wavelength interval. This band supports bathymetric studies by
di ¼
J X
sensing the deeper parts of a water body compared to mj
kj xij
(2)
j¼1
other sensors (Puetz et al. ). Band 5, on the other hand, acquires radiance data in the range of 630–690 nm. The light in this region of the electromagnetic spectrum
where, di is the water depth at location i, a is the intercept, kj
is mainly absorbed by chlorophyll-a (Thiemann & Kauf-
is the regression coefficient for band j, xij is the radiance at
mann ). As mentioned before, analysis of the data in
location i at band j, and mj is the exponent for band j. The
this band revealed a log-normal distribution. This was in
a represents the offset for the depth of 0 m (Loomis )
line with the distribution of measured chlorophyll-a con-
for the linear regression model. This parameter is used to
centrations in Lake Eymir. The radiance in Band 8 (xi8)
handle the average error that would be produced by over-
was another explanatory variable that was used in bathy-
and under-predictions at different depths as a result of the
metry model development for Lake Eymir. In various
impact of heterogeneous bottom cover (macrophytes, sand,
studies, the relationship between suspended particles and
gravel, etc.) and variable water quality (suspended solids,
radiance in near-infrared band has been shown (Doxaran
chlorophyll, etc.) on reflectance values (Loomis ) for
et al. ). In this study, the impact of suspended particles
the linear model. In the above equations, a, kj and mj
in depth determination was considered through inclusion
values are set by XLStat software by minimizing the root
of Band 8. An initial analysis of the distribution of sus-
mean square error (RMSE) and maximizing the Pearson
pended particle concentrations in Lake Eymir indicated a
55
O. Yuzugullu & A. Aksoy
|
Bathymetry generation using WorldView-2 imagery
Journal of Hydroinformatics
|
16.1
|
2014
normal distribution similar to that for the distribution of
sensed data are given below in Equations (3) and (4),
radiance values in Band 8. It must also be noted that it is
respectively:
possible to observe frequent algal blooms over the lake surface in patches. Moreover, macrophytes may cover the
di ¼ 2:433 þ 193:000xi1 1:313 log xi5 108:886xi8
(3)
0:128 log x5:000 419:672x1:378 di ¼ 1140:027x1:628 i1 i5 i8
(4)
bottom, especially at shallower depths closer to shore. Therefore, reflectance in Band 8 can be impacted by these as well. Preceding the determination of independent explanatory variables (that have no multicollinearity), data
R 2, adjusted R 2, RMSE and F-value with respect to the
screening was performed. As mentioned earlier, the
calibration data set were 0.87, 0.78, 0.370 and 6.78 × 10 4,
sampling points over re-suspension areas were removed
respectively, for the linear regression model at 95% confi-
from the data set in order to avoid the interference these
dence level. For the same data set, R 2, adjusted R 2, RMSE
areas would produce in depth predictions. Then, the
and F-value were 0.90, 0.83, 0.379 and 3.04 × 10 4, respect-
remaining 48 sampling points were taken into consider-
ively, when the non-linear regression model was used.
ation. The minimum, average and maximum water depths
Performances of these models were also tested against the
at these points were 2.50, 4.57 and 5.75 m, respectively.
validation data. R 2, adjusted R 2, RMSE and F-value were
These values were 2.50, 4.56 and 5.75 m, respectively, for
0.805, 0.760, 0.488 and 1.07 × 10 6, respectively, when the
the full observation data set (59 observation points).
linear regression model was applied. The corresponding
Further data elimination was conducted based on SD_SR.
values for the non-linear model were 0.855, 0.822, 0.365
This approach was used to improve the prediction capa-
and 1.11 × 10 7, respectively, at 95% confidence level. In
bility of depth models. Application of remote sensing
both models, the radiance values in Band 1 had the highest
technology to Case 2 waters to make water quality predic-
coefficient (kj in Equations (1) and (2)) compared to other
tions may be problematic compared to Case 1 waters due
bands, keeping in mind that the radiance in Band 5 (xi5)
to the presence of water constituents that may significantly
was in logarithmic scale. This situation emphasized the
impact radiance values (Swardika ). It is very probable
importance of Band 1 in bathymetry determination. A simi-
that local algal blooms, bottom sediments, suspended par-
lar observation was valid in the correlation matrix as well.
ticles, and even waves can impact radiance values.
Compared to other bands, depth had the highest r (0.351)
Another difficulty is the heterogeneous distribution of
for Band 1 radiance at 95% confidence level when 48
these interferences which may lead to extreme values. By
sampling points were considered. The coefficients for
regarding extreme values as outliers, the impact of such
Bands 5 and 8 were negative which were indicative of the
interferences in model prediction performance may be
interference due to absorption based on the presence of sus-
improved at least at other locations in the lake that are
pended solids and algal species. It must also be noted that
less prone to such effects. As seen in Figure 1, removed
another model based on the ratio method proposed by
sampling points form clusters in certain locations. It is
Stumpf et al. () was tested. The ratio of ln(xi5)/ln(xi1)
possible that these locations were subject to the interfer-
was used. This ratio had the highest correlation with depth
ences mentioned. The minimum, average and maximum
(R 2 ¼ 0.51) compared to other combinations. The model
depths for 25 observation points used in the model devel-
obtained for this ratio, di ¼ 15.652*(ln xi5/ln xi1) 11.886,
opment
resulted in no better performance. R 2 and RMSE were
were
2.80,
4.70
and
5.70 m,
respectively.
Therefore, it can be said that deeper locations were con-
0.46 and 0.529 at 95% confidence level.
sidered as ground truth data for modelling purposes. It
Measured versus predicted depths are depicted in
may be the case that deeper locations impacted less from
Figure 3. As can be seen, both models were successful in pre-
bottom sediment re-suspension or bottom reflection.
dicting low and high depths in Lake Eymir. However, the
The linear and non-linear regression models generated
statistical analysis given before stated that the non-linear
to determine the depths at different locations using remotely
model was slightly better in depth predictions. For the
56
O. Yuzugullu & A. Aksoy
|
Bathymetry generation using WorldView-2 imagery
Journal of Hydroinformatics
|
16.1
|
2014
regression model and 0.55 m in the non-linear regression model. The errors in the calculated average depths were 13 and 12%, respectively, for the linear and non-linear models. The bathymetric maps of Eymir Lake that are generated using Equations (3) and (4) are depicted in Figures 4 and 5, respectively. Both models simulated the shallow depths at shores with success. The increasing depths from the shoreline can be clearly seen for both models. Tureli & Norman () studied the bathymetry of the lake using sonar technology. According to that study, the lake bottom had a bowl-type structure with steep slopes at shores. As a result, sharp increases were observed in depths progressing away Figure 3
|
Predicted versus measured depths for linear and non-linear depth models.
from the shore to the inner regions of the lake. The midregion of the lake was the deepest location with an average
validation data set, the average error was calculated as 0.30
depth of 5.5 m in 1985. They also stated that the lake
and 0.2 m for the linear and non-linear regression models,
became relatively shallow at the southern and eastern
respectively. The average depth for the validation data set
parts, which correspond to the inlet and outlet of the lake,
(observations) was 4.73 m. The predicted average depths
respectively. The findings of Tureli & Norman () are
were 4.65 and 4.71 m for the linear and non-linear regression
consistent with the results of this study. As Lake Eymir
models, respectively. These correspond to 2 and 0.5% error in
has a valley-type structure, a sharp increase is expected in
the predicted average depths, respectively. Therefore, models
depth in short distances away from the shore. This is cap-
developed using screened ground truth data were successful
tured by the depth models (Figures 4 and 5). Moreover,
in predicting the average depth. When the models were
the southern and eastern parts of the lake are shallower
applied to predict the depths at 48 sampling points, the aver-
than the other parts. The deeper regions of the lake are
age error in depth predictions was 0.61 m in the linear
shown by darker shades in Figures 4 and 5. In general, the
Figure 4
|
The bathymetric map of Lake Eymir generated by the linear regression depth model.
57
O. Yuzugullu & A. Aksoy
Figure 5
|
|
Bathymetry generation using WorldView-2 imagery
Journal of Hydroinformatics
|
16.1
|
2014
The bathymetric map of Lake Eymir generated by the non-linear regression depth model.
distributions of relatively lower and higher depths were in
these locations as well. For the existing situation, regression
line with the observations. However, the depths at re-sus-
models were successful in defining the shallow depths at
pension areas were in error. This could be seen especially
shore and close to the inlet and outlet of the lake. Moreover,
at the southern part of the lake closer to the inlet. At these
deeper locations were successfully identified.
locations mixed values were observed. Overall, although
Bathymetry determination using WorldView-2 can aid
both models predicted the depths well, the non-linear
in water quality studies. Use of remotely sensed data may
model was better in predicting the shallower depths at
provide an alternative in determination of the distribution
shores. However, the non-linear model was more sensitive
of depths and examination of the water quality in lakes
to the impact of re-suspension areas.
with respect to these depths. Scale advantage supplied by remote sensing over traditional bathymetry generation methods may make it preferable for large lakes. However,
CONCLUSIONS The results of this study showed that WorldView-2 image can be used to predict the depths in a eutrophic lake. Bands 1, 5 and 8 of the WorldView-2 satellite were adequate
more research is needed to investigate the effects of spatially and temporarily heterogeneous bottom characteristics (i.e., variable
coverage
by
macrophytes,
different
bottom
materials) on reflectance values in determination of depths in a eutrophic lake.
to determine the depth distribution. Among these bands, Band 1 (coastal blue band) made the highest contribution in determination of the depths in the eutrophic lake.
ACKNOWLEDGEMENTS
The presence of turbidity due to re-suspension areas caused interference in predicting the depths. However, elim-
The authors are grateful to the Scientific and Technical
inating these areas in the depth model development helped
Research Council of Turkey (TUBITAK) for providing
to make good depth estimates at locations where the impact
financial
of turbidity was less. More study is required to deal with this
CAYDAG-106Y201). The authors acknowledge Res. Asst.
issue and improve the prediction capability of the models at
Elif Kucuk for her support during field work.
support
for
this
study
(Project
Number:
58
O. Yuzugullu & A. Aksoy
|
Bathymetry generation using WorldView-2 imagery
REFERENCES Bachmann, C. M., Ainsworth, T. L., Fusina, R. A., Montes, M. J., Bowles, J. H., Korwan, D. R. & Gillis, D. B. Bathymetric retrieval from hyperspectral imagery using manifold coordinate representations. IEEE Trans. Geosci. Remote Sens. 47, 884–897. Bachmann, C. M., Montes, M. J., Fusina, R. A., Parrish, C., Sellars, J., Weidemann, A., Goode, W., Nichols, C. R., Woodward, P., Mcilhany, K., Hill, V., Zimmerman, R., Korwan, D., Truitt, B. & Scwartzschild, A. Very shallow water bathymetry retrieval from hyperspectral imagery at the Virginia Coast Reserve (VCR’07) multi-sensor campaign. In: Proceedings of 2008 IEEE International Geoscience and Remote Sensing Symposium, 6–11 July 2008, Boston, MA, pp. 125–128. Beisl, U., Telaar, J. & Schonermark, M. V. Atmospheric correction, reflectance calibration and BRDF correction for ADS40 image data. In: Proceedings of 2008 ISPRS Congress, Commission Papers XXXVIII. 3–11 July 2008, Beijing, China, pp. 7–12. Beklioglu, M., Ince, O. & Tuzun, I. Restoration of eutrophic Lake Eymir, Turkey, by biomanipulation undertaken after a major external nutrient control I. Hydrobiologia 489, 93–105. Calkoen, C. J., Hesselmans, G. H. F. M., Wensink, G. J. & Vogelzang, J. The bathymetry assessment system: efficient depth mapping in shallow seas using radar images. Int. J. Remote Sens. 22, 2973–2998. Chavez, P. S. Atmospheric, solar, and MTF corrections for ERTS digital imagery. In: Proceedings of the American Society of Photogrammetry. Fall technical meeting, Phoenix, AZ, p. 69. Dierssen, H. M., Zimmerman, R. C., Leathers, R. A., Downes, T. V. & Davis, C. O. Ocean color remote sensing of seagrass and bathymetry in the Bahamas Banks by high resolution airborne imagery. Limnol. Oceanogr. 48, 444–455. Digital Globe 8-band multispectral imagery. Available at: www.digitalglobe.com/index.php/48/Products? product_id=27 (accessed 9 July 2011). Diker, Z. A Hydrobiological and Ecological Study in Lake Eymir. MS Thesis, Middle East Technical University, Ankara, Turkey. Doxaran, D., Froidefond, J. M., Lavender, S. & Castaing, P. Spectral signature of highly turbid waters: application with SPOT data to quantify suspended particulate matter concentrations. Remote Sens. Environ. 81, 149–161. Glass, A. L., Walker, B., Peters, M. & Dykes, L. Improving the usability of high resolution imagery for tropical areas: deglinting, de-hazing and calibration of very high resolution satellite imagery. In: Proceedings of Map Asia 2010 & ISG 2010. 26–28 July 2010, Kuala Lumpur, Malaysia. Available at: www.mapasia.org/2010/proceeding/pdf/lisa.pdf (accessed 10 June 2011). Greidanus, H., Calkoen, C., Hennings, I., Romeiser, R., Vogelzang, J. & Wensink, G. J. Intercomparison and validation of
Journal of Hydroinformatics
|
16.1
|
2014
bathymetry radar imaging models. In: Proceedings of 1997 IEEE International Geoscience and Remote Sensing Symposium. 3–8 August 1997, Singapore, pp. 1320–1322. Hennings, I. A historical overview of radar imagery of sea bottom topography. Int. J. Remote Sens. 19, 1447–1454. Jordan, D. & Fonstad, M. Two dimensional mapping of river bathymetry and power using aerial photography and GIS on the Brazos River, Texas. Geocarto Int. 20, 13–20. Kao, H. M., Ren, H., Lee, C. S., Chang, C. P., Yen, J. Y. & Lin, T. H. Determination of shallow water depth using optical satellite images. Int. J. Remote Sens. 30, 6241–6260. Kishino, M., Tanaka, A. & Ishizaka, J. Retrieval of chlorophyll a, suspended solids, and colored dissolved organic matter in Tokyo Bay using ASTER data. Remote Sens. Environ. 99, 66–74. Lafon, V., Froidefond, J. M., Lahet, F. & Castaing, P. SPOT shallow water bathymetry of a moderately turbid tidal inlet based on field measurements. Remote Sens. Environ. 81, 136–148. Lee, S. R. A coarse-to-fine approach for remote-sensing image registration based on a local method. Int. J. Smart Sens. Intell. Systems 3, 690–702. Lee, K. R., Kim, A. M., Olsen, R. C. & Kruse, F. A. Using WorldView-2 to determine bottom-type and bathymetry. In: Proceedings of the SPIE 8030 (Ocean Sensing and Monitoring III). Available at: spiedigitallibrary.org/ proceedings/resource/2/psisdg/8030/1/80300D_1 (accessed 10 June 2011). Leira, M. & Cantonati, M. Effects of water-level fluctuations on lakes: an annotated bibliography. Hydrobiologia 613, 171–184. Loomis, M. J. Depth Derivation from the WorldView-2 Satellite using Hyperspectral Imagery. MS Thesis, Naval Postgraduate School, Monterey, CA, USA. Lyzenga, D. R., Malinas, N. P. & Tanis, F. J. Multispectral bathymetry using a simple physically based algorithm. IEEE Trans. Geosci. Remote Sens. 44, 2251–2259. Maheswari, R. M. WorldView-2 (WV-2) coastal, yellow, rededge, NIR-2 in underwater habitat mapping. Available at: dgl.us.neolane.net/res/img/7a827acbb24ab9cdd85da7b64 d0f9259.pdf (accessed 11 November 2011). Marchisio, G., Pacifici, F. & Padwick, C. On the relative predictive value of the new spectral bands in the Worldview-2 satellite. In: Proceedings of the 2010 International Geoscience and Remote Sensing Symposium. 25–30 July 2010, Honolulu, Hawaii, pp. 2723–2726. Mccarthy, B. L., Olsen, R. C. & Kim, A. M. Creation of bathymetric maps using satellite imagery. In: Proceedings of the SPIE 8030 (Ocean Sensing and Monitoring III). 26–27 April 2011, Orlando, FL, 80300C. McIntyre, M. L., Naar, D. F., Carder, K. L., Donahue, B. T. & Mallinson, D. J. Coastal bathymetry from hyperspectral remote sensing data: comparisons with high resolution multibeam bathymetry. Mar. Geophys. Res. 27, 128–136.
59
O. Yuzugullu & A. Aksoy
|
Bathymetry generation using WorldView-2 imagery
Mobley, C. D., Sundman, L. K., Davis, C. O., Bowles, J. H., Downes, T. V., Leathers, R., Montes, M. J., Bissett, W. P., Kohler, D. D., Reid, R. P., Louchard, E. M. & Gleason, A. Interpretation of hyperspectral remote-sensing imagery by spectrum matching and look-up tables. Appl. Opt. 44, 3576–3592. Morgan, L. A., Shanks, W. A., Lovalvo, D. A., Jhonson, S. Y., Stephenson, W. J., Pierce, K. L., Harlan, S. S., Finn, C. A., Lee, G., Webring, M., Shulze, B., Duhn, J., Sweeney, R. & Balistrieri, L. Exploration and discovery in Yellowstone Lake: results from high-resolution sonar imaging, seismic reflection profiling, and submersible studies. J. Volcanol. Geoterm. Res. 122, 221–242. Ozen, A. Role of Hydrology, Nutrients and Fish Predation in Determining the Ecology of a System of Shallow Lakes. MS Thesis, Middle East Technical University, Ankara, Turkey. Parthish, D., Gopinath, G. & Ramakrishnan, S. S. Coastal bathymetry by coastal blue. Available at: www.dgl.us. neolane.net/res/img/db5653880b1d7abc6fd5c393de7c909d. pdf (accessed 11 November 2011). Philpot, W. D. Bathymetric mapping with passive multispectral imagery. Appl. Opt. 28, 1569–1578. Puetz, A. M., Lee, K. & Olsen, R. C. WorldView-2 data simulation and analysis results. In: Proceedings of the SPIE 7334 (algorithms and technologies for multispectral, hyperspectral, and ultraspectral imagery XV). 73340U. Available at: proceedings.spiedigitallibrary.org/proceeding. aspx?articleid=778667 (accessed 11 November 2011). Robbins, B. Quantifying temporal change in seagrass areal coverage: the use of GIS and low resolution aerial photography. Aquat. Bot. 58, 259–267. Roberts, A. C. B. Shallow water bathymetry using integrated airborne multi-spectral remote sensing. Int. J. Remote Sens. 20, 497–510. Sandidge, J. & Holyer, R. Coastal bathymetry from hyperspectral observations of water radiance. Remote Sens. Environ. 65, 341–352.
Journal of Hydroinformatics
|
16.1
|
2014
Stevens, J. P. Outliers and influential data points in regression analysis. Psychol. Bull. 95, 334–344. Stumpf, R. P., Holderied, K. & Sinclair, M. Determination of water depth with high-resolution satellite imagery over variable bottom types. Limnol. Oceanogr. 48, 547–556. Sudheer, K. P., Chaubey, I. & Garg, V. Lake water quality assessment from Landsat thematic mapper data using neural network: an approach to optimal band combination selection. J. Am. Water Resour. Assoc. 42, 1683–1695. Swardika, I. K. Bio-optical characteristic of case-2 coastal water substances in Indonesia coast. Int. J. Remote Sens. Earth Sci. 4, 64–84. Tan, C. O. The Roles of Hydrology and Nutrients in Alternative Equilibrium of Two Shallow Lakes of Anatolia, Lake Eymir and Lake Mogan: Using Monitoring and Modeling Approaches. MS Thesis, Middle East Technical University, Ankara, Turkey. Thiemann, S. & Kaufmann, H. Determination of chlorophyll content and trophic state of lakes using field spectrometer and IRS-1C satellite data in the Mecklenburg Lake district, Germany. Remote Sens. Environ. 73, 227–235. Tureli, K. & Norman, T. Ankara güneyindeki Eymir Gölü’nün batimetresi ve taban sedimanları (The bathymetry and bottom sediments of Lake Eymir located in south of Ankara). Geol. Bull. Turk. 35, 91–99. Yagbasan, O. & Yazicigil, H. Sustainable management of Mogan and Eymir Lakes in central Turkey. Environ. Geol. 56, 1029–1040. Yenilmez, F., Keskin, F. & Aksoy, A. Water quality trend analysis in Lake Eymir, Ankara. Phys. Chem. Earth. 36, 135–140. Yuzugullu, O. Determination of Chlorophyll-a Distribution in Lake Eymir Using Regression and Artificial Neural Network Models with Hybrid Inputs. MS Thesis, Middle East Technical University, Ankara, Turkey.
First received 1 August 2012; accepted in revised form 9 May 2013. Available online 6 June 2013
60
© IWA Publishing 2014 Journal of Hydroinformatics
|
16.1
|
2014
Multi-site evaluation to reduce parameter uncertainty in a conceptual hydrological modeling within the GLUE framework Kairong Lin, Pan Liu, Yanhu He and Shenglian Guo
ABSTRACT Reducing uncertainty of hydrological modeling and forecasting has both theoretical and practical importance in hydrological sciences and water resources management. This study focuses on reducing parameter uncertainty by multi-sites validating for the conceptual Xinanjiang model. The generalized likelihood uncertainty estimation (GLUE) method was used to conduct the uncertainty analysis with Shuffled Complex Evolution Metropolis (SCEM-UA) sampling. The discharge criterion of interior gauge station was added to select the behavioral parameters, and then two comparable schemes were established to illustrate how well the uncertainty can be reduced by considering the observations of the interior sites’ flow information. The Dongwan watershed, a sub-basin of the Yellow River basin in China, was selected as the case study. The results showed that the number and standard deviation of behavioral parameter sets decreased, and the simulated runoff series by the Xinanjiang model with the behavioral parameter sets can fit better with the observed runoff series when setting the threshold value at the interior sites. In addition, considering the interior sites’ flow information allows one to derive more reasonable prediction bounds and reduce the uncertainty in hydrological modeling and forecasting to some degree. Key words
| GLUE, multi-site evaluation, parameter uncertainty, SCEM-UA, Xinanjiang model
Kairong Lin (corresponding author) Yanhu He Guangdong Key Laboratory for Urbanization and Geo-simulation, School of Geography and Planning, Sun Yat-sen University, Guangzhou 510275, China E-mail: linkr@mail.sysu.edu.cn Kairong Lin Yanhu He Key Laboratory of Water Cycle and Water Security in Southern China of Guangdong High Education Institute, Sun Yat-sen University, Guangzhou 510275, China Pan Liu Shenglian Guo State Key Laboratory of Water Resources and Hydropower Engineering Science, Wuhan University, Wuhan 430072, China
INTRODUCTION Hydrological models have been accepted as effective tools
The treatment of uncertainty engaged with the explosion
in the description of dynamic relations between hydrologi-
of methods devoted to deriving meaningful uncertainty
cal processes, meteorological behaviors, land use and land
bounds for hydrological model predictions (e.g., Beven &
cover, and also the changes of vegetation coverage within
Freer ; Thiemann et al. ; Vrugt et al. ; Morad-
a watershed, providing theoretical and practical support
khani et al. ; Ajami et al. ; Benke et al. ;
for river basin management (Wagener & Gupta ;
Vrugt & Robinson ; Li et al. ; Mousavi et al. ).
Hejazi et al. ). The hydrological system is complicated
Prediction in ungauged basins (PUB) is an initiative that
by climate changes such as atmospheric circulation, precipi-
emerged out of discussions among International Association
tation, air temperature,
surface
of Hydrological Sciences (IAHS) members on the world-
properties such as the geological conditions, vegetation
wide web and during a series of IAHS sponsored meetings
and soil conditions (Lin et al. ). As a result, the com-
in Maastricht (July 18–27, 2001), Kofu (March 28–29,
and
the
underlying
plexity of the hydrological system poses great challenges
2002), and Brasilia (November 20–22, 2002) about the
for the hydrological modeling practices, and uncertainty
need to reduce the predictive uncertainty in hydrological
analysis is still an important issue for hydrological modeling
science and practice (Sivapalan et al. ). Indeed, the
and forecast.
final aim of studying uncertainty is to find the ways and
doi: 10.2166/hydro.2013.204
61
K. Lin et al.
|
Multi-site evaluation to reduce parameter uncertainty with the GLUE framework
Journal of Hydroinformatics
|
16.1
|
2014
measures to reduce the uncertainty in hydrological model-
information for evaluation to gain less uncertainty, the
ing and forecasting, so as to increase the accuracy and
objective of this study is to reduce parameter uncertainty
reliability of hydrological forecasting.
by using multi-site evaluation in the performance of the
One of the efficient ways of reducing uncertainty is to
Xinanjiang model, based on the generalized likelihood
use new and all available information (Beven & Binley
uncertainty estimation (GLUE) method with the Shuffled
). For example, Goodman () pointed out that the
Complex Evolution Metropolis (SCEM-UA) sampling algor-
statistical methods that lend themselves to correct quantifi-
ithm. Undoubtedly, utilization of the multi-site evaluation
cation of the uncertainty were also effective for combining
may be of theoretical and practical merit in obtaining
different sources of information, and concluded that one
some insight into the causes behind the hydrological model-
way to reduce uncertainty was to use all the available
ing uncertainty, one of the crucial but tough problems in the
data. Freer et al.’s () research showed that further con-
hydrological modeling practices. The rest of this paper is
straining of the model responses using the fuzzy water
organized as follows: the section below briefly describes
table elevations at both locations considerably reduced the
the uncertainty estimation schemes and the Xinanjiang
number of behavioral parameter sets. Uhlenbrook &
model; then, in the next section, we introduce the study
Sieber () also pointed out that the potential restriction
area and associated hydrological data; results are discussed
of the uncertainty clearly depended on the goodness of the
and analyzed in the section after that; finally, the last section
simulation of the additional data set. Gallart et al. ()
contains the major conclusions.
used conditioning on water table records and the distribution of parameters obtained from point observations to reduce the uncertainty of predictions for both streamflow
METHODOLOGY
and groundwater contribution. Maschio et al. () dealt with uncertainty mitigation by using observed data, integrat-
Uncertainty estimation technique
ing the uncertainty analysis and the history-matching processes. The main characteristic of their study was the
The GLUE method proposed by Beven & Binley () to
use of observed data as constraints to reduce the uncertainty
estimate parameter uncertainty has been widely used in
of the reservoir parameters. Lumbroso & Gaume () used
many complex and nonlinear models. The GLUE method
the analysis of various types of data that can be collected
is devoted to the investigation of hydrological modeling
during post-event surveys and consistency checks to
uncertainty by producing the prediction limits for the mod-
reduce the uncertainty in indirect discharge estimates.
eled streamflow series and a set of behavioral parameters
In fact, interior hydrological information has been used
(e.g., Freer et al. ; Beven & Freer ; Blazkova &
to improve the performance of hydrological models in many
Beven ; Montanari ; McMichael et al. ; Jin
literatures. The study by Gupta et al. () proposed the use
et al. ; Ng et al. ). The popularity of the GLUE
of the multiple and non-commensurable measures of infor-
method is probably best explained by its conceptual simpli-
mation to improve calibration of hydrologic models.
city, relative ease of implementation, the ability to handle
Thereafter, many studies have proved that it is helpful to
different error structures and models without major modifi-
use interior hydrological information to improve the hydro-
cations to the method itself.
logical modeling to some degree for both conceptual model
The SCEM-UA algorithm (Vrugt et al. ) can be used
and distributed model (e.g., Krysanova et al. ; Andersen
to improve the efficiency of the GLUE, which has a heavy
et al. ; Moussa et al. ; Das et al. ; Feyen et al.
computational burden. The SCEM-UA algorithm is an adap-
). There could be more uncertainty if only the error at
tive Markov Chain Monte Carlo (MCMC) sampler, which
the outlet is considered, and this uncertainty can be con-
has good ability to infer the posterior probability distribution
siderably reduced by using more available information,
of hydrologic model parameters (e.g., Gong ; Blasone &
such as the interior sites’ flow information. Therefore,
Vrugt ; McMillan & Clark ; Dotto et al. ; Xu
based on the idea of inputting more available useful
et al. ). Due to the merits of the SCEM sampling and
62
K. Lin et al.
|
Multi-site evaluation to reduce parameter uncertainty with the GLUE framework
Journal of Hydroinformatics
|
16.1
|
2014
the GLUE method, these two methods can be combined
observations of the interior sites’ flow information in an
together. For example, the initial range of parameter
alternative strategy. It is notable that the proposed idea of
samples can be wide without necessarily increasing compu-
the utility of the interior sites’ information is not limited to
tational requirements (Dotto et al. ). Blasone & Vrugt
the GLUE or MCMC methods. The flowcharts of these
() compared performance of the informal likelihoods
two schemes are shown in Figure 1. Scheme I sets the
in the SCEM-UA algorithm with the GLUE method and
threshold of likelihood measure only at the outlet, and
demonstrated that the targeted sampling resulted in better
scheme II sets the threshold of likelihood measure at both
predictions of the model output (and that the uncertainty
the outlet and interior sites. First, in this study, the Nash–
limits were less sensitive to the number of retained
Sutcliffe efficiency index (NE) (Nash & Sutcliffe ) is
solutions).
selected as the likelihood measure, which is defined as:
Therefore, the GLUE method with SCEM algorithm was adopted for uncertainty analysis in our study. In this study, two schemes were established by using the GLUE method with the SCEM-UA sampling algorithm, to study how well parameter uncertainty can be reduced by considering the
Figure 1
|
Flowchart of GLUE method with SCEM-UA sampling algorithm.
n P
NE ¼ 1:0
½Qobs ðiÞ Qsim ðiÞ 2
i¼1 n P i¼1
Qobs ðiÞ Qobs
2
(1)
63
K. Lin et al.
|
Multi-site evaluation to reduce parameter uncertainty with the GLUE framework
where Qobs ðiÞ, Qsim ðiÞ, and Qobs denote the observed
Journal of Hydroinformatics
|
16.1
|
2014
calculated formula is:
runoff, simulated runoff and the mean value of the observed runoff series, respectively, n is the length of the observed data series. Second, instead of the Monte Carlo method, the SCEM-UA algorithm was used to generate a sample of parameter sets. In this study, the SCEM-UA algorithm produces
n P
J ½Qobs ðiÞ
CR ¼ i¼1 where,
NE-dependent samples before setting a threshold, so the simulation associated with each of the parameter sets has
J ½Qobs ðiÞ ¼
equal weight. After that, a threshold value of likelihood measure is decided and the behavioral parameter sets whose likelihood values are greater than the thresholds are chosen. Then the discharge predictions from the behavioral parameter sets were ranked in order of magnitude and, using the likelihood weights associated with each behavioral parameter set, which is defined as:
(2)
(5)
The confidence interval of discharge at each time step is the major result by the GLUE method in terms of evaluations of hydrological modeling uncertainty. Interval width (IW) is usually adopted as one of the major indices to evaluate the uncertainty interval, but it depends on the
width (RIW) is used, which is defined by the following equation:
where W(i) and L(θi) are likelihood weight and likelihood measure value associated with behavioral parameter set θi, respectively, n is the number of behavioral parameter sets. Finally, a cumulative probability distribution for the ranked discharge predictions is obtained by Equation (3):
PðQ Qi Þ ¼
Qlow ðiÞ < Qobs ðiÞ < Qup ðiÞ otherwise
magnitudes of discharge which makes it impossible to
i¼1
j¼1 n P
1, 0,
compare across basins. In this study, a relative interval
Lðθi Þ W ðiÞ ¼ n P Lðθi Þ
i P
(4)
n
n P
RIW ¼ i¼1
Qup ðiÞ Qlow ðiÞ nQobs
(6)
where Qlow (i) and Qup(i) denote the lower and the upper uncertainty bounds at time i, respectively, the meaning of Qobs is the same as in Equation (1).
W ð jÞ
The Nash–Sutcliffe efficiency index of the median (3)
W ð jÞ
j¼1
where Q represents discharge, and Qi is the ranked dis-
values MQ0.5 (NE(MQ0.5)) is also used as an evaluation index to judge whether or not the median values MQ0.5 and the uncertainty intervals are effective crisp simulations of the observation of total flow.
charge prediction which is ranked at the ith place, n has the same meaning as Equation (2).
Xinanjiang conceptual model
According to the cumulative probability distribution, an uncertainty bound can be obtained for a given certainty
The Xinanjiang model, developed in 1973 and published in
level.
1980 (Zhao et al. ), is one conceptual hydrological
In this study, three indices were adopted to evaluate the
model and has been widely used in China. Its main feature
uncertainty interval. One is the containing ratio (CR), which
is the concept of runoff formation on repletion of storage,
is defined as the ratio of the number of the observations fall-
which denotes that runoff is not produced until the soil
ing within their respective uncertainty intervals to the total
moisture content of the aeration zone reaches field
number of observations (Beven & Binley ; Montanari
capacity, and thereafter runoff equals the rainfall excess
; Xiong & O’Connor ; Lin et al. ). The
without further loss (Zhao & Liu ). Based on the
64
K. Lin et al.
|
Multi-site evaluation to reduce parameter uncertainty with the GLUE framework
concept of runoff formation on repletion of storage, the
Journal of Hydroinformatics
|
16.1
|
2014
STUDY REGIONS AND DATA
total runoff, R, of the basin is calculated by using a soil moisture storage capacity distribution curve in the Xinan-
River basins
jiang model. After that, the total runoff, R, is separated into only two components, i.e., the surface runoff and the
The Dongwan watershed was selected as the case study, and
groundwater runoff in the early version of the Xinanjiang
is a sub-watershed of the Yellow River basin and located in
model (e.g., Zhao et al. ). In the subsequent appli-
Henan Province in China, at longitude 111 230 to 112 510
cation of the Xinanjiang model, the runoff, R, is
and latitude of 33 510 to 34 370 (Figure 3). It drains an
W
W
W
W
2
separated into three components, i.e., surface runoff (RS),
area of 2,623 km , rising in the mountain Funiu situated in
ground water runoff (RG), and interflow (RI) with the
the Qinling Mountain. Vegetation cover of the watershed
aim of simulating the real runoff processes in the correct
is good and soil erosion is not serious. The Dongwan water-
way (Zhao & Liu ), and this version of the Xinanjiang
shed belongs to a monsoon climate area and its rainfall
model is used in this study. The model consists of four
varies greatly with different seasons. The inter-annual vari-
major parts (Figure 2): evapotranspiration, runoff pro-
ation of precipitation is very large and climatic tendencies
duction, runoff separation, and flow routing. There are 15
produce the highest flooding in the period July to August.
parameters when using the Muskingum method for flow
The mean annual precipitation and runoff are 791 and
routing, which may be grouped as follows: evapotranspira-
276 mm, respectively. Figure 3 shows eight rainfall gauge
tion parameters KE, X, Y, C; runoff production parameters
stations and three hydrological gauge stations (Luanchan,
WM, B, IMP; runoff separation parameters SM, EX, KI,
Tantou, and Dongwan) located in the Dongwan watershed.
KG; and runoff concentration parameters CI, CG, N,
The data selected for modeling are hourly rainfall and dis-
NK, XE, K. The meanings of the model parameters are
charges over the same period of 1 June to 30 October in
listed in Table 1.
seven consecutive years from 1993 to 1998. In this study,
Figure 2
|
Flowchart of the Xinanjiang model.
65
K. Lin et al.
Table 1
|
|
Multi-site evaluation to reduce parameter uncertainty with the GLUE framework
Parameters of the Xinanjiang model and their prior ranges
Parameter
Range
Description
WM/(mm)
100–200
Areal soil moisture storage capacity
X
0.05–0.2
Proportion of soil moisture storage capacity of the upper layer (WUM) to WM
Y
0.4–0.7
Proportion of soil moisture storage capacity of the lower layer (WLM) to (1 X)*WM
KE
0.8–1.5
Ratio of potential evapotranspiration to pan evaporation
B
0.1–0.4
The exponent of the soil moisture storage capacity curve
Journal of Hydroinformatics
|
16.1
|
2014
and Calcic Luvisols (LVk), all of which are almost evenly distributed in the watershed. Therefore, the parameters are considered as homogeneous over the whole basin in this study. Extraction of the digital river network and sub-basin based on digital elevation model (DEM) Based on DEM data with a map scale of 1:250,000, the digital river network, sub-watersheds, and topological relations of the study area are extracted automatically by using Arc Hydro Tools, including the related hydrological topography
SM/(mm)
10–50
Areal mean free water capacity of the surface soil layer
features, such as the area, river length, and gradient, etc. In
EX
1–1.5
The exponent of the free water capacity curve
sub-watersheds by three hydrological gauge stations (Luan-
KI
0.1–0.3
The outflow coefficients of the free water storage to interflow
and 928 km2 respectively (Figure 3).
KG
0.1–0.4
The outflow coefficients of the free water storage to groundwater
Model parameter ranges Based on previous studies of the Xinanjiang model (Zhao
this study, the Dongwan watershed was divided into four chan, Tantou, and Dongwan), with areas of 340, 729, 626,
IMP
0.01
The ratio of the impervious to the total area of the basin
C
0.08–0.18
The coefficient of deep evapotranspiration
et al. ; Zhao ; Zhao & Liu ) and the character-
CI
0.9–0.93
The recession constant of the lower interflow storage
land cover, and vegetation and soil conditions, the prior
CG
0.997
The recession constant of groundwater storage
istics of the Dongwan watershed, such as climate, land use, ranges of the Xinanjiang model in this study were determined and listed in Table 1. In detail, the value of the
N
1–5
Number of reservoirs in the instantaneous unit hydrograph
ratio of the impervious to the total area of the basin (IMP)
NK
4–10
Common storage coefficient in the instantaneous unit hydrograph
The parameters of the Muskingum method XE and K are
XE
0.45
The weighting factor of the Muskingum method
discharge, which are equal to 0.45 and 5 h respectively.
K/(h)
5
The storage time constant of the Muskingum method
the data from 1993 to 1996 are selected as the calibration
is taken as 0.01 because the study area is a natural basin. estimated by the trial and error method using the observed Thus, 14 parameters were selected for the uncertainty analysis.
RESULTS
period, and the data from 1997 to 1998 are selected as the validation period.
Comparison of the behavioral parameter sets
As shown in Figure 3, eight land cover types were identified in the Dongwan watershed in which there were three
To assess the impact of using the interior sites’ flow infor-
main kinds of land cover: woodland, cropland and
mation on the uncertainty of hydrological modeling, this
wooded grassland, with slightly different subdivisions. The
study accepted 12 scenarios (as shown in Table 2) by
Dongwan watershed consists of three main kinds of soil
taking the threshold values of the Nash–Sutcliffe efficiency
types: Calcaric Cambisols (CMc), Eutric Cambisols (CMe),
index (NE-outlet) at the outlet (Dongwan station) as 50,
66
K. Lin et al.
Figure 3
Table 2
|
|
|
Multi-site evaluation to reduce parameter uncertainty with the GLUE framework
Journal of Hydroinformatics
|
16.1
|
2014
Location, digital river network, and land cover of the Dongwan basin.
Comparison of number of behavior parameters of different scenarios
scenarios are listed in Table 2, which showed that the number of behavioral parameter sets decreased when setting
70%
the threshold value at the interior site under all the
Threshold of NE-Interior site
Luanchuan
threshold values at the outlet, especially for setting the
Threshold of NE-Outlet
Scheme I
Luanchan
Tantou
and Tantou
50%
4,927
3,513
4,273
3,225
Tantou. Figure 4 shows the scatter map between the Nash
60%
4,872
3,507
4,270
3,218
efficiency indices at the outlet and interior sites under the
70%
4,645
3,448
4,200
3,184
threshold of the Nash efficiency index at the outlet as
Scheme I represents the scenario without setting the threshold value at the interior site; NE-Outlet and NE-Interior sites are the Nash–Sutcliffe efficiency indices at the outlet and interior sites, respectively.
threshold value at all interior sites, i.e., Luanchuan and
50%. From Figure 4, although it does not show direct relationship, it can be seen that the Nash efficiency index at the outlet is sensitive with that at the interior sites, and with the greater value of the Nash–Sutcliffe efficiency
60, and 70% without setting the threshold value at the
index at the interior sites, it is easier to get the greater
interior sites and setting the threshold values of different
value of that at the outlet.
interior sites (NE-interior site) as 70%. The Xinanjiang
For further analysis of the difference in behavioral par-
model was used to perform the hydrological modeling,
ameter sets among different threshold values at the
and the GLUE method with the SCEM-UA sampling algor-
interior sites, two schemes were selected from the above
ithm was adopted for the uncertainty analysis. The total
12 scenarios. Scheme I sets the threshold of likelihood
number of behavioral parameter sets of the above 12
measure only at the outlet as 70% (NE ¼ 70%), and
67
K. Lin et al.
Figure 4
|
|
Multi-site evaluation to reduce parameter uncertainty with the GLUE framework
Journal of Hydroinformatics
|
16.1
|
2014
The scatter map between the Nash–Sutcliffe efficiency indices at the outlet and interior stations under threshold of the Nash–Sutcliffe efficiency index at the outlet as 50%.
scheme II sets the threshold of likelihood measure at both
outlet do not always produce high likelihood measure
the outlet and interior sites (Dongwan station, Luanchuan
values at the interior sites. Typically, some values were
and Tantou stations) as 70% (NE1 ¼ NE2 ¼ NE3 þ 70%).
even smaller than 50% (the shaded numbers in Table 3).
Table 3 lists part of the behavioral parameter sets and associ-
That is, many unreasonable behavioral parameter sets
ated likelihood measure values in scheme I. As shown in
were obtained by using scheme I. It is indicated that some
Table 3, the parameter sets based only on the runoff at the
unreasonable parameter sets can be removed by setting
Table 3
|
Part of the behavioral parameter sets obtained by scheme I
WM
X
Y
KE
B
SM
EX
KI
KG
C
CI
N
NK
NE-LC/%
NE-TT/%
NE-Outlet/%
114.03
0.06
0.63
1.09
0.14
17.34
1.47
0.24
0.32
0.15
0.93
1.07
7.35
43.90
45.80
71.40
161.78
0.12
0.67
1.30
0.25
28.63
1.00
0.23
0.28
0.13
0.92
1.64
5.26
60.40
70.50
79.70
124.65
0.06
0.57
1.22
0.25
21.04
1.36
0.11
0.13
0.18
0.91
1.88
8.95
64.90
70.30
75.10
129.60
0.13
0.60
1.38
0.23
48.05
1.32
0.26
0.38
0.16
0.91
1.52
8.64
67.10
73.80
81.50
111.35
0.17
0.53
1.31
0.14
30.41
1.26
0.10
0.19
0.17
0.91
3.91
6.20
66.90
73.20
77.10
152.56
0.17
0.65
0.96
0.36
10.78
1.48
0.40
0.19
0.10
0.92
2.51
8.89
50.20
54.90
74.00
142.76
0.15
0.42
1.43
0.20
29.11
1.13
0.30
0.24
0.13
0.91
1.51
7.88
64.40
73.70
82.20
111.73
0.06
0.65
1.26
0.28
29.33
1.09
0.28
0.31
0.09
0.90
2.97
4.97
73.30
66.80
71.70
136.84
0.13
0.56
1.50
0.16
27.88
1.38
0.29
0.16
0.09
0.91
3.29
9.49
53.00
70.60
83.60
154.24
0.10
0.62
1.36
0.36
17.12
1.33
0.28
0.33
0.10
0.92
3.02
9.20
71.70
69.00
71.50
116.07
0.08
0.52
1.03
0.27
11.69
1.41
0.17
0.34
0.13
0.92
2.47
5.69
63.60
57.80
71.90
134.56
0.18
0.58
1.16
0.14
41.46
1.26
0.13
0.23
0.10
0.93
4.50
6.98
47.10
59.50
77.80
120.91
0.15
0.51
1.45
0.24
42.02
1.41
0.40
0.26
0.12
0.92
3.80
4.32
57.80
69.80
82.70
138.11
0.09
0.49
1.04
0.23
16.37
1.22
0.18
0.30
0.14
0.92
2.94
4.04
35.50
37.00
72.30
159.70
0.19
0.47
1.17
0.14
18.99
1.37
0.17
0.20
0.16
0.93
2.66
4.10
52.30
66.40
82.80
138.70
0.16
0.66
1.43
0.14
49.51
1.08
0.24
0.37
0.15
0.92
1.28
4.25
71.20
77.20
78.90
155.49
0.07
0.57
0.96
0.37
33.05
1.47
0.13
0.28
0.17
0.91
3.30
7.33
47.90
46.60
70.70
104.84
0.11
0.61
1.30
0.22
29.13
1.31
0.18
0.32
0.12
0.90
1.52
7.28
74.00
69.70
70.30
115.28
0.13
0.69
1.19
0.13
18.13
1.18
0.15
0.26
0.16
0.90
4.50
4.36
71.30
71.50
71.60
146.48
0.13
0.54
1.49
0.12
35.55
1.15
0.15
0.29
0.16
0.90
2.98
7.56
72.80
77.10
75.20
NE-LC, NE-TT, and NE-Outlet are the Nash–Sutcliffe efficiency indices at Luanchuan, Tantou, and outlet respectively.
68
K. Lin et al.
|
Multi-site evaluation to reduce parameter uncertainty with the GLUE framework
Journal of Hydroinformatics
|
16.1
|
2014
the threshold of the likelihood measure at the interior sites.
the SCEM-UA-derived initial sample contains numerous sol-
The mean and standard deviation of behavioral parameter
utions in the high probability density (HPD) region of the
sets and efficiency coefficients of scheme I and scheme II
parameter space, so that the average distance of the various
were gained and are shown in Figure 5. Referring to Figure 5,
parameter combinations to the optimal model is small. Fur-
it can be seen that the standard deviation of most behavioral
thermore, most of the parameters’ posterior distributions
parameter sets decreased greatly when setting the threshold
obtained by scheme II showed more peak than those obtained
value at the interior sites, and the same with the Nash–
by scheme I. This finding implied that the posterior distri-
Sutcliffe efficiency indices at the outlet and interior sites.
butions obtained by scheme II can evolve into the HPD
All of the above results and analysis indicated that taking
region of the parameter space with higher frequency, so as to
the interior sites’ information into consideration can
obtain more reasonable posterior distributions of the hydrolo-
reduce parameter uncertainty to some degree.
gical parameters, since scheme II further filters the alternative simulation results using the interior flow information.
Comparison of parameters’ posterior distributions Comparison of uncertainty intervals Figure 6 illustrates comparisons between the parameters’ posterior distributions of the Xinanjiang model obtained
To investigate how the interior sites’ information affects the
by scheme I and scheme II, respectively. As shown in
efficiency of uncertainty interval in the Xinanjiang model-
Figure 6, the posterior distributions all show distinct non-
ing, three indices including the CR, RIW, and the Nash–
uniform distribution, and have peak value mostly in the
Sutcliffe efficiency index of the median MQ0.5 presented
two schemes. Blasone & Vrugt () have indicated that
above,
Figure 5
|
were selected
to
evaluate the
efficiency of
Comparisons of the mean and standard deviation of behavior parameters and Nash–Sutcliffe efficiency index of different schemes under threshold of the Nash–Sutcliffe efficiency index at the outlet as 70%.
69
Figure 6
K. Lin et al.
|
|
Multi-site evaluation to reduce parameter uncertainty with the GLUE framework
Journal of Hydroinformatics
|
16.1
|
2014
The posterior distribution of parameters obtained by scheme I and scheme II.
uncertainty. The uncertainty intervals for a given confidence
the Xinanjiang model obtained by scheme I and scheme
level of 90% are obtained by using the GLUE method with
II. NE(MQ0.5) in Table 4 represents the Nash–Sutcliffe effi-
setting a given threshold value of NE as 70%. Table 4 dis-
ciency index of the median MQ0.5 produced from the
plays the results of the uncertainty evaluation indices of
uncertainty analysis by fitting the observed runoff series.
70
K. Lin et al.
Table 4
|
|
Multi-site evaluation to reduce parameter uncertainty with the GLUE framework
Journal of Hydroinformatics
|
16.1
|
2014
Assessing indices of uncertainty for different schemes
Dongwan (Outlet) Evaluation index
Tantou
Luanchuan
Scheme I
Scheme II
RI%
Scheme I
Scheme II
RI%
Scheme I
Scheme II
RI%
RIW
Calibration Verification
0.606 0.617
0.524 0.538
13.58 12.91
0.589 0.607
0.524 0.538
11.05 11.37
0.575 0.665
0.524 0.538
8.89 19.13
CR
Calibration Verification
0.683 0.693
0.660 0.674
3.35 2.74
0.677 0.628
0.661 0.612
2.42 2.60
0.666 0.631
0.652 0.607
2.15 3.84
NE(MQ0.5)
Calibration Verification
0.803 0.859
0.807 0.861
0.59 0.23
0.848 0.808
0.850 0.811
0.24 0.38
0.779 0.747
0.783 0.754
0.54 0.83
RI is the percentage of IW decrease from scheme II to scheme I; RIW is relative interval width; CR is containing ratio; NE(MQ0.5) is the Nash–Sutcliffe efficiency index of the median value MQ0.5.
Figures 7 and 8 illustrate the uncertainty intervals and
can be found in the RIW, implying that considering the interior
observed flow during the time period of 24 July–12 October
sites’ flow information can reduce parameter uncertainty to
1996 (calibration period) and 19 July–13 October 1998 (vali-
some degree. It can be also observed from Table 4 and Figures 7
dation period) at Dongwan obtained by scheme I and
and 8 that the Nash–Sutcliffe efficiency index of the median
scheme II, respectively.
MQ0.5, NE (MQ0.5) increased with setting the thresholds at
It can be found from Table 4 and Figures 7 and 8 that the
the interior sites, which indicated that, when considering the
CR did not decrease by much but a more significant decrease
interior sites’ flow information, the simulated runoff series by
Figure 7
|
The runoff uncertainty intervals and observed flow during the time period 24 July–12 October at the outlet obtained by scheme I and scheme II.
71
Figure 8
K. Lin et al.
|
|
Multi-site evaluation to reduce parameter uncertainty with the GLUE framework
Journal of Hydroinformatics
|
16.1
|
2014
The runoff uncertainty intervals and observed flow during the time period 19 July–13 October at the outlet obtained by scheme I and scheme II.
the Xinanjiang model with the behavioral parameter sets will
quite different to those in calibration. The only response to
fit better with the observed runoff series. Referring to
this would appear to be to moderate our expectations of
Table 4, the results also showed that the total coverage ratios
what a model, or set of models, can do in prediction. Other
in both calibration and validation are not very high. It was
relative issues need to be carried out in the future.
found that the coverage ratios are high at the high flow, but they are low at the low flow, and the period of low flow is longer than that of high flow. As we know, there can be uncertainty due to many reasons, e.g., input uncertainty, model
CONCLUSION
structure uncertainty, parameter uncertainty; however, in this case, the reason for this result is that the model used in
The aim in researching uncertainty is to find the ways and
this study cannot perform very well at the low flow in the
measures to reduce parameter uncertainty in hydrological mod-
study area. As pointed out by Beven et al. (), we should
eling and forecasting, so as to increase the accuracy and
not expect such periods to be well predicted by the set of behav-
reliability of hydrological forecasting. Using all the available
ioral models identified in calibration. We should also not
and new data for multi-site evaluation is one of the valid ways
expect that such periods would be covered by any statistical
to reduce parameter uncertainty in hydrological modeling and
representation of the calibration errors, since the epistemic
forecasting. Based on the GLUE method with the SCEM-UA
uncertainties of inconsistent periods in prediction might be
sampling algorithm, this study focuses on reducing hydrological
72
K. Lin et al.
|
Multi-site evaluation to reduce parameter uncertainty with the GLUE framework
modeling uncertainty by using the interior hydrological information in the performance of the Xinanjiang model. Comparison of the results between 12 scenarios showed that, under the same threshold of the Nash–Sutcliffe efficiency index at the outlet, the number and standard deviation of behavioral parameter sets decreased greatly when setting the threshold value at the interior sites. The uncertainty analysis confirmed that the GLUE method with the SCEM-UA sampling algorithm, which periodically updates the size and direction of the proposal distribution, was able to locate the HPD region of the parameter space efficiently. In addition, the CR decreased by not much but a more significant decrease can be found in the RIW, implying that considering the interior sites’ flow information, which makes the selection of behavior parameters stricter, can reduce parameter uncertainty to some degree. As well, the Nash–Sutcliffe efficiency of the median value, MQ0.5, increased when the interior sites’ flow information was taken into consideration, which indicated that when considering the interior sites’ flow information, the simulated runoff series by the Xinanjiang model with the behavioral parameter sets can fit better with the observed runoff series, and correspondingly, the abstracted median value, MQ0.5, can be improved for better prediction of the runoff.
ACKNOWLEDGEMENTS The authors would like to express their gratitude to Valeria from the University of Illinois at Champaign-Urbana. The authors are grateful to Dr Jasper A. Vrugt for developing code of SCEM-UA. The authors would like to express their sincere gratitude to Prof. Keith Beven and the other two anonymous referees for their constructive comments and useful suggestions that helped us improve our paper. This study was financially supported by National Natural Science Foundation of China (Grant No. 50809078), and project of Pearl-River-New-Star of Science and Technology of Guangzhou City (Grant No. 2011J2200051).
REFERENCES Ajami, N. K., Duan, Q. & Sorooshian, S. An integrated hydrologic Bayesian multimodel combination framework:
Journal of Hydroinformatics
|
16.1
|
2014
confronting input, parameter and model structural uncertainty. Water. Resour. Res. 43, W01403. Andersen, J., Refsgaard, J. C. & Jensen, K. H. Distributed hydrological modelling of the Senegal river basin – model construction and validation. J. Hydrol. 247 (3–4), 200–214. Benke, K. K., Lowell, K. E. & Hamilton, A. J. Parameter uncertainty, sensitivity analysis and prediction error in a water-balance hydrological model. Math. Comput. Model. 47 (11–12), 1134–1149. Beven, K. & Binley, A. The future of distributed models: model calibration and uncertainty prediction. Hydrol. Process. 6, 279–298. Beven, K. & Freer, J. Equifinality, data assimilation, and uncertainty estimation in mechanistic modeling of complex environmental systems using the GLUE methodology. J. Hydrol. 249, 11–29. Beven, K., Smith, P. J. & Wood, A. On the colour and spin of epistemic error (and what we might do about it). Hydrol. Earth Syst. Sci. 15, 3123–3133. Blasone, R. S. & Vrugt, J. A. Generalized likelihood uncertainty estimation (GLUE) using adaptive Markov Chain Monte Carlo sampling. Adv. Water Res. 31, 630–648. Blazkova, S. & Beven, K. J. Flood frequency estimation by continuous simulation for a catchment treated as ungauged (with uncertainty). Water Resour. Res. 38 (8), 1139. Das, T., Bardossy, A., Zehe, E. & He, Y. Comparison of conceptual model performance using different representations of spatial variability. J. Hydrol. 356 (1–2), 106–118. Dotto, C. B. S., Mannina, G., Kleidorfer, M., Vezzaro, L., Henrichs, M., McCarthy, D. T., Freni, G., Rauch, W. & Deletic, A. Comparison of different uncertainty techniques in urban stormwater quantity and quality modelling. Water Res. 46 (8), 2545–2558. Feyen, L., Kalas, M. & Vrugt, J. A. Semi-distributed parameter optimization and uncertainty assessment for largescale streamflow simulation using global optimization. Hydrol. Sci. J. 53 (2), 293–308. Freer, J., Beven, K. J. & Ambroise, B. Bayesian estimation of uncertainty in runoff prediction and the value of data: an application of the GLUE approach. Water Resour Res. 32 (7), 2161–2173. Freer, J., McMillan, H., McDonnell, J. J. & Beven, K. J. Constraining dynamic TOPMODEL responses for imprecise water table information using fuzzy rule based performance measures. J. Hydrol. 291, 254–277. Gallart, F., Latron, J., Llorens, P. & Beven, K. Using internal catchment information to reduce the uncertainty of discharge and baseflow predictions. Adv. Water Res. 20, 808–823. Gong, Z. J. Estimation of mixed Weibull distribution parameters using the SCEM-UA algorithm: application and comparison with MLE in automotive reliability analysis. Reliab. Eng. Syst. Safe. 15 (1), 915–922. Goodman, D. Extrapolation in risk assessment: improving the quantification of uncertainty, and improving information to reduce the uncertainty. Hum. Ecol. Risk. Assess. 8 (1), 177–192.
73
K. Lin et al.
|
Multi-site evaluation to reduce parameter uncertainty with the GLUE framework
Gupta, H. V., Sorooshian, S. & Yapo, P. O. Toward improved calibration of hydrologic models: multiple and noncommensurable measures of information. Water Resour. Res. 34, 751–763. Hejazi, M. I., Cai, X. M. & Borah, D. K. Calibrating a watershed simulation model involving human interference: an application of multi-objective genetic algorithms. J. Hydroinform. 10 (1), 97–111. Jin, X., Xu, C., Zhang, Q. & Singh, V. P. Parameter and modeling uncertainty simulated by GLUE and a formal Bayesian method for a conceptual hydrological model. J. Hydrol. 383, 147–155. Krysanova, V., Bronstert, A. & Müller-Wohlfeil, D. Modelling river discharge for large drainage basins: from lumped to distributed approach. Hydrol. Sci. J. 44 (2), 313–331. Li, Z., Shao, Q., Xu, Z. & Cai, X. Analysis of parameter uncertainty in semi-distributed hydrological models using bootstrap method: a case study of SWAT model applied to Yingluoxia watershed in northwest China. J. Hydrol. 385, 76–83. Lin, K., Chen, X., Zhang, Q. & Chen, Z. A Modified Generalized Likelihood Uncertainty Estimation Method by Using Copula Function. IAHS Publication, Wallingford, UK, 335, pp. 51–56. Lin, K., Zhang, Q. & Chen, X. An evaluation of impacts of DEM resolution and parameter correlation on TOPMODEL modeling uncertainty. J. Hydrol. 394, 370–383. Lumbroso, D. & Gaume, E. Reducing the uncertainty in indirect estimates of extreme flash flood discharges. J. Hydrol. 414, 16–30. Maschio, C., Schiozer, D. J., Moura, M. A. B. & Becerra, G. G. A methodology to reduce uncertainty constrained to observed data. SPE. Reserv. Eval. Eng. 12 (1), 167–180. McMichael, C. E., Hope, A. S. & Loaiciga, H. A. Distributed hydrological modelling in California semi-arid shrublands: MIKE SHE model calibration and uncertainty estimation. J. Hydrol. 317 (3–4), 307–324. McMillan, H. & Clark, M. Rainfall-runoff model calibration using informal likelihood measures within a Markov chain Monte Carlo sampling scheme. Water Resour. Res. 45, W04418. Montanari, A. Large sample behaviors of the generalized likelihood uncertainty estimation (GLUE) in assessing the uncertainty of rainfall–runoff simulations. Water Resour. Res. 41, W08406. Moradkhani, H., Hsu, K.-L., Gupta, H. & Sorooshian, S. Uncertainty assessment of hydrologic model states and parameters: sequential data assimilation using the particle filter. Water Resour. Res. 41 (5), 1–17. Mousavi, S. J., Abbaspour, K. C., Kamali, B., Amini, M. & Yang, H. Uncertainty-based automatic calibration of HEC-HMS
Journal of Hydroinformatics
|
16.1
|
2014
model using sequential uncertainty fitting approach. J. Hydroinform. 14 (2), 286–309. Moussa, R., Chahinian, N. & Bocquillon, C. Distributed hydrological modelling of a Mediterranean mountainous catchment – model construction and multi-site validation. J. Hydrol. 337 (1–2), 35–51. Nash, J. E. & Sutcliffe, J. V. River flow forecasting through the conceptual models, 1: a discussion of principles. J. Hydrol. 10 (3), 282–290. Ng, T. L., Eheart, J. W. & Cai, X. M. Comparative calibration of a complex hydrologic model by stochastic methods GLUE and PEST. Trans. ASABE 53 (6), 1773–1786. Sivapalan, M., Takeuchi, K., Franks, S. W., Sivapalan, M., Takeuchi, K., Franks, S. W., Gupta, K., Karambiri, H., Lakshmi, K., Liang, X., McDonnell, J. J., Mendiondo, E. M., O’Connell, P. E., Oki, T., Pomeroy, J. W., Schertzer, D., Uhlenbrook, S. & Zehe, E. IAHS decade on predictions in ungauged basins (PUB), 2003–2012: shaping an exciting future for the hydrological sciences. Hydrol. Sci. J. 48 (6), 857–880. Thiemann, M., Trosset, M., Gupta, H. & Sorooshian, S. Bayesian recursive parameter estimation for hydrological models. Water Resour. Res. 7 (10), 21–35. Uhlenbrook, S. & Sieber, A. On the value of experimental data to reduce the prediction uncertainty of a process-oriented catchment model. Environ. Modell. Softw. 20 (1), 19–32. Vrugt, J. A. & Robinson, B. A. Treatment of uncertainty using ensemble methods: comparison of sequential data assimilation and Bayesian model averaging. Water Resour. Res. 43, W01411. Vrugt, J. A., Gupta, H. V., Bouten, W. & Sorooshian, S. A Shuffled Complex Evolution Metropolis algorithm for optimization and uncertainty assessment of hydrologic model parameters. Water Resour. Res. 39 (8), 1201. Wagener, T. & Gupta, H. V. Model identification for hydrological forecasting under uncertainty. Stoch. Environ. Res. Risk. A 19, 378–387. Xiong, L. & O’Connor, K. M. An empirical method to improve the prediction limits of the GLUE methodology in rainfall-runoff modeling. J. Hydrol. 349, 115–124. Xu, D., Wang, W., Chau, K. & Chen, S. Comparison of three global optimization algorithms for calibration of the Xinanjiang model parameters. J. Hydroinform. 15 (1), 174–193. Zhao, R. J. The Xinanjiang model applied in China. J. Hydrol. 135, 371–381. Zhao, R. J. & Liu, X. R. The Xinanjiang model. In: Computer Models of Watershed Hydrology (V. P. Singh, ed.). Water Resources Publication, Highlands Ranch, CO. Zhao, R. J., Zhang, Y. L. & Fang, L. R. The Xinanjiang model. Hydrological Forecasting Proceedings Oxford Symposium. IASH, Oxford, vol. 129, 1980, pp. 351–356.
First received 5 November 2012; accepted in revised form 10 May 2013. Available online 4 June 2013
74
© IWA Publishing 2014 Journal of Hydroinformatics
|
16.1
|
2014
Integration of an evolutionary algorithm into the ensemble Kalman filter and the particle filter for hydrologic data assimilation Gift Dumedah and Paulin Coulibaly
ABSTRACT Data assimilation (DA) methods continue to evolve in the design of streamflow forecasting procedures. Critical components for efficient DA include accurate description of states, improved model parameterizations, and estimation of the measurement error. Information about these components are usually assumed or rarely incorporated into streamflow forecasting procedures. Knowledge of these components could be gained through the generation of a Pareto-optimal set – a set of competitive members that are not dominated by other members when compared using evaluation objectives. This study integrates Pareto-optimality into the ensemble Kalman filter (EnKF) and the particle filter (PF). Comparisons are made between three methods: evolutionary data assimilation (EDA) and methods based on the integration of Pareto-optimality into the EnKF (ParetoEnKF) and into the PF (ParetoPF). The methods are applied to assimilate daily streamflow into
Gift Dumedah (corresponding author) Department of Civil Engineering, Monash University, Building 60, Melbourne, Victoria 3800, Australia E-mail: dgiftman@hotmail.com Paulin Coulibaly School of Geography and Earth Sciences, and Department of Civil Engineering, McMaster University, 1280 Main Street West, Hamilton, Ontario, Canada L8S4L8
the Sacramento Soil Moisture Accounting model in the Spencer Creek watershed in Canada. The updated members are applied to forecast streamflows for up to 10 days ahead, where forecasts for 1 day, 5 day and 10 day lead times are compared to observations. The results show that updated estimates are similar for all three methods. An evaluation of updated members for multi-step forecasting revealed that EDA had the highest forecast accuracy compared to ParetoEnKF and ParetoPF, which have similar accuracies. Key words
| data assimilation, ensemble Kalman filter, multi-objective evolutionary algorithms, Pareto-optimality, particle filter, streamflow forecasting
INTRODUCTION Data assimilation (DA) has gained popularity in the design
model state and parameterizations have a direct influence
of streamflow forecasting methods. It is an analytical
on simulations (e.g. streamflow) and subsequent assimila-
approach that allows an optimal merger between inaccurate
tion, whereas the measurement error controls the relative
model output and imperfect observations, and accounts for
penalty between simulation and observation. State and
uncertainties in model and observation data (Liu & Gupta
model parameterizations also control model forecasts (i.e.
). For hydrological forecasting systems, accurate esti-
background information) and ensemble simulations, which
mation of the state, determination of the measurement (or
are combined with observations to determine updated
observation) error, and parameterizations of the model are
ensemble members. However, specific integration of these
crucial components for the performance of the DA
components into streamflow forecasting procedures has
method (Chen ; Snyder et al. ; van Leeuwen
not been fully examined in the hydrological literature.
). These three components are important for the
Evolutionary algorithms have been shown to provide a
design of efficient hydrological forecasting systems. The
stochastic framework to address key components for DA,
doi: 10.2166/hydro.2013.088
75
G. Dumedah & P. Coulibaly
|
Integration of Pareto-optimality into Bayesian-type filters
Journal of Hydroinformatics
|
16.1
|
2014
including efficient parameter estimation and the estimation
In assimilation, the evolutionary algorithm employs the
of measurement error using the dynamics between simu-
variational DA approach (Reichle et al. ; Caparrini et al.
lations and observations (Chemin & Honda ; Nazemi
; Caparrini et al. ) to minimize a cost (or penalty)
et al. ; Ines & Mohanty ; Dumedah et al. ).
function by finding the least squares estimate between
Multi-objective evolutionary algorithms have been widely
ensemble simulations and the observation data. As a
used for hydrological applications (Chemin & Honda
result, evolutionary data assimilation (EDA) uses evolution-
; Ines & Mohanty , ; Dumedah et al. ,
ary strategy to continuously evolve a population of
, a, b). Evolutionary algorithms allow several mem-
competing members through evaluation conditions that
bers in a population to compete among themselves based
are defined by the cost function and other accuracy
on evaluation objectives – the fitter members are selected
measures such as the root mean square error (RMSE).
and varied to reproduce new members to form a new popu-
Applied in a sequential mode, the EDA evolves the popu-
lation. The procedure is repeated to evolve the population
lation of members at each assimilation time step and also
through natural selection and variation of fitter members
between time steps. At each assimilation time step, several
using crossover and mutation – nudging operators for mod-
members are evaluated, but only the ensemble members
ifying members to maintain diversity between members.
that remain non-dominated are selected as the updated
Each cycle of evolution of the population to reproduce a
members to determine the ensemble mean and its variance.
new one is called a generation. The evaluation conditions
A detailed description of the computational procedure of the
change with each population as the fitness of its members
EDA is provided in the subsection on EDA.
usually increases with every cycle of the evolution. This con-
Moreover, the design of streamflow forecasting systems
tinuous evolution of the population of members through
has been dominated by popular DA methods such as the
different evaluation conditions allows the determination of
ensemble Kalman filter (EnKF) and the particle filter (PF)
the Pareto-optimal set – a set of equally accurate members
(Moradkhani & Hsu ; Weerts & El-Serafy ; Clark
that are not dominated when compared to other members
et al. ; van Leeuwen ; Weerts et al. ; Xie &
using evaluation objectives.
Zhang ). Brief descriptions for the EnKF and PF are
Note that the evolution continuously evaluates the
given below. The EnKF was developed by Evensen
dynamics between several simulations and perturbed obser-
(a), and has been applied in several hydrological studies
vations. This interaction estimates the measurement noise as
(Clark et al. ; Komma et al. ; Thirel et al. ; Xie
the error of using several simulations (or measurements) to
& Zhang ). The EnKF uses Monte Carlo integration to
approximate an ensemble of perturbed observations. The
estimate the posterior probability density function (pdf)
continuous evolution and the final population from which
through the ensemble mean and covariance (Evensen
the Pareto-optimal set is determined provide an appealing
b, ; Burgers et al. ). The ensemble members,
framework to adaptively approximate the measurement
which may include perturbed states, model parameters,
error, and to improve the estimation of state and model
and forcing data uncertainties, are propagated by using the
parameterizations. This can facilitate the integration of
model to make predictions (or measurements) to future
Pareto-optimality into Kalman-type assimilations where,
time. The ensemble predictions are combined with obser-
instead of assimilating randomly generated members (or
vations to determine the Kalman gain function (denoted
ensemble members), the framework first determines
K ) and the innovation vector. The K that is computed
Pareto-optimal members before they are optimally merged
using the covariance between states, parameters, and for-
with the perturbed observations. The integration of Pareto-
cing data is combined with the innovation vector, the
optimality into Kalman-type assimilation can facilitate the
residual between simulation and observation, to update
design and performance of hydrological forecasting systems.
the ensemble members. The above procedure is repeated
Further information on evolutionary algorithms can be
to evolve state and model parameter components for sub-
found in Deb () and Eiben & Smith ().
sequent time steps. Detailed application of the EnKF is
76
G. Dumedah & P. Coulibaly
|
Integration of Pareto-optimality into Bayesian-type filters
Journal of Hydroinformatics
|
16.1
|
2014
provided in the subsection on integration of Pareto-optimal-
set using evolutionary strategy before merging resulting
ity into the EnKF (ParetoEnKF).
evolved members with perturbed observations through the
The PF was originally developed by Gordon et al. (),
EnKF. The ParetoPF determines the updated ensemble
and has been applied widely in hydrology (Moradkhani &
members by using the evolutionary algorithm to generate
Hsu ; Weerts & El-Serafy ; van Leeuwen ).
the Pareto-optimal members that, in turn, are assimilated
The PF uses recursive Bayesian estimation to also estimate
using a particle filtering method. The three methods are
the posterior pdf, but through the use of weighted ensemble
applied in a state-parameter estimation procedure to assim-
members for approximating the full pdf (Bengtsson et al.
ilate daily streamflow into the Sacramento Soil Moisture
; Chen ; Vossepoel & van Leeuwen ; Snyder
Accounting (SAC-SMA) model in Spencer Creek watershed
et al. ; van Leeuwen ). As in the EnKF, the ensem-
in southern Ontario, Canada. The updated ensemble mem-
ble members for states, model parameters, and forcing data
bers are applied to forecast streamflow for up to a 10 day
uncertainties are propagated by using the model to make
lead time in order to evaluate the forecasting performance
predictions forward in time. The ensemble predictions are
for the DA methods.
combined with observations to compute the ensemble
The rest of the paper is organized as follows. The Data
weight, which in turn is re-sampled to replace low weighted
and methods section describes the study area and the rain-
members with normalized weighted members. The re-
fall-runoff model, and implementation procedures for the
sampled weights are applied to update the ensemble mem-
three methods. The resulting assimilation outputs and
bers, and the procedure is repeated to evolve state and
model forecasts are presented in the Results and discussion
model parameter components for subsequent time steps.
section. The implications of the results on the design and
Detailed application of the PF is provided in the subsection
performance of streamflow forecasting systems, and findings
on integration of Pareto-optimality into the PF (ParetoPF).
of this study are summarized in the Conclusions section.
Since both the EnKF and the PF have clear and standard computational procedures, their applications in this study are outlined in the subsections on ParetoEnKF and
DATA AND METHODS
ParetoPF. Despite their popularity, analytical drawbacks, including the assumption of normality for model errors in
Study area and rainfall-runoff model
the EnKF and the weight degeneracy problems in the PF, are well-known challenges in the DA literature (Chen
The study area, Spencer Creek watershed shown in Figure 1,
; Moradkhani & Hsu ; Weerts & El-Serafy ;
is located westward of Lake Ontario in southern Ontario,
Clark et al. ; Snyder et al. ; van Leeuwen ).
Canada. The Spencer Creek watershed has a drainage area
While these limitations are not specifically addressed here,
of about 280 km2, and the land cover is mainly agricultural
this study will demonstrate the integration of Pareto-optim-
with mixed forest. The upstream area has a flat physio-
ality into the EnKF and PF. The integration would allow
graphic terrain, whereas the downstream area has variable
adaptive estimation of the measurement error, and the
topography. Forcing data, including daily temperature, and
assimilation of a continuously evolved set of members to
daily precipitation are obtained from Environment Canada
ensure that model forecasts are generated from improved
weather stations and also from McMaster University
model parameterizations and updated states. As will be
weather stations. Two streamflow gauging stations, Highway
demonstrated in this study, the continuous evaluation of
5 located at the upstream section, and Dundas, located at
the updated ensemble members and their associated
the downstream section, are used in this study.
model forecasts is an important measure for evaluating forecasting performance of DA methods.
The rainfall-runoff model used is a modified version of the SAC-SMA model, in which a snowmelt routine was
This study makes a comparison between three assimila-
included. The SAC-SMA model is a conceptual model and
tion approaches: EDA, a method based on the ParetoEnKF
it spatially lumps the drainage area to account for water bal-
and ParetoPF. The ParetoEnKF generates a Pareto-optimal
ance (inflow, storage, and outflow) in the catchment. The
77
Figure 1
G. Dumedah & P. Coulibaly
|
|
Integration of Pareto-optimality into Bayesian-type filters
Journal of Hydroinformatics
|
16.1
|
2014
Study area – Spencer Creek watershed. Source: Natural Resources Canada.
model states that are based on the soil moisture budget are
Dumedah ; Dumedah & Coulibaly ). Compu-
determined from the mean area precipitation, the evapotran-
tational procedures of the NSGA-II can be found in the
spiration, the net streamflow, and the net groundwater loss.
following sources: Deb et al. (, ), Deb (),
The SAC-SMA model has been applied in several studies,
Deb & Goel () and Coello Coello et al. (). A
and is extensively used for operational streamflow forecast-
flowchart for computational procedures in the EDA is
ing (Vrugt et al. a, b; Vrugt & Robinson ). A list
shown in Figure 2 – detailed descriptions are given
of SAC-SMA model parameters and state variables is given
below.
in Table 1. Note that the intervals represent values that are
The EDA is applied to sequentially assimilate daily
physically meaningful for the SAC-SMA model in the con-
streamflow into the SAC-SMA model through the simul-
text of the Spencer Creek watershed. Further information
taneous
on the SAC-SMA model can be found in Vrugt et al.
components. The state and model parameter components
(a, b).
are considered time variant in a way that they are updated
estimation
of
state
and
model
parameter
for each assimilation time step when there is a new obserThe EDA procedure
vation. This is the same as in the standard state-parameter assimilation procedure in Moradkhani et al. (). The
The EDA uses the Non-dominated Sorting Genetic Algor-
EDA begins by using the NSGA-II to generate n random
ithm-II (NSGA-II) to continuously evolve a population of
members into a population Pn for initial time t0. This initial
members through different assimilation time steps. The
population is generated by using the model parameter
NSGA-II was designed by Deb et al., and is an advanced
bounds and the forcing data uncertainties shown in
multi-objective evolutionary algorithm that has been
Table 1. The population Pn is varied using crossover and
applied in several hydrological studies (Tang et al. ;
mutation operators to generate a child population of size n
Confesor & Whittaker ; Wohling et al. ;
where both populations are combined to create 2n members
78
Table 1
G. Dumedah & P. Coulibaly
|
|
Integration of Pareto-optimality into Bayesian-type filters
Journal of Hydroinformatics
|
16.1
|
2014
Description and intervals for model parameters and state variables for the SAC-SMA model
Parameter
Description
Interval
UZTWM
Upper-zone tension water maximum storage (mm)
5–100
UZFWM
Upper-zone free water maximum storage (mm)
5–100
LZTWM
Lower-zone tension water maximum storage (mm)
100–500
LZFPM
Lower-zone free water primary maximum storage (mm)
50–500
Model parameters
LZFSM
Lower-zone free water supplemental maximum storage (mm)
250–1,000
ADIMP
Additional impervious area (–)
0.01–0.4
UZK
Upper-zone free water lateral depletion rate (day 1)
0.01–0.2
LZPK
Lower-zone primary free water depletion rate (day 1)
0.0001–0.02
LZSK
Lower-zone supplemental free water depletion rate (day 1)
0.1–0.5
Maximum percolation rate (–)
1–10
Recession parameters
Percolation and other ZPERC REXP
Exponent of the percolation equation (–)
1–10
PCTIM
Impervious fraction of the watershed area (–)
0.0–0.01
PFREE
Fraction percolating from upper to lower-zone free water storage (–)
0–0.5
RIVA
Riparian vegetation area (–)
0
SIDE
Ratio of deep recharge to channel base flow (–)
0
SAVED
Fraction of lower-zone free water not transferable to tension water
0
Upper-zone tension water storage content (mm)
Updated
Soil moisture states UZTWC UZFWC
Upper-zone free water storage content (mm)
Updated
LZTWC
Lower-zone tension water storage content (mm)
Updated
LZFPC
Lower-zone free primary water storage content (mm)
Updated
LZFSC
Lower-zone free secondary water storage content (mm)
Updated
ADIMC
Additional impervious area content linked to stream network (mm)
Updated
Snow routine components DDF
Degree day factor
1–5
SCF
Snowfall correction factor
0.8–1.2
TR
Upper threshold temperature, to distinguish between rainfall, snowfall and a mix of rain and snow
0–1
ATHORN
A constant for Thornthwaite’s equation
0.1–0.3
RCR
Rainfall correction factor
0.8–1.2
SWE (state)
Snow water equivalent (mm)
Nash-cascade routing components RQ
Residence time parameters of quick flow
0.4–0.95
Three linear reservoirs to route the upper-zone (quick response) channel inflow (mm)
Updated
Precipitation (mm)
±5%
UHG1 UHG2 UHG3 Forcing variables PRECIP
(continued)
79
Table 1
G. Dumedah & P. Coulibaly
|
|
Integration of Pareto-optimality into Bayesian-type filters
Journal of Hydroinformatics
|
16.1
|
2014
continued
Parameter
Description
Interval
TEMPR
Temperature ( C)
±5%
EVAPO
Evapotranspiration (mm)
±5%
W
Variables with interval ‘updated’ means they are defined using model output
Figure 2
|
Computational procedures for a sequential assimilation in the EDA.
into a new population Pc. The variation operators employ a uniform typed crossover with a crossover probability, and a substitution typed mutation with a mutation rate. Note that a member (or a solution) is represented as a vector containing values for states, model parameters, and forcing data uncertainties, where they are applied in the SAC-SMA model to generate streamflow. The states are obtained using Equation (1), and the forcing data are perturbed according to Equation (2). The model parameters, states,
the observed streamflow is perturbed using Equation (4): xt ¼ f[xt 1 , ut 1 , zt ]
(1)
ut ¼ ut þ γ t ,
(2)
γ t ∼ N(0, β ut )
^yt ¼ h[xt , zt , ut ] yt ¼ yt þ εt ,
εt ∼ N(0, β yt )
(3) (4)
and the forcing data are applied into the SAC-SMA model
where, for each population member, xt is a vector of
to estimate the predicted streamflow in Equation (3), and
forecasted states at time t with dimension L × 1; L is the
80
G. Dumedah & P. Coulibaly
|
Integration of Pareto-optimality into Bayesian-type filters
Journal of Hydroinformatics
|
16.1
|
2014
number of model states; f[.] is the system transition function
model parameters, and forcing data uncertainties into the
(or the SAC-SMA model); xt 1 is a vector of updated states
SAC-SMA model:
for the previous time; zt is the model parameter with dimension F × 1; F is the number of model parameters; ut is the forcing data with dimension E × 1; E is the number of forcing variables; and γt is the forcing data error with covariance β ut at each time step. Additionally, ^yt is the ensemble streamflow predictions with dimension 2n × 1; h is the measurement function (i.e. the SAC-SMA model);
Pk Bias ¼
i¼1
( ^yi yi ) k
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pk 2 ^ i¼1 ðyi yi Þ RMSE ¼ k
yt is the observed streamflow with dimension 2n × 1; and εt is the observed streamflow error with covariance β yt at each time step. The population Pc is evaluated and continuously evolved over several generations to determine the Pareto-
(5)
J¼
k X
J(yi ) ¼
i¼1
8 2 k > < ^yi ^yb,i X >
i¼1 :
σ 2b
(6) 9 2> ^ ðy yi Þ = þ i 2 σo > ;
(7)
optimal set using the evaluation objectives: bias in
where ^yb,i is the background value for the ith data point, yi is
Equation (5), RMSE in Equation (6), and the cost function
the observed value for the ith data point, σ 2b is the variance
J in Equation (7). All the objectives are minimized such that the bias and RMSE aim to determine a simulated stream-
for the background streamflow, σ 2o is the variance for the observed streamflow, ^yi is the analysis (or searched) value
flow (from the SAC-SMA model) that is closest to the
for the ith data point that minimizes J(yi), and k is the
perturbed streamflow observation. Note that the observed
number of data points (in this study, k ¼ 1 for sequential
daily streamflow is randomly perturbed using the associ-
assimilation).
ated hourly variance such that 2n ensemble observations
The EDA increments the assimilation time step to t1
are generated to correspond to the number of members in
where the n members in the final evolved population
Pc. The minimization of J allows the determination of a
found at t0 are varied and integrated to create a seed popu-
simulated streamflow from the SAC-SMA model that rep-
lation Pc of 2n ensemble members. The population Pc is
resents an optimal compromise between the background
continuously evaluated and evolved to determine a new
(i.e. forecast from previous ensemble members) and the
Pareto-optimal set for t1, where it is used to determine the
observed streamflow. The background streamflow is
updated ensemble members and background information
the average streamflow value determined by applying the
for future time step t2. The above steps are repeated to
ensemble members from the previous time step into
create seed population, continuously evolve these members,
the SAC-SMA model to forecast ensemble streamflows
and determine the updated ensemble members and back-
for the current time step. For the initial time step, the back-
ground information for subsequent time periods. Note that
ground value is computed from a randomly generated
procedures for determining background information are
population of members.
used in a similar approach to determine model forecasts
The final evolved population of size n from which the
for the 10 day lead times. That is, the streamflow forecasts
Pareto-optimal set is determined represents the updated
are based on the updated ensemble members that are associ-
ensemble members for t0. This final evolved population of
ated with specific values for forcing data uncertainties, states
members is used to determine the streamflow ensemble
and model parameters.
mean and its associated variance. This population is also used to forecast n ensemble members for future time t1
The PartoEnKF procedure
where the average and variance of the ensemble members are used as background information at t1. Note that the
The ParetoEnKF method uses the NSGA-II to continuously
streamflow forecast is conducted by applying the final
evolve a population of members before assimilating the
ensemble members that have specific values for states,
resulting Pareto-optimal set using the EnKF method. That
81
G. Dumedah & P. Coulibaly
|
Integration of Pareto-optimality into Bayesian-type filters
Journal of Hydroinformatics
|
16.1
|
2014
is, instead of assimilating randomly perturbed ensemble
are applied to determine the Pareto-optimal set using the
members, the ParetoEnKF first generates an equally com-
evaluation objectives: bias in Equation (5), RMSE in
petitive set of members before they are merged with
Equation (6), and the cost function in Equation (7). All
observation data. In the ParetoEnKF method, the EnKF is
members in the final population from which the Pareto-opti-
only used to update the final evolved members in the
mal set is determined represent ensemble members (with
Pareto-optimal set, whereas the NSGA-II controls the
associated streamflows) to be assimilated using the EnKF
Pareto distribution through continuous evolution and natu-
method.
ral selection of members. Note that the assimilation of the
The predicted ensemble streamflows ( ^yt ) are combined
final evolved population is conducted following the state-
with perturbed observations (yt) in Equation (4) to deter-
parameter formulation outlined in Moradkhani et al.
mine the Kalman gain functions for model parameters in
(). A flowchart for computational procedures in the Par-
Equation (8), and for state components in Equation (9):
etoEnKF is shown in Figure 3 – detailed descriptions are given below. The estimation of the Pareto-optimal set is described briefly, since the generation of the final evolved population
1
yy y Ktz ¼ β zy t [β t þ β t ]
(8)
yy y 1 Ktx ¼ β xy t [β t þ β t ]
(9)
that contains the Pareto-optimal set has been described in time t0, the ParetoEnKF method uses the NSGA-II to
where βzy is the cross variance of parameter ensemble zt and the ensemble streamflow prediction ^yt , βyy is the forecast
evolve a randomly generated population Pv (of the same
error covariance for ensemble of streamflow prediction ^yt ,
size as Pc in the EDA subsection). The EDA procedures
β yt is the covariance for the observed streamflow, and βxy is
detail in the subsection on EDA. Beginning with the initial
Figure 3
|
Computational procedures for a sequential assimilation in the ParetoEnKF – integration of Pareto-optimality into the EnKF.
82
G. Dumedah & P. Coulibaly
|
Integration of Pareto-optimality into Bayesian-type filters
Journal of Hydroinformatics
|
16.1
|
2014
the cross covariance of the model state ensemble and the
final evolved population that contains the Pareto-optimal
streamflow prediction ensemble. The model parameters
set using the PF. That is, instead of assimilating randomly
and state components are directly updated using their
perturbed ensemble members, the ParetoPF first generates
respective Kalman gain functions and the innovation
an equally competitive set of members before they are
vector, as in Equations (10) and (11), respectively:
merged with perturbed observation data. The ParetoPF uses the PF only to update the final evolved members in
zþ t
¼
z t
þ
Ktz (yt
^yt )
(10)
the Pareto-optimal set, whereas the Pareto distribution is controlled by the NSGA-II in continuous evolution and
x ^ xþ t ¼ xt þ Kt (yt yt )
where
zþ t
(11)
is the updated model parameter components, z t
is the perturbed model parameters before update, xþ t is the updated state components, and x t is the perturbed states before update. These updated members for t0 are populated into Pv, where they are applied into the SACSMA model to make v ensemble forecasts of streamflow for future time step t1. The ensemble forecasts (for streamflow) are used to determine the ensemble mean and its associated variance where they represent the background information for t1. At t1, the updated population from t0 is used as the seed population where it is varied and evolved using the NSGAII. A new evolved population Pv for t1 is determined where it is again updated using the EnKF method. The above procedures are repeated for subsequent time steps to evolve previously updated populations of members, assimilate evolved members using the EnKF method, and update evolved members for future forecasts. As in the EDA, streamflow forecasts for the 10 day lead times are determined using the same procedure for estimating the background information. Further information on the EnKF method can be found in various sources (Evensen b; Houtekamer & Mitchell ; Moradkhani et al. ; Weerts & El-Serafy ; Clark et al. ; Komma et al. ; Thirel et al. ; Xie & Zhang ).
natural selection of members. The computational procedure for the ParetoPF is shown in Figure 4 – detailed descriptions are given below. Beginning with the initial time t0, the ParetoPF method uses the NSGA-II to evolve a randomly generated population Pv. The EDA procedures described earlier in the EDA subsection are applied to determine the Pareto-optimal set using the evaluation objectives: bias, RMSE, and J. All members in the final population from which the Pareto-optimal set is determined represent ensemble members (with associated streamflows) to be assimilated using particle filtering. As in the ParetoEnKF, observed streamflows are perturbed according to Equation (4). The predicted streamflow ensemble ^yt and the ensemble of perturbed observations yt are applied to determine the ensemble weight (w) in Equation (12). Note that n is the ensemble size, which is the same as v – the number of members in Pv. The weights are re-sampled using a residual re-sampling approach in Equation (13) (Lui & Chen ; Weerts & ElSerafy ; van Leeuwen ). The function fix(A) rounds the elements of A to the nearest integer. The re-sampling reduces the variance between ensemble weights such that low weighted ensemble members are discarded and replaced with high normalized weighted members (Moradkhani & Hsu ; Weerts & El-Serafy ; van Leeuwen ): exp(0:5=β y )(yt ^yt )2 w ¼ Pn y ^2 i¼1 (exp(0:5=β ))(yt yt )
(12)
The ParetoPF procedure The working procedure for the ParetoPF method is similar to that of the ParetoEnKF method, except that the ParetoPF
wr ¼
nw fix(nw) P ; n ni¼1 ki
ki ¼ fix(nwi )
(13)
uses a PF to assimilate an equally competitive set of members. The ParetoPF method uses NSGA-II to continuously
The re-sampled weights w r are mapped to new indexed
evolve a population of members before assimilating the
ensemble members according to their ensemble weights
83
Figure 4
G. Dumedah & P. Coulibaly
|
|
Integration of Pareto-optimality into Bayesian-type filters
Journal of Hydroinformatics
|
16.1
|
2014
Computational procedures for a sequential assimilation in the ParetoPF – integration of Pareto-optimality into the PF.
following the mapping procedure in Moradkhani & Hsu
; Vossepoel & van Leeuwen ; Snyder et al. ;
(). The new indexes are applied to determine the re-
van Leeuwen ).
sampled model parameters (zt resample), states (xt resample),
These updated members for t0 are populated into Pv
and streamflow predictions ( ^yt resample ). The model par-
and are applied into the SAC-SMA model to make v
ameters are then perturbed using Equation (14). The
ensemble forecasts of streamflow for future time step t1.
posterior expectation (mean) for the ensemble streamflow
The ensemble forecasts (of streamflow) are used to deter-
is determined using Equation (15):
mine the ensemble mean and its associated variance where they represent the background information for t1.
zþ t
¼ zt resample þ υt 1 ,
υt 1 ∼ N(0,
β zt 1 )
(14)
At t1, the updated population from t0 is used as the seed population where it is varied and evolved using the NSGA-II. A new evolved population Pv for t1 is determined
E( ^yt ) ¼
N X
^yt resample
(15)
i¼1
where it is again updated using particle filtering. The above procedures are repeated for subsequent time steps to evolve previously updated population of members, assimi-
where υt 1 is the model parameter error with covariance
late evolved members using particle filtering, and update
β zt 1 . Detailed information on the applied particle filtering
evolved members for future model forecasts. Note that as
procedure can be found in Moradkhani & Hsu () and
in the ParetoEnKF, streamflow forecasts for the 10 day
other general sources (Gordon et al. ; Lui & Chen
lead times are determined using the same procedure for
; Bengtsson et al. ; Chen ; Weerts & El-Serafy
estimating the background information.
84
G. Dumedah & P. Coulibaly
|
Integration of Pareto-optimality into Bayesian-type filters
RESULTS AND DISCUSSION
Journal of Hydroinformatics
|
16.1
|
2014
that the EDA approach has an improved convergence pattern compared to both ParetoEnKF and ParetoPF.
The three assimilation methods were run from 2007 to 2010,
The clustering pattern in ParetoPF is more consistent with
with 1,000 ensemble members for each time step. The 1,000
a similar convergence pattern to the EDA than the
ensemble size was chosen to be large enough to accommo-
ParetoEnKF. The pattern of parameter convergence in the
date the dynamics from all model states and parameters in
EDA procedure is due to its multiple evolution of ensemble
Table 1. In the EDA, 40 members are evolved through 25
members through continuous selection and variation (cross-
generations to make up for the 1,000 ensemble members,
over
whereas the ParetoEnKF and ParetoPF methods merge the
members. That is, the selection of a subset of all evaluated
evolved population of members from which the resulting
members in the EDA plays a crucial role to enhance par-
Pareto-optimal set is determined with perturbed streamflow
ameter convergence.
and
mutation)
of
competitive
(non-dominated)
observations. Consequently, the number of updated mem-
The convergence of model parameters in the EDA is
bers is 20 (i.e. 2n ¼ 40) for each of the three methods. The
further examined through the distribution of model par-
observation error for streamflow, which is time variant, is
ameter values for the updated ensemble members across all
estimated as the hourly variance for each day of streamflow
assimilation time steps. The level of convergence of model
data. A time-variant model error is estimated adaptively
parameters is examined through clustering analysis where
from the ensemble members using the procedure for estimat-
the persistence of cluster groups across all assimilation
ing the background error outlined in the subsection on
steps for each model parameter is evaluated. The clustering
EDA. Given that the assimilation was conducted sequen-
analysis is conducted on the ensemble parameter values
tially at a daily time step, the RMSE in Equation (6) is
where the appropriate number of clusters was determined
used together with the cost function in Equation (7) to evalu-
using the knee procedure in Thorndike (). The number
ate candidate members (where k ¼ 1, in both equations).
of cluster groups examined when determining the appropri-
It is noteworthy that the EDA is based on the NSGA-II
ate number of clusters is variable, typically between four
procedure, so a standard crossover probability of 0.8 and a
and eight. The cluster with the largest membership is deter-
mutation probability of 1/r (where r is the number of vari-
mined, along with its coverage of the parameter space, the
ables) are used. The various updated ensemble members
centroid, and the lower and upper bounds to represent the
were applied to make streamflow forecasts for up to 10
converged parameter space with the largest weight.
days ahead, where each time step has 10 ensemble forecasts
The largest membership cluster for each model par-
starting from 1 day, 2 day, up to 10 day lead times. The
ameter across all assimilation time steps is shown in
resulting streamflow forecasts for 1 day, 5 day and 10 day
Table 2. It is noteworthy that the parameter values are all
lead times are compared to the observed streamflows. The
re-scaled between zero and one before application in the
outputs for the updated ensemble members and their
clustering analysis. As a result, the values shown in this
extended model forecasts for the three DA methods are pre-
table represent the re-scaled values. The coverage represents
sented and examined in the following subsections.
the proportion of members in the largest membership cluster in relation to the total number of members across all assimilation time steps. The coverage therefore quantifies
Convergence of model parameters for EDA updated
the weight of the cluster with the largest membership and
members
accounts for variability of cluster memberships due to variable cluster groupings. The coverage representing the level
It is noteworthy to begin by demonstrating the improved
of convergence for each model parameter across all assimi-
parameter convergence obtained through the EDA pro-
lation time steps is shown in Figure 9 for Dundas and
cedure. The distribution of ensemble parameter values for
Highway 5 stations.
the EDA, ParetoEnKF, and ParetoPF methods are com-
The convergence of model parameter values illustrated
pared in Figures 5–8. The parameter distributions illustrate
for the EDA output is significant. The coverage illustrates
85
Figure 5
G. Dumedah & P. Coulibaly
|
|
Integration of Pareto-optimality into Bayesian-type filters
Journal of Hydroinformatics
|
16.1
|
2014
Distribution of model parameter values in response to changes across assimilation time steps for the three methods: EDA, ParetoEnKF and ParetoPF: model parameters 1–8. (a) Model parameters 1–4. (b) Model parameters 5–8.
the recurrence of cluster groups such that across all assimi-
consistently reliable across the assimilation time steps. The
lation time steps, the uncertainty applied to precipitation
significance of these findings is that the convergence of par-
was found to be between 0.166 and 0.499 of its re-scaled
ameter values across different observation/assimilation time
value, with a coverage of about 79%. About 29 model par-
steps is valuable in the retrieval of variables that are not
ameters out of 32 have converged to about 80% across all
explicitly observed. These illustrations show the potential
assimilation time steps for both Dundas and Highway 5
of the EDA approach to examine the convergence of
watersheds. The high level of convergence for the 29
model parameters and their associated clusters through
model parameters means that their clustered intervals are
time in order to determine their relationships, sensitivities,
86
Figure 6
G. Dumedah & P. Coulibaly
|
|
Integration of Pareto-optimality into Bayesian-type filters
Journal of Hydroinformatics
|
16.1
|
2014
Distribution of model parameter values in response to changes across assimilation time steps for the three methods: EDA, ParetoEnKF and ParetoPF: model parameters 9–16. (a) Model parameters 9–12. (b) Model parameters 13–16.
and their responses to changes in observation and forcing
estimates
data.
shown in Figure 10. The evaluation of the ensemble
(the
analysis)
and
the
observations
is
means in comparison to the observations is shown in Evaluation of streamflow assimilations
Table 3 using the Nash–Sutcliffe efficiency (NSE) in Equation (16) and percent bias in Equation (17). The
The updated ensemble estimates of streamflow for
percent bias represents the proportion of the estimation
the three methods are compared in the following.
which
A temporal comparison between the updated ensemble
zero indicates unbiased estimation, whereas values
is
biased
such
that
a
minimum
value
of
87
Figure 7
G. Dumedah & P. Coulibaly
|
|
Integration of Pareto-optimality into Bayesian-type filters
Journal of Hydroinformatics
|
16.1
|
2014
Distribution of model parameter values in response to changes across assimilation time steps for the three methods: EDA, ParetoEnKF and ParetoPF: model parameters 17–24. (a) Model parameters 17–20. (b) Model parameters 21–24.
greater than zero indicate the level of bias in the estimation: "P NSE ¼ 1
m ^ 2 t¼1 (yt yt ) Pm 2 t¼1 (yt y)
where yt is the observed streamflow at time t, ^yt is the estimated streamflow at time t, y is the mean of observed streamflow, and m is the number of data points.
#
The three methods produce similar streamflow esti(16)
mations
when
compared
to
the
observations.
The
assessment of estimated streamflows using evaluation percent bias ¼ 100 ×
Pm ^ t¼1 jyt yt j Pm i¼1 yt
measures is similar at both upstream and downstream (17)
stations for all three methods. For example, streamflow evaluations using the NSE are: 0.894 for EDA, 0.897 for
88
Figure 8
G. Dumedah & P. Coulibaly
|
|
Integration of Pareto-optimality into Bayesian-type filters
Journal of Hydroinformatics
|
16.1
|
2014
Distribution of model parameter values in response to changes across assimilation time steps for the three methods: EDA, ParetoEnKF and ParetoPF: model parameters 25–32. (a) Model parameters 25–28. (b) Model parameters 29–32.
ParetoEnKF, and 0.900 for ParetoPF at Dundas gauge
Evaluation of streamflow forecasts
station. The similarity between the assimilations for the three methods has important implications. For example,
The above results illustrate that improvements to the Pareto-
the similarity suggests that one method can be easily
optimal members based on the EnKF and PF methods are
replaced with another method with very little effect on the
minimal and could be ignored at assimilation stage. The
accuracy of the assimilated streamflow. Although the three
identical ensemble means from the three methods suggest
methods have different computational pathways for assimi-
that comparable updated members could be determined
lation, their ensemble means are comparable, and each
from different state and model parameterizations. Given
method appears adequate to merge simulated streamflows
these identical updated estimates, a persistent question is
with perturbed observations.
the response of future streamflows to state and model
89
Table 2
G. Dumedah & P. Coulibaly
|
|
Integration of Pareto-optimality into Bayesian-type filters
Model parameters and their converged intervals represented by the largest membership clusters with a definition of their coverage, centroids and lower and upper bounds estimated for the Dundas station. The coverage is presented as a fraction where a maximum value of one represents a perfectly converged cluster and a value close to zero represents a sensitive cluster
Journal of Hydroinformatics
|
16.1
|
2014
three methods are also the same. These questions and their implications are examined in this subsection. A comparison between the observations and forecast streamflows for the three methods is shown in Figure 11
Parameter
Centroid
Lower bound
Upper bound
Coverage
for Dundas at the downstream outlet. The evaluation
UZTWM
0.620
0.422
0.856
0.977
measures for these forecasts at both upstream and down-
UZFWM
0.439
0.211
0.766
0.930
stream stations are presented in Table 3. The forecast
UZK
0.337
0.022
0.487
0.580
streamflows from the updated members are different com-
PCTIM
0.154
0.012
0.470
0.928
pared
ADIMP
0.833
0.411
0.924
0.982
assimilation stage in the subsection on evaluation of stream-
ZPERC
0.436
0.222
0.764
0.950
flow assimilations. The forecasts from the EDA method have
REXP
0.447
0.266
0.652
0.955
higher accuracy than the forecasts from the ParetoEnKF and
LZTWM
0.582
0.388
0.799
0.991
ParetoPF methods. The rate of decrease in forecast accuracy
LZFSM
0.419
0.200
0.766
0.956
from 1 day through to 10 day is smaller for the EDA method
LZFPM
0.548
0.333
0.699
0.955
when compared to the rate of decrease in accuracy for the
LZSK
0.801
0.640
0.940
0.880
ParetoEnKF and ParetoPF forecasts. That is, the accuracy
LZPK
0.072
0.002
0.230
0.982
of streamflow forecasts deteriorates quickly with increasing
PFREE
0.232
0.011
0.436
0.996
lead time for ParetoEnKF and ParetoPF methods, whereas
RQ
0.903
0.770
1.000
0.485
the differences between forecasts from the EDA method
DDF
0.438
0.210
0.592
0.963
are much smaller for most time steps. Between the three
SCF
0.444
0.222
0.548
0.881
methods, the EDA method produces the highest accuracy
TR
0.488
0.344
0.646
0.994
for streamflow forecasts, whereas forecast accuracy is simi-
ATHORN
0.527
0.344
0.636
0.966
lar for both ParetoEnKF and ParetoPF methods.
RCR
0.273
0.130
0.448
0.978
These results emphasize the importance of continuous
UZTWC
0.332
0.126
0.410
0.950
evaluation of the updated ensemble members for future
UZFWC
0.283
0.180
0.477
0.962
time periods. For example, subsequent evaluation of the
LZTWC
0.448
0.263
0.666
0.904
updated ensemble members for future time steps exposed
LZFSC
0.120
0.015
0.299
0.984
accuracy differences in streamflow forecasts for the three
to
the
comparable
outputs
obtained
at
the
LZFPC
0.121
0.012
0.299
0.980
methods that generate similar streamflow values at the
ADIMC
0.340
0.213
0.522
0.781
assimilation stage. The rapid decrease in forecast accuracy
UHG1
0.145
0.014
0.288
0.984
for the ParetoEnKF and ParetoPF methods may illustrate a
UHG2
0.196
0.023
0.277
0.987
skewed association between the assimilated streamflows
UHG3
0.271
0.142
0.377
0.983
and its corresponding updates for state and model par-
SWE
0.346
0.216
0.577
0.974
ameter components. In other words, state and model
EVAPO
0.439
0.222
0.599
0.604
parameter updates that are performed based on the evolved
PRECIP
0.394
0.166
0.499
0.786
ensemble members do not seem to improve model
TEMPR
0.403
0.288
0.599
0.810
forecasts. Discussion on design and performance of DA methods
parameter updates from the three comparable sets of updated members. That is, since the updated members are
The above sections have compared assimilation outputs for
similar for the three different methods, it is desirable to
the three DA methods. The three methods produce similar
determine whether streamflow forecasts that are generated
assimilated streamflows, but their corresponding model fore-
using state and parameter updates from the corresponding
casts for future time periods are different. The rationale for
90
G. Dumedah & P. Coulibaly
Figure 9
|
|
Integration of Pareto-optimality into Bayesian-type filters
Journal of Hydroinformatics
|
16.1
|
2014
Percent convergence of model parameter spaces indicative of the coverage of the largest membership clusters. The coverage in this case is the ratio of the number of cluster members to the total number of members across all assimilation time steps. (a) Dundas station. (b) Highway 5 station.
continuous evaluation of model forecasts from the updated
assimilation does not skew the optimal merger between
ensemble members was to validate the assimilation pro-
simulation and observation by distorting state and model
cedure, and to ensure that the mergers between simulation
parameterizations. Second, accuracy improvements gained
and perturbed observation for any time step is evaluated
through the assimilation of Pareto-optimal members for
for future time steps. The results show that all three DA
different time steps is minimal and can be ignored at the
methods produce comparable updated ensemble members,
assimilation stage. This was exemplified by similarities
and that EDA has the highest forecast accuracy and is pre-
between assimilated streamflows from the three methods.
ferable
to
ParetoEnKF
or
ParetoPF
methods
for
streamflow forecasting.
Third, state and model parameter updates performed on Pareto-optimal members do not increase the forecasting per-
These findings have important implications on the
formance of these members. This was illustrated by the high
design of DA procedures for streamflow forecasting. First,
accuracy for EDA streamflow forecasts compared to fore-
continuous evaluation of the updated ensemble members
casts made from either the ParetoEnKF or ParetoPF
for future time steps is equally important as the determi-
methods. Finally, the findings illustrate the forecasting per-
nation of the updated members. Evaluation of the updated
formance
members for future time steps would ensure that the
encapsulate the memory of past model states and
for
continuously
evolving
members
that
91
G. Dumedah & P. Coulibaly
Figure 10
Table 3
|
|
|
Integration of Pareto-optimality into Bayesian-type filters
Journal of Hydroinformatics
|
16.1
|
2014
Comparison between observations and updated ensemble members for EDA, ParetoEnKF, and ParetoPF at Highway 5 and Dundas gauging stations.
Evaluation of the updated ensemble members, and their forecasting perform-
Station
Measure
measurement (i.e. the SAC-SMA model) errors in the form of simulation–observation dynamics. These properties have
ance for DA methods
ParetoEnKF
ParetoPF
EDA
been shown to improve the performance of streamflow forecasting, and could facilitate real-time predictions and the
Evaluation for updated members Highway 5
NSE Percent bias
0.949 0.142
0.941 0.149
0.947 0.148
Dundas
NSE Percent bias
0.897 0.152
0.900 0.154
0.894 0.157
initiation of models from unknown initial conditions.
CONCLUSIONS
Evaluation for 1 day forecast (background information) Highway 5
NSE Percent bias
0.756 0.387
0.766 0.373
0.836 0.277
This study has illustrated the integration of Pareto-optimality
Dundas
NSE Percent bias
0.739 0.332
0.764 0.296
0.777 0.281
applied Pareto-optimality to obtain information on model
Evaluation for 5 day forecast Highway 5
NSE Percent bias
0.581 0.482
0.589 0.471
0.754 0.431
Dundas
NSE Percent bias
0.490 0.484
0.507 0.466
0.725 0.324
into Kalman-type and PF-type assimilations. The study state, improve model parameterizations, and to better estimate measurement error. This information was, in turn, incorporated into the EnKF and PF methods to improve their forecasting performance. Comparative evaluation was conducted to examine forecasting performance for the three methods: the EDA, and the methods based on the inte-
Evaluation for 10 day forecast Highway 5
NSE Percent bias
0.554 0.495
0.555 0.485
0.666 0.463
gration of Pareto-optimality into the EnKF and PF methods.
Dundas
NSE Percent bias
0.426 0.517
0.437 0.501
0.625 0.420
SMA model in the Spencer Creek watershed in southern
The ensemble means are used to compute the evaluation measures
The three methods assimilate daily streamflow into the SACOntario, Canada. The resulting updated ensemble members were, in turn, applied to predict streamflow for up to 10 day
92
G. Dumedah & P. Coulibaly
Figure 11
|
|
Integration of Pareto-optimality into Bayesian-type filters
Journal of Hydroinformatics
|
16.1
|
2014
Comparison between observations and 1 day, 5 day and 10 day streamflow forecasts for EDA, ParetoEnKF, and ParetoPF methods at Dundas gauging station.
lead times where forecasts for 1 day, 5 day and 10 day lead
forecasting. Additionally, the results illustrated the capability
times were compared to the observation data.
of the EDA approach to estimate convergent model par-
The results show that the optimal merger between simu-
ameter values and to identify persistent, as well as sensitive,
lations and observations for the three DA methods generate
model parameter spaces. It was found that the additional
similar ensemble estimates. However, a subsequent evalu-
update steps from the EnKF method (for ParetoEnKF) and
ation of the updated members for future time periods yields
the PF method (for ParetoPF) generally degrade the conver-
different forecasting performance for the three methods.
gence of model parameters and do not improve the overall
The ParetoEnKF and ParetoPF methods have similar fore-
accuracy of streamflow estimation.
casting performance, whereas the EDA method has the
While most studies emphasize assimilation results for
highest forecasting accuracy and could be the desired
DA methods, this study has illustrated the importance for
method for streamflow forecasting in the SAC-SMA model.
a continuous evaluation of the updated ensemble members.
The high performance of the EDA method illustrates that
The continuous evaluation of the assimilation includes a
the continuous evolution and subsequent merging of Pareto-
comparison of the updated ensemble members and their
optimal members with perturbed observations provides an
associated model forecasts to the observations. The accu-
appealing framework to enhance the accuracy of streamflow
racy of model forecasts from the updated members, which
93
G. Dumedah & P. Coulibaly
|
Integration of Pareto-optimality into Bayesian-type filters
typifies forecasting performance (and, indirectly, robustness) of the assimilation procedure, is equally as important as the estimation of the updated members themselves. That is, the continuous and comparative evaluation of the updated members and their associated model forecasts is an important measure for assessing the forecasting performance of DA methods. Future studies should compare the updated members from the EDA method to the standard EnKF and PF methods.
REFERENCES Bengtsson, T., Snyder, C. & Nychka, D. Toward a nonlinear ensemble filter for high-dimensional systems. Journal of Geophysical Research 108 (D24), 8775. Burgers, T., van Leeuwen, J. P. & Evensen, G. Analysis scheme in the ensemble Kalman filter. American Meteorological Society Monthly Weather Review 126 (6), 1719–1724. Caparrini, F., Castelli, F. & Entekhabi, D. Mapping of landatmosphere heat fluxes and surface parameters with remote sensing data. Boundary-layer Meteorology 107 (3), 605–633. Caparrini, F., Castelli, F. & Entekhabi, D. Variational estimation of soil and vegetation turbulent transfer and heat flux parameters from sequences of multisensor imagery. Water Resources Research 40 (12), W12515. Chemin, Y. & Honda, K. Spatiotemporal fusion of rice actual evapotranspiration with genetic algorithms and an agrohydrological model. IEEE Transactions on Geoscience and Remote Sensing 44 (11), 3462–3469. Chen, Z. Bayesian Filtering: From Kalman Filters to Particle Filters, and Beyond. Technical report, McMaster University, Adaptive System Laboratory, Hamilton, ON, Canada. Clark, M., Rupp, D., Woods, R., Zheng, X., Ibbitt, R., Slater, A., Schmidt, J. & Uddstrom, M. Hydrological data assimilation with the ensemble Kalman filter: use of streamflow observations to update states in a distributed hydrological model. Advances in Water Resources 31 (10), 1309–1324. Coello Coello, C. A., van Veldhuizen, D. A. & Lamont, G. B. Evolutionary Algorithms for Solving Multi-Objective Problems. Kluwer Academic/Plenum Publishers, New York, NY. Confesor, R. B. & Whittaker, G. W. Automatic calibration of hydrologic models with multi-objective evolutionary algorithm and Pareto optimization. Journal of American Water Resources Association 43 (4), 981–989. Deb, K. Multi-objective Optimization using Evolutionary Algorithms. John Wiley and Sons, Chichester, UK. Deb, K. & Goel, T. Controlled elitist non-dominated sorting genetic algorithms for better convergence. In: Evolutionary Multi-criterion Optimization, 1st International Conference,
Journal of Hydroinformatics
|
16.1
|
2014
EMO 2001, Swiss Federal Institute of Technology, Springer, Zurich, Switzerland, vol. 1993, pp. 67–81. Deb, K., Agrawal, S., Pratap, A. & Meyarivan, T. A fast elitist non-dominated sorting genetic algorithms for multi-objective optimization: NSGA-II. In: Parallel Problem Solving from Nature VI (PPSN-VI), Springer Lecture Notes in Computer Science, No. 1917, Springer, Paris, France, pp. 849–858. Deb, K., Pratap, A., Agrawal, S. & Meyarivan, T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6 (2), 182–197. Dumedah, G. Formulation of the evolutionary-based data assimilation, and its practical implementation. Water Resources Management 26, 3853–3870. Dumedah, G. & Coulibaly, P. Evolutionary assimilation of streamflow in distributed hydrologic modeling using in-situ soil moisture data. Advances in Water Resources 53, 231–241. Dumedah, G., Berg, A. A., Wineberg, M. & Collier, R. Selecting model parameter sets from a trade-off surface generated from the Non-dominated Sorting Genetic Algorithm-II. Water Resources Management 24 (15), 4469–4489. Dumedah, G., Berg, A. A. & Wineberg, M. An integrated framework for a joint assimilation of brightness temperature and soil moisture using the Non-dominated Sorting Genetic Algorithm-II. Journal of Hydrometeorology 12 (2), 1596–1609. Dumedah, G., Berg, A. A. & Wineberg, M. a Evaluating autoselection methods used for choosing solutions from Paretooptimal set: does non-dominance persist from calibration to validation phase? Journal of Hydrologic Engineering 17 (1), 150–159. Dumedah, G., Berg, A. A. & Wineberg, M. b Pareto-optimality and a search for robustness: choosing solutions with desired properties in objective space and parameter space. Journal of Hydroinformatics 14 (2), 270–285. Eiben, A. E. & Smith, J. E. Introduction to Evolutionary Computing. Springer. Evensen, G. a Inverse methods and data assimilation in nonlinear ocean models. Physica D 77 (1–3), 108–129. Evensen, G. b Sequential data assimilation with a non-linear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. Journal of Geophysical Research 99 (C5), 10,143–10,162. Evensen, G. The ensemble Kalman filter: theoretical formulation and practical implementation. Ocean Dynamics 53 (4), 343–367. Gordon, N., Salmond, D. & Smith, A. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEEE Proceedings on Radar and Signal Processing 140 (2), 107–113. Houtekamer, P. L. & Mitchell, H. L. Data assimilation using an ensemble Kalman filter technique. Monthly Weather Review 126 (3), 796–811. Ines, A. & Mohanty, B. Near-surface soil moisture assimilation for quantifying effective soil hydraulic properties
94
G. Dumedah & P. Coulibaly
|
Integration of Pareto-optimality into Bayesian-type filters
using genetic algorithm: 1. Conceptual modeling. Water Resources Research 44 (6), W06422. Ines, A. & Mohanty, B. Near-surface soil moisture assimilation for quantifying effective soil hydraulic properties using genetic algorithms: 2. Using airborne remote sensing during SGP97 and SMEX02. Water Resources Research 45, W01408. Komma, J., Bloschl, G. & Reszler, C. Soil moisture updating by ensemble Kalman filtering in real-time flood forecasting. Journal of Hydrology 357 (3–4), 228–242. Liu, Y. & Gupta, H. V. Uncertainty in hydrologic modeling: toward an integrated data assimilation framework. Water Resources Research 43, 1–18. Lui, J. S. & Chen, R. Sequential Monte-Carlo methods for dynamical systems. Journal of the American Statistical Association 93 (443), 1032–1044. Moradkhani, H. & Hsu, K. Uncertainty assessment of hydrologic model states and parameters: sequential data assimilation using the particle filter. Water Resources Research 41, W05012. Moradkhani, H., Sorooshian, S., Gupta, H. V. & Paul Houser, R. Dual state parameter estimation of hydrological models using ensemble Kalman filter. Advances in Water Resources 28 (2), 135–147. Nazemi, A., Yao, X. & Chan, A. Extracting a set of robust Pareto-optimal parameters for hydrologic models using NSGA-II and SCEM. In: IEEE Congress on Evolutionary Computation, Vancouver, BC, 2006, pp. 1901–1908, doi: 10.1109/CEC.2006.1688539. Reichle, R., McLaughlin, D. & Entekhabi, D. Variational data assimilation of microwave radiobrightness observations for land surface hydrology applications. IEEE Transactions on Geoscience and Remote Sensing 39 (8), 1708–1718. Snyder, C., Bengtsson, T., Bickel, P. & Anderson, J. Obstacles to high-dimensional particle filtering. Monthly Weather Review 136, 4629–4640. Tang, Y., Reed, P. & Wagener, T. How effective and efficient are multiobjective evolutionary algorithms at hydrologic model calibration? Hydrology and Earth System Sciences 10, 289–307.
Journal of Hydroinformatics
|
16.1
|
2014
Thirel, G., Martin, E., Mahfouf, J.-F., Massart, S., Ricci, S. & Habets, F. A past discharges assimilation system for ensemble streamflow forecasts over France. Part 1: description and validation of the assimilation system. Hydrology and Earth System Sciences 14 (8), 1623–1637. Thorndike, R. L. Who belong in the family? Psychometrika 18, 4. van Leeuwen, P. J. Particle filtering in geophysical systems. Monthly Weather Review 137 (12), 4089–4114. Vossepoel, F. C. & van Leeuwen, P. J. Parameter estimation using a particle method: inferring mixing coefficients from sea level observations. Monthly Weather Review 135 (3), 1006–1020. Vrugt, J. A. & Robinson, B. A. Treatment of uncertainty using ensemble methods: comparison of sequential data assimilation and Bayesian model averaging. Water Resources Research 43 (1), W01201–W01701. Vrugt, J. A., Gupta, H. V., Dekker, S. C., Sorooshian, S., Wagener, T. & Bouten, W. a Application of stochastic parameter optimization to the Sacramento soil moisture accounting model. Journal of Hydrology 325 (1–4), 288–307. Vrugt, J. A., Gupta, H. V., Nuallain, B. O. & Bouten, W. b Real-time data assimilation for operational ensemble streamflow forecasting. Journal of Hydrometeorology 7 (3), 548–565. Weerts, A. H. & El Serafy, G. Y. Particle filtering and ensemble Kalman filtering for state updating with hydrological conceptual rainfall-runoff models. Water Resources Research 42, W09301–W09602. Weerts, A. H., El Serafy, G. Y., Hummel, S., Dhondia, J. & Gerritsen, H. Application of generic data assimilation tools(datools) for flood forecasting purposes. Computers & Geosciences 36 (4), 453–463. Wohling, T., Vrugt, J. A. & Barkle, G. F. Comparison of three multiobjective optimization algorithms for inverse modeling of vadose zone hydraulic properties. Soil Science Society of America Journal 72 (2), 305–319. Xie, X. & Zhang, D. Data assimilation for distributed hydrological catchment modeling via ensemble Kalman filter. Advances in Water Resources 33 (6), 678–690.
First received 5 May 2012; accepted in revised form 10 May 2013. Available online 5 June 2013
95
© IWA Publishing 2014 Journal of Hydroinformatics
|
16.1
|
2014
Encapsulation of parametric uncertainty statistics by various predictive machine learning models: MLUE method Durga L. Shrestha, Nagendra Kayastha, Dimitri Solomatine and Roland Price
ABSTRACT Monte Carlo simulation-based uncertainty analysis techniques have been applied successfully in hydrology for quantification of the model output uncertainty. They are flexible, conceptually simple and straightforward, but provide only average measures of uncertainty based on past data. However, if one needs to estimate uncertainty of a model in a particular hydro-meteorological situation in real time application of complex models, Monte Carlo simulation becomes impractical because of the large number of model runs required. This paper presents a novel approach to encapsulating and predicting parameter uncertainty of hydrological models using machine learning techniques. Generalised likelihood uncertainty estimation method (a version of the Monte Carlo method) is first used to assess the parameter uncertainty of a hydrological model, and then the generated data are used to train three machine learning models. Inputs to these models are specially identified representative variables. The trained models are then employed to predict the model output
Durga L. Shrestha (corresponding author) CSIRO Land and Water, Highett, Australia E-mail: durgalal.shrestha@csiro.au Nagendra Kayastha Dimitri Solomatine Roland Price UNESCO-IHE Institute for Water Education, Delft, The Netherlands Dimitri Solomatine Roland Price Water Resources Section, Delft University of Technology, Delft, The Netherlands
uncertainty which is specific for the new input data. This method has been applied to two contrasting catchments. The experimental results demonstrate that the machine learning models are quite accurate. An important advantage of the proposed method is its efficiency allowing for assessing uncertainty of complex models in real time. Key words
| hydrological modelling, machine learning, MLUE, Monte Carlo, uncertainty analysis
INTRODUCTION Hydrological models, in particular rainfall-runoff models,
when it is too costly to measure them in the field. Concep-
are simplified representations of reality and aggregate the
tual
complex, spatially and temporally distributed physical pro-
parameters, which cannot be directly measured. Manual
rainfall-runoff
models
usually
contain
several
cesses through relatively simple mathematical equations
adjustment of the parameter values is labour intensive and
with parameters. The parameters of the rainfall-runoff
its success is strongly dependent on the experience of the
models can be estimated in two ways ( Johnston & Pilgrim
modeller. In the last two decades, a number of automated
). First, they can be estimated from the available knowl-
routines have been suggested (see e.g. Duan et al. ;
edge or measurements of the physical process, provided the
Yapo et al. ; Solomatine ; Madsen ; Vrugt
model parameters realistically represent the measurable
et al. ).
physical process. In the second approach, parameter
While considerable attention has been given to the
values are estimated by calibration on the basis of the
development of calibration methods which aim to find a
input and output measurements in situations when the par-
single best or Pareto set of values for the parameter vector,
ameters do not represent directly measurable entities or
a realistic estimation of parameter uncertainty received
doi: 10.2166/hydro.2013.242
96
D. L. Shrestha et al.
|
Encapsulation of parametric uncertainty statistics: MLUE method
Journal of Hydroinformatics
|
16.1
|
2014
special attention over the last few years. It is now being
Binley (). GLUE is one of the popular methods for ana-
broadly recognised that proper consideration of uncertainty
lysing parameter uncertainty in hydrological modelling and
in hydrologic predictions is essential for purposes of both
has been widely used over the past 15 years to analyse and
research and operational modelling (Wagener & Gupta
estimate predictive uncertainty, particularly in hydrological
). The value of hydrologic prediction for water
applications (see e.g. Freer et al. ; Beven & Freer ;
resources-related decision-making processes is limited if
Montanari ). Users of the GLUE (and actually of any
reasonable estimates of the corresponding predictive uncer-
MC method in general) are attracted by its simple under-
tainty are not provided (Georgakakos et al. ). The
standable ideas, relative ease of implementation and use,
research community has done quite a great deal in moving
and its ability to handle different error structures and
towards the recognition of the necessity of complementing
models without major modifications to the method itself.
point forecasts of decision variables by the uncertainty esti-
Despite its popularity, there are theoretical and practical
mates, and nowadays it is widely recognised that along the
issues related with the GLUE method reported in the litera-
modelling per se, there is a need to (i) understand and ident-
ture. For instance, Mantovan & Todini () argue that
ify sources of uncertainty, (ii) quantify uncertainty, (iii)
GLUE is inconsistent with the Bayesian inference processes
evaluate the propagation of uncertainty through the
such that it leads to an overestimation of uncertainty, both
models, and (iv) find means to reduce uncertainty. Incorpor-
for the parameter uncertainty estimation and the predictive
ating uncertainty into deterministic forecasts helps to
uncertainty estimation. For the account of different views at
enhance the reliability and credibility of the model outputs.
the methodological correctness of GLUE, readers are
One may observe a significant proliferation of uncer-
referred to the citation above and the subsequent discus-
academic
sions in the Journal of Hydrology in 2007 and 2008, and to
literature, trying to provide meaningful uncertainty bounds
the papers by Stedinger et al. () and Vrugt et al. ().
of the model predictions. Pappenberger et al. () provide
Since MC-based methods require a large number of
tainty analysis
methods
published
in
the
a decision tree to find the appropriate method for a given
samples (or model runs), their applicability is sometimes
situation. However, the methods to estimate and propagate
limited to simple models. In the case of computationally
this uncertainty have so far been limited in their ability to
intensive models, the time and resources required by these
distinguish between different sources of uncertainty and in
methods could be prohibitively expensive. Alternative
the use of the retrieved information to improve the model
approximation methods have been developed (e.g. moment
structure analysed. These methods range from analytical
propagation techniques), which under certain assumptions
and approximation methods (see e.g. Tung ) to Monte
are able to calculate directly the first and second moments
Carlo (MC) sampling-based methods (e.g. Beven & Binley
without the application of MC simulation (see e.g. Rosen-
; Kuczera & Parent ; Thiemann et al. ) with
blueth ; Harr ; Melching ). A number of
the use of Bayesian approaches to determine the posterior
methods allow for reducing the number of MC simulation
distributions; methods based on the analysis of model
runs, for instance, Latin hypercube sampling (see e.g.
errors (e.g. Montanari & Brath ); machine learning
McKay et al. ) but they may fail to provide reliable esti-
methods (Shrestha & Solomatine ; Solomatine &
mates of uncertainty.
Shrestha ; Shrestha et al. ), and methods based on fuzzy set theory (see e.g. Maskey et al. ).
For models with a large number of parameters, the sample size from the respective parameter distributions
Due to complexities, or even impossibility of using
must be very large in order to achieve a reliable estimate
analytical methods to propagate uncertainty from par-
of uncertainties (Kuczera & Parent ) (it is worth men-
ameters
MC-based
tioning that this is a problem for all methods based on
(sampling) techniques have been widely applied in studying
sampling and multiple model runs). One of the ways to
uncertainty of hydrological models. A version of the MC
address the problem of computational complexity in optim-
simulation method was introduced under the term ‘general-
isation, random search or MC simulation, is to use a limited
ised likelihood uncertainty estimation’ (GLUE) by Beven &
number of samples of parameter vectors and run the
to
outputs
for
complex
models,
97
D. L. Shrestha et al.
|
Encapsulation of parametric uncertainty statistics: MLUE method
Journal of Hydroinformatics
|
16.1
|
2014
hydrologic or hydraulic model in order to generate a data set
needs only a single set of MC simulations in off line mode
which is then used as the calibration set for building an
and allows one to predict the uncertainty bounds of the
approximating regression model. This latter (fast) model
model prediction when the new input data are observed
(also called a meta-model, or surrogate model) is then
and fed into hydrological models (whereas the standard
used instead of the (slower) original model; such approach
MC approach requires new multiple model runs for each
is widely employed in industrial design optimisation, and
new input).
in water resources problems was used, for example by Solo-
In a comparison with previous study of Shrestha et al.
matine & Torres () in model-based optimisation, and
(), the main contributions of this study are to (i) provide
Khu & Werner () in reducing the number of MC
an extensive review of the state art of the uncertainty analy-
simulations.
sis
methods
used
in
hydrology,
(ii)
generalise
the
Yet another approach is to use more efficient sampling
methodology and to extend it further to approximate prob-
strategies, as was done by Blasone et al. () who used
ability distribution function of the model outputs, (iii)
adaptive Markov chain Monte Carlo sampling within the
apply methodology to different study area, (iv) employ and
GLUE methodology to improve the sampling of the high
compare different machine learning models to emulate
probability density region of the parameter space. Other
MC simulation results, and (v) compare the methodology
examples of this approach are the delayed rejection adaptive
with yet another uncertainty analysis method. The HBV
Metropolis method (Haario et al. ), and the differential
(Hydrologiska Byråns Vattenbalansavdelning) hydrological
evolution adaptive Metropolis method, DREAM (Vrugt
models of the Brue catchment in UK and Bagmati catch-
et al. ).
ment in Nepal are used as case studies.
One of the practical observations concerning the GLUE method is that in many cases the percentage of observations falling within the prediction limits provided by GLUE is
MACHINE LEARNING METHODS
smaller than the given confidence level used to produce these prediction limits (see e.g. Montanari ). Xiong &
In this section, we introduce briefly the main notions of
O’Connor () modified the GLUE method to somehow
machine learning and the methods used. Major focus of
resolve this issue, so that the prediction limits would envel-
machine learning is to automatically produce (induce) pre-
ope the observations better.
dictive models from data. A machine learning algorithm
There is, however, an issue which is not widely discussed
estimates an unknown mapping (or dependency) between
in the literature, and this is the assessment of model uncer-
the inputs (predictors) and outputs (predictands) of a phys-
tainty when it is used in operation, i.e. when the new input
ical system from the available data (Mitchell ). As
data are fed into the model, in other words, uncertainty pre-
such a dependency (model) is discovered, it can be used to
diction. The MC simulation provides only the averaged
predict the future outputs of the system from the known
uncertainty estimates based on the past data, but in real
input values. Machine learning techniques, based on
time forecasting situations there may be simply little time
observed data D ¼ (X, y) ¼ {xt, yt}, t ¼ 1, 2,…, N, try to ident-
to perform the MC simulations for the new input data in
ify (learn) the target function f(xt, w) describing how the real
order to assess the model uncertainty for a new situation.
system behaves, where X is the matrix (x, vector) of the input
Recently, we proposed to use artificial neural network
data, y is the vector of systems’ response, N is the number of
(ANN) to emulate the MC simulations results obtained for
data, w is the parameter vector of the function. Learning (or
the past data, and named this method MLUE – machine learning in parameter uncertainty estimation (Shrestha
‘training’) here is the process of minimising the difference between observed response y and model response ŷ through
et al. ). The idea of this method is to use the data
an optimisation procedure. Such a model f is often called a
from MC simulations to train a statistical or machine learn-
‘data-driven model’. For a recent overview of data-driven mod-
ing model to (with specially selected inputs) predict the
elling in water-related issues, see for example Solomatine &
quantiles of the model error distribution. MLUE method
Ostfeld (), Maier et al. (), and Elshorbagy et al.
98
D. L. Shrestha et al.
|
Encapsulation of parametric uncertainty statistics: MLUE method
Journal of Hydroinformatics
|
16.1
|
2014
(a, b). A review of the application of machine learning
and the results are interpretable. An example of using MT
techniques to estimate the uncertainty of rainfall-runoff
in the role of a data-driven rainfall-runoff model can be
models can be found in Shrestha & Solomatine () and
found in Solomatine & Dulal ().
Shrestha et al. (). Three machine learning methods namely ANN, model
Locally weighted regression
tree, and locally weighted regression (LWR), are used in this study. Among them, ANN is the most popular technique
LWR is a method that builds a regression model selecting
and has been extensively used in hydrological modelling
only a limited number of examples close to the vector of
over the past 15 years (see for example early papers
input xq (often called a query vector). The selected examples
by Minns & Hall (), Maier & Dandy (), Abrahart
are assigned weights according to their distance to the query
& See (), Govindaraju & Rao (), Dawson &
vector, and regression equations are generated using the
Wilby () and Dibike & Solomatine ()). The follow-
weighted data. The word ‘local’ in the ‘locally weighted
ing sections present a brief overview of the other two
regression’ means that the function is approximated based
methods which are less known in the water and environ-
on data in the locality of the query vector, and it is
mental modelling community.
‘weighted’ because the contribution of each training example is weighted by its distance from the query vector.
Model trees
The regression function f built for the neighbourhood of
A model tree (MT) is a hierarchical (or tree-like) modular
an ANN, etc. Various distance-based weighting schemes
model which has splitting rules in non-terminal nodes and
can be employed (given in Appendix A, available online at
linear regression functions at the leaves of the tree. In fact,
http://www.iwaponline.com/jh/016/242.pdf). For a detailed
it is a piece-wise linear regression model. In the mid
description of LWR method, the readers are referred to
1980s, the Australian researcher Dr J. Ross Quinlan
Aha et al. (), and its application in rainfall-runoff model-
suggested the so-called M5 algorithm to build MT (Witten
ling is reported in Solomatine et al. ().
the query vector xq can be a linear or non-linear function,
& Frank ); this is an iterative scheme that progressively splits the examples in the space of inputs {x1, x2, … xn} using the criterion xi < A, where i and A are the values chosen at
METHODOLOGY
each iteration according to the ‘splitting criterion’. This criterion is based on the standard deviation of the output
The original version of the main ideas of the MLUE method
values (in rainfall-runoff models this is runoff) in the result-
can be found in Shrestha et al. () (in open access), here
ing subsets, which is used as a measure of the possible
we present only a brief description of it, however in a more
regression model error if it is built for this subset. All
formalised and generalised fashion. The basic idea is to esti-
values of i and A are examined, and the M5 algorithm per-
mate the uncertainty of the hydrological model under a
forms the splits that ensures the small standard deviation
number of the following assumptions. First, that the model
in the resulting subsets; the splitting iterations continue
uncertainty is different in various hydro-meteorological con-
trying to perform the best possible split of each of the result-
ditions, and depends on the corresponding forcing input
ing subsets. At a certain moment splits are stopped, and
and the model states (e.g. rainfall, antecedent rainfall, soil
linear regression models are built for each of the resulting
moisture, etc.). Second, that the uncertainties associated
subsets. The splitting procedure can be presented as a hier-
with the prediction of the hydrological variables such as
archy, or a tree, where the splitting rules are in the
runoff in similar hydro-meteorological conditions are also
intermediate nodes and the linear models are associated
similar. By ‘hydrological conditions’, we mean a vector of
with the tree leaves. MT can tackle tasks with very high
variables representing such conditions – the combination of
dimensionality, up to hundreds of variables. Compared to
the particular values of the input and state variables (possibly
other machine learning techniques, MT learning is fast
lagged and transformed), which are seen as the driving forces
99
D. L. Shrestha et al.
|
Encapsulation of parametric uncertainty statistics: MLUE method
generating runoff. This assumption is quite natural: e.g. typi-
Journal of Hydroinformatics
|
16.1
|
2014
forcing vector x0 t, i.e.
cally prediction error (and hence uncertainty) is higher in case of peak flows during extreme events compared to the
{^yt,1 , . . . , ^yt,s } ¼ {M(x0t , θ 1 ), . . . , M(x0t , θs )}
(1)
low flows – however, the proper statistical analysis to support the validity of this assumption is still to be done. The flow chart of the MLUE methodology is presented
where θ is the parameter vector of the model M. Similarly, each column of matrix Y, i.e. {ŷ1,s,…, ŷt,s}T is one realisation
in Figure 1. Let us assume that S various vectors of par-
of MC simulations corresponding to the parameter set θs.
ameters or inputs are sampled and for each of them the
Note that Equation (1) does not represent predictive uncertainty Pt(y|y)̂ which is the uncertainty related to the actual
hydrological model M is run generating a time series of the model output ŷ. The results are presented in the matrix form Y ¼ {ŷt,s}, where t ¼ 1,…, N, s ¼ 1,…, S, N is the
value given the model predictions and all the information
number of time steps, S is the number of simulations. Note
Todini ; Todini ). Rather, by and large it represents
that each row of the matrix Y corresponds to the particular
uncertainty of the model predictions due to the parameter
Figure 1
|
Schematic diagram of the MLUE method.
and knowledge available up to the present (Mantovan &
100
D. L. Shrestha et al.
|
Encapsulation of parametric uncertainty statistics: MLUE method
uncertainty, i.e. Pt(y|̂ xt, θ). Estimating quantiles of the distribution of ŷ or probability density Pt(y|̂ xt, θ) is not always practical in real time application (e.g. for computationally expensive environmental models). However, we can approximate Pt(y|̂ xt, θ) by estimating its quantiles using the MLUE
Journal of Hydroinformatics
|
16.1
|
2014
conditional on the model structure, inputs and other parameters (e.g. in case of using GLUE framework this is the likelihood weight vector ws). 4. The prediction intervals [PItL (α), PItU (α)] for the given confidence level of 1 α (0 < α< 1)
method. Our intention is to build a regression (machine learning) model U which is relatively efficient (fast) and can
PItL (α) ¼ Qt (α=2), PItU (α) ¼ Qt (1 α=2)
(6)
encapsulate these uncertainty results in the following form: where PItL (α) and PItU (α) are the differences between the statistical properties z of ^yt ¼ U ðxt Þ
ð2Þ
model output and the lower and upper bounds of the prediction intervals (PI) respectively, corresponding to the
where z ¼ {z1,…, zK} is a set of desired statistical properties; x
1 α confidence level.
is the input vector of the model U which is constructed from If U in Equation (2) is treated as a quantile, the general
the forcing input variables x’, model state s and possibly model output ŷ (all possibly combined, transformed and/or
equation for calculating the conditional prediction quantile
lagged). A way to construct the input space x is described in
(Equation (5)) can be presented as
next section. To characterise the uncertainty of the model M prediction, the following uncertainty descriptors can be considered.
Qt (p) ¼ U(xt ) þ ξ
(7)
where ξ is the error between the target quantile and the pre-
1. The prediction variance σ 2t (^yt )
dicted quantile by the machine learning model. In particular, the two quantiles that represent the bounds of
σ 2t (^yt ) ¼
S X
1 (^yt,s yt,s )2 S 1 s¼1
(3)
where yt,s is the mean of MC realisations at the time step t.
the PI (Equation (6)) can be calculated as follows: PItL (α) ¼ UL (xt ) þ ξL PItU (α) ¼ UU (xt ) þ ξU
(8)
0
2. The prediction quantile Qt (p) of yt̂ corresponding to the pth [0, 1] quantile
Since these prediction quantiles are derived from the current value of the model output (Equation (5)), then
P(^yt < Q0t (p)) ¼
S X
the general model for the predictive quantile can be ws j^yt,s < Q0t (p)
(4)
presented as
s¼1
where ws is the weight given to the model output at simulation s, ŷt,s is the value of model output at the time t simulated by the model M(x,θs). The use of weights is assumed in case of using GLUE framework. 3. The conditional prediction quantile Qt(p) corresponding to the pth quantile
Q0t (p) ¼ U(xt ) þ ^yopt t
In particular, the upper and lower bounds of the PI of the model output are given by PLLt ¼ UL (xt ) þ ^yopt t ^opt PLU t ¼ UU (xt ) þ yt
Qt (p) ¼ Q0t (p) ^yopt t
(9)
(10)
(5) where UL and UU are the machine learning models for the
where
^yopt t
is the output of the calibrated (optimal) model.
lower and upper bounds of the PIs, respectively. It is worth-
Note that the quantiles Qt(p) obtained in this way are
while to mention that Equation (10) is valid for the
101
D. L. Shrestha et al.
|
Encapsulation of parametric uncertainty statistics: MLUE method
Journal of Hydroinformatics
|
16.1
|
2014
uncertainty descriptors in Equation (6) and it is assumed
variable runoff (Yt). However, the uncertainty model U,
that there is an optimal (calibrated) model M.
whose aim is to predict the error distribution of the simu-
Model U, after being trained on the historical calibration
lated runoff, may be trained with the possible combination
data (generated by MC simulations), encapsulates the under-
of rainfall and evapotranspiration (or effective rainfall),
lying dynamics of the uncertainty descriptors of the MC
their several past (lagged) values, the lagged values of
simulations and maps the input (or more precisely, vectors
runoff, and, possibly, their derivatives and/or combinations.
in space x) to these descriptors. The model U can be of various types, from linear to non-linear regression models such as an ANN. The choice of model depends on the complexity
VERIFICATION
of the problem to be handled and the availability of data. Once the model U is trained on the calibration data, it can
The uncertainty model U can be validated in two ways: (i)
be employed in operation to estimate the uncertainty
measuring its predictive capability in approximating the
descriptors such as quantiles for the new unseen input
uncertainty descriptors of the realisations of MC simu-
data vectors.
lations; and (ii) measuring the ‘quality’ of representing uncertainty by using some indices. Two performance measures, such as coefficient of corre-
SELECTION OF INPUT VARIABLES FOR THE UNCERTAINTY MODEL
lation (CoC) and the root mean square error (RMSE), are widely used to measure the predictive capability of models, and they can be employed for the uncertainty
Selection of appropriate variables to serve as model inputs
model as well. Beside these numerical measures, the graphi-
for the uncertainty model U is extremely important as they
cal plots such as scatter and time series plot of the
should be relevant for the particular modelling exercise
uncertainty descriptors obtained from the MC simulations
and the type of the process model M and its inputs. For
and their predicted values are used to judge the performance
this, the domain (expert) knowledge and analysis of
of the uncertainty model U.
causal relationship between inputs and outputs should be
For assessing the quality of model U, we use two
used in combination. The following variables (or their com-
measures (Shrestha & Solomatine ). Model U is con-
binations) of the process model M are considered as the
sidered to be good if PI coverage probability and mean PI
candidates for being the input variables for model U: (i)
calculated for U are close to those calculated for the MC
input variables; (ii) state variables; (iii) outputs; (iv) time
simulation data from which is used to train U.
derivatives (rate of change) of the input data and state variables; (v) lagged variables of input, state and observed output; and (vi) other data from the physical system that may be relevant to the uncertainty descriptors. Since the nature of models M and U is very different, analysis techniques such as linear correlation or average mutual information between the uncertainty descriptors and the input data listed above may help in choosing the relevant input variables. Based on the domain knowledge and analysis of causal relationships, several structures of input data can be tested to select the optimal input data
1. PI coverage probability (PICP). It measures the percentage of observations falling inside the PI and ideally should be equal to the confidence bounds used to generate these intervals. It is an indication of the quality of model U. PICP is given by: N 1X Ct N t¼1 ( 1, PLLt yt PLU t with C ¼ 0, otherwise
PICP ¼
(11)
structure. For example, if a model M is a conceptual hydrological
where yt is the observed model output at the time t.
model, it would typically use rainfall (Rt) and evapotran-
2. Mean prediction interval (MPI). It measures the average
spiration (Et) as input variables to simulate the output
width of the PIs (it gives an indication of how large the
102
D. L. Shrestha et al.
|
Encapsulation of parametric uncertainty statistics: MLUE method
N 1X L (PLU t PLt ) N t¼1
|
16.1
|
2014
STUDY AREA
uncertainty is) and given by: MPI ¼
Journal of Hydroinformatics
(12)
The MLUE approach has been tested to two contrasting catchments: Brue and Bagmati. The Brue catchment is located in South West of England (Figure 2). It has a drai-
Besides these uncertainty statistics, a visual inspection
nage area of 135 km2 with the average annual rainfall of
of the plot of uncertainty bounds and of the observed
867 mm and the average river flow of 1.92 m3 s 1 (measured
model output can additionally provide significant infor-
in a period from 1961 to 1990). The hourly rainfall, dis-
mation about how effective the uncertainty model is in
charge, and the weather data (temperature, wind, solar
enclosing the observed model outputs along the different
radiation, etc.) are computed from the 15 minutes resolution
input regimes (e.g. low, medium or high flows in hydrology).
data which are available from a period 1993 to 2000.
More detailed description of the performance measures can
The catchment average rainfall data are used in the
be found in Shrestha & Solomatine () and Shrestha
study. The hourly potential evapotranspiration is computed
et al. ().
using the modified Penman method recommended by FAO (Allen et al. ). One year hourly data from June 24 1994 to June 24 1995 is selected for calibration of the
HYDROLOGICAL MODEL
HBV model and data from June 25 1995 to May 31 1996 for the verification (testing) of the HBV model.
A simplified version of HBV model (Bergström ) was
The Bagmati catchment is located in the central moun-
used. This is a lumped conceptual hydrological model
tainous region of Nepal (Figure 3). Compared to the Brue,
which includes conceptual numerical descriptions of the
the size of the Bagmati catchment is bigger, the length of
hydrological processes at catchment scale. The model com-
the data is larger, the temporal resolution of the data is
prises subroutines for snow accumulation and melt, soil
coarse (daily), and the quality of the data is comparatively
moisture accounting procedure, routines for runoff gener-
poorer. It encompasses nearly 3,700 km2 within Nepal and
ation, and a simple routing procedure. The model has 13
reaches the Ganges River in India. The catchment area
parameters; however only nine parameters (see Table 1)
draining to the gauging station at Pandheradobhan is
are effective when there is no snowfall.
about 2,900 km2. Two thousand daily records from 1 January
Table 1
|
Ranges and the optimal values of the HBV model parameters
Brue
Bagmati
Parameter
Description and Unit
Range
Value
FC
Maximum soil moisture content (L)
100–300
160.335
LP
Limit for potential evapotranspiration ( )
0.5–0.99
0.527
ALFA
Response box parameter ( )
BETA
Exponential parameter in soil routine ( )
K
Recession coefficient for upper tank (/T)
0.0005–0.1
K4
Recession coefficient for lower tank (/T)
0.0001–0.005
PERC
Maximum flow from upper to lower tank (L/T)
0.01–0.09
CFLUX
Maximum value of capillary flow (L/T)
0.01–0.05
MAXBAS
Transfer function parameter (T)
0–4 0.9–2
8–15
Range
50–500 0.3–1
1.54
Value
450 0.90
0–4
0.1339
1.963
1–6
1.0604
0.001
0.05–0.5
0.3
0.004
0.01–0. 5
0.04664
0.089
0–8
7.5
0.0038
0–1
0.0004
1–3
2.02
12
Note: The uniform ranges of parameters are used both for calibrating the HBV model, and for analysis of the parameter uncertainty of the HBV model.
103
D. L. Shrestha et al.
|
Encapsulation of parametric uncertainty statistics: MLUE method
Journal of Hydroinformatics
|
16.1
|
2014
EXPERIMENTAL SETUP Uncertainty analysis Hydrological models are calibrated by using adaptive cluster covering (Solomatine ), an efficient randomised search method implemented in GLOBE software. The GLUE method is used in uncertainty analysis because it has now been widely used for uncertainty estimation in a variety of models of complex environmental systems (we do not discuss here how much it does (or does not) follow the Figure 2
|
The Brue catchment showing dense rain gauges network (reproduced from Shrestha & Solomatine (2008) with permission from the International Association for Hydraulic Research). The horizontal and vertical axes refer to the
Bayesian framework). No model is perfect (free from structural error), observation and input data are not free from
easting and northing in British national grid reference co-ordinates. Circles
errors, so Monte Carlo simulation results considering only
denote the rainfall stations and triangles denote the discharge gauging stations. The location of the Brue catchment (solid circle) in the map of UK is
parameter uncertainty are not free from these sources of
shown in the inset.
error. We tried to reduce such errors as much as possible by selecting the best (automatically calibrated) model (which is reasonably accurate), and quality control of the input and observation data. Of course, uncertainty results only considering parameter uncertainty from GLUE are contaminated by other sources of error, again, we tried to minimise them. In this, we follow many researchers using parametric uncertainty analysis. Though Beven claimed that GLUE can be applied to other sources of error as well, we explicitly consider only parameter uncertainly in this study. So we can assume that the uncertainty results produced
by
GLUE
represent
mostly
the
parametric
uncertainty per se, and neglecting the contamination by other sources of error seems to be reasonable thing to do. It is also worth noting that because of informal likelihood function and cut-off threshold value used in GLUE to Figure 3
|
Location map of the Bagmati catchment considered in this study. Discharge
select the behaviours parameter sets, GLUE does not con-
measured at Pandheradobhan is used for the analysis (adopted from Solo-
sider complete parameter uncertainty in statistical sense
matine et al. (2008)).
(Vrugt et al. ). The convergence of MC simulations is assessed to deter-
1988 to 22 June 1993 are selected for calibration of the pro-
mine the number of samples required to obtain the reliable
cess model (HBV hydrological model) and data from 23
results by authors in the previous publication (see Shrestha
June 1993 to 31 December 1995 are used for the verification
et al. ) and is not reported here. The parameters of
of the process model. The first two months of calibration
the HBV model are sampled using non-informative uniform
data are used as the warming-up period and hence excluded
sampling without prior knowledge of individual parameter
in the study. In separation of the 8 years of data into cali-
distributions other than a feasible range of values (see
bration and verification sets, we follow the previous study
Table 1). We use the sum of the squared errors as the
of Solomatine & Shrestha ().
basis to calculate the generalised likelihood measure (see
104
D. L. Shrestha et al.
|
Encapsulation of parametric uncertainty statistics: MLUE method
Freer et al. ) in the form:
L(θ s jD) ¼
1
σ 2e σ 2obs
Journal of Hydroinformatics
|
16.1
|
2014
For the Bagmati catchment, since the resolution of data is daily (as opposed to hourly for the Brue), we do not con-
!λ (13)
sider the derivative (stepwise difference) of the flow as input to the model. The same data sets used for calibration and verification
where L(θs|D) is the generalised likelihood measure for
of the HBV model are used for training and verification of
the sth model (with parameter vector θs) conditioned
model U, respectively. However, for proper training of the
on the observations D, σ 2e is the associated error variance
machine learning models, the calibration data set is segmen-
is the observed variance for the period
ted into the two subsets: 15% of data sets for cross-
under consideration, λ is a user defined parameter. We set λ
validation (CV) and 85% for training per se. CV data set
to 1, so Equation (13) is equivalent to the Nash–Sutcliffe
was used to identify the best structure of machine learning
coefficient of efficiency (CoE) (Nash & Sutcliffe ).
models.
for the sth model,
σ 2obs
The threshold value of CoE ¼0 is selected to classify simulation as either behavioural or non-behavioural. The
Machine learning models
number of behavioural models is set to 25,000, which is based on the convergence analysis of MC simulations. Var-
A multilayer perceptron neural network with one hidden
ious uncertainty descriptors such as variance, quantiles, PIs
layer is used; the Levenberg–Marquardt algorithm is
and estimates of the probability distribution functions are
employed for its training. The hyperbolic tangent function
computed from these 25,000 MC realisations. Note that
is used for the hidden layer, and the linear transfer
these descriptors are computed using likelihood measure
function – for the output layer. The maximum number of
(Equation (13)) as weights ws in Equation (4). The model
epochs is fixed to 1000. Trial and error method is adopted
parameters ranges used for MC sampling are given in
to find the optimal number of neurons in the hidden layer;
Table 1. For Bagmati catchment, first 122,132 MC samples
we tried the number of neurons ranging from 1 to 10. It
are generated by setting threshold value of 0.7 to obtain
was found that 7 and 8 neurons for lower and upper PI,
25,000 behavioural samples. However, to make consistent
respectively, gave the lowest CV error for the Brue. For
with the Brue catchment experiment, model simulations
the Bagmati catchment, the number of hidden neurons
with negative CoE are removed for further analysis, leaving
reduced to 5 and 7.
116,153 samples out of 122,132.
Experiments with MT are carried out with various values of the pruning factor that controls the complexity of
Input variables and data
the generated model (i.e. number of the linear models) and hence the generalising ability of the model. We report the
Selection of input variables for the machine learning model U are based on the methods outlined in the previous section
results of the MT which have a moderate level of complexity. Note that CV data set has not been used in the MT,
and publication of Shrestha et al. () and is not discussed
rather it uses the whole calibration data set to build the
here; they are constructed from the forcing input variables
model.
(e.g. rainfall, evapotranspiration) used in the process models, and the observed discharge. The selected input variables are REt 9a, Yt 1, ΔYt 1 for the Brue catchment and
In the LWR model, we vary two important parameters – number of neighbours and the weight functions (see Appendix A, available online at http://www.iwaponline.com/jh/
REt 0, REt 1, Yt 1, Yt 2, for the Bagmati catchment where
016/242.pdf). Several experiments are done with different
REt τ: effective rainfall at time t τ;
combination of these values and the best results are
Yt τ: discharge at time t τ; where τ is lag time;
obtained with five neighbours and the linear weight function
REt 9a: the average of REt 5, REt 6, REt 7, REt 8, REt 9;
for the Brue and 11 neighbours and Tricube weight function
ΔYt 1 ¼ Yt 1 Yt 2.
for the Bagmati catchment.
105
D. L. Shrestha et al.
|
Encapsulation of parametric uncertainty statistics: MLUE method
Modelling the probability distribution function
Journal of Hydroinformatics
|
16.1
|
2014
observed discharge in the verification period is 54% higher than that in the calibration period which apparently
In the previous study (Shrestha et al. ), we estimate the
increases performance in the verification period.
90% PIs by building only two models predicting the 5 and
Figure 4 shows a comparison of the 90% prediction
95% quantiles. In this paper, the methodology is extended
bounds estimated by the GLUE and the three machine learn-
to predict several quantiles of the model outputs to estimate
ing models in the verification period for the Brue catchment.
the distribution functions (CDF) of the model outputs gener-
One can see a noticeable difference among them for predict-
ated by the MC simulations. The methodology applied to
ing the lower and upper bounds of PI. For example, in the
estimate only two quantiles can be extended to approximate
second peak of Figure 4(a), the upper bound of PI is underes-
the full distribution of the model outputs. The procedures to
timated by ANN compared to the MT and LWR. However,
estimate the CDF of the model outputs consists of (i) deriv-
the lower bound is well approximated by the ANN compared
ing the CDF of the realisations of the MC simulations in the
to the other models. Furthermore, in Figure 4(b), the ANN is
calibration data, (ii) selecting several quantiles of the CDF in
overestimating two peaks, while the MT and LWR models
such a way that these quantile can approximate the CDF,
underestimate them (Figure 4(d) and (f)). From Figure 4, it
(iii) computing corresponding prediction quantiles using
can be seen that the results of the three models are compar-
Equation (5), (iv) constructing and training separate
able. They reproduce the MC simulations uncertainty
machine learning models for each prediction quantiles, (v)
bounds reasonably well except for some peaks, in spite of
using these models to predict the quantiles for the new
the low correlation of the input variables with the PIs. The
input data vector, and (vi) constructing a CDF from these
predicted uncertainty bounds follow the general trend of the
discrete quantiles by interpolation. This CDF will be
MC uncertainty bounds although some errors can be noticed
approximation to the CDF of the MC simulations. We select 19 quantiles from 5 to 95% with uniform inter-
and the model fails to capture the observed flow during one of the peak events (Figure 4(a), (c), and (e)).
val of 5%, and then an individual machine learning model is
For the Bagmati catchment, it is found that only 49.79%
constructed for each quantile using the same structure of the
of observed discharge data is inside the 90% prediction
input data and the model that was used for modelling two
bounds computed by the GLUE method in the calibration
quantiles. In principle, the optimal set of input data and
period and 61.48% in the verification period. Therefore,
the model structure could be different for each quantile,
we follow the modified GLUE method (denoted by
but we leave this investigation to future studies.
mGLUE) (Xiong & O’Connor ) to improve the capacity of the prediction bounds to capture the observed runoff data. mGLUE method uses the bias corrected MC simu-
RESULTS
lations to estimate the uncertainty bounds. Compared to
The HBV model is calibrated maximising CoE. CoE values
mGLUE method includes two more procedural steps.
of 0.96 and 0.83 are obtained for the calibration period in
Firstly, for each behavioural parameter set, a simulation
the Brue and Bagmati catchment, respectively. We also
bias curve is constructed on the basis of the simulation
experimented
performance
series that are obtained using the calibration data. Thus,
measures taking into account different temporal scales and
for a number S of the behavioural parameter sets, there
using step-wise line search (Kuzmin et al. ). The
will be S different simulation bias curves. Secondly, at
model is validated by simulating the flows for the indepen-
each time step, with the new data input, all the different pre-
dent verification data set, and CoE is 0.83 and 0.87 in the
diction values for the same observation are corrected by
Brue and Bagmati catchments, respectively. HBV model is
dividing by a common median bias value, before the deri-
quite accurate for the Brue catchment but its error (uncer-
vation of the prediction limits.
the original GLUE method (Beven & Binley ), the
with
more
sophisticated
tainty) is quite high during the peak flows. Note that for
Figure 5 presents the 90% prediction bounds estimated
the Bagmati catchment, the standard deviation of the
by the mGLUE and the three machine learning models in
106
Figure 4
D. L. Shrestha et al.
|
|
Encapsulation of parametric uncertainty statistics: MLUE method
Journal of Hydroinformatics
|
16.1
|
2014
Hydrograph of 90% prediction bounds estimated by GLUE and machine learning methods for the Brue catchment in parts of the veriďŹ cation period. The black dots indicate the observed discharges and the dark grey shaded area â&#x20AC;&#x201C; the prediction uncertainty that results from GLUE. The black lines denote the prediction uncertainty estimated by neural networks ((a) and (b)), model trees ((c) and (d)) and locally weighted regression ((e) and (f)).
Figure 5
|
Same as Figure 4, but for the Bagmati catchment.
107
D. L. Shrestha et al.
|
Encapsulation of parametric uncertainty statistics: MLUE method
Journal of Hydroinformatics
|
16.1
|
2014
the verification period. With mGLUE method, the percen-
estimated by them enclose relatively lower percentage of
tage of the observation falling inside the bounds is
the observed values compared to those of the ANN.
increased to 65.26 and 67.52% in the calibration and verifi-
So far we have compared the performance of the three
cation periods, respectively. The machine learning models
machine learning models by analysing the accuracy of the
are able to approximate the mGLUE simulation results
prediction only; however, there are other factors to be con-
reasonably well. The results of the three machine learning
sidered as well. These include computational efficiency,
models are comparable; however one can see a noticeable
simplicity or ease of use, number of training parameters
difference between them when predicting the peaks. The
required, flexibility, transparency, etc. Computational effi-
highest peak in Figure 5(a) is overestimated by the ANN
ciency is shown in Table 3. One can see that the time
model, while the other two peaks in Figure 5(b) are
required to generate uncertainty results by MLUE methods
underestimated.
in the verification period is significantly lower than that
Figure 6 and Table 2 present a summary of statistics of
required by GLUE method. Table 4 shows linguistic variables
the uncertainty estimation in the verification period. The
to describe other factors mentioned above with parameters of
ANN model is very close to the MC simulations results.
machine learning models to be tuned. In ANN, we have
The MT and LWR are better than the ANN with respect
only tuned one parameter – number of hidden neurons.
to MPI (note that lower MPI is the indication of better per-
MT also contains one parameter – pruning factor that has
formance), however PICP shows that the prediction limits
to be tuned. While in LWR, two parameters – number
Figure 6
Table 2
|
|
A comparison of statistics of uncertainty (PICP and MPI) estimated with GLUE, neural networks (ANN), model trees (MT), and locally weighted regression (LWR) in the verification period. (a) Brue catchment; (b) Bagmati catchment.
Performances of the models measured by the coefficient of correlation (CoC), root mean squared error (RMSE), the prediction interval coverage probability (PICP) and the mean prediction interval (MPI) in the verification data set
Lower prediction interval RMSE (m3/s)
Upper prediction interval RMSE (m3/s)
MPI (m3/s)
Catchment
Model
CoC
Brue
ANN
0.86
0.56
0.80
1.59
77.00
2.09
MT
0.84
0.61
0.79
1.63
68.72
1.95
LWR
0.82
0.64
0.80
1.60
75.43
1.93
ANN
0.81
51.46
0.94
61.59
66.24
124.03
MT
0.81
50.25
0.95
52.14
59.05
120.59
LWR
0.86
44.56
0.96
50.37
59.16
121.73
Bagmati
Note: Bold type signifies the maximum value in each statistics.
CoC
PICP (%)
108
Table 3
D. L. Shrestha et al.
|
|
Encapsulation of parametric uncertainty statistics: MLUE method
|
16.1
|
2014
From the visual inspection one can see that the CDFs
Computational time for GLUE and MLUE
Brue
Journal of Hydroinformatics
are reasonably approximated by the machine learning
Bagmati
methods. However, it may require a rigorous statistical test
Catchments Period
Calibration
Verification
Calibration
Verification
to conclude if the estimated CDFs are not significantly
Number of data used
8760
8217
2000
922
different from those given by the GLUE simulations. In
GLUE
16:34:00
11:45:00
7:45:00
6:41:00
results of the significance test (e.g. Kolmogorov–Smirnov)
ANN
2:07:00
0:04:00
1:03:00
0:01:30
may not be reliable.
MT
1:07:00
0:03:00
0:33:00
0:01:05
LWR
4:07:00
0:09:00
2:03:00
0:03:00
this study, since we have limited data (only 19 points) the
Note: The time (hh:mm:ss) is based on prediction of two quantiles (5% and 95%) and also includes data analysis and preparation time in the calibration period except for GLUE.
DISCUSSION
of neighbours and weighting functions have been tuned to
In this study, the uncertainty of the model output is assessed
get optimal results. Such parameters are optimised by
when the hydrological process model is used in simulation
exhaustive search during training the model. It can be
mode. However, this method can be used also in forecasting
observed that none of the models is superior with respect
mode, provided that the process model is also run in fore-
to all factors; however one may favour ANN if the ranking
casting mode. Note that we have not used the current
is done by giving equal weight to all factors.
observed discharge Qt as an input to machine learning models because during the model application this variable is not available (indeed, the value of this variable is calcu-
Modelling the probability distribution function
lated by the HBV model, and the machine learning model assesses the uncertainty of this output).
Figure 7 and Figure 8 show comparison of the CDFs for the
It is observed that the results of machine learning
peak events estimated by the three machine learning
models and the GLUE (or mGLUE) are visually closer to
methods for the Brue and Bagmati catchment, respectively.
each other. The model prediction uncertainty caused by par-
One can see that the CDFs estimated by the ANN, MT and
ameter uncertainty is rather large. There could be several
LWR are comparable and are very close to the CDFs given
reasons for this including the following ones (Shrestha
by the GLUE simulations. It is observed that the CDFs esti-
et al. ): (i) the GLUE and mGLUE methods do not
mated by the ANN, MT and LWR models deviate a little
strictly follow the Bayesian inference process (Mantovan
more near the middle of it for the peak event of 9 January
& Todini ) and overestimate the model prediction
1996 in the Brue catchment (see Figure 7(b)). The CDFs esti-
uncertainty; (ii) in the GLUE method, the uncertainty
mated by the ANN, MT and LWR deviate a bit more at the
bound very much depends on the rejection threshold
higher percentiles values for the peak event of 13 August
separating behavioural and non-behavioural models: in
1995 in the Bagmati catchment (see Figure 8(b)).
this study we use quite a low value of rejection threshold
Table 4
|
Performance criteria of machine learning models indicated by linguistic variables
Accuracy Models
Model parameters (optimised)
CoC
PICP and MPI
Efficiency
Transparency
Rank
ANN
Number of hidden nodes
High
High
Medium
Low
1
MT
Pruning factor
Medium
Low
High
Medium
2
LWR
Number of neighbours and weight functions
Low
Medium
Low
High
3
109
Figure 7
D. L. Shrestha et al.
|
|
Encapsulation of parametric uncertainty statistics: MLUE method
Journal of Hydroinformatics
|
16.1
|
2014
A comparison of cumulative distribution function (CDF) estimated with GLUE and neural networks (ANN), model trees (MT), and locally weighted regression (LWR) for the Brue catchment in a part of the verification period. (a) Peak event of 20 December 1995; (b) peak event of 9 January 1996.
(CoE value of 0) which produces relatively wider uncer-
When comparing the percentage of the observed dis-
tainty bounds; and (iii) we consider only parameter
charge data falling within the uncertainty bounds (i.e.
uncertainty, thus implicitly assuming no model structure
PICP) produced by the GLUE, it can be seen that this per-
and input data uncertainty.
centage is much lower than the specified confidence level
It can be noticed that the performance of machine learn-
to generate these bounds. Low PICP value is consistent
ing models to predict lower quantiles (5%, 10%, etc.) is
with the results reported in the literature (see e.g. Montanari
relatively higher compared to those of the models for the
; Xiong & O’Connor ). The low ‘quality’ of the PIs
upper quantiles (90%, 95%, etc.). This can be explained by
obtained by the GLUE in enveloping the real-world dis-
the fact that the upper quantiles correspond to higher
charge observations might be mainly due to the following
values of flow (where the HBV model is obviously less accu-
three reasons (Shrestha et al. ): (i) by using GLUE
rate) and higher variability, which makes prediction a
method we investigate only the parametric uncertainty with-
difficult task. It is possible to develop a specific model only
out consideration of uncertainty in the model structure, the
to simulate the peak observed data and their uncertainty as
input (such as rainfall, temperature data) and the output dis-
well as for the mean flows. In general, such model performs
charge data; (ii) we use uniform distribution and ignore the
better than the global model. In this study, we have used MT
parameters correlation; (iii) results of the GLUE method
and LWR models for uncertainty estimation which implicitly
depend on the (subjectively set) threshold value and likeli-
build the local models internally. It would be interesting to
hood measure for selecting the behavioural parameter sets.
build the local models explicitly for high flow events for
To approximate CDF, an individual machine learning
example. However, it is always not possible because of train-
model is constructed for each quantile with the same struc-
ing data requirements for such rare and extreme events.
ture of the input data and the model configuration. Thus, we
110
Figure 8
D. L. Shrestha et al.
|
|
Encapsulation of parametric uncertainty statistics: MLUE method
Journal of Hydroinformatics
|
16.1
|
2014
Same as Figure 7, but for the Bagmati catchment in a part of the verification period. (a) Peak event of 14 September 1994; (b) peak event of 13 August 1995.
have not undertaken the full-fledged optimisation of the
data belong, and only a little beyond. In order to avoid the
model and the input data structure of the machine learning
problem of extrapolation, an attempt should be made to
models and there is a hope to improve the results. Further-
ensure that the training data includes various possible combi-
more one can notice that the CDFs estimated are not
nations of the events including the extreme (such as extreme
necessarily monotonously increasing (see e.g. 30% quantile
flood), however, this is not always possible since the
of the MT model for the second case study). This is not sur-
extremes tend to be rather rare events. Like most of the
prising given that individual models are built for each
uncertainty analysis methods, the MLUE method also pre-
quantile independently. This deficiency can be addressed
supposes the existence of a reasonably long, precise and
by a correcting scheme (to be developed) that would
relevant time series of measurements. As pointed out by
ensure monotonicity of the overall CDF.
Hall & Anderson (), uncertainty in extreme or unrepea-
In this paper, the MLUE method is applied to emulate
table events is more important than in situations where there
the results of the GLUE and mGLUE methods, however it
are historical data sets, and this may require different
can be used for other uncertainty analysis methods such as
approaches towards uncertainty estimation. The lack of suffi-
Markov chain Monte Carlo, Latin hypercube sampling,
cient historical data makes the uncertainty results from the
etc. Furthermore, the MLUE method can be applied in the
model unreliable. This is actually true for all MC-based
context of other sources of uncertainty – input, structure
methods that use past data to make judgements about the
or combined.
future uncertainty.
Since the machine learning technique is the core of the
The MLUE method is applicable only to systems whose
MLUE method, it may have a problem of extrapolation for
physical characteristics do not change considerably with
extreme (rare) events. This means that the results are reliable
time. The results will not be reliable if the physics of the
only within the boundaries of the domain where the training
catchment
(e.g.
land
use)
and
hydro-meteorological
111
D. L. Shrestha et al.
|
Encapsulation of parametric uncertainty statistics: MLUE method
Journal of Hydroinformatics
|
16.1
|
2014
conditions differ substantially from what was observed
or mGLUE uncertainty bounds. It is also observed that the
during the model calibration. If there is evidence of such
uncertainty bounds estimated by ANN, MT and LWR are
changes, then the models should be re-calibrated.
comparable; however ANN is a bit better than the other
The reliability and accuracy of the uncertainty analysis
two models. Second we extend the MLUE method to
depend on the accuracy of the uncertainty models used, so
approximate the CDF of the model outputs, and the results
attention should be given to these aspects as well. The pro-
demonstrate that the MLUE is performing quite well in esti-
posed
mating the CDF resulting from the GLUE (and mGLUE)
method
does
not
consider
the
uncertainty
associated with the model U itself. However, one could use CV data set to improve the accuracy of the model U by generalising its predictive capability.
methods. It can be recommended to direct further studies at testing applicability of the MLUE approach with other sampling methods, ensuring compatibility of the models for multiple quantiles to achieve monotonicity of the resulting approxi-
CONCLUSIONS
mation of CDF, considering multiple sources of uncertainty, and testing the method on more complex models.
This paper presents the further development, studying the relative performance and application of the MLUE method presented in its initial form by Shrestha et al. (), in predicting parameter uncertainty in rainfall-runoff modelling. The basic idea of the MLUE method is to encapsulate the computationally expensive MC simulations of a process model by an efficient machine learning model. (We used GLUE, a version of MC simulation method.) This model is first trained on the data generated by the MC simulations to encapsulate the relationship between the hydro-meteorological variables and the uncertainty statistics of the model output probability distribution, e.g. quantiles. Then the trained model can be used to estimate the latter for the new input data. The MLUE method is computationally efficient and can be used in real time applications when a large number of model runs are required. We use three machines learning techniques, namely ANN, MT and LWR to predict several uncertainty descriptors of the rainfall-runoff model outputs. It is observed that
ACKNOWLEDGEMENTS Most of this work has been completed during the first author’s post doctorate research and second author’s PhD research at UNSECO-IHE Institute for Water Education, Delft, The Netherlands; these were partly funded by the European Community’s 7th Framework Research Program through the grants to the budget of the EnviroGRIDS, KULTURisk and WeSenseIt projects. WIRADA project (The Water Information Research and Development Alliances between CSIRO’s Water for a Healthy Country Flagship and the Australian Bureau of Meteorology) partly supported the first author for completing this manuscript. The authors sincerely thank the editor and the three anonymous
reviewers
for
providing
helpful
and
constructive comments to improve the manuscript.
the percentage of the observation discharge data falling within the prediction bounds generated by GLUE is much lower than the given certainty level used to produce these
REFERENCES
prediction bounds. Thus, we also apply mGLUE (Xiong & O’Connor ) method to improve the percentage of the observation falling within the prediction bounds. On the two case studies we first demonstrate the application of the MLUE method to estimate the two quantiles (5 and 95%) forming the 90% PIs. Several performance indicators and visual inspection show that machine learning models are reasonably accurate to approximate the GLUE
Abrahart, R. J. & See, L. Comparing neural network and autoregressive moving average techniques for the provision of continuous river flow forecasts in two contrasting catchments. Hydrological Processes 14, 2157–2172. Aha, D., Kibler, D. & Albert, M. Instance-based learning algorithms. Machine Learning 6, 37–66. Allen, R. G., Pereira, L. S., Raes, D. & Smith, M. Crop Evapotranspiration: Guidelines for Computing Crop Water Requirements. Irrigation and Drainage Paper No. 56, FAO,
112
D. L. Shrestha et al.
|
Encapsulation of parametric uncertainty statistics: MLUE method
Rome. Available at: http://www.fao.org/docrep/X0490E/ x0490e00.htm. Bergström, S. Development and application of a conceptual runoff model for Scandinavian catchments. SMHI Reports RHO, No. 7, Norrköping, Sweden. Beven, K. & Binley, A. The future of distributed models: Model calibration and uncertainty prediction. Hydrological Processes 6, 279–298. Beven, K. & Freer, J. Equifinality, data assimilation, and uncertainty estimation in mechanistic modelling of complex environmental systems using the GLUE methodology. Journal of Hydrology 249, 11–29. Blasone, R., Vrugt, J., Madsen, H., Rosbjerg, D., Robinson, B. & Zyvoloski, G. Generalized likelihood uncertainty estimation (GLUE) using adaptive Markov Chain Monte Carlo sampling. Advances in Water Resources 31, 630–648. Dawson, C. W. & Wilby, R. L. Hydrological modelling using artificial neural networks. Progress in Physical Geography 25, 80–108. Dibike, Y. B. & Solomatine, D. P. River flow forecasting using artificial neural networks. Journal of Physics and Chemistry of the Earth, Part B: Hydrology, Oceans and Atmosphere 26, 1–8. Duan, Q., Sorooshian, S. & Gupta, V. Effective and efficient global optimization for conceptual rainfall-runoff models. Water Resources Research 28, 1015–1031. Elshorbagy, A., Corzo, G., Srinivasulu, S. & Solomatine, D. P. a Experimental investigation of the predictive capabilities of data driven modeling techniques in hydrology – Part 1: concepts and methodology. Hydrology and Earth System Sciences 14, 1931–1941. Elshorbagy, A., Corzo, G., Srinivasulu, S. & Solomatine, D. P. b Experimental investigation of the predictive capabilities of data driven modeling techniques in hydrology – Part 2: application. Hydrology and Earth System Sciences 14, 1943–1961. Freer, J., Beven, K. & Ambroise, B. Bayesian estimation of uncertainty in runoff prediction and the value of data: an application of the GLUE approach. Water Resources Research 32, 2161–2173. Georgakakos, K., Seo, D.-J., Gupta, H. V., Schaake, J. & Butts, M. M. Towards the characterization of streamflow simulation uncertainty through multimodel ensembles. Journal of Hydrology 298, 222–241. Govindaraju, R. S. & Rao, A. R. Artificial Neural Networks in Hydrology. Kluwer Academic Publishers, Amsterdam, 348 pp. Haario, H., Laine, M., Mira, A. & Saksman, E. DRAM: efficient adaptive MCMC. Statistical Computation 16, 339–354. Hall, J. & Anderson, M. G. Handling uncertainty in extreme or unrepeatable hydrological processes – the need for an alternative paradigm. Hydrological Processes 16, 1867–1870. Harr, M. Probabilistic estimates for multivariate analyses. Applied Mathematical Modeling 13, 313–318. Johnston, P. & Pilgrim, D. Parameter optimization for watershed models. Water Resources Research 12, 477–486.
Journal of Hydroinformatics
|
16.1
|
2014
Khu, S.-T. & Werner, M. G. F. Reduction of Monte-Carlo simulation runs for uncertainty estimation in hydrological modelling. Hydrology and Earth System Sciences 7, 680–692. Kuczera, G. & Parent, E. Monte Carlo assessment of parameter uncertainty in conceptual catchment models: the Metropolis algorithm. Journal of Hydrology 211, 69–85. Kuzmin, V., Seo, D.-J. & Koren, V. Fast and efficient optimization of hydrologic model parameters using a priori estimates and stepwise line search. Journal of Hydrology 353, 109–128. Madsen, H. Automatic calibration of a conceptual rainfall– runoff model using multiple objectives. Journal of Hydrology 235, 276–288. Maier, H. R. & Dandy, G. C. Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications. Environmental Modelling & Software 15, 101–124. Maier, H. R., Jain, A., Dandy, G. C. & Sudheer, K. P. Methods used for the development of neural networks for the prediction of water resource variables in river systems: current status and future directions. Environmental Modelling & Software 25, 891–909. Mantovan, P. & Todini, E. Hydrological forecasting uncertainty assessment: Incoherence of the GLUE methodology. Journal of Hydrology 330, 368–381. Maskey, S., Guinot, V. & Price, R. K. Treatment of precipitation uncertainty in rainfall-runoff modelling: a fuzzy set approach. Advance in Water Resources 27, 889–898. McKay, M. D., Conover, W. J. & Beckman, R. J. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21, 239–245. Melching, C. S. An improved-first-order reliability approach for assessing uncertainties in hydrologic modeling. Journal of Hydrology 132, 157–177. Minns, A. W. & Hall, M. J. Artificial neural networks as rainfall-runoff models. Hydrological Science Journal 41, 399–417. Mitchell, T. Machine Learning. McGraw-Hill, Singapore, 414 pp. Montanari, A. Large sample behaviors of the generalized likelihood uncertainty estimation (GLUE) in assessing the uncertainty of rainfall-runoff simulations. Water Resources Research 41, W08406. Montanari, A. & Brath, A. A stochastic approach for assessing the uncertainty of rainfall-runoff simulations. Water Resources Research 40, W01106. Nash, J. & Sutcliffe, J. River flow forecasting through conceptual models – Part I – A discussion of principles. Journal of Hydrology 10, 282–290. Pappenberger, F., Harvey, H., Beven, K., Hall, J. & Meadowcroft, I. Decision tree for choosing an uncertainty analysis methodology: a wiki experiment http://www.floodrisknet. org.uk/methods http://www.floodrisk.net. Hydrological Processes 20, 3793–3798.
113
D. L. Shrestha et al.
|
Encapsulation of parametric uncertainty statistics: MLUE method
Rosenblueth, E. Two-point estimates in probability. Applied Mathematical Modelling 5, 329–335. Shrestha, D. L. & Solomatine, D. P. Machine learning approaches for estimation of prediction interval for the model output. Neural Networks 19, 225–235. Shrestha, D. L. & Solomatine, D. P. Data-driven approaches for estimating uncertainty in rainfall-runoff modelling. International Journal of River Basin Management 6, 109–122. Shrestha, D. L., Kayastha, N. & Solomatine, D. P. A novel approach to parameter uncertainty analysis of hydrological models using neural networks. Hydrology and Earth System Sciences 13, 1235–1248. Solomatine, D. P. Two strategies of adaptive cluster covering with descent and their comparison to other algorithms. Journal of Global Optimization 14, 55–78. Solomatine, D. P. & Torres, L. A. A. Neural network approximation of a hydrodynamic model in optimizing reservoir operation. In: Hydroinformatics ’96 (A. Muller, ed.). Balkema, Rotterdam. Solomatine, D. P. & Dulal, K. N. Model trees as an alternative to neural networks in rainfall–runoff modelling. Hydrological Sciences Journal 48, 399–411. Solomatine, D. P. & Ostfeld, A. Data-driven modelling: some past experiences and new approaches. Journal of Hydroinformatics 10, 3–22. Solomatine, D. P. & Shrestha, D. L. A novel method to estimate model uncertainty using machine learning techniques. Water Resources Research 45, W00B11. Solomatine, D. P., Maskey, M. & Shrestha, D. L. Instancebased learning compared to other data-driven methods in hydrological forecasting. Hydrological Processes 22, 275–287. Stedinger, J. R., Vogel, R. M., Lee, S. U. & Batchelder, R. Appraisal of the generalized likelihood uncertainty
Journal of Hydroinformatics
|
16.1
|
2014
estimation (GLUE) method. Water Resources Research 44, W00B06. Thiemann, M., Trosset, M., Gupta, H. V. & Sorooshian, S. Bayesian recursive parameter estimation for hydrologic models. Water Resources Research 37, 2521–2535. Tung, Y.-K. Uncertainty and reliability analysis. In: Water Resources Handbook (L. W. Mays, ed.). McGraw-Hill, New York, 7.1–7.65. Todini, E. A model conditional processor to assess predictive uncertainty in flood forecasting. Journal of River Basin Management 6, 123–137. Vrugt, J. A., ter Braak, C., Gupta, H. & Robinson, B. Equifinality of formal (DREAM) and informal (GLUE) Bayesian approaches in hydrologic modeling? Stochastic Environmental Research and Risk Assessment 23, 1011– 1026. Vrugt, J. A., Diks, C., Gupta, H. V., Bouten, W. & Verstraten, J. M. Improved treatment of uncertainty in hydrologic modeling: Combining the strengths of global optimization and data assimilation. Water Resources Research 41, W01017. Wagener, T. & Gupta, H. V. Model identification for hydrological forecasting under uncertainty. Stochastic Environmental Research and Risk Assessment 19, 378–387. Witten, I. H. & Frank, E. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann Publishers, San Francisco, USA, 371 pp. Xiong, L. & O’Connor, K. An empirical method to improve the prediction limits of the GLUE methodology in rainfallrunoff modeling. Journal of Hydrology 349, 115–124. Yapo, P., Gupta, H. V. & Sorooshian, S. Automatic calibration of conceptual rainfall-runoff models: sensitivity to calibration data. Journal of Hydrology 181, 23–48.
First received 20 December 2012; accepted in revised form 3 June 2013. Available online 25 July 2013
114
© IWA Publishing 2014 Journal of Hydroinformatics
|
16.1
|
2014
Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system Haijun Wang, Wenting Zhang, Song Hong, Yanhua Zhuang, Hongyan Lin and Zhen Wang
ABSTRACT Non-point source (NPS) pollution has become the major reason for water quality deterioration. Due to the differences in the generation and transportation mechanisms between urban areas and rural areas, different models are needed in rural and urban places. Since land use has been rapidly changing, it is difficult to define the study area as city or country absolutely and the complex NPS pollution in these urban–rural mixed places are difficult to evaluate using an urban or rural model. To address this issue, a fuzzy system-based approach of modeling complex NPS pollutant is proposed concerning the fuzziness of each land use and the ratio of belonging to an urban or rural place. The characteristic of land use, impact of city center and traffic condition were used to describe spatial membership of belonging to an urban or rural place. According to the spatial
Haijun Wang Song Hong (corresponding author) Yanhua Zhuang Hongyan Lin Zhen Wang School of Resource and Environmental Science, Wuhan University, Wuhan, China E-mail: environmentalanalytics@gmail.com Wenting Zhang Department of Geography and Resource Management, The Chinese University of Hong Kong, Hong Kong
membership of belonging to an urban or rural place, the NPS distributions calculated by the urban model and rural model respectively were combined. To validate the method, Donghu Lake, which is undergoing rapid urbanization, was selected as the case study area. The results showed that the urban NPS pollutant load was significantly higher than that of the rural area. The land usage influenced the pollution more than other factors such as slope or precipitation. It also suggested that the impact of the urbanization process on water quality is noteworthy. Key words
| Donghu Lake, fuzzy system, non-point source pollution, urban–rural watershed
INTRODUCTION Excessive loads of pollution into rivers, lakes, reservoirs and
pollutants in urban and rural places are different, the
estuaries are now becoming a major concern to water
models and the factors as well as the corresponding par-
resource managers across the world (Shrestha et al. ;
ameters must be different (William et al. ; Kim et al.
Jing & Chen ; Liu & Tong ). Non-point source
; Zhang et al. ; Phillips et al. ) to ensure accu-
(NPS) pollution significantly contributes to the deterioration
rate results. To begin with, close attention was paid to
of water quality (Leone et al. ; Ouyang et al. ) due to
rural NPS pollution, as agricultural chemicals contributed
the difficulty in identifying, assessing and controlling the
to the NPS pollution a great deal. For example, the empirical
sources of this type of pollution. The major NPS pollutants
quantitative approach, namely, the universal soil loss
are nitrogen (N) and phosphorus (P). Recently, numerous
equation (USLE), is developed to predict large scale soil ero-
research efforts have been made to discover the process
sion and the designation of potential risk zones for
and spatial quality of NPS N and P pollutants to support pre-
agricultural plots (Pandey et al. ). Thanks to its low
vention and mitigation measures (Zhang & Huang ). In
data and parameter requirements, in contrast to physically-
particular, the NPS pollution can be classified into two
based models, as well as its scale-independent geometric res-
types: agricultural/rural NPS and urban NPS (Edwin et al.
olution (Renard et al. ; Bahadur ; Dumas et al. ;
). As the generation and transportation process of NPS
Volk et al. ), USLE is widely used in the evaluation of
doi: 10.2166/hydro.2013.266
115
H. Wang et al.
|
Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system
Journal of Hydroinformatics
|
16.1
|
2014
the rural NPS pollutant loads by providing average annual
the Baltimore Ecosystem Study method to explore the
soil erosion (Fistikoglu & Harmancioglu ; Haregeweyn
impact of urbanization on the magnitude and export flow
& Yohannes ). Additionally, the other rural NPS evalu-
distribution of nitrogen, concluding that they are highly cor-
ation model, the export coefficient model (ECM), is well-
related. However, their studies still did not separate the rural
developed in determining NPS pollution (Do et al. )
region from the urban region for their study area and the
with the simple model format for agricultural areas
problem of modeling complex NPS pollution at urban–
(Johnes & Heathwaite ) at the same time. In short,
rural mixed place has not yet been solved.
the model for rural NPS evaluation is fully developed and
To address the problem of evaluating the NPS pollutant
widely applied. Yet, as land use is changing from agricul-
loads in an urban–rural mixed area, Zhuang et al. () pro-
tural to urban, the natural soil surface is replaced with
posed a CA-AUNPS model to asses the spatial and temporal
impermeable surfaces (Chris et al. ) which suffer from
variation of complex NPS pollution for a lake watershed of
higher population density and more intensive human activi-
central China. In this model, Focal Neighborhood method
ties (Shon et al. ). This will influence the generation and
was used as the coupling model to combine the export empiri-
transportation process of NPS pollutants. Recently, due to
cal model and L-THIA model. In our study, a fuzzy
the wide process of urbanization all over the world, the
membership-based approach is proposed. To our knowledge,
urban NPS pollution research has became more popular.
it is difficult to classify an area into urban or rural absolutely
For instance, Shon et al. () used a storm water manage-
due to the fact that the multiple or fuzzy characteristics of
ment model to estimate the NPS pollutant loads; Bhaduri
non-urban, partly-urban and urban states in the process of
et al. () proposed a Geographical Information System
urban development are not solved (Liu & Phinn ). Con-
(GIS)–NPS model to assess the NPS pollutant loads under
ventionally, the land use can be classified into ‘0’ meaning
urbanization by using the Long-Term Hydrologic Impact
non-urban or rural and ‘1’ meaning urban. According to this
Assessment (L-THIA) model.
classification, urban land which is surrounded by rural land
Even if the NPS evaluation models for urban or rural
and the land use at the boundary of rural-urban areas may
areas are well-developed, all these models concentrate on
be misclassified and then result in mistakes in the evaluation
one aspect, urban or rural pollutant loads, which is insuffi-
of the NPS pollutant loads. The fuzzy membership can
cient to evaluate the NPS pollutant loads in the urban-
express the ratio of the land cell belonging to urban or rural,
rural mixed areas. In this study, we identify the NPS pol-
which ranges from 0 to 1. The fuzzy expression may be suit-
lution in urban–rural mixed areas and caused by various
able for use in the NPS evaluation and have been employed
pollutants in rural and urban surface runoff together as the
to assist in the calculation of water quality (Yang et al. ).
complex NPS pollution. Since more and more urban–rural
For example, Dixon () incorporated GIS, global position
mixed areas are emerging, as the result of rapid economic
system, remote sensing (RS) and the fuzzy rule-based model
development, some efforts have concentrated on the NPS
to generate groundwater sensitivity maps. Besides the
pollution in mixed urban and rural watershed and have
traditional calculation for groundwater quality, his method-
mentioned that the process of urbanization impacted the
ology was further refined through fuzzy rule-based model to
water quality greatly (Wang et al. ; Zheng et al. ).
incorporate land-use/pesticide application and soil structure
For example, Chris et al. () measured the water quality
information. Gemitzi et al. () combined GIS with fuzzy
in agricultural, urban and mixed land, and determined the
logic and multi-criteria evaluation techniques for data acqui-
water quality from these three places, but Chris et al. just
sition and the production of factor images. Then, he created
compared the measured samples to confirm that the water
the intermediate and final ground water vulnerability map
quality varied in different areas and did not address the
based on factor images. In accordance with previous studies
problem of how to evaluate the NPS pollutant loads in
and the aims of this study, the fuzzy membership-based
different areas. Shields et al. () pointed out that the
approach is employed to describe the fuzziness in land
urbanizing study area is different from the traditional
usage and then to express the complex NPS pollutants in
urban or rural catchment. Thus, Shields et al. employed
rural and urban mixed areas.
116
H. Wang et al.
|
Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system
Journal of Hydroinformatics
|
16.1
|
2014
In this paper, the universal soil loss equation model
Figure 2 shows the temporal land use patterns of the site
(Wischmeier & Smith ), the export coefficient model and
in 1991, 2002 and 2005, respectively. From the view of land
the long-term hydrologic impact assessment model (Harbor
use pattern, the north-western portion of the area functions
; Lim et al. ), are employed to evaluate the spatial dis-
completely as a city under the urban expansion of Wuhan,
tribution and quality of rural and urban NPS pollutant loads
while the obstruction of the lake still allows for rural proper-
respectively. In particular, the USLE model and the ECM are
ties. In 1991, the agricultural land accounted for 48.16% of
integrated to calculate the NPS pollutant loads with the
the whole land area (the total area of built-up, forest, agricul-
hypothesis that the study area is rural, yet the L-THIA model
tural land). Therefore, the agricultural land area was larger
is used to achieve the urban NPS pollutant loads.
than the built-up land area, which was still more scattered,
Generally, this study aims to develop an integrated
with no centralized developing tendency at that time. By
method to assess complex NPS pollution under the process
2002, under the background of economic development of
of urbanization, which accounts for the fuzziness of the real
China, the western basin exhibited features of a city, as the
world. It involves the following objectives: (1) calculating
western development rate was significantly higher than that
rural and urban NPS pollutant loads by using well-
of the eastern area. Accompanied with the significant process
developed NPS evaluation models, USLE, ECM and L-
of urban expansion, the built-up land increased to 93.13 km2
THIA models respectively; (2) classifying the study area
in 2005, while the eastern area was still rural because of the
into rural and urban by fuzzy membership function of the
obstruction of the lake. The built-up area of Donghu water-
characteristic of land use, impact of city center and traffic
shed increased from 51.44 km2 in 1991 to 93.13 km2 in
condition; (3) combining the results of rural and urban pol-
2005, and the agriculture and forest were reduced to provide
lutant loads calculating models, according to the fuzzy
space for urban development (according to the statistic data of
membership; and (4) carrying out the case study of a rapid
land use map). Generally, we can conclude that the Donghu
developing watershed, Donghu watershed in central
watershed was a typical urban–rural mixed area in 2005.
China, to confirm the proposed methods.
METHODS STUDY AREA Firstly, assuming that the watershed is rural primarily, this 0
0
0
0
The study area (Figure 1, 114 18 ∼114 30 E, 30 30 ∼30 38 N,
study calculates the particulate pollutant loads by using the
18,075 ha.), the Donghu watershed, is located in the eastern
USLE model, considering the factors of slope, normalized
portion of the city of Wuhan (Gao et al. ). The study site
difference vegetation index (NDVI), land use type, soil type
is one of the largest downtown lakes in China. In addition to
and rainfall. Subsequently, the dissolved pollutant loads
W
W
W
W
the general functions of a lake, such as regulating climate,
were determined by the ECM which uses the export coeffi-
degrading pollution, providing living space for aquatic life
cient and the corresponding land use pattern to establish the
and preventing flooding, the Donghu watershed has a signifi-
relationship of land use type and pollutant loads. In particular,
cant impact on the ecological environmental safety of
NDVI is a simple graphical indicator to assess whether the
Wuhan. Due to the radial effects of urban centers, the
target that has been observed contains live green vegetation
impact of intense anthropogenic activities and urbanization
or not (Rulinda et al. ). Secondly, assuming that the water-
on the watershed water quality is profound.
shed is urban, the L-THIA model is used to generate the spatial
The land use classification is extracted from the LAND-
distribution of NPS pollutant loads in terms of total phos-
SAT TM images in 1991, 2002 and 2005 by the ERDAS
phorus and nitrogen. Finally, fuzzy membership functions
software package, and the resolution of the RS images is
are established to define the rural and urban weights for
30 m × 30 m. Then the results are revised by the land usage
each land use cell. As opposed to binary weight, the weights
pattern provided by ‘The Earth System Science Data Sharing
defined here are used to combine the results of the rural and
Nets’.
urban NPS pollutant loads calculating models.
117
Figure 1
H. Wang et al.
|
|
Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system
Journal of Hydroinformatics
|
16.1
|
2014
Location and land use pattern of the study watershed.
The spatial data used in our study, like NDVI, slope,
Particulate N and P loads based on USLE model
distance to road, distance to city center and land use pattern, are retrieved from LANDSAT TM imageries with
The USLE was proposed by Wischmeier & Smith (),
the resolution of 30 m × 30 m. The non-spatial data, includ-
and has since been widely used at a watershed scale. It is
ing rainfall capacity and the soil type, are obtained from
an empirical model allowing the average annual soil loss
the Wuhan Statistical Year Book and other statistical
based on the product of five erosion risk indicators
sources.
(Meusburger et al. ). The empirical model to obtain particulate loads is represented in Equation (1):
Evaluating NPS pollutant loads in rural areas Wxp ¼ β C A η Sd
ð1Þ
In this section, the classic USLE model and the ECM are used to acquire the particulate and dissolved pollutant
where, Wxp is the particulate pollutant load (kg/(hm2.a));
loads of N and P, respectively.
β is the dimensionless unit conversion constant; CN and
118
Figure 2
H. Wang et al.
|
|
Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system
Journal of Hydroinformatics
|
16.1
|
2014
Temporal and historical landscape patterns of study area.
CP are the concentrations of particulate N and P, respectively, which are available from the almanac of soil in the Hubei and Henan provinces and the soil database of China provided by the Institute of Soil Science, Chinese Academy of Science, Nanjing; A is the amount of the
Table 1
Scale
Description
β
1,000
Converting t/hm2 a to kg/hm2 a
CN, CP
Adsorption P concentration is 0.048%; Adsorption N concentration is 0.084%
In red earth
A
0–139.05 t/hm2 a
–
η
2
Referring to the relevant research, in this paper the η was identified as 2
Sd
0.1–0.4
Referring to the Sd of Changjiang basin provided by Changjiang Water Resources Committee and the feature of our study site, Sd was identified varying from 0.1 to 0.4 with the 0.25 as the average value for all the cells and then the Sd of grid (i, j) could be calculated depending on the distance from the grid to the lake
soil loss (t/(hm .a)); η is the non-dimensional concentration coefficient; Sd is the ratio of the final pollutant loading into the lake to the original load generated in each cell. The specific value of each parameter is ).
The description of parameters in Equation (1)
Parameter
2
shown in Table 1 (Shi et al. ; Xu et al. ; Xue
|
From the USLE model, the amount of the soil loss can be obtained as Equation (2) shows, where K is the soil erod2
2
ibility factor (t hm h)/(hm MJ mm), P is the support practice factor (non-dimensional), C is the cover management factor (non-dimensional), R is the rainfall erosivity factor ((MJ mm)/(hm2 h a)) and LS is the slope steepness factor (non-dimensional). A ¼ K P C R LS
ð2Þ
The soil erodibility factor (K) is related to the integrated
the amount of water runoff and, thus, reduce the erosion
effects of rainfall, runoff and infiltration on soil loss and can
rate (Volk et al. ), ranging from 0 to 1. The support prac-
reflect the process of soil loss during storm events on upland
tice factor can be obtained through considering the variation
areas (Renard et al. ). In our study, the soil type is gen-
of the land use pattern. In this paper, by referencing soil con-
eral red earth and according to experimental data (Deng
servation operations and relevant research (Bu et al. ;
et al. ; Wang ; Zhang et al. ), the K value is
Cai et al. ; Xu & Shao ) on the study area, the P
denoted as 0.299.
values are identified according to the land use type: the
The support practice factor (P) reflects the effects of soil conservation operations or other measures that will reduce
built-up land has a P value of 0.35; forest is 0.5; agriculture is 0.66; water body is 0.
119
H. Wang et al.
|
Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system
Journal of Hydroinformatics
|
16.1
|
2014
The cover management factor (C ) is a weighted index,
The rainfall factor (R) represents two characteristics of a
which takes the effect of land use on soil erosion into account
storm that determine its erosivity: the amount of rainfall and
(Dumas et al. ). It is measured as the ratio of soil loss to
the peak intensity sustained over an extended period. R is
land cropped under continuously fallow conditions (Wisch-
computed by using the function of monthly precipitation
meier & Smith ). By definition, C equals 1 under
(Dumas et al. ) (see Equation (5)):
standard fallow conditions. As vegetative cover approaches 100%, the C factor value approaches the minimal value. The C value of each cell is obtained by Equation (3) (Cai et al. ; Zhao et al. ). 8 lc ¼ 0 <1 C ¼ 0:6805 0:3436 lg lc 0 lc < 78:3% : 0 78:3% lc 8 0 > > < NDVI þ 0:0675 lc ¼ > 0:47 > : 1
12 X ð 2:6398 þ 0:3046Pi Þ
tation in the ith month which is obtained from the ð3Þ
statistics yearbook. The slope length and steepness factor (LS) represent the effect of topography on erosion, as an increase in slope length and steepness will produce higher overland flow vel-
ð4Þ
0:4025 < NDVI 1
ocities, thus, stronger erosion. LS is derived from Equation (6) (Wischmeier & Smith ; Dumas et al. ):
where lc is the vegetation coverage, non-dimensional. lc in Equation (3) can be obtained through the function of NDVI (see Equation (4)). An NDVI approaching a value of 1 means the associated area is fully covered by vegetation. Using NDVI retrieved from RS data, C values for our site can be calculated ranging from 0 to 1, with the average value of 0.3316 (see Figure 3(a)).
Figure 3
|
ð5Þ
i¼1
where R is in MJ mm/(hm2 h a), and Pi is the precipi-
1 NDVI 0:0675 0:0675 < NDVI 0:4025
R¼
LS ¼
m l ð0:085 þ 0:045θ þ 0:0025θ2 Þ 22:13
8 0:3 > > > > < 0:25 m ¼ 0:2 > > > 0:15 > : 0:10
22:50 θ 17:50 θ < 22:50 12:50 θ < 17:50 7:50 θ < 12:50 θ < 7:50
Distributions of USLE factors: (a) the cover management factor C; (b) the slope length and steepness factor LS.
ð6Þ
ð7Þ
120
H. Wang et al.
|
Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system
where l is the slope length in meters, θ is the slope angle in
Table 2
|
Journal of Hydroinformatics
|
from 0.1 to 0.3 (McCool et al. ) computed by the function
The concentration of dissolved p
t/km2 a
t/km2 a
Land use type
30 m resolution DEM and ArcGIS software package, fol-
Built-up
0.6
0.011
lowed by the values of m for each cell. LS values vary from
Forest
0.119
0.007
0.088 to 3.037, with an average of 0.152 (Figure 3(b)).
Agricultural
1.2
0.04
Water
0
0
employed in NPS pollution studies (Kay et al. ) which avoids the difficulty of the physical models. This eliminates the difficulties associated with the complex formation of NPS pollution, thereby reducing the requirements of monitoring the processes of the migration and transformation of the pollutants. Thus, the ECM is available for estimating the NPS pollution for the medium or the large-scale watershed. This model is commonly represented in the form of Equation (8):
Wxd ¼
Evaluating NPS pollutant loads in urban areas A distributed hydrological-water quality model based on hydrological response units, the L-THIA model (Phillips et al. ), is selected to simulate the urban NPS pollutant loads. It takes long-term hydrological impacts on land use change into consideration, so it can be useful in researching the relationships between urbanization, surface runoff and urban NPS pollution (Yang et al. ). L-THIA was developed as an effective approach to estimate the NPS pollution resulting from past or proposed land use changes (Zhang et al. ).
n X m X i
2014
The concentration of dissolved N
of slope (see Equation (7)). l and θ are calculated applying a
The ECM is a well-developed method that has been widely
|
Export coefficient of the pollutant (E) under hypothesis of rural area
degrees, and m is the slope angle contingent variable ranging
Dissolved N and P loads based on the export coefficient model
16.1
Based on the L-THIA model, the NPS pollutant loads E×α
ð8Þ
can be acquired through Equation (10):
j
NPSurban
model
¼ AR AE UR
ð10Þ
where Wxd is the output quantity of the dissolved pollutant (kg/hm2 a), E is the export coefficient of the pollutant (t/km2 a) on different land usages, and α is the conversion factor with the value of 10. The value of E is identified according to the literature review of NPS studies on the Yangtze River and city of Chongqing (Liu et al. ; Cao et al. ) and characteristic of our site (see Table 2).
AR ¼
ðRP 0:2SÞ2 RP þ 0:8S
ðP 0:2SÞ
1000 10 S ¼ 25:4 CN
ð11Þ
ð12Þ
in which NPSurban_model is the NPS pollutant load NPS pollutant loads in rural areas
(kg/hm2 a); UR is the unit conversion constant, 10 2; and AE is the concentration of pollutant in the surface runoff
Based on the USLE which integrated with the empirical model
for each land use type (mg/L). Due to the difficulty in collect-
and the ECM, the particulate pollutant load, Wxp, and the
ing data, we identify the concentration of pollutant, AE, in
dissolved pollutant load, Wxd, are achieved respectively.
different land use types by literature review and the detailed
Then, the NPS pollutant loads assuming the study area is
information is represented in Table 3. AR is the quantity of
rural, NPSrural_model, can be calculated by adding the particu-
actual runoff, in mm which can be retrieved from the function
late and dissolved NPS pollutant loads (see Equation (9)).
of total annual precipitation, RP, and potential maximum precipitation, S (see Equation (11) and Equation (12)).
NPSrural
model
¼ Wxp þ Wxd
ð9Þ
According to the precipitation data measured by the
121
Table 3
H. Wang et al.
|
|
Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system
Journal of Hydroinformatics
|
16.1
|
2014
Complex NPS loads in urban and rural mixed places
Concentrations of pollutants (AE) under the hypothesis of urban area
Land use type
Concentration of N (mg/L)
Concentration of P (mg/L)
Built-up
3.92
0.4
lutants in rural and urban areas are totally different, it is
Forest
1.9
0.42
essential to discriminate the rural and urban areas and use
Agricultural
5.7
1.6
Water
0
0
Due to the fact that generations and properties of NPS pol-
various models and parameters to evaluate the NPS in these two places. However, it is difficult to discriminate the rural and urban cells in a rapid developing area where
monitoring station and the approach of interpolation, the
the rural and urban places are mixed and coexisting. Con-
annual precipitation of each cell is obtained (see Figure 4).
ventionally, an administrative boundary is employed to
The maximum precipitation can be identified by the CN
distinguish the characteristic of the study area. In China, a
value which is obtained from literature review. In Yang’s
city administratively contains a built up area, suburbs, and
study (), the CN value of the Hanyang district, which is
counties under city administration. Usually the built-up
approximately 15 kilometers away from Donghu Lake,
area is urban, and the suburbs are a mix of urban and
experiencing similar temperatures and rainfall as our site,
rural. In fact, it is hard to distinguish, using an administra-
were proposed. In addition, the CN value used in Yang’s
tive boundary, between urban place and rural place which
paper had been modified by the antecedent moisture con-
suffer totally different process of NPS pollutant generation
dition (AMC) already (Zhao ; Li et al. ; Wang
and transportation. Especially in some rapid developing
et al. ) and the detailed values are presented in Table 4.
cities, rural places are continually changing into urban areas to satisfy the requirement of economic development and population growth. Conventional classification methods divided the land into classic clusters, for example ‘0’ meaning rural or ‘1’ meaning urban. Fuzziness exists in the real world, especially in the boundaries where it is hard to judge or classify. Hence, the classic classification method would be unsuitable. A fuzzy membership function-based approach is proposed to define a cell as urban or rural which not only classifies the cell into an urban or rural cell, but also provides the membership of belonging to a rural or an urban place which can reflect the degree of belonging to a certain cluster and can be used to combine the urban and rural NPS evaluation results.
Figure 4
|
In our work, three factors are defined to evaluate
Spatial distribution of summed annual precipitation in 2005.
whether a cell belongs to urban place: (1) characteristic of surrounding land use, (2) influence of city center, (3) traffic Table 4
|
condition. The density of built-up land within the land use
CN value in each land use type in the L-THIA model
Land use type
CN
Built-up
98.81
Forest
92.51
Agricultural
96.02
Water
0.00
cell is used to express the characteristic of surrounding land use. The grids which are within 150 m are taken into consideration to calculate the density. The city center is defined as the CBD (Central Business District) of Wuhan, and the distance from the city center is employed to measure the influence of the city center. Finally, the distance to the
122
H. Wang et al.
|
Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system
Journal of Hydroinformatics
|
16.1
|
2014
nearest road line is proposed to reflect the traffic condition,
By suggestion of experts, the fuzzy functions are
and a smaller distance a more convenient traffic condition.
defined and Figure 5 represents the tendency line between
Particularly, the factors stated above can be denoted as X ¼ {x1 ; x2 ; x3 ; . . . ; xm } where m is the number of attribu-
membership and the value of factors. The fuzzy functions of belonging to a rural place
established:
are defined as frural ¼ 1–furban. Then, the membership of
f ∼ : X ! ϑðrÞ. In other words, by function f the attribute
belonging to an urban place is defined as Equation (18)
of a certain land use cell, xi, can be mapped to the member-
shows.
tion,
and
the
fuzzy
mapping
can
be
ship of belonging to a certain cluster j, which can be written as rij. The function f is determined by the suggestions of experts and the characteristic of the study area. According to the membership of single attribution, the comprehensive evaluation can be conducted. The membership of belonging to a certain cluster j can be calculated according to the fuzzy operator of rij (see Equation (13)). In our study, the multiple product of all memberships of single attribution is employed (see Equation (14)). Rj ¼ ⊗rij Rj ¼
m Y
furban ðXÞ ¼ furban1 ðDeBuilt; a; b; cÞ furban2 ðDisCen; a; b; cÞ furban3 ðDisRoad; a; b; cÞ ð18Þ Finally, the NPS in land use cell can be calculated as Equation (19) shows. Otherwise, the integrated NPS would be calculated according to binary weight as Equation (20) shows. NPS ¼ furban ðXÞ NPSurban
model
þ f rural ðXÞ NPSurban
ð19Þ
ð13Þ ð14Þ
rij
i¼1
Generally, a cell with a large density of built-up land,
model
NPS ¼
8 > > <
NPSurban
> > :
NPSurban
1 þ NPSrural model 0 cell ϵ urban place model 0 þ NPSrural model 1 cell ϵ rural place model
ð20Þ
small distance to the city center and small distance to a road tends to be an urban cell. Inversely, the cell should be a rural cell. And then the bell-shaped function, the Gaussian curve function, and sigmf function, which is a
RESULTS AND DISCUSSION
function composed of the difference between two sigmoidal membership functions can be used as fuzzy function
Urban NPS
(see Equations (15)–(17)) to express the relationship between attributions and the membership functions. The
Urban NPS pollutant loads calculated by the L-THIA model
parameters of the fuzzy function are feasible to be gener-
are shown in Figures 6(a) and 7(a), assuming that the study
ated according to the opinion of experts. The Analytic
area is totally urban. The L-THIA model determines the
Hierarchy Process (AHP) (Saaty ) which is a pair-
urban NPS pollutant loads through rainfall-runoff and con-
wise comparison approach has been used to extract the
centration of pollutants within each land use type because
experts’ opinion.
in an urban system the land is covered by impervious areas and the influence of natural factors such as slope or soil
fbell ðx; a; b; cÞ ¼
1
ð15Þ
1 þ jðx cÞ=aj2b
type is less while the impact of human activities is larger. In the L-THIA model, the intensity of human activities is indirectly reflected by land use. As a result, the total nitrogen
fsigmf ðx; a1; c1; a2; c2Þ ¼
1 1 þ e a1ðx c1Þ
1
(TN) in the built-up land area is around 42 to 47 km/hm2 a,
1 þ e a2ðx c2Þ ð16Þ
ðx bÞ2 =c2
fgaussian ðx; a; b; cÞ ¼ a e
ð17Þ
and that of the forest area is about 17 to 22 km/hm2 a. Meanwhile the agricultural land area undergoes the highest TN load, achieving about 60 km/hm2 a, and the total phosphorus (TP) load suffers similar spatial distribution to that of TN.
123
Figure 5
H. Wang et al.
|
|
Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system
Journal of Hydroinformatics
|
16.1
|
2014
The membership functions of three factors.
Rural NPS
factors and other factors to simulate NPS pollutant loads, the distributions tend to be gentle and gradual.
As for the rural NPS pollutant loads (Figures 6(b) and 7(b)) calculated by the USLE model, the TN and the TP range
Division of urban and rural areas
from 0 to 30 km/hm2 a and 1 to 10 km/hm2 a, respectively, which do not correspond to the land use pattern but relate
By the fuzzy membership functions and three factors, the
much more to the nature factors like slope, the ration of veg-
land use cells can be classified into rural and urban at the
etation (which is measured by NDVI), soil type and so on. In
same time. Figure 10 displays three factors in x, y and z
Figures 8(a) and 9(a), TP and TN highlights are both concen-
axis, with the position of the point denoting the value of
trated in the southern region, and the analysis reveals that
three factors and the color denoting the membership of
these highlights usually corresponded with difficult terrain,
belonging to an urban place. In plot A (see Figure 10)
such as a larger slope which tends to be associated with
where the density of built-up is close to 1 (the maximum)
hard runoff and large soil erodibility, resulting in higher
and the distance to center is short, the cells undergo
NPS loads. Since the USLE model combines rainfall, topo-
higher membership of belonging to an urban place. In plot
graphy, management factors, soil types, cover management
B, the cells with medium value of three factors are hard to
124
H. Wang et al.
|
Spatial evaluation of complex non-point source pollution in urbanâ&#x20AC;&#x201C;rural watershed using fuzzy system
Figure 6
|
Results of TP load by different methods: (a) by urban method, (b) by rural method, and (c) integrated result.
Figure 7
|
Results of TN load by different methods: (a) by urban method, (b) by rural method, and (c) integrated result.
Figure 8
|
Histograms of TP load: (a) by USLE model, (b) by L-THIA model, and (c) integrated result.
Journal of Hydroinformatics
|
16.1
|
2014
125
Figure 9
H. Wang et al.
|
Figure 10
|
Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system
Journal of Hydroinformatics
|
16.1
|
2014
Histograms of TN load: (a) by USLE model, (b) by L-THIA model, and (c) integrated result.
|
Three-dimensional distribution of the membership with three factors denoted by x, y and z axis.
classify and the memberships of these cells are around 0.4 to
Additionally, to justify the advantage of using the
0.6. Additionally, we find that as the distance from the road
fuzzy approach, the conventional results with binary
increases, the cell tends to undergo small membership of
values obtained by the k-means classification method
belonging to urban places. Besides the cells in plot B, the
are displayed in Figure 11(b). By comparison, we find
cells in the center of this three-dimensional space (Plot C)
that the membership ranging from 0 to 1 can reflect
also had membership near to 0.5.
the distribution and characteristic of land use better
Correspondingly, Figure 11(a) represents the member-
than the classic results where only two integers are
ship based on spatial distribution of the land use cells. We
denoted, as ‘rural’ and ‘urban’. The fuzzy approach gen-
find out the cells with the highest fuzziness (around 0.5)
erates generally similar results to that of the classic
are concentrated around the boundary of the urban and
method where the rural and urban distribution is gener-
rural places.
ally the same. But in the boundary, the fuzzy approach
126
H. Wang et al.
Figure 11
|
|
Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system
Journal of Hydroinformatics
|
16.1
|
2014
The classification of land use cells to urban and rural in two-dimensional space: (a) the results by fuzzy approach; (b) the results by conventional k-means method.
uses a gradual value to express the change from urban to
Table 5
|
Pollutant loads of each land use type by different methods
rural however the classic results provide a sharpened break. Average P
Urban Rural Over all
Average N
Urban Rural Over all
Complex NPS Ren et al. () investigated and analyzed the urbanization level and the water quality of Shanghai from 1947
Built-up
Forest
Agriculture
Water
(kg/hm2 a)
(kg/hm2 a)
(kg/hm2 a)
(kg/hm2 a)
3.451 2.7952 3.1443
5.6545 4.8293 4.888
11.921 8.8148 9.4385
0 0 0
45.156 35.667 37.573
0 0 0
30.804 22.381 26.865
17.386 12.809 13.135
to 1996, showing that the faster the rate of urbanization increased, the poorer the water quality became in his case study; Sartor et al. () focused on the pollutant loads on urban streets, and pointed out that pollutant
NPSs are integrated, and there is no sharpened break
loads of nutrients in urban runoff were much higher
between rural and urban areas, meaning the NPSs in
than that of rural areas in his case study; Shon et al.
rural and urban areas are interactive even if they have
() argued that the amount of NPS pollutant loads
different generations and characteristics. Especially in
discharged into rivers was larger in urban regions than
the area of boundary, the urban NPS and rural NPS are
in forests and farmlands, because of the high population
mixed and interactive. Hence, in the integrated NPS dis-
and greater impermeable areas, and then used a storm
tribution maps, there is no sharpened break, at the same
water management model (SWMM) to simulate NPS
time the different NPS evaluation models are applied for
pollutant loads in the target area. Similarly, this study
different places.
revealed that NPS pollutant loads in the scope of urban land are larger than that of the rural model (Table 5) in
Donghu
watershed.
The
integrated
results
(see
CONCLUSION
Figures 6(c) and 7(c)) better reflects the complex NPS pollution distribution because it assumed that the study
Computation and analysis of NPS pollutants according to
area is rural and urban mixed. According to the member-
land use changes, precipitations, topography, soil type, veg-
ship of belonging to urban and rural, the urban and rural
etation and others in urban–rural mixed places were
127
H. Wang et al.
|
Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system
Journal of Hydroinformatics
|
16.1
|
2014
presented. In this study, we established a comprehensive
with corresponding classification factors. For example, in
model that successfully calculates rural and urban NPS pol-
Donghu watershed the distance to the city center, the den-
lutant loads respectively by USLE, ECM and L-THIA. Then,
sity of built-up land, and the distance to the nearest road
we introduced the fuzzy membership function based
are employed to describe the characteristics of each land
approach to integrate the rural and urban NPS pollutant
use cell by fuzzy membership function, while in other case
loads through the evaluation of land use characteristic.
study areas the factors may be various. Then, according to
Afterwards, the results were successfully obtained regarding
fuzziness in terms of land usage, the degree of being urban
complex NPS pollution in an urban–rural mixed watershed,
or rural can be identified. Afterward, the relationship
Donghu watershed.
between urbanization and the above mentioned problems
Even if numerous studies are concerned with the vari-
can be exactly assessed.
ation of NPS pollutant loads under the rapid urbanization process, there is no applicative model that alludes to the increasingly urban–rural mixed watershed and that con-
ACKNOWLEDGEMENTS
siders the difference in generation and characteristic of pollution between the urban and rural areas. To address this issue, we firstly identify the NPS pollution in urban–
This study is supported by funding from the National Natural
Science
Foundation
of
China
(grant
nos.
rural mixed areas and caused by various pollutants in
40701184
rural and urban surface runoff together as the complex
appreciate the contributions of the anonymous referees,
NPS pollution, and then employ the fuzzy membership func-
who provided very useful suggestions.
and
40871179).
All
the
authors
greatly
tion to classify the urban areas and rural areas so as to integrate the well-developed urban NPS model and rural NPS model. The results are proven to be consistent with existing research conclusions and with the characteristics of our site. To our knowledge, urbanization is popular worldwide. Take China as an example: the national urbanization level stood at 11% in 1949 and sharply increased to 29% in 1996 (Wang et al. ). Although the rapid urbanization process has boosted the economy and led to a higher quality of life, some adverse effects have been brought along with it. For example, in addition to detrimental water quality, the descent of indoor-air-quality (Wang et al. ), damage of ecosystem, climate change (Grimm et al. ), threat to biodiversity (Pompeu et al. ), effect on tree growth (Gregg et al. ), promotion of asthma (Lin et al. ) and so on are associated with rapid urbanization. Hence, it is significant
to
evaluate,
analyze
and
understand
the
relationship between urbanization and the corresponding detrimental effects. Within this context, the model proposed in this study which focused on the fuzziness in the ruralurban mixed places is innovative and applicable for the above mentioned problems. In particular, the model proposed is capable of being employed in any other rural and urban mixed regions, which undergo rapid urbanization,
REFERENCES Bahadur, K. C. K. Mapping soil erosion susceptibility using remote sensing and GIS: A case of the upper Nam Wa Watershed, Nan Province, Thailand. Environ. Geol. 57 (3), 695–705. Bhaduri, B., Harbor, J., Engel, B. & Grove, M. Assessing watershed-scale, long-term hydrologic impacts of land-use change using a GIS-NPS model. Environ. Manage. 26 (6), 643–658. Bu, Z. H., Sun, J. Z. & Zhou, F. J. Study on quantitative remote sensing method for soil erosion and its application. Acta Pedologica Sinica 34 (3), 235–245. Cai, C. F., Ding, S. W., Shi, Z. H., Huang, L. & Zhang, G. Y. Study of applying USLE and geographical information system IDRISI to predict soil erosion in small watershed. J. Soil Water Conserv. 14 (2), 19–24. Cao, Y. L., Li, C. M., Guo, J. S. & Fang, F. Pollutant source analysis and pollution loads estimation from non-point source in Chongqing Three Gorges Reservoir Region. J. Chongqing Jianzhu Univ. 29 (4), 1–5. Chris, B. C., Randy, K. K. & James, A. T. Water quality in agricultural, urban, and mixed land use watershed. J. Am. Water Res. Assoc. 40 (6), 1593–1601. Deng, L. J., Hou, D. B., Wang, C. Q., Zhang, S. R. & Xia, J. G. Study on characteristics of erodibility of natural soil and non-irrigated soil of Sichuan. Soil Water Conserv. China 7, 23–25.
128
H. Wang et al.
|
Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system
Dixon, B. Groundwater vulnerability mapping: A GIS and fuzzy rule based integrated tool. Appl. Geogr. 25 (4), 327–347. Do, H. T., Lo, S. L., Chiueh, P. T., Lan, A. P. T. & Shang, W. T. Optimal design of river nutrient monitoring points based on an export coefficient model. J. Hydrol. 406 (1–2), 129–135. Dumas, P., Printemps, J., Mangeas, M. & Luneau, G. Developing erosion models for integrated coastal zone management: A case study of The New Caledonia west coast. Mar. Pollut. Bull. 61, 519–529. Edwin, D. O., Zhang, X. L. & Yu, T. Current status of agricultural and rural non-point source pollution assessment in China. Environ. Pollut. 158, 1159–1168. Fistikoglu, O. & Harmancioglu, N. B. Integration of GIS with USLE in assessment of soil erosion. Water Res. Manag. 16, 447–467. Gao, J. Q., Xiong, Z. T., Zhang, J. D., Zhang, W. H. & Obono Mba, F. Phosphorus removal from water of eutrophic Lake Donghu by five submerged macrophytes. Desalination 242 (1), 193–204. Gemitzi, A., Petalas, C., Tsihrintzis, V. A. & Pisinaras, V. Assessment of groundwater vulnerability to pollution: A combination of GIS, fuzzy logic and decision making techniques. Environ. Geol. 49 (5), 653–673. Gregg, J. W., Jones, C. G. & Dawson, T. E. Urbanization effects on tree growth in the vicinity of New York City. Nature 424 (6945), 183–187. Grimm, N. B., Foster, D., Groffman, P., Grove, J. M., Hopkinson, C. S., Nadelhoffer, K. J., Pataki, D. E. & Peters, D. P. The changing landscape: Ecosystem responses to urbanization and pollution across climatic and societal gradients. Front. Ecol. Environ. 6 (5), 264–272. Harbor, J. A practical method for estimating the impact of land use change on surface runoff, groundwater recharge and wetland hydrology. J. Am. Plann. Assoc. 60, 91–104. Haregeweyn, N. & Yohannes, F. Testing and evaluation of the agricultural non-point source pollution model (AGNPS) on Augucho catchment, western Hararghe, Ethiopia. Agric. Ecosyst. Environ. 99, 201–212. Jing, L. & Chen, B. Field investigation and hydrological modelling of a subarctic wetland – the Deer River Watershed. J. Environ. Inf. 17 (1), 36–45. Johnes, P. J. & Heathwaite, A. L. Modelling the impact of land use change on water quality in agricultural catchments. Hydrol. Processes 11 (3), 269–286. Kay, D., Crowther, J., Stapleton, C. M., Wyer, M. D., Fewtrell, L., Anthony, S., Bradford, M., Edwards, A., Francis, C. A., Hopkins, M., Kay, C., McDonald, A. T., Watkins, J. & Wilkinson, J. Faecal indicator organism concentrations and catchment export coefficients in the UK. Water Res. 42 (10–11), 2649–2661. Kim, K. Y., Ventura, S. J., Harris, P. M., Thum, P. G. & Prey, J. Urban non-point-source pollution assessment using a geographical information- system. J. Environ. Manage. 39 (3), 157–170.
Journal of Hydroinformatics
|
16.1
|
2014
Leone, A., Ripa, M. N., Uricchio, V., Deak, J. & Vargay, Z. Vulnerability and risk evaluation of agricultural nitrogen pollution for Hungary’s main aquifer using DRASTIC and GLEAMS models. J. Environ. Manage. 90 (10), 2969–2978. Li, Y. J., Nian, Y. G., Song, Y. W., Hu, S. R., Nie, Z. D., Yan, H. H. & Yin, Q. Spatio-temporal variation of non-point source pollutants in Wuli Lake, Taihu Lake. J. Sichuan Univ. (Engineering Science Edition) 41 (2), 125–130. Lim, K. J., Engel, B. A., Tang, Z., Muthukrishnan, S., Choi, J. & Kim, K. Effects of calibration on L-THIA GIS runoff and pollutant estimation. J. Environ. Manage. 78, 35–43. Lin, R. S., Sung, F. C., Huang, S. L., Gou, Y. L., Ko, Y. C., Gou, H. W. & Shaw, C. K. Role of urbanization and air pollution in adolescent asthma: A mass screening in Taiwan. J. Formosan Med. Assoc. 100 (10), 649–655. Liu, Y. & Phinn, S. R. Modelling urban development with cellular automata incorporating fuzzy-set approaches. Comput. Environ. Urban Syst. 27 (6), 637–658. Liu, Z. & Tong, S. T. Y. Using HSPF to model the hydrologic and water quality impacts of riparian land-use change in a small watershed. J. Environ. Inf. 17 (1), 1–14. Liu, R. M., Yang, Z. F., Ding, Z. F., Ding, X. W., Shen, Z. Y., Wu, X. & Liu, F. Effect of land use/cover change on pollution load of non-point source in upper reach of Yangtze River basin. Environ. Sci. 27 (12), 2407–2414. McCool, D., Brown, L., Foster, G., Mutchler, C. & Meyer, L. Revised slope steepness factor for the USLE. Trans. Am. Soc. Agric. Eng. 30, 1387–1396. Meusburger, K., Konz, N., Schaub, M. & Alewell, C. Soil erosion modelled with USLE and PESERA using QuickBird derived vegetation parameters in an alpine catchment. Int. J. Appl. Earth Obs. Geoinf. 12, 208–215. Ouyang, W., Skidmore, A. K., Toxopeus, A. G. & Hao, F. H. Long-term vegetation landscape pattern with non-point source nutrient pollution in upper stream of Yellow River basin. J. Hydrol. 389, 373–380. Pandey, A., Chowdary, V. M. & Mal, B. C. Identification of critical erosion prone areas in the small agricultural watershed using USLE, GIS and remote sensing. Water Resour. Manage. 21, 729–746. Phillips, P., Russell, F. A. & Turner, J. Effect of non-point source runoff and urban sewage on Yaquedel Norte River in Dominican Republic. Int. J. Environ. Pollut. 31 (3–4), 244– 266. Pompeu, P. S., Alves, C. B. M. & Callisto, M. A. R. C. O. S. The effects of urbanization on biodiversity and water quality in the Rio das Velhas basin, Brazil. Am. Fish. Soc. Symp. 47, 11–22. Ren, W. W., Zhong, Y., Melilgrana, J., Ariderson, B., Watt, W. E., Chan, J. K. & Leung, H. L. Urbanization, land use, and water quality in Shanghai 1947–1996. Environ. Int. 29, 649–659. Renard, K. G., Foster, G. R., Weesies, G. A., McCool, D. K. & Yoder, D. C. Predicting Soil Erosion by Water: A Guide
129
H. Wang et al.
|
Spatial evaluation of complex non-point source pollution in urban–rural watershed using fuzzy system
to Conservation Planning with Revised Universal Soil Loss Equation (RUSLE). Department of Agriculture, ARS, Washington, DC. Rulinda, C. M., Bijker, W. & Stein, A. Characterising and quantifying vegetative drought in East Africa using fuzzy modelling and NDVI data. J. Arid Environ. 78, 169–178. Saaty, T. L. The Analytic Hierarchy Process. McGraw-Hill, New York. Sartor, J. D., Boyd, G. B. & Agardy, F. J. Water Pollution Aspects of Street Surface Contaminants. The United States Environmental Protection Agency, Washington, DC. Shi, Z. H., Cai, C. F., Ding, S. W., Li, Z. X., Wang, T. W., Zhang, B. & Sheng, X. L. Research on nitrogen and phosphorus load of agricultural non-point sources in middle and lower reaches of Hanjiang River based on GIS. Acta Scientiae Circumstantiae 22 (4), 473–477. Shields, C. A., Band, L. E., Law, N., Groffman, P. M., Kaushal, S. S., Savvas, K., Fisher, G. T. & Belt, K. T. Streamflow distribution of non-point source nitrogen export from urbanrural catchments in the Chesapeake Bay watershed. Water Resour. Res. 44 (9), 1–13. Shon, T. S., Kim, S. D., Cho, E. Y., Im, J. Y., Min, K. S. & Shin, H. S. Estimation of NPS pollutant properties based on SWMM modeling according to land use change in urban area. Desalin. Water Treat. 37/38 (1–3), 333. Shrestha, S., Kazama, F., Newham, L. T. H., Babel, M. S., Clemente, R. S., Ishidaira, H., Nishida, K. & Sakamoto, Y. Catchment scale modeling of point source and nonpoint source pollution loads using pollutant export coefficients determined from long-term in stream monitoring data. J. Hydro-Environ. Res. 2, 134–147. Volk, M., Moller, M. & Wurbs, D. A pragmatic approach for soil erosion risk assessment within policy hierarchies. Land Use Policy 27, 997–1009. Wang, D. C. Modeling the Process of Runoff and Sediment Yield on Slopeland Based on ARCGIS. Southwestern University, Chongqing. Wang, Z., Bai, Z., Yu, H., Zhang, J. & Zhu, T. Regulatory standards related to building energy conservation and indoor-air-quality during rapid urbanization in China. Energy Build. 36 (12), 1299–1308. Wang, J. Y., Da, L. J., Song, K. & Li, B. L. Temporal variations of surface water quality in urban, suburban and rural areas during rapid urbanization in Shanghai, China. Environ. Pollut. 152, 387–393. Wang, K., Wu, W. Y., Chen, Y. Q. & Ding, H. Study on nonpoint pollution characteristics of urban runoff in Fuzhou city. J. Minjiang Univ. 30 (2), 107–111. William, S. J. R., Nicks, A. D. & Arnold, J. R. Simulation for water resource in rural basins. Hydraul. Eng. 111 (6), 970– 986.
Journal of Hydroinformatics
|
16.1
|
2014
Wischmeier, W. H. & Smith, D. D. Predicting Rainfall Erosion Losses.USDA Agricultural Research Services handbook 537. USDA, Washington, DC, p. 57. Xu, Y. L., Li, H. E. & Ni, Y. M. Estimate on pollutant loads of nitrogen and phosphorus based on USLE in Heihe river watershed. J. Northwest Sci-Tech Univ. Agric. Forestry (Natural Science Edition) 34 (3), 138–142. Xu, Y. Q. & Shao, X. M. Estimation of soil erosion supported by GIS and RUSLE: A case study of Maotiaohe Watershed, Guizhou Province. J. Beijing Forestry Univ. 28 (4), 67–71. Xue, S. L. Simulating the Non-Point Pollution Load of Nitrogen and Phosphorus in the Heihe Watershed Based on GIS. Xi’an University of Technology, Xi’an. Yang, A. L., Huang, G. H., Qin, X. S. & Fan, Y. R. Evaluation of remedial options for a benzene-contaminated site through a simulation-based fuzzy-MCDA approach. J. Hazard. Mater. 213, 421–433. Yang, L., Ma, K. M., Guo, Q. H. & Bai, X. Evaluating longterm hydrological impacts of regional urbanisation in Hanyang, China, using a GIS model and remote sensing. Int. J. Sustainable Dev. World Ecol. 15, 350–356. Zhang, H. & Huang, G. H. Assessment of non-point pollution using a spatial multicriteria approach. Ecol. Modell. 222, 313–321. Zhang, J. H., Shen, T., Liu, M. H., Wan, Y., Liu, J. B. & Li, J. Research on non-point source pollution spatial distribution of Qingdao based on L-THIA model. Math. Comput. Modell. 54, 1151–1159. Zhang, K. L., Peng, W. Y. & Yang, H. L. Soil erodibility and its estimation for agricultural soil in China. Acta Pedologica Sinica 44 (1), 7–13. Zhang, W. W., Shi, M. J. & Huang, Z. H. Controlling nonpoint-source pollution by rural resource recycling. Nitrogen runoff in Tai Lake valley, China, as an example. Sustainability Sci. 1 (1), 83–89. Zhao, Y. X. Prediction of Non-point Pollution in the Small Watershed of the Miyun Reservoir. Beijing Jiaotong University, Beijing. Zhao, Y. X., Zhang, W. S., Wang, Y. & Wang, T. T. Soil erosion intensity prediction based on 3S technology and the USLE: A case from Qiankeng Reservoir basin in Shenzhen. J. Subtropical. Resour. Environ. 2 (3), 23–28. Zheng, C., Yang, W. & Yang, Z. F. Strategies for managing environmental flows based on the spatial distribution of water quality: A case study of Baiyangdian Lake, China. J. Environ. Inf. 18 (2), 84–90. Zhuang, Y. H., Hong, S., Zhang, W. T., Lin, H. Y., Zeng, Q. H., Nguyen, T., Niu, B. B. & Li, W. Y. Simulation of the spatial and temporal changes of complex non-point source loads in a lake watershed of central China. Water Sci. Technol. 67.9, 2050–2058.
First received 4 October 2012; accepted in revised form 15 May 2013. Available online 13 July 2013
130
© IWA Publishing 2014 Journal of Hydroinformatics
|
16.1
|
2014
The characteristics of probability distribution of groundwater model output based on sensitivity analysis Xiankui Zeng, Jichun Wu, Dong Wang and Xiaobin Zhu
ABSTRACT The probability distribution of groundwater model output is the direct product of modeling uncertainty. In this work, we aim to analyze the probability distribution of groundwater model outputs (groundwater level series and budget terms) based on sensitivity analysis. In addition, two sources of uncertainties are considered in this study: (1) the probability distribution of model’s input parameters; (2) the spatial position of observation point. Based on a synthetical groundwater model, the probability distributions of model outputs are identified by frequency analysis. The sensitivity of output’s distribution is analyzed by stepwise regression analysis, mutual entropy analysis, and classification tree analysis methods. Moreover, the key uncertainty variables influencing the mean,
Xiankui Zeng Jichun Wu (corresponding author) Dong Wang Xiaobin Zhu Key Laboratory of Surficial Geochemistry, Ministry of Education, Department of Hydrosciences, School of Earth Sciences and Engineering, State Key Laboratory of Pollution Control and Resource Reuse, Nanjing University, Nanjing, 210093, China E-mail: jcwu@nju.edu.cn
variance, and the category of probability distributions of groundwater outputs are identified and compared. Results show that mutual entropy analysis is more general for identifying multiple influencing factors which have a similar correlation structure with output variable than a stepwise regression method. Classification tree analysis is an effective method for analyzing the key driving factors in a classification output system. Key words
| classification tree analysis, frequency analysis, groundwater modeling, mutual entropy analysis, probability distribution, sensitivity analysis
INTRODUCTION Groundwater modeling and prediction are influenced by
Uncertainty analysis of groundwater models is often
many factors from the surface to underground. The uncer-
implemented in a probability statistical framework (Blasone
tainty of groundwater model outputs stems from a number
et al. ; Hassan et al. ). The results are generally
of factors including incomplete model structure, incorrect
expressed as the probability distributions of outputs of inter-
boundary conditions, and aquifer parameters (Hassan
est
(e.g.,
groundwater
level,
boundary
flux,
solute
et al. ; Wu et al. ; Zhang et al. ; Gungor &
concentration). The uncertainty of a random variable can
Goncu ; Zeng et al. ). Data scarcity and obser-
be described by its characteristics of probability distribution,
vation
handling
which include probability density function (PDF) and
simulation uncertainty (Hassan et al. ; Mpimpas
numerical characteristics (e.g., mean and variance). The
et al. ; Wang et al. ). In recent years, a number
location, range, and shape of a random variable’s distri-
of studies have been developed to assess the uncertainties
bution are determined by the probability distribution. The
errors
enhance
the
difficulty
in
on groundwater model outputs. Moreover, these studies
sensitivity analysis of groundwater output’s probability dis-
focus on the uncertainty assessments by referring uncer-
tribution is primarily aimed at identifying two types of
tainty
parameters,
influencing factors. One is the factor affecting the numerical
conceptual model, and scenario (Blasone et al. ;
characteristics of a random variable, the other is the driving
Hassan et al. ; Ye et al. ; Hashemi et al. ;
factor which leads the output variable to obey a specific
Morway et al. ).
PDF.
sources
such
doi: 10.2166/hydro.2013.106
as
hydrogeological
131
X. Zeng et al.
|
Probability distribution of groundwater model output
Journal of Hydroinformatics
|
16.1
|
2014
For the uncertainty analysis of groundwater simulation,
groundwater levels series (GLS) and groundwater budget
in general, we are interested in the probability distribution of
terms. The suitable PDFs of outputs were selected by the
model output. However, little attention has been devoted to
Kolmogorov–Smirnov
the influencing factors of model output’s distribution in pre-
regression and mutual entropy analysis were used to identify
test.
After
that,
the
stepwise
vious studies. In this paper, we focus on the sensitivity of
the influencing factors of the first two moments of GLS
groundwater model output’s probability distribution for
(mean and variance). In addition, mutual entropy analysis
two sources: (1) the probability distribution of the input par-
is a reliable sensitivity analysis method based on infor-
ameters; (2) the spatial position of observation point.
mation
Frequency analysis is a technique which has been exten-
theory,
which
is
compared
with
stepwise
regression analysis. Finally, for the sensitivity analysis of
sively used in hydrologic uncertainty issues (Lang et al. ;
classification output system, classification tree analysis was
Neppel et al. ), such as the design of flood control and
used to identify the driving factors that lead the GLS to
risk management. The observation series is used to fit an
obey a specified distribution.
alternative PDF, and then the variable’s distribution uncer-
The main results of this study were obtained from a syn-
tainty is analyzed statistically (Smakhtin ; Katz et al.
thetic groundwater model. This groundwater model is
). Generally, there are three basic procedures for fre-
simple compared to a real groundwater system. Therefore,
quency analysis: (1) selecting a suitable PDF for data
the research results can be regarded as a mathematical
series; (2) parameter estimation for the selected PDF; (3)
exploration into the characteristics of probability distri-
uncertainty assessment for the data series (Onoz & Bayazit
bution of groundwater model outputs. Some conclusions
). Herein, how to select a suitable PDF is the key pro-
need further confirmation in the real field. Nevertheless,
blem for a frequency analysis. According to Mcmahon &
the use of a real groundwater model is not easy for such
Srikanthan (), Haktanir (), Onoz & Bayazit (),
analysis, because observations are often limited in the
and Vogel et al. (), there is not a universal applicable
number and length of a data series.
rule to select the best PDF, and the qualified PDF should be selected based on effective comparison and testing. For the complicated groundwater model, it is hard to
In the following sections, the methods used for this research are described. Then, a synthesized groundwater flow model is presented. In the results and discussion sec-
describe the influences of model inputs on outputs directly
tion,
we
describe
the
characteristics
of
probability
by mathematic model. Sensitivity analysis provides an effec-
distribution of model output. Finally, the main conclusions
tive framework for unraveling the relationship between the
drawn from the analysis are provided.
input variables and outcomes. In general, the studying object is the direct model output, such as hydraulic head (Rojas et al. ; Mazzilli et al. ) and solute concen-
METHODS
tration (Huysmans et al. ; Zhang et al. ). The influencing factors of output variable can be identified by
Parameter estimation and goodness of fit test
sensitivity analysis. In this study, the research object is not the direct model output, but the probability distribution of
Seven functions were chosen as the alternative probability
output. The importance of this kind of influencing factor
distribution functions to fit the outputs of groundwater
can be regarded as another form of sensitivity to model
model.
output. Furthermore, recognizing the distribution character-
gamma, log-2-parameter gamma, Pearson type III, log-
istics of model output will help in identifying groundwater
Pearson type III, and uniform distribution, respectively.
modeling uncertainty, improving model structure, and pro-
The methods used for parameter estimation have been illus-
viding feedback for data collecting activities relating to
trated in many papers and will not be provided here.
model uncertainty analysis.
Readers can obtain detailed derivation processes by refer-
A synthetic groundwater model was built for producing groundwater outputs. The outputs of the model include
They
were
normal,
log-normal,
2-parameter
ring to Chen et al. (), Ross (), Singh & Singh (a, b), and Sun & Zheng ().
132
X. Zeng et al.
|
Probability distribution of groundwater model output
Journal of Hydroinformatics
|
16.1
|
2014
The Kolmogorov–Smirnov test (Melo et al. ; Wang
equal-width intervals. The number in each contingency
& Wang ) is a convenient method of a hypothesis test by
table is a nonnegative integer which represents the
comparing the statistic value with the critical value at a
number of observed events satisfying the joint conditions
specified confidence level. The statistic value is evaluated
of row and column.
by comparing the proposed PDF with the empirical distri-
The probability of the state with input variable xi and
bution function constructed based on samples. The
output variable yi is pij ¼ Nij/N, where Nij is the value of
Kolmogorov–Smirnov test is a standard procedure of good-
the contingency table at i-th row and j-th column, and N is
ness of fit test, and it will not be described here.
the number of samples. In addition, Ni. is the cumulative number of samples in the i-th interval of x for the whole range of y, and N.j denotes the cumulative number of
Stepwise regression analysis
samples in the j-th interval of y for the whole range of x. Stepwise regression analysis is a common approach for
Consequently, when considering the state xi only, the prob-
global sensitivity analysis. The basic idea for regression
ability can be written as pi. ¼ Ni./N, and the probability of
analysis is to fit the input and output variable with a linear
outcomes only with the state yj is given by p.j ¼ N.j/N
regression model (Pappenberger et al. ; Mishra et al.
(Mishra et al. ).
). The model generated at every step is tested to
The entropy of a variable represents the amount of aver-
ensure that all the regression variables are important to
age information. According to information theory, the
the model. The t-test measuring the difference between
entropies of variable x, y, and (x, y) are defined as follows:
samples and the regression model is applied to test the importance of a variable. In addition, if some variables are found to be insignificant, then the most insignificant variable is removed from the model. Moreover, the stepwise regression process will continue until each variable in the regression model becomes significant and the variables outside of the model are insignificant (Mishra et al. ; Bergante et al. ; Zeng et al. ). After that, the uncertainty importance of input variable can be defined as standardized regression coefficient (SRC): bj σ xj SRC ¼ σ ð yÞ
HðxÞ ¼
X
pi: ln pi: ; Hð yÞ ¼
X
i
Hðx; yÞ ¼
ð2Þ
p:j ln p:j
j
XX i
ð3Þ
pij ln pij
j
In information theory, the mutual information of two variables is a quantity that measures the mutual dependence of two variables. The mutual entropy between x and y is described as the reduction in the uncertainty of y due to the information of x, which can be given by:
ð1Þ
Iðx; yÞ ¼ HðxÞ þ Hð yÞ Hðx; yÞ ¼
XX i
j
pij ln
pij pi: p:j
ð4Þ
where y is the output variable, xj is the input variable num-
In mutual entropy method, the uncertainty importance
bered by j, σ(xj), σ(y) are the standard deviations of xj and
of input variables on output variable is indicated by two indi-
y, respectively, bj is the regression coefficient of xj.
cators: uncertainty coefficient (U) and R statistic (R)
Mutual entropy analysis
(Mishra et al. ; Zeng et al. ): Iðx; yÞ U ðx; yÞ ¼ 2 HðxÞ þ Hð yÞ
ð5Þ
The distribution character of data set (X, Y ) can be described using contingency tables. For the contingency tables’ rows, the label denotes the input variable x, and
Rðx; yÞ ¼ ½1 expf 2Iðx; yÞg 1=2
ð6Þ
the range is divided into i equal-width intervals. For
These two measures take values in the range [0, 1], U (or
the contingency tables’ columns, the label denotes
R) is 0 if x and y are independent, and it takes 1 if x is com-
the output variable y, and the range is divided into j
pletely related to y.
133
X. Zeng et al.
|
Probability distribution of groundwater model output
Journal of Hydroinformatics
Classification tree analysis
|
16.1
|
2014
Assuming the splitting variable X, with n samples ordered by magnitude, the amount of alternative split
Sensitivity analysis techniques such as stepwise regression,
points of X is n-1 by choosing the midpoint of two adjacent
regionalized sensitivity analysis (Pappenberger et al. ),
samples. The point that maximizes the information gain or
and mutual entropy analysis are useful for identifying impor-
minimizes the uncertainty of outputs is selected. InfoGain
tant influencing factors if the study object is a continuous
is calculated by the equation (Myles et al. ):
variable. When the problem relates to binary outcomes such as ‘right’ vs. ‘wrong’, ‘yes’ vs. ‘no’, the classification
InfoGain ¼ Infoð parentÞ
tree method provides a more efficient framework for identi-
X
ð pk ÞInfoðchildk Þ
ð8Þ
k
fying the factors driving the result into particular categories (Mishra et al. ; Englehart & Douglas ; Esther et al. ; MacQuarrie et al. ).
where parent denotes the space before splitting, childk denotes the subspace after splitting, and pk is the ratio of
The fundamental target for constructing a classification
the samples which passed into the k-th subspace. The
tree model is searching for a classifying rule. The output is
purity of a space describing the distribution of samples’
classified by a series of splits based on splitting variables.
types is expressed as follows:
Each split is determined by the appropriate classifier. Thus, the following two steps are essential for constructing a classification tree: (1) selecting an appropriate splitting
purity ¼
X j
p2j ; pj ¼
Nj ðtÞ N ðtÞ
ð9Þ
variable and determining the split point; (2) deciding when where pj is the proportion of samples belonging to class j.
to continue splitting or to declare splitting termination. The split can be defined by several principles, such as
The classification tree is constructed by the successive
maximum information gain (InfoGain), maximum impurity
selection of splitting points. It is beneficial to set up some
reduction, and maximum reduction in deviance (Mishra
constraint for preventing excessive splitting. If the number
et al. ; Myles et al. ). The InfoGain index based
of samples in a subspace below the minimum value, or the
on information entropy theory was applied to construct a
purity of samples in a subspace is higher than the maximum
classification tree in this study. The outputs are classified
value specified by the user, the splitting is terminated at that
into subspaces by selecting a splitting point of the splitting
node. Furthermore, a classification tree can be optimized by
variable. This implies the complicated and disordered out-
pruning and reconstruction, which acquires a balance
puts are arranged and sorted with higher order within
between the complexity and classification precision. After
subspaces. Therefore, the uncertainty of output variable is
the classification tree is constructed, the sensitivities of split-
reduced by acquiring information.
ting variables can be simply determined by comparing the
A classification tree is built on two types of nodes:
order used to classify outputs (Mishra et al. ).
branch nodes and leaf nodes. Each branch node is the parent of two children branch nodes, and the leaf node is
IMPLEMENTATION OF METHODS
the endpoint of the tree. The uncertainty or information entropy of output vari-
Description of the synthesized model
able y in node t is defined as: Info ¼
X Nj ðtÞ j
N ðtÞ
Nj ðtÞ ln N ðtÞ
ð7Þ
For the purpose of frequency analysis, we constructed a synthetic three-dimensional steady-state groundwater model (Rojas et al. ) (Figure 1). The model domain is
where Nj (t) denotes the number of samples belonging to
5,000 m in the x direction, 3,000 m in the y direction, and
the class j at node t, and N(t) is the number of samples
53 m in the z direction (thickness). The model area is a rec-
at node t.
tangle (5,000 m by 3,000 m) and discretized into 25 m by
134
Figure 1
X. Zeng et al.
|
|
Probability distribution of groundwater model output
Journal of Hydroinformatics
|
16.1
|
2014
A schematic diagram of synthetic groundwater model.
25 m grid cells. Five pumping wells and 240 observation
Table 1
|
Spatial correlation parameters of hydraulic conductivity K for the layers of synthetic groundwater model
points are placed at the confined aquifer. The upper layer is 25 m in thickness and is modeled as an unconfined aqui-
Parameter
fer. The lower layer is 25 m in thickness and is confined. The Layer
Mean of K (m/d )
Variance of lnK
Correlation length of lnK (m)
neglecting its storage capacity. The three model layers were
1
5.0
2.0
80
assumed to be horizontal in extension. The hydraulic con-
2
0.1
0.5
80
ductivity distribution within each aquifer is heterogeneous,
3
5.0
2.0
80
upper and lower layer is separated by a 3-m confining bed by
and the hydraulic conductivity field within each layer is assumed to be statistically stationary.
Model parameters
Boundary conditions set up As shown in Figure 1, for the model aquifers, two impermeable boundary conditions are specified along the south and
Model layers are assumed to be homogeneous statistically
north boundaries. Along the west boundary, a constant head
with a constant mean of hydraulic conductivity K. Smaller-
boundary condition is imposed. The east side of the domain
scale variability is represented using the theory of random
is bounded by a 20 m-wide river, and the river level is 40 m.
space functions. In addition, an isotropic exponential covari-
The riverbed’s thickness is 2 m, and the elevation at the
ance function is used to describe the K fields of layers. The
bottom of the riverbed is 35 m. Furthermore, sources and
spatial distribution of hydraulic conductivity is generated
sinks in the model include recharge from precipitation, dis-
using the direct Fourier transform method (Robin et al.
charge from pumping and evapotranspiration. The top
). The spatial structure parameters of lnK for different
surface of unconfined aquifer receives the precipitation
layers are presented in Table 1.
recharge
uniformly,
and
the
model
bottom
is
an
135
X. Zeng et al.
|
Probability distribution of groundwater model output
Journal of Hydroinformatics
|
16.1
|
2014
impermeable boundary. In addition, the pumping wells and
The Monte Carlo simulation procedure involves two parts
observation wells are only screened at layer 3. An evapotran-
(part I and part II).
spiration zone, delineated by a rectangle in the left side of the study area, is defined with an evapotranspiration surface
Part I
elevation at 51 m, and the extinction depth is set as 5 m. Then, the unknown model parameters including the water level of constant head boundary, the conductance of
Part I includes the following steps: 1. Generating model mesh, setting the initial head condition,
river bed, precipitation rate, maximum evapotranspiration
the positions of pumping wells and observation points, etc.
rate, and pumping rate are defined in specified ranges
2. Setting the hydraulic conductivity K of model layers.
(Table 2). In addition, the conductance of riverbed rep-
Based on the mean and the covariance function of lnK
resents the interconnection between river and unconfined
(Table 1), the random fields of K are generated by the
aquifer, which is calculated as follows (Harbaugh ): CRiv ¼
KRiv l w m
direct Fourier transform method.
ð10Þ
where CRiv is the conductance of riverbed, KRiv is the vertical hydraulic conductivity of riverbed, l is the length of reach, w is the width of river, and m the thickness of riverbed. The probability distribution of groundwater model output is influenced by input parameters. Therefore, two conditions that the input parameters follow, uniform and normal distributions, are both considered in this study. In addition, the range of uniform distribution is consistent with interval of corresponding normal distribution. The parameters of these two distributions are shown in Table 2.
stant head boundary, conductance of riverbed, and pumping rate. A boundary condition is assigned a value by sampling uniformly from the corresponding range (Table 2). 4. Running the established model and collecting the outputs of groundwater model. The outputs include the groundwater levels of observation points in layer 3, the inflow from constant head boundary and precipitation, the outflow from well pumping, evapotranspiration process, and river boundary. 5. Repeating step 2 to step 4 500 times. 6. Conducting frequency analysis for groundwater model constructed by the output of every realization, e.g., the groundwater levels of an observation point from 1st to
The numerical model of synthesized groundwater flow system is built using MODFLOW-2005 (Harbaugh ). |
rate, maximum evapotranspiration rate, water head of con-
outputs. The data series used for frequency analysis is
Monte Carlo simulation
Table 2
3. Setting the boundary conditions, including precipitation
500th realization. Therefore, each data series has 500 samples. The data series include 240 GLS and five groundwater budget series. The procedure of frequency analysis can be summarized as two steps: (1) parameter
Probability distributions of model parameters
estimation for each alternative PDF; (2) taking the Uniform distribution
Normal distribution
Kolmogorov–Smirnov test for each PDF. If all the
Model parameter
Minimum
Maximum
Mean
Variance
alternative PDFs have poor performance (cannot pass
Precipitation rate (m/d)
6.0 × 105
6.0 × 104
3.3 × 104
7.1 × 105
Evapotranspiration rate (m/d)
5.0 × 10
5.0 × 10
2.75 × 10
Constant head (m)
47.0
52.0
49.5
0.6579
Conductance of riverbed (m2/d)
10.0
500.0
255.0
64.4737
Pumping rate (m3/d )
500.0
3000.0
1750.0
328.9474
4
3
3
through the Kolmogorov–Smirnov test, and the significance level α was set to 0.05 in this study), we will
5.92 × 10
4
mark the GLS as an unknown PDF. Part II The procedure of part II is the same as that of part I, except for step 3. In this part, step 3, a boundary condition is assigned a value by sampling from corresponding normal distribution.
136
X. Zeng et al.
|
Probability distribution of groundwater model output
Journal of Hydroinformatics
RESULTS AND DISCUSSION
|
16.1
|
2014
uniform distribution nearly. Moreover, when the input parameters are sampled from normal distribution, it is
Figure 2 shows the frequency distribution of parameters’
obvious that most of the GLS obey normal distribution
samples which are sampled from uniform and normal distri-
(Figure 3(b)). The groundwater budget terms include the inflows from
butions, respectively.
constant head boundary (InCH) and precipitation (InPre), and outflows from river leakage (OutRiv), evapotranspiration
Frequency analysis
(OutEva), and pumping (OutPum). Obviously, the probability The outputs of groundwater model are tested for each
distributions of InPre and OutPum are fully controlled by
alternative PDF by Kolmogorov–Smirnov test, and the sig-
model input parameters (precipitation rate and pumping
nificance level is 0.05. The numbers of GLS which obey
rate). Figure 4 shows the frequency distributions of InCH,
normal, log-normal (Log-nor), 2-parameter gamma (G2),
OutRiv, and OutEva. When input parameters are sampled
log-2-parameter gamma (Log-G2), Pearson type III (P3),
from uniform distribution, none of the budget terms can
log-Pearson type III (Log-P3), uniform, and unknown distri-
pass the Kolmogorov–Smirnov test (Figures 4(a)–4(c)). More-
bution are denoted as ni (i ¼ 1,2,…,8) in order. After that, the
over, the distributions of these budget terms are significantly
ratio for each PDF was calculated as:
different from uniform distribution. By contrast, all the budget terms have passed the Kolmogorov–Smirnov test as
ratioi ¼ ni =240
ð11Þ
normal distribution when input parameters are sampled from normal distribution (Figures 4(d)–4(f)).
As shown in Figure 3, the PDF of GLS is strongly influenced by the probability distribution of model input
Stepwise regression analysis
parameters. When the input parameters are sampled from uniform distribution, although a majority of GLS obey
Figure 3 shows that only a part of GLS obeys a specified
unknown distribution (Figure 3(a)), the rest of GLS obey
PDF. The observed GLS show different characteristics of
Figure 2
|
The frequency distributions of precipitation rate (PREC), evapotranspiration rate (EVAP), constant head (CH), conductance of riverbed (CRIV), and pumping rate (PUMP). First and second rows represent these parameters are sampled from uniform and normal distributions, respectively.
137
X. Zeng et al.
Figure 3
|
Figure 4
|
|
Probability distribution of groundwater model output
Journal of Hydroinformatics
|
16.1
|
2014
The ratios of GLS which obey normal, G2, P3, uniform, Log-nor, Log-G2, Log-P3, and unknown distribution. (a) and (b) denote input parameters are sampled from uniform and normal distributions, respectively.
Frequency distributions of inflow from constant head boundary (InCH), outflows from river leakage (OutRiv) and evapotranspiration (OutEva). The plots (a), (b), (c) and the plots (d), (e), (f) denote input parameters are sampled from uniform and normal distributions, respectively.
probability distributions among observation points. The
pumping well, evapotranspiration area, and the average dis-
probability distribution of GLS is influenced by the spatial
tance from five pumping wells. The variables are listed in
position of an observation point. Thus, the stepwise
Table 3 and numbered from 1 to 7, all of them are normal-
regression analysis is used to identify the key factors of the
ized before regression analysis.
mean and variance of GLS. The input variables of regression
As shown in Figure 5, as the input parameters are
model are the distances of an observation point from sur-
sampled from uniform and normal distribution, respectively,
rounding model boundaries. They are the distances of an
the sensitivities of influencing factors are almost identical
observation point from northern boundary, river boundary,
for the mean of GLS, e.g., Figure 5(a) vs. Figure 5(b). How-
southern boundary, constant head boundary, the nearest
ever, the influences of regression variables on the variance
138
Table 3
X. Zeng et al.
|
|
Probability distribution of groundwater model output
Journal of Hydroinformatics
|
16.1
|
2014
larger than that in the variance model, e.g., Figure 5(a) vs.
Input variables of stepwise regression model and their numbers
Variable
No.
Distance from an observation point to northern boundary (D1)
1
Distance from an observation point to river boundary (D2)
2
Distance from an observation point to southern boundary (D3)
3
Distance from an observation point to constant head boundary (D4)
4
Distance from an observation point to the nearest pumping well (D5)
5
Average distance from an observation point to five pumping wells (D6)
6
Distance from an observation point to evapotranspiration area (D7)
7
Figure 5(c), Figure 5(b) vs. Figure 5(d). Thus, the mean of GLS is more dependent on the distance from river boundary than the variance of GLS. In addition, the average distance from five pumping wells (D6) is inversely related with the mean and variance of GLS.
Mutual entropy analysis Stepwise regression analysis is restricted in monotonic linear issues, and mutual entropy analysis is capable of treating the complicated non-monotonic relationship between output and input variables. The same as for stepwise regression analysis, the input variables are also listed in Table 3, and the output variables are the mean and variance
of GLS are slightly different for these two distributions, such
of GLS. Tables 4–7 display the contingency tables of mutual
as Figure 5(c) vs. Figure 5(d).
entropy analysis.
For the regression analysis of the mean of GLS, four
Figure 6 shows the results of mutual entropy analysis.
variables (D2, D5, D6, and D7) passed into the regression
Similar to the results of stepwise regression analysis, the sen-
model. The variable with the largest sensitivity is D2 (the
sitivities of input variables are similar for the mean and
regression coefficient is about 0.97). For the regression
variance of GLS. The most important influencing factors
analysis of the variance of GLS, four variables (D2, D5,
for the mean and variance of GLS are the distances from
D6, and D7) also passed into the regression model. The vari-
an observation point to river and constant head boundaries
able with the largest sensitivity is also D2 (the regression
(D2 and D4). In addition, for the mean of GLS, the variables
coefficient is about 0.80). Therefore, the mean and variance
with the weakest sensitivity are the distances from an obser-
of GLS are affected similarly by the regression variables.
vation point to northern and southern boundaries (D1 and
They are both significantly influenced by the distance
D3). For the variance of GLS, the distance from an obser-
from river boundary (D2), and other regression variables
vation point to evapotranspiration area (D7) holds the
have very low influences relative to D2. Furthermore,
smallest sensitivity. Nevertheless, the index values of D1,
the regression coefficient of D2 in the mean model is
D3, and D7 are very close for the mean and variance of
Figure 5
|
The regression coefficients of the entered variables in stepwise regression analysis. The plots (a), (b) and the plots (c), (d) denote output variables are the means and variances of GLS, respectively. The plots (a), (c) and the plots (b), (d) indicate input parameters are sampled from uniform and normal distributions, respectively.
Contingency tables when Y (labeled by column) is the mean of GLS, and model parameters are sampled from uniform distribution
D2
21
9
9
24
9
11
21
9
11
17
23
|
21
36
24
0
0
9
11
17
23
0
18
0
16
44
19
0
0
39
0
9
11
21
19
21
9
9
24
18
0
0
0
60
9
9
21
21
D2
9
31
8
12
21
26
4
19
28
4
11
27
14
|
20
0
60
9
5
36
13
0
0
39
21
14
20
44
0
16
44
0
8
13
4
36
24
0
0
5
2
0
9
1
9
19
8
0
D4
D7
0
8
66
11
36
4
24
17
36
0
0
0
0
23
20
8
0
27
12
14
19
25
9
12
0
0
7
12
12
9
13
61
17
12
D5
0
0
11
27
14
8
0
4
18
57
2
0
19
28
4
9
19
31
31
10
0
21
26
4
9
1
57
4
18
38
9
31
8
12
40
20
12
14
55
43
38
24
10
0
2
0
0
0
D6
27
4
8
23
64
18
8
20
8
5
2
0
9
D7
20
65
0
0
34
9
8
39
28
6
0
0
0
0
12
20
8
2
25
13
32
11
14
12
0
0
7
13
19
2
12
12
54
46
Contingency tables when Y (labeled by column) is the mean of GLS, and model parameters are sampled from normal distribution
D2
D3
9
9
19
23
36
24
0
9
9
23
19
0
12
9
9
22
20
0
0
9
9
17
25
0
0
D4
0
9
9
17
25
0
48
0
9
9
22
20
33
27
9
9
23
19
0
60
9
9
19
23
D5
D6
0
0
60
9
4
36
14
0
0
33
27
14
17
42
0
12
48
0
8
13
4
36
24
0
0
5
2
0
D7
0
5
66
14
12
41
4
23
15
39
0
0
0
0
23
20
8
0
27
12
12
19
27
9
12
0
0
7
12
12
8
14
Contingency tables when Y (labeled by column) is the variance of GLS, and model parameters are sampled from normal distribution
D1
D2
D3
D4
D5
D6
D7
29
6
11
41
19
0
0
11
31
11
7
0
6
20
34
30
21
5
7
37
48
0
0
51
47
16
10
25
23
3
9
5
55
0
0
28
20
5
7
32
23
5
0
34
60
13
7
9
47
22
3
0
0
0
0
28
20
5
7
32
23
5
0
25
23
3
9
5
55
0
0
9
21
7
11
20
8
3
24
13
36
9
12
11
31
11
7
0
6
20
34
14
29
6
11
41
19
0
0
5
2
0
9
12
0
0
7
14
20
0
12
|
14
Journal of Hydroinformatics
|
0
D3
40
D1
Table 7
D6
Contingency tables when Y (labeled by column) is the variance of GLS, and model parameters are sampled from uniform distribution
D1
Table 6
D5
Probability distribution of groundwater model output
9
D4
|
9
D3
X. Zeng et al.
D1
Table 5
139
|
Table 4
16.1
| 2014
140
Figure 6
X. Zeng et al.
|
|
Probability distribution of groundwater model output
Journal of Hydroinformatics
|
16.1
|
2014
Index values of input variables in mutual entropy analysis. First and second rows represent input parameters are sampled from uniform and normal distributions, respectively. The plots (a), (c) and the plots (b), (d) indicate output variables (Y ) are the mean and variance of GLS, respectively.
GLS. Moreover, when the model input parameters are
Classification tree analysis
sampled from uniform and normal distributions, respectively, the influences of input variables are similar for these
Figure 7 displays a conventional diagram that labels the PDF
two distributions.
of GLS, when input parameters are sampled from uniform
Compared with the results of stepwise regression analysis, the distance from constant head boundary (D4) has a
distribution. Figure 8 shows the PDF of GLS when input parameters are sampled from normal distribution.
significant influence on the first two moments of GLS. How-
Figure 3 shows that the PDF of GLS is strongly related
ever, the variable D4 has been excluded from the stepwise
to the probability distribution of groundwater model input
regression analyses of both mean and variance of GLS. As
parameters. However, as shown in Figures 7 and 8, the
shown in Figure 1, the sum of the distances from an obser-
PDF of GLS is not fully controlled by the probability distri-
vation point to the river boundary and constant head
bution of input parameters. The category of the PDF of GLS
boundary is a constant (the width of the study area,
is not uniformly distributed in the space of model layer. For
5,000 m). According to the constructing mechanism of step-
identifying the driving factors that lead GLS to follow the
wise regression model, the influence of D2 and D4 is
specific PDF (uniform or normal), classification tree
presented by only one variable (D2) in stepwise regression
method is used to identify these driving factors. The GLS
analysis. Moreover, the importance of D4 is significant as
are classified into two categories: 0 obeys uniform or
well as D2. The influence mode of D2 on the output vari-
normal distribution when the input parameters are sampled
ables is inversed to that of D4, and this situation is the
from uniform or normal distribution, respectively; 1 does
same as D1 and D3. In addition, the relationship between
not obey. The input variables in the classification tree
the influence modes of D2 and D4 (or D1 and D3) on
model have identical numbers to the variables used in step-
output variables is certified by the contingency tables in
wise regression and mutual entropy analyses (see Table 3).
Tables 4–7. Furthermore, the input variables excluded
As shown in Figure 9, GLS is passed into subspaces by
from the stepwise regression model are able to be identified
selecting suitable input variables used for splitting. In
by mutual entropy analysis. By contrast, these variables are
addition, the classification tree (four ranks in this paper) is
roughly treated as invalid influencing factors by stepwise
built by constantly splitting. The maximum purity was set
regression analysis.
as 0.82 in this classification tree. The results indicate that
141
X. Zeng et al.
|
Probability distribution of groundwater model output
Journal of Hydroinformatics
|
16.1
|
2014
Furthermore, when the model input parameters are sampled from normal distribution, the tree model contains four variables, and the entry order is D6, D2, D5, and D1. Moreover, variable D6 and D2 are also the most significant driving factors. Groundwater is a complex system affected by many factors. According to the central limit theorem, when a system is constructed by a large number of independent random variables, each with finite mean and variance, the output Figure 7
|
Conventional diagram labeling the probability distribution of GLS when input parameters are sampled from uniform distribution.
of the system will be approximately normally distributed. Thus, when the groundwater model parameters are sampled from normal and uniform distributions, respectively, the outputs of groundwater model following normal distribution are many more than that following uniform distribution (see Figures 3, 4, 7 and 8). Moreover, Figure 9 shows that whether the GLS obey normal distribution is controlled by more driving factors than that leading GLS to obey uniform distribution. As has been stated, the key driving factors of GLS are D2 and D6. In addition, variable D2 obtains a significant importance in stepwise regression and mutual entropy analyses. However, the mean and variance of GLS are slightly
Figure 8
|
Conventional diagram labeling the probability distribution of GLS when input
influenced by variable D6. As a result, the mean and var-
parameters are sampled from normal distribution.
iance of GLS are both controlled by the distance from observation point to river boundary (or constant head
when the groundwater model input parameters are sampled
boundary). The category of the PDF of GLS is dominated
from uniform distribution, only two variables (D6 and D2)
by the average distance from observation point to five pump-
entered into the classification tree model. Therefore, the
ing wells, and the distance from observation point to river
probability distribution of GLS is driven by D6 and D2.
boundary (or constant head boundary).
Figure 9
|
Splitting process of classification tree analysis. SZ denotes sample size, P denotes the purity of a space, SV denotes splitting variable. (a) and (b) indicate groundwater model parameters are sampled from uniform and normal distribution, respectively.
142
X. Zeng et al.
|
Probability distribution of groundwater model output
CONCLUSIONS The uncertainty of groundwater modeling can be represented by the characteristics of probability distribution of model outputs. Based on a synthetic groundwater model, and the sensitivity analysis of the probability distributions of model outputs, the following conclusions are drawn: 1. The characteristics of probability distribution of groundwater model output is analyzed and summarized. The most important influencing factors for the mean and variance of GLS are the distances from an observation point to river and constant head boundaries. The most important driving factor for the PDF of GLS is the distance from an observation point to all pumping wells. In addition, the distribution characteristics of groundwater model outputs (GLS and budget terms) are significantly influenced by the probability distribution of input parameters. 2. Stepwise regression analysis is a defective sensitivity analysis method for identifying multiple influencing factors which have similar correlation structure with output variable. By contrast, mutual entropy analysis is more general in identifying complicated multivariate relationships. Furthermore, mutual entropy analysis is able to identify the influence of variables which are excluded from stepwise regression analysis. Moreover, classification tree analysis is an effective method for analyzing the key driving factors in a classification output system.
ACKNOWLEDGEMENTS This study was supported by the National Natural Science Fund of China (Nos. 41172207, 41030746, 51190091, and 41071018), Program for New Century Excellent Talents in University (NCET-12-0262), China Doctoral Program of Higher Education (20120091110026), Qing Lan Project, the Skeleton Young Teachers Program and Excellent Disciplines Leaders in Midlife-Youth Program of Nanjing University.
REFERENCES Bergante, S., Facciotto, G. & Minotta, G. Identification of the main site factors and management intensity affecting the
Journal of Hydroinformatics
|
16.1
|
2014
establishment of Short-Rotation-Coppices (SRC) in Northern Italy through stepwise regression analysis. Cent. Eur. J. Biol. 5, 522–530. Blasone, R. S., Vrugt, J. A., Madsen, H., Rosbjerg, D., Robinson, B. A. & Zyvoloski, G. A. Generalized likelihood uncertainty estimation (GLUE) using adaptive Markov chain Monte Carlo sampling. Adv. Water Resour. 31, 630–648. Chen, Y. F., Hou, Y., Van Gelder, P. & Zhigui, S. Study of parameter estimation methods for Pearson-III distribution in flood frequency analysis. Iahs-Aish P 271, 263–269. Englehart, P. J. & Douglas, A. V. Diagnosing warm-season rainfall variability in Mexico: A classification tree approach. Int. J. Climatol. 30, 694–704. Esther, A., Groeneveld, J., Enright, N. J., Miller, B. P., Lamont, B. B., Perry, G. L. W., Blank, F. B. & Jeltsch, F. Sensitivity of plant functional types to climate change: classification tree analysis of a simulation model. J. Veg. Sci. 21, 447–461. Gungor, O. & Goncu, S. Application of the soil and water assessment tool model on the Lower Porsuk Stream Watershed. Hydrol. Process. 27, 453–466. Haktanir, T. Comparison of various flood frequencydistributions using annual flood peaks data of rivers in Anatolia. J. Hydrol. 136, 1–31. Harbaugh, A. W. The U.S. Geological Survey modular groundwater model–the Ground-Water Flow Process. U.S. Geological Survey Techniques and Methods 6-A16, pp. 81–84. Hashemi, H., Berndtsson, R., Kompani-Zare, M. & Persson, M. Natural vs. artificial groundwater recharge, quantification through inverse modeling. Hydrol. Earth Syst. Sci. 17, 637–650. Hassan, A. E., Bekhit, H. M. & Chapman, J. B. Uncertainty assessment of a stochastic groundwater flow model using GLUE analysis. J. Hydrol. 362, 89–109. Huysmans, M., Madarasz, T. & Dassargues, A. Risk assessment of groundwater pollution using sensitivity analysis and a worst-case scenario analysis. Environ. Geol. 50, 180–193. Katz, R. W., Parlange, M. B. & Naveau, P. Statistics of extremes in hydrology. Adv. Water Resour. 25, 1287–1304. Lang, M., Pobanz, K., Renard, B., Renouf, E. & Sauquet, E. Extrapolation of rating curves by hydraulic modelling, with application to flood frequency analysis. Hydrol. Sci. J. 55, 883–898. MacQuarrie, C. J. K., Spence, J. R. & Langor, D. W. Using classification tree analysis to reveal causes of mortality in an insect population. Agr. Forest Entomol. 12, 143–149. Mazzilli, N., Guinot, V. & Jourde, H. Sensitivity analysis of two-dimensional steady-state aquifer flow equations. Implications for groundwater flow model calibration and validation. Adv. Water Resour. 33, 905–922. McMahon, T. A. & Srikanthan, R. Log Pearson III distribution – Is it applicable to flood frequency-analysis of Australian streams. J. Hydrol. 52, 139–147.
143
X. Zeng et al.
|
Probability distribution of groundwater model output
Melo, I., Tomasik, B., Torrieri, G., Vogel, S., Bleicher, M., Korony, S. & Gintner, M. Kolmogorov–Smirnov test and its use for the identification of fireball fragmentation. Phys. Rev. C. 80, 024904. Mishra, S., Deeds, N. E. & RamaRao, B. S. Application of classification trees in the sensitivity analysis of probabilistic model results. Reliab. Eng. Syst. Safe. 79, 123–129. Mishra, S., Deeds, N. & Ruskauff, G. Global sensitivity analysis techniques for probabilistic ground water modeling. Ground Water 47, 730–747. Morway, E. D., Niswonger, R. G., Langevin, C. D., Bailey, R. T. & Healy, R. W. Modeling variably saturated subsurface solute transport with MODFLOW-UZF and MT3DMS. Ground Water 51, 237–251. Mpimpas, H., Anagnostopoulos, P. & Ganoulis, J. Uncertainty of model parameters in stream pollution using fuzzy arithmetic. J. Hydroinf. 10, 189–200. Myles, A. J., Feudale, R. N., Liu, Y., Woody, N. A. & Brown, S. D. An introduction to decision tree modeling. J. Chemometr. 18, 275–285. Neppel, L., Renard, B., Lang, M., Ayral, P. -A., Coeur, D., Gaume, E., Jacob, N., Payrastre, O., Pobanz, K. & Vinet, F. Flood frequency analysis using historical data: accounting for random and systematic errors. Hydrol. Sci. J. 55, 192–208. Onoz, B. & Bayazit, M. Best-fit distributions of largest available flood samples. J. Hydrol. 167, 195–208. Pappenberger, F., Beven, K. J., Ratto, M. & Matgen, P. Multimethod global sensitivity analysis of flood inundation models. Adv. Water Resour. 31, 1–14. Robin, M. J. L., Gutjahr, A. L., Sudicky, E. A. & Wilson, J. L. Cross-correlated random-field generation with the direct Fourier-transform method. Water Resour. Res. 29, 2385– 2397. Rojas, R., Feyen, L. & Dassargues, A. Sensitivity analysis of prior model probabilities and the value of prior knowledge in the assessment of conceptual model uncertainty in groundwater modelling. Hydrol. Process. 23, 1131–1146. Ross, S. M. Introduction to Probability and Statistics for Engineers and Scientists. Elsevier Academic Press, San Diego, CA.
Journal of Hydroinformatics
|
16.1
|
2014
Singh, V. P. & Singh, K. a Derivation of the gammadistribution by using the principle of maximum-entropy (POME). Water Resour. Bull. 21, 941–952. Singh, V. P. & Singh, K. b Derivation of the Pearson Type (PT) III distribution by using the principle of maximum-entropy (POME). J. Hydrol. 80, 197–214. Smakhtin, V. U. Low flow hydrology: a review. J. Hydrol. 240, 147–186. Sun, C. X. & Zheng, S. Q. Some results of parameter estimator based on uniform distribution. Coll. Math. J. 22, 130–134. Vogel, R. M., Mcmahon, T. A. & Chiew, F. H. S. Floodflow frequency model selection in Australia. J. Hydrol. 146, 421–449. Wang, D., Singh, V. P., Zhu, Y. S. & Wu, J. C. Stochastic observation error and uncertainty in water quality evaluation. Adv. Water Resour. 32, 1526–1534. Wang, F. G. & Wang, X. D. Fast and robust modulation classification via Kolmogorov-Smirnov test. IEEE T. Commun. 58, 2324–2332. Wu, J. C., Lu, L. & Tang, T. Bayesian analysis for uncertainty and risk in a groundwater numerical model’s predictions. Hum. Ecol. Risk Assess. 7, 1310–1331. Ye, M., Pohlmann, K. F., Chapman, J. B., Pohll, G. M. & Reeves, D. M. A model-averaging method for assessing groundwater conceptual model uncertainty. Ground Water 48, 716–728. Zeng, X. K., Wang, D. & Wu, J. C. Sensitivity analysis of the probability distribution of groundwater level series based on information entropy. Stoch. Environ. Res. Risk Assess. 26, 345–356. Zeng, X. K., Wang, D., Wu, J. C. & Chen, X. Reliability analysis of the groundwater conceptual model. Hum. Ecol. Risk Assess. 19, 515–525. Zhang, P., Aagaard, P., Nadim, F., Gottschalk, L. & Haarstad, K. Sensitivity analysis of pesticides contaminating groundwater by applying probability and transport methods. Integr. Environ. Assess. Manag. 5, 414–425. Zhang, X., Hoermann, G. & Fohrer, N. Parameter calibration and uncertainty estimation of a simple rainfall-runoff model in two case studies. J. Hydroinformat. 14, 1061–1074.
First received 5 January 2013; accepted in revised form 13 June 2013. Available online 12 July 2013
144
© IWA Publishing 2014 Journal of Hydroinformatics
|
16.1
|
2014
An optimization model for water resources allocation risk analysis under uncertainty Y. L. Xie and G. H. Huang
ABSTRACT In order to deal with the risk of low system stability and unbalanced allocation during water resources management under uncertainties, a risk-averse inexact two-stage stochastic programming model is developed for supporting regional water resources management. Methods of intervalparameter programming and conditional value-at-risk model are introduced into a two-stage stochastic programming framework, thus the developed model can tackle uncertainties described in
Y. L. Xie G. H. Huang (corresponding author) MOE Key Laboratory of Regional Energy and Environmental Systems Optimization, Resources and Environmental Research Academy, North China Electric Power University, Beijing 102206, China E-mail: guohe.huang3@gmail.com
terms of interval values and probability distributions. In addition, the risk-aversion method was incorporated into the objective function of the water allocation model to reflect the preference of decision makers, such that the trade-off between system economy and extreme expected loss under different water inflows could be analyzed. The proposed model was applied to handle a water resources allocation problem. Several scenarios corresponding to different river inflows and risk levels were examined. The results demonstrated that the model could effectively communicate the interval-format and random uncertainties, and risk aversion into optimization process, and generate inexact solutions that contain a spectrum of water resources allocation options. They could be helpful for seeking cost-effective management strategies under uncertainties. Moreover, it could reflect the decision maker’s attitude toward risk aversion, and generate potential options for decision analysis in different system-reliability levels. Key words
| conditional value-at-risk, inexact two-stage stochastic programming, risk analysis, uncertainty, water resources allocation
INTRODUCTION Water resources are critical for human survival, and human
and temporal units, and incompleteness or impreciseness of
society would be unable to prosper or even exist without
observed information (McIntyre et al. ; Maqsood et al.
them. The ever-growing conflicting demand for water
). Therefore, it is desired that the uncertainties should
resources supplies threaten the sustainability of this essential
be considered in water allocation planning programming.
resources recycling. Coupled with rapid increasing water
Over the past decades, inexact optimization models have
demand, decreasing usable water supplies and poor manage-
been widely used to tackle uncertainties and complexities in
ment have led to inefficient water resources allocation, and
water resources allocation problems, and a majority of them
the unsustainable use of water resources with significant
were based on fuzzy, stochastic, and interval-parameter pro-
economic, social, and environmental ramifications. More-
gramming (abbreviated as FMP, SMP, and IPP), as well as
over, in water resources systems, many system parameters
their combinations (Slowinski et al. ; Wagner et al. ;
and their inter-relationship may appear uncertain. Such
Huang ; Chang et al. ; Russell & Campbell ;
uncertainties, that would affect the related exercises for gen-
Wang & Du ; Li et al. , ; Li & Huang ; Cetin-
erating desired water resources management schemes, may
kaya et al. ; Simonovic ; Guo & Huang ; Xu &
be caused by the errors in acquired data, variations in spatial
Qin ; Lv et al. ). For example, Huang () developed
doi: 10.2166/hydro.2013.239
145
Y. L. Xie & G. H. Huang
|
Water resources allocation management and risk analysis model
Journal of Hydroinformatics
|
16.1
|
2014
an interval chance-constraint programming model for water
the violation of some overriding policies, those methods/
quality management in a Chinese city, which allowed prob-
models would fail to analyze the economic consequences;
ability distributions and discrete intervals to be incorporated
also, none of the above methods could facilitate the analysis
within the optimization process. Jairaj & Vedula () opti-
of various policy scenarios that were associated with different
mized a multi-reservoir system through using a fuzzy
levels of economic penalties when the promised targets were
mathematical programming method, where uncertainties
violated in the water resources management process.
existing in reservoir inflows were treated as fuzzy sets. Faye
Inexact two-stage stochastic programming (ITSP),
et al. () proposed a long-term water resources allocation
coupled with two-stage stochastic programming (TSP) and
model for an irrigation management problem of a reservoir
IPP, is an attractive technique to help overcome the above
system, where the fuzzy logic presented as a particularly ade-
shortcomings. In the ITSP, a decision is first undertaken
quate means to refine on-line the formulation of the objective
before values of random variables are known; then, after
function of the recurrent optimization problem. Teegavarapu
the random events have happened and their values are
& Elshorbagy () proposed a fuzzy mean squared error
known, a second-stage decision can be made in order to
measure to evaluate the performance of time series prediction
minimize ‘penalties’ that may appear due to any infeasibility
models in water resources, where membership functions
(Loucks et al. ; Birge ). ITSP methods have been
derived from a number of modeler preferences could be
widely explored in water resources management in the
easily aggregated to obtain a single integrated membership
past decades (Ferrero et al. ; Huang & Loucks ;
function. Chaves & Kojiri () applied a stochastic fuzzy
Seifi & Hipel ; Luo et al. ; Maqsood et al. ;
neural network model for the optimization of reservoir
Li et al. , ; Guo et al. ; Huang et al. ;
monthly operational strategies considering maximum water
Wang & Huang ). For example, Maqsood et al. ()
utilization and improvements on water quality simultaneously,
developed an interval-parameter fuzzy two-stage stochastic
where the stochastic fuzzy neural network was defined as a
programming method for water resources systems planning
fuzzy neural network model stochastically trained by a genetic
and management under uncertainty. Li & Huang ()
algorithm. Zhang et al. () introduced an inexact-stochastic
proposed an inexact two-stage stochastic nonlinear pro-
dual water supply programming model for regional water
gramming model for supporting decisions of water
resources management, which was based on analysis of the
resources allocation within a multi-reservoir system. Wang
inexact characteristics in demand and supply subsystems of a
& Huang () developed an interactive two-stage stoch-
dual water supply system and their dynamic interactions. Lu
astic fuzzy programming model for water allocation
et al. () advanced an interval-valued fuzzy linear-program-
management, where the method can not only tackle dual
ming method based on infinite α-cuts for an agricultural
uncertainties presented as fuzzy boundary intervals, but
irrigation problem, where a two-step infinite α-cuts solution
also permit in-depth analyses of various policy scenarios.
method is communicated to the solution process to discretize
Huang et al. () developed an integrated optimization
infinite α-cuts to interval-valued fuzzy membership functions.
method for supporting agriculture water management and
Tran et al. () developed a stochastic dynamic programming
planning in Tarim River Basin, Northwest China, where
model for reservoir water management strategy planning in
the method couples ITSP and quadratic programming. In
southern Vietnam, where multi-users, stochastic water level,
general, ITSP is effective for problems where an analysis
the timing and quantity of water release, and climatic con-
of policy scenarios is desired and the related data are
ditions were considered. Liu et al. () proposed an
random/interval format in nature. However, in the previous
interval-parameter chance-constrained fuzzy multi-objective
study, the minimum cost or maximum net benefit are usually
programming model for assisting water pollution control
considered as the objective in a general ITSP model, which
within a sustainable wetland management system, where the
could lead to the problems of low system stability and unba-
proposed approach can effectively handle the uncertainties
lanced allocation risk. Most of the models generated by the
and complexities in the water pollution control systems. How-
ITSP methods for water resources management take the
ever, in water resources planning practice, when it comes to
system benefit as the objective without considering the risk
146
Y. L. Xie & G. H. Huang
|
Water resources allocation management and risk analysis model
Journal of Hydroinformatics
|
16.1
|
2014
aversion, which should also be incorporated in the proposed inexact stochastic programming approach. Incorporating risk measures in the objective functions within other optimization methods is a fairly recent research topic. An alternative risk measure, namely conditional valueat-risk (CVaR), proposed by Rockafellar & Uryasev (), is a widely accepted risk measure (Ahmed ; Schultz & Tiedemann ; Fábián ). The CVaR model is a new risk measurement method based on probability distributions of random variables, and has been widely used for portfolio selection (Kall & Mayer ; Klein Haneveld & Van der Vlerk ; Schultz & Neise ; Liu et al. ). Previously, the application of CVaR in the water resources management field has been relatively limited. For example, Piantadosi et al. () developed a stochastic dynamic programming model with CVaR for supporting urban storm water management. Shao et al. () proposed a stochastic dynamic programming model with CVaR constraints for supporting water resources management under uncertainty. Nevertheless, most of the models take the system risk as the constraints, and
Figure 1
|
Framework of the RITSP model.
no previous studies were focused on development of riskaversion inexact two-stage stochastic programming (RITSP) method through integrating IPP, TSP, and CVaR into a general
RITSP method, which is based on IPP, TSP, and CVaR
framework for water resources allocation management with
techniques. Each technique has a unique contribution in
considering the risk aversion in the system objective.
enhancing the RITSP’s capacities for tackling the uncer-
Therefore, the aim of this study is to develop a RITSP
tainties and system risk. For example, the probability
method for water resources allocation management under
distributions and policy implications were handled through
uncertainty. It is the first attempt where IPP, TSP, and
TSP; the uncertainties presented as discrete intervals were
CVaR methods are integrated into a general framework of
reflected through IPP; the system risk was addressed by
a maximum benefit objective in the water resources allo-
CVaR. The modeling framework would offer feasible and
cation problem under uncertainties presented as interval
reliable solutions under different scenarios of allocation
values and probabilities. A case study will demonstrate the
targets, which are helpful for decision makers (Maqsood
performance of the RITSP method in water resources man-
et al. ).
agement systems planning under uncertainty. Furthermore, it will be shown how it can be used to generate water allo-
Two-stage stochastic programming
cation policies under a given risk level, as well as to determine which designs can most efficiently lead to the
Consider a typical water resources management system in a
optimized system objectives.
region, where a water resources manager is responsible for allocation of limited water to multiple competing users during a planning horizon. The water manager needs to
METHODOLOGY
promise each user an allocation target in the management process, which can help the water users make their gener-
An RITSP model was based on IPP, CVaR model, and
ation plans. If the promised water is delivered, it will
TSP. Figure 1 presents the general framework of the
result in net benefit to the local economy and drive the
147
Y. L. Xie & G. H. Huang
|
Water resources allocation management and risk analysis model
regional industry development; however, if the promised
Journal of Hydroinformatics
|
16.1
|
2014
optimization problem can be expressed as:
water is not delivered, the benefit will be reduced, due to the curtailed demand and the imposed penalty. Since the amount of available water is random, this water allocation
Eω∈Ω ½Qðx, ωÞ ¼
v X
ph Qðx, ωh Þ
(3)
h¼1
problem can be formulated as a two-stage stochastic programming with the objective of maximizing the expected value of economic activity in the region. The
general
form
of
TSP
For each realization of random variable ωh, a secondstage decision is made, which is denoted by yh. The
problems
read:
second-stage optimization problem can be rewritten as:
zðx, ωÞ ¼ cx Qðx, ωÞ, and a TSP model can be formulated as follows (Birge & Louveaux ): f ¼ max cx Eω∈Ω ½Qðx, ωÞ
min qðyh , ωh Þ (1a)
subject to ax b
(1b)
x 0
(1c)
(4a)
subject to Dðωh Þyh hðωh Þ þ T ðωh Þx
(4b)
yh 0
(4c)
Thus, Model (1) can be equivalently formulated as a linear programming model (Ahmed et al. ):
where f is the system benefit, x is the first-stage decision of water allocation made before the random variable ω is observed (ω ∈ Ω), and c is the benefit coefficients of
f ¼ max cx
and Qðx, ωÞ is the optimal value of the following nonlinear programming: min qðy, ωÞ
(2a)
ph qðyh , ωh Þ
(5a)
h¼1
first-stage variable x in the objective function; a is the technical coefficients, b is right-hand side coefficients,
v X
subject to ax b
(5b)
Dðωh Þyh hðωh Þ þ T ðωh Þx
(5c)
x 0
(5d)
yh 0
(5e)
subject to DðωÞy hðωÞ þ T ðωÞx y 0
(2b)
(2c)
where y is the second-stage adaptive decision, which
Risk-averse two-stage stochastic programming
depends on the realization of the random variable. qðx, ωÞ denotes the second-stage cost function, while
In the TSP, the first-stage decisions are deterministic and the
fDðωÞ, hðωÞ, T ðωÞjω ∈ Ωg are random model parameters
second-stage decisions are allowed to depend on the elemen-
with reasonable dimensions, which are functions of the
tary events, i.e., yh ¼ yðωh Þ. Basically, the second-stage
random variable ω. By letting random variables ω take
decisions represent the operational decisions, which change
discrete values with probability levels ph (h ¼ 1, 2,…, v P and ph ¼ 1), the expected value of the second-stage
depending on the realized values of the random data. The objective function Qðx, ωÞ of the second-stage problem, also
148
Y. L. Xie & G. H. Huang
|
Water resources allocation management and risk analysis model
Journal of Hydroinformatics
|
16.1
|
2014
known as the recourse (benefit) function, is a random vari-
VaR has the additional difficulty, for stochastic prob-
able and therefore, the total profit function zðx, ωÞ is a
lems, that it requires the use of binary variables for its
random variable. Determining the optimal decision vector x
modeling. Instead, computation of CVaR does not require
leads to the problem of comparing random profit variables
the use of binary variables and it can be modeled by the
zðx, ωÞ. Comparing random variables is one of the main inter-
simple use of linear constraints. The concept of CVaR is
ests of decision theory in the presence of uncertainty. While
illustrated in Figure 2. CVaR(z) is the conditional expected
comparing random variables, it is crucial to consider the
value not exceeding the value under the confidence level
effect of variability, which leads to the concept of risk. The
α. The CVaR at the confidence level α is given by:
preference relations among random variables can be specified using a risk measure. One of the main approaches in the practice of decision making under risk uses mean-risk models (Ogryczak & Ruszczyn´ski ). In these models,
CVaRα ðzÞ ¼ inf ξ ξ∈R
1 E ½ ξ z þ 1 α
(9)
one minimizes the mean-risk function, which involves a
where ξ is an auxiliary variable, which is the maximum
specified risk measure ρ:z ! R, where ρ is a functional and
value at the cumulative probability α.
z is a linear space of F-measurable functions on the prob-
Thus, Model (6) can be redefined as:
ability space (Ω, F, P): maxfEðzðx, ωÞÞ λCVaRα ðzðx, ωÞÞg maxfEðzðx, ωÞÞ λρðzðx, ωÞÞg
(10)
(6) In
In this approach, λ is a nonnegative trade-off coefficient
addition,
(Birbil
et
al.
CVaRα ðz þ aÞ ¼ CVaRα ðzÞ þ a, a ∈ R ),
therefore,
CVaRα ðzðx, ωÞÞ ¼
representing the exchange rate of mean benefit for risk, and
CVaRα ðcx Qðx, ωÞÞ ¼ cx CVaRα ðQðx, ωÞÞ,
also refers to it as a risk coefficient, which is specified by
can be reformed as the following linear programming
decision makers according to their risk preferences. Usually,
problem:
Model
(10)
when typical dispersion statistics, such as variance, are used as risk measures, the mean-risk approach may lead to inferior
Max f ¼ ð1 λÞcx
solutions. In order to remedy this drawback, models with
alternative asymmetric risk measures, such as downside
h¼1
risk measures, have been proposed (Ogryczak & Ruszczyn ´ski ), and conditional value-at-risk (CVaR) measure which is based on the value-at-risk (VaR) was widely applied in many areas to downside risk measures among the popular risk-aversion methods.
v X
v 1 X ph qðyh , ωh Þ þ λ ξ ph Vh 1 α h¼1
ax b
lower than or equal to this value (e.g., l ) is lower than or equal to 1 α: (7)
CVaR at level α, in a simple way, is defined as follows (Rockafellar & Uryasev , ): (8)
(11a)
(11b)
VaR is a measure computed as the maximum profit
CVaRðzÞ ¼ Eðzjz VaRðzÞÞ
)
subject to
value (e.g., z) such that the probability of the profit being
VaR ¼ maxfljpðz lÞ 1 αg
(
Figure 2
|
VaR and CVaR illustration.
149
Y. L. Xie & G. H. Huang
|
Water resources allocation management and risk analysis model
Dðωh Þyh hðωh Þ þ T ðωh Þx, Vh ξ cx þ q yh, ωh ,
h ¼ 1, 2, . . . , v
(11c)
where
Journal of Hydroinformatics
± ± ± ± a± , c± , x± , b± , y± h , ξ , ωh , Vh ∈ fR g,
|
16.1
and
|
2014
f R± g
denotes a set of interval parameters and/or variables; superscript ‘±’ means interval-valued feature; the ‘–’ and
h ¼ 1, 2, . . . , v
(11d)
‘þ’ superscripts represent lower and upper bounds of an interval parameter/variable, respectively.
Vh 0,
h ¼ 1, 2, . . . , v
(11e) Solution of the RITSP model
x 0
(11f) Model (12) can be transformed into two deterministic sub-
yh 0,
h ¼ 1, 2, . . . , v
(11g)
models that correspond to the lower and upper bounds of desired objective function value. This transformation process is based on an interactive algorithm, which is
Risk-averse inexact two-stage stochastic programming
different from the best/worst case analysis (Huang et al. ). The objective function value corresponding to f þ is
However, in water resources optimization problems,
desired first because the objective is to maximize net
uncertainty presented as interval numbers is more
system benefit. The sub-model to find f þ can be first formu-
straightforward than probability density functions (PDFs)
lated as follows (assume that c± 0, A± 0, and b± 0):
due to the poor quality of information that can be obtained (Li et al. ). Thus, by introducing the interval parameter
Max f þ ¼ ð1 λÞcþ x
programming to quantify those uncertainties presented in
terms of interval values, Model (11) can be transformed
v X
±
ph q y± h , ωh
h¼1
(
v 1 X þ λ ξ± ph Vh± 1 α h¼1
ph q
y h,
ω h
h¼1
into the following RITSP model: Max f ± ¼ ð1 λÞc± x±
v X
(
v 1 X ph Vh þλ ξ 1 α h¼1
)
þ
(13a) )
subject to x ¼ x þ μðxþ x Þ
(13b)
0 μ 1
(13c)
(12b)
a x bþ
(13d)
(12c)
D ω h yh h ωh þ T ωh x,
(12d)
Vh ξþ cþ x þ q y h , ωh ,
(12e)
Vh 0,
(12f)
y h 0,
(12a) subject to a± x± b± ± ± ± ± D ω± h yh h ωh þ T ωh x , ± Vh± ξ± c± x± þ q y± h , ωh , Vh± 0,
h ¼ 1, 2, . . . , v
x± 0 y± h
0,
h ¼ 1, 2, . . . , v
h ¼ 1, 2, . . . , v h ¼ 1, 2, . . . , v
h ¼ 1, 2, . . . , v h ¼ 1, 2, . . . , v
h ¼ 1, 2, . . . , v h ¼ 1, 2, . . . , v
(13e)
(13f)
(13g)
(13h)
þ where μ and y h are decision variables. The optimal fopt , μopt
(12g)
and y h opt would be obtained through solving the Submodel
150
Y. L. Xie & G. H. Huang
|
Water resources allocation management and risk analysis model
Journal of Hydroinformatics
|
16.1
|
2014
(13), and xopt ¼ x þ μopt ðxþ x Þ is the optimized first-stage
development. In the study region, agriculture and industry
variable, which may correspond to the optimized upper-
are the dominant activities and the agricultural irrigation
bound objective function value. Based on the above solutions,
and industrial consumption accounts for more than 75% of
the second submodel for f can be formulated as follows:
the total water demands to promote the development of the regional economy; 20–25% of the total water consumption
Max f ¼ ð1 λÞc xopt
v X
þ ph q yþ h , ωh
h¼1
( þ λ ξ
v X
1 ph Vhþ 1 α h¼1
is used for drinking, cleaning, and other municipal purposes. The main elements of the problem involve the water resource
)
availabilities, the water demands to satisfy current and poten(14a)
tial future needs, and the water transportation systems. With less water available, the management of water resources is becoming more complex in order to satisfy all users, and
subject to
existing public institutions lack the capacity and structure to þ
a xopt b
(14b)
þ þ þ D ωþ h yh h ωh þ T ωh xopt , Vhþ
ξ c xopt þ q
yþ h,
ωþ h
h ¼ 1, 2, . . . , v
properly deal with the situation. Since local economic development relies heavily on the availability of the water supply, the adaptive strategies to water shortage crises are of high
(14c)
importance to local government. Moreover, in water resources systems, the manager wants to obtain different
,
h ¼ 1, 2, . . . , v
(14d)
options for water supply, and select the option or combination of options that provide the necessary amount of
Vhþ Vh 0,
h ¼ 1, 2, . . . , v
(14e)
water in the most cost-effective manner while taking into account technical and social criteria. From an economic point of view, all users need to know how much water they
xopt
yþ h
y h
0,
h ¼ 1, 2, . . . , v
(14f)
can expect to obtain during the planning horizon in order to establish and make plans for rational production. How-
Solutions of
yþ opt
can be obtained through Submodel (14).
ever, a variety of complexities exist in the study problem.
Through integrating solutions of Submodels (13) and (14),
On the one hand, the hydrologic cycle is basically dependent
interval solution for Model (12) can be obtained as follows:
upon the geology and climate, which determine the physical
± fhopt
¼
h
þ fopt , fopt
characteristics of the basin, the natural environment, and the
i
(15a)
the river basin. On the other hand, human activities impact on the natural resources, mainly through land and water
xopt ¼ x þ μopt ðxþ x Þ
y± hopt
variability of the mass and energy exchanges occurring within
(15b)
use, and produce changes in the dynamics of the natural environment and the hydrologic cycle, which may amplify
h i þ ¼ y hopt , yhopt
(15c)
the variability of those exchanges, affecting the hydrologic balance and the use of the natural resources (Victoria et al. ; Li et al. ). These complexities could become further compounded by not only interactions among many uncertain
CASE STUDY
system components but also their economic implications caused by improper policies.
A case study of regional water management is then provided
Thus, the manager needs to create a plan to effectively allo-
for demonstrating applicability of the developed method. In
cate the uncertain supply of water to the three users in order to
the water resources system, a water manager is responsible
maximize the overall system benefit while simultaneously con-
for allocating the limited water resources to support the
sidering the uncertainties in the system. In addition, based on
regional
the regional water management policies, an allowable flow
municipality,
industrial
and
agricultural
151
Y. L. Xie & G. H. Huang
|
Water resources allocation management and risk analysis model
Journal of Hydroinformatics
|
16.1
|
2014
level to each user must be regulated. If the promised water
i ¼ 2 for industrial production, and i ¼ 3 for agricultural sector;
amount is delivered, the net benefit will be generated. How-
h is the index of scenarios where h ¼ 1, 2,…, 7; Wi± is the allo-
ever, if the promised water amount is not delivered, either
cation target of water that is promised to user i; D± ih is the
the water must be obtained from higher price alternatives or
amount of water deficit by which the water allocation target
the supply must be decreased by reducing the scale of pro-
Wi± is not met in scenario h; NB± i is the net benefit of user i
duction to fill the so-called deviation, causing economic
per unit of water allocated; Ci± is the reduction of net benefit
losses (Li et al. , ). Moreover, the existence of multiple
to user i per unit of water not delivered; λ is a nonnegative
uncertainties associated with the water resources system will
trade-off coefficient representing the exchange rate of mean
aggravate the risk of system impairment and failure. Therefore,
benefit for risk; ξ± α is an auxiliary variable, which is the maxi-
it is desirable that the risk control should be considered in the
mum benefit at the cumulative probability α; α is the
water allocation planning program. The problem under con-
confidence level; Vh± is a positive auxiliary variable under scen-
sideration of the risk of water resources system transforms
ario h; ph is probability of occurrence for scenario h; q± h is the
into how to effectively allocate water to various sectors in
available water resources in scenario h; Wi±max is the minimum
order to achieve a maximum benefit assuming a given risk
allowable allocation amount for user i.
level under uncertainties. To solve such a problem, the pro-
For Model (16), if Wi± are considered as uncertain
posed RITSP is considered to be a suitable approach for
inputs, the existing methods for solving inexact linear pro-
dealing with the study problem:
gramming problems cannot be used directly. In this study, an optimized set of target values will be identified by
3 3 X 7 X X ± Max f ¼ ð1 λÞNB± W ph Ci± D± i i ih ±
i¼1
( ξ± α
þλ
having μi in Model (17) be decision variables. This opti-
i¼1 h¼1 7 1 X ph Vh± 1 α h¼1
)
mized set will correspond to the highest possible system (16a)
Wi ¼ Wi þ μi ΔWi ,
where
ΔWi ¼ Wiþ Wi ,
ing an optimized set of target values Wi± in order to support
[constraints of water availability] Wi±
let
μi ∈ ½0, 1 . μi are decision variables that are used for identify-
subject to 3 X
benefit under the uncertain water allocation targets. Accordingly,
D± ih
the related policy analyses (Huang & Loucks ). For
q± h,
∀h
(16b)
example, when Wi± approach their upper bounds (i.e., when ui ¼ 1), a relatively high benefit would be obtained if
i¼1
the water demands are satisfied; however, a high penalty
[constraints of extreme allocation amounts]
may have to be paid when the promised water is not delivWi±max Wi± D± ih , ± Wi± D± ih Wi min ,
Vh± ξ±
3 X
∀i, t, h,
(16c)
∀i, t, h,
(16d)
when ui ¼ 0), we may have a lower cost and a higher risk
± NB± i Wi þ
i¼1
ered. Conversely, when Wi± reach their lower bounds (i.e.,
3 X
of violating the promised targets. Therefore, by introducing decision variables ui, and according to Huang & Loucks
Ci± D± ih , ∀h
(16e)
(), the model can be transformed into two deterministic submodels based on an interactive algorithm. Since the
i¼1
objective is to maximize the net system benefit, the submo-
[nonnegative constraints]
del corresponding to upper-bound objective function value
Vh± 0,
∀h
(16f)
D± ih 0,
∀i, h
(16g)
( f þ ) is first desired. Thus, we have:
Max f þ ¼
3 3 X 7 X X ð1 λÞNBþ ph Ci D ih i Wi þ μi ΔWi i¼1
(
where f± is the net system benefit over the planning horizon ($); i is the index of water users, where i ¼ 1 for municipality,
þ λ ξþ α
7 X
1 ph Vh 1 α h¼1
)
i¼1 h¼1
(17a)
152
Y. L. Xie & G. H. Huang
|
Water resources allocation management and risk analysis model
Journal of Hydroinformatics
subject to ΔWi ¼
Wiþ
Vhþ ξ
Wi
3 X
Wi
(17c)
þ μi ΔWi
D ih
qþ h,
∀h
Ciþ Dþ ih , ∀h
|
2014
(18f)
i¼1
Vhþ 0,
∀h
(18g)
Dþ ih 0,
∀i, h
(18h)
where fopt and Dþ it opt are solutions of the Submodels (18).
Wiþmax Wi þ μi ΔWi D ih ,
Thus, the solutions for Model (16) under the optimized tar∀i, t, h,
þ
ξ
3 X
(17e)
gets can be obtained through incorporating the solutions of the two submodels.
þ Wi þ μi ΔWi D ih Wi min ,
Vh
3 X
16.1
(17d)
i¼1
NB i Wi opt þ
i¼1
(17b)
0 μi 1
3 X
|
NBþ i
∀i, t, h,
Table 1 provides the water target demands and the related economic data. The data were obtained from a
Wi
(17f)
þ μi ΔWi þ
i¼1
3 X
number of representative cases for water resources manageCi D ih ,
∀h
(17g)
ment (Loucks et al. ; Huang & Loucks ; Li et al. , ). Since uncertainties exist in the system com-
i¼1
ponents, water allocation targets and economic data are Vh
0,
∀h
(17h)
expressed as intervals format. Let Wiþ be the quantity of water that is promised to each user i. If this water is deliv-
D ih 0,
∀i, h
(17i)
þ where fopt , D ihopt , and uiopt are solutions of the Submodels
(17). Solution for f þ provides the extreme upper bound of system benefit under uncertain inputs. Then, the optimized water allocation targets would be Wi opt ¼ Wi þ ΔWkt ui opt . Consequently, the submodel corresponding to the lower bound of the objective function value (i.e., f ) is: Max f ¼
unit of water allocated is estimated to be NB± i . However, if the promised water is not delivered, either water must be obtained from alternative and more expensive sources, or demand must be curtailed by reduced production and/or increased recycling within the industrial concern, or by reduced irrigation in the agricultural sector. This results in a reduction of net benefit to user i of Ci± per unit of water not delivered (Ci± > NB± i ). In addition, in the water
3 3 X 7 X X ð1 λÞNB ph Ciþ Dþ i Wi opt ih i¼1
ered, the resulting net benefit to the local economy per
resources system, the total amount of water available has
i¼1 h¼1
(
7 1 X þ λ ξ ph Vhþ α 1 α h¼1
)
(18a)
Table 1
|
Water target demands and the related economic data
User
subject to Wi opt ¼ Wi þ ΔWkt ui opt 3 X
Wi opt Dþ ih qh , ∀h
Wi opt
Dþ ih
Wi min ,
∀i, t, h, ∀i, t, h,
Industrial
Agricultural
(18b)
Water allocation target, Wi± (106 m3)
[2.20, 4.00]
[3.00, 5.50]
[3.50, 6.50]
Minimum allowable [1.00, 1.50] allocation, Wi±min (106 m3)
[0.50, 1.00]
[0.60, 1.00]
(18c)
Net benefit when water demand is satisfied, 3 NB± i ($/m )
[90, 100]
[45, 55]
[25, 35]
(18d)
Penalty when water is not delivered, Ci± ($/m3)
[125, 135]
[70, 80]
[45, 55]
i¼1
Wi max Wi opt Dþ ih ,
Municipal
(18e)
153
Y. L. Xie & G. H. Huang
|
Water resources allocation management and risk analysis model
Journal of Hydroinformatics
|
16.1
|
2014
characteristics of random and increasing or decreasing trend
and 0.99. In addition, the value of λ can be chosen as any
changes. Theoretically, there are other ways to generate the
real number. After a number of test runs, it was found
random variables. One is a survey from different experts,
that, if λ value is over 0.6, the solutions of optimal water allo-
based on an assumption that there were not enough data
cation targets and water shortage amounts are the same as
available. A large group of experts are required to estimate
the one obtained under λ ¼ 0.6. Therefore, in order to reflect
the value of a certain parameter. Then the value of the par-
the variation trend of allocation policies by changing the
ameter can be obtained via analyzing the estimations
value of λ, the λ value is set between 0 and 1.
through sample statistic inductive methods. The other way
Uncertainties exist in many of the system components (pro-
is exemplified by the probability cumulative distribution
vided as intervals for water allocation targets and economic
function, which is based on that there are enough data avail-
data, as well as distribution information for the total water avail-
able. According to the local policy of hypothetical cases,
ability). The problems under consideration include: (1) how to
seven discrete water inflow values (i.e., very-low, low, low-
suitably allocate water flows to achieve a maximized system
medium, medium, medium-high, high, and very-high) are
benefit; (2) how to identify desired water allocation policies
selected as the range of intervals. In addition, division of
under different risk levels; and (3) how to seek cost-effective
the targets into a number of predefined values associated
water resources management strategies under complex uncer-
with probabilities (8, 12, 16, 25, 15, 14, and 10%) can
tainties. The developed RITSP is considered to be a suitable
meet the requirement of the RITSP. Table 2 shows the
approach for dealing with these problems.
water inflow levels and the associated probabilities of occurrence. From previous studies (Conejo et al. ; Pousinho
RESULT ANALYSIS AND DISCUSSION
et al. ), the value of α is commonly set between 0.90 Table 2
|
Results have been obtained through solving the RITSP
Stream flow distribution
Flow level
Probability
6 3 Stream flows q ± h (10 m )
Very-low (V-L)
0.08
[3.80, 5.20]
Low (L)
0.12
[5.50, 6.50]
Low-medium (L-M)
0.16
[6.90, 8.20]
Medium (M)
0.25
[8.50, 9.80]
Medium-high (M-H)
0.15
[10.0, 11.5]
High (H)
0.14
[11.5, 12.9]
Very-high (V-H)
0. 10
[13.2, 14.5]
Table 3
|
model. The solutions for the objective function value and most of the nonzero decision variables were interval numbers.
Generally,
solutions
presented
as
intervals
demonstrate that the related decisions should be sensitive to the uncertain modeling inputs (Li et al. ). Table 3 shows the solutions of water allocation targets (Wi opt ) under different α and λ levels during the planning horizon. Various α and λ levels correspond to different system confidence levels and different levels of trade-off
Optimal targets of the RITSP model
λ level α level
Wi
0.90
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
W1 opt W2 opt W3 opt
4.00 5.40 3.50
4.00 4.80 3.50
4.00 4.00 3.50
4.00 4.00 3.50
4.00 4.00 3.50
4.00 4.00 3.50
4.00 3.20 3.50
4.00 3.20 3.50
4.00 3.20 3.50
4.00 3.20 3.50
4.00 3.20 3.50
0.95
W1 opt W2 opt W3 opt
4.00 5.40 3.50
4.00 4.00 3.50
4.00 3.20 3.50
4.00 3.00 3.50
4.00 3.00 3.50
4.00 3.00 3.50
4.00 3.00 3.50
4.00 3.00 3.50
4.00 3.00 3.50
4.00 3.00 3.50
4.00 3.00 3.50
0.99
W1 opt W2 opt W3 opt
4.00 5.40 3.50
4.00 4.00 3.50
4.00 3.00 3.50
3.20 3.00 3.50
3.20 3.00 3.50
3.20 3.00 3.50
3.20 3.00 3.50
3.20 3.00 3.50
3.20 3.00 3.50
3.20 3.00 3.50
3.20 3.00 3.50
opt
154
Y. L. Xie & G. H. Huang
|
Water resources allocation management and risk analysis model
Journal of Hydroinformatics
|
16.1
|
2014
between profit and risk, thus would lead to varied water allo-
and λ levels are varied; the benefit from industry lies
cation targets. For example, when α ¼ 0.90 and 0.95, water
between the profits from the municipality and agriculture.
allocation targets for the municipal sector would be 4.00 ×
Moreover, the water allocation targets would decrease
106 m3 under different λ levels; however, when α ¼ 0.99,
with increment of the λ levels, especially in a high confi-
the water allocation targets for this user would be 4.00 ×
dence level. For example, when α ¼ 0.99, the water
106 (λ ¼ 0, 0.1, and 0.2), and 3.20 × 106 m3 (the value of λ
allocation targets for industry would be 5.40 × 106 m3 (λ ¼
is from 0.3 to 1.0). Generally, water resources would first
0), 4.00 × 106 m3 (λ ¼ 0.1), and 3.00 × 106 m3 (the value of
be allocated to the municipal sector, followed by the indus-
λ is from 0.2 to 1.0).
trial and agricultural sectors. For example, the optimized
Variations in Wi± could reflect different policies of water
allocation target for the municipality over the planning hor-
resources management under uncertainty. When the water
izon would be close to its maximum value under different α
allocation targets reach their lower bounds, the corresponding
and λ levels. This is because the municipality could bring
policy may result in less water shortage and lower economic
about the highest benefit when its demand is satisfied;
penalty. Moreover, the upper bounds of Wi± would lead to a
thus, the manager would have to promise larger amounts
strategy with higher allocated targets, resulting in a higher
to it to achieve a maximized system benefit. The optimized
system benefit and a higher risk of penalty when the water
allocation target for the agricultural sector would reach its
inflow is in a lower level. Therefore, different policies in prede-
minimum value under demanding conditions since this
fining the promised water allocation are associated with
user is associated with the lowest benefit. In comparison,
different levels of economic benefit and system failure risk.
the optimized water allocation target for industry would
Tables 4 to 6 present the water deficit (D± ih ) under differ-
fluctuate within its minimum and maximum values as α
ent scenarios in the planning horizon. The solutions of D± ih
Table 4
|
Solutions of D± ih from RITSP model under α ¼ 0.90
α ¼ 0.90 H
I
λ¼0
λ ¼ 0.1
λ ¼ 0.2
λ ¼ 0.3
λ ¼ 0.4
λ ¼ 0.5
λ ¼ 0.6
1
1 2 3
[0.80,1.30] [4.40,4.90] [2.50,2.90]
[0.80,1.30] [3.80,4.30] [2.50,2.90]
[0.80,1.30] [3.00,3.50] [2.50,2.90]
[0.80,1.30] [3.00,3.50] [2.50,2.90]
[0.80,1.30] [3.00,3.50] [2.50,2.90]
[0.80,1.30] [3.00,3.50] [2.50,2.90]
[0.80,1.30] [2.20,2.70] [2.50,2.90]
2
1 2 3
0 [3.90,4.50] [2.50,2.90]
0 [3.30,3.90] [2.50,2.90]
0 [2.50,3.10] [2.50,2.90]
0 [2.50,3.10] [2.50,2.90]
0 [2.50,3.10] [2.50,2.90]
0 [2.50,3.10] [2.50,2.90]
0 [1.70,2.30] [2.50,2.90]
3
1 2 3
0 [2.20,3.10] [2.50,2.90]
0 [1.60,2.50] [2.50,2.90]
0 [0.80,1.70] [2.50,2.90]
0 [0.80,1.70] [2.50,2.90]
0 [0.80,1.70] [2.50,2.90]
0 [0.80,1.70] [2.50,2.90]
0 [0,0.90] [2.50,2.90]
4
1 2 3
0 [0.60,1.50] [2.50,2.90]
0 [0,0.90] [2.50,2.90]
0 [0,0.10] [1.70,2.90]
0 [0,0.10] [1.70,2.90]
0 [0,0.10] [1.70,2.90]
0 [0,0.10] [1.70,2.90]
0 0 [0.90,2.20]
5
1 2 3
0 0 [1.40,2.90]
0 0 [0.80,2.30]
0 0 [0,1.50]
0 0 [0,1.50]
0 0 [0,1.50]
0 0 [0,1.50]
0 0 [0,0.70]
6
1 2 3
0 0 [0,1.40]
0 0 [0,0.80]
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
7
1 2 3
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
Note: When λ > 0.6, the solutions of D± ih are the same as those obtained under λ ¼ 0.6.
155
Table 5
Y. L. Xie & G. H. Huang
|
|
Water resources allocation management and risk analysis model
Journal of Hydroinformatics
|
16.1
|
2014
Solutions of D± ih from RITSP model under α ¼ 0.95
α ¼ 0.95 h
i
λ¼0
λ ¼ 0.1
λ ¼ 0.2
λ ¼ 0.3
λ ¼ 0.4
λ ¼ 0.5
λ ¼ 0.6
1
1 2 3
[0.80,1.30] [4.40,4.90] [2.50,2.90]
[0.80,1.30] [3.00,3.50] [2.50,2.90]
[0.80,1.30] [2.20,2.70] [2.50,2.90]
[0.80,1.30] [2.00,2.50] [2.50,2.90]
[0.80,1.30] [2.00,2.50] [2.50,2.90]
[0.80,1.30] [2.00,2.50] [2.50,2.90]
[0.80,1.30] [2.00,2.50] [2.50,2.90]
2
1 2 3
0 [3.90,4.50] [2.50,2.90]
0 [2.50,3.10] [2.50,2.90]
0 [1.70,2.30] [2.50,2.90]
0 [1.50,2.10] [2.50,2.90]
0 [1.50,2.10] [2.50,2.90]
0 [1.50,2.10] [2.50,2.90]
0 [1.50,2.10] [2.50,2.90]
3
1 2 3
0 [2.20,3.10] [2.50,2.90]
0 [0.80,1.70] [2.50,2.90]
0 [0,0.90] [2.50,2.90]
0 [0,0.70] [2.30,2.90]
0 [0,0.70] [2.30,2.90]
0 [0,0.70] [2.30,2.90]
0 [0,0.70] [2.30,2.90]
4
1 2 3
0 [0.60,1.50] [2.50,2.90]
0 [0,0.10] [1.70,2.90]
0 0 [0.90,2.20]
0 0 [0.70,2.00]
0 0 [0.70,2.00]
0 0 [0.70,2.00]
0 0 [0.70,2.00]
5
1 2 3
0 0 [1.40,2.90]
0 0 [0,1.50]
0 0 [0,0.70]
0 0 [0,0.50]
0 0 [0,0.50]
0 0 [0,0.50]
0 0 [0,0.50]
6
1 2 3
0 0 [0,1.40]
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
7
1 2 3
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
Note: When λ > 0.6, the solutions of D± ih are the same as those obtained under λ ¼ 0.6.
under the given targets reflect the variations of system con-
medium level of stream flow under the scenario of α with
ditions caused by inputs of the uncertain parameters.
the value of 0.95, the amount of industrial water shortage
Generally, the water shortage solutions of the three users
would be [0.60, 1.50] × 106 m3 (λ ¼ 0), [0, 0.90] × 106 m3
and scenarios can be similarly interpreted based on the
(λ ¼ 0.1), [0, 0.10] × 106 m3 (the value of λ is from 0.2 to
results. As the water flow level increases, the water allo-
0.9), and 0 (λ ¼ 1.0), respectively; the water shortage of the
cation target would be satisfied, and the water shortage
municipal sector would be 0 under different λ values, and
would decrease. For example, when α ¼ 0.90 and λ ¼ 0.1,
the agricultural shortage would decrease from [2.50,
the industrial water shortages would be [3.80, 4.30] ×
2.90] × 106 m3 to [0.90, 2.20] × 106 m3 when λ changes from
106 m3, [3.30, 3.90] × 106 m3, [1.60, 2.50] × 106 m3, and
0.1 to 1.0. Generally, as λ increases, the allocation target
[0, 0.90] × 106 m3, when flow levels are very-low, low, low-
and shortage would decrease, leading to a decreased
medium, and medium, respectively; there would be no
amount of water shortage. It indicated that when the risk
shortages under medium-high, high and very-high flow
level λ increases, water managers would choose a conserva-
levels. In addition, a trade-off could be analyzed by assigning
tive water allocation scheme to avoid the risk. In contrast, a
different λ values in the model constraints when α is fixed.
lower λ value would result in alternatives with lower risk
From Tables 3–6, a number of decision variables such as
aversion. Moreover, when the confidence level of α
the target values (Wi opt ) and the upper and lower bounds
increases, the allocation target would decrease, leading to
of the shortage (D± ih ) amount would vary with different λ
a reduced amount of water shortage and increased water
values. As the value of λ increases, the water allocation
allocation balance among users. For example, under the
target and shortage of the three users would decrease. For
low inflow level, the municipal water shortage would be 0,
example, when the available quantity of water is at the
and agricultural deficit would be [2.50, 2.90] × 106 m3 with
156
Table 6
Y. L. Xie & G. H. Huang
|
|
Water resources allocation management and risk analysis model
Journal of Hydroinformatics
|
16.1
|
2014
Solutions of D± ih from RITSP model under different α ¼ 0.99
α ¼ 0.99 h
i
λ¼0
λ ¼ 0.1
λ ¼ 0.2
λ ¼ 0.3
λ ¼ 0.4
λ ¼ 0.5
λ ¼ 0.6
1
1 2 3
[0.80,1.30] [4.40,4.90] [2.50,2.90]
[0.80,1.30] [3.00,3.50] [2.50,2.90]
[0.80,1.30] [2.00,2.50] [2.50,2.90]
[0,0.50] [2.00,2.50] [2.50,2.90]
[0,0.50] [2.00,2.50] [2.50,2.90]
[0,0.50] [2.00,2.50] [2.50,2.90]
[0,0.50] [2.00,2.50] [2.50,2.90]
2
1 2 3
0 [3.90,4.50] [2.50,2.90]
0 [2.50,3.10] [2.50,2.90]
0 [1.50,2.10] [2.50,2.90]
0 [0.70,1.30] [2.50,2.90]
0 [0.70,1.30] [2.50,2.90]
0 [0.70,1.30] [2.50,2.90]
0 [0.70,1.30] [2.50,2.90]
3
1 2 3
0 [2.20,3.10] [2.50,2.90]
0 [0.80,1.70] [2.50,2.90]
0 [0,0.70] [2.30,2.90]
0 0 [1.50,2.80]
0 0 [1.50,2.80]
0 0 [1.50,2.80]
0 0 [1.50,2.80]
4
1 2 3
0 [0.60,1.50] [2.50,2.90]
0 [0,0.10] [1.70,2.90]
0 0 [0.70,2.00]
0 0 [0,1.20]
0 0 [0,1.20]
0 0 [0,1.20]
0 0 [0,1.20]
5
1 2 3
0 0 [1.40,2.90]
0 0 [0,1.50]
0 0 [0,0.50]
0 0 0
0 0 0
0 0 0
0 0 0
6
1 2 3
0 0 [0,1.40]
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
7
1 2 3
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
Note: When λ > 0.6, the solutions of D± ih are the same as those obtained under λ ¼ 0.6.
the scenarios of α with the values of 0.90, 0.95, and 0.99; for
to 3.2 × 106 m3 when the value of λ changes from 0.4 to 1.0
the industrial sector with different λ values, the amount of
under the scenarios of α with the value of 0.99. In addition,
6
water deficit would decrease from [3.90, 4.50] × 10
to
under the lower level of water inflow, the amount of indus-
[1.70, 2.30] × 106 m3 when α is a fixed value of 0.90, and
trial allocation would be decreased, and the agricultural
reduce from [3.90, 4.50] × 106 to [0.70, 1.30] × 106 m3 when
water allocation would increase, when the value of α
the value of α is 0.99. In such a case, the extreme risk
increases from 0.90 to 0.99 under the same λ value. For
would be lowered and the system feasibility would be
example, under the low-medium water flow level, when λ
enhanced. In contrast, a lower α value would result in a
is a fixed value of 0.6, the industrial water allocation
higher possibility of system loss in extreme conditions.
would be [2.30, 3.20] × 106, 3.00 × 106, and [2.30, 3.00] ×
The RITSP model can generate a great deal of water
106 m3, and the amount of agricultural allocation would
allocation strategies with different α and λ values under
be [0.60, 1.00] × 106, [0.60, 1.20] × 106, and [0.70, 2.00] ×
different inflow levels, in order to analyze the effects of α
106 m3, under the condition of α increasing from 0.90 to
and λ on water allocation policies. Figures 3–5 present the
0.99. From Figures 3–5, the lower and upper bounds of the
optional water allocation schemes obtained through the
water allocation amount would vary with the change of α.
RITSP model. Due to the highest benefit, water would be
This shows that the effect of the risk measure on the model-
first allocated to the municipal sector under different α
ing outputs could be adjusted by changing the α value.
and λ values. For example, the water allocated to municipal
Generally, a high α value would lead to a lower risk and
sectors would reach the upper bound of the water allocation
enhanced system feasibility. The water allocated to the
6
3
target (e.g., 4.00 × 10 m ) under the scenarios of α with the
users with higher benefit would decrease, and the water sup-
values of 0.90 and 0.95; the water allocation would decrease
plied to the users with lower benefit would increase when
157
Y. L. Xie & G. H. Huang
Figure 3
|
|
Water resources allocation management and risk analysis model
Journal of Hydroinformatics
|
16.1
|
2014
Optimized water allocation schemes under α ¼ 0.90.
there is an α value increment with a fixed λ value, in order to
change as the value of λ increased. When α is a fixed
reduce the risk of unbalance water allocation caused by the
value of 0.95, the net benefit would be $ [400.22, 640.89] ×
objective of maximum net benefit in water resources system
106 (λ ¼ 0), $ [427.87, 628.12] × 106 (λ ¼ 0.1), $ [433.14,
planning and management.
613.28] × 106 (λ ¼ 0.2), $ [434.30, 608.77] × 106 (the value
Figures 6 and 7 show the varying trend of the RITSP
of λ is from 0.3 to 1.0) respectively; the recourse cost
model’s objective, the system net benefit and recourse cost
would be $ [178.62, 290.28] × 106 (λ ¼ 0), $ [114.39,
under different α and λ values. In general, the intervals of
199.63] × 106 (λ ¼ 0.1), $ [85.23, 158.37] × 106 (λ ¼ 0.2), $
the model’s objective would decrease as the value of λ
[78.74, 148.21] × 106 (the value of λ is from 0.3 to 1.0)
increases when α is a fixed value. For example, when α is
respectively (as shown in Figure 7(b)). It indicated that
a fixed value of 0.90, the objective of the RITSP model
increasing the value of λ would increase the relative impor-
would be $ [0.40, 0.64] × 109, $ [0.57, 0.96] × 109, $ [0.79,
tance of the risk term and also lead to a higher system risk,
1.62] × 109, $ [0.97, 1.62] × 109, $ [1.15, 1.95] × 109, $ [1.33,
and the water managers would choose a conservative
9
9
9
2.28] × 10 , $ [1.63, 2.62] × 10 , $ [1.83, 2.95] × 10 , $ [2.02,
scheme with a lower system benefit. Moreover, as the
3.28] × 109, $ [2.22, 3.62] × 109, and $ [2.42, 3.95] × 109
value of α increases, the net benefit would decrease. For
under the scenario of λ varying from 0 to 1.0, respectively
example, when λ is a fixed value of 0.6, the net benefit
(as shown in Figure 6(a)). In addition, first, the values of
would be $ [433.14, 613.28] × 106, $ [434.30, 608.77] × 106,
system net benefit would decrease and then would not
and $ [403.58, 557.12] × 106 under the scenarios of α with
158
Y. L. Xie & G. H. Huang
Figure 4
|
|
Water resources allocation management and risk analysis model
Journal of Hydroinformatics
|
16.1
|
2014
Optimized water allocation schemes under α ¼ 0.95.
the value of 0.90, 0.95, and 0.99, respectively (Figure 7).
expected benefit and a higher CVaR value. Increasing λ
Thus, increasing the parameter λ and/or the parameter α
leads to a more risk-averse policy with a lower system
implies a higher level of risk than the recourse cost and
benefit and lower expected recourse costs in general. Thus,
the total positioning profit, which together constitute the
increasing the parameter λ and/or the parameter α implies
expected total benefit; change monotonically as a function
a higher level of risk aversion, and water managers would
of α.
choose a more risk-averse policy that would be a lower Figure 8 illustrates how the optimal CVaR changes as
water allocation target for each user in order to avoid the
the risk parameters α and λ increase through solving
risk of water shortage, and a well-balanced water allocation
the RITSP model. Similar to the optimal objective of the
scheme to reduce the risk of conflicts over competition for
RITSP model, CVaR also decreases as α increases by the
water resources.
definition of CVaR. When α increases the corresponding
When the λ value is 0, the RITSP model would be an
value-at-risk increases, and CVaR accounts for the risk of
ITSP model for water resources system management under
larger realizations. Thus, larger α values would lead to
uncertainty. The detailed optimal water targets and water
more conservative policies, which give more weight to
shortage from ITSP are presented in Tables 3–6. Differently
worse scenarios. However, CVaR increases as λ increases.
from the RITSP model, the ITSP model aims to obtain the
Due to the changing trade-off between the expectation and
maximum benefit in the optimal process of water allocation,
the CVaR criterion, larger λ values provide us with a lower
and it does not take the risk of model feasibility and
159
Figure 5
Y. L. Xie & G. H. Huang
|
|
Water resources allocation management and risk analysis model
Journal of Hydroinformatics
|
16.1
|
2014
Optimized water allocation schemes under α ¼ 0.99.
reliability into consideration. These limitations could lead to
objective of a water resources system management model,
low system stability and unbalanced allocation patterns. For
managers could obtain a robust and riskless decision.
example, when λ value is equal to 0, the water allocation targets of the municipal and industrial sectors would first be satisfied and reach their upper bounds, due to a higher
CONCLUSIONS
benefit, and the agricultural water allocation target would reach the lower bounds; especially, in the very-low inflow
In this study, a RITSP model is developed for supporting
level, water shortage would first occur in the agricultural
regional water resources management problems under
sector. Moreover, the net benefit of ITSP is higher than
uncertainty. This method is based on an integration of
that of the RITSP model. This also implies that the system
IPP, CVaR model, and two-stage stochastic programming
objective of the ITSP model is only to obtain a maximum
(TSP). It allows uncertainties presented as both probability
benefit without regarding risk aversion. In addition, the
distributions and interval values to be incorporated within
width of interval net benefit in the RITSP model is narrower
a general optimization framework. Moreover, the risk-
than that of the ITSP model. It is indicated that the system
aversion method was incorporated into the objective func-
benefit relies on the water resources condition, and tends
tion to reflect the preference of decision makers, such that
to fluctuate more intensively with the change of available
the trade-off between system economy and extreme
water resources. Through integrating CVaR into the
expected loss could be analyzed. Then, the developed
160
Figure 6
Y. L. Xie & G. H. Huang
|
|
Water resources allocation management and risk analysis model
Journal of Hydroinformatics
|
16.1
|
2014
The objectives of the RITSP model under different α and λ levels.
method has been confirmed through a case study of a water
allocation plans with a maximized system, and reflecting the
resources allocation problem involving three competing
decision maker’s attitude toward risk aversion.
water users. A number of scenarios corresponding to differ-
The proposed method could help water resources man-
ent river inflow and risk levels was examined; the results of
agers identify desired management policies under various
the case study suggest that the methodology is applicable to
economic considerations. The study results suggested that
reflecting complexities of water resources management and
the proposed approach was also applicable to many other
can be used for providing bases for identifying desired water
environmental and energy management problems. The
161
Figure 7
Y. L. Xie & G. H. Huang
|
|
Water resources allocation management and risk analysis model
Journal of Hydroinformatics
|
16.1
|
2014
Net benefit and recourse cost under different α and λ levels.
risk-based framework could be used to assess the perform-
methodologies to handle various types of uncertainties.
ance risk of unbalanced water resources allocation
However, compared with other approaches, there is still
strategies in compliance with the economic and/or environ-
much space for improvement of the proposed model. For
mental management goals, and help managers identify
example, RITSP would have difficulties in dealing with the
desired water resources management policies under various
uncertainties in the model’s right-hand side coefficients;
environmental, economic, and system reliability consider-
the probability of random variable is estimated through stat-
ations. It could also be coupled with other optimization
istical analysis, which would unavoidably bring errors to the
162
Figure 8
Y. L. Xie & G. H. Huang
|
|
Water resources allocation management and risk analysis model
Journal of Hydroinformatics
|
16.1
|
2014
Optimal values of CVaR under different α and λ levels.
system; the selection of a suitable alternative among the
grateful to the editor and the anonymous reviewers for
obtained interval solutions under different α and λ values
their insightful comments and suggestions.
is of significant complexity and becomes an extra burden for water resources managers. It is also possible that fuzzy logic could be used instead of λ values to deal with uncertainties in many real-world optimization problems, due to
REFERENCES
the inherent ambiguity of the fuzzy subsets. Further studies are desired to mitigate these limitations.
ACKNOWLEDGEMENTS This research was supported by the Fundamental Research Funds for the Central Universities (13XS20), the Major Project Program of the Natural Sciences Foundation (51190095), and the Program for Innovative Research Team in University (IRT1127). The authors are extremely
Ahmed, S. Mean-risk objectives in stochastic programming. Technical Report, Georgia Institute of Technology. E-print available at 2004, http://www.optimization-online.org. Ahmed, S., Tawarmalani, M. & Sahinidis, N. V. A finite branch-and-bound algorithm for two-stage stochastic integer programs. Math. Program. A 100, 355–377. Birbil, S. I., Frenk, J., Kaynar, B. & Noyan, N. The VaR implementation handbook. In: Risk Measures and Their Applications in Asset Management (G. N. Gregoriou, ed.). McGraw-Hill, New York, pp. 311–337. Birge, J. R. Decomposition and partitioning methods for multistage stochastic linear programs. Oper. Res. 33, 989–1007.
163
Y. L. Xie & G. H. Huang
|
Water resources allocation management and risk analysis model
Birge, J. R. & Louveaux, F. V. Introduction to Stochastic Programming. Springer, New York. Cetinkaya, C. P., Fistikoglu, O., Fedra, K. & Harmancioglu, N. B. Optimization methods applied for sustainable management of water-scarce basins. J. Hydroinformat. 10, 69–95. Chang, N. B., Wen, C. G., Chen, Y. L. & Yong, Y. C. A grey fuzzy multiobjective programming approach for the optimal planning of a reservoir watershed, Part A: theoretical development. Water Res. 30, 2329–2340. Chaves, P. & Kojiri, T. Deriving reservoir operational strategies considering water quantity and quality objectives by stochastic fuzzy neural networks. Adv. Water Resour. 30, 1329–1341. Conejo, A. J., García-Bertrand, R., Carrión, M., Caballero, A. & Andrés, A. Optimal involvement in futures markets of a power producer. IEEE Trans. Power Syst. 23, 701–711. Fábián, C. I. Handling CVAR objectives and constraints in twostage stochastic models. Eur. J. Oper. Res. 191 (3), 888–911. Faye, R. M., Sawadogo, S., Lishoua, C. & Mora-Camino, F. Long-term fuzzy management of water resource systems. Appl. Math. Comput. 137, 459–475. Ferrero, R. W., Riviera, J. F. & Shahidehpour, S. M. A dynamic programming two-stage algorithm for long-term hydrothermal scheduling of multireservoir systems. Trans. Power Syst. 13 (4), 1534–1540. Guo, P. & Huang, G. H. Inexact fuzzy-stochastic programming for water resources management under multiple uncertainties. Environ. Model. Assess. 15 (2), 111–124. Guo, P., Huang, G. H. & Li, Y. P. An inexact fuzzy-chanceconstrained two-stage mixed-integer linear programming approach for flood diversion planning under multiple uncertainties. Adv. Water Resour. 33 (1), 81–91. Huang, G. H. IPWM: an interval-parameter water quality management model. Eng. Optim. 26, 79–103. Huang, G. H. A hybrid inexact-stochastic water management model. Eur. J. Operat. Res. 107, 137–158. Huang, G. H., Baetz, B. W. & Patry, G. G. An interval linear programming approach for municipal solid waste management planning under uncertainty. Civil Eng. Environ. Syst. 9, 319–335. Huang, Y., Chen, X., Li, Y. P., Willems, P. & Liu, T. Integrated modeling system for water resources management of Tarim River Basin. Environ. Eng. Sci. 27 (3), 255–269. Huang, Y., Li, Y. P., Chen, X. & Ma, Y. G. Optimization of the irrigation water resources for agricultural sustainability in Tarim River Basin, China. Agric. Water Manage. 107, 74–85. Huang, G. H. & Loucks, D. P. An inexact two-stage stochastic programming model for water resources management under uncertainty. Civil Eng. Environ. Syst. 17, 95–118. Jairaj, P. G. & Vedula, S. Multi-reservoir system optimization using fuzzy mathematical programming. Water Resour. Manage. 14, 457–472. Kall, P. & Mayer, J. Stochastic Linear Programming: Models, Theory, and Computation. International Series in Operations Research and Management Science. Springer, New York.
Journal of Hydroinformatics
|
16.1
|
2014
Klein Haneveld, W. K. & Van der Vlerk, M. H. Integrated chance constraints: Reduced forms and an algorithm. Comput. Manage. Sci. 3, 245–269. Li, Y. P. & Huang, G. H. Interval-parameter two-stage stochastic nonlinear programming for water resources management under uncertainty. Water Resour. Manage. 22, 681–698. Li, Y. P., Huang, G. H. & Nie, S. L. An interval-parameter multi-stage stochastic programming model for water resources management under uncertainty. Adv. Water Resour. 29, 776–789. Li, Y. P., Huang, G. H. & Nie, S. L. Mixed interval-fuzzy twostage integer programming and its application to flooddiversion planning. Eng. Optim. 39 (2), 163–183. Li, Y. P., Huang, G. H., Nie, S. L. & Chen, X. A robust modeling approach for regional water management under multiple uncertainties. Agr. Water Manage. 98, 1577–1588. Li, Y. P., Huang, G. H., Wang, G. Q. & Huang, Y. F. FSWM: a hybrid fuzzy-stochastic water-management model for agricultural sustainability under uncertainty. Agric. Water Manage. 12 (96), 1807–1818. Liu, Y., Cai, Y. P., Huang, G. H. & Dong, C. Intervalparameter chance-constrained fuzzy multi-objective programming for water pollution control with sustainable wetland management. Procedia Environ. Sci. 13, 2316–2335. Liu, C., Fan, Y. & Ordóň ez, F. A two-stage stochastic programming model for transportation network protection. Comput. Oper. Res. 36, 1582–1590. Loucks, D. P., Stedinger, J. R. & Haith, D. A. Water Resource Systems Planning and Analysis. Prentice-Hall, Englewood Cliffs, NJ. Lu, H. W., Huang, G. H. & He, L. Development of an interval-valued fuzzy linear-programming method based on infinite α-cuts for water resources management. Environ. Model. Softw. 25, 354–361. Luo, B., Maqsood, I., Yin, Y. Y., Huang, G. H. & Cohen, S. J. Adaptation to climate change through water trading under uncertainty – an inexact two-stage nonlinear programming approach. J. Environ. Inf. 2 (2), 58–68. Lv, Y., Huang, G. H., Li, Y. P. & Sun, W. Managing water resources system in a mixed inexact environment using superiority and inferiority measures. Stoch. Env. Res. Risk A 26 (5), 681–693. Maqsood, I., Huang, G. H. & Yeomans, J. S. An intervalparameter fuzzy two-stage stochastic program for water resources management under uncertainty. Eur. J. Oper. Res. 167 (1), 208–225. McIntyre, N., Wagener, T., Wheater, H. S. & Siyu, Z. Uncertainty and risk in water quality modelling and management. J. Hydroinformat. 5 (4), 259–274. Ogryczak, W. & Ruszczyn´ski, A. Dual stochastic dominance and related mean-risks models. SIAM J. Optim. 13 (2), 60–78. Piantadosi, J., Metcalfe, A. V. & Howlett, P. G. Stochastic dynamic programming (SDP) with a conditional value-at-risk (CVaR) criterion for management of storm-water. J. Hydrol. 348 (3–4), 320–329.
164
Y. L. Xie & G. H. Huang
|
Water resources allocation management and risk analysis model
Pousinho, H. M. I., Mendes, V. M. F. & Catalão, J. P. S. A risk-averse optimization model for trading wind energy in a market environment under uncertainty. Energy 36, 4935–4942. Rockafellar, R. & Uryasev, S. Optimization of conditional value at risk. J. Risk 2 (3), 21–41. Rockafellar, R. & Uryasev, S. Conditional value-at-risk for general loss distributions. J. Bank. Financ. 26, 1443–1471. Russell, S. O. & Campbell, P. F. Reservoir operating rules with fuzzy programming. ASCE J. Water Resour. Plan. Manage. 122, 165–170. Schultz, R. & Neise, F. Algorithms for mean-risk stochastic integer programs in energy. Rev. Invest. Oper. 28, 4–16. Schultz, R. & Tiedemann, S. Conditional value-at-risk in stochastic programs with mixed-integer recourse. Math. Program. 105 (2), 365–386. Seifi, A. & Hipel, K. W. Interior-point method for reservoir operation with stochastic inflows. J. Water Resour. Plan. Manage. 127 (1), 48–57. Shao, L. G., Qin, X. S. & Xu, Y. A conditional value-at-risk based inexact water allocation model. Water Resour. Manage. 25, 2125–2145. Simonovic, S. A new method for spatial and temporal analysis of risk in water resources management. J. Hydroinformat. 11 (3–4), 320–329. Slowinski, R., Urbaniak, A. & Weglarz, J. Probabilistic and fuzzy approaches to capacity expansion planning of a water supply system. In: Systems Analysis Applied to Water and
Journal of Hydroinformatics
|
16.1
|
2014
Related Land Resources (L. Valadares Tavares & J. Evaristo da Silva, eds). Pergamon, Oxford, pp. 93–98. Teegavarapu, R. & Elshorbagy, A. Fuzzy set based error measure for hydrologic model evaluation. J. Hydroinformatic. 7 (3), 199–208. Tran, L. D., Schilizzi, S., Chalak, M. & Kingwell, R. Optimizing competitive uses of water for irrigation and fisheries. Agric. Water Manage. 101, 42–51. Victoria, F. B., Viegas Filho, J. S., Pereira, L. S., Teixeira, J. L. & Lanna, A. E. Multi-scale modelling for water resources planning and management in rural basins. Agricult. Water Manage. 77, 4–20. Wagner, J. M., Shamir, U. & Marks, D. H. Containing groundwater contamination: planning models using stochastic programming with recourse. Eur. J. Oper. Res. 7, 1–26. Wang, X. H. & Du, C. M. An internet based flood warning system. J. Environ. Inf. 2, 48–56. Wang, S. & Huang, G. H. Interactive two-stage stochastic fuzzy programming for water resources management. J. Environ. Manage. 92 (8), 1986–1995. Xu, Y. & Qin, X. S. Rural effluent control under uncertainty: An inexact double-sided fuzzy chance-constrained model. Adv. Water Resour. 33 (9), 997–1014. Zhang, X. H., Zhang, H. W., Chen, B., Guo, H. C., Chen, G. Q. & Zhao, B.A. An inexact-stochastic dual water supply programming model. Commun. Nonlinear Sci. 14, 301–309.
First received 12 December 2012; accepted in revised form 13 June 2013. Available online 17 July 2013
165
© IWA Publishing 2014 Journal of Hydroinformatics
|
16.1
|
2014
Hybrid metaheuristics for multi-objective design of water distribution systems Qi Wang, Dragan A. Savic´ and Zoran Kapelan
ABSTRACT Multi-objective design of Water Distribution Systems (WDSs) has received considerable attention in the past. Multi-objective evolutionary algorithms (MOEAs) are popular in tackling this problem due to their ability to approach the true Pareto-optimal front (PF) in a single run. Recently, several hybrid metaheuristics based on MOEAs have been proposed and validated on test problems. Among these algorithms, AMALGAM and MOHO are two noteworthy representatives which mix their constituent algorithms in contrasting fashion. In this paper, they are employed to solve a wide range of benchmark design problems against another state-of-the-art algorithm, namely NSGA-II. The design
Qi Wang (corresponding author) Dragan A. Savic´ Zoran Kapelan Centre for Water Systems, College of Engineering, Mathematics and Physical Sciences, University of Exeter, North Park Road, Exeter EX4 4QF, United Kingdom E-mail: qw212@exeter.ac.uk; zero3315263@gmail.com
task is formulated as a bi-objective optimisation problem taking cost and network resilience into account. The performance of three algorithms is assessed via normalised hypervolume indicator. The results demonstrate that AMALGAM is superior to MOHO and NSGA-II in terms of convergence and diversity on the networks of small-to-medium size; however, for larger networks, the performance of hybrid algorithms deteriorates as they lose their adaptive capabilities. Future improvement and/or redesign on hybrid algorithms should not only adopt the strategies of adaptive portfolios of subalgorithms and global information sharing, but also prevent the deterioration mainly caused by imbalance of constituent algorithms. Key words
| hybrid metaheuristics, hypervolume, multi-objective design, resilience, water distribution system
INTRODUCTION The design of Water Distribution Systems (WDSs) by multi-
popular for this task due to their ability to approach the
objective evolutionary algorithms (MOEAs) has attracted
true Pareto-optimal front (PF) in a single run (Zitzler &
considerable attention during recent years (Keedwell &
Thiele ; Farmani et al. a).
Khu ; Prasad & Park ; Khu & Keedwell ; Farm-
Farmani et al. (a) compared the performance of
ani et al. ; Prasad & Tanyimboh ; Fu et al. a).
three commonly used MOEAs, i.e. Non-dominated Sorting
The primary goal of the MOEA is to generate a trade-off
Genetic Algorithm II (NSGA-II), Strength Pareto Evolution-
between the total cost and system benefits, while meeting
ary Algorithm 2 (SPEA2) as well as Multi-Objective Genetic
consumer demands and other system constraints (e.g.
Algorithm (MOGA), on multi-objective design of a WDS
pressure, velocity, etc.). As combinatorial optimisation
applying them to two benchmark networks, as well as a
problems with Non-deterministic Polynomial-time hard
large real-life network. They concluded that SPEA2 (Zitzler
(NP-hard) feature (Papadimitriou & Steiglitz ), it is chal-
et al. ) outperformed other techniques in satisfying both
lenging to tackle the design of a real-world WDS as it often
goals of multi-objective optimisation, i.e. closeness to the
incurs expensive computational efforts, especially when
true PF and diversity among the non-dominated solutions,
extended period simulations are required for objective evalu-
especially on a large network. Subsequently, Farmani et al.
ations (Keedwell & Khu ). MOEAs are suitable and
(b) used NSGA-II to solve an expanded rehabilitation
doi: 10.2166/hydro.2013.009
166
Q. Wang et al.
|
Hybrid metaheuristics for multi-objective design of water distribution system
Journal of Hydroinformatics
|
16.1
|
2014
problem of the Anytown network (Walski et al. ) as a
In this paper, we applied two recently-proposed hybrid
realistic benchmark taking cost and resilience index
algorithms, i.e. AMALGAM and Multi-Objective Hybrid
(Todini ) into account.
Optimisation (MOHO), to solve the multi-objective design
In order to yield acceptable near optimal solutions
of a WDS. More specifically, we tested the strength of two
and reduce the overall number of hydraulic evaluations,
different hybrid schemes (Talbi ), namely high-level
Keedwell & Khu () investigated the possibility of com-
teamwork hybrid (HTH) and high-level relay hybrid
bining NSGA-II with a neighbour search to solve the
(HRH), by conducting the bi-objective optimal design on a
multi-objective design of the New York tunnels network.
wide range of benchmark models collected from the litera-
Results showed an encouraging improvement of the hybrid
ture, including the Anytown network which is regarded as
algorithm given a budget of model simulations. Later on,
one of the challenging benchmarks receiving less attention
they tried to combine a novel cellular automaton-based initi-
in the past (Prasad & Tanyimboh ). The problem was
alisation technique with a Genetic Algorithm (GA) to solve
formulated to minimise the total cost and to maximise the
the least cost design of a WDS (Keedwell & Khu ). The
network resilience, as defined by Prasad & Park (). In
applications to two large networks from industry highlighted
order to compare the performance of hybrid algorithms
the benefits of using this approach to discover better results
with state-of-the-art MOEAs in the domain, we used
in a fixed time span.
NSGA-II to solve the aforementioned problem as well. In
Besides integrating a local search strategy with current
addition, with an attempt to clearly evaluate the perform-
MOEAs, Raad et al. () applied a hybrid metaheuristic
ance of each algorithm, we employed a well-established
algorithm, called a multi-algorithm, genetically adaptive
indicator, i.e. hypervolume (Deb ), to assess the quality
multi-objective method (AMALGAM) proposed by Vrugt
of final solutions. Multiple independent optimisation runs
& Robinson (), for the first time to address the optimal
were carried out on each problem, which served to generate
design of a WDS considering the total cost and network resi-
unbiased evaluation based on statistics. The main contri-
lience (Prasad & Park ). Instead of using the original
butions of this paper are the investigation of the capability
sub-algorithms, they employed a greedy design heuristic,
of hybrid metaheuristics to perform multi-objective design
two variants of NSGA-II and discrete particle swarm optim-
of a WDS and comparison of their performance with
isation (PSO) because of their tendency to succeed in a
that of modern MOEAs by extensive testing. Therefore, this
discrete multi-objective optimisation setting. The results
work aims to uncover the reasons for success and/or failure
obtained from three benchmark models as well as a
of the two algorithms, and in turn, to establish how the
real WDS in South Africa proved the strength of the
hybrid algorithms could benefit from further improvements.
AMALGAM-type algorithm as a faster, more reliable tool for multi-objective design of a WDS.
The remainder of this paper is organised as follows: first, multi-objective design of a WDS is briefly introduced fol-
Wolpert & Macready () presented a number of ‘no
lowed by the mechanisms of AMALGAM and MOHO in
free lunch’ theorems and demonstrated the danger of analys-
more detail. Then, the benchmark problems used in this
ing algorithms by their performance on a small set of cases.
paper are summarised and the performance metric is
Most of the previous work tests several MOEAs (often built
given. After comparing the results obtained from each algor-
on different concepts) on quite a few benchmark and/or
ithm, conclusions are drawn at the end.
real-world WDS design problems, therefore, the conclusions might be biased since it is impossible for a specific optimisation algorithm to be effective on a wide range of problems.
MULTI-OBJECTIVE DESIGN OF WDSs
Hybrid algorithms arise with an attempt to overcome this difficulty by combining the power of different methods.
The design of a WDS always involves optimising multiple
However, many such schemes proposed for WDS design
and usually conflicting objectives at the same time, such
often require the parameters to be fine-tuned, hence the
as, total cost, system reliability and water quality. The goal
lack of adaptability, robustness and popularity.
of multi-objective design of a WDS is to get as close as
167
Q. Wang et al.
|
Hybrid metaheuristics for multi-objective design of water distribution system
possible to the true trade-off between cost and benefit, which
Journal of Hydroinformatics
|
16.1
|
2014
HYBRID METAHEURISTICS
offers a range of alternatives for the decision making process. A typical WDS design problem consists of providing
Unlike the self-contained algorithms, hybrid metaheuristics
cost-effective specification of various components, i.e.
combine two or more different mechanisms (usually built
pipes, pumps, valves tanks, etc., within the network given
on population-based evolutionary algorithms) to facilitate
the system layout. In a more narrow sense, various investi-
the efficiency of the search towards the global optima. In
gators considered the design task to be the specification of
an attempt to classify hybrid algorithms using common ter-
the best combination of pipe sizes from within a discrete
minology, Talbi () presented a taxonomy mechanism
range of commercial diameters that meets the water
for current hybrid metaheuristics in a qualitative way consid-
demand and other system requirements. Herein, we focus
ering both design and implementation issues. The taxonomy
on this narrow definition of the problem using bi-objective
combined a hierarchical classification scheme with a flat
optimisation to minimise the total capital cost and maximise
classification scheme to provide a clear and structural fra-
the performance benefits of the network. The value of the
mework for comparative purposes. Here, we mainly focus
latter objective is calculated based on hydraulic simulation
on the design issues of hybrid algorithms.
through the EPANET2.0 package (Rossman ).
At the first level of the hierarchical classification, low-
A series of indicators (Todini ; Prasad & Park ;
level and high-level hybridisations can be distinguished.
Prasad & Tanyimboh ) have been proposed in the litera-
This is done by ascertaining whether the component meta-
ture as a surrogate of performance benefit giving preference
heuristics are embedded or self-contained. In the low-level
to a ‘looped network’. Recently, the resilience index (Todini
hybrid class, a certain functional part of an algorithm is sub-
) has gained more attention due to its ability to account
stituted with another algorithm. While in the high-level
for failure conditions in a risk type measure. It is defined
hybrid class, each algorithm works on its own without
based on the concept that the total input power into a net-
depending on other metaheuristics. At the second level of
work consists of the power dissipated in the network and
the hierarchical classification, each class (low-level or
the power delivered at demand nodes. In response to
high-level hybrid) is further divided into relay and teamwork
Todini’s measure, Prasad & Park () developed the net-
classes according to the working fashion, i.e. optimising a
work resilience metric by taking the uniformity of pipes
problem in turn or cooperatively. Therefore, four general
connected to a certain node into account. The advantage
types of algorithms are derived from the hierarchical taxon-
of the latter is that it explicitly rewards redundancy of simi-
omy, i.e. Low-level Relay Hybrid, Low-level Teamwork
larly sized pipes as improving the reliability of network
Hybrid (LTH), HRH and HTH. According to the flat classi-
under pipe failure scenarios (Raad et al. ). A new
fication, all abovementioned hybridisation classes can be
approach was recently proposed to provide flexibility to
categorised into homogeneous/heterogeneous, global/par-
the design of water supply (Zhang & Babovic ) by con-
tial and specialist/general schemes. In homogeneous
sidering innovative Real Options technology. However,
hybrids, all the constituent algorithms use the same meta-
this approach deals with the design of water systems under
heuristic.
uncertainty which is not considered here.
metaheuristics are employed. The hybrid schemes can also
While
in
heterogeneous
hybrids,
different
Given the above and the fact that this paper focuses on
be viewed as global or partial hybrids depending on whether
the comparison of hybrid metaheuristics for the WDS
the whole search space will be the same for all the
design, the optimisation methodology presented here is
sub-algorithms or decomposed into sub-areas (one for each
based on the conventional WDS design driven by the
sub-algorithm). From the perspective of function of
trade-off between the WDS design cost and performance,
metaheuristics, specialist hybrids can be distinguished
the latter being evaluated by using the network resilience
from general hybrids as they combine sub-algorithms
metric.
which aim to solve different problems from the others.
168
Q. Wang et al.
|
Hybrid metaheuristics for multi-objective design of water distribution system
Journal of Hydroinformatics
|
16.1
|
2014
Van Zyl et al. () proposed an LTH algorithm, which
the population-based evolutionary algorithms in contrasting
incorporated a hill-climber strategy with a GA method, to
fashions. In particular, two instances of high-level hybrid
solve operational optimisation of a WDS. They concluded
scheme are analysed by solving the bi-objective design prob-
that the hybrid algorithm outperformed pure GA by finding
lems using 12 WDS benchmark networks collected from the
good solutions quickly. They also showed that a local search
literature.
method complemented GA by efficiently finding local optima.
Instance of HTH: AMALGAM
Cisty () combined a GA with Linear Programming (LP) as an LTH to solve three least-cost design problems
Vrugt & Robinson () proposed a multi-algorithm,
of a WDS. This method employed a GA to decompose
genetically adaptive multi-objective method, known as
looped network configurations into a group of branched net-
AMALGAM. This can be classified as an HTH, hetero-
works. LP was then applied to optimise the branched
geneous, global, general framework. It simultaneously
networks as it was more reliable than heuristic methods in
employs four sub-algorithms within the framework, includ-
finding the global optimum. The results demonstrated the
ing NSGA-II, PSO, adaptive metropolis search (AMS) and
hybrid’s superiority in consistently generating better sol-
differential evolution (DE). The main aim of the developed
utions when compared to GA and Harmony Search.
algorithm was to overcome the drawbacks, as well as poss-
Tolson et al. () extended dynamically dimensioned
ible failure of an individual algorithm on a specific
search (DDS), which is a continuous global optimization
problem. The new concepts of multi-method search and
algorithm (Tolson & Shoemaker ) and developed an
genetically adaptive offspring creation are developed to
LTH (called hybrid discrete DDS, HD-DDS) by introducing
ensure a fast, reliable and computationally efficient algor-
two local search strategies. These local search heuristics
ithm for multi-objective optimisation. Results on a set of
involved one-pipe change and two-pipe change local
well-known multi-objective test functions suggest that this
moves in the process of solving a discrete, single-objective,
hybrid method achieved a tenfold improvement in conver-
constrained WDS design problem. The main advantages of
gence metric (Deb et al. ) over NSGA-II for the more
the algorithm were that it does not require fine-tuning of a
complex, higher dimensional problems. Besides its extra-
number of parameters and that it is computationally effi-
ordinary performance, AMALGAM provides a general
cient when compared to GA or PSO. The results obtained
template which is flexible and extensible, and could easily
(especially on a large network) revealed that it outperformed
accommodate any other population-based algorithms. A
the state-of-the-art existing algorithms in terms of searching
sequential version of AMALGAM code was requested
ability and computational efficiency.
from Vrugt for this work. The pseudocode of AMALGAM
As most low-level hybrid schemes commonly combine
is illustrated in Figure 1.
various local search strategies or a mechanism different
The parameter settings of three population-based sub-
from population-based techniques into the structure of evol-
algorithms within AMALGAM are summarised in Table 1.
utionary algorithms, they turn out to be tailored to cope with
Besides using GA, PSO and DE, AMALGAM also includes
specific problems. This is most often done by experimenting
AMS as a Markov Chain Monte Carlo (MCMC) sampler
with a rule that determines when to switch from one algor-
that proactively avoids the search being trapped in local
ithm to another. However, this makes such a hybrid less
optima. The algorithm works by substituting the parents
flexible as it would generally fail to adapt to other appli-
with offspring of lower fitness (Haario et al. ). This sam-
cations. On the other hand, few of these low-level hybrid
pler also shows superior efficiency in exploring the search
algorithms are designed for multi-objective optimisation
space of high-dimensionality. Therefore, AMS is capable of
except Creaco & Franchini (). Since the main concern
rapidly travelling across the entire Pareto distribution
of this paper is about multi-objective design of a WDS,
when the optimisation process progresses towards the PF.
herein, we focus on the comparison of two different high-
Readers are referred to the supporting information of
level hybrid schemes, i.e. HTH and HRH, which employ
(Vrugt & Robinson ) for more details.
169
Q. Wang et al.
Figure 1
Table 1
|
|
|
Hybrid metaheuristics for multi-objective design of water distribution system
Journal of Hydroinformatics
|
16.1
|
2014
Pseudocode of AMALGAM.
Setting of parameters in AMALGAM
GA
PSO
DE
Crossover rate
0.9
Inertia factor
0.5 þ 0.5u(0,1)
Scaling factor 1
u(0.6,1.0)
Mutation rate
1/L
Cognitive weight
1.5
Scaling factor 2
u(0.2,0.6)
Distribution index for crossover
20
Social weight
1.5
–
–
Distribution index for mutation
20
Turbulence factor
u( 1,1)
–
–
Note: L is the number of decision variables; u(a,b) is a uniform random number between a and b.
Instance of HRH: MOHO
the primary idea in a Matlab environment. Figure 2 shows the pseudocode of MOHO.
Moral & Dulikravich () focused on another hybrid
MOHO evaluates the performance of its sub-algorithms
scheme following the concept of Pareto-dominance. They
on five distinct improvements: (1) changes in the size of non-
presented an MOHO algorithm as a HRH, heterogeneous,
dominated set; (2) whether there exists a solution from the
global, general metaheuristic which implements three sub-
new generation which dominates any members in the last
algorithms in a sequential manner. The MOHO hybrid
generation; (3) changes in the hypervolume indicator; (4)
coordinates SPEA2, Multi-Objective Particle Swarm Optim-
changes in average Euclidian distance; (5) increase in the
isation (MOPSO) and Non-dominated Sorting Differential
spread indicator. The innovative part of this evaluation strat-
Evolution (NSDE) and decides which one of them will gen-
egy is that MOHO considers not only the quality of the non-
erate offspring using the automatic switching procedure.
dominated set in the next generation (i.e. in terms of conver-
More specifically, MOHO proceeds by choosing one of
gence and diversity), but also takes into account the
them for producing the next generation based on the per-
perturbation introduced by the potential solutions which
formance of the currently employed algorithm. Five
may bring substantial improvement in later iterations. The
different indicators for measuring improvements on finding
main differences between the original MOHO and the one
non-dominated solutions, including the quality of approxi-
reported here are twofold. First, the initial population is gen-
mation and distribution, are used to decide whether to
erated using uniformly distributed random sampling rather
continue with a particular algorithm or change to another
than Sobol’s quasi-random sequence generator (Bratley &
one. In this paper, we are not able to implement the original
Fox ) as the advantage of this method vanishes for
MOHO software; instead, we tried to recreate it following
higher dimensional problems (Rahnamayan et al. ).
170
Figure 2
Q. Wang et al.
|
|
Hybrid metaheuristics for multi-objective design of water distribution system
Journal of Hydroinformatics
|
16.1
|
2014
Pseudocode of MOHO.
Secondly, since the parameter settings of each sub-algorithm
with the number of candidate solutions ranging between
were not clearly stated in the original MOHO, we configure
107 and 10454. The name of benchmark models, number of
these values by trial-and-error method on some difficult test
pipes, diameter options and relevant design criteria are sum-
functions and choose the best combination based on the
marised in Table 2.
experimental results. Additionally, the maximum number
It is worth mentioning that four benchmark models,
of consecutive iterations of a certain sub-algorithm is set to
BLA, FOS, PES and MOD, adopted from Bragalli et al.
1/50 of total generations.
() are more realistic compared with others (except
Greater details about two hybrid algorithms and their
ANT) as they all take a reasonable range of pressure
performance can be found in the original authors’ papers
head (not only minimum pressure requirement) as well
(Vrugt & Robinson ; Moral & Dulikravich ).
as the upper bound on flow velocity in the network.
Apart from using hybrid algorithms with distinct schemes,
Although ANT was introduced as a hypothetical network,
we also applied NSGA-II to solve the benchmark problems
it contains most common features (multiple loading con-
for the purpose of comparison of the quality of final sol-
ditions, pipe duplication or reconditioning (i.e. cleaning
utions. For more details about NSGA-II, the readers are
and re-lining), new pipe installation, tank location and
referred to Deb et al. (). The latest version of NSGA-II
operation as well as pump scheduling) found in many
(revision 1.1.6) was downloaded from the website of
real systems. For a detailed description of design criteria
Kanpur Genetic Algorithms laboratory (http://www.iitk.ac.
on each model, interested readers are referred to Dong
in/kangal/codes.shtml).
et al. (), Raad () as well as via http://centres. exeter.ac.uk/cws.
CASE STUDIES
Performance indicator
Benchmark problems
It should be emphasised here that there is no ideal indicator
To well compare the performance of AMALGAM and
convergence and diversity of multi-objective optimisation.
MOHO against NSGA-II, 12 WDS networks were collected
Among the various metrics which are designed to measure
from the literature and served as benchmarks for optimis-
the achievement of MOEAs, it is established that hyper-
ation tests. The number of pipes in these models ranges
volume (HV) is a single metric which can assess the
from eight to 454, which, together with various design cri-
performance of both aspects in a combined sense (Deb
teria, provide a wide range of problems and search spaces
). In order to remove the bias caused by the magnitude
which can give consistent and definite evaluation of both
171
Table 2
Q. Wang et al.
|
|
Hybrid metaheuristics for multi-objective design of water distribution system
Journal of Hydroinformatics
|
16.1
|
2014
Benchmark models used for comparison of each algorithm
Design Criteria No.
Model
Pipe Count
Option Count
Min Head
Max Head
Max Velocity
Multiple Loading Condition
1
Two-loop Network (TLN)
2
BakRyan Network (BAK)
8
14
Yes
No
No
No
9
11
Yes
No
No
No
3
New York Tunnel Network (NYT)
4
Blacksburg Network (BLA)
21
16
Yes
No
No
No
23
15
Yes
Yes
Yes
No
5 6
GoYang Network (GOY)
30
8
Yes
No
No
No
Hanoi Network (HAN)
34
6
Yes
No
No
No
7
Fossolo Network (FOS)
58
22
Yes
Yes
Yes
No
8
Pescara Network (PES)
99
13
Yes
Yes
Yes
No
9
Modena Network (MOD)
317
13
Yes
Yes
Yes
No
10
Balerma Irrigation Network (BIN)
454
10
Yes
No
No
No
11
Two Reservoir Network (TRN)
12
Anytown Network (ANT)
8
8
Yes
No
No
Yes
43
10
Yes
No
No
Yes
Note: For TRN network, three of eight pipes are existing pipes which have three options including ‘do nothing’, cleaning or duplication; for ANT network, although there are only 43 pipes to be considered, its formulation contains up to 112 decision variables, which makes it the most challenging problem in the list.
of different objective functions, we take the normalised ver-
RESULTS AND DISCUSSION
sion of HV, called the ratio of the HV of approximation set and of true Pareto-optimal front (HVR) (Deb ), to evalu-
The benchmark networks adopted in this paper encompass a
ate the quality of final solutions obtained from each
wide range of network sizes, with up to several hundreds of
algorithm. The expression of HV and HVR are shown as
pipes. Hence, various computational budgets (Table 3) were
Equations (1) and (2), respectively:
tested to make sure each algorithm converged well before
HV ¼ volume
[jQj
v ; i¼1 i
their performance could be compared. It is worth noting that (1)
these budgets (i.e. population size and number of generations for each benchmark problem) are kept the same for all three algorithms. As such, the number of function evaluations via
HV(Q) HVR ¼ HV(P )
(2)
where vi is the hypercube constructed with a reference point
EPANET2.0 (Rossman ) varied from 25,000 to 500,000. Because each algorithm produces a first generation in a different way, multiple runs are implemented to eliminate the
(normally a vector of worst objective values) and the solution i as the diagonal corners; Q is the non-dominated
Table 3
|
Configuration of computational budget
solutions obtained by an algorithm and P* is the solutions
Population
in the true PF.
Size
Since we do not have a theoretical true PF for each benchmark problem, to assist the evaluation of performance, a quasi-true Pareto-Optimal front (quasi-PF) was generated for each problem. This was achieved by applying a non-dominated sorting procedure to the aggregated Pareto fronts obtained by all three algorithms through multiple runs.
100
Generation
Pipe No.
Problems
250
50
500 1000 5000
100 500 N/A
TLN, BAK, NYT, BLA, GOY, HAN, TRN FOS, PES MOD, BIN ANT
Note: The numbers of population size and generation (except ANT) are decided based on trial runs in order to ensure the convergence of NSGA-II given the specified computational budgets. The number of generation on the ANT problem follows the same setting chosen by Farmani et al. (2005b).
172
Q. Wang et al.
|
Hybrid metaheuristics for multi-objective design of water distribution system
Journal of Hydroinformatics
|
16.1
|
2014
influence of the initial population, and the statistical results of
Reference points are also provided in terms of cost (in million
HVR are used to assess their performance. Thirty independent
units) and network resilience values. For example, for TLN
runs were carried out for all cases except ANT, which was run
the reference point is (5.0, 0.1). The results clearly demon-
for 10 times as it requires many more generations to ensure
strate that AMALGAM consistently outperforms MOHO
convergence and thus is extremely time-consuming.
and NSGA-II on the networks of small-to-medium size
Figure 3 shows the box plot of statistical performances of
(Wang et al. ), i.e. TLN, BAK, NYT, GOY, FOS, PES,
three algorithms on 12 benchmark problems. The top and
MOD and TRN. The performance of MOHO was compar-
bottom edges of the grey bar in each plot represent the maxi-
able to that of AMALGAM and NSGA-II on smaller
mum and minimum values of HVR for each algorithm,
networks, i.e. TLN, BAK, NYT, BLA, GOY and TRN; how-
respectively. The intermediate short lines in dark colour
ever, it became less efficient on larger networks, i.e. HAN,
denote the average values of HVR for each algorithm.
FOS, PES, MOD, BIN and ANT, as the complexity of the
Figure 3
|
Statistical performances of each algorithm on each problem using HVR indicator.
173
Q. Wang et al.
|
Hybrid metaheuristics for multi-objective design of water distribution system
Journal of Hydroinformatics
|
16.1
|
2014
problem increased. Even worse, it was able to find only one
for initialisation, successfully discovered non-dominated sol-
feasible solution on the ANT problem in 10 runs. On smaller
utions all the time.
networks (less than 400 pipes), NSGA-II performed worse
Another way to compare the performance of three algor-
than hybrid algorithms except on the BLA and HAN pro-
ithms is to illustrate their contributions to the Pareto front
blems; on the contrary, it dominated hybrid algorithms on
obtained via multiple runs on each case (see Figure 4).
larger networks, i.e. BIN and ANT. Admittedly, none of the
Herein, only four cases, namely NYT, HAN, PES, and BIN,
algorithms converged on the ANT problem, which also
are chosen as they exhibit different levels of complexity
implies that it was the most complex problem in the selected
within the problems considered in the paper. Each figure is pro-
cases by considering many aspects simultaneously. Further-
duced in the following manner. Firstly, the objective function
more, it is important to emphasise that the convergence of
values of the non-dominated solutions obtained by each algor-
MOHO and NSGA-II were highly dependent on initial
ithm (via 30 runs) are rounded to four-digit precision and the
random seeds. For instance, twice out of 10 runs, improper
duplicate solutions are removed. Next, the quasi-PF for each
seeds resulted in complete failure of NSGA-II as there were
case is generated using the non-dominated sorting procedure
no feasible solutions found in the final population. By con-
(Deb et al. ). Seven data sets are then obtained by counting
trast, AMALGAM, which uses Latin hypercube sampling
the common contribution of all three algorithms (denoted as
Figure 4
|
Pareto fronts obtained via multiple runs by AMALGAM, MOHO, and NSGA-II. (a) Case NYT, (b) Case HAN, (c) Case PES, (d) Case BIN.
174
Q. Wang et al.
|
Hybrid metaheuristics for multi-objective design of water distribution system
UAMALGAMþMOHOþNSGA-II), every
two
algorithms
common
(denoted
as
Journal of Hydroinformatics
|
16.1
|
2014
of
obtained by each algorithm in the quasi-PF set for each
UAMALGAMþMOHO,
test case. On seven of these benchmark problems (mainly
contribution
UAMALGAMþNSGA-II, UMOHOþNSGA-II), and individual contri-
on larger networks), NSGA-II found a significant number
butions of each algorithm (denoted as SAMALGAM, SMOHO,
of solutions in the quasi-PF sets. For smaller test cases,
SNSGA–II,), which has already excluded the common ones.
like TLN and TRN, its contribution was similar to that of
Finally, these data sets are plotted in Figure 4. It should be
AMALGAM and MOHO. It was worth noting that on the
noted that these sets can be empty and therefore are not necess-
ANT problem the quasi-PF was comprised solely of the sol-
arily shown on the figure. Similar algorithm performance
utions
trends can be observed as discussed previously. AMALGAM
superiority of NSGA-II in terms of convergence given a
was consistently superior to the others in terms of diversity by
fixed computational budget. Conversely, AMALGAM suc-
identifying the solutions in the region of high network resili-
cessfully produced more solutions on five small-to-medium
ence. NSGA-II outperformed hybrid algorithms in terms of
size networks when compared to NSGA-II. Such perform-
convergence towards the region of low cost, especially on
ance was due to its better achievement in terms of
larger networks (i.e. HAN, PES, and BIN) while MOHO was
diversity and convergence. It can also be observed that
able to find solutions in the quasi-PFs of NYT and PES, albeit
AMALGAM always found extreme points of the quasi-PF
completely failing on HAN and BIN problems.
sets in the region of high network resilience, which were
obtained
by
NSGA-II.
This
highlighted
the
To compare quantitatively the contributions of each
often neglected by NSGA-II and MOHO. Interestingly,
algorithm, Table 4 summarises the percentage of solutions
MOHO failed to generate any members in the quasi-PF sets on HAN, MOD, BIN and ANT. Furthermore, it only
Table 4
|
found a feasible solution set once out of 10 runs on the
Percentage of contribution from each algorithm for each design problem
ANT problem. Contribution in percentage (%)
In order to investigate the reasons why hybrid algorithms HAN
failed on some cases, the evolutionary processes of each sub-
62
25
algorithm within AMALGAM and MOHO on all design pro-
21
32
0
blems were recorded and analysed. Four cases, i.e. HAN,
31
42
78
PES, BIN and ANT, were selected and discussed here as
Problem
TLN
BAK
NYT
BLA
GOY
AMALGAM
88
100
67
52
MOHO
91
98
60
NSGA-II
98
96
39
they represented the most difficult ones under limited compu-
Contribution in percentage (%) Problem
FOS
AMALGAM
31
MOHO NSGA-II
PES
ANT
tational budget levels, i.e. 250, 500, 1000 and 5000
MOD
BIN
TRN
38
56
38
98
0
offspring points in AMALGAM was maintained at 5, the
31
12
0
0
95
0
number of individuals provided by a specific sub-
38
50
44
62
99
100
algorithm was expected to vary between 5 and 85. As
generations, respectively. Since the bottom line of creating
Note: The maximum contribution to each problem is shown in boldface.
Figure 5
|
shown in Figure 5, for the HAN problem, AMS outperformed
Statistical performances of sub-algorithms within AMALGAM on four selected cases.
175
Figure 6
Q. Wang et al.
|
|
Hybrid metaheuristics for multi-objective design of water distribution system
Journal of Hydroinformatics
|
16.1
|
2014
Statistical performances of sub-algorithms within MOHO on four selected cases.
the other three sub-algorithms by generating a median value
of multi-objective design of WDS benchmark networks.
of 50 points within the 250 generations. GA worked better
AMALGAM employs four sub-algorithms simultaneously
than DE followed by PSO which always stayed around the
and adapts offspring creation genetically based on the suc-
bottom line. However, this behaviour changes steadily from
cess rate of each algorithm in producing the next
less complex (i.e. PES) to more complex (ANT) problems
population. MOHO, on the other hand, selects in sequence
as GA consistently dominated other sub-algorithms. Only
when to switch from one of its sub-algorithms to another by
DE was comparable to GA on the PES and BIN problems,
monitoring performance on five separate aspects. NSGA-II
while PSO and AMS seldom made a contribution to the
was used as a representative of state-of-the-art MOEAs for
population and stayed at the minimum level most of the
the purpose of comparison. Multiple independent runs
time. For the ANT problem, GA steadily produced most off-
were carried out on each test cases and the HVR metric
spring. In other words, AMALGAM behaved like NSGA-II.
was adopted to assess their performance in terms of conver-
Therefore, the failure of AMALGAM on the ANT problem
gence and diversity.
could be attributed to the fact that PSO, AMS and DE were
The results clearly reveal that AMALGAM (HTH
not effective and consequently wasted search resources. In
scheme) is superior to NSGA-II on the networks of small-
three of the four selected cases and the MOD problem,
to-medium size, which indicates that this achievement
MOHO completely failed to contribute any solutions in sets
benefits from the strategies of adaptive multi-method
of the quasi-PF. In Figure 6, it can be observed that
search and global information sharing. On the other hand,
MOPSO was inefficient especially on large networks as on
the HTH scheme has potential to achieve better perform-
average it ran less than 1/20 of total iterations. Although
ance compared to the HRH scheme through taking full
SPEA2 was comparable with NSDE on the first three cases,
advantage of each sub-algorithm more efficiently. However,
it was not selected to produce a next generation on the
on larger networks, the behaviour of hybrid algorithms
ANT problem. This resulted in MOHO working similarly to
gradually deteriorated or completely failed. The underlying
NSGA-II while wasting nearly 20% of iterations to explore
reason why hybrid metaheuristics perform worse on larger
the search space. Another explanation for MOHO’s ineffi-
networks was also investigated by monitoring the evolution-
ciency is that the adaptive feature may be significantly
ary process of its sub-algorithms in detail. The failure is
weakened as the inefficiency of a certain constituent algor-
attributed to the loss of effectiveness in terms of proactive
ithm produces poor solutions.
adaptation. Actually, it is observed that, on the ANT problem, AMALGAM performed nearly the same as NSGA-II because GA dominated other sub-algorithms completely
CONCLUSIONS
most of the time. Admittedly, there is still a lack of theoretical analysis in and
the literature about the impact of problem characteristics on
MOHO, as well as NSGA-II were applied to a wide range
the performance of metaheuristics, which makes them and
Two
hybrid
algorithms,
namely
AMALGAM
176
Q. Wang et al.
|
Hybrid metaheuristics for multi-objective design of water distribution system
associated hybrid methods (like AMALGAM and MOHO in this paper) as black-box approaches and thus results in them receiving criticism. Future work on this aspect is needed to change this situation substantially. There is also a gap between the design and application stages of hybrid schemes, which verifies the effectiveness and efficiency of a specific combination of different sub-algorithms from a mathematical point of view. Without this step, it can be misleading when creating a new hybrid scheme. Moreover, the parameterisation issue of hybrid algorithms should be carefully investigated giving consideration to different problem characteristics. In addition, with the development of both hardware and software in computer technology, the computational capacity of modern PCs has been significantly improved; hence, we suggest that any newly-developed hybrid frameworks or MOEAs should be tested on a wide range of benchmark networks as shown in this work. Furthermore, considerable attention should be focused on the networks of medium-tolarge size which give sufficient consideration to the requirements of real-world cases. On the other hand, there are additional concerns other than cost and reliability (e.g. water quality issues) in real cases. The multi-objective design of a WDS may need to adapt to a many-objective (more than three objectives) design process (Fu et al. b). Thus, the future development of hybrid metaheuristics should cope with the expansion of dimensionality in both objective function space and decision variable space.
REFERENCES Bragalli, C., D’Ambrosio, C., Lee, J., Lodi, A. & Toth, P. Water Network Design by MINLP. IBM Research Report. Bratley, P. & Fox, B. L. Algorithm 659: implementing Sobol’s quasirandom sequence generator. ACM Trans. Math. Softw. 14 (1), 88–100. Cisty, M. Hybrid genetic algorithm and linear programming method for least-cost design of water distribution systems. Water Resour. Manage. 24, 1–24. Creaco, E. & Franchini, M. Fast network multi-objective design algorithm combined with an a posteriori procedure for reliability evaluation under various operational scenarios. Urban Water J. 9 (6), 385–399. Deb, K. Multi-Objective Optimization using Evolutionary Algorithms. John Wiley & Sons, Chichester, UK.
Journal of Hydroinformatics
|
16.1
|
2014
Deb, K., Pratap, A., Agarwal, S. & Meyarivan, T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6 (2), 182–197. Dong, X., Liu, S., Tao, T., Li, S. & Xin, K. A comparative study of differential evolution and genetic algorithms for optimizing the design of water distribution systems. J. Zhejiang Univ-Sci A (Appl. Phys. Eng.) 13 (9), 674–686. Farmani, R., Savic, D. A. & Walters, G. A. a Evolutionary multi-objective optimization in water distribution network design. Eng. Optim. 37 (2), 167–183. Farmani, R., Walters, G. A. & Savic, D. A. b Trade-off between total cost and reliability for Anytown water distribution network. J. Water Res. Plan. Manage. 131 (3), 161–171. Farmani, R., Walters, G. & Savic, D. Evolutionary multiobjective optimization of the design and operation of water distribution network: Total cost vs. reliability vs. water quality. J. Hydroinf. 8 (3), 165–179. Fu, G., Kapelan, Z. & Reed, P. a Reducing the complexity of multiobjective water distribution system optimization through global sensitivity analysis. J. Water Res. Plan. Manage. 138 (3), 196–207. Fu, G., Kapelan, Z., Kasprzyk, J. & Reed, P. b Optimal design of water distribution systems using many-objective visual analytics. J. Water Res. Plan. Manage. 10.1061/(ASCE)WR. 1943-5452.0000311. Haario, H., Saksman, E. & Tamminen, J. An adaptive metropolis algorithm. Bernoulli 7 (2), 223–242. Keedwell, E. & Khu, S. Novel cellular automata approach to optimal water distribution network design. J. Comput. Civil Eng. 20 (1), 49–56. Keedwell, E. C. & Khu, S. T. More Choices in Water System Design Through Hybrid Optimisation. Computing and Control for the Water Industry 2003, London, UK, pp. 257–264. Khu, S.-T. & Keedwell, E. Introducing more choices (flexibility) in the upgrading of water distribution networks: The New York city tunnel network example. Eng. Optim. 37 (3), 291–305. Moral, R. J. & Dulikravich, G. S. Multi-objective hybrid evolutionary optimization with automatic switching among constituent algorithms. AIAA J. 46 (3), 673–681. Papadimitriou, C. H. & Steiglitz, K. Combinatorial Optimization: Algorithms and Complexity. Dover Publications, New York. Prasad, T. D. & Park, N.-S. Multiobjective genetic algorithms for design of water distribution networks. J. Water Res. Plan. Manage. 130 (1), 73–82. Prasad, T. D. & Tanyimboh, T. T. Entropy based design of “Anytown” water distribution network. In: Water Distribution Systems Analysis 2008 (J. E. Van Zyl, A. A. Ilemobade & H. E. Jacobs, eds). ASCE, Kruger National Park, South Africa, pp. 450–461. Raad, D. N. Multi-objective Optimisation of Water Distribution Systems Design Using Metaheuristics. University of Stellenbosch, Stellenbosch. Raad, D., Sinske, A. & Van Vuuren, J. Robust multi-objective optimization for water distribution system design using a meta-metaheuristic. Int. Trans. Oper. Res. 16 (5), 595–626.
177
Q. Wang et al.
|
Hybrid metaheuristics for multi-objective design of water distribution system
Rahnamayan, S., Tizhoosh, H. R. & Salama, M. M. A. A novel population initialization method for accelerating evolutionary algorithms. Comput. Math. Appl. 53 (10), 1605–1614. Rossman, L. A. EPANET 2 Users Manual. U.S. Environment Protection Agency, Cincinnati, Ohio, USA. Talbi, E. G. A taxonomy of hybrid metaheuristics. J. Heuristics 8, 541–564. Todini, E. Looped water distribution networks design using a resilience index based heuristic approach. Urban Water 2 (2), 115–122. Tolson, B. A. & Shoemaker, C. A. Dynamically dimensioned search algorithm for computationally efficient watershed model calibration. Water Resour. Res. 43, W01413. Tolson, B. A., Asadzadeh, M., Maier, H. R. & Zecchin, A. Hybrid discrete dynamically dimensioned search (HD-DDS) algorithm for water distribution system design optimization. Water Resour. Res. 45, W12416. Van Zyl, J. E., Savic, D. A. & Walters, G. A. Operational optimization of water distribution systems using a hybrid genetic algorithm. J. Water Resour. Plan. Manage. 130 (2), 160–170. Vrugt, J. A. & Robinson, B. A. Improved evolutionary optimization from genetically adaptive multimethod search. Proc. Natl. Acad. Sci. USA 104 (3), 708–711.
Journal of Hydroinformatics
|
16.1
|
2014
Walski, T. M., Brill, J. E. D., Gessler, J., Goulter, I. C., Jeppson, R. M., Lansey, K., Lee, H.-L., Liebman, J. C., Mays, L., Morgan, D. R. & Ormsbee, L. Battle of the network models: Epilogue. J. Water Res. Plan. Manage. 113 (2), 191–203. Wang, Q., Savic´, D. & Kapelan, Z. Hybrid optimisation algorithms for multi-objective design of water distribution systems. 10th International Conference on Hydroinformatics, Hamburg, Germany. Wolpert, D. H. & Macready, W. G. No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1 (1), 67–82. Zhang, S. X. & Babovic, V. A real options approach to the design and architecture of water supply systems using innovative water technologies under uncertainty. J. Hydroinf. 14 (1), 13–29. Zitzler, E. & Thiele, L. Multiobjective evolutionary algorithms: A comparative case study and the strength Pareto approach. IEEE Trans. Evol. Comput. 3 (4), 257–271. Zitzler, E., Laumanns, M. & Thiele, L. SPEA2: Improving the strength pareto evolutionary algorithm for multiobjective optimization. In: Evolutionary Methods for Design, Optimisation and Control (K. Giannakoglou, D. Tsahalis, J. Periaux, K. Papailiou & T. Fogarty, eds). International Center for Numerical Methods in Engineering (CIMNE), Barcelona, Spain, pp. 95–100.
First received 17 January 2013; accepted in revised form 24 June 2013. Available online 24 July 2013
178
© IWA Publishing 2014 Journal of Hydroinformatics
|
16.1
|
2014
Consequence management of chemical intrusion in water distribution networks under inexact scenarios Abbas Afshar and Ehsan Najafi
ABSTRACT The US Environmental Protection Agency (EPA)’s Response Protocol Toolbox provides a list of recommendations on actions that may be taken to minimize the potential threats to public health following a contamination threat. This protocol comprises three steps: (1) detection of contaminant presence, (2) source identification and (3) consequence management. This paper intends to explore consequence management under source uncertainty, applying Minimize Maximum Regret (MMR) and Minimize Total Regret (MTR) approaches. An ant colony optimization algorithm is coupled with the EPANET network solver for structuring the MMR and MTR models to present a robust method for consequence management by selecting the best combination of hydrants and valves for isolation
Abbas Afshar Department of Civil Engineering and EnviroHydroinformatic Center of Excellence, Iran University of Science & Technology, Tehran, Iran Ehsan Najafi (corresponding author) Department of Civil Engineering, Iran University of Science & Technology, Tehran, Iran E-mail: Ehs.najafi@gmail.com
and contamination flushing out of the system. The proposed models are applied to network number 3 of EPANET to present its effectiveness and capabilities in developing effective consequence management strategies. Key words
| ant colony algorithm, consequence management, minimize maximum regret, water network contamination
NOTATION
Gkgb
objective function value for the ant with the best
Z(x, s)
x under scenario s
performance within the past total iterations L
set of options {lij}
α, β
parameters
which
Z(x*, s) number of polluted consumer nodes with optimal control
the
solution x* under scenario s
relative
importance of the pheromone trail against heurisηij
F
number of polluted consumer nodes from the
tic value
beginning of consequence management until
heuristic value representing the desirability of state
the end of the simulation counted over all discrete time intervals
transition ij ρ
number of polluted consumer nodes with solution
coefficient of pheromone evaporation
i
node index
τij (t)
total pheromone deposited on path ij at iteration t
n
total number of consumer nodes
k gb
ant with the best performance within the past total
S
set of scenarios
iterations
x
a solution composed of valves and hydrants
X
search space that is set of solutions composed of
Pij (k, t) likelihood that ant k selects option lij for decision
valves and hydrants
point i at iteration t q
random variable uniformly distributed over [0, 1]
ct
threshold value
q0
tunable parameter ∈[0, 1]
tCM
time of beginning the consequence management
Q
constant
x*
an optimal solution composed of valves and hydrants
R(x, s)
regret for solution x under scenario s
EPS
overall simulation duration
doi: 10.2166/hydro.2013.125
179
A. Afshar & E. Najafi
|
Consequence management under inexact scenarios
Journal of Hydroinformatics
|
16.1
|
2014
INTRODUCTION
a); (3) flushing the contaminated water out of the
Water distribution networks are one of the most important
binations of public notification, valve operations, and system
infrastructures and highly vulnerable to deliberate contami-
flushing. Flushing is the purging of water from the distri-
nation intrusions. Following the terrorism events of
bution network via fire hydrants or blow-off ports to
September 11, 2001 in the United States, the literature has
address water quality concerns (Baranowski et al. ).
been focused more on the possibility of intentional contami-
Consequence management strategies which could best mini-
nation intrusions within drinking water distribution systems.
mize public health hazard and economic impacts to
The
(EPA)’s
remediate contaminated systems must then be evaluated.
Response Protocol Toolbox (US EPA ) provides a list
Limited researches have systematically focused on the devel-
of recommendations on actions that may be taken to mini-
opment and application of the most effective consequence
mize the potential threats to public health following a
management strategies in response to contamination
contamination threat. This protocol comprises three steps:
which leaves it in its early stages of development. Bara-
(1) detection of contaminant presence, (2) source identifi-
nowski
cation and (3) consequence management.
management in order to identify demands which were
system through hydrants (US EPA b); and (4) any com-
US
Environmental
Addressing
the
Protection
contaminant
Agency
detection,
&
LeBoeuf
()
investigated
consequence
numerous
most appropriate to minimize the concentration of contami-
researchers during the last decade have focused on the place-
nants in a water distribution network. They employed three
ment of online water quality monitoring sensors to effectively
different gradient-based optimization techniques in order to
detect contamination incidents in shortest possible time to
find out the near-optimal demand necessary and requisite
reduce potential public health and economic consequences
for minimizing total network contaminant concentration
(Ostfeld & Salomons ; Berry et al. ; Propato ).
after detection of pollution presence by warning sensors.
The locations of online sensors can be optimized to help
In another attempt, Baranowski & LeBoeuf ()
achieve one goal or a combination of goals such as minimiz-
employed a genetic algorithm to minimize contaminant con-
ing public exposure to contaminants, the spatial extent of
centrations in a water network along with minimizing the
contamination, sensor detection time, or costs. To address
cost of hydrant flushing. EPANET as a hydraulic simulator
the second step (i.e. source identification) of the protocol a
was employed in their study, and the genetic algorithm
few other researchers have investigated different methods
was utilized to identify the following items: (1) the nodes
for identifying locations of contaminant injection after detec-
at which to alter the demand; (2) the new demands for
tion of pollution (De Sanctis et al. ; Laird et al. ;
these nodes; and (3) pipe closure locations essential to
Preis & Ostfeld ). As the number of measurements
decrease the contaminant concentration during an incident.
increases over time, the problem is better defined but con-
Regarding their assumption, flushing could be done at any
taminant spread and public exposure also increase (Poulin
nodes and every pipe could be closed as desired. Preis &
et al. ), so source identification in a short time is a
Ostfeld () utilized Non-Dominated Sorted Genetic
tricky task. Due to the sparseness of the sensor grid, this pro-
Algorithm II (NSGAII) as an optimizer in order to enhance
blem inherently has non-unique solutions (Laird et al. ).
the response against intentional contamination intrusions
Thus solving the inverse problem of source identification
into water networks. They explored two conflicting objec-
leads to several probable injection locations.
tives: (1) contaminant mass consumed minimization
Regarding the subsequent successful detection of a con-
following detection, versus (2) minimization of the number
tamination event via a contamination warning system,
of operational activities requisite for isolation and flushing
consequence management strategies must be implemented.
the contaminant out of the network. They defined the first
These consequence management strategies would include
objective as the total mass of contamination in the con-
the following factors: (1) public notification; (2) isolation
sumed water following detection until the end of the
of a contaminant through valve operations (US EPA
simulation period. In their system simulation, occurrence
180
A. Afshar & E. Najafi
|
Consequence management under inexact scenarios
Journal of Hydroinformatics
|
16.1
|
2014
of negative pressures was disregarded. Poulin et al. ()
nominated nodes for consequence management, a well-
introduced a simple topological method to organize the iso-
established approach is lacking. In this study, consequence
lation of polluted zones within the drinking water supply
management is explored under source uncertainty applying
networks. Their approach is based on closing proper
Minimize Maximum Regret (MMR) and Minimize Total
valves and leaving one pipe to let clean water go through
Regret (MTR) approaches. Although the min-max regret
the isolated area. Following the previous study which
model has been applied to several optimization cases
addressed isolation of the contaminated area, in another
under uncertainties (Averbakh ; Chang & Davila
study, Poulin et al. () defined unidirectional flushing
; Afshar & Amiri ), utilization of these methods in
strategies through a heuristic set of rules in a well-organized
consequence management has not been reported.
and efficient way. Alfonso et al. () presented a methodology for finding sets of operational activities in a water distribution network in order to flush the pollution out of
METHODOLOGY
the system to minimize the impact on the population. They explored the situation as two aspects: single-objective
In this paper, an ant colony optimization (ACO) algorithm
and multi-objective optimization problem, which were
for solving MMR and MTR models, considering a constraint
investigated by using optimization techniques, in combi-
for technical operational capacity, is presented. A water dis-
nation with EPANET.
tribution network is considered in which some of the pipes
Although different strategies have been defined and a
represent valves and some of the nodes represent hydrants.
few methodologies developed to effectively manage the con-
To deal with uncertainties, five nodes are assumed as prob-
sequences of the contamination after potential source
able locations of intrusion. The strategy that has been used
identification, very few attempts have been made to expli-
consists of three steps. The first step finds the optimal sol-
citly or implicitly deal with the uncertain location of the
ution employing the ACO algorithm. The objective
intrusion as an important issue which should be addressed
function in the optimization model seeks to minimize the
following the second step. In fact, inverse solutions for
total number of polluted consumer nodes for each scenario.
source identification may not result in a single solution,
The second step finds the solution that minimizes the maxi-
implying that more than one source may be nominated as
mum regret over all potential scenarios, as will be defined in
a possible polluted node. Each polluted node will call for
the next section (Minimizing Maximum Regret). The third
a different optimal consequence management and oper-
step finds the solution which minimizes the total regret
ational strategy. All previous studies have developed
over all potential scenarios (Minimizing Total Regret).
consequence management strategies assuming predefined intrusion location. Disregarding the uncertainties involved in the assumed polluted node may result in a solution strategy very far away from the optimal one. In a most recent
MMR AND MTR APPROACHES IN CONSEQUENCE MANAGEMENT STRATEGIES
work, Haxton & Uber () utilized a source location algorithm according to an event backtracking analysis to
Many problems are associated with the degree of uncer-
determine feasible and likely injection nodes. In their
tainty. In these situations, the decision maker tries to find
study, the source locations were considered as inputs to
a solution that performs relatively well across uncertainties.
the flushing approach, which made the average impact
Regret criterion is a useful tool for decision making under
least across all of the injection locations. Based on their
uncertainty. Regret is a sense of loss which is felt by the
results, knowing the contaminant source location would
decision maker knowing an alternative action would be
influence the efficiency of the flushing significantly and if
more profitable than the one that was taken (Mausser &
the number of potential and feasible source locations was
Laguna ). For instance, in finance, an investor may
smaller, the decrease in impacts would be greater. Realizing
observe not only his own portfolio performance but also
that there might be no priority in selecting any of the
returns on other stocks or portfolios in which he was able
181
A. Afshar & E. Najafi
|
Consequence management under inexact scenarios
Journal of Hydroinformatics
|
16.1
|
2014
to invest but decided not to. Therefore, it seems very natural
according to the optimal solution for that scenario. The
to assume that the investor may feel joy/disappointment if
MMR model may now be formulated as:
his own portfolio outperformed/underperformed some benchmark portfolio or portfolios (Aissi et al. ). MMR
r ¼ minfmax R(x, s)g
(2)
and MTR approaches are among the most reliable criteria for decision making under uncertainties when the likelihood
The aim of Equation (2) is finding a solution by ACO to
of the possible outcomes cannot be predicted with a satisfy-
minimize the maximum regret over all the possible scen-
ing accuracy (Loulou & Kanudia ). In other words, the
arios. The approach of MTR is very similar to MMR while
MMR and MTR approaches are suitable in situations where
minimization of total regret is considered as the objective
the decision maker may feel regret if a wrong decision has
function. The structure of the model is as follows:
been made, so when he/she decides which results will be more satisfying if this regret were taken into account. The
X
r ¼ min
(3)
R(x, s)
S
objective of the MMR approach is to address a decision which minimizes the maximum deviation between that decision and the optimum decision for each scenario over all possible (or identified) scenarios. In fact this model intends to make a decision with the best possible performance in the worst case (Aissi et al. ). In this study, alternative actions or scenarios are defined as probable injection locations. The decision maker in this problem is an authority governing and running the city’s water distribution network. He/she is the one who bears the responsibility of making a decision and may make that decision based on MMR or MTR when analyzing the consequences and the harmful impact of the harmful effects of
In order to illustrate the previous definitions, the problem of sensor placement, as contaminant warning systems for a water distribution system, is considered. Suppose that three different layouts (installations) of water quality monitoring stations have been proposed and intrusion could occur in one of the nodes with labels i, j and k (scenarios). Table 1 shows the time of contaminant detection in minutes for each of the three scenarios. The crude choice to minimize the longest duration detection time would be selection of layout 2, ensuring the time of detection does not exceed 320 min. However, based on Table 2 if intrusion at node j occurred, the regret
intrusion on public health. Let Z(x, s) be the number of polluted consumer nodes under scenario s and solution x where x is a solution that consists of valves and hydrants used to isolate and remove contaminant out of the system and s is a probable node of intrusion. In this definition x ∈ X and s ∈ S where X is decision space and S contains all of the probable nodes of intrusion (scenarios). For a solution x ∈ X, the regret under
associated with this choice would be 300, which is the difference between the 320 and 20 min which is too large and could have been avoided if the exact scenario had been known. In addition the total regret of this choice is 620 which is too big (the summation of 65, 300 and 255 min). Therefore, in this example, according to the maximum and the total regret the best choice would be to select layout 1,
scenario s ∈ S is defined as follows: R(x, s) ¼ Z(x, s) Z(x , s)
(1)
where Z(x*, s) is the total number of polluted consumer
Table 1
|
Time of contaminant detection for different layouts of monitoring stations (minute)
Scenario
Intrusion at
Intrusion at
Intrusion at
Worst time of
Layout
node i
node j
node k
detection
to determine x*, the number of polluted consumer nodes
Layout 1
385
20
40
385
should be calculated for each scenario individually by the
Layout 2
305
320
295
320
optimizer. Therefore regret for a scenario is defined as the
Layout 3
240
280
330
330
difference of polluted consumer nodes counted for applying
Best time of detection
240
20
40
nodes with optimal solution x* under scenario s. In order
a solution x ∈ X and the number of polluted consumer nodes
182
A. Afshar & E. Najafi
Table 2
|
|
Consequence management under inexact scenarios
Maximum and total regret of each scenario for each layout of monitoring stations (minute)
Journal of Hydroinformatics
beginning
the
consequence
management.
|
16.1
|
2014
Although
shorter and/or longer time steps could be used, it is a more rational time step for consequence management
Intrusion
Intrusion
Intrusion
Maximum
Total
at node i
at node j
at node k
regret
regret
both from computational and plan implementational
Layout 1
145
0
0
145
145
points of view. When minimizing F, indirectly two key
Layout 2
65
300
255
300
620
issues are addressed: first, reducing the pollution extent
Layout 3
0
260
290
290
550
(contaminated area) in the network and second, reducing the time of exposure of concentrations above the threshold (Alfonso et al. ). In this article regardless
ensuring maximum and total regret of no worse than
of the differences in total nodal demands, it is assumed
145 min.
that the density of population over the nodes is equally
In this study, the fitness value is calculated as the total
distributed; therefore, all nodes in terms of impact are
number of polluted consumer nodes from the beginning of
similarly significant. Note that this is crude because
consequence management until the end of the simulation
EPANET example 3 consists of nodes with different
counted over all discrete time intervals:
demands. In addition, in reality some nodes of a water
F¼
n X EPS X
distribution network such as supply nodes for hospitals N(i, t)
(4)
i¼1 t¼tCM
and schools are more important than others if they become polluted. So it would be more precise if nodes were weighted according to their demands and impor-
Please note that the number of polluted nodes may
tance which needs to be considered in future studies.
vary from one computational time step to another. In other words, due to dynamics of the system, a given node may be recognized as a polluted node in one time
ACO ALGORITHM; GENERAL ASPECTS
step and unpolluted in the next one. Summation of the total polluted nodes in the entire computational time
ACO algorithms, using principles of communicative behav-
steps will often result in a fitness value exceeding the
ior occurring in real ant colonies, have successfully been
total number of network nodes. In Equation (4), i is the
applied to solve various combinatorial optimization prob-
node index, n is the total number of consumer nodes,
lems (Abbasi et al. ).
tCM represents the time of beginning the consequence management and EPS (Extended Period Simulation) is the overall simulation duration. The value of N depends on the existence of certain pollutant concentrations in the nodes. A node is considered polluted when its concentration exceeds the threshold value ct. Depending on the nature of the contaminant and its impact on public
In general, the kth ant at iteration t moves from state i to state j with probability (Dorigo et al. ): 8 α h iβ > > τ ij (t) ηij > < h iβ Pij (k, t) ¼ PJ α τ ij ηij > j¼1 > > : 0
if j ∈ Nk (i)
(5)
otherwise
health, different residual concentrations may be set for the consequence management strategy. Without loss of
where τ ij (t) is the total pheromone deposited on path ij at
generality, a lower threshold of 0.01 mg/l has been used
iteration t, ηij is the heuristic value representing the desirabil-
here to observe and assess the consequences if the man-
ity of state transition ij, Nk(i) is the possible neighborhood of
agement period extends for a longer time. If the
ant k when located at decision point i, and α and β are two
pollution concentration in node i at time t is more than
parameters to control the relative importance of the phero-
0.01 mg/l, it is assumed polluted and hence N(i, t)
mone trail against the heuristic value.
denoted as 1, otherwise it is assigned 0. The polluted con-
Let q be a random variable uniformly distributed over
sumer nodes are added together every 15 min after
[0, 1] and q0 ∈ [0, 1] be a tunable parameter. The next
183
A. Afshar & E. Najafi
|
Consequence management under inexact scenarios
Journal of Hydroinformatics
|
16.1
|
2014
option, j, that ant k chooses is (Dorigo & Gambardella ): ( j¼
n o argl∈Nk (t) max ½τ il (t) α ½ηil (t) β
q q0
J
otherwise
(6)
where J is a random variable value selected based on the probability distribution of Pij (k, t) (Equation (6)). Once all ants have built a tour, the pheromone trail intensity will be updated. This is done according to following equations: τ ij (t þ 1) ¼ (1 ρ):τ ij (t) þ Δτ ij (t)
(7)
where τ ij (t þ 1) is the amount of pheromone deposited for a state transition ij at iteration t þ 1, 0 ρ 1 is the pheromone evaporation coefficient and Δτ ij (t) is the amount of pheromone deposited on path ij at iteration t: 8 < Q Δτ ij (t) ¼ Gk gb : 0
if (i, j) ∈ tour done by ant k gb
Schematic of EPANET’s example 3.
between pressure and demand is incorporated. In this type of analysis, functions assume fixed demand above a given critical pressure, zero demand below a given minimum demand for intermediate pressures (Cheung et al. ). In
(8)
this study, minimum and desired pressure limits are assumed to be 0 and 25 m, respectively. Flow of an open pffiffiffiffi hydrant was modeled as an emitter by Q ¼ K P, where P
otherwise
value for ant
|
pressure and some relationship between pressure and
where, Q is a constant and Gkgb is the objective function k gb
Figure 1
which is the ant with the best performance
within the past total iterations.
is the pressure drop across the emitter and K is the emitter coefficient. K for all of the simulations was considered 1 l/s/m0.5. In this research, the hydrants and valves that are selected for consequence management are similar to those that Preis & Ostfeld () utilized in their research
MODEL SETUP
(Table 3). The total numbers of decision variables are equal to 51 which consider the modes of operation for 20
The water system utilized is the EPANET example 3 net-
valves and 31 hydrants. The decision variables are coded
work (Rossman ). It comprises two constant head
as binary numbers (0, 1) which determines whether the
sources, a lake and a river, three elevated storage tanks,
valve and hydrants are open or closed. Initially all valves
two pumping stations, 117 pipes, 59 consumer nodes and
are assumed ‘open’ and hydrants are ‘closed’. The mode of
35 internal nodes (Figure 1). In EPANET, a hydraulic and
operation for open valves and closed hydrants are identified
constituent time step of 15 min was used for a 24 hour simulation period.
Table 3
|
Valve and hydrant locations employed in consequence management
Today accounting for uncertainty without accounting for the certain errors coming from wrong water distribution network modeling is unacceptable. Thus, here, realizing the deficiency of the EPANET software in handling the pressure driven condition, an extension of EPANET was prepared to directly include the pressure driven issue in the modeling approach. In pressure driven analysis, the relationship
Valve locations (links number)
111, 175, 105, 116, 177, 215, 204, 237, 269, 173, 123, 107, 229, 311, 155, 309, 221, 231, 317, 301
Hydrant locations (nodes number)
40, 50, 60, 601, 61, 120, 129, 164, 169, 173, 179, 181, 183, 184, 187, 195, 204, 206, 208, 241, 249, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275
184
A. Afshar & E. Najafi
|
Consequence management under inexact scenarios
Journal of Hydroinformatics
|
16.1
|
2014
by 0, whereas the decision variable for closed valves and
response activities is restricted to 20. In some real cases,
open hydrants is represented by 1 in the proposed binary
however, the number of operational responses may far
coding. In any trial solution, decision variables are free to
exceed this number if a large number of valves and hydrants
take either 0 or 1 to redefine the operational mode of the
are selected by the optimizer to be re-operated.
valves and/or hydrants. Each trial solution will have its
In order to apply ACO algorithms to a specific problem,
own consequences with its own regret, if implemented. It
the problem should be represented on a graph or a similar
should be noted that in EPANET example 3 there are no
structure easily covered by ants (Afshar et al. ). Sol-
valves, but each pipe can be closed or opened at any time
utions that are produced by ants are combinations of
and this option was used to overcome this issue. Contami-
valves and hydrants which are closed and opened by oper-
nant injection takes place with a mass rate of 0.006 kg/s at
ators at 13:00 simultaneously and remain unchanged until
09:00 am for a duration of 7 hours.
24:00. In Figure 2, each column represents a valve or
In this study we assume that: (1) the water system is
hydrant utilized in consequence management. As an
equipped with some sensors for detection of contaminant
example, if an ant selects number 1, it means that the
presence; (2) nodes 103, 111, 125, 113 and 259 are probable
status of that valve or hydrant will be changed in the pro-
nodes of intrusion (Table 4); and (3) the necessary time for
posed management alternative. Otherwise, the situation of
detection of contaminant presence in the network by moni-
the valve or hydrant will remain unchanged.
toring stations and delay in the response time (including: (1)
As mentioned above, in order to obtain optimal sol-
contamination source identification; (2) isolation and con-
utions for MMR and MTR models, the ACO algorithm
tainment by valve closures; (3) flushing by hydrant
should be solved for each scenario individually. The func-
opening; and (4) public notification (Preis & Ostfeld
tion evaluations for all of the ACO algorithms are 90,000
)) is 4 hours; thus consequence management will
and the metaheuristic parameters are: β ¼ 0, α ¼ 1, ρ ¼ 0.05
begin at 13:00 pm. In addition, the constraint ‘technical
and q0 ¼ 0.4. To control the amount of pheromone depos-
operational capacity to implement response’ is considered.
ited for a state transition ij, the proper value of Q must be
Based on this constraint, the total number of operational
selected by sensitivity analysis. The values of Q for each sub-problem consisting of different inexact scenarios and
Table 4
|
MMR and MTR are selected through sensitivity analysis as
Probable nodes of intrusion (scenarios)
displayed in Table 5. Scenario 1
Node 103 number
Scenario 2
Scenario 3
Scenario 4
Scenario 5
111
125
113
259
RESULTS AND DISCUSSION The number of polluted consumer nodes without performing consequence management for each scenario is shown in Table 6. As presented in Table 6, the total number of polluted consumer nodes for the identified scenarios ranges from 753 to 1,307 for scenario numbers 4 and 3, respectively. Numbers of operational activities along with identification number of valves and hydrants to be re-oper-
Figure 2
Table 5
Q
|
|
Decision graph of ACO algorithm for consequence management strategies.
ated under optimal solutions for different scenarios are
Q values for ACO algorithms in order to find optimum solutions Scenario 1
Scenario 2
Scenario 3
Scenario 4
Scenario 5
MMR Model
MTR Model
40
50
115
60
45
70
180
185
Table 6
A. Afshar & E. Najafi
|
|
Consequence management under inexact scenarios
Number of polluted consumer nodes without performing response actions
Scenario 1
914 Polluted consumer nodes
Scenario 2
Scenario 3
Scenario 4
Scenario 5
906
1,307
753
1,153
Journal of Hydroinformatics
|
16.1
|
2014
from the optimization scheme demonstrate that the use of the number of polluted nodes in consequence management helps in reducing both exposure time and consumed pollutant
concentrations.
Given
a
scenario,
results
also
illustrate the usefulness of ACO to provide optimal solutions in order to minimize the number of polluted nodes in water
presented in Table 7. As presented, for scenario number 1, a total of 18 operational activities are identified with nine valves and nine hydrants. The number of polluted consumer nodes based on the occurrence of each scenario while employing the optimal solutions are shown in Table 8. In this table, dark cells are optimal solutions under given consequence management scenarios. Compared to the number of polluted consumer nodes with no consequence management, a significant reduction in the number of polluted consumer nodes may be achieved for scenario numbers 1 to 5. Specifically speaking, assuming that the polluted node is fully identified, the total number of polluted consumer nodes may be reduced by 82, 80, 69, 79 and 84% for
distribution network. Amounts of regret under different scenarios are shown in Table 9. Suppose that the decision maker implements the optimal solution for scenario number 3, for which 402 nodes are expected to be polluted. Let’s assume that, in reality, scenario number 4 occurs. In this case, as a result of incorrect scenario identification, the decision maker must pay for a regret of 238 extra polluted nodes (399–161). This issue is reflected in the fifth column and fourth row of Table 9. According to Table 9, if the decision maker employs optimal solutions, the maximum regret would range from 298 to 996 and total regret from 676 to 1,835 for different scenarios.
scenario numbers 1 to 5, respectively. As an example, for the first scenario, implementation of the consequence man-
MMR approach
agement may reduce the total number of polluted nodes from 914 to 164 (Table 8). Please note that even if the
Minimizing the total or maximum regret may eventually
right scenario is not correctly identified, the management
lead to a more robust solution for cases of inexact scenarios.
strategy will still reduce the number of polluted consumer
The objective of the MMR approach is to address a decision
nodes by 42% (from 914 to 533) or more. The second
which minimizes the maximum deviation between the
column of Table 8 illustrates that if the decision maker
alternative taken and the optimum one for each scenario
implements the optimum solution associated with scenario
over all possible (or identified) scenarios. In fact this
number 1, the total number of polluted consumed nodes
model intends to make a decision with the best possible per-
will reach 164, whereas this number may increase to 699
formance in the worst case. In order to minimize the
nodes if the third scenario prevails. The results obtained
maximum regrets under different scenarios, the MMR
Table 7
|
Optimal solutions for each scenario
Number of operational activities
Valve numbers
Hydrant numbers
Scenario 1
18
107, 111, 155, 204, 221, 231, 269, 309, 311
601, 120, 179, 183, 184, 187, 195, 204, 267
Scenario 2
18
107, 111, 116, 204, 215, 221, 231, 269, 309
179, 183, 184, 187, 195, 204, 263, 267, 269
Scenario 3
20
155, 175, 237
40, 601, 61, 120, 129, 164, 169, 173, 179, 181, 183, 195, 204, 265, 269, 271, 273
Scenario 4
18
116, 204, 231, 269, 309, 311
60, 129, 179, 183, 184, 187, 195, 204, 257, 261, 267, 269
Scenario 5
20
107, 111, 123, 221, 269, 301, 309
164, 169, 173, 179, 181, 183, 257, 259, 261, 263, 265, 269, 273
186
Table 8
A. Afshar & E. Najafi
|
|
Consequence management under inexact scenarios
Number of polluted consumer nodes based on occurrence of each scenario and
Journal of Hydroinformatics
Table 10
|
|
16.1
|
2014
Optimal solution obtained from MMR approach
employing optimal solutions
Scenario 1
Scenario 2
x*(sc ¼ 1)
164
226
x*(sc ¼ 2)
183
179
x*(sc ¼ 3)
533
x*(sc ¼ 4)
346
x*(sc ¼ 5)
382
Scenario 3
Scenario 4
Scenario 5
699
195
477
1,396
204
731
475
402
399
671
336
1,398
161
679
552
589
444
179
Number of operational activities
Valve numbers
Hydrant numbers
19
105, 111, 116, 155, 204, 215, 229, 269, 301, 309, 311, 317
40, 173, 195, 204, 259, 267, 271
MTR approach approach is applied. The results are shown in Tables 10 and
To test the performance of the MTR model in handling injec-
11. By minimizing maximum regret, it is intended to re-
tion location uncertainty in consequence management, the
operate the valves and hydrants in such a way that, for all
same case example and the same five scenarios are used.
possible scenarios, the maximum deviation of the number
The final results of the model application are provided in
of polluted consumer nodes from the optimum value is mini-
Tables 12 and 13. Similarly to Table 10, Table 12 provides
mized. In this case, the maximum amount of regret under
the optimum solution which minimizes the total regret
different scenarios will be reduced. Table 10 provides the
over all possible scenarios. For valves and hydrants pro-
valves and hydrants which minimize the maximum regret
posed in Table 12, the total values of regrets for different
over all identified scenarios. For valves and hydrants pro-
scenarios are presented in Table 13. As presented, for the
posed in Table 10, the values of regret for different
proposed valves and hydrants re-operation, maximum
scenarios are presented in Table 11. As presented, for the
regret will result when scenario number 3 occurs. Table 13
proposed valves and hydrants re-operation, the maximum
shows that the decision maker should implement valves
regret will result when scenario number 3 occurs. By com-
and hydrants provided in Table 12, for which the total
paring
single
regret for the fifth scenario will be 18. In this case if other
scenario-based optimum number of polluted consumer
scenarios take place, the decision maker will feel regrets ran-
nodes, the solution to the MMR model has decreased the
ging from 82 to 183 polluted consumer nodes. For the
maximum regret. To be pleased about the drops in maxi-
proposed solution the total regret over all identified scen-
mum regret, one may compare the regret values in
arios will be equal to 576 polluted consumer nodes
Table 11 with those in column 7 of Table 9. Table 11 implies
(Table 13). Compared to the MMR model, the total regret
that no matter which scenario is going to happen, the
resulting from the MTR model has been reduced (i.e. from
decision maker’s regret will not exceed 157 polluted consu-
639 in Table 11 to 576 in Table 13). In this model, compared
mer nodes. Whereas the maximum regret for the single
to the MMR model, total regret is reduced by almost 10 per-
scenario-based solution ranges from 298 to 996 for different
cent and maximum regret is increased by 14 percent. As a
scenarios (Table 9).
result, it seems that MMR and MTR models nearly eventuate
Table 9
|
the
maximum
regret
associated
with
Regret under different scenarios
Reality Assumed scenario
Scenario 1
Scenario 2
Scenario 3
Scenario 4
Scenario 5
Maximum regret
Total regret
Scenario 1
0
47
297
34
298
298
676
Scenario 2
19
0
994
43
552
994
1,608
Scenario 3
369
296
0
238
492
492
1,395
Scenario 4
182
157
996
0
500
996
1,835
Scenario 5
218
373
187
283
0
373
1,061
187
Table 11
A. Afshar & E. Najafi
|
|
Consequence management under inexact scenarios
Journal of Hydroinformatics
Scenario 2
Scenario 3
Scenario 4
Scenario 5
Number of polluted consumer nodes
294
319
559
272
280
Regret
130
140
157
111
101
|
16.1
|
2014
Number of polluted consumer nodes and amounts of regret with optimal solutions under different scenarios obtained from MMR approach
Scenario 1
Table 12
|
Total ¼ 639
any employed consequence management strategy. In
Optimal solution obtained from MTR approach
addition it was shown that knowing the exact location of Number of operational activities
Valve numbers
Hydrant numbers
16
107, 111, 116, 123, 215, 221, 237, 269, 309, 317
129, 173, 195, 267, 269, 271
contaminant intrusion will strongly influence the effectiveness of the response activities. Contamination detection, source identification, and consequence management strategy implementation are all
in the same results and both are suitable in consequence
time consuming and may demand relatively considerable
management.
time. To minimize the time lag and overcome the computational shortcomings, it is recommended to set up and calibrate the models for detection, source identification,
CONCLUSIONS
and consequence management in advance and have them
Efficient consequence management in an intentionally con-
MTR and MMR models are suitable for design and analysis
taminated water distribution network is only possible if the
of consequence management strategies. Realizing the dis-
source of the contamination is known. Complete knowledge
crete nature of the decision space in the consequence
on the contaminant source location will lead to great
management problem, the ACO algorithm performed
reduction in the polluted consumer nodes and greatly influ-
quite satisfactorily and is recommended for similar studies.
ence the effectiveness of the consequence management.
Although not providing the water network governor and
However, inverse solutions for source identification may
authorities with a single solution, results of the proposed
identify multiple inexact sources of contamination and pol-
models may provide the decision maker with a reasonable
luted nodes, each one demanding different optimal
level of awareness and impacts of the alternating decisions
consequence management and operational strategy. This
considering the uncertainties involved in exact identifi-
study proposed and tested a systematic approach based on
cation of the contaminated node. In addition, the
MMR and MTR in connection with the well-established
proposed methodology disregards other uncertainties,
ACO approach to develop a set of robust consequence man-
such as type of injected contaminant and injection time,
agement strategies with known impacts on alternating
which need to be investigated in future works. The testing
strategies. It was illustrated that the approach is mathemat-
of more networks with different topological structures is
ically sound, computationally feasible and the proposed
recommended for improving the confidence in the pro-
method can be used to analyze the regrets associated with
posed approach.
in ‘ready to be used’ condition. It was shown that both
Table 13
|
Number of polluted consumer nodes and amounts of regret with optimal solutions under different scenarios obtained from MTR approach
Number of polluted consumer nodes Regret
Scenario 1
Scenario 2
Scenario 3
Scenario 4
Scenario 5
246
350
585
283
197
82
171
183
122
18
Total ¼ 576
188
A. Afshar & E. Najafi
|
Consequence management under inexact scenarios
REFERENCES Abbasi, A., Afshar, A. & Jalali, M. R. Ant-colony-based simulation–optimization modeling for the design of a forced water pipeline system considering the effects of dynamic pressures. J. Hydroinf. 12 (2), 212–224. Afshar, A. & Amiri, H. A min-max regret approach to unbalanced bidding in construction. KSCE J. Civil Eng. 14 (5), 653–661. Afshar, A., Sharifi, F. & Jalali, M. R. Non-dominated archiving multi-colony ant algorithm for multi-objective optimization: application to multi-purpose reservoir operation. Eng. Optim. 41 (4), 313–325. Aissi, H., Bazgan, C. & Venderpooten, D. Min-max and minmax regret versions of some combinatorial optimization problems: a survey. Eur. J. Oper. Res. 197 (2), 427–438. Alfonso, L., Jonoski, A. & Solomatine, D. Multiobjective optimization of operational responses for contaminant flushing in water distribution networks. J. Water Resour. Plan. Manage. 136 (1), 48–58. Averbakh, I. Minmax regret linear resource allocation problems. Oper. Res. Lett. 32 (2), 174–180. Baranowski, T. M. & LeBoeuf, E. J. Consequence management optimization for contaminant detection and isolation. J. Water Resour. Plan. Manage. 132 (4), 274–282. Baranowski, T. M. & LeBoeuf, E. J. Consequence management utilizing optimization. J. Water Resour. Plan. Manage. 134 (4), 386–394. Baranowski, T., Janke, R., Murray, R., Bahl, S., Sanford, L., Steglitz, B. & Skadsen, J. Case study analysis to identify and evaluate potential response initiatives in a drinking water distribution system following a contamination event. Borchardt Conf., Univ. of Mich., Ann Arbor, Mich. Berry, J., Hart, W. E., Phillips, C. A., Uber, J. G. & Watson, J. P. Sensor placement in municipal water networks with temporal integer programming models. J. Water Resour. Plan. Manage. 132 (4), 218–224. Chang, N. & Davila, E. Minimax regret optimization analysis for a regional solid waste management system. Waste Manage. 27 (6), 830–832. Cheung, P., Van Zyl, J. & Reis, L. Extension of EPANET for pressure driven demand modeling in water distribution system. In: Proceeding of the 8th International Conference in Computing and Control in Water Industry.Water Management for the 21st Century, Center for Water Systems, University of Exeter, UK, 1 (2), 311–316. De Sanctis, A., Shang, F. & Uber, J. Determining possible contaminant sources through flow path analysis. In: Proceedings of the 8th Water Distribution System Analysis Symposium, Cincinnati, OH. Dorigo, M. & Gambardella, L. M. Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans. Evol. Comput. 1 (1), 53–66.
Journal of Hydroinformatics
|
16.1
|
2014
Dorigo, M., Maniezzo, V. & Colorni, A. The ant system: optimization by a colony of cooperating ants. IEEE Trans. Syst. Man. Cybern. 26, 29–42. Haxton, T. & Uber, J. G. Flushing under source uncertainties. In: Proceedings of 12th Annual Water Distribution Systems Analysis (WDSA) Conference, American Society of Civil Engineers (ASCE), AZ, Tucson, pp. 604–612. Laird, C. D., Biegler, L. T. & Waanders, B. Mixed-integer approach for obtaining unique solutions in source inversion of water networks. J. Water Resour. Plan. Manage. 132 (4), 242–251. Loulou, R. & Kanudia, A. Minimax regret strategies for greenhouse gas abatement: methodology and application. Oper. Res. Lett. 25 (5), 219–230. Mausser, M. E. & Laguna, M. A heuristic to minimax absolute regret for linear programs with interval objective function coefficients. Eur. J. Oper. Res. 117 (1), 157–174. Ostfeld, A. & Salomons, E. Optimal layout of early warning detection stations for water distribution systems security. J. Water Resour. Plan. Manage. 130 (5), 377–385. Poulin, A., Mailhot, A., Grondin, P., Delorme, L., Periche, N. & Villeneuve, J. P. A heuristic approach for operational response to drinking water contamination. J. Water Resour. Plan. Manage. 134 (5), 457–465. Poulin, A., Mailhot, A., Periche, N., Delorme, L. & Villeneuve, J. -P. Planning unidirectional flushing operations as a response to drinking water distribution system contamination. J. Water Resour. Plan. Manage. 136 (6), 647–657. Preis, A. & Ostfeld, A. Contamination source identification in water systems: a hybrid model trees linear programming scheme. J. Water Resour. Plan. Manage. 132 (4), 263–273. Preis, A. & Ostfeld, A. Multiobjective contaminant response modelling for water distributions systems security. J. Hydroinf. 10 (4), 267–274. Propato, M. Contamination warning in water networks: general mixed-integer linear models for sensor location design. J. Water Resour. Plan. Manage. 132 (4), 225–233. Rossman, L. A. EPANET 2.0: User’s Manual. National Risk Management Research Laboratory, US EPA, Cincinnati. US EPA Response Protocol Toolbox: Planning for and Responding to Drinking Water Contamination Threats and Incidents – Overview and Application. US Environmental Protection Agency, Washington, DC. US EPA a Response Protocol Toolbox: Planning for and Responding to Drinking Water Contamination Threats and Incidents—Module 5: Public Health Response Guide. US Environmental Protection Agency, Washington, DC. US EPA b Response Protocol Toolbox: Planning for and Responding to Drinking Water Contamination Threats and Incidents—Module 6: Remediation and Recovery Guide. US Environmental Protection Agency, Washington, DC.
First received 21 July 2012; accepted in revised form 24 June 2013. Available online 25 July 2013
189
© IWA Publishing 2014 Journal of Hydroinformatics
|
16.1
|
2014
Free-surface flow simulations for discharge-based operation of hydraulic structure gates C. D. Erdbrink, V. V. Krzhizhanovskaya and P. M. A. Sloot
ABSTRACT We combine non-hydrostatic flow simulations of the free surface with a discharge model based on elementary gate flow equations for decision support in the operation of hydraulic structure gates. A water level-based gate control used in most of today’s general practice does not take into account the fact that gate operation scenarios producing similar total discharged volumes and similar water levels may have different local flow characteristics. Accurate and timely prediction of local flow conditions around hydraulic gates is important for several aspects of structure management: ecology, scour, flow-induced gate vibrations and waterway navigation. The modelling approach is described and tested for a multi-gate sluice structure regulating discharge from a river to the sea. The number of opened gates is varied and the discharge is stabilized with automated control by varying gate openings. The free-surface model was validated for discharge showing a correlation coefficient of 0.994 compared to experimental data. Additionally, we show the analysis of computational fluid
C. D. Erdbrink (corresponding author) V. V. Krzhizhanovskaya P. M. A. Sloot University of Amsterdam, Amsterdam, The Netherlands and National Research University of Information Technologies, Mechanics and Optics, Saint Petersburg, Russia E-mail: chriserdbrink@gmail.com; christiaan.erdbrink@deltares.nl C. D. Erdbrink Deltares, Delft, The Netherlands
dynamics (CFD) results for evaluating bed stability and gate vibrations. Key words
| computational fluid dynamics, discharge sluice, free-surface flow, gate operation, hydraulic gates, hydraulic structures
NOTATION A
amplitude of gate vibration (m)
h0
upstream water depth before reaching pier (m)
Alake
surface area of lake (m2)
h1
upstream water depth between piers, upstream
a
gate opening (m)
Cc
contraction coefficient for flow past sharp-
h2
water depth in control section (m)
edged underflow gate ( )
h3
water depth downstream of gate, between
Cc,in
contraction
coefficient
of gate (m)
for
flow
entering
piers, behind recirculation zone (m)
upstream section between piers ( )
h4
downstream water depth beyond pier (m)
CD
discharge coefficient for submerged flow (–)
htarget
target lake level to be reached at the end of dis-
CD*
average value of CD over one discharge event,
CE
charge period (m)
computed by discharge model ( )
k
turbulent kinetic energy (m2/s2)
discharge coefficient as used by Nago ()
KP, KI, KD
gain parameters of PID discharge controller
for experimental data
( ) 3
e
error value of PID discharge controller (m /s)
m
number of gates opened fully or partially ( )
fgate
frequency of gate vibration (Hz)
total number of gates of the structure ( )
Fr
Froude number ( )
n ~ n
normal vector ( )
g
gravitational constant (m/s2)
p
pressure (Pa)
h
water depth (m)
Q
discharge (m3/s)
doi: 10.2166/hydro.2013.215
190
QDM
C. D. Erdbrink et al.
|
Free-surface flow simulations for discharge-based operation of hydraulic gates
discharge through one gate, calculated by discharge model (m3/s)
Qgate
discharge through one gate, calculated by the system model (m3/s)
QMF
modular flow discharge based on gate underflow contraction criterion (m3/s) discharge per unit width (m2/s)
t
time variable (s)
U
magnitude of flow velocity vector; in 2DV ffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi model defined as U ¼ u21 þ u23 (m/s)
Uvc
magnitude of flow velocity vector in vena
~ u ¼ (u1 , u3 ) flow velocity vector in 2DV model; u1 is horizontal velocity, u3 is vertical velocity (m/s) reduced velocity parameter of flow-induced vibrations ( ) total volume passing the structure in a given amount of time (m3) Vtot,req
required total volume to pass the structure in a given amount of time in order to reach htarget (m3) width between piers (m)
α
calibration parameter for turbulent flow in bed stability parameter (–) relaxation factor in formula for CD in system model (–) turbulent dissipation (m2/s3)
ξin
entrance loss coefficient (–)
ξout
exit loss coefficient (–)
Ψ
stability parameter for beginning of motion of
∇ui ∇ ~ u q r ⌊r⌋ ⌈r⌉ s
INTRODUCTION This paper gives an outline of how near-field free-surface flow simulations can be used in the operation of gates of large hydraulic structures. Barrier operation is commonly based on water level pre-
the structure: for a weir in a river this is to maintain the upstream water level; for a discharge sluice this is to transfer river water out to the sea while keeping a safe inland level. Present-day hydraulic structures have various secondary functions, such as providing favourable ecological conditions, for which usually no numerical aids are available in daily operation. A better prediction of the flow near structures would be beneficial to durable performance of all barrier tasks. Proper design studies pay attention to all functions of a structure and assess the impact of all relevant flow features. However, operational constraints change in time for natural reasons (e.g. sea-level rise) or political reasons (e.g. In addition, sometimes the design criteria that were originally applied cannot be retrieved, yielding uncertainty about safety levels and allowable limits of gate settings in the present. agement for which an informed view on discharge and flow around gates is essential. First, the prediction of bed material
@ui =@x @ui =@z defined as
gradient operator, defined as ∇ui ¼ divergence operator, ∇ ~ u ¼ @u1 =@x þ @u3 =@z
2014
There are several aspects in contemporary barrier man-
ε
granular bed material (–)
|
‘Kierbesluit Haringvlietsluizen’) (see Rijkswaterstaat ).
w
β
16.1
procedures are aimed at fulfilling the main function of
contracta (m/s)
Vtot
|
dictions from system-scale far-field flow models. The
q
Vr
Journal of Hydroinformatics
stability and scour, including local erosion (Hoffmans & Pilarczyk ; Azamathulla ), as well as large-scale morphological changes of surrounding bathymetry (Nam et al. ) greatly depend on the flow. Second, ecological issues such as fish migration, salt water intrusion and mobile fauna are also linked with local flow characteristics
number of combinations of r objects out of q q! objects (0 r q), defined as r!ðq r Þ!
(Martin et al. ). Third, other relevant aspects are the
floor function, defined as ∀r ∈ R, ⌊r⌋ ¼
(Naudascher & Rockwell ) and the impact of flow
max(n ∈ Z:n r)
around structures on nearby shipping traffic. Fourth, for
ceiling function, defined as ∀r ∈ R,
the structure itself, local flow prediction is useful for dealing
⌈r⌉ ¼ min(n ∈ Z:n r)
with abnormal conditions: downtime of gates during sched-
time-average of quantity s
uled maintenance or unexpected gate failure.
dynamic forces associated with flow-induced gate vibrations
191
C. D. Erdbrink et al.
|
Free-surface flow simulations for discharge-based operation of hydraulic gates
Journal of Hydroinformatics
|
16.1
|
2014
The barriers and sluices built in the south-west of The
(such as constant discharge coefficients) is that the prediction
Netherlands in the period 1960–2000 are good examples of
quality of discharges in system-scale models is often unclear.
structures where different functions are combined. Present
Warmink et al. (, ) investigated the uncertainty in cali-
management of the barriers at Haringvliet and Oosterschelde
bration of water levels in river models resulting from the
calls for smart use to allow for regulation of fresh and salt
limited availability of discharge data. It was concluded that
water flows and fish migration. At the same time, the aging
the necessary extrapolation of the calibration parameter
process of these structures demands an increasing awareness
(bed roughness of main channel) leads to significant uncer-
of structural safety issues. The new storm-surge barrier of
tainty in simulated design water levels. More intensive
Saint Petersburg, Russia, is another example. This large
measurement of discharges, for which most gated structures
dam houses two sector-gates and three sections of radial
are ideal, and a physically more realistic representation of
gates that protect the low-lying city centre and regulate the
hydraulic structures in models are self-evident improvements
discharge from the river Neva. Operation of this complex
that nevertheless require a culture shift.
structure must rely on state-of-the-art flow models.
The application of computational fluid dynamics (CFD)
The above considerations motivate quantification of flow
in the assessment of flow impact issues that arise long after
around a hydraulic structure. The aim of this paper is to lay a
the start of operation of a structure is rare. Bollaert et al.
foundation for numerical models to estimate gate discharges
() employ numerical modelling to assess the influence
and to evaluate the impact of flow near hydraulic structures
of gate usage on the formation of plunge pool scour of a
in a way that is fit for operational applications. The influence
hydropower dam. For some issues, like salt water intrusion
of waves is not investigated; the focus is on flow (current).
and sediment transport past a discharge-regulating structure,
Traditionally, flow around hydraulic structures is
the solution cannot be found in a modelling tool at one
studied experimentally in the design stage or as a fundamen-
scale. The local flow simulation should in those cases be
tal research topic (Kolkman ; Roth & Hager ).
coupled to a mid- or far-field model that covers a larger area.
Numerous numerical studies have looked into sluice gate
A central role nowadays is played by the multi-
flow (Khan et al. ; Kim ; Akoz et al. ), but no
disciplinary field of hydroinformatics (Solomatine & Ostfeld
single accepted, validated modelling tool exists for assessing
; Krzhizhanovskaya et al. ; Melnikova et al. ;
turbulent gate flow with suitable practical value. Estimating
Pyayt et al. a; Pengel et al. ), in which different
discharge over weirs or under gates is not trivial. New dis-
forms of modelling (physics-based and data-driven) are con-
charge equations are still being introduced, both from
sidered and combined with contemporary computational
informatics viewpoints (Khorchani & Blanpain ) and
techniques like machine learning (Pyayt et al. b). In the
from the traditional viewpoint of measurements (Habibza-
context of the present study, it is noted beforehand that for
deh et al. ).
a complex hydraulic structure, data-driven modelling alone
System-scale models of inland water systems simulate the
is not an apt option, because a single Q-H-relation does not
flow in river branches by solving the one-dimensional or quasi-
describe all states (Kolkman ), or is highly impractical
two-dimensional Shallow Water equations (Deltares a, b).
as it would require extensive permanent monitoring.
The fact that these hydrostatic models do not simulate the flow
This study takes the underlying physics as a starting
around hydraulic structures explicitly is not a severe limitation
point: elementary flow equations are combined with two-
for most applications. The system effect of the operation of var-
dimensional model in the vertical (2DV) time-dependent
ious gates on the water levels in adjacent water bodies (river
CFD simulations. The method bridges modelling scales
branches) can thus be studied (for instance Becker & Schwa-
with a minimum of data coupling and at the same time intro-
nenberg ). For stability of granular bed material and salt
duces the use of numerical aids into practical barrier
water transport, however, the flow acceleration in the vertical
operation for issues that at present are decided upon by
dimension needs to be simulated. Moreover, the downside of
expert judgement by the operator.
primarily water level-centred validation and calibration in
The remainder of this paper is organized as follows: first,
combination with parameterized structure representations
we describe the overall approach, then the method is
192
C. D. Erdbrink et al.
|
Free-surface flow simulations for discharge-based operation of hydraulic gates
Journal of Hydroinformatics
|
16.1
|
2014
described in three sections about discharge modelling, free-
question addressed is how to find the set of gate configur-
surface flow simulation and analysis of the modelling
ations capable of delivering the required discharge that
results. Next, the results of a series of validation runs for
also meet the relevant constraints on flow properties.
the free-surface model are discussed, followed by the results of a test case that gives numerical examples of all modelling steps. We end the paper with recommendations, con-
METHOD
clusions and an outlook on future work. Discharge computations
APPROACH
Configurations of multi-gated structure
For obtaining a timely prediction of the flow around gates, we
Let us consider the gate configurations of a discharge struc-
propose a multi-step physics-based modelling strategy which
ture consisting of n similar openings, each accommodating a
uses data input from a system-scale model. The work-flow of
movable gate, see Figure 2. In its idle state, all n gates close
the suggested gate operation system is shown in Figure 1.
off the openings between the piers and the total discharge is
The first step consists of the extraction of predicted
zero. During a discharge event, m gates will be opened par-
water levels on both sides of the structure from a far-field
tially or completely, allowing a certain discharge through
(system-scale) model that contains the structure. Different
the structure. A ‘gate configuration’ is defined as the allo-
possible gate settings (when to open, how many gates to
cation of a number of gates (m n) that are opened with a
use) are identified in the second step. All options need to
gate opening a(t) while the other gates remain closed. All
be assessed in terms of discharge capacity; this happens in
gates selected for opening will be operated similarly, i.e.
step 3. In the fourth step of Figure 1, for all gate configur-
with the same a(t).
ations capable of discharging the required volume, the
Before deciding which gates to open, first the possible
resulting flow is simulated using CFD. Subsequent analysis
combinations of opening gates are identified and counted.
of the simulation results determines the impact of the flow
In general, flow instabilities are not favourable for maintain-
for specific issues such as bed stability. The fifth and final
ing an efficient and controllable discharge. As in other parts
step comprises the actual decision of gate operation actions.
of physics, symmetry is a global measure for stability of free-
The conventional sequence of steps taken by most oper-
surface flows. If asymmetry is allowed, m gates can be
ational systems follows the dashed line in Figure 1, 2–4, which can be seen as an addition to computational
chosen freely from the total of n available slots. the Then n , using number of possible combinations is obviously m the common notation for combinatorial choice of m objects
decision support systems (steps 1 and 5) by Boukhanovsky
out of n. For the condition of symmetry to hold, gates may
& Ivanov () and Ivanov et al. ().
only be opened in such a way that the pattern is symmetric
skipping steps 3 and 4. The present study focuses on steps
A multi-gated discharge sluice with underflow gates will
about the vertical plane of symmetry in flow direction (see
be used to describe the modelling method. The central
Figure 2). This implies that the number of options reduces
Figure 1
|
Scheme of evaluation steps leading to a decision on optimal gate operation. Steps 1–4 are treated in this paper. The dashed line shows the shorter decision sequence taken by barrier systems that do not take into account flow effects.
193
C. D. Erdbrink et al.
Figure 2
to
|
|
Free-surface flow simulations for discharge-based operation of hydraulic gates
Journal of Hydroinformatics
|
16.1
|
2014
A multi-gated discharge sluice in plan view. In this example, gates 3, 4 and 5 are opened, the others are closed; so n ¼ 7 and m ¼ 3. The dotted line depicts plane of symmetry.
⌊n=2⌋ ⌊m=2⌋
for all 0 m n, where m cannot be chosen
odd if n is even – in which case there are no options at all. For a structure with seven gates (n ¼ 7), for instance, the total number of possible ways to open 1, 2, …, 7 gates is P7 7 7 i¼1 i 1 ¼ 2 1 ¼ 127 if asymmetry is allowed and
P7 ⌊7=2⌋ ¼ 2⌈7=2⌉ 1 ¼ 15 if only symmetric configuri¼1 ⌈i=2⌉ ations are permitted.
This shows that the symmetry constraint greatly reduces the number of ways to open a given number of gates. Furthermore, an even number of gates has roughly half the number of possibilities, because opening any odd number of gates then results in asymmetric inflow. This could also hold for an
Figure 3
|
Classic box model of outflow of a river to sea. An outlet barrier structure regulates the lake level while keeping salt seawater out.
odd-numbered gate structure which misses one (or any odd m < n) of the gates due to maintenance or operational failure.
where Qriver is discharge from a river, Qbarrier is the total discharge through the gates of the barrier, hlake is the water
System model and gate control The basis is formed by a classic box model, see for example Stelling & Booij (). The focus is on submerged flow through a multi-gated outlet barrier that blocks seawater from entering the lake at high tide and discharges river water to sea at low tide, see Figure 3. This basic model serves in the present study as a surrogate system-scale model. The water levels it generates will be used as bound-
level in the lake, Alake is the area of the lake assumed independent of hlake. Submerged flow past an underflow gate is by definition affected by the downstream water level. The associated discharge depends on both water levels (sea and lake), the gate opening a and a discharge coefficient for submerged flow CD. The discharge Q through a barrier gate at time t is written as: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Qgate ðtÞ ¼ CD ai ðtÞw 2g(hlake ðtÞ hsea ðtÞ),
ary conditions for the near-field modelling. Assuming barrier gates are closed (except when dischar-
where w is the flow width (see Figure 2) and the subscript
ging under natural head from lake to sea) and assuming zero
‘barrier’ is dropped from now on. Sea level hsea is approxi-
evaporation and precipitation, the system is described by:
mated by a sine function. The total discharged volume that passes the barrier in the period during which hlake > hsea is
Qriver Qbarrier ¼ Alake
dhlake , dt
found after summing over all m gates and integrating with respect to time.
194
C. D. Erdbrink et al.
|
Free-surface flow simulations for discharge-based operation of hydraulic gates
Journal of Hydroinformatics
|
16.1
|
2014
Two gate opening scenarios will be considered. In both
0.10, KI ¼ 0.45 and KD ¼ 0.55 are used. The setpoint Qset is
scenarios equal gate openings a(t) are applied to all m gates
constant and equal to Qtot,req, except for linear setpoint
selected for opening. The first scenario uses a constant gate
ramping applied at the start of discharge to prevent undue
opening aconst for the whole discharge period (from tstart to
fluctuations of gate position. At each time step, the required
tend). The opening required to lower the lake level to a desired lake level htarget is found by estimating the average
gate opening is derived from this discharge divided by pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 0 mCD w 2gðhlake ðtÞ hsea (t)Þ. Figure 4 shows the flow chart
required discharge Qtot,req to achieve this and by making
of the system model. It includes computations of the two
estimates of the average discharge coefficient and water
gate operation scenarios.
levels during the discharge period:
Figure 4 shows that the total discharge computed by the system model Qtot is being used to calculate the new
aconst ¼
lake level. Additionally, it shows that at the start of each
0 Q tot,req
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi with 0 h 0 ) 0 w 2g(h mC lake sea D
Alake (hlake (t(start) ) htarget ) Q tot,req ¼ t0end tstart 0
discharge event, i.e. when the gates are opened, the prediction of the discharge coefficient CD0 is updated using data from the discharge model. For both situations, with and without PID control, this coefficient is found by a relaxation function with the mean discharge coefficient C*D of
where bars are time-averages and primes indicate predictions of future values. In the second scenario, the discharge is regulated by a proportional integral derivative (PID) controller (Brown ). The goal of this scenario is to have a more constant gate discharge by varying the gate openings in time, whilst still achieving the same htarget as in the first scenario. The discrete PID formula for discharge at ti is:
the previous discharge event computed by the discharge model. For the nth discharge event, the update formula reads: 0 0 0 CD ðnÞ ¼ CD ðn 1Þ þ β CD ðn 1Þ CD ðn 1Þ In all computations, a relaxation factor β ¼ 0.75 is applied. Discharge coefficients actually depend on numerous fac-
Qðti Þ ¼ KP eðti Þ þ KI
i X j¼1
e(tj ) þ KD
eðti Þ e(ti 1 ) , Δt
tors. Also, flows through neighbouring gates influence each other. To distinguish between different gate configurations with the same total flow-through area m w aconst , these two things need to be taken into account. This is done in
where Ki are the gain parameters and the error value is
the discharge model described in the next subsection, see
defined as e(ti ) ¼ Qset Q(ti 1 ). In the simulations, KP ¼
also the bold block in Figure 4.
Figure 4
|
System model: flow chart of gate control and water level computations.
195
C. D. Erdbrink et al.
|
Free-surface flow simulations for discharge-based operation of hydraulic gates
Discharge model
Journal of Hydroinformatics
|
16.1
|
2014
A good geometric design of a discharge-regulator is such that no transition occurs from one flow type to another
Vertical lift gates with underflow are raised vertically
during regular usage. The model therefore checks if indeed
between piers of a structure. The two main flow types that
submerged discharge occurs. As criterion for reaching the
occur are free flow and submerged flow. When the gates
modular flow discharge QMF, the minimum flow depth in
are lifted higher than the water surface, there exists free or
the control section h2 is compared to the flow height in
submerged Venturi flow (Boiten ). All flow types have
the point of maximum vertical contraction Cc·a, the so-
different discharge characteristics and associated formulae.
called ‘vena contracta’. Free and intermediate flow regimes
For estimating the submerged flow discharge, the local
are thus detected, but not calculated. Submerged Venturi
water depths are schematized according to Figure 5 (after
flow is not considered either, since the idea is to actively
Kolkman ). Conservation of the energy head (Bernoulli
control the flow.
equation) is applied in the accelerating parts and the
All four non-linear equations are reshaped into third-
momentum equations in the decelerating parts, yielding a
order polynomials f(hi, hiþ1, Q) ¼ 0. Discharge Q is substi-
system of four equations (see Appendix, available online at
tuted for the velocity terms and remains as the only
http://www.iwaponline.com/jh/016/215.pdf).
unknown in the system of equations. As prescribed for
Transitions h0–h1 and h3–h4 with loss coefficients ξin
sub-critical flow conditions (Chow ), computational
and ξout represent the effects of flow entering and leaving
direction behind the gate is from downstream to upstream
the narrow area between two piers. Transitions h1–h2–h3
(h4 to h2). On the lake side, computations go in flow
are the characteristic underflow gate zones, see Battjes
direction up to the control section (h0 to h2). The dis-
() for details. Computations were carried out accord-
charge coefficient CD is derived from the contraction
ing to the flow chart shown in Figure 6 with the aim of
coefficient Cc for sharp-edged gates, fitted on experimen-
giving better discharge estimates. The lake and sea levels
tal data cited in Kolkman () so that the full range
computed in the system model served as boundary con-
of gate openings a/h1 is covered; see the Appendix for
ditions – for variables h0 and h4 of this model,
equations (available online at http://www.iwaponline.
respectively.
com/jh/016/215.pdf).
Figure 5
|
Definitions of local water depths hi for underflow gate in hydraulic structure, after Kolkman (1994). Above: top view of pier; below: cross-section free water surface around gate. Sketch not to scale.
196
Figure 6
C. D. Erdbrink et al.
|
|
Free-surface flow simulations for discharge-based operation of hydraulic gates
Journal of Hydroinformatics
|
16.1
|
2014
Flow chart of discharge model. This computation is repeated each time step; it is fully contained in the block named ‘discharge model’ in Figure 4.
Iterations on Q ultimately yield a value at which
one water body (lake) to the other (sea), see Figure 7.
h2,forward, computed from upstream, is equal to h2,backward
A rigid rectangular gate with a sharp-edged bottom is mod-
computed from downstream. This is the achieved value of
elled implicitly by cutting its shape out of the flow domain.
Q for the given gate opening a. The entrance and exit
The Reynolds-Averaged Navier–Stokes (RANS) equations
losses are assumed to depend on the number of gates in
for incompressible flow, included in the Appendix (available
use (m). The method does not distinguish between different
online at http://www.iwaponline.com/jh/016/215.pdf), are
gate configurations with equal m, however. Numerical
the basis for the simulations. Figure 8 gives the flow chart
results are shown in the results section.
of the CFD simulations. The model domain covers the flow from h1 to h3. These input values are taken from the discharge model.
CFD SIMULATIONS
For each simulated flow situation, two consecutive runs are made: a steady-state run and a time-dependent transient
Step 4 in Figure 1 consists of two parts: free-surface CFD
run. In the former run, iterations on the outflow velocity pro-
simulations (discussed in this section) and flow analysis (dis-
file are done until pressure at the surface becomes zero. The
cussed in the next section).
results of this pre-run are then implemented as initial conditions for the transient run, which uses a moving mesh to
Model set-up
simulate the free surface. Boundary conditions are similar for both runs except for the surface downstream of the
A non-hydrostatic flow model is applied to find out which of
gate, see Figure 7.
the selected gate settings is most favourable in terms of flow
The upstream flow boundary consists of a hydrostatic
properties. The two-dimensional domain (2DV) is defined
pressure profile pðzÞ ¼ ρgðh1 zÞ. The downstream bound-
by a vertical cross-section through the gate section from
ary is a block profile u-velocity. No slip is applied at the
Figure 7
|
Boundary conditions of CFD model. The main flow direction is from left to right. Sketch not to scale.
197
Figure 8
C. D. Erdbrink et al.
|
|
Free-surface flow simulations for discharge-based operation of hydraulic gates
Journal of Hydroinformatics
|
16.1
|
2014
Flow chart of FEM free-surface flow simulations.
walls (~ u ¼ 0) along with a wall function. The steady pre-run uses a ‘rigid lid’ (free slip boundary, ~ u ~ n ¼ 0) for the downstream water surface. The upstream free surface is modelled as a rigid lid in both runs.
water depth at this boundary would imply a change of local pressure, which contradicts the applied pressure profile. In the course of the transient run, the free surface adapts to the pressure field and vice versa. Because the physical
An unstructured computational mesh is used with refine-
flow situation is quasi-steady, with fluctuations depending
ments near the bottom wall and gate boundaries, made up of
on degree of submergence and gate opening, the surface
around 35,000 triangular elements and yielding about 230,000
may show oscillations in time in its equilibrium state. As a
degrees of freedom for a transient run. Figure 9 shows part of
consequence, the flow discharge is also not strictly constant
the mesh. The Arbitrary Langrangian–Eulerian (ALE) method
in the equilibrium state.
with Winslow smoothing (Donea et al. ) is applied to
The package Comsol Multiphysics is used to simulate
compute the deformation of the computational mesh down-
the gate flow. This finite element method (FEM) solver is
stream of the gate. At the top boundary in the transient run,
applied to solve the discretised RANS equations. The gener-
the velocity condition is an open boundary with zero stress
alized alpha time-implicit stepping method is applied to
in normal direction. At the same boundary, the mesh velocity
ensure Courant stability, with a strict maximum time step
in normal direction is prescribed as umesh,n ¼ u1 nx þ u3 nz
of Δt ¼ 0.02 s. The time step in the CFD model is completely
(Ferziger & Peric´ ). Mesh convergence tests showed that
independent of the time step in the system model and dis-
the applied mesh is sufficiently dense so that results do not
charge model. The variables are solved in two segregated
improve on further mesh refinement.
groups using a combination of the PARDISO solver and
The more common choice of applying a velocity con-
the iterative BiCGStab solver in combination with a
dition upstream and a pressure boundary downstream
VANKA preconditioner. The standard k-epsilon model is
conflicts with the required ALE moving mesh condition at
used for turbulence closure. Simulation of 24 seconds of
the outlet boundary. Vertical mesh freedom is necessary for
physical time took around 6 hours of wall-clock time on
the surface movement. A hydrostatic pressure profile
an Intel 8-core i7 processor, 2.93 GHz, 8 Gb RAM, occupy-
cannot be prescribed at the outlet, since any change in
ing on average 1 Gb RAM and 50% of total CPU power.
Figure 9
|
Example snapshot showing part of the computational mesh. Deformed surface downstream of gate is visible. Flow is from left to right.
198
C. D. Erdbrink et al.
|
Free-surface flow simulations for discharge-based operation of hydraulic gates
Journal of Hydroinformatics
|
16.1
|
2014
in which obviously h2 ¼ Cc a in fully free flow. An overview
ANALYSIS
of critical flow theory from a historical perspective is given The second part of step 4 in Figure 1 is the analysis of the
by Castro-Orgaz & Hager () and from a more practical
modelling results obtained in previous steps. In this section,
viewpoint by Boiten (). In a more complete flow assess-
three aspects of analysis are discussed: flow parameters,
ment, not only the vertical contraction caused by the
vibrations and bed stability.
underflow gate is used as a criterion for modular flow, as is done here, but also contraction caused by horizontal and possibly vertical flow domain transitions at the inlet of
Flow parameters
the structure should be included.
Three parameters that are required for assessing various types of flow impact are extracted from the CFD model:
Vibrations
the contraction coefficient Cc, the velocity in the vena contracta Uvc and the Froude number (Fr). The flow field is
The interaction of current with the movable hydraulic gate
interpolated to a regular grid, so that the edge of the separ-
is capable of causing significant flow-induced vibrations
ated layer is found, see Figure 10. The contraction
(FIV). Although dedicated design tests greatly reduce sus-
coefficient is thus found directly.
ceptibility for dangerous dynamic forces (Jongeling &
The cross-sectional averaged velocity in the vena con-
Erdbrink ), active prediction and control will broaden
tracta is defined by a spatial average in the separated shear
the windows of operation. The literature on dynamic gate
zone:
forces caused by this phenomenon uses a dimensionless
1 Uvc (t) ¼ Cc (t)a
parameter of reduced velocity to signify occurring gate
Cc(t) a ð
U(z, t) dz z¼0
where U is the velocity magnitude scalar at the point of maximum flow contraction. For gate flow with significant vc , may fluctuations, the temporal mean of this quantity, U be used. The Froude number is a widely used dimensionless measure for flow-related surface curvatures. It is used for describing the transition from intermediate to free flow regimes and predicting modular flow discharge and associated gate openings. Here it is defined as: Uvc (t) Fr(t) ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffi gh2 ðtÞ
Figure 10
|
vibrations (Hardwick ; Billeter & Staubli ; Erdbrink ). In time-dependent form it is written as: Vr(t) ¼
Uvc (t) fgate ðtÞ L
where fgate is the response frequency of the structure in Hz; L is a characteristic length scale of the gate, usually the thickness of the gate bottom, and Uvc as defined in the previous section. The response frequency is not easily determined analytically (see general formula in Appendix, available online at http://www.iwaponline.com/jh/016/ 215.pdf); among other reasons because the ‘added’ water mass mw that is caused by the inertia of water being pushed away by the gate deviates from analytical values
Vector flow field of run II. Flow is from left to right. The computed free surface behind the gate shows local lowering. Dashed line indicates separation between positive and negative u1-velocity. The figure shows only part of the actual computational domain. Total domain length is 3.6 m.
199
C. D. Erdbrink et al.
|
Free-surface flow simulations for discharge-based operation of hydraulic gates
Journal of Hydroinformatics
|
16.1
|
2014
at non-zero gate flow (Blevins ). The gate frequency
the turbulence (that depends on flow type and local geome-
may best be monitored in situ by installing sensors –
try, e.g. slopes in bottom profile).
which need to be sensitive to small amplitudes in order to have predictive value. Erdbrink et al. () provide a recipe for a data-driven gate control system for gate vibrations. It is therein proposed to combine physics-
MODEL VALIDATION RESULTS
based modelling and sensor data with machine learning computations to steer the gates clear of riskful situations.
A series of validation runs was performed for the free-sur-
From numerous experimental studies it is concluded
face model. ‘Validation run’ is used here in the meaning
(Naudascher & Rockwell ) that for a specific gate, the
discussed by Stelling & Booij (): the uncalibrated
amplitude A due to FIV, in cross-flow or in-flow direction
model is run without any tweaking of parameters to see
or both, is a function of Vr, a and submergence:
if it can reproduce the most important physical features. Experimental laboratory data by Nago (, ) for a
A ¼ f(Vr, a, h3 )
vertical
sharp-edged
gate
under
submerged
efflux
serve as comparison. Nago’s (, ) dimensions Details of the gate geometry are decisive for occurrence or absence of vibrations. A database with response data
were used without any scaling. His discharge formula pffiffiffiffiffiffiffiffiffiffi Q ¼ CE aw 2gh1 does not contain the downstream level
from past laboratory studies could be used to predict ampli-
h3 explicitly. Its influence is instead found in the discharge
tudes of future flow situations in an operational system.
coefficient CE. The simulated discharge is computed by spatial integration of horizontal velocity at the outflow boundary. In Figure 11, coefficient CE is plotted for differ-
Scour and bed protection
ent series of dimensionless gate openings and for a range
The classical prediction of local scour downstream of weirs and sluice structures caused by outlet currents is described by Breusers () and Hoffmans & Pilarczyk (). More recently, contemporary computational techniques were introduced for scour estimation, e.g. Azmathullah et al. (). In the classical physics-based design formulae, turbulence parameters are used to predict the depth of the scour hole in unprotected beds. For beds protected with granular material (loose rocks), the Shields parameter is a classic non-dimensional measure applied as a first indicator for instability (Shields ). An adapted version of this parameter used by Jongeling et al. () and elaborated upon by Hofland () and Hoan et al. () is defined as:
Ψ(x) ¼
〈
pffiffiffiffiffiffiffiffiffi 2 U(x) þ α k(x) Δgd(x)
〉
of dimensionless downstream levels. The results of the validation runs make clear that the simulations capture the discharges of the experimental data quite accurately: the correlation coefficient is 0.994 and the root mean square error is 1.14%. The fact that the uncalibrated model shows good discharge estimates gives confidence in the predictive power of this modelling approach. Physical output not validated here (such as TKE) may be calibrated in future studies by adjusting suitable model parameters. Convergence of various flow variables occurs at different rates. First, the mean velocities stabilize, and then the forces on the gate converge, then the discharge, and lastly the turbulent energy. The chosen boundary conditions proved to lead to
with Δ ¼
ρs ρw , ρw
stable results for all submergence ratios of Nago’s (, ) data. It was found that the moving mesh is the critical factor for numerical stability. ALE is a suitable method for
where 〈::〉 denotes spatial averaging over the whole water
computing the free surface for quasi-steady gate flow as
depth, k is the turbulent kinetic energy (TKE), d is the is the mean flow velocity magnitude local water depth, U
long as the flow remains submerged. Steep surface gradients
and α is an empirical parameter for bringing into account
and hence numerical instabilities.
associated with lowering h3 cause inverted mesh elements
200
Figure 11
C. D. Erdbrink et al.
|
|
Free-surface flow simulations for discharge-based operation of hydraulic gates
Journal of Hydroinformatics
|
16.1
|
2014
Results of validation runs showing discharge coefficient CE simulated by the free-surface CFD model versus experimental data of submerged flow of a sharp-edged underflow gate by Nago (1978, 1983). Left: sorted by gate opening (a/h1) and downstream level (h3/a). Right: direct comparison of the same data. Dashed lines mark 10% deviation.
TEST CASE RESULTS The described methods are illustrated by a test case example. The results of three modelling steps are discussed: the sluice model containing the system model (for water levels) plus the discharge model (Figures 4 and 6), the free-surface model (Figure 8) and analysis of vibrations and bed stability. Four tidal cycles and four discharge events were modelled for a discharge sluice with seven gates regulating a lake with constant river inflow. The goal of the computations is to determine the optimal number of gates to open and the best gate operation scenario.
• •
tidal amplitude ¼ 0.60 m tidal period ¼12.5 hours The sluice model was run for 1 m 7. When opening
only one gate, the target lake level could not be reached even when lifting the gate completely. When using two gates, the target level is reached, but the modular flow limit is exceeded for the greatest part of the discharge period. This results in unwanted transitions to intermediate and free flow with fluctuating discharges that are hard to control. For 3 m 7 strictly submerged flow exists and the target is met. Therefore, only these configurations are modelled further. The plotted water levels (Figure 12) show that the lake level fluctuates in a controlled way and is nearly identi-
Results of system and discharge model
cal for the scenarios with and without discharge control.
Model parameters
charges in time are plotted for one tidal period for the
In Figure 13, the gate openings and achieved gate dis-
• • • • • • • •
situations with three or seven gates opened during the disn ¼ 7, m ¼ 1, … , 7 7
Alake ¼ 1.9·10 m
2
charge event. Intermediate numbers of operated gates (4 m 6) lie between the shown curves for m ¼ 3 and
Qriver ¼ 100 m3/s
m ¼ 7, but are not plotted for clarity. It can be seen that con-
hlake(t ¼ 0) ¼ 6.1 m
stant gate openings give discharges that vary in time
htarget ¼ 6.0 m
following the time-dependent hydraulic head difference. In
w ¼ 22.5 m
the PID-controlled scenario, the gate opening is automati-
sill height: 3 m
cally operated in such a way that the discharge stabilizes
mean sea level ¼ z0 þ 6.1 m
quickly after the start.
201
C. D. Erdbrink et al.
|
Free-surface flow simulations for discharge-based operation of hydraulic gates
Journal of Hydroinformatics
|
16.1
|
2014
simulation depending on specific interests and available computing power. Results of CFD simulations To simulate the two selected runs I and II within the validated range, the levels and opening are scaled down with length scale 1:10, see Table 1. The near-gate flow velocities, pressures, TKE and dissipation are simulated. Figure 10 shows a plot of the
Figure 12
|
Results of sluice model for 3 m 7: sea and lake level for gate operation
simulated flow field of run II (at length scale 1:10) by indicating ~ u. The simulated free surface as expected sinks in the
scenario with and without PID-controlled discharge. Vertical line indicates moment of maximum head difference.
region directly downstream of the gate (solid line in Figure 10). In this case, the vena contracta is located at
In this multi-scale modelling approach, averaged values
short distance downstream of the flow separation point.
from the discharge model are used to improve discharge pre-
The separation between positive and negative horizontal vel-
dictions at system scale. However, instantaneous discharges
ocities in the recirculation area is derived (dashed line in
and gate openings computed in both models inevitably
Figure 10). At a distance of around five times the down-
differ. The largest discrepancies are around 10%. This
stream water level past the gate, the flow reattaches at the
could be improved by examining different update methods,
surface and the velocity starts to return to a more uniform
at the cost of longer computation time.
profile.
Three configurations are selected for evaluation by free-
Figure 14 shows plots of the pressure and TKE of run II.
surface simulations. These cases are marked in Figure 13 as
In the case shown in the plots, the equilibrium state
runs I, II and III. Runs I and III represent extremes: a con-
reached in the simulations is fully steady. Pressure gradients
stant gate opening with only three gates in use (high Q) and
are mild; the pressure returns smoothly to a hydrostatic
a controlled opening with all seven gates in use (low Q).
shape as the streamlines become parallel downstream. The
All three runs are at the time of maximum head difference.
TKE reaches a maximum in the middle of the water
In real-life practice, more cases could be selected for
column at about two times the downstream water depth
Figure 13
|
Results of sluice model: gate openings (left) and achieved discharges per gate (right).
202
C. D. Erdbrink et al.
Table 1
|
|
Free-surface flow simulations for discharge-based operation of hydraulic gates
Journal of Hydroinformatics
|
16.1
|
2014
Values of selected CFD runs
Gate opening a
Total discharge
Discharge per
Discharge per gate per
Run
Gate configuration
scale
h0 (m)
h1 (m)
h3 (m)
h4 (m)
(m)
Qtot (m3/s)
gate Qi (m3/s)
unit width qi (m2/s)
I
m ¼ 3, constant opening
1:1 1:10
3.07 0.307
2.93 0.293
2.38 0.238
2.50 0.250
1.30 0.130
270 0.855
90.1 0.285
4.00 0.127
II
m ¼ 3, PID control
1:1 1:10
3.07 0.307
3.00 0.300
2.44 0.244
2.50 0.250
1.14 0.114
237 0.750
79.02 0.250
3.51 0.111
III
m ¼ 7, PID control
1:1 1:10
3.07 0.307
3.06 0.306
2.49 0.249
2.50 0.250
0.610 0.0610
237 0.750
33.86 0.107
1.51 0.0476
Length
▪ Input values for CFD runs. All water levels hi are relative to z ¼ 0.
past the gate. Run I has a steeper surface behind the gate
observations from the free surface curvatures of the final sol-
than run II (shown in Figures 10 and 14) and higher TKE
ution of the transient simulations. The flow impact on the bed protection material is esti-
levels, while run III has the lowest TKE levels and the most level surface downstream of the gate.
mated by computing Ψ for two different α for the selected
Results of flow analysis
runs. The whole water depth d is used for averaging the pffiffiffi 2 square of the maximum local velocity term U þ α k . The results are plotted in Figure 15.
The output of the CFD free-surface model is used for com-
The plot shows that run I (three gates with constant open-
puting the values of the three flow parameters that were
ing) has the strongest flow impact on the bed material of the
discussed in an earlier section, see Table 2.
three runs irrespective of the choice for α. The Ψ–values of
Table 2 shows that the contraction coefficients do not
run II show that controlling the discharge without opening
differ much, which is expected for similar gate types. The
more gates already gives a lower flow impact on the bed.
velocity in the control section Uvc is highest for the situation
Run III (seven gates with controlled discharge) has the
with highest discharge per gate (run I) and lowest for the
lowest flow impact. All runs reach their maximum flow
situation with smallest discharge per gate (run III). The
impact on the bed around the same (limited) distance down-
same holds for the Froude number. This matches
stream of the gate. For all runs the general shape of the curves
Figure 14
|
Pressure p in Pa (above) and turbulent kinetic energy k (TKE) in m2/s2 (below) of run II.
203
Table 2
C. D. Erdbrink et al.
|
|
Free-surface flow simulations for discharge-based operation of hydraulic gates
Journal of Hydroinformatics
|
16.1
|
2014
Computed flow parameters derived from CFD model results
Run
Cc ( )
Uvc (m/s)
Fr ( )
I
0.88
3.56
0.83
II
0.86
3.50
0.78
III
0.84
2.74
0.57
is quite similar for both values of α, indicating that turbulence is dominant over mean velocity for the flow impact. Overall, the values of the bed stability parameter are somewhat low compared to previous numerical investigations by Erdbrink & Jongeling () and Erdbrink (), which could be attributed to the use of the standard k-epsilon model in this study instead of the RNG k-epsilon turbulence model used in the two mentioned studies. Choos-
Figure 16
|
Gate vibration response for runs I–III giving relative amplitude A / Amax as a function of reduced velocity Vr. Fictitious response curves are used to illustrate the method. Two regions of gate openings a are distinguished.
ing higher α values could compensate the lower TKE. For practical application one should fix α after calibration in
bottom gate occurs at small gate openings, therefore
experimental investigations and define a threshold value for
higher amplitudes are expected for run III. In this fictitious
Ψ that should not be exceeded during operation and that
instance, the computed Vr-ranges indeed give higher relative
can be used as a fitness measure for different flow scenarios.
amplitudes for run II than for the other two runs. As
Turning to the assessment of gate vibrations, it is calcu-
with the bed protection assessment, the definition of a
lated that for an assumed range of structural response
literature-based threshold level would be a logical addition
frequencies of 2–5 Hz (typical values for large hydraulic
for real applications.
gates), the reduced velocity number Vr lies in the range
Based on the discussed modelling results and flow analy-
3.5–8.5 for runs I and II and in the range 2.5–6 for run III.
sis, it may be decided to implement the discharge scenario of
For illustration purposes, a response curve is devised, see
run II, because it leads to acceptable vibration levels and
Figure 16, since a full evaluation is laborious (e.g. Billeter
gives a lower impact on the bed material than run I –
& Staubli ). Projection of the Vr-values onto the
while still ensuring sufficient discharge volume to reach
response curve give resulting vibration amplitudes.
the target lake level.
Two different response curves are used in Figure 16. The most significant excitation of cross-flow vibrations of a flat-
RECOMMENDATIONS As a main recommendation, we propose to apply this modelling process in a case study of existing barrier structures such as Haringvliet, Oosterschelde, Maeslantkering in The Netherlands or the Saint Petersburg barrier in Russia. This research should find a natural place within on-going work on system-scale modelling for water level prediction used in decision support for hydraulic structures (Boukhanovsky & Ivanov ). Specific gate uses are to be simulated and evaluated. For
Figure 15
|
Computed values of bed stability parameter Ψ downstream of the gate for two different values of turbulence impact parameter α. Runs I, II and III are shown.
the last two barriers just mentioned, the operational modelling system will be mostly aimed at widening the window of
204
C. D. Erdbrink et al.
|
Free-surface flow simulations for discharge-based operation of hydraulic gates
Journal of Hydroinformatics
|
16.1
|
2014
operation. The introduced methods can also be adapted for
control process. A combination of elementary equations
weirs in rivers. Coupling the presented models with a mid-
and empirical relations was used for this. The increase in
field or far-field model of regional scale would enable an
computational power over the years now enables solving
operational impact assessment for water management
these flow equations in quick assessment procedures
issues such as salt water intrusion.
during operation.
The inclusion of measurement data (from field sensors
Free-surface CFD simulations of the turbulent flow past
or laboratory tests) is necessary for the calibration of empiri-
an underflow gate revealed the effects of local lowering of
cal parameters (such as entrance and exit losses), for the
the surface on flow velocities and TKE levels. Time-depen-
process of model validation and for providing actual
dent FEM simulations with a moving mesh technique were
model input (water levels). Experiences from the field of
found to give stable solutions of the free-surface under sub-
hydroinformatics should be added to the present research
merged conditions. From a series of validation runs it is
to make the extension towards data-driven modelling com-
concluded that the free-surface model yields discharge
ponents. The link with data assimilation that is to be
values for a range of gate openings and submergence
accommodated by the higher-level models is obvious.
levels within an acceptable accuracy of experimental values.
A longstanding issue in the engineering practice of
Among the flow analysis possibilities based on output
detailed hydrodynamics is turbulence modelling. The right
from the free-surface model is computation of the Froude
balance between accuracy and computational costs needs
number, the reduced velocity parameter for estimating gate
to be found for specific applications. Again, smart use of
vibrations and a stability parameter for granular bed protec-
measurement data for numerical validation and calibration
tion. The numerical example of the discharge sluice has
could be the key. It is furthermore expected that intermedi-
proved the feasibility of combining discharge estimates
ate and free flow conditions where hydraulic jumps occur
with free-surface simulations for deriving operational
away from the gate, including the Venturi flow type, can
decisions. For the particular case treated in this paper, it
be modelled more universally using other numerical
was found that lower TKE levels of the PID-controlled dis-
methods such as Phase Field or Volume of Fluid. If
charge scenarios contribute significantly to reducing the
needed, the model can thus be extended to account for
flow attack on the bed protection. Additionally, the model
dynamic effects directly related to opening and closing
showed the influence of the number of opened gates on
actions of the gates. Active setpoint ramping of the PID-
the flow properties.
control using feed-forward model predictions is another recommendation related to this.
The practical benefits of including near-field flow modelling in gate control systems seem clear. It will enable more sophisticated water reservoir management in everyday operation with respect to issues such as salt water intrusion, fish
CONCLUSIONS AND FUTURE WORK
migration and possibly saving energy. In extraordinary situations, model results can help maintain safe gate usage and
The purpose of the current study was to set up physics-based
prevent gate vibrations, washing away of bed protection and
modelling methods for a flow-centred operation of gates of
the development of scour holes around the structure.
hydraulic structures. The described case of a multi-gated
Limitations of the followed modelling approach need to
outlet barrier sluice has shown how discharge estimates
be addressed in follow-up studies. Additional calibrations
and free-surface simulations can aid in deciding on optimal
are necessary: PID-control optimization to obtain the
gate configuration and opening scenarios.
desired discharge more precisely, discharge and loss coeffi-
The application of a PID-controller to achieve a more
cients in the flow equations and turbulence model
constant discharge during changing head differences
parameters. Next to this, improvements to the free-surface
emerged as a feasible addition to traditional structure oper-
model should broaden the range of applicability so that stee-
ation. Prediction of gate discharge coefficients is a central
per surface disruptions and hydraulic jumps as found in free
issue in determining appropriate gate openings in the
flows can be captured as well.
205
C. D. Erdbrink et al.
|
Free-surface flow simulations for discharge-based operation of hydraulic gates
The physics-based model of this study is logically complemented by data-driven techniques in future studies. It is believed that hydroinformatics provides the required tools for this. Use of sensor data from real-life structures and coupling to system-scale water level prediction models are seen as next steps. Moreover, it should be investigated how operational decisions should be derived when taking into account the various criteria and flow constraints.
ACKNOWLEDGEMENTS This work was supported by the EU FP7 project UrbanFlood, grant N 248767; by the Leading Scientist Program
of
the
Russian
Federation,
contract
11.
G34.31.0019 and by the BiG Grid project BG-020-10, #2010/01550/NCF
with
financial
support
from
The
Netherlands Organisation for Scientific Research NWO. It is carried out in collaboration with Deltares.
REFERENCES Akoz, M. S., Kirkgoz, M. S. & Oner, A. A. Experimental and numerical modeling of a sluice gate flow. J. Hydraul. Res. 47 (2), 167–176. Azamathulla, H. Md. Gene expression programming for prediction of scour depth downstream of sills. J. Hydrol. 460–461, 156–159. Azmathullah, H. M. D., Deo, M. C. & Deolalikar, P. B. Estimation of scour below spillways using neural networks. J. Hydraul. Res. 44 (1), 61–69. Battjes, J. A. Vloeistofmechanica. Lecture notes CT2100, Delft University of Technology, Fac. of Civil Eng. & Geosciences, Fluid Mechanics section. Becker, B. P. J. & Schwanenberg, D. Conjunctive real time control and hydrodynamic modelling in application to Rhine river. In: HIC 2012: Proceedings of the 10th International Conference on Hydroinformatics. Hamburg, Germany, 14–18 July, 2012. TuTech Verlag, Hamburg. Billeter, P. & Staubli, T. Flow-induced multiple-mode vibrations of gates with submerged discharge. J. Fluids Struct. 14, 323–338. Blevins, R. D. Flow-induced Vibration, 2nd edn. Van Nostrand Reinhold, New York. Boiten, W. Vertical gates as flow measures structures. In: Proceedings of the 2nd International Conference on Hydraulic Modelling. Stratford-upon-Avon, UK, 14–16 June 1994, pp. 33–44. BHR Group, London.
Journal of Hydroinformatics
|
16.1
|
2014
Bollaert, E. F. R., Munodawafa, M. C. & Mazvidza, D. Z. Kariba dam plunge pool scour: quasi-3D numerical predictions. In: Proceedings of the International Conference on Scour and Erosion ISCE6, Paris, August 27–31, 2012. Boukhanovsky, A. V. & Ivanov, S. V. Urgent computing for operational storm surge forecasting in Saint Petersburg. Proc. Comput. Sci. 9, 1704–1712. Breusers, H. N. C. Conformity and time-scale in twodimensional local scour. In: Proceedings of the Symposium on Model and Prototype Conformity. Hydr. Res. Lab., Poona, India, pp. 1–8. Brown, F. T. Engineering System Dynamics: A Unified Graph-centred Approach, 2nd edn. Taylor & Francis Group, Boca Raton, FL, USA. Chow, V. T. Open-Channel Hydraulics. McGraw-Hill, New York. Castro-Orgaz, O. & Hager, W. H. Critical flow: a historical perspective. J. Hydraul. Eng. 136, 3–11. Deltares a Delft3D-Flow User Manual (HydroMorphodynamics) – version: 3.15.20508. Available from: http://oss.deltares.nl/web/opendelft3d. Deltares b SOBEK-RE User Manual. Available from: http:// sobek-re.deltares.nl and www.deltaressystems.com. Donea, J., Huerta, A., Ponthot, J.-Ph. & Rodríguez-Ferran, A. Arbitrary Lagrangian–Eulerian methods. In: The Encyclopedia of Computational Mechanics (E. Stein, R. De Borst & T. J. R. Hughes, eds). Vol. 1. John Wiley & Sons, Bognor Regis, UK, pp. 413–437. Erdbrink, C. D. Ontwerpmethodiek granulaire bodemverdediging met CFX ongestructureerd. Deltares research report 1200257-003, kennisonline.deltares.nl. Erdbrink, C. D. Physical model tests on vertical flow-induced vibrations of an underflow gate. Deltares research report 1202229-004, kennisonline.deltares.nl. Erdbrink, C. D. & Jongeling, T. H. G. Computations of the turbulent flow about square and round piers with a granular bed protection: 3D flow computations with CFX. Deltares research report Q4386/Q4593, kennisonline. deltares.nl. Erdbrink, C. D., Krzhizhanovskaya, V. V. & Sloot, P. M. A. Controlling flow-induced vibrations of flood barrier gates with data-driven and finite-element modelling. In: Comprehensive Flood Risk Management (F. Klijn & T. Schweckendiek, eds). CRC Press/Balkema (Taylor & Francis Group), Leiden, Proceedings of the 2nd European Conference on Flood Risk Management FLOODrisk 2012. 20–22 November 2012, Rotterdam, The Netherlands, pp. 425–434. Available from www.crcpress.com/product/ isbn/9780415621441. Ferziger, J. H. & Peric´, M. Computational Methods for Fluid Dynamics, 3rd edn. Springer-Verlag, Berlin, Heidelberg, New York. Habibzadeh, A., Vatankhah, A. R. & Rajaratnam, N. Role of energy loss on discharge characteristics of sluice gates. J. Hydraul. Eng. 137 (9), 1079–1084.
206
C. D. Erdbrink et al.
|
Free-surface flow simulations for discharge-based operation of hydraulic gates
Hardwick, J. D. Flow-induced vibration of vertical-lift gate. J. Hydraul. Div. Proc. ASCE 100 (5), 631–644. Hoan, N. T., Stive, M., Booij, R., Hofland, B. & Verhagen, H. Stone stability in nonuniform flow. J. Hydraul. Eng. 137 (9), 884–893. Hoffmans, G. J. C. M. & Pilarczyk, K. W. Local scour downstream of hydraulic structures. J. Hydraul. Eng. 121 (4), 326–340. Hofland, B. Rock & Roll – Turbulence-Induced Damage to Granular Bed Protections. PhD Thesis, Delft University of Technology, The Netherlands. Ivanov, S. V., Kosukhin, S. S., Kaluzhnaya, A. V. & Boukhanovsky, A. V. Simulation-based collaborative decision support for surge floods prevention in St. Petersburg. J. Comput. Sci. 3 (6), 450–455. Jongeling, T. H. G. & Erdbrink, C. D. Dynamica van beweegbare waterkeringen – Trillingen in onderstroomde schuiven en uitgangspunten voor een schaalmodelopstelling. Deltares research report 1200216-000, kennisonline.deltares.nl. Jongeling, T. H. G., Blom, A., Jagers, H. R. A., Stolker, C. & Verheij, H. J. Design method granular protections. WL| Delft Hydraulics, Technical report Q2933/Q3018. Khan, L. A., Wicklein, E. A. & Rashid, M. A 3D CFD model analysis of the hydraulics of an outfall structure at a power plant. J. Hydroinform. 7 (4), 283–290. Khorchani, M. & Blanpain, O. Development of a discharge equation for side weirs using artificial neural networks. J. Hydroinform. 7 (1), 31–39. Kim, D.-G. Numerical analysis of free flow past a sluice gate. KSCE J. Civil Eng. (Water Eng.) 11 (2), 127–132. Kolkman, P. A. Discharge relations and component head losses for hydraulic structures. In: Hydraulic Structures Design Manual 8 (D. S. Miller, ed.). IAHR/AIRH, Balkema, pp. 55– 151. Also published in 1989 as Delft Hydraulics report Q953. Krzhizhanovskaya, V. V., Shirshov, G. S., Melnikova, N. B., Belleman, R. G., Rusadi, F. I., Broekhuijsen, B. J., Gouldby, B. P., Lhomme, J., Balis, B., Bubak, M., Pyayt, A. L., Mokhov, I. I., Ozhigin, A. V., Lang, B. & Meijer, R. J. Flood early warning system: design, implementation and computational modules. Proc. Comput. Sci. 4, 106–115. Martin, D., Bertasi, F., Colangelo, M. A., De Vries, M., Frost, M., Hawkins, S. J., Macpherson, E., Moschella, P. S., Satta, M. P., Thompson, R. C. & Ceccherelli, V. U. Ecological impact of coastal defence structures on sediment and mobile fauna: Evaluating and forecasting consequences of unavoidable modifications of native habitats. Coast. Eng. 52, 1027–1051. Melnikova, N. B., Shirshov, G. S. & Krzhizhanovskaya, V. V. Virtual dike: multiscale simulation of dike stability. Proc. Comput. Sci. 4, 791–800. Nago, H. Influence of gate-shapes on discharge coefficients. Trans. JSCE 10, 116–119. Original in Japanese: Proc. of JSCE 270, Feb. 1978, 59–71.
Journal of Hydroinformatics
|
16.1
|
2014
Nago, H. Discharge coefficient of underflow gate in open channel. Research Report Department of Civil Engineering, Okayama University, Japan. Nam, P. T., Larson, M., Hanson, H. & Xuan Hoan, L. A numerical model of beach morphological evolution due to waves and currents in the vicinity of coastal structures. Coast. Eng. 58, 863–876. Naudascher, E. & Rockwell, D. Flow-induced Vibrations – An Engineering Guide. Dover Publications, New York. Pengel, B., Krzhizhanovskaya, V. V., Melnikova, N. B., Shirshov, G. S., Koelewijn, A. R., Pyayt, A. L. & Mokhov, I. I. Flood early warning system: sensors and internet. In: IAHS Red Book, N 357, Floods: From Risk to Opportunity (A. Chavoshian & K. Takeuchi, eds). IAHS Press, Wallingford, UK, pp. 445–453. Available from www.iahs. info/uploads/dms/15684.357%20445-453.pdf. Pyayt, A. L., Mokhov, I. I., Kozionov, A., Kusherbaeva, V., Melnikova, N. B., Krzhizhanovskaya, V. V. & Meijer, R. J. a Artificial intelligence and finite element modelling for monitoring flood defence structures. IEEE Workshop on Environmental, Energy, and Structural Monitoring Systems. September 2011. pp. 1–7. Available from http://dx.doi.org/ 10.1109/EESMS.2011.6067047. Pyayt, A. L., Mokhov, I. I., Lang, B., Krzhizhanovskaya, V. V. & Meijer, R. J. b Machine learning methods for environmental monitoring and flood protection. World Acad. Sci. Eng. Technol. 54, 118–123. Available from http://waset. org/journals/waset/v54/v54-23.pdf. Rijkswaterstaat Haringvlietsluizen op een kier – Effecten op natuur en gebruiksfuncties. Stuurgroep Realisatie de Kier, report AP/2004.07, Dutch Ministry of Public Works. Roth, A. & Hager, W. H. Underflow of standard sluice gate. Exp. Fluids 27 (4), 339–350. Shields, A. Anwendung der Aehnlichkeitsmechanik und der Turbulenzforschung auf die Geschiebebetrieb. Mitteilungen der Preussischen Versuchsanstalt fur Wasserbau und Schiffbau, Heft 26. Solomatine, D. P. & Ostfeld, A. Data-driven modelling: some past experiences and new approaches. J. Hydroinform. 10 (1), 3–22. Stelling, G. S. & Booij, N. Computational modelling of flow and transport. Lecture notes CTwa4340, Delft University of Technology, The Netherlands. Warmink, J. J., Van der Klis, H., Booij, M. J. & Hulscher, S. J. M. H. Identification and quantification of uncertainties in river models using expert elicitation. In: Proc. Conf. NCR-days 2008 (A. G. van Os & C. D. Erdbrink, eds). NCR-Publications, 33–2008, Delft, pp. 40–41. Warmink, J. J., Janssen, J. A. E. B., Booij, M. J. & Krol, M. S. Identification and classification of uncertainties in the application of environmental models. Environ. Model. Software 25, 1518–1527.
First received 15 November 2012; accepted in revised form 28 June 2013. Available online 5 August 2013
207
© IWA Publishing 2014 Journal of Hydroinformatics
|
16.1
|
2014
Implementation of pressure reduction valves in a dynamic water distribution numerical model to control the inequality in water supply Gabriele Freni, Mauro De Marchis and Enrico Napoli
ABSTRACT The analysis of water distribution networks has to take into account the variability of users’ water demand and the variability of network boundary conditions. In complex systems, e.g. those characterized by the presence of local private tanks and intermittent distribution, this variability suggests the use of dynamic models that are able to evaluate the rapid variability of pressures and flows in the network. The dynamic behavior of the network also affects the performance of valves that are used for controlling the network. Pressure reduction valves (PRVs) are used for controlling pressure and reducing leakages. Highly variable demands can produce significant fluctuation of the PRV set point, causing related transient phenomena that propagate through the network and may result in water quality problems, unequal distribution of resources among users, and premature wear of the pipe infrastructure. A model was developed in previous studies and an additional module for pressure control was implemented able to analyze PRVs in a fully dynamic numerical framework. The
Gabriele Freni (corresponding author) Mauro De Marchis Università di Enna ‘Kore’, Facoltà di Ingegneria, Architettura e Scienze Motorie, Cittadella Universitaria, I-94100, Enna, Italy E-mail: gabriele.freni@unikore.it Enrico Napoli Università di Palermo, Dipartimento di Ingegneria Civile, Ambientale ed Aerospaziale, Viale delle Scienze, I-90128, Palermo, Italy
model was demonstrated to be robust and reliable in the implementation of pressure management areas in the network. The model was applied to a district of the Palermo network (Italy). The district was monitored and pressure as well as flow data were available for model calibration. Key words
| dynamic model, intermittent distribution, method of characteristic, pipe-filling process, PRVs, water distribution network modeling
INTRODUCTION The distribution of water resources can be made through two
this practice reduces the background water losses with
different delivery methods: continuous or intermittent distri-
little financial effort (Criminisi et al. ). Despite this,
bution. Continuous distribution ensures better management
when the practice of intermittent supply is protracted over
of the water network because the water demand depends
time, the effect could be opposite. Due to the water
only on user requests and the service quality can be better
hammer induced by the filling process (De Marchis et al.
guaranteed. In a water scarcity condition, an intermittent
), a deterioration of the pipes occurs, thus increasing
system is used by the management authority for rationing
the rate of burst and increasing leakages, preventing
the available water volume, for reducing real losses and/or
achievement of one of the main objectives of the intermit-
for controlling consumption (Fontanazza et al. ).
tent
supply.
Furthermore,
discontinuous
distribution
Due to several detrimental aspects, this approach should
presents several critical aspects, such as users’ inequality
be only applied if no other management choices are avail-
in access to water resources and the presence of filling and
able. Despite this, it is broadly adopted not only in
emptying transient phenomena affecting the mechanical
developing countries (Hardoy et al. ) but also in devel-
stability of the pipes, the durability of the network and
oped ones (Cubillo ). In a water scarcity condition,
water losses (Vairavamoorthy et al. ).
doi: 10.2166/hydro.2013.032
208
G. Freni et al.
|
Implementation of pressure reduction valves in a dynamic water network model
Impact on water quality can be equally relevant because empty water pipes can be exposed to ingress of soil particles
Journal of Hydroinformatics
|
16.1
|
2014
the advantaged parts of the network reducing the inequalities in water resource access among users.
and contaminated water from the surrounding soil through
Pressure transients caused by the combined behavior of
leak openings. This means that the water quality integrity
a network and PRV propagate through a PMA and result in
of the system is compromised and that users cannot be guar-
water supply problems, a higher number of pipe bursts, and
anteed a safe supply (National Research Council ).
premature wear of the pipe infrastructure. Since it is imposs-
Users try to adapt to intermittent distribution by instal-
ible to eliminate demand changes from a network, it is
ling local tanks, in order to collect water when the
important to control PRVs appropriately to minimize their
distribution service is available, and use them when the ser-
impact on the system. The interaction between automatic
vice is suspended (Arregui et al. ). Tanks are often
control valves and transients has been investigated in sev-
oversized with respect to the users’ real needs and their pres-
eral publications. Bergant et al. () investigated the
ence makes the network work in conditions that are quite
effect of valve closing time on the transient response in a
far from the design ones: flows in the lower parts of the net-
pipeline and compared measured data with a simulation
work are much higher than the design until the tanks are full
model. The effect of automatic control valves in a real
and water resources can reach the tanks in the disadvan-
pipe network was shown by Brunone & Morelli (),
taged areas of the network; pressure on the network is
and used to estimate the friction in a transient model. A
generally lower than the design and it is controlled by the
model for analysis and control of PRVs was implemented
levels in the tanks (Giustolisi et al. ).
by Prescott & Ulanicki (), using dynamic formulations
This configuration of the system reduces the applicability of common steady state models, because the private
and experimental analysis but the model was not integrated with a dynamic network modeling approach.
tank filling process creates continuous change in the hydrau-
The analysis of the network during the filling process
lic network behavior. To follow this constant change in
was carried out with a dynamic model, assuming that the
network state variables, dynamic and pressure driven
air pressure inside the network is always equal to the atmos-
models are needed. Considering this aim, Giustolisi ()
pheric one and that the water column cannot be fragmented
presented an extension of the pressure-driven analysis
(De Marchis et al. ). A demand model based on the node
using a global gradient algorithm (Todini ; Giustolisi
pressure-consumption law defining flow draw from the net-
et al. a, b) permitting the effective introduction of the
work and filling the tank was previously integrated into the
lumped nodal demand while preserving the energy balance
network model (De Marchis et al. ). In the present
by means of a pipe hydraulic resistance correction. The
paper, a PRV module was integrated in the network
model allowed the simulation of private tanks but tools for
model, following the dynamic approach proposed in Pre-
the regulation and control of network pressures could not
scott & Ulanicki (), obtaining a fully dynamic model
be modeled.
of the network filling process in the presence of PRVs.
Pressure control is one of the main technical options
The model was calibrated and applied for the implemen-
that a water manager can put in place to reduce the inequal-
tation of PMAs in one of the distribution networks of
ities among users in such complex cases. Nevertheless, the
Palermo (Italy). The research proposed here starts from
low pressures and the complex and dynamic hydraulic be-
the preliminary finding presented by Freni et al. ().
havior of the system with private tanks prevent a simple analysis of the effect of pressure control devices such as pressure reduction valves (PRVs) and pumps. Hydraulically
METHODOLOGY
controlled PRVs maintain a specified outlet pressure, irrespective of a higher fluctuating inlet pressure, and they are
In this section, the numerical model and the case study are
often implemented dividing the network into districts
presented. The model description is divided in two parts: the
(Pressure Management Areas – PMAs). In intermittent net-
discussion of the network hydrodynamic model that was
works, they may control pressures (and indirectly flows) in
previously presented in De Marchis et al. () and the
209
G. Freni et al.
|
Implementation of pressure reduction valves in a dynamic water network model
Journal of Hydroinformatics
|
16.1
|
2014
detailed description of the PRV valve that was implemented
inside the pipe from opposite directions (Figure 1(d)). Once
and integrated in the present study.
they reach the same cross-section, the subsequent collision can cause an increase in pressure that is a function of the vel-
The network model
ocity propagation of the water front. The numerical model, using the method of characteristic, is able to take into
In the proposed numerical model the transient in pipes is simulated using fast elasticity-demand pressure waves. In fact, the initial velocity of the water front, inside a previously empty pipe, can be quite large since the pressure gradient is relatively high due to the rapid change in pressure, which can be considered atmospheric at the water front. In water distribution networks, where the pipes are initially empty, different filling cases occur and must be simulated by the numerical models. The proposed numerical model is able
account these relatively small water hammers. Because of the complexity of the system, determined by the various possible filling conditions that may occur, it is necessary to make some simplifying assumptions. Based on the study conducted by Liou & Hunt (), it is assumed that the air pressure at the water front is always atmospheric and the wave-fronts are always perpendicular to the pipe axis and coincident with the cross-sections. For detailed discussion of the above hypothesis, see De Marchis et al. (, ). In this paper, the solution of hydraulic equations has
to simulate the following cases, shown in Figure 1. The first empty pipeline is connected to the network reservoirs and the filling of the network starts after the opening of
been carried out by means of the Method of Characteristics (MOC), starting from the condition of an empty network. The one-dimensional unsteady flow of the compressible
the gates (Figure 1(a)). As the water front reaches one of the users’ connections, tanks start to fill, with a discharge that depends on the geometric and hydraulic features of the diver-
liquid in the elastic pipe is described by the following system of equations:
sion as well as on the pressure at the derivation point (Figure 1(b)). When the water front reaches the end of a pipe-
g
@h @V @V g þV þ þ gJ þ Vsinϑ ¼ 0 @s @s @t c
(1)
line (Figure 1(c)), water begins to flow inside the pipelines connected to it; the pressure inside the filled pipeline generally continues to increase until a steady-state condition is reached.
g @h @V g @h V þc þ ¼0 c @s @s c @t
(2)
Since water distribution networks are generally looped to increase system reliability, both ends of a pipeline start to fill
where t is the time, V is the velocity averaged over the
during the filling; as a consequence, two water fronts proceed
pipe cross-section, h is the water head, g is the
Figure 1
|
Hydraulic schematics of the network filling process: (a) initial phase of water front propagation; (b) water front reaches a user connection; (c) water front reaches the end of the pipeline; (d) two water fronts proceed inside the pipe in opposite directions.
210
G. Freni et al.
|
Implementation of pressure reduction valves in a dynamic water network model
acceleration due to gravity, c is the celerity of pressure waves, ϑ is the slope of the pipeline, while J ¼ Js þ Ju, rep-
Journal of Hydroinformatics
ds c ¼ dt β
C :
|
16.1
|
2014
(8)
resents the head loss per unit length due to steady and
In the proposed numerical model, the coefficient k was
unsteady friction, respectively. The steady friction contri-
calculated at each time step toward the Vardy & Brown
bution is calculated according to the classical Darcy–
() formulation, given by:
Weisbach equation:
Js ¼
f VjVj D 2g
(3)
where ƒ is the Darcy–Weisbach friction factor, calculated
pffiffiffi c k¼ g with
dynamically at each time step. On the other hand, Ju, according to the formulation of Brunone et al. (),
(9)
c ¼
later modified by Vítkovský et al. (), can be calcu-
8 < :
0:0476 7:41
log
Re
14:3 Re0:05
Re < 2500 Re > 2500
(10)
lated according to: In order to study the transient flow in the water distribution
k @V @V Ju ¼ þ cϕA g @t @s
(4)
network, the MOC are combined with the proper boundary conditions. A constant water head is imposed to all the reservoirs feeding the network, thus water levels remain constant
where k is a coefficient obtained dynamically in the
during the filling process. Coherently with the assumption of
function of the flow regime, as will be shown in
atmospheric air pressure in the pipelines network, the
the following, while φA is a coefficient depending on the
water head at the front face of partially filled pipes is equal to
sign of the convective acceleration. Specifically, φA ¼ þ1
zero.
if V (@V=@s) 0, and 1 if V (@V=@s) < 0. Introducing Equation MOC,
(4)
into
the
differential
Equation
momentum equations
can
(1)
and be
and
applying
continuity transformed
the
partial
Equations (5) and (6) can be solved through the finite difference technique. Following the notation used in Figure 2, these equations read:
into
ordinary differential equations, known as compatibility equations:
(1 þ k)
dV gdh g þα þ gJs þ αVsin(θ) ¼ 0 dt cdt c
(5)
(1 þ k)
dV gdh g β þ gJs βVsin(θ) ¼ 0 dt cdt c
(6)
where α and β are (k þ 2 kφA)/2 and (k þ 2 þ kφA)/2, respectively.
1 þ k c i,nþ1 Vj hi,n V i,n hi,nþ1 j jm þ jm α g hc i i,n i,n J jm þ V jm senθi Δti ¼ 0 þ α 1 þ k c i,nþ1 Vj hi,nþ1 hi,n V i,n j jv jv β g
c i,n i,n i J V jv senθ Δti ¼ 0 β jv
(11)
(12)
where Vji,nþ1 and hi,nþ1 are the velocity and the water head j in the j-th section (of abscissa ( j–1)Li/Ni) of the i-th pipe
The compatibility equations are valid along the proper
at the time step tn þ Δt; θi is the slope of the i-th pipe; and
positive and negative characteristic lines of equation that,
jm and jv are the sections upstream and downstream of the
introducing the unsteady friction model, read:
j-th section, respectively.
Cþ:
ds c ¼ þ dt α
(7)
The time step advancement Δtni , function of the length and of the celerity of the i-th pipe, is calculated for each
211
Figure 2
G. Freni et al.
|
|
Implementation of pressure reduction valves in a dynamic water network model
Journal of Hydroinformatics
|
16.1
|
2014
Space-time scheme for two-pipes of different diameter with the same number of sections N and different time step Δti.
pipe and then the minimum value is chosen as the unique
used to calculate the discharge at nodes only when the floating
time step integration:
valve is open, i.e., while the user tank is not entirely filled.
Δtn ¼ mini Δtni ¼ mini (Lni =(Ni ci )) When the velocity of the water front
(13) Vji,nþ1
is calculated,
the filling process is updated according to:
Thus, this equation must be combined with the tank continuity equation, which can be written as: 8 < :
¼ Lni þ VNi,nþ1 Δt Lnþ1 i
Q j,up Dj ¼ Q j,up ¼ 0
dWj dHj ¼A dt dt
for
Hj < H j, max
for
Hj H j, max
(16)
(14)
where Lnþ1 is the length of the water column inside the pari tially empty i-th pipeline at the time t nþ1.
where Dj is the user water demand at the j-th node, Wj is the volume of the storage tank connected to the node having area A, Hj is the tank water level, and Hj,max is the maximum
The compatibility equations for the pipelines connected to the node are resolved together with the continuity equation at each junction node, and the discharge provided to user tanks is calculated as a function of the water head.
allowed water level in the tank (before the floating valve closes). Further details on the numerical model can be found in De Marchis et al. (, ).
Specifically, the discharge Qj,up at the j-th node entering The valve model
the tank connected to the node can be obtained as: Q j,up ¼ Cv a
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2g(hij h j,tank )
(15)
where Cv is the non-dimensional float valve emitter coefficient, a is the valve effective discharge area, g is the gravity acceleration, hij is the water head at the j-th node and hij,tank is the height of the private tank. Although more complex methods were considered in the
The dynamic analysis of PRVs follows the formulation and experimental analysis provided by Prescott & Ulanicki (, ), in which the derivative of the opening of the valve χm is proportional to the difference between a given set point hset, and the current outlet pressure hout (Figure 3). The PRVs are located at some nodes of the network. Assuming an initial value of χm ¼ χm0, the valve capacity Cv is calculated using the following equation:
past to relate coefficients Cv and a to valve-opening rates, here constant values were used for both of the coefficients (Criminisi et al. ) which have been calibrated experimentally, as discussed in the following paragraphs. Equation (15) can be
Cv ¼ 0:021 0:0296e 51:1xm0 þ 0:0109e 261xm0 0:0032e 683:2xm0 þ 0:0009e 399:5xm0
(17)
212
G. Freni et al.
|
Figure 3
|
Implementation of pressure reduction valves in a dynamic water network model
Journal of Hydroinformatics
|
16.1
|
2014
The PRV scheme and governing flows.
So, knowing the incoming flow (qm) and the inlet head
Another function of valve opening (χm) is the cross-sec-
(hin) of PRV, the PRV outlet head (hout) can be determined
tional area Acs of the control space, that is determined using
with the equation:
the equation:
hout ¼
hin q2m [Cv(xm0 )]2
(18)
Equation (18) relates the flow through the PRV to the head loss across it and the opening. Valve opening and closing are controlled by the pilot circuit allowing a flow q3 to
Acs ¼
on the difference between the outlet head and the valve setting. The valve is characterized by two parameters αopen and αclose, fixed to 1.1 × 10 6 and 10 × 10 6 m2/s respectively in the present study following the experimental results of Prescott & Ulanicki (). The two parameters determine
(21)
Calculating q3 with Equations (19), a new value χm can be estimated representing the adaptation of valve opening to seek the required set point hset:
fill the valve control space. The inflow to the control space of the PRV is dependent
1 3700(0:02732 xm0 )
xm ¼
q3 Δt þ xm0 Acs (xm0 )
(22)
The system of Equations (19)–(22) requires an iterative resolution because inflow q3 depends on hout by means of Equation (19) that is dependent on valve opening condition xm again dependent on q3 according to Equation (22).
the opening and closing celerity of the valve and, at
The system has to be solved making an initial hypothesis on
the same time, its sensitivity and reactivity to pressure
valve opening χm0 and then solving the equations in order
fluctuation. The inflow q3 to the control space is calculated as follows: q3 ¼
αopen (hset hout ) if x_ m 0 αclose (hset hout ) if x_ m < 0
dx dt
ations are continued until the difference between q3,i (with i being the i-th iteration) and q3,i 1 is less of an established tolerance that was assumed equal to 0.1% in the present study.
(19) The case study
where x_ m ¼
until a new value of χm is obtained in Equation (22). The iter-
The model has been applied on one of the 17 distribution net(20)
works of Palermo city (Sicily). The network is fed by two tanks at different levels, that can store up about 40,000 m3
213
G. Freni et al.
|
Implementation of pressure reduction valves in a dynamic water network model
Journal of Hydroinformatics
|
16.1
|
2014
per day, and supply around 35,000 inhabitants (8,700 users).
processes. For the same period, flow data entering the network
It has been designed to deliver about 400 l/capita/d, but the
were available with the same temporal resolution.
actual mean consumption is about 260 l/capita/d. Pipes are
The current configuration of the network is characterized
made of polyethylene and their diameters range from 110
by significant inequality in the distribution of water resources
to 225 mm (Figure 4). Additional details on the analyzed net-
during intermittent supply. As demonstrated in De Marchis
work can be found in De Marchis et al. ().
et al. (), the users in the lower part of the network, charac-
The system is supplied on a daily basis because the high
terized by the lowest geodetic elevation, can access water
level of leakages (around 25%) does not allow the manager
resources soon after the beginning of a service period and
to supply the network continuously with the available water
they are able to fill up their tanks that were emptied in the
resources. Considering that intermittent supply was histori-
period of service unavailability. At the same time, the users
cally a common practice, especially during summer, all the
in the upper part of the network have to wait for the advan-
users are supplied via tanks having volume equivalent to
taged users to collect water resources and pressure over the
two days’ consumption. In the present paper, the aim was
network to rise in order to begin filling their tanks. As dis-
the evaluation of the impact of intermittent supply on water
cussed in the introduction, the definition of PMA can help
resources distribution among users. For this reason, leakages
in the reduction of inequalities among advantaged and disad-
were simply proportionally divided according to the node
vantaged users. In the present study, two configurations were
demand because exact positions of leakages were not known.
considered dividing the district into two and four PMAs. In
The system is monitored by six pressure cells and two elec-
Scenario A, the network was divided into two approximately
tromagnetic flow meters (Figure 4). Data have been provided
equal parts (Figure 5(a)). In Scenario B, the network was
on an hourly basis almost continuously since 2001, and the
divided into four PMAs increasing the number of valves intro-
network hydraulic model calibration is continuously updated
duced in the system and the number of closed pipes
when new data become available (Criminisi et al. ). The
(Figure 5(b)). The two configurations were chosen based on
pressure data used for model calibration have a time resol-
the original design of the network in which the district is
ution of five minutes and were taken from the period
divided into four areas that can be insulated for maintenance
between June and October 2002, during which the network
purposes. In the present application, some of the existing
was managed by intermittent supply on a daily basis. The
static section valves are simply substituted by PRVs.
pressure time series were available at each of the six pressure gauges and used to represent the filling and the emptying
ANALYSIS OF RESULTS The model was initially calibrated according to the pressure profiles available during the monitoring period from the six pressure gauges located in the network. The results, not shown here, can be found in De Marchis et al. () where a section dedicated to the model calibration, in the same case of study, is reported. The presented model was used to analyze and compare different configurations of PMAs in order to reduce inequalities between user accessing and collecting water resources considering the relevant role of private tanks. In the analysis of results, Scenario 0 (i.e. the current situation with one network district with no pressure control) was compared with the two proposed PMA scenarios. The effectiveness of district definition was evaluated by means of Figure 4
|
Case study network scheme.
pressure levels in the network and by means of the water
214
Figure 5
G. Freni et al.
|
|
Implementation of pressure reduction valves in a dynamic water network model
Journal of Hydroinformatics
|
16.1
|
2014
Position of the PRVs and closed pipes on network mains: Scenario A (a) and Scenario B (b). The blue lines define the boundaries of the PMAs. Please refer to the online version of this paper to see this figure in colour: http://www.iwaponline.com/jh/toc.htm.
volume supplied to the users at different moments of the ser-
water head reduction is higher than that registered when
vice day. In all the scenarios, the simulation starts with the
two PRVs were activated (Scenario A). Due to the fact that
reactivation of service during intermittent distribution on a
a pressure driven model is used to calculate the discharge
daily basis. Because of user water consumption during the
entering the users’ tanks, the increase of the pressure in the
day before, at the beginning of the simulation, all the private
disadvantaged nodes and the reduction in the advantaged
tanks are almost empty and their supply valves are fully open.
ones reduces the inequalities in the water supply. Figure 6(c)
Figure 6 shows the comparison between the water head
shows that in the most disadvantaged nodes, located either in
variation in time obtained in the three different scenarios ana-
the highest part of the district or in the farther part of
lyzed here. The first 7 h of the dynamic filling process are
the network from the inlet node, in order to reduce the inequal-
analyzed in four different nodes of the network. Specifically,
ities it is necessary to divide the district into four PMAs. The
the pressure in the nodes 42, 109, 165 and 249 were plotted.
water heads obtained in Scenario A, in fact, are equal to those
These nodes were chosen to be representative of the effects of
in Scenario 0. Finally, Figure 6(d) shows that in the nodes loca-
the PRVs in the different PMAs, as can be observed in Figure 5
ted near to the inlet node the three scenario profiles are very
where the nodes were shown to improve the clarity. Figure 6
similar, with negligible differences in water head distribution.
shows pressure levels in four nodes of the network: initially
Figure 7 shows pressure levels in the network after 3 h in
the pressure is null and the pipes are empty. The time taken
the three selected scenarios. The separation between the differ-
for the filling process is different for each of the four nodes mon-
ent PMAs is clear and the average pressure in the network
itored. In the disadvantaged node (Figure 6(c)), the transient
progressively decreases by implementing two and four differ-
period of the filling process can be protracted for almost 1 h
ent districts: in Scenario 0, the average pressure head is
from the beginning of the simulation. For details about these
22.1 m, decreasing to 21.4 m in Scenario A and 20.7 m in Scen-
inequities, see De Marchis et al. (). The static level of the
ario B. More interestingly, the standard deviation drops from
supply tank is equal to 48 m above medium sea level.
6.5 m in Scenario 0 to 5.4 m and 4.6 m, respectively in Scen-
Figure 6(a) shows that, when the PRVs are activated
ario A and Scenario B. This fact confirms a more uniform
(Scenarios A and B) in the upper part of the network, an increase
distribution of pressures over the network and, considering
of the pressure is achieved, with respect to Scenario 0. Further-
that the majority of the uses are head driven because of private
more, the increase of the pressure is higher in Scenario B,
tanks, a more uniform distribution of resources.
where several PRVs were activated to reduce the inequality
This consideration is confirmed by looking at water head
between the users. On the other hand, Figure 6(b) shows the
after 3 h (Figure 7) and at supplied water volumes after 5 h
reduction of the pressure at node 109 located in the lower
(Figure 8). The percentage of users able to collect the totality
part of the network. Also at this node, in Scenario B the
of their daily demand after 3 h drops from 14% to 10% and
215
G. Freni et al.
|
Implementation of pressure reduction valves in a dynamic water network model
Journal of Hydroinformatics
|
16.1
|
2014
Figure 6
|
Pressure level variation in time in four nodes: (a) 42; (b) 109; (c) 165; (d) 249. In panels (a) and (b) the horizontal line represents the static level of water supply equal to 48 m.
Figure 7
|
Pressure levels in the network after 3 hours: (a) Scenario 0; (b) Scenario A; (c) Scenario B.
216
Figure 8
G. Freni et al.
|
|
Implementation of pressure reduction valves in a dynamic water network model
Journal of Hydroinformatics
|
16.1
|
2014
Volume supplied to the users after 5 hours: (a) Scenario 0; (b) Scenario A; (c) Scenario B.
4% in Scenarios A and B. After 5 h in Scenario 0, one quarter of
impact of such choices on network pressure and on water
the users’ tanks were filled while only 20% and 13% have com-
supply distribution. From a practical perspective, the creation
pleted their supply in the two PMA scenarios. The
of PMAs has a relevant impact on intermittent networks help-
implementation of PMAs has a more relevant impact on users
ing the reduction of inequalities between users accessing and
unable to be supplied: after three hours, 45% of users are
collecting water resources. The presence of private tanks
unable to be supplied in Scenario 0 and this number is reduced
helps advantaged users to collect as much water as possible
to 38% and 29% in Scenario A and B, respectively; after five
in a few hours after the restoration of service; at the same
hours, the number of non-supplied users is still high (39%)
time, several users are unable to collect water because
while it is reduced to 29% and 18% in the two PMA scenarios.
pressure in the network is too low. The introduction of PMAs mitigates this problem by reducing the differences of pressure between different points of the network. The intro-
CONCLUSIONS
duction of the valves reduces the differences between water collected by users in the first part of the service day even if
In the study, a dynamic mathematical model for intermittent
inequalities still remain. The analysis demonstrated that
networks was integrated with a PRV model in order to simu-
PMAs can help move towards having equal distribution of
late management actions for reducing inequalities between
water resources during intermittent service but further ana-
users in their access to water resources. The model was
lyses are needed to implement an optimal distribution of
demonstrated to be robust and to correctly represent the
valves in order to reduce the different distribution of water
application of several valves in the network showing the
supply between users. The impact of valves on the network
217
G. Freni et al.
|
Implementation of pressure reduction valves in a dynamic water network model
is not easily predictable without the use of dynamic models as the presented analysis has demonstrated. Some parts of the network are unaffected by the presence of the valves because they are dominated by the proximity of network inlets. The introduction of valves has a pervasive impact on the network, cutting pressure downstream of the valve, but also increasing pressures in the upper part of the network due to the compensation of flow distribution in the network.
ACKNOWLEDGEMENTS The authors would like to acknowledge the Italian Research Project ‘POR FESR Sicily 2007-2013 – Measure 4.1.1.1 SESAMO – SistEma informativo integrato per l’acquisizione, geStione e condivisione di dAti aMbientali per il supportO alle decisioni’ for providing financial support to the presented research.
REFERENCES Arregui, F., Cabrera Jr, E. & Cobacho, R. Integrated Water Meter Management. IWA Publishing, London. Bergant, A., Vitkovsky, J., Simpson, A. & Lambert, M. Valve induced transients influenced by unsteady pipe flow friction. Proc. 10th Int. Meeting of the IAHR Workgroup on the Behaviour of Hydraulic Machinery under Steady Oscillatory Conditions, IAHR, Madrid, Spain. Brunone, B. & Morelli, L. Automatic control valve-induced transients in an operative pipe system. J. Hydraul. Eng. 125 (5), 534–542. Brunone, B., Golia, U. M. & Greco, M. Some remarks on the momentum equations for fast transients. Hydraulic transients with column separation (9th and last round table of the IAHR Group), IAHR, Valencia, Spain, pp. 201–209. Criminisi, A., Fontanazza, C. M., Freni, G. & La Loggia, G. Evaluation of the apparent losses caused by water meter under-registration in intermittent water supply. Water Sci. Technol. 60 (9), 2373–2382. Cubillo, F. Impact of end uses knowledge in demand strategic planning for Madrid. Water Sci. Technol.: Water Supply 5 (3–4), 233–240. De Marchis, M., Fontanazza, C. M., Freni, G., La Loggia, G., Napoli, E. & Notaro, V. A model of the filling process of an intermittent distribution network. Urban Water J. 7 (6), 321–333. De Marchis, M., Fontanazza, C. M., Freni, G., La Loggia, G., Napoli, E. & Notaro, V. Analysis of the impact of
Journal of Hydroinformatics
|
16.1
|
2014
intermittent distribution by modelling the network-filling process. J. Hydroinf. 13 (3), 358–373. Fontanazza, C. M., Freni, G., La Loggia, G., Notaro, V. & Puleo, V. A composite indicator for water meter replacement in an urban distribution network. Urban Water J. 9 (6), 419–428. Freni, G., De Marchis, M., Dalle Nogare, G. & Napoli, E. Implementation of pressure reduction valves in a dynamic water distribution system numerical model. Proceedings of the 10th International Conference on Hydroinformatics, Hamburg, July 14–18. Giustolisi, O. Considering actual pipe connection in WDN analysis. J. Hydraul. Eng. 136 (11), 889–900. Giustolisi, O., Berardi, L. & Laucelli, D. Generalizing WDN simulation models to variable tank levels. J. Hydroinf. 14 (3), 562–573. Giustolisi, O., Kapelan, Z. & Savic, D. A. a An algorithm for automatic detection of topological changes in water distribution networks. J. Hydraul. Eng. 134 (4), 435–446. Giustolisi, O., Savic, D. A. & Kapelan, Z. b Pressure-driven demand and leakage simulation for water distribution networks. J. Hydraul. Eng. 134 (5), 626–635. Hardoy, J. E., Mitlin, D. & Satterthwaite, D. Environmental Problems in an Urbanizing World: Finding Solutions for Cities in Africa, Asia and Latin America. Earthscan, London. Liou, C. P. & Hunt, W. A. Filling of pipelines with undulating elevation profiles. J. Hydraul. Eng. 122 (10), 534–539. National Research Council, Committee on Public Water Supply Distribution Drinking Water Distribution Systems: Assessing and Reducing Risks. National Academies Press, Washington, DC. Prescott, S. L. & Ulanicki, B. Dynamic modelling of pressure reducing valves. J. Hydraul. Eng. 129 (10), 804–812. Prescott, S. L. & Ulanicki, B. Improved control of pressure reducing valves in water distribution networks. J. Hydraul. Eng. 134 (1), 56–65. Todini, E. A more realistic approach to the ‘extended period simulation’ of water distribution networks. In: Proc. Of CCWI2003, Advances in Water Supply Management (C. Maksimovic, D. Butler & F. A. Memon, eds). A.A. Balkema Publishers, Lisse, pp. 173–184. Vairavamoorthy, K., Akinpelu, E., Lin, Z. & Ali, M. Design of sustainable system in developing countries. Proceedings of the World Water and Environmental Resources Challenges, Environmental and Water Resources Institute of ASCE, Orlando, Florida, 20–24 May 2001. Vardy, A. E. & Brown, J. M. B. Transient turbulent friction in smooth pipe flows. J. Sound Vib. 259 (5), 1011–1036. Vítkovský, J. P., Bergant, A., Simpson, A. R. & Lambert, M. F. Systematic evaluation of one-dimensional unsteady friction models in simple pipelines. J. Hydraul. Eng. 132 (7), 696–708.
First received 21 March 2013; accepted in revised form 5 July 2013. Available online 23 August 2013
218
© IWA Publishing 2014 Journal of Hydroinformatics
|
16.1
|
2014
Improving applicability of neuro-genetic algorithm to predict short-term water level: a case study Gooyong Lee, Sangeun Lee and Heekyung Park
ABSTRACT This paper proposes a practical approach of a neuro-genetic algorithm to enhance its capability of predicting water levels of rivers. Its practicality has three attributes: (1) to easily develop a model with a neuro-genetic algorithm; (2) to verify the model at various predicting points with different conditions; and (3) to provide information for making urgent decisions on the operation of river infrastructure. The authors build an artificial neural network model coupled with the genetic algorithm (often called a hybrid neuro-genetic algorithm), and then apply the model to predict water levels at 15 points of four major rivers in Korea. This case study demonstrates that the approach can be highly compatible with the real river situations, such as hydrological disturbances and water infrastructure under emergencies. Therefore, proper adoption of this approach into a river
Gooyong Lee Heekyung Park (corresponding author) Korea Advanced Institute of Science and Technology (KAIST), 335 Gwahangno, Yuseong-gu, Daejeon 305-701, Republic of Korea E-mail: hkpark@kaist.ac.kr Sangeun Lee International Centre for Water Hazard and Risk Management under the Auspices of UNESCO (ICHARM) 1-6 Minamihara, Tsukuba-shi, Ibaraki-ken, 305-8516, Japan
management system certainly improves the adaptive capacity of the system. Key words
| four-river remediation project, genetic algorithm, hybrid neuro-genetic, neural network, practical approach, water level prediction
INTRODUCTION Background
(FRRP) in 2009. With total expenses approximately amounting to a tenth of the annual national budget, the MLTM
For
sustainable
water
resources
management,
many
countries often initiate and develop huge river remediation
constructed a variety of water infrastructures such as reservoirs, weirs, dikes, wetlands, and eco-parks up to 2011.
projects, e.g., the ‘Tennessee Valley Authority Act (1993–
However, successful river management cannot be
2012)’ in the USA, and the ‘Isar River Remediation Project
ensured by these structural measures. Considering ‘non-
(2000–2011)’ in Germany. Korea has four major rivers.
stationarity’ (Milly et al. ) and ‘no basis for probabilities’
Their slopes are relatively steep, and stream flows differ
(Foley ; Cha et al. ), it is necessary to supplement
vastly from month to month. Thus, people living near the
adaptive capacity with nonstructural measures from a per-
rivers have repeatedly suffered from chronic problems such
spective featuring both economics and reliability. As
as flood, drought, stream depletion, and low water quality.
mentioned by Lee et al. (), when the capacity of water
In addition, many experts (e.g., NIMR ; MLTM ,
infrastructure exceeds a certain level, priorities should be
; Lee & Park ) argue that Korean river management
given to better predicting and monitoring hydrological
in the future will be much more vulnerable to climate
changes, arranging emergency options sufficient to cover a
change than in the present, projecting the increase of heavy
wide range of extreme events, and making timely and ade-
rainfalls in the wet season and the duration of drought in
quate decisions. In these regards, this study was initiated to
the dry season. To solve these problems and to provide full
more accurately predict the water levels at various measure-
amenities for the inhabitants, the Ministry of Land, Transport
ment points of the rivers in FRRP. The authors also took
and Maritime Affairs (MLTM) launched a nationwide, large-
into account that adaptive capacity can be further enhanced
scale project named the Four River Remediation Project
if the prediction models reveal information about how to
doi: 10.2166/hydro.2013.011
219
G. Lee et al.
|
Applicability of neuro-genetic approach for short-term water level prediction
Journal of Hydroinformatics
|
16.1
|
2014
operate water infrastructure constructed by the FRRP in
; Gavin et al. ; Kisi & Asce ; Zhengfu & Fer-
order to maintain the water levels within desirable ranges.
nando ) found several merits of the ANN.
When Korean river managers make plans for the operation of a reservoir or a weir, the preferred way to forecast the water level at a point is to select one or combine hydrological simulation models, e.g., SWAT (Soil and Water Assessment Tool developed by the US Department of Agriculture), WEAP (Water Evaluation and Planning System developed by the Stockholm Environment Institute), PRMS (Precipitation-Runoff Modeling System developed by the US Geological Survey), and HEC-RAS (Hydrologic Engineering Centers River Analysis System developed by the US Army). These simulation models are structured with governing equations and parameters and have been regarded as most adequate to describe physical processes related to the rain– runoff relation, especially within the hydrologic communities. The authors also agree that the simulation models are the best in achieving long-term prediction (for more detail, see Leavesley () and Solomatine ()). However, when river managers are interested in real-time or short-term prediction, there are two serious limitations. One is the demand of the long period and numerous kinds of data to calibrate the model parameters, and the other is excessive consumption of time and endeavor to build and run the models (Grayson et al. ; Chang et al. ). These limitations definitely discourage river managers from using simulation results in making decisions upon the operation of water infrastructures. Objectives of the study This study is based on the viewpoint that the conventional numerical models to predict the water level are not adequate to draw out the full potential of newly constructed water infrastructures because they entail a number of assumptions and demand numerous kinds of data (Chau ; Chang
1. Fast prediction speed: the ANN conducts prediction through the direct relations between inputs and outputs, without necessitating the treatment of data in the geographical information system. 2. Low data requirements: many fewer input variables are needed than when simulation models are used. Those variables can be selected in a flexible manner according to previous literature, modelers’ experiences, new insights, and trial-and-error. 3. Better consideration of site characteristics: to explain highly complicated hydrological phenomena (or dynamic nature of the phenomenon at stake) of a certain site, a simulation model is usually calibrated to adjust its parameters. The ANN model’s parameters and structure can be adjusted to be more site-specific. This is a great advantage in modeling non-linear and unique site characteristics of the watershed of concern. Despite such merits, it does not seem that ANN models are widely used in practice. The authors think that the models should be improved at the perspective of the real river management system to improve applicability. For example, according to the Korean River Management Guideline (K-water ), river managers set up the allowable range of water levels at each point and should maintain the water levels above the lower limits during dry seasons and below the upper limits during wet seasons. The managers are obliged to periodically make decisions on the amount flowing out from upstream weirs or reservoirs and then to request approval from the River Flow Control Office under MLTM. The authors suggest improving the ANN model as follows.
et al. ). Hence, as an alternative method, the authors
1. It should be easy and systematic to optimize the model:
intended to examine the systemic approach of using an arti-
when the ANN model is used to predict the water level
ficial neural network (ANN). Indeed, the ANN has become
at a point, the modeler should consider many of the
the most popular in various system engineering commu-
hydrological characteristics of the watershed basin.
nities (Joo et al. ; Choi & Park ; Robert )
They are also required to know where the flow gauging
when models need good performance and fast calculation
stations and the weather stations are located in the
in short-term or real-time prediction. In studies on water
upper stream, and how the locations of stations influence
resource management, many experts (Karunanithi et al.
the ANN model. Therefore, it is usually difficult to deter-
; Imrie et al. ; Toth et al. ; Cameron et al.
mine the structure of the model. In many previous studies
220
G. Lee et al.
|
Applicability of neuro-genetic approach for short-term water level prediction
using the ANN model, this determination was done by
Journal of Hydroinformatics
|
16.1
|
2014
METHODS
using informal trial-and-error methods, referring to the literature, or relying on personal experiences. Even if
Study areas
useful, these ‘subjective’ methods cannot be expressed explicitly; a significant barrier for engineers and man-
From previous studies, it is seen that a modeler optimizes
agers in the field who require at least well-coded
the performances of the ANN model at a few points in a
algorithms.
river, and strives to improve accuracy (Maier et al. ).
2. The information regarding water levels a couple of days
This observation does not appear impressive to a river man-
later should be provided: the previous studies focused
ager since the results were derived only from the several
on hourly predictions (see Filho & Santos ; Alvisi
specific points. To gain a river manager’s confidence, it
et al. ; Napolitano et al. ) is not helpful when
would be better to let them decide whether the existing
considering the real practices articulated in the Korean
models should really be replaced; it would be very important
River Management Guideline (K-water ). It usually
to ascertain that the ANN model provides good perform-
takes 1 or 2 days to perform the management practices
ances at multiple points in various rivers at a time. This
based on the prediction of water level to implement con-
study thus attempts to examine the applicability of the
trol action. Therefore, the ANN model must be able to
model at 15 points near the locations where weirs or reser-
know the water level after this time lag of a couple
voirs were newly constructed by the FRRP. The 15 points
of days.
fall into four groups according to the rivers. The rivers
3. The model must help determine discharges of the upstream weirs or reservoirs: the climate in Korea is
have geographical and hydrological characteristics, as follows (see also Figure 1 and Table 1).
characterized by frequent localized torrential rainfall and high coefficient of river regime. Under such climate,
1. Han River: it flows through the northern part of Korea
maintaining the water level between the upper and lower
from the Gangwon province to Gyeonggi province via
limits during both dry and wet seasons is critical to pre-
Seoul metropolitan city. The FRRP newly constructed
vent natural disasters under extreme events. In this
three weirs in a section, from the Chungju dam to the Pal-
sense, it is important to examine whether the ANN
dang Lake, of the main stream. The authors selected two
model can clearly explain the relations of the upstream
water level measurement points within this section, as
discharges and the downstream water level. If possible,
shown in Figure 1(a). It seems very likely that the water
the modeler will be able to conduct ‘what-if …’ tests,
level at point 1 will be significantly affected by the dis-
and
charges of the Chungju dam while the water level at
then
determine
emergency
actions
more
systematically.
point 2 is greatly determined by the operation of the weirs. Point 2 is of national concern because the point
To test the three hypotheses, this study examines the use
is at the starting point of Paldang Lake which supplies
of the ANN model at 15 points near the locations where weirs
raw water for approximately 20 million inhabitants
or reservoirs were constructed by the FRRP. This article is
living in Seoul metropolitan city and Gyeonggi province.
designed as follows. Following an introduction, the authors
2. Geum River: it is located at the center of Korea and orig-
explain the methods including study areas, data collection,
inates from the Jeollabuk province, and then flows out into
the ANN model and optimization algorithms, and the way
the West Sea through the Chungcheongnam province and
to validate the model. Then, the authors present the results
Chungcheongbuk province. The river is characterized by
of case studies, which are used to verify the established
rising from many tributaries, i.e., 20 streams. The FRRP
hypotheses. This study summarizes that the investigated
constructed three weirs at the section, 99 km in length,
approach using the ANN model can successfully manage
between the Daecheong dam and Geum River estuary
river systems, cope with emergencies, and raise the adaptive
dam. Within the section, four points on the main stream
capacity of the river management system.
are used to predict the water levels, as shown in Figure 1(b).
221
Figure 1
G. Lee et al.
|
|
Applicability of neuro-genetic approach for short-term water level prediction
Journal of Hydroinformatics
|
16.1
|
2014
Locations of the study areas.
Although point 1 is largely affected by the discharges of
3. Yeongsan River: it passes through the Jeollanam province
the Daecheon dam, it is very challenging for river man-
in the south-western part of Korea, and flows out to the
agers to predict and control the water level because the
West Sea. The distinctive characteristic of this river is
influx of the Miho stream and the Gapcheon tributary
that the regime coefficient of the watershed basin is extre-
at the front fills about half of the total flow in the main
mely high (1:682). This implies that flow rate differs
stream. For other points, which are placed behind point
vastly from season to season so that damage due to
1, the FRRP is likely to improve the ability to manage
floods and droughts frequently occurs. In the section,
water quantity. Water levels at points 2, 3, and 4 are
98 km long, where two weirs were installed by the pro-
dominated by operation of weirs in the upper streams.
ject, there are four points available to estimate the
222
Table 1
G. Lee et al.
|
|
Applicability of neuro-genetic approach for short-term water level prediction
Journal of Hydroinformatics
|
16.1
|
2014
Geographical and hydrological characteristics
Study area
Catchment area (km2)
River length (km)
Annual mean temperature ( C)
Annual mean rainfall (mm)
Coefficient of river regimea
Han River
26,018
481.7
10–11
1,200–1,300
1:393
Geum River
9,912
394.8
11–12
1,100–1,300
1:299
Yeongsan River
3,371
115.5
13
1,100–1,500
1:682
Nakdong River
23,384
506.2
12–14
900–1,400
1:372
a
W
The coefficient usually implies the ratio between the maximum and minimum of daily flow over an average year.
water levels, as in Figure 1(c). Among them, water levels
Znþ1 is an output from (n þ 1)th layer, wn is a weight
at points 1 and 3 are affected by the operation of Seung-
between nodes, unþ1 is the sum of multiplying Zn by wn,
chon and Juksan weirs, respectively. Point 2 is placed
and fi(unþ1) is an activation function transferring the sum
behind the confluence of the Jiseok stream having abun-
of node inputs into a node output. The activation function
dant flow, and thus the water level is influenced by the
can have a logistic, hyperbolic-tangent, or linear form
flow variation of Jiseok stream. Besides, point 4 is used
(Figure 2(b)). Several nodes form a layer, and several
to monitor the flow escaping into the West Sea.
layers form the whole ANN structure again. Among a var-
4. Nakdong River: it originates from the Gangwon province
iety of ANN structures, the most popular one is the MLP
and flows vertically through the Gyeongsangbuk province,
(multi-layer perceptrons) and a feed-forward network with
Gyeongsangnam province, and Busan metropolitan city
several layers (Haykin ; Dibike & Solomatine ). In
into the Southern Sea. It is the longest river in Korea.
several previous studies (Sahoo et al. ; Wang et al.
Accordingly, spatial variations in rainfall and flow are rela-
; Pulido-Calvo & Portela ), ANN with double
tively large, and many inhabitants near the lower stream
hidden layers showed competent results. Therefore, in this
have suffered from floods and droughts almost every
study, both ANNs having a single hidden layer and double
year even though five reservoirs for flood control were
hidden layers are tested, and one is selected for the optimal
installed a long time ago. As a result, while planning the
ANN design at each study site.
FRRP, the MLTM took note of these problems and con-
In many ANN studies, the model structure has been
structed eight large-scale weirs. The construction projects
selected through a trial-and-error method (Hsu et al. ;
were mainly implemented within the section, 277 km
Zealand et al. ; Chiang et al. ). However, the
long, between Andong city and Busan metropolitan city.
results of optimizing the model largely depend on how
Although many points are available to estimate the water
appropriately the hidden layers and nodes are designed
levels in this section, five points where flooding regularly
and how well the functions are defined. Thus, much
occurs were selected as points of interest.
effort should be given to the selection of model structure. Since the mid-1990s, there have been numerous attempts
Model construction and calibration
in water resource management communities to employ the genetic algorithm (GA) (see Savic et al. ; Giustolisi
An artificial neural network is a computational model
& Laucelli ; Giustolisi & Simeone ; Shamseldin
mimicking the human brain that is constructed by huge net-
et al. ). GA is a technique inspired by the principles
works connecting neurons, neural cells, and synapses. The
of natural evolution and selection (see details in Holland
ANN is not required to establish model structure with the
() and Goldberg ()). In previous studies, the GA
full understanding of natural phenomena. The model
has be mainly used in the neural network, as follows: (1)
rather relies on empirical training as humans normally do
selection of input variables; (2) adjustment of weights; (3)
(Haykin ; Maier et al. ). Figure 2(a) exemplifies
number of hidden layers in the MLP; (4) number of
the general structure of an ANN model. A neuron corre-
nodes in each layer; and (5) type of activation function in
sponds to a node where Zn is an output from nth layer,
each node.
223
Figure 2
G. Lee et al.
|
|
Applicability of neuro-genetic approach for short-term water level prediction
Journal of Hydroinformatics
|
16.1
|
2014
General structure of the ANN model (MLP): (a) a node at (n þ 1)th layer; (b) typical activation functions; (c) multi-layered feed forward neural network using double hidden layers.
Based on literature reviews (See & Openshaw ;
time window (t0, t0 1) at the two points should be considered
Alvisi et al. ; Maier et al. ) and Korean river con-
as model inputs for 1-day ahead prediction. As a result, the
dition (K-water ), rainfall and upstream flow are used as
prediction point #1 in the Geum River has 10 model
input variables, and rainfall/flow gaging stations are selected
inputs. To avoid ‘curse of dimensionality and over-fitting in
through preliminary statistical test (i.e., significance test) con-
ANN’ (Haykin ; Giustolisi & Laucelli ), GA is
sidering the travel time (maximum 4 days at study sites (K-
coded to select minimum nodes at hidden layers. The
water )). The number of nodes at input layer is different
number of nodes at a hidden layer is limited to a maximum
at each prediction point because the Korean rivers have a
of 32 following ‘a general principle that the node numbers
different number of branches and lag-time (see Figure 1).
of the hidden layers should be greater than the input layer
Due to Korean climate characteristics (frequent localized tor-
nodes’ (Zeng & Wang ). Finally, 14 average nodes are
rential rainfall and high coefficient of river regime), it is
used in this study (it is similar to the number of input
difficult to construct a model using only input data for a cer-
nodes). Weights are also anticipated to be well trained by
tain period (e.g., dry and wet season). Therefore, the model is
the back propagation algorithm (BPA) solely as done by
constructed by selecting representative points of annual pre-
other ANN studies (Rumelhart et al. ; Rumelhart &
cipitation and flow variations. To take into account the
McClelland ; Abebe & Price ; Robert ; Kisi &
number of branches and the time window (t0, t0 1, t0 2), a
Asce ; Chau ; Chang et al. ; Napolitano et al.
maximum of 16 inputs are necessary. For example, predic-
; Mohanty et al. ). To take into account the high coef-
tion point #1 in Geum River (see Figure 1) is affected by
ficient of river regime, the weights are adjusted to minimize
eight points (four weather gaging stations, three flow gaging
error value for high variations of water level. Three criteria
stations, and one dam). Lag-time between the dam and the
(Table 2) are used for validation.
furthermost flow gaging station is 1–2 days depending on
Finally, the authors decided using the GA to select: (1)
flow speed. To take the lag-time into account, two types of
the number of hidden layers; (2) the number of nodes in
224
Table 2
G. Lee et al.
|
|
Applicability of neuro-genetic approach for short-term water level prediction
Purpose
Coefficient of determination (R 2)
To evaluate the goodness-of-fit of models
Mean square error (MSE)
To quantify the error of model
Mean absolute percent error (MAPE)
To compare the error between populations
|
16.1
|
2014
selected as the optimal model. The number of generations
Criteria for testing validity of the model
Criteria
Journal of Hydroinformatics
is 50, and each generation contains 500 candidate models
Estimation
Pn
R ¼ 2
i¼1 ðPi Pn i¼1 ðOi
MSE ¼
2 OÞ 2 OÞ
n 1X ðPi Oi Þ2 n i¼1
MAPE ¼
1 n
n X i¼1
jPi Oi j Oi
(population size). At the model calibration step, all the weights, wn, need to be trained to minimize the sum of square errors between observed data and model outputs. As a result, dozens of training algorithms have been suggested so far although none of them ensure that the solution reaches a global minimum (for details, see Coulibaly et al. () and Mohanty et al. ()). Among those algorithms, the authors used the back propagation algorithm, first suggested by Rumelhart & McClelland (). The
each hidden layer; and (3) the type of activation function in
algorithm is widely known as an adequate method to train
each node. Overall model construction procedure is sum-
the MLP and, in particular, it is less sensitive to the noises or
marized in Figure 3. In the first part, input data are
errors inherent in input data (Maier & Dandy ).
decided through three steps, ‘selection of water level forecasting point’, ‘selection of input point’, and ‘data
Selection of data, input variables, and validation criteria
collection’. In the second part, ANN is constructed using GA. The initial ANN structure consists of double hidden
There were some limitations in availability of data when the
layers with the same number of nodes as in the input
constructed model is trained and validated. First, the length
layer, and each node contains a random activation function.
of data is a bit short compared to that in other studies. In a
Then, the ANN model is constructed through iteration of
large portion of water level measurement points, the data
three steps, selection, crossover, and mutation, and the prob-
obtained prior to the year 2004 turned out to be inconsistent.
ability of each step is 20, 60 and 20%, respectively. Before a
Second, the FRRP perturbed the quality of data temporally.
new iteration starts, weights are adjusted by BPA. Finally,
From 2009 to 2011, large-scale weirs were constructed, and
1st-rank ANN model of one generation is created and
dredging works were conducted in the study sites, which
saved for selection of optimal ANN structures. The ANN
led to a slight change in the location of several measurement
structure with the lowest coefficient of determination is
points and an increase in measurement errors. Therefore,
Figure 3
|
Overall model construction procedure in this study.
225
G. Lee et al.
|
Applicability of neuro-genetic approach for short-term water level prediction
Journal of Hydroinformatics
|
16.1
|
2014
daily data were restricted to a 5-year period from 1 January
activation functions optimized by the GA. The results can
2004 to 31 December 2008. The total period is subdivided
be interpreted as follows.
into 3 and 2 years to distinguish model training/validation from testing periods, respectively. The 3-year training/validation period (2004–2006) is divided into training and validation parts: if the data period starts from t0, even numbers (t0, t0þ2, t0þ4 …) are used as training data and odd numbers (t0þ1, t0þ3, t0þ5 …) are used as validation data. This study satisfied the minimum data quantity for ANN construction suggested by Lawrence & Peterson (). All the input data are adjusted on a scale ranging from 0 to 1.
1. Among the hidden layers included in all models, double layers amount to more than a third (37%). This gives an insight that there is high nonlinearity between input variables and water level, and among input variables (Haykin ). 2. In the first hidden layers, the number of nodes ranges from 6 to 32 (15 on average), and in the second hidden layers, this ranges from 2 to 17 (6 on average). 3. Hyperbolic tangent functions and logistic curves were dominantly selected as activation functions in hidden
Table 2 represents the criteria for validating the ability of the
layers rather than linear functions. This agrees with
model to predict the water level, in which Pi is the values pre-
Daliakopoulos et al. (), who stated that sigmoid-
dicted by the ANN model, and O is the observed values, and is the average of observed values, and n is the number of O
type functions ensure better performances on the ANN model.
samples. The three criteria, R 2, MSE, and MAPE, are widely
4. In contrast, linear functions were selected for almost
used statistics, which refer to high validity in the constructed
half of activation functions in the output layers, which
model as the statistics are closer to 1, 0, and 0, respectively.
corresponds to other experts’ experiences (cf. Abebe & Price ; Chang et al. ; Pulido-Calvo & Portela ).
RESULTS AND DISCUSSION Selection of the model structure
Testing of the ANN models
Tables 3 and 4 show the results of building the ANN models,
To test the trained model for the period 1 January 2006 to 31
each listing the numbers of layers and nodes, and types of
December 2008, the authors applied data for input
Table 3
|
Results of optimizing the model structure (1-day ahead water level)
Site
Number of inputs
Number of hidden layers
AFs at 1st hidden layera
AFs at 2nd hidden layera
AF at output layer
Han River
#1 #2
9 14
2 2
8Lo, 10T, 5Li 2Lo, 1T, 1Li
1Lo, 1T, 2Li 2Lo, 1T
T T
Geum River
#1 #2 #3 #4
10 6 11 9
1 2 1 1
3T, 12Li 11Lo, 20T, 1Li 2Lo, 5T 1Lo, 2T
– 3Lo, 2T, 1Li – –
Li Li Li Li
Yeongsan River
#1 #2 #3 #4
9 11 6 9
1 2 2 1
2Lo, 1T, 1Li 16Lo, 8T, 3Li 14T, 14Li 1Lo, 20T, 4Li
– 10Lo, 6T 2Lo, 2T –
Lo T T Li
Nakdong River
#1 #2 #3 #4 #5
14 13 14 14 14
2 2 1 1 1
5Lo, 9T, 1Li 5Lo, 4T, 5Li 2T, 1Li 1Lo, 1T 4Lo, 6T, 7Li
2Lo, 2T, 2Li 1Lo, 1T – – –
T T T T Li
a
The expression ‘aLo, bT, cLi’ means that ‘a, b, c’ is the number of each function and ‘Lo, T, Li’ stand for logistic function, hyperbolic tangent function, and linear function in order. The sum of
‘a, b, and c’ is the total number of nodes at each layer.
226
Table 4
G. Lee et al.
|
|
Applicability of neuro-genetic approach for short-term water level prediction
Journal of Hydroinformatics
|
16.1
|
2014
Results of optimizing the model structure (2-day ahead water level)
Site
Number of inputs
Number of hidden layers
AFs at 1st hidden layera
AFs at 2nd hidden layera
AF at output layer
Han River
#1 #2
9 14
1 2
3Lo, 6T 15Lo, 6T, 5Li
– 1T, 1Li
1Lo 1Li
Geum River
# # # #
1 2 3 4
10 6 11 9
1 1 1 1
1Lo, 1Li 1Lo, 1Li 2Lo, 2T, 1Li 10Lo, 9T, 5Li
– – – –
Li Lo Li Li
Yeongsan River
# # # #
1 2 3 4
7 11 8 9
1 2 2 1
1Lo, 2T 10Lo, 9T, 1Li 1Lo, 12T 6Lo, 4T, 7Li
– 12Lo, 4T 2Lo, 2T –
Li T T Li
Nakdong River
# # # # #
1 2 3 4 5
14 13 14 14 14
1 2 1 1 1
15Lo, 9T, 4Li 15Lo, 5T, 10Li 7T, 4Li 11T, 12Li 3Lo, 2T, 1Li
– 1Lo, 1T – – –
Lo T Li T Lo
a The expression ‘aLo, bT, cLi’ means that ‘a, b, c’ is the number of each function and ‘Lo, T, Li’ stand for logistic function, hyperbolic tangent function, and linear function in order. The sum of ‘a, b, and c’ is the total number of nodes at each layer.
variables, and then compared the results with recorded
very difficult to predict as the main stream is influenced
water levels. Results of the 1-day ahead prediction are as
by the flows of many tributaries. However, validation
below. For all measurement points, the models could
testing showed that the ANN models anticipate and
explain changes of water levels very satisfactorily consider-
solve the complication these tributaries cause with excel-
ing that R
2
spans from 0.84 to 0.94 (see Table 5 and
lent accuracy. 3. Yeongsan River: R 2 (0.83) is similar to that of the points in
Figure 4).
the Han River. It should be also noted that MSE is low 2
1. Han River: R of the four points is lower (0.84) than
(0.04) and simultaneously MAPE is relatively high
those in any other study site while MSE is the average
(13.12%). MAPE is more sensitive to overestimation aris-
and MAPE is relatively low. The first criterion reveals
ing when the absolute values of observed data are
that there are difficulties in fitting the model as
smaller. Therefore, the values of MSE and MAPE are
random variations of water levels are rather significant.
interpreted as the constructed models having a little ten-
However, the latter criteria show that errors of the
dency of overestimating low water levels, especially in
models are insignificant. Overall, prediction ability is
the dry season.
quite good even in the seasons when floods or droughts
4. Nakdong River: the models have excellent consistency (R 2 ¼ 0.91), but the other criteria are relatively unsatisfac-
occur. 2. Geum River: R 2 is the largest (0.94) among all the study
tory. Based on these results, it is expected that random
sites. Water levels in this river were once expected to be
variation of the water level data are not significant, but the models do not respond in a very sensitive manner
Table 5
|
when water levels are suddenly changed.
Testing results of the ANN models (R 2) R 2: 1-day ahead
R 2: 2-day ahead
Han River
0.84
0.72
Geum River
0.94
0.87
Yeongsan River
0.83
0.82
Nakdong River
0.91
0.87
For the 2-day ahead prediction models, it is natural that validation criteria get worse. However, Figure 5 states that the models are still acceptable: 0.72 < R 2 < 0.87, and sufficiently small MSE and MAPE. Also, the statistical properties, differing from study sites in cases of the
227
Figure 4
G. Lee et al.
|
|
Applicability of neuro-genetic approach for short-term water level prediction
Journal of Hydroinformatics
|
16.1
|
2014
Results of testing the trained model (1-day ahead water level): (a) Han River; (b) Geumgang River; (c) Youngsan River; (d) Nakdong River.
1-day prediction models, remain valid when the overall perspective of the criteria is considered. The models are
Is it easy and systematic to find the optimum structure of the model?
most consistent with the points in the Geum River, but they are not perfectly fitted with the points in the Han
The model is optimized by using the genetic algorithm
River basin. In addition, the model fits with Nakdong
(selecting model structure) and the back propagation algor-
River, but a small error results from insensitivities of the
ithm (adjusting weights). Although some professional efforts
models.
for programming were required (in practice, this could be
In general, the ANN models are acceptable in predicting
easily settled by developing the GUI, interacting with the
water levels at all points even when there is uncertainty
GA and back propagation (BP) codes, and a manual),
in tomorrow or the day after tomorrow’s weather
these algorithms greatly reduced the labor in repeating
conditions.
trials and errors. Simultaneously, with the assistance of linking the ANN model with the GA, the authors could have great confidence in determining hidden layers, nodes, and
Discussion
activation functions, which might otherwise have been arbitrary.
At the beginning of this study, the authors form three hypotheses to examine the practicality of the ANN models. Each hypothesis is then discussed with the models
Is this model advantageous to forecast 1- or 2-day ahead water levels?
constructed above. The discussion underpins the opinion that the ANN models, especially optimized with the genetic
As the authors mentioned previously, the Korean river man-
algorithm, can achieve many requirements necessary to
agement system needs 1 or 2 days to decide the operation of
replace the existing models and eventually enhance the
upstream infrastructure to maintain the downstream water
adaptability of the river management system.
levels. Practically, it is important to have the prediction
Figure 5
|
Results of testing the trained model (2-day ahead water level): (a) Han River; (b) Geumgang River; (c) Youngsan River; (d) Nakdong River.
228
G. Lee et al.
|
Applicability of neuro-genetic approach for short-term water level prediction
Journal of Hydroinformatics
|
16.1
|
2014
methods and models that are highly advantageous to predict
Again, these tests let the modeler determine the acceptable
the 1- or 2-day ahead water levels. The ANN models coupled
range of the discharge amount of the upstream in order to
2
with the GA showed satisfactory validity (e.g., 0.84 < R <
satisfy the management level at the point. For illustration,
0.94 for 1-day ahead water level, and 0.72 < R 2 < 0.88 for
see Figure 6(a); we are interested in the water level at point
2-day ahead water level) and more consistent results than
1 of the Geum River, and the management level is hypotheti-
the hourly models (for instance, Filho & Santos (),
cally set at 3.80 m. Let us also assume that now is day 61 (at
Alvisi et al. (), and Napolitano et al. ()) that tried
this moment, the water level is 3.50 m, and the discharge
to build the prediction models for 1-, 12-, and 18-hours
amount from the upper reservoir is 810 m3/s). This is the
ahead water levels. The coefficient of determination,
time when we get the model prediction that tomorrow’s
which was estimated from their models, ranged from 0.4
water level (3.84 m) would go beyond the management
to 0.95. In addition, even at the points of the Geum River
level. It is thus natural to investigate what would result
which are complicated due to the influence of many tribu-
from the intentionally reduced discharges. By using the
taries, the 2-day ahead water level can be predicted with
ANN model with other assumptions on the upper discharges,
2
the accuracy of R ¼ 0.87. For a further study, comparing
it would be possible to get the prediction that the discharge
the neuro-genetic algorithm and other conventional neural
amount should be immediately dropped to less than
networks will be meaningful for additional verification of
789 m3/s to maintain tomorrow’s water level below 3.8 m,
the developed model.
as in Figure 6(b). Indeed, the Korean River Management Guideline (K-water ) mentions that for cases where meeting the management level is threatened, the river manager is
Will the models be helpful for deciding the operation of the upstream weirs or reservoirs?
exceptionally allowed to ‘act first, report later’.
The ANN models are implicit in explaining a quantitative relation between the upstream flow and the downstream
CONCLUSION
water level. Hence, these models facilitate further decisionmaking in face of the anticipation that the water level at a
Recently, the Korean government implemented the FRRP
point of concern would be risky under un-intervened con-
with a great deal of ambition. However, it is hard to think
ditions. The modeler can carry out ‘what-if …’ tests while
that the constructed weirs, dams, and reservoirs will solve
thinking about the different operation (or different discharge)
all the chronic problems that riparian areas have long
of weirs and reservoirs constructed in the upper stream.
faced, and climate change is likely to aggravate. For adaptive
Figure 6
|
Example illustrating the decision operation of the upstream water infrastructure at point 1 of the Geum River: (a) prediction of the 1-day ahead water level; (b) estimated relation between upstream flow (day 61) and downstream water level (day 62).
229
G. Lee et al.
|
Applicability of neuro-genetic approach for short-term water level prediction
capacity of the river management system, this study had a
Journal of Hydroinformatics
|
16.1
|
2014
REFERENCES
special interest in raising the capability of predicting water levels at various points of the rivers. Such intelligent forecasting
capabilities
can
be
heightened
by
carefully
monitoring weather conditions and upstream water flow data, adequately utilizing the data in predicting 1- or 2-day ahead water level, and building the models properly to satisfy practical requirements. In this context, the authors tested the use of a hybrid neuro-genetic algorithm in predicting water levels at 15 points of four rivers. The results are summarized as follows. 1. By using the genetic algorithm, it was possible to greatly reduce the trials and errors which were necessary to find out the optimum structure of the ANN model. The developed ANN model demonstrates the great advantage that hidden layers, nodes, and activation functions can be selected in a more formulated manner. 2. The ANN models showed satisfactory validity over the 15 water level measurement points. Especially, the coefficient of determination ranged from 0.84 to 0.94 for 1-day ahead water levels, and from 0.72 to 0.88 for 2-day ahead water levels. Based on these statistics, it was found that the built models have greater prediction abilities than those presented in previous studies. 3. The ANN models could clearly explain the relation between the upstream flow and the downstream water level. This advantage can be of significant merit when the river manager anticipates the water levels within the acceptance level. The models can encourage the river manager to investigate the consequences of differently operating water infrastructures located in the upstream. Therefore, they are considerably helpful in making urgent decisions regarding how water infrastructure should be properly operated and maintained.
ACKNOWLEDGEMENTS This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (Grant No. 2012-0001656).
Abebe, A. J. & Price, R. K. Managing uncertainty in hydrological models using complementary models. Hydrol. Sci. J. (J. Des Sci. Hydrologiques) 48 (5), 679–692. Alvisi, S., Mascellani, G., Franchini, M. & Bardossy, A. Water level forecasting through fuzzy logic and artificial neural network approaches. Hydrol. Earth Syst. Sci. 10, 1–17. Cameron, D., Kneale, P. & See, L. An evaluation of a traditional and a neural net modeling approach to flood forecasting for an upland catchment. Hydrol. Process. 16, 1033–1046. Cha, D., Lee, S. & Park, H. Investigating the vulnerability of dry-season water supplies to climate change: using the Gwangdong Reservoir Drought Management Model. Water Resour. Manage. 26 (14), 4183–4201. Chang, F. J., Chiang, Y. M. & Chang, L. C. Multi-step-ahead neural networks for flood forecasting. Hydrol. Sci. J. (J. Des. Sci. Hydrologiques) 52 (1), 114–130. Chau, K. W. Particle swarm optimization training algorithm for ANNs in stage prediction of Shing Mun River. J. Hydrol. 329 (3–4), 363–367. Chiang, Y. M., Chang, L. C. & Chang, F. J. Comparison of static-feedforward and dynamic-feedback neural network for rainfallrunoff modeling. J. Hydrol. 290, 297–311. Choi, D. & Park, H. A hybrid artificial neural network as a software sensor in a wastewater treatment process. Water Res. 35 (16), 3959–3967. Coulibaly, P., Anctil, F. & Bobee, B. Hydrological forecasting using artificial neural networks: the state of the art. Can. J. Civil Eng. 26 (3), 293–304. Daliakopoulos, I. N., Coulibaly, P. & Tsanis, I. K. Groundwater level forecasting using artificial neural networks. J. Hydrol. 309, 229–240. Dibike, Y. B. & Solomatine, D. P. River flow forecasting using artificial neural networks. Phys. Chem. Earth Part B 26 (1), 1–7. Filho, A. J. P. & Santos, C. C. Modeling a densely urbanized watershed with an artificial neural network, weather radar and telemetric data. J. Hydrol. 317, 31–48. Foley, A. M. Uncertainty in regional climate modeling: a review. Prog. Phys. Geog. 34 (5), 647–670. Gavin, J. B., Graeme, C. D. & Holger, R. M. Data transformation for neural network models in water resources applications. J. Hydroinf. 5, 245–258. Giustolisi, O. & Laucelli, D. Improving generalization of artificial neural networks in rainfall–runoff modeling. Hydrol. Sci. J. 50 (3), 439–457. Giustolisi, O. & Simeone, V. Optimal design of artificial neural networks by a multi-objective strategy: groundwater level predictions. Hydrol. Sci. J. 51 (3), 502–523. Goldberg, D. E. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley-Longman, Reading, MA, USA.
230
G. Lee et al.
|
Applicability of neuro-genetic approach for short-term water level prediction
Grayson, R. B., Moore, I. D. & McMahon, T. A. Physically based hydrologic modelling, 2: is the concept realistic? Water Resour. Res. 28 (10), 2659–2666. Haykin, S. Neural Networks: a Comprehensive Foundation (2nd edn). Prentice-Hall, Upper Saddle River, NJ, USA. Holland, J. Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, MI, USA. Hsu, K. L., Gupta, H. V. & Sorooshian, S. Artificial neural network modeling of the rainfall–runoff process. Water Resour. Res. 31 (10), 2517–2530. Imrie, C. E., Durucan, S. & Korre, A. River flow prediction using artificial neural networks: generalisation beyond the calibration range. J. Hydrol. 233, 138–153. Joo, D., Choi, D. & Park, H. The effects of data preprocessing in the determination of coagulant dosing rate. Water Res. 34 (13), 3295–3302. Karunanithi, N., Grenney, W. J., Whitley, D. & Bovee, K. Neural networks for river flow prediction. J. Comput. Civil Eng. 8 (2), 201–220. Kisi, O. & Asce, M. River flow modeling using artificial neural networks. J. Hydrol. Eng. 1 (60), 60–63. K-water Dam Operation Manual. K-water, Korea (in Korean). Lawrence, J. & Peterson, A. Brainmaker: User’s Guide and Reference Manual. California Scientific Software, Nevada City, CA, USA. Leavesley, G. H. Modeling the effects of climate change on water resources? A review. Clim. Change. 28, 159–177. Lee, S. & Park, H. Adaptation practices of urban water infrastructure management. Proceedings of the APEC Climate Symposium 2011, Hawaii, USA. Lee, S., Suhaimi, A. & Park, H. Lessons from water scarcity of the 2008–2009 Gwangdong reservoir: needs to address drought management with the adaptiveness concept. Aquat. Sci. 74 (2), 213–227. Maier, H. R. & Dandy, G. C. Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications. Environ. Modell. Softw. 15, 101–124. Maier, H., Jain, A., Dandy, G. & Sudheer, K. Methods used for the development of neural networks for the prediction of water resource variables in river systems: Current status and future directions. Environ. Modell. Softw. 25 (8), 891–909. Milly, P. C. D., Betancourt, J., Falkenmark, M., Hirsch, R. M., Kundzewicz, Z. W., Lettenmaier, D. P. & Stouffer, R. J. Stationarity is dead: whither water management? Science 319 (5863), 573–574. Ministry of Land, Transport and Maritime Affairs Master Plan for Four River Project. Korea (in Korean). Ministry of Land, Transport and Maritime Affairs Future Water Resources Management Strategies for Coping with Climate Change. Korea (in Korean). Mohanty, S., Jha, M. K., Kumar, A. & Sudheer, K. P. Artificial neural network modeling for groundwater level forecasting in
Journal of Hydroinformatics
|
16.1
|
2014
a river island of Eastern India. Water Resour. Manage. 24, 1845–1865. Napolitano, G., See, L., Calvo, B., Savi, F. & Heppenstall, A. A conceptual and neural network model for real-time flood forecasting of the Tiber river in Rome. Phys. Chem. Earth. 35 (3–5), 187–194. National Institute of Meteorological Research Understanding Climate Change II – Climate Change in the Korean Peninsula: Present and Future. Korea Meteorological Administration (in Korean). Pulido-Calvo, I. & Portela, M. M. Application of neural approaches to one-step daily flow forecasting in Portuguese watersheds. J. Hydrol. 332 (1–2), 1–15. Robert, J. A. Neural network rainfall–runoff forecasting based on continuous resampling. J. Hydroinf. 5, 51–61. Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning internal representation by error propagation. Parallel Distributed Process 1, 318–362. Rumelhart, D. E. & McClelland, J. L. Parallel Distributed Processing: Explorations in the Microstructure of Cognition. MIT Press, Cambridge, MA, USA. Sahoo, G. B., Ray, C. & De Carlo, E. H. Use of neural network to predict flash flood and attendant water qualities of a mountainous stream on Oahu, Hawaii. J. Hydrol. 327, 525–538. Savic, D. A., Walters, G. A. & Davidson, J. W. A genetic programming approach to rainfall–runoff modelling. Water Resour. Manage. 13 (3), 219–231. See, L. & Openshaw, S. A hybrid multi-model approach to river level forecasting. Hydrol. Sci. J. 45 (4), 523–536. Shamseldin, A. Y., O’Connor, K. M. & Nasr, A. E. A comparative study of three neural network forecast combination methods for simulated river flows of different rainfall–runoff models. Hydrol. Sci. J. (J. Des. Sci. Hydrologiques) 52, 896–916. Solomatine, D. P. Data-driven modelling paradigm, methods, experiences. In: Proceedings of the 5th International Conference on Hydroinformatics, Cardiff, UK, 1–5 July, pp. 1–7. Toth, E., Brath, A. & Montanari, A. Comparison of shortterm rainfall prediction models for real-time flood forecasting. J. Hydrol. 239 (1–4), 132–147. Wang, W., Gelder, P. H. A. J. M., Vrijling, J. K. & Ma, J. Forecasting daily streamflow using hybrid ANN models. J. Hydrol. 324, 383–399. Zealand, C. M., Burn, D. H. & Simonovic, S. P. Short term streamflow forecasting using artificial neural networks. J. Hydrol. 214, 32–48. Zeng, Z. & Wang, W. Advances in neural network research and application. Lecture Notes in Electrical Engineering 67, Springer, 449. Zhengfu, R. & Fernando, A. Use of an artificial neural network to capture the domain knowledge of a conventional hydraulic simulation model. J. Hydroinf. 9 (1), 15–24.
First received 18 January 2013; accepted in revised form 20 May 2013. Available online 10 July 2013
231
© IWA Publishing 2014 Journal of Hydroinformatics
|
16.1
|
2014
Impact of climate change on future stream flow in the Dakbla river basin Srivatsan V. Raghavan, Vu Minh Tue and Liong Shie-Yui
ABSTRACT A systematic ensemble high-resolution climate modelling study over Vietnam was performed and future hydrological changes over the small catchment of Dakbla, Central Highland region of Vietnam, were studied. Using the widely used regional climate model WRF (Weather Research and Forecasting), future climate change over the period 2091–2100 was ascertained. The results indicate
Srivatsan V. Raghavan (corresponding author) Vu Minh Tue Liong Shie-Yui Tropical Marine Science Institute, 18 Kent Ridge Road, 119227, Singapore E-mail: tmsvs@nus.edu.sg
W
that surface temperature over Dakbla could increase by nearly 3.5 C, while rainfall increases of more than 40% is likely. The ensemble hydrological changes suggest that the stream flow over the peak and post-peak rainfall seasons could experience a strong increase, suggesting risks of flooding, with an overall average annual increase of stream flow by 40%. These results have implications for water resources, agriculture, biodiversity and economy, and serve as useful findings for policy makers. Key words
| climate change, dynamical downscaling, hydrology, stream flow, Soil and Water Assessment Tool, WRF
INTRODUCTION Climate change impacts are studied using the information
fine-scale details to be applied for regional-scale impact
derived by global climate models (GCMs) which still
studies. When impact studies are performed, such as hydrol-
remain the primary tools in understanding climate and cli-
ogy, regional-scale impact studies warrant high-resolution
mate change at a global scale. However, it has been
climate information. To this end, regional climate models
realized that to study sub-global scales, i.e. continental,
(RCMs) (which are limited area models) at a higher resol-
regional or sub-regional scales, the GCMs do not provide
ution than that of GCMs (c. 10–50 km) are widely used in
detailed information of climate as it is observed in reality.
climate research. For hydrological studies it has become
This is largely attributable to the coarse resolution of the
common to use the output of the regional climate models
GCMs, making them unsuitable for regional impact studies
as input to hydrological models. Similar studies have been
(Giorgi ). The need for regional scale information is
done by Hay et al. (); Sushama et al. (); Andersson
also emphasized by the fact that GCM climate projections
et al. () and Graham et al. ().
do not allow regional examinations such as water balances
This paper describes such a method where the climate
or trends of extreme precipitation due to their coarse grid
outputs (precipitation and surface temperature) from
resolution. This clearly applies to hydrological impact
a high-resolution regional climate model (Weather Research
studies over a river basin, as most of the river basins of the
and Forecasting or WRF) are applied to a hydrological
world are smaller than the typical resolution (c. 300 km)
model (Soil and Water Assessment Tool, SWAT) (Arnold
of the GCM. Such hydrological models therefore need to
et al. ) to study changes in future stream flow over the
be driven by high-resolution data for better assessments of
small river catchment Dakbla, over the Central Highland
regional scale impacts. The GCMs do not simulate precipi-
region of Vietnam. Ensemble scenarios of climate change
tation, one of the most important and sensitive climate
derived from the WRF model driven by three different
parameter highly variable in space and time, with adequate
GCMs are described, all under the A2 emission scenario.
doi: 10.2166/hydro.2013.165
232
S. V. Raghavan et al.
|
Climate change impact on future Dakbla river basin stream flow
Journal of Hydroinformatics
|
16.1
|
2014
Similar studies have also been documented by Hamlet &
through to October (referred to as MJJASO) and dry season
Lettenmaier () and Wei & Watkins ().
from November through to April (referred to as NDJFMA). Flood season is around 1 month after the rainy season, because some buffer time is required to fill up the groundwater for basalt
STUDY AREA
soil in this region after the earlier 6-month dry period. Due to the steep slope topography and heavy rainfall concentrations,
The Dakbla River is a small tributary of the Mekong river over
stream flow in this region acquires a high velocity, especially
the Lower Mekong Basin (LMB) in southeast Asia. The catch-
during floods, causing massive damage to people and property.
2
ment has a total area of 2,560 km from the upstream to Kon
There is also a very high potential of constructing hydropower
Tum gauging station (Figure 1) and lies over the Central High-
dams to store surface water for multipurpose needs: irrigation,
land region of Vietnam. The catchment is covered mostly by
electricity generation and flood control. Upper Kon Tum hydro-
tropical forests which are classified as tropical evergreen
power, with an installed capacity of 210 MW, has been under
forest, young forest, mixed forest, planned forest and shrub.
construction since 2009 (to be completed in 2014) in the
The climate of this region follows the pattern of the Central
upstream region of Dakbla river; at 110 km downstream, the
Highland region in Vietnam with an annual average tempera-
Yaly hydropower plan has been constructed (installed capacity
W
ture of c. 20–25 C and a total annual average rainfall of
720 MW; the second biggest hydropower project in Vietnam)
c. 1,500–3,000 mm with high evapotranspiration rates of
which has been in operation since 2001. Forecasting stream
c. 1,000–1,500 mm per annum. There are two main seasons
flow mainly by using rainfall is therefore an important task in
for the Central Highland region: a rainy season from May
this region for both hydropower and irrigation.
Figure 1
|
Map of Vietnam climate zones and location of Dakbla catchment. (a) Different climate zones and topography of Vietnam; and (b) Dakbla catchment and its meteorological and river gauging station.
233
S. V. Raghavan et al.
|
Climate change impact on future Dakbla river basin stream flow
METHODS
Journal of Hydroinformatics
|
16.1
|
2014
wind speed. Due to limited available meteorological data for the site considered in this study, the Hargreaves
Soil and water assessment tool (SWAT)
method is applied. In the SWAT model, the land area in a sub-basin is
The rainfall–runoff model is a typical hydrological model-
divided into what are known as hydrological response
ling tool that determines the runoff from the watershed
units (HRUs). HRUs are constructed through a unique com-
basin resulting from rainfall falling on the basin. Precipi-
bination of land use and soil information. One HRU is the
tation is therefore an important input in deriving runoff in
total area of a sub-basin with a particular land use and soil
hydrological modelling. The SWAT model (Arnold et al.
characteristics. While individual fields with a specific land
), used for rainfall–runoff modelling in this study, was
use and soil may be scattered throughout a sub-basin,
developed to quantify the runoff and concentration load
these areas are lumped together to form a single HRU.
due to the distributed precipitation, watershed topography,
These are used in most SWAT applications since they sim-
soil and land use conditions.
plify a simulation by putting together all similar soil and
SWAT is a river basin scale model developed by the
land use areas into one single response unit (Neitsch et al.
United States Department of Agriculture (USDA) Agricul-
). All parameters such as surface runoff, PET, lateral
ture Research Service (ARS) in the early 1990s. It has
flow, percolation, soil erosion, nitrogen and phosphorus
been designed to work for large river basins over a long
are measured in each HRU.
period of time. Its purpose is to quantify the impact of land management practices on water, sediment and agri-
Model set-up
culture chemical yields with varying soil, land use and management condition. SWAT version 2005 with an
Ensemble regional climate model outputs were used as
ArcGIS user interface (ArcSWAT) was used in this
input to the SWAT hydrological model to determine future
study. There are two methods for estimating surface
hydro-climatic changes. These regional climate model out-
runoff in SWAT model: Green & Ampt () infiltration
puts (surface temperature and precipitation) were derived
method, which requires precipitation input over a sub-
using the WRF model which was used to downscale the
daily scale and the Soil Conservation Service (SCS)
GCMs CCSM3.0, ECHAM5 and MIROC-medres, all
curve number procedure (USDA Soil Conservation Ser-
forced under the Intergovernmental Panel on Climate
vice ) which uses daily precipitation. The latter was
Change (IPCC) A2 future greenhouse gas emission scenario.
selected in this study for simulations, since daily rainfall
This regional climate model was initially driven by the
from the climate models was used as input to the SWAT
ERA40 reanalysis which refer to the ‘true’ climate period
model. The retention parameter is very important in the
of 1981–1990. Later, the WRF model was also driven by
SCS method and is defined by curve number (CN), a func-
the GCMs CCSM3.0, ECHAM5 and the MIROC-medres
tion of the soil permeability, land use and antecedent soil
for both the present day (1981–1990) and the future
water conditions.
(2091–2100) climates. For simplicity, the simulations of
The SWAT model offers three options for estimating
WRF driven by ERA40 reanalysis and the GCMs
potential evapotranspiration (PET): Hargreaves (Hargreaves
CCSM3.0, ECHAM5 and the MIROC-medres are referred
et al. ); Priestley–Taylor (Priestley & Taylor )
to as WRF/ERA, WRF/CCSM, WRF/ECHAM and WRF/
and Penman–Monteith (Monteith ). The Hargreaves
MIROC, respectively.
method requires only maximum, minimum and average sur-
For comparison of WRF model simulated precipitation
face temperature. The Priestley–Taylor method needs solar
and surface temperature profiles, two sets of gridded obser-
radiation, surface temperature and relative humidity. The
vational datasets are used: CRU (Climatic Research Unit,
inputs for the Penman–Monteith method are the same as
University of East Anglia, UK, 0.5 data) and the APHRO-
those for Priestley–Taylor; however, it also requires the
DITE (Asian precipitation highly resolved observational
W
234
S. V. Raghavan et al.
|
Climate change impact on future Dakbla river basin stream flow
data integration towards evaluation of water resources) W
(0.25
data) from the Japanese Meteorological Agency
(JMA). In this paper, the latter is referred as APH. These
Journal of Hydroinformatics
|
16.1
|
2014
error or goodness-of-fit measures available, due to its straightforward physical interpretation (Legates & McCabe ).
datasets have been documented by Mitchell & Jones () and Yatagai et al. (), respectively. For hydrological simulations, daily precipitation data
RESULTS AND DISCUSSION
were obtained from three rainfall stations (Kon Tum, Dak Doa and Kon Plong; the former two lie inside and the
Daily precipitation data were obtained from the three rain-
latter outside the Dakbla catchment) and daily river
fall stations (Kon Plong, Kon Tum and Dak Doa) for the
stream flow data were taken from the gauging station at
periods 1980–1990 (calibration) and 1995–2005 (vali-
Kon Tum, all shown in Figure 1(b). Surface temperature,
dation).
rainfall and discharge data have been acquired for the two
temperature data were also obtained from the local auth-
periods 1980–1990 and 1995–2005, at a daily rate. For use
ority from the Kon Tum meteorological station for the
in the SWAT model, the digital elevation model (DEM) of
same period. Daily river stream flow data were obtained
250 m was obtained from the Department of Survey and
from the Kon Tum gauging station at the downstream end
Mapping (DSM), Vietnam. The land use map was obtained
of the Dakbla River. These data were used for both the cali-
from the Forest Investigation and Planning Institute (FIPI)
bration and validation processes in the stream flow
and the soil map was obtained from the Ministry of Agricul-
simulations of the SWAT model. In the calibration part,
ture and Rural Development (MARD), both in Vietnam
the SWAT model was run in a daily time step for the
(Figure 2).
period of 1980–1990 using observed rainfall and river
Daily
maximum
and
minimum
surface
A couple of benchmarking indices were used to assess
stream flow at Kon Tum gauging station, with the first year
the performance of the SWAT model: Nash–Sutcliffe Effi-
1980 used as the spin-up period. The validation was per-
ciency (NSE) proposed by Nash & Sutcliffe () and the
formed for the 10-year period of 1996–2005 to ensure that
coefficient of determination (R 2). The value of NSE
the model was well calibrated. The reason for choosing
ranges from minus infinity to 1 while R 2 is from 0 to 1,
these 10-year periods for calibration and validation is
with 1 representing a perfect match for both indices. The
because of the data availability; longer-period data spanning
NSE is considered to be the most appropriate relative
30 years were not available from station sources.
Figure 2
|
SWAT model spatial inputs: (a) DEM; (b) land use; and (c) soil map of Dakbla river basin.
235
S. V. Raghavan et al.
|
Climate change impact on future Dakbla river basin stream flow
Journal of Hydroinformatics
|
16.1
|
2014
A sensitivity analysis was conducted prior to calibrating
the precision of OAT sampling to ensure that the changes in
the hydrological model. This is a method that analyzes the
each model output could be attributed to the changed par-
sensitivity of the different model parameters (Table 1) that
ameter. In this study, the LH-OAT design was coupled to
influence
This
the ArcSWAT 2005 model for the sensitivity analysis
method serves to filter out those model parameters that do
module. In the SWAT model there are 25 parameters that
not have a significant influence on the model results. On
are sensitive to stream flow, six parameters sensitive to sedi-
the other hand, it also aims to reduce the number of par-
ment transport and nine other parameters sensitive to water
ameters required in the auto-calibration method.
quality. In this study, sensitivity analysis was performed for
the
hydrological
model
performance.
Traditional methods of sensitivity analysis have been
the 25 parameters of stream flow as listed in Table 1, from
classified by Saltelli et al. (). They are: (1) local method
which 11 most sensitive parameters were then selected
(Melching & Yoon ); (2) integration of local to global
(Table 2) for performing the auto-calibration.
method using random one-factor-at-a-time (OAT) proposed
Since the ArcSWAT model has the options to choose
by Morris (); and (3) global methods such as Monte
either manual or auto-calibration, calibration is applied to
Carlo and Latin-Hypercube (LH) simulation (McKay et al.
the most sensitive parameters to yield the optimal set of
; McKay ). By studying the advantages and disadvan-
values for the model parameters which results in the mini-
tages of each of the above methods, van Griensven & Meixner
mum discrepancy between the observed and the simulated
() developed the LH-OAT method which performs LH
river discharge data. Parameter solution method (ParaSol)
sampling followed by OAT sampling. This method samples
is a built-in auto-calibration model in the ArcSWAT 2005
the full range of all parameters using LH design along with
version (van Griensven & Meixne ) which was used
Table 1
|
SWAT parameters sensitive to stream flow
Group
Parameter
Description
Unit
Soil
Sol_Alb Sol_Awc Sol_K Sol_Z
Moist soil albedo Available water capacity Saturated hydraulic conductivity Depth to bottom of second soil layer
– mm mm–1 mm h–1 mm
Subbasin
Tlaps
Temperature laps rate
HRU
Epco Esco Canmx Slsubbsn
Soil evaporation compensation factor Plant uptake compensation factor Maximum canopy storage Average slope length
– – mm H2O m
Routing
Ch_N2 Ch_K2
Manning’s n value for the main channel Effective hydraulic conductivity in main channel alluvium
– mm h–1
Groundwater
Alpha_Bf Gw_Delay Gw_Revap Gwqmn Revapmn
Baseflow alpha factor Groundwater delay Groundwater ‘revap’ coefficient Threshold depth of water in the shallow aquifer for return flow to occur Threshold depth of water in the shallow aquifer for ‘revap’ to occur
days days – mm H2O mm H2O
Management
Biomix Cn2
Biological mixing efficiency Initial SCS runoff curve number for moisture condition II
– –
General data basin
Sftmp Smfmn Surlag Timp Smfmx Blai Slope
Snowfall temperature Minimum melt rate for snow during year Surface runoff lag time Snow pack temperature lag factor Maximum melt rate for snow during year Maximum potential leaf area index for land cover/plant Slope
W
C km–1
W
C mm H2O C–1 day–1 days – – – – W
236
Table 2
S. V. Raghavan et al.
|
|
Climate change impact on future Dakbla river basin stream flow
Journal of Hydroinformatics
|
16.1
|
2014
Sensitivity analysis ranking of 11 most sensitive parameters in SWAT model to stream flow
Sensitivity analysis order
Parameter
Description
Parameter range
Initial value
Optimal value
1
Cn2
2
Ch_K2
Initial SCS runoff curve number for moisture condition II
35–98
35
96.78
Effective hydraulic conductivity in main channel alluvium
–0.01 to 500
0
150
3 4
Sol_Awc
Available water capacity
0–1
0.22
0.44
Sol_K
Saturated hydraulic conductivity
0–2,000
1.95
1,873
5
Ch_N2
Manning’s n value for the main channel
–0.01 to 0.3
0.014
0.073
6
Alpha_Bf
Baseflow alpha factor
0–1
0.048
0.027
7
Surlag
Surface runoff lag time
1–24
4
1
8
Esco
Plant uptake compensation factor
0–1
0
0.66
9
Gwqmin
Threshold depth of water in the shallow aquifer for return flow to occur
0–5,000
0
1,107
10
Gw_Revap
Groundwater ‘revap’ coefficient
0.02–0.2
0.02
0.17
11
Gw_Delay
Groundwater delay
0–500
31
215
in this study for auto-calibration of the SWAT model. This
derived data (precipitation and surface temperature) to be
ParaSol method has also been documented by van Griens-
used for stream flow simulations is discussed, as the cali-
ven & Meixne (). Using the above methodology, the
bration and validation stages used only the station data
SWAT model was calibrated to ensure a robust performance
precipitation and surface temperature.
before undertaking stream flow simulations using the 2
Before discussing the stream flow results of the SWAT
regional climate model output. The R and the NSE index
model, the WRF model simulated climates is useful to high-
were used as benchmarking indices to assess the goodness-
light the usefulness in applying RCM results for hydrological
of-fit of the SWAT hydrological model.
applications. The comparison of WRF model simulated pro-
The calibration and validation graphical results for
files of present-day surface temperature over Dakbla region
Dakbla River are shown in Figures 3 and 4 at (a) daily and
and the gridded observation datasets CRU and APH is dis-
(b) monthly scales, respectively. It is clearly seen in the cali-
played in Figure 5. It is notable that, even between the
bration that the simulated peak-to-peak discharge (on a
CRU and APH observations, CRU exhibits hotter profiles
monthly scale) and the low flow agree well with the
than the APH dataset. Nevertheless, the WRF model results
observed data better than the agreement seen on daily
show a reasonable simulation of the model by exhibiting a
scale, due to a higher variability in daily scales. The vali-
good pattern of temperature gradients as well as their mag-
dation plots indicate that the trend of observed data is
nitudes. The simulations of WRF/ECHAM, WRF/CCSM
being captured by the simulated flow, although some of
and WRF/MIROC also show similar profiles to that of
the peak-to-peak discharges are underestimated compared
WRF/ERA. Figure 6 shows the WRF model precipitation
to observed flow. The values of R 2 and NSE shown in
distribution over Dakbla catchment for the present-day cli-
Table 3 indicate that the comparison indices over a daily
mate compared against the two gridded observational
and monthly scale for both calibration and validation are
datasets. The WRF/ECHAM shows overestimation in rain-
around 0.5 and 0.7, respectively. These values indicate a
fall over this region, while WRF/CCSM and WRF/MIROC
good performance of the SWAT model (Santhi et al. )
share similar distributions to that of WRF/ERA and APH.
and that the hydrological model was well calibrated using
It can be stressed here that while surface temperatures are
the ParaSol method. Since the model was able to reproduce
more homogeneous and easy to be simulated, precipitation
the pattern of the observed stream flow well enough, the
is rather difficult to simulate well. Detailed evaluation of
next stage of the application of the regional climate model
the model performance was carried out (not discussed
237
S. V. Raghavan et al.
|
Climate change impact on future Dakbla river basin stream ďŹ&#x201A;ow
Figure 3
|
Calibration of the SWAT model, top: daily scale and bottom: monthly scale.
Figure 4
|
Validation of the SWAT model, top: daily scale and bottom: monthly scale.
Journal of Hydroinformatics
|
16.1
|
2014
238
Table 3
S. V. Raghavan et al.
|
|
Climate change impact on future Dakbla river basin stream flow
Statistical indices of SWAT Dakbla river basin model calibration and validation: R 2 and NSE
Journal of Hydroinformatics
|
16.1
|
2014
The precipitation and surface temperature variables from the RCM outputs of WRF/ERA were initially used for stream
Calibration (1981–1990)
Validation (1996–2005)
Daily
Daily
Monthly
flow simulation, followed by the outputs of WRF/CCSM, WRF/ECHAM and WRF/MIROC. The rationale for doing
Monthly
so is the same as that of the regional climate simulations: to R2
NSE
R2
NSE
R2
NSE
R2
NSE
test the performance of the true climate first and then that
0.58
0.53
0.72
0.74
0.45
0.43
0.73
0.66
of the GCMs. The reasonably good results from the WRF model for the present-day climate over this region imply
here), but is outwith the scope of this paper. These results
that they are suitable for use in the rainfall–runoff model.
are merely a bird’s eye view of regional climate simulations
The daily scale precipitation and temperature derived from
over a small region such as that of Dakbla. The climate
the RCMs were bi-linearly interpolated to the respective rain-
model results are shown to substantiate the use of model-
fall stations (Kon Plong, Kon Tum, Dak Doa) and
derived climate variables for further use in the SWAT hydro-
meteorological station (Kon Tum). The SWAT model usually
logical simulations.
takes measured rainfall data from gauged stations as input,
Figure 5
|
W
Annual surface temperature over Dakbla during 1981–1990 (in C): (a) CRU; (b) APH; (c) WRF/ERA; (d) WRF/CCSM; (e) WRF/ECHAM; and (f) WRF/MIROC.
239
Figure 6
S. V. Raghavan et al.
|
|
Climate change impact on future Dakbla river basin stream flow
Journal of Hydroinformatics
|
16.1
|
2014
Annual daily precipitation over Dakbla during 1981–1990 (in mm day–1): (a) CRU; (b) APH; (c) WRF/ERA; (d) WRF/CCSM; (e) WRF/ECHAM; and (f) WRF/MIROC.
then distributes its values to all of its sub-catchments. An
temperature and precipitation over the Dakbla region.
interpolation is therefore required to compute the station
Figure 7 displays the future response of the delta change
data (at a particular grid point) when using gridded data.
in annual scale for Dakbla region over scenario A2 for
Linear interpolation is therefore applied in this case. The
three different
bilinear interpolation method is an extension of the linear
ECHAM; and (c) WRF/MIROC for surface temperature
interpolation for interpolating functions of two variables on
and precipitation. It can be seen that WRF/CCSM projects
a regular grid; this is therefore used to extract precipitation
the least surface temperature increase compared to WRF/
value from station data at a grid point, from the entire gridded
ECHAM and WRF/MIROC. The change in temperature
data source derived from the RCM output. The same
from these three model scenarios ranges between 2.6 and
approach is applied for the surface temperature.
3.7 C. Precipitation is also expected to increase annually
Before the future stream flow results are discussed, it is also helpful to assess the future changes in the mean surface
models: (a) WRF/CCSM; (b)
WRF/
W
by 20–50%, with the largest (smallest) changes simulated by WRF/MIROC (WRF/CCSM).
240
Figure 7
S. V. Raghavan et al.
|
|
Climate change impact on future Dakbla river basin stream flow
Journal of Hydroinformatics
|
16.1
|
2014
Future response of (1) surface temperature and (2) daily precipitation over Dakbla: (a) WRF/CCSM; (b) WRF/ECHAM; and (c) WRF/MIROC.
Figure 8 shows the stream flow simulated by the SWAT
datasets. This finding is important because drought is one
model for the baseline (1981–1990) (black) and future
of the severe threats to this Central Highland region of Viet-
(2091–2100) (red; see colour version online) period derived
nam and has strong implications due to the high potential
from the inputs (precipitation, temperature) from the three
for hydropower.
different RCM integrations – WRF/CCSM, WRF/ECHAM and WRF/MIROC – all using the same A2 scenario.
In order to assess the characteristics of extreme rainfall and stream flow time series, a boxplot graph is shown in
It can be seen that, over an annual scale, the stream
Figure 9 for both rainfall and discharge at the Kon Tum
flow simulated by WRF/CCSM A2 scenario shows an
station. Overall, the WRF/ECHAM results indicate more
increase of 38% in the future, WRF/ECHAM A2 indicates
rainfall compared to the other two RCM integrations,
an increase of 37% and WRF/MIROC shows the highest
suggesting higher stream flow data. The maximum value of
increase of 46%. The low flow period during the dry
such a discharge is seen in the future stream flow for the
season NDJFMA also indicates a slight increase from all
WRF/ECHAM driven simulation, at 600 m3 s–1.
241
Figure 8
S. V. Raghavan et al.
|
|
Climate change impact on future Dakbla river basin stream flow
Journal of Hydroinformatics
|
16.1
|
2014
Baseline and future stream flow at Kon Tum station for three RCMs.
On a daily scale study of extremes, the probability distri-
model using station data rainfall has been found satisfac-
bution function compares the rainfall and stream flow for
tory; the model-derived rainfall was therefore also used to
the three different RCM results for the baseline and future
assess stream flow simulation over the current and future
periods (Figure 10). All three RCM results agree that
climate. Using the RCM outputs, the present-day and
future stream flow has higher frequency distribution for
future stream flows were also simulated. Results show
3 –1
high discharge (>100 m s ) compared to the baseline.
that the future stream flow over the Dakbla river basin
For the extreme case, a discharge value of more than
is expected to increase, especially during the rainy
3 –1
indicates a higher frequency of future stream
season, which has implications not only for flood mitiga-
flow. This must be taken very seriously, as very high dis-
tion measures but also for water resources management,
charge is critical for river operation management.
hydropower and agriculture. Extreme values of rainfall
480 m s
and discharges indicate that necessary steps should be taken for appropriate river operation management.
CONCLUSIONS
However, much more work is required to improve confidence in these results. Further higher resolution simulation
In this study, regional climate model outputs of precipi-
(5–10 km) of the RCMs may be required to obtain more cred-
tation and surface temperature were applied to a
ible estimates of present-day and future precipitation. Since
hydrological model (SWAT), calibrated using the ParaSol
this result has been obtained only from a few RCM simu-
method, and its simulated discharges were compared to
lations of future climates, it is recommended to obtain an
their observed counterparts. The performance of the
ensemble estimate of future climate change by downscaling
242
Figure 9
S. V. Raghavan et al.
|
|
Climate change impact on future Dakbla river basin stream flow
Journal of Hydroinformatics
|
16.1
|
2014
Box plot for baseline and future for three RCMs at Kon Tum station, top: precipitation and bottom: stream flow.
more GCMs or by using perturbed initial conditions to the
some uncertainties from the hydrological model, improved
RCM to derive multiple estimates of climate. The hydrologi-
spatial data such as the DEM might help to improve the
cal simulations using the results of the derived ensemble
stream flow simulations since the current version was
climate simulations will add to the confidence of such a
mapped a few years ago in 2005. Other than the ParaSol
hydrological impact study.
method which was used for calibration, a few other auto-cali-
Further developments in the RCM model physics and
bration methods which are coupled to SWAT-CUP model
dynamics might also yield improvements in the climate simu-
(SWAT Calibration Uncertainty Procedures, Abbaspour
lations, yielding a better quality of RCM outputs which in
et al. ) might yield more possible outcomes which
turn might improve the hydrological simulations. As to
could help to understand a wider range of uncertainties.
243
Figure 10
S. V. Raghavan et al.
|
|
Climate change impact on future Dakbla river basin stream flow
Journal of Hydroinformatics
|
16.1
|
2014
Probability distribution function for baseline and future for three RCMs at Kon Tum station, top: precipitation and bottom: stream flow.
However, the applications of these methods are compre-
paper, yet provide possible future research work. The research
hensive exercises that entail more sensitivity studies and
findings from this study are still useful as they yield some ‘new’
experimentations; they are as such beyond the scope of this
information that might yield clues to the wider and larger
244
S. V. Raghavan et al.
|
Climate change impact on future Dakbla river basin stream flow
changes to come. This study is one of the first detailed RCM studies undertaken over this region to provide preliminary possible future climate change information to policy makers. As these several uncertainties will be constrained down the road once improvements in the modelling are achieved, those plausible wider and larger changes could be used for further assessments of future changes.
REFERENCES Abbaspour, K. C., Yang, J., Maximov, I., Siber, R., Bogner, K., Mieleitner, J., Zobrist, J. & Srinivasan, R. Modelling hydrology and water quality in the pre-alpine/alpine Thur watershed using SWAT. Journal of Hydrology 333, 413–430. Andersson, L., Wilk, L., Todd, M., Hughes, D., Earle, A., Kniveton, D., Layberry, R. & Savenije, H. Impact of climate change and development scenarios on flow patterns in the Okavango River. Journal of Hydrology 331 (1–2), 43–57. Arnold, J. G., Srinivasan, R., Muttiah, R. S. & Williams, J. R. Large area hydrologic modeling and assessment, part I: Model development. Journal of American Water Resources Association 34 (11), 73–89. Giorgi, F. Simulations of regional climate using a limited area model nested in a general circulation model. Journal of Climate 3 (9), 941–963. Graham, L. P., Hagemann, S., Jaun, S. & Beniston, M. On interpreting hydrological change from regional climate models. Climatic Change 81, 97–122. Green, W. H. & Ampt, G. A. Studies on soil physics, Part I: The flow of air and water through soils. Journal of Agricultural Science 4, 1–24. Hamlet, A. F. & Lettenmaier, D. P. Long-range climate forecasting and its use for water management in the Pacific Northwest region of North America. Journal of Hydroinformatics 2, 163–182. Hargreaves, G. L., Hargreaves, G. H. & Riley, J. P. Agriculture benefits for Senegal River basin. Journal of Irrigation and Drainage Engineering 111 (2), 113–124. Hay, L. E., Clark, M. P., Wilby, R. L., Gutowski, W. J., Leavesley, G. H., Pan, Z., Arritt, R. W. & Takle, E. S. Use of regional climate model output for hydrological simulations. Journal of Hydrometeorology 3, 571–590. Legates, D. R. & McCabe Jr, G. J. Evaluating the use of ‘goodness-of-fit’ measure in hydrologic and hydroclimatic model validation. Water Resources Research 35 (1), 233–241. McKay, M. D. Sensitivity and uncertainty analysis using a statistical sample of input values. In: Uncertainty Analyses (Y. Ronen, ed.). CRC Press, Boca Raton, FL, pp. 145–186. McKay, M. D., Beckman, R. J. & Conover, W. J. A comparison of three methods for selecting values of input
Journal of Hydroinformatics
|
16.1
|
2014
variables in the analysis of output from a computer code. Technometrics 21 (2), 239–245. Melching, C. S. & Yoon, C. G. Key sources of uncertainty in QUAL2E model of Passaic river. Journal of Water Resources Planning and Management 122 (2), 105–113. Mitchell, T. D. & Jones, P. D. An improved method of constructing a database of monthly climate observations and associated high-resolution grids. International Journal of Climatology 25, 693–712. Monteith, J. L. Evaporation and the Environment. Symposia of the Society for Experimental Biology. Cambridge University Press, London, pp. 205–234. Morris, M. D. Factorial sampling plans for preliminary computation experiments. Technometrics 33, 161–174. Nash, J. E. & Sutcliffe, J. V. River flow forecasting through conceptual models. Part 1: A discussion of principles. Journal of Hydrology 10 (3), 282–290. Neitsch, S. L., Arnold, J. G., Kiniry, J. R., Srinivatsan, R. & Williams, J. R. Soil and Water Assessment Tool Input/ Output File Documentation version 2005. Grassland, Soil and WaterResearch Laboratory, Agricultural Research Service, Temple, Texas. Priestley, C. H. B. & Taylor, R. J. On the assessment of surface heat flux and evaporation using large scale parameters. Monthly Weather Review 100, 81–92. Saltelli, A., Chan, K. & Scott, E. M. (eds) Sensitivity Analysis. Wiley, New York. Santhi, C., Arnold, J. G., Williams, J. R., Dugas, W. A., Srinivasan, R. & Hauck, L. M. Validation of the SWAT model on a large river basin with point and nonpoint sources. Journal of the American Water Resources Association 37 (5), 1169–1188. Sushama, L., Laprise, R., Caya, D., Frigon, A. & Slivitzky, M. Canadian RCM projected climate-change signal and its sensitivity to model errors. International Journal of Climatology 26 (15), 2141–2159. USDA Soil Conservation Service SCS National Engineering Handbook, Section 4: Hydrology. Washington, DC. van Griensven, A. & Meixne, T. ParaSol (Parameter Solutions), PUB-IAHS Workshop Uncertainty Analysis in Environmental Modelling, July 2004. Lugano, Italy. van Griensven, A. & Meixner, T. Methods to quantify and identify the sources of uncertainty for river basin water quality models. Water Science and Technology 53 (1), 51–59. Wei, W. & Watkins, D. W. Probabilistic streamflow forecasts based on hydrologic persistence and large-scale climate signals in central Texas. Journal of Hydroinformatics 13, 760–774. Yatagai, A., Kamiguchi, K., Arakawa, O., Hamada, A., Yasutomi, N. & Kitoh, A. APHRODITE: Constructing a long-term daily gridded precipitation dataset for Asia based on a dense network of rain gauges. Bulletin of American Meteorological Society 93, 1401–1415.
First received 15 October 2012; accepted in revised form 23 April 2013. Available online 25 June 2013