Google Earth Engine: Analyzing Precipitation Data Training Release 2.0.0
International Research Institute for Climate and Society, Earth Institute, Columbia University
Pietro Ceccato Valerie Pietsch Yung-Jen Chen Benjamin Marconi Carolyn Balk Alice Stevenson
May 25, 2017
2
3
CONTENTS
1.1 Definition 1.2 Interpretation 1.3 Access 1.4 Analyses 1.5 Case study – Brazil 1.6 Exercise 1.7 Validation of Satellite Estimation Precipitation Data 1.8 Summary 1.9 Reference
3 3 4 5 8 10 11 15 16
4
Google Earth Engine: Using CHIRPS Data to analyze precipitation Training
Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) Why was it developed? The
CHIRPS is a 30+ year quasi-global rainfall dataset. By using google earth engine, people can access to CHIRPS data to analyze the precipitation information in different scales.
What can the tool be used for? ● Assessing precipitation at national, regional and local scale ● Evaluating the seasonality of precipitation ● Identifying regions where precipitation have increased or decreased (precipitation anomalies)
What can CHIRPS not be used for? ● Predicting precipitation in the coming season/year ● Investigating disasters caused by factors other than the distribution of precipitation.
1.1 Definition Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) is a 30+ year quasi-global rainfall dataset. The “CHIRPS” tool is an interactive map using google earth engine that helps people display and analyze the rainfall variations in various time frames and spatial areas (Funk et al., 2014).
1.2 Interpretation CHIRPS was created in collaboration with scientists at the U.S. Geological Survey (USGS) Earth Resources Observation and Science (EROS) Center in order to deliver reliable, up to date, and more complete datasets for a number of early warning objectives. Spanning 50°S-50°N (and 5
all longitudes), starting in 1981 to Feb 2016, CHIRPS incorporates 0.05° resolution satellite imagery with in-situ station data to create gridded rainfall time series for trend analysis and seasonal drought monitoring.
1.3 Access The Google Earth Engine – CHIRPS precipitation tool can be modified to output a time-series of precipitation for a user-defined region. This user-defined region can whether be defined through use of the geometry tool or selected from a specific country.
The region is imported into the map through the following code: (user could draw their region of interest by the tool on the top left)
// Import region var region = geometry; On the other hand, user could also select a country for interest. By using the code below, we select Brazil as an example.
//Identify country var Brazil = ee.FeatureCollection('ft:1tdSwUL7MVpOauSgRzqVTOwdfy17KDbw-1d9omPw' ).filter(ee.Filter.eq('Country', 'Brazil')); Map.addLayer(Brazil);
6
Next, load the CHIRPS data.
var CHIRPS= ee.ImageCollection('UCSB-CHG/CHIRPS/PENTAD'); The CHIRPS data is from 1981-01-01 to 2016-02-27
var precip = CHIRPS.filterDate('1981-01-01', '2016-02-27'); Chart the “Full Precipitation Time Series”
var TS5 = ui.Chart.image.series(precip, Brazil, ee.Reducer.mean(),1000, 'system:time_start').setOptions({ title: 'Precipitation Full Time Series', vAxis: {title: 'mm/pentad'}, }); print(TS5); Charts One Year
var precip1year=CHIRPS.filterDate('2015-01-01', '2015-12-31'); var TS1 = ui.Chart.image.series(precip1year, Brazil, ee.Reducer.mean(),1000, 'system:time_start').setOptions({ title: 'Precipitation 1-Year Time Series', vAxis: {title: 'mm/pentad'}, }); print(TS1); Map out results
var BrazilPrecip = precip1year.mean().clip(Brazil); var BrazilPrecip=precip.mean().clip(Brazil); Map.addLayer(BrazilPrecip, {'min': 0, 'max': 40, 'palette':"CCFFCC,00CC66,006600"}); Map.addLayer(BrazilPrecip1, {'min': 0, 'max': 40, 'palette':"CCFFCC,00CC66,006600"}); You could find the whole code here. 7
https://code.earthengine.google.com/81ed65d800fcf4eb61df43f104674c3a
1.4 Analyses Users can choose to map and calculate average precipitation over a 1-year or multi-year period, or any time period they specify. The user can also choose to chart time series for mean precipitation over any specified region. In addition, the user can also create precipitation anomalies. In the world of climate, the term “anomaly” means the difference between the value of a quantity and its climatological mean value. A “monthly anomaly” is the difference between the original monthly value of a quantity in a given month and the monthly climatological value for that month of the year.
Here, for months i and years j, r’ij is the monthly anomaly, rij is the original monthly value, and the remainder of the equation is the calculation of the monthly climatology (which is subtracted from the original monthly values). Monthly anomalies (of precipitation, for example) indicate the difference (positive or negative) between a monthly precipitation value and its “normal” value for that month of the year, in terms of the original units of the quantity (e.g., mm/month, or monthly mean mm/day). Calculating anomalies is one way to remove the annual cycle from a time series. This can be a useful thing to do in some types of analyses, such as the calculation of correlations between two-time series. If the annual cycle is left in the time series, a high correlation between two variables may sometimes be the result of this annual cycle, which may mask other variations of greater interest to the analysis. In Google Earth Engine, the script to compute monthly anomalies for a single location, a point with Longitude -60.22 [ 60° 13’ 12” West] and Latitude -1.73 [1° 43’ 48” South], is as follow:
Imports (2 entries) var chirps: ImageCollection “CHIRPS: Climate Hazards Group InfraRed Precipitation with Station data (version 2.0 final)” var point: Point (-60.22, -1.73) var means = ee.ImageCollection(ee.List.sequence(1, 12) 8
.map(function(m) { return chirps.filter(ee.Filter.calendarRange(m, m, 'month')) .mean() .set('month', m); })); //Define time period (in this example for three years)
var start = ee.Date('2012-01-01'); var months = ee.List.sequence(0, 36); var dates = months.map(function(index) { return start.advance(index, 'month'); }); print(dates); //Group by month, and then reduce within groups by mean() the result is an ImageCollection with one //image for each month.
var byMonth = ee.ImageCollection.fromImages( dates.map(function(date) { var beginning = date; var end = ee.Date(date).advance(1, 'month'); var mean = chirps.filterDate(beginning, end) .mean() .set('date', date); var month = ee.Date(date).getRelative('month', 'year').add(1); return mean.subtract( means.filter(ee.Filter.eq('month', month)).first()) .set('date', date);
})); print(byMonth);
//Map out results
Map.addLayer(ee.Image(byMonth.first()));
//Chart the Anomalies
var chart = ui.Chart.image.series({ 9
imageCollection: byMonth, region: point, reducer: ee.Reducer.mean(), scale: 10000, xProperty: 'date' }); print(chart); This script is also available at: https://code.earthengine.google.com/6b7a6b37a7f5c76ee079bd328060d2ed
10
1.5 Case study – Brazil Taking Brazil for example, this tool shows the potential to analyze precipitation in various time frames. The first chart and map shows the mean pentad precipitation amount averaging over different regions, in this case the whole Brazil region is used.
11
The second chart and map show the mean pentad precipitation amount averaging over 2015 alone.
12
The third chart shows the precipitation anomalies for a single location point with coordinates: -60.22 (for Longitude 60.22 West) and -1.72 (for Latitude 1.72 South) during the time period Jan 2012 to Jan 2015.
1.6 Exercise Estimating rainfall variations in space and time is an important aspect in of drought early warning and environmental monitoring. Disaster managers in Brazil would like to assess the seasonality of precipitation in order to take early action before drought or flood event occurs. Back in 2015, when was the dry season? How about the wet season? Answer: The dry season was from July to October with around 5-15mm/pentad precipitations. On the other hand, the wet season was from January to April with around 25-50 mm/pentad precipitations.
13
1.7 Validation of Satellite Estimation Precipitation Data No satellite dataset is perfect; however, there are more perfect datasets than others. In order to validate different satellite datasets, we need to compare the satellite dataset with station data and perform statistical analysis to compare the two products. The station dataset uses a rain gauge, providing (in most cases) the most accurate precipitation measurement. In order to determine which dataset is the most valuable, it is necessary to perform the following statistical analysis tests.
Correlation is measured on a scale between -1 and +1 to determine the extent to which two sets of paired values are related in a linear fashion. In other words, how mutual the relationship between two sets of values is. Numbers between -0.35 and +0.35 are not statistically significant correlations; however, numbers closer to -1 and +1 are. -1 is a perfect negative linear correlation, +1 is a perfect positive linear correlation, and 0 is no correlation.
Mean error is the mean of the sample-by-sample differences between two sets of values, where there are n number of paired values x and y. In other words, the difference between the estimator (rain gauge) and the estimated (satellite data) is being measured. This statistic can indicate if one set of values is generally larger or smaller than the comparison set (mean value). Put differently, the mean error can tell you if satellite precipitation estimates tend to over- or under-estimate the values measured by a rain gauge in Sao Paulo. A larger value signifies greater error, which a small value signifies less error. Values can be either positive or negative.
Root mean square error is the square root of the mean of the added squared differences between two sets of values where there are n number of paired values x and y. 14
This statistic provides an absolute (neither positive nor negative) value of the difference between two sets of values. A smaller value signifies less error.
For example, comparing CHIRPS data set with the rain gauge station located in Sao Paulo (Longitude: 46.9West, Latitude: 23.6ºSouth), f or the Time Period: Jan 1998 to Dec 2012, we obtain the following results: Datasets:
Correlation
Mean Error
Root Mean Square Error
CHIRPS
0.9703923
2.037247
26.72042
However, other precipitation estimations from satellite exist on the market such as: 1. CMAP: The CPC Merged Analysis of Precipitation is a pentad (5-day) and monthly analysis of global precipitation merging rain gauge data with precipitation estimates from infrared and microwave satellite algorithms. CMAP has a temporal range from Jan 1979 to near current with a spatial resolution of 2.5°latitude/longitude. Data can be found in the IRI Data Library under “NOAA NCEP CPC Merged Analysis monthly latest ver2”. 2. CMORPH: The Climate Prediction Center Morphing Technique is a high resolution precipitation analysis derived from low-orbiter satellite microwave observations. Its features are transported via spatial propagation information derived from geostationary infrared satellite data, and exists from 7 Dec 2002 - present. CMORPH has a three-hourly and daily dataset, with a spatial domain of 0°E-360°E (global in longitude), 60°S-60°N. This data is updated daily, and exists at a spatial resolution of 0.25° latitude/longitude. Data can be found in the IRI Data Library under “NOAA NCEP CPC CMORPH”. 3. GPCP: The Global Precipitation Climatology Project synthesizes multiple precipitation sources including microwave satellite estimates, infrared satellite estimates and multiple rain-gauge observations datasets. GPCP has a temporal range spanning from Jan 1979 Oct 2009, with a monthly temporal resolution. GPCP has a spatial resolution of 2.5° latitude/longitude. Data can be found in the IRI Data Library under “NASA GPCP V2p2 satellite-gauge”. 4. TRMM: The Tropical Rainfall Measuring Mission produces precipitation estimates by combining microwave and infrared satellite precipitation estimates so that the product 15
can be rescaled to monthly rain gauge scales. TRMM has a temporal range of Jan 1998 to near current, with a temporal resolution of 1 day. The estimate has a spatial domain of 180°W-180°E, 50°S-50°N and a resolution of 0.25° latitude/longitude. The data is updated monthly and can be found in the IRI Data Library under “NASA GES-DAAC TRMM_L3 TRMM_3B42 v7 daily”.
By comparing each of the different data set available, we can see that the CHIRPS data set is the most accurate for the location of Sao Paulo: Datasets:
Correlation
Mean Error
Root Mean Square Error
CHIRPS
0.9703923
2.037247
26.72042
CMAP
0.9408509
5.535028
38.02375
TRMM
0.9317468
-11.7647
40.77058
GPCP
0.9256447
-2.836775
40.83574
CMORPH
0.8336335
39.64173
72.90929
Spatial and Temporal Resolution Problems When comparing different data sets and deciding which one is the most appropriate for your application, you should be aware that the accuracy of the satellite precipitation estimation varies with the spatial and temporal resolutions. There is less variability on the monthly time step than the dekad (10-day) and daily time steps. If the time step is too fine, there can be too much “noise”. Satellite precipitation estimation datasets also often under-estimate high quantity of precipitation. The example below taken in Ethiopia shows that high values of rain (over 150 mm) tend to be under-estimated (Dinku et al. 2007). It is also possible that the under or over estimation can be seasonal, and that some satellites are more accurate in certain seasons as opposed to others.
16
Ethiopia Example: The monthly datasets at low spatial resolution follow a line of best fit, even if the satellite under-estimates the amount of rain for high precipitation values There is also less variation at the 2.5ยบ spatial resolution than higher spatial resolution (e.g. 1ยบ or 0.25ยบ). The example below shows how much noise you can expect from the satellite precipitation estimation compared to rain gauge data.
17
Ethiopia Example: On the dekad (10-day) time step at higher spatial resolution (1deg. by 1 deg.), there is more noise than lower spatial resolution at monthly time step. In general, the larger the time step and spatial resolution, the more accurate are the satellite precipitation estimations as shown in the following statistical analysis.
As the time step increases and spatial resolution decreases, the dataset proves more statistically viable.
1.8 Summary The Google Earth Engine: CHIRPS Data code is to be used in assessment of historical precipitations and monitoring about potential drought condition. It allows the user to access 18
precipitation data on the time scale between Jan 1, 1981 to almost current data and at any location either input by a point, geometrical selection of a region or a country.
19
1.9 Reference Dinku, T., Ceccato, P., Grover-Kopec, E.K., Lemma, M., Connor, S.J., Ropelewski, C.F. 2007. Validation of satellite rainfall products over East Africa's complex topography. International Journal of Remote Sensing, 28(7): 1503-1526 Funk, C.C., Peterson, P.J., Landsfeld, M.F., Pedreros, D.H., Verdin, J.P., Rowland, J.D., Romero, B.E., Husak, G.J., Michaelsen, J.C., and Verdin, A.P., 2014, A quasi-global precipitation time series for drought monitoring: U.S. Geological Survey Data Series 832, 4 p., http://dx.doi.org/10.3133/ds832
20