THE IMPACT OF SPATIAL AND TEMPORAL RESOLUTIONS IN TROPICAL SUMMER RAINFALL DISTRIBUTION : PRELIMINARY RESULTS

The abundance or lack of rainfall affects peoples’ life and activities. As a major component of the global hydrological cycle (Chokngamwong & Chiu, 2007), accurate representations at various spatial and temporal scales are crucial for a lot of decision making processes. Climate models show a warmer and wetter climate due to increases of Greenhouse Gases (GHG).   However, the models’ resolutions are often too coarse to be directly applicable to local scales that are useful for mitigation purposes.  Hence disaggregation (downscaling) procedures are needed to transfer the coarse scale products to higher spatial and temporal resolutions.  The aim of this paper is to examine the changes in the statistical parameters of rainfall at various spatial and temporal resolutions. The TRMM Multi-satellite Precipitation Analysis (TMPA) at 0.25 degree, 3 hourly grid rainfall data for a summer is aggregated to 0.5,1.0, 2.0 and 2.5 degree and at 6, 12, 24 hourly, pentad (five days) and monthly resolutions. The probability distributions (PDF) and cumulative distribution functions(CDF) of rain amount at these resolutions are computed and modeled as a mixed distribution. Parameters of the PDFs are compared using the Kolmogrov-Smironov (KS) test, both for the mixed and the marginal distribution. These distributions are shown to be distinct. The marginal distributions are fitted with Lognormal and Gamma distributions and it is found that the Gamma distributions fit much better than the Lognormal.


INTRODUCATION
Rainfall is an important climatic factor, which not only has huge influence on agriculture, transportation and many other human activities, but also a key parameter in ecology, meteorology and hydrology.Extreme rainfall events result in large scale floods and flash floods which become major hydrological disasters.The lack of rainfall results in droughts that affect food production and other environmental conditions.Therefore, an investigation on rainfall distribution is needed for both scientific research such as climatic change, hydrologic circulation and ecological environment, and operational decision making and other applications, such as agriculture irrigation, prevention and reduction of natural disasters.
The characteristics of remote sensing data are the large spatial coverage and the frequent satellites' revisit times.Rain gauge measurements provide continuous temporal coverage, but are however limited to the sampling areas of the gauge.The analysis of gauge networks is used for input to hydrologic models, however, they are limited by their spatial coverage (Chokngamwong & Chiu, 2007).With the increasing number of meteorological satellites, advances in sensor technology and techniques for merging satellite and gauge products, rainfall products have been widely applied for research and operations.
Since the first meteorological satellite, TIROS-1 launched, satellite global rainfall maps have been developed.With the launch of the Tropical Rainfall Measuring Mission (TRMM) and the Global Precipitation Measurement Mission (GPM), the * Corresponding author collection provided the impetus and satellite rainfall estimation techniques flourished.Kidd (Kidd, 2001) reviewed techniques for rainfall measurements and introduced rainfall climatology.Adler et al (Adler, Negria, Keehnb, & Hakkarinenc, 1993) proposed an approach to estimate mean monthly rainfall by combining the GOES infrared data and SSM/I microwave measurements.Main operational techniques for rainfall estimation are the CMORPH (CPC MORPHing technique) precipitation of NOAA Climatic Prediction Center; and the NASA Goddard Precipitation Rain Profiling algorithms for TRMM (GPROF) and merged techniques for the Global Precipitation Mission (GPM) from NASA and JAXA.
The spatial and temporal rainfall distribution can be characterized by the sensor and the associated sampling strategy.Different spatial and temporal scales in rainfall monitoring can impact rainfall distributions.As indicated by Hamza Varikoden (Varikoden, Preethi, Samah, & Babu, 2011), knowledge of the spatiotemporal distribution of high intensity rain events would be immensely useful for planners, architect and disaster managers to undertake appropriate risk reduction strategies.In this study, the rainfall distribution at different spatial and temporal scales is investigated with the TRMM 3 hourly rainfall products.

TRMM 3B42V7 Data
TRMM 3B42V7 is a post-real-time production computed with TRMM Multi-satellite Precipitation Analysis (TMPA) algorithm, developed by the National Aeronautics and Space Administration (NASA) Goddard Space Flight Center (GSFC), which can be downloaded from https://giovanni.sci.gsfc.nasa.gov/giovanni/.And the TRMM 3B42 data has a 0.25degree spatial resolution and 3 hourly temporal resolution, covering the latitudinal band about 50S-50N for the period 1997 to present.(Kummerow, Barnes, Kozu, Shiue, & Simpson, 1998).
The TRMM/TMPA 3B42 rainfall estimates are produced in four stages: (1) the microwave precipitation estimates are calibrated and combined, (2) infrared precipitation estimates are created using the calibrated microwave precipitation, (3) the microwave and IR estimates are combined, and (4) rescaling to monthly data is applied.Each precipitation field is best interpreted as the precipitation rate effective at the nominal observation time (Huffman, et al., 2007).The sources of its passive Microwave satellite precipitation estimates include: TRMM Microwave Imager (TMI), Special Sensor Microwave Imager (SSMI), Special Sensor Microwave Imager/Sounder (SSMIS) (3B42V7 only), Advanced Microwave Scanning Radiometer-EOS (AMSR-E), Advanced Microwave Sounding Unit-B (AMSU-B), and Microwave Humidity Sounder (MHS) (Chen, et al., 2013).And the IR data are from National Climatic Data Center (NCDC) and Climate Prediction Center (CPC), which are calibrated by the Microwave data according to the algorithm.It also incorporates the latest version 4 of Global Precipitation Climatology Centre (GPCC) full-gauge analysis from 1998 to 2010 and the GPCC monitoring gauge analysis since 2010 (Moazami, et al., 2013).TRMM 3B42 data covers the tropical area of 50o~ -50oN and 180o ~ -180oE, with an original spatial pixel size of 0.25o *0.25o, so the total pixel number of an original dataset is 1440*400.The TRMM 3B42 data set used in this study is the latest version (3B42V7) for the period from June 1st to August 31st.In this study, the precipitation rates of raw data-every three hour's measurements are aggregated to 6,12, 24 hourly, pentad and monthly scales at the original 0.25 degree, 1.0, and 2.5 degrees' grids and stored in separate HDF format files.Figure 1 shows the rainfall amount of the study area at a spatial resolution of 0.25 degree.We can observe that higher rainfall amount concentrates close to the equator, which is called Inter Tropical Convergence Zone (ITCZ).ITCZ is a belt of low pressure which circles the Earth generally near the equator where the trade winds of the Northern and Southern Hemispheres come together.It is characterized by convective activity which generates often vigorous thunderstorms over large areas.Therefore, large amount of rainfall appears there.Other areas of the high rainfall are the maritime continent and the Amazon over land.In addition, there are regions in the western boundaries of the Pacific and Atlantic showing the storm tracks.
The second step is to aggregate these merged data into different temporal and spatial resolutions, and store all the data with the same temporal and spatial resolution into one file in NetCDF format.A time dimension will be created to describe the rainfall amount of one pixel on a given time point.The size of the time dimension is determined by its temporal resolution: Where DT is the size of temporal dimension, N +,-is number of days of the study period, and T is the temporal resolution of the satellite rainfall data.Take dataset with a temporal resolution of 3 hour and a spatial resolution of 0.25*0.25 degree as an example, it will have a time dimension size of 92*8(=736).After spatial and temporal aggregation, 3dimensional datasets of rainfall rate, denoted by Rt,x,y, are obtained for the following statistical analysis.For a set temporal and spatial resolution, its value is equal to the rainfall amount (RAt,x,y) per hour.
And the third step is to analyze the characteristics and difference of and between datasets with different resolutions.
The new rainfall rate on one pixel after temporal aggregation is as follows: R ),0,-= 2 3 4 ,5,$ 6 478 9 (2) n = T 3 where T is the new temporal resolution; t0 is the time when the monitoring of rainfall rate on the pixel started; x and y is the latitude/longitude coordinates of the pixel, which is not considered to be a variable at this case.When we fix the temporal resolution, t will not be a variable.Then the rainfall rate of the pixel after aggregated on spatial scale is: R =,0,-= R 0 4 -4 ( 9 ?@A 3) n = S 0.25 where Rx,y is the value of rainfall rate of pixel (x, y), S is the new spatial resolution; x0, y0 is the original coordinates of the pixel before aggregation.

Mixed Distribution
There are many situations in which the cumulative distribution function contains jumps at some points but is otherwise continuous.Such a distribution is neither discrete nor continuous but rather a combination of a discrete component and a continuous component (Kedem, Chiu, & North, Estimation of Mean Rain Rate: Application to Satellite Observations, 1990).The case of rain rate presents an example of a mixed distribution.For pixels' RAt,x,y=0, it has a probability 1-p, otherwise the probability is p when RAt,x,y>0.The cumulative distribution function G(r), of RA can be presented as a convex combination of two increasing function H and F, and F is a continuous distribution function (Kedem, Chiu, & North, Estimation of Mean Rain Rate: Application to Satellite Observations, 1990): where f(r) is the density of R conditional on R > 0. The generalized density g(r) corresponding to G(r) can be expressed as follow (Aitchison & Brown, 1963): for r < 0, g(r) = 0 r=0, g(r)=1-p (5) r> 0, g(r)=pf(r) Hence the probability of detecting no rain is (1-p) and p is the probability of detecting rain in a space/time element.f(r) is also referred to as the marginal distribution.

KS test
In statistics, the Kolmogorov-Smirnov test (K-S test or KS test) is a nonparametric test of the equality of continuous, onedimensional probability distributions that can be used to compare a sample with a reference probability distribution (onesample K-S test), or to compare two samples (two-sample K-S test).It is named after Andrey Kolmogorov and Nikolai Smirnov (Hazewinkel, 2001).
In this study, KS test is used to test if the distribution of samples from two different resolutions differ.two-sample KS test is used.In this case, the Kolmogorov-Smirnov statistic is D 9,H = sup F M,9 (r) − F ',H (r) (6) where F1,m(r) and F2,n(r) are the empirical distribution functions of the first and the second sample respectively, and sup is the supremum function.The null hypothesis is that the two sample distributions are different if Where c =1.36 at the 95% level and n and m are the number of samples of each dataset.KS tests are performed between distributions of the total distribution (mixed distribution) and the marginal (non-raining) distribution separately.

Chi-Square Goodness of Fit Test
Parametric tests are also performed using the Lognormal and gamma distribution.This study uses chi-square (X ' ) goodnessof-fit test to compare the distributions, i.e. check how "close" are the observed values to those which would be expected under the fitted model (Department of Statistics and Data Science, n.d.).The X ' test estimates the difference between the observed data and the expected value according to the theoretical distribution.If the data are grouped in k categoried (i=1,2,3,…,k), the observed frequency in each class is denoted as O, and the expected probability from the hypothesized distribution is E (CHO, BOWMAN, & NORTH, 2004), then the chi-square test statistic is of the form: A smaller X ' value means a closer distribution to the hypothesized distribution.

Figure 2. Rain probability in different temporal and spatial resolutions
Figure 2 shows p (rain probability) in different temporal resolutions with spatial resolution of 0.25, 1.0 and 2.5 degrees respectively.It can be found that, with the decrease of temporal and spatial resolutions, the fraction of raining pixels increases and probability of none-rainfall pixels decreases in each spatial resolution, i.e. the larger the observation interval, there is a higher probability of detecting raining events.(  Figure 3 to Figure 5 show the PDF of log rainfall rate at different temporal resolutions with spatial resolutions of 0.25o * 0.25o, 1.0o * 1.0o, and 2.5o * 2.5o.The rainfall amount is binned respectively at 0.42 mm/hr, 1.67 mm/hr and 6.67 mm/hr.It should be noted that the low rain rate data (<0.1mm/hr)deviates substantially from the lognormal distribution.At these low rain rates, the radiometer data used in the algorithm may not be able to discriminate low rain and cloud signatures.The peak of the Lognormal distribution coincides with the data at high resolutions, but shifts toward the low rain rates as the resolutions are degraded.At the mid-high rain rate range, the lognormal distribution underestimates the observed distribution but over-estimates at the high rain end (log r ~ 3).The 12-hourly data fit the lognormal distribution according to the Chi square test.For the 3-hourly data, the first half part doesn't fit the lognormal perfectly as well as the last half part of the daily data.
Another parametric fit to rainfall data in this study is the gamma distribution.Cho et al. (CHO, BOWMAN, & NORTH, 2004) compared the fit of TRMM data to lognormal and gamma distribution and showed that in general the lognormal (gamma) distribution generally fits better in dry (wet) regions.Our data are fitted to these distributions and the fits are compared using the Chi square statistics.(2) Since the total rainfall amount at any temporal resolution is the same, the conditional rainfall rate will be decrease according to the relation: CRR= RA/RF, where CRR, RA and RF are the conditional rain rate, rain amount and rain frequency.As temporal resolution decreases, the average conditional rainfall rates become smaller, as shown in Figure 12. (3) As the resolutions decreases, the variance decreases as the data becomes less dispersive (lower CV). ( 4) The increasing rate of CDF become larger.
And with the decrease of spatial resolution: (1) Raining pixels' fractions increase and more non-rainfall events are missed, which means an increase on rainfall frequency.This conclusion is also proved by Figure 12. (2) But for pixel area is enlarged with the decrease of spatial resolution, conditional rainfall rate is not comparable between different spatial resolutions.(3) Difference between samples from two spatial resolutions increases when the value of these two resolutions differ greater.The study also compare the goodness of fit of the lognormal and gamma distribution.The lognormal fits are in general better over dry areas than over wet areas, consistent with the results of Cho et al. (2004).The gamma distribution performs better than the lognormal distribution both on the entire data scale and at pixel scale.But the sample data only contains one season (JJA) of year 2000, so for the daily, panted and monthly resolution there is not enough samples to draw an accurate conclusion.
While the datasets are all distinct, there are similarity among some datasets, such 1] [12hr, 0.25] when their fraction (p) and CV are quite similar.This can be understood in terms of Taylor's frozen field hypothesis (Gupta & Waymire, 1987), i.e. the field is advected by the mean field, and hence the properties averaged over 3 hourly 1̊ is the same as that at 0.25̊ averaged over 12 hours.
The results herein are only based on one season of data.Our next step is to examine all the available TMPA data and other datasets.Further studies are needed to examine the contributions of the rain fraction and conditional rain rate to the amount at various resolutions, and to examine the scales for which the data are compatible with certain parametric models.The introduction of an autocorrelation scale will also be needed to refine the downscaling model as it has been argued the multi-scale properties of rainfall fields.

Figure 1 .
Figure 1.Map of rainfall amount per hour through June 1st to August 31st with a spatial resolution of 0.25 o * 0.25 o

Figure 3 .
Figure 3. PDF of log value of rainfall rate in different temporal resolutions with a spatial resolution of 0.25 o * 0.25 o

Figure 6 to
Figure 6 to Figure 8 compares the Chi square statistics for our dataset for one summer season.

Figure 6 .
Figure 6.χ2 test of lognormal distribution and gamma distribution in temporal resolution of hourly with a spatial resolution of 0.25 o * 0.25 o

Figure 9 .
Figure 9. PDF of Lognormal and Gamma distribution fitting to data with temporal resolution 3 and 12 hourly spatial resolution of 0.25 o * 0.25 o Figure 9 shows PDF of the Lognormal and Gamma distribution.The 3 hourly-0.25degreedata is generated in 100 bins and the interval is 0.8mm/hr.Most of the rainfall amount values concentrate in the first few bins.More than 85% is in the first bin which means most of the rainfall amount (per hour) is small than 0.8 mm/hr.The PDF of Gamma distribution fits the observed data well in the first two bins and smaller than the observed PDF in the following bins.The PDF of Lognormal distribution fails to fit the first bin and larger than the observed data in the

Figure 10 .
Figure 10.Probplot-plot of Gamma and Lognormal distribution fitting to data with temporal resolution of 3 hourly and spatial resolution of 0.25 o * 0.25 o

Figure 12 .
Figure 12.Mean and median of rainfall frequency(RF) and conditional rainfall rate(CRR) in different resolutions.

Table 2 .
Same as Table 1, except for a spatial resolution of 1.0 o * 1.0 o

Table 3 .
Same as Table 1, except for a spatial resolution of 2.5 o * 2.5 o oTable 4 shows the D value of KS-test between different temporal resolutions as well as d which refers to the threshold, c*[(n+m)/nm]1/2.Similar tables are constructed at other spatial resolutions (not shown).According to the results of KS tests, all D values are larger than the threshold d.This means all the distributions of rainfall are distinct.