INTERCOMPARISON OF DIFFERENT RAINFALL PRODUCTS AND VALIDATION OF WRF MODELLED RAINFALL ESTIMATION IN NW HIMALAYA DURING MONSOON PERIOD

Extreme precipitation events are responsible for major floods in any part of the world. In recent years, simulations and projection of weather conditions to future, with Numerical Weather Prediction (NWP) models like Weather Research and Forecast (WRF), has become an imperative component of research in the field of atmospheric science and hydrology. The validation of modelled forecast is thus have become matter of paramount importance in case of forecasting. This study delivers an all-inclusive assessment of 5 high spatial resolution gridded precipitation products including satellite data products and also climate reanalysis product as compared to WRF precipitation product. The study was performed in river basins of North Western Himalaya (NWH) in India. Performance of WRF model is evaluated by comparing with observational gridded (0.25°×0.25°) precipitation data from Indian Meteorological Department (IMD). Other products include TRMM Multi Satellite Precipitation Analysis (TMPA) 3B42-v7 product (0.25°×0.25°) and Global Precipitation Measurement (GPM) product (0.1°×0.1°). Moreover, climate reanalysis rainfall product from ERA Interim is also used. Bias, Mean Absolute Error, Root Mean Square Error, False Alarm Ratio (FAR), Probability of False Detection (POFD), and Probability of Detection (POD) were calculated with particular rainfall thresholds. TRMM and GPM products were found to be sufficiently close to the observations. All products showed better performance in the low altitude areas i.e. in planes of Upper Ganga and Yamuna basin and Indus basin, and increase in error as topographical variation increases. This study can be used for identifying suitability of WRF forecast data and assessing performance of other rainfall datasets as well.


INTRODUCTION 1.1 Importance of Prediction of Monsoon Precipitation
The Indian summer monsoon is a main mechanism having a significant role in the global climate system and also towards the global water cycle (Trenberth et al., 2000).The south-west summer monsoon governs agricultural, energy and water resources sectors.Thus it is crucial for all applications to precisely estimate and predict the summer monsoon precipitation.
Variations in the pattern of precipitation have the most adverse impacts among all meteorological variables on the humanity.More precisely, the main concern is on the changes in extreme rainfall.As the intense precipitation events often causes disasters like flash floods, which in turn results in large-scale damage to the infrastructure, also on natural ecosystems.
Different researchers have performed a number of studies regarding the evaluation of satellite estimated rainfall products, as there are limitations in availability of continuous observed rainfall information at all locations.Especially the success of TRMM and GPM missions has been a debated topic in recent years.Prakash et al. (2018) found that GPM IMERG shows improvement in missed and false events over India.IMERG product also gives a better estimate of mean monsoon rainfall than TMPA product.On the other hand, a high correlation of 0.88 was found for TRMM by Bharti et al. (2016) with IMD gridded Raingauge data over NWH region.TRMM 3B42 v7 showed high bias in the precipitation amount for extreme events but could capture the frequency.

Verification of Forecast from NWP models
During the past few years, the numerical weather predictions have gained improved performance with better skill score, as a consequence of the development of improved schemes of parameterization of model, higher computational efficiency, and data assimilation techniques (Mitra et al., 2013).Nevertheless, the improvement of skill for prediction of tropical monsoon has not been yet entirely comprehended.(Prakash et al., 2014).As there is sufficient amount of bias existing in case of modelling in rainy season in this region, truthful estimate of projection of climate change of Indian summer monsoon precipitation is not yet certain (Turner and Annamalai, 2012).Advancement of model is dependent on assessment of modelled precipitation which acts as an important feedback for this purpose.Validating the model accuracy against the observed gage data is in turn significant for some purposes (Ebert et al., 2007;Collins et al., 2013).
Real time weather conditions are taken as input to run atmospheric models in NWP to forecast the progression of weather.In fact, the atmosphere is conceptualized as a dynamic fluid in NWP models and thus they represent the behavior of the atmosphere by solving the equations of mechanics and thermodynamics (Yu et al., 2016).
In the performance assessment of the flood forecasting for previous flood events, the ambiguity in meteorological data is thus generally the major cause of uncertainty in the forecast of flood (Rossa et al., 2010).The outputs of the meteorological forecast tool Weather Research and Forecast (WRF) is taken as principal input in the forecast time duration.Thus it is a basic requirement check the uncertainties is the WRF modelled data.

Objectives of the Study
This study aims to compare the performance and accuracy of different precipitation products, more specifically, satellite and reanalysis datasets, for the NWH region.Finally, it was intended to find a precipitation product suitable for long-term calibration and validation of hydrological models which can be used to produce flood forecast with WRF meteorological predictions.(Details of the WRF model of IIRS is given in: http://www.dms.iirs.gov.in/)

Study Area
In this study, the North-Western Himalaya, consisting of the states of Jammu and Kashmir, Himachal Pradesh, Uttaranchal and parts of Panjab and Haryana, in India, also extending outside the country border across China or Tibet in East and portions of Pakistan lying in the catchment of Indus.NWH includes basins of Indus and its tributaries, namely, Jhelum, Ravi, Chenab, Beas, Sutlej and upper Ganga and its tributaries namely, Yamuna, Ramganga, Ghaghra, Kali.Heights in the region is found varying in a large range from a few hundred meters in the Siwalik Himalaya in the south to about 8000 meters in the Karakoram Himalaya in the farthest northern region (Bhutiyani et al., 2007).Precipitation in the NWH ensues because of westerly disturbances during the post-monsoon (Oct-Nov) and pre-monsoon (May-Jun) seasons and due to the southwest monsoon during June to September, each year.There occurs also a vast variation of yearly precipitation, in forms of rainfall and snowfall in parts of NWH.The region of whole NWH is shown in Figure 1.However, in this study, main focus was in Beas and Sutlej basins, which have well-known records of floods in past years.(Figure 2)

Ground Observation Reference Datasets
Precipitation gage data was only available for 4 stations in Beas basin, namely, Nadaun, Sujanpur Tira, Palampur and Baijnath.Thus for reference dataset, observed point data could not be obtained.However, 0.25º × 0.25º gridded precipitation product was procured for the monsoon seasons from the year 2000 to 2015.The main motivation of this study was to evaluate the reliability of the freely available gridded global precipitation products to be used as input in hydrological modelling in NWH region.In this regard, it is to be guaranteed that the rain gauge stations data used for deriving the gridded reference dataset (IMD) had not been utilized in the manufacturing of the products which are to be assessed against it.It is required for an independent analysis of all the products.(Duan et al., 2016)

Satellite Datasets
TRMM 3B42 v7 is a product derived by TRMM Multi Satellite Precipitation Analysis algorithm.It originally produces 3 hourly precipitation estimates at a spatial resolution of 0.25º for the tropical to temperate zones (50ºS -50ºN).Datasets are available since 1998.In our study, as we focussed on daily comparison analyses, the derived daily product was used.This product presently is a combination of two products, namely, gauge adjusted combined estimate of microwave and IR, and also combined estimates of microwave-IR-gauge rainfall at monthly scale.
GPM IMERG product, used in this study, is also a similar precipitation product, which combines Microwave, IR and gauge estimates.The intent of this algorithm is to first intercalibrate the microwave estimates of all satellites in this constellation, merge, and interpolate them together with microwave-calibrated IR estimates of satellite derived precipitation, gauge analyses, and other potential estimators of precipitation at a higher spatial and temporal resolution with a global coverage area.The result is the product of 0.1º spatial resolution with 30mins time interval having a full coverage for 60ºS -60ºN and a partial coverage for the rest of the globe.For the current study, derived daily product was used.
In our study we used CPC daily product (0.50º × 0.50º grid) namely, Global Daily Unified Gauge-Based Analysis of Precipitation.This data set is disseminated by NOAA Climate Prediction Center (CPC) and it is a component of the CPC Unified Precipitation Project.This project primarily aims at creation of a suite of combination of rainfall estimates having a reliable quantity and upgraded quality which unifies all the dataset inventories available at CPC.It also uses Optimal Interpolation objective analysis approach which adds on its advantages.

Climate Reanalysis Datasets
ERA-Interim project is one major global climatic reanalysis products from 1979 onwards, incessantly being restructured in the real time.The structure comprises of 4D analysis (4D-Var) with an analysis window of 12-hours.The spatial resolution of the available data sets is 80 km (0.75º grids) on 60 vertical levels from the surface up to 0.1 hPa.In our study 6-hourly precipitation at a downscaled resolution of 0.50º was obtained and daily rainfall estimates was derived from that.

NWP Product
WRF can yield predictions on the basis of real-time atmospheric scenarios or ideal climatic conditions, which is useful for research purposes.As the other national/regional climatological centres (real-time forecasting organizations) IIRS has its own operational WRF forecast model set up on two domains in NWH region, one of coarser resolution (9km) and one of higher resolution (3km), which gives sufficiently accurate rainfall prediction.In the current work, WRF running in 3 day forecast mode, running daily at 12 UTC, with boundary conditions form Global Forecast System (GFS) 0.25 degree 6 hourly time step, 72 hour forecast products, has been used for the purpose of forecasted meteorological input.Both 3 hourly (inner domain -3 km resolution) and 6 hourly (outer domain -9 km resolution) forecasts were produced by this setup.Only 24 hour forecast product is taken from WRF daily simulations.In order to use this forecast in hydrological simulations for forecasting food scenario in near future extent, the validation of this product against other datasets is a certain requirement.Moreover, the rainfall product having the most similarity with this NWP estimate is also to be assessed as the hydrological model needs long-term calibration and validation.For more detailed description of the model, the website www.dms.iirs.gov.in is to be referred.

EVALUATION METHODS
There are numerous methods available to verify the accuracy of forecast.However, in this study, the approach based on Categorical Skill Score was used.Ranges of unremitting forecast values can characterize these categories, mainly by implementing some thresholds (Mariani and Casaioli, 2008).Here, "Categorical dichotomous forecasts" were dealt with by incorporating single threshold s at a time.A 2×2 contingency table is shown in Figure 3. (1) The hit rate (HR) lies between 0 and 1, where 1 denotes a 'perfect forecast'.Hit Rate might usually be enhanced by methodically over forecasting the frequency of the event.

False alarm ratio (FAR):
(2) It lies between 0 and 1.A perfect score is 0. It is not imperative of the events missed in forecasting.It is an incomplete score and it is to be stated along with the Hit Rate.

Bias:
(3) An unbiased forecast is said to be made when the event is predicted with precisely the same frequency with that of the observed.Thus, frequency bias value of 1 signifies the best score.Bias > 1 points to 'over-forecasting' and Bias < 1 indicate 'under-forecasting'.

Probability of False Detection (POFD):
(4) 'Probability of False Detection' is sometimes called as 'False Alarm Rate'.Lower values of POFD denote that the number of false alarm events is fewer and the accurateness of the forecasts is greater.
These abovementioned 4 parameters were calculated both subbasin wise and also for each major basin along with Mean Absolute Error (MAE) and Root Mean Square Error (RMSE).All the datasets used are in grid-based format.This is why IMD dataset was procured at 0.25º gridded form.All datasets being in grid-based format, no further interpolation technique was needed to be used.Apart from, the aforementioned 6 parameters, the coefficient of determination R2 was estimated to determine the extent of collinearity of the precipitation product as compared to the IMD rainfall product.

Evaluation of the Satellite Derived Products
The plots of TRMM and GPM rainfall time series for monsoon seasons along with the observed data, for Hamirpur District in Beas basin, are shown in Figure 4.For long-term continuous precipitation in the whole monsoon season, both TRMM and GPM precipitation shows more or less similar patterns as can be seen from the time series plots in Fig. 4. Thus, it was observed, TMPA and GPM IMERG estimates gives a good estimate of rainfall occurred in case of specific extreme events, as the graphs depict clearly that the higher peaks were more explicitly endorsed by the satellite data retrieval algorithms.
Moreover these TMPA and GPM datasets give R 2 values 0.27 and 0.26 respectively, when compared with gridded IMD data as given in the scatter plots in Fig. 6.Both the dataset performs almost similar, where TMPA provides slightly better assessment of rainfall for Beas and Sutlej basins.As per the research performed by Maggioni et al. (2016), TRMM 3B42-V7 records the nearest values to ground observation data in Indian subcontinent, and hence, could be considered useful for monsoon studies in this region because of its lesser underestimation, higher correlation, and low error estimates, in comparison to the other products.Again, as per a study conducted by Tang et al. (2016), the post-real-time corrections commendably reduce the biasness of Day-1 IMERG and 3B42v7 to single digits of underestimation from above 20% overestimation of 3B42RT.Indications from the Taylor's diagram prepared depicted that IMERG day 1 product and 3B42-v7 product are analogous at grid and basin scales.While, sometimes, the GPM IMERG product outperforms the standard products of TMPA, which in turns, points to a capable outlook of hydrological effectiveness and a necessary continuity from TRMM products inheritances to GPM IMERG products.
Moreover, ERA Interim data was also compared with observed gage data (Fig. 5).It was seen that ERA Interim climate reanalysis data is not a great choice for small scale local events.As can be seen in Fig. 5, it missed most of the peaks as compared to the average estimate of gaging station data obtained for Nadaun and Sujanpur Tira stations in Hamirpur District, which was compared with the area averaged ERA Interim data for Hamirpur district.However, this is due to the fact that ERA Interim dataset downloaded is having a coarse resolution of 0.50° in both the directions.Thus, averaging over a large area is already present there in the data.More averaging were included while carrying out zonal average over the district, whereas, the observed gauging stations were present for only 2 stations over that region, Nadaun and Sujanpur Tira during 2013 and with 3 stations (Bhota as 3 rd AWS station) in 2014-15.Thus, it is very likely that the estimate used as observed precipitation was not the 100% true representative of actual average condition of rainfall.For that reason, the discrepancy due spatial variations in rainfall may have caused the deviance.Thus in future both denser rain gauge data and high spatial resolution re-analyses data products are needed for more realistic representation of rainfall in NWH.
This fact was more emphasized by validating ERA Interim estimate over whole Beas and Sutlej basins against IMD gridded precipitation data.It was found that ERA Interim has a better linear relationship with the reference data than the other datasets compared, having a R 2 value 0.40, for the monsoon seasonal rainfall during the recent years (2010 to 2015).For a basin scale analysis, the average correlation coefficients for Beas, Sutlej and Ganga basins were also calculated for the monsoon season for 2012 to 2015 and are shown below in Table 1.Scatter plots of daily precipitation from all 4 gridded products for Beas and Sutlej basins combined is shown below in Fig. 6.It is observed that there exist large scatters for more or less all the products, which depicts the lesser agreement of evaluated products with the observed product.This poor agreement of the rainfall products can be attributed to one or many of the following reasons: sampling error from satellite data, errors incorporated during estimation of precipitation by the specific algorithm from the specific platforms like rain gauge analyses, climate reanalyses, satellite, climate models, errors involved in the algorithms implemented for combining multi-source estimates, errors intruded due to bias correction, and also by the erroneous gage data used in the algorithm for correcting bias.(Shen et al., 2010;Duan et al., 2016).From the comparison of these 4 datasets infers some basic points.ERA Interim dataset is better for larger spatial and temporal scales.However, for localized events, satellite datasets is found better.Another dataset, used for Indian subcontinent by many researchers in their study as the source of precipitation, was also incorporated in our current study.It was found that it was a good source of precipitation for the Sutlej basin, but not very much precise for Beas basin.However, CPC dataset was found having a high bias value for both the basins.Further it is to be noted that, basin level zonal statistics may lead to some spatial averaging effect, which can impact the overall values of used statistical parameters.

Validation of WRF Derived Product
The WRF forecast data was compared with the satellite products of TRMM 3B42v7 (0.25º × 0.25º) and GPM IMERG product (0.1º × 0.1º) by same set of parameters.The comparison has been performed on all 8 basins of NWH region.In this case, the validation of forecast was performed for a number of discrete events, instead of continuous long timescales.The calculated parameter values have been shown in graphs below in Fig. 9 and 10 for events of 2015 and 2017 respectively.
It was found that, for some basins, TMPA 3B42 product is more close to the used forecast product of the current WRF model, whereas in other basins, GPM product has less error in it, in terms of rainfall.For Sutlej, Ganga, Chenub and Ravi basins, GPM product is having better match with WRF, whereas in Indus and Beas basins region, WRF product is more close to TRMM observed rainfall.For, Beas basin, in case of specific rainfall events RMSE of WRF rainfall data w.r.It was observed that TMPA product was the overall best estimate as it is having the least bias value among all 4, while GPM product is found to have sufficient bias error induced in it.ERA Interim also had relatively less bias for both Beas and Sutlej basins.However, instantaneous and localized peaks of extreme rainfall events could not be identified with coarse resolution ERA Interim product, in which, GPM estimated rainfall provided better measurements due to fine resolution (0.1°).Least average RMSE is found in ERA Interim product on a basin scale, and largest RMSE is observed in case of TMPA product.
This study acts as a tool which can guide to choose an alternative source for precipitation information for localized hydrological phenomena, like floods.In our further studies, flood forecasting system was developed, which uses WRF model forecast meteorological dataset as an input to the hydrological model.Thus, it was very much essential to verify the accuracy of the other precipitation products to find the most suitable meteorological dataset which provides similar patterns of rainfall as the already developed WRF model, and also which is the most accurate in terms of verification metrics, in order to obtain a long-term meteorological dataset to perform calibration and validation of the model, over whole NWH region.As the basins boundary extends beyond the political boundary of India, IMD product could not be used to get the long-term meteorological information for the purpose mentioned.Obviously, some other factors is also there to look upon in order to decide for the most suitable product for calibration and validation purpose, which is, of course, out of the scope of the current study.
However, the outcomes evidently illustrate that there can be further improvement of the precipitation estimates for all 4 rainfall products, in terms of spatial resolution and sub-daily timescale.More detailed study with other precipitation products like CHIRPS, INSAT 3D Hydro-estimator, APHRODITE etc. can be performed to find the suitability of these datasets in a similar manner.It is possible to carry out spatial downscaling methods which will provide precipitation estimates at finer resolution (e.g. 1 km).There are a number of spatial downscaling approaches, but mostly these techniques were developed for larger timescales, like annual or monthly scales (Chen et al., 2015).Thus, the future scope of the related works should be focussed to develop fruitful spatial downscaling technique for meteorological products at daily to 3 hourly time scale.

Figure 1 :
Figure 1: Topographical map of NWH and the major basins

Figure 2 :
Figure 2: Drainage networks in NWH with highlighted Beas and Sutlej basins

Figure 4 :
Figure 4: Comparison of TRMM 3B42 and GPM IMERG Rainfall Products with Observed Data in (a) 2014, and (b) 2015

Figure 5 :
Figure 5: Plot of time series rainfall data from ERA and gauging station in (a) 2013 & (b) 2014, in Hamirpur Basins R 2 values

Figure 6 :
Figure 6: Scatter plots of daily precipitation products with observational rainfall product (with average R 2 values)

Figure 9 :Figure 10 :
Figure 10: WRF data verification w.r.t.satellite products for (a) Indus, (b) Chenub, (c) Ravi, (d) Beas, (e) Sutlej, (f) Yamuna, (g) Jhelum, (h) Upper Ganga basins in the year 20175.CONCLUSIONIn the present study, we conducted an all-inclusive evaluation of four types of precipitation products for NWH region in terms of some verification metrics.This study indicates the suitability of all 4 types of precipitation products evaluated in specific locations for daily timescales, in recent years(2010)(2011)(2012)(2013)(2014)(2015).The products include TRMM 3B42, GPM IMERG, ERA Interim climate reanalysis and CPC Raingauge-based Precipitation products.

Table 1 .
Correlation coefficients for different products compared with reference data

Table 2 :
Summary Statistics of Estimated Parameters