ESTIMATING PM2.5 CONCENTRATIONS IN BRITISH COLUMBIA, CANADA DURING WILDFIRE SEASON USING SATELLITE OPTICAL DATA

British Columba, Canada experienced its record-breaking wildfire season in 2017. The wildfire smoke is one of the main sources of fine particles with diameters smaller than 2.5 μm (PM2.5). The rising level of PM2.5 concentrations during the wildfire season would considerable increase the risk of premature death, especially for people with weak immune systems. In this study, the satellite optical data collected from 3 km MODIS aerosol optical depth (AOD) products were adopted to estimate PM2.5 concentration levels derived from wildfires in British Columbia, Canada from July to September 2017. The satellite optical data were combined with ground station measurements, meteorological and supplementary data to estimate PM2.5 concentrations using the geographically weighted regression (GWR) model. Our results demonstrated that PM2.5 concentrations were the highest in July and August based on the estimation results of seasonal and monthly GWR models. It indicated that the application feasibility of MODIS AOD products in predicting PM2.5 concentrations during the wildfire season in British Columbia.


INTRODUCTION
Wildfire smoke is increasingly recognized as a significant source of air pollution and mostly result in public health issues (Black et al., 2017). In the past few decades, although air pollution in Canada has been well controlled due to proper regulations, fine particulate emissions from wildfires show upward trends since climate change aggravates the frequency and likelihood of wildfires (Black et al., 2017). The daily fine particulate matter concentration affected by wildfires smoke exceeded the 24-hour standard of 28 μg/m3 during 2015 to 2017 (Environmental Reporting BC, 2017). It is estimated that almost 40% of total particulate emissions was the result of wildfires, and over two million hectares of forests were annually burnt in Canada (Gralewicz et al., 2012;Black et al., 2017). It is found that PM2.5 and ultrafine particles are the dominant wildfire-generated particles of wildfire smoke (Gralewicz et al., 2012). Since forest area accounts for 60% of British Columbia's total area, wildfires have always been a serious problem in British Columbia due to both natural and anthropogenic reasons. The British Columbia has witnessed the record-breaking wildfire season from July 6, 2017 and approximately 120,000 hectares of forests were burnt, which accounts for 1.3% of total British Columbia area.
Wildfire is considered to be the major source of PM2.5 in Canada. In the recent years, several studies have been conducted to examine the relationship between wildfire and PM2.5. Tian & Chen (2010) adopted a semi-empirical model to predict hourly PM2.5 concentrations in southern Ontario using MODIS aerosol optical depth (AOD) and meteorological data. The R 2 of the model was 0.65, meaning that the model was able to explain 65% variability in PM2.5. Hystad et al. (2011) used the land use regression (LUR) model to estimate PM2.5 concentrations in seven Canadian cities, and the R 2 was about 0.46. Hystad et al. (2012) estimated PM2.5 concentrations using the Chemical Transport Model (CTM), and generated a R 2 of 0.67. Sofowote & Dempsey (2015) analyzed three wildfire events in July of 2011, 2012, and 2013 to identify the major source of high PM2.5 concentrations during the study period. It is found that the regions with active fires tend to display higher PM2.5 concentrations by examining the near-real-time ground-level PM2.5 data. Crouse et al. (2016) also implemented the CTM model combining MODIS, MISR, and SeaWIFS AOD to estimate PM2.5 concentration levels in Canada. The result of R 2 was 0.58. Stieb et al. (2016) examined the impacts of PM2.5 using the LUR model at national scale in Canada, and generated a R 2 of 0.59.  analyzed ground-level PM2.5 in the city of Montreal using the one of the CTM models (GEOS-Chem), and generated a R 2 of 0.86. In order to examine the health impact of wildfire smoke, Mirzaei et al. (2018) integrated the LUR model and MODIS AOD products to estimate PM2.5 concentrations in southern Alberta, Canada affected by wildfires in the northwest of the United States in the summer of 2015. They distinguished PM2.5 from other sources by dividing the study period into three sub-periods, including pre-fire, during-fire, and post-fire. The normalized difference vegetation index (NDVI), AOD products from MODIS and the ozone monitoring instrument (OMI), meteorological predictors (e.g., temperature, wind speed, relative humidity), distance to the source of fire, and land use of industrial roads were utilized to build the model. The R 2 of the LUR model was 0.50 for this study. An important contribution of this study was that it can evaluate PM2.5related health impacts before and after wildfire by analyzing estimated PM2.5 distribution maps. McClure et al. 2018 calculated PM2.5 trends in the US by using Quantile Regression from 1988 to 2016. Their results showed the positive trends in 98th quantile PM2.5 at sites within the Northwest United States due to wildfire activity, and also a negative trend at sites throughout the rest of country, likely duo to the reductions in anthropogenic emissions. Liu et al. 2019 applied the GWR model to evaluate the annual average ground-level PM2.5 concentrations using the AOD and meteorological data during 2012 to 2017 in China. Their results showed a 31% reduction in PM2.5 concentrations from 69.37 to 43.85 μg/m 3 over five years. Diao et al. 2019 investigated the main approaches for producing PM2.5 measurements, and also illustrating and comparing the applications of different data fusion methods. Mirzaei et al. (2019) used the ordinary least-squares regression (OLS) and the geographically weighted regression (GWR) model to estimate PM2.5 concentrations in Alberta affected by wildfires derived from British Columbia in August 2017. Other than AOD and meteorological variables, numerous variables were taken into consideration, such as distance from fire in British Columbia, and road length around each ground station. The R 2 values of the OLS and GWR model were 0.74 and 0.84, respectively. This result indicated that the GWR model generated more accurate results than the OLS model. In addition, this study was valuable for further studies to assess the health impact of wildfire plumes. Liu et al. 2019 established a satellite-based optical-mass conversion algorithm which measures PM2.5 mass concentrations based on aerosol microphysical characteristics over China in 2017. The spatial pattern of average radius of PM2.5 was then validated by AERONET measurements. The results indicated that particle diameters in eastern part are smaller than those in other areas in China due to various factors including topography, meteorology, land use, and population density.
The aim of this research is to estimate PM2.5 concentrations level during British Columbia's wildfire season, by combining a set of ground monitoring PM2.5 data, satellite MODIS AOD measurements, meteorological, and supplementary data. The remainder of this paper is structured as follows. Section 2 describes the study area and the datasets used in this study, Section 3 details our method. Section 4 presents and discusses the results. Section 5 concludes the paper.

STUDY AREA AND DATASET
British Columbia is the western most province of Canada with the geographical location of 53° 43' 36.0084'' North and 127° 38' 51.4356'' West, respectively (see Fig. 1). The west part of British Columbia is bounded to Pacific Ocean. Northern, eastern, and southern region of British Columbia are bordered by the Yukon and Alaska, Alberta, and the U.S., respectively. British Columbia is composed of 27 regional districts with a total estimated population of 5.016 million in 2018, which is the third populous province in Canada after Ontario and Quebec. It is reported that British Columbia has experienced rapid population growth in the past three years due to its pleasant climate and diverse culture. British Columbia is famous for its rich natural resources, mountainous terrain, abundant forests, unique coastline, and numerous water resources (Ministry of Forest, n. d). Almost 70% of the total land area in British Columbia are covered by mountains, and forests account for 60% of mountainous area. Due to both natural (such as lightning) and human-caused reasons, wildfires in British Columbia have been increasing in recent years.

Fig. 1. Study area: British Columbia, Canada
The dataset used in this study is included satellite-based measurements, ground-level PM2.5 data, and meteorological data. The 3 km Aqua MODIS AOD products were adopted to estimate PM2.5 concentration levels in British Columbia in 2017. The AERONET AOD products were then generated to validate MODIS AOD retrievals, since it has five times higher accuracy than the satellite-derived AOD.
The hourly ground-level PM2.5 measurements in 2017 were acquired from 66 ground stations through British Columbia Data Catalogue (https://catalogue.data.gov.bc.ca/dataset). Among three methods implemented to measure groundlevel PM2.5 levels including Tapered Element Oscillating Microbalance (TEOM), Beta Attenuation Monitoring (BAM), and Gravimetric methods, BAM1020 is the main instrument to measure PM2.5 in these stations. Ground-level PM2.5 datasets used in this study were verified by British Columbia's Ministry of Environment, and has been validated through a Quality Assurance (QA) or Quality Control (QC) process. Both AOD and PM2.5 estimation models are influenced by meteorological data. The meteorological factors used in this study include boundary layer height (BLH)(m), relative humidity (%), surface pressure, U and V wind speed (m/s), temperature (K), and visibility. Monthly NDVI products with 0.05° resolution acquired from the U.S. Geological Survey (USGS), and the 1 km resolution elevation data derived from the digital elevation model (DEM) provided by Natural Resources Canada were the supplementary datasets applied in the study.

METHOD
The method of this study could be divided into four stages, including MODIS AOD validation, pre-processing, model construction, and output analysis. Fig. 2 shows the workflow of our method. Firstly, the AOD values acquired from the AErosol RObotic NETwork (AERONET), a network of ground-based sun photometers that measure atmospheric aerosol properties were used to validate with MODIS AOD after temporal and spatial matching. All datasets were then pre-processed through projection, clipping, and resampling processes. Since satellite MODIS data, meteorological data, and supplementary data have different resolutions, therefore, the next step is to resample meteorological and supplementary datasets to 3 km using bilinear interpolation in ArcGIS 10.6.1. The ground-level PM2.5 measurements were then used to match with MODIS AOD, meteorological variables, and supplementary data in a 5 × 5pixel window around ground stations. After the removal of invalid matchings (pixels with no AOD data matched), the third step is the construction of the model. The GWR model was applied to estimate PM2.5 concentration levels and produce monthly results. Lastly, the model outputs were analyzed, as well as the application feasibility of MODIS AOD during the wildfire season in British Columbia.
The GWR model is a local regression model where the spatial variation is considered to predict the continuous surface of dependent variable (Donkelaar et al. 2010). The GWR model can be expressed as follows: PM2.5 (i,j) = a0(i,j) + a1(i,j) AOD(i,j) + a2(i,j) V2(i,j) + a2(i,j) V2(i,j) …… an(i,j) Vn(i,j) where PM2.5 (i,j)is the station-detected PM2.5at location (i,j); a0(i,j) is the intercept at location (i,j); V2(i,j) to Vn(i,j) are the values of different predictors at location (i,j); a2(i,j) to an(i,j) are the slope of corresponding predictors (Li et al., 2017). In this study, four meteorological variables were employed as supplementary data for the model establishment, which is the temperature, U and V wind speed, humidity and BLH, respectively. Two different bandwidth methods including the Akaike Information Criterion (AICc) and the Cross Validation (CV) were processed to search for the optimal neighbor parameter of the GWR model (Li et al., 2016).

Fig. 2. Overview of the method.
After the implementation of the model, the spatial autocorrelation analysis of Moran I was conducted in ArcGIS using the residuals of the GWR model. The statistical results of GWR model, validation for the GWR, and Global Regression model, including AIC, R 2 , RMSE, MAPE, and Moran's I are presented in Table 1. AIC is used to evaluate the relative quality between the models. If the difference of AICs between two models is larger than 3, then the model with smaller AICs has better performance and quality. Therefore, since the AIC in the GWR models is much lower than that of the Global Regression model, the GWR model generates better performance result than the Global Regression model. R 2 reflects how close the data are to the fitted regression line. The R 2 of the GWR model is 0.76, which means the model is able to explain 76% variation of response data. In contrast, the global regression model only has an R 2 of 0.53, which is much lower than the GWR model. It can be seen that the validation for GWR has R 2 of 0.74, which means that the GWR model is not over-fitted. In terms of RMSE and mean absolute percentage error (MAPE), they are both used to indicate the accuracy of predicted values. The GWR model has the lowest values for both RMSE and MAPE by 13.45 μg/m 3 and 16.75%, respectively which all show that the GWR model is not over-fitted. The Moran's I values for the residuals of both GWR model and validation are -0.014 and -0.016, respectively, which are all close to 0. It indicates the spatial autocorrelation barely exists among the residuals, and the observations are independent with each other.

RESULTS AND DISCUSSION
The annual PM2.5 concentrations generated by the GWR model are shown in Fig.3. The annual ground station measured PM2.5 distribution map is also generated to compare with the annual distribution map generated by the GWR model. The highest predicted PM2.5 value of the GWR model is 38.28 μg/m 3 , while for the annual ground station distribution map, the highest value is 31.29 μg/m 3 . As shown in Fig.3, the GWR model presents similar spatial distribution of PM2.5 concentrations as ground stations. The regions with the highest PM2.5 values are concentrated in the Central Kootenay. Ground station in the Central Kootenay has the highest annual mean PM2.5 concentrations. The GWR tends to overestimate PM2.5 concentrations in some regions, such as Thompson-Nicola and Cariboo regional districts in a reasonable range. Both districts have higher PM2.5 concentrations than the surrounding area. For the rest area of British Columbia, the annual PM2.5 concentrations are under both CAAQS and British Columbia's annual standards. In addition, the GWR model is able to provide more variations than ground stations. This comparison also indicates that the GWR model shows a high accuracy on predicting PM2.5 concentrations, which could be used for further analysis. After generating the annual model, seasonal models were also conducted to examine PM2.5 concentrations in four seasons. The statistics of seasonal models are presented in Table 2. Regions with high PM2.5 concentrations are greater than that in spring and fall, especially centered on the areas with wildfires. Since summer is the season with highest occurrence possibility of wildfires, which indicates that wildfires could release much more PM2.5 than spring and fall, and could be considered as the main source of rapid rising PM2.5 concentrations in summer.
According to the spatial distribution map of PM2.5 in summer, it can be concluded that the GWR is able to predict PM2.5 concentrations, because the trends of PM2.5 distributions are almost the same in two maps. For the PM2.5 distribution maps in fall, the highest PM2.5 concentrations are 23.01 μg/m 3 and 31.13 μg/m 3 , respectively, which means the fall GWR model underestimated PM2.5 concentrations in some instances. As defined before, September belongs to fall, which is still under British Columbia's state of emergency period during the wildfire season in 2017. Therefore, PM2.5 concentrations in fall are higher than that in spring, but lower than that in summer. The spatial distribution maps generated by the monthly GWR models were built during the British Columbia's state of emergency wildfire season during July, August, and September to estimate PM2.5 concentrations (see Figs. 7,8,9). Table 3 summarizes the statistical results of monthly GWR models. The training samples for three monthly models are 421, 540, and 271, respectively. All three models perform high values of R 2 , and the July model has the highest R 2 with 0.85 which demonstrates that GWR is suitable for conducting monthly model in July, August, and September. The RMSE for three models are 3.26 μg/m 3 , 11.31 μg/m 3 , and 12.68 μg/m 3 , respectively. The July model has the lowest RMSE value, as well as the MAPE value (5.35%) which means that the July model has the best performance compared to other two monthly models. The average magnitude error for predicted values in July is 3.26 μg/m3, which is the lowest among annual, seasonal, and monthly GWR models. The trends of RMSE and MAPE regarding R 2 are the same for all three models, which is: RMSE and MAPE increases as R 2 increases. In addition, Moran's I was also conducted for the residuals of three models to examine the spatial autocorrelation. As shown in Table 3, Moran's I values for the residuals of three models are all near 0, which indicates there are no significant spatial autocorrelation exist.  As shown in Fig. 7, the estimated PM2.5 values in July range between 1.43 and 35.28 μg/m 3 . The regions with high PM2.5 concentrations are centered on the Cariboo district. The ground-level PM2.5 concentrations agree with the estimated PM2.5 concentrations. As shown in Fig. 8, August has the most severe PM2.5 concentrations, with the highest value over 100 μg/m 3 based on the ground station measurements. For estimated PM2.5 concentrations generated by the August GWR model, PM2.5 reaches its peak at 83.37 μg/m 3 . It can be found that most regions with active wildfire exhibit the highest values of PM2.5 concentrations. However, for the area that is lack of ground monitoring stations, the PM2.5 concentrations are underestimated due to the insufficient number of control points. Therefore, more ground stations are helpful for increasing the accuracy of estimation. It could also be concluded that there is a strong correlation between wildfires and PM2.5 concentrations. As shown in Fig. 9, some underestimations of PM2.5, concentrations in September are revealed. The main reason for this is due to the lowest value of R 2 for the September model. Therefore, by comparing three monthly models, it can be viewed that there is an obvious rising trend in PM2.5 concentrations during the wildfire season, especially in August.

CONCLUDING REMARKS
The main purpose of this study was to estimate PM2.5concentrations in British Columbia, Canada using 3 km MODIS AOD products integrated with meteorological and supplementary data. Some key findings of our study can be summarized as follows.
First, a strong correlation was observed between the MODIS AOD and the AERONET AOD. It was also found that the 3 km MODIS AOD products tended to overestimate groundlevel AOD values, but most values fell within a reasonable range regarding AERONET AOD. It indicated that the 3 km MODIS AOD products were qualified to use for PM2.5 estimations. For regions that are lack of AEORNET sites, MODIS AOD products could be considered as a substitute for further studies.
Second, by comparing with ground station PM2.5 concentrations, it can be concluded that PM2.5 concentrations predicted by the GWR model nearly followed the same trend as PM2.5 concentrations measured by ground stations. The summer model generated the best performance among the three seasonal models. The winter model was failed to conduct due to the lack of training samples. Three monthly models all performed high values of R 2 . Except for the spring and fall models, the rest GWR models were able to generate accurate PM2.5 distribution maps that followed the same trend as the PM2.5 concentrations measured by ground stations.
Third, by examining PM2.5 concentrations during the wildfire season, it was found that there was a rapid increasing trend in PM2.5 concentrations. According to the spatial distribution map generated by the August model, PM2.5 concentrations were the highest in August, especially in regions with active wildfires. It can be concluded that the integration of MODIS AOD products and the GWR model was capable of estimating PM2.5 concentrations accurately during the wildfire season in British Columbia, which indicated a high application feasibility for the future studies in other regions. This study is also valuable for the Provincial Government of British Columbia to conduct research related to air pollution and public health perspectives. Although this study achieved its main objectives and made some contributions, some limitations still exist, which are summarized as follows. First, in order to match with Aqua's overpassing time at 13:30, meteorological data were retrieved at the same time in this study. The estimatedPM2.5 concentrations in this study were based on 13:30, while PM2.5 concentrations at other times are note valuated. Second, due to the lack of collocations between MODIS AOD and the AOD from the Saturn_Island AERONET site, there might exist some bias regarding AOD validation and PM2.5 estimations. The unevenly distribution of PM2.5 ground stations would also affect PM2.5 estimations in this study. Third, the GWR model was failed to conduct in winter due to the lack of training samples. The theoretical model was built based on several assumptions. It assumes that the distribution of aerosols is even in the atmospheric vertical direction, and the shape of aerosol particles is all spherical. Lastly, the meteorological and supplementary data used in this study were just for model constructions. The impact of each parameter on PM2.5 estimations is not discussed. The dispersion and accumulation of PM2.5 concentrations were not presented.