SOIL CARBON MAPPING IN LOW RELIEF AREAS WITH COMBINED LAND USE TYPES AND PERCENTAGES

Accurate mapping of soil carbon in low relief areas is of great challenge because of the defect of conventional "soil-landscape" model. Efforts have been made to integrate the land use information in the modelling and mapping of soil organic carbon (SOC), in which the spatial context was ignored. With 256 topsoil samples collected from Jianghan Plain, we aim to (i) explore the land-use dependency of SOC via one-way ANOVA; (ii) investigate the “spillover effect” of land use on SOC content; (iii) examine the feasibility of land use types and percentages (obtained with a 200-meter buffer) for soil mapping via regression Kriging (RK) models. Results showed that the SOC of paddy fields was higher than that of woodlands and irrigated lands. The land use type could explain 20.5% variation of the SOC, and the value increased to 24.7% when the land use percentages were considered. SOC was positively correlated with the percentage of water area and irrigation canals. Further research indicated that SOC of irrigated lands was significantly correlated with the percentage of water area and irrigation canals, while paddy fields and woodlands did not show similar trends. RK model that combined land use types and percentages outperformed the other models with the lowest values of RMSEC (5.644g/kg) and RMSEP (6.229 g/kg), and the highest RC (0.193) and RP (0.197). In conclusions, land use types and percentages serve as efficient indicators for the SOC mapping in plain areas. Additionally, irrigation facilities contributed to the farmland SOC sequestration especially in irrigated lands.


INTRODUCTION
Soil organic carbon (SOC) content is an essential indicator of soil productivity, adequate SOC content contributes to plant growth and water and soil conservation (Rasool et al., 2008).Moreover, mutual transformation of soil organic carbon and atmospheric carbon dioxide plays an important role of the carbon cycle of the terrestrial ecosystem, SOC content and its dynamic changes have a profound impact on global climate and food security (Powlson et al., 2011).Accurate mapping of soil carbon is conducive to the development of precise agriculture and the assessment of greenhouse gas emissions.However, disturbed by the complex process of soil formation and the intensive human activities, the distribution of SOC content exists spatial heterogeneity, which brings serious challenges for SOC mapping.
Ordinary Kriging (OK) method has been widely used for spatial prediction of soil properties in those areas with similar landscape patterns (Robinson and Metternicht, 2006;Shahbeik et al., 2014).However, the prediction accuracy of OK method decreases as the soil properties are strongly interfered by complex terrain and human activities (Liu et al., 2006).Regression Kriging (RK), which combines auxiliary environment variables and Kriging methods, has been proved better than OK with better prediction accuracy and goodness of fit (Menezes et al., 2016;Zhu and Lin, 2010).The scorpan method is a generic framework to fit quantitative relationships between soil properties and environment variables (Mcbratney et al., 2003).Based on the method, soil information (i.e., soil maps), climate (i.e., temperature and precipitation), organisms (i.e., land cover and vegetation) and topography (i.e., elevation, slope and curvature) were widely used as predictors to help digital soil mapping (Ungaro et al., 2010).However, in low relief areas, conventional soil-landscape model, in which topography derived factors play an important role, performed poorly due to the low relief amplitude (Zhao et al., 2014;Zhu et al., 2010).Finding feasible explanatory variables becomes an important direction of current research.
Land use types (Liu et al., 2015), soil maps (Walker et al., 2017), fuzzy slope position information (Qin et al., 2012), temporal remote sensing images and their products (Liu et al., 2010;Mirzaee et al., 2016;Zhao et al., 2014) and hyper-spectral indices (Chen et al., 2015;Ji et al., 2014;Liu et al., 2014) were gradually used for digital soil mapping in plain areas.As an easily accessible variable, land use types could be obtained through interpretation of remote sensing images, and land use maps of study area are generated subsequently.More important, land use types have been proved important factors to explain the spatial viability of soil properties and are widely used for spatial estimation and digital soil mapping (Yigini and Panagos, 2016).There are three main approaches to utilize land use types: (i) stratified according to land use types and then executed the Kriging interpolation for each region (Qian et al., 2017); (ii) were employed for the mean centering Kriging method (Gu et al., 2014); (iii) were transformed into dummy variables and used for RK models (Wen et al., 2015).However, the land use types, which have regularly been used for spatial prediction of soil properties, were primary category (i.e.farmland, woodland, grassland).Such strategy was of little help for mapping SOC in plain areas.Therefore, land use types of secondary category should be employed to explain the variance of farmland SOC in plain areas.In addition, efforts have been made to integrate the land use types in RK model, in which the spatial context was ignored.
Jianghan Plain is an important commodity grain base of China, with flat terrain and fertile soil.A total of 256 topsoil samples (0-30 cm) were collected from Jianghan Plain, including paddy fields, irrigated lands and woodlands.After obtaining land use percentages of a specific sampling point with a 200-meter buffer, we aim to explore the potential of land use types and percentages for soil mapping in plain areas.Specifically, we (i) explore the land-use dependency of SOC via one-way ANOVA; (ii) investigate the "spillover effect" of land use on SOC content via stepwise regression; (iii) examine the feasibility of land use types and percentages for soil mapping via regression Kriging (RK) models.

Study Area
Chahe Town, where this research conducted, is situated in southeast of Jianghan Plain, China.The elevation of study area ranges from 14 to 35m, and covers approximately 141.32km 2 (29°56′-30°04′N and 113°22′-113°34′E).The region has a subtropical monsoon climate, with a mean temperature of 16.6℃ and a mean annual precipitation ranges from 1000 to 1300mm (Liu et al., 2017).The town is adjacent to Lake Honghu, with sufficient source of water, flat terrain and fertile soil, and now is an important commodity grain base in Hubei province.The region consists of large areas of continuous paddy fields and water bodies, numerous but broken irrigated lands, rural settlements with zonal distribution, a small amount of woodlands and scattered other lands.
A total of 256 topsoil samples (0-30cm) were collected by random sampling in June 2013.The process of sample collection and SOC content measurement is the same as that of Liu et al. (2015).Two samples that deviate from the mean with triple standard deviation were removed as outliers.The remaining 254 soil samples were divided into calibration set (n=204, 80%) and validation set (n=50, 20%).Specifically, all the samples were sorted in an ascending order of SOC content, and every five of them were picked up to form the validation set, while the others were set as the calibration set.This method guarantees the maximum and minimum value are in the calibration set.The spatial distribution of sampling points is shown in Fig 1 .2.2 Data Pre-processing 2.2.1 Classification of Land Use Types: In this study, we derived the land-use data from a raster dataset of 2013 with 10 meter resolution.The dataset comprises 22 land-use types in accord with the Second National Land Survey classification system (GB/T 21010-2007).Combining with the previous articles and the characteristics of land use in the study area (Liu et al., 2017;Tian et al., 2012), we classified the study area into seven land-use types, including paddy field, irrigated land, woodland, water area, construction land, irrigated canals and other lands.The irrigated canal is an essential land use type in farmland.Other lands contain garden plot and grassland, which accounts for less than 1% of the total area.

Calculation of Land Use Percentages:
The land use percentages of a specific sampling point were obtained with a 200-meter buffer.Moreover, land use percentages of each grid in the study area were calculated to form seven land use percentages images, which were prepared for SOC mapping.In fact, the land use percentages images were larger than study area to ensure the calculation accuracy of grids at the boundary.

Construction of Regressions:
The multiple linear regression method was employed to explore the quantitative relationship between SOC, land use types and percentages.To avoid multicollinearity, stepwise method was used to ensure all parameters in the regression at the significant level of p<0.05.Specifically, the types and percentages of land use were taken as independent variables in the first two regressions respectively, and applied to build the third regression in the meantime.The three stepwise regressions, named R1, R2, R3 separately, were then used to RK models.The land-use dependency of the relationship between SOC and land use percentages was also investigated via stepwise regressions.In addition, land use type was converted to dummy variables as it is a categorical variable (Wen et al., 2015).For example, in dummy variable 1, the value of paddy fields was set as 1 and the others were set as 0. The setting of dummy variables were shown in Table 1.
where  * (s 0 ) is predicted value at non-sampled location s0 (s  ) is observed value at location si wi is the weighting factor for (s  ).

Regression Kriging:
RK is an extension of OK.After detrending with environment predictors, the residuals are extended to the whole region via OK method.The predicted value at non-sampled location is the sum of the drift and the estimated residuals: (2) where  * (s 0 ) is the predicted value at non-sampled location s0 e(s 0 ) is the estimated residuals m(s 0 ) is the drift,   is the coefficients of regression x i is the independent variable   is the weights (s  ) is the residual at location sj

Model Evaluation
Leave one out cross-validation and external validation were used for prediction accuracy evaluation, including two indices: root mean of squared error (RMSE) and coefficient of determination (R 2 ).In order to make the results of this paper comparable to other studies, the relative optimization times of RK to OK was measured via RMSEI and R 2 I. (5) where n is the number of samples O i is the observed SOC content for the sample i O ̅ is the mean value of the observed samples P i is the predicted values j is the sequence number of RK, j=1, 2, 3

Descriptive Statistics and Analysis of Variance
As shown in Table 2, the statistical characteristics of calibration set were similar with validation set, and the maximum value (Max) and minimum value (Min) were set in calibration set, therefore the validity and representativeness of calibration set was guaranteed.The SOC content of calibration set ranged from 1.64 g/kg to 34.72 g/kg, and the coefficient of variation (CV) was at the moderate variation (Wilding, 1985).For the specific land use type, the CV was smaller than the whole calibration.
The results of one-way ANOVA indicated that the SOC content of paddy fields was significantly higher than that of woodlands, and the SOC content of woodlands was significantly higher than that of irrigated lands.

Spillover Effect of Land Use on SOC Content
Both of three regression models have passed through F test (p<0.01),and the coefficients, intercept value and R 2 of them (p<0.05)were shown in Table3.The results indicated that the percentage of irrigated lands was negatively correlated with SOC when only considering land use percentages.With the higher R 2 , the observed land use types were more effective than land use percentages obtained from the image.Fortunately, there was no contradiction between types and percentages: when combined with the land use types and percentages, the R 2 of regression was further improved to 24.7%.In R3 model, dummy variable 1 was significant indicated that the SOC content of paddy fields was significantly higher than that of woodlands, which was consistent with the result of one-way ANOVA.The difference of SOC content between irrigated lands and woodlands was no longer significant when taking land use types into consideration.Moreover, the percentage of water area and irrigated canals were positively correlated with SOC content.The land-use dependency of the relationship between SOC and land use percentages was also investigated and the results were exhibited in Table 4.The SOC content of paddy fields and woodlands did not show significant correlation with any land use percentages.By contrast, the SOC content of irrigated lands was positively correlated with the percentage of water area and irrigated canals, and the R 2 reached 15.9%, which was far larger than the R 2 of R1 model.This results indicated that the irrigated land was the main category affected by percentage of water area and irrigated canals, and this phenomenon was covered because of the mixture of different land use samples.

Calibration Sets and Spatial Models of Four Kriging Methods
After obtaining the three regression models, the RK models were correspondingly built, named RK1, RK2 and RK3 respectively.OK model was used as a reference model.4. Association between SOC content and land use percentages for samples of irrigated lands R_IL: stepwise regression based on samples of irrigated lands; Void coefficients denote not significant (p>0.05); the regressions based on samples of paddy fields and woodlands did not pass through F test (p>0.05).middle value areas.Comparing with original calibration set, the distribution of RK2 calibration had great improvement at low value areas when considering land use types, and the residuals of RK1 were more close to the diagonal line at the high value areas.In general, taking land use types and percentages as predictors contributed to the performance of residuals, and the residuals of RK3 outperformed the other calibration sets.
The Kolmogorov-Smirnov (K-S) test was carried out for the four calibration sets.The results indicated that the three residuals rejected the normal distribution hypothesis (H0) (p<0.05) and could directly use for modelling, but the original SOC dataset must be transformed with Napierian logarithm (ln).The linear, spherical, exponential, and Gaussian functions were fitted to theoretical semi-variogram models, and the relative parameters were shown in Table 5 via GS+ software.In Table 5, the value of C0/(C0+C) of four spatial models were less than 25%, which indicated strong spatial autocorrelation (Wilding, 1985) and proved the rationality of the application of spatial interpolation models.

SOC Content Maps
The digital maps of SOC content estimated by OK and RK models were displayed in Fig 3 .In order to make them comparable in visual, the ranges of SOC content of the four models were set to 0 ~ 35 g/kg.There were some common characteristics of SOC spatial variability shown in the four maps: in the overall trend, the SOC content on the periphery were larger than that of central part.Three high value aggregated areas were located at north, south and southeast of the image separately, and the area with the lowest average SOC contents situated on the west side of central town.For different land use types, the SOC content of paddy field was larger than that of woodlands and irrigated lands in the local region.However, there were some differences between four maps: the spatial variance of SOC content in first map was smoother due to the limitations of Kriging method.But in the map estimated by RK2 model, the contrast of SOC content in different land use was magnified, and the feature was more obvious in the areas where paddy fields were adjacent to irrigated lands.In the map of RK3, the SOC variability in adjacent patches became more violent, and more detailed information was exhibited inside the same patches.regression Kriging with land use types and percentages.

Model Validation
The results of cross-validation and external validation (Table 6) indicated that the three RK models outperformed the OK model with lower RMSE values and larger R 2 .RK1 model was better than OK model but the effect was not obvious.RK2 model had a great promotion than RK1 model from the aspects of prediction accuracy and goodness of fit.RK3 model outperformed the other models with the lowest values of RMSEC (5.644g/kg) and RMSEP (6.229 g/kg), and the highest R 2 C (0.193) and R 2 P (0.197).

The Effects of Land Use Types and Percentages on SOC
As different land use types, the input and output ways of SOC of paddy fields, irrigated lands and woodlands was quiet different, which led to the great difference of SOC content.Paddy fields and irrigated lands were strongly affected by human producing activities.With the input of fertilizer, SOC content quickly rose in a short time, and the soil carbon pool lost with the harvest of crops.However, the cultivation and management methods, land cover, soil types and soil moisture of paddy fields and irrigated lands were very different.Under the submerged conditions, the activity of soil microbes was limited by oxygen deficiency, which led to the decrease of the mineralization rate of SOC and the accumulation of SOC.But for irrigated lands, long-term cultivation destroyed the soil physical structure, and the loss of large soil aggregates weaken the ability of soil carbon sequestration (Six et al., 2000).The local farmers' habit of burning straw also aggravated the loss of soil organisms and the mineralization of organic matter in irrigated lands.Therefore, the SOC content of paddy fields was significant higher than that of irrigated lands.As a natural land use type, the SOC of woodland was relatively less disturbed by human social development.The soil structure of woodland contributed to the soil and water conservation, which kept SOC content at a high level.The relationship between the paddy fields and the woodlands was not fixed but based on the actual situation.In this research, the SOC content of woodlands was lower than that of paddy fields, but higher than that of irrigated lands, which was consistent with the discoveries made by Zhang et al. (2010), Tai-Kui al. (2013) and Sá nchez-Gonzá lez et al. ( 2017).
Except for land use types, the land use percentages were also important factors for the spatial variability of SOC content.For all of sampling points, SOC content was negatively affected by the percentage of irrigated land.On one hand, the average SOC content of irrigated lands was lowest among the three land use types, a higher percentage of irrigated land would take down the regional SOC content.On the other hand, the irrigated lands were very broken in the study area, a higher percentage of irrigated land mean the longer edges of farmland were exposed to oxygen, which speeded up the mineralization rate of SOC.For the sampling points of a specific type, irrigated lands significantly benefited from the high percentage of water area and irrigated canals.The high percentage of water area could potentially promote the wetness of the surrounding soil, and the irrigated canals would directly provide sufficient water for irrigation.For irrigated lands, sufficient water was conducive for crop growth and carbon sequestration via photosynthesis (Xiao-Bo et al., 2017).A further research was implemented to explore the relationship between annual average NDVI value (through 13 phase Landsat 8 remote sensing images) and soil moisture (through field measurement), the results indicated that there were significant positive correlation between them.As a result, irrigation facilities were important for farmland SOC sequestration especially in irrigated lands.As for paddy fields, which were already under the condition of flooding, the effect of water moisture on it was weaken.The woodlands was less affected by agricultural irrigation as the natural elements.

Comparison of modal performance
In this study, the land use types and percentages were employed to explain the spatial variability of SOC content.Under the control of different land use type and surrounding environment, the input and output rate of SOC was quiet different, which caused the spatial heterogeneity of SOC content.The results of one-way ANOVA demonstrated that there were significant differences among SOC content of paddy fields, irrigated lands and woodlands, and therefore the SOC content of the three land use types could not be regarded as the same regional variable, which violated the precondition of Kriging methods.The results of regression also proved that land use types and percentages were both responsible for the heterogeneity of SOC content.When spatial heterogeneity existed in the spatial variability of SOC, the expected value of random function of SOC varied with the change of position, which violated the stationary assumption required by ordinary Kriging method.Under this situation, the prediction accuracy of OK method was limited, and it was why OK method was not suitable for spatial estimation of soil properties under complex landscape.Actually, in the study, the SOC content was not normally distributed and had to be conducted Napierian logarithm (ln) transformation.
In comparison, RK models eliminated the spatial heterogeneity caused by land use types and percentages.The spatial variability of SOC residuals became more stable, and the residuals were relatively normally distributed and passed through the K-S test.
As a result, RK models had a promotion of the prediction accuracy and goodness of fit than OK model.As for the two explanatory predictors, the land use types had a higher R2 of regression than that of percentages, which mean the local land use type was more influential than surrounding environment on soil properties.Therefore, the RK2 model was more valid than RK1 model.Moreover, when considering land use types and percentages simultaneously, the R2 of regression further increased to 24.7% and thus the RK3 model outperformed the other prediction models with the lowest RMSE and the highest R2.
The digital SOC maps showed that most of low value areas belonged to irrigated lands and were located nearby central town and rural settlements, which had a longer history of cultivation.In contrast, the three high value aggregated areas had characteristics of belonging to paddy fields, mixing with water area patches, and approaching irrigated canals.Moreover, learned from the field survey, local farmers often use sediments of ponds to fertilize soil, which was conducive to the accumulation of farmland SOC.However, there were some value obviously higher or lower than the surrounding SOC values, which called spatial outliers (Costa, 2003).After combined with land use information, the contrast between the spatial outliers and the surrounding value reduced but still existed.Therefore, new environment variables should be explored, together with land use information to explain the causes of spatial heterogeneity and spatial outliers.

CONCLUSION
In this research, we explored the feasibility of land use types and percentages (obtained with a 200-meter buffer) for soil mapping via RK models.The three RK models, named RK1, RK2 and RK3, combined with land use percentages, land use types and both of them respectively.The OK model was set as a reference.The basic SOC statistics showed that the SOC of paddy fields was significantly higher than that of woodlands, and SOC of woodlands was higher than that of irrigated lands.According to results of regressions, we found that SOC was negatively correlated with percentage of irrigated land when only considering land use percentages.After adding land use types into the model, the SOC was positively correlated with the percentage of water area and irrigation canals.Further research indicated that SOC of irrigated lands was significantly correlated with the percentage of water area and irrigation canals, while paddy fields and woodlands did not show similar trends.The land use percentages and types could explain 3.7% and 20.5% variation of the SOC respectively, and the value increased to 24.7% when integrating types and percentages.The results of model validation indicated that the RK3 model outperformed the other models with the lowest values of RMSEC (5.644g/kg) and RMSEP (6.229 g/kg), and the highest R 2 C (0.193) and R 2 P (0.197).
In conclusions, land use types and percentages serve as efficient indicators for the SOC mapping in plain areas.Additionally, irrigation facilities contributed to the farmland SOC sequestration especially in irrigated lands.
Figure 1.Study area and spatial distribution of soil samples

Figure 2 .
Figure 2.QQplots of four calibration sets OK: ordinary Kriging; RK1: regression Kriging with land use percentages; RK2: regression Kriging with land use types; RK3: regression Kriging with land use types and percentages.

Figure 3 .
Figure 3.Maps of SOC content estimated by four spatial models OK: ordinary Kriging; RK1: regression Kriging with land use percentages; RK2: regression Kriging with land use types; RK3: regression Kriging with land use types and percentages.

Table 2 .
The QQplot was employed to examine whether the original calibration set and the residuals of the RK models were normally distributed.As shown in Fig 2, original calibration set performed poorly at the low and Descriptive statistics of SOC content for different land use types Se: standard error; Values suffixed with different superscript letters along the columns are significant different (p<0.05)

Table 6
Evaluation of prediction accuracy of four models OK: ordinary Kriging; RK1: regression Kriging with land use percentages; RK2: regression Kriging with land use types; RK3: regression Kriging with land use types and percentages; RMSE: root measure squares error; RMSEI: root measure squares error; RMSEI: the relative optimization times of RK to OK.