A COMPARISON OF MACHINE LEARNING MODELS FOR SOIL SALINITY ESTIMATION USING MULTI-SPECTRAL EARTH OBSERVATION DATA
Keywords: Soil salinity, Electrical conductivity (EC), Machine learning, Sentinel-2 satellite
Abstract. Soil salinity, a significant environmental indicator, is considered one of the leading causes of land degradation, especially in arid and semi-arid regions. In many cases, this major threat leads to loss of arable land, reduces crop productivity, groundwater resources loss, increases economic costs for soil management, and ultimately increases the probability of soil erosion. Monitoring soil salinity distribution and degree of salinity and mapping the electrical conductivity (EC) using remote sensing techniques are crucial for land use management. Salt-effected soil is a predominant phenomenon in the Eshtehard Salt Lake located in Alborz, Iran. In this study, the potential of Sentinel-2 imagery was investigated for mapping and monitoring soil salinity. According to the satellite's pass, different salt properties were measured for 197 soil samples in the field data study. Therefore several spectral features, such as satellite band reflectance, salinity indices, and vegetation indices, were extracted from Sentinel-2 imagery. To build an optimum machine learning regression model for soil salinity estimation, three different regression models, including Gradient Boost Machine (GBM), Extreme Gradient Boost (XGBoost), and Random Forest (RF), were used. The XGBoostmethod outperformed GBM and RF with the coefficient of determination (R2) more than 76%, Root Mean Square Error (RMSE) about 0.84 dS m−1, and Normalized Root Mean Square Error (NRMSE) about 0.33 dS m−1. The results demonstrated that the integration of remote sensing data, field data, and using an appropriate machine learning model could provide high-precision salinity maps to monitor soil salinity as an environmental problem.