HOW LANDSAT 9 IS SUPERIOR TO LANDSAT 8: COMPARATIVE ASSESSMENT OF LAND USE LAND COVER CLASSIFICATION AND LAND SURFACE TEMPERATURE

: This study aims (i) to analyze the performance of Landsat 8 and 9’s multispectral bands in Land Use Land Cover (LULC) mapping by applying Random Forest (RF) method, and (ii) to compare the LST results of Landsat 8 and 9 using ground-based measurements obtained from Surface Radiation Budget Network (SURFRAD). RF-based classification and pixel-based LST information extraction were conducted in the Google Earth Engine (GEE) environment. Considering the LULC classification, Iğdır province of Türkiye was chosen as the study area, whereas for LST analysis, the location of two SURFRAD stations (FPK and GWN) was selected. Collection 2 Level 2 Surface Reflectance (SR) Products of Landsat 8 and Landsat 9, acquired on 14 May 2022 and 22 May 2022, respectively, were used for LULC mapping. On the other hand, the products of Collection 2 Level 2 Surface Temperature (ST) were utilized for LST analysis. The obtained LULC results showed that Kappa value and Overall Accuracy (OA) for Landsat 9 and Landsat 8 were 87.4 %, 0.83, and 82 %, 0.76, respectively, presenting Landsat 9 achieved better performance in this case study. Concerning the thermal analysis, Landsat 9-based LST provided 1.77 K RMSE, which was lower than Landsat 8-based LST (RMSE=2.31 K). Consequently, Landsat 9 provided better accuracies in both LULC classification and LST analysis, and this study proved that


INTRODUCTION
Since the 1950s, the population of the world has risen from 2.5 billion to 7.9 billion, also it is projected to reach 9.7 billion by 2050 (https://www.macrotrends.net/countries/WLD/world/population).This growing trend in the population has been leading to demand augmentation for housing, transportation, water, healthcare, food, and energy.To meet these requirements, people have utilized natural resources and caused changes in the Earth's surface (Amini et al., 2022).Thus, generating Land Use Land Cover (LULC) Maps has always been more of an issue for land management, land planning, and sustainable environment ( Thiam et al., 2022;Sekertekin et al., 2017).Moreover, the knowledge of the LULC change has usually been critical for many studies (Giuliani et al., 2022), including land use planning (Sakieh et al., 2015), impact assessment on biodiversity (Michelsen, 2008), watershed analyses (Hörmann et al., 2005), and LULC effect on stream ecology (dos Reis Oliveira et al., 2019;Zhou et al., 2012).Therefore, timely, reliable, and accurate LULC knowledge is vital for policy and decision makers to maintain sustainable land resource management.In addition to the LULC mapping, Surface Urban Heat Island (SUHI) effect, which can be extracted from Thermal Infrared (TIR) based images, is another variable for the sustainability of the cities.
Remote Sensing (RS) technology has provided useful information and solutions during the past few decades for monitoring the Earth's surface variations.Satellite imageries have been extensively utilized for the extraction of LULC maps.From the past to the present, various optical RS systems have been launched, some of which are Landsat, Terra/ASTER, SPOT, and Sentinel-2.Among these satellites, Landsat is the unique mission that has been providing images since the 1970s, and the last member of the mission, Landsat 9, was launched on September 27, 2021.Therefore, the number of scientific * Corresponding author publications that used Landsat data has reached a great extent over time (Hemati et al., 2021).
To create an LULC map, a classification method is generally applied to the RS images, and various classification algorithms have been proposed and used so far (Alzubaidi et al., 2021;Khelifi and Mignotte, 2020;Thanh Noi and Kappas, 2017;Korytkowski et al., 2016;Otukei and Blaschke, 2010;Wulder et al., 2008).Concerning the classifiers, Random Forest (RF) has attracted the attention of researchers, and has become one of the widely used methods for LULC mapping thanks to its performance and the demand for a few criteria (Adugna et al., 2022;Balha et al., 2021).On the other hand, SUHI maps can be obtained from TIR-based Land Surface Temperature (LST) images of MODIS, ASTER, Landsat, etc., and various methods have been improved to retrieve TIR-based LST ( Gillespie et al., 1998;Qin et al., 2001;Dash et al., 2002;Jiménez-Muñoz and Sobrino, 2003;Sobrino et al., 2004;Duan et al., 2018).
This study aims (i) to compare the performance of Landsat 8 and 9's multispectral bands in LULC mapping using the RF method, and (ii) to compare the LST results of Landsat 8 and 9 using ground-based measurements obtained from Surface Radiation Budget Network (SURFRAD).Furthermore, this is the first study that will provide LST validation of the Landsat 9 TIRS sensor using in-situ measurements.RF-based classification and pixelbased LST information extraction were conducted in the Google Earth Engine (GEE) cloud environment.Although Landsat 9 has similar copies of the Thermal Infrared Sensor (TIRS) and Operational Land Imager (OLI) instruments onboard Landsat 8 as possible, one of the main differences between OLI and OLI-2 instruments is 14 bits of data downloaded per pixel obtaining a higher bit depth for its imagery in comparison with the Landsat 8's 12-bit OLI data (https://landsat.gsfc.nasa.gov/satellites/landsat-9/landsat-9instruments/).On the other hand, The TIRS-2 is an upgraded version of the TIRS-1 from Landsat 8 based on instrument class and stray light reduction (https://landsat.gsfc.nasa.gov/satellites/landsat-9/landsat-9instruments).That is why we would like to investigate if OLI-2 and TIRS-2's differences(characteristics) have any effect on LULC mapping, and LST analysis, respectively, compared to the OLI-1 and TIRS-1.

LULC Mapping Study Area
Concerning the LULC mapping, Iğdır Province, located at the Eastern part of Türkiye and along the borders with Iran, Armenia, and Azerbaijan (the area of Nakhchivan Autonomous Republic), was chosen (Figure 1).Iğdır Province is approximately 3588 km 2 in size, and the height of the city center above the mean sea level is around 850 m.The Igdir plain is among the most valuable agricultural regions in Turkey, and it has a continental climate; however, lowland parts are not affected by the continental climate as seen in the other parts of Eastern Anatolia due to the micro-climate effect.

Test Sites for TIRS Comparison
In order to facilitate climate studies across the US, the National Oceanic and Atmospheric Administration (NOAA) of the United States (US) built the SURFRAD network in 1993 by preparing continuous, long period and reliable ground-based data of surface radiation budget (Augustine et al., 2000).Longwave radiation is one of the parameters that SURFRAD stations measure, and these measurements are used to calculate the ground-based LST.In this study, two SURFRAD stations (FPK and GWN), located at inverse climate zones, were considered for the LST comparisons from TIRS-1 and TIRS-2 (Figure 2).

DATA AND METHOD
Concerning the LULC mapping, Landsat 8's and Landsat 9's Collection 2 (C2) Level 2 (L2) Surface Reflectance (SR) images (Masek et al., 2006;Vermote et al., 2016) were acquired on 14 May 2022 and 22 May 2022, respectively.Apart from the panchromatic band, all reflective bands of both datasets were performed in the LULC classification process.In this research, the RF classification method, providing effective performance (Shao et al., 2021), was implemented to identify the LULC classes, namely, water body, vegetation area (pasture, vegetation covered agricultural areas, etc.), artificial surface (urban and other impervious surfaces), bare land (rocks and bare soils including non-vegetative agricultural lands), and snow cover.
The classification process was implemented in the GEE platform.More information about RF can be achieved from the study of Breiman (2001).The number of decision trees (ntree) and the number of variables per split are two significant user-defined factors for the RF, which were defined as 130 and 3, respectively, for this research.Furthermore, for the classification process of each image, 35214 pixels were used in training while 429 pixels were used in testing.To analyze the classification performance, confusion matrix was generated with the Kappa coefficient and Overall Accuracy (OA).
Landsat-based LST and SURFRAD-derived LST were analyzed considering the performance metrics, namely, average Bias and Root Mean Square Error (RMSE) given by: where T L = Landsat-based LST T SURFRAD = SURFRAD-derived LST n = the number of data The general flowchart of the study is i in Figure 3, showing all steps including, preprocessing, data, methods, and analysis in the corresponding environment.Table 1 represents the accuracy assessment report for the classification results.The obtained results showed the derived LULC of the Landsat 9 image had a higher kappa value (0.83) and OA (87.4 %) than the Landsat 8-derived LULC (OA=82% and Kappa=0.76).As a visual interpretation, artificial surfaces were better determined by Landsat 9; however, some of the bare lands were identified as artificial surfaces in the Landsat 8derived LULC image.Considering the comparison of TIR sensors, Figure 5 represents the LST results from both Landsat images and SURFRAD stations.In-situ LST results from FPK and GWN stations were compared with Landsat 8 and 9 based LST results in Figure 5a and 5b, respectively, while all results from both FPK and GWN stations were presented in Figure 5c.For all scatter plots, Landsat 9-based LST provided slightly better performance than Landsat 8-based LST.It is also required to mention that in this analysis, 27 Landsat 8 images and 19 Landsat 9 images from 2 stations were utilized by considering the clear-sky condition.Over the FPK station, for Landsat 8 and Landsat 9, the RMSE values were 2.60 K and 2.32 K, respectively.Moreover, the average bias was lower for Landsat 9 than Landsat 8. Concerning the GWN station, for Landsat 8 and Landsat 9, the RMSE values were 1.99 K and 1.24 K, respectively.Besides, the average bias was again lower for Landsat 9 than Landsat 8 as in FPK station.For overall evaluation, all data from both FPK and GWN stations were analyzed as seen in Figure 5c.In this general evaluation, Landsat 9-based LST provided 1.77 K RMSE, which was lower than Landsat 8-based LST (RMSE=2.31K).

CONCLUSION
In this study, comparative assessments of Landsat 9 and Landsat 8 were conducted based on LULC classification and LST validation.Thus, the effectiveness of the spectral bands and thermal band (band 10) of both satellite sensors was evaluated.
The most recent Landsat satellite is Landsat 9, so we showed its performance on both LULC and LST analysis to highlight the efficiency of OLI-2 and TIRS-2 sensors with a comparative analysis with Landsat 8.For LULC retrieval, the RF method was applied in the GEE cloud platform with the same training and testing data for both Landsat 8 and Landsat 9 images.The LULC results revealed that OA and Kappa values for Landsat 8 and Landsat 9 were 82 %, 0.76, and 87.4 %, 0.83, respectively, presenting that Landsat 9 achieved better performance in this case study.Additionally, regarding the thermal analysis, readily available LST products of Landsat 8 and Landsat 9 from the USGS were evaluated based on ground measurements.FPK and GWN stations from SURFRAD Network were considered in-situ measurement stations.In this analysis, Landsat 9-based LST provided 1.77 K RMSE, which was lower than Landsat 8-based LST (RMSE=2.31K).As a general evaluation, Landsat 9 provided better accuracies in both LULC classification and LST analysis, and this study proved Landsat 9 has more improved OLI and TIRS sensors than Landsat 8.

Figure 1 .
Figure 1.Illustration of the study area map for LULC classification.

Figure 3 .
Figure 3. General workflow of this study.

Figure 4
Figure 4 represents the RF-based LULC images obtained from Landsat 8 and Landsat 9.For both datasets, the same training and testing data were performed in the classification process.Both

Figure 5 .
Figure 5. Validations and comparative analysis between Landsat-based LST and SURFRAD-based LST; a) LST results from FPK station, (b) LST results from GWN station, (c) Total LST results from FPK and GWN stations.

Table 1 .
Validation results for LULC images