AUTOMATED RECOGNITION OF VEGETATION AND WATER BODIES ON THE TERRITORY OF MEGACITIES IN SATELLITE IMAGES OF VISIBLE AND IR BANDS

Vegetation and water bodies are a fundamental element of urban ecosystems, and water mapping is critical for urban and landscape planning and management. A methodology of automated recognition of vegetation and water bodies on the territory of megacities in satellite images of sub-meter spatial resolution of the visible and IR bands is proposed. By processing multispectral images from the satellite SuperView-1A, vector layers of recognized plant and water objects were obtained. Analysis of the results of image processing showed a sufficiently high accuracy of the delineation of the boundaries of recognized objects and a good separation of classes. The developed methodology provides a significant increase of the efficiency and reliability of updating maps of large cities while reducing financial costs. Due to the high degree of automation, the proposed methodology can be implemented in the form of a geo-information web service functioning in the interests of a wide range of public services and commercial institutions. * Corresponding author


INTRODUCTION
The identification and extraction of accurate spatial information relating to urban areas is essential for future sustainable city planning owing to its importance within global environmental change and human-environment interactions.At present, control of anthropogenic changes in vegetation and water bodies of large cities is one of the most important and urgent tasks of ecological monitoring of mega-cities all over the world.Ensuring the sustainable development of densely populated areas at an extremely high rate of urban infrastructure development is impossible without obtaining regular and reliable information about the current state and dynamics of changes in all types of objects, including vegetation and water bodies (Di, 2017, Hnatushenko, 2017a, 2017b, Kashtan, 2017, Mozgovoy, 2016).For regular updating of databases of spatial information, when actualizing maps of megacities, the data of ground measurements and aerial photographs were traditionally used.With this approach, the frequency of updating maps of cities was low (usually once every few years), which was due to the high labor input required for data collection and processing.The intensive development in recent years of multispectral satellite sensors of ultra-high spatial resolution, on the one hand (Table 1, n.a.-data not available), and significant progress in the field of modern information processing technologies for satellite imagery, on the other hand, have significantly accelerated the process of collecting and processing spatial data (Derkzen, 2015, Makarov, 2011, Mozgoviy, 2007).At present, numerous methods have been developed to delineate vegetation and water bodies in moderate-resolution satellite imagery (Aguilar, 2016, Feyisa, 2014, Xie, 2016, Hnatushenko, 2016a, Liu, 2013, Tulbure, 2013).But their use in recognizing images of high spatial resolution is ineffective.Water bodies in urban areas are frequently small and surrounded by complex built-up areas, vegetation, and their shadows (Su, 2008).The main goal of the research is development of a methodology for automated recognition of vegetation and water bodies in the megacities in satellite images of sub-meter spatial resolution of the visible and infrared bands in order to increase the efficiency and reliability of updating maps of cities.

INPUT DATA
As the test site for testing the methodology for automated recognition of vegetation and water bodies, the territory of the city of Mentougou (Figure 1) -the suburb of Beijing, one of the largest and dynamically developing megacities of China -was chosen.1. Satellites with multispectral scanners of sub-meter resolution This is due to the fact that the Chinese megacities are characterized by high rates of urban infrastructure changes, vast densely populated areas with dense high-rise and low-rise buildings, as well as a large number of various industrial and cultural sites.As a source of multispectral data, a SuperView-1A satellite was chosen that performs the survey in the visible and infrared ranges, and also has a sufficiently high radiometric and spatial resolution.In this work the multispectral image of the Mentougou city from the SuperView-1A satellite for May 5, 2017 was selected as the initial data (Figure 2).The works carried out within the framework of the studies included the following stages of image processing and analysis: -preliminary processing of satellite images (Figure 3), including orthorectification and augmentation of spatial resolution (Gnatushenko, 2013, Hnatushenko, 2015); -thematic processing of satellite images (Figure 4), including calculation of spectral indices, binarization, morphological filtration and vectorization of recognized vegetation and water bodies (Hnatushenko, 2015, Jinru, 2017).
Figure 3.The main stages of preliminary processing of satellite images The operations of preliminary processing of satellite images are performed completely automatically.They are necessary when using original pictures without orthorectification and pansharpening.The volume of data files from satellites of sub-meter spatial resolution is quite large -a scene taken in the visible and infrared range can occupy several gigabytes.Therefore, for realtime processing of multi-shot images in real time, it is desirable to use modern computers with multi-core processors of Intel class I-7 or higher and at least 64 GB of RAM.The software can be either commercial (ERDAS, ENVI, ArcGIS, etc.) or free (SNAP, SAGA, GRAAS, QGIS, etc.), available in versions for MS Windows and Linux.However, currently the most effective approach for processing and storing satellite imagery is to use specialized web services, for example, EOS engine (https://eos.com),that, compared to traditional software and hardware, has significant advantages, such as: -it works directly in the browser, which does not require additional software installed by the client; -software and hardware independence, which enables using this web service on mobile devices; -the results of image processing are stored on the server, which allows all customers to use the web service regardless of their location; -high economic efficiency (no purchase of powerful graphic stations and expensive software is required); -minimum requirements for the level of user training (there is no need to spend time for studying large and complex software packages).Such web services allows users to select from various data imagery sources, analytic operations, workflow and geospatial operations and then instantly see the result of their query.

Applied metrics for accuracy evaluation
The most widely used methods of validation of Earth remote sensing classification outcomes are the following (Hnatushenko, 2016b): -comparison of results with the results of synchronous surface observations and measurements carried out immediately at the time of imaging; -comparison with the results of automatic classification by certified software products for the same purpose; -comparison with manual corrected classification results; -comparison with the outcomes of manual classification carried out by operators and evaluated by an expert group (used in this work).This method is used for comparatively small volumes of data or for a limited set of test areas, which are to be distributed over the territory of the research as evenly as possible.
The metrics that have been used for the quantifying accuracy of the automatic classification are the following: -confusion matrix (a number of unrecognized class pixels, a number of falsely recognized class pixel, total result accuracy); -statistics (Kappa coefficient, a regression or standard error); -compliancy matrix for several classes (the accuracy can be low due to the transition of class boundaries); -compliancy matrix with fuzzy boundaries of the class; in this case an algorithm of class boundaries designation may vary.
In this case the following well-known indicators of accuracy classification have been chosen: confusion matrix and Kappa coefficient.
For the accuracy evaluation of one class, matrix of classification errors has been used (Fig. 5).


Where : i is the class number; N is the total number of classified pixels that are being compared to ground truth; m i, i is the number of pixels belonging to the ground truth class i, that have also been classified with a class i (i.e., values found along the diagonal of the confusion matrix); C i is the total number of classified pixels belonging to class i; G i is the total number of reference (i.e., standard, ground truth) pixels belonging to class i.

Results of the research
In the course of processing multispectral images of the visible and infrared bands from the satellite SuperView-1A, vector layers of recognized objects of vegetation and water bodies were obtained (Figure 6).Analysis of the results of image processing showed a sufficiently high accuracy of the delineation of the boundaries of recognized objects and good separation of the vegetation and water classes various test sites with the same settings for the binarization thresholds (Figure 7).The overall classification accuracy is from 85% to 92% with the kappa coefficient ranging from 0.81 to 0.86.

Comparison with images of other territories and dates
To confirm the reproducibility and sustainability of the proposed method, in the course of the research, a classification accuracy comparison was made between the processing results of two multispectral images from SuperView-1A satellite for different territories and dates.In this case the we was used: -image of the Mentougou city for May 5, 2017 (Figure 1); -image of the Hualian city for January 18, 2018 (Figure 8).The image processing procedures were the same as in the first case, with the exception of the orthorectification operation, since the original image was already in the UTM projection.The overall classification accuracy in this case was slightly worse and amounted to from 82% to 90% with the kappa coefficient from 0.80 to 0.85.The main error in the classification results is caused by shaded areas, which are falsely recognized as water.This problem is typical for both satellite images of submeter resolution, and for aerial images.

Comparison with medium resolution satellite images
To assess the reliability of the proposed in the course of the research, a comparison was made the processing results of the submeter spatial resolution images from the SuperView-1A satellite for May 5, 2017 with processing results of the medium spatial resolution images for the nearest dates.Multispectral images from the Landsat-8 satellite for May 7, 2017 and Sentinel-2A satellite for April 28, 2017 were used.The results of the comparisons are shown in the figure 9 (NDVI, NDWI) and figure 10 (binarization and vectorization results).There are insignificant differences in the values of the NDVI and NDWI for Landsat-8 and Sentinel-2A, which are caused by different time and date of shooting.There are more significant differences in index values between SuperView-1A and Landsat-8 & Sentinel-2A, which are caused not only by different time and date of shooting, but also by the lack of radiometric correction for SuperView-1A images.Therefore, in order to increase the accuracy of vegetation and water separation, it is necessary to correct the values of binarization thresholds when processing images from a SuperView-1A satellite or to perform radiometric and atmospheric correction.

The main advantages of the proposed method
The main advantages of the proposed method in comparison with terrestrial methods of measurements and aerial photography are: -wide coverage (imaging of vast territories in a short time); -almost immediate availability of results (imaging and processing takes less than 1 day); -minimization of human errors (high degree of automation of processing procedures); -maximum reliability (minimization of errors and exclusion of falsification); -high frequency (possibility to collect images with periodicity of up to several times a day); -multidisciplinarity (the possibility of using the same images to solve a wide range of applied problems in the interests of different consumers); -no need to obtain legal permits for surveying objects, which allows you to shoot any desired objects; -complete safety (monitoring can be carried out without contacting a hazardous object, excluding risks to human health and life); -maximum availability (you can image objects located in hardto-reach places); -high detail, sufficient for most practical tasks (spatial resolution up to 30 cm for commercial satellites); -synchronization of data acquisition (simultaneous observation of a large number of objects located at a considerable distance from each other); -high economic efficiency (significant cost reduction, especially when using specialized web services).

Directions of further research
The method developed in this study has extended the recognition technique into vegetation and water mapping in a complex urban environment.Currently, the proposed method is being tested and tuned using multispectral images of various parts of the Earth obtained from active sub-meter resolution satellites in order to determine optimal processing parameters for the main types of scanners taking into account the region and the imaging conditions.Moreover, we should note that although our method has been quite efficient, there is a room for improvement if we would consider additional classifiers from different pattern recognition families.

CONCLUSIONS
The results of testing confirmed the sufficiently high qualitative and quantitative indicators of the developed methodology: -classification accuracy (on most of the test areas the percentage of unrecognized and falsely recognized classes was less than 10%); -reproducibility (the repeatability of results in various test cases); -sustainability (no significant deviations in the results of the object recognition should appear the input data or during the setting of a processing procedure); -high of image processing due to the extreme simplicity of the algorithm in comparison with other methods (controlled classification, segmentation, clustering, neural networks, etc.).A methodology of automated recognition of vegetation and water objects in the megacities in satellite images of sub-meter spatial resolution of the visible and IR bands is proposed, which makes it possible to significantly increase efficiency and reliability of updating maps of large cities while reducing financial costs.Due to the high degree of automation, the developed methodology can be implemented in the form of a geo-information web service, functioning in the interests of a wide range of public services and commercial structures.

Figure 2 .
Figure 2. Initial image of the Mentougou city from the SuperView-1A satellite for May 5, 2017

Figure 4 .
Figure 4.The main stages of thematic processing of satellite imagery 3.2 Requirements for software and hardware

Figure 5 .
Figure 5.The matrix of classification errors for 1 class The overall evaluation of the accuracy of automatic classification is determined by the formula:

Figure 6 .
Figure 6.Vector layers of recognized vegetation and water objects

Figure 8 .
Figure 8.Initial image of the Hualian city (Taiwan) from the SuperView-1A satellite for January 18, 2018 The results of the automated recognition of vegetation and water objects for two test sites on the images of the Hualian city with the same settings of binarization thresholds are shown in the figure 9.

Figure 8 .
Figure 8.An example of automated recognition of vegetation and water objects for two test sites on the images of the Hualian city from the SuperView-1A satellite for January 18, 2018 with the same settings of binarization thresholds