AN EVOLUTIONARY ALGORITHM FOR FAST INTENSITY BASED IMAGE MATCHING BETWEEN OPTICAL AND SAR SATELLITE IMAGERY

: This paper presents a hybrid evolutionary algorithm for fast intensity based matching between satellite imagery from SAR and very high-resolution (VHR) optical sensor systems. The precise and accurate co-registration of image time series and images of different sensors is a key task in multi-sensor image processing scenarios. The necessary preprocessing step of image matching and tie-point detection is divided into a search problem and a similarity measurement. Within this paper we evaluate the use of an evolutionary search strategy for establishing the spatial correspondence between satellite imagery of optical and radar sensors. The aim of the proposed algorithm is to decrease the computational costs during the search process by formulating the search as an optimization problem. Based upon the canonical evolutionary algorithm, the proposed algorithm is adapted for SAR/optical imagery intensity based matching. Extensions are drawn using techniques like hybridization (e.g. local search) and others to lower the number of objective function calls and reﬁne the result. The algorithm signiﬁcantely decreases the computational costs whilst ﬁnding the optimal solution in a reliable way.


INTRODUCTION
Over the last decades we observe an ever increasing number of satellites delivering optical high-(HR) and very high-resolution (VHR) imagery (Belward and Skien, 2015). This situation serves as catalyst for the development and application of a broad range of methodologies in the domain of Change Detection, Mosaiking, Classification and Multi-Temporal Image Analysis. The combination of aerial imagery acquired by one or several sensors offers possibilities by means of image interpretation and classification (Sörgel et al., 2017). The highly accurate co-registration of the imagery is a necessity, misregistration lead to unreliable and error-prone results (Townshend et al., 1992) (Dai and Khorram, 1998) (Sundaresan et al., 2007).
The absolute geo-location accuracy of optical ortho-images depends on several factors, e.g. the satellite orbit determination and the instrument position estimation. The first can be solved within the decimeter range, the resulting influence is almost in the same range as the position estimation. For the second this holds not true, due to the long baseline even small errors in the instrument attitude and thermally influenced mounting angle estimation can lead to geo-location errors in the magnitude of up to several meters. Therefor ground control information remains a necessity for reaching high absolute geometric accuracies (Reinartz et al., 2011).
Due to the nature of SAR systems, the instruments attitude determination has no influence on the absolute geometric accuracy of the orthorectified SAR image. The co-registration of optical and SAR imagery is therefor recognized as an adequate way to improve the absolute geometric accuracy of optical imagery * Corresponding author: +49 81 53 28 15 61 and minimize image misalignment errors (Reinartz et al., 2011) (Merkle et al., 2015 (Perko et al., 2011).
Tie points between the optical and SAR image need to be identified in order to enable later on a geometric transformation and therefor the co-registration of the optical image which leads to a raise by means of absolute geometric accuracy. Establishing the necessary correspondence between the two images can done by intensity-based approaches and feature-based approaches (Zitov and Flusser, 2003). Intensity-based approaches rely on the comparison of raw pixel values, which is problematic to a certain degree when it comes to inter-sensor matching. SAR and optical systems are distinguished by (a) being active and passive systems, (b) recording signals in different wavelength domains (c) with different imaging geometries. The first leads to almost equal response signals for SAR systems independent from weather conditions and image acquisition time in contrast to optical systems where the state of the atmosphere (water vapor content) and image acquisition time (illumination conditions due to season and day-time) strongly influence the image product. The second influences the characteristic of the recorded signal, SAR systems are known for being mostly sensitive to the physical conditions (e.g. surface roughness) whereas optical systems mainly describe the chemical properties (due to absorption/reflectance) of the illuminated entities by means of spectroscopy (Sörgel et al., 2017). Different imaging geometries mentioned third limits for instance the usage of in principal well defined 3D objects with significant borders (e.g.. buildings), as the different viewing perspectives lead to different depictions in 2D image plane the taller the object is. Nevertheless it's possible to establish a statistical measure of similarity given somehow comparable image content. Mutual Information giving a statistical measurement of dependence between two sets has proven to be suitable for resolving the inter-sensor similarity measurement under certain circumstances in the domain of remote sensing (Siddique et al., 2012) (Suri and Reinartz, 2010) mainly overcoming the differences in the optical and SAR image related to systematic differences between the sensor systems (e.g. wavelength/bandwidth and passive versus active system). Roundabouts have already been used successfully combining several advantageous features for inter-sensor matching. They are rotation-invariant, they are almost planar which leads to low projective effects (lowering influence of difference in imaging geometry) and most important as paved roads are often surrounded by vegetation they are easily to identify in optical images because of the color change and in SAR images because of the changing Signal-to-Clutter ratio.
The need for developing fast and efficient algorithms for image matching between optical and SAR images is therefor given, as resolving misalignment errors by co-registration is recognized as not just beneficial but as a prerequisite for almost every later on conducted evaluation. Special attraction is given to fully automated algorithms which minimize the time-consuming process of tie-point localization (Scheffler et al., 2017). Intensity-based algorithms, which can be considered to be template matching algorithms, are often computationally expensive (Merkle et al., 2017). The main components of such algorithms are (a) search and (b) compare, where (a) means exploring a given region in image space and (b) deals with computing a quantitative similarity measure between two image regions.
Whilst extensive research has been done in the domain of developing fast and efficient similarity measurement functions, the search for tie-points is in most cases implemented as a simple moving-window approach. As remote sensing data are in most cases distributed as georefereced datasets, the size of the search space has to consider the spatial offset between the SAR and the optical image and furthermore the absolute pointing accuracy of both sensor systems. This situation leads to a search space with an absolute size of several tens of square meters. Given the submeter resolution of the SAR and optical system, several hundreds of times the similarity function has to be called in order to find the best matching position.
Evolutionary Algorithms have proven their ability to overcome this drawback by means of minimizing computational costs in template matching (Cuevas et al., 2013) (Cuevas et al., 2017) (Oliva et al., 2014. A smart search strategy balancing exploration and exploitation of the search space lowers the number of similarity function calls thus lowering computational costs. In this paper we adapt the canonical evolutionary algorithm for finding tie points between SAR and optical remote sensing imagery. The goal is to significantly reduce computational costs of the search whilst ensuring reliable and stable search results. In section 2 a detailed description of the algorithm and the data preprocessing is given. Section 3 describes the experiments carried out and gives the results. Section 4 concludes with giving a short perspective on possible next steps in the refinement of the proposed algorithm.

ALGORITHM
The aim of the algorithm is to solve the search for the global maximum of a given function within the borders of a search space defined by the size of two georeferenced overlapping images, where the two (optical/SAR) sets of pixel values serve as independent parameters of the objective function. The raw image patches are currently manually pre-selected and passed as parameters. The search algorithm follows the design of the canonical Evolutionary Algorithm (EA) (Zhang et al., 2011). After initializing a population of solution candidates and assigning fitness values using a suitable objective and fitness function to these individuals, an evolution process is applied leading the population to converge towards the formulated optimum. The algorithm is combined with a local search algorithm, as such a hybridization of EAs which is also known as Memetic Algorithm has been proven to be beneficial (Ong et al., 2010). Historical search experiences are incorporated by using a look-up table for positions already investigated in search space. Figure 1 depicts the iterative sequential operations of the algorithm.

Figure 1. Algorithm description
The EA needs to be adapted for the specific problem domain of template matching using SAR and optical remote sensing imagery. In the following we describe the used objective and fitness function, later on we give a brief view on the evolutionary operators for selection, recombination and mutation. The operators of EAs rely on probabilities which are given as parameters. The performance of EAs is strongly influenced by the chosen parameters, we give the used ones for the experiments in Table 1. The determination of an optimal parameter set (e.g. hyperparameter optimization) is a non trivial task and not the focus of this work. Therefor the used parameters were chosen by the authors experience with the goal to enable moderate convergence in the later experiments of this paper. No further hyperparameter tuning was conducted.  Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume IV-3, 2018 ISPRS TC III Mid-term Symposium "Developments, Technologies and Applications in Remote Sensing", 7-10 May, Beijing, China

Objective Function
The objective function has to yield a statement about the quality of match between a subset of the optical image and a patch of the SAR image. As the signal recorded by an optical system differs fundamentally from the signal recorded by a SAR system, the objective function needs to be independent from units and the range of values. Mutual Information (MI) being a measure of the amount of information that one random set of measurements contains about another set of measurements suits well for our application (Cover and Thomas, 2006). The used objective function is therefor given in Equation 1, where MI I(X; Y ) is the relative entropy between the joint distribution and the product distribution p(x)p(y) of the optical and SAR image.
The usage of other objective functions for similarity measurement is possible. Mean absolute difference (MAD), normalized cross-correlation (NCC) and variations of the before mentioned are well established in template matching. However for the specific optic to SAR matching problem the applicability of MI for giving a response about the dependency between an optical and a SAR image has been given by several publications, e.g. (Suri and Reinartz, 2010) and (Siddique et al., 2012).

Search Strategy
The search is formulated as an optimization problem where the goal is to find the position (u, v) * of the global maximum of Equation 1 in search space given two independent sets of pixel values x ∈ X and y ∈ Y , therefor the goal is given with 2.2.1 Fitness Function The higher the MI value of an individual, the higher its fitness. Due to the nature of the objective function and the specific input of SAR and optical image pixel values, the raw MI values cannot be simply translated into a fitness value suitable for an evolutionary algorithm. Only limited assumptions can be made about the value and range of the computed MI values resulting from the comparision of an optical and a SAR image. To overcome this issue a rank-based fitness assignment is used where the range of the fitness values is given by the size of the population. Having a population with n individuals the individual with the highest MI value of the population gets rank n whereas the unfittest individual gets rank 1.

Selection
The determination of individuals used for breeding is realized by a roulette wheel selection. The probability of selection of an individual is given by the ratio between the individuals fitness value and the cumulative sum of all fitness values of the population. For instance within a population of five individuals the fittest individual has a selection probability of 1/3 whereas the unfittest individual has a selection probability of 1/15.

Recombination
The recombination of two individuals is realized in an uniform rectangular fashion. Given two individuals xi and xi+1, the rectangular space is spanned by the row/column-coordinates of both individuals. The crossover points within this space are determined independently by a random uniform distribution.

Mutation
The aim of the mutation is to enable small jumps within the search space preventing the algorithm to get stuck in a local optimum. The mutation operator adds to the rowand column-coordinates a random value. The random value is determined by using a normal distribution.

Local Search
We use steepest ascent hillclimbing. For the 8 neighbors of the fittest individual the objective function is evaluated in a clockwise fashion. If one of the 8 neighboring positions gives a higher MI value, the position of the individual is shifted to this new position. The local search is repeated until there is no more significant raise in the MI value. Figure  2 illustrates this process. The given raster size is 9 rows times 9 columns. The pixel colormap raises from bright green to dark red, where the colors correspond to low and high MI respectively.
The maximum is at the center of the raster. The white triangles with cyan borders show for each iteration of the hillclimbing algorithm the current position. The pixels with bright green corners are the ones for which the MI is computed.  Table) in combination with the corresponding objective function value.

Termination Criterion
The algorithm terminates given no change in the sum of the M I values of the n fittest population member over m iterations. In the following Section 3 the influence of this two parameters on success rate and computational costs of the algorithm are discussed.

Sample Data
The experiments were carried out with three different data sources which are overlapping.
• Two optical images acquired by the WorldView-2 (WV-2) and the Quickbird (QB) satellite having a spatial resolution of 0.5m and 0.6m respectively and • one radar image recorded by the German TerraSAR-X (TSX) satellite having a spatial resolution of 0.5m.
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume IV-3, 2018 ISPRS TC III Mid-term Symposium "Developments, Technologies and Applications in Remote Sensing", 7-10 May, Beijing, China As the three images raster grids are shifted towards each other by a subpixel magnitude, the images were resampled using nearest neighbor algorithm. The smallest common denominator for reaching a common pixel grid spacing between WV-2 and TSX was 0.25m and for QB and TSX it was 0.05m. Figure 3 to 8 give a major view an all issues related to the used data. In the WV-2 scene four roundabouts were determined which can visually also be identified in the TSX scene. Figure 3, 4, 5 and 6 depict subsets of the Worldview-2 scene and the corresponding TerraSAR-X data. Two roundabouts can be identified in the Quickbird scene, Figure 7 and 8 show subsets of the QB scene including these roundabouts and their TSX counterparts.

The four columns (a, b, c, d) of the aforementioned Figures show from left to right:
• A subset of the optical satellite image (e.g. QB) where in transparent cyan the region used for the search is highlighted. The roundabout is marked in transparent green, the center of the roundabout is marked with a green dot.
• The corresponding image slice of the TSX image. The search patch is also highlighted in cyan, the manually determined roundabout position is marked with a red circle and a red dot as centroid.
• The optical image with the search space highlighted in cyan and the roundabout colored in green, the bounding box of the TSX patch is shown in red, the position of the roundabout in the TSX image is shown by a red transparent circle.
• The MI image of the search space -the size is given by the size of the subset in the optical image minus the patch of the SAR image. The colormap goes from dark blue (low MI) to dark red (high MI).
At the data preparation stage we recognized that in two of the six experiments the expected matching position was not the global maximum of the M I function, see Figure 5 (d) and 6 (d). As the formulation of the objective function is not the scope of this paper and as such behavior is known to be a drawback of all similarity measurement based matching methods, the experiments were carried out ignoring this unsatisfying situation.

Experiment Description
The goal of the experiments is to proof that an EA can lower the computational costs for finding tie points between an optical and a SAR image using template matching. Six experiments are carried out as four rotaries are identified in the WV-2 and the TSX scene and two rotaries can be found in both the QB and the TSX scene. The computational costs are measured by means of objective function calls. A moving window algorithm calls the objective function for each position in the search space to find the maximum MI value, which is then expected to be the best matching position. The number of possible positions in the search space is given by the search space size in the optical image minus the SAR patch size. In Table 3 the last column gives for the six experiments the number of function calls if the full search space would be exploited using a moving window approach.
After initialization using the parameters given in Table 1 the aim of the algorithm is to find the maximum MI position in the search space. Each individual of the population represents a solution candidate, the number of objective function calls is counted. The total number of objective function calls is given by the sum of • the initial calls -one for each individual, thus 50 for the whole population • the calls for the offspring evaluation within each new generation, for instance a population size of 50 individuals and a selection gap of 0.7 leads to a maximum of 35 calls per iteration, the number can be lower having the information about already evaluated positions in search space via Lookup- Table at hand • and the additional extra costs for the local search carried out by the fittest population member, an iterative process needing 8 calls per iteration (here also the Lookup- Table is used for lowering computational costs if possible. Two different criterions for termination of the algorithm are used in the following experiments. (a) we compute the global maximum beforehand and terminate the algorithm immediately after reaching this value and (b) the algorithm is terminated after a certain time of no progress. Details concerning both scenarios are given in subsection 3.3.
Besides of this the number of times the search algorithm succeeds is determined. For this experimental setting the true maximum value was computed beforehand.
As Evolutionary Algorithms are Heuristics we follow the law of the big number to get a stable measure about the performance of the algorithm. Each experiment is repeated 1000 times.

Algorithm Performance
The algorithm performance was measured using two different scenarios. In Scenario A the algorithm is terminated immediately after reaching the global optimum. This scenario indicates whether the algorithm converges at what speed and whether the usage of EA for the given search problem is beneficial by means of lowering computational costs. Scenario B deals with the identification of a suitable termination criterion. Obviously the algorithm has to stop when no more progress can be measured. Progress is measured by measuring the cumulative sum of the MI value of the n best individuals and comparing this value to the value reached in previous iterations. Does the value remain stable over m iterations the algorithm stops. The determination of suitable parameters for n and m having the goal of high success rate and low computational costs in mind is the topic of Scenario B.

Scenario A -Optimum known
The results of the experiments where the algorithm was terminated immediately after reaching the global optimum are given in Table 3. The search algorithm always succeeds in three of the six experiments, in one experiment the search was nearly always conducted successfully (> 99.99%). The average number of function calls of these four experiments is in all cases significantly lower than if the whole search space would be explored. For roundabout one and three in the WV-2 scene the number of function calls is lowered to just 1.92% and 5.1% in comparison to the moving window approach. This significance holds also true for both rotaries in the QB scene, where just 3.06% respectively 17.51% of the maximum number of function calls are necessary. For roundabout two in the QB scene the somehow low performance is explained by the anyhow small search space.
For roundabout two and four in the WV-2 scene the success rate did not achieve 100%. The given results with approx. 96.8% and 93.8% are nevertheless promising. As the average number of function calls includes the experiments with no success it's not surprising that the percentage raises to 12.86% and 16.49% respectively. The lower performance in this two cases can be explained by the more complex nature of the search space. In Figure  3 to 8 the right column (d) shows the fully exploited search space by means of MI values. In the four cases the algorithm succeeded with almost 100% the given situation is almost univariate. In Figure 4 and 6 the situation is not that clear which leads obviously to a decrease in the overall success rate and an increase in the average sum of objective function calls.  Figure 9 and 10 show the convergence behavior of the algorithm for the six experiments where Figure 9 depicts the MI value of the fittest population member on the y-axis and the generation number on the x-axis and Figure 10 visualizes the mean MI value of the population on the y-axis and the generation number on the x-axis. Each line represents one experiment, thus 1000 lines are drawn per plot. The opacity of the lines is 5% -the general behavior is highlighted, unexpected behavior is still visible. The mean MI value in both plots is given as red line. Both plots give the proof that the algorithm converges towards the global optimum in a robust and reliable manner.  Table 4 gives the number of fittest individuals and the number of iterations for which no increase in MI was measured until stopping the search. A grid search was done to approximate the best values for this two parameters where the number of individuals was 1, 3, 5, 7, 9 and the number of iterations was 3,5,7,9,11,13,15,17,19. As Figure 11 shows is the average number of function calls during search strongly depending on this two parameters. For both parameters the less the better one might say. The counterpart position is proposed when it comes to the average success rate. Figure 12 clearly indicates that the overall success of the search is the better the more indiviuals are taken into account in measuring search progress and the longer no more progress is measured.

CONCLUSION
An algorithm is proposed which is capable of finding tie points between optical and SAR images. The aim of the algorithm is to reduce significantly the computational costs of the search process. Given the results of the experiments we have the proof that the application of the proposed search algorithm is in most cases beneficial. EAs and their operators rely on probabilities which are given as parameters. In the future our goal is to adapt the operators for the specific problem domain of finding spatially shifted image patches within a multi-sensor environment. Besides of this the algorithm should also be extended by using established techniques which help to reduce the computational costs further. Several generic versions of EAs have been published during the last decade which outperform the canonical one. We found out that basic MI is not that robust in delivering the optimal matching position than expected. The adaption or even development of a new more reliable objective function is not the scope of this workhowever in the future we will evaluate the usage of other more sophisticated functions. As the intensity-based similarity measure between optical and SAR imagery was identified to be the major drawback of our approach, we want to evaluate the usage of EAs searching for geometric features like circles directly in the future. This would probably overcome the issues identified during this research project. Merkle, N., Luo, W., Auer, S., Mller, R. and Urtasun, R., 2017. Exploiting deep matching and sar data for the geo-localization accuracy improvement of optical satellite images. Remote Sensing.