REGION-BASED FUZZY CLUSTERING IMAGE SEGMENTATION ALGORITHM WITH KULLBACK-LEIBLER DISTANCE

: To effectively describe the uncertainty of remote sensing image segmentation, a novel region-based algorithm using fuzzy clustering and Kullback-Leibler (KL) distance is proposed. By regular tessellation, the image domain is completely divided into several sub-blocks to overcome the complex noise existed in high-resolution remote sensing images. Taking the blocks as the basic processing units, KL divergence is used to model the distance between blocks and clusters, which enables the model to describe the uncertainty of the non-similarity relationship. Besides, based on the theory of Markov Random Field (MRF), the regionalized KL entropy regularization term is established and added to the objective function to further consider the spatial constraints. Finally, the optimal segmentation results are obtained by estimating the parameters. The experiments carried out on different kinds of remote sensing images by comparing algorithms fully demonstrate the performance of the proposed algorithm.


INTRODUCTION
Image segmentation is the key step of image processing, the segmentation accuracy can directly affect the quality of image interpretation (Dass et al., 2012;Wang et al., 2018). However, with the increase of spatial resolution, the rich and detailed surface information increases the heterogeneity of homogeneous regions and complicates the spatial correlation of spectra (Yuan et al., 2014). All of these characteristics bring more uncertainty to image segmentation and make high accuracy segmentation face new challenges (Heshmati et al., 2016).
Fuzzy set is one of the most effective tools to deal with uncertainty problems, where Fuzzy C-Means (FCM) is the most classical algorithm in image segmentation (Gong et al., 2013;Memon, Lee, 2018). The fuzzy membership provides an ingenious way of describing the segmentation uncertainty. However, the noise immunity of the traditional FCM algorithm is too weak to effectively segment high-resolution remote sensing images (Benaichouche et al., 2013;Liu et al., 2009). More and more modified fuzzy clustering algorithms are studied to improve segmentation results (Singh, Garg, 2014;Kalti, Mahjoub, 2014). Miyamoto and Mukaidono (1997) proposed an Entropy-based FCM algorithm (EFCM). It defined an entropy regularization term based on fuzzy membership according to maximum entropy theory. The mis-segmentation problem of similar spectra is improved. However, EFCM can not control the clustering scale. To overcome this problem, Miyagishi et al. (2001) proposed Kullback-Leibler FCM (KLFCM). The scale factor is added into the entropy regularization term according to KL divergence. Furthermore, to consider the spatial correlation of spectra, Zhao et al. (2018) proposed a modified fuzzy clustering algorithm based on Markov Random Field (MRF-FCM). The scale factor is replaced with the prior probability of pixels belonging to clusters. The prior probability is defined in the label field based on MRF theory. MRF-FCM made a great achievement. However, all of these algorithms mentioned above are pixel-based, they can not effectively overcome the complex noise existed in high-resolution remote sensing image. Besides, The ability to describe the segmentation uncertainty is also limited by the distance model defined by Euclidean distance or Gaussian distribution.
In this paper, the region-based fuzzy clustering image segmentation algorithm with KL distance is proposed to increase the ability to overcome noise and describe the segmentation uncertainty. First, the image domain is completely divided into several sub-blocks by regular tessellation strategy (Wang et al., 2015), and the divided blocks are considered to be the basic processing units during segmentation. Then, assuming that the spectra of pixels in the same cluster follow Gaussian distribution. The distance between blocks and clusters is modeled by KL divergence. Furthermore, the regionalized KL entropy regularization term with spatial constraints is established and added to the objective function based on the theories of KL divergence and MRF. For estimating the optimal parameter, the Lagrange function method and Markov Chain Monte Carlo (MCMC) method (zhao et al., 2014) are selected according to the characteristics of segmentation model parameters.

Regular Tessellation
Let I = {Ii (xi1, xi2): i = 1, ..., n} express the remote sensing image, where i and n are the index and total number of pixels respectively, (xi1, xi2) ∈ P is the lattice coordinates of pixel i, P = {(xi1, xi2): i = 1, ..., n} is the image domain, Ii = (Iie: e = 1, ..., r) is the spectrum characteristic of pixel i, e and r are the index and total number of bands respectively. In order to realize the regionalization of the image domain, regular tessellation is utilized to divide P into several sub-blocks, P = {Pj: j = 1, ..., m}, where j and m are the index and total number of blocks respectively. Pj = {(xi1, xi2): Bi = j}, where Bi indicates the blocks to which the pixel belongs. The minimum size of blocks is limited to 2, and other sizes are an integral multiple of 2. Taking the blocks as the basic processing units, the label field can be expressed as L = {Lj: j =1, ..., m}, Lj∈{1, ..., k}, k is the number of clusters also called homogeneous regions.

Kullback-Leibler Distance
Assuming that the spectra of pixels in the same cluster follow Gaussian distribution. A homogeneous region consists of several blocks, and the probability density function (pdf) of the spectrum conditioned on Lj = l can be expressed as, In order to describe the uncertainty of the non-similarity relationship between blocks and clusters, KL divergence is used to model the distance, θ'j = Distribution parameter set of block j, θ'j = {μ'j, Σ'j }. μ'j = Mean of Gaussian distribution of block j. Σ'j = Covariance of Gaussian distribution of block j. p(Zj; θ'j) = The pdf of the spectrum in block j. p(Zj; θl) = The pdf of the spectrum conditioned on Lj = l.

Objective Function
Let U =[ujl]m×k be the fuzzy membership matrix to describe the clustering uncertainty. ujl is the fuzzy membership of block j belonging to cluster l, and satisfied 0 ≤ ujl ≤ 1, Where the second term in Equation (4) is the regionalized KL entropy regularization term. Djl = DKL(Zj, θl). γ = Fuzzy factor. λ = Coefficient of the regularization term. πjl = Prior probability of block j belonging to cluster l. Nj = The number of pixels in block j.
In order to consider the effects of spatial interaction, πjl is defined based on MRF model. Assuming that ∂j is the neighborhood set of block j, ∂j = {Pj': Pj' ≠ Pj, Pj'~Pj,}, "~" represents the neighbouring relations. Then, Where β = Intensity of neighborhood influence. t(Lj, Lj') = 1 if and only if Lj = Lj'.

Parameter Estimation
There are many types of parameters in the objective function. Different estimation methods are designed according to the characteristics of parameters.
For the fuzzy membership ujl. According to the constraint condition, the Lagrange function is established as, For the Gaussian distribution parameter θl . It is difficult to be estimated by derivative directly, MCMC strategy as the classical parameter estimation method is given priority. Assuming that the change of parameters follows Gaussian distribution, the mean is the current value of the parameter, and the variance is a given number, i.t. σ0. First, randomly selecting one cluster to change, such as l, its parameters at t iteration expressed as θl (t) is replaced with a candidate parameter θl * .
Then, recalculating the objective function J * with the new variables ujl * , Djl * , where ujl * and Djl * are obtained by θl * according to equations (8) and (3) respectively. If J * < J, the candidate parameter θl * is accepted, otherwise, θl (t) will stay the same.

EXPERIMENTS AND RESULTS
To evaluate the effectiveness of the proposed algorithm, FCM, KLFCM, MRF-FCM are tested as the comparing algorithms on the simulated image and remote sensing images. The simulated image generated by Gaussian random noise is shown in Figure  1(a), its template is shown in Figure 1   KLFCM improves the segmentation result with the help of KL divergence, but there is no obvious visual difference between FCM and KLFCM. MRF-FCM is better than others because of the spatial constraints, but noise immunity is still limited. Figure 3. Segmentation result of the simulated image with comparing algorithms. The proposed algorithm is also tested on different kinds of remote sensing images shown in Figure Figure 2. Segmentation result of the simulated image with the proposed algorithm.
but also the remote sensing images. And that, the homogeneous regions with similar spectra can also be segmented.
Figure 6(a1)-(c1), (a2)-(c2), (a3)-(c3) are the segmentation results of remote sensing images with FCM, KLFCM and MRF-FCM algorithms respectively. It can be seen that FCM and KLFCM are hard to segment the regions with similar spectra, as shown in Figure 6(b1) and (b2), and not good at SAR image, as shown in Figure 6(c1) and (c2). For MRF-FCM, although the spatial constraint is considered, there are still many missegmented pixels. Figure 6. Segmentation results of remote sensing images with comparing algorithms.

CONCLUSION
This paper proposed a region-based fuzzy clustering image segmentation algorithm with KL distance. The regular tessellation technology is utilized to divided the image domain into several sub-blocks to realize the regionalization. It lays an effective foundation for reducing the sensitivity to noise. Besides, KL divergence is used to establish the distance between blocks and clusters and the entropy regularization term, which can more accurately describe the segmentation uncertainty and obtain better segmentation results. Currently, the segmentation results are depended on the size of regular tessellation. For improving the performance, the region-based fuzzy clustering segmentation algorithm with an adaptive adjustment strategy for blocks will be studied in the future.