SENSOR SIMULATION BASED HYPERSPECTRAL IMAGE ENHANCEMENT WITH MINIMAL SPECTRAL DISTORTION

In the recent past, remotely sensed data with high spectral resolution has been made available and has been explored for various agricultural and geological applications. While these spectral signatures of the objects of interest provide important clues, the relatively poor spatial resolution of these hyperspectral images limits their utility and performance. In this context, hyperspectral image enhancement using multispectral data has been actively pursued to improve spatial resolution of such imageries and thus enhancing its use for classification and composition analysis in various applications. But, this also poses a challenge in terms of managing the trade-off between improved spatial detail and the distortion of spectral signatures in these fused outcomes. This paper proposes a strategy of using vector decomposition, as a model to transfer the spatial detail from relatively higher resolution data, in association with sensor simulation to generate a fused hyperspectral image while preserving the inter band spectral variability. The results of this approach demonstrates that the spectral separation between classes has been better captured and thus helped improve classification accuracies over mixed pixels of the original low resolution hyperspectral data. In addition, the quantitative analysis using a rank-correlation metric shows the appropriateness of the proposed method over the other known approaches with regard to preserving the spectral signatures.


INTRODUCTION
Hyperspectral imaging or imaging spectroscopy has gain considerable attention in remote sensing community due to its utility in various scientific domains.It has been successfully used in issues related to atmosphere such as water vapour (Schlpfer et al., 1998), cloud properties and aerosols (Gao et al., 2002); issues related to eclogoy such as chlorophyll content (Zarco-Tejada et al., 2001), leaf water content and pigments identification (Cheng et al., 2006); issues related to geology such as mineral detection (Hunt, 1977); issues related to commerce such as agriculture (Haboudane et al., 2004) and forest production.
The detailed pixel spectrum available through hyperspectral images provide much more information about a surface than is available in a traditional multispectral pixel spectrum.By exploiting these fine spectral differences between various natural and manmade materials of interest, hyperspectral data can support improved detection and classification capabilities relative to panchromatic and multispectral remote sensors (Lee, 2004) (Schlerf et al., 2005) (Adam et al., 2010) (Govender et al., 2007) (Xu and Gong, 2007).
Though hyperspectral images contain high spectral information, they usually have low spatial resolution due to fundamental tradeoff between spatial resolution, spectral resolution, and radiometric sensitivity in the design of electro-optical sensor systems.Thus, generally multispectral data sets have low spectral resolution but high spatial resolution.On the other hand, hyperspectral datasets have low spatial resolution but high spectral resolution.This coarse resolution results in pixels consisting of signals from more than one material.Such pixels are called mixed pixels.This phenomenon reduces accuracy of classification and other tasks (Villa et al., 2011b) (Villa et al., 2011a).
With the advent of numerous new sensors of varying specifications, multi-source data analysis has gained considerable atten-tion.In the context of hyperspectral data, Hyperspectral Image Enhancement using multispectral data has gained considerable attention in the very recent past.Multi-sensor image enhancement of hyperspectral data has been viewed with many perspectives and thus a variety of approaches have been proposed.Algorithms for pansharpening of multispectral data such as CN sharpening (Vrabel, 1996), PCA based sharpening (Chavez, 1991), Wavelets based fusion (Amolins et al., 2007) have been extended for hyperspectral image enhancement.Component substitution based extensions such as PCA based sharpening suffer from the fact that information in the lower components which might be critical in classification and detection may be discarded and replaced with the inherent bias that exist due to band redundancy.Frequency based methods such as Wavelets have the limitation that they are computationally more expensive, requires appropriate values of the parameters and in general does not preserve spectral characteristics of small but significant objects in the image.Various methods (Gross and Schott, 1998) using linear mixing models have been proposed to obtain sub pixel compositions which are then distributed spatially under spatial autocorrelation constraints.The issue with these algorithms is that they do not have robust ways of determining spatial distribution of pixel compositions.Recently, methods incorporating Bayesian framework (Eismann and Hardie, 2005) (Zhang et al., 2008) have been proposed that model the enhancement process in a generative model and achieve enhanced image as maximum likelihood estimate of the model.The challenge with these methods is that they make various assumptions on distribution of data and require certain correlations to exist in data for good performance.
A core issue with most of these algorithms is that they do not consider the physical characteristics of the detection system i.e each sensor works in different regions of the electromagnetic spectrum.Ignoring this fact leads to injection of spectral information from different part of the spectrum which may not belong to the sensor and this leads to modification of spectral signatures in the fused hyperspectral data.This makes the enhanced image inap- Here, we propose a new approach, Hyperspectral Image Enhancement Using Sensor Simulation and Vector Decomposition (HySSVD) for improving spatial resolution of hyperspectral data using high spatial resolution multi-spectral data.This paper aims at exploring how well enhanced images from different algorithms mimic the true or desired spectral variability in the fused hyperspectral image.Through various experiments we show that there exits a trade-off between improvement in spatial detail and distortion of spectral signatures.

DATA
Hyperspectral data from HYPERION sensor on board EO-1 spacecraft has been used for this study.HYPERION provides a high resolution hyperspectral imager capable of resolving 220 spectral bands (from 0.4 to 2.5 micrometers) with a 30-meter spatial resolution and provides detailed spectral mapping across all 220 channels with high radiometric accuracy.
For multispectral data, ALI (Advanced Land Imager) sensor on board EO-1 spacecraft has been used.ALI provides Landsat type panchromatic and multispectral bands.These bands have been designed to mimic six Landsat bands with three additional bands covering 0.433-0.453,0.845-0.890,and 1.20-1.30micrometers.
Multispectral bands are available at 30-meter spatial resolution.
Since hyperspectral bands have very narrow spectral range (10 nm), they are also referred by their center wavelength.Table 1 shows spectral range of multispectral bands that mimic six Landsat bands together with list of hyperspectral bands whose center wavelength lie in the range of different multispectral bands.
Since, both hyperspectral data and multispectral data are at 30m spatial resolution, hyperspectral data has been down sampled to 120m in this work.Hence the ratio of 4:1 between multispectral data and hyperspectral data has been established.Moreover, in this setting hyperspectral data at 30m can be used as validation data against which output of different algorithms can be compared.Here, the hyperspectral data at 30m will be referred to as THS (True Hyperspectral) and hyperspectral data at 120m will be referred as OHS (Original Hyperspectral) which will be enhanced by the algorithms.Multispectral data at 30m will be referred to as OMS. Figure 1 shows RGB composites of datasets used in this study.The algorithms use OHS with OMS to generate the fused hyperspectral image (FHS) at 30m.This fused result would be compared with THS for performance analysis.Figure 1  The ground truth is available through NASA's Cropscape website (Han et al., 2012).
In order to properly show the visual quality of various images, magnified view of the part of image enclosed in dotted lines in figure 1(a) will be used.For statistical analysis, the complete image will be used.

PROPOSED APPROACH
The algorithm presented here, HySSVD has three main stages.
Figure 2 shows the flowchart of the algorithm.As a preprocessing step, original low resolution hyperspectral data (OHS) is upscaled to the spatial resolution of original multispectral data (OMS).The upscaled data will be referred as UHS.In the first stage, simulated multispectral (SMS) bands are generated using UHS bands and Spectral Response Functions (SRF) of OMS bands.In the second stage, each SMS band is enhanced using its corresponding OMS band to generate fused multispectral (FMS) bands.In third stage, Fused Hyperspectral (FHS) bands are computed by inverse transformation using vector decomposition.Following subsections explain each stage in detail.

Generating Simulated Multispectral (SMS) bands
Hyperspectral data has been considered in remote sensing to crosscalibrate a hyperspectral sensor with another hyperspectral or multispectral sensor (Teillet et al., 2001), and to simulate data of future sensors (Barry et al., 2002).The algorithm exploits sensor simulation capabilities of hyperspectral data using spectral response function of the sensor to be simulated.Figure 3 shows spectral response function of a multispectral band from ALI sensor (blue curve).The spectral response function of a sensor defines the probability that a photon of a given wavelength is detected by this sensor.As we can see from the response function of the band, it is non zero for only some wavelengths and its response varies for different wavelengths.The value recorded by the sensor is proportional to the total incident light that it was able   In other words, all the wavelengths that a sensor is able to detect contributes to its value.The weight of the contribution of a particular wavelength is determined by the SRF value at that wavelength.Ideally, to simulate a multispectral band, we need light from all wavelengths within the SRF of that band.But this level of spectral detail is not available in real sensors.Hyperspectral sensors are the closest approximation for the data that can be used for simulating a sensor with wide SRF.As mentioned before, SRFs of hyperspectral bands are generally referred by their center wavelengths as they are very narrow.Figure 3 shows these center wavelengths as vertical red lines.The hyperspectral bands that contribute in simulation of a multispectral band are the ones which have their center wavelength within the spectral range of the multispectral band.Table 1 shows the set of hyperspectral bands selected for each multispectral band.
Many methods exist for simulating multispectral data with desired wide-band SRFs.Most methods synthesize a multispectral band by a weighted sum of hyperspectral bands, and they are different in their ways in determining the weighting factors.Some methods directly convolve the multispectral filter functions to the hyperspectral data (Green and Shimada, 1997), which is equivalent to using the values of the multispectral SRF as the weighting factors.Some have used the integral of the product of the hyperspectral and multispectral SRFs as the weight (Barry et al., 2002).Few have calculated the weights by finding the least square approximation of a multispectral SRF by a linear combination of the hyperspectral SRFs ( Slawomir Blonksi, Gerald Blonksi, Jeffrey Blonksi, Robert Ryan, Greg Terrie, Vicki Zanoni, 2001).Bowels (Bowles et al., 1996) used a spectral binning technique to get synthetic image cubes with exponentially decreasing spectral resolution, where the equivalent weighting factors are binary numbers.
The algorithm presented here has adopted the method described in (Green and Shimada, 1997).Firstly, all OHS bands are upscaled to the spatial resolution of the given OMS data to get UHS.If m is the number of bands falling in the range of the multispectral band k (OM S k ), then let W k be the m dimensional weight vector calculated using spectral response function of the multispectral band k.For a pixel at location i,j , let U HS i,j,k be the m dimensional vector containing the intensity values of those m hyperspectral bands corresponding to (OM S k ).The simulated value SMS i,j,k for the pixel i,j can be obtained using the following equation.
which is the inner product of the two vectors.The vector W T k is computed as where, SRF k is the spectral response function of the multispectral band k and Ci is center wavelength of the hyperspectral band i.
The reason for creating simulated multispectral bands is two-fold.First, since high spatial detail is available at multispectral resolution, transferring spatial detail at multispectral level would be more effective than transferring spatial detail from a multispectral band to a hyperspectral band directly.Second, this will ensure that the algorithm is not enhancing a hyperspectral band whose spectral information is not part of a multispectral band in question.The drawback of this approach is that some hyperspectral bands will not be enhanced.But other approaches will also cause spectral distortion in these bands as multispectral data does not contain information about these bands.

Generating Fused Multispectral Bands
After stage 1, SMS bands are obtained.These bands have spatial resolution same as that of OMS bands but have poor detail as compared to OMS bands because SMS bands are simulated using upscaled hyperspectral bands.In this step, spatial detail from an OMS band is transferred to its corresponding SMS band.This step can be seen as sharpening of a grayscale image using another grayscale image.One relevant concern here can be that why there is a need to transfer detail and create a fused multispectral band.Since, theoretically an SMS band is just low spatial resolution version of its corresponding OMS band, then OMS band itself can be treated as the fused high spatial resolution version of SMS bands.But in many real situations multispectral data and hyperspectral data can be from different dates.Because of different atmospheric conditions and other factors, an OMS can not be taken as a direct enhanced version of its SMS band.Hence, we need methods that can transfer only spatial detail while not distorting the spectral properties of the SMS band.Many methods exist to do this operation.In this paper Smoothing Filter Based Intensity Modulation (SFIM) algorithm has been adopted (Liu, 2000).SFIM can be represented as where, OM S i,j,k , SM S i,j,k and F M S i,j,k are original multispectral value, simulated multispectral value and fused multispectral value respectively at a pixel i,j for the band k.OM S i,j,k is the mean value calculated by using an averaging filter for a neighborhood equivalent in size to the spatial resolution of the low-resolution data.Similarly this operation can be applied to enhance each SMS band leading to the calculation of the corresponding FMS band.
Figure 4 shows results of this step on only the part enclosed in

Generating Fused Hyperspectral (FHS) Bands Using Vector Decomposition Method
The previous stage generates FMS bands.The spatial detail from FMS bands has to be transferred into hyperspectral bands.Here, we explain the process of detail transfer at this stage.Expanding equation 1, we have w1uhs1 +w2uhs2 +...wm−1uhsm−1 +wmuhsm = SM S i,j,k (4) where wi and uhsi are elements of vectors W k and U HS i,j,k respectively.This is an equation of a m dimensional hyperplane on which we know a point, U HS i,j,k .This plane will be referred as PSMS.Also, say normal of this plane is n .Say F HS i,j,k be the m dimensional vector representing fused hyperspectral value for the m selected bands of multispectral band k.Fused multispectral (FMS) data can alternatively be estimated using the sensor simulation strategy from section (3.1).
w1f1 + w2f2 + ...wm−1fm−1 + wmfm = F M S i,j,k (5) where wi and fi are elements of vectors W k and F HS i,j,k respectively.Again this is an equation of an m dimensional hyperplane on which we wish to estimate the point F HS i,j,k .This plane will be referred as PF M S .Equations 4 and 5 represent two parallel hyperplanes which are separated by a distance d equal to the difference between the simulated and fused multispectral values at that pixel ( F M S i,j,k -SM S i,j,k ).
Since, we aim to achieve enhanced spectra with least spectral distortion, we estimate point F HS i,j,k as a point on the plane PF M S which is closest to the point U HS i,j,k on the plane PSMS.Hence, this point is the intersection of the plane PF M S and the line perpendicular to plane PSMS and passing through point U HS i,j,k .Figure 5 illustrates the method of estimation using two dimensions.
The two axes represent two bands of hyperspectral data corresponding to a multispectral band.Say, x and y are the values in these bands for a pixel, d is the difference between the simulated Figure 5: Geometric interpretation of Vector decomposition based detail transfer multispectral value and the fused multispectral value at that pixel.P 1 is the plane that represents equation 4 and P 2 is the plane that represents the equation 5.Here (x,y) can be understood as the components of vector U HS i,j,k and (x + dcosα, y + dsinα) can be understood as the components of F HS i,j,k .

RESULTS AND DISCUSSION
In order to compare the performance of HySSVD with existing work, Principal Component Analysis based technique (PCA) (Tsai et al., 2007) has been implemented for comparison.
PCA first takes the principal component transform of the input low spatial resolution hyperspectral bands corresponding to a given multispectral band.Then first principal component is replaced by the high spatial resolution multispectral band.Then an inverse transform on this matrix returns a high spatial and spectral resolution hyperspectral bands.
These methods have been compared qualitatively and quantitatively.Qualitative analysis has been done by visual inspection of different fused images.For quantitative analysis, classification accuracy on two major classes has been observed.Another metric, Kendall Tau rank correlation has been used to measure the spectral signature preservation of different algorithms.

Qualitative Analysis
In order to see the impact of various algorithms on visual quality of the images, here we show a part of the image.
From visual inspection, we can see that HySSVD has slightly improved the spatial details in the image.HySSVD has slightly sharpened the patch boundaries.But fused image from PCA has much better spatial detail than fused image from HySSVD.So, for visual analysis of RGB composites, PCA based fused image is more suitable among the two algorithms.

Quantitative Analysis
As we have seen in the previous section, HySSVD improves visual quality but not as well as PCA.Now we will analyze the performance of HySSVD quantitatively.Any method that aims to do image enhancement should not only improve the visual quality but also preserve the spectral characteristics so that the fused image can be used for image processing tasks such as classification.Here, we aim to demonstrate that classification results after fusion are better than without fusion and compare different algorithm based on their classification accuracy.In order to make the evaluation setup less complex, we will consider only the binary classification case.We use binary Decision Tree to classify the given image into two classes namely Wheat and Soybean.500 pixels for each class were randomly selected for learning the model.Table 3 shows the overall accuracy and class based accuracy for different types of images only on mixed pixels as we want to evaluate algorithms for their ability to improve spatial detail of mixed pixels.As explained before, mixed pixels contain spectral combination from multiple land cover types.Since OHS is at 120 meters, each OHS pixel contains 16 THS pixels.So, if it is assumed that pixels at 30 meters are pure, then we can define mixed pixels for OHS.Specifically, any OHS pixel contains THS pixels belonging to more than one land cover type then that pixel is considered as mixed pixel.Since, we are evaluating the results at THS resolution, we considered only those pixels which are part of mixed pixels at OHS resolution.There are 10223 pixel of Soyabean class which are part of any mixed pixel.Similarly, we have 4559 Wheat pixel that are part of any mixed pixel.Accuracy values are average accuracy values from 1000 random runs.
Firstly, we can see that from Table 3 that THS has highest class based and overall accuracy.This matches our expectation because THS is itself the high spatial and spectral resolution data that different methods are trying to estimate.HySSVD shows improvement in classification accuracy over OHS.This demonstrates that HySSVD has been able to improve the quality of the input image for image processing tasks such as classification.
When compared with the baseline algorithm, PCA appear to do better than HySSVD.
We have observed that fusion methods have lead to improvement in discriminative ability of the data.It is important to note that classification performance depends on the classification task at hand.If the land cover classes are very easy to separate then small distortions in spectral signatures will not impact classification performance.But for various applications that use hyperspectral data, preserving subtle features of the spectral signature Hence, it is important to preserve these features in the fused product.Hence it is important that different algorithms maintain the relative ordering of values in spectral signatures and hence preserving these subtle features of the spectral signatures.Rank correlation measures the consistency between two given orderings.In our case, we want to measure whether spectral signatures from HySSVD are more consistent with true spectral signatures or spectral signatures from PCA are more consistent.Here we have used Kendall Tau rank correlation measure.Kendall Tau correlation is defined as - where, B is the reference series of length n and I is the input series of length n.In words, KT measure the difference between the number of concordant pairs and discordant pairs normalized by total number of pairs.A pair of indices in the series are considered concordant if both reference and input series increase or decrease together from one index to other.Otherwise, the pair is considered discordant.PCA has more variance than HySSVD.This shows that PCA deviates more from the original signature and has more propensity for spectral distortion than HySSVD.For class wheat as shown in figure 9, HySSVD is performing better than PCA.At the same time, HySSVD also shows slight improvement over LR.On the other hand, PCA is worse than the original data itself.This tells us that PCA is more aggressive while doing enhancement and deteriorates the relative spectral properties.For HySSVD we can see that it performs very similar to LR which means that it is more conservative and does not alter the LR spectrum very drastically and hence tends to avoid spectral distortion.
A similar analysis is required on pure pixels to ensure that the algorithms are not introducing unwanted characteristics into the signatures that do not need enhancement.Table 5 shows the performance on pure pixels.
Again, we can see that HySSVD has maintained the spectral characteristics of pure pixels more effectively than PCA. Figure 10 shows anecdotal example of spectral distortion in pure and mixed pixels.Figure 10(a) shows spectral signature of the first 25 bands of a pure pixel.We can see that the amount of distortion by HySSVD is minimal whereas PCA has introduced slight distortion in signature.Figure 10(b) shows spectral signature of the first 25 bands of a mixed pixel.We can see that relatively large distortion of signature has happened in PCA based image.Even tough signature from HySSVD is also different from required THS signature but it has less distortion.

CONCLUSION
The paper presents a novel way of fusing hyperspectral data and multispectral data to obtain an image with good characteristics  For each stage a very fundamental technique has been chosen as a proof of concept, in order to make the idea less complex and easy to understand.Like most other algorithms, the performance of the algorithm might decrease at larger resolution differences due to upscaling of the data in the initial stage.Since the algorithm enhances only those hyperspectral bands which lie in wide-band SRF ranges of the multispectral bands, this leaves some of the bands unsharpened.This is the tradeoff that has been made to maintain spectral integrity of the fused output.At the application level, this drawback will not be consequential as the algorithm will give subsets of enhanced data in each region of the electromagnetic spectrum according to the SRFs of the multispectral bands.This will allow correct classification and material identification capabilities.Comparison with a baseline algorithm shows that HySSVD has some potential of improving spatial quality while preserving spectral properties.Through experiments we showed that HySSVD has capability to improve performance in classification task.By careful analysis of the characteristics of the spectral signatures we demonstrated that baseline algorithm does distort spectral properties.The performance of HySSVD is limited due to very conservative choice of vector decomposition which does not alter original low resolution signal drastically but this also prevents HySSVD from introducing spectral distortion.

Figure
Figure 2: Flowchart of the Algorithm

Figure 3 :
Figure3: Spectral response function of band 9 of ALI to detect.In other words, all the wavelengths that a sensor is able to detect contributes to its value.The weight of the contribution of a particular wavelength is determined by the SRF value at that wavelength.Ideally, to simulate a multispectral band, we need light from all wavelengths within the SRF of that band.But this level of spectral detail is not available in real sensors.Hyperspectral sensors are the closest approximation for the data that can be used for simulating a sensor with wide SRF.As mentioned before, SRFs of hyperspectral bands are generally referred by their center wavelengths as they are very narrow.Figure3shows these center wavelengths as vertical red lines.The hyperspectral bands that contribute in simulation of a multispectral band are the ones which have their center wavelength within the spectral range of the multispectral band.Table1shows the set of hyperspectral bands selected for each multispectral band.

Figure 4 :
Figure 4: Performance of the Stage 2 spatial detail transfer dotted lines in figure 1(a) due to space constraints.Figure 4(a) shows band 5 from original multispectral data (OM S5), while Figure 4(b) shows simulated multispectral band 5 (SM S5) corresponding to band 5 of ALI data.Figure 4(c) shows the fused result for band 5 (F M S5).It is clearly evident that features have become sharper in the fused result.Now this detail has to be transferred to each hyperspectral band that contributed in simulation of this band.

Figure 6 :
Figure 6: Magnified view of the selected part of the Study Area Figure 7: Spectral Signatures of class Soyabean and Wheat Method Soyabean Wheat Overall FHS-PCA 0.90 0.68 0.83 FHS-HySSVD 0.90 0.78 0.86 OHS 0.90 0.76 0.86

Figure 9 :
Figure 9: Scatter plot of Kendall Tau values for class Wheat

Figure 10 :
Figure 10: Example Spectral signatures showing spectral distortion of both.Each stage of the algorithm has many choices of sub techniques.For each stage a very fundamental technique has been chosen as a proof of concept, in order to make the idea less complex and easy to understand.Like most other algorithms, the performance of the algorithm might decrease at larger resolution differences due to upscaling of the data in the initial stage.Since the algorithm enhances only those hyperspectral bands which lie in wide-band SRF ranges of the multispectral bands, this leaves some of the bands unsharpened.This is the tradeoff that has been made to maintain spectral integrity of the fused output.At the application level, this drawback will not be consequential as the algorithm will give subsets of enhanced data in each region of the electromagnetic spectrum according to the SRFs of the multispectral bands.This will allow correct classification and material identification capabilities.Comparison with a baseline algorithm shows that HySSVD has some potential of improving spatial quality while preserving spectral properties.Through experiments we showed that HySSVD has capability to improve performance in classification task.By careful analysis of the characteristics of the spectral signatures we demonstrated that baseline algorithm does distort spectral properties.The performance of HySSVD is limited due to very conservative choice of vector decomposition which does not alter original low resolution signal drastically but this also prevents HySSVD from introducing spectral distortion.

Table 1 :
Set of hyperspectral bands corresponding to each multispectral band propriate for image processing tasks such as classification, object detection etc.

Table 2 :
Pixel Count of major classes

Table 3 :
Classification Performance on mixed pixels for different image types

Table 4
shows class wise and overall rank correlation values on mixed pixels for different algorithms.From above table, we can see that HySSVD has done a good job of preserving relative ordering of values in the spectrum.For class Soyabean, HySSVD has not shown any improvement but it has notably improved consistency for pixels of class Wheat.When compared with the baseline algorithm, PCA perform poorly than HySSVD.In order to understand these results in more detail, we plot the scatter plot of Kendall Tau value for different pairs of data images as shown in figure8 and 9.In order to reduce number of plots, we have compared HySSVD with only PCA.Figure9(a) compares HySSVD and PCA.As we can that the distribution is mostly around the red line.This shows that both algorithm are performing similarly.But it is also important to see how much improvement has been made over the original hyperspectral data (LR).Both algorithms have comparable performance with LR and hence there is not much gain for this class.Also, note that even tough overall the points are distributed around the line but

Table 5 :
Class wise Rank Correlation values for different methods on pure pixels