DETECTION OF DISEASE SYMPTOMS ON HYPERSPECTRAL 3D PLANT MODELS

: We analyze the bene ﬁ t of combining hyperspectral images information with 3D geometry information for the detection of Cercospora leaf spot disease symptoms on sugar beet plants. Besides commonly used one-class Support Vector Machines, we utilize an unsupervised sparse representation-based approach with group sparsity prior. Geometry information is incorporated by representing each sample of interest with an inclination-sorted dictionary, which can be seen as an 1D topographic dictionary. We compare this approach with a sparse representation based approach without geometry information and One-Class Support Vector Machines. One-Class Support Vector Machines are applied to hyperspectral data without geometry information as well as to hyperspectral images with additional pixelwise inclination information. Our results show a gain in accuracy when using geometry information beside spectral information regardless of the used approach. However, both methods have different demands on the data when applied to new test data sets. One-Class Support Vector Machines require full inclination information on test and training data whereas the topographic dictionary approach only need spectral information for reconstruction of test data once the dictionary is build by spectra with inclination.


INTRODUCTION
Hyperspectral images are an important tool for assessing the vitality and stress response of plants (Fiorani et al., 2012;Mahlein et al., 2012;Behmann et al., 2014). In recent time, sensor technology for hyperspectral plant phenotyping has significantly improved in resolution, accuracy, and measurement time and is integrated into commercial phenotyping platforms.
The identification of disease symptoms using hyperspectral images is an established approach. Due to the unknown statistical distributions of hyperspectral data and disease symptoms, methods from the machine learning domain are used frequently. Applications cover direct classification of spectra (Moshou et al., 2004), combined analysis of multiple vegetation indices (Behmann et al., 2014) and derivation of new, disease specific indices (Mahlein et al., 2013). Supervised approaches like neural networks (Wu et al., 2008), Support Vector Machines (Rumpf et al., 2010) and LDA (Suzuki et al., 2008) and unsupervised approaches like Self-Organizing Maps (SOM; Moshou et al., 2002) are used. Since label information for disease symptoms are hard to obtain and oftentimes erroneous, one-class classifiers (e.g. , Schölkopf et al., 2001;Tax and Duin, 2004) and unsupervised approaches are promising (e.g. , Wahabzada et al., 2015).
Simultaneously with the improvement of hyperspectral sensors, sensor technology for the assessment of 3D geometry is considerably improving. A common application for analyzing 3D point clouds of plants is the segmentation of a plant into different organs like leaves, stems and fruits like berries (Paulus et al., 2013b;Roscher et al., 2014).
Combining both data types, hyperspectral images and 3D point clouds, to a hyperspectral 3D plant model is the recent step Hyperspectral 3D plant models can be generated in multiple ways. Liang et al. (2013) have generated hyperspectral 3D models by observing a plant from multiple viewpoints with a full frame hyperspectral camera. These perspective images are combined to a 3D model by detectors for homologous points and the structure from motion principle. A similar approach was applied to crop surfaces using a unmanned aerial vehicle and a full frame hyperspectral camera that captures all bands simultaneously by (Bareth et al., 2015). The resulting crop surface models allow to extract plotwise height information and to integrate these into the spectral analysis. The combination of separately sensed spectral and spatial information was applied to solid objects in the context of compressed sensing by (Kim et al., 2011). They combine a 3D triangulation sensor with a multi-spectral camera and a rotating table to generate spectral 3D models of solid objects. Sparse representation-based classifiers have been recently introduced in the context of hyperspectral image analysis, show-ing e.g. state-of-the-art classification performance. A sparse representation-based approach assumes that each pixel can be reconstructed by a sparsely weighted linear combination of a few basis vectors taken from a so-called dictionary. The weights of the representation can be used as learned new data representation and be fed into a classifier, which can be seen as the first level of deep learning where a hierarchical representation is learned. The dictionary is constructed from a set of representative samples, for instance the training data, and is either directly embodied by these samples (e.g. , Soltani-Farani et al., 2013;Chen et al., 2011) or learned from them (e.g. , Yang et al., 2014;Charles et al., 2011). More sophisticated approaches use structured sparsity in order to integrate prior knowledge such as homogeneity assumptions into the solution (Bach et al., 2012). In this way, actual structure in the data can be modeled rather than unimportant effects from specific samples leading to overfitted solutions.
Sparse representation has also been used for outlier/anomaly detection by e.g. Adler et al. (2013). In this work an extra error term is introduced into the optimization function to account for all anomalies which cannot be explained well by a weighted linear combination of dictionary elements. Thus, this approach is based on same assumption as one-class classifiers. We use a similar strategy based on this assumption in our paper by using a topographic dictionary in order to combine spectral as well as prior information about the geometry of the plant. Topographic dictionaries are dictionaries in which neighboring dictionary elements show similar weights to input signals (Kavukcuoglu et al., 2009;Mairal et al., 2011). Their usage can promote rotation-and translation-invariant features, which is especially useful for robust object recognition. Generally, the dictionary is learned from data to be topographic and develops a typical structure. In our work, the learning step is omitted, since we exploit the inclination information to construct a sorted dictionary, which can be seen as an 1D topographic dictionary. Fig. 1 shows some of the used dictionary elements with their respective inclination derived from 3D information of the plant. As can be seen the signal show a typical behaviour depending on the inclination.
The present paper is -to our knowledge -the first study that combines plant geometry and spectral information for detecting disease symptoms in the close range. We apply our framework on sugar beet plants which are partially infested by Cercospora leaf spot disease. We employ prior knowledge about geometry and spectral characteristics to build a topographic dictionary, which is used within a sparse representation framework with group sparsity. Furthermore, we apply feature stacking within One-Class Support Vector Machines (OCSVM; (Tax and Duin, 2004)) to integrate the geometry to show its positive effect. This allows a comparison of these different integration approaches and analysis methods.
This paper is structured as follows: Sec. 2. describes the used plant material, sensors and the combination of their geometry and spectral information. Sec. 3.1 introduces sparse representation and its usage in our framework. OCSVM are introduced and the application of these methods for disease detection is outlined in Sec. 3.1.2. In our experiments (Sec. 4.) we analyse different aspects for the detection of Cercospora disease symptoms and compare sparse representation with topographic and standard dictionary to OCSVM.

Biological material
The applications for hyperspectral 3D plant models are demonstrated by a preliminary study with sugar beet plants partially in-fected by the plant pathogen Cercospora beticola. The dicotyledon sugar beet is the main sugar producing crop in the European Union and temperate climates. Characteristic are broad leaves with a heterogeneous topography, characterized by leaf veins and the intercostal tissue. The single leaves emerge rosette-like with stalks from the center of the tap root, which is a thickened hypocotyl. During vegetation periods sugar beet plants are exposed to different kinds of biotic and abiotic stress. Thus the identification of resistant genotypes is a relevant task in plant phenotyping. For the experiments, plants, cv. Pauletta (KWS, Einbeck, Germany) were cultivated for 8 weeks in a controlled environment in a greenhouse. To demonstrate the ability of hyperspectral 3D plant models for a detailed and improved disease detection, plants were inoculated with Cercospora beticola, the causal agent of Cercospora leaf spot. Three plants, one healthy and two infected, were observed by the sensor systems and hyperspectral 3D plant models were generated based on these measurements.

Sensors
Hyperspectral cameras record the reflected radiation at narrow wavelength bands with a high spatial resolution in a defined field of view. The hyperspectral pushbroom sensor unit used in this study was the VISNIR-camera ImSpector V10E with 1600 pixel observing a spectral signature from 400 to 1000 nm (Specim, Oulu, Finland) in nadir position. Its viewing plane is moved linearly across the plant. The measured images are radiometrically normalized by subtracting the dark frame and by calculating the ratio to a white reference panel. The assessment of plant shapes requires 3D imaging techniques that handle the non-regular surface and the non-solid characteristics of the plant architecture. In this study, a Perceptron laser triangulation scanner (Perceptron Scan Works V5, Perceptron Inc., Plymouth MI, USA) is used. By coupling with a measuring arm (Romer Infinite 2.0 in 2.8m version) it provides an occlusion-free option for close-up imaging of plants with a point reproducibility better than 0.1 mm. It is chosen due to its high resolution and accuracy and has been successfully applied for 3D imaging of various plants (Wagner et al., 2011;Paulus et al., 2013a).

Combination of hyperspectral image and geometry
For the combination of 3D point clouds and image data to hyperspectral 3D plant models, directions of the 3D ray for each pixel of the hyperspectral image have to be calculated. Based on this information, the corresponding surface point of the plant can be determined. The calculation of the 3D rays is performed by a camera calibration procedure specially designed for hyperspectral pushbroom sensors in close range scenarios like plant phenotyping. The used camera calibration method is described in detail in (Behmann et al., 2015b). It extends the linear pushbroom model by a non-linear fraction using polynomials. The model parameters are estimated by homologous points on a reference object specifically designed for this purpose. Using the estimated camera model, 3D information can be projected into the image space resulting in a depth image with the same resolution as the hyperspectral image. Based on this depth image, pixel-wise inclination can be derived by analyzing the local neighborhood (see Fig. 2). In this study a hyperspectral image and local inclination for each pixel of the image is combined to a hyperspectral 3D model.

Methods for Anomaly Detection
As label information are erroneous and its generation is related to great effort, the application of one-class classifiers and unsupervised approaches is favorable for stress detection on plants. We use four different approaches for detection of disease symptoms comprising sparse representation with topographic and standard dictionary and OCSVM with and without stacked inclination feature. For all approaches we only provide negative examples by using an image of a healthy plant without desease symptoms. Treating these healthy samples as normal allows to characterize the anomaly of disease symptoms in the remaining images. In the following the two methods are explained in more detail and approach for the detection of disease symptoms is introduced.

Sparse Representation with Topographic Dictionaries
In terms of basic sparse coding a (V × 1)-dimensional test sample x can be represented by a weighted linear combination of a few elements taken from a (V × N )-dimensional dictionary D, so that x = Dα + ǫ with ǫ being the reconstruction error. The parameter vector comprising the weights is given by α.
Assuming the dictionary elements were constructed using geometry as well as spectral information, the whole dictionary is sorted regarding the inclination. The dictionary is divided into overlapping groups Gi, i = 1, . . . , I, where one group comprises the indices of neighboring dictionary elements. The sparsity groups should not be confused with inclination groups, since one sparsity group can contain multiple inclination groups. The optimization function L with group sparsity is given by The weights α are smoothed with Gaussian filter weights w, where the width of the kernel is chosen to by around 1/3 of the number of group elements. Since the groups are overlapping, the weights α will vary smoothly over neighboring groups. We use group orthogonal matching pursuit as an approximation to solve for the minimization in (1) using the approach presented in Szlam et al. (2012). The maximum number of dictionary elements is restricted to W .  3.1.2 One-class Support Vector Machines As second approach we use OCSVM Tax and Duin (2004), an established anomaly detector. The used OCSVM classifier derives a spherical decision boundary separating a given sample set from the remaining feature space. As this decision boundary represents the sample sets and provides a specific type of distribution model, the used method is called Support Vector Data Description (SVDD). Compared to density estimation methods, OCSVM deal well with sparsity of high dimensional data, which generelly leads to the curse of dimensionality.

Detection of Diseases
Both presented approaches yield different outcomes which can be utilized for the detection of disease symptoms (Tab. 1). For OCSVM the distance to the hyperplane is used as single output to be analyzed. For sparse representation the following outcomes can be qualified for analysis: • Reconstruction error: We expect the reconstruction error for ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume III-7, 2016 XXIII ISPRS Congress, 12-19 July 2016, Prague, Czech Republic pixel with disease symptoms to be higher than for pixels in other regions, since these pixels cannot be reconstructed by the dictionary elements in a proper way.
• Sum of the weights α: We expect this value to by different for pixels with disease symptoms than for other pixels, which weights should be approximately sum to 1.
• Spatial analysis of dominant dictionary elements (i.e. , dictionary element with largest weight): We expect that anomalies with similar spectral characteristics to be reconstructed by similar dictionary elements, since these pixels mostly have the same nearest neighbor in feature space. As a postprocessing step for this outcome, we utilize morphological operators to improve the spatial analysis of dominant dictionary elements, e.g. by removing too small regions with the same dominant dictionary element.
The outcomes can also be combined by e.g. multiplication.
Abbreviation Description

DVSVM Decision value of OCSVM RESR
Reconstruction error of sparse representation SumWSR Sum of weights α of sparse representation DESR Dominant dictionary element used for sparse representation DE+RESR Dominant dictionary element combined with RESR Table 1: Abbreviations for the outcomes of the used disease detection methods The different outcomes are used to detect the center points of disease symptoms by the following workflow: First, the background is automatically removed as 3D information is not available there. Furthermore, we removed the leaf borders from the evaluation by eroding the binary image of available inclination by 10px. At last, we perform a peak detection with non-maxima suppression for each pixel in the image (c.f. Section 4.1.3) . Since disease symptoms of the plant can lie close to each other, the threshold for the non-maxima suppression has to be chosen regarding image resolution and prior information about the illness.

General Setup
In our experiments we analyze two images (see Fig. 2) of plants with given data as described in Section 2.. One healthy plant with inclination information is used to build the dictionary. For construction of the dictionary we randomly select 10 samples per inclination group, where on group is defined by all samples with the same inclination after rounding. Samples are discarded, which are too dissimilar regarding the standard deviation to the average in one inclination group. Generally, these are samples from leaf veins, specular reflections and other outliers. The detection of disease symptoms is performed using the criteria mentioned in Sec. 3.2, where several outcomes are combined by multiplication. Since using the dictionary elements as training data for OCSVM turned out to result in low accuracies, we randomly choose 20 samples per inclination group. The training set need to include samples from leaf veins and specular reflections to ensure high accuracies. Before applying OCSVM, the data set is Z-normalised, i.e. each feature is normalized to have zero mean and a standard deviation of one to equalize the feature weight. We compare OCSVM with a sparse representation approach with topographic and standard dictionary, i.e. no group sparsity.

Parameter Settings of Used Methods
For optimization of (1), we use our own implementation of group orthogonal matching pursuit and restrict the number of active groups to W = 3. We choose 50 inclination groups and a large average sparsity group size of |G| = 16 with an overlap of 14. We applied OCSVM in two different setups, once without using inclination and once with pixel-wise inclination as additional and weighted stacked feature. The idea behind this feature stacking approach is to define an anomaly in the geometric context when compared to other spectra with similar inclination. The crucial factor in applying OCSVM is the specification of optimal values for the hyperparameters ν (cost on number of support vectors) and γ (kernel width). In the absence of labeled training data of two classes, we specify an outlier rate of 1% as expected leading to reasonable results. Using the SVDD implementation in LIBSVM 3.18 (Chang and Lin, 2011), we optimized the two parameters using cross validation with a grid optimization leading to the parameter values C = 1 and γ = 1.2 · 10 −4 for the spectral data set and C = 0.46 and γ = 6.1 · 10 −5 for the data set that utilizes also inclination. The feature weight w = 0.3 is used for the inclination, which leads to visually optimal results.

Evaluation Criteria
Due to the error-prone labeling of the exact area of the symptoms, we decided to exclude this effect from the analysis by relying only on the symptom centers which are labeled more robust. Therefore, the detected symptom centers and the corresponding strengths of the prediction are the analysis output and the base for the result evaluation.
In order to evaluate our proposed framework, we use precisionrecall curves and receiver operating characteristics (ROC). For this, the true positives rate (tp), false positives rate (fp) and false negatives rate (fn) is computed to derive precision, which is defined as tp tp+fp , and recall, which is defined as tp tp + fn . As evaluation measure we compute the area under curve (AUC). The higher the value the better performing the algorithm.

Results and Discussion
In our experiments we could observe that all outcomes presented in Sec. 3.2 could serve as indicator for disease symptoms. Fig. 4 shows the reconstruction error of the sparse representation-based approach with topographic dictionary and the decision value obtained by OCSVM. Both outcomes may serve as indicator for a detection of disease symptoms. As expected, the reconstruction error of the sparse representation approach as well as the decision value obtained by OCSVM is higher for pixel with disease symptoms than for healthy pixels. While OCSVM show a high variability within each leaf and only small differences between leafs, the sparse representation approach shows a small variability within a leaf but large differences between leafs. Fig. 5 show a larger part of each test image for sparse representation with topographic and standard dictionary. Both approaches show the similar weakness to detect leaf veins as potential anomaly, however they are more visually robust to specular reflections than OCSVM.
As illustrated in Fig. 6, we could observe that most of the pixels with disease symptoms are reconstructed by the same dictionary element or a common set of dominant dictionary elements. These dominant elements can be identified by the average roundness factor of specific areas with the same dictionary element index. Also, leaf veins are reconstructed by mostly one, dominant dictionary element, however, in most cases a different one compared to pixel with diseased symptoms. Thus, using the dictionary element as indicator for diseases, leaf veins and potential areas with disease symptoms can be distinguished from each other. This is advantageous over using the reconstruction error or the sum of the weights, which show a similar behavior for disease symptoms and leaf veins. However, the identification of the dominant element can be challenging, e.g. as soon as single, round disease areas conflate to larger areas. We could further observe that the usage of a topographic dictionary result in smoother results, so that grouping of dominant dictionary elements yield more reliable regions (see Fig. 6). A descrease of the group size results in more used dominant dictionary elements. Fig. 7 as well as Tab. 2 show quantitative results. The sparse representation approach reached in most cases better results when compared to OCSVM. Detection of disease symptoms with reconstruction error only results in most cases in the lowest accuracies, because false negatives arising from leaf veins or other anomalies cause a loss in accuracy. Although the sum of the weights tend to be higher for pixels with disease symptoms, also this criteria yield worse results especially for topographic dictionaries and thus, is not distinctive enough for the detection of the symptoms. Although the usage of the dominant dictionary elements sometimes achieve the highest accuracy of 100%, this result must be critically examined because this criterium tend to underestimate diseases. I.e. , all detected disease symptoms are correct, but only about 3/4 of all disease symptoms were detected. In most cases the usage of a topographic dictionary lead to a gain in accuracy. The reason for this is, as indicated earlier, the outcomes are smoother when using a topographic dictionary and thus, the results are less effected by noise.
OCSVM achieves in both configurations and on both data sets a competitive detection accuracy of Cercospora symptoms as anomalies, however, OCSVM need more training than sparse representation-approach to achieve good results. As sparse representation, OCSVM without inclination information sometimes fail in separating leaf regions with specular reflections from the disease symptoms. Therefore the precision of the OCSVM without inclination is reduced. An interfering problem was the erroneously detection of leaf veins as symptoms. As this is not related to a specific inclination it cannot be compensated by the additional inclination information. As counter measure, the training set from the healthy plant should be sampled in a way that samples of leaf veins are included sufficiently.
As can be seen in Tab. 2, the OCSVM experiment shows clearly that the integration of spatial knowledge by feature stacking improves the prediction quality. In all cases the AUC is improved. The reason for this is the definition of "anomaly" now in a spatial context, meaning that a spectra is compared to spectra with similar inclination information. For horizontal leaf parts, a strong reflectivity is normal due to the specular reflection whereas such a high reflectivity for leaf parts with higher inclination would be certainly an "anomaly". In this way OCSVM takes the effect of geometry into account that is able to cover the important processes of interest. Further improvements in prediction quality may be achieved by the construction of more informative features or feature combinations. The inclusion of spatial features that use the spectral characteristics of the neighboring pixels may (a) Standard dictionary, plant 1 (detail) (b) Topographic dictionary, plant 1 (detail) (c) Standard dictionary, plant 2 (detail) (d) Topographic dictionary, plant 2 (detail) Figure 5: Detailed illustration of reconstruction error obtained by sparse representation with topographic and standard dictionary (i.e. , no group sparsity). Blue color indicate a low value and yellow colors a high value.
(a) Standard dictionary, plant 1 (detail) (b) Topographic dictionary, plant 1 (detail) (c) Standard dictionary, plant 2 (detail) (d) Topographic dictionary, plant 2 (detail) Figure 6: Color coded indices of dominant dictionary elements for sparse representation with topographic and standard dictionary (i.e. , no group sparsity). also improve the result quality.

CONCLUSION
We could show the benefit of combining hyperspectral information and geometry in terms of inclination angles for the detection of disease symptoms on plants. Our experiments confirmed for One-Class Support Vector Machines as well as a sparse representation based approach with group sparsity prior a gain in accuracy when incorporating geometry information in terms of inclination. However, the sparse representation-based approach only needs inclination information for building the dictionary and not for spectral reconstruction of the plant image of interest, whereas One-Class Support Vector Machines also need inclination information for training and classification to achieve a good result.
As it become visible in our experiments, the investigated anomaly detection methods have different strengths. OCSVM cope relatively well with leaf veins but shows artefacts of the specular reflectance of horizontal leaf parts. These are reconstructed by the sparse representation-based approach in a better way but in contrast this apporoach has rather problems in differentiating leaf veins and disease symptoms. Since both approaches show such different characteristics underlines that the analysis and interpretation of hyperspectral 3D plant models is still in its infancy. Future analysis methods specifically designed for the interpretation of this specific data type could combine the strengths. Ensemble based methods or meta classifiers are promising approaches in this context. Future research will also consider the influence of the size of groups in the sparsity term as well as the more detailed analysis of false positives, which may be correctly detected symptoms which are not yet visible. This effect is not regarded here but future experiments with time series of hyperspectral 3D plant models will allow to include such effects into the analysis.

RESR
SumWSR DESR DE+RESR -in +in -TD +TD -TD +TD -TD +TD -TD +TD  for precision-recall curves and receiver operator characteristics. Statistics are given for sparse representation (SR) with topographic dictionary (+TD) and without (-TD). This approach is compared to OCSVM with used inclination information (+in) and without (-in).
(a) ROC curve with standard dictionary for plant 1 (b) ROC curve with topographic dictionary for plant 1 (c) ROC curve with standard dictionary for plant 2 (d) ROC curve with topographic dictionary for plant 2 (e) PR curve with standard dictionary for plant 1 (f) PR curve with topographic dictionary (g) PR curve with standard dictionary for plant 2 (h) PR curve with topographic dictionary for plant 2 Figure 7: Receiver operator characteristics (ROC) and precison-recall (PR) curves for plant 1 and plant 2. For comparison, the same curves of the OCSVM were added to the figures with and without topographic dictionary.