RADIOMETRIC CORRECTION OF TERRESTRIAL LIDAR DATA FOR MAPPING OF HARVEST RESIDUES DENSITY

In precision agriculture detailed geoinformation on plant and soil properties plays an important role. Laser scanning already has been used to describe in-field variations of plant growth in 3D and over time and can serve as valuable complementary topographic data set for remote sensing, such as deriving soil properties from hyperspectral sensors. In this study full-waveform laser scanning data acquired with a Riegl VZ-400 instrument is used to classify 3D point clouds into post-harvest straw residues and bare soil. A workflow for point cloud based classification is presented using radiometric and geometric point features. A radiometric correction is performed by using a range-correction function f(r), which is derived from lab experiments with a reference target of known reflectance. Thereafter, the corrected signal amplitude and local height features are explored with respect to the target classes. The following procedure includes feature calculation, decision tree analysis, point cloud classification and finally result validation using detailed classified reference RGB images. The classification tree separates the classes of harvest residues and bare soil with an accuracy of 96% by using geometric and radiometric features. The LiDAR-derived harvest residue coverage value of 75% lies in accordance with the image-based reference (coverage of 68%). The results indicate the high potential of radiometric features for natural surface classification, particularly in combination with geometric features.


INTRODUCTION
Mapping and characterization of the three-dimensional nature of vegetation is increasingly gaining in importance in many fields of applications.LiDAR, also referred as laser scanning (LS), has evolved into a state-of-the-art technology for highly accurate 3D data acquisition.By now several studies indicate a high value for 3D vegetation description, such as in agricultural monitoring of trees (Rosell and Sanz, 2012;Seidel et al., 2011), field crops (Höfle, 2013;Saeys et al., 2009;Lumme et al., 2008) or harvest residues (Lenaerts et al., 2012).Harvest residues play an important role in agricultural management, for instance concerning the calculation of biomass and subsequently the need of fertilization as 'humus compensation' (Fink, 1996) or for renewable energy production (Pimentel, 1981).Knowledge of the quantity and the spatial distribution of harvest residues are important for the determination of surface properties.Hyperspectral (HS) remote sensing is used in agriculture, mapping crop characteristics (Jarmer, 2013;Thenkabail, 2000), the assessment of crop residues (Daughtry, 2004) and the assessment of soil properties (Jarmer et al., 2008;Barnes and Baker, 2000).The captured mixed signal in case of existing harvest residues can be interpreted and considered in further analysis.This paper investigates the mapping and coverage derivation of harvest straw residues and bare soil using ground-based fullwaveform LS including geometric and radiometric data.Once the individual laser points are classified, the position of the harvest residues can be used to evaluate and model biomass distribution or can serve as input for HS remote sensing analyses in agricultural management.Challenges of using LS point clouds directly are the large amount of raw data, the requirement of fast and robust algorithms for radiometric correction and geoinformation extraction.Radiometric information of LS has proven to be a valuable feature for object detection in airborne (Höfle et al., 2012;Briese et al., 2008) and terrestrial LS (TLS) data (Höfle, 2013).However, to exploit the full potential of radiometric correction and information extraction, especially in the precision agriculture domain, further detailed investigations are required.
In this study the potential of radiometric information of LS for 3D point cloud classification and the identification of harvest residues is explored.This paper presents a radiometric correction procedure based on laboratory experiments as well as a novel workflow for point cloud classification, using geometric and radiometric point cloud features.

Study area and TLS data
The study area is part of an agricultural research test site at the Julius Kühn-Institut, Brunswick, Germany.Within the investigated field two plots (1 m x 1 m) were marked.Data was collected on 31 March 2012 from time-of-flight scanner Riegl VZ-400 with full-waveform online echo detection.The laser system has a near-infrared laser beam (1550 nm) with a beam ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-5/W2, 2013ISPRS Workshop Laser Scanning 2013, 11 -13 November 2013, Antalya, Turkey divergence of 0.3 mrad and a range accuracy of 5 mm at 100 m according to the manufacturer's data sheet.While the data was captured, the scanner was installed at a fixed position ca.3.5 m above ground.The horizontal distance to the test plots lies between 5.5 m to 9.5 m.To simulate harvest residues, straw was applied irregularly over the plots, from 0 g (bare soil) up to 150 g straw in steps of 10 g (0 g to 100 g) and 25 g (100 g to 150 g).Each time after adding straw, the study area was scanned and imaged with a standard compact digital RGB camera.The study area is covered with an average point density of 14 points per 0.01 m², totaling 1.9 x 10 6 points.
Post-processing includes the removal of points outside the defined study area and points with large echo widths (Riegl's deviation >10; Pfenningbauer and Ulrich, 2010) in order to remove points occurring at edges.For subsequent processing, the point cloud was exported as ASCII file with XYZcoordinates, range [m], recorded signal amplitude [DN].

Reference Data
The validation of the classification is performed with 1) a normalized Digital Surface Model (nDSM) based on the Digital Terrain Model (DTM) of bare soil, and 2) an independent classification done by image analysis of detailed photographs.A validation of the workflow and the features derived by the classification is achieved by transferring the workflow and derived classification rules to a second plot.Comparison between the different data sets of training plot is based on the calculated residue coverage value and by cell-by-cell comparison of the binary classification maps.
The nDSM layer is calculated by subtracting the DTM from the DSM.The DSM represents the surface at 150 g straw and the DTM 0 g of residues of the area of the test plots.The percentage of coverage is determined within cells of 0.005 m x 0.005 m size.In case of classification by image analysis, georeferenced binary images of all levels of straw coverage serve as data basis.For each straw layer, the imagebased straw detection is realized using a simple decision tree (DT) with one rule node.In this context, two threshold values (one for the red-and one for the blue band of the RGB image) are used to generate classification results for all images with the two classes straw and bare soil.Afterwards, the single classification results of different straw levels are accumulated in a final map to derive the image-based reference classification for the 150 g straw coverage situation.This accumulation is necessary since at higher straw levels, earlier added straw is partly covered by shade of newly added straw and hence misclassified as a result of the low spectral resolution of camera data.

METHODS
The workflow consists of a radiometric correction of the recorded signal amplitude, the exploration and calculation of features, followed by the classification and the validation.The main strategy of the classification procedures aims at assessing the straw's specific signal amplitude, differing from bare soil amplitudes (radiometric criteria).Additionally the vertical height range and variation of local neighbouring points are assumed to be higher for harvest residues (geometric criteria).
The developed workflow is summarized in Fig. 1.

Radiometric Correction
The laser scanner gathers the geometric (XYZ coordinates) and radiometric features (e.g.signal amplitude) from all scanned surfaces.The recorded amplitude value is affected by the range dependence of the echo signal detected by the receiver (Höfle, 2013;Pfennigbauer and Ulrich, 2010;Kaasalainen et al., 2009).
The radiometric correction aims at the removal of the range effect.For the used scanner, the 1/R 2 correspondence from the radar equation, which is mostly valid for airborne laser scanning (Wagner, 2010;Höfle and Pfeifer, 2007), is given for distances greater than 20 m.Close to the scanner, the recorded amplitude increases with distance (Höfle, 2013;Pfenningbauer and Ulrich, 2010).To remove this range effect, a range-correction function is determined in a laboratory experiment.
This experiment is accomplished in a long hallway (250 m) and comprises a measurement series of a reference target in stepwise increasing distances to the scanner.The quadratic target (20 x 20 cm) is made of Spectralon® with Lambertian scattering properties.At the scanner's operating wavelength of 1550 nm, the nominal reflectance of the target is 92.5%.The target is mounted on a tripod and its centre is oriented to the scanner at same height as the scanner's optical centre.The measurements are performed with increasing distance between the target and the scanner from 1.5 m up to 100 m in 1 m-steps, thereafter in 5 m-steps up to the maximum distance of 250 m.In the next step, least-square (LSQ) fitting of polynomial functions from degree 1 to 11 are performed to the range-amplitude values.The function with the lowest root-mean-square error (RMSE) is chosen to derive the correction function 1/f(r).Finally, the real world data sets of 0 g and 150 g straw level of the study area were corrected using this correction function.

Feature Calculation
Agricultural surfaces can be parameterized by several features in order to distinguish harvesting residues and soil.Fullwaveform data offers the possibility to describe surfaces by using their geometry and reflectance properties (e. g. amplitude).The spatial neighbourhood can be defined either in 2D (a circle projected on a horizontal plane) or in 3D (spherical) domain.
For feature calculation the two data sets of plots with bare soil (0 g straw level) and maximum amount of straw coverage (150 g straw level) are chosen for analysis in order to achieve the best distinguishable result between the two target classes.Several smaller areas distributed over the plot are selected manually as training areas regarding the absolute height value, low height differences within a specific neighbourhood and are

Exploration
A first exploratory analysis of the corrected signal amplitude and height is performed.This exploration of features comprises histogram analysis and the features' distribution within the plots.The height differences of the surface can be expressed using the standard deviation of the z-values.This geometric feature assists to distinguish between 'smoother' surfaces such as bare soil on the one hand and straw characterized by higher height variations in a defined neighbourhood on the other hand.
The second feature for detecting straw is the signal amplitude, where different backscatter characteristics of the target classes are expressed in certain separability in amplitude values.To distinguish between soil and straw, the histograms of each class in the training data are calculated and compared.The corresponding amplitude value of the intersection point of both distribution curves serves as threshold in the next step (cf.Fig. 3).

Neighbourhood Analysis
The second step includes the calculation of additional point features considering the neighbourhood of a single point.The recorded signal amplitude represents the backscattered signal strength from the illuminated target.If the target is smaller than the footprint, the signal strength is reduced due to the smaller footprint area contributing to the backscatter.To overcome this high variation inherent in single point measurements, aggregated local neighbourhood features are calculated and act as input for decision tree analysis.
Outliers in amplitude are tackled by using the amplitude density, which is the percentage of neighbours in 3D with amplitude values fulfilling a certain threshold (cf.Höfle, 2013).
The features are calculated within a local neighbourhood, using a maximum count of points k within a search radius R and an amplitude threshold A T (derived via exploration).The number of neighbours k was deduced from the average point density of the plot field and set to 50.The search radius was set to 0.02 m with respect to the point density and the size of straw parts.

Classification
The decision tree (DT) for surface classification has been applied widely to remotely sensed data (Pal and Mather, 2003).
The DT approach has a number of advantages over traditional classification methods.It requires no assumptions regarding the distribution of input data and also provides an intuitive classification structure (Hansen et al., 1996).
Seven different input features serve as input for the DT classification (Table 1).The DT method varies the different input features and returns the model with best accuracy.To check the model, a validation split of 0.7 (70% training data and 30% validation data) was chosen.The sampling of the training data was conducted by shuffled sampling.The minimal size for split was set to 4 and the minimal leaf size to 2. As criterion for the tree induction the gain ratio was chosen with minimal gain of 0.1 and confidence of 0.25.The gain ratio is a variant of information gain.It adjusts the information gain for each attribute to allow the breadth and uniformity of the attribute values (Quinlan, 1986) The final straw and soil classes are established by merging all branches of the decision tree belonging to class of soil or straw.

Validation
The accuracy of the classified point cloud is evaluated within four steps.First, the workflow is performed for a second plot (validation plot) and the resulting performance thresholds are compared.The validation plot is located in 2 m distance to the scanner and 2 m distance to the training plot.Second, the derived thresholds from the training plot are applied to classify the validation plot, enabling to assess the transferability of the derived rules.Third, a height difference layer (nDSM) was calculated by subtracting the DTM (bare soil) from the DSM (with max.straw coverage).Heights above the DTM indicate straw that has been added to the bare soil.A threshold of height above DTM for the nDSM is chosen to assign cells to the harvest residue class, whereas cells with no height above the DTM are assumed to be part of the bare soil surface.Finally, a classified map derived by independent classification of compact camera images serves as validation layer.
For cell-by-cell error assessment a binary raster map (soil/residue) is derived by rasterization of the classified point cloud, taking the most frequent class (of laser points) per pixel.The cell size in the rasterization step is set to 0.005 m based on the average point distance of the point cloud.This raster map is input for a detailed cell-by-cell comparison with the corresponding binary raster map generated from the classified RGB image data.
Furthermore, the straw coverage per plot is calculated from the binary raster map.This value indicates how much soil is covered by straw in the spatial unit of the plot (1m 2 ) as seen from bird's eye view, which is a relevant value for remote sensing applications.

Radiometric Calibration
The range correction of signal amplitude shows lowest RMSE for a polynomial of degree nine (Fig. 2).The maximum amplitude is reached at ca. 10 m distance and is decreasing constantly with range thereafter.The error bars and RMSE of 1.5% to all laser point amplitudes indicate the polynomial function's high approximation of the lab experiment's measurements.This derived correction function is used to remove the amplitude-range effect from the data sets, in order to be able to relate the corrected amplitudes with distinct backscatter characteristics of the different target surface classes.

Feature Calculation
The exploration of the features shows a good separability between harvest residues and bare soil.This can be visualised by colouring the points by signal amplitude (Fig. 3).Furthermore, the histogram of training data's signal amplitude and the calculated distribution function indicates certain separability of the two classes.The intersection point at 2963 DN serves as threshold (A T ) for the calculation of the amplitude density feature.The percentage of neighbours in 3D with amplitude values below this threshold is computed for each single laser point.This intersection point can also be seen in the vertical profile's shape of amplitudes of the entire training plot (Fig. 4).

Classification
The most reliable classification results are achieved by a DT which uses the features of mean signal amplitude and standard deviation of height (Fig. 5).A high separability of straw and soil points with an accuracy of 96% (Table 2) is achieved.Within the DT, 46% of the points are classified as straw and 47% as bare soil regarding the thresholds of mean amplitude of 3034 DN and 2994 DN, respectively.The remaining points are classified based on the standard deviation of height.Our result highlights that point features derived in the local neighbourhood (e.g.mean amplitude) increase the separability of straw and soil compared to using single point signal amplitude values (cf.Fig. 6a and 6g).The final classification step assigns the class label "harvest residues" or "bare soil" to the single laser points based on derived DT thresholds.Most assignments are done based on the feature of mean amplitude.This is reflected by the relatively high accuracy of 95% in DT analysis, using only the radiometric features.The usage of height differences as input does not increase the accuracy significantly (only 1.4%).The classification of the transition area between straw and soil is particularly affected by the height feature (Fig. 6

Validation with Reference Data sets
The validation proceeds on four levels to demonstrate the transferability of the workflow.Due to varying spatial resolution and data models of different data sets (i.e.RGB image classification versus classified point cloud), coverage raster maps were computed and compared (Table 3).4).
Comparison of the point cloud classification with the nDSM layer (Fig. 6 (f)) results in an overall accuracy of 70%.The 10% lower coverage value of harvest residues derived from the nDSM layer, which assumes that harvest residues have a certain height above soil, reflects the improvement of classification through additional usage of radiometric features.This is mainly due to residues without a distinct height lying relatively flat at the soil surface. LiDAR

CONCLUSION
The study at hand shows the potential of LiDAR data for harvest residues detection and coverage mapping if using both radiometric information and geometric description.Radiometric correction using a range-correction function derived from laboratory experiment eliminates distance effects.The separability between the two target classes harvest residues and bare soil was revealed.The study shows promising results for the detection of harvest residues of crop entire fields.Future research should concentrate on the effect of different soil conditions like dry, wet or mossy on the signal's backscatter strength (e.g.signal amplitude).Furthermore, the procedure will be applied to entire crop fields with spatially varying harvest residues densities.In this context, the optimum point density of spatial target unit (e.g.1m 2 ) for coverage maps should be investigated further.With respect to radiometric correction, angular dependencies are relevant and have to be analyzed in more detail.

Figure 1 .
Figure 1.Workflow to map harvest residues using fullwaveform point cloud and raster data.

Figure 2 .
Figure 2. Polynomial function (degree 9) of range-amplitude dependency assessed by least square fitting to moving median values of original amplitude of laboratory experiment (cf.Höfle, 2013).

Figure 3 .
Figure 3. Density function of corrected amplitude of harvest residues (red) and bare soil (blue) of training data and the intersection point.

Figure 4 .
Figure 4. Vertical profile of boxplots of the training plot's corrected amplitudes (the 1 m² plot contains a sample of 80,132 points).

Figure 5 .
Figure 5. Derived classification tree of training data.DT input features Accuracy Used features Corrected values of signal amplitude 85.80% corrected amplitude Geometric features 86.41% height difference Radiometric features 94.83% mean amplitude, coefficient of variation of amplitude Radiometric & geometric features 96.24% mean amplitude, standard deviation of height (a) compared to (g)).Most of the points can be separated by using the feature signal amplitude, which can be seen in the use of mean amplitude values as first separation step in DT.The resulting threshold of 3034 DN for mean amplitude is comparable to the intersection function maximum in the exploration part (2964 DN) as well as the threshold of 2946 DN determined by DT with corrected signal amplitude.

Figure 6 :
Figure 6: 1-m transect through classified point cloud: a) top view of classified points, b) point cloud profile, c) rasterized binary map, d) corresponding image reference classification, e) RGB image of maximum residues cover, f) nDSM layer residues cover and g) coloured point cloud based on single corrected amplitude values only.Red: harvest residues, blue: bare soil.

First, the workflow
and thresholds assessed by using training plot data were applied to validation plot data.The calculated density distribution and the density function of the validation plot's corrected amplitude values differ slightly from the training plot.The intersection of straw and soil amplitude density curves increases about 1.26% from 2963 DN to 3001 DN compared to the training plot.This variation could be explained by a higher number of points and the lower incidence angles due to the shorter distance between scanner and plot.Comparing the revealed class thresholds by DT analysis shows a similar result in selection of classification features of mean amplitude and height.Most of the points are classified by mean amplitude, in case of straw 44% and in case of bare soil 46%.The height threshold varies and is in accordance to the differing distribution of straw at the validation and the training plot field.The DT of the training plot uses the standard deviation of height (straw > 0.003) for classifying, while DT of the testing plot uses the standard deviation of height (straw > 0.014).The DT analysis of the validation plot achieves an accuracy of 95%.This is reaffirmed with the final calculation of the coverage within the validation plot.This is done by classifying by means of a) rule-base derived from training data of first plot and by b) newly computed rules derived by DT analysis.The 10% lower coverage calculated by the rule-base derived from training plot shows the limited transferability, regarding varying distribution patterns or angular dependency on scanning ( Due to varying point distribution and lack of coverage in certain areas of the TLS data (compare Fig. 6 (a), (c) and (g)), comparison of rasterized coverage considers only cells that have values in LiDAR-derived case.Around 76% of the training plot area is covered with harvest residues as compared to 68% calculated by reference RGB image analysis.All data sets show a coverage proportion of around 2/3 harvest residues to bare soil.A cell-by-cell error assessment of LiDAR-derived classes and classes derived by image analysis (Fig. 6 (d)) yields an overall accuracy of around 82% and a precision (positive predictive value) of 95% of classified straw (Table ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-5/W2, 2013 ISPRS Workshop Laser Scanning 2013, 11 -13 November 2013, Antalya, Turkey further refined by visual comparison with the RGB images of both data sets.Remaining outliers were manually removed in order to reduce classification errors.The training area selection was performed without considering the amplitude value.

Table 1 .
. The resulting thresholds of the tree with highest accuracy and precision are used for feature selection and the subsequent classification process.Precision stands for the deviation of data values within the model classes and accuracy stands for the closeness of model data to reference data.The derived rule-base is data set specific and is thus not directly transferable to other point clouds.Seven different features used for DT classification.

Table 2
. Accuracy of DT classification of training data with varying input features and the resulting thresholds.The input features are listed in detail in Table1

Table 3 .
Resulting overall coverage values of harvest residues and bare soil for the different data sets (1 m² plots).