INDOOR ULTRA-WIDE BAND NETWORK ADJUSTMENT USING MAXIMUM LIKELIHOOD ESTIMATION

This study is the part of our ongoing research on using ultra-wide band (UWB) technology for navigation at the Ohio State University. Our tests have indicated that the UWB two-way time-of-flight ranges under indoor circumstances follow a Gaussian mixture distribution that may be caused by the incompleteness of the functional model. In this case, to adjust the UWB network from the observed ranges, the maximum likelihood estimation (MLE) may provide a better solution for the node coordinates than the widely-used least squares approach. The prerequisite of the maximum likelihood method is to know the probability density functions. The 30 Hz sampling rate of the UWB sensors enables to estimate these functions between each node from the samples in static positioning mode. In order to prove the MLE hypothesis, an UWB network has been established in a multi-path density environment for test data acquisition. The least squares and maximum likelihood coordinate solutions are determined and compared, and the results indicate that better accuracy can be achieved with maximum likelihood estimation. * Corresponding author. This is useful to know for communication with the appropriate person in cases with more than one author.


INTRODUCTION
The most commonly used and widely-accepted approach to determine the coordinates of a surveying network is based on the least squares (LSE) or weighted least squares estimation, which is not robust enough, when bad measurements, outliers are present in the data.For this reason, in situations, where outliers are expected, more robust estimation methods are recommended.Maximum likelihood estimation (MLE) is a robust estimation method, which requires that the probability density function (PDF) is known.In most cases the PDF's are derived from prior assumptions, experts' knowledge, etc.Ultimately, the pdf can be determined from samples, if they are available.The ultra-wide band (UWB) ranging systems are capable for producing sufficient number of measurements to estimate these PDF's due to their relatively high sampling frequency that can be more than 30 Hz, if the dynamics of the application allows for multiple observations.This study is the part of our continuing UWB research at the SPIN Lab at The Ohio State University.Earlier, the calibration possibilities and the accuracy of the dynamic trajectory points derived from UWB ranges were investigate (Koppanyi et al., 2014).It was found that cm level accuracy can be achieved outdoor, while indoors, under line-of-sight circumstances, the accuracy was above half a meter.These tests also showed that the distributions of the ranges do not follow Gaussian distribution, rather a mixture of Gaussian distributions, These assumptions motivated the use of the MLE approach instead of LSE.
The MLE approach for wireless geolocation has been investigated in non-line-of-sight (Qi, 2003;Qi et al., 2006a) and multi-path density environment (Qi et al., 2006b).The UWB ranging under similar conditions was studied in (Lee et al., 2002).These papers are rather theoretically and focus on positioning.The goal of our paper is to adjust the static network instead of positioning and to assess the solutions.An indoor UWB network with accurate reference was established and the single ranges obtained by UWB were used for estimating the node coordinates with MLE.Based on experimental data, the LSE and MLE solutions were compared, and confirmed the hypothesis.

The functional model
The UWB networks consist of nodes, where the nodes measure ranges based on time-of-arrival (ToA) between each other.One of the unknown positions can be determined with circular lateration.
First, assume that the coordinates of three stations are known, and an unknown 2D position has to be calculated from the ranges between the unknown position and the stations.Note that each measurement defines a circle around each node.If at least three distances are known, the circles have to intersect each other at one point, which is the unknown position, see Fig. 1.The conditions can be described by the following system of equations: where ,  = the unknown coordinates,   ,   = the th station coordinates,   = the measured range between the unknown position and the th station.If more than one range is available between the nodes and the unknown position, the equalities may not be satisfied, when errors are present.In this case, the error model is the following: where  , = the error in the th range of th station,  , = the th distance between the unknown position, and the th station.
When the goal is to determine the coordinates of the network nodes, this formula could be extended to consider that the station coordinates are also unknowns (in which case, the unknown position is also a node): where   ,   = the th station coordinates,  ,, = the th measured range between the th and th station,  ,, = the residual between them.
Note that the problem becomes ill-determined, when not enough ranges (connections) are available, and at least, one node coordinates and the orientation of the network are not known.These networks are called anchored-free network, while if these parameters are known, the network is anchored.

Least squares and maximum likelihood estimation
When coordinates have to be estimated from multiple measurements, the most popular approach is the least squares estimation, which will find those   ,   coordinates where the function of the sum of the squared value of  , is minimal: In most applications, the LSE can provide proper results, but its outlier sensitivity is a well-known issue, thus if the numbers of the outliers is significant or the functional model is incomplete, one of the more robust estimation methods is preferred, such as contrast the maximum likelihood estimation (MLE) method: where  , = the probability density function of the distances between the th and th station.
Note that the MLE requires the PDF's of the range measurements.The PDF can be derived from priori assumptions or it can be estimated from the samples.In the latter case, the determined function ( ̂, ) is the empirical probability density function (EPDF).
Note that the well-known M-estimators are the generalization of the maximum likelihood estimation (MLE).The idea behind using the MLE concept is its robustness.M-estimator uses a pre-defined function instead of using the likelihood function (Huber, 2009).

Motivation for using MLE
In an UWB ranging system, the samples are acquired with about 30-100 Hz, which means over 100 ranges are available within 2-3 seconds.So in static applications, the EPDF can be derived from these samples, and it can be used in maximum likelihood estimation directly instead of defining it from prior assumptions.
In a narrow corridor indoor environment, tests indicate that UWB ranges do not typically follow Gaussian distribution.For an example, Fig. 2 shows the relative histogram of the ranges between the same nodes.It is clearly demonstrating the presence of more than one peak.Also, this distribution rather consists of a set of Gaussian distributions than a single one; this type of distribution is called Gaussian mixture distribution.
The multi-path signal propagation may cause this distribution.
The conventional UWB systems use correlation-based time-offlight method for estimating ranges.Theoretically, the large bandwidth decrease the impact of the multi-path mismatches, but multiple correlation peaks may still remain, and filtering these peaks is important to be able to provide the accurate distances.The experienced multiple peaks in the histogram can be suggested by the multi-path mismatches (Gezici, 2005).We emphasize that these statements are just our hypotheses, it is not clear for us that these peaks are caused by the multi-path environment, but the fact is that similar distributions did not occur in outdoor circumstances.Estimating the expected value is a widely-used approach.For the LSE estimation method, the expected value is the sample mean, which is shown by the red dashed line; the median, which can be the L1-estimation of the expected value, is depicted with green dashed line.Note that these values are located at a third smallest peak, which is likely to be a bad estimation of the distance, even if it is an unbiased estimation of the expected value.If only this histogram is known and considering the multi-path mismatches, the location of the maximum peak is likely to be the valid estimation of the real distance, and this maximum is chosen by the maximum likelihood estimation (blue line in the figure).
This example demonstrates that if large number of outliers is present in the measurements or the functional model is not complete (which peak is the real distance), the MLE can provide better results due to its robustness.However, the MLE may not be unbiased or efficient in statistical sense; yet it is a very robust estimation method that fits better to these measurements.

Estimating the probability density function
Estimating the PDF is necessarily to use MLE.First, histograms can be estimated from a set of distances obtained by the UWB sensors.Note the histogram is not a probability measure; it is not continuous, its shape depends on the bin size, and it may fluctuate.In order to address these issues, kernel density estimation is used, where the kernel is the standard normal function: ̂, ℎ () = Choosing the appropriate value for the bandwidth is important, because larger bandwidth smooths out the relevant peaks, but if it is too small, the fluctuation remains.In this paper, the performance was found to be the best at the 0.005 bandwidth.

Estimating the node coordinates with MLE
After estimating the empirical probability density function, the maximum likelihood estimation of the node positions is those coordinates of which distances maximize the likelihood functions: Note that here the distances are the independent variables of the EPDF, while the errors are the independent variables of the score function in LSE case (see Eq. 4).Because of numerical properties, instead of solving the Eq. 7, log likelihood function is maximized: The equation can be solved by one of the numerical optimization methods.The line search algorithm was selected in this work, which requires good initial value to avoid trapped at the local maximum instead of finding the global critical point.
For that reason, the LSE solution is calculated first, and then its result is used as the initial value for the MLE.The reliability can be further improved with global optimization methods, such as different stochastic-based algorithms.

TEST
The UWB network is established in a multi-path challenged environment in the Bolz Hall at The Ohio State University.Seven TimeDomain's PulsOn 400 units are placed on the floor (http://www.timedomain.com/).The distances between them vary between 2 -20 meters, see right side of Fig. 3, network arrangement is shown in the left side.
Note that the error propagation property inside the network is not favourable; especially in X direction due to the fact that the Y extent of the network is much larger than the X extent.The network is measured with surveying tape to determine reference coordinates; the estimated accuracy of these coordinates is better than 5 cm.Laptop based data acquisition software records the raw ranges sent by the sensors via USB cable (right upper side of Fig. 3).
Approximately 1000 distances are measured with two-way time-of-flight range estimation at 30 Hz in every case, but note that there are failed observations too.

DISCUSSION
The   In Fig. 4d-e, note that the LSE distances are not coinciding with any peaks; they fall between them.Obviously, it is not absolutely necessarily that the MLE chooses one of the peaks, though it tends to select one.In Fig. 4c, the estimated distance of MLE is at the maximum peak, while a lower peak is selected in Fig. 4d.This shows that it is not necessarily that the estimated distances have to be at the maximum peak, as it depends on all of the measurements, all of the EPDFs in which the node is present, such that the probabilities of the distances maximize the likelihood function.
The Table 1 shows the numerical results.In some cases, larger correction can be noticed; for example, the coordinates of nodes 100 and 103 are improved by 149.2 cm and 50.9 cm using MLE.The average of the differences between the reference points and the LSE coordinates is 77.5 cm.Using MLE, a significantly lower average differences, 34.3 cm, can be achieved.The maximum differences between the LSE and MLE solutions are 180.9 and 59.2, respectively; note the minimum discrepancies are also decreased using MLE.These numerical results confirm that the MLE estimation of this type of network provides better results than LSE.
The only node, where the LSE coordinates are closer to the reference, is the node 106.The closer analysis of the measurement and the distances show that node 106 does not work properly during the test; after removing these ranges from the dataset, the LSE and MLE solutions are improved, though the MLE solutions still show lower errors, the average difference of LSE from the reference is 51.1 cm, while the MLE average is 25.7 cm, see Table 2.

CONCLUSION
In this paper, UWB network adjustment process using maximum likelihood estimation was presented.The empirical probability density function is derived from the ranges obtained by the UWB sensors using standard normal kernel estimation.The network coordinates are calculated using LSE and MLE methods.The results indicate that the MLE estimation can provide more accurate coordinates than the LSE solution.
In further research, the use of expectation maximization to estimate the PDFs and to separate the outliers will be investigated.The cause of the Gaussian mixture distribution is also expected to be examined.

Figure 2 .
Figure 2. Sample indoor range ℎ = the estimated empirical PDF from the distances between the th and th station,  = the standard normal kernel function,  = the sample number, ℎ = the bandwidth of the kernel.

Figure 3 .
Figure 3.A UWB unit on the floor (left lower) and another one attached to the logging laptop (left upper); the network arrangement with the measured distances (right) Nodes 101, 102, 103, 105, and 106 are part of the network, stations, and ranges are measured from these nodes.Nodes 100 and 107 are also measured by the network points, but no observation is executed on these nodes.The nodes 101 and 105 are assumed to be reference points (marked by double circle in the figure) to fix the network for comparing it with the reference coordinates; thus, the network is anchored.
LSE and MLE solutions are shown in Fig. 4. The black circles show the reference coordinates measured by surveying tape in the floor map, shown in Fig. 4a.The double circle marks the fixed stations, nodes 101 and 105.The points with red asterisks present the LSE solution, while the blue ones show the MLE solutions.It is clearly seen that the MLE coordinates are closer to the reference coordinates than the LSE coordinates.

Fig. 4d and
Fig. 4d and 4e show the histograms between nodes 102 and 101, 102 and 103, 101 and 103, 106 and 103, respectively.The empirical probability density functions smoothed by the standard normal kernel function are plotted with green line.The red dashed line shows the LSE, the blue dashed line is the MLE estimated distances.In the Fig. 4b, the LSE and MLE solutions provide nearly same result, while in Fig 4c the estimated distances are different, however both appear to follow Gaussian distribution.

Table 1 .
Numerical comparison of LSE and MLE solutions ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-1, 2014 ISPRS Technical Commission I Symposium, 17 -20 November 2014, Denver, Colorado, USA This contribution has been peer-reviewed.The double-blind peer-review was conducted on the basis of the full paper.doi:10.5194/isprsannals-II-1-31-2014

Table 2 .
Numerical comparison of LSE and MLE solutions after removing node 106