IMAGE-BASED TOPOGRAPHY RECONSTRUCTION BY MINIMIZING THE REPROJECTION-ERROR OVER A SURFACE

Image based three dimensional modelling of small scale scenes sees an increased demand in recent years due to algorithmic advances, and growing awareness of users to the value of scene documentation in three dimensions. Nonetheless, most of the reported image-based reconstruction methods are either point driven, relying on matching large set of automatically sampled points (e.g., SIFT, SURF), or integral approaches. The latter are commonly based on generating an entire map of matching. For many such stereo-correspondence methods, however, images which are acquired from a narrow baseline and similar orientations are usually needed, in order to ensure satisfying results. We present in this paper a reconstruction approach, assuming both visibility and Lambertian reflectance, which is intended to be less affected by image difference. The proposed method performs a surface evolution which is implemented by an iterative scheme, applying a gradient flow which minimizes a reprojection-error cost function. The algorithm is applied on scenes of complex topography and varied terrain. Our results show applicability of the method, along with its satisfactory performance for some difficult imagery configurations.


INTRODUCTION
Detailed three dimensional mapping of small scale scenes sees an increased demand in recent years.This trend can be attributed to increasing computing power, graphic hardware, and technological and algorithmic advances.It has also been thriving by growing awareness of users to the value of scene documentation in three dimensions, such as visualization, scene analysis, and more.Among the various data acquisition techniques, the most effective ones for geographical scenes reconstruction are based on laser scanning technology and images.Although laser based point-clouds provide a dense set of data for scene description, limited availability of scanners and their prohibitive cost makes them not as widely applied as may be anticipated.In contrast, low cost cameras are far more available, and enable 3D modelling which is less costly, and in many cases generate data that are more detailed.Consequently, image-based 3D modelling is still relevant nowadays (Remondino and Menna, 2008).Image based methods for three-dimensional reconstruction, are often motivated by photo-consistency criteria.Most of the reported approaches for environmental and geographical scenes are point driven, and rely in many cases on the detection and matching or a large set of points (e.g., using the SIFT or SURF operators) (Remondino and El-Hakim, 2006;Barazzetti et al. 2010).Traditional methods for integral scene reconstruction are stereo-vision based and generate disparitymaps which match each image pixel to a counterpart in the other image (Figure 1a).Generation of such maps is often subjected to an optimization scheme of some energy function, e.g., graph-cuts, belief-propagation and so forth (Scharstein and Szeliski, 2002;Brown et al., 2003).Notwithstanding, implementation is usually carried out in image-space, and is subjected to its resolution and precision, whereas the final reconstruction model is given in three-dimensional space.Moreover, these methods seem ineffective for wide baselines or significant orientation differences.Therefore, they may turn less applicable in complex natural environments which contain varied topography along with discontinuity and unsmooth features therein (Mor and Filin, 2009).A somewhat different approach for 3D modelling is surface evolution (Faugeras and Keriven, 1998, Gargallo et al, 2007, Delaunoy and Prados, 2011) which often generates a model which minimizes the images reprojection-error objective function (also known as energy or cost function) (Seitz et al., 2006).This minimization is based on the assumption that the intensity image values of the same object-space point should share similar values; more realistically, have a minimal intensity difference between them.The process begins with an initial coarse approximated surface, and in each iteration the surface should change or "evolve" by some criteria until convergence to the optimal solution (by minimizing an objective function) is reached.Following such approaches, this paper proposes a methodology for three-dimensional surface reconstruction of a textured scene, which is motivated by photo-consistency, assuming visibility and Labertian reflectance.Our objective is to generate an explicit digital surface model (DSM) based on minimization of the integral reprojection-error generated by an image-set over the surface.An initial coarse surface is evolved iteratively using the gradient of a reprojection-error energy function (gradientdescent method).Contrary to traditional reconstruction methods, our approach is surface-based (Figure 1b) and advancement is computed in three-dimensional model-space.The advantages of the proposed approach over traditional ones lie in the fact that the surface reconstruction is generated in a three-dimensional model space which makes it less subjected to image resolution.Moreover, since no point-or pixel-direct matching is required, the use of image sets with wide baselines and significant difference in their orientations becomes possible and in a more effective manner.The paper presents a theoretical scheme, along with an implementation for a triangular mesh model.Modification to other explicit surface representations is, however, straightforward.
The organization of this paper is as follows: we first introduce the main concept of the surface reconstruction as casted by an energy minimization.Section 2.1 develops expressions for the energy function gradient which enables to evolve the surface.Section 2.2 elaborates on computational aspects which enable us to implement our method for digital images.Section 3 demonstrates the application of the method on an archaeological site that offers complex topography including varied terrain, discontinuities and unsmooth features therein.The results show applicability of the method, along with its satisfactory performance for difficult imagery configurations.The paper concludes with summary and discussion.

SURFACE RECONSTRUCTION
Our surface reconstruction approach is based on minimizing the reprojection-error in the three-dimensional object space.In order to do so we first describe the reprojection-error as a scalar field over the 3 Euclidean domain.For simplicity we focus in first on an image pair alone, but the expansion to multiview configuration is relatively straightforward by adding terms for all image pairs.Consider two images of a specific scene: a reference image I and a target image I'.Every object-space point x|=|[x,y,z] T can be projected into each one of the images, and retrieve the intensity value Ȋ(x) and Ȋ'(x).Notice that here Ȋ: 3 → is a function over a point in the three-dimensional object-space (as opposed to I(u,v) which is the intensity function for a twodimensional image-space point).The relation between the two is Ȋ(x) = I(P(x)), where P: 3 → 2 is the projection of the objectspace coordinate onto the image.We can now define the reprojection-error scalar field as the square difference of intensity values as given in Eq. ( 1) This scalar field may be calculated for any point in the field-ofview of both images.Figure 2 provides a graphic illustration of such a field.We would like to find a surface Γ which represents the scene optimally, i.e., a surface that would maintain photo-consistency, or minimize the reprojection-error.As our interest is in an integral solution for the entire surface, our objective would be minimizing the cost function given in Eq. ( 2): where dσ is the differential surface area element.This is an integration of the reprojection-error field D(x) over the surface Γ.Using continuous surface representation the optimization can be formulized as a variational problem.However, we can also use a discrete surface representation, such as Triangular Irregular Network (TIN) mesh or Digital Terrain Model (DTM) grid.In such a case, the surface would be represented by a set of piecewise functions (planes for TIN, bilinear or bi-cubic for grid etc.) where each of them depends on some finite number of parameters: The choice here is of a triangle-mesh representation, and so, the integration is over a set of triangles {T k , k = 1,2,…,n T }, and the solution is of a finite vector of n p z-ordinate [t 1 ,t 2 ,…,tnp].
Figure 2. Illustration of the reprojection-error field.Region A is of low reprojection-error as its projection color in both images is red.On the other hand, the reprojection-error of region B is high due to the collision between red and green colors of its image projection.
Our choice of solving this optimization problem is by using the gradient-descent method.Using it we should calculate the gradient of the objective function given in Eq. ( 3), i.e., its vector of derivatives with respect to the z-ordinate [t 1 ,…,tnp] of each of the vertices (Figure 3).The update for each iteration q should be t j for some scalar step of size α .

Gradient of the Energy function
The objective energy function introduced in Eq. ( 3) is in fact an integral over the entire surface, and therefore depends on the parameters t i .As these parameters change, the surface evolves.
Notice that the domain of integration changes as well as the energy function value.This concept of a surface flow according to some direction, which is the gradient in our case, is sometimes related as gradient flow (Delaunoy and 2011).
To illustrate the flow, consider a surface that depends on a single parameter, t, termed in the following as time, for traditional reasons.For simplification of the mathematical derivation, we restrict our surface to a single triangle.Our energy function is an integration of some scalar field over the surface.

Time Derivative for a Single Triangle
Consider a triangle T in 3D-space, defined by its three vertices i=1,2,3.Let D: 3 → be some scalar field, then its integration over T may be expressed as a surface integral E.
In order to introduce surface evolution or flow, we consider the triangle surface as moving in time in such a way that its three vertices move along some time dependent curves (Figure 4a).Consequently, the domain of integration also depends on time, and thus can be written as: Since the domain of integration in Eq. ( 4) is also time dependent, computing the derivative dE/dt is not a trivial task.Thus we should use the Leibniz integral rule for differentiation under the integral sign (Flanders, 1973).To overcome this difficulty we may transform the domain of integration into a 2D coordinate system for which it has no time dependency.This is done by changing the surface domain to the unit triangle in 2 which is defined by {|(r,s)[0,1] 2 |||r+s|=|1|}.The integral may now be written as: where any triangle points x(r,s)|=|x 1 +|(x 2 -x 1 )r|+|(x 3 -x 1 )s, determined by r and s, is a convex combination of the three vertices.The vector g|=[ |×| ]=|(x 2 -x 1 )×(x 3 -x 1 ) is also defined here in such a way that dσ|=|||g|||dr|ds enables us to compute a surface integral over the 2D domain space (r,s) as shown in Eq.
(5).Since now the domain of integration is not time dependent, we may derive it and obtain the expression in Eq. ( 6).
  , so that the transition dD/dt|=|D•ẋ becomes a result of applying the chain rule.We have also used the notation ġ|=|∂g/∂t.Moreover, since g is computed only using the triangle vertices it does not depend on the integration parameters r and s, and therefore can be taken outside of the integral, which leads to the following expression for the derivative We can further show that the vector derivative ẋ and the expression ġ•g can be expressed as a linear function of the vertices derivative ẋ 1 , ẋ 2 , and ẋ 3 , as given in Eqs. ( 8) and ( 9).
where the value g 0 is the triple product of the three vertices g 0 |=|(x 1 ×x 2 )•x 3 ; g|=|||g|| is the norm of vector g and equals to twice the area of the triangle; and the vector n|=|g/||g|| is the unit normal vector of the triangle.an explicit triangle surface model .

The Derivative for an explicit TIN model
As we have chosen to use an explicit triangular mesh for a surface representation, the shape of the surface is determined by the z-ordinates of the triangles vertices (Figure 4b).For a single triangle T k , the energy E k which is integrated over it, depends only on a limited (three) set of parameters, t j , which correspond to its three vertices.The gradient of our energy function, is therefore a vector containing the derivatives of the energy E(t 1 ,..,tnp) with respect to the values t j of each vertex j|=|1,2,…,n p , where n p is the number of vertices.Using Eq. ( 7) the j-th component of the gradient may be expressed by where e jk |=|1 if the j-th vertex belonging to the k-th triangle T k , and e jk |=|0 if not.Here, ẋ and ġ are vector derivatives with respect to t j .For a single triangle T k , and without any loss of generality, we name these three parameters t 1 ,t 2 ,t 3 , which correspond to the triangle vertices x i , i=1,2,3, respectively.The 3D derivative values of the vertices can either be [0,0,0] T or [0,0,1] T (each vertex is defined as x i |=|[x i ,y i ,z i ] T and derivatives are with respect to the z-ordinate of all vertices t i , i=1,2,3).By substituting the derivatives into Eqs.( 8) and ( 9) we get the following expressions: and Continuing with an expression for D, the reprojection-error field is D(x)|=|(Ȋ(x)-Ȋ'(x)) 2 as defined in Eq. ( 1), and therefore its 3D gradient is where Ȋ and Ȋ' are the aforementioned 3D color fields generated by the reference and target image respectively.Deriving this color field function in 3D space is not as obvious, and so we use the 2D image function which generates it by the relation Ȋ(x)|=|I(P(x)).The 3D gradient Ȋ may therefore be expressed as T T , , where here u and v are the 2D image coordinates and u|=|[| ,| ,| |] T .Therefore, to complete the expression for the gradient we also need to find the derivatives of the 2D imagespace coordinates u|=|[u,v] T with respect to the object-space 3D coordinates x|=|[x,y,z] T .We assume a pinhole model and therefore use the co-linearity principle, which can be written using the projection matrix P 3×4 : The partial derivative of (u,v) with respect to (x,y,z) may be expressed by the Jacobian matrix where w|=|p 31 x|+|p 32 y|+|p 33 z|+|p 34 .This provides us with the required expressions, since the first row of the matrix in Eq. ( 16) is u T and the second row is v T .
We can now write explicit expressions for the derivatives of E.
Prior to that we note that the vectors we obtained for ẋ in Eq. ( 11) have zero value for their first two components, namely, ẋ|=|[|0,|0,|∂z/∂t i |] T for i=1,2,3.This is an expected result, since the parameters t j affect the surface only on the z-direction.Consequently, in order to compute the expression D•ẋ, we may consider only the derivative D z |=|∂D/∂z and ignore the derivatives with respect to x and y.Therefore, combining Eqs. ( 13), ( 14), and ( 16), we have we use the tag sign ('|) to relate to the target image.As a result and using Eqs.( 7), ( 11), and ( 12), we obtain the necessary derivatives of E.

Computation
Computing the integral in Eq. ( 3) in 3D object space is a challenging task.Although the 3D function Ȋ(x) is theoretically well defined, it is difficult to implement it practically.Moreover, our basic units for color data are image pixels, and therefore it is reasonable to compute this integral in the 2D image-space (u,v).We may therefore write Eq. ( 3) in the following form where u|=|[u,v] T is the 2D reference image coordinates.The function P(x) in the domain of integration is the projection function from 3D object-space into 2D image-space: u=P(x).
Eq. ( 19) would be entirely computable in the reference imagespace (u,v) had it not included the term I'(u') which is the actual target image-space.Therefore, in order for the integrand to be a function of single image-space domain (u,v), we have to reproject the target image coordinates (u',v') onto the reference image.Since the two projection functions are defined by u=P(x) and u'=P'(x), we may determine that u'|=|P'(P -1 (u)).This composite function may seem indefinite, since we cannot trivially transform image coordinates from one to the other (by inverting P).However reprojection may be computed using the surface data.Taking the surface into consideration we can intersect a reference image ray generated by an image point, leading to a 3D object-space point, which can be then projected to the target image.A geometric illustration of such reprojection is given in Figure 5.We now define the reprojected image function Ĩ(u)|=|I'(u')|=|I'(|P'(P -1 (u))|), which may be constructed by the original target image I', the two perspective projection matrices of both images, and the surface.Notice that the reprojected image is a synthetic one created by first inversely-projecting the target image onto the surface, and then projecting it from the surface to the reference image-space.Using the above notations we may now write Eqs. ( 3) or ( 19) as: where is the coefficient of the surface area element in the image-space, which is developed in the Appendix.As Eq. ( 21) shows, J depends on the image-space point u|=|[u,v] T , its inversely projected 3D object-space coordinates x|=|[x,y,z] T , the image perspective projection matrix P and the surface normal vector n at this point.The value w|is p 31 x|+|p 32 y|+|p 33 z|+|p 34 .
Figure 5. Illustration of an image-pair reprojection.
Applying the reprojection approach we can now write the following expression for integration of D z .Using Eqs. ( 17) and ( 20 which transforms the domain of integration from the 3D surface in object-space into both the reference and target image-spaces. Here Ĩ' is the inverse reprojected image from the reference to the target one, and J' is the coefficient for the target image.This enables us to compute the gradient of our energy function, and therefore perform a surface evolution in that direction.We detail here the algorithm for evolving an initial course surface based on minimization of the reprojection-error accumulated of an image pair of the scene.The computation of the surface point (x,y,z) in line #17 of the algorithm can be done by intersection of the ray line, generated by the image pixel and the camera exposure point, and the triangle plane.Moreover, the image derivative I u and I v in line #20 has to be computed numerically by using, for example, a gradient kernel.This can be done as a preliminary stage since the images do not change between iterations.

RESULTS
The proposed surface reconstruction method has been applied for scenes within an archaeological excavation site.It is demonstrated on a documentation project in Oren cave in the Carmel mountain ridge, as part of on-going research studying the origins and use of Manmade-bedrock-holes (MBH) in those sites.An excavation campaign performed in June 2011 serves as the case study here.Images were acquired during the excavation with a Nikon D70 camera, with 48mm Nikkon lens.
We focus here on the reconstruction of a 3D surface model of MBH embedded within rough topographic settings.The input images were taken from a distance of approximately 1.5m from the MBH, such that the baseline is ~80cm, with a large orientation difference (Figure 6).The computation is generated by a few manually sampled points for producing the initial coarse surface (Figure 6).The initial surface can be seen in Figure 7a.As noted the surface evolution is implemented by an iterative scheme, applying gradient flow on the z-ordinates of the vertices.Selected stages of the evolution are presented in Figure 7, and the final resulted surface is presented Figure 8a.
Results show a complete and successful reconstruction of the surface enabling also to depict fine details that reflect its formation.Application of the proposed model to other MBHs has proven successful as well.
The reconstruction has been compared to existing methods as well.Use of the same images for reconstruction using the graph-cuts method has failed in the reconstruction completely (Mor and Filin, 2009).It seems that these methods which are based on disparity map in image-space tend to give inadequate results for poorly textured scenes, and seem to fail when the imaging configuration is of wide-baselines and is characterized by significant orientation difference.Furthermore, applying the SIFT algorithm on the images has yielded a sparse set of points, most of them are on the outer surface of the MBH and are insufficient to reconstruct the object successfully (Figure 8b).
Here again, texture, and the wide baseline and orientations have a significant effect on the outcome.

CONCLUSION AND DISCUSSION
This paper has presented a method for 3D reconstruction by surface evolution based on an image-pair, by employing a gradient flow for minimization of the reprojection-error.
Additionally, an implementation scheme on a TIN mesh has been presented, and applied for complex topographic scenes.
The results have shown applicability of the method, along with its satisfactory performance for some difficult imaging configurations.Future work may deal with expanding the method for several images, and evolving surfaces of nonexplicit representation.

APPENDIX
The appendix presents an expression for J, the coefficient transforming the area element form the surface to the imagespace.This is the expression given in Eq. ( 21).The relation between the image-space coordinates (u,v) and the object-space 3D coordinates (x,y,z) can be stated by the following three equations: These two sets of equations may be written as Using the identity (Ab)×(Ac) = det(A) A -T (b×c) and since the determinant of Q is g T |(det(P 3 )P 3 -1 )|[|u|v|1] T , we get the expression of Eq. ( 21).

Figure 1 .
Figure 1.Illustration of 3D point determination by a onedimensional search, a) an image-space search along the epipolar line using the radiometric information directly, b) the search is along the vertical locus, and the radiometric values are obtained by projection to image-space.The positioning in both cases is motivated by requiring minimal values for the intensity differences.

Figure 3 .
Figure 3.A surface (left) and the gradient of its vertices (right) visualized by the arrows.

Figure 4
Figure 4. a) a general time dependent triangle surface model, b) an explicit triangle surface model .

Figure 6 .Figure 7 .
Figure 6.Images used for the reconstruction, along with sampled points for generating the initial coarse surface.

Figure 8 .
Figure 8.The reconstructed surface with texture in (a), whereas for comparison, a set of match points generated by the SIFT algorithm is presented in (b).
two equation are variation of the co-linear law of Eq. (15), and the third one is the triangle plane equation in the 3D object-space.Deriving these three equations with respect to u and v gives T , w|=|p 31 x|+|p 32 y|+|p 33 z|+|p 34 , and