COMBINING PUBLIC DOMAIN AND PROFESSIONAL PANORAMIC IMAGERY FOR THE ACCURATE AND DENSE 3 D RECONSTRUCTION OF THE DESTROYED BEL TEMPLE IN PALMYRA

This paper exploits the potential of dense multi-image 3d reconstruction of destroyed cultural heritage monuments by either using public domain touristic imagery only or by combining the public domain imagery with professional panoramic imagery. The focus of our work is placed on the reconstruction of the temple of Bel, one of the Syrian heritage monuments, which was destroyed in September 2015 by the so called "Islamic State". The great temple of Bel is considered as one of the most important religious buildings of the 1st century AD in the East with a unique design. The investigations and the reconstruction were carried out using two types of imagery. The first are freely available generic touristic photos collected from the web. The second are panoramic images captured in 2010 for documenting those monuments. In the paper we present a 3d reconstruction workflow for both types of imagery using state-of-the art dense image matching software, addressing the non-trivial challenges of combining uncalibrated public domain imagery with panoramic images with very wide base-lines. We subsequently investigate the aspects of accuracy and completeness obtainable from the public domain touristic images alone and from the combination with spherical panoramas. We furthermore discuss the challenges of co-registering the weakly connected 3d point cloud fragments resulting from the limited coverage of the touristic photos. We then describe an approach using spherical photogrammetry as a virtual topographic survey allowing the coregistration of a detailed and accurate single 3d model of the temple interior and exterior. * Corresponding author


INTRODUCTION
The Syrian City of Palmyra, located ca. 250km North-East of Damascus contains the monumental ruins of a great city that was one of the most important cultural centres of the ancient world.Standing at the crossroads of several civilizations, Palmyra was an established caravan oasis when it came under Roman control in the mid-first century AD as part of the Roman province of Syria.It grew steadily in importance as a city on the trade route linking Persia, India and China with the Roman Empire, marking the crossroads of several civilizations in the ancient world.It has unique examples of funerary sculpture uniting the forms of Greco-Roman art with indigenous elements and Persian influences in a strongly original style.Outside the city's walls are remains of a Roman aqueduct and immense necropolises.
Palmyra is the Symbol of Syrian Archaeology and one of the UNESCO world cultural heritage sites.It was added to the UNESCO list of World Heritage in Danger in 2013 due to the civil war taking place in Syria.Many monuments were blown up or damaged by the so called Islamic State in the zones under its control in the North-East of Syria and the West of Iraq in 2015.In Palmyra the most celebrated monuments were destroyed: Bel temple, the Arch of Triumph and the Baalshamin Temple.The great temple of Bel is considered as one of the most important religious buildings of the 1st century AD in the East and to be of unique design.
The temple sits within a bounded, architectural precinct measuring approximately 205 meters per side.This precinct, surrounded by a portico (a colonnaded entryway), encloses the temple of Bel.The temple itself has a very deep foundation that supports a stepped platform.At the level of the stylobate (the platform atop the steps) the area measures 55 x 30 meters and the cella (the inner space of the temple that held the cult statue), stands over 14 meters in height and measures 39.45 x 13.86 meters.Touristic photos of monuments are very useful, to record geometric and radiometric information.As the Islamic State is systematically destroying Syrian and Iraqi cultural heritage, the necessity is growing to find a way to document those monuments, to reconstruct virtual models as basis for 3d visualizations, augmented reality applications or even a base for an eventual future reconstruction.Virtual 3d reconstructions are also an important statement that war crimes might destroy physical monuments and artefacts but that it will not eradicate their virtual representation for the memory of mankind.
Photogrammetry is the only available solution in this particular case as there is no possibility to use another technique for a survey or reconstruction of monuments which no longer exist or which are no longer accessible.This problem was presented for the first time and under similar circumstances when the Great Buddha statues of Bamiyan, Afghanistan had been destroyed in 2001 by the Taliban.A photogrammetric reconstruction was subsequently carried out using only three metric images acquired in 1970 (Gruen et al., 2003).Now, after many years of advancement in photogrammetric and computer vision algorithms as well as hardware capabilities, it is possible to use much more imagery and even non metric, generic touristic photos to generate three dimensional information.
Recent research in the image-based reconstruction of urban environments or cultural heritage monuments can be divided into two main streams: high-quality 3d reconstructions using (semi-) professional imagery on the one hand and large-scale 3d reconstructions using image collections from the web on the other.Koutsoudis et al. (2014), for example, provide a good overview of Structure-From-Motion (SfM) and Dense Multi-View 3D Reconstruction (DMVR) algorithms and of results obtainable with imagery from (semi-) professional DSLR cameras.Kersten at al. (2012) and Santagati et al. (2013) specifically investigate the accuracy potential of image-based 3d reconstructions by comparing their results with terrestrial LiDAR reference data sets.Alsadik et al. (2014) furthermore describe methods for automatically designing minimal networks for the 3d modelling of cultural heritage.These professional photogrammetric reconstructions with specifically designed networks stand in contrast to numerous research efforts exploiting the potential of unordered web-based image collections for efficiently reconstructing large-scale urban environments (e.g.Agarwal et al., 2011).Remondino et al. (2012) provide a comparison of low-cost image orientation packages and critical discussion, also with respect to poor image networks which are typical for public domain imagery.In addition to assessing the quality of image orientation there are also recent efforts in establishing methodologies for assessing the quality of dense image matching results (e.g.Cavegn et al., 2015) and in extending dense image matching to historical image data sets (Nebiker et al., 2014).
Over the last few years, a number of projects have been investigating the use of public domain imagery in the reconstruction of collapsed or destroyed cultural heritage sites such as the City of Bam (Futragoon et al., 2010;Kitamoto et al., 2011).These efforts largely relied on combinations of aerial and terrestrial imagery and on a semi-automatic reconstruction process, thus limiting the metric accuracy and the level of detail of the resulting reconstructions.
In our paper we first discuss the available public and professional imagery and the typical limitations of touristic imagery with respect to imaging patterns and coverage.In Section 3 we present different 3d extraction strategies using either public domain imagery only or a combination of public domain imagery with professional panoramic imagery and subsequently compare their results.In Section 4 we present an approach for co-registering multiple point cloud subsets which typically result from public domain imagery sets before discussing the co-registration results obtained.Finally, we introduce spherical photogrammetry serving as a virtual topographic survey for georeferencing the entire model.

Source Images
For the reconstruction of Bel temple, we had two image sources available: public domain touristic photos from the web and panoramic imagery acquired with professional photographic equipment.
The tourist imagery was searched and collected from the web.Most of the images were downloaded from flickr.com.The search was carried out looking for tags in many different languages, which might indicate the function and place of the monument.Google was also useful, in particular the image search by similarity called "reverse image search", which yielded a number of untagged photos.After filtering the images, excluding those without EXIF data and those with small dimensions in order to increase the accuracy, a total of 180 suitable touristic images remained.These included 112 images of the exterior and 68 images of the interior of the temple, with a big range in focal length.The minimum image size used was 1200 x 1600 pixels.And the time period of the available imagery ranged from 2005 to 2011.
The professional panoramas had been captured by G. Fangi, during a tour through Syria in 2010.The panoramic images cover the interior and the exterior space of the monument with 20 multi-image partial panoramas created from 551 individual photos.13 panoramas cover the exterior of the monument and 7 panoramas the interior.The images were captured using a Canon EOS 450D DSLR camera with a 28mm fixed lens and a nodal point adapter, ensuring a single camera centre for all images of a station.External panoramas were captured in the courtyard around the temple covering a 360° field of view.From these panoramas, the original individual frame images facing the temple were used in the subsequent image-based reconstruction.The reasons for using the original frame images include the preservation of the original image quality and the avoidance of image re-projection and possible stitching errors.Furthermore, typical SfM software such as AgiSoft PhotoScan support perspective rectinlinear frame imagery with the recent addition of fisheye models, but not imagery in typical panoramic projections as equirectangular or cylindrical ones.
From the panoramic image acquisition, a total of the original 74 images of the exterior and 162 of interior were available for the reconstruction process.

Image Acquisition Patterns
Usually, optimal image acquisition patterns and networks are a key issue in high-quality 3d reconstructions (Alsadik et al., 2014).However, in cases where tourist photographs provide the main source of imagery, acquisition patterns become a limiting factor.Image acquisition patterns by tourists have been investigated at urban scales, e.g. by Kádár & Gede (2013) using geotagged public domain tourist photography.In our work we were interested in the image acquisition patterns on a per-site or per-monument level and their effect on the reconstruction process and results.
Typically, tourists are more interested in prominent parts of archaeological heritage monuments than in the rest or only some sides of the entire structure.Therefore, their photos are limited to describe their interest for a number of reasons:  Parts that indicate the historical value of the monument.e.g.parts of the monument hold signs like writing, decorations, well preserved parts, etc.
 Parts of the monument visible from the typical touristic routes.e.g. the side of a monument which faces the road where tourists pass  Monument parts that create the first impression for a tourist, as a great dome, an entrance, etc.For example, it is difficult to find photos of an entrance gate from the inside (exit direction) even if it is as decorated as the exterior side.
 Photogenic parts of the monument, as many of photo publishers on the web are amateur or professional photographers.They choose their "monument model" in terms of photo composition, lights, colours, etc. Regardless of the historical value, even if the monument is hardly recognizable in the photo.
 Parts that are not subject to accessibility restrictions.

3D EXTRACTION
As shown in earlier research, there is a rapidly growing number of both open source and commercial software packages for automated bundle adjustment and multi-view dense image matching (Santagati et al., 2013;Koutsoudis et al., 2014).Following some earlier trials with a number of different algorithms and software packages, Agisoft PhotoScan was used for the subsequent investigations.Some of the reasons include its capability to handle a large number of different cameras and the important possibility to introduce geometric constraints into the bundle adjustment.In our case the option of constraining the camera coordinates of images belonging to the same panoramic station proved to be particularly valuable.Furthermore, it was shown by other authors (e.g.Remondino et al., 2012) that PhotoScan compares well to other software packages with respect to yielding reliable and accurate results, even for challenging image blocks.

Image Registration and Matching
The two types of imagery available for our investigations were both not acquired with the intension of an accurate and detailed 3d reconstruction.However, they have different and partially complementary characteristics which shall subsequently be exploited: the touristic internet photo collections provide a dense and high-resolution coverage of some prominent parts of the monument.The typical trade-offs are a weak geometric network design, the use of multiple unknown non-metric cameras and only a partial coverage of the entire monument.
The panoramic images, on the other hand, were taken from intentionally chosen stations with constant focal length lenses offering a full coverage of the monument.The panoramic images have some very wide baselines with large image ray intersection angles from neighbouring panoramic stations.These wide baselines are ideal for interactive panoramic photogrammetry, but are not suitable for SfM approaches.The subsequent image registration and matching strategy was to first investigate the potential of tourist internet collections only and to subsequently investigate the benefits of combining this public source of imagery with panoramic imagery.
For this purpose, dense 3d point clouds for the following three scenarios were computed and evaluated:  Scenario A) Using public tourist imagery only  Scenario B) Using touristic imagery with free floating panoramic images  Scenario C) Using touristic imagery with geometrically constraint panoramic images

Generated Point Clouds
The exterior of the temple including all wall surfaces could be reconstructed using a single group of images within a single computation run (Figure 4, bottom).In case of the temple interior, it was not possible to produce a single point cloud for the whole space with a single computation.This was due to the distribution and the orientation of the photos resulting from the fact that the interesting two chambers with their carved monolithic ceilings were located at the opposite ends of a narrow long rectangle hall.
As a result, most of the photos were either oriented towards the chamber at the northern or the southern end (see Figure 3) with an insufficient number of photos for bridging the northern and southern parts of the hall.The solution was to create two image groups and two subsequent point clouds for the interior space.
The first group contains the photos oriented towards the northern chamber (Figure 4, top) and the second contains the ones oriented towards the South (Figure .4, middle).In case of computation scenarios B and C, both groups also included images from the panoramas.The fact that both interior image groups have several panoramic images of the western and eastern walls in common can later on be exploited in the coregistration process.
With the most comprehensive computation scenario C the following three point clouds were derived: the northern interior point cloud with about 8.5 million points, the southern with about 9.5 million points and the exterior one with about 6.5 million points.

Evaluation of the results
To analyse the contribution of the two resources of images, it was interesting to compare the three point clouds produced by the scenarios A-C described in Section 3.1.Scenario C including all available imagery of both image types where panorama images are constraint to their common projection centre, is expected to produce the best resulting point cloud "C".The introduction of these camera station constraints significantly improves the stability and redundancy of the bundle adjustment.Since the Bel Temple has been destroyed and due to the absence of independent high-quality data, point cloud C is used as reference in the subsequent investigations.
The following experiments were carried out using the point cloud data sets A to C for the southern part of the interior hall (Figure. 4,middle).All three point clouds were referenced to the same local coordinate system using identical natural features as control points.The used features were present in the touristic images to be common for all the point clouds.The model is not exactly to scale but it was possible to approximate the real scale using the known measurements of its internal rectangular plan.
The point clouds for the scenarios A and B were compared to the reference cloud C using the open source program CloudCompare.The results of these comparisons are shown in Figures 5 and 6.
Figure 5 shows the evaluation of the distances between the points of cloud A and C of the interior southern wall.The mean offset between the two point clouds was 2.2cm and the standard deviation 3.3cm.It shows, for example, how the use of panorama images affected the quality of the point cloud in the high part of the wall (red coloured area of the wall).The panoramic images included in image block and the resulting point cloud C appear to stabilize the border areas of the tourist image 'only' block A. These border areas are susceptible to uncompensated block deformations, especially in the case of multiple unknown cameras and relatively weak network geometry.Figure 6 shows the distances between the points of cloud B and C of the same wall.The mean offset between the two point clouds was 1.5cm and the standard deviation 1.4cm.These results and the figure show that adding station constraints for the panoramic photos in the bundle adjustment reduces the noise of the point cloud by more than factor of 2 over the unconstraint solution.

CO-REGISTRATION STRATEGY USING PANORAMA CENTRES AS VIRTUAL CONTROL POINTS
The co-registration of the two internal point clouds was a challenging problem due to the absence of a topographic base and due to the low density point clouds in the common areas (eastern and western walls).Attempts to use natural features in these sparse areas for the co-registration proved to be unsatisfactory.The solution for the accurate co-registration of the interior point clouds was found by co-registering the two underlying image networks using the estimated panorama centres of the two networks as common virtual control points.The solution exploits the fact that the panoramic photos participating in the generation of the point cloud were captured from the identical projection centre.The five internal panoramas were split in two for the dense matching process.
The photos oriented towards the north were used in the generation of the northern point cloud and the rest for the southern one.The idea was then to co-register the two point clouds using the panorama camera positions estimated in the image bundle adjustments as common control points.The northern point cloud was considered as reference, using a Helmert transformation of the southern data set with the adjustment of the scale.Figure 7 shows the residual errors of the five common camera stations used in the transformation.The average transformation error was approx.12cm with a minimum error of 5cm for the central panorama and a maximum of 17cm in the extreme positions.This corresponds to an average transformation error of about 0.3 % of the maximum dimension of the hall of roughly 40 m.
Figure 8 shows the evaluation of distances between points of the two point clouds on a section of the eastern wall following the co-registration described above.The visualisation shows the minimum errors (in blue colour) to be located in the central part but it also indicates some remaining systematic effects.Then, the point cloud of the exterior was aligned to interior one using common points visible in the entrance opening and the windows of the temple to complete the co-registration operation and obtain one complete point cloud.Using this method, we can obtain only the relative co-registration between two point clouds as we don't know the exact absolute positions of the panorama centres.
Figure 8: Evaluation of the distances a between common portion of the two clouds after the co-registration using panoramas centres as virtual control points.

Spherical photogrammetry
Panoramic Spherical Photogrammetry (PSP) makes use of spherical panoramas as information support.They are formed by overlapping images taken from the same point and stitched together on a sphere (Szelinsky and Shum, 1997).The advantages of the spherical panorama consist in the Field Of View (FOV) that can reach 360°x180°, and in the high resolution of the omnidirectional image.The coordinates of an arbitrary object point can be obtained by intersection of corresponding straight lines connecting the image point with the centre of two or more spheres.PSP can be suitable for fast and accurate recording of Architectural Cultural Heritage (Pisa et al., 2010).Attempts have been done for the automatic detection of corresponding points in the panoramas (Barazzetti et al.,2010).To overcome the lack of stereoscopy and enable complex surfaces to be plotted, the photo-modelling has been used; say the back-projection of the oriented panoramic spheres over the rough spherical photogrammetric model in a cad environment (d'Annibale, et al., 2009;Fangi, Piermattei and Wahbeh, 2013).For more about PSP see (Fangi and Nardinocchi, 2013).So far, the integration with point clouds produced by SFM algorithms has been fruitfully used in (Fangi and Pierdicca, 2012).

The use of spherical photogrammetry to reference point clouds
The use of spherical photogrammetry is an additional step to reference the point cloud which goes further than using markers directly as we don't have a ground control points coordinates.In our case spherical photogrammetry took the role of a virtual total station, with which 3D point coordinates could be produced to reference the model to a local coordinates system.The absolute orientation was based on geometrical constraints to orient the X,Y,Z axes and to get the approximately right scale based on the known interior temple plan measurements (39.45 x 13.86 meters).The good coverage of panoramas guarantees a good possibility to choose points for the external orientation in zones which are not covered by touristic photos and then to produce a good orientation of the coordination system axes (Figure 9).
Figure 9: The central panorama of the interior with collimations of the orientation network points in "Point Records" software for the panoramic spherical photogrammetry.
The exterior network of panoramas (Figure 10) composed by 13 panoramas distributed around the temple was oriented using 1320 observations with the following results of the block adjustment: * sigma-naught is expressed in radians Having those oriented panoramas gives the possibility to plot 3d points (Fangi and Nardinocchi, 2013) and even to create wireframe or polygonal 3d models (Fangi and Wahbeh, 2013).In the Bel temple case it was used as a basis for referencing the produced point cloud to a defined coordinates system.The point clouds were referenced adding the 3d points as markers giving them their specific coordinates to translate and scale the entire block to the real scale and the correct orientation.

RESULTS
The results of the image-based reconstruction of the Bel Temple using a combination of public domain touristic imagery and professional panoramic imagery are shown in Figures 13 and  14.The presented approach led to a largely complete point cloud with the exception of some horizontal parts such as the top of the walls and some zones of the floor.The mesh produced from this point cloud has large variations in level of detail.As result of the touristic image patterns, the most interesting parts of the monument are represented with very high-levels of detail and provide a high-quality representation of the monument, especially after applying the photo textures.
Figure 14: Detail view of the produced mesh for a part of the northern chamber.
Based on the investigations presented above, the final model also has a good metric quality with a relative accuracy in the order of 2-3cm and an absolute accuracy in the order of 10-15cm, which is equivalent to better than 0.3 % of the Temple dimensions.
Figure 13: The final 3d model of Bel Temple shown as mesh of the merged point cloud (top) and textured using scource photos (bottom)

CONCLUSIONS
In this article we presented different approaches and scenarios for image-based reconstructions of destroyed cultural heritage monuments using the example of the destroyed Bel Temple in Syria.Our investigations included reconstructions based on public domain touristic imagery from the web and reconstructions using a combination of touristic imagery with professional panoramic imagery.The reconstructions from touristic imagery only were shown to provide a good relative accuracy and very high levels of detail for prominent parts of the monument.However, due to the typical image acquisition patterns by tourists, the monument could only be reconstructed in parts.The combination of touristic imagery and panoramic photos, on the other hand, have proven to be highly complementary.They enabled an almost complete 3d reconstruction of the entire Temple with a good relative and absolute geometric accuracy and with very high levels of detail for prominent parts of the monument.Advanced features for geometrically constraining panoramic camera stations in the bundle adjustment proved to significantly enhance the quality of the 3d reconstruction.In order to obtain a complete 3d model, novel co-registration procedures had to be applied.Here, the panoramic images proved to be very valuable.First, they allowed to co-register minimally overlapping image blocks and their respective point clouds using the panoramic camera stations as virtual tie points.Second, spherical photogrammetry was used to correctly scale and georeference the entire 3d model.In our future work we will further investigate methods for quantifying and visualizing the reconstruction quality for different parts and sections of the reconstructed monuments.Furthermore, we are planning to include historical images in the image orientation and matching process with the intention of further improving the completeness and quality of the resulting 3d reconstructions.Last but not least, we hope that our 3d reconstruction might provide a valuable basis for a physical reconstruction of the Bel Temple in a more peaceful future.

Figure 1 :
Figure 1: some of the open domain touristic images of the temple of Bel used for the reconstruction.

Figure 2
Figure 2 shows the image acquisition patterns for the exterior of the Bel temple and Figure 3 the pattern in the interior.A visual inspection already indicates the complementary nature of the two image data sets and the limited coverage of just the tourist photos.

Figure 2 :
Figure 2: Exterior photo network with camera stations and viewing directions (tourist photos in red; panoramas in blue)

Figure 3 :
Figure 3: Interior photo network with camera stations and viewing directions in the final model plan (tourist photos in red;panoramas with their fields of view in blue)

Figure 4 :
Figure 4: The resulting three point clouds of the temple: interior northern part (top); interior southern part (middle) and the exterior one (bottom).

Figure 5 Figure 6 :
Figure 5 Evaluation of the distances between the two point clouds A (touristic imagery only) and C (touristic imagery combined with constraint panorama stations)

Figure 7 :
Figure 7: Residual errors of the Helmert transformation for the five panorama centres used as virtual control points.

Figure 10 :
Figure 10: The exterior network of panoramas in PSP The interior network consists of seven panoramas, five of them covering almost the entire interior space and two of them covering the southern chamber detail.The interior network shown in Figure 11 and 12 was oriented using 692 observations with the following results of the block adjustment:

Figure 12 :
Figure 12: Perspective view of the orientation network of the five internal panoramas in the final 3d model.