COMPUTER VISION IN THE TELEOPERATION OF THE YUTU-2 ROVER

: On January 3, 2019, the Chang'e-4 (CE-4) probe successfully landed in the Von Kármán crater inside the South Pole-Aitken (SPA) basin. With the support of a relay communication satellite "Queqiao" in 2018 located at the Earth-Moon L2 liberation the lander and the Yutu-2 rover carried out in-situ exploration and patrol surveys, respectively, and were able to make a series of important scientific discoveries. Owing to the complexity and unpredictability of the lunar surface, teleoperation has become the most important control method for the operation of the rover. Computer vision is an important technology to support the teleoperation of the rover. During the powered descent stage and lunar surface exploration, teleoperation based on computer vision can effectively overcome many technical challenges, such as fast positioning of the landing point, high-resolution seamless mapping of the landing site, localization of the rover in the complex environment on the lunar surface, terrain reconstruction, and path planning. All these processes helped achieve the first soft landing, roving, and in-situ exploration on the lunar farside. This paper presents a high-precision positioning technology and positioning results of the landing point based on multi-source data, including orbital images and CE-4 descent images. The method and its results have been successfully applied in an actual engineering mission for the first time in China, providing important support for the topographical analysis of the landing site and mission planning for subsequent teleoperations. After landing, a 0.03 m resolution DOM was generated using the descent images and was used as one of the base maps for the overall rover path planning. Before each movement, the Yutu-2 rover controlled its hazard avoidance cameras (Hazcam), navigation cameras (Navcam), and panoramic cameras (Pancam) to capture stereo images of the lunar surface at different angles. Local digital elevation models (DEMs) with a 0.02 m resolution were routinely produced at each waypoint using the Navcam and Hazcam images. These DEMs were then used to design an obstacle recognition method and establish a model for calculating the slope, aspect, roughness, and visibility. Finally, in combination with the Yutu-2 rover mobility characteristics, a comprehensive cost map for path search was generated.


INTRODUCTION
The Chang'e-4 (CE-4) probe was launched from the Xichang Satellite Center at 02:23 (local time) on December 8, 2018. After 26 days of space flight, it successfully landed in the Von Ká rmá n crater inside the South Pole-Aitken (SPA) basin at 10:26 on January 3, 2019 (Wang et al., 2019). Then, the Yutu-2 rover drove away from the lander using the "straight-forward, no-obstacle" mode and successfully reached the first waypoint (X) on the moon at 22:22 on January 3. At approximately 16:47 on January 11, the Yutu-2 rover and the lander successfully carried out mutual-photographing (CLEP, 2019), marking the complete success of the CE-4 mission and the opening of a new chapter in China's exploration of the Universe.
By the end of the first twelve lunar days, the Yutu-2 rover had been working on the lunar farside for over 300 days, far exceeding the projected service life. The rover was able to overcome the traversing challenges on the complex terrain, and covered a total travel distance of more than 300 m, achieving the "double three hundred" breakthrough (CLEP, 2019). With the support of the relay communication satellite "Queqiao," the lander and the Yutu-2 rover carried out in-suit explorations and exploratory surveys, respectively, and achieved a series of important scientific findings. Due to the complexity and unpredictability of the lunar surface environment, teleoperation technology has become the most important control method for rover operation, and computer vision is a core supporting technology in the teleoperation of the rover. During the powered descent stage and lunar surface roving, teleoperation based on computer vision, all of which ensured the successful completion of the first inspection and exploration mission of the lunar farside (Wang et al., 2019;Liu et al., 2020;Di et al., 2020). This paper introduces a number of key technologies and methods used in CE-4 mission operations, including highprecision positioning of the landing point based on multi-source data, high-resolution seamless mapping of the landing site, navigation and localization in the complex lunar environment, terrain reconstruction, and path planning. Specifically, the improvements and optimizations to those techniques used in the teleoperation of the Chang'e-3 (CE-3) mission are described in detail. Finally, the results of the processing and application of the actual lunar image data from the CE-4 are presented.

Teleoperation Mode
The teleoperation system of the Yutu-2 rover needs to consider various constraints, such as environmental temperature, illumination conditions, measurement control, energy, and relay communications. In the unstructured and complex environment on the lunar farside, it is necessary to select detection targets, plan the driving route, and perform scientific measurements in a timely, safe, and accurate manner, along with ensuring the ability to autonomously navigate and safely and efficiently control the rover to complete various scientific investigations (Jin et al., 2019). The Yutu-2 rover utilizes a combination of ground-based teleoperation and on-board autonomous control. The ground teleoperation center and the rover communicate with each other through downlink and uplink communication sequences. Based on the images and telemetry parameters transmitted by the rover, the teleoperation center reconstructes the geographical environment around the rover and builds the telepresence. The scientists choose the corresponding detection targets; the teleoperation center then carries out task planning to determine the desired work content and driving route of the rover, and finally directs the rover to complete the corresponding actions through a set of commands uplinked to the rover. Moreover, the Yutu-2 rover has a certain level of autonomy, which allows itself to complete local planning, and enables emergency obstacle avoidance and emergency protection actions. The teleoperation diagram for the Yutu-2 rover is shown in Figure 1.

High Precision Localization of the Landing Point based on Multi-source Images
The high-precision positioning of the landing site is an important technical step for the safe landing of the detector. The position information of the landing site is an important basis for establishing a local coordinate system. It can provide position information and basic data for analyzing the landing area terrain, task planning of patrol exploration, and scientific investigations (Di et al., 2019a(Di et al., , 2019bLiu et al., 2020).
Because the CE-4 landing zone is on the lunar farside, conventional radio measurement methods, such as ranging, doppler and VLBI, cannot be used during the powered descent stage of the detector Wan et al., 2019). Therefore, vision-based positioning technology has become the preferred tool. On the other hand, limited by the bandwidth of the "Qiaoqiao" relay transmission link, the CE-4 detector utilized a high compression ratio mode to compress and transmit the descent images during the powered descent stage. This resulted in obvious patch effect in the original image and increased difficulty in image matching.
Considering the requirements of the engineering mission during the powered descent stage of CE-4, based on the imaging particularity of the sequence of descent images, and the requirement for timeliness and accuracy, a coarse to fine positioning strategy was adopted. The high-compression ratio descent images, transmitted in near real-time, were used to achieve rapid initial positioning, whereas the low-compression ratio descent images were used to achieve high precision positioning (Wang et al., 2019). The flowchart of landing point positioning (i.e., lander localization) procedure is shown in Figure 2.

Imaging Strategy for Rover Localization in Long Distances
When facing with an unfamiliar, uncertain, and unstructured terrain environment, knowing the position and attitude of the rover is critical for the ground teleoperators to complete the task planning and path planning. Only if accurate positioning data is available can they determine "where to explore" and "what to explore." On the other hand, the cross-range merging of the topographic map of the lunar surface and marking the detection targets in the images or mapping products rely on the support of localization.
The precise localization for navigation of the rover was typically classified into two steps. First, a rough measurement of the relative position change of the rover was obtained by the inertial navigation sensor and stored as the initial value. Then, the precise value was accurately calculated based on the multicamera vision positioning method in photogrammetry. Due to the limited on-line processing capabilities, precise navigation localization was typically performed at the teleoperation center. In the large distance travel mode, limited by the height of the camera (about 1.5 m), the shape, size, and position changes of the same target in the image overlapping area of the two neighboring waypoints on the front and back will present uncertainty and complexity. Additionally, the images are greatly affected by changing illumination confiditions, which makes it sometimes difficult to automatically match the images.
To improve the automation of image matching under largescale and rotation differences, an imaging strategy was adopted to alter the original visual positioning algorithm process in CE-4. First, according to the distance traversed, the imaging of the corresponding scene is carried out. Then, the optimal image is selected based on the imaging angle at two waypoints . Finally, the observation equation is established using the feature points extracted by the ASIFT algorithm (Wang et al., 2014) to calculate the precise localization. Figure 3 shows the specific implementation process.

Terrain Reconstruction Based on Multi-source Image Fusion
Terrain reconstruction is not only an important tool to understand the lunar environment, but also an essential component in the autonomous navigation of the rover. This technology provides teleoperators the ability to select targets for scientific measurements, task planning, and path planning. Additionally, the realistic terrain reconstruction helps enhance the immersion for the teleoperators, effectively improving their efficiency.
Before the lunar rover moves, the teleoperation center adjusts the pitch and yaw drives of the mast, so as to enable the Navcam and Pancam to capture stereo images at different angles. The teleoperators then use dense matching technology to process the downlinked images. Terrain reconstruction is the process of virtually recreating the lunar surface using the 3D point cloud data obtained through dense matching. The CE-3 rover traveled 114.9 m in total and was able to generate 17 waypoints of DEMs and DOMs using the Navcam images Peng et al., 2014), which effectively supported the task planning for scientific measurements, navigation positioning, and path planning.
The complex geological environment on the lunar farside and the rugged terrain of the landing area, combined with the blind spot caused by only using the topographic data derived from Navcam images, present a great challenge to the rover. For the CE-4 rover, we utilized a multi-source image processing method, which uses the terrain data from the Hazcam to fill in the blind spot in the terrain data from the Navcam. The specific implementation of this process is shown in Figure 4. At the same time, to accurately verify the safety of the planned path, the panoramic mosaic technology, which matches images from the Navcam taken from adjacent yaw or pitch angles using the feature points and an affine transformation to create a seamless panoramic image, is used.

Modeling of Integrated Lunar Surface Environment Based on Time-varying and Non-time-varying Factors
Path planning is an essential step for the lunar rover to avoid obstacles, traverse the lunar surface, and reach the target position safely. Generally, path planning is implemented based on the terrain data reconstructed by the stereo vision system. Using the performance indicators of the rover, such as curvature constraints, ability to surmount an obstacle, and communication conditions, the rover searches for an optimized path from the starting point to the target point within the motion space, so that it can safely avoid all obstacles (Xu, 2009).
The path planning consists of two main sub-problems: modeling of the lunar surface and searching of the optimal path. The first problem is to recreate the physical environment of the lunar rover in an environmental model that can be understood and expressed by a computer, with the objective of establishing a cost map. The second problem is to find the optimal path from the starting position to the target position based on the environment model and cost map.
The landing area of CE-4 is on the lunar farside. Owing to relatively frequent impacts, the density of craters is very high; only about 2.5% of the surface area is smoother relatively flat lunar plains between hazardous craters. These factors make proper path planning even more important to the success of the mission (Chen et al., 2010). Based on the above analysis, we extracted several features from the lunar surface and proposed a comprehensive method for lunar surface modeling based on both time-varying and non-time-varying factors, to support the path search. The details of this process are presented in Figure 5.

Skyline Extraction based on Edge Detection
As the rover traverses and explores the lunar surface, direct sunlight is required to maintain the energy supply from the solar panels. Additionally, to communicate with the relay, there must be a direct line of sight. Given the complex terrain on the lunar farside, the terrain occlusion must be analyzed, basic data for navigation must be provided, and sleep-reboot commands, as well as other controls to ensure the safety of the rover, must be carried out at appropriate times and locations.
For CE-4, we controlled the Navcam or Pancam to capture a photograph of the surrounding terrain; using this monocular image, we were able to determine the boundary between the background of space and the lunar surface, which is known as the "skyline" (Han et al., 2011). Considering the absence of atmosphere on the moon, which results in direct sunlight, phenomena that are common on the surfaces of Earth and Mars, such as diffuse reflection and scattering, will not occur. Therefore, skyline images of the moon often show that a dark sky and a bright moon area, making decomposition more obvious. Therefore, based on these features, we utilized an edge detection method based on large-scale unilateral uniform constraints (Peng et al., 2019), as shown in Figure 6. After determining the spatial orientation of the target object (sun or relay communication satellite), it was compared with the calculation results of the skyline. If the altitude angle of the target object is less than the altitude angle of the skyline, the object is considered to be occluded.

ENGINEERING APPLICATIONS IN CE-4 MISSION
Teleoperation is crucial for the rover to traverse the lunar surface and carry out scientific exploration after soft landing. Computer vision techniques play an important role in the entire process of landing and exploration as the primary way to obtain information on the landing environment. After acquiring the imaging information on the lunar surface, a series of processing and analysis technologies are used to provide a high level of safety in the implementation of the scientific exploration tasks, as well as the driving of the rover. Currently, the Yutu-2 rover holds the record for the longest working rover on the lunar farside.

Lander localization and Analysis of the Terrain of the Landing Area
First, the high compression ratio descent images were used to determine the initial positioning of the landing site within 20 min of the probe landing on the lunar surface. Then, the position information was further refined based on the playback of the low compression ratio images. The coordinates of the final landing site were (177.5884 o E, 45.4565 o S). This was the first successful application of a vision-based landing site positioning method based on computer vision in Chinese lunar program (Wang et al., 2019;Di et al., 2019a). After landing, a 0.03 m/pixel DOM, which covers an area of 211 m × 187 m, was generated using the descent images, and it is used as one of the base maps for the overall rover path planning (Wan et al., 2019). Figure 7 shows the topographic map of the landing area, where the circles are craters labeled with their diameters. From this figure, it is clear that the lander is surrounded by several large impact craters, which are located to the north, east, and south of the landing site. The terrain to the west of the landing site is relatively flat, which was favorable to the subsequent inspection and exploration by the Yutu-2 rover. To avoid inclusion of the shadows of the rover and the lander in the imaging process, a moving strategy of "getting through the close siege and approaching to the west" was ultimately adopted. Figure 7 also shows the periodic planning for rover and lander photographing each other, which was one of the milestones of the mission.

Navigation and positioning
Using the stereo Navcam on the Yutu-2 rover, the teleoperation center has achieved high-precision positioning of the rover in the separation stage, mutual-photographing stage, and traversing investigation stage. By the end of the first twelve lunar days, the rover had traversed a total of 73 waypoints.
Two methods, namely dead reckoning and visual positioning, have been used to localize the rover for the purpose of navigation. Dead reckoning executed onboard the rover to provide real-time positions and attitudes of the rover; visual positioning was perfomed at the teleopration center in near realtime aming to provide localization results with higher accuracy. Within the traverse distance of up to 10 meters between nerghboring waypoints, the localization results from the two methods were generally consistent with differences in centimeter to decimeter level. Occasionally when the rover experienced severe slippage, the localization differences were over 1 m; in these cases, the visual localization technique effectively corrected the large localization errors of dead reckoning. Overall, the on-board dead reckoning and visual localization methods performed well and provided timely and accurate localization information for the teleoperation of the Yutu-2 rover.

Terrain Reconstruction
By the end of 13 lunar days, the Yutu-2 rover had traveled a total of 357.695 m (CLEP, 2020), and the Navcam and Hazcam acquired stereo images at every waypoint along the path. Local DEMs and DOMs with a resolution of 0.02 m were routinely produced at each waypoint, which effectively supported the subsequent detailed topographical analysis for path planning. Figure 8 shows the rover at the LE01104 waypoint on November 1, 2019; the next step was to select the path to the "dormant" point. Fig. 8(a) shows the DOM at this waypoint, which was automatically generated from 16 pairs of Navcam images captured at a fixed pitch angle. As previously described, the Navcam cannot image the terrain beneath the rover. In order to clearly see the terrain beneath and in front of the rover, we also turned on the Hazcam to acquired stereo images, one of which is shown in Fig. 8(b). The Hazcam image shows that there was a large pit in the area around the front left wheel of the rover. Fig. 8(c) is the DOM generated from Hazcam images. Considering the low resolution of Hazcam image in the far range, we only cut and used the DOM within the range of 3.5 m. Fig. 8(d) shows the merged Navcam and Hazcam DOM, within which the Hazcam DOM is indicated by the red oval. The diameter of the crater was approximately 45 cm, and the depth was approximately 6 cm, which did not exceed the obstacle crossing threshold of the Yutu-2 rover. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume V-3-2020, 2020 XXIV ISPRS Congress (2020 edition)

Path Planning
On the night of January 03, 2019, the CE-4 lander and lunar rover completed the separation process, and the Yutu-2 rover started its journey on the lunar farside. The path planning provided the lunar rover with the basis for each movement. Due to the rugged terrain and high densensity of craters of various sizes, the Yutu-2 rover typically utilized a slow and steady driving mode, and the targeted distance was 4-7 m with each movement. Figure 9 shows the comprehensive cost map with a transparency of 0.3. Through the modeling and analysis of the DEM, it is clear that there is a crater, having a diameter of ~ 3 m to the west of the rover, being identified as an obstacle as shown in the red area. Additionally, the terrain to the southwest is relatively flat, and there are continuous and dense dormant areas, as indicated by the yellow area of the figure. Based on these information and the rover's performance limitations, the path from waypoint LE01104 to LE01105 was searched, as shown on the comprehensive cost map of Fig. 9(a). The action sequence of the rover was to move along a small curvature to reach the waypoint as indicated by the green line, and then to turn left by a small angle at the waypoint to satisfy the dormant azimuth constraint, as indicated by the yellow section. In order to check the planned path from a more intuitive perspective, we projected the 3D path onto the Navcam image mosaic and the Hazcam image based on the Navcam and Hazcam imaging model, as shown in Fig. 9(b) and Fig. 9(c). From the Hazcam image, we can see that that there is a pit that is of approximately 45 cm diameter and 6 cm depth in front of the front left wheel of the lunar rover; there is a small stone inside the pit as well (shown by arrow 3), which can not be seen in the Navcam images. The path we planned just avoided the stone, making the left wheel drive along the side of the pit. Additionally, while moving, the left wheel will pass through a small pit, with a depth of about 5 cm, as shown by arrow 1, and a larger pit near the right wheel, with a depth of about 6 cm shown by arrow 2.
This verification method presents the planned path in a more intuitive way and can help adjust the path if necessary. Additionally, it can be used to compare the planned path with the traversed path of the rover to evaluate the accuracy of the moving process, which plays an important role in the completion of the overall mission. Figure 10 shows the rover looking back at the tracks it imprinted at waypoints A' and C' on the first lunar day. The planned paths are projected onto the images and compared with the actual tracks. From these images, it can be seen that the paths are consistent with each other, indicating that the rover was moving as planned during these periods. Figure 10. Comparison of planned paths (red lines) and actrual tracks at waypoint A' and C', respectively.

Skyline calculation
Generally, for every two lunar days, at approximately lunar noon, the Yutu-2 rover captures an image of the skyline when the altitude angle of the sun is highest. To date, six such skyline images have been created, and the results of skyline altitude computations are shown in Figure 11. Fig. 11 illustrates that the calculated results of the six skyline altitude angles are all less than 3°, and the average error between them is less than 0.02°. The Yutu-2 rover is approximately 0.77° away from the northern mountain and 0.17° away from the western mountain. There is no occlusion in the distance, and the environment is conducive to sleep-reboot. Figure 11. Results of the skyline altitude angles in the first 12 lunar days (Azimuth 0° represents North).
Based on the visual localiztion results at all waypoints, the traversed path of the rover is generated, and the treverse map of Yutu-2 over the first 12 lunar days on the lunar farside is shown in Figure 12, and the green number represents the distance of a

CONCLUSIONS
As of December 03, 2019, the Yutu-2 rover had been operating on the lunar farside for 17 lunar days, with a cumulative traversed distance of more than 400 m. During this period, the rover traveled in accordance with the planned paths. The scientific payloads, such as Pancam, Lunar Penetrating Radar, and Visible and Near-infrared Imaging Spectrometer (Jia el al., 2018), carried out scientific measurements at multiple waypoints. A large amount of scientific data was collected, and a preliminary analysis of the results has been completed. In the follow-up, the Yutu-2 rover will continue to move forward, and strive to obtain more first-hand scientific data, and add luster to the "space dream" and to the "Chinese dream." During this period, computer vision technology played an integral role in supporting the Yutu-2 rover to safely travel on the lunar farside, work effectively, and achieve fruitful scientific results. In this paper, the computer vision technology utilized in the teleoperation of the Yutu-2 rover was described in detail. The data processing results and applications in the Yutu-2 rover were presented. With the advancement of the deep space exploration program in China, the teleoperation technology will have a wide range of applications. Furthermore, computer vision, one of the key technologies in teleoperation, will need further developments for teleoperation. For example, more advanced technologies will be introduced, including machine learning, artificial intelligence and virtual reality technologies.