Volume II-3/W5
ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., II-3/W5, 483-490, 2015
https://doi.org/10.5194/isprsannals-II-3-W5-483-2015
© Author(s) 2015. This work is distributed under
the Creative Commons Attribution 3.0 License.
ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., II-3/W5, 483-490, 2015
https://doi.org/10.5194/isprsannals-II-3-W5-483-2015
© Author(s) 2015. This work is distributed under
the Creative Commons Attribution 3.0 License.

  20 Aug 2015

20 Aug 2015

A SENSOR AIDED H.264/AVC VIDEO ENCODER FOR AERIAL VIDEO SEQUENCES WITH IN THE LOOP METADATA CORRECTION

L. Cicala1, C. V. Angelino1, G. Ruatta2, E. Baccaglini2, and N. Raimondo2 L. Cicala et al.
  • 1CIRA, Italian Aerospace Research Centre, 81043 Capua, Italy
  • 2Istituto Superiore Mario Boella, Torino, Italy

Keywords: UAV, Sensor aided video coding, Metadata correction, h.264, x264, Vision aided navigation, Sensor fusion

Abstract. Unmanned Aerial Vehicles (UAVs) are often employed to collect high resolution images in order to perform image mosaicking and/or 3D reconstruction. Images are usually stored on board and then processed with on-ground desktop software. In such a way the computational load, and hence the power consumption, is moved on ground, leaving on board only the task of storing data. Such an approach is important in the case of small multi-rotorcraft UAVs because of their low endurance due to the short battery life. Images can be stored on board with either still image or video data compression. Still image system are preferred when low frame rates are involved, because video coding systems are based on motion estimation and compensation algorithms which fail when the motion vectors are significantly long and when the overlapping between subsequent frames is very small. In this scenario, UAVs attitude and position metadata from the Inertial Navigation System (INS) can be employed to estimate global motion parameters without video analysis. A low complexity image analysis can be still performed in order to refine the motion field estimated using only the metadata. In this work, we propose to use this refinement step in order to improve the position and attitude estimation produced by the navigation system in order to maximize the encoder performance. Experiments are performed on both simulated and real world video sequences.