Acquisition using lightweight platforms and data processing for underwater photogrammetry

Ph.D. Defense

Charles Villard

IGN / LaSTIG / ACTE

Director: Marc Pierrot-Deseilligny
Supervisor: Ewelina Rupnik

EPITA / LRE

Supervisor: Laurent Beaudoin
Supervisor: Loica Avanthey

05/07/2024

UGE / MSTIC

https://thesis-slides.villard.it

Plan

1. Acquisition

2. Processing

Conclusion & Perspectives

Contributions

Publications:

Code:

Presentations:

Courses taught at EPITA:

  • Introduction to Robotics (16h)
  • End-of-studies Projects (60h)
  • Linux Distribution for Embedded Systems (21h)
  • Data structures and Algorithms (18h)
  • C++ Workshops (12h)

Acquisition

  1. Context
  2. Acquisition sensors
  3. Vector platform
  4. First Conclusion

Underwater Environment

Context

Objectives:

  • Conduct on-site data collection
  • Utilize affordable platforms and sensors
  • Achieve accurate pose estimation

Environment:

  • 0 to 40 meters depth
  • Nearshore areas
  • Natural lighting conditions

Sensors

Price Control
Industrial High Excellent
Consumer Low Basic
  • Aim for affordable equipment.
  • Ensure modularity and control.

Vectors

  • The seabed can be mapped using different platforms:
    • Unmanned Aerial Vehicles (UAVs).
    • Unmanned Surface Vehicles (USVs).
    • Towed Underwater Vehicles (TUVs).
    • Autonomous Underwater Vehicles (AUVs).
  • These platforms share a common robotic architecture adaptable to various vehicle types.

Acquisition

Systems Used for Underwater Photogrammetry

Equipment:

  • Primarily GoPro cameras.
  • Off-the-shelf Remotely Operated Vehicles (ROVs).

Drawbacks:

  • Lack of synchronization between cameras.
  • Limited control over image acquisition parameters.
  • No integration with the vector platform.

Blue Rov GoPro Camera

Our multi-camera synchronized underwater system

Standalone for divers.

Connected to robotic platform.

Contributions:

  • Hardware and software architecture development.
  • Low bandwidth communication protocol for the camera.

Features:

  • Each camera can operate:
    • Independently.
    • In a synchronized array.
    • Connected to a robotic platform.
  • Modular design:
    • Interchangeable lenses.
    • Additional modules such as screens can be attached.
    • Supports connection of multiple cameras.

Sensor architecture - Hardware

Hardware architecture of one underwater module.

Diagram of Network connection.

Sensor architecture - Software

Software architecture of one underwater camera.

Acquired images examples

1.5mm focal

2.6mm focal

8mm focal
Figure 1: Acquired images.

Characteristics:

  • Rigid base linking the cameras.
    • Minimizes pose estimation uncertainties.
  • Synchronized for capturing dynamic scenes.
  • Expands field of view.

Metrics:

  • 1 ms synchronization between cameras.
  • 1 image per second.

Modular Autonomous Unmanned Surface Vehicle

Acquisition platform at sea.

Contributions:

  • Hardware electronics architecture.
  • Software port of Ardupilot on ESP32.
  • Modular platform architecture:
    • Operable by one person.
    • Dismountable for train transport.

Features:

  • Manual long-distance control.
  • Autonomous trajectory planning.
  • Remote camera triggering.
  • Global position estimation for image acquisition.

Vector architecture - Hardware

Figure 2: Architecture of the robotic platform.

Vector autonomous capabilities

Figure 3: Autonomous missions performed for image acquisitions.

Operating modes :

  • Auto: Autonomous waypoint trajectory.
  • Hold: Safety state with motors off.
  • Manual: Operator-controlled.
  • Loiter: Maintain a fixed GPS position.

Capabilities:

  • Speed matches the acquisition rate.
  • Waypoint trajectory can be dynamically updated.
  • The platform maintains its trajectory even in strong winds and large waves.

Acquisition mission

Figure 4: GPS location of images taken during Submeeting 2022 event.

  • 3 field missions.
    • Nice and Saint-Raphael locations.
  • Submeeting 2022:
    • 13 images datasets acquired.
    • 2 diving sites.
    • Acquisition from surface and divers.

Acquired images processing

Figure 5: Underwater point cloud of one diving site acquired during Submeeting 2022 event.

Processing:

  • Initial pose estimation and calibration performed with Colmap.
  • Micmac refinement utilizing GPS image locations and rigid constraints between cameras.

Result:

  • Synchronization between cameras is insufficient for rigid constraints.

Synchronization variation

Variation of the Rigid Constraint Vector Between Stereo Cameras Over Time.

First Conclusion

  • Multi-camera synchronized underwater system.
  • Modular autonomous unmanned surface vehicle.
  • Conducted field experiments and data acquisition with low-cost components.

Processing

This section is dedicated solely to photogrammetry processing.

Since acquiring new mission data using the previously described platform was not feasible, public datasets were utilized.

Constraints from the underwater environment, such as false matching, mobile environment acquisition, and lack of prior information about the images, are studied.

Pose Estimation

  1. Context
  2. Initial pose estimation
  3. Hierarchical pose optimization
  4. Second Conclusion

Structure from Motion

Context

Figure 6: Execution steps for a general Structure From Motion pipeline.

Different strategies for reconstruction :

  • Incremental.
  • Global.
  • Hierarchical.

Pose Estimation

Preprocessing

  • Use MicMac for features extractions and viewgraph creation.
Figure 7: Detection and Matching of Feature Points pairwise.
Figure 8: View graph of relative orientations.

Initial Pose Estimation Approach

Pipeline

Figure 9: Pipeline of the initial pose estimation.

Contribution:

  • Score triplet of image in a viewgraph.
  • Propose an initial pose estimation from triplet tree.

Requires:

  • Intrinsic calibration data.
  • A view graph representing relative orientations.
  • Image triplets list along with the matched feature points.

Triplet Graph

Figure 10: Example of Hyper View Graph. Views \(V_{1..8}\) are linked by triplets \(T_{1..8}\).

In a practical scenario involving 100 views:

  • There are approximately 10,000 triplets in total.
  • A randomly selected triplet tree contains around 98 triplets.
  • Scoring is performed on approximately 9,902 triplets.
Figure 11: Example of a random initial orientation.
Figure 12: Triplets not used for the random initial orientation are scored.

Random Initial Orientation

Figure 13: Example of a randomly generated triplet tree.
Figure 14: Example of a different randomly generated triplet tree.

The process for generating a random triplet orientation includes:

  • Uniformly selecting a seed triplet and orienting its views.
  • Randomly traversing adjacent triplets.
    • Each traversed triplet orients an additional view.
  • This continues until all views are oriented, forming the generated random tree.

Triplet Scoring

Figure 15: Calculate the simultaneous reprojection error using views \(V_{3,4,8}\) of triplet \(T_6\) based on the tie points.

Triplets not included in the randomly generated tree are used for scoring. The scoring process utilizes the feature points visible in all three views of each triplet.

5 Points Virtual Feature Points

 

Figure 16: A toy example of structure approximation in 2D and 3D, where (a) shows the initial 2D structure, (b) shows the fitted ellipse, (c) the fictitious 2D structure, and (d) is the equivalent in 3D. From (Rupnik and Pierrot Deseilligny 2020).

Since the method employs reprojection error at multiple stages, the scoring and bundle adjustment steps are optimized by using 5 virtual points instead of the original feature points.

Weighted HyperGraph

Figure 17: The Hyper View Graph of triplet weighted after multiple iterations of random tree generation then scoring.

Best Initial Orientation

Figure 18: Minimum spanning tree for the Hyper View Graph using random tree scores.

Using a Prim-like algorithm (Prim 1957) on the hypergraph, we obtain a subhypergraph solution that can be utilized as a tree to orient our views in the context of our problem.

Results - Initial Pose Orientation

Figure 19: Best Initial Orientation of our method on the drone dataset.
  • Each view is oriented based on a triplet from the optimal tree.

  • Certain triplets may be selected that incorrectly orient the remaining branches.

  • Orientations are compared to a reference by transforming coordinates and comparing pose positions.

  • The reference is generated using all available information for each dataset.

Hierarchical pose optimization

Pipeline
Figure 20: Pipeline of Hierarchical Pose estimation from previous initial pose.

Contribution:

  • Split and Merge technique.
  • Hierarchical scene decomposition based on triplets.
  • Robust merging of image block using spanning triplets.
  • Local bundle adjustment during merge.

Orientation Decomposition

Traversal of the tree follow a depth-first order.

(a) Example of the Best Triplet Tree generated.
(b) Corresponding Tree decomposition (Split).
(c) Merge the first deepest leafs layers.
Figure 21: Example of tree from the Initial estimation problem.

Initial Orientation of Leaf Pose

Before starting the reconstruction, each leaf is assigned the pose from the Final Initial Orientation part.

Leafs Merge

During the traversal, each sibling is merged using the crossing triplet.

Reference Datasets

Drone

Temple

Underwater

Results

Drone Histogram

Temple Histogram

Underwater Histogram
  • To evaluate the method’s quality, result orientations are compared to the reference orientation through coordinate transformation.

  • The initial pose orientation and its hierarchical optimization are analyzed.

  • For comparison, the classical Colmap automatic pipeline is also evaluated.

Results

Drone distance in Meters to reference orientation.

Results

Temple distance to reference orientation.

Results

Underwater distance to reference orientation.

Results - Underwater

Figure 22: Underwater point cloud result. Red the original result, Blue oriented without the worst blocks.
  • Certain branches of the tree may experience incorrect initialization.
  • For method comparison, the coordinate system is aligned with that of the reference.
  • This coordinate system change introduces a bias in result estimation.
  • Consistent output orientation relative to the reference, excluding poorly initialized blocks.

Results

Underwater Histogram filtering wrongly oriented image block.

Results - Point Cloud

Figure 23: Drone point cloud.
Figure 24: Temple point cloud.
Figure 25: Underwater point cloud.

Second Conclusion

  • Initial Pose Estimation Approach:
    • Utilize image triplets in random orientation for view graph weighting.
  • Hierarchical Pose Optimization with Triplet Trees:
    • Apply Divide and Conquer strategy to optimize orientation using all available information in the scene.

Conclusion & Perspectives

Conclusion

  • From data collection to scene reconstruction.
  • The first part focuses on data collection.
    • Gather information to facilitate pose estimation.
  • The second part focuses on general pose estimation.
    • Initial pose estimation and subsequent optimization.

Perspectives

Acquisition

Short-term:

  • Improve camera capabilities robustness.

Processing

Short-term:

  • Study other scoring approach for the triplets.

Long-term:

  • Fleet of acquisitions platform.

Long-term:

  • Study graph-cut approach to resolve initialization.

Thanks for your attention

References

Beaudoin, Laurent, Loïca Avanthey, Corentin Bunel, and Charles Villard. 2022. “Automatically Guided Selection of a Set of Underwater Calibration Images.” Journal of Marine Science and Engineering 9 (5). https://doi.org/10.3390/jmse10060741.
Beaudoin, L., L. Avanthey, and C. Villard. 2020. Porting Ardupilot To Esp32: Towards a Universal Open-Source Architecture For Agile And Easily Replicable Multi-Domains Mapping Robots.” ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B2-2020 (August): 933–39. https://doi.org/10.5194/isprs-archives-XLIII-B2-2020-933-2020.
Prim, R. C. 1957. “Shortest Connection Networks And Some Generalizations.” Bell System Technical Journal 36 (6): 1389–1401. https://doi.org/10.1002/j.1538-7305.1957.tb01515.x.
Rupnik, E., and M. Pierrot Deseilligny. 2020. “Towards Structureless Bundle Adjustment With Two- And Three-View Structure Approximation.” In ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, V-2-2020:71–78. Copernicus GmbH. https://doi.org/10.5194/isprs-annals-V-2-2020-71-2020.
Villard, C., and David B. Bussenschutt. 2021. ESP32: New HAL Layer for Esp32 Including Hal · Pull Request #18954 · ArduPilot/Ardupilot.” GitHub. 2021. https://github.com/ArduPilot/ardupilot/pull/18954.
Villard, Charles, Ewelina Rupnik, and Marc Pierrot-Deseilligny. 2023. “Estimation Initiale de La Pose Avec Un Échantillonnage Aléatoire Mais Pondéré de Paires de Triplets Redondants d’images.” In ORASIS 2023. Carqueiranne, France: Laboratoire LIS, UMR 7020. https://hal.science/hal-04219524.

Appendix

Additionnal uncounted slide

Slide 1

Here is another slide with more annotations. Important note.