Sections
Access Technologies
Image Slideshow
Right Column
Text Area

Adaptive Monocular Depth Processor: Fusing Vision Transformer with Time-of-Flight

Text Area

Product introduction

While Time-of-Flight (ToF) and Monocular Depth Estimation (MDE) advance rapidly, both face inherent sensor limitations. Our system adaptively fuses ToF’s precision with a state-of-the-art monocular transformer’s depth priors, eliminating multi-sensor calibration via single infrared input.

  • Foundation-Model Enhancement: 
    Integrates a state-of-the-art monocular transformer model trained on large-scale datasets for generalizable depth prediction.
  • Single-Sensor Simplicity: 
    Operates entirely on infrared input, minimizing hardware complexity and alignment errors.
  • High Adaptability and Efficiency: 
    Designed to benefit from advances in both MDE models and ToF hardware, enabling scalable deployment in embedded systems.

Adaptive Monocular Depth Processor: Fusing Vision Transformer with Time-of-Flight

Text Area

Technology Specifications

Technology Specifications

  • System Architecture:
    • Core Modules: Depth fusion unit, 8-bit CNN-Transformer core, Arm host processor
  • Dual Processing Paths
    • ToF Depth: Physical depth from phase information
    • ViT Depth: Monocular depth prediction via transformer on intensity map
  • Adaptive Fusion Engine:
    • Local Compensation: Fills missing ToF values using ViT residuals
    • Delta Fusion: First-order fusion with exponential decay weights
    • Robustness Control: Overflow prevention & confidence tuning by ToF hole density
    • Non-Uniformity Detection: Sobel/Laplacian gradient analysis on ViT-ToF differences
    • Range-Consistent Smoothing: Structural refinement for noise/reflection mitigation