This repository represents the official implementation of the paper titled "Marigold-DC: Zero-Shot Monocular Depth Completion with Guided Diffusion".
Photogrammetry and Remote Sensing team: Massimiliano Viola, Kevin Qu, Nando Metzger, Bingxin Ke, Alexander Becker, Konrad Schindler, Anton Obukhov.
📦 Clone the repository:
git clone https://github.com/prs-eth/Marigold-DC.git
cd Marigold-DC
🐍 Create python environment:
python -m venv venv/marigold_dc
⚡ Activate the environment:
source venv/marigold_dc/bin/activate
💻 Install the dependencies:
pip install -r requirements.txt
The script performs densification of the input sparse depth, provided as a sparse numpy array, and saves the output as a dense numpy array, along with the visualization. Optimal default settings are applied. By default, it processes the teaser image and uses 100-point guidance.
🏃🏻♂️➡️ Simply run as follows:
python -m marigold_dc
🧩 Customize image and sparse depth inputs as follows:
python -m marigold_dc \
--in-image <PATH_RGB_IMAGE> \
--in-depth <PATH_SPARSE_DEPTH> \
--out-depth <PATH_DENSE_DEPTH>
🛠️ Customize other settings:
--num_inference_steps <int>
specifies the number of diffusion inference steps.--checkpoint <path>
allows overriding the base monocular depth estimation model checkpoint; can be a local path or a Hugging Face repository.
None — the method is purely test-time; please refer to the paper for more details.
By default, the checkpoint is stored in the Hugging Face cache,
which defaults to the home directory on Linux and Mac.
This is often problematic in cluster environments.
The HF_HOME
environment variable defines the cache location and can be overridden, e.g.:
export HF_HOME=/large_volume/cache
Coming soon
Depth completion upgrades sparse depth measurements into dense depth maps, guided by a conventional image. Existing methods for this highly ill-posed task operate in tightly constrained settings, and tend to struggle when applied to images outside the training domain, as well as when the available depth measurements are sparse, irregularly distributed, or of varying density. Inspired by recent advances in monocular depth estimation, we reframe depth completion as image-conditional depth map generation, guided by a sparse set of measurements. Our method, Marigold-DC, builds on a pretrained latent diffusion model (LDM) for depth estimation and injects the depth observations as test-time guidance, via an optimization scheme that runs in tandem with the iterative inference of denoising diffusion. The method exhibits excellent zero-shot generalization across a diverse range of environments and handles even extremely sparse guidance effectively. Our results suggest that contemporary monodepth priors greatly robustify depth completion: it may be better to view the task as recovering dense depth from (dense) image pixels, guided by sparse depth; rather than as inpainting (sparse) depth, guided by an image.
- 2024-12-19: ArXiv paper and demo release.
- 2024-12-18: Code release (this repository).
@misc{viola2024marigolddc,
title={Marigold-DC: Zero-Shot Monocular Depth Completion with Guided Diffusion},
author={Massimiliano Viola and Kevin Qu and Nando Metzger and Bingxin Ke and Alexander Becker and Konrad Schindler and Anton Obukhov},
year={2024},
eprint={2412.13389},
archivePrefix={arXiv},
primaryClass={cs.CV},
}
The code of this work is licensed under the Apache License, Version 2.0 (as defined in the LICENSE).