When an RGB image is inputted to the model, it produces a depth map that displays the predicted depth of each pixel. It is similar to that of a person's ability to percieve perspective, distinguishing what is far away and what is nearby. It does this by evaluating the darkness of each pixel; something closer is generally lighter and something further is generally darker.
For the model architecture, we chose to use a UNet model. It was first proposed for abnormality localization in medical images that used convolutional networks for Biomedical Image Segmentation. As Manikandan [2021] explained, it has the ability for pixel-level localization and distinguishes unique patterns. It also has a 'U' shaped structure with the first half being an encoder and the last half being a decoder. Purkayastha [2020] also described that, "[t]his architecture consists of three sections: The contraction, The bottleneck, and the expansion section".
In this demo, our predicted results is a black picture. This is due to the time constraint we had to train. Because we did not have enough time to train our model, the results are as they are; however, with more time, our results would have been better.
zoom_0.mp4
-
.github/workflows/run_all_tests.yml: runs all test cases in test directory
-
dataset: directory containing the dataset
dataset/depth_maps: directory containing corresponding annotated depth maps to images in images directory (used while training)
dataset/images: directory containing raw images
- src: directory containing our code
src/UNet_model.py: defines and builds the UNet model
src/data_preprocessor.py: preprocesses data from Kitti dataset
src/environment.yml: a yaml file that lists necessary packages to set up the environment
src/model_summary.py: generates summary of the model
src/train.py: trains and tests the model
-
test: directory containing test cases for the code in src directory
-
requirements.txt: lists out all the installed packages and version numbers
-
test-requirements.txt: lists out required packages to run tests
- Make sure you are in the 2021-Monocular-Depth-Estimation directory
cd 2021-Monocular-Depth-Estimation
- To install the environment, run this command:
conda env create -f src/environment.yml
- Activate the new environment with this command:
conda activate monocular-depth-estimation
- If you need to deactivate the environment, run this command:
conda deactivate
To download raw data: Click here and download the raw data download script and run the .sh file
To download depth maps: Click here and download the annotated depth maps dataset
-
Open train.py
-
Go to main and set mode variable to "train"
-
Run train.py
-
Trained weights will be saved in a folder named "weights"
-
Open train.py
-
Go to main and set mode variable to "test"
-
Run train.py
-
Predicted results will be saved in a .ph file
[1] Monimoy Purkayastha. 2020. Monocular Depth Estimation and Background/Foreground Extraction using UNet Deep Learning Architecture. (July 2020). Retrieved December 10, 2021 from https://medium.com/analytics-vidhya/monocular-depth-estimation-and-background-foreground-extraction-using-unet-deep-learning-bdfd19909aca
[2] Bala Manikandan. 2021. Monocular depth estimation using U-Net. (July 2021). Retrieved December 10, 2021 from https://medium.com/mlearning-ai/monocular-depth-estimation-using-u-net-6f149fc34077
Code References: