~Please note this is only a beta release at this stage~
Unsupervised CNN for Single View Depth Estimation, is a nueral networks that can predict depth from a single RGB image. It achieves this by training the network analogous to an autoencoder with a pair of images, source and target, with a small known camera motion between the two.
This can be achieved because the convolution encoder is trained to predict the depth map of the source image. To do this, inverse warp of the target image is generated using the predicted depth and known inter-view displacement, to reconstruct the source image; the photometric error in the reconstruction is the reconstruction loss for the encoder.
The paper further details the unsupervised deep learning framework developed to predict scene depth from a single image, that does not require a pre-training stage or annotated depth ground-truth.
This repository contains an PyTorch (and PyCaffe) open-source implementation of Unsupervised CNN for Single View Depth Estimation with official weights converted from caffe. This package currently provides inference implementation that can be deployed. We are working on providing training and evaluation implementation in PyTorch as well. Dependencies of both the PyTorch and PyCaffe packages can be easily installed using conda
, and if you prefer a more manual approach, via pip
. The PyTorch version of the package can also be installed from pip
Our code is free to use, and licensed under BSD-3. We simply ask that you cite our and Ravi's work if you use Single View Depth Estimation in your own research.
This repository updates Ravi Garg's open-source unsupervised CNN for Single View Depth Estimation work by providing the the network's implementation in PyTorch and PyCaffe.
- Original paper : Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue
- The original Caffe 1 implementaion: Realtime Unsupervised Depth Estimation from an Image.
This repository is split into three sections:
- a pytorch implementation to run inferences (along with a the converted architecture and weights from the original implementation) and
- a pycaffe implementation of the original caffe network (to allow for non Matlab inference),
- the tool used for this caffe to pytorch conversion of the network architecture and weights.
PyTorch version includes the converted Caffe to PyTorch network (architecture and the weights) along with a sample script to run the inference and some sample images. It also has a seperate conda environment file that can be used to create an virtual environment to run the pytorch version.
PyCaffe version consists of pycaffe implementation of the original matcaffe sample inference script along with the trained caffe network. It also contains some sample images. Finally, it has a conda environment definition file that can be used to quickly create the virtual environment required to use it.
This is essentially the code we used to convert and validate Ravi's network from Caffe to PyTorch along with the sample images. We used a modified version of an external tool called pytorch-caffe by marvis. The modified tool is also part of this sub-folder.
Note: We are assuming you are using a Linux:Ubuntu system.
We offer three methods to install our packages:
- Through our Conda Package: single command installs everything including system dependencies (recommended)
- Through our pip package: single command installs package and Python dependences, you take care of system dependencies
- Directly from source: allows easy editing and extension of our code, but you take care of building and all dependencies
The only requirement is that you have Conda installed on your system, and are inside a Conda environment. From there, simply run:
u@pc:~$ conda install single_view_depth
You can see a list of our Conda dependencies in the ./pytorch-env.yml
Before installing via pip
, you must have the following system dependencies installed:
- TODO the rest of this list
Then Single View Depth Estimator, and all its Python dependencies can be installed via:
u@pc:~$ pip install single_view_depth
Installing from source is very similar to the pip
method above due to Single View Depth Estimator only containing Python code. Simply clone the repository, enter the directory, and install via pip
u@pc:~$ pip install -e .
Note: the editable mode flag (-e
) is optional, but allows you to immediately use any changes you make to the code in your local Python ecosystem.
Once installed, single view depth estimator can be used directly from the command line using Python.
TODO: add details for quickstart scripts that run directly from the command line
Once installed, our pytorch package can be used like any other Python package. It consists of a RunSingleViewDepthExample
class with currently one main functions for inference and deployment. We are working on adding training and evaluation. Below are some examples to help get you started.
If you chose to build from source, the example inference script run_depth_estimator_example.py
can run from the command line to get sample output.
The code snipet shows how to use single view depth estimator directly in your own projects.
from unsupervised_single_view_depth import UnsupervisedSingleViewDepth
# Initialise a full RefineNet network with pre-trained KITTI model available in repository
sv = UnsupervisedSingleViewDepth()
# Load a previous snapshot from a self trained network
sv = UnsupervisedSingleViewDepth(load_snapshot='/path/to/snapshot/file.pth')
# By default, the inference code will run on GPU, device 0. To change the GPU device to use:
sv = UnsupervisedSingleViewDepth(gpu_id=1)
# If you don't want to use GPU even if available, change the gpu flag to False:
sv = UnsupervisedSingleViewDepth(use_gpu_if_available=False)
# Get a predicted segmentation as a NumPy image, given an input NumPy image
my_image = cv2.imread("</path/to/image>")
segmentation_image = sv.predict(image=my_image)
# If you want to provide path to image instead
segmentation_image = sv.predict(image_path="</path/to/input_image>")
# Save a segmentation image to file, given an image from another image file
# If you would like to see the output prediction and input images plotted
If using Single View Depth Estimation in your work, please cite our original ECCV paper:
title={Unsupervised CNN for single view depth estimation: Geometry to the rescue},
author={Garg, Ravi and Kumar, BG Vijay and Carneiro, Gustavo and Reid, Ian},
booktitle={European Conference on Computer Vision},