Skip to content

Latest commit

 

History

History
77 lines (59 loc) · 3.52 KB

README.md

File metadata and controls

77 lines (59 loc) · 3.52 KB

AdaBins

PWC PWC

Official implementation of Adabins: Depth Estimation using adaptive bins

Download links

  • You can download the pretrained models "AdaBins_nyu.pt" and "AdaBins_kitti.pt" from here
  • You can download the predicted depths in 16-bit format for NYU-Depth-v2 official test set and KITTI Eigen split test set here

Colab demo

Open In Colab

Inference

Move the downloaded weights to a directory of your choice (we will use "./pretrained/" here). You can then use the pretrained models like so:

from models import UnetAdaptiveBins
import model_io
from PIL import Image

MIN_DEPTH = 1e-3
MAX_DEPTH_NYU = 10
MAX_DEPTH_KITTI = 80

N_BINS = 256 

# NYU
model = UnetAdaptiveBins.build(n_bins=N_BINS, min_val=MIN_DEPTH, max_val=MAX_DEPTH_NYU)
pretrained_path = "./pretrained/AdaBins_nyu.pt"
model, _, _ = model_io.load_checkpoint(pretrained_path, model)

bin_edges, predicted_depth = model(example_rgb_batch)

# KITTI
model = UnetAdaptiveBins.build(n_bins=N_BINS, min_val=MIN_DEPTH, max_val=MAX_DEPTH_KITTI)
pretrained_path = "./pretrained/AdaBins_kitti.pt"
model, _, _ = model_io.load_checkpoint(pretrained_path, model)

bin_edges, predicted_depth = model(example_rgb_batch)

Note that the model returns bin-edges (instead of bin-centers).

Recommended way: InferenceHelper class in infer.py provides an easy interface for inference and handles various types of inputs (with any prepocessing required). It uses Test-Time-Augmentation (H-Flips) and also calculates bin-centers for you:

from infer import InferenceHelper

infer_helper = InferenceHelper(dataset='nyu')

# predict depth of a batched rgb tensor
example_rgb_batch = ...  
bin_centers, predicted_depth = infer_helper.predict(example_rgb_batch)

# predict depth of a single pillow image
img = Image.open("test_imgs/classroom__rgb_00283.jpg")  # any rgb pillow image
bin_centers, predicted_depth = infer_helper.predict_pil(img)

# predict depths of images stored in a directory and store the predictions in 16-bit format in a given separate dir
infer_helper.predict_dir("/path/to/input/dir/containing_only_images/", "path/to/output/dir/")

TODO:

  • Add instructions for Evaluation and Training.
  • Add UI demo
  • Remove unnecessary dependencies

Environment config:

  • cuda:11.8
  • cudnn: 8.9.0
  • torch: 2.0.1+cu118
  • torchvision: 0.15.2+cu118
  • find cuda_version:nvcc -V;
  • find cudnn_version: cat /usr/include/cudnn_version.h |grep CUDNN_MAJOR -A 2
  • find torch_version and torchvision_version: python; import torch; import torchvision; print(torch.version); print(torchvision.version)