Skip to content

Barrybarry-Smith/PixelGaussian

Repository files navigation

PixelGaussian: Generalizable 3D Gaussian Reconstruction From Arbitrary Views

Xin Fei*, Wenzhao Zheng$\dagger$, Yueqi Duan, Wei Zhan, Masayoshi Tomizuka, Kurt Keutzer, Jiwen Lu

Tsinghua University, UC Berkeley

*Work doen during an internship at UC Berkeley, $\dagger$ Project leader

Most existing generalizable 3D Gaussian splatting methods (e.g., pixelSplat, MVSplat) assign a fixed number of Gaussians to each pixel, leading to inefficiency in capturing local geometry and overlap across views. Differently, our PixelGaussian dynamically adjusts the Gaussian distributions based on geometric complexity in a feed-forward framework. With comparable efficiency, PixelGaussian (trained using 2 views) successfully generalizes to various numbers of input views with adaptive Gaussian densities.

teaser

News

  • [2024/10/25] Code release.
  • [2024/10/25] Paper released on arXiv.

Visualizations

pipeline

Overview

pipeline

Given multi-view input images, we initialize 3D Gaussians using a lightweight image encoder and cost volume. Cascade Gaussian Adapter (CGA) then dynamically adapts both the distribution and quantity of Gaussians. By leveraging local image features, Iterative Gaussian Refiner (IGR) further refines Gaussian representations via deformable attention. Finally, novel views are rendered from the refined 3D Gaussians using rasterization-based rendering.

Results

pipeline

PixelGaussian achieves the best performance on the two representative datasets. Trained with 2 reference views, PixelGaussian can generalize to more views.

Getting Started

Installation

  1. Please clone this project, create a conda virtual environment and install the requirements in requirement.txt.

  2. Download RealEstate10K, ACID datasets and corresponding assets following the instructions of pixelSplat

Folder Structure

├── datasets
│   ├── re10k
│   ├── ├── train
│   ├── ├── ├── 000000.torch
│   ├── ├── ├── 000001.torch
│   ├── ├── ├── ...
│   ├── ├── test
│   ├── ├── ├── 000000.torch
│   ├── ├── ├── ...
  1. Please run the following command to start your training process, and choose a batch size that best matches your device for optimal performance.
python -m src.main +experiment=[re10k/acid] data_loader.train.batch_size=[batch_size]

Related Projects

Our code is based MVSplat and GaussianFormer and is also inspired by pixelSplat and SelfOcc.

Citation

If you find this project helpful, please consider citing the following paper:

@article{fei2024pixel,
    title={PixelGaussian: Generalizable 3D Gaussian Reconstruction From Arbitrary Views},
    author={Fei, Xin and Zheng, Wenzhao and Duan, Yueqi and Zhan, Wei and Tomizuka, Masayoshi and Keutzer, Kurt and Lu, Jiwen},
    journal={arXiv preprint arXiv:2410.18979},
    year={2024}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published