# PointNet with CLIP Features for Multi-View Object Classification
This repository contains code for training a multi-view object classification model using PointNet with CLIP (Contrastive Language-Image Pretraining) features. The model combines 2D and 3D features to perform classification on 3D point cloud data.
## Prerequisites
- Python 3.6 or later
- PyTorch 1.7 or later
- CUDA-capable GPU (optional but recommended for faster training)
- Other dependencies listed in `requirements.txt`
## Installation
1. Clone this repository:
```sh
git clone https://github.com/yourusername/pointnet-clip-object-classification.git
cd pointnet-clip-object-classification
-
Install the required packages using pip:
pip install -r requirements.txt
-
Train the PointNet-CLIP model:
python main.py --batch_size 32 --num_points 1024 --nepoch 250 --outf results
Adjust the batch size, number of points, number of epochs, and output folder as needed.
-
Evaluate the trained model:
python main.py --batch_size 32 --num_points 1024 --model results/3D_model_X.pth --outf results --feature_transform
Replace
X
with the epoch number of the trained model. -
View results:
Training and testing progress, as well as accuracy, will be displayed during training and saved to a
log.txt
file in the output folder.
- This code builds upon the PointNet architecture and CLIP model.
- Realistic_Projection module is used for projecting point clouds to 2D images.
- The dataset is assumed to be organized in the specified structure under
dataset_path
.
If you use this code in your research, please consider citing:
@article{YourArticleCitation,
title={Title of Your Article},
author={Author Names},
journal={Journal Name},
year={Year},
volume={Volume},
number={Number},
pages={Page Range},
doi={DOI},
}
This project is licensed under the MIT License - see the LICENSE file for details.
Please replace `yourusername` in the repository URL with your actual GitHub username and adjust any paths or details according to your use case.