This is the repository for the first module of the AIPI-540 course of the MEng in Artificial Intelligence at Duke University. The module is focused on applying Computer Vision techniques to solve a problem.
Prior to this project there was no way to classify images of the species of snakes that exist in North Carolina. Previous efforts created big datasets of images of snakes, but they were from all over the world, not including som of the specied that are found in North Carolina. This project brings novelty by creating a new dataset of images of the 38 most common snake species that are found in North Carolina and to train a model to classify them.
This project needs the dependencies defined in "requirements.txt" as well as GPU support for faster training (optional - the saved weights in the repository can be used). Some of the dependencies are:
- Keras,
- PyTorch
- scikit-learn
- MatplotLib
- Pillow
- Streamlit
The inference code in this repository is ready to be run. There is no need to perform data sourcing or training. However, all the code that was used to produce the dataset and train the model is available in the repository.
Below are two instruction flows:
- Source and pre-process data and train model
- Run application in inference-only mode
- Clone the repository
- Install the dependencies (
pip install -r requirements.txt
) - (Optional) - Run the
scripts/data_sourcing/make_data.py
script to generate the images dataset. - (Optional) - Run the
scripts/segmentation/segment_preparation.py
to install Grounding Dino dependencies. - (Optional) - Run the
scripts/segmentation/segment_process.py
to create image dataset. - Perform transfer learning to train and save the model weights by running the
scripts/training/resnet50_torch.py
script. - Voilà! You have a trained model ready to be used!
Note: the optional steps above are not necessary if you directly download the image dataset from the links provided in the "Image Dataset Links" section below.
- Optional (if ran previous flow) - Clone the repository
- Optional (if ran previous flow) - Install the dependencies (
pip install -r requirements.txt
) - Optional (if ran previous flow) - Run the
scripts/segmentation/segment_preparation.py
to install Grounding Dino dependencies. - Run
streamlit run main.py
script to start the Streamlit application.
Inspiration, code snippets, etc.