This repo is for finetuning nlpconnect/vit-gpt2-image-captioning with the FlowerEvolver-dataset
You can use the jupyter-notebook or the FlowerCaptioner.py script
git clone https://huggingface.co/datasets/cristianglezm/FlowerEvolver-Dataset "data"
from transformers import pipeline
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
FlowerCaptioner = pipeline("image-to-text", model="cristianglezm/ViT-GPT2-FlowerCaptioner", device=device)
FlowerCaptioner(["flower1.png"])
# A flower with 12 petals in a smooth gradient of green and blue.
# The center is green with black accents. The stem is long and green.
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python FlowerCaptioner.py -i <flower.png> or <folder with flowers.png>
python FlowerCaptioner.py -t -m <model_name>
python convert.py --quantize --model_id "./models/FlowerCaptioner" --task "image-to-text-with-past" --opset 18
convert.py is under xenova/transformers.js license