Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add image2video microservice (Stable Video Diffusion) #465

Merged
merged 18 commits into from
Sep 23, 2024
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions comps/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@
GraphDoc,
LVMDoc,
LVMVideoDoc,
ImagePath,
ImagesPath,
VideoPath,
ImageDoc,
TextImageDoc,
MultimodalDoc,
Expand Down
2 changes: 2 additions & 0 deletions comps/cores/mega/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ class ServiceType(Enum):
LVM = 12
KNOWLEDGE_GRAPH = 13
WEB_RETRIEVER = 14
IMAGE2VIDEO = 15
TEXT2IMAGE = 16


class MegaServiceEndpoint(Enum):
Expand Down
12 changes: 12 additions & 0 deletions comps/cores/proto/docarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -217,3 +217,15 @@ class LVMVideoDoc(BaseDoc):
chunk_duration: float
prompt: str
max_new_tokens: conint(ge=0, le=1024) = 512


class ImagePath(BaseDoc):
image_path: str


class ImagesPath(BaseDoc):
images_path: DocList[ImagePath]


class VideoPath(BaseDoc):
video_path: str
18 changes: 18 additions & 0 deletions comps/image2video/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Copyright (C) 2024 Intel Corporation
XinyuYe-Intel marked this conversation as resolved.
Show resolved Hide resolved
# SPDX-License-Identifier: Apache-2.0

FROM python:3.11-slim

# Set environment variables
ENV LANG=en_US.UTF-8

COPY comps /home/comps

RUN pip install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir -r /home/comps/image2video/requirements.txt

ENV PYTHONPATH=$PYTHONPATH:/home

WORKDIR /home/comps/image2video

ENTRYPOINT ["python", "image2video.py"]
66 changes: 66 additions & 0 deletions comps/image2video/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Image-to-Video Microservice

Image-to-Video is a task that generate video conditioning on the provided image(s). This microservice supports image-to-video task by using Stable Video Diffusion (SVD) model.

# 🚀1. Start Microservice with Python (Option 1)

## 1.1 Install Requirements

```bash
pip install -r requirements.txt
pip install -r svd/requirements.txt
```

## 1.2 Start SVD Service

```bash
# Start SVD service
cd svd/
python svd_server.py
```

## 1.3 Start Image-to-Video Microservice

```bash
cd ..
# Start the OPEA Microservice
python image2video.py
```

# 🚀2. Start Microservice with Docker (Option 2)

## 2.1 Build Images

### 2.1.1 SVD Server Image

```bash
cd ../..
docker build -t opea/svd:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/image2video/svd/Dockerfile .
```

### 2.1.2 Image-to-Video Service Image

```bash
docker build -t opea/image2video:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/image2video/Dockerfile .
```

## 2.2 Start SVD and Image-to-Video Service

### 2.2.1 Start SVD server

```bash
docker run --ipc=host -p 9368:9368 -e http_proxy=$http_proxy -e https_proxy=$https_proxy opea/svd:latest
```

### 2.2.2 Start Image-to-Video service

```bash
ip_address=$(hostname -I | awk '{print $1}')
docker run -p 9369:9369 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e SVD_ENDPOINT=http://$ip_address:9368 opea/image2video:latest
```

### 2.2.3 Test

```bash
http_proxy="" curl http://localhost:9369/v1/image2video -XPOST -d '{"images_path":[{"image_path":"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket.png"}]}' -H 'Content-Type: application/json'
```
2 changes: 2 additions & 0 deletions comps/image2video/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
47 changes: 47 additions & 0 deletions comps/image2video/image2video.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0


import json
import os
import time

import requests

from comps import (
ImagesPath,
ServiceType,
VideoPath,
opea_microservices,
register_microservice,
register_statistics,
statistics_dict,
)


@register_microservice(
name="opea_service@image2video",
service_type=ServiceType.IMAGE2VIDEO,
endpoint="/v1/image2video",
host="0.0.0.0",
port=9369,
input_datatype=ImagesPath,
output_datatype=VideoPath,
)
@register_statistics(names=["opea_service@image2video"])
async def image2video(input: ImagesPath):
start = time.time()
images_path = [img.image_path for img in input.images_path]
inputs = {"images_path": images_path}
video_path = requests.post(url=f"{svd_endpoint}/generate", data=json.dumps(inputs), proxies={"http": None}).json()[
"video_path"
]

statistics_dict["opea_service@image2video"].append_latency(time.time() - start, None)
return VideoPath(video_path=video_path)


if __name__ == "__main__":
svd_endpoint = os.getenv("SVD_ENDPOINT", "http://localhost:9368")
print("Image2video server started.")
opea_microservices["opea_service@image2video"].start()
11 changes: 11 additions & 0 deletions comps/image2video/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
datasets
docarray[full]
fastapi
opentelemetry-api
opentelemetry-exporter-otlp
opentelemetry-sdk
prometheus-fastapi-instrumentator
pydantic==2.7.2
pydub
shortuuid
uvicorn
22 changes: 22 additions & 0 deletions comps/image2video/svd/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

FROM python:3.11-slim

# Set environment variables
ENV LANG=en_US.UTF-8

ARG ARCH="cpu"

COPY comps /home/comps

RUN apt-get update && apt-get install python3-opencv -y && \
pip install --no-cache-dir --upgrade pip && \
if [ ${ARCH} = "cpu" ]; then pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu; fi && \
pip install --no-cache-dir -r /home/comps/image2video/svd/requirements.txt

ENV PYTHONPATH=$PYTHONPATH:/home

WORKDIR /home/comps/image2video/svd

ENTRYPOINT ["python", "svd_server.py"]
7 changes: 7 additions & 0 deletions comps/image2video/svd/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
accelerate
diffusers
fastapi
opencv-python
torch
transformers
uvicorn
55 changes: 55 additions & 0 deletions comps/image2video/svd/svd_server.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
"""Stand-alone Stable Video Diffusion FastAPI Server."""

import argparse
import os
import time

import torch
import uvicorn
from diffusers import StableVideoDiffusionPipeline
from diffusers.utils import export_to_video, load_image
from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse, Response

app = FastAPI()


@app.post("/generate")
XinyuYe-Intel marked this conversation as resolved.
Show resolved Hide resolved
async def generate(request: Request) -> Response:
print("SVD generation begin.")
request_dict = await request.json()
images_path = request_dict.pop("images_path")

start = time.time()
images = [load_image(img) for img in images_path]
images = [image.resize((1024, 576)) for image in images]

generator = torch.manual_seed(args.seed)
frames = pipe(images, decode_chunk_size=8, generator=generator).frames[0]
video_path = os.path.join(os.getcwd(), args.video_path)
export_to_video(frames, video_path, fps=7)
end = time.time()
print(f"SVD video output in {video_path}, time = {end-start}s")
return JSONResponse({"video_path": video_path})


if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--host", type=str, default="0.0.0.0")
parser.add_argument("--port", type=int, default=9368)
parser.add_argument("--model_name_or_path", type=str, default="stabilityai/stable-video-diffusion-img2vid-xt")
parser.add_argument("--video_path", type=str, default="generated.mp4")
parser.add_argument("--seed", type=int, default=42)

args = parser.parse_args()
pipe = StableVideoDiffusionPipeline.from_pretrained(args.model_name_or_path)
print("Stable Video Diffusion model initialized.")

uvicorn.run(
app,
host=args.host,
port=args.port,
log_level="debug",
)
68 changes: 68 additions & 0 deletions tests/test_image2video.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
#!/bin/bash
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

set -x

WORKPATH=$(dirname "$PWD")
ip_address=$(hostname -I | awk '{print $1}')

function build_docker_images() {
cd $WORKPATH
echo $(pwd)
docker build --no-cache -t opea/svd:latest -f comps/image2video/svd/Dockerfile .
if [ $? -ne 0 ]; then
echo "opea/svd built fail"
exit 1
else
echo "opea/svd built successful"
fi
docker build --no-cache -t opea/image2video:latest -f comps/image2video/Dockerfile .
if [ $? -ne 0 ]; then
echo "opea/image2video built fail"
exit 1
else
echo "opea/image2video built successful"
fi
}

function start_service() {
unset http_proxy
docker run -d --name="test-comps-image2video-svd" -e http_proxy=$http_proxy -e https_proxy=$https_proxy -p 9368:9368 --ipc=host opea/svd:latest
docker run -d --name="test-comps-image2video" -e SVD_ENDPOINT=http://$ip_address:9368 -e http_proxy=$http_proxy -e https_proxy=$https_proxy -p 9369:9369 --ipc=host opea/image2video:latest
sleep 3m
}

function validate_microservice() {
result=$(http_proxy="" curl http://localhost:9369/v1/image2video -XPOST -d '{"images_path":[{"image_path":"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket.png"}]}' -H 'Content-Type: application/json')
if [[ $result == *"generated.mp4"* ]]; then
echo "Result correct."
else
echo "Result wrong."
docker logs test-comps-tts-speecht5
docker logs test-comps-tts
exit 1
fi

}

function stop_docker() {
cid=$(docker ps -aq --filter "name=test-comps-image2video*")
if [[ ! -z "$cid" ]]; then docker stop $cid && docker rm $cid && sleep 1s; fi
}

function main() {

stop_docker

build_docker_images
start_service

validate_microservice

stop_docker
echo y | docker system prune

}

main
Loading