-
Notifications
You must be signed in to change notification settings - Fork 255
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #184 from tryolabs/chore/colab-demo
Add official demo in Google Colab
- Loading branch information
Showing
13 changed files
with
421 additions
and
8,146 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,184 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"<a href=\"https://colab.research.google.com/github/tryolabs/norfair/blob/master/demos/colab/colab_demo.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Draw objects paths and track camera movement with Norfair\n", | ||
"\n", | ||
"Run a demo similar to the [Hugging Face Spaces Norfair demo](https://huggingface.co/spaces/tryolabs/norfair-demo) in this notebook.\n", | ||
"\n", | ||
"This demo uses the YOLOv7 model and shows the Norfair features to draw the object's paths and track camera movement.\n", | ||
"\n", | ||
"Tack camera movement is useful to improve the tracker and also to keep the paths fixed although the camera movements.\n", | ||
"\n", | ||
"The demo will use the following video by default, but you can change which video you use by changing the url in [this cell](#Download-Video-and-Preprocessing). We trim the video to only a few seconds due to limitations with video playback in Google Colab, but you can play with these limitations and see what you get.\n", | ||
"\n", | ||
"[data:image/s3,"s3://crabby-images/51554/515545725be8396d88e9b621770d1db31f55fb1b" alt=""](https://user-images.githubusercontent.com/67343574/191318965-b7c224d7-73b0-49f7-840a-b1c9d8534a06.png)\n", | ||
"\n", | ||
"**Note**\n", | ||
"\n", | ||
"- Set the hardware accelerator setting of Colaboratory to **GPU** and execute it.\n", | ||
"(Runtime -> Change Runtime Type -> GPU)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Install dependencies" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"! wget \"https://raw.githubusercontent.com/tryolabs/norfair/master/demos/colab/requirements.txt\" -O requirements.txt\n", | ||
"! pip install -r requirements.txt" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Download [Video](https://www.youtube.com/watch?v=aio9g9_xVio) and Preprocessing\n", | ||
"We cut the video short because it's too long to play in Google Colabratory." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"! wget \"https://drive.google.com/u/0/uc?id=1Jc5TAiwOZ-yUO6R_tG0zSW9Niv_HKTPV&export=download\" -O sample.mp4\n", | ||
"! ffmpeg -i sample.mp4 -ss 7 -t 10 sample_10s.mp4" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Download and runs the demo." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"In this example we only look at people, you can change this with the `classes` parameter following the Coco labels ids." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"! wget \"https://raw.githubusercontent.com/tryolabs/norfair/master/demos/colab/demo.py\"\n", | ||
"! wget \"https://raw.githubusercontent.com/tryolabs/norfair/master/demos/colab/draw.py\"\n", | ||
"! wget \"https://raw.githubusercontent.com/tryolabs/norfair/master/demos/colab/yolo.py\"\n", | ||
"\n", | ||
"! python demo.py sample_10s.mp4 --classes 0" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Convert mp4 to webm\n", | ||
"\n", | ||
"\n", | ||
"Reference: [StackOverflow - python-opencv-video-format-play-in-browser](https://stackoverflow.com/questions/49530857/python-opencv-video-format-play-in-browser)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"! ffmpeg -i ./sample_10s_out.mp4 -vcodec vp9 ./sample.webm" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Displaying the Drawing Result" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import io\n", | ||
"from base64 import b64encode\n", | ||
"from IPython.display import HTML\n", | ||
"\n", | ||
"with io.open('sample.webm','r+b') as f:\n", | ||
" mp4 = f.read()\n", | ||
"data_url = \"data:video/webm;base64,\" + b64encode(mp4).decode()\n", | ||
"HTML(\"\"\"\n", | ||
"<video width=800 controls>\n", | ||
" <source src=\"%s\" type=\"video/webm\">\n", | ||
"</video>\n", | ||
"\"\"\" % data_url)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Convert mp4 to gif" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"! ffmpeg -ss 5 -i ./sample_10s_out.mp4 -filter_complex \"[0:v] fps=10,scale=1280:-1,split [a][b];[a] palettegen [p];[b][p] paletteuse\" output.gif -y" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3.9.5 64-bit ('3.9.5')", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.9.5" | ||
}, | ||
"orig_nbformat": 4, | ||
"vscode": { | ||
"interpreter": { | ||
"hash": "d1d45f7b56f6e27d41b86676aa8ae2293c110fadaa7f6b0b931d437bdf9db7e9" | ||
} | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
import argparse | ||
from typing import List | ||
|
||
import numpy as np | ||
from draw import center, draw | ||
from yolo import YOLO, yolo_detections_to_norfair_detections | ||
|
||
from norfair import AbsolutePaths, Paths, Tracker, Video | ||
from norfair.camera_motion import HomographyTransformationGetter, MotionEstimator | ||
from norfair.distances import create_normalized_mean_euclidean_distance | ||
|
||
DISTANCE_THRESHOLD_CENTROID: float = 0.08 | ||
|
||
|
||
def inference( | ||
input_video: str, model: str, track_points: str, model_threshold: str, classes: List | ||
): | ||
coord_transformations = None | ||
paths_drawer = None | ||
fix_paths = True | ||
model = YOLO(model) | ||
video = Video(input_path=input_video) | ||
|
||
transformations_getter = HomographyTransformationGetter() | ||
|
||
motion_estimator = MotionEstimator( | ||
max_points=500, min_distance=7, transformations_getter=transformations_getter | ||
) | ||
|
||
distance_function = create_normalized_mean_euclidean_distance( | ||
video.input_height, video.input_width | ||
) | ||
distance_threshold = DISTANCE_THRESHOLD_CENTROID | ||
|
||
tracker = Tracker( | ||
distance_function=distance_function, | ||
distance_threshold=distance_threshold, | ||
) | ||
|
||
paths_drawer = Paths(center, attenuation=0.01) | ||
|
||
if fix_paths: | ||
paths_drawer = AbsolutePaths(max_history=40, thickness=2) | ||
|
||
for frame in video: | ||
yolo_detections = model( | ||
frame, | ||
conf_threshold=model_threshold, | ||
iou_threshold=0.45, | ||
image_size=720, | ||
classes=classes, | ||
) | ||
|
||
mask = np.ones(frame.shape[:2], frame.dtype) | ||
|
||
coord_transformations = motion_estimator.update(frame, mask) | ||
|
||
detections = yolo_detections_to_norfair_detections( | ||
yolo_detections, track_points=track_points | ||
) | ||
|
||
tracked_objects = tracker.update( | ||
detections=detections, coord_transformations=coord_transformations | ||
) | ||
|
||
frame = draw( | ||
paths_drawer, | ||
track_points, | ||
frame, | ||
detections, | ||
tracked_objects, | ||
coord_transformations, | ||
fix_paths, | ||
) | ||
video.write(frame) | ||
|
||
|
||
if __name__ == "__main__": | ||
parser = argparse.ArgumentParser(description="Track objects in a video.") | ||
parser.add_argument("files", type=str, help="Video files to process") | ||
parser.add_argument( | ||
"--detector-path", type=str, default="yolov7.pt", help="YOLOv7 model path" | ||
) | ||
parser.add_argument( | ||
"--img-size", type=int, default="720", help="YOLOv7 inference size (pixels)" | ||
) | ||
parser.add_argument( | ||
"--conf-threshold", | ||
type=float, | ||
default="0.25", | ||
help="YOLOv7 object confidence threshold", | ||
) | ||
parser.add_argument( | ||
"--classes", | ||
nargs="+", | ||
type=int, | ||
help="Filter by class: --classes 0, or --classes 0 2 3", | ||
) | ||
parser.add_argument( | ||
"--device", type=str, default=None, help="Inference device: 'cpu' or 'cuda'" | ||
) | ||
parser.add_argument( | ||
"--track-points", | ||
type=str, | ||
default="bbox", | ||
help="Track points: 'centroid' or 'bbox'", | ||
) | ||
args = parser.parse_args() | ||
|
||
inference( | ||
args.files, | ||
args.detector_path, | ||
args.track_points, | ||
args.conf_threshold, | ||
args.classes, | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
import numpy as np | ||
|
||
import norfair | ||
|
||
|
||
def draw( | ||
paths_drawer, | ||
track_points, | ||
frame, | ||
detections, | ||
tracked_objects, | ||
coord_transformations, | ||
fix_paths, | ||
): | ||
if track_points == "centroid": | ||
norfair.draw_points(frame, detections) | ||
norfair.draw_tracked_objects(frame, tracked_objects) | ||
elif track_points == "bbox": | ||
norfair.draw_boxes(frame, detections) | ||
norfair.draw_tracked_boxes(frame, tracked_objects) | ||
|
||
if fix_paths: | ||
frame = paths_drawer.draw(frame, tracked_objects, coord_transformations) | ||
elif paths_drawer is not None: | ||
frame = paths_drawer.draw(frame, tracked_objects) | ||
|
||
return frame | ||
|
||
|
||
def center(points): | ||
return [np.mean(np.array(points), axis=0)] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
torch==1.12.1 | ||
torchvision==0.13.1 | ||
numpy==1.21.6 | ||
rich==12.5.1 | ||
opencv-python==4.6.0.66 | ||
tqdm==4.64.1 | ||
git+https://github.com/tryolabs/norfair.git@master |
Oops, something went wrong.