This project aims to put lyrics for a particular frame of a video in an optimal location avoiding overlap with the main subjects of the video.
Assumptions:
- In this case "subjects" are "persons".
- Number of subjects = 1
- Maximum time-length = 5 minutes
- Maximum frames per second = 60
- Maximum resolution of each frame (or video) = 1920x1080 (FHD)
- No time-overlapping lyrics
- Since its a Python project - one needs to have Python v3.8+.
- Clone the repo.
- Make and activate a virtual environment either from your IDE or by following the instructions here.
- Install all the
pip
dependencies by running -pip3 install -r requirements.txt
.
The torch
and torchvision
can be installed by running pip3 install torch torchvision
.
Run this command inside your virtual environment - python3 main.py
This repo is auto-linted using pre-commit
. So before you push any commits
please install it following the instructions in their official website here.