This project is designed to streamline the process of extracting valuable information from YouTube videos. It starts by first downloading the youtube video then uses whisper to transcribe the video to text. Then it passes the text to an large language model for processing and converts it into an dataset of question answer pairs (the format can be changed).
Clone the repository
git clone https://github.com/Parkourer10/YT2DATA.git
Install all the dependencies
pip install -r requirements.txt
Run the project
python main.py