The Website-to-PDF project is designed to archive/convert entire websites or wikis into a single PDF. It can handle lazy-loaded images, supports multi-threading for efficient processing, and generates one final PDF for the entire website. You can configure parameters like batch size, number of CPU cores, etc.
It first gathers all the web pages of a website from its URL, stores the links in a file, and then converts each page into a PDF. Finally, all the PDFs are merged into a single PDF that contains the entire website.
Clone my github repo:
git clone https://github.com/Parkourer10/Website-to-PDF.git
cd Website-to-PDF/
Install the dependencies:
pip install -r requirements.txt
- https://googlechromelabs.github.io/chrome-for-testing/ Put the chrome driver in the project folder or this will not work.
Run the project:
python main.python