crawl_the_imperial_library.py scrapes all Elder Scrolls books from https://www.imperial-library.info/books/all/by-title
Books are scraped into json files for easier processing
Helper scripts:
-
extract_tags.py - To extract book tags into a separate json
-
check_duplicate_ids.py - Unfortunately, the original script has a bug where dupplicate ids will be created. At the time, it was easier and faster to fix the already created files with a script then to fix the original script and run it again.
You can find all json files created by these scripts in the repository. Please feel free to use them and these scripts as you wish, but do keep in mind the contents generated by these scripts belong to Bethesda.
-
book_texts - All Elder Scrolls books can be found here in json format
-
books_metadata.json - Holds data about all files in book_texts: id, title, author, description, tags, category and file name
-
books_categories.json - All individual categories as found on the imperial library website
-
books_tags.json - All individual tags as found on the imperial library website
I do not own any rights for the contents generated by these scripts, as they are part of The Elder Scrolls series. The Elder Scrolls series are trademarks of Bethesda Softworks LLC, a ZeniMax Media company. All Rights Reserved.