Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add 'download' command (functionality) with basic options (#11)
* refactor: ♻️ Change serialization of LepEpisode object to manual dictionary (replace __dict__ attribute) * style: 🎨 Add license boilerplate, change typing imports, format docstring to line limit =80 * refactor: ♻️ Move DEFAULT_JSON_NAME to config.py, change type (to Set) for irrelevant links * refactor(parser): 🚧 Add drafts of new classes: Lep, Archive, LepParser, SoupParser * test(parser): ✅ Update tests checking index with new 'archive' object * refactor: 🏗️ Add LepEpisodeList class, class variables for Archive class, rename attr to 'parsed_at' * refactor: 🥅 Add module with custom exceptions * refactor(parser): 🏗️ Add new classes ArchiveParser and EpisodeParser, re-write all functions using OOP approach * refactor: ♻️ Change function returning type to new LepEpisodeList class * test(parser): ✅ Update tests with new classes * test: ✅ Update renamed field 'parsed_at' in fixture JSON file and test_downloader module * test(console): Skip one test until feature implementation * test: ✅ Update fixtures using new classes * chore: ⬆️ Update typeguard (2.13.0 -> 2.13.3) I was hoping this would solve the problem with reversed dict (but it's only for Python 3.7.0) * ci: 👷 Exclude Python 3.7 from python_versions list Due to errors with Reversible type for Python 3.7.0 * refactor(parser): ♻️ Remove redundant IF block checking that only duplicated links on page * test(parser): ✅ Add test to check that script does not fail when only duplicated links on archive page * refactor(parser): ♻️ Move up 'pre-parsing' step and change its logic for ArchiveParser class * test(parser): ✅ Update tests with 'pre-parsing' action * test(parser): ✅ Add test to check raising NotImplementedError for methods in subclasses of LepParser class * test(parser): ✅ Add tests to check invalid date in URL * refactor(parser): 🏗️ Change 'date' attr type to datetime for LepEpisode class Support conversion date from string and from datetime offset-naive value Add __lt__ and __eq__ methods to implement native sort * test(parser): ✅ Update tests and one fixture to use (check) episode date as datetime * fix(downloader): 💩 Change retrieving of short date for audio filename * refactor(parser): ♻️ Add coputed property 'post_title' for LepEpisode class (to use as safe filname during dowloading) Keep origin post title in '_origin_post_title' attr * test(parser): ✅ Add test to check replacing of invalid path characters in post title attr * fix(parser): 🐛 Fix sorting by two attributes (date, index) by updating condition in __lt__ method of LepEpisode class * test(parser): ✅ Add test to check comparison operators (<, ==) and update tests checking sorting (based on these operators) * test: ✅ Add test to check __repr__ method for LepEpisode object and change a little repr itself * test: ✅ Add test to check passing episode date directly as 'datetime' type * refactor: 🏗️ Update Lep class to have one plae for storing global session object Move two methods 'get_web_document' and 'extract_only_valid_episodes' into Lep class (as class methods) * test: ✅ Update tests with regard to Lep class modifications * refactor(parser): ♻️ Remove redundant list comprehension * refactor: ♻️ Create class method 'get_db_episodes' to share for other modules Add method 'filter_by_type' for LepEpisodeList class Add class variables 'json_body' and 'db_episodes' in :ep class * refactor: 🥅 Add new custom exceptions DataBaseUnavailable and NoEpisodesInDataBase * refactor(parser): ♻️ Use new exception NoEpisodesInDataBase and store db episodes in Lep's class variable * refactor(downloader): 💩 Attempt to implement OOP for downloader module Add class Downloader with shared dictonaries Add function to constract bunch of links for downloading * test(downloader): ✅ Update tests and fixtures due to modifications in downloader module * refactor(parser): ♻️ Remove redundant IF block (because will be exception earlier in this case) * refactor(console): 🥅 Add handling custom exceptions for 'parse' command * test: ✅ Update tests where new custom exceptions are raised now * refactor(downloader): ♻️ Add field '_short_date' for LepEpisode which is set during JSON decoding * refactor: 🏗️ Replace 'audios' list with 'files' dictionary. Update JSON decoding hook. * fix(parser): 🐛 Fix missed url assigning for 'successfully' parsed episode. Update assigning audios into files dict * test: 🤡 Add updated JSON mock file (without duplicates and 'files' dict) * test: ✅ Update all tests with improved JSON structure * chore: 🔧 Bump version to 3.0.0a2 No duplicates in archive links, updated JSON structure with 'files' dictionary, safe post names * refactor(downloader): 🏗️ Add two dataclasses LepFile and Audio; Add functions for gathering audio files from episodes list * test(downloader): ✅ Update tests with new dataclasses and gathering function for audio files * test(downloader): ✅ Add test to check collecting auxiliary audio links * fix(downloader): 🐛 Fix a bug with setting 2nd and 3rd links for audio files Move fields 'secondary_url' and 'tertiary_url' to base dataclass (to fix 'mypy' errors) * refactor(downloader): ♻️ Move getting DB episodes into separate method, Update downloading function for LepFile list * refactor: Add new EmptyDownloadsBunch exception * test(downloader): ✅ Update tests with operating LepFile objects Add checks for two exceptions NoEpisodesInDataBase and EmptyDownloadsBunch Test of using already parsed DB episodes. * style(console): 🎨 Fix length limit >80 chars * style: 🚨 Fix BLK100 errors after 'pre-commit' scanning * refactor(downloader): ♻️ Update function for detecting existing files in target folder. Change type for shared list 'existed' * test(downloader): ✅ Update test with checking files separating out from existing files * refactor(downloader): ♻️ Change algorithm for downloading auxiliary links to files * refactor(downloader): ♻️ Change type of two shared class variables 'downloaded' and 'not_found' to list of LepFile objects * test(downloader): ✅ Update tests to initialize empty lists instead of dictionaries * feat(downloader): ✨ Add function to populate default secondary url (to reduce the size of JSON database file) Add unit test to check auto populating of secondary url Add config constant DOWNLOADS_BASE_URL * feat(downloader): ✨ Add PagePDF dataclass and 'page_pdf' dictionary key to strore links to page PDF * test(downloader): ✅ Update tests with new 'page_pdf' file and update mocked JSON database file * chore: Bump version to 3.0.0a3 New 'page_pdf' file * feat(parser): ✨ Add parsing HTML title (not storing in JSON) * test(parser): ✅ Add test to check parsing of HTML page title * refactor(downloader): ♻️ Add LepFileList class and change a liitle 'gather_all_files' function * test(downloader): ✅ Update tests with new LepFileList class and its instance method 'filter_by_type' * style: ⚰️ Clean code (remove old commented lines) * feat: ✨ Add two LepEpisodeList filter methods (by episode number and by date) * fix(downloader): 🐛 Fix appending 'not_found' file after trying all auxiliary links Remove print from OSError exception during downloading file * feat(console): ✨ Implement 'download' command with basic options * test: ✅ Add autouse fixture for clearing all shared lists * test(console): ✅ Add tests for basic options * chore: ➕ Add 'pytest-mock' (3.6.1) into dev dependencies Update 'noxfile.py' to install 'pytest-mock' * test(console): ✅ Add two tests to check validations for '--dest' option (when PermissionError and OSError occurs) * refactor(console): ♻️ Extract MyCLI class and decorator with common options into separate module 'cli_shared.py' * refactor(console): ♻️ Make running script without mentioning 'download' commad explicitly (passing options into it) Add 'common_options' decorator to cli() command group and 'download' command * test(console): ✅ Update imports of 'cli' module in tests (do inner import) Add one test to check passing of options to 'download' command * chore: ⬆️ Upgrade version of 'typeguard' to '2.13.3' * feat(console): ✨ Add function to pause script execution until 'Enter' key will be pressed * test(console): ✅ Add tests to check pressing 'Enter' at the end and interrupting execution in 'quiet' mode (for two negative scenario) * style(console): 🚨 Fix 'pre-commit' error with D202 No blank lines allowed after function docstring * chore: Bump version to 3.0.0a4 Add 'download' CLI command with basic options * refactor: ♻️ Make 'page_pdf' key optional in 'files' dictionary of LepEpisode object * chore: 🤡 Update mocked JSON database (without optional 'page_pdf' now) * feat(downloader): ✨ Add gathering new type of files 'ATrack' (audio track for video episodes) * test(downloader): ✅ Add tests to check gathering single-part audio track and multi-part audio track * refactor(console): ♻️ Add ATrack type to default filter for files (now Audio + ATrack) * refactor(parser): 🚧 Start to refactor 'parser' module * refactor(parser): ♻️ Make Archive class responsible for parsing actions * refactor: ♻️ Fix attribute 'session' for Lep and Archive classes / instances * refactor: ♻️ Chnage 'downloader' structure, Move some functions to 'LepDL' class Remove class varibales lists * refactor(downloader): ♻️ Add method 'detach_existed_files' to simplify function invocation * ci: 💚 Remove sessions in Python 3.7 (due to 'mypy' fails)
- Loading branch information