Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add writing and updating JSON database (#7)
New changes from #6: * fix(parser): Make links texts safe for windows path * test(parser): ✅ Update test with getting link text by href * chore: ➕ Add 'rope' package in dev deps for refactoring in VS Code * feat(parser): ✨ Parse episode page (date and episode number) Add LepEpisode class (with a couple of basic attrs) Add parsing functions for episode date and number Add replacing of invalid path characters for link title Add and update unit-tests to demonstrate that episode parsing (isolated) works * refactor(parser): ♻️ Add returning of URL final location during getting response Return error text instead of raising exceptions * test(parser): ✅ Add tests to check final location after redirects Update tests for differend exceptions during getting response * feat(parser): ✨ Add index generating for post URL (for quick search / reference in the future) * test(parser): ✅ Add two tests to check index generation * feat: Add 'admin_note' attribute to LepEpisode class * feat(parser): Add logic for bad response of parsing page * test(parser): ✅ Update tests taking into account response status for parsing page Add test to check return value (None) for non-episode link Black formatting changes * style(parser): 🏷️ Fix 'mypy' and 'pre-commit' errors * feat(parser): ✨ Add 'parsing_utc' attribute for LepEpisode class Add storing parsing datetime in each episode * test(parser): ✅ Update test to check parsing all links from list Add non-episode HTML and link to check when parsed episode is empty Return 'fail_under' = 100 in pyroject.toml * feat(parser): ✨ Add function to parsing links to episode audio (parts) Add attributes 'post_type' and 'audios' for LepEpisode class * test(parser): ✅ Add minimum sufficient tests (to satisfy Coverage) * refactor(parser): ♻️ Unify parsing part of archive page (tag <article> only) Rename soup objects * style: 🎨 Fix 'pre-commit' errors for imports order * refactor: ♻️ Change default value for 'audios' attribute to None because 'flake8-bugbear' error B006 was raised * perf(parser): ⚡ Change algorithm to extract episode links and their texts Remove mapping dict i.e. there are no duplicates now * test(parser): ✅ Update tests according to new archive parsing algorythm Remove tests for two deleted functions * test(parser): ✅ Add two tests to check parsing mp3 links for certain cases Exclude links to separate short audio No dupplicates for 'audio' word in the URL * style: ✏️ Fix wrong writing of 'non-episode' word * feat(parser): ✨ Add function for descending sorting of parsed episodes Change returned type of episodes list to List[LepEpisode] * test(parser): ✅ Add test to check episodes sorting Modify existing tests to follow changes for returned type of parsing function * fix(parser): 🐛 Change secondary key sorting to 'index' becuase could be episodes with the same date but without episode number Update unit-test * chore: 🔧 Add JSON_DB_URL configuration parameter * feat: 🏷️ Add 'LepJsonEncoder' class for json dump operations * feat(parser): ✨ Add rough implementation of 'main' method with parsing actions (including writing JSON file) * test(parser): ✅ Add tests for writing and updating JSON database Add JSON (pretty) fixture with test database * style: 🎨 Fix imports by pre-commit Add json files to exclude types in '.pre-commit-config.yaml'
- Loading branch information