-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gain Improvements - Stable #50
Conversation
Return the complete matched url and prevent incorrectly written regex from returning html within the url string.
Users can use any aiocache type to cache their requests except for SimpleMemory cache. In case you want each request to have a different proxy you can now supply a generator and Gain will yield for every fetch a new proxy based on your custom generator.
Because the user, not the library, should dictate what happens when a logging event occurs, one admonition bears repeating: It is strongly advised that you do not add any handlers other than NullHandler to your library’s loggers.
…ache Cache is now being tested via a webserver with Redis. Users can submit urls that should not be cached for Feed or blog post scraping where you only wish to catch new pages and ignore old ones. A new CSS class has been added with backwards compatibility in mind accept for jquery like css functions for that we still have the old code which can be imported as Pyq.
Users now have allot more control of what they want and can scrape, how they want the data and if any extra manipulation is needed to get the right clean data they need. This approach is very easy to extend and clean in it's design. Every feature has been covered by tests.
Users can now set spider flag Test to True to execute a test run for their spider on just a single parsed page. Together with Cache enabled this makes up for a very fast trial and error cycle. In case needed, the amount of test requests can be increased by setting the max_requests value. In case you wish to limit actual external requests for some reason, set limit_requests to true and set the appropriate value for max_requests.
Removed python 3.7 test. Seems not to be supported by travis. |
Almost done with redis cache test please wait with merge until I completed it so we have test coverage on a very important part of the new code. |
pytest redis cache
Repo can be squashed. Redis cache tests have been implemented. Removed python 3.5 support since async coding on 3.6 and 3.7 is much cleaner and supporting python 3.5 will be a pain and feels quite unnecessary. |
@gaojiuli, are you still available for review and merge? |
Would really like to know if you are stil moving on with this library. |
Least you could do is respond. Closing PR. Next time don't publish something if you do not intend to maintain or respond to it. |
I want to resolve #42, #43, #46, #49
I disabled the cache test since we'd need to update ci and start webserver.py somehow and ensure redis is available as wel.
Gladly hear your thoughts on this so I can make a separate PR for improving tests.
Please review and update PyPi after acceptance.