Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gain Improvements - Stable #50

Closed
wants to merge 28 commits into from
Closed

Gain Improvements - Stable #50

wants to merge 28 commits into from

Conversation

kwuite
Copy link
Collaborator

@kwuite kwuite commented Jun 1, 2019

I want to resolve #42, #43, #46, #49

I disabled the cache test since we'd need to update ci and start webserver.py somehow and ensure redis is available as wel.

Gladly hear your thoughts on this so I can make a separate PR for improving tests.

Please review and update PyPi after acceptance.

Return the complete matched url and prevent incorrectly written regex from returning html within the url string.
Users can use any aiocache type to cache their requests except for SimpleMemory cache. In case you want each request to have a different proxy you can now supply a generator and Gain will yield for every fetch a new proxy based on your custom generator.
Because the user, not the library, should dictate what happens when a logging event occurs, one admonition bears repeating: It is strongly advised that you do not add any handlers other than NullHandler to your library’s loggers.
…ache

Cache is now being tested via a webserver with Redis. Users can submit urls that should not be cached for Feed or blog post scraping where you only wish to catch new pages and ignore old ones. A new CSS class has been added with backwards compatibility in mind accept for jquery like css functions for that we still have the old code which can be imported as Pyq.
Users now have allot more control of what they want and can scrape, how they want the data and if any extra manipulation is needed to get the right clean data they need. This approach is very easy to extend and clean in it's design. Every feature has been covered by tests.
Users can now set spider flag Test to True to execute a test run for their spider on just a single parsed page. Together with Cache enabled this makes up for a very fast trial and error cycle. In case needed, the amount of test requests can be increased by setting the max_requests value. In case you wish to limit actual external requests for some reason, set limit_requests to true and set the appropriate value for max_requests.
@kwuite
Copy link
Collaborator Author

kwuite commented Jun 1, 2019

Removed python 3.7 test. Seems not to be supported by travis.

@kwuite
Copy link
Collaborator Author

kwuite commented Jun 1, 2019

Almost done with redis cache test please wait with merge until I completed it so we have test coverage on a very important part of the new code.

@kwuite
Copy link
Collaborator Author

kwuite commented Jun 1, 2019

Repo can be squashed. Redis cache tests have been implemented. Removed python 3.5 support since async coding on 3.6 and 3.7 is much cleaner and supporting python 3.5 will be a pain and feels quite unnecessary.

@kwuite
Copy link
Collaborator Author

kwuite commented Jun 3, 2019

@gaojiuli, are you still available for review and merge?

@kwuite
Copy link
Collaborator Author

kwuite commented Jun 7, 2019

@gaojiuli, are you still available for review and merge?

Would really like to know if you are stil moving on with this library.

@kwuite
Copy link
Collaborator Author

kwuite commented Jun 13, 2019

Least you could do is respond. Closing PR. Next time don't publish something if you do not intend to maintain or respond to it.

@kwuite kwuite closed this Jun 13, 2019
@kwuite kwuite deleted the development branch June 13, 2019 09:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The sciencenet_spider.py example does not (seem to) work for python 3.6
1 participant