Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #15
In #15, a
TimeoutError
causes the following:except
block closes the page which handled the request to avoid having unused pages consuming memoryThe first exception can be handled with a request errback or a spider middleware with a
process_spider_exception
, so recovery measures can be taken. The only way I can think of avoiding the second one is not closing the page on failure, I don't think that's a good idea (this avoids having unclosed pages floating around consuming memory) but I'm open to be proven wrong. With the current situation, it's just confusing for the users so let's just catch it and log a message.This patch is mostly to warn in the logs instead of failing loudly, as the error can be handled with Scrapy's existing API (errbacks, spider middleware's process_spider_exception).
A sample spider: