Skip to content
This repository has been archived by the owner on Jul 19, 2018. It is now read-only.

changed: Updated HcfMiddleware. Fixes #56 #57

Closed
wants to merge 1 commit into from

Conversation

starrify
Copy link
Member

No description provided.

def _save_new_links_count(self):
""" Save the new extracted links into the HCF."""
for slot, new_links in self.new_links.items():
self._msg('Stored %d new links in slot(%s)' % (len(new_links), slot))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using len is the same as storing the counts yourself.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of storing the counts myself (instead of len) is to avoid storing 100 million strings in new_links in my case. Now with hcf_dont_filter set to True in request.meta, new_links could be empty and space saved. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But where are you cleaning new_links? I'm confused :P

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously new_links is cleaned in close_spider. However I do not think that is necessary since the spider is closing, everything here would be cleaned once the instance of.HcfMiddleware is recycled.

@nyov
Copy link
Contributor

nyov commented Jul 15, 2016

What's the verdict on this one?
Is the hcf middleware still in use, should this be merged before the project gets broken up?

@redapple
Copy link
Contributor

redapple commented Nov 7, 2016

@starrify , is the issue and fix still valid?
If so, can you re-open it against https://github.com/scrapy-plugins/scrapy-hcf ? Thanks.
We are deprecating scrapylib so I'm closing the issue.

@redapple redapple closed this Nov 7, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants