-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Requests not being garbage collected #23
Comments
|
Hmm, thanks. The amount of HtmlResponse in memory is as expected, so I can't quite pin down what is happening. I do not have the playwright_include_page meta set so that should handle itself. I guess I'll have to start over from scratch with an example spider and see if the problem persists. It looks like there is just one page being created and deleted as well, the page amount in the log never goes above 1.
|
Could you try the code from this commit (c4c0bd6) and see if the reference count drops? |
That cleared it right up! Thank you! I was going mad trying to debug with the garbage collect interface and trackref but didn't get anywhere, at least I learned a bit more about the inner workings of Python.
|
@elacuesta Thanks for the fix. I've had the same issue for multiple months now, i made a work around by killing the whole scrapy process after 1k pages to avoid OOM. Reading the code it's not obvious to me why passing a scrapy request object as argument to |
Glad to know it worked! I'll be merging this update into the main branch then. |
Released as v0.0.5, thanks for the report! |
I've got a problem where my Request objects are not getting garbage collected but pile up until memory runs out. I've checked this with trackref and objgraph and it looks to me like something in the ScrapyPlaywrightDownloadHandler is keeping a reference to all the Requests? Attached is the objgraph output for the first Request after a few have been handled.
The text was updated successfully, but these errors were encountered: