-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyAthena - Memory Issue #417
Comments
@laughingman7743 Let me know if you have a idea. Here's some of the things we talked about #416 in the last issue. To recreate the issue just use any query and repeat it in a docker container, you will see it grow, by 0.1 MB each time and it doesn't reclaim that memory space. |
Sorry, I don't know, but there must be a memory leak somewhere. |
Yeah, no problem. That's what I found as well.
I would be happy to resolve it but I can't get a hang of the library
internals.
On Tue, 14 Mar 2023 at 11:36, laughingman7743 ***@***.***> wrote:
Sorry, I don't know, but there must be a memory leak somewhere.
—
Reply to this email directly, view it on GitHub
<#417 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AX3T4JPZ3ZQHFRH2LVYLD3TW4BC2PANCNFSM6AAAAAAV2HBYMU>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
|
When using the unload option, the read_csv method is not used. I am wondering if the same memory leakage occurs in that case. |
I'm actually using the unload currently.
I tried it before without unload but no success.
On Tue, 14 Mar 2023 at 11:45, laughingman7743 ***@***.***> wrote:
When using the unload option, the read_csv method is not used. I am
wondering if the same memory leakage occurs in that case.
—
Reply to this email directly, view it on GitHub
<#417 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AX3T4JL4I5O55VLCVNO4J2DW4BD6BANCNFSM6AAAAAAV2HBYMU>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
|
Hi, I think I'm experiencing the same or similar issue, creating new PandasCursors and coming into a memory leak. I've been using objgraph to diagnose it, I think there's something to do with this loop and the S3FIleSystem. The pandas cursor creates an AthenaPandasResultSet which creates an S3FileSystem, then something to do with the AbstractFileSystem in fsspec? I'm not an expert in this sort of thing, but hopefully it helps. |
If you remember the last request where we continually execute queries, like 1000 per hour it seems that the memory is continually growing and can't stop it.
This happens with a PandasCursor I thought the solution is to use chunksize but that wasn't the issue. The problem is that the memory still grows by 0.1 and since we have a deamon thread that runs 24/7 it eventually grows beyond memory size and doesn't reclaim itself.
I tried to execute manually
gc.collect()
and delete the object dataframe but something seems to be going on internally in your library after 16 hours of investigation that seems to be the problem and I'm out of reach for now.I'm looking for an idea on how to resolve this issue. Thanks.
The text was updated successfully, but these errors were encountered: