Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak when using fast_executemany #854

Closed
tjosgood opened this issue Jan 15, 2021 · 10 comments
Closed

Memory leak when using fast_executemany #854

tjosgood opened this issue Jan 15, 2021 · 10 comments
Milestone

Comments

@tjosgood
Copy link

Environment

Python: Python 3.9.0 (tags/v3.9.0:9cf6752, Oct 5 2020, 15:34:40) [MSC v.1927 64 bit (AMD64)] on win32
pyodbc: 4.0.30
OS: Windows 10
DB: MS SQL 15.0.2
Driver: ODBC Driver 17 for SQL Server

Issue

The minimal example included inserts 500,000 rows of 4kb (2000 nvarchar)

With fast_executemany disabled, this example script maintains constant mem use as I expect, with it enabled the mem use gradually grows to the size of the data inserted even though no refs to that data are kept.

I attempted to use tools like dozer to identify leaking elements but this seems to happen outside python as I cannot see growing numbers of objects.

import pyodbc

conn = pyodbc.connect(driver='{ODBC Driver 17 for SQL Server}',
                      server='localhost\SQLEXPRESS',
                      database='fifi',
                      uid='local', pwd='123')

insert_to_tmp_tbl_stmt = f"INSERT INTO simpledata VALUES (?)"
cursor = conn.cursor()
cursor.fast_executemany = True #change to false to see first example

cursor.execute("CREATE TABLE [simpledata]([c1] [nvarchar](max) NULL)")
cursor.execute("TRUNCATE TABLE [simpledata]")
cursor.commit()

# this leaks memory like crazy!
for i in range(100):
    cursor.executemany(insert_to_tmp_tbl_stmt, [("0" * 2000,), ] * 5000)

cursor.commit()
cursor.close()

Expected Behaviour

Rows are inserted and memory use remains constant as is seen when running the example with fast_executemany = False
i.e. image 1
image

Actual Behaviour

Rows are inserted and memory use climbs rapidly and even after the connection is closed the memory use remains high, this is only seen when fast_executemany = True and this setting is the only change between the 2 examples.

You can see at the end that 2GB of memory is consumed, roughly the same amount of data that was generated for the insert (500,000 * 4kb bytes).

image

@v-makouz
Copy link
Contributor

Looks like the parameter arrays aren't being de-allocated, I'll see if I can reproduce this.

@v-chojas
Copy link
Contributor

Possibly related; #802

@tjosgood
Copy link
Author

Yes I tried without using varchar(max) and used varchar(2000) and it worked as expected so this does seem to be related

@tjosgood
Copy link
Author

A few more insights..

  1. Tested my example code on Ubuntu 20.04 and the issue does not occur
  2. tested the fix branch mentioned in Increasing memory consumption with fast_executemany and varchar(max) #802 and it did not solve the problem for this example

@tjosgood
Copy link
Author

I also found that using cursor.setinputsizes([(pyodbc.SQL_WVARCHAR, 2000, 0), ]) before the insert makes the issue go away. However, this still doesn't let me use fast_executemany with strings longer than 4000 which is what I would like to do

@gordthompson
Copy link
Collaborator

Yes, I can confirm that the test code in the original post leaks memory on Windows 8.1 but not on Ubuntu 18.04.

However, on Windows, the patch for #832 fixed the leak for me:

pip install git+https://github.com/Mizaro/pyodbc@bugfix/SQLPutData_UnicodeToBytes_MemoryLeak

@tjosgood
Copy link
Author

Thanks for confirming. I had another go with #832 and can confirm it fixes the issue for me, I must have used pip incorrectly the first time round :D.

That is good news then there is working fix the only question is when will it be merged! Thanks for the fix!

@gordthompson
Copy link
Collaborator

the only question is when will it be merged

I suppose that's up to @mkleehammer

@erykwho
Copy link

erykwho commented May 2, 2021

I had the same problem here and can confirm that #832 solved it. But it is not released yet. Do we have any idea of when this is going to be released?

@gordthompson gordthompson added this to the 4.0.31 milestone Jun 12, 2021
@gordthompson
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants