ERROR : "('HY000', 'The driver did not supply an error!')" or "The incoming request has too many parameters" when trying to push a df to a MS SQL database using the to_sql() method with argument method='multi' #38541

AntoineWeber · 2020-12-17T15:02:46Z

I have checked that this issue has not already been reported.
[Yes, it was already opened here. However the issue was closed even if the problem is still here]
I have confirmed this bug exists on the latest version of pandas.
[Yes, currently version 1.1.5]
(optional) I have confirmed this bug exists on the master branch of pandas.
[No, did not check this.]

Problem description

I am currently trying to push a dataframe to a MS SQL database using pandas' to_sql() (using sqlalchemy and pyodbc) method. My dataframe is rather large with a 14k x 72 shape and various types (strings, floats, integers.)

When trying to push the dataframe without any argument set to the to_sql() method (except the table name and DB connection), the data will push extremely slow (> 30min for this 5mb df).
To try to fasten this, I came accross the method= argument, that I set to 'multi'. Without specifying a chunksize, I get the error
('HY000', 'The driver did not supply an error!')

So then I tried to set a chunksize, but whenever this latter one is bigger than ~20, I get the error The incoming request has too many parameters. The server supports a maximum of 2100 parameters. Reduce the number of parameters and resend the request.
Setting the chunksize to 10 works, but the upload is ~3min, which is still too slow..

From my understanding, such an upload should be doable normally in a few seconds.

Am I the only one having this issue ?

Sample code

# connection to the db
engine = sqlalchemy.create_engine("mssql+pyodbc://{},{}/{}?driver=SQL Server?Trusted_Connection=yes"
                                  .format(server_ip, port_number, db_name))
connection = engine.connect()

# read the df
df_raw = pd.read_csv(files_path, index_col=0, low_memory=False)

# push to database
with connection.begin():
    df_raw.to_sql('RawTestData', con=connection, if_exists='replace', index=False, chunksize=chunk_size, method='multi')

Expected Output

Using the method = 'multi' argument should speed the upload of the data. I guess in a few seconds maximum.

Output of `pd.show_versions()`

[paste the output of pd.show_versions() here leaving a blank line after the details tag]

INSTALLED VERSIONS

commit : b5958ee
python : 3.8.5.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19041
machine : AMD64
processor : Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : French_Switzerland.1252

pandas : 1.1.5
numpy : 1.19.2
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2.4
setuptools : 50.3.1.post20201107
Cython : 0.29.21
pytest : 6.1.1
hypothesis : None
sphinx : 3.2.1
blosc : None
feather : None
xlsxwriter : 1.3.7
lxml.etree : 4.6.1
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.19.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : 1.3.2
fsspec : 0.8.3
fastparquet : None
gcsfs : None
matplotlib : 3.3.2
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.5
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.5.2
sqlalchemy : 1.3.20
tables : 3.6.1

The text was updated successfully, but these errors were encountered:

Liam3851 · 2020-12-21T13:59:02Z

If you want fast uploading to SQL Server, you need to use a more modern ODBC driver, like ODBC Driver 17 for SQL Server, rather than "SQL Server" which I see in your connection string (which was the driver for SQL Server 2008R2 and earlier). You will also want to set fast_executemany=True in your connection string. See https://docs.sqlalchemy.org/en/13/dialects/mssql.html#fast-executemany-mode.

(Edit: Do note these settings apply to any use of SQLAlchemy with SQL Server, and not pandas to_sql specifically).

AntoineWeber · 2020-12-21T15:39:49Z

If you want fast uploading to SQL Server, you need to use a more modern ODBC driver, like ODBC Driver 17 for SQL Server, rather than "SQL Server" which I see in your connection string (which was the driver for SQL Server 2008R2 and earlier). You will also want to set fast_executemany=True in your connection string. See https://docs.sqlalchemy.org/en/13/dialects/mssql.html#fast-executemany-mode.

(Edit: Do note these settings apply to any use of SQLAlchemy with SQL Server, and not pandas to_sql specifically).

Hey, thanks for your answer. I actually proceed to install exactly this new driver this morning (ODBC Driver 17 for SQL Server) to enable easier dockerized use.
Now to make it faster, I also indeed added the fast_executemany=True which at first did not accelerate anything. To end-up with a substantial speed boost, I also had to remove the method='multi' from the to_sql() method. Now I can set a chunk_size way bigger (currently 1000) and a dataframe of 14k rows and 72columns is pushed in under 9 seconds.

Thanks you for the insight. I can close the issue.

AntoineWeber added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 17, 2020

AntoineWeber closed this as completed Dec 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ERROR : "('HY000', 'The driver did not supply an error!')" or "The incoming request has too many parameters" when trying to push a df to a MS SQL database using the to_sql() method with argument method='multi' #38541

ERROR : "('HY000', 'The driver did not supply an error!')" or "The incoming request has too many parameters" when trying to push a df to a MS SQL database using the to_sql() method with argument method='multi' #38541

AntoineWeber commented Dec 17, 2020 •

edited

Loading

INSTALLED VERSIONS

Liam3851 commented Dec 21, 2020 •

edited

Loading

AntoineWeber commented Dec 21, 2020

ERROR : "('HY000', 'The driver did not supply an error!')" or "The incoming request has too many parameters" when trying to push a df to a MS SQL database using the to_sql() method with argument method='multi' #38541

ERROR : "('HY000', 'The driver did not supply an error!')" or "The incoming request has too many parameters" when trying to push a df to a MS SQL database using the to_sql() method with argument method='multi' #38541

Comments

AntoineWeber commented Dec 17, 2020 • edited Loading

Problem description

Sample code

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

Liam3851 commented Dec 21, 2020 • edited Loading

AntoineWeber commented Dec 21, 2020

AntoineWeber commented Dec 17, 2020 •

edited

Loading

Output of `pd.show_versions()`

Liam3851 commented Dec 21, 2020 •

edited

Loading