You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ERROR : "('HY000', 'The driver did not supply an error!')" or "The incoming request has too many parameters" when trying to push a df to a MS SQL database using the to_sql() method with argument method='multi'
#38541
Closed
AntoineWeber opened this issue
Dec 17, 2020
· 2 comments
I have checked that this issue has not already been reported.
[Yes, it was already opened here. However the issue was closed even if the problem is still here]
I have confirmed this bug exists on the latest version of pandas.
[Yes, currently version 1.1.5]
(optional) I have confirmed this bug exists on the master branch of pandas.
[No, did not check this.]
Problem description
I am currently trying to push a dataframe to a MS SQL database using pandas' to_sql() (using sqlalchemy and pyodbc) method. My dataframe is rather large with a 14k x 72 shape and various types (strings, floats, integers.)
When trying to push the dataframe without any argument set to the to_sql() method (except the table name and DB connection), the data will push extremely slow (> 30min for this 5mb df).
To try to fasten this, I came accross the method= argument, that I set to 'multi'. Without specifying a chunksize, I get the error ('HY000', 'The driver did not supply an error!')
So then I tried to set a chunksize, but whenever this latter one is bigger than ~20, I get the error The incoming request has too many parameters. The server supports a maximum of 2100 parameters. Reduce the number of parameters and resend the request.
Setting the chunksize to 10 works, but the upload is ~3min, which is still too slow..
From my understanding, such an upload should be doable normally in a few seconds.
Am I the only one having this issue ?
Sample code
# connection to the db
engine = sqlalchemy.create_engine("mssql+pyodbc://{},{}/{}?driver=SQL Server?Trusted_Connection=yes"
.format(server_ip, port_number, db_name))
connection = engine.connect()
# read the df
df_raw = pd.read_csv(files_path, index_col=0, low_memory=False)
# push to database
with connection.begin():
df_raw.to_sql('RawTestData', con=connection, if_exists='replace', index=False, chunksize=chunk_size, method='multi')
Expected Output
Using the method = 'multi' argument should speed the upload of the data. I guess in a few seconds maximum.
Output of pd.show_versions()
[paste the output of pd.show_versions() here leaving a blank line after the details tag]
INSTALLED VERSIONS
commit : b5958ee
python : 3.8.5.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19041
machine : AMD64
processor : Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : French_Switzerland.1252
If you want fast uploading to SQL Server, you need to use a more modern ODBC driver, like ODBC Driver 17 for SQL Server, rather than "SQL Server" which I see in your connection string (which was the driver for SQL Server 2008R2 and earlier). You will also want to set fast_executemany=True in your connection string. See https://docs.sqlalchemy.org/en/13/dialects/mssql.html#fast-executemany-mode.
(Edit: Do note these settings apply to any use of SQLAlchemy with SQL Server, and not pandas to_sql specifically).
If you want fast uploading to SQL Server, you need to use a more modern ODBC driver, like ODBC Driver 17 for SQL Server, rather than "SQL Server" which I see in your connection string (which was the driver for SQL Server 2008R2 and earlier). You will also want to set fast_executemany=True in your connection string. See https://docs.sqlalchemy.org/en/13/dialects/mssql.html#fast-executemany-mode.
(Edit: Do note these settings apply to any use of SQLAlchemy with SQL Server, and not pandas to_sql specifically).
Hey, thanks for your answer. I actually proceed to install exactly this new driver this morning (ODBC Driver 17 for SQL Server) to enable easier dockerized use.
Now to make it faster, I also indeed added the fast_executemany=True which at first did not accelerate anything. To end-up with a substantial speed boost, I also had to remove the method='multi' from the to_sql() method. Now I can set a chunk_size way bigger (currently 1000) and a dataframe of 14k rows and 72columns is pushed in under 9 seconds.
Thanks you for the insight. I can close the issue.
I have checked that this issue has not already been reported.
[Yes, it was already opened here. However the issue was closed even if the problem is still here]
I have confirmed this bug exists on the latest version of pandas.
[Yes, currently version 1.1.5]
(optional) I have confirmed this bug exists on the master branch of pandas.
[No, did not check this.]
Problem description
I am currently trying to push a dataframe to a MS SQL database using pandas'
to_sql()
(using sqlalchemy and pyodbc) method. My dataframe is rather large with a 14k x 72 shape and various types (strings, floats, integers.)When trying to push the dataframe without any argument set to the
to_sql()
method (except the table name and DB connection), the data will push extremely slow (> 30min for this 5mb df).To try to fasten this, I came accross the
method=
argument, that I set to'multi'
. Without specifying a chunksize, I get the error('HY000', 'The driver did not supply an error!')
So then I tried to set a chunksize, but whenever this latter one is bigger than ~20, I get the error
The incoming request has too many parameters. The server supports a maximum of 2100 parameters. Reduce the number of parameters and resend the request.
Setting the chunksize to 10 works, but the upload is ~3min, which is still too slow..
From my understanding, such an upload should be doable normally in a few seconds.
Am I the only one having this issue ?
Sample code
Expected Output
Using the
method = 'multi'
argument should speed the upload of the data. I guess in a few seconds maximum.Output of
pd.show_versions()
[paste the output of
pd.show_versions()
here leaving a blank line after the details tag]INSTALLED VERSIONS
commit : b5958ee
python : 3.8.5.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19041
machine : AMD64
processor : Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : French_Switzerland.1252
pandas : 1.1.5
numpy : 1.19.2
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2.4
setuptools : 50.3.1.post20201107
Cython : 0.29.21
pytest : 6.1.1
hypothesis : None
sphinx : 3.2.1
blosc : None
feather : None
xlsxwriter : 1.3.7
lxml.etree : 4.6.1
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.19.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : 1.3.2
fsspec : 0.8.3
fastparquet : None
gcsfs : None
matplotlib : 3.3.2
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.5
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.5.2
sqlalchemy : 1.3.20
tables : 3.6.1
The text was updated successfully, but these errors were encountered: