Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem with Exasol driver's WebSocket connection cleanup process #390

Closed
wavewater opened this issue Oct 26, 2023 · 9 comments · Fixed by #391
Closed

problem with Exasol driver's WebSocket connection cleanup process #390

wavewater opened this issue Oct 26, 2023 · 9 comments · Fixed by #391
Assignees
Labels
bug Unwanted / harmful behavior

Comments

@wavewater
Copy link

wavewater commented Oct 26, 2023

Summary

When invoking the sql-alchemy engine, it returns the correct result (42) but throws an error during the the WebSocket Connection cleanup process.

Reproducing the Issue

Reproducibility: always

Steps to reproduce the behavior:

step 1: create file test_sql_alchemy_websocket.py

# test_sql_alchemy_websocket.py 
from sqlalchemy import create_engine
url = "exa+websocket://user:pw@host:8563/meta?SSLCertificate=SSL_VERIFY_NONE"
engine = create_engine(url)
with engine.connect() as con:
    result = con.execute("select 42").fetchall()
    print(f"{result}")

step 2: execute the script:
python ./resources/scripts/test_sql_alchemy_websocket.py

Expected Behaviour

output should be:

[(42,)]

without any further errors

Actual Behaviour

[(42,)]
Exception ignored in: <function Connection.__del__ at 0x7fe1304df670>
Traceback (most recent call last):
  File "/home/xxxxx/projects/airflow_sbk/.venv/lib/python3.9/site-packages/exasol/driver/websocket/_connection.py", line 151, in __del__
  File "/home/xxxxx/projects/airflow_sbk/.venv/lib/python3.9/site-packages/exasol/driver/websocket/_connection.py", line 127, in close
exasol.driver.websocket._errors.Error: 

Root Cause (optional)

Note that the 42 was returned but there seems a problem with Exasol driver's WebSocket connection cleanup process. Without more descriptive error messages, it's difficult to diagnose the exact issue. This is why I am reaching out to the community for support. Any ideas?

Context

I am simply following the tutorial on sqlalchemy-exasol and wanted to give the new websocket dialect a try because the odbc dialect is a burden. In particular, many open source tooling require a working sqlalchemy connection.

System

CentOS Linux release 7.9.2009 (Core)

python --version
Python 3.9.18
pip list 
Package           Version
----------------- -------
cffi              1.16.0
cryptography      41.0.5
greenlet          3.0.1
packaging         23.2
pip               23.0.1
pyasn1            0.5.0
pycparser         2.21
pyexasol          0.25.2
pyodbc            4.0.39
pyOpenSSL         23.3.0
rsa               4.9
setuptools        58.1.0
SQLAlchemy        1.4.44
sqlalchemy-exasol 4.6.0
websocket-client  1.6.4
@wavewater wavewater added the bug Unwanted / harmful behavior label Oct 26, 2023
@Nicoretti Nicoretti self-assigned this Oct 27, 2023
@Nicoretti
Copy link
Contributor

Hi @wavewater,

thanks for reporting this issue. I will pick up on it as my next task.

best
Nico

@wavewater
Copy link
Author

Hi @wavewater,

thanks for reporting this issue. I will pick up on it as my next task.

best Nico

Thanks! - Joachim

@Nicoretti
Copy link
Contributor

Nicoretti commented Oct 30, 2023

Hi @wavewater (Joachim),

I have analyzed the issue, this is a clear bug. I'll provide a fix during this week.

🗒️ Side Note:
The error is an "additional close" (double close) on the already closed connection when the GC is run. This is not ideal, but on the other hand, all executed and committed queries should not be affected. Additionally, the connections have been closed in an orderly fashion before this error occurs.

best
Nico

@Nicoretti
Copy link
Contributor

Nicoretti commented Oct 31, 2023

Hi @wavewater (Joachim),

I have been digging and it appears that the issue may not be isolated to the SQLAlchemy dialect.
My current investigations indicate that it is a problem existing within pyexasol and maybe even
within the websocket library pyexasol is using.

What do we know so far?

  1. The problem only occurs on process termination when the __del__ methods of pyexasol, websocket-client etc. are invoked.

  2. This is also a problem when a user expects pyexasol to clean up the resources on shutdown
    The pyexasol example below, has the exact same problem, except that the failure is hidden from the user.

    import pyexasol
    c = pyexasol.connect(
        dsn="127.0.0.1:8888", user="sys", password="exasol", socket_timeout=None,
        # don't fall back to no encryption for unknown certificates
        websocket_sslopt= {"cert_reqs": ssl.CERT_NONE}
    )
    result = c.execute("SELECT 1;")
    print(f"{result.fetchall()}")
  3. Checking the underlying resource(s) before calling close does not seem to be sufficient in case of program termination. This may is due to

    • Invalid usage of the underlying API
    • An bug in the underlying implementation(s)/layers
    • Destruction order and/or guarantees/assumptions regarding process cleanup

What are partial mitigation(s) from a SQLA Dialect users perspective?

Redirect the error message

python ./resources/scripts/test_sql_alchemy_websocket.py 2> /dev/null

Should yield:

[(42,)]

⚠️ Attention ⚠️ :
This only should be fine as long as the connection has been closed properly, during program execution. This can be achieved by explicitly calling .close() on the connection or by using it in the context of a with statement e.g. with engine.connect() as con:.

How we intend to follow up on this?

  1. We will improve the checks before closing the underlying connection in the websocket based SQLA Dialect
    🗒️ Note: This won't solve the issue at hand, but will handle other edge cases which could be a problem in the future
  2. We will create an issue in the pyexasol project for further investigations.
    🗒️ Note: I will link the associated issue here, once it has been created.

Related Issue

@tkilias
Copy link
Collaborator

tkilias commented Nov 1, 2023

@Nicoretti besides the fix in pyexasol, we should also assign None to self._connection after

@Nicoretti
Copy link
Contributor

@Nicoretti besides the fix in pyexasol, we should also assign None to self._connection after

Agreed, also I think we should add an additional check for self._connection.is_closed beforehand, to ensure the we only call close on open connections (How we intend to follow up on this [1.]).

@Nicoretti
Copy link
Contributor

Nicoretti commented Nov 3, 2023

@wavewater (Joachim)

We have conducted further investigations and have come to the conclusion that the problem lies within the libraries on which this project is based. In order to enhance the situation for the users of this library, we have modified the code to handle the problem more gracefully, although it still persists. Additionally, it remains unclear how long it will take to address this issue.

That being said, we have also assessed the impact of the issue and have reached the conclusion that its overall impact should be minor (for details, see the 'Context' section here). Shortly after this pull request is merged, we will create a new release of sqlalchemy-exasol. So you can expect a release either today or early next week.

Nicoretti added a commit that referenced this issue Nov 3, 2023
--------
Co-authored-by: Christoph Pirkl <[email protected]>
@Nicoretti
Copy link
Contributor

@wavewater we just released a new version of sqlalchemy-exasol

@wavewater
Copy link
Author

@Nicoretti Thanks. I upated via pip install --upgrade sqlalchemy-exasol, And now it works without the error. Perfect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Unwanted / harmful behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants