Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ODBC Refactor #3

Open
JohnOmernik opened this issue Mar 6, 2024 · 0 comments
Open

ODBC Refactor #3

JohnOmernik opened this issue Mar 6, 2024 · 0 comments

Comments

@JohnOmernik
Copy link
Collaborator

@nmani has pointed out severe limitations in how we look at ODBC Connections. Currently we just rely on defined ODBC connections in the ODBC UI (which we made useful in the bootstrap by templating registry files that can be loaded in order to take the configuration away from the users.

Naveen has mentioned the ability to make the ODBC Configuration both easy and portable (platform independent).

This will require a refactor that will "likely" live here, but regardless of what has to change in which repos, I will use this to walk through some of the challenges.

Here are some high level items to consider in this project.

jupyter_integration_base


  • Instances
    • Instances are connections to various clusters.
      • Consider you may have an oracle integration that connects to multiple oracle clusters.
      • Currently, Instances for jupyter_pyodbc based integrations (tera, oracle, impala, hive etc) have an optional argument for dsn
        • For most windows environments, this allows us to use built in authentication, or pass passwords at connect
        • Password handling This is an important part: Passwords, or using windows auth when available needs to be seemless for the users.
        • Password (and OTP) passwords are part of jupyter_integration_base. We need to be able to handle asking for passwords/OTP in integration_base and pass to the underlying instance. (or flag when it it Built in auth)

jupyter_pyodbc


  • This is the base class that multiple other integrations (that use ODBC) utilize as a base.
  • Most other ODBC classes are just wrappers for this, although some have some custom code. From here on when I say jupyter_pyodbc I am saying "jupyter_pyodbc and child integrations"
  • jupyter_pyodbc currently uses DSNs through the instace argument ?dsn=JUPINSTNACE
  • The reason I used this approach (a full DSN vs. options in PyODBC) is that connections for various ODBC drivers were not clear to me how to set that generically. So instead, I would create a ODBC DSN in the UI. Then I would copy that out of my registry. This allowed me to set performance items that weren't (apparently) exposed in pyodbc. Things like the Teradata settings for Strings vs. Integeters. We need this ability in whatever refactor we do. We need the ability to set EVERY argument of a ODBC connection. And we need to have it be set once (in a YAML file or something) in the Bootstrap by the admin and apply to all users.
  • One issue with this was that some items were dependent on the User running. The first and most obvious is the username. The DSN had my username in it, that's why I created templated DSN and the bootstrap would replace that with the user ran. Another is the Teradata driver version which we have to detect from the User's install.
  • Having YAML Defined that can be referenced by the Instances is probably good.
  • Reg files are NOT Portable, windows only, and assuming people can directly run reg files.
  • I am hesitant to define an instance with EVERY option available in the reg file. We may need to put YAML files in the users profile under .ipython/integrations. We could create a new folder for ODBC connections. This should be platform independent.

jupyter_integrations_bootstrap


  • Currently this is where an org can define reg files.
  • We do some templating here for the reg files
  • That could likely be moved over to the YAML or what ever is defined

Trying to get a bunch of the stuff laid out here so we can refactor all three repos (or more) in a way that makes sense to make it platform independent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant