Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Add support for ODBC #829

Closed
jorgecarleitao opened this issue Feb 11, 2022 · 9 comments
Closed

Add support for ODBC #829

jorgecarleitao opened this issue Feb 11, 2022 · 9 comments
Labels
investigation Issues or PRs that are investigations. Prs may or may not be merged. no-changelog Issues whose changes are covered by a PR and thus should not be shown in the changelog

Comments

@jorgecarleitao
Copy link
Owner

@pacman82 wrote this amazing library https://github.com/pacman82/odbc-api that we should consider depending on for reading and writing from/to ODBC.

imo ODBC is quite central in data, which is why imo I think "this must live".

Question: should this live in arrow2 as optional (e.g. src/io/odbc/) or somewhere else?

Any thoughts @ritchie46 , @houqp , @pacman82, @sundy-li , @Dandandan?

@jorgecarleitao jorgecarleitao added the investigation Issues or PRs that are investigations. Prs may or may not be merged. label Feb 11, 2022
@pacman82
Copy link
Collaborator

pacman82 commented Feb 11, 2022

Hi thanks @jorgecarleitao for promoting my library. I think the maintainers of this library a probably qualified best to decide wether or not an optional dependency is suitable here. I think the most important impact of this decision is wether or not we want the ODBC glue code to to share a CI and release cycle with the rest of arrow2.

I am happy to help out either way.

@pacman82
Copy link
Collaborator

Well, at least my naming scheme would not do well for an individual crate. arrow22odbc is just silly 😅 .

@pacman82
Copy link
Collaborator

Just realized that I already renamed arrow2odbc into arrow-odbc. Seeing that now an arrow2 exists, I am really happy about doing that.

@ritchie46
Copy link
Collaborator

I do think there is some benefit in maintaining the dependency optionally in arrow2. I have heard several cases of users who use connector-x + polars rust don't realize that both should point to exactly the same arrow version to work. It is very easy to get out of sync unless we make sure that we distribute such a dependency through arrow.

@pacman82
Copy link
Collaborator

How to start?

If this were a personal project of mine, I would probably start with the test setup. I would suggest testing against a Microsoft SQL Server. They have IMHO the most solid implemenation of an ODBC driver, and we probably want to find errors in our bindings, before we would compensate for various misbehaving drivers.

I would also start testing the ODBC bindings in Ubuntu only. Testing these under e.g. Windows is they more involved and very expensive on the runtime, since the GitHub windows runner does not support booting arbitrary docker containers, yet.

With that setup, we should easily be able to write some driving tests and execute them on CI.

How do you envision the local development experience? The machine executing the tests would need an ODBC driver installed.

@jorgecarleitao
Copy link
Owner Author

I agree that sticking to one driver is a good idea.

Could we run them on docker, or you think it is an even worse requirement? Asking because it would also make it easier to boot the db

@pacman82
Copy link
Collaborator

pacman82 commented Feb 16, 2022

It is very easy to operate on CI. Here is how I spin up the DB to test the odbc2parquet tool.

https://github.com/pacman82/odbc2parquet/blob/eab372a9ad50ea065b461e41ee6e26d9fbc71097/.github/workflows/test.yml#L1

    services:
      # Microsoft SQL Server
      sqlserver:
        image: mcr.microsoft.com/mssql/server:2019-latest
        ports:
        - 1433:1433
        env:
          ACCEPT_EULA: Y
          SA_PASSWORD: My@Test@Password1

@jorgecarleitao
Copy link
Owner Author

This is being worked out in #849

@jorgecarleitao
Copy link
Owner Author

Closed by #849

@jorgecarleitao jorgecarleitao added the no-changelog Issues whose changes are covered by a PR and thus should not be shown in the changelog label Mar 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
investigation Issues or PRs that are investigations. Prs may or may not be merged. no-changelog Issues whose changes are covered by a PR and thus should not be shown in the changelog
Projects
None yet
Development

No branches or pull requests

3 participants