Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - Add Trino support #15160

Closed
wants to merge 2 commits into from
Closed

WIP - Add Trino support #15160

wants to merge 2 commits into from

Conversation

mosabua
Copy link

@mosabua mosabua commented Mar 13, 2021

I am attempting to add support for Trino by taking the Presto support and creating a new module for the new project.

This is work in progress and it might take me a bit to get the hang of things. Any help appreciated.

Tests

  • Run the frontend and Cypress end-to-end tests with yarn lint && yarn test)

  • If there are changes to the backend codebase, run the backend tests with lein test && lein lint && ./bin/reflection-linter

  • Sign the Contributor License Agreement
    (unless it's a tiny documentation change).

@mosabua mosabua mentioned this pull request Mar 13, 2021
@codecov
Copy link

codecov bot commented Mar 13, 2021

Codecov Report

Merging #15160 (5bce21e) into master (aa0e321) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #15160   +/-   ##
=======================================
  Coverage   84.30%   84.30%           
=======================================
  Files         389      389           
  Lines       30660    30660           
  Branches     2197     2198    +1     
=======================================
  Hits        25849    25849           
+ Misses       2614     2613    -1     
- Partials     2197     2198    +1     
Impacted Files Coverage Δ
src/metabase/models/query.clj 85.41% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0ad49a9...b6f09bd. Read the comment docs.

@mosabua
Copy link
Author

mosabua commented Mar 13, 2021

If there is a better way to implement this by modelling of another driver that uses a JDBC driver .. I can see if I can adopt that approach in another PR as well. Trying to see how to build next..

@willtrnr
Copy link
Contributor

As far as I can tell, at the moment, the only difference on the Presto/Trino side is the custom headers used. At least for the very minimal API surface used.

I think we can probably get away with sending both X-Presto- and X-Trino- headers from the existing presto driver for now.

@amitds1997
Copy link

As far as I can tell, at the moment, the only difference on the Presto/Trino side is the custom headers used. At least for the very minimal API surface used.

I think we can probably get away with sending both X-Presto- and X-Trino- headers from the existing presto driver for now.

I think we can get away with passing both the client headers for now but we should consider splitting them into separate entities (if not now) since they are not the same projects anymore and would likely diverge in the future.

Trino already has better timezone support (pointed out by mosabua in comments) and other features which are/are not present in PrestoDB. Using the same driver would restrict us from providing the end-users the full advantage of using Trino.

@mosabua
Copy link
Author

mosabua commented Mar 14, 2021

We definitely want to have a separate Trino driver as soon as possible.. There are lots of improvements in Trino right now, hat are beyond what PrestoDB offers and some are incompatible (high precision timestamp, more functions, ...). Ideally I would like to create an alternate PR that actually uses the Trino JDBC driver, but I am not sure on what other driver I should model it off.

@amitds1997
Copy link

amitds1997 commented Mar 14, 2021

@mosabua I think we can model it of the sql-jdbc driver module already present in the codebase. There's a bit of wiki associated with it available here. The drivers for Postgres and MySQL are also modeled off sql-jdbc.

@willtrnr
Copy link
Contributor

willtrnr commented Mar 14, 2021

Just to put this in perspective, at the moment there's multiple Presto-like derivatives in the wild and potentially in use, some more updated than others. At least the ones I know about are (in chronological order of release):

  • PrestoDB: original Facebook version
  • Teradata Presto: fork of Facebook PrestoDB
  • Amazon Athena: essentially Facebook PrestoDB (out of scope since there's already a 3rd party driver available)
  • PrestoSQL: original comunity fork
  • Starburst Presto: fork of PrestoSQL
  • Trino: rebranded PrestoSQL

These all implement the exact same SQL ANSI standard and support at least the same initial set of functions from the original Facebook version. None of the Trino extras are usuable from the Metabase UI anyway and they are unlikely to ever be since they go far beyond what typical DBMS' can handle. Users are still able to use them in raw SQL given they known which Presto flavor is used.

So I believe we do not want a seperate driver for each, or at the very least we want to have a "Presto-like" common base because, aside from swapping a JDBC driver due to minor API differences, everything else is, for the most part, identical and should remain so as mandated by the SQL standard they are implementing.

@findepi
Copy link

findepi commented Mar 15, 2021

Just to put this in perspective, at the moment there's multiple Presto-like derivatives in the wild and potentially in use, some more updated than others. At least the ones I know about are (in chronological order of release):

PrestoDB: original Facebook version
Teradata Presto: fork of Facebook PrestoDB
Amazon Athena: essentially Facebook PrestoDB (out of scope since there's already a 3rd party driver available)
PrestoSQL: original comunity fork
Starburst Presto: fork of PrestoSQL
Trino: rebranded PrestoSQL

You already removed Athena from the list, since they have their custom SQL language features and separate driver.

PrestoSQL is renamed to Trino.
Starburst is based on Trino.
Teradata Presto is not unmaintained for several years now.

So the list is actually:

  • Trino
  • PrestoDB

hopefully this is less scary than originally

As far as I can tell, at the moment, the only difference on the Presto/Trino side is the custom headers used. At least for the very minimal API surface used.

I think we can probably get away with sending both X-Presto- and X-Trino- headers from the existing presto driver for now.

On the face of it, this is correct, however details probably matter.

Trino supports more language-level features, that could allow some queries to be simpler, but i agree this is more far-fetched.

Trino eg also supports parametric time/timestamp [with time zone] (or UUID type).
trinodb/trino#1284

This is different from Metabase user perspective as well.

@willtrnr
Copy link
Contributor

So the list is actually:

  • Trino
  • PrestoDB

PrestoSQL will still remain relevant for some time. As a matter of fact, we are still running a pre-Trino version where I am simply because updating custom plugins and everything else takes time and resources are allocated elsewhere.

As far as I can tell, at the moment, the only difference on the Presto/Trino side is the custom headers used. At least for the very minimal API surface used.
I think we can probably get away with sending both X-Presto- and X-Trino- headers from the existing presto driver for now.

On the face of it, this is correct, however details probably matter.

I only intended this to be a quick short term solution to buy some time for a cleaner JDBC driver implementation, not as an actual long term solution.

Trino supports more language-level features, that could allow some queries to be simpler, but i agree this is more far-fetched.

Trino eg also supports parametric time/timestamp [with time zone] (or UUID type).
trinodb/trino#1284

Maybe the most obvious thing from the Metabase query builder perspective is the OFFSET support which would make the paging handling so much better. Otherwise, I don't think the other differences actually matter to the Metabase query builder and definitely don't matter when using raw SQL.

I believe all of this can be handled in a common driver or, at the very least, a common base driver.

@findepi
Copy link

findepi commented Mar 15, 2021

PrestoSQL will still remain relevant for some time. As a matter of fact, we are still running a pre-Trino version where I am simply because updating custom plugins and everything else takes time and resources are allocated elsewhere.

Oh, sure, I guess i spoke too briefly and thus the message came across not what i intended.

I am sure there are many pre-Trino installations, because with such massive community it would be unreasonable to expect otherwise.
However, they are served somehow today, by Metabase -- be it by prestodb dialect -- and we probably don't need to improve this, assuming everyone will upgrade sooner or later.

Maybe the most obvious thing from the Metabase query builder perspective is the OFFSET support which would make the paging handling so much better.

Good point.

@buremba
Copy link

buremba commented Jun 10, 2021

The active PR seems to be #16194 but looks like Trino is not in the roadmap. Any plans so far?

@mosabua
Copy link
Author

mosabua commented Jun 11, 2021

I am closing this PR since it will be better to base a Trino support on the JDBC based client and this does not seem to have a chance to get merged. I am not able to work on this for now but might take it up again later when there is a better JDBC base to use.

@mosabua mosabua closed this Jun 11, 2021
@fazz21
Copy link

fazz21 commented Oct 18, 2021

@mosabua maybe you could look up again about the new presto-jdbc driver which already released in 11 October 2021. But regarding the new presto-driver, I already test with the new metabase version 0.41 with new presto-jdbc. It turns out we could use Trino (our version Trino is 361) with the new presto-jdbc.

But if you want to create trino driver using this jdbc driver, what will make it difference with the existing presto JDBC. cc: @flamber here also, in case you have the answer.

@flamber
Copy link
Contributor

flamber commented Oct 18, 2021

@fazz21 I don't think we need more implementations. There's already a community driver: #17532 (comment)

@mosabua
Copy link
Author

mosabua commented Jan 21, 2023

A fully supported driver for Trino, created by Metabase and Starburst is now available as open source project - https://github.com/starburstdata/metabase-driver

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants