Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate support for Delta Tables as backing storage #175

Closed
rupurt opened this issue Oct 26, 2022 · 5 comments
Closed

Investigate support for Delta Tables as backing storage #175

rupurt opened this issue Oct 26, 2022 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@rupurt
Copy link

rupurt commented Oct 26, 2022

Howdy,

Are there any plans to support Delta Tables? This could work really well with GraphQL subscriptions.

@mildbyte
Copy link
Contributor

Hey! Do you mean being able to support DataBricks' Delta Tables / Delta Lake (https://github.com/delta-io/delta/blob/master/PROTOCOL.md) as a storage backend / data source for CREATE EXTERNAL tables?

Design-wise, a GraphQL frontend is a sweet idea, though I'm not sure how to make it work well for analytical/aggregation queries (e.g. being able to represent a group by or window on arbitrary columns as a set of supported GraphQL fields). Same with subscriptions -- how would you quickly update a result for AVG(volume) GROUP BY country_id? IIRC ClickHouse/Materialize did some heavy research in that direction -- would indeed be cool to have it also available to Web devs via GQL :)

@rupurt
Copy link
Author

rupurt commented Oct 27, 2022

Yes exactly as a storage backend / data source for CREATE EXTERNAL tables

Design wise I'm not exactly sure how to implement the subscription :) But I feel like there is so much work going into this problem that the solution is right on the cusp of being implemented (e.g. ClickHouse/Materialize/Delta Tables). FWIW there is now a Delta Table implementation in rust and it can do streaming updates https://github.com/delta-io/delta-rs/tree/main/rust.

@mildbyte mildbyte added the enhancement New feature or request label Oct 31, 2022
@mildbyte mildbyte changed the title Support Delta Tables Investigate support for Delta Tables as backing storage Dec 20, 2022
@gruuya
Copy link
Contributor

gruuya commented Dec 31, 2022

@rupurt thanks for the very cool ideas! :)

As for using Delta tables for our storage backend/layer (i.e. replacing our DIY lakehouse protocol with the Delta one using delta-rs), this is something that we'll likely converge towards at some point later on.

For now though, with the latest Seafowl version (0.2.10) you should be able to instantiate the delta tables stored in various cloud object stores as an external table (will be placed in the staging schema) and query them.

@rupurt
Copy link
Author

rupurt commented Dec 31, 2022

Amazing. Thank you @gruuya

@gruuya
Copy link
Contributor

gruuya commented Mar 23, 2023

As for using Delta tables for our storage backend/layer (i.e. replacing our DIY lakehouse protocol with the Delta one using delta-rs), this is something that we'll likely converge towards at some point later on.

I'm happy to say that we've completed this migration, so this issue can be closed now. Thanks for a great idea @rupurt !

@gruuya gruuya closed this as completed Mar 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants