Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable external Delta tables in Seafowl #252

Merged
merged 2 commits into from
Dec 27, 2022
Merged

Enable external Delta tables in Seafowl #252

merged 2 commits into from
Dec 27, 2022

Conversation

gruuya
Copy link
Contributor

@gruuya gruuya commented Dec 27, 2022

Utilize the Delta table factory for DataFusion, and the new options mechanism for creating external Delta tables in Seafowl.

Example definition for AWS S3-backed delta table:

CREATE EXTERNAL TABLE my_delta
STORED AS DELTATABLE
OPTIONS ('AWS_ACCESS_KEY_ID' 'secret', 'AWS_SECRET_ACCESS_KEY' 'also_secret', 'AWS_REGION' 'eu-west-3') 
LOCATION 's3://my-bucket/my-delta-table/'

@gruuya gruuya requested a review from mildbyte December 27, 2022 11:39
Copy link
Contributor

@mildbyte mildbyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Do you know if the new deps make the final binary size much larger? Might be worth putting the support behind the Cargo features mechanism so that people can turn it off if necessary.

@mildbyte
Copy link
Contributor

(tagging #175 since this partially solves it)

@gruuya
Copy link
Contributor Author

gruuya commented Dec 27, 2022

Good point, thanks!

I've featurized remote/delta table querying mechanisms now (both are default), and measured the binary sizes in a couple of scenarios:

  1. all features—65MB
  2. all features except delta tables—58MB
  3. all features except delta and remote tables—55MB

So delta tables contribute about 10% to the new binary size, with remote tables being much more lightweight.

@gruuya gruuya merged commit 455da47 into main Dec 27, 2022
@gruuya gruuya deleted the external-delta-tables branch December 27, 2022 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants