-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delta Lake MERGE/UPDATE/DELETE on Databricks should trigger optimized write and auto compaction #10417
Comments
Note that this also should remove the repartition by partition key for partitioned tables when writing a MERGE because we're going to turn around and repartition for the optimized write anyway. |
Note that for MERGE the user can specify |
Hi, @jlowe delta oss have added support for optimized write: delta-io/delta#2145 I think we can always enable optimized write after porting this? |
This is a Databricks-specific behavior per the doc linked above, not a behavior in OSS Delta Lake, at least for the versions of OSS Delta Lake that we support. There's already a separate issue for tracking the OSS versions of optimized write and auto compact, see #10397 and #10398, respectively, but I do not see it as being relevant for this issue. We already support optimized write and auto compact on Databricks. |
I'll take this. |
https://docs.databricks.com/en/delta/tune-file-size.html states that Delta Lake MERGE, UPDATE, and DELETE operations will always trigger optimized write and auto compaction behavior as of 10.4 LTS, and this cannot be disabled. The RAPIDS Accelerator forms of these operations should mimic this behavior.
The text was updated successfully, but these errors were encountered: