Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: refresh snapshot after vacuuming logs #3252

Merged
merged 1 commit into from
Feb 25, 2025

Conversation

ion-elgreco
Copy link
Collaborator

@ion-elgreco ion-elgreco commented Feb 22, 2025

Description

As per @roeap suggestion to simply reload the table state after cleaning up logs so that we get a proper snapshot.

Introduces a write function on the DeltaTable which allows us to update the state after writing, which we couldn't do properly before.

Related Issue(s)

@github-actions github-actions bot added binding/python Issues for the Python package binding/rust Issues for the Rust crate labels Feb 22, 2025
Copy link

codecov bot commented Feb 22, 2025

Codecov Report

Attention: Patch coverage is 2.32558% with 126 lines in your changes missing coverage. Please review.

Project coverage is 72.08%. Comparing base (666179e) to head (c2dd8e4).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
python/src/lib.rs 0.00% 119 Missing ⚠️
crates/core/src/operations/transaction/mod.rs 30.00% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3252      +/-   ##
==========================================
- Coverage   72.18%   72.08%   -0.10%     
==========================================
  Files         143      143              
  Lines       45629    45699      +70     
  Branches    45629    45699      +70     
==========================================
+ Hits        32936    32944       +8     
- Misses      10621    10682      +61     
- Partials     2072     2073       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines +1638 to +1653
fn write(
&mut self,
py: Python,
data: PyArrowType<ArrowArrayStreamReader>,
mode: String,
schema_mode: Option<String>,
partition_by: Option<Vec<String>>,
predicate: Option<String>,
target_file_size: Option<usize>,
name: Option<String>,
description: Option<String>,
configuration: Option<HashMap<String, Option<String>>>,
writer_properties: Option<PyWriterProperties>,
commit_properties: Option<PyCommitProperties>,
post_commithook_properties: Option<PyPostCommitHookProperties>,
) -> PyResult<()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function body looks really close to the write_to_deltalake function, can the code in write_to_deltalake simply defer to this function now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, I'll investigate!

Copy link
Collaborator

@roeap roeap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! agreeing with @rtyler thogh that we should see if we can cosolidate.

Reviewing our APIs is something to keep in ming for the 1.0 plan - no better time to break things :D.

@ion-elgreco
Copy link
Collaborator Author

@roeap @rtyler I've updated the code, write_to_deltalake will go through RawDeltaTable.write now :)

@ion-elgreco ion-elgreco added this pull request to the merge queue Feb 25, 2025
Merged via the queue into delta-io:main with commit cede5a8 Feb 25, 2025
28 checks passed
@ion-elgreco ion-elgreco deleted the fix/refresh-snapshot branch February 25, 2025 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package binding/rust Issues for the Rust crate
Projects
None yet
3 participants