-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add table spec changes for statistics information in table snapshot #4945
Conversation
Currently based on #4944 |
Supersedes apache/iceberg-docs#77 (just for reference, no useful content/comments there) |
6594d54
to
405fe41
Compare
405fe41
to
5599290
Compare
5599290
to
7a27f31
Compare
comments applied. @rdblue mind taking another look? |
cc66021
to
57f3f73
Compare
57f3f73
to
26da01e
Compare
format/spec.md
Outdated
| _required_ | **`statistics-path`** | `string` | Path of the statistics file. See [Puffin file format](../puffin-spec). | | ||
| _required_ | **`file-size-in-bytes`** | `long` | Size of the statistics file. | | ||
| _required_ | **`file-footer-size-in-bytes`** | `long` | Size of the statistics file's footer. See [Puffin file format](../puffin-spec) for footer definition. | | ||
| _required_ | **`source-sequence-number`** | `long` | Table sequence number at which the stats were calculated | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this also include a snapshot ID at which the stats were calculated?
Also, if this is to be used in v1, we need to rely on snapshot ID rather than sequence number because v1 tables use a null sequence number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if this is to be used in v1,
my intention was for this to be v2 feature.
Shouldn't this also include a snapshot ID at which the stats were calculated?
also, or instead of?
please advise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think this should have the snapshot for at which stats were calculated.
There's also no reason to not support this in v1. As long as the snapshot ID is here for tracking, it should work just fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to source-snapshot-id
and made the whole thing supported in v1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A side question -- now that we have v2 and we will have v3 of the spec, why do we want to add a new feature to v1?
8125ad7
to
5b2d4ee
Compare
5b2d4ee
to
528eb9c
Compare
Done (#4945 (comment)) |
528eb9c
to
01097d8
Compare
applied/answered the comments |
Thank you @rdblue for your detailed review. |
01097d8
to
9737dc3
Compare
Thank you @rdblue for your detailed review. Applied comments! |
Merged! Thanks for all your work to get this in, @findepi! |
No description provided.