Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(partitioning): Add partition specs/fields and basic transform interface #2

Merged
merged 9 commits into from
Sep 14, 2023

Conversation

zeroshade
Copy link
Member

Adds representations for Partition Specs, Partition Fields, and base transform interface + implementations. It does not add actual transformation functionality implementations yet. That will come later when implementing data reading. This adds only what is sufficient for reading a table definition / metadata.

@zeroshade
Copy link
Member Author

CC @coded9 @Fokko @nastra @bitsondatadev @rdblue

partitions.go Outdated Show resolved Hide resolved
partitions.go Outdated Show resolved Hide resolved
partitions_test.go Outdated Show resolved Hide resolved
})
}

errorTests := []struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

@zeroshade
Copy link
Member Author

@nastra @coded9 @Fokko @rdblue @bitsondatadev Any further comments / reviews here?

@coded9
Copy link

coded9 commented Aug 21, 2023

@nastra @coded9 @Fokko @rdblue @bitsondatadev Any further comments / reviews here?

LGTM!

@zeroshade
Copy link
Member Author

@nastra @rdblue any chance for a review soon to get this in?

"fmt"
"strings"

"golang.org/x/exp/slices"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I am really against using the experimental packages for anything that will be used likely in production. There recently has been a lot of people ignoring this and creating havoc in the release cycles of Go as seen in the following issues:

golang/go#51317
golang/go#61374

I think it would be better at this time to avoid using anything in the exp branch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zeroshade thoughts on this one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I forgot to respond here on the PR because we ended up discussing this in the slack channel.

Tl;DR: the standard for Go in general is to target N-1 versions (ie. right now the latest is Go 1.21, so we want to target Go 1.20 + Go 1.21). This slices package was only added to the Go stdlib in Go 1.21, so we still need to point at golang.org/x/exp/slices until/unless we drop support for Go 1.20, at which point we can update this to just use the stdlib version of the package.

We're still about 6 months or so before Go 1.22 would get released, but this should be fine for now. Once Go 1.22 is released I'll replace these with the stdlib package, letting us maintain the N-1 compatibility.

transforms.go Outdated Show resolved Hide resolved
transforms.go Outdated Show resolved Hide resolved
transforms.go Outdated Show resolved Hide resolved
SourceID: 3, FieldID: 1001, Name: "id", Transform: bucket}
spec1 := iceberg.NewPartitionSpec(idField1)

assert.Zero(t, spec1.ID())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

independent from this PR, is there a good fluent assertion library available in the Golang ecosystem (similar to AssertJ)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with AssertJ, but the most common assertion library I've seen for testing in the Golang ecosystem is the testify library that I'm using here.

partitions.go Show resolved Hide resolved
partitions_test.go Outdated Show resolved Hide resolved
partitions.go Outdated Show resolved Hide resolved
partitions.go Outdated Show resolved Hide resolved
func (ps *PartitionSpec) NumFields() int { return len(ps.fields) }
func (ps *PartitionSpec) Field(i int) PartitionField { return ps.fields[i] }

func (ps *PartitionSpec) IsUnpartitioned() bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to make sure: this will return true only if 1) fields are empty or 2) all transforms are void

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, I've added tests accordingly.

assert.Equal(t, 1002, spec3.LastAssignedFieldID())
}

func TestUnpartitionedWithVoidField(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe also add a test where one field has a void transform and another field has some non-void transform, thus resulting spec.IsUnpartitioned() to be false

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the test

Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM, just left one more comment around testing

@nastra
Copy link
Contributor

nastra commented Sep 14, 2023

thanks for working on this @zeroshade! Should be good to go once the merge conflict is resolved

@zeroshade zeroshade force-pushed the partitioning-transforms branch from a7dee3e to beeaed2 Compare September 14, 2023 16:19
@zeroshade
Copy link
Member Author

@nastra resolved! 😄

@nastra nastra merged commit 39b0197 into apache:main Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants