-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-9331] Add better Row builders #10883
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some minor comments.
// passed in, it could result in strange errors later in the pipeline. This method is largely | ||
// used internal | ||
// to Beam. | ||
@Internal | ||
public Builder attachValues(List<Object> values) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is an opportunity to change the attachValues. No values should be set before or after the attach. I see 2 options to improve this:
- In the attach first see if values are already set. Let the attachValues return the new Row directly. This is maybe a bit strange as it violates a builder pattern.
- Have 4 build in builders. The starting one (that includes an
attachValues
,add
andwithFieldValue
), all of them return a specific builder: the newModifyingBuilder
and a newAddValuesBuilder
that only has the add methods and anAttachBuilder
that only has build. This also eliminates some elaborate if/then/else's in the builder().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like making attachValues return a Row. I thin the same for withFieldValueBuilders. this simplifies the builder code
} | ||
|
||
/** Builder for {@link Row} that bases a row on another row. */ | ||
public static class ModifyingBuilder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't this be tweaked a bit, that this is the builder specifically for use with withFieldValue
. Meaning that if when nit doesn't have a source row it just assumes null values for the fields not set. See remark on withFieldValue
on initial builder as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good suggestion. This should simplify the code
* Set a field value using the field name. Nested values can be set using the field selection | ||
* syntax. | ||
*/ | ||
public Builder withFieldValue(String fieldName, Object value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this one can return the ModifyingBuilder so that no other methods can used (no attachValues, no add's).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
return sourceRow.getSchema(); | ||
} | ||
|
||
public ModifyingBuilder withFieldValue(String fieldName, Object value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be useful to have a withFieldValue
with an index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Maybe rebase, then we could get this in. This is a useful addition. |
@alexvanboxel there are still some bugs in this PR related to logical types, which is why I haven't pushed it in yet. |
Run Java PreCommit |
778bf70
to
52695aa
Compare
@alexvanboxel rebased and fixed bugs. Previously I was blocked on getting logical types to work, but now that we natively store logical types in Row, it's become much easier. |
run sql postcommit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really like it. Would you consider making RowUtils public? Looks like a nice set of utilities I could use in my personal pinelines...
@alexvanboxel I would rather not in this PR, because the RowUtils API wasn't designed for public usage. If we were to make it public, I would prefer to spend a lot more time on the API design, and I would also want to understand the use cases a bit better. |
Understand LGTM |
run sql postcommit |
1 similar comment
run sql postcommit |
This PR adds two builders to the Row object. The first allows building a Row by specifying fields by name:
Row row = Row.withSchema(schema)
.withFieldValue("userId", "user1)
.withFieldValue("location.city", "seattle")
.withFieldValue("location.state", "wa")
.build();
The second allows building. a Row based on a previous row by specifying only the fields to change:
Row modifiedRow =
Row.fromRow(row)
.withFieldValue("location.city", "tacoma")
.build();
R: @rezarokni