-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PYTHON] Add new --auto_unique_labels
option to StandardOptions
#28984
Conversation
reviewer: @robertwb |
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control |
Codecov Report
@@ Coverage Diff @@
## master #28984 +/- ##
===========================================
- Coverage 72.22% 38.37% -33.86%
===========================================
Files 684 686 +2
Lines 100856 101673 +817
===========================================
- Hits 72846 39013 -33833
- Misses 26434 61084 +34650
Partials 1576 1576
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 314 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
Motivating Issue
The python SDK requires users specify a unique label for every transform. Often times, the first time a transform is applied there is a default label that will be unique, but any future applications of the transform will require that the user specify a new and unique label. This can be quite tedious since there are many (?) scenarios where a user won't actually care about the labels. A couple scenarios that came up in my experience:
assert_that()
s requires unique label for each assertLogElements
orDeduplicate
)Change Summary
This change adds a new
--auto_unique_labels
standard option. The option defaults to off so there's no change in the default behavior. If the option is set, then whenever a transform is applied with a non-unique label, a new label is generated that includes an automatically incremented suffix. For example, ifDeduplicate
is applied twice, then the second deduplicate will have a label ofDeduplicate_1
. AdditionalDeduplicates
would have a label ofDeduplicate_n
.Testing
Wrote a new unit test that tests the new behavior. The rest of the
pipeline_test.py
tests still pass except for one:test_display_data
. It also fails on my fork's master though so I'm assuming it's unrelated (EDIT: confirmed unrelated, it's just failing b/c it doesn't expect the URN to change whenpipeline_test.py
is invoked directly, in which case the URN references__main__
)GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.