-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StructuredDatasetTransformerEngine should derive default protocol from raw output prefix #1107
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
wild-endeavor
changed the title
wip
StructuredDatasetTransformerEngine should derive default protocol from raw output prefix
Jul 21, 2022
pingsutw
reviewed
Jul 21, 2022
plugins/flytekit-polars/flytekitplugins/polars/sd_transformers.py
Outdated
Show resolved
Hide resolved
Closed
2 tasks
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
Codecov Report
@@ Coverage Diff @@
## master #1107 +/- ##
==========================================
+ Coverage 86.93% 86.95% +0.02%
==========================================
Files 275 276 +1
Lines 25448 25492 +44
Branches 2862 2865 +3
==========================================
+ Hits 22123 22167 +44
Misses 2847 2847
Partials 478 478
Continue to review full report at Codecov.
|
pingsutw
approved these changes
Jul 21, 2022
wild-endeavor
added a commit
that referenced
this pull request
Aug 2, 2022
…m raw output prefix (#1107) Signed-off-by: Yee Hing Tong <[email protected]>
wild-endeavor
added a commit
that referenced
this pull request
Aug 2, 2022
…m raw output prefix (#1107) Signed-off-by: Yee Hing Tong <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
TL;DR
Instead of relying on the default protocol in most cases, let's infer the protocol from the raw output prefix.
One downside of this change is that we no longer set the default storage format either for the encoders/decoders that we provide. Fortunately these all just use the parquet format, which is the default that's provided in various places if missing. We may have to add a function to the transformer engine api in the future to just set the default format.
Type
Are all requirements met?
Complete description
Tracking Issue
flyteorg/flyte#2684