-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add a transformer that adds tags to all tables created in a job #287
Conversation
Codecov Report
@@ Coverage Diff @@
## master #287 +/- ##
==========================================
+ Coverage 73.34% 73.45% +0.10%
==========================================
Files 102 103 +1
Lines 4307 4328 +21
Branches 401 403 +2
==========================================
+ Hits 3159 3179 +20
- Misses 1048 1049 +1
Partials 100 100
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, put some comments.
from databuilder.models.table_metadata import TableMetadata | ||
|
||
|
||
class TableTagTransformer(Transformer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be good to put a doc on https://github.com/lyft/amundsendatabuilder#list-of-transformers as well as some unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some tests and the docs.
|
||
def transform(self, record): | ||
if isinstance(record, TableMetadata): | ||
if record.tags: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm, saw you did the split in above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this tag splitting logic into a staticmethod on the TableMetadata
object, so we can guarantee the same logic is used in both places. I realized I was missing the part where it called lower
on the tags in the copied version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be ready for review again.
tags = [tag.lower().strip() for tag in tags] | ||
self.tags = tags | ||
|
||
self.tags = TableMetadata.format_tags(tags) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This small refactor was so the tag formatting logic could be reused, instead of replicated.
from databuilder.models.table_metadata import TableMetadata | ||
|
||
|
||
class TableTagTransformer(Transformer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some tests and the docs.
|
||
def transform(self, record): | ||
if isinstance(record, TableMetadata): | ||
if record.tags: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this tag splitting logic into a staticmethod on the TableMetadata
object, so we can guarantee the same logic is used in both places. I realized I was missing the part where it called lower
on the tags in the copied version.
|
||
def transform(self, record): | ||
if isinstance(record, TableMetadata): | ||
if record.tags: | ||
print(record.tags) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print line that we should remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm , we could commit once the print line is removed.
got rid of the print |
the CI is failing due to flake8 on the test file. |
I am very spoiled by all of our internal tooling that automatically alerts me to things like lint issues! I pushed an update to fix the lint errors, and another to switch from double quotes to single, since that's the norm across the Amundsen code base. |
This comment has been minimized.
This comment has been minimized.
|
seems to fail on py27, I think we could just remove the py27 unit tests. And once Lyft(I assume only Lyft still does at this point) fully migrates off py2(target end of Q2), we could get rid of py2 entirely. |
That looks like the tests are running in Python 2. I thought databuilder required Python 3, so actually removed some Python 2 compatibility stuff I had locally. It would be easy to add back for the sake of simplifying testing for now. |
@jdavidheiser that will be great to unlock it :) |
…builder into tagtransformer
thanks @jdavidheiser |
Summary of Changes
This was a use case that came up for us several times, where we want all tables loaded from a specific source to have the same tag. This transformer makes it easy to add a tag independent of what the extractor is doing.
Tests
It seems like none of the other transformers have tests, so I was not sure what pattern to follow.
Documentation
No new docs, hopefully the transformer is simple enough to be self explanatory.
CheckList
Make sure you have checked all steps below to ensure a timely review.
make test