-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AG-838] Support independent transforms #70
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have an initial few comments. Thanks Brad!
…o `transform/apply.py`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥 LGTM! Going to pre-approve. Will wait for @JessterB and @jaclynbeck-sage to comment. There are many things that can be improved so that it becomes easier to contribute.
We can iterate on adding a CONTRIBUTING.md
section after this gets merged in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran my current data validation process for the new version of the files generated by this PR to make sure nothing went sideways during the refactor, and everything is looking good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks good! What a colossal amount of work, thank you Brad!!
This PR refactors the
transform
module by splitting uptransform.py
. Fromtransform.py
, theapply_custom_transformations
function is moved toprocess.py
since it is only used there, and it only applies the transformations already defined intransform
. The functionsstandardize_column_names
,standardize_values
,rename_columns
, andnest_fields
are all moved toetl/utils.py
to consolidate utility functions for alletl
modules, and because they are all used in more than one other module.etl/transform
becomes a submodule only containing custom transformation scripts to make contributing new transformations and importing them where needed (inprocess.py
) simpler.process.py
is updated to account for all of these changes.A
tests/transform
directory was created and the existing transform tests were split intotest_utils.py
andtransform/test_genes_biodomains.py
.