-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge duplicate funcs #977
Comments
I'm generally in support of this but I think we do want to be careful about making pipelines depend on each other. |
Does that mean that any function common to multiple pipelines should be extracted out to the parent folder of all places it is used? |
I don't have a rule in mind yet. We should just carefully think about how pipelines are supposed to interact and depend on each other. |
I took a peek at every duplicate function name and found a lot of overlap. These are functions that can definitely refer to shared or upstream resources as-is or with minor tweaks
These are funcs that would be best resolved by introducing a new
These are funcs that would need significant refactoring to be resolved
These funcs are passed to
These are instances that would likely be resolved by #1049
|
Process
While writing pytests, I noticed quite a few funcs that were copies of methods shared across tables, both across different versions of a pipeline and across different pipelines. I put together the following script to map out where funcs are duplicated, ignoring funcs shared via inheritance, and funcs with contextual definitions like
insert_default
Script
Results
Results
There some false positives...
create_group
)But there are also many repetitions across versions of pipelines,
fill_nan
)make_video
)generate_pos_components
)Proposed
The text was updated successfully, but these errors were encountered: