Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(WIP) CI: Check that private functions are not used across modules #32942

Conversation

ShaharNaveh
Copy link
Member


cc @jbrockmendel

@ShaharNaveh ShaharNaveh force-pushed the CI-check-for-no-internal-functions-across-modules branch from 7ec9851 to d9f3928 Compare March 23, 2020 18:25
Comment on lines 126 to 130
if not (isinstance(node, ast.Import) or isinstance(node, ast.ImportFrom)):
continue

if any(mod.name.split(".")[-1].startswith("_") for mod in node.names):
yield (node.lineno, "Use of private function across modules found.")
Copy link
Member Author

@ShaharNaveh ShaharNaveh Mar 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic behind this is as follow:

  1. If the node is not import or from foo import bar we don't care here, so continue to loop.

  2. if the imported function name starts with an underscore(_), it will yield the line number of the import statement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the function name to the message

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

Since 4786cf7 the private function imported name is included in the message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great, thanks

@jbrockmendel
Copy link
Member

neat, thanks for taking the lead on this. Looks like we've got plenty of violations here. let's stick a pin in this and try to whittle those down over the coming weeks, then revisit

@ShaharNaveh
Copy link
Member Author

neat, thanks for taking the lead on this. Looks like we've got plenty of violations here. let's stick a pin in this and try to whittle those down over the coming weeks, then revisit

@jbrockmendel Should I open a master issue on this?

@jbrockmendel
Copy link
Member

Should I open a master issue on this?

You could open a STY issue, sure. The other (harder) thing that would go with this would be checking more generally for foo._bar (excepting self._bar which is OK)

Or could do a round-up for STY issues, since I know there are a bunch.

@jbrockmendel
Copy link
Member

@MomIsBestFriend how hard would it be to make a special case to allow _shared_docs and similar?

@ShaharNaveh
Copy link
Member Author

@MomIsBestFriend how hard would it be to make a special case to allow _shared_docs and similar?

@jbrockmendel Very easy, can you please provide me a list of names you'd like to have an exception for?

@jbrockmendel
Copy link
Member

can you please provide me a list of names you'd like to have an exception for?

Subject to change, let's start with ["_merge_doc", "_shared_docs", "_extension_array_shared_docs", "_index_shared_docs"]

@ShaharNaveh
Copy link
Member Author

Note: ATM I'm only checking pandas/core/ because .pyx gives me trouble.
I will open a separate PR giving the ability to choose the files extension to check (instead of hard-coding it)

@ShaharNaveh
Copy link
Member Author

After the changes in #33216 are accepted, the whole pandas directory will be scanned.

@jbrockmendel
Copy link
Member

@MomIsBestFriend can you re-push? re-running the failed GH Actions check isnt working for some reason

@ShaharNaveh
Copy link
Member Author

rebased @jbrockmendel

@ShaharNaveh
Copy link
Member Author

@jbrockmendel As I was going through the code I noticed that for example at

def _validate_limit(nobs: int, limit=None) -> int:

_validate_limit is considered private, but it gets used in:

limit = algos._validate_limit(nobs=None, limit=limit)

and in

limit = libalgos._validate_limit(None, limit=limit)


ATM the check is doing "lazy" check, as it only checking for imports of from foo import _bar.

It will not detect things like

import foo as foolib

value = foolib._private_function("baz")

Do you want the existing check to validate the case above as well?

@jbrockmendel
Copy link
Member

Do you want the existing check to validate the case above as well?

That's going to be the next step, but its going to have a lot of violations

@jbrockmendel
Copy link
Member

Let's tentatively add to the ignored list all of the remaining docstring template stuff and pickle-constructor stuff:

_new_Index,
_new_PeriodIndex
_doc_template
_interval_shared_docs
_apply_docs
_arith_doc_FRAME
_flex_comp_doc_FRAME
_make_flex_doc
_op_descriptions
_pipe_template

We should make the _np_under_17 public, will take another look at whats left.

…-check-for-no-internal-functions-across-modules
@jbrockmendel
Copy link
Member

just rebased and pushed

@jbrockmendel
Copy link
Member

@MomIsBestFriend brainstorming: how hard would it be to check for tests that just silently pass or return? e.g. we have a bunch of those in tests.indexing.test_coercion

@jreback jreback added the CI Continuous Integration label Jun 14, 2020
@jreback
Copy link
Contributor

jreback commented Jun 14, 2020

nice idea. pls comment / open a new PR if able to continue.

@jreback jreback closed this Jun 14, 2020
@ShaharNaveh ShaharNaveh deleted the CI-check-for-no-internal-functions-across-modules branch May 5, 2021 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants