Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pandas-vet] Constant column check with nunique #5588

Closed
sbrugman opened this issue Jul 7, 2023 · 1 comment
Closed

[pandas-vet] Constant column check with nunique #5588

sbrugman opened this issue Jul 7, 2023 · 1 comment
Labels
needs-decision Awaiting a decision from a maintainer rule Implementing or modifying a lint rule

Comments

@sbrugman
Copy link
Contributor

sbrugman commented Jul 7, 2023

Newly proposed upstream rule: constant column check with nunique

series.nunique() == 1

Rather use (in the absence of nans):

v = series.values
(v[0] == v).all()

See deppen8/pandas-vet#119

@charliermarsh charliermarsh added rule Implementing or modifying a lint rule needs-decision Awaiting a decision from a maintainer labels Jul 8, 2023
@sbrugman
Copy link
Contributor Author

sbrugman commented Jul 12, 2023

Note that pandas-dev/pandas#54064 was merged to pandas. This adds documentation to the cookbook on how users can check for constant columns. We can link to that page in the linting message.

There are quite some projects that could benefit from this rule.

(pandas-vet does not seem that active anymore)

charliermarsh pushed a commit that referenced this issue Jul 17, 2023
## Summary

Implementation for #5588

Q1: are there any additional semantic helpers that could be used to
guard this rule? Which existing rules should be similar in that respect?
Can we at least check if `pandas` is imported (any pointers welcome)?
Currently, the rule flags:
```python
data = {"a": "b"}
data.nunique() == 1
```

Q2: Any pointers on naming of the rule and selection of the code? It was
proposed, but not replied to/implemented in the upstream. `pandas` did
accept a PR to update their cookbook to reflect this rule though.

## Test Plan

TODO:
- [X] Checking for ecosystem CI results
- [x] Test on selected [real-world
cases](https://github.com/search?q=%22nunique%28%29+%3D%3D+1%22+language%3APython+&type=code)
  - [x] https://github.com/sdv-dev/SDMetrics
  - [x] https://github.com/google-research/robustness_metrics
  - [x] https://github.com/soft-matter/trackpy
  - [x] https://github.com/microsoft/FLAML/
- [ ] Add guarded test cases
evanrittenhouse pushed a commit to evanrittenhouse/ruff that referenced this issue Jul 19, 2023
## Summary

Implementation for astral-sh#5588

Q1: are there any additional semantic helpers that could be used to
guard this rule? Which existing rules should be similar in that respect?
Can we at least check if `pandas` is imported (any pointers welcome)?
Currently, the rule flags:
```python
data = {"a": "b"}
data.nunique() == 1
```

Q2: Any pointers on naming of the rule and selection of the code? It was
proposed, but not replied to/implemented in the upstream. `pandas` did
accept a PR to update their cookbook to reflect this rule though.

## Test Plan

TODO:
- [X] Checking for ecosystem CI results
- [x] Test on selected [real-world
cases](https://github.com/search?q=%22nunique%28%29+%3D%3D+1%22+language%3APython+&type=code)
  - [x] https://github.com/sdv-dev/SDMetrics
  - [x] https://github.com/google-research/robustness_metrics
  - [x] https://github.com/soft-matter/trackpy
  - [x] https://github.com/microsoft/FLAML/
- [ ] Add guarded test cases
konstin pushed a commit that referenced this issue Jul 19, 2023
## Summary

Implementation for #5588

Q1: are there any additional semantic helpers that could be used to
guard this rule? Which existing rules should be similar in that respect?
Can we at least check if `pandas` is imported (any pointers welcome)?
Currently, the rule flags:
```python
data = {"a": "b"}
data.nunique() == 1
```

Q2: Any pointers on naming of the rule and selection of the code? It was
proposed, but not replied to/implemented in the upstream. `pandas` did
accept a PR to update their cookbook to reflect this rule though.

## Test Plan

TODO:
- [X] Checking for ecosystem CI results
- [x] Test on selected [real-world
cases](https://github.com/search?q=%22nunique%28%29+%3D%3D+1%22+language%3APython+&type=code)
  - [x] https://github.com/sdv-dev/SDMetrics
  - [x] https://github.com/google-research/robustness_metrics
  - [x] https://github.com/soft-matter/trackpy
  - [x] https://github.com/microsoft/FLAML/
- [ ] Add guarded test cases
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-decision Awaiting a decision from a maintainer rule Implementing or modifying a lint rule
Projects
None yet
Development

No branches or pull requests

2 participants