Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF: define arithmetic methods non-dynamically #51813

Merged
merged 12 commits into from
Mar 8, 2023

Conversation

jbrockmendel
Copy link
Member

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Avoid runtime imports, get stuff out of ops.__init__, which was the goal 4-5 years ago when ops was first madea directory.

Move relevant helper methods from frame/series files . Decided on a new mixin rather than dumping everything directly in DataFrame/Series bc those files are too bug as it is. If others feel strongly i'd be OK putting it all in DataFrame/Series.


# Set the result's name after __finalize__ is called because __finalize__
# would set it back to self.name
out.name = name # pyright: ignore[reportGeneralTypeIssues]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Dr-Irv how do i get pyright to stop complaining about this?

  /home/runner/work/pandas/pandas/pandas/core/ops/methods.py:785:20 - error: Cannot assign member "name" for type "tuple[Series, Series]"
    Member "name" is unknown (reportGeneralTypeIssues)

Copy link
Member

@twoertwein twoertwein Mar 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the ignore comment locally and run pre-commit run -av --hook-stage manual pyright_reportGeneralTypeIssues but did not get an error

(could also bump pyright to 1.1.296)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was thinking this could be a pyright bug, but might be difficult to create a reproducible example. At least when I've seen that kind of error, it's usually been a pyright bug

pandas/core/series.py Outdated Show resolved Hide resolved
right = right._values

# error: "FrameOps" has no attribute "_iter_column_arrays"
col_arrays = self._iter_column_arrays() # type: ignore[attr-defined]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume sub-classes implement this method? Would it be worth making this an abstractmethod to ensure that sub-classes definitely have it (and to make the type checkers happy)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yah this is only inherited by DataFrame.

might just put this all in Series/DataFrame instead of having a subclass to avoid this typing gymnastics

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to have these directly on DataFrame/Series, less indirection too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved + green

@mroeschke mroeschke added the Refactor Internal refactoring of code label Mar 8, 2023
@jbrockmendel
Copy link
Member Author

The failing doctest looks like the doctest is wrong. Must just not be getting in the status quo?

>>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> a.divmod(b, fill_value=0)
(a    1.0
- b    NaN
- c    NaN
+ b    inf
+ c    inf
 d    0.0
 e    NaN
 dtype: float64,
 a    0.0
 b    NaN
 c    NaN
 d    0.0
 e    NaN
 dtype: float64)

@mroeschke
Copy link
Member

The failing doctest looks like the doctest is wrong. Must just not be getting in the status quo?

Hmm is this the same for DataFrame too and/or a recent change?

For label b, I would expect this to follow:

In [10]: 1 / np.nan
Out[10]: nan

For label c, since there's no corresponding values in series b I would expect it to be filled with nan too?

@jbrockmendel
Copy link
Member Author

I would expect it to be filled with nan too?

At the reindex stage it is, but then fill_value=0 part is done after the reindex.

@mroeschke mroeschke added this to the 2.1 milestone Mar 8, 2023
@mroeschke mroeschke merged commit 7489224 into pandas-dev:main Mar 8, 2023
@mroeschke
Copy link
Member

Thanks @jbrockmendel

@jbrockmendel jbrockmendel deleted the ref-ops-2 branch March 8, 2023 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants