Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] iloc/loc keeps circular reference to original DataFrame/Series #15748

Closed
mroeschke opened this issue May 14, 2024 · 0 comments · Fixed by #15749
Closed

[BUG] iloc/loc keeps circular reference to original DataFrame/Series #15748

mroeschke opened this issue May 14, 2024 · 0 comments · Fixed by #15749
Assignees
Labels
bug Something isn't working Python Affects Python cuDF API.

Comments

@mroeschke
Copy link
Contributor

Describe the bug
Since loc and iloc use cached_property, the indexer classes will hang on to a reference to self even when the original object was dereferenced

Steps/Code to reproduce bug

In [1]: import weakref, cudf, pandas as pd

In [2]: df = cudf.DataFrame(range(1))

In [3]: ref = weakref.ref(df)

In [4]: df.iloc[0]
Out[4]: 
0    0
Name: 0, dtype: int64

In [5]: del df

In [6]: ref() is None
Out[6]: False

In [7]: df1 = pd.DataFrame(range(1))

In [8]: ref1 = weakref.ref(df1)

In [9]: df1.iloc[0]
Out[9]: 
0    0
Name: 0, dtype: int64

In [10]: del df1

In [11]: ref1() is None
Out[11]: True

Expected behavior
Out[6] should be True

Performance impact of just using @property (which I think is OK to eliminate this reference cycle)

In [2]: df = cudf.DataFrame(range(1))

In [3]: %timeit df.loc  # property
272 ns ± 24.3 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [3]: %timeit df.loc  # cached_property
46.3 ns ± 1.63 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

Environment overview (please complete the following information)

  • Environment location: Bare-metal
  • Method of cuDF install: conda
    • If method of install is [Docker], provide docker pull & docker run commands used

Environment details
Please run and paste the output of the cudf/print_env.sh script here, to gather any other relevant environment details

Additional context
Add any other context about the problem here.

@mroeschke mroeschke added bug Something isn't working Python Affects Python cuDF API. labels May 14, 2024
rapids-bot bot pushed a commit that referenced this issue May 15, 2024
closes #15748

The performance implication can be seen in the issue

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Lawrence Mitchell (https://github.com/wence-)

URL: #15749
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant