Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand RUF015 rule for list(iterable).pop(0) idiom #9190

Closed
Skylion007 opened this issue Dec 18, 2023 · 14 comments · Fixed by #10148
Closed

Expand RUF015 rule for list(iterable).pop(0) idiom #9190

Skylion007 opened this issue Dec 18, 2023 · 14 comments · Fixed by #10148
Assignees
Labels
accepted Ready for implementation good first issue Good for newcomers rule Implementing or modifying a lint rule

Comments

@Skylion007
Copy link
Contributor

Skylion007 commented Dec 18, 2023

I just saw another method of accessing the first element of an iterable / dictionary key that is very inefficient. Currently, this is not detected by RUF015 which we have enabled. We should expand RUF015 to cover this idiom. A minimal problematic example is found below.

first_element = list(iterable).pop(0)

It should be

first_element = next(iter(iterable))

which would not require iterating through the entire iterable just to retrieve the first element. Expanding the autofix to cover this would be helpful as well.

  • RUFF 0.1.8 (and all earlier versions are affected).
@Skylion007 Skylion007 changed the title Expand RUF015 rule for list(iterable).pop() idiom Expand RUF015 rule for list(iterable).pop() idiom Dec 18, 2023
@charliermarsh charliermarsh added rule Implementing or modifying a lint rule accepted Ready for implementation labels Dec 18, 2023
@charliermarsh
Copy link
Member

👍 Makes sense!

@T-256
Copy link
Contributor

T-256 commented Dec 19, 2023

>>> iterable = [1,2]
>>> list(iterable).pop()
2
>>> next(iter(iterable))
1
>>> 

@charliermarsh
Copy link
Member

Oh, oops, the .pop() takes the last item, not the first. We could wrap in a reversed...

@Skylion007
Copy link
Contributor Author

Skylion007 commented Dec 19, 2023

Oh whoops. Good catch, I misremembered the default arg of pop. This still applies to pop(0) though. The reversed iter next isn't bad though. I wonder if a deque could be abused to grab the last element too.

@Skylion007 Skylion007 changed the title Expand RUF015 rule for list(iterable).pop() idiom Expand RUF015 rule for list(iterable).pop(0) idiom Dec 19, 2023
@UnknownPlatypus
Copy link
Contributor

Using reversed instead of iter in the .pop() case is great no? (no need to reversed(iter(...)) I think?)

-list(iterable).pop(0)
+next(iter(iterable))

-list(iterable).pop()
+next(reversed(iterable))

@Skylion007
Copy link
Contributor Author

Oh, great point!

@Skylion007
Copy link
Contributor Author

Skylion007 commented Dec 19, 2023

Actually, one concern if the generator is very, very long and if the iterable doesn't implement __reversed__, then the next(reversed(iterable)) could take a lot of memory, right? I think the most efficient solution is to use reversed if it does implement this method, otherwise use deque(iterable, max_len=1).pop(), right? reversed only works on sequences not iterators so I am not sure it's a drop in replacement sadly.

a = iter(range(10))
b = reversed(a)

will return an error.

@UnknownPlatypus
Copy link
Contributor

oh right, this is a tricky one. However, I'm not sure deque(iterable, max_len=1).pop() will actually be faster. On my machine it's actually slower. The iterable is probably evaluated entirely idk:

%timeit deque(range(10000), maxlen=1).pop()
88.7 µs ± 8.44 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%timeit list(range(10000)).pop()
73.8 µs ± 252 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

@Skylion007
Copy link
Contributor Author

Yeah, this is more about memory management than speed. I think we agree that thepop(0) change is definitely not controversial though.

@charliermarsh charliermarsh added the good first issue Good for newcomers label Feb 7, 2024
@hauntsaninja
Copy link
Contributor

(this should be an unsafe fix)

@rmad17
Copy link

rmad17 commented Feb 14, 2024

I am a beginner in Rust and would like to contribute. Can I take this up?
Going through the comments I got some idea but its not clear what is the solution.

@dhruvmanila
Copy link
Member

dhruvmanila commented Feb 15, 2024

I am a beginner in Rust and would like to contribute. Can I take this up?

Sure!

Going through the comments I got some idea but its not clear what is the solution.

So, currently the implementation takes in an ExprSubscript which matches the list(iterable)[0] code (AST Playground). What we want is to also look for code like list(iterable).pop(0) which is an ExprAttribute (AST Playground).

This will require updating the function signature to now take any Expr to allow for either Expr::Subscript or Expr::Attribute. You'll need to update the function body to check for the .pop(0) idiom along with the existing [0] (is_head_slice). Rest of the code should probably be the same.

The fix for this rule is already unsafe and it would help expand on the fix safety documentation of the rule with why this fix is unsafe w.r.t. the new code.

Hope this helps, feel free to ask any other questions that you might have. I'll assign this issue to you but don't feel any pressure to complete this :)

@robincaloudis
Copy link
Contributor

Since there was no activity on this issue yet (followed it for the last 2 weeks), I stepped ahead and provided a fix in #10148. I hope I interpreted the inactivity on this ticket correctly and didn't steal someones chance. @dhruvmanila, @charliermarsh, do you mind reviewing the PR? Thanks a lot.

charliermarsh pushed a commit that referenced this issue Feb 28, 2024
…0148)

## Summary

Currently, rule `RUF015` is not able to detect the usage of
`list(iterable).pop(0)` falling under the category of an _unnecessary
iterable allocation for accessing the first element_. This PR wants to
change that. See the underlying issue for more details.

* Provide extension to detect `list(iterable).pop(0)`, but not
`list(iterable).pop(i)` where i > 1
* Update corresponding doc

## Test Plan

* `RUF015.py` and the corresponding snap file were extended such that
their correspond to the new behaviour

Closes #9190

--- 

PS: I've only been working on this ticket as I haven't seen any activity
from issue assignee @rmad17, neither in this repo nor in a fork. I hope
I interpreted his inactivity correctly. Didn't mean to steal his chance.
Since I stumbled across the underlying problem myself, I wanted to offer
a solution as soon as possible.
@rmad17
Copy link

rmad17 commented Feb 29, 2024

@robincaloudis Thanks for taking it up, I was not able to look into it despite asking for the ticket to be assigned.

nkxxll pushed a commit to nkxxll/ruff that referenced this issue Mar 10, 2024
…tral-sh#10148)

## Summary

Currently, rule `RUF015` is not able to detect the usage of
`list(iterable).pop(0)` falling under the category of an _unnecessary
iterable allocation for accessing the first element_. This PR wants to
change that. See the underlying issue for more details.

* Provide extension to detect `list(iterable).pop(0)`, but not
`list(iterable).pop(i)` where i > 1
* Update corresponding doc

## Test Plan

* `RUF015.py` and the corresponding snap file were extended such that
their correspond to the new behaviour

Closes astral-sh#9190

--- 

PS: I've only been working on this ticket as I haven't seen any activity
from issue assignee @rmad17, neither in this repo nor in a fork. I hope
I interpreted his inactivity correctly. Didn't mean to steal his chance.
Since I stumbled across the underlying problem myself, I wanted to offer
a solution as soon as possible.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation good first issue Good for newcomers rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants