-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use dict keys for order-preserving dedupes instead of set + list #15105
Conversation
mypyc/ir/rtypes.py
Outdated
# Use a dict for O(1) lookups that preserve order; values are ignored | ||
seen: dict[RType, Any] = {} | ||
for item in items: | ||
if item not in seen: | ||
new_items.append(item) | ||
seen.add(item) | ||
if len(new_items) > 1: | ||
return RUnion(new_items) | ||
seen[item] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This for
loop could be written more simply as:
seen = dict.fromkeys(items)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pushed this, I swear you read my mind :)
mypyc/ir/rtypes.py
Outdated
return RUnion(new_items) | ||
seen[item] = None | ||
if len(seen) > 1: | ||
return RUnion(list(seen.keys())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to call .keys()
here, since dictionaries are perfectly iterable :)
return RUnion(list(seen.keys())) | |
return RUnion(list(seen)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yep I'll blame the time
mypyc/ir/rtypes.py
Outdated
unique_items = list(dict.fromkeys(items)) | ||
if len(unique_items) > 1: | ||
return RUnion(unique_items) | ||
else: | ||
return new_items[0] | ||
return unique_items[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea if it actually makes a difference in terms of performance, but I feel like I mildly preferred your earlier idea of only casting it to a list
if it's actually necessary, i.e.:
unique_items = dict.fromkeys(items)
if len(unique_items) > 1:
return RUnion(list(unique_items))
else:
return next(iter(unique_items))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So yeah reverted to that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured I might be in the wrong place altogether but this was low-hanging fruit so I decided to pick it. I'll try to take a look at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me!
I figured this might be faster and less code. Also just curious what the process is like to contribute to mypy.