Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement PartialEq<str> for Bound<'py, PyString> #4245

Merged
merged 4 commits into from
Jun 16, 2024

Conversation

davidhewitt
Copy link
Member

Idea spurred by #4020 and also what I've just been doing in #4196.

This PR implements comparison between Python str and Rust &str by using:

  • The new PyUnicode_EqualToUTF8AndSize on Python 3.13,
  • fallback to converting to a Rust &str and doing equality that way on older versions.

PyUnicode_EqualToUTF8AndSize can never fail, so similarly the fallback implementations just return false on error getting UTF8 data out.

@davidhewitt
Copy link
Member Author

davidhewitt commented Jun 14, 2024

Thanks for the approval @LilyFoote!

I just realised: there's a bit of an edge case here: what about when the Bound<'py, PyString> contains a subclass of str? That subclass can override inequality, and might not compare equal with strings (although that would be very weird).

What should we do then? One option is to make these implementations return false for subclasses (e.g. do an is_exact_instance_of::<PyString>() check).

In PyUnicode_EqualToUTF8 there is no subclass checking - https://github.com/python/cpython/blob/42351c3b9a357ec67135b30ed41f59e6f306ac52/Objects/unicodeobject.c#L10789

I'm inclined to proceed with this as-is, but I would also be glad to hear folks' opinions as there is a slight possibility this is a footgun which leads to surprises in the subclass edge case.

@LilyFoote
Copy link
Contributor

I think subclassing built-in types like str is fairly well known to have edge-cases and oddities like this. Maybe all we need here is a note in the docs.

@davidhewitt
Copy link
Member Author

I think subclassing built-in types like str is fairly well known to have edge-cases and oddities like this. Maybe all we need here is a note in the docs.

True! I think to that point, it's very likely that users would write bound_str.to_str()? == "rust string" which is just as wrong for subclasses as what we implement here. So documenting gotcha on the trait implementations as you suggest and moving on might be fine.

Any other opinions before we proceed?

Copy link
Contributor

@Icxolu Icxolu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree here. I think it is reasonable to expect PyString to do normal string comparison and having the PartialEq impls makes a lot of code much more ergonomic and readable. Documenting this behavior seems sufficient.

src/types/string.rs Show resolved Hide resolved
@davidhewitt davidhewitt mentioned this pull request Jun 15, 2024
5 tasks
@davidhewitt
Copy link
Member Author

Thanks both! I've added documentation and will proceed to merge 👍

@davidhewitt davidhewitt enabled auto-merge June 16, 2024 06:24
@davidhewitt davidhewitt added this pull request to the merge queue Jun 16, 2024
Merged via the queue into PyO3:main with commit 9648d59 Jun 16, 2024
41 checks passed
@davidhewitt davidhewitt deleted the pystring-eq branch June 16, 2024 09:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants