Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update identifier Unicode character validation to match Python spec #7169

Closed
zanieb opened this issue Sep 5, 2023 · 2 comments · Fixed by #7209
Closed

Update identifier Unicode character validation to match Python spec #7169

zanieb opened this issue Sep 5, 2023 · 2 comments · Fixed by #7209
Assignees
Labels
bug Something isn't working help wanted Contributions especially welcome

Comments

@zanieb
Copy link
Member

zanieb commented Sep 5, 2023

Python supports some, but not all Unicode characters in identifiers. See https://docs.python.org/3/reference/lexical_analysis.html#identifiers and https://peps.python.org/pep-3131/

However, Ruff is not correctly enforcing the subset of valid unicode characters e.g. in is_identifier

/// Returns `true` if a string is a valid Python identifier (e.g., variable
/// name).
pub fn is_identifier(name: &str) -> bool {
// Is the first character a letter or underscore?
let mut chars = name.chars();
if !chars.next().is_some_and(|c| c.is_alphabetic() || c == '_') {
return false;
}
// Are the rest of the characters letters, digits, or underscores?
if !chars.all(|c| c.is_alphanumeric() || c == '_') {
return false;
}
// Is the identifier a keyword?
if is_keyword(name) {
return false;
}
true
}

This causes bugs such as:

We need to add validation of the specific supported subset of characters.

@zanieb zanieb added bug Something isn't working help wanted Contributions especially welcome labels Sep 5, 2023
@MichaReiser
Copy link
Member

MichaReiser commented Sep 5, 2023

The relevant function in our lexer

fn is_identifier_continuation(c: char) -> bool {
match c {
'a'..='z' | 'A'..='Z' | '_' | '0'..='9' => true,
c => is_xid_continue(c),
}
}

@LaBatata101
Copy link
Contributor

I can work on this one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Contributions especially welcome
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants