-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect and ignore Jupyter automagics #8398
Conversation
crates/ruff_notebook/src/notebook.rs
Outdated
| "who_ls" | ||
| "whos" | ||
| "xdel" | ||
| "xmode" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obviously some risk of this getting stale but I kind of doubt these change dramatically over time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. There are user defined magics as well but I think that's out of scope 😅
414bbde
to
87af9fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, leaving it to @dhruvmanila, the IPython expert to approve.
crates/ruff_notebook/src/notebook.rs
Outdated
|
||
/// Returns `true` if a cell should be ignored due to the use of cell magics. | ||
fn is_magic_cell(line: &str) -> bool { | ||
let line = line.trim_start(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Trim python whitespace only?
crates/ruff_notebook/src/notebook.rs
Outdated
// ``` | ||
// | ||
// See: https://ipython.readthedocs.io/en/stable/interactive/magics.html | ||
if line.split_whitespace().next().is_some_and(|token| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Split on whitspace only
PR Check ResultsEcosystem✅ ecosystem check detected no linter changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a reasonable approach. In the future, we could alter the AST to include the magic command name separately from the rest of the command value. That could be beneficial to have dedicated logic for specific magic command.
crates/ruff_notebook/src/notebook.rs
Outdated
|
||
/// Returns `true` if a cell should be ignored due to the use of cell magics. | ||
fn is_magic_cell(line: &str) -> bool { | ||
let line = line.trim_start(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trim_start
is required because the magics can be at any indentation level but it seems like the logic is different for auto-magics. For example, the following is invalid:
if True:
pwd
But, then the following is valid:
pwd
# ^^ unnecessary indentation
I think it's fine to go forward with this as it seems difficult to detect this without the surrounding context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we still trim, but only test the first line, rather than all subsequent lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we actually run the risk of some false positives here...
For example, I guess we'd now flag alias + 1
as an automagic, incorrectly.
We may need to try parsing this cell...? And fall back to automagics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we still trim, but only test the first line, rather than all subsequent lines?
That might risk in not detecting cases where there's Python code before the magic commands.
For example, I guess we'd now flag
alias + 1
as an automagic, incorrectly.
Are you referring alias
as a variable name? Yes, that's basically a risk for all escape commands 😅
We may need to try parsing this cell...? And fall back to automagics?
How would this help?
crates/ruff_notebook/src/notebook.rs
Outdated
| "who_ls" | ||
| "whos" | ||
| "xdel" | ||
| "xmode" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. There are user defined magics as well but I think that's out of scope 😅
87af9fa
to
541598d
Compare
@dhruvmanila - Thought on this a bit more, and it seems like a reasonable approach for now. If a user does have a cell that consists of code, but starts with a standalone automagic, then the behavior of that cell will vary based on the order in which the cells are executed. So I think it's ok to assume it's an automagic. But I'm definitely open to refinement. |
|
541598d
to
12454b7
Compare
Summary
LangChain is attempting to use Ruff over their Jupyter notebooks (https://github.com/langchain-ai/langchain/pull/12677/files), but running into a bunch of syntax errors, the majority of which come from our inability to recognize automagic.
If you run this in a cell:
Jupyter will automatically treat that as:
We need to ignore cells that use these automagics, since the parser doesn't understand them. (I guess we could support it in the parser, but that seems much harder?). The good news is that AFAICT Jupyter doesn't let you mix automagics with code, so by skipping these cells, we don't miss out on analyzing any Python code.
Test Plan
cargo test
pip install
automagics.