Detect and ignore Jupyter automagics #8398

charliermarsh · 2023-10-31T23:25:08Z

Summary

LangChain is attempting to use Ruff over their Jupyter notebooks (https://github.com/langchain-ai/langchain/pull/12677/files), but running into a bunch of syntax errors, the majority of which come from our inability to recognize automagic.

If you run this in a cell:

pip install requests

Jupyter will automatically treat that as:

%pip install requests

We need to ignore cells that use these automagics, since the parser doesn't understand them. (I guess we could support it in the parser, but that seems much harder?). The good news is that AFAICT Jupyter doesn't let you mix automagics with code, so by skipping these cells, we don't miss out on analyzing any Python code.

Test Plan

cargo test
Ran over LangChain and verified that there are no more errors relating to pip install automagics.

charliermarsh · 2023-10-31T23:25:28Z

crates/ruff_notebook/src/notebook.rs

+                    | "who_ls"
+                    | "whos"
+                    | "xdel"
+                    | "xmode"


Obviously some risk of this getting stale but I kind of doubt these change dramatically over time.

I agree. There are user defined magics as well but I think that's out of scope 😅

MichaReiser

LGTM, leaving it to @dhruvmanila, the IPython expert to approve.

MichaReiser · 2023-10-31T23:31:13Z

crates/ruff_notebook/src/notebook.rs

+
+    /// Returns `true` if a cell should be ignored due to the use of cell magics.
+    fn is_magic_cell(line: &str) -> bool {
+        let line = line.trim_start();


Nit: Trim python whitespace only?

MichaReiser · 2023-10-31T23:31:46Z

crates/ruff_notebook/src/notebook.rs

+        // ```
+        //
+        // See: https://ipython.readthedocs.io/en/stable/interactive/magics.html
+        if line.split_whitespace().next().is_some_and(|token| {


Nit: Split on whitspace only

github-actions · 2023-10-31T23:47:56Z

PR Check Results

Ecosystem

✅ ecosystem check detected no linter changes.

dhruvmanila

I think this is a reasonable approach. In the future, we could alter the AST to include the magic command name separately from the rest of the command value. That could be beneficial to have dedicated logic for specific magic command.

dhruvmanila · 2023-11-01T03:23:08Z

crates/ruff_notebook/src/notebook.rs

+
+    /// Returns `true` if a cell should be ignored due to the use of cell magics.
+    fn is_magic_cell(line: &str) -> bool {
+        let line = line.trim_start();


The trim_start is required because the magics can be at any indentation level but it seems like the logic is different for auto-magics. For example, the following is invalid:

if True: pwd

But, then the following is valid:

pwd # ^^ unnecessary indentation

I think it's fine to go forward with this as it seems difficult to detect this without the surrounding context.

Can we still trim, but only test the first line, rather than all subsequent lines?

Hmm, we actually run the risk of some false positives here...

For example, I guess we'd now flag alias + 1 as an automagic, incorrectly.

We may need to try parsing this cell...? And fall back to automagics?

Can we still trim, but only test the first line, rather than all subsequent lines?

That might risk in not detecting cases where there's Python code before the magic commands.

For example, I guess we'd now flag alias + 1 as an automagic, incorrectly.

Are you referring alias as a variable name? Yes, that's basically a risk for all escape commands 😅

We may need to try parsing this cell...? And fall back to automagics?

How would this help?

dhruvmanila · 2023-11-01T03:24:58Z

crates/ruff_notebook/src/notebook.rs

+                    | "who_ls"
+                    | "whos"
+                    | "xdel"
+                    | "xmode"


I agree. There are user defined magics as well but I think that's out of scope 😅

crates/ruff_notebook/src/notebook.rs

konstin · 2023-11-02T09:14:43Z

😵‍💫

charliermarsh · 2023-11-03T00:42:03Z

@dhruvmanila - Thought on this a bit more, and it seems like a reasonable approach for now. If a user does have a cell that consists of code, but starts with a standalone automagic, then the behavior of that cell will vary based on the order in which the cells are executed. So I think it's ok to assume it's an automagic. But I'm definitely open to refinement.

github-actions · 2023-11-03T01:01:42Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

charliermarsh requested a review from dhruvmanila October 31, 2023 23:25

charliermarsh commented Oct 31, 2023

View reviewed changes

charliermarsh force-pushed the charlie/magics branch 2 times, most recently from 414bbde to 87af9fa Compare October 31, 2023 23:29

MichaReiser added cli Related to the command-line interface and removed cli Related to the command-line interface labels Oct 31, 2023

charliermarsh added the bug Something isn't working label Oct 31, 2023

MichaReiser reviewed Oct 31, 2023

View reviewed changes

dhruvmanila approved these changes Nov 1, 2023

View reviewed changes

dhruvmanila reviewed Nov 1, 2023

View reviewed changes

crates/ruff_notebook/src/notebook.rs Outdated Show resolved Hide resolved

Detect and ignore Jupyter automagics

a67254e

charliermarsh force-pushed the charlie/magics branch from 87af9fa to 541598d Compare November 3, 2023 00:41

charliermarsh enabled auto-merge (squash) November 3, 2023 00:41

charliermarsh force-pushed the charlie/magics branch from 541598d to 12454b7 Compare November 3, 2023 01:07

Add more tests

12454b7

charliermarsh merged commit f64c389 into main Nov 3, 2023
16 checks passed

charliermarsh deleted the charlie/magics branch November 3, 2023 01:14

miccal mentioned this pull request Nov 3, 2023

ruff 0.1.4 Homebrew/homebrew-core#153286

Merged

charliermarsh mentioned this pull request Jan 26, 2024

F821 false positive in notebook when using magic command as variables #9648

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detect and ignore Jupyter automagics #8398

Detect and ignore Jupyter automagics #8398

charliermarsh commented Oct 31, 2023

charliermarsh Oct 31, 2023

dhruvmanila Nov 1, 2023

MichaReiser left a comment

MichaReiser Oct 31, 2023

MichaReiser Oct 31, 2023

github-actions bot commented Oct 31, 2023

dhruvmanila left a comment

dhruvmanila Nov 1, 2023

charliermarsh Nov 1, 2023

charliermarsh Nov 1, 2023

dhruvmanila Nov 1, 2023

dhruvmanila Nov 1, 2023

konstin commented Nov 2, 2023

charliermarsh commented Nov 3, 2023

github-actions bot commented Nov 3, 2023 •

edited

Loading

Detect and ignore Jupyter automagics #8398

Detect and ignore Jupyter automagics #8398

Conversation

charliermarsh commented Oct 31, 2023

Summary

Test Plan

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaReiser left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Oct 31, 2023

PR Check Results

Ecosystem

dhruvmanila left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

konstin commented Nov 2, 2023

charliermarsh commented Nov 3, 2023

github-actions bot commented Nov 3, 2023 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

github-actions bot commented Nov 3, 2023 •

edited

Loading

`ruff-ecosystem` results