Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to handle magic commands in Jupyter notebook #5030

Closed
Tracked by #5188
dhruvmanila opened this issue Jun 12, 2023 · 4 comments · Fixed by #6358
Closed
Tracked by #5188

Add ability to handle magic commands in Jupyter notebook #5030

dhruvmanila opened this issue Jun 12, 2023 · 4 comments · Fixed by #6358
Assignees
Labels
core Related to core functionality

Comments

@dhruvmanila
Copy link
Member

Currently, if any cell contains any magic commands it'll be ignored even if the cell contains valid Python code. This might lead to false positive as some code is intentionally ignored for linting (undefined symbol).

At a high level overview what black does is the following:

  1. Using IPython's transformer, convert the magic commands into the respective function calls (% ... will be converted to get_ipython().run_cell_magic)
  2. Replace cell magics with tokens (string values which is a valid Python code) and keep track of such replacements
  3. Replace line magics in the same way as mentioned in (2)
  4. Returned the transformed code along with the replacements
  5. At the end, use the replacements to get back the original content

Reference implementation in black: https://github.com/psf/black/blob/main/src/black/handle_ipynb_magics.py

@dhruvmanila dhruvmanila added the core Related to core functionality label Jun 12, 2023
@dhruvmanila
Copy link
Member Author

dhruvmanila commented Jun 14, 2023

Jupytext has a simpler implementation although lots of regex: https://github.com/mwouts/jupytext/blob/main/jupytext/magics.py

@henryiii
Copy link
Contributor

There might be an easier but dumber way to do with without invoking IPython, which I would assume would be slow. You could add a configurable list of "ignored" magics. The default could contain the built-in magics that don't have much affect: %%time, %%timeit, etc. But a user could override the list via configuration. Detecting them I'd assume is already mostly done, since it can ignore them?

This wouldn't cover magics that modify stuff or change the language, but would cover a lot of common cases & is the situation that is easiest to report violations and autofix.

@dhruvmanila
Copy link
Member Author

Hey, thanks for the suggestions. Yes, invoking IPython would be slow and we're not going that route. Instead, we'll use a new parser mode in which the parser will parse the magic commands at possible locations. New tokens and AST nodes are introduced which will be available only in that mode. Although the node isn't finalized, you can refer to astral-sh/RustPython-Parser#31.

@dhruvmanila
Copy link
Member Author

The solution which we've implemented is to integrate it in the parser directly. The following changes were made to the parser to accommodate the requirement:

References

Footnotes

  1. The allowed magic token positions are determined directly from the IPython implementation of the same.

dhruvmanila added a commit that referenced this issue Aug 9, 2023
## Summary

This PR adds support for a stricter version of help end escape
commands[^1] in the parser. By stricter, I mean that the escape tokens
are only at the end of the command and there are no tokens at the start.
This makes it difficult to implement it in the lexer without having to
do a lot of look aheads or keeping track of previous tokens.

Now, as we're adding this in the parser, the lexer needs to recognize
and emit a new token for `?`. So, `Question` token is added which will
be recognized only in `Jupyter` mode.

The conditions applied are the same as the ones in the original
implementation in IPython codebase (which is a regex):
* There can only be either 1 or 2 question mark(s) at the end
* The node before the question mark can be a `Name`, `Attribute`,
`Subscript` (only with integer constants in slice position), or any
combination of the 3 nodes.

## Test Plan

Added test cases for various combination of the possible nodes in the
command value position and update the snapshots.

fixes: #6359
fixes: #5030 (This is the final piece)

[^1]: #6272 (comment)
durumu pushed a commit to durumu/ruff that referenced this issue Aug 12, 2023
## Summary

This PR adds support for a stricter version of help end escape
commands[^1] in the parser. By stricter, I mean that the escape tokens
are only at the end of the command and there are no tokens at the start.
This makes it difficult to implement it in the lexer without having to
do a lot of look aheads or keeping track of previous tokens.

Now, as we're adding this in the parser, the lexer needs to recognize
and emit a new token for `?`. So, `Question` token is added which will
be recognized only in `Jupyter` mode.

The conditions applied are the same as the ones in the original
implementation in IPython codebase (which is a regex):
* There can only be either 1 or 2 question mark(s) at the end
* The node before the question mark can be a `Name`, `Attribute`,
`Subscript` (only with integer constants in slice position), or any
combination of the 3 nodes.

## Test Plan

Added test cases for various combination of the possible nodes in the
command value position and update the snapshots.

fixes: astral-sh#6359
fixes: astral-sh#5030 (This is the final piece)

[^1]: astral-sh#6272 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Related to core functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants