-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ruff inserts "id": null
to notebook cells and breaks GitHub notebook viewer
#6834
Comments
id: null
to notebook cells which breaks the GitHub notebook viewer"id": null
to notebook cells which breaks the GitHub notebook viewer
Can we skip serializing diff --git a/crates/ruff/src/jupyter/schema.rs b/crates/ruff/src/jupyter/schema.rs
index b6f9ed3c4..17b03e474 100644
--- a/crates/ruff/src/jupyter/schema.rs
+++ b/crates/ruff/src/jupyter/schema.rs
@@ -120,6 +120,7 @@ pub struct RawCell {
/// Technically, id isn't required (it's not even present) in schema v4.0 through v4.4, but
/// it's required in v4.5. Main issue is that pycharm creates notebooks without an id
/// <https://youtrack.jetbrains.com/issue/PY-59438/Jupyter-notebooks-created-with-PyCharm-are-missing-the-id-field-in-cells-in-the-.ipynb-json>
+ #[serde(skip_serializing_if = "Option::is_none")]
pub id: Option<String>,
/// Cell-level metadata.
pub metadata: Value,
@@ -135,6 +136,7 @@ pub struct MarkdownCell {
/// Technically, id isn't required (it's not even present) in schema v4.0 through v4.4, but
/// it's required in v4.5. Main issue is that pycharm creates notebooks without an id
/// <https://youtrack.jetbrains.com/issue/PY-59438/Jupyter-notebooks-created-with-PyCharm-are-missing-the-id-field-in-cells-in-the-.ipynb-json>
+ #[serde(skip_serializing_if = "Option::is_none")]
pub id: Option<String>,
/// Cell-level metadata.
pub metadata: Value,
@@ -150,6 +152,7 @@ pub struct CodeCell {
/// Technically, id isn't required (it's not even present) in schema v4.0 through v4.4, but
/// it's required in v4.5. Main issue is that pycharm creates notebooks without an id
/// <https://youtrack.jetbrains.com/issue/PY-59438/Jupyter-notebooks-created-with-PyCharm-are-missing-the-id-field-in-cells-in-the-.ipynb-json>
+ #[serde(skip_serializing_if = "Option::is_none")]
pub id: Option<String>,
/// Cell-level metadata.
pub metadata: Value, |
"id": null
to notebook cells which breaks the GitHub notebook viewer"id": null
to notebook cells which breaks GitHub notebook viewer
"id": null
to notebook cells which breaks GitHub notebook viewer"id": null
to notebook cells and breaks GitHub notebook viewer
Found this issue while I was working on mlflow/mlflow#9445 |
@konstin - Do you know? |
|
Can I fix this (if this is really a bug)? |
Sure. I would be interested in @konstin's input because he has experience roundtripping JSON. |
That's an interesting example because according to the spec (html, RFC, schema),
The notebook still works because tools want to be backwards compatible with the <4.5 format and generally don't validate that their input is valid wrt to the specified version. I believe we shouldn't emit invalid notebooks and instead generate UUIDs (or apply one of the other suggested strategies) if we see a notebook version ≥4.5. We should still use
This case i think would be only applicable if we want to increase the nbformat version on write (i think we don't want to do that):
|
@konstin Thanks for the comment. This notebook was created 4 years ago. Maybe that's the reason why cell ids are missing. |
Can ruff just fix code, and not touch cell ids? |
It's definitely really easy to miss as a tool author! (i mean, we also missed that)
Not really, i'm afraid. If ruff were to write a notebook with version 4.5 and no cell ids, ruff would produce an invalid notebook with respect to the schema, which we shouldn't do. Downgrading the version on writing is also not an option because the notebook might use other 4.5+ features. The JEP is pretty clear that you just create some ids (with a preference towards UUIDs), so i'd do that. |
The notebook was created with nbformat 4.2 in Python 2:
|
Sorry, i forgot to check the actual notebook and only looked at the |
Filed #6851 |
<!-- Thank you for contributing to Ruff! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary <!-- What's the purpose of the change? What does it do, and why? --> Fix #6834 ## Test Plan <!-- How was it tested? --> Need tests? --------- Co-authored-by: Dhruv Manilawala <[email protected]>
**Summary** See #6834 (comment) **Test Plan** Added a new notebook
**Summary** See #6834 (comment) **Test Plan** Added a new notebook
**Summary** See #6834 (comment) **Test Plan** Added a new notebook
**Summary** See #6834 (comment) **Test Plan** Added a new notebook
**Summary** See #6834 (comment) **Test Plan** Added a new notebook
**Summary** See #6834 (comment) **Test Plan** Added a new notebook
**Summary** See #6834 (comment) **Test Plan** Added a new notebook
**Summary** See #6834 (comment) **Test Plan** Added a new notebook
Summary
https://github.com/harupy/mlflow/blob/3f1650db853d2c61ac0cdb0035be91b10f131859/examples/h2o/random_forest.ipynb
harupy/mlflow@3f1650d: a commit that replicates what ruff does:
Without
"id": null
, the notebook renders fine:https://github.com/harupy/mlflow/blob/dab36dc4fd6ac6e442bb78c016e7813ced8c0a21/examples/h2o/random_forest.ipynb
How to reproduce
The text was updated successfully, but these errors were encountered: