Write notebook.json file atomically #3305
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The motivation for this was a bug in the jupyter-notebook-gist extension whereby a
LazyConfigValue
was being added to the JSON config tree in some cases. That's definitely a bug on the part of the extension that needs to be fixed over there.However, this caused the
notebook.json
writer to crash mid-writing, leaving a corruptednotebook.json
on disk, eg.:So on subsequent startups of the Jupyter notebook, parsing fails unless the user manually fixes the JSON file. Potentially there could have been data loss in the config file if there were keys after
oauth_client_id
that failed to write.I would argue that notebook shouldn't write the corrupted JSON to disk, but leave it in its last state if this happens. Manual testing shows that if a rogue extension runs, subsequent extensions are still run and can successfully add their own values to the JSON. In other words, only the data in single call to
set
orupdate
is thrown out by this change.I considered using the "write to a tempfile and replace" method for this. That's complicated to get correct in a cross-platform way (requires
os.replace
on Windows, which is only in Python 3.3 and later, so would require a backfill for Python 2.7 etc.) If streaming directly to disk is important, I'm happy to redo this that way, but I thought for config files that are generally quite small, doing this in-memory should be fine.