-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate deterministic ids when formatting notebooks #9359
Conversation
|
27a8726
to
411f988
Compare
411f988
to
f4edaa6
Compare
crates/ruff_notebook/src/notebook.rs
Outdated
if raw_notebook.nbformat == 4 && raw_notebook.nbformat_minor >= 5 { | ||
// We use a mock random number generator to generate deterministic uuids | ||
let mut rng = rand::rngs::mock::StepRng::new(0, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use a seeded number generator rather than the StepRng
one, so that the IDs are deterministic but look random rather than structured as they do now? Or was this the only option for deterministic UUIDs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure there's another option for a seeded number generator, this one was just the most obvious way I saw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure we can use the StdRng
seeded with 0
7fb27b94-1602-401d-9154-2211134fc71a
acae54e3-7e7d-407b-bb7b-55eff062a284
9a63283c-baf0-4dbc-ab1f-6479b197f3a8
8dd0d809-2fe7-4a7c-9628-1538738b07e2
72eea511-9410-473a-a328-ad9291626812
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd go with natural numbers for simplicity but it doesn't matter much since the users shouldn't see that id anyway.
When formatting notebooks, we populate the
id
field for cells that do not have one. Previously, we generated a UUID v4 which resulted in non-deterministic formatting. Here, we generate the UUID from a seeded random number generator instead of using true randomness. For example, here are the first five ids it would generate:We also add a check that an id is not present in another cell to prevent accidental introduction of duplicate ids.
The specification is lax, and we could just use incrementing integers e.g.
0
,1
, ... but I have a minor preference for retaining the UUID format. Some discussion here — I'm happy to go either way though.Discovered via #9293