Add ability to capture validation errors #249
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request stems from an issue discovered during Jupyter Server performance analysis in which it was learned that the server is validating notebooks twice (and on both read and write operations). The crux of the issue is that both methods do not give the caller information regarding the validation status of their content - so they must explicitly validate a second time. This appears to have been the case for the last 7 to 8 years (since the code resided in notebook). I suspect this is based on decisions at the time, but this pull request attempts to rectify this in a (hopefully) acceptable manner. Since the elimination of the redundant validation can improve performance by nearly 50%, this seems like an opportunity we should not ignore.
Because so many years have passed, I believe there are at least two backward-compatibility concerns that lead to this particular resolution.
validate()
method, to capture the exception and produce an application-friendly error message when validation issues are encountered.This change adds an optional dictionary-valued parameter (
capture_validation_error
) that applications can pass to capture theValidationError
instance for use by the calling application. If the parameter is non-None and a dictionary instance, a validation error will be added into the dictionary under the key'ValidationError'
while the corresponding value will contain theValidationError
instance. This would allow applications that make an additional call tovalidate()
to remove the second call since they have both their content (on reads) and theValidationError
instance (when validation issues are present) they can use to make decisions.Alternative approaches
ValidationError
when it occurs, which also allows the application to produce a friendly message, but will prevent the return of content (on reads) and persistence of content (on writes) - that is assumed behavior today.There may be other solutions, but I think we should do something as this is the kind of low-hanging fruit that performance-sensitive folks live for. 😄
cc: @goanpeca, @mlucool