-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP / experiment] Specialization of std.foldl(std.mergePatch, arr, target)
#254
[WIP / experiment] Specialization of std.foldl(std.mergePatch, arr, target)
#254
Conversation
This saves some extra allocations.
…ergePatch; add new tests
Still need to add more tests for plus field handling; currently lack full mutation coverage for the added code there. Also need more cases for order preservation during field merges.
On further reflection, I think this ends up changing lazy evaluation semantics for But I think there's still substantial room to speed this up via other potential optimizations that I spotted while digging into this, including optimizations to speed up field name checks. I also spotted a pre-existing bug in our I'll fix both of those in separate PRs. |
std.foldl(std.mergePatch, arr, target)
std.foldl(std.mergePatch, arr, target)
Closing this, as I managed to gain even larger speedups via the optimizations in #258 and those don't change semantics. I may end up repurposing parts of the specialization testing bits in a future PR. |
Note
I have marked this PR as a draft because I want to do one more round of self-review on the added test cases and test coverage before merging, plus possibly clean up some of the comments. I'm opening the PR now for feedback.
Overview
This PR adds a specialization for
std.foldl(std.mergePatch, patches, target)
, improving performance when applying large numbers of patches to an object.Motivation
This pattern is sometimes used for flattening or reshaping large lists of configurations.
As a simplified toy example:
yields
In some cases we use this pattern to flatten very large lists of objects. In profiling, I noticed significant time spent in
mergePatch
and began looking for optimizations.This PR's optimized implementation
This PR adds a specialized
FoldlMergePatch
function to optimize this pattern. This function is not directly invokable by end users; instead, it's automatically injected using the specialization framework from #119 / 0bd255a.In psuedocode, the optimized implementation does the following:
Obj
patches overwrite the target, hence special handling here.Null
then it removes the field, so we can ignore the target and all earlier patch values when we see a null.+:
fields present anywhere in the object or its children).mergePatch
tests added in Fix a bug in hidden field handling in std.mergePatch #250 for details.This optimized implementation has the same O(n²) worst-case asymptotic complexity as the unoptimized
std.foldl(std.mergePatch, ..., ...)
approach, but can be significantly faster in practice because it produces much less garbage: it avoids building and discarding intermediate merge results and therefore avoids having to repeatedly recompute the intermediate merge target's visible fields.I also play several tricks to reduce object allocations, including:
value0
in the resultingVal.Obj
.Array[Val]
for holding patch values when computing the set of values that will participate in a sub-merge at a given ancestor path: I pass around asize
field to denote the "valid" size of this array, avoiding copies for trimming it to size.cleanObject
, which is a perhaps-overly-complex optimized equivalent of the regular megePatch'srecSingle
helper function: this is used for recursively removing non-visible or explicitly-null keys frompatch
objects when they don't merge with the target, but it also has the side effect of dropping+:
modifiers. My implementation optimizes for the common scenario where patches are purely additive by avoiding new object allocation when the cleaning would be a no-op.Correctness testing
I added a new dedicated test suite for this optimization.
I noticed that our existing unit tests don't meaningfully exercise specialization because the StaticOptimizer usually ends up constant/static-folding the test cases. To address this, I added a new internal
disableStaticApplyForBuiltinFunctions
setting which we can set in specific tests.I also added an internal
disableBuiltinSpecialization
setting for disabling specialization.Combining these together, I added a test helper which compares specialized and non-specialized execution and asserts that the answers are equal with field ordering preserved.
Warning
As we saw in #250, merge patch's implicit behaviors can be subtle and I'm still not 100% sure that I've faithfully covered all cases, which is why I've kept this marked as a draft to give me a chance to revisit the tests with fresh eyes. In particular, I'm not 100% certain that I've handled standard vs.
unhide
visibility correctly. For some reason, not yet clear to me,sjsonnet
's mergePatch creates fields withUnhide
visibility. This might constrain my ability to do thecleanObject
optimizations, but it might also be a behavior difference w.r.t. the official jsonnet implementation and thus perhaps something we could change to more faithfully match their behavior.Performance testing
🚧 I'll attach some of my microbenchmarks later.
In end-to-end tests on a very complex real-world input, this PR's patch cut ~20% of overall wallclock time.