Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP / experiment] Specialization of std.foldl(std.mergePatch, arr, target) #254

Closed

Conversation

JoshRosen
Copy link
Contributor

Note

I have marked this PR as a draft because I want to do one more round of self-review on the added test cases and test coverage before merging, plus possibly clean up some of the comments. I'm opening the PR now for feedback.

Overview

This PR adds a specialization for std.foldl(std.mergePatch, patches, target), improving performance when applying large numbers of patches to an object.

Motivation

This pattern is sometimes used for flattening or reshaping large lists of configurations.

As a simplified toy example:

local inputData = [
  { cloud: 'aws', 'region': 'us-west-1', 'service': 'webapp', confs: {servers: 1 }},
  { cloud: 'aws', 'region': 'us-west-1', 'service': 'auth', confs: {servers: 2 }},
  { cloud: 'azure', 'region': 'us-west-2', 'service': 'auth', confs: {servers: 2 }},
];

std.foldl(
  std.mergePatch,
  [
     { [x.cloud]: { [x.region]: { [x.service]: x.confs } } } 
     for x in inputData
  ],
  {}
)

yields

{
   "aws": {
      "us-west-1": {
         "auth": {
            "servers": 2
         },
         "webapp": {
            "servers": 1
         }
      }
   },
   "azure": {
      "us-west-2": {
         "auth": {
            "servers": 2
         }
      }
   }
}

In some cases we use this pattern to flatten very large lists of objects. In profiling, I noticed significant time spent in mergePatch and began looking for optimizations.

This PR's optimized implementation

This PR adds a specialized FoldlMergePatch function to optimize this pattern. This function is not directly invokable by end users; instead, it's automatically injected using the specialization framework from #119 / 0bd255a.

In psuedocode, the optimized implementation does the following:

  • Check whether there are patches to apply. If not, return the target.
  • Determine the set of patches that might affect the output.
    • Non-Obj patches overwrite the target, hence special handling here.
  • If the object was overwritten by a non-object, return the non-object.
  • Otherwise, execute a recursive function over the object:
    • Determine an upper bound on set of potential output fields by unioning the visible fields of the target and the patches.
    • For each possibly-output field:
      • Check if the target has the value.
      • Iterate over the patches, collecting the field values that might participate in the output:
        • If we see an explicit Null then it removes the field, so we can ignore the target and all earlier patch values when we see a null.
      • At this point, either:
        • (a) no patches merged with the field, so return the target's value (if any)
        • (b) the last effective patch removed the field, so drop it from the output.
        • (c) the last effective patch set a non-object, so we can just store that in the field
        • (d) an earlier patch removed the field and only one patch adds to it, so we can simply use the patch's object as the field value (after first cleaning it to remove any hidden or +: fields present anywhere in the object or its children).
        • (e) at least two objects need to be merged, so recursively merge them
          • This step continues to distinguish between the target and patch objects in the sub-merges: this is necessary because hidden nested fields in target fields that don't merge with patches are preserved in output, while hidden fields in or targeted by patches are dropped. See mergePatch tests added in Fix a bug in hidden field handling in std.mergePatch #250 for details.

This optimized implementation has the same O(n²) worst-case asymptotic complexity as the unoptimized std.foldl(std.mergePatch, ..., ...) approach, but can be significantly faster in practice because it produces much less garbage: it avoids building and discarding intermediate merge results and therefore avoids having to repeatedly recompute the intermediate merge target's visible fields.

I also play several tricks to reduce object allocations, including:

  • Reusing the same LinkedHashMap instance for both computing the set of visible keys and for the final value0 in the resulting Val.Obj.
  • Reusing the same Array[Val] for holding patch values when computing the set of values that will participate in a sub-merge at a given ancestor path: I pass around a size field to denote the "valid" size of this array, avoiding copies for trimming it to size.
  • Optimized cleanObject, which is a perhaps-overly-complex optimized equivalent of the regular megePatch's recSingle helper function: this is used for recursively removing non-visible or explicitly-null keys from patch objects when they don't merge with the target, but it also has the side effect of dropping +: modifiers. My implementation optimizes for the common scenario where patches are purely additive by avoiding new object allocation when the cleaning would be a no-op.

Correctness testing

I added a new dedicated test suite for this optimization.

I noticed that our existing unit tests don't meaningfully exercise specialization because the StaticOptimizer usually ends up constant/static-folding the test cases. To address this, I added a new internal disableStaticApplyForBuiltinFunctions setting which we can set in specific tests.

I also added an internal disableBuiltinSpecialization setting for disabling specialization.

Combining these together, I added a test helper which compares specialized and non-specialized execution and asserts that the answers are equal with field ordering preserved.

Warning

As we saw in #250, merge patch's implicit behaviors can be subtle and I'm still not 100% sure that I've faithfully covered all cases, which is why I've kept this marked as a draft to give me a chance to revisit the tests with fresh eyes. In particular, I'm not 100% certain that I've handled standard vs. unhide visibility correctly. For some reason, not yet clear to me, sjsonnet's mergePatch creates fields with Unhide visibility. This might constrain my ability to do the cleanObject optimizations, but it might also be a behavior difference w.r.t. the official jsonnet implementation and thus perhaps something we could change to more faithfully match their behavior.

Performance testing

🚧 I'll attach some of my microbenchmarks later.

In end-to-end tests on a very complex real-world input, this PR's patch cut ~20% of overall wallclock time.

@JoshRosen
Copy link
Contributor Author

On further reflection, I think this ends up changing lazy evaluation semantics for mergePatch, something that's not covered in the test cases.

But I think there's still substantial room to speed this up via other potential optimizations that I spotted while digging into this, including optimizations to speed up field name checks.

I also spotted a pre-existing bug in our std.mergePatch's default field visibility.

I'll fix both of those in separate PRs.

@JoshRosen JoshRosen changed the title Specialization of std.foldl(std.mergePatch, arr, target) [WIP / experiment] Specialization of std.foldl(std.mergePatch, arr, target) Jan 3, 2025
@JoshRosen
Copy link
Contributor Author

Closing this, as I managed to gain even larger speedups via the optimizations in #258 and those don't change semantics.

I may end up repurposing parts of the specialization testing bits in a future PR.

@JoshRosen JoshRosen closed this Jan 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant