-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Catalogue cache improvements #329
Conversation
In Puppet 3, the Puppet::Node::Facts class adds a _timestamp fact to the facts hash, causing constant catalogue cache invalidation between spec examples. This was removed in Puppet 4 via PUP-3130. By calling #dup on the facts hash, the facts in the cache key are no longer modified and the catalogue cache works as expected.
end | ||
end | ||
|
||
(16..20).each do |i| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't that be (5..20) to cover all bases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be (9..20) or even (9..20) + (1..4), as the 1..20 will have invalidated the first four, then 1..4 will have invalidated the next four. What's left should be the original 9..20 (since the first eight are now invalid), plus 1..4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right.
@@cache[args] ||= self.build_catalog_without_cache(*args) | ||
unless @@cache.has_key? args | ||
# Keep only the most recently added 16 entries to prevent high memory consumption | ||
expire_cache(@@cache, @@cache_lra, 15) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes me sad that the API here requires the caller to do math in their head.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the off-by-one error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
PS: "The two big problems in computer science: Cache Invalidation, Naming, and Off-by-one Errors."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved the line down so that it's below the new addition to the cache, meaning 16
can be passed into expire_cache
. Since it's the most recently added, there's no danger of it being immediately expired again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, two of three in one PR's not bad going!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could bikeshed the name :P
The catalogue cache, based on facts, code etc, previously kept all known catalogues in a class variable without expiration. Large specs with many permutations (e.g. multiple OSes, different class or Hiera parameters) caused many catalogues to remain in memory and the test process grew to a large size. This limits the cache to keep only the last 16 entries, which should allow for a bit of catalog reuse within a single spec file.
I like the code change, and I'll try it out tonight. Then merge it. |
Thanks for reviewing David. My own unscientific test on Travis CI for one of our modules just completed with interesting results - most runs are significantly quicker and a couple of runs that I think were being OOM killed are now passing. Before: https://travis-ci.org/theforeman/puppet-foreman/builds/85269582 |
@domcleal why on earth are you tests running for 5 hours?! |
@igalic I think that's totalling the runtime from all parts of the matrix. They're quite large test suites which use rspec-puppet-facts to run examples on each supported OS. The wall time was closer to 1-2 hours. |
This improved the local runtime of the puppetlabs-apache spec tests from 11 minutes 32 seconds to slightly over 8 minutes. Excellent catch, @domcleal , thanks! |
Fixes two issues in the catalogue cache:
Clones the facts hash to stop the
_timestamp
fact (dynamically generated by Puppet 3.x) from being part of the cache key. This was preventing the catalogue cache from having any effect on adjacent examples on 3.x. Puppet 4 no longer adds this fact.Limits the size of the cache, preventing memory exhaustion when many different sets of facts and parameters were tested.
Originally reported at #215 (comment).
I'm not sure about the choice of 16 catalogues. I was tempted to only cache a single catalogue, which would allow adjacent examples (with identical parameters) to get the benefit of caching, but I suppose this might allow for some caching in more complex specs with non-linear layouts of examples (e.g. testing parameter A, then parameter B, then parameter A again).