Prefer newer JSON cache files to older #524
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The 'run map' JSON cache is used to speed up opening a run when possible, by listing the sources & trains to be found in each file. There are a couple of possible locations for this: it can be inside the run folder, where only privileged processes can write, or in a hidden world-writable folder under proposal scratch. In master, a cache in the run folder is preferred both for reading & writing.
In some cases, the calibration pipeline creates a partial cache file in the run folder, by opening a subset of files with the
include=
parameter. When someone later opens the whole run folder, a complete cache file is written in scratch, but then this is never used, because the one in the run folder takes priority.This change looks for the newest available run map file when reading.
Testing: I was pointed to p5733 r120 as an example. Using the script below, I get a minimum of about 0.4 s to open the run on master, and around 0.1 s on this branch.
Using
strace
, I can also see that it opens 320.h5
files on master, and 0 on this branch, because the information is all cached.