-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track Snapshot Version in RepositoryData #50930
Track Snapshot Version in RepositoryData #50930
Conversation
Add tracking of snapshot versions to `RepositoryData` to make BwC logic more efficient. Follow up to elastic#50853
Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore) |
Jenkins run elasticsearch-ci/1 |
// Step 2: Write new index-N blob to repository and update index.latest | ||
setPendingStep.whenComplete(newGen -> threadPool().executor(ThreadPool.Names.SNAPSHOT).execute(ActionRunnable.wrap(listener, l -> { | ||
// BwC logic: Load snapshot version information if any snapshot is missing a version in RepositoryData so that the new | ||
// RepositoryData contains a version for every snapshot | ||
final List<SnapshotId> snapshotIdsWithoutVersion = repositoryData.getSnapshotIds().stream().filter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is somewhat annoying that we have to load all SnapshotInfo
here one-off but I think it's still the best solution available.
Ideally, I was hoping to be able to just work from the assumption that if no version is set for a snapshot then it must be from before 8.0
/7.6
. But I gave up on that idea since it breaks as soon as some older version cluster (for whatever reason), takes a snapshot and removes all the version
fields when writing out a new index-N
(maybe that's the wrong trade off though, glad to hear opinions on this :) ... obviously you could argue that loading all the SnapshotInfo
is too high a price to pay just so the repo can move to the new metadata version earlier, but IMO even on S3 etc. loading 100 snapshots or so won't take all that long or cost any meaningful amount of $$$).
@@ -263,11 +271,24 @@ public void testHandlingMissingRootLevelSnapshotMetadata() throws Exception { | |||
logger.info("--> delete root level snapshot metadata blob for snapshot [{}]", snapshotToCorrupt); | |||
Files.delete(repo.resolve(String.format(Locale.ROOT, BlobStoreRepository.SNAPSHOT_NAME_FORMAT, snapshotToCorrupt.getUUID()))); | |||
|
|||
logger.info("--> strip version information from index-N blob"); | |||
final RepositoryData withoutVersions = new RepositoryData(repositoryData.getGenId(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obviously now that we're writing the version field to RepositoryData
using shard generations to identify having any pre-7.6 snapshots or not goes away and we have to fake an old-style snapshot by stripping out the version
fields manually.
new ActionListener<>() { | ||
@Override | ||
public void onResponse(Collection<Void> voids) { | ||
logger.info("Successfully loaded all snapshot's version information for {} from snapshot metadata", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
info
IMO since it's a one time thing and it would be nice to have some marker of the "upgrade" in the logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that snapshotIdsWithoutVersion
can be very long. I was wondering if we should just display the size of snapshotIdsWithoutVersion
here in the info logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to have the concrete ids of the snapshots to help debug situations where users might be writing to the repo from various ES versions (that's my main motivation to have this). I figured the list isn't going to be so long that it would create real issues due to the log line length so worst case it's a bit of an annoyance right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to have the concrete ids of the snapshots to help debug situations where users might be writing to the repo from various ES versions (that's my main motivation to have this)
Why do the exact names and versions matter in that case? Isn't it sufficient to know that some snapshots got version-tagged that previously were not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Versions I don't care about much and the exact names aren't important either, but it would be nice to see if the list of snapshots changed if this gets logged repeatedly.
Especially on Cloud just having just the size which might be effectively constant over time seems like it could create an annoying situation when debugging :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps use firstListElementsToCommaDelimitedString
then (see AllocationService)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nice thanks => reused in 02e8c4e (can probably move that method to a more appropriate place in a follow up)
public void onFailure(Exception e) { | ||
logger.warn("Failure when trying to load missing version information from snapshot metadata", e); | ||
} | ||
}, () -> filterRepositoryDataStep.onResponse(repositoryData.withVersions(updatedVersionMap))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not great but just like in last week's discussion of SnapshotsService
not introducing any new breakage here so we take whatever version information we can get and run with it even on exceptions.
@ywelsch the test failure in a previous run here seems like it might be interesting to you:
even though a snapshot test failed here it looks like this is related to the new CS persistence layer instead of my changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left one comment, looking good o.w.
new ActionListener<>() { | ||
@Override | ||
public void onResponse(Collection<Void> voids) { | ||
logger.info("Successfully loaded all snapshot's version information for {} from snapshot metadata", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that snapshotIdsWithoutVersion
can be very long. I was wondering if we should just display the size of snapshotIdsWithoutVersion
here in the info logging.
Thanks Yannick! |
Add tracking of snapshot versions to RepositoryData to make BwC logic more efficient. Follow up to elastic#50853
Add tracking of snapshot versions to RepositoryData to make BwC logic more efficient. Follow up to elastic#50853
Add tracking of snapshot versions to
RepositoryData
to make BwC logic more efficient.Follow up to #50853