-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Alternative to 1148): Don't blindly reuse state from a previous layer when re-creating it #1140
Conversation
Absolutely untested so far, and I’d like to see this work in the simulated situation before it is merged. |
4b5e4c5
to
dbbd18f
Compare
Note that removing a layer like this is also potentially risky — we can’t necessarily trust the contents of the |
1e79bf3
to
3442356
Compare
OK, I have tested this now and verified that it works as expected. OTOH #1139 might be a better fix — I’d very much love for someone who understands the design intent to weigh in. |
3442356
to
733574c
Compare
Podman tests: containers/podman#13315 |
006af1c
to
a5296f4
Compare
Marking as ready to review — please consider this only in tandem with #1148. |
a5296f4
to
a4bd292
Compare
a4bd292
to
f1fe475
Compare
The in-driver lock appears to be a carryover from docker — everything at this level is supposed to be protected by a LayerStore lock that happened before we called into any driver-specific logic. The premise looks sound. |
This looks a decent remediation for cases where a process was killed sometime earlier, while it was attempting to create a layer from a template, which is often done for users of fedora toolbox, so I think this is worth pursuing. |
defer d.locker.Unlock(id) | ||
|
||
if _, err := system.Lstat(dir); err == nil { | ||
logrus.Warnf("Trying to create a layer %#v while directory %q already exists; removing it first", id, dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We default to showing warnings. Is this something we want normal users to see by default? Or should we drop this down to info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that, this indicates an unexpected abort of a c/storage operation, and it is definitely something we want to have captured in any logs at the default level for postmortems if this cleanup is insufficient and/or incorrect and harmful.
I’m not so sure that it needs to be printed by default on a TTY during interactive use; at that point, there are unlikely to be any logs recorded, and the thing either works or it doesn’t. OTOH, it’s also a situation we shouldn’t be regularly getting into, so I don’t think that a warning should hurt.
Alternatively, #1148 should make this case unreachable assuming there are no bugs — and in that case we either don’t need this PR, or we can happily leave it on a warning level because it is clearly very unexpected. If I’m reading @nalind’s comments right, it seems preferable to have both this PR and #1148.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok leave it at Warning level.
LGTM |
f1fe475
to
2e9d351
Compare
In all call paths, the layerStore owning the driver is expected to be locked, so, so this seems redundant. See also containers#1140 (comment) . Signed-off-by: Miloslav Trmač <[email protected]>
OK, let’s discuss possibly removing that in #1214 . (That would break this PR, I will rebase either one as necessary.) |
In all call paths, the layerStore owning the driver is expected to be locked, so, so this seems redundant. See also containers#1140 (comment) . Signed-off-by: Miloslav Trmač <[email protected]>
We have reports in the wild of a layer store where two symbolic links in linkDir point to the same layer. That could only happen when calling Driver.create with a previously-used layer ID (which happens all the time because pulls use deterministic layer IDs), without fully deleting the previous version of the layer (so far, we don't know how that has happened). To avoid such situations, don't just leave whatever was in the layer directory laying around; try to remove any pre-existing contents, as well as the symbolic link in linkDir, if any. Signed-off-by: Miloslav Trmač <[email protected]>
2e9d351
to
2a3194c
Compare
Use theDriver.locker
lock when creating a layer in the overlay driver. I don’t know why that lock is necessary, but if it is necessary at all, it seems that it should be held on that code path. Alternatively, maybe that lock should be removed entirely?linkDir
, to ensure the layer is created in a known state.Compare also EDIT
#1133#1139.