-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not merge - Copy detection #607
Conversation
cb0b5b5
to
3016bd8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very quick initial look after an initial skim; I didn’t really read the copy/copy.go
code in any kind of detail.
52db506
to
071c9f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewing myself in the GitHub UI ...
15a682b
to
80bf576
Compare
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one question and a nit
80bf576
to
59f71ce
Compare
Fixed both. Thanks, @giuseppe ! |
59f71ce
to
02a4d93
Compare
02a4d93
to
6384658
Compare
@mtrmac can you have a look at the latest commit I added? It's passing Buildah and Podman CI, and the code looks fairly straight forward. Rather sooner than later, I would love to allocate some time and reflect how we can refactor the code. |
In general, or specifically for the way c/storage does this? In general, all the |
Oops, I have somehow submitted an incomplete review. I’m still reading the code, and I planned to revisit the FIXMEs afterwards. |
Let me know when you're done. I will repush after that. |
|
||
locker, ok := s.lockedDigests[d] | ||
if !ok { | ||
return errors.Errorf("trying to unlock non existent lock for digest %q", d) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation for UnlockBlob
says “note that it is safe to call UnlockBlob
multiple times”. This error return makes that kind of of true (the code does not panic), but unexpected.
One way to make this go away is to just not return any error
value from UnlockBlob
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation is incorrect. I missed updating that. Sorry for that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(All the other copies of the documentation need updating as well. — but let’s first see if we need to make this public at all.)
copy/copy.go
Outdated
// another process. | ||
// | ||
// Note that we call SetTotal() after the new | ||
// bar is created to avoid potential flickering |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This feels like something that could be encapsulated in a helper; ic.c.createReplacementProgressBar
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
// Note that we call SetTotal() after the new
// bar is created to avoid potential flickering
comment is hard-earned knowledge that seems worth preserving.
copy/copy.go
Outdated
// if the bar removal and replacement happens | ||
// across refresh rates. | ||
toReplaceBar := bar | ||
bar = ic.c.createProgressBar(pool, srcInfo, "blob", "paused: being copied by another process", mpb.BarRemoveOnComplete(), mpb.BarReplaceOnComplete(bar)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is bar
ever completed? If not, how does pool.Wait
finish?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will complete. createProgressBar() will set the size to 0 if a filler is specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am reading the code correctly, a zero-sized bar does not automatically complete; toComplete
is set only by Bar.SetTotal
or Bar.IncrBy
.
Anyway, in the !reused
case below, this progress bar should somehow be returned to the caller, so that it can be replaced before copyLayerBlobFromStream
.
(In the reused
case, this progress bar is replaced, which completes it.)
|
||
var bar *mpb.Bar | ||
if filler != "" { | ||
barFiller := mpb.FillerFunc( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(BTW, if I understand the idea correctly, we don’t need to replace the bars only to change the message; the FillerFunc
could just fmt.Fprint(*someVariable)
and we could update the variable (+ locking in both the reader and writer).
That does not look like an improvement right now, but if the logic moved into storageImageDestination.TryReuseBlob
, it could make the API to update the progress bar with status messages notably easier.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's possible to do that but this would not work when replacing a "placeholder" with the real progress bar, which does not specify a filler. I prefer to keep it as is (now with a dedicated method to create a replacement bar).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, the real progress bar can be a different animal.
The fmt.Fprintf(*someVariable)
approach seems attractive in that it would give GetBlob
(for status about mirrors) and TryReusingBlob
(for all of this code, if we manage to move it into the transport) a pretty nice way to report progress: we could give them a
type UserProgressNotifier interface {
Update(string message)
}
Done now. |
Extend the ImageDestination interface with methods for locking and unlocking a blob: 1) LockBlob() for locking a given blob. 2) UnlockBlobk() for unlocking a blob. 3) SupportsBlobLocks() to indicate whether the image destination supports lock semantics or not. Signed-off-by: Valentin Rothberg <[email protected]>
cb307af
to
5e5e204
Compare
@mtrmac I updated the PR according to your feedback. I vendored the latest state into the CI PRs:
At the moment, I cannot run it through CRI-O as a) the K8s e2e tests regress and b) cri-o/cri-o#2195 must be merged before. |
CRI-O is now unblocked, the CI PR can be found here: containers/podman#2763 |
Detect copy operations of other processes when copying layers to the containers-storage. The blob locks used by the storage image destination are using file locks from containers/storage. Those locks do not support try-lock semantics as they are constrained by the fcntl(2) syscall. A try-lock mechanism is needed to detect if another process is currently holding the lock and is hence copying the layer in question. Once we figured that out, we can display the information on the progress bars to inform the user what is happening. The workaround of the missing try-lock semantics is to acquire the lock in a sepearate goroutine and wait a certain amount of time. If we still don't own the lock after the timeout, we are assuming the layer to be copied by another process. Unlocking a blob presents another problem as we should only unlock when we are certain the operation has succeeded, which is *after* the blob has been committed to the containers storage. Hence, we unlock as soon as possible during Commit() of the storage image destination. To avoid potential deadlocks during errors or panics, we also unlock all unlocked locks during (storageImageDestination).Close() which is invoked in a deferred call in copy.Image. Signed-off-by: Valentin Rothberg <[email protected]>
5e5e204
to
c456947
Compare
The PR for c/storage has been merged and I just updated the |
There is one scenario I did not see until now, which is a big blocker to merge the code. Assume we have an image with two layers (L1, L2) and two processes (P1, P2) pulling it. We can run into a situation where P1 is locking L1 and P2 is locking L2, both waiting for the other process to release the locks: deadlock. I do not know how I could not see this issue before... |
Re: deadlocks, I think you need a timeout on waiting for locks, after which you pull in parallel anyway. See the "denial of service" paragraph here. |
Thanks for the feedback! I think these are two orthogonal issues. If a user goes mad and (for some reason) can access our storage, we are doomed in any case. There are various other locks which could lead to a DOS. Hence, this scenario is not (yet) scope of this PR. The problem we have in the PR now is that a DOS can currently currently happen by design without any attack, which is more than unfortunate. I have some ideas that I want to evaluate with @mtrmac who has a much better judgment on the code than I have. However, I think that a time out would make sense for the blob locking to be on the safe side. But I really prefer to first fix the underlying problem of the potential for deadlocks. |
If you assume well-behaved actors, won't P1 soon finish pulling L1 and then release the lock, after which P2 can use the layer? That's why we have layer-level locking and not image-level locking. I'm not seeing a case where this deadlocks with no work being done unless a bad actor acquires a layer lock and then sits on its hands instead of working to fetch the layer (which is where lock-waiting timeouts would come in). |
The devil lies in the detail. Currently, the downloading of a layer is decoupled from committing the layer to the storage, and we commit the layers after all layers downloaded. Only when a layer is committed, the associated lock will be released. Hence, in the upper example, P1 and P2 will be deadlocked during downloading as both are waiting for the other process to release the lock. Certainly, the deadlock can be avoided by combining the download and commit of a given layer, which is something I am doing at the moment. This will also speed up the entire process as layers will be committed as soon as they are available. |
I opened #611 for that. It's not yet in a state for merging but it passes local smoke tests. |
|
||
// data for {Lock,Unlock}Blob | ||
lockedDigests map[digest.Digest]storage.Locker // Bookkeeping of digests that have been locked | ||
unlockedDigests map[digest.Digest]bool // Bookkeeping of digests that have been unlocked. Required to unlock yet unlocked digests in Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Absolutely non-blocking: The symmetry between lockedDigests
and unlockedDigests
, when EDIT they are are so different, makes me nervous. Maybe name the first digestLockers
?
Also, while the unlockedDigests
idea does work correctly AFAICT, I can’t quite formulate a simple explanation why. If you have one, writing it down (here or somewhere around LockBlob
) would be nice.)
Closing in favour of #611. |
containers/storage#307 must be merged before. Then we can update this PR and merge.
This PR is rather big and complex but I tried to cut it into easier to digest commits where possible (the final commit being the big chunk I couldn't figure out to split).
FYI @rhatdan @wking @mtrmac
I will open a PR over a Skopeo that you can you use to play around.