Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not merge - Copy detection #607

Closed
wants to merge 2 commits into from

Conversation

vrothberg
Copy link
Member

containers/storage#307 must be merged before. Then we can update this PR and merge.

This PR is rather big and complex but I tried to cut it into easier to digest commits where possible (the final commit being the big chunk I couldn't figure out to split).

FYI @rhatdan @wking @mtrmac

I will open a PR over a Skopeo that you can you use to play around.

copy/copy.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very quick initial look after an initial skim; I didn’t really read the copy/copy.go code in any kind of detail.

copy/copy.go Outdated Show resolved Hide resolved
copy/copy.go Outdated Show resolved Hide resolved
copy/copy.go Outdated Show resolved Hide resolved
copy/copy.go Outdated Show resolved Hide resolved
storage/storage_image.go Outdated Show resolved Hide resolved
storage/storage_image.go Outdated Show resolved Hide resolved
storage/storage_image.go Outdated Show resolved Hide resolved
copy/copy.go Outdated Show resolved Hide resolved
copy/copy.go Outdated Show resolved Hide resolved
copy/copy.go Outdated Show resolved Hide resolved
@vrothberg vrothberg requested a review from nalind March 21, 2019 07:55
@vrothberg vrothberg force-pushed the copy-detection branch 6 times, most recently from 52db506 to 071c9f9 Compare March 22, 2019 12:52
Copy link
Member Author

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing myself in the GitHub UI ...

copy/copy.go Outdated Show resolved Hide resolved
copy/copy.go Outdated Show resolved Hide resolved
@vrothberg vrothberg force-pushed the copy-detection branch 2 times, most recently from 15a682b to 80bf576 Compare March 25, 2019 12:29
@vrothberg vrothberg changed the title WIP - Copy detection Copy detection Mar 25, 2019
@vrothberg
Copy link
Member Author

@rhatdan @nalind @mtrmac PTAL

@giuseppe
Copy link
Member

LGTM

Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one question and a nit

copy/copy.go Outdated Show resolved Hide resolved
storage/storage_image.go Outdated Show resolved Hide resolved
@vrothberg
Copy link
Member Author

one question and a nit

Fixed both. Thanks, @giuseppe !

storage/storage_image.go Outdated Show resolved Hide resolved
@vrothberg
Copy link
Member Author

@mtrmac can you have a look at the latest commit I added? It's passing Buildah and Podman CI, and the code looks fairly straight forward.

Rather sooner than later, I would love to allocate some time and reflect how we can refactor the code. PutBlob() and Commit() could be moved into a single operation. I did not get into the details of how this can be done but seeing all the potentials of deadlocks, I am worried (not yet scared).

@mtrmac
Copy link
Collaborator

mtrmac commented Apr 1, 2019

Rather sooner than later, I would love to allocate some time and reflect how we can refactor the code. PutBlob() and Commit() could be moved into a single operation.

In general, or specifically for the way c/storage does this? In general, all the PutBlobs need to be finished before the manifest can be updated and written to the destination (incl. possibly attempting more than one manifest format if the destination protocol like d/distribution can support it in principle, but that particular server does not accept some formats), so I don’t know how that would work. OTOH the code can certainly be structured in some other way.

types/types.go Outdated Show resolved Hide resolved
storage/storage_image.go Show resolved Hide resolved
storage/storage_image.go Outdated Show resolved Hide resolved
storage/storage_image.go Outdated Show resolved Hide resolved
storage/storage_image.go Show resolved Hide resolved
copy/copy.go Outdated Show resolved Hide resolved
storage/storage_image.go Outdated Show resolved Hide resolved
@mtrmac
Copy link
Collaborator

mtrmac commented Apr 1, 2019

Oops, I have somehow submitted an incomplete review. I’m still reading the code, and I planned to revisit the FIXMEs afterwards.

@vrothberg
Copy link
Member Author

Oops, I have somehow submitted an incomplete review. I’m still reading the code, and I planned to revisit the FIXMEs afterwards.

Let me know when you're done. I will repush after that.


locker, ok := s.lockedDigests[d]
if !ok {
return errors.Errorf("trying to unlock non existent lock for digest %q", d)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation for UnlockBlob says “note that it is safe to call UnlockBlob multiple times”. This error return makes that kind of of true (the code does not panic), but unexpected.

One way to make this go away is to just not return any error value from UnlockBlob :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation is incorrect. I missed updating that. Sorry for that!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(All the other copies of the documentation need updating as well. — but let’s first see if we need to make this public at all.)

storage/storage_image.go Outdated Show resolved Hide resolved
copy/copy.go Outdated Show resolved Hide resolved
copy/copy.go Outdated Show resolved Hide resolved
copy/copy.go Outdated Show resolved Hide resolved
copy/copy.go Outdated
// another process.
//
// Note that we call SetTotal() after the new
// bar is created to avoid potential flickering
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This feels like something that could be encapsulated in a helper; ic.c.createReplacementProgressBar?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The

// Note that we call SetTotal() after the new
// bar is created to avoid potential flickering

comment is hard-earned knowledge that seems worth preserving.

copy/copy.go Outdated
// if the bar removal and replacement happens
// across refresh rates.
toReplaceBar := bar
bar = ic.c.createProgressBar(pool, srcInfo, "blob", "paused: being copied by another process", mpb.BarRemoveOnComplete(), mpb.BarReplaceOnComplete(bar))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is bar ever completed? If not, how does pool.Wait finish?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will complete. createProgressBar() will set the size to 0 if a filler is specified.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am reading the code correctly, a zero-sized bar does not automatically complete; toComplete is set only by Bar.SetTotal or Bar.IncrBy.

Anyway, in the !reused case below, this progress bar should somehow be returned to the caller, so that it can be replaced before copyLayerBlobFromStream.

(In the reused case, this progress bar is replaced, which completes it.)

copy/copy.go Outdated Show resolved Hide resolved
copy/copy.go Outdated Show resolved Hide resolved

var bar *mpb.Bar
if filler != "" {
barFiller := mpb.FillerFunc(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(BTW, if I understand the idea correctly, we don’t need to replace the bars only to change the message; the FillerFunc could just fmt.Fprint(*someVariable) and we could update the variable (+ locking in both the reader and writer).

That does not look like an improvement right now, but if the logic moved into storageImageDestination.TryReuseBlob, it could make the API to update the progress bar with status messages notably easier.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's possible to do that but this would not work when replacing a "placeholder" with the real progress bar, which does not specify a filler. I prefer to keep it as is (now with a dedicated method to create a replacement bar).

Copy link
Collaborator

@mtrmac mtrmac Apr 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, the real progress bar can be a different animal.

The fmt.Fprintf(*someVariable) approach seems attractive in that it would give GetBlob (for status about mirrors) and TryReusingBlob (for all of this code, if we manage to move it into the transport) a pretty nice way to report progress: we could give them a

type UserProgressNotifier interface {
   Update(string message)
}

@mtrmac
Copy link
Collaborator

mtrmac commented Apr 1, 2019

Let me know when you're done. I will repush after that.

Done now.

Extend the ImageDestination interface with methods for locking and
unlocking a blob:

1) LockBlob() for locking a given blob.

2) UnlockBlobk() for unlocking a blob.

3) SupportsBlobLocks() to indicate whether the image destination
   supports lock semantics or not.

Signed-off-by: Valentin Rothberg <[email protected]>
@vrothberg vrothberg force-pushed the copy-detection branch 4 times, most recently from cb307af to 5e5e204 Compare April 2, 2019 10:58
@vrothberg
Copy link
Member Author

@mtrmac I updated the PR according to your feedback. I vendored the latest state into the CI PRs:

At the moment, I cannot run it through CRI-O as a) the K8s e2e tests regress and b) cri-o/cri-o#2195 must be merged before.

@vrothberg
Copy link
Member Author

CRI-O is now unblocked, the CI PR can be found here: containers/podman#2763

Detect copy operations of other processes when copying layers to the
containers-storage.  The blob locks used by the storage image
destination are using file locks from containers/storage.  Those locks
do not support try-lock semantics as they are constrained by the
fcntl(2) syscall.  A try-lock mechanism is needed to detect if another
process is currently holding the lock and is hence copying the layer
in question.  Once we figured that out, we can display the information
on the progress bars to inform the user what is happening.  The
workaround of the missing try-lock semantics is to acquire the lock in a
sepearate goroutine and wait a certain amount of time.  If we still
don't own the lock after the timeout, we are assuming the layer to be
copied by another process.

Unlocking a blob presents another problem as we should only unlock when
we are certain the operation has succeeded, which is *after* the blob
has been committed to the containers storage.  Hence, we unlock as soon
as possible during Commit() of the storage image destination.  To avoid
potential deadlocks during errors or panics, we also unlock all unlocked
locks during (storageImageDestination).Close() which is invoked in a
deferred call in copy.Image.

Signed-off-by: Valentin Rothberg <[email protected]>
@vrothberg
Copy link
Member Author

The PR for c/storage has been merged and I just updated the vendor.conf here to depend on c/storage (not on my fork anymore).

@vrothberg vrothberg changed the title Copy detection Do not merge - Copy detection Apr 4, 2019
@vrothberg
Copy link
Member Author

There is one scenario I did not see until now, which is a big blocker to merge the code.

Assume we have an image with two layers (L1, L2) and two processes (P1, P2) pulling it. We can run into a situation where P1 is locking L1 and P2 is locking L2, both waiting for the other process to release the locks: deadlock. I do not know how I could not see this issue before...

@wking
Copy link
Contributor

wking commented Apr 4, 2019

Re: deadlocks, I think you need a timeout on waiting for locks, after which you pull in parallel anyway. See the "denial of service" paragraph here.

@vrothberg
Copy link
Member Author

Re: deadlocks, I think you need a timeout on waiting for locks, after which you pull in parallel anyway. See the "denial of service" paragraph here.

Thanks for the feedback! I think these are two orthogonal issues. If a user goes mad and (for some reason) can access our storage, we are doomed in any case. There are various other locks which could lead to a DOS. Hence, this scenario is not (yet) scope of this PR.

The problem we have in the PR now is that a DOS can currently currently happen by design without any attack, which is more than unfortunate. I have some ideas that I want to evaluate with @mtrmac who has a much better judgment on the code than I have.

However, I think that a time out would make sense for the blob locking to be on the safe side. But I really prefer to first fix the underlying problem of the potential for deadlocks.

@wking
Copy link
Contributor

wking commented Apr 4, 2019

We can run into a situation where P1 is locking L1 and P2 is locking L2, both waiting for the other process to release the locks: deadlock.

If you assume well-behaved actors, won't P1 soon finish pulling L1 and then release the lock, after which P2 can use the layer? That's why we have layer-level locking and not image-level locking. I'm not seeing a case where this deadlocks with no work being done unless a bad actor acquires a layer lock and then sits on its hands instead of working to fetch the layer (which is where lock-waiting timeouts would come in).

@vrothberg
Copy link
Member Author

vrothberg commented Apr 4, 2019

If you assume well-behaved actors, won't P1 soon finish pulling L1 and then release the lock, after which P2 can use the layer? That's why we have layer-level locking and not image-level locking. I'm not seeing a case where this deadlocks with no work being done unless a bad actor acquires a layer lock and then sits on its hands instead of working to fetch the layer (which is where lock-waiting timeouts would come in).

The devil lies in the detail. Currently, the downloading of a layer is decoupled from committing the layer to the storage, and we commit the layers after all layers downloaded. Only when a layer is committed, the associated lock will be released. Hence, in the upper example, P1 and P2 will be deadlocked during downloading as both are waiting for the other process to release the lock.

Certainly, the deadlock can be avoided by combining the download and commit of a given layer, which is something I am doing at the moment. This will also speed up the entire process as layers will be committed as soon as they are available.

@vrothberg
Copy link
Member Author

Certainly, the deadlock can be avoided by combining the download and commit of a given layer, which is something I am doing at the moment. This will also speed up the entire process as layers will be committed as soon as they are available.

I opened #611 for that. It's not yet in a state for merging but it passes local smoke tests.


// data for {Lock,Unlock}Blob
lockedDigests map[digest.Digest]storage.Locker // Bookkeeping of digests that have been locked
unlockedDigests map[digest.Digest]bool // Bookkeeping of digests that have been unlocked. Required to unlock yet unlocked digests in Close()
Copy link
Collaborator

@mtrmac mtrmac Apr 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Absolutely non-blocking: The symmetry between lockedDigests and unlockedDigests, when EDIT they are are so different, makes me nervous. Maybe name the first digestLockers?

Also, while the unlockedDigests idea does work correctly AFAICT, I can’t quite formulate a simple explanation why. If you have one, writing it down (here or somewhere around LockBlob) would be nice.)

@vrothberg
Copy link
Member Author

Closing in favour of #611.

@vrothberg vrothberg closed this Apr 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants