Do not merge - Copy detection #607

vrothberg · 2019-03-20T17:19:58Z

containers/storage#307 must be merged before. Then we can update this PR and merge.

This PR is rather big and complex but I tried to cut it into easier to digest commits where possible (the final commit being the big chunk I couldn't figure out to split).

FYI @rhatdan @wking @mtrmac

I will open a PR over a Skopeo that you can you use to play around.

copy/copy.go

mtrmac

Very quick initial look after an initial skim; I didn’t really read the copy/copy.go code in any kind of detail.

copy/copy.go

storage/storage_image.go

copy/copy.go

vrothberg

Reviewing myself in the GitHub UI ...

copy/copy.go

vrothberg · 2019-03-25T12:31:14Z

@rhatdan @nalind @mtrmac PTAL

giuseppe · 2019-03-27T10:03:23Z

LGTM

giuseppe

one question and a nit

copy/copy.go

storage/storage_image.go

vrothberg · 2019-03-27T10:12:49Z

one question and a nit

Fixed both. Thanks, @giuseppe !

storage/storage_image.go

vrothberg · 2019-03-29T15:16:48Z

@mtrmac can you have a look at the latest commit I added? It's passing Buildah and Podman CI, and the code looks fairly straight forward.

Rather sooner than later, I would love to allocate some time and reflect how we can refactor the code. PutBlob() and Commit() could be moved into a single operation. I did not get into the details of how this can be done but seeing all the potentials of deadlocks, I am worried (not yet scared).

mtrmac · 2019-04-01T11:31:59Z

Rather sooner than later, I would love to allocate some time and reflect how we can refactor the code. PutBlob() and Commit() could be moved into a single operation.

In general, or specifically for the way c/storage does this? In general, all the PutBlobs need to be finished before the manifest can be updated and written to the destination (incl. possibly attempting more than one manifest format if the destination protocol like d/distribution can support it in principle, but that particular server does not accept some formats), so I don’t know how that would work. OTOH the code can certainly be structured in some other way.

types/types.go

storage/storage_image.go

copy/copy.go

storage/storage_image.go

mtrmac · 2019-04-01T12:46:40Z

Oops, I have somehow submitted an incomplete review. I’m still reading the code, and I planned to revisit the FIXMEs afterwards.

vrothberg · 2019-04-01T13:09:11Z

Oops, I have somehow submitted an incomplete review. I’m still reading the code, and I planned to revisit the FIXMEs afterwards.

Let me know when you're done. I will repush after that.

mtrmac · 2019-04-01T12:23:26Z

storage/storage_image.go

+
+	locker, ok := s.lockedDigests[d]
+	if !ok {
+		return errors.Errorf("trying to unlock non existent lock for digest %q", d)


Documentation for UnlockBlob says “note that it is safe to call UnlockBlob multiple times”. This error return makes that kind of of true (the code does not panic), but unexpected.

One way to make this go away is to just not return any error value from UnlockBlob :)

The documentation is incorrect. I missed updating that. Sorry for that!

(All the other copies of the documentation need updating as well. — but let’s first see if we need to make this public at all.)

storage/storage_image.go

copy/copy.go

mtrmac · 2019-04-01T13:28:09Z

copy/copy.go

+					// another process.
+					//
+					// Note that we call SetTotal() after the new
+					// bar is created to avoid potential flickering


(This feels like something that could be encapsulated in a helper; ic.c.createReplacementProgressBar?)

Will do 👍

The

// Note that we call SetTotal() after the new // bar is created to avoid potential flickering

comment is hard-earned knowledge that seems worth preserving.

mtrmac · 2019-04-01T13:28:27Z

copy/copy.go

+					// if the bar removal and replacement happens
+					// across refresh rates.
+					toReplaceBar := bar
+					bar = ic.c.createProgressBar(pool, srcInfo, "blob", "paused: being copied by another process", mpb.BarRemoveOnComplete(), mpb.BarReplaceOnComplete(bar))


Is bar ever completed? If not, how does pool.Wait finish?

It will complete. createProgressBar() will set the size to 0 if a filler is specified.

If I am reading the code correctly, a zero-sized bar does not automatically complete; toComplete is set only by Bar.SetTotal or Bar.IncrBy.

Anyway, in the !reused case below, this progress bar should somehow be returned to the caller, so that it can be replaced before copyLayerBlobFromStream.

(In the reused case, this progress bar is replaced, which completes it.)

copy/copy.go

mtrmac · 2019-04-01T13:42:27Z

copy/copy.go

+
+	var bar *mpb.Bar
+	if filler != "" {
+		barFiller := mpb.FillerFunc(


(BTW, if I understand the idea correctly, we don’t need to replace the bars only to change the message; the FillerFunc could just fmt.Fprint(*someVariable) and we could update the variable (+ locking in both the reader and writer).

That does not look like an improvement right now, but if the logic moved into storageImageDestination.TryReuseBlob, it could make the API to update the progress bar with status messages notably easier.)

I think it's possible to do that but this would not work when replacing a "placeholder" with the real progress bar, which does not specify a filler. I prefer to keep it as is (now with a dedicated method to create a replacement bar).

Sure, the real progress bar can be a different animal.

The fmt.Fprintf(*someVariable) approach seems attractive in that it would give GetBlob (for status about mirrors) and TryReusingBlob (for all of this code, if we manage to move it into the transport) a pretty nice way to report progress: we could give them a

type UserProgressNotifier interface { Update(string message) }

mtrmac · 2019-04-01T13:52:53Z

Let me know when you're done. I will repush after that.

Done now.

Extend the ImageDestination interface with methods for locking and unlocking a blob: 1) LockBlob() for locking a given blob. 2) UnlockBlobk() for unlocking a blob. 3) SupportsBlobLocks() to indicate whether the image destination supports lock semantics or not. Signed-off-by: Valentin Rothberg <[email protected]>

vrothberg · 2019-04-02T12:26:52Z

@mtrmac I updated the PR according to your feedback. I vendored the latest state into the CI PRs:

At the moment, I cannot run it through CRI-O as a) the K8s e2e tests regress and b) cri-o/cri-o#2195 must be merged before.

vrothberg · 2019-04-03T13:12:23Z

CRI-O is now unblocked, the CI PR can be found here: containers/podman#2763

Detect copy operations of other processes when copying layers to the containers-storage. The blob locks used by the storage image destination are using file locks from containers/storage. Those locks do not support try-lock semantics as they are constrained by the fcntl(2) syscall. A try-lock mechanism is needed to detect if another process is currently holding the lock and is hence copying the layer in question. Once we figured that out, we can display the information on the progress bars to inform the user what is happening. The workaround of the missing try-lock semantics is to acquire the lock in a sepearate goroutine and wait a certain amount of time. If we still don't own the lock after the timeout, we are assuming the layer to be copied by another process. Unlocking a blob presents another problem as we should only unlock when we are certain the operation has succeeded, which is *after* the blob has been committed to the containers storage. Hence, we unlock as soon as possible during Commit() of the storage image destination. To avoid potential deadlocks during errors or panics, we also unlock all unlocked locks during (storageImageDestination).Close() which is invoked in a deferred call in copy.Image. Signed-off-by: Valentin Rothberg <[email protected]>

vrothberg · 2019-04-03T13:15:26Z

The PR for c/storage has been merged and I just updated the vendor.conf here to depend on c/storage (not on my fork anymore).

vrothberg · 2019-04-04T08:29:23Z

There is one scenario I did not see until now, which is a big blocker to merge the code.

Assume we have an image with two layers (L1, L2) and two processes (P1, P2) pulling it. We can run into a situation where P1 is locking L1 and P2 is locking L2, both waiting for the other process to release the locks: deadlock. I do not know how I could not see this issue before...

wking · 2019-04-04T11:23:03Z

Re: deadlocks, I think you need a timeout on waiting for locks, after which you pull in parallel anyway. See the "denial of service" paragraph here.

vrothberg · 2019-04-04T11:44:06Z

Re: deadlocks, I think you need a timeout on waiting for locks, after which you pull in parallel anyway. See the "denial of service" paragraph here.

Thanks for the feedback! I think these are two orthogonal issues. If a user goes mad and (for some reason) can access our storage, we are doomed in any case. There are various other locks which could lead to a DOS. Hence, this scenario is not (yet) scope of this PR.

The problem we have in the PR now is that a DOS can currently currently happen by design without any attack, which is more than unfortunate. I have some ideas that I want to evaluate with @mtrmac who has a much better judgment on the code than I have.

However, I think that a time out would make sense for the blob locking to be on the safe side. But I really prefer to first fix the underlying problem of the potential for deadlocks.

wking · 2019-04-04T13:27:23Z

We can run into a situation where P1 is locking L1 and P2 is locking L2, both waiting for the other process to release the locks: deadlock.

If you assume well-behaved actors, won't P1 soon finish pulling L1 and then release the lock, after which P2 can use the layer? That's why we have layer-level locking and not image-level locking. I'm not seeing a case where this deadlocks with no work being done unless a bad actor acquires a layer lock and then sits on its hands instead of working to fetch the layer (which is where lock-waiting timeouts would come in).

vrothberg · 2019-04-04T13:37:27Z

If you assume well-behaved actors, won't P1 soon finish pulling L1 and then release the lock, after which P2 can use the layer? That's why we have layer-level locking and not image-level locking. I'm not seeing a case where this deadlocks with no work being done unless a bad actor acquires a layer lock and then sits on its hands instead of working to fetch the layer (which is where lock-waiting timeouts would come in).

The devil lies in the detail. Currently, the downloading of a layer is decoupled from committing the layer to the storage, and we commit the layers after all layers downloaded. Only when a layer is committed, the associated lock will be released. Hence, in the upper example, P1 and P2 will be deadlocked during downloading as both are waiting for the other process to release the lock.

Certainly, the deadlock can be avoided by combining the download and commit of a given layer, which is something I am doing at the moment. This will also speed up the entire process as layers will be committed as soon as they are available.

vrothberg · 2019-04-04T17:38:24Z

Certainly, the deadlock can be avoided by combining the download and commit of a given layer, which is something I am doing at the moment. This will also speed up the entire process as layers will be committed as soon as they are available.

I opened #611 for that. It's not yet in a state for merging but it passes local smoke tests.

mtrmac · 2019-04-11T21:12:08Z

storage/storage_image.go

+
+	// data for {Lock,Unlock}Blob
+	lockedDigests   map[digest.Digest]storage.Locker // Bookkeeping of digests that have been locked
+	unlockedDigests map[digest.Digest]bool           // Bookkeeping of digests that have been unlocked. Required to unlock yet unlocked digests in Close()


(Absolutely non-blocking: The symmetry between lockedDigests and unlockedDigests, when EDIT they are are so different, makes me nervous. Maybe name the first digestLockers?

Also, while the unlockedDigests idea does work correctly AFAICT, I can’t quite formulate a simple explanation why. If you have one, writing it down (here or somewhere around LockBlob) would be nice.)

vrothberg · 2019-04-18T08:11:20Z

Closing in favour of #611.

vrothberg force-pushed the copy-detection branch from cb0b5b5 to 3016bd8 Compare March 20, 2019 17:21

vrothberg mentioned this pull request Mar 20, 2019

WIP - detect copy operations by other processes containers/skopeo#619

Closed

mtrmac mentioned this pull request Mar 20, 2019

store: add digest locking containers/storage#307

Merged

nalind reviewed Mar 20, 2019

View reviewed changes

copy/copy.go Outdated Show resolved Hide resolved

mtrmac reviewed Mar 20, 2019

View reviewed changes

vrothberg requested a review from nalind March 21, 2019 07:55

vrothberg force-pushed the copy-detection branch 6 times, most recently from 52db506 to 071c9f9 Compare March 22, 2019 12:52

vrothberg commented Mar 22, 2019

View reviewed changes

copy/copy.go Outdated Show resolved Hide resolved

copy/copy.go Outdated Show resolved Hide resolved

vrothberg force-pushed the copy-detection branch 2 times, most recently from 15a682b to 80bf576 Compare March 25, 2019 12:29

vrothberg changed the title ~~WIP - Copy detection~~ Copy detection Mar 25, 2019

giuseppe reviewed Mar 27, 2019

View reviewed changes

copy/copy.go Outdated Show resolved Hide resolved

storage/storage_image.go Outdated Show resolved Hide resolved

vrothberg mentioned this pull request Mar 27, 2019

WIP- vendor containers/{image,storage} containers/buildah#1445

Closed

vrothberg force-pushed the copy-detection branch from 80bf576 to 59f71ce Compare March 27, 2019 10:12

vrothberg force-pushed the copy-detection branch from 59f71ce to 02a4d93 Compare March 27, 2019 16:16

mtrmac reviewed Mar 27, 2019

View reviewed changes

storage/storage_image.go Outdated Show resolved Hide resolved

vrothberg force-pushed the copy-detection branch from 02a4d93 to 6384658 Compare March 28, 2019 08:16

mtrmac reviewed Apr 1, 2019

View reviewed changes

vrothberg force-pushed the copy-detection branch 4 times, most recently from cb307af to 5e5e204 Compare April 2, 2019 10:58

vrothberg force-pushed the copy-detection branch from 5e5e204 to c456947 Compare April 3, 2019 13:14

vrothberg changed the title ~~Copy detection~~ Do not merge - Copy detection Apr 4, 2019

mtrmac reviewed Apr 11, 2019

View reviewed changes

vrothberg closed this Apr 18, 2019

Do not merge - Copy detection #607

Do not merge - Copy detection #607

Conversation

vrothberg commented Mar 20, 2019

mtrmac left a comment

Choose a reason for hiding this comment

vrothberg left a comment

Choose a reason for hiding this comment

vrothberg commented Mar 25, 2019

giuseppe commented Mar 27, 2019

giuseppe left a comment

Choose a reason for hiding this comment

vrothberg commented Mar 27, 2019

vrothberg commented Mar 29, 2019

mtrmac commented Apr 1, 2019

mtrmac commented Apr 1, 2019

vrothberg commented Apr 1, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtrmac Apr 11, 2019 • edited Loading

Choose a reason for hiding this comment

mtrmac commented Apr 1, 2019

vrothberg commented Apr 2, 2019

vrothberg commented Apr 3, 2019

vrothberg commented Apr 3, 2019

vrothberg commented Apr 4, 2019

wking commented Apr 4, 2019

vrothberg commented Apr 4, 2019

wking commented Apr 4, 2019

vrothberg commented Apr 4, 2019 • edited Loading

vrothberg commented Apr 4, 2019

mtrmac Apr 11, 2019 • edited Loading

Choose a reason for hiding this comment

vrothberg commented Apr 18, 2019

mtrmac Apr 11, 2019 •

edited

Loading

vrothberg commented Apr 4, 2019 •

edited

Loading

mtrmac Apr 11, 2019 •

edited

Loading