Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please provide a basic progress / transfer speed output ("CI compatible") on copy #658

Open
frittentheke opened this issue May 20, 2019 · 33 comments
Labels
good first issue help wanted kind/feature A request for, or a PR adding, new functionality

Comments

@frittentheke
Copy link

frittentheke commented May 20, 2019

First of all, thank you massively for Skopeo and its ability to efficiently copy images between registries. Especially when having to use proxies and other special setups Skopeo is gold!

Just like @baracoder in issue #597, we are using Skopeo within Gitlab CI jobs.

While having Skopeo not print and update the otherwise nice progress indicators during a CI run is an improvement - not having any indication of the current transfer speed is an issue for our use case though. With copy being silent until it either succeeds or fails makes it hard to spot if there is a (speed) issue with the transfer.

In our case having some feedback on the transfer speed / likely success is crucial. May I kindly suggest to introduce a setting (or by checking for a tty to be present) to simply report the current speed / progress along with a timestamp in front about once a minute or every 30 seconds as a single line? This should not clutter ones output in CI and with the --quite option still available can still be switched off. This would also be nice for cron jobs to have some sort of "logging" of the progress maybe?

@mtrmac
Copy link
Contributor

mtrmac commented May 20, 2019

Thanks for your report.

Reporting the recent transfer speed should be reasonably possible using c/image/copy.Options.Progress; reporting overall progress is not currently possible through that interface (it does not report about all the involved blobs, and their sizes, in advance). Enhancing the progress reporting interfaces of c/image/copy to make this possible (or, ideally, to implement the current / WIP progress bar code in c/image/copy on top of that generic interface) would be nice, of course.

@frittentheke
Copy link
Author

Thanks for the quick response. The issue I am having is to see the current progress (transfer speed per second) rather than the total process. In short: I simply want to see how quickly things are progressing currently.

@mtrmac
Copy link
Contributor

mtrmac commented Jun 25, 2019

This got notably more complex in the meantime, because up to 6 blobs are now copied simultaneously; so, the concept of “current progress” does not make sense any more, without some sort of aggregated view.

@frittentheke
Copy link
Author

@mtrmac urgh, too bad :-(
Thanks you keeping me / this issue updated.

As cool as it is when Skopeo does sync / copy multiple gigabytes of Docker layers in seconds on a fast local network, when you are dealing with potentially slow registries which are accessed via the internet across half the globe any indication if things are still moving or (almost) stalled would help.

@rhatdan
Copy link
Member

rhatdan commented Oct 8, 2020

@vrothberg Can we use the progress bars that we use in Podman for this?

@vrothberg
Copy link
Member

@vrothberg Can we use the progress bars that we use in Podman for this?

Yes, that's possible. There is an example on GitHub (https://github.com/vbauerster/mpb#bytes-counters) that indicates the download speed.

I am currently busy with other things but the change should be straight forward. We need to change the decorators in createProgressBar -> https://github.com/containers/image/blob/master/copy/copy.go#L982.

@github-actions
Copy link

github-actions bot commented Jun 6, 2021

A friendly reminder that this issue had no activity for 30 days.

@frittentheke
Copy link
Author

unstale

@vrothberg
Copy link
Member

podman (master) $ ./bin/podman rmi -af; ./bin/podman pull docker.io/gcc                           
Trying to pull docker.io/library/gcc:latest...                                                    
Getting image source signatures                                                                   
Copying blob ad4592a9cb6d 9.00 MiB/s [===========================>----------] 39.1MiB / 52.4MiB   
Copying blob 5a2f691668eb 1.12 MiB/s [--------------------------------------] 1.9MiB / 187.3MiB   
Copying blob 7faeec18cdc0 done                                                                    
Copying blob 0ddda9701dd9 1.20 MiB/s [=========>----------------------------] 2.6MiB / 10.4MiB    
Copying blob a6e37b3a94cd 1.72 MiB/s [=>------------------------------------] 3.2MiB / 52.0MiB    
Copying blob f0031fb5d71f 1.48 MiB/s [==========================>-----------] 3.4MiB / 4.9MiB     
Copying blob 8f2ca4ed8981 8.06 MiB/s [===>----------------------------------] 11.5MiB / 121.3MiB  

As already mentioned by @mtrmac, copying layers is happening in parallel which makes it challenging to have a single indicator of the download speed.

But I want to revive the conversation on how we could get closer. @frittentheke, would the upper example be of any help? Each layer would have the IO printed before the progress bar.

@frittentheke
Copy link
Author

@vrothberg absolutely would this help, even be more detailed than a summed up data rate.

Just please also consider the usage of Skopeo in CI pipelines which do not like constant updates to the same few lines all the time. Maybe an option to refresh the output only so often is sensible here?

@mtrmac
Copy link
Contributor

mtrmac commented Jun 7, 2021

As already mentioned by @mtrmac, copying layers is happening in parallel which makes it challenging to have a single indicator of the download speed.

It seems possible in principle to sum the speeds of the individual items; of course actually doing that, in concurrent code, and figuring out the relevant heuristics (smoothing / moving averages, and making sure the data for all streams covers the same time range, so that if 6 items “take turns” on a 100 MB/s link, 5 report 0 speed and 1 reports 100 MB/s at the time it is receiving data, we don’t sum that up to 600 MB/s) might end up pretty complex.

But we are getting a bit into the weeds… do I understand correctly that the core need is to:

  • Not spam the log, but
  • Report download speed sometimes (per 30-60 seconds), to eventually detect very slow network transfers?

and it’s not very important what the actual data reported is, beyond the two concerns above?

@frittentheke
Copy link
Author

But we are getting a bit into the weeds… do I understand correctly that the core need is to:

* Not spam the log, but
* Report download speed _sometimes_ (per 30-60 seconds), to _eventually_ detect very slow network transfers?

and it’s not very important what the actual data reported is, beyond the two concerns above?

Yes. Maybe continue to print the single line that was introduced with containers/image#558 and just add some progress info like x of y Megabytes or z MB/s whatever` and then repeat the line every 30 or 60 seconds.

@github-actions
Copy link

github-actions bot commented Jul 8, 2021

A friendly reminder that this issue had no activity for 30 days.

@frittentheke
Copy link
Author

frittentheke commented Jul 8, 2021

This is not stale - was just talking to @vrothberg and @mtrmac about this ;-)

@github-actions
Copy link

github-actions bot commented Aug 8, 2021

A friendly reminder that this issue had no activity for 30 days.

@github-actions
Copy link

github-actions bot commented Sep 8, 2021

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Sep 8, 2021

@mtrmac @vrothberg @frittentheke Any movement on this?

@vrothberg
Copy link
Member

Not to my knowledge. To summarize the above conversation:

  • Skopeo (or c/image) should print at a given frequency the network IO
  • When there's no TTY available (e.g., in CI), the IO should be printed as a single line
  • When there's a TTY, we can add a new decorator to each progress bar

@mtrmac @frittentheke does that sound right to you?

@mtrmac
Copy link
Contributor

mtrmac commented Sep 9, 2021

The progress bars already include (per-layer) speed, so that’s a no-op.

I understand this as an opt-in, periodic, report, only in the non-interactive case.

@mtrmac
Copy link
Contributor

mtrmac commented Oct 7, 2021

Compare #1477 ; it’s not quite the same thing but it might need computing similar data.

@github-actions
Copy link

github-actions bot commented Nov 7, 2021

A friendly reminder that this issue had no activity for 30 days.

@frittentheke
Copy link
Author

@vrothberg I hope you don't mind me keeping this from staling out ....

@rhatdan
Copy link
Member

rhatdan commented Nov 8, 2021

We don't just close stale issues, We use it as an opportunity to take a fresh look.

@pombredanne
Copy link

It would be super useful to have some simple progress reporting that could be captured from stdout or stderr to provide some progress indications.

We use skopeo in ScanCode.io to fetch images and it would help a lot to report some progress when we have larger images. For FWIW our code is at: https://github.com/nexB/scancode.io/blob/00bf2545436ebcfc5e94f45f9a29a4b2abfe2131/scanpipe/pipes/fetch.py#L92 and is a CLI wrapper using "docker://" URLs as inputs which are accepted in the UI of scancode.io to fetch and scan whole docker images for origin, license and more.

See also aboutcode-org/scancode.io#372

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@pombredanne
Copy link

Gentle ping... I am still looking forward to this!

@frittentheke
Copy link
Author

frittentheke commented Jan 29, 2022

A friendly reminder that this issue had no activity for 30 days.

This should not stale away .... @rhatdan @vrothberg

@pombredanne
Copy link

Note that I have made some crude tests using script to pretend we are running interactively:

$ script --return  --flush -c "./skopeo copy  --insecure-policy docker://debian docker-archive:foo6.tar" -a log.txt

This get us some output from mpb with escape sequences:

Script started on Sat 22 Jan 2022 07:43:55 PM CET
Getting image source signatures
Copying blob 0e29546d541c [--------------------------------------] 367.2KiB / 52.4MiB
ESC[1AESC[JCopying blob 0e29546d541c [>-------------------------------------] 1.2MiB / 52.4MiB
ESC[1AESC[JCopying blob 0e29546d541c [>-------------------------------------] 1.9MiB / 52.4MiB
ESC[1AESC[JCopying blob 0e29546d541c [=>------------------------------------] 2.7MiB / 52.4MiB
ESC[1AESC[JCopying blob 0e29546d541c [==>-----------------------------------] 3.6MiB / 52.4MiB
ESC[1AESC[JCopying blob 0e29546d541c [==>-----------------------------------] 4.0MiB / 52.4MiB
ESC[1AESC[JCopying blob 0e29546d541c [==>-----------------------------------] 4.4MiB / 52.4MiB
ESC[1AESC[JCopying blob 0e29546d541c [===>----------------------------------] 5.6MiB / 52.4MiB
ESC[1AESC[JCopying blob 0e29546d541c [====>---------------------------------] 6.4MiB / 52.4MiB

Then using strings to filter out the escape sequences yields something of sorts. This is warty and brittle to get this this way... but if we could this out of the box without relying on script and strings that would be perfectly good enough for me:

Script started on Sat 22 Jan 2022 07:43:55 PM CET
Getting image source signatures
Copying blob 0e29546d541c [--------------------------------------] 367.2KiB / 52.4MiB
[JCopying blob 0e29546d541c [>-------------------------------------] 1.2MiB / 52.4MiB
[JCopying blob 0e29546d541c [>-------------------------------------] 1.9MiB / 52.4MiB
[JCopying blob 0e29546d541c [=>------------------------------------] 2.7MiB / 52.4MiB
[JCopying blob 0e29546d541c [==>-----------------------------------] 3.6MiB / 52.4MiB
[JCopying blob 0e29546d541c [==>-----------------------------------] 4.0MiB / 52.4MiB
[JCopying blob 0e29546d541c [==>-----------------------------------] 4.4MiB / 52.4MiB
[JCopying blob 0e29546d541c [===>----------------------------------] 5.6MiB / 52.4MiB
[JCopying blob 0e29546d541c [====>---------------------------------] 6.4MiB / 52.4MiB
[JCopying blob 0e29546d541c [====>---------------------------------] 7.2MiB / 52.4MiB
[JCopying blob 0e29546d541c [=====>--------------------------------] 7.9MiB / 52.4MiB

@vbauerster this project uses your excellent mpb https://github.com/vbauerster/mpb 🙇 .... would there be a way to get progress provided optionally in a non-interactive mode without terminal escape sequence decoration?

@ cco3 FYI, this is the underlying issue making it hard(er?) to report progress when fetching images in aboutcode-org/scancode.io#372

@pombredanne
Copy link

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@mtrmac mtrmac added the kind/feature A request for, or a PR adding, new functionality label Dec 7, 2022
cgwalters added a commit to cgwalters/rpm-ostree that referenced this issue Aug 21, 2024
It's useful to see in the progress output how many layers
there are to fetch.

This is similar to
containers/bootc@6eb5718
which ended up being totally reworked in a nicer way in
containers/bootc@d8b5df2

But doing the latter would require nontrivial changes to our
DBus API around status and progress reporting...and I'd
like to think about how we tackle that more generally
in e.g. containers/skopeo#658

Closes: coreos#5024
cgwalters added a commit to cgwalters/rpm-ostree that referenced this issue Sep 3, 2024
It's useful to see in the progress output how many layers
there are to fetch.

This is similar to
containers/bootc@6eb5718
which ended up being totally reworked in a nicer way in
containers/bootc@d8b5df2

But doing the latter would require nontrivial changes to our
DBus API around status and progress reporting...and I'd
like to think about how we tackle that more generally
in e.g. containers/skopeo#658

Closes: coreos#5024
@SignFinder
Copy link

SignFinder commented Sep 12, 2024

Progress bar is absolutely needed.
Manually running "skopio copy" without any progress looks a bit uninformative - you can not be sure if copying images or not before exit of command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue help wanted kind/feature A request for, or a PR adding, new functionality
Projects
None yet
Development

No branches or pull requests

6 participants