support `--output -` to pull to stdout #346

ndeloof · 2022-01-07T15:47:46Z

My use case is to rely on oras CLI to restore a data cache, stored as tar.gz
on pull, I'd like to pipe directly the downloaded artifact to tar xz

The text was updated successfully, but these errors were encountered:

SteveLasker · 2022-01-14T22:52:29Z

Looks super interesting. Please open a PR for the proposal.

shizhMSFT · 2022-05-07T11:44:37Z

This is an interesting one. How would you handle multiple files?

FeynmanZhou · 2022-08-31T14:45:54Z

restore a data cache

Hi @ndeloof ，

Could you pls elaborate more on your use case?

Actually, ORAS CLI v0.15 will provide oras manifest fetch and oras blob fetch that might meet your need. You can check out this doc for details.

TerryHowe · 2022-12-05T16:24:40Z

Seems like the output of this should be a tgz so multiple files can be supported

shizhMSFT · 2023-03-15T14:29:11Z

One UX can be: oras pull localhost:5000/json-artifact:v1 --output - | jq and oras returns error if there are multiple blobs associated with the target manifest.

ProbstDJakob · 2023-07-18T10:39:39Z

Are there any plans to implement this also for the input so that something like the following will be possible:

command-a | command-b | oras push localhost:5000/json-artifact:v1 -

...

oras pull localhost:5000/json-artifact:v1 --output - | jq

qweeah · 2023-09-12T09:46:53Z

@ProbstDJakob There is a plan to provide piped-command user experience in v1.2.0 by standardizing output, see #638

Still I have questions on below commands:

command-a | command-b | oras push localhost:5000/json-artifact:v1 -

Since the layer content comes from stdin not file,

What is the file name of the generated layer?
How should we name the layer if user runs oras pull localhost:5000/json-artifact:v1?

oras pull localhost:5000/json-artifact:v1 --output - | jq

What if localhost:5000/json-artifact:v1 contains multiple layers?

ProbstDJakob · 2023-09-12T21:07:57Z

oras push could receive an additional option as follows:

--from-stdin[=file-path[:type]]
    oras will read data from the stdin and write it to `file-path` within the image. If `file-path` has not
    been supplied it defaults to `./stdin.blob` with the type `application/octet-stream`. This option can be
    used in conjunction with other files supplied via `<file>[:type] [...]` but does not need to. The only
    exception is that there must not be supplied another `-` file.

...

<file>[:type] [...]
    The files to include within the image. The special file `-[:type]` is equivalent to using the option
    `--form-stdin=./stdin.blob[:type]` where if no type has been supplied the type
    `application/octet-stream` will be used.

Regarding your second question, I am not that familiar with how OCI images work, thus I am currently unable to answer your question, but I am willing to study the docs in order to further elaborate your question if the answer above doesn't solve it implicitly.

For oras pull there might be a similar option:

--to-stdout[:<single|tar>][=file-path,...]
    Instead of writing the content of the image to a directory, the content will be written to stdout.

    When supplying `--to-stdout:single[=file-path]` the file found at `file-path` within the image will be
    written to stdout without converting it to an archive. If no `file-path` has been supplied and the image
    contains exactly one file this will be written out, otherwise the command will fail. If more than one
    `file-path` has been supplied the command will also fail.

    When supplying `--to-stdout:tar[=file-path,...]` the files found at `file-path,...` will be written to
    standard out by combining the files within an uncompressed tar archive. If no files have been supplied,
    all files within the image will be included in the archive.

    Aliases:
    `--to-stdout=<file-path>` => `--to-stdout:single=<file-path>`
    `--to-stdout=<file-path,...>` => `--to-stdout:tar=<file-path,...>`
    `--to-stdout` => `--to-stdout:single`
    
    This option is mutually exclusive with the `--output` option.

Regarding the penultimate line, I am not quite confident if this is the right choice (defaulting to single), but I think most users would try to pipe a single file instead of a whole archive.

guettli · 2023-11-28T15:43:02Z

Just for the records, I found this solution for me to stream the content of an artifact to stdout:

oras blob fetch -o- ghcr.io/foo/test@$(oras manifest fetch ghcr.io/foo/test:0.0.1  | yq '.layers[0].digest')

I pushed the tgz like this:

oras push ghcr.io/foo/test:0.0.1 --artifact-type application/vnd.foo.machine-image.v1 image.tgz

This solves my use case, but it would be great to do that without yq (in a single oras call).

qweeah · 2023-11-29T08:04:04Z

I pushed the tgz like this:

oras push ghcr.io/foo/test:0.0.1 --artifact-type application/vnd.foo.machine-image.v1 image.tgz

@guettli This is very interesting. May I know what's stored inside the image.tgz and how it is generated?

If you provided a folder but not a file, oras push can help pack and oras pull can unpack automatically. If your end-to-end scenario fits into this, you may try

oras push ghcr.io/foo/test:0.0.1 --artifact-type application/vnd.foo.machine-image.v1 image # pack and push all files in folder image
oras pull ghcr.io/foo/test:0.0.1 -o pulled # pull and unpack files into folder pulled/image

guettli · 2023-11-29T08:16:00Z

@qweeah thank you for asking. The tgz contains a linux root file system. We booted Ubuntu on a VM, then we installed some tools and applied some configuration, and then we create a tgz, so that we have constant custom image. The image is about 1.8 GByte and contains 100k files.

I am happy to store the tgz as blob in an artifact. Nice to know that you could use oras for tar/untar, too. But at the moment I don't see big the benefit.

One drawback of the current method: We can't create the artifact via streaming. AFAIK something like this is not supported yet:

tar -czf- .... | oras push ...

@qweeah what benefit would we have if we would use oras instead of tar/untar?

qweeah · 2023-11-29T10:36:45Z

Before upload any blob to a registry, the digest must be specified.

Unless you can get the digest before archiving is done, Otherwise it's not possible to do the streamed uploading.

qweeah · 2023-11-29T11:26:45Z

@qweeah what benefit would we have if we would use oras instead of tar/untar?

Well, rather than using oras manifest fetch + oras blob fetch, you can use only one command oras pull to do the pulling.

ProbstDJakob · 2023-11-29T11:50:50Z

Before upload any blob to a registry, the digest must be specified.

Unless you can get the digest before archiving is done, Otherwise it's not possible to do the streamed uploading.

In order to circumvent this oras could buffer the input stream in memory until for example 64MiB and if this threshold has been reached oras pauses reading new input and first writes the 64MiB into a temporary file with narrow access rights and then resumes reading from the input and directly pipe it into the file. After reaching the EOF oras could calculate the digest either from the in memory buffer or if the content was too large from the file, pack it, and upload the image.

The buffering in memory would only be for performance (and security) reasons, but would mostly be a nice to have feature.

qweeah · 2023-11-29T12:35:30Z

Before upload any blob to a registry, the digest must be specified.
Unless you can get the digest before archiving is done, Otherwise it's not possible to do the streamed uploading.

In order to circumvent this oras could buffer the input stream in memory until for example 64MiB and if this threshold has been reached oras pauses reading new input and first writes the 64MiB into a temporary file with narrow access rights and then resumes reading from the input and directly pipe it into the file. After reaching the EOF oras could calculate the digest either from the in memory buffer or if the content was too large from the file, pack it, and upload the image.

The buffering in memory would only be for performance (and security) reasons, but would mostly be a nice to have feature.

It's not sth oras can circumvent, you cannot get the checksum of the blob before tar finishes writing

ProbstDJakob · 2023-11-29T12:46:57Z

I know that is why I proposed the solution with buffering/writing to a temporary file. Thus the calculation of the digest can be done after tar finishes without the need of creating/deleting a temporary file by oneself and therefore support streaming.

qweeah · 2023-11-29T13:15:37Z

Yes, the digest calculation can be done while packing and this optimization has already been applied in oras-go.

The question is after getting the digest, oras CLI still need to go through the archive file to upload it.

ProbstDJakob · 2023-12-10T20:30:56Z

Sorry for the late response. Maybe I do not know enough about how oras works, but wouldn't the proposed solution be equivalent to supplying files as arguments but instead of including the files from the arguments the only file to include is the buffer/temporary file?

Maybe the following pseudo script will help you understand my suggestion:

uint8[64MiB] buffer;

read(into=buffer, from=stdin);
Readable inputData;

if (peek(stdio) == EOF) {
  inputData = buffer;
else {
  File tmpFile = tmpFileCreate();
  write(to=tmpFile, from=buffer);
  readAll(into=tmpFile, from=stdin);

  seek(origin=START, offset=0, file=tmpFile);
  inputData = tmpFile;
}

call oras push registry.example:5000 inputData # Yes the CLI is not able to accept buffers, but I hope you get what I intend to say

qweeah · 2023-12-11T02:57:00Z

@ProbstDJakob Besides from the seek operation, what you described is already implemented in here.

P.S. I think this discussion has gone too far from this issue and I have created #1200 so we can continue there.

ProbstDJakob · 2023-12-11T10:10:35Z

The following script is a real world example where streaming could come in handy.

Background

We fully manage the life cycle of an OpenShift cluster via a GitLab Pipeline. When creating a cluster with the openshift-install tool some files like terraform state and kube-configs will be created. Those files are needed during the whole life cycle of the cluster (not only in the current pipeline), thus they need to be stored persistently. In our case we use the existing GitLab registry and oras to create an image.

Current way to pull the artifacts from the registry

#!/usr/bin/env sh
set -eu

# [...] some preparations

tempDir="$(mktemp -d)"
oras pull --output "$tempDir" "$ENCRYPTED_OPENSHIFT_INSTALL_ARTIFACTS_IMAGE"
sops --decrypt --input-type binary --output-type binary "$tempDir/openshift-install-artifacts.tar.gz.enc" \
  | tar -xzC "$CI_PROJECT_DIR"
rm -rf "$tempDir"

Possible way to pull the artifacts from the registry with pipelining

#!/usr/bin/env sh
set -eu

# [...] some preparations

oras pull --output - "$ENCRYPTED_OPENSHIFT_INSTALL_ARTIFACTS_IMAGE" \
  | sops --decrypt --input-type binary --output-type binary /dev/stdin \
  | tar -xzC "$CI_PROJECT_DIR"

This way there is no need to create a temporary directory and to know how the file is called within the image (not a problem for us since we named it within the same repo).

Counterpart

See #1200 (comment)

sajayantony added the help wanted Extra attention is needed label Apr 19, 2022

dtzar added this to ORAS-Planning Jul 6, 2022

shizhMSFT assigned FeynmanZhou and yizha1 Aug 2, 2022

FeynmanZhou added this to the future milestone Sep 7, 2022

shizhMSFT modified the milestones: future, v1.1.0 Mar 22, 2023

shizhMSFT added this to ORAS Project Board-2023 Mar 22, 2023

shizhMSFT modified the milestones: v1.1.0, v1.2.0 Mar 22, 2023

shizhMSFT added enhancement New feature or request and removed help wanted Extra attention is needed labels Mar 22, 2023

shizhMSFT moved this to Todo in ORAS Project Board-2023 Mar 22, 2023

shizhMSFT modified the milestones: v1.2.0, future Sep 12, 2023

shizhMSFT added the question Further information is requested label Sep 12, 2023

qweeah mentioned this issue Nov 8, 2023

Support customizing file name in oras pull #1159

Open

1 task

qweeah mentioned this issue Dec 11, 2023

oras push: support streamed uploading from stdin #1200

Open

shizhMSFT mentioned this issue Mar 26, 2024

Support pulling referrers from an image manifest #1308

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support `--output -` to pull to stdout #346

support `--output -` to pull to stdout #346

ndeloof commented Jan 7, 2022

SteveLasker commented Jan 14, 2022

shizhMSFT commented May 7, 2022

FeynmanZhou commented Aug 31, 2022

TerryHowe commented Dec 5, 2022

shizhMSFT commented Mar 15, 2023 •

edited by qweeah

Loading

ProbstDJakob commented Jul 18, 2023

qweeah commented Sep 12, 2023

ProbstDJakob commented Sep 12, 2023

guettli commented Nov 28, 2023

qweeah commented Nov 29, 2023 •

edited

Loading

guettli commented Nov 29, 2023

qweeah commented Nov 29, 2023

qweeah commented Nov 29, 2023

ProbstDJakob commented Nov 29, 2023

qweeah commented Nov 29, 2023

ProbstDJakob commented Nov 29, 2023

qweeah commented Nov 29, 2023

ProbstDJakob commented Dec 10, 2023

qweeah commented Dec 11, 2023

ProbstDJakob commented Dec 11, 2023 •

edited

Loading

support --output - to pull to stdout #346

support --output - to pull to stdout #346

Comments

ndeloof commented Jan 7, 2022

SteveLasker commented Jan 14, 2022

shizhMSFT commented May 7, 2022

FeynmanZhou commented Aug 31, 2022

TerryHowe commented Dec 5, 2022

shizhMSFT commented Mar 15, 2023 • edited by qweeah Loading

ProbstDJakob commented Jul 18, 2023

qweeah commented Sep 12, 2023

ProbstDJakob commented Sep 12, 2023

guettli commented Nov 28, 2023

qweeah commented Nov 29, 2023 • edited Loading

guettli commented Nov 29, 2023

qweeah commented Nov 29, 2023

qweeah commented Nov 29, 2023

ProbstDJakob commented Nov 29, 2023

qweeah commented Nov 29, 2023

ProbstDJakob commented Nov 29, 2023

qweeah commented Nov 29, 2023

ProbstDJakob commented Dec 10, 2023

qweeah commented Dec 11, 2023

ProbstDJakob commented Dec 11, 2023 • edited Loading

Background

Current way to pull the artifacts from the registry

Possible way to pull the artifacts from the registry with pipelining

Counterpart

support `--output -` to pull to stdout #346

support `--output -` to pull to stdout #346

shizhMSFT commented Mar 15, 2023 •

edited by qweeah

Loading

qweeah commented Nov 29, 2023 •

edited

Loading

ProbstDJakob commented Dec 11, 2023 •

edited

Loading