-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support --output -
to pull to stdout
#346
Comments
Looks super interesting. Please open a PR for the proposal. |
This is an interesting one. How would you handle multiple files? |
Seems like the output of this should be a tgz so multiple files can be supported |
One UX can be: |
Are there any plans to implement this also for the input so that something like the following will be possible: command-a | command-b | oras push localhost:5000/json-artifact:v1 -
...
oras pull localhost:5000/json-artifact:v1 --output - | jq |
@ProbstDJakob There is a plan to provide piped-command user experience in v1.2.0 by standardizing output, see #638 Still I have questions on below commands:
Since the layer content comes from stdin not file,
What if |
--from-stdin[=file-path[:type]]
oras will read data from the stdin and write it to `file-path` within the image. If `file-path` has not
been supplied it defaults to `./stdin.blob` with the type `application/octet-stream`. This option can be
used in conjunction with other files supplied via `<file>[:type] [...]` but does not need to. The only
exception is that there must not be supplied another `-` file.
...
<file>[:type] [...]
The files to include within the image. The special file `-[:type]` is equivalent to using the option
`--form-stdin=./stdin.blob[:type]` where if no type has been supplied the type
`application/octet-stream` will be used. Regarding your second question, I am not that familiar with how OCI images work, thus I am currently unable to answer your question, but I am willing to study the docs in order to further elaborate your question if the answer above doesn't solve it implicitly. For --to-stdout[:<single|tar>][=file-path,...]
Instead of writing the content of the image to a directory, the content will be written to stdout.
When supplying `--to-stdout:single[=file-path]` the file found at `file-path` within the image will be
written to stdout without converting it to an archive. If no `file-path` has been supplied and the image
contains exactly one file this will be written out, otherwise the command will fail. If more than one
`file-path` has been supplied the command will also fail.
When supplying `--to-stdout:tar[=file-path,...]` the files found at `file-path,...` will be written to
standard out by combining the files within an uncompressed tar archive. If no files have been supplied,
all files within the image will be included in the archive.
Aliases:
`--to-stdout=<file-path>` => `--to-stdout:single=<file-path>`
`--to-stdout=<file-path,...>` => `--to-stdout:tar=<file-path,...>`
`--to-stdout` => `--to-stdout:single`
This option is mutually exclusive with the `--output` option. Regarding the penultimate line, I am not quite confident if this is the right choice (defaulting to |
Just for the records, I found this solution for me to stream the content of an artifact to stdout:
I pushed the tgz like this:
This solves my use case, but it would be great to do that without |
@guettli This is very interesting. May I know what's stored inside the If you provided a folder but not a file, oras push ghcr.io/foo/test:0.0.1 --artifact-type application/vnd.foo.machine-image.v1 image # pack and push all files in folder image
oras pull ghcr.io/foo/test:0.0.1 -o pulled # pull and unpack files into folder pulled/image |
@qweeah thank you for asking. The tgz contains a linux root file system. We booted Ubuntu on a VM, then we installed some tools and applied some configuration, and then we create a tgz, so that we have constant custom image. The image is about 1.8 GByte and contains 100k files. I am happy to store the tgz as blob in an artifact. Nice to know that you could use oras for tar/untar, too. But at the moment I don't see big the benefit. One drawback of the current method: We can't create the artifact via streaming. AFAIK something like this is not supported yet:
@qweeah what benefit would we have if we would use oras instead of tar/untar? |
Before upload any blob to a registry, the digest must be specified. Unless you can get the digest before archiving is done, Otherwise it's not possible to do the streamed uploading. |
Well, rather than using |
In order to circumvent this oras could buffer the input stream in memory until for example 64MiB and if this threshold has been reached oras pauses reading new input and first writes the 64MiB into a temporary file with narrow access rights and then resumes reading from the input and directly pipe it into the file. After reaching the EOF oras could calculate the digest either from the in memory buffer or if the content was too large from the file, pack it, and upload the image. The buffering in memory would only be for performance (and security) reasons, but would mostly be a nice to have feature. |
It's not sth oras can circumvent, you cannot get the checksum of the blob before |
I know that is why I proposed the solution with buffering/writing to a temporary file. Thus the calculation of the digest can be done after |
Yes, the digest calculation can be done while packing and this optimization has already been applied in oras-go. The question is after getting the digest, oras CLI still need to go through the archive file to upload it. |
Sorry for the late response. Maybe I do not know enough about how oras works, but wouldn't the proposed solution be equivalent to supplying files as arguments but instead of including the files from the arguments the only file to include is the buffer/temporary file? Maybe the following pseudo script will help you understand my suggestion: uint8[64MiB] buffer;
read(into=buffer, from=stdin);
Readable inputData;
if (peek(stdio) == EOF) {
inputData = buffer;
else {
File tmpFile = tmpFileCreate();
write(to=tmpFile, from=buffer);
readAll(into=tmpFile, from=stdin);
seek(origin=START, offset=0, file=tmpFile);
inputData = tmpFile;
}
call oras push registry.example:5000 inputData # Yes the CLI is not able to accept buffers, but I hope you get what I intend to say |
@ProbstDJakob Besides from the seek operation, what you described is already implemented in here. P.S. I think this discussion has gone too far from this issue and I have created #1200 so we can continue there. |
The following script is a real world example where streaming could come in handy. BackgroundWe fully manage the life cycle of an OpenShift cluster via a GitLab Pipeline. When creating a cluster with the Current way to pull the artifacts from the registry#!/usr/bin/env sh
set -eu
# [...] some preparations
tempDir="$(mktemp -d)"
oras pull --output "$tempDir" "$ENCRYPTED_OPENSHIFT_INSTALL_ARTIFACTS_IMAGE"
sops --decrypt --input-type binary --output-type binary "$tempDir/openshift-install-artifacts.tar.gz.enc" \
| tar -xzC "$CI_PROJECT_DIR"
rm -rf "$tempDir" Possible way to pull the artifacts from the registry with pipelining#!/usr/bin/env sh
set -eu
# [...] some preparations
oras pull --output - "$ENCRYPTED_OPENSHIFT_INSTALL_ARTIFACTS_IMAGE" \
| sops --decrypt --input-type binary --output-type binary /dev/stdin \
| tar -xzC "$CI_PROJECT_DIR" This way there is no need to create a temporary directory and to know how the file is called within the image (not a problem for us since we named it within the same repo). CounterpartSee #1200 (comment) |
My use case is to rely on oras CLI to restore a data cache, stored as tar.gz
on pull, I'd like to pipe directly the downloaded artifact to
tar xz
The text was updated successfully, but these errors were encountered: