-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
artifact metadata #156
Comments
@taskcluster/services-reviewers please share your feedback! |
I realized this may be ambiguous. There's the the gzipped content length, the multipart upload content lengths, and and the filesize on disk. We care about the filesize on disk, so perhaps |
Can someone confirm this assumption? |
It's something we should flesh out a little here. Two parts:
|
Right, I remember the discussion point now. This isn't an immutable data structure so much as an append-only database enforced via Grants. 👍 |
I'm going to start working on restructuring this issue into an RFC. |
Artifact Metadata
The goal is to provide Artifact Integrity guarantees from the point the worker uploads an artifact, to the point that someone downloads that artifact to use. We can do this by:
Adding artifact metadata to the Queue
First, we add a
metadata
dictionary to theS3ArtifactRequest
type. This is a dictionary to allow for flexibility of usage. The initial known keys would includeThe sha256 field is required for Artifact Integrity. Releng has use cases for all 3 fields, so I'm proposing all 3.
A future entry may be
ContentSha256WorkerSignature
, once we solve worker identity.(Optionally we could also add a
metadata
dictionary to theErrorArtifactRequest
(error summary?) andRedirectArtifactRequest
(live log socket info?) types, but it's not clear if we want or need those at this time.)We could add a
Queue.getArtifactInfo
endpoint that returns the URL and metadata.Ensuring that metadata can't be modified once it's written
I'm under the impression this will Just Work, given the nature of the Queue.
Providing a download tool
This is probably a thin wrapper around the taskcluster client library, that gets the metadata of the artifact, downloads it, and verifies any shas. We should allow for optional and required metadata fields, and for failing out if any required information is missing, or if the sha doesn't match. We should be sure to measure the shas and filesizes on the right artifact state (e.g. combining a multipart artifact, not compressed unless the original artifact was compressed).
This tool should be usable as a commandline tool, or as a library that the workers can use.
Once we implement worker signatures in artifact metadata, the download tool will verify those signatures as well.
Object Service
The future object service should be compatible with this proposal.
I can create an rfc once we come to an initial consensus here.
The text was updated successfully, but these errors were encountered: