diff --git a/docs/proposals/Verify-Latest-N-Artifacts.md b/docs/proposals/Verify-Latest-N-Artifacts.md new file mode 100644 index 000000000..efe7240eb --- /dev/null +++ b/docs/proposals/Verify-Latest-N-Artifacts.md @@ -0,0 +1,103 @@ +# Verify only the latest N artifacts + +## Problem/Motivation + +When configuring a verifier in Ratify, we set the artifact type the verifier should work on. In such case, Ratify will verify all referrers of a given subject that have a matching artifact type using the verifier. +In some cases, this could lead to a wrong behavior. For instance, Vulnerability Artifacts are outdated once a new artifact is written to the repository, as such there is no use for verifying both the new one and the old one. + +The issue with verifying all the matching artifacts could also lead to performance issues, each "verification" process hides within a request to pull the artifcat manifest, and the blobs containing the actual data. +In previous studies made by the ratify team, it was observed that opverloading the registry with requests could lead to errors and throtteling. (see: https://ratify.dev/docs/reference/performance) + +Given the performance study listed above, in order to provide the best experience for Ratify's users ratify would reduce the load it generates on an the registry, thus reducing the chance for throtteling. + +# Proposed Solution + +Ratify uses the Referrer API (see: https://github.com/opencontainers/distribution-spec/blob/main/spec.md#listing-referrers) in order to obtain the list of attached artifacts of a given subject artifact. The response body for this request is a generated OCI image index, that looks like this: + +```json +{ + "schemaVersion": 2, + "mediaType": "application/vnd.oci.image.index.v1+json", + "manifests": [ + { + "mediaType": "application/vnd.oci.image.manifest.v1+json", + "size": 1234, + "digest": "sha256:a1a1a1...", + "artifactType": "application/vnd.example.sbom.v1", + "annotations": { + "org.opencontainers.image.created": "2022-01-01T14:42:55Z", + "org.example.sbom.format": "json" + } + }, + { + "mediaType": "application/vnd.oci.image.manifest.v1+json", + "size": 1234, + "digest": "sha256:a2a2a2...", + "artifactType": "application/vnd.example.signature.v1", + "annotations": { + "org.opencontainers.image.created": "2022-01-01T07:21:33Z", + "org.example.signature.fingerprint": "abcd" + } + }, + { + "mediaType": "application/vnd.oci.image.index.v1+json", + "size": 1234, + "digest": "sha256:a3a3a3...", + "annotations": { + "org.opencontainers.image.created": "2023-01-01T07:21:33Z", + } + } + ] +} +``` + +The image index response is requried by API to include the image annotations, which gives Ratify to oprrotunity to perform some basic filtration before invoking the verifier on the listed artifacts. The exact mechanism for requiring the filtration should be specific to each verifier as the behavior and logic differs based on is actually being attested, as such, it will be part of the verifier configuration. + +## Latest N artifacts only verification + +Assuming artifacts are generated by ORAS, they all have a `org.opencontainers.image.created` annotation, that markes the creation date of the artifact. Based on it, Ratify could filter out stale artifacts and only evaluate the latest image. To achieve this, Ratify would have to read all the referrers, ordering them based on the artifact age, and only pass the latest one to the corresponding verifier. + +This kind of filtering strategy is best used on artifacts that are rapidly changing, for example Vulnerability Assessment artifacts that are immedietly oudated once a new artifact is pushed to the registry. + +* An artifact without the creation annotation is considered to be the oldest. +* The annotation value should be a date time string in RFC 3339 format, any other value will result is invalid, and should be treated as the oldest artifact. + +## User experiences + +This section describes the experience that users interact with Ratify using the proposed solution. In summary, the propsed solution suggest we should allow for filtering of artifact based on annotations, and as such the following section describes how the customer would configure the filtration. + +Seeing that filteration is unique to each verifier, it should be configured in the verifier itself, as such, in order to maintain backwards compatability, it is important to note that if no filteration is configured the default behavior would be to evaluate all artifacts. + +### Defining artifact age based filtering + +To support artifact age based filtering, we would add an additional field to the verifier configuration: + +```yaml +apiVersion: config.ratify.deislabs.io/v1beta1 +kind: Verifier # NamespacedVerifier has the same spec. +metadata: + name: test-verifier +spec: + name: # REQUIRED: [string], the unique type of the verifier (notation, cosign) + artifactType: # REQUIRED: [string], comma seperated list, artifact type this verifier handles + verifyLastNArtifacts: # Optional: [int], denote the number of attached artfacts that should be verified. only the Last n will be verified. if not defined, all artifacts will be verified. + address: # OPTIONAL: [string], Plugin path, defaults to value of env "RATIFY_CONFIG" or "~/.ratify/plugins" + version: # OPTIONAL: [string], Version of the external plugin, defaults to 1.0.0. On ratify initialization, the specified version will be validated against the supported plugin version. + source: + artifact: # OPTIONAL: [string], Source location to download the plugin binary, learn more at docs/reference/dynamic-plugins.md e.g. wabbitnetworks.azurecr.io/test sample-verifier-plugin:v1 + parameters: # OPTIONAL: [object] Parameters specific to this verifier +``` + +### Implementation Considerations +To implmenet "Last N" verification only, Ratify has to be aware of all the attached artifacts of a givan kind before handing them to the verifier that wishes to attest only the latest artifact. In order to implement such behavior some modification has to be made to the executor of Ratify and the verifier implementation. + +Below are two proposals which are currently being considered for implemetation. +| Approach | Pros | Cons| Notes | +| -------- | ---- | ----| ----- | +| 1. Obtain and store all the referrer list
2. Sort it in descending order.
3. Use the CanVerify method of the referrer to make sure a verifier
that only wants the latest artifact is invoked once. | Naive implementation.

Does not make a huge change in the executor, other than fetching the list before hand.

Transparent change for verifiers that do not wish to use verify the latest image only. | The referrer list can be of any arbitrary size, therefore fetching the entire list may cause Ratify to hit a hard memory limit and crash.
To implement the feature with this kind of behavior, Ratify would have to limit the number of attached artifacts it supports to some constant number which will be determined during the implementation.

Additional latency for sorting the artifacts. | A test index list, with ~1000 artifacts within and two annotations (created timestamp, and another text field) weighs around ~400K, default ratify installation has 512MB of ram, so we're well within the limits of 'normal' use. +| 1. Split verifiers into two groups, those which require only latest artifact, and those which operate on all artifacts.
2. For verifiers that work on all artifacts, no change will be made.
3. For verifiers that require only the last N artifacts, the executor will manage a map between the verifier and an artifact descriptor list that is the "current candidates" for being the latest.
4. As we iterate all the referrer, the cadndiate list is constantly being updated, if a new artifact is discovered.
Once the executor had finished iterating over the referrer list, it would execute all the verifiers that required the latest N artifact against the "current candidate" list for each verifier, which are promised to be latest artifacts. | Does not modify Ratify's current scalability.| Requires the executor to be aware of verifier type, possibly by an interface change on the verifier API

Requires changes in multiple places in the executor, performing the verifier loop another time for the second list of verifiers that only require latest artifact. | The benefit of not pulling all the referrers, from the standpoint of keeping the same 'memory footprint' is not clear. + +# References + +* [Ratify Performance at Scale Study](https://ratify.dev/docs/reference/performance) +* [Referrer API in Distribution Spec](https://github.com/opencontainers/distribution-spec/blob/main/spec.md#listing-referrers) \ No newline at end of file