Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schema: Use Validators map and prepare to extend beyond JSON Schema #403

Closed
wants to merge 1 commit into from

Conversation

wking
Copy link
Contributor

@wking wking commented Oct 20, 2016

With image-tools split off into its own repository, the plan seems to be to keep all intra-blob JSON validation in this repository and to move all other validation (e.g. for layers or for walking Merkle trees) in image-tools. All the non-validation logic currently in image/ is moving into image-tools as well.

Some requirements (e.g. multi-parameter checks like allowed OS/arch pairs) are difficult to handle in JSON Schema but easy to handle in Go. And callers won't care if we're using JSON Schema or not; they just want to know if their blob is valid.

This commit restructures intra-blob validation to ease the path going forward (although it doesn't actually change the current validation significantly). The old method:

func (v Validator) Validate(src io.Reader) error

is now a new Validator type:

type Validator(blob io.Reader, descriptor *v1.Descriptor, strict bool) (err error)

and instead of instantiating an old Validator instance:

schema.MediaTypeImageConfig.Validate(reader)

there's a Validators registry mapping from the media type strings to the appropriate Validator instance (which may or may not use JSON Schema under the hood). And there's a Validate function (with the same Validator interface) that looks up the appropriate entry in Validators for you so you have:

schema.Validate(reader, descriptor, true)

By using a Validators map, we make it easy for library consumers to register (or override) intra-blob validators for a particular type. Locations that call Validate(…) will automatically pick up the new validators without needing local changes.

All of the old validation was based on JSON Schema, so currently all Validators values are ValidateJSONSchema. As the schema package grows non-JSON-Schema validation, entries will start to look like:

var Validators = map[string]Validator{
  v1.MediaTypeImageConfig: ValidateConfig,
  …
}

although ValidateConfig will probably use ValidateJSONSchema internally.

By passing through a descriptor, we get a chance to validate the digest and size (which we were not doing before). Digest and size validation for a byte array are also exposed directly (as ValidateByteDigest and ValidateByteSize) for use in validators that are not based on ValidateJSONSchema. Access to the digest also gives us a way to print specific error messages on failures. In situations where you don't know the blob digest, the new DigestByte will help you calculate it (for a byte array).

There is also a new strict parameter to distinguish between compliant images (which should only pass when strict is false) and images that only use features which the spec requires implementations to support (which should pass regardless of strict). The current JSON Schemas are not strict, and I expect we'll soon gain Go code to handle the distinction (e.g. #341). So the presence of strict in the Validator type is future-proofing our API and not exposing a currently-implemented feature.

I've made the minimal sane changes to cmd/ and image/, because we're dropping them from this repository (and continuing them in runtime-tools).

@wking
Copy link
Contributor Author

wking commented Oct 21, 2016

Rebased around #337 with bd8ec26e3017d7.

@wking
Copy link
Contributor Author

wking commented Oct 21, 2016

I expect the Travis error is due to the Dyn DDoS issues. We should kick Travis after those have been resolved.

@wking
Copy link
Contributor Author

wking commented Nov 1, 2016 via email


// DigestByte computes the digest of a blob using the requested
// algorithm.
func DigestByte(data []byte, algorithm string) (digest string, err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only there were an existing, well-tested package that provides a mature implementation of this functionality...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have a more specific suggestion here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevvooe
Copy link
Contributor

@wking Is this still relevant?

@wking
Copy link
Contributor Author

wking commented Jan 19, 2017 via email

With image-tools split off into its own repository, the plan seems to
be to keep all intra-blob JSON validation in this repository and to
move all other validation (e.g. for layers or for walking Merkle
trees) in image-tools [1].  All the non-validation logic currently in
image/ is moving into image-tools as well [2].

Some requirements (e.g. multi-parameter checks like allowed OS/arch
pairs [3]) are difficult to handle in JSON Schema but easy to handle
in Go.  And callers won't care if we're using JSON Schema or not; they
just want to know if their blob is valid.

This commit restructures intra-blob validation to ease the path going
forward (although it doesn't actually change the current validation
significantly).  The old method:

  func (v Validator) Validate(src io.Reader) error

is now a new Validator type:

  type Validator(blob io.Reader, descriptor *v1.Descriptor, strict bool) (err error)

and instead of instantiating an old Validator instance:

  schema.MediaTypeImageConfig.Validate(reader)

there's a Validators registry mapping from the media type strings to
the appropriate Validator instance (which may or may not use JSON
Schema under the hood).  And there's a Validate function (with the
same Validator interface) that looks up the appropriate entry in
Validators for you so you have:

  schema.Validate(reader, descriptor, true)

By using a Validators map, we make it easy for library consumers to
register (or override) intra-blob validators for a particular type.
Locations that call Validate(...) will automatically pick up the new
validators without needing local changes.

All of the old validation was based on JSON Schema, so currently all
Validators values are ValidateJSONSchema.  As the schema package grows
non-JSON-Schema validation, entries will start to look like:

  var Validators = map[string]Validator{
    v1.MediaTypeImageConfig: ValidateConfig,
    ...
  }

although ValidateConfig will probably use ValidateJSONSchema
internally.

By passing through a descriptor, we get a chance to validate the
digest and size (which we were not doing before).  Digest and size
validation for a byte array are also exposed directly (as
ValidateByteDigest and ValidateByteSize) for use in validators that
are not based on ValidateJSONSchema.  Access to the digest also gives
us a way to print specific error messages on failures.

There is also a new 'strict' parameter to distinguish between
compliant images (which should always pass when strict is false) and
images that only use features which the spec requires implementations
to support (which should only pass if strict is true).  The current
JSON Schemas are not strict, but the config/layer media type checks in
ValidateManifest exercise this distinction.

Also use go-digest for local hashing now that we're vendoring it.

[1]: http://ircbot.wl.linuxfoundation.org/meetings/opencontainers/2016/opencontainers.2016-10-12-21.01.log.html#l-71
[2]: opencontainers#337
[3]: https://tools.ietf.org/html/draft-fge-json-schema-validation-00#section-5.5.5
[4]: opencontainers#341

Signed-off-by: W. Trevor King <[email protected]>
@wking
Copy link
Contributor Author

wking commented Jan 22, 2017

Rebased onto master and fixed two lint errors with 3218e5bf2b9500.

@stevvooe
Copy link
Contributor

Closing as this is really the wrong direction. One should be able to do this:

if err := MediaTypeManifest.Validate(r); err != nil {
  // handle errors
}

@stevvooe stevvooe closed this Jan 25, 2017
@wking
Copy link
Contributor Author

wking commented Jan 25, 2017 via email

@stevvooe
Copy link
Contributor

@wking Why would validation take a descriptor? You already know the type. Just validate it.

And since when did we have a strict bool? That makes no sense. Hasn't that kind of thing been considered bad practice for at least a decade now?

If we want different levels of validation, create a type-switched validation train that can be pulled over the content:

type Validators map[MediaType]func(r io.Reader, validators...Validators) error

var Strict = Validators{
  // .. define validators
}

MediaTypeManifest.Validate(r, Strict, Log)

Effectively, this allows you traverse a tree of validators, dispatching various types for each validator, while also allowing extension.

This PR allows none of this and is pretty static.

@wking
Copy link
Contributor Author

wking commented Jan 26, 2017 via email

@stevvooe
Copy link
Contributor

@wking The descriptor should already have been verified against the content. Collapsing these two layers in a HUGE design mistake and I have no more patience in arguing this point for point. There may be a few adjustments, but the suggested approach meets all of the requirements. I apologize for putting it this way, but either take direction from the maintainers or have your PRs closed.

@wking
Copy link
Contributor Author

wking commented Jan 26, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants