Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add json schema support for validating the configuration #69

Merged

Conversation

phlogistonjohn
Copy link
Collaborator

@phlogistonjohn phlogistonjohn commented Mar 8, 2023

Fixes: #67

Add jsonschema support to sambacc. We add a new subdir sambacc/schema that holds the canonical YAML schema files and generated files derived from the YAML sources.
A new tool that can be invoked on the cli by python -m sambacc.schema.tool is added to help manage the schema files and ensure they are kept in sync.

The sambacc.config module is updated to take advantage to schema based validation when the python jsonschema library is available. On the CLI one can request automatic validation (validate if library is present) or require or disable validation. The auto option is the default.

Since this PR is already long-ish I'm avoiding two things:

  • Trying to cover all cases in the schema. I'm sure that there are gaps. This builds the infra and we can fill in some of the gaps over time.
  • Dealing with distro packaging. That'll be done in follow up PR(s) as I need to manully test various scenarios and I have other packaging related TODOs.

@phlogistonjohn
Copy link
Collaborator Author

@dmulder PTAL and provide any initial feedback you have WRT validating the config. I'm pretty happy with this PR now so I'll probably move it to reviewable very soon (after I look at it once more with fresh eyes).

@phlogistonjohn phlogistonjohn force-pushed the jjm-json-schema-1 branch 3 times, most recently from 171002a to 215aeb5 Compare March 13, 2023 15:43
@phlogistonjohn phlogistonjohn changed the title [WIP] Add json schema support for validationg the configuration Add json schema support for validating the configuration Mar 13, 2023
@phlogistonjohn phlogistonjohn marked this pull request as ready for review March 13, 2023 15:48
Copy link
Collaborator

@synarete synarete left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR itself looks fine, but when I try to run tox (tox 3.25.0 on fedora36) it fails (with jsonschema.exceptions.RefResolutionError: unknown url type: ''). Trying to dive into it and figure out the root cause.

@@ -0,0 +1,303 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "mailto:[email protected]",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this e-mail address correct?
IMHO it is perfectly fine to use it here, just wanted to make sure it is intended.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value of the $id field has to be a unique URI according to the spec. Because SINK doesn't have a webpage nor did I want to point it back at a "raw" link on github I just used a mailto uri that just happens to be a valid contact for me. :-)

@@ -0,0 +1,169 @@
#!/bin/python3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer: #!/usr/bin/env python3

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used to use /iusr/bin/env in my python scripts but moved away from it as I heard that it is not the best approach. Of course, now I want to justify that I am not finding a ton of resources but theres this one from fedora - https://lists.fedoraproject.org/archives/list/[email protected]/message/2PD5RNJRKPN2DVTNGJSBHR5RUSVZSDZI/
(as well as https://blogs.gnome.org/mcatanzaro/2018/02/16/on-python-shebangs/ )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol. I just noticed that it's still wrong because while I intended to not use '/usr/bin/env' I did not put the correct path in!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting -- thanks for this info.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@phlogistonjohn
Copy link
Collaborator Author

The PR itself looks fine, but when I try to run tox (tox 3.25.0 on fedora36) it fails (with jsonschema.exceptions.RefResolutionError: unknown url type: ''). Trying to dive into it and figure out the root cause.

Locally, I was running tox 3.27.1 also on fedora36. Can you tell me which test env failed? It kinda sounds like it might not like my mailto trick, but I don't see why it would fail for you and not for me or the CI. Maybe you somehow ended up with an older/newer version of the jsonschema library... so can you also see if you have jsonschema installed via rpm and what version?

@phlogistonjohn
Copy link
Collaborator Author

The PR itself looks fine, but when I try to run tox (tox 3.25.0 on fedora36) it fails (with jsonschema.exceptions.RefResolutionError: unknown url type: ''). Trying to dive into it and figure out the root cause.

Locally, I was running tox 3.27.1 also on fedora36. Can you tell me which test env failed? It kinda sounds like it might not like my mailto trick, but I don't see why it would fail for you and not for me or the CI. Maybe you somehow ended up with an older/newer version of the jsonschema library... so can you also see if you have jsonschema installed via rpm and what version?

Failed tests (also when using tox 3.27.1):

tests/test_permissions.py::test_init_handler ERROR 
tests/test_permissions.py::test_always_handler ERROR 

Which testenv(s) though? our tox.ini defines 6 now: formatting, py3-mypy, py39-mypy, py3, py39, schemacheck

I assume it is either py3, py39 or both. They handle dependencies slightly differently (so as to test samba libs when run in the container). Knowing which ones failed may help track down the reason.

@synarete
Copy link
Collaborator

The PR itself looks fine, but when I try to run tox (tox 3.25.0 on fedora36) it fails (with jsonschema.exceptions.RefResolutionError: unknown url type: ''). Trying to dive into it and figure out the root cause.

Locally, I was running tox 3.27.1 also on fedora36. Can you tell me which test env failed? It kinda sounds like it might not like my mailto trick, but I don't see why it would fail for you and not for me or the CI. Maybe you somehow ended up with an older/newer version of the jsonschema library... so can you also see if you have jsonschema installed via rpm and what version?

Failed tests (also when using tox 3.27.1):

tests/test_permissions.py::test_init_handler ERROR 
tests/test_permissions.py::test_always_handler ERROR 

Which testenv(s) though? our tox.ini defines 6 now: formatting, py3-mypy, py39-mypy, py3, py39, schemacheck

I assume it is either py3, py39 or both. They handle dependencies slightly differently (so as to test samba libs when run in the container). Knowing which ones failed may help track down the reason.

My bad -- I used the wrong tox. Now using tox 3.27.1 and it is fine. Strange.

@phlogistonjohn
Copy link
Collaborator Author

@Mergifyio rebase

The check_config_data function is not used outside of config.py. It
is safe to make private so this patch does so. This will make future
changes to the file clearer.

Signed-off-by: John Mulligan <[email protected]>
The word member is not spelled "memeber".

Signed-off-by: John Mulligan <[email protected]>
When working with arbitrary json data dicts it's easier to write
"JSONData".

Signed-off-by: John Mulligan <[email protected]>
Add a new keyword-only initial_data argument to GlobalConfig. This is
primarily for testing so that we can load data directly into the global
configuration object without going through all validations that real
json read from a file does.

Signed-off-by: John Mulligan <[email protected]>
Update a test to use the new initial_data argument of GlobalConfig.
Since this test is checking run time behavior of invalid data it is best
to bypass any other checks that GlobalConfig may apply in the future
when parsing JSON based config.

Signed-off-by: John Mulligan <[email protected]>
Add two schema files, one in yaml and one in json, to the tree.
The YAML file is the file to be edited, but in cases where yaml
is an extra dependency we also provide a conversion to raw json.
These yaml files help describe the configuration and may be later
used with validation tools or documentation generators.

Signed-off-by: John Mulligan <[email protected]>
Add a tool can be run by either a developer or a CI to ensure the schema
files are sync'ed and/or update the generated files.

Signed-off-by: John Mulligan <[email protected]>
Now schema is a real package (based on generated code).

Signed-off-by: John Mulligan <[email protected]>
This tool is only run at "build time" and is not a direct part of the
sambacc library code.

Signed-off-by: John Mulligan <[email protected]>
This uses the jsonschema library to check that the structure of the JSON
config matches the supplied schema.

Signed-off-by: John Mulligan <[email protected]>
Add a --validate-config cli option and SAMBACC_VALIDATE_CONFIG
environment variable to enable/disable (or leave automatic) the json
schmea based validation of the config. Because json schema is new and
there are cases (primarily as a dev hacking on stuff) we give the
ability to turn off the validation. Note - the validation is auto by
default - if the library is not present the validation will be
automatically skipped.

Signed-off-by: John Mulligan <[email protected]>
The require_validation property/method is used to determine if sambacc
commands should enable validation of the config or just try to use
whatever was provided.

Signed-off-by: John Mulligan <[email protected]>
The default behavior of tool.py in the schema dir is to check that the
data in the yaml based files match that in the JSON & py files.
Add a tox rule to invoke the tool to help ensure changes to yaml file
aren't being made without corresponding updates to the others.

Signed-off-by: John Mulligan <[email protected]>
@mergify
Copy link

mergify bot commented Apr 5, 2023

rebase

✅ Branch has been successfully rebased

@phlogistonjohn phlogistonjohn merged commit 56945d7 into samba-in-kubernetes:master Apr 6, 2023
@phlogistonjohn phlogistonjohn deleted the jjm-json-schema-1 branch April 6, 2023 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sambacc fails to do any data validation on json config
2 participants