Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JSON Schema for 'canonical-data.json' and update the README.md #602

Merged
merged 2 commits into from
Mar 5, 2017
Merged

Add JSON Schema for 'canonical-data.json' and update the README.md #602

merged 2 commits into from
Mar 5, 2017

Conversation

rbasso
Copy link
Contributor

@rbasso rbasso commented Feb 18, 2017

In #336 (canonical-data.json standardisation discussion), it was recently discussed a JSON Schema to capture the fundamental structure needed for canonical-data.json files.

After a lot of discussions and a few iterations, the resulting schema proposed seems stable and mature enough to become at least a recommended standard, IMHO.

This proposal was designed with the following goals:

  • Keep/improve human-readability.
  • Keep it flexible enough to allow designing any reasonable test suite.
  • Make it more regular and machine-readable.

There where naturally some trade-offs, as it is impossible to satisfy these objectives completely, but I think we got a good balance here.

To get a first impression of what is this about, in this new schema, a simple test suite would look like this:

{ "exercise": "foobar"
, "version" : "1.0.0"
, "cases":
    [ { "description": "Foo'ing a word returns it reversed"
      , "property"   : "foo"
      , "input"      : "lion"
      , "expected"   : "noil"
      }
    , { "description": "Bar'ing a name returns its parts combined"
      , "property"   : "bar"
      , "firstName"  : "Alan"
      , "lastName"   : "Smithee"
      , "expected"   : "ASlmainthee"
      }
    ]
}

Of course, there are more features, like comments, test groups and error signaling!

We expect that this change will soon allow us to:

  • Automatically verify the canonical-data.json files.
  • Simplify test generation, allowing more code to be shared among all exercises.
  • Pave the way for future standardization. As our understanding of test suite design improves and new best practices emerge, it will be easy to patch this schema to capture even more structure.

We know that this proposal will not make it possible to generate a test suite in a fully automatic way. To allow that we would have to significantly change all the test suites, and there was no agreement about how to do it or even if that is desirable.

Considering that this is a very sensible and highly debated topic, we shouldn't merge this PR until we are sure that enough people have a clear understanding of it and think this is an improvement over what we currently have.

We would like to know what do you think about this proposal. So, after studying it for a while, please...

  • Say ❤️ if you think this is great!
  • Say 👍 if you think this would make x-common better.
  • Say 👎 if you think this would make x-common worse.
  • If you have any question about it, please ask!

Also, if you think this isn't good the way it is, please consider helping us improve it!

Closes #336.

README.md Outdated
[ " Test cases can be arbitrarily grouped with a description "
, " to make organization easier. "
]
, "description": "Abnormal input: empty strings and numbers"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it perhaps make sense to also have an example with multiple input values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right, @ErikSchierboom! I'll fix that soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just changed the previous case to use two properties, firstName and lastName, as input. Was this what you had in mind, @ErikSchierboom?

Copy link
Member

@ErikSchierboom ErikSchierboom Feb 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, I just thought to pass an array to input, but I think this is more readable as you now have input value names. Furthermore, if one was to use an array to specify multiple input values, it would become odd to pass an array.

One other option might be to make the content of input be an object. So something like this:

  • test method with no parameters: "input": {}
  • test method with one parameter: "input": { "name": "john" }
  • test method with more than one parameter: "input": { "name": "john", "age": 18 }
  • test method with one parameter that is an array: "input": { "allergies": ["fish", "peanuts"] }

We can then make input be a required field, which helps ease parsing the canonical data by test generators. What do you think about this option?

Copy link
Contributor Author

@rbasso rbasso Feb 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It really makes sense but, considering that we are intentionally not standardizing the input data for now, I used distinct key names to showcase that this is possible. This way people will not be mislead thinking that they need to have an input key in every test. Only description and property are mandatory.

Copy link
Contributor Author

@rbasso rbasso Feb 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid that, if we try to standardize too much at once, more people will start to disagree about details and the chances of getting this approved will go down.

My target is to get a general structure approved, leaving all the details for other discussions.

We already added the restrictions to error, which are great, but they make this PR even harder for people to understand. I'm really not sure if trying to add more things now is good idea...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point taken. Let's leave it as it is then.

Copy link
Contributor Author

@rbasso rbasso Feb 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the sake of discussion, I found a few possible objections to adding an input object:

  • We don't really need it. All additional keys in a test case are already implicitly inputs to the test.
  • It decreases human-readability, adding an unneeded input key and increasing one level of nesting.
  • Google JSON Style Guide recommends avoiding unneeded objects an favor flatter structures (edit)
  • It would not fit more general property tests well, and we are trying to be as general as possible here. (edit)

Following the same reasoning, it would also make more sense to have either a expected or an error. Maybe we shouldn't have put the error inside expected.

Edit: I think that error is nice the way it is now, and we are only enforcing it's structure if it is present in the expected, so we are not really losing anything with the current proposal, as it still allows other representation for errors. If in the future we decide that it is inconvenient, we can change the recommended encoding, but for now we already agreed on that.

@rbasso
Copy link
Contributor Author

rbasso commented Feb 21, 2017

I forgot to mention @exercism/track-maintainers in the first message.

Copy link
Member

@kotp kotp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a trivial change.

README.md Outdated
, "property" : "bar"
, "firstName" : "Alan"
, "lastName" : "Smithee"
, "expected" : "Aslmainthee"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example suggests more than the description states. Should be "ASlmainthee" I believe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?!

I just kept the first in upper case , because is looks more like a real name.

Anyway, I'll change it if you prefer it the other way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the description states that it returns the parts combined, not modified and combined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed! Thanks! 👍

@stkent
Copy link
Contributor

stkent commented Mar 4, 2017

I wonder if it would be useful to also include a schema version in canonical data? Just an idle thought. But anyway, LGTM!

@catb0t
Copy link
Contributor

catb0t commented Mar 5, 2017

I really love this. I was the lone collaborator on #336 arguing for more machine-readability (to strike a balance with human-readability) and this is absolutely beautiful IMO.

@rbasso
Copy link
Contributor Author

rbasso commented Mar 5, 2017

I wonder if it would be useful to also include a schema version in canonical data? Just an idle thought. But anyway, LGTM!

Do you know what is the recommended way to do it, @stkent? The best I could find was this link.

So, unless there is a more generally accepted versioning style, I'm considering following the directions on that link and add the following to schema:

   "self": { "vendor" : "io.exercism"
           , "name"   : "canonical-data"
           , "format" : "jsonschema"
           , "version": "1-0-0"
           },

What do you think about it?

rbasso added 2 commits March 5, 2017 12:25
In #336 ('canonical-data.json' standardisation
discussion), it was discussed a JSON Schema to capture the
fundamental structure needed for `canonical-data.json` files.

This resulting schema can be used to automatically verify these
files in `x-common`, after they are adapted to match the new
format.
@rbasso
Copy link
Contributor Author

rbasso commented Mar 5, 2017

@stkent, I just updated the PR to include the schema version according to this specification. Of course we can change the format later, but it seems a good idea to start versioning the schema right now.


@catb0t wrote:

I really love this. I was the lone collaborator on #336 arguing for more machine-readability (to strike a balance with human-readability) and this is absolutely beautiful IMO.

I'm glad you liked it!


Despite some minor modifications, in the last 15 days we had enough support for this schema and it seems there are no major changes needed here, so I'm planning merging this PR in a few hours, after taking a last look at it.

@rbasso rbasso merged commit dad6e03 into exercism:master Mar 5, 2017
@rbasso rbasso deleted the canonical-schema branch March 5, 2017 15:46
@rbasso
Copy link
Contributor Author

rbasso commented Mar 5, 2017

Here is a link which can be used to easily validate a canonical-data.json file against the schema.

@ErikSchierboom
Copy link
Member

🎉 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants