-
-
Notifications
You must be signed in to change notification settings - Fork 546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Add canonical data "identifier" field to separate concern from test "description" #1473
Comments
Further to this, it seems that
and ends up serving neither well. It constrains the length of method comments and encourages too-long method names. |
Reading more about the schema... https://github.com/exercism/problem-specifications/blob/master/canonical-schema.json and also that "Comments ... can be used almost anywhere" https://github.com/exercism/problem-specifications/blob/master/README.md#test-data-format-canonical-datajson to fit current schema perhaps it would be reasonable for me to shorten the
[EDIT:] In practice this seems to be turning out to be a bad idea. I now expect there will be a lot of push back trying to slim down the descriptions, and conversation will just bog down going in circles trying to reconcile dual purposes (description and identifier) from the one field. This PR for example... #1525 |
Using your example as an example one might consider it easier to just adjust the ordering in the sentence, to simplify it and thus shorten it:
This new description retains the meaning of the original description. There is no guarantee that the comments will translate into a track's test suite. The comments are there for the maintainers benefit in building the test suite for their track. Instead of reworking the schema for the canonical data I think the first step ought to be seeing if it is possible to refactor the "offensive" descriptions to be less offensive, to be more concise but without making the language too sophisticated. For many people English is not their first language, so one has to be careful to not make the language too efficient so as to be difficult for a non-native English speakers to understand easily. Don't take what I have said as meaning I am against changing the schema, quite the opposite. I just don't think changing the schema is necessarily the solution we need to rush too. |
This would also be my preference. The example you presented (bowling), is one of the test suites with the longest descriptions. So maybe if we can shorten those, the problem goes away? |
We can certainly see if we can improve the descriptions to generate better. Also worth noting that some test cases have nested descriptions - e.g. and often these repeat what the parent description say vs. refining it. Triangle is an example of this and it should be corrected (along with checking all the other sub-cased examples). (the geek in me did think it would be fun to take a munged description and hit some web service to summarise it - but that's where the angel on my other shoulder shouts....stoppppp) |
Convert long descriptions into comments per... per #1473
Such a long test is really a comment, not a description to be used as a test identifier. Make it so, per #1473
Such a long test is really a comment, not a description to be used as a test identifier. Make it so, per #1473
Such a long test is really a comment, not a description to be used as a test identifier. Make it so, per #1473
I feel like modifying the descriptions to serve the purpose of being an identifier is like forcing a square peg into a round hole by cutting off the square corners.
What is "too efficient" is highly judgmental with a high-bike-shedding-factor. I fear slimming down descriptions trying to make them more identifier-like will burn everyone's time better spent on other things. |
I think this is a fair assessment. Perhaps we can take this new field under consideration? cc @ErikSchierboom, #1496 |
Somehow I'm a bit hesitant to include this new field, even though I understand the reasoning behind it. My main argument against it is that the current setup actually works fine for virtually all exercises/tracks. Why then change it? As far as I understand, you want to include the description in the code, which I think is something that tracks normally don't do (it is not wrong of course, just unusual). Let's gather some more opinions to see if I'm the odd man out here. @iHiD @kytrinyx @rpottsoh @petertseng what do you feel about the proposal to add a new |
FWIW I'm inclined to agree with @ErikSchierboom: the description field works well enough. Adding an Context: the Rust generator does pretty well with a very simple function to convert descriptions to identifiers: The advantage of adding an The disadvantage is that if we add the new field, it imposes a maintenance burden. All tracks' test generators must handle the field, or be in (harmless-for-now) noncompliance with the canonical data spec. Everyone writing a PR updating the canonical data must decide whether or not to include the optional field. All the reviewers have to decide whether or not it's worth dogshedding it in this case. In my opinion, the cost of this maintenance burden outweighs the benefit of sometimes having cleaner identifiers. |
@bencoman 's main concern I think is that the cascading descriptions as is are huuuuuge. Their language (Pharo (?)) I think is a visual one, so these huuuuge names show up all over and are not actually very "workable". I think he's trying to minimise the bikeshedding by having both: a name and a description. |
And my argument is that instead of doing all this bikeshedding, why not just use shorter descriptions? |
Bingo. My #1473 (comment) from the beginning of March. |
For those of us unfamiliar with Pharo (me) perhaps a screen shot or something that illustrates how/why long descriptions work in a negative way in that language / development environment. If that has been provided somewhere else then please supply a link. It is easier to say no to any changes than it is to say yes when it isn't easy to appreciate what the issue is being faced. |
Here are a couple of generic Pharo workflow videos for context... |
The first sample I checked... Also that function doesn't help generate sufficiently short strings to make a good identifier. For example...
produces...
That is off point. The full identifier doesn't need to be typed in Pharo where individual tests are run with a single click. The issue is super long method names being awkward in the navigation pane. |
Hence my @rpottsoh' suggestion to just shorten the descriptions (which really are excessively long). Once again, why should we add a new field that would require maintenance work for many people, for a problem that appears to only really be problematic in one language track and for which there is a relatively easy fix (shorten the descriptions)? p.s. in the time we spent discussing this new field we could have done enough bikeshedding to come up with suitably short descriptions I feel... |
Including existing descriptions in generated code comments is a secondary concern.
At least @exercism/scala seems to concatenate nested descriptions into comments. |
Yeah I'm also trying that path. Some support there would be nice.
Instead the discussion at PR #1525 will be repeated a dozen times - for each exercise needing shorter descriptions. |
I think this comes down to what is the true intent of having problem descriptions in a machine readable format? Why not have an ambiguous descriptive text description and be done with it? I had understood it was to ease automatic generation of consistent descriptions across languages - at least the intent of tests could be consistent across languages regardless of implementation. You could read a test name and understand what you had to do. This is bollocks - it’s badly conceived and should be sorted . The descriptions have grown organically without much thought to autogeneration and we are left to muddle through? This isn’t a language problem or ide problem at all! Can someone clearly in pseudo code tell me how to unambiguously generate clear and concise test names for all the exercises without having to adjust anything? Ideally so I can read the same test name in any language and compare the unique implementation in that language to any other? If so, we will change Pharo to that - but I suspect it’s not the case, we’ve all worked around this nonsense and it’s a damn shame. It takes the steam out of wanting to improve this in my opinion , I’ve certainly backed away - I’ll solve a few more interesting problems with the descriptions we’ve got and maybe create a few custom exercises but it saps too much energy to correct things up stream in my opinion. The interesting thing was that problems had a definition, but this looks rather accidental from my experience so far. |
The pseudo code should be as simple as: for every "case" (descending in pre-order) *adjusted description: the following additional rules are a bit empirical (and shouldn't be needed if readability was considered when nesting cases) if the description is all digits prepend "at " The issue is typically nested cases where the author hasn't thought about a generation algorithm (possibly the above) and needlessly repeats information, so concatenation is repetitive and overly verbose. Beyond a certain size, I don't think students actually read the test name, or struggle to find it useful (regardless of the language or IDE - so you have to manually customise it) We've painfully corrected some specifications, there are still many that are sub optimal (look at robot-simulator, which starts out ok and then gets a bit silly) |
In this comment, in the first section I share my view of our current purpose for descriptions (maintainer-centric, intended to help us evaluate coverage of test suites). The second is to examine the effect that some of these choices (nesting, descriptions) have on various test suites (it appears the original design was only made to support test suites that have nesting. It should be decided what sort of test suites problem-specifications is meant to encourage and make easy). I am sorry to say that because of travel, I currently do not personally have a recommendation to provide, but I hope some information helps to understand what the elements of a desirable recommendation would be. First, regarding the intent of descriptions:
I would say I agree with that understanding. I'll give my own take on the previously-quoted README, the quote being:
... actually I'm almost obligated to give my own take on this, since any examination of the history would show that I wrote those sentences in add5995#diff-04c6e90faac2675aa89e2176d2eec7d8, following the discussion in #451 (comment). We have a desire, as collective maintainers of the various canonical-data.json, to retain information about why each test case exists, so that we as maintainers can evaluate what coverage we might lose if a test case is proposed to be removed. What mistaken solutions might now be allowed if we remove a test case? For the sake of coverage of the test suites, I suggest that it is useful to keep that information somewhere, but if the allegation, discussion, and subsequent consensus is that the description is not the right place for it, then perhaps it is time to find a better place for it. So, given this information we have, with each test case given a description and possibly nested in other groups of cases each with their own description, can we predict how a track's maintainers might choose to render these cases in the track's choice of test suite? It is hard to say, but I have a suspicion the current structure was written in mind with test suites that support some form of nesting/grouping for their tests. As an example, I show the robot-simulator for Haskell (note that Haskell's tests were not generated as far as I can tell, but it is still a fine example to show the nesting): https://travis-ci.org/exercism/haskell/jobs/542405000#L3389-L3435. Only the leaves (items without further items grouped under them) are individual test cases. For convenience it'll also be reproduced at the bottom of this comment. I would say that since guidelines on nesting and descriptions implicitly only considered this style (I don't believe anything was discussed), other styles were left by the wayside. So I would say something useful is to decide what sorts of test suites the canonical data is meant to encourage. What sort of information should be shown to the student trying to solve the exercise. Based on that, then it is possible to come up with guidelines on what sorts of nesting and descriptions should be done. Without knowing what we are trying to support and encourage, it is difficult to make recommendations if we don't understand what the goal is. Nested descriptions, reproduced here for convenience (but unfortunately without the proper colour-coding shown in the Travis CI link)
|
@petertseng thanks for a thoughtful reply (and apologies for my frustration). Your explanation clarifies many things, and I think it does confirm that we should offer some pseudo code for track authors about how they can approach generation. A reference implementation that goes as far as spitting out test descriptions and pseudo code for a test - would go a long way to helping everyone improve the current specifications as well as think about what to write for future specs. The robot example you give, is much better that the current specification (and more in line with what I would expect to generate. The current one has (and there are lots of other spec like this too): "Where R = Turn Right, L = Turn Left and A = Advance, the robot can follow a series of instructions and end up with the correct position and direction", |
If my shortened
I'm not sure how it aligns with the growth of the long descriptions, but I noticed that...
was introduced four years ago and removed two years ago
So it seems the |
I myself got confused about the purpose of the "comment" field. In a greenfields scenario a better choice might be "maintainer_notes" and "student_notes" - but thats easy to see the distinction in hindsight. Even though it duplicates/summarises the description.md , the following might fit well in "student_notes".
|
This is indeed a bit confusing.
Well spotted. I don't know why we changed this. Anybody has any idea? |
* Robot simulator descriptions too long Convert long descriptions into comments per... per #1473
#336 seems to be the discussion around it (Ctrl-F'ing
Changing the field name would be a breaking change, but I think this could definitely be clarified in the README. |
@bencoman wrote:
I didn't know this. This is a very reasonable concern. To grow on your idea, perhaps rather than an
While this issue does not affect any tracks I am involved with personally, in the interest of moving forward with #1492, I would like to know from everyone who is still involved in the discussion of an "identifier" field how they feel about taking another iteration of this proposed schema change:
We may end up with this feature and we may not. But I'd like for it to 1) be designed well, if at all, and 2) not block #1492. |
Even if its useful to group addition of the other keys together, its also useful to concentrate on a single case for the first one.
|
Raises hand I still believe that we can shorten almost all descriptions enough to have reasonable length identifiers. Note that if I am the only one objecting, I'd of course get in line and follow the majority. I do wonder what @kytrinyx's and @iHiD's opinions are on this matter. |
I also believe that we can shorten almost all descriptions enough to have reasonable length identifiers. But descriptions will diverge and become long again unless addressed. Can the following statements highlighted by @bencoman both be true?
That is, a description that can be used to name each test that explains what each case is and also why it's there? Is there an ideal soft max length of N characters, e.g. 60? |
Most tracks use the description to generate the test name, if I understand correctly. Also, it seems like there's no current recommendation for an alternate method for generating names. As such, it is my recommendation that we continue to generate names based on the description, and that we make those descriptions short, allowing for ambiguity. That said, shorter doesn't really mean more ambiguous most of the time, I think. @bencoman's example earlier, I find roll bonus balls before scoring to be much more understandable than Bonus balls for a strike in the last frame must be rolled before score can be calculated. Where possible, I think it makes sense to base descriptions on the why if possible, since the how already exists in the implementation, but I don't think we should mandate that with any force, because there are just too many cases where the why is "duh". I suspect that any time we try to make the rules too rigid, we end up with silly edge cases. Since we have people writing these specs, not machines, it seems like a perfectly good time to allow for judgement calls. I'm not opposed to adding or changing the spec in any way, but at the moment I don't understand how the long descriptions and a new short identifier would be better than short descriptions. If we do find that we need maintainer notes and student notes, I think it's worth breaking the spec and spending some concerted effort helping those tracks whose generators break when new fields are added. (I would recommend, in that case, leaving "comment" as an empty field until generator maintainers can remove the reliance on it). |
I'm going to try and give some thought from a product POV rather than a track maintainer POV (because I don't maintain any tracks and don't know anything about how this part of exercism works). The most important thing in this is that we consider what is clear to the user's learning. This should be the main guiding principle. The second most important thing is that maintainers can maintain the tracks easily without lots of extra work. Reducing unnecessary effort maintainers keeps everyone happy and means maintainers can do the most important things. It's also important to raise that TDD is Exercism's preferred method of solving exercises, so the tests should be clear. I rarely even read the README when solving exercises and instead just attack the test-suite. My instinct says that there are two clear things here - the name of the test, and some information that explains what's going on with the test. At the moment, these are combined into one field ( Wearing my product hat, I would fine with splitting these out explicitly into two fields, with the following conditions:
|
@iHiD's proposal would be an excellent outcome for me. I'd tend towards a more explicit I'm am willing to keep on with shortening then existing
...because that is the nature of a "description". |
Well, @sshine also suggested a possible fix, by limiting the number of characters that can be used for the description. It looks like opinions vary a bit on this subject. Reading this thread, it looks like there are basically two options:
So how to continue? In another thread, it was implied that if no consensus was reached, the Exercism team would make a decision. In this case though, it looks like @kytrinyx and @iHiD have slightly different opinions :) Can we reach a consensus here or can otherwise a decision be made for us? I feel passionate about this issue, but I'm even more passionate about being able to continue and start working or closing this issue :) |
…1533) The purpose of the "description" field is to derive identifiers for naming generated test methods, but some descriptions have tended to verbosity ending up with long identifiers (up to 150 characters). Such long method names are harder for students to work with and also are awkward for some IDE GUIs. Issue #1473 proposed adding a separate "identifier" field, but I was required to first try improving "descriptions" to better suit "identifier" generation. This PR aims to slim down the generated identifiers without losing substantive information about the test. To start with, "Check if the given string is a pangram" is a rather banal redundant description unworthy of nesting.
The purpose of the "description" field is to derive identifiers for naming generated test methods, but some descriptions have tended to verbosity ending up with long identifiers (up to 150 characters). Such long method names are harder for students to work with and also are awkward for some IDE GUIs. Issue #1473 proposed adding a separate "identifier" field, but the consensus there was to first try improving "descriptions" to better suit "identifier" generation. This PR attempts to slim down the generated identifiers without losing substantive information about the test. For the longest descriptions here, I couldn't recognize the significance of... ``` dart whose coordinates sum to > 1 but whose radius to origin is <= 1 ``` While considering a shorter alternative, it helped consistency to also rearranged the shorter descriptions to be more explicit about each test condition, and add a couple of tests each side of boundary conditions. Co-Authored-By: Erik Schierboom <[email protected]> Co-Authored-By: Derk-Jan Karrenbeld <[email protected]>
In three parts: My personal recommendations, explanation of why I haven't been helping, and some statistics about the current description lengths that may help a decision. Personal recommendations: I observe that there are two questions whose answers may be decided on completely separately.
They are separate because the adding of new fields clearly defines a translation path between the old schema and the new schema, so if a description is to be shortened, it can be shortened in the old schema or the new schema and the result will be the same. (Mathematically, the two functions commute with one another) If there should be a length limit, since we are only human, we will inevitably violate it unless a machine checks it for us. The statistics may help understand whether there can/should be a length limit. If there should be one, code should be written to check it. If a new field is introduced that is suited to contain long descriptions, please do one of the two:
Otherwise, I will not be able to understand the difference between You may have seen that after I posted my view of why things are the way they are, then I didn't help. It seemed to me that I wouldn't be affected either way by any of the decisions so I had nothing to add. I think that is the natural state of affairs though, so it is not too surprising that we see those unaffected by the decision tend not to make the decision or take action. The incentives are also not aligned very well because a decision in this issue would benefit the students or the mentors of a track, and there are maintainers that are currently neither students nor mentors. That includes myself. I have historically found it difficult to convince people to do something that doesn't benefit them in some way. But that is a discussion for another issue entirely so I'll get back on topic. Here is some code that reads all our canonical-data.json files and outputs:
It is capable of concatenating the descriptions from all the nested levels above a case (if you pass the the code require 'json'
def cases(h)
# top-level won't have a description, but all lower levels will.
# therefore we have to unconditionally flat_map cases on the top level.
# on lower levels we either flat_map cases or just return individuals.
cases = ->(hh, path = [].freeze) {
hh['cases']&.flat_map { |c|
cases[c, (path + [hh['description'].freeze]).freeze]
} || [hh.merge(path: path)]
}
h['cases'].flat_map(&cases)
end
maxes = []
freq = Hash.new(0)
recurse = ARGV.delete('-r')
Dir.glob("#{__dir__}/exercises/*/canonical-data.json") { |f|
exercise = File.basename(File.dirname(f))
lens = cases(JSON.parse(File.read(f))).map { |c|
arbitrary = ?/
((recurse ? c[:path] : []) + [c['description']]).join(arbitrary).size
}
maxes << [lens.max, exercise]
lens.each { |l| freq[l] += 1 }
}
n = ARGV.empty? ? 10 : Integer(ARGV.first)
bucket_size = 5
(0..maxes.map(&:first).max).step(bucket_size) { |left|
right = left + bucket_size
v = freq.values_at(*(left...right).to_a).sum
puts '%3d - %3d: %d' % [left, right, v]
}
puts "total: #{freq.values.sum}"
puts "top #{n}:"
maxes.max_by(n, &:first).each { |x| puts '%3d %s' % x } the results
|
Each test case now has an |
[Edit:] Some languages don't individually identify tests and use the canonical data "description" field to generate comments into their tests code. Other languages individually identify tests, but all that is available for generating a test-name identifiers is the "description" field. This results in some extremely long test-method-names. e.g. For
robot-simulator
Pharo has...test18_WhereRTurnRightLTurnLeftAndAAdvanceTheRobotCanFollowASeriesOfInstructionsAndEndUpWithTheCorrectPositionAndDirectionInstructionsToMoveEastAndNorth
Can an optional
identifier
field be introduced for cases where the description doesn't make a good identifier?For example, for the Bowling exercise...
generating tests for different tracks produces the following...
CSharp:
public void Bonus_rolls_for_a_strike_in_the_last_frame_must_be_rolled_before_score_can_be_calculated()
https://github.com/exercism/csharp/blob/master/exercises/bowling/BowlingTest.cs#L250
Delphi:
procedure BowlingTests.Bonus_rolls_for_a_strike_in_the_last_frame_must_be_rolled_before_score_can_be_calculated;
https://github.com/exercism/delphi/blob/master/exercises/bowling/uBowlingTests.pas#L324
FSharp:
let ``Both bonus rolls for a strike in the last frame must be rolled before score can be calculated`` () =
https://github.com/exercism/fsharp/blob/master/exercises/bowling/BowlingTest.fs#L177
Adding a optional
identifier
like below would be useful.The text was updated successfully, but these errors were encountered: