Process for fixing out-of-sync exercise READMEs #2

kytrinyx · 2016-03-22T18:47:23Z

Occasionally we realize that a problem description is incorrect or outdated, and that different language tracks implement the problem differently.

A recent example is described in exercism/problem-specifications#190 and fixed in exercism/problem-specifications#204 (both the original issue and the PR has a lot of interesting discussion in it).

In that thread, @jtigger asked an important question:

Do you have a process by which you stage up updates to existing exercises such that the README and the various track implementations deploy together?

Unfortunately, no.

That said, I do have a tool (https://github.com/exercism/blazon) that lets me (or whoever) create identical issues for all tracks with a given exercise, which means that we could probably come up with a pretty good process.

What I've done so far is to write up the issue in a file, then submit it:

$ ./blazon -file SERIES -exercise=largest-series-product

See exercism/go#258 for one of the actual issues that was submitted above.

This uses the x.exercism.io api to figure out which language tracks have that issue, and then it uses the first line of the file as a subject, and the rest of the file as the body, and submits it as an issue to those tracks.

Notice that here I wrote the issue in such a way that if the track in question doesn't actually have a problem, the instructions are still to just "make sure it's fine" and then close the issue. This spreads the maintenance burden out to the people who are familiar with each language/track.

I guess the question is this:

When we fix a README and a bunch of tracks are going to have an incorrect implementation, should we wait to merge the README fix until all of the tracks have lined up a PR with a fix in it? Or should we just merge it and the various tracks can fix things whenever they're ready?

/cc @matthewmorgan @jtigger @petertseng since the three of you have recently been doing work to help fix inconsistencies.

/cc @roseaboveit since this is an interesting process question that will be related to a bunch of issues that come up.

The text was updated successfully, but these errors were encountered:

matthewmorgan · 2016-03-22T20:13:38Z

It seems like the PRs will potentially involve fixing example exercises, and maybe <exercise-name>.json files? Do many tracks have generated test suites that depend on the json files?

kytrinyx · 2016-03-22T20:24:12Z

Only Ruby and Go have generated test suites, I think. They don't depend directly on the json file, we run a script that uses the json file to regenerate the files, then we commit those files.

Yeah, you're right. The PR to x-common would potentially include both the <slug>.md and the <slug>.json.

petertseng · 2016-03-22T20:27:22Z

I may be back with more thoughts later. Here are a few in the meantime.

I can see there being three sub-parts to this problem:

1. Notifying various tracks about a change

I'm satisfied with Blazon for this subproblem.

2. Actually making the change happen

You could say this subproblem has a social aspect in addition to the technical aspect.

You may have observed that I have at times personally submitted PRs to all tracks where I wanted to make a change. This is good if I have context on the problem I'm trying to make consistent, and clearly it lowers the barrier to closing the issue (just have to understand the submitted fix and merge it once convinced it's good, instead of spending time to diagnose how to fix and then implement the fix). Obviously I will not be able to sustain that indefinitely, so I have to only apply this option very sparingly.

As you noted, it can be helpful to load-balance the work across track maintainers. Also interesting to consider is that some tracks are more actively maintained than others. And I'm sure we track maintainers have various other obligations that we need to balance personally. Sometimes it can be hard to assess how highly to prioritize an issue.

Due to this last point, it's inevitable that no matter who does the work (one person, or load-balance across track maintainers), different tracks will complete the change at different points in time. Which brings us to an interesting next question.

3. Making the change live

While we are in any intermediate state (some tracks have completed a change, some haven't), there is a chance that some tracks will be inconsistent with the README. See the work we are doing in exercism/problem-specifications#198 - currently xgo is the one track that doesn't use the defaulting behavior, so we resorted to a comment in the test file saying "yes, we're out of sync with the README, please bear with us". No matter which way the README says (yes defaults, or no defaults), some tracks will be in conflict with the README until all tracks are in sync. It is somewhat undesirable to be in conflict with the README if that could confuse students.

So here's the question to ask ourselves: How important is it to us that all tracks (as perceived by a normal user of exercism fetch, etc.) are in sync with the README?

If it's so important that we never want it to happen, we may have to imagine some ways to make it so that a cross-track change all rolls out at the same time.

You could envision a world where you don't update the x-api submodules until all tracks have completed a change. But then if track A completes a change today but track B won't complete the same change until next week, this is now blocking track A from getting out other changes for a week.

Or maybe we envision a world where we don't merge PRs that require a README change until they're all ready, and then we all merge them in sync. This requires all track maintainers to be on board and aware of this (hold this PR until everyone is ready).

Our answer may just be "well, some tracks are going to have to be inconsistent with the README and we're going to have to live with it". I'm not sure yet, as I don't have an answer for this subproblem, but it's an interesting thing to think about.

petertseng · 2016-03-22T20:38:22Z

My current suspicion is: Maybe we should decide in such a way that causes the least confusion. Taking the sum of multiples problem as an example: What would be less confusing, A) having the README say there are defaults, but the tests in fact say there are none, or B) the README doesn't mention defaults, but the tests have some?

We've seen two issues say that B is confusing in the Haskell track, and one issue that A is confusing in the Go track. And personally I feel that state B is more confusing simply because it's not even obvious from the tests that the defaults are 3 and 5. So in this case we may favor approach A, and wait to merge the README change.

I can see this being a case-by-case decision. In other cases, we might decide to merge the README change immediately instead.

kytrinyx · 2016-03-22T20:39:49Z

1.1. - agreed, I think blazon works for this.
1.2. - I think in general I'd like to load balance this across maintainers, in particular because people from the outside could come and submit a PR (it's nice low-hanging fruit, often, especially if the original issue has a good, detailed explanation).
1.3. - I think I'm leaning towards the third option of "oh, well, they'll just have to be inconsistent for a while", and then going with a case-by-case decision about what would be more vs less confusing, as you suggest.

petertseng · 2016-04-25T18:16:46Z

A few notes from doing exercism/problem-specifications#198:

In this case, our decision was "wait for tracks first, then update README" since that was least confusing. At some point, it becomes confusing again because people may decide "this track isn't testing the case asked for in the README, let's add the cases", reverting the work.

So, question: If we decide on the "wait for tracks first, then update README" course, should there be a deadline at which point we update the README without waiting for the remaining tracks?

For this particular case I'm going to submit PRs to the remaining tracks myself but still needed question for future situations.

kytrinyx · 2016-04-25T18:22:37Z

How about adding some wording to the README up front about how this exercise is currently being changed, and that some tracks are doing X whereas others are doing Y, and if you want to help get this cleared up, we could use your help, see issue Z?

petertseng · 2016-04-25T18:29:18Z

That sounds great, no deadline needed in that case! We can update the README immediately to say it's a work in progress, and then when all the work is done we can remove the WIP stuff.

roseaboveit · 2016-06-21T04:34:18Z

This makes a lot of sense to me. Sorry for the late response. My understanding of the new process for this situation is to:

Use blazon to send out the issues to relevant languages
Update README to explain change in progress
Language groups update the change as they are able
Update README to remove WIP notifications.

Are there any issues with this process? Can we close this discussion issue?

kytrinyx · 2016-06-21T04:58:53Z

Yes, this sounds right.

There's one more action item:

document this process somewhere!

I've been thinking about how to redo the documentation for exercism, and have been looking at homebrew and jekyll for inspiration. I also watched a phenomenal talk by @nodunayo called "Code Hospitality" that gave me some ideas. I'm still digesting, but I think it will look something like this: Keep ALL major exercism-related documentation in exercism/exercism.io, perhaps in docs/. Then write friendly, welcoming summaries about various things in the relevant READMEs and contributor documents that can link to the more focused documentation.

kytrinyx · 2017-06-03T21:30:13Z

Since we've come to an actionable conclusion I've opened a new issue that describes what needs to be done (it links back to this one, exercism/docs#49). I'm closing this, as there is nothing further to decide.

Insti · 2017-06-04T13:03:07Z

I've only just seen this thread and don't agree with this conclusion.

should we just merge it and the various tracks can fix things whenever they're ready?

Yes. This.

I very strongly disagree with adding WIP markers to the description.md file.

The current (June 2017) process is to update the description.md and canonical-data.json (if exists) for an exercise and then let the tracks work out the details.
For the well maintained tracks these modifications are noticed and the relevant issues/PRs are created.

The README (generated from description.md) and the canonical-data.json make up the canonical definition of a problem.

Tracks have the ability to track the version of the description.md and canonical-data.json their tests implement. (Although not currently the ability to choose which version of the README they display, although this would be possible to add.)

There is currently no organisational policing of the content of the test files - this is a good thing. The tracks are free to implement their own selection of exercises and test suites, both adding and subtracting tests to those in the canoncial-data.json file. If the tests drift slightly out of sync with the README due to clarifications/changes this is no different from the track maintainers choosing to implement a slightly different set of tests.

It should be the responsibility of the track maintainers to ensure that their tests match the specification, even if the specification changes, this is why it is called maintaining.

If a track is not updated in a timely manner, people doing the exercises will discover any inconsistencies and can report them to the relevant tracks. This reporting/patch submission process acts as a good way to introduce track contributors to the project and groom potential new maintainers to the project.

Thus the process becomes:

Change description or canonical-data as required.
Let the tracks pull the changes as they notice them.

Some people have indicated a preference for being notified about changes via issues on their repository, See: x-common/issues/524 but this should be a separate explicit opt-in system, otherwise we end up with unmaintained tracks filled with 100s of unanswered issues to update exercises.

kytrinyx · 2017-06-04T16:36:09Z

The current (June 2017) process is to update the description.md and canonical-data.json (if exists) for an exercise and then let the tracks work out the details.

The current process was never defined, and was causing confusion. That's why I opened this discussion 14 months ago.

The README (generated from description.md) and the canonical-data.json make up the canonical definition of a problem.

It sounds like you're saying that adding a notation to say that the README means that the above is not true. I don't see how that makes sense.

There would be a notation in the README that says "the previous version of the README was ambiguous, and some tracks have not yet updated their exercise to match the updated specification". We could point to the issue where this is described, which would link to all of the open issues.

There is nothing about that which makes the description not the canonical definition.

Tracks have the ability to track the version of the description.md and canonical-data.json their tests implement.

Almost none of the tracks list the SHA1 of the description. The canonical data is irrelevant to this, as the canonical data itself is not delivered as part of the exercise. It's the README description that gets confusing.

Although not currently the ability to choose which version of the README they display, although this would be possible to add.

Yes, it's possible, but it's not currently done.

It sounds like you're saying that we shouldn't add a notation that would help unconfuse people because it's less than 100% correct. This isn't about being 100% correct, it's about giving people the best possible experience.

There is currently no organisational policing of the content of the test files - this is a good thing.

Agreed.

This is not about policing the content of the test file. It's about making people less confused.

If the tests drift slightly out of sync with the README due to clarifications/changes this is no different from the track maintainers choosing to implement a slightly different set of tests.

This does not sound right to me. If a README is ambiguous and then gets fixed, previously an implementation matched by virtue of something being ambiguous. Now it explicitly doesn't match.

I'm talking about the description here, not the canonical data. The canonical data are not delivered as part of the exercise, the description is. If tracks want to differ significantly enough from the specification that it is noticeable and confusing, then they should add a HINTS.md file that helps clarify. Up until now the deviations that I've seen are not confusing in this way.

It should be the responsibility of the track maintainers to ensure that their tests match the specification, even if the specification changes, this is why it is called maintaining.

Except we don't have maintainers for all the tracks. Again: it sounds like you're operating from the perspective of a perfect world—we don't have that here. We have messy humans in an ambiguous world, and we're trying to do the best we can to balance people's happiness and well-being with our lack of perfection.

If a track is not updated in a timely manner, people doing the exercises will discover any inconsistencies and can report them to the relevant tracks. This reporting/patch submission process acts as a good way to introduce track contributors to the project and groom potential new maintainers to the project.

You're saying that by leaving things confusing we're encouraging people to contribute. I think that (1) this doesn't work as well as it sounds like you think, and (2) adding a notation to the README could be an even better way to encourage people to contribute without making a bunch of people confused in the process.

Some people have indicated a preference for being notified about changes via issues on their repository [...] but this should be a separate explicit opt-in system, otherwise we end up with unmaintained tracks filled with 100s of unanswered issues to update exercises.

Unmaintained tracks are their own problem. Not notifying them doesn't solve it. I'm working on that problem separately.

petertseng · 2017-06-04T19:33:59Z

Since coming up with the solution of "add WIP markers":

I am aware of at least one instance where we did use it: sum-of-multiples sum-of-multiples: Note change in flight in README problem-specifications#236
I am aware of at least one instance where we did not use it: hello-world I posit that hello-world is too complex for a first exercise. How do we deal with it? problem-specifications#520
I think, but have no evidence of such, that the instances where we did not use it outnumber those where we did.
I acknowledge that even the fact that we did not use the solution in the past doesn't prevent it from being recommended for use in the future, unless we learned something from not-using it.

hello-world may not be the best source for learning, since it holds a special position. Nevertheless, I took the risk of choosing not to place a WIP marker there since I took (completely unsubstantiated!) guesses that someone new to an Exercism track is less likely to be in a position to contribute and having a WIP marker in the introductory exercise is likely to be confusing.

But if we at least try to extrapolate that, then what we might conclude is that:

Placing a WIP marker:
- For students on tracks that have not yet updated to match the README: lessens confusion.
- For students on tracks that have updated: Slightly increases cognitive burden, since they'll read the WIP marker, maybe wonder what it's for, maybe find it annoying, maybe expend time investigating only to find that it's inapplicable for the current track (but maybe the student can help out on a different track???)
Not placing a WIP marker:
- For students on tracks that have not yet updated to match the README: increases confusion, to the point that some may choose to file issues expressing their confusion (Hello-World just a bit over a newbie level? python#430, Hello-world Python specification inconsistency problem-specifications#725)
- For students on tracks that have updated: No effect!

I'd like to take a moment to call out a selection bias here. The maintainers in a position to comment on this issue and voice their opinion are likely the maintainers who keep their tracks up to date. It is slightly more likely that these maintainers will say "Why must I levy a tax on students of my track when I already keep it up to date?" The proponents of adding a WIP marker would be wise to address this question.

kytrinyx · 2017-06-04T19:39:44Z

@petertseng Thank you for elaborating.

I think that based on this, we should not mandate the use of a WIP marker, but rather ask that people consider whether or not it would be better with or without it on a case by case basis.

The proponents of adding a WIP marker would be wise to address this question.

I think that reading a warning and realizing that it doesn't apply, is much less taxing than being confused and having to troubleshoot that confusion (is the README wrong? Is the test suite wrong? Is it something else? Am I just misunderstanding? How do I find out? Where is this defined?).

I think in the case of Hello World, you are right that it is a special case.

stkent · 2017-06-04T21:48:29Z

Although not currently the ability to choose which version of the README they display, although this would be possible to add.

This is an elegant solution to the problem, IMO. It means that less-maintained tracks continue to at least make sense when they inevitably (and hopefully temporarily) fall behind x-common.

kytrinyx · 2017-06-04T21:55:31Z

Although not currently the ability to choose which version of the README they display, although this would be possible to add.

This is an elegant solution to the problem, IMO.

@Insti @stkent these are great points. I'm going to take some time to think about how to restructure/rethink versioning in a way that is transparent to the people doing exercises.

kytrinyx · 2017-07-15T19:48:22Z

After a whole bunch of discussions and tinkering, we've resolved this by not generating READMEs on the fly.

See exercism/meta#15 for more details.

kytrinyx mentioned this issue Mar 22, 2016

Edit README to clarify exercise instructions (crypto-square) exercism/problem-specifications#204

Merged

petertseng mentioned this issue Apr 26, 2016

sum-of-multiples: Note change in flight in README exercism/problem-specifications#236

Merged

petertseng mentioned this issue Jan 30, 2017

I posit that hello-world is too complex for a first exercise. How do we deal with it? exercism/problem-specifications#520

Closed

kytrinyx mentioned this issue Jun 3, 2017

Document philosophy and process of syncing implementations with the exercise READMEs exercism/legacy-docs#49

Closed

kytrinyx closed this as completed Jun 3, 2017

kytrinyx reopened this Jun 4, 2017

petertseng mentioned this issue Jun 14, 2017

Reevaluating cross-track consistency #158

Closed

kytrinyx closed this as completed Jul 15, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process for fixing out-of-sync exercise READMEs #2

Process for fixing out-of-sync exercise READMEs #2

kytrinyx commented Mar 22, 2016

matthewmorgan commented Mar 22, 2016

kytrinyx commented Mar 22, 2016

petertseng commented Mar 22, 2016

petertseng commented Mar 22, 2016

kytrinyx commented Mar 22, 2016

petertseng commented Apr 25, 2016

kytrinyx commented Apr 25, 2016

petertseng commented Apr 25, 2016 •

edited

Loading

roseaboveit commented Jun 21, 2016

kytrinyx commented Jun 21, 2016 •

edited

Loading

kytrinyx commented Jun 3, 2017

Insti commented Jun 4, 2017

kytrinyx commented Jun 4, 2017

petertseng commented Jun 4, 2017 •

edited

Loading

kytrinyx commented Jun 4, 2017

stkent commented Jun 4, 2017

kytrinyx commented Jun 4, 2017

kytrinyx commented Jul 15, 2017

Process for fixing out-of-sync exercise READMEs #2

Process for fixing out-of-sync exercise READMEs #2

Comments

kytrinyx commented Mar 22, 2016

matthewmorgan commented Mar 22, 2016

kytrinyx commented Mar 22, 2016

petertseng commented Mar 22, 2016

1. Notifying various tracks about a change

2. Actually making the change happen

3. Making the change live

petertseng commented Mar 22, 2016

kytrinyx commented Mar 22, 2016

petertseng commented Apr 25, 2016

kytrinyx commented Apr 25, 2016

petertseng commented Apr 25, 2016 • edited Loading

roseaboveit commented Jun 21, 2016

kytrinyx commented Jun 21, 2016 • edited Loading

kytrinyx commented Jun 3, 2017

Insti commented Jun 4, 2017

kytrinyx commented Jun 4, 2017

petertseng commented Jun 4, 2017 • edited Loading

kytrinyx commented Jun 4, 2017

stkent commented Jun 4, 2017

kytrinyx commented Jun 4, 2017

kytrinyx commented Jul 15, 2017

petertseng commented Apr 25, 2016 •

edited

Loading

kytrinyx commented Jun 21, 2016 •

edited

Loading

petertseng commented Jun 4, 2017 •

edited

Loading