From b2d0d610525731fca40879f0c4e7f3030e17c84a Mon Sep 17 00:00:00 2001 From: Katrina Owen Date: Tue, 16 May 2023 10:32:37 +0200 Subject: [PATCH] Sync word-count docs with problem-specifications (#544) The word-count exercise has been overhauled as part of a project to make practice exercises more consistent and friendly. For more context, please see the discussion in the forum, as well as the pull request that updated the exercise in the problem-specifications repository: - https://forum.exercism.org/t/new-project-making-practice-exercises-more-consistent-and-human-across-exercism/3943 - https://github.com/exercism/problem-specifications/pull/2247 --- .../practice/word-count/.docs/instructions.md | 52 ++++++++++++------- .../practice/word-count/.docs/introduction.md | 8 +++ 2 files changed, 42 insertions(+), 18 deletions(-) create mode 100644 exercises/practice/word-count/.docs/introduction.md diff --git a/exercises/practice/word-count/.docs/instructions.md b/exercises/practice/word-count/.docs/instructions.md index d3548e5d8..064393c8a 100644 --- a/exercises/practice/word-count/.docs/instructions.md +++ b/exercises/practice/word-count/.docs/instructions.md @@ -1,31 +1,47 @@ # Instructions -Given a phrase, count the occurrences of each _word_ in that phrase. +Your task is to count how many times each word occurs in a subtitle of a drama. -For the purposes of this exercise you can expect that a _word_ will always be one of: +The subtitles from these dramas use only ASCII characters. -1. A _number_ composed of one or more ASCII digits (ie "0" or "1234") OR -2. A _simple word_ composed of one or more ASCII letters (ie "a" or "they") OR -3. A _contraction_ of two _simple words_ joined by a single apostrophe (ie "it's" or "they're") +The characters often speak in casual English, using contractions like _they're_ or _it's_. +Though these contractions come from two words (e.g. _we are_), the contraction (_we're_) is considered a single word. -When counting words you can assume the following rules: +Words can be separated by any form of punctuation (e.g. ":", "!", or "?") or whitespace (e.g. "\t", "\n", or " "). +The only punctuation that does not separate words is the apostrophe in contractions. -1. The count is _case insensitive_ (ie "You", "you", and "YOU" are 3 uses of the same word) -2. The count is _unordered_; the tests will ignore how words and counts are ordered -3. Other than the apostrophe in a _contraction_ all forms of _punctuation_ are ignored -4. The words can be separated by _any_ form of whitespace (ie "\t", "\n", " ") +Numbers are considered words. +If the subtitles say _It costs 100 dollars._ then _100_ will be its own word. -For example, for the phrase `"That's the password: 'PASSWORD 123'!", cried the Special Agent.\nSo I fled.` the count would be: +Words are case insensitive. +For example, the word _you_ occurs three times in the following sentence: + +> You come back, you hear me? DO YOU HEAR ME? + +The ordering of the word counts in the results doesn't matter. + +Here's an example that incorporates several of the elements discussed above: + +- simple words +- contractions +- numbers +- case insensitive words +- punctuation (including apostrophes) to separate words +- different forms of whitespace to separate words + +`"That's the password: 'PASSWORD 123'!", cried the Special Agent.\nSo I fled.` + +The mapping for this subtitle would be: ```text -that's: 1 -the: 2 -password: 2 123: 1 -cried: 1 -special: 1 agent: 1 -so: 1 -i: 1 +cried: 1 fled: 1 +i: 1 +password: 2 +so: 1 +special: 1 +that's: 1 +the: 2 ``` diff --git a/exercises/practice/word-count/.docs/introduction.md b/exercises/practice/word-count/.docs/introduction.md new file mode 100644 index 000000000..1654508e7 --- /dev/null +++ b/exercises/practice/word-count/.docs/introduction.md @@ -0,0 +1,8 @@ +# Introduction + +You teach English as a foreign language to high school students. + +You've decided to base your entire curriculum on TV shows. +You need to analyze which words are used, and how often they're repeated. + +This will let you choose the simplest shows to start with, and to gradually increase the difficulty as time passes.