Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

acronym: add leading / trailing and multiple separator case #1432

Open
yawpitch opened this issue Jan 5, 2019 · 14 comments
Open

acronym: add leading / trailing and multiple separator case #1432

yawpitch opened this issue Jan 5, 2019 · 14 comments

Comments

@yawpitch
Copy link
Contributor

yawpitch commented Jan 5, 2019

Currently the acronym tests do not cover inputs with leading, trailing, or repeated separator characters, and many solutions presented will fail if these are encountered.

A student suggested this as a good test string: " - Annoying string ending - with - multiple separators - " should return "ASEWMS".

@rpottsoh
Copy link
Member

rpottsoh commented Jan 5, 2019

I think a single PR that closes #1431 and #1432 would suffice.

@yawpitch
Copy link
Contributor Author

yawpitch commented Jan 5, 2019 via email

@rpottsoh
Copy link
Member

rpottsoh commented Jan 5, 2019

Also I think we should consider what should the acronym be for inputs like "3 Men And An _nderscore"?

Interesting. I was curious so I tried this with my solution; I get MAAN.

Basically we should positively state what we consider an acronym or if we don't want to worry about those sort of inputs affirmatively say they won't be provided.

The description is a little weak and could probably benefit from a definition of some sort that defines what is considered a reasonable phrase from which an acronym could be derived.

@yawpitch
Copy link
Contributor Author

yawpitch commented Jan 5, 2019 via email

@rpottsoh
Copy link
Member

rpottsoh commented Jan 5, 2019

Does that map to all the languages that have implemented acronym though?

Don't know, but likely.

I think your statement sums things up nicely though. 👍

@ErikSchierboom
Copy link
Member

I think a more specific description of what defines an acronym would be much appreciated.

@sshine
Copy link
Contributor

sshine commented Jan 10, 2019

It's the first letter of each word. For hyphenated words, include the letter after the hyphen(s).

Why does it need to be more specific than this?

@yawpitch
Copy link
Contributor Author

yawpitch commented Jan 11, 2019 via email

@sshine
Copy link
Contributor

sshine commented Jan 12, 2019

Ok, let's state it clearly then.

@rpottsoh
Copy link
Member

Realize that #1436 has been merged recently, deals with underscores. I am gathering that this issue is maybe more for advocating a change to the description.md than to the canonical data.

@sshine
Copy link
Contributor

sshine commented Jan 13, 2019

The assumption of ASCII is not unique to this exercise. So "The first letter of each word" should be sufficient here.

@sshine
Copy link
Contributor

sshine commented Jan 13, 2019

this issue is maybe more for advocating a change to the description.md

At least a part of the discussion has focused on that.

@yawpitch
Copy link
Contributor Author

The assumption of ASCII is not unique to this exercise. So "The first letter of each word" should be sufficient here.

I'd tend to argue that that assumption is a bug, not a feature of Exercism, and that where it's relevant to the solution the bias should be explicitly called out.

As acronym is an exercise that will very commonly be approached with regular expressions -- in Python it's a core exercise and tagged as the first to involve regex -- the ASCII limitation can be very important to the solution.

For instance in Python 3 without complying the regex with the re.ASCII flag the \w special character will match all of E and È and É and Ę... should those all be included? Should they be excluded? I don't know or have a particular opinion, but us not expressing a preference for ASCII-only solutions leaves it as UB, and UB is pretty confusing for a learner, especially one for whom English isn't a first language and who isn't necessarily typing in ASCII.

If we explicitly limit the character set we make the student's lives easier, and we also get the opportunity to present a bonus exercise in which they extend to handle something like "L'École Française du Bristol", which according to that school should abbreviate to EFB.

@emcoding
Copy link
Contributor

Related: #1463

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants