-
-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Number::spell() - spell numbers as words #43
Conversation
Hello @tommarshall ! Thanks for your contribution. I was thinkg about this feature :) localisation is definitely required. We can use similar mechanism that we have for datetime difference, I mean translations stored in yaml files. Maybe the name of this function could be "spellNumber"? We can then put it into |
Hi @norzechowicz, thanks for the prompt response!
I'll add the localisation support. How do you want to name the localization files in Thanks, |
|
I've refactored Unfortunately I only know the one language, so the Anything you'd like me change? |
/** | ||
* @var array | ||
*/ | ||
private $map = array( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah wait, I was talking about the $map field but I didnt noticed its used in order to create translation key : / Sorry for that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right. No problem. I thought you were favouring implicit scoping for the properties. I'll revert that commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for that, just read my other comments and we gonna remove $map and other fields as well
@tommarshall thanks, looks better now! @orestes @lightglitch @dagaa @Forst @sarelvdwalt @ozmodiar @mattallty @cnkt @tbreuss @IgorDePaula @omissis - could you please take a look at this PR and tell us if current implementation gonna handle your native language? |
@norzechowicz no problem. Thanks for your help, much cleaner now 👍 Would you like me to squash it down into a single commit? |
@tommarshall yes please :) |
- Convert '123' to 'one hundred and twenty-three' - Includes support for decimals and negatives - Includes base support for localisation - Includes tests - Credit to Karl Rixon for the original function (http://www.karlrixon.co.uk/writing/convert-numbers-to-words-with-php/) - Thanks to @norzechowicz for help and advice
I'm not sure if the current version will work, the number words in Russian have to be singular/plural, just as in #23. If the currently used syntax supports "form1|form2|form3", then it's fine. |
@Forst its possible but @tommarshall would need to pass number as a variable into translation key. For example |
Right now I'm short on time, so didn't make a proper PR. If anyone wants to take over the Russian translation, please go ahead. @sam002 Here's what the translation file should look like with the current spelling code: hyphen: " "
conjunction: " "
separator: ""
negative: "минус"
# decimal is a workaround for the current spelling code
decimal: "запятая"
number:
0: "ноль"
1: "один"
2: "два"
3: "три"
4: "четыре"
5: "пять"
6: "шесть"
7: "семь"
8: "восемь"
9: "девять"
10: "десять"
11: "одиннадцать"
12: "двенадцать"
13: "тринадцать"
14: "четырнадцать"
15: "пятнадцать"
16: "шестнадцать"
17: "семнадцать"
18: "восемнадцать"
19: "девятнадцать"
20: "двадцать"
30: "тридцать"
40: "сорок"
50: "пятьдесят"
60: "шестдесят"
70: "семьдесят"
80: "восемьдесят"
90: "девяносто"
100: "сто"
200: "двести"
300: "триста"
400: "четыреста"
500: "пятьсот"
600: "шестьсот"
700: "семьсот"
800: "восемьсот"
900: "девятьсот"
1000: "тысяча|тысячи|тысяч"
1000000: "миллион|миллиона|миллионов"
1000000000: "миллиард|миллиарда|миллиардов"
1000000000000: "триллион|триллиона|триллионов"
1000000000000000: "квадриллион|квадриллиона|квадриллионов"
1000000000000000000: "квинтиллион|квинтиллиона|квинтиллионов" With this, the number Note I had to add 200, 300, …, 900, since those do not obey any particular rules. Also, the decimal spelling with the current code is not the way it is usually done in Russian. The normal way of pronouncing the number For the proper support of the syntax above, the following should be added in the localization file: number:
# The following MUST be used when the number is not an integer, for both whole and decimal parts:
1_decimal: "одна"
2_decimal: "две"
# the rest are the same as in 'number'
number_decimal:
# see below for how to use decimals below 0.001
1: "целая|целых|целых"
0.1: "десятая|десятых|десятых"
0.01: "сотая|сотых|сотых"
0.001: "тысячная|тысячных|тысячных"
0.000001: "миллионная|миллионных|миллионных"
0.000000001: "миллиардная|миллиардных|миллиардных"
0.000000000001: "триллионная|триллионных|триллионных"
0.000000000000001: "квадриллионная|квадриллионных|квадриллионных"
0.000000000000000001: "квинтиллионная|квинтиллионных|квинтиллионных"
decimal_ten: "десяти"
decimal_hundred: "сто"
# 10^-3 becomes одна тысячная (%count% + 0.001)
# 10^-4 becomes одна десятитысячная (%count% + decimal_ten + 0.001)
# 10^-5 becomes одна стотысячная (%count% + decimal_hundred + 0.001)
# 10^-6 becomes одна миллионная (%count + 0.000001)
# 10^-7 becomes одна десятимиллионная (%count + decimal_ten + 0.000001)
# 10^-8 becomes одна стомиллионная (%count% + decimal_hundred + 0.000001) As an example, All these rules are pretty complicated, hope I made them at least somewhat clear. |
In Portuguese you are going to have the same issue as Russian with the 200, 300, …, 900, they also need to be added. About the other cases I need to do more testing to confirm things. |
All this variations seem to always affect the structure of the translation file. I think we could come up with a base translation strategy that works for latin alphabets/rules/languages and uses that initial set of strings in the translation files. For other alphabets, I think implementing a specific class, using a common base class, would be a better solution. For example, having a HumanizerTranslator class, and extending it in a RussianHumanizerTranslator. This later class could handle the corner cases for that language. Otherwise the translation file format is going to keep changing and affecting the rest of the languages. |
Hi All Sorry that I'm late to the party here. I have an idea that might be able to achieve @tommarshall 's feature without too much complexity: The ICU 56.1 standard (http://www.icu-project.org/apiref/icu4c/classRuleBasedNumberFormat.html#details) already specifies a method for spelling out numbers using rulesets. This is implemented, with localisation, in PHP in the NumberFormatter class (http://php.net/manual/en/class.numberformatter.php#intl.numberformatter-constants.unumberformatstyle). It can be used by specifying the NumberFormatter::SPELLOUT style constant. English Example: $formatter = new \NumberFormatter('en', \NumberFormatter::SPELLOUT);
echo $formatter->format(-1234567.89).PHP_EOL;
// minus one million two hundred thirty-four thousand five hundred sixty-seven point eight nine Russian Example: $formatter = new \NumberFormatter('ru', \NumberFormatter::SPELLOUT);
echo $formatter->format(-1234567.89).PHP_EOL;
// минус один миллион двести тридцать четыре тысячи пятьсот шестьдесят семь запятая восемь девять This output differs from @Forst 's example of: For completeness @orestes here is a Portuguese example $formatter = new \NumberFormatter('pt', \NumberFormatter::SPELLOUT);
echo $formatter->format(-1234567.89).PHP_EOL;
// menos um milhão e duzentos e trinta e quatro mil e quinhentos e sessenta e sete vírgula oito nove Lastly, I stumbled on this PEAR package that might help as a good reference: I hope this helps |
@mosterb The Russian example you gave is the simplest of all to make in code, yet sounds least natural. |
JFYI
|
Firstly, thanks for the project. It's nice idea to package this functionality up together in a lib. Definitely much nicer than the ball of utility functions I normally use.
Would you be interested in adding this
Number::toWords()
function to the lib?Summary:
If you're interested in adding this function to php-humanizer would you prefer to see
toWords
as a function ofString
likeBinarySuffix
andMetricSuffix
?Presumably this would also benefit from localisation, although I guess that could be problematic if other languages construct large numbers differently to English? I imagine the ordinalization could also face similar challenges. Again, if you're interested in adding this function I'm happy to add base support for the localisation, if you want it.