Number::spell() - spell numbers as words #43

tommarshall · 2015-10-30T10:53:35Z

Firstly, thanks for the project. It's nice idea to package this functionality up together in a lib. Definitely much nicer than the ball of utility functions I normally use.

Would you be interested in adding this Number::toWords() function to the lib?

Summary:

Convert '123' to 'one hundred and twenty-three'
Includes support for decimals and negatives
Includes tests
Credit to Karl Rixon for the original function (http://www.karlrixon.co.uk/writing/convert-numbers-to-words-with-php/)

If you're interested in adding this function to php-humanizer would you prefer to see toWords as a function of String like BinarySuffix and MetricSuffix?

Presumably this would also benefit from localisation, although I guess that could be problematic if other languages construct large numbers differently to English? I imagine the ordinalization could also face similar challenges. Again, if you're interested in adding this function I'm happy to add base support for the localisation, if you want it.

norberttech · 2015-10-30T11:03:10Z

Hello @tommarshall ! Thanks for your contribution. I was thinkg about this feature :) localisation is definitely required. We can use similar mechanism that we have for datetime difference, I mean translations stored in yaml files. Maybe the name of this function could be "spellNumber"? We can then put it into String::spellNumber($number) or Numer::spell($number), what do you think?

tommarshall · 2015-10-30T11:21:36Z

Hi @norzechowicz, thanks for the prompt response!

Number::spell($number) would make the most sense to me, as it feels like a similar function to ordinalize but I'd equally be happy to have it under String::spellNumber($number) if you felt that was more in keeping with project?

I'll add the localisation support. How do you want to name the localization files in Resources/translations? spell-number.en.yml?

Thanks,
Tom

norberttech · 2015-10-30T11:23:35Z

Number::spell($number) make more sense to me too.
About resource file name I think number.en.yml would be enough.

tommarshall · 2015-10-30T13:55:47Z

I've refactored Number::toWords() to Number::spell() and added base support for the localisation.

Unfortunately I only know the one language, so the convert() function may require some modification in order for the localisation read correctly for other languages if large numbers are constructed differently to in English, but this should hopefully provide a useful base implementation.

Anything you'd like me change?

norberttech · 2015-10-30T13:58:03Z

src/Coduo/PHPHumanizer/Number/Spell.php

+    /**
+     * @var array
+     */
+    private $map = array(


I think this can be removed

Good point. Done.

Ah wait, I was talking about the $map field but I didnt noticed its used in order to create translation key : / Sorry for that

Oh right. No problem. I thought you were favouring implicit scoping for the properties. I'll revert that commit.

No need for that, just read my other comments and we gonna remove $map and other fields as well

norberttech · 2015-10-30T15:23:56Z

@tommarshall thanks, looks better now!

@orestes @lightglitch @dagaa @Forst @sarelvdwalt @ozmodiar @mattallty @cnkt @tbreuss @IgorDePaula @omissis - could you please take a look at this PR and tell us if current implementation gonna handle your native language?

tommarshall · 2015-10-30T15:25:40Z

@norzechowicz no problem. Thanks for your help, much cleaner now 👍

Would you like me to squash it down into a single commit?

norberttech · 2015-10-30T15:26:38Z

@tommarshall yes please :)

@norzechowicz

- Convert '123' to 'one hundred and twenty-three' - Includes support for decimals and negatives - Includes base support for localisation - Includes tests - Credit to Karl Rixon for the original function (http://www.karlrixon.co.uk/writing/convert-numbers-to-words-with-php/) - Thanks to @norzechowicz for help and advice

Forst · 2015-10-30T16:19:57Z

I'm not sure if the current version will work, the number words in Russian have to be singular/plural, just as in #23. If the currently used syntax supports "form1|form2|form3", then it's fine.

norberttech · 2015-10-31T10:06:22Z

@Forst its possible but @tommarshall would need to pass number as a variable into translation key.

For example $this->translate('number.100', ["%number%" => 100]);, is this what you need? Maybe you could prepare translation file for Russian, this would help us a lot.

Forst · 2015-10-31T11:37:25Z

Right now I'm short on time, so didn't make a proper PR.

If anyone wants to take over the Russian translation, please go ahead. @sam002

Here's what the translation file should look like with the current spelling code:

hyphen: " "
conjunction: " "
separator: ""
negative: "минус"
# decimal is a workaround for the current spelling code
decimal: "запятая"

number:
  0:                    "ноль"
  1:                    "один"
  2:                    "два"
  3:                    "три"
  4:                    "четыре"
  5:                    "пять"
  6:                    "шесть"
  7:                    "семь"
  8:                    "восемь"
  9:                    "девять"
  10:                   "десять"
  11:                   "одиннадцать"
  12:                   "двенадцать"
  13:                   "тринадцать"
  14:                   "четырнадцать"
  15:                   "пятнадцать"
  16:                   "шестнадцать"
  17:                   "семнадцать"
  18:                   "восемнадцать"
  19:                   "девятнадцать"
  20:                   "двадцать"
  30:                   "тридцать"
  40:                   "сорок"
  50:                   "пятьдесят"
  60:                   "шестдесят"
  70:                   "семьдесят"
  80:                   "восемьдесят"
  90:                   "девяносто"
  100:                  "сто"
  200:                  "двести"
  300:                  "триста"
  400:                  "четыреста"
  500:                  "пятьсот"
  600:                  "шестьсот"
  700:                  "семьсот"
  800:                  "восемьсот"
  900:                  "девятьсот"
  1000:                 "тысяча|тысячи|тысяч"
  1000000:              "миллион|миллиона|миллионов"
  1000000000:           "миллиард|миллиарда|миллиардов"
  1000000000000:        "триллион|триллиона|триллионов"
  1000000000000000:     "квадриллион|квадриллиона|квадриллионов"
  1000000000000000000:  "квинтиллион|квинтиллиона|квинтиллионов"

With this, the number -1234567.89 should turn into минус один миллион двести тридцать четыре тысячи пятьсот шестьдесят семь запятая восемьдесят девять.

Note I had to add 200, 300, …, 900, since those do not obey any particular rules.

Also, the decimal spelling with the current code is not the way it is usually done in Russian. The normal way of pronouncing the number -1234567.89 would be минус один миллион двести тридцать четыре тысячи пятьсот шестьдесят семь целых восемьдесят девять сотых, where целых stands for whole and сотых for hundredth (89/100).

For the proper support of the syntax above, the following should be added in the localization file:

number:
  # The following MUST be used when the number is not an integer, for both whole and decimal parts:
  1_decimal:                    "одна"
  2_decimal:                    "две"
  # the rest are the same as in 'number'

number_decimal:
  # see below for how to use decimals below 0.001
  1:                    "целая|целых|целых"
  0.1:                  "десятая|десятых|десятых"
  0.01:                 "сотая|сотых|сотых"
  0.001:                "тысячная|тысячных|тысячных"
  0.000001:             "миллионная|миллионных|миллионных"
  0.000000001:          "миллиардная|миллиардных|миллиардных"
  0.000000000001:       "триллионная|триллионных|триллионных"
  0.000000000000001:    "квадриллионная|квадриллионных|квадриллионных"
  0.000000000000000001: "квинтиллионная|квинтиллионных|квинтиллионных"

  decimal_ten:          "десяти"
  decimal_hundred:      "сто"

  # 10^-3 becomes одна тысячная (%count% + 0.001)
  # 10^-4 becomes одна десятитысячная (%count% + decimal_ten + 0.001)
  # 10^-5 becomes одна стотысячная (%count% + decimal_hundred + 0.001)
  # 10^-6 becomes одна миллионная (%count + 0.000001)
  # 10^-7 becomes одна десятимиллионная (%count + decimal_ten + 0.000001)
  # 10^-8 becomes одна стомиллионная (%count% + decimal_hundred + 0.000001)

As an example, -1.23456789 becomes минус одна целая двадцать три миллиона четыреста пятьдесят шесть тысяч семьсот восемьдесят девять стомиллионных, but -1 is минус один, since it's an integer.

All these rules are pretty complicated, hope I made them at least somewhat clear.

lightglitch · 2015-10-31T13:07:06Z

In Portuguese you are going to have the same issue as Russian with the 200, 300, …, 900, they also need to be added. About the other cases I need to do more testing to confirm things.

orestes · 2015-10-31T13:17:13Z

All this variations seem to always affect the structure of the translation file. I think we could come up with a base translation strategy that works for latin alphabets/rules/languages and uses that initial set of strings in the translation files. For other alphabets, I think implementing a specific class, using a common base class, would be a better solution. For example, having a HumanizerTranslator class, and extending it in a RussianHumanizerTranslator. This later class could handle the corner cases for that language. Otherwise the translation file format is going to keep changing and affecting the rest of the languages.

mostertb · 2015-11-08T09:03:55Z

Hi All

Sorry that I'm late to the party here. I have an idea that might be able to achieve @tommarshall 's feature without too much complexity:

The ICU 56.1 standard (http://www.icu-project.org/apiref/icu4c/classRuleBasedNumberFormat.html#details) already specifies a method for spelling out numbers using rulesets.

This is implemented, with localisation, in PHP in the NumberFormatter class (http://php.net/manual/en/class.numberformatter.php#intl.numberformatter-constants.unumberformatstyle). It can be used by specifying the NumberFormatter::SPELLOUT style constant.

English Example:

$formatter = new \NumberFormatter('en', \NumberFormatter::SPELLOUT);
echo $formatter->format(-1234567.89).PHP_EOL;
// minus one million two hundred thirty-four thousand five hundred sixty-seven point eight nine

Russian Example:

$formatter = new \NumberFormatter('ru', \NumberFormatter::SPELLOUT);
echo $formatter->format(-1234567.89).PHP_EOL;
// минус один миллион двести тридцать четыре тысячи пятьсот шестьдесят семь запятая восемь девять

This output differs from @Forst 's example of:
минус один миллион двести тридцать четыре тысячи пятьсот шестьдесят семь запятая восемьдесят девять
I dont speak Russian, but this may be because a different DEFAULT_RULESET needs to be specified: http://stackoverflow.com/questions/24282324/numberformatterspellout-spellout-ordinal-in-russian-and-italian

For completeness @orestes here is a Portuguese example

$formatter = new \NumberFormatter('pt', \NumberFormatter::SPELLOUT);
echo $formatter->format(-1234567.89).PHP_EOL;
// menos um milhão e duzentos e trinta e quatro mil e quinhentos e sessenta e sete vírgula oito nove

Lastly, I stumbled on this PEAR package that might help as a good reference:
http://pear.php.net/package/Numbers_Words

I hope this helps

Forst · 2015-11-11T15:04:13Z

@mosterb The Russian example you gave is the simplest of all to make in code, yet sounds least natural.

tomasfejfar · 2016-10-21T11:44:51Z

JFYI \NumberFormatter::SPELLOUT does not work properly for Czech after the comma:

actual:   minus jeden milión dvě stě třicet čtyři tisíc pět set šedesát sedm čárka osm devět
expected: minus jeden milión dvě stě třicet čtyři tisíc pět set šedesát sedm celých osmdesát devět
                                                                             ^-------------------^

tommarshall changed the title ~~Number::toWords() - convert number to words~~ Number::spell() - spell numbers as words Oct 30, 2015

norberttech reviewed Oct 30, 2015
View reviewed changes

tommarshall force-pushed the master branch from 62a57d9 to f463083 Compare October 30, 2015 15:33

norberttech added the feature request label Aug 15, 2016

norberttech closed this Feb 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Number::spell() - spell numbers as words #43

Number::spell() - spell numbers as words #43

tommarshall commented Oct 30, 2015

norberttech commented Oct 30, 2015

tommarshall commented Oct 30, 2015

norberttech commented Oct 30, 2015

tommarshall commented Oct 30, 2015

norberttech Oct 30, 2015

tommarshall Oct 30, 2015

norberttech Oct 30, 2015

tommarshall Oct 30, 2015

norberttech Oct 30, 2015

norberttech commented Oct 30, 2015

tommarshall commented Oct 30, 2015

norberttech commented Oct 30, 2015

Forst commented Oct 30, 2015

norberttech commented Oct 31, 2015

Forst commented Oct 31, 2015

lightglitch commented Oct 31, 2015

orestes commented Oct 31, 2015

mostertb commented Nov 8, 2015

Forst commented Nov 11, 2015

tomasfejfar commented Oct 21, 2016

Number::spell() - spell numbers as words #43

Number::spell() - spell numbers as words #43

Conversation

tommarshall commented Oct 30, 2015

norberttech commented Oct 30, 2015

tommarshall commented Oct 30, 2015

norberttech commented Oct 30, 2015

tommarshall commented Oct 30, 2015

norberttech Oct 30, 2015

Choose a reason for hiding this comment

tommarshall Oct 30, 2015

Choose a reason for hiding this comment

norberttech Oct 30, 2015

Choose a reason for hiding this comment

tommarshall Oct 30, 2015

Choose a reason for hiding this comment

norberttech Oct 30, 2015

Choose a reason for hiding this comment

norberttech commented Oct 30, 2015

tommarshall commented Oct 30, 2015

norberttech commented Oct 30, 2015

Forst commented Oct 30, 2015

norberttech commented Oct 31, 2015

Forst commented Oct 31, 2015

lightglitch commented Oct 31, 2015

orestes commented Oct 31, 2015

mostertb commented Nov 8, 2015

Forst commented Nov 11, 2015

tomasfejfar commented Oct 21, 2016