Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(dev/core#2638) Civi::format() - Add helper for formatting #21599

Merged
merged 2 commits into from
Oct 6, 2021

Conversation

eileenmcnaughton
Copy link
Contributor

@eileenmcnaughton eileenmcnaughton commented Sep 24, 2021

Overview

Adds a helper class for formatting. - although many sites have used the core utils class.

This attempts to provide a consistent formatting interface via \Civi::format() for numbers and money.

Dates were originally included too but I pulled them out because once we add in timezone to the dates it adds quite a burden in terms of writing more tests which I wanted to push out to a follow up

IMPORTANT - in the end I bypassed our existing money functions & wrote the functions I thought we should have. However, I did not that there is a change in formatting from $ 1.00 to $1.00 - both look equally 'correct' to me & I wasn't of the opinion we should bend over backwards to force the former if php is suggesting the latter is 'the standard'.

Before

Previously we have not had an official supported way to format values using CiviCRM functions. The existing functions have had gaps / confusion / bad parameter choices.

After

Function For
Civi::format()->number() formating to a localised number
Civi::format()->money() formating to a localised money output
Civi::format()->moneylong() formating to money based in currency-specific minimum rounding)
Civi::format()->moneyNumber() formating to a number based on money rounding(e.g for data input)
Civi::format()->moneyNumberLong() formating to a number based on minimum padding for the currency (e.g for data input)
Civi::format()->getMoneyFormatter() gets the money formatter object, allowing the calling code to manipulate further

Details

All money functions take the following parameters

Parameter details
$amount an amount. This should be already using . to denote decimal places
$currency currency - defaults to site default
$locale either NULL or the locale for format in eg fr_FR

Note that

  1. as with our existing basic money::format() functions the default behaviour for the basic Civi::format()->money() function is to round the number to the number of decimal places appropriate to the currency.
  2. Civi::format()->moneyLong() pads the money to the number of decimal places appropriate to the currency but does not round.
  3. Civi::format()->moneyNumber() as Civi::format()->moneyNumberLong() do the same as above but do not add currency symbols (equivalent to the '$onlyNumber parameters that litter our existing code.

Civi::format()->number() takes the following

Parameter details
$amount an amount. This should be already using . to denote decimal places
$locale either NULL or the locale for format in eg fr_FR
$attributes an array of values that are passed to the number formatter. By default this specifies max decimal places as 9, like our db fields

Civi::format()->getMoneyFormatter() provides flexibility to combine our defaults with more bespoke use-cases.

Parameter details
$currency defaults to site currency if null
$locale either NULL or the locale for format in eg fr_FR
$style see https://www.php.net/manual/en/class.numberformatter.php#intl.numberformatter-constants
$attributes an array of values that are passed to the number formatter. By default this specifies max decimal places as 9, like our db fields

Technical Details

numbers & money
I added support for a define IGNORE_SEPARATOR_CONFIG for early adopters to always ignore the thousand separators. We have discussed settings & such like to migrate it in but if we don't get to that this round it would be good to have 'something'

Rounding

From past efforts (regressions!) we have learnt that we generally want to ensure we have sensible defaults for both padding and rounding. The functions in this PR

12.1 => 12.10
12.10 => 12.10
12.89000 => 12.89

  • kinda mostly
    12.89340 => 12.89
  • but in 'long' situations
    12.89340 => 12.8934

The scope of this PR was originally more ambitious but the challenge of the above caused me to drop the rest out of scope for manageability reasons but the original plan (& following discussion was to also add)

Originally included but moved out of scope of this PR
|Function|For|
|Civi::format()->machineDate()|formating to a machine friendly date- no longer in this PR|
|Civi::format()->machineNumber()|formating to a machine friendly number- no longer in this PR|
|Civi::format()->machineMoney()|formating to a machine friendly number with the rounding based on the currency- no longer in this PR|

|Civi::format()->date()|formating to a localised date - no longer in this PR|

** date formatting**
dateFormat supports the same values as smarty crmDate (with the exception of the obsolete $onlyTime & hence that has been fixed to call this
timezone is new / not supported elsewhere
Civi::format()->date() takes the following

Parameter details
$date a date string, such as can be interpretted by strtotime, defaults to 'now', in the timezone of the php session
$dateFormat either a string reference to a date setting such as 'Year' or a strftime string, defaults to 'dateformat' setting see
`$timezone timezone to output in - eg. 'Asia/Paciic'

As per the intent was to add machineMoney etc functions that 'deformat' - the scope of these grew when we talked about these fully reversing the format functions - since our existing functions don't deal with that. My reading suggests this would be the way to go https://www.php.net/manual/en/numberformatter.parsecurrency.php

Comments

Most recent related discussion
#20296

Gap - tests on timezone. edge cases in tests

@mattwire @jaapjansma @seamuslee001 @totten - this builds on previous discussions - it switches us from having a 'money' class as per some discussions to a 'format' class. I found that reframing really helpful in terms of trying to get to the nub of how 'number' interacts with money - basically 'number' is money without the currency and with different rounding assumptions

@civibot
Copy link

civibot bot commented Sep 24, 2021

(Standard links)

@artfulrobot
Copy link
Contributor

How abouts the $date params can be DateTime (or DateTimeImmutable) objects as well as strtotime() strings?

machineDate, would be more correctly named machineDateTime or even machineDateTimeWithoutTimezone :-)

But a machineDate (Y-m-d) would be useful too.

@eileenmcnaughton
Copy link
Contributor Author

@artfulrobot yeah - I agree that makes sense to accept DateTime - - I'll have to google DateTimeImmutable :-)

Could we compromise on machineDateTimeWithoutTimezoneButMaybeSometimesItHas ?

I guess you are saying we should have both

machineDate AND machineDateTime ?

@artfulrobot
Copy link
Contributor

@eileenmcnaughton yes, I'm suggesting machineDate AND machineDateTime and that all these methods are like

public function someDateThing(string|DateTimeImmutable|DateTime $date, ...) {}

DateTimeImmutable would be the nearest thing to a string input, since it can't be changed.

@eileenmcnaughton
Copy link
Contributor Author

Yeah there is an example of great explanation "Representation of date and time."

@artfulrobot did you have any thoughts on the rounding - that broke my brain a bit - ie good defaults + flexibility

@totten
Copy link
Member

totten commented Sep 28, 2021

Ah, now that you mention it, the new bits that use DateTime objects for token-processing... should probably use DateTimeImmutable objects...

@eileenmcnaughton
Copy link
Contributor Author

@totten don't hold back if you want to push any updates to this - or to the php manual's description of the DateTimeImmutable object...

Note test cover is not yet adequate around timezone in particular

@artfulrobot
Copy link
Contributor

artfulrobot commented Sep 28, 2021

@eileenmcnaughton

DateTimeImmutable is just what it says: it's like DateTime but you can't change it. So all the methods work, but any that change the time will return a new DateTimeImmutable.

Re: Rounding

The challenge is coming up with logical defaults to achieve that... while allowing overrides

Why do we want 2dp for a general number? I understand for currency that's typical (but, I think, not world-wide). And we have special formatters for currency.

Also, do we deal with floating point numbers at all? e.g. 1.234e¹²

I'd imagine that a helper for numbers would:

  • take an optional locale arg, e.g. en_GB or server (PHP) or civicrm (site localization) or user (out of my depth, but I think some users can have their own locales, right?) - without it it's the server's locale? Or maybe CiviCRM's configured default. Probably the latter?
  • take an optional decimalPlaces argument - without this the dp should be as was.
  • take an optional thousandsSeparator argument - without it it's locale default
  • take an optional padDecimals number, e.g. 2 - without this we do not pad.
  • take an optonal padUnits number, e.g. if this was 3 and you passed in 2 you'd get 002
  • take an optional siUnits flag?

i.e.

number(1000.234567); // English "1,000.234567"
number(1000.234567, { locale: "de_DE" ); // German "1.000,234567"
number(1000.234567, { decimalPlaces: 2, padDecimals: 2 }); // English "1,000.23"
number(1000.235, { decimalPlaces: 2, padDecimals: 2 }); // English "1,000.24"
number(1000, { decimalPlaces: 2, padDecimals: 2 }); // English "1,000.00"
number(1.2e9, { decimalPlaces: 2, padDecimals: 2 }); // English "1,200,000,000.00"
number(1000, { decimalPlaces: 2, padDecimals: 2 }); // English "1,000.00"
number(1000, { siUnits: TRUE }); //  "1k"
number(10000, { siUnits: TRUE }); //  "10k"
number(1300000, { siUnits: TRUE }); //  "1.3M"
number(123456, { siUnits: TRUE, decimalPlaces: 2 }); //  "123.46k"

And, quite separately, I think these would be super helpful,

duration(10975); // "3 hours, 2 mins, 55s"
duration(10975, {orders: 2}); // "3 hours, 3 mins"
// Would need locale specifics, too.
bytes(1024); // "1kB"
bytes(1363148); // "1.3MB"

@eileenmcnaughton
Copy link
Contributor Author

@artfulrobot so so the scenario where we have had a lot of run around with padding has been setting defaults on forms for money fields. These are money values - but we don't want the currency format (the legacy functions pass various $onlyNumber variants for this but calling money($justKidding = TRUE) feels wrong to me so I figured we should call number().

One page where we had a lot of times around the loop was price fields - you can configure an option with up to 8 decimal places. However, the expecation people has was that it would 'look like a money field ' ( 2 decimal places) UNLESS there were more than 2 but it should not pad it with zeros at the end - ie if it was 3 decimal places then we should show 3, not 8.

This is the most common usage of number formatting in core & one we've had a lot of pain with - so I want the call to get that to be as default as possible - even if it means adding moneyNumber() as an extra option with the behaviour as described

@eileenmcnaughton
Copy link
Contributor Author

Also - your examples ^^ will be good for putting in a unit test

I agree about duration. Bytes won't be useful unitl it is & they it will be super useful :-) It hardly seems hard to support

@eileenmcnaughton
Copy link
Contributor Author

Also - I don't really want to get into passing arrays as params for the really common values - they feel much less discoverable to me - more edge case ones like siUnits make sense in an array

@mattwire
Copy link
Contributor

For currency the default decimal places should be the actual definition for that currency (for most it's two but there are a few that have none and a few that have 3 or more). We got as far as implementing a placeholder function in CRM_Utils_Money::getCurrencyPrecision() with the intention it would look up the correct precision but never implemented anything other than 2 (basically we really need to use a library/lookup for this info).

As you already said there are various configuration forms (such as price-sets) where money amounts are using for calculations (eg. user will select 2x t-shirt + 3x beer). That's where it's really important that those amounts can be define with (I think) 4 more decimal places than the decimal places of the final number (eg. 3.33 needs to be 3.333333). This is especially important when tax calculations happen on top of all that.. But I agree with @eileenmcnaughton that we should treat them as "numbers" not "currency" at this point - the form will either display a fixed currency symbol or a currency dropdown next to the number field.
If people really want those fields to "look" like money we could always invent a field that updates a javascript representation next to the number field that looks "pretty".

@eileenmcnaughton
Copy link
Contributor Author

eileenmcnaughton commented Sep 28, 2021

@mattwire so I think the signature for money() above is fine - it's the number() that is tricky because we kinda was some money specific smarts since we know it's used a lot & we don't want to rely on people negotiating the correct parameters & we know from experience that people expect

12.3000000 to lose the trailing zeros but 12.333333 should not be rounded.

Given this IS a bit of special sauce - maybe having the 3 makes sense ie

money()
moneyNumber() - same params as money - does a bit of special rounding like our existing function to handle removing some of those zeros but not significant numbers
number() - the above means we are free to come up with 'the best words' for number's params...

@eileenmcnaughton
Copy link
Contributor Author

Also - I think there is mention about of padding numbers - what we have is the opposite- we are passing in already padded numbers & we want to strip the padding rather than round them for UI - ie we don't actually round the saved value of civicrm_membership.minium_fee when we present it - we DO trim it though - but never below 2 because it is money at the end of the day

@artfulrobot
Copy link
Contributor

artfulrobot commented Sep 28, 2021

Quick replies that I don't need to continue discussing

  • agree to leave off things we don't need right now is fine.
  • agree positional params are nicer for IDEs when writing code, although the result is harder to read. e.g. api4's join([['X AS y', false, null, []) or the URL functions with a million params - I can never recall the order when scanning code. I don't feel strongly but it's a good time to consider Kernighan's Law. $numberTo2DP = numberHelper($value, NULL, 2); looks fine, NULL means default locale, I think... but this looks right, too, while reading code: $numberTo2DP = numberHelper($value, 2); but only one is correct.
  • agree that if we need to support a common but fairly weirdo case, centralising our quirks in a separate helper function is helpful.

On other issues, I feel this is getting murkier! And as we're trying to make something as a "helper" I think we need more clarity. Specifically

  • We're talking about functions that explicitly return string representations of numbers. This is important - none of these functions should be used for money calculations.
  • Removing trailing zeros after a decimal point is not rounding a number, and we should not talk of it as such.
  • We may need other helpers for rounding/performing arithmetic on numbers in a way that is suitable to the use case.

@eileenmcnaughton
Copy link
Contributor Author

We probably don't need a helper for maths - because brick money can be used directly. This is, as you say, a formatting class

@totten
Copy link
Member

totten commented Oct 1, 2021

There are 6 functions... as I reader of that function-list, I interpret them to be complements -- pairs of operations that do opposite things.

Machine/Canonical ==> Formatted/Localized Formatted/Localized ==> Machine/Canonical
$f->date($date,$dateFormat,$timezone) $f->machineDate($date)
$f->number($amount,$locale,$precision,$isPadToPrecision) $f->machineNumber($amount, $precision)
$f->money($amount, $currency, $locale, $precision) $f->machineMoney($amount, $currency)

I don't know if the intent really is for them to be complements. But it seems a likely interpretation - and there are times you need the complementary actions. For example, suppose you put up "Contribution Page" with a recommended (but editable) amount. Pseudocode:

Donation: <input name="my_amount" value="$default_donation">
<button>Go</button>

The page lifecycle requires these complementary operations:

  • When you first display the page, you read the $default_donation from MySQL and then show it to the user. (Machine=>localized)
  • When the user posts back, you accept the my_amount from the browser and then store that in MySQL. (Localized=>machine)
  • You'll see numbers doing the full round-trip -- e.g. "New Year Fundraiser" recommends donation of 20.22 Euro. For a user in Germany, that presents as "€20,22"; for a user in Ireland, that's "€20.22".

There are a couple asymmetries in the current contract that stick out to me - things which suggest that the functions are not quite complements:

  • If $timezone is needed for date(), then don't you need it for machineDate() too?
  • If $locale is needed to number(), then don't you need it for machineNumber() too?
  • If $locale is needed for money(), then don't you need it for machineMoney() too?

How does that matter? Return to the example above and consider 2 users:

  • Ingrid (de_DE) submits my_amount=€20,22, and that needs to convert to machine value 20.22.
  • Then Callum (en_IE) submits my_amount=€20.22, and that needs to convert to the same machine value 20.22.
  • But the source-material comes from different locales.

(Of course, all of that is pseudocode. With real-world code, there's a bit of wildcard depending on the specific presentation widgets - e.g. sending/receiving data for <input type=text> or CSV will require different handling than sending/receiving data for <input type=number>, and both differ from <select><option>...)

@eileenmcnaughton
Copy link
Contributor Author

@totten so just to orient you - this PR in it's current state is not 'how I think it should be' - but 'how far I thought through things before my brain broke'

I think you are probaby right in your articulation of the complementariness and how that impacts the requirements

I suspect the dreaded `CRM_Utils_Rule::cleanMoney' - is what we currency have for 'machineMoney' and it 'deformats' submitted money

@eileenmcnaughton
Copy link
Contributor Author

@totten also - I think the actual code currently has signatures that don't quite match the PR template - I started to move towards having the very common things as parameters & everything else as an array - so if you really want to pass the atrribute to have your number returned as 'one hundred and forty three' you could

@eileenmcnaughton
Copy link
Contributor Author

Just noting @totten & I discussed a bit more more on chat & also started trying to get our heads around it here

https://docs.google.com/spreadsheets/d/1WpfVdfLyYoTIMPXLCPxTqDeGmgAsd1k20j3oj4eOe_c/edit#gid=0

@artfulrobot
Copy link
Contributor

Just noting that:

  • the numerical formatters now all use NumberFormatter and do a lot of confuguration and config lookups.
  • we may need to format large numbers of numbers at once and that may become inefficient
  • suggest move the bulk of those functions into factory methods that take the signatures (the contract) you want to offer and return a NumberFormatter so that can bd reused.
  • provide either a second function or extra param to switch between returning a formatted value (convenience) and the formatter object (bulk efficiency)

@eileenmcnaughton
Copy link
Contributor Author

@artfulrobot am happy to see a proposal for how it would look - but note that I see the priority as being something that deals with the most common usage with minimal parameters passed in & I have doubts about how much more we should worry about a scenario where it might impact performance since that if so many numbers are being formatting that performance is impacted it might be time for functions just to call brick money or the underlying php function directly

@artfulrobot
Copy link
Contributor

artfulrobot commented Oct 4, 2021

@eileenmcnaughton I was just meaning like

Currently you have...

public function number($amount, ?string $locale = NULL, array $attributes = [...]):string {
   $formatter = ...
   // lots of code to configure formatter.
   return $formatter->format($amount);
}

Where you could have

public function number($amount, ?string $locale = NULL, array $attributes = [...]):string {
   return $this->getNumberFormatter($amount, $locale, $attributes)->format($amount);
}
public function getNumberFormatter($amount, ?string $locale = NULL, array $attributes = [...]):NumberFormatter {
   $formatter = ...
   // lots of code to configure formatter.
   return $formatter;
}

So your function still work the same way as they do now.

But if someone was - I dunno - formatting a thousand numbers in a report or such that used the same locale and attributes, they could grab the formatter and call ->format($amount) in the loop instead of needing to do all the work in every iteration.

Just a suggestion, but perhaps it's optimising too soon.

@eileenmcnaughton
Copy link
Contributor Author

@artfulrobot cool - I'll look again tomorrow but I see what you mean

@eileenmcnaughton eileenmcnaughton force-pushed the format branch 5 times, most recently from b2445a0 to 92e449a Compare October 4, 2021 20:27
@eileenmcnaughton
Copy link
Contributor Author

eileenmcnaughton commented Oct 4, 2021

I've cut this PR back to just 4 tested functions which I'm now happy with & have test cover for

    Civi::format()->money($amount, $currency, $locale); 
    Civi::format()->moneyNumber($amount, $currency, $locale); // no currency symbol but rounding is currency based 
    Civi::format()->moneyNumberLong($amount, $currency, $locale); // no rounding but minimum padding based on currency
    Civi::format()->moneyLong($amount, $currency, $locale)); // no currency symbol but rounding is currency based
    Civi::format()->number($amount, $locale, $attributes); // defaults to rounding per our db field length
    Civi::format()->getNumberFormatter($currency, $locale, $style, $attributes);

Note this basically means that if you want something more complicated you should use the getNumberFormatter function

I pulled out the following because the tests need more work

    Civi::format()->date();

And all the machineMoney() type functions because I realised that the only function we have to call is CRM_Utils_Rule::cleanMoney()

I'm gonna move those to a new PR in the hope of getting this resolved

@eileenmcnaughton eileenmcnaughton force-pushed the format branch 4 times, most recently from 9992a94 to e1ba74e Compare October 4, 2021 21:14
@eileenmcnaughton
Copy link
Contributor Author

@totten @seamuslee001 @artfulrobot @mattwire - I'm happy with this now - with it's reduced scope. I pushed out the 'I want to do something else' to the getMoneyFormatter function & focussed the helpers in on the stuff we are currently doing in core. At some point it makes more sense for people to just interact directly with the php functions - once the civi-specific stuff we are adding is no longer relevant

if (!$currency) {
$currency = Civi::settings()->get('defaultCurrency');
}
$formatter = new NumberFormatter($locale, $style);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A local cache ($this->formatters[$locale . $currency] or similar) is probably worthwhile here... But you can check the numbers.

I ran a basic benchmark - generate ~1000 example numbers and format them with slight variations in the formatting strategy. On a relatively fast laptop (i7-9750H; no other major tasks) , this pretty consistently returned runtimes like so:

## Baseline/Control: use sprintf() with fixed length
staticPrintf: 0.001189 for 1002 amounts

## Use a single, static NumberFormatter - disregarding the variations in locale and currency
staticFmt: 0.062295 for 1002 amounts

## Create new NumberFormatter each time you format number
newFmt: 0.124439 for 1002 amounts across 3 locale-currency context(s)

## Create a few NumberFormatters, cache+reuse them
cachedFmt: 0.062991 for 1002 amounts across 3 locale-currency context(s)

For ~1,000 numbers, the newFmt is +120ms and the cachedFmt is +60ms.

I don't think it's a stretch to see ~1,000 numbers in one request - e.g. a data-table or spreadsheet summarizing contributions (with columns for total-amount, tax-amount, fee-amount, etc) would only need a couple hundred records. Granted, the more realistic target for a paginated data-table is more like (50 rows x 5 columns) - so only a quarter of that (say, 30ms vs 15ms). OTOH, the benchmark environment was rather favorable (wrt load-avg and CPU-type).

Anyway, that's not a hard block, but reusing the NumberFormatters is a fairly simple thing and cuts the runtime-cost in half.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@totten before I try to read ^^ - can you confirm what 'fmt' means?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"format"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - I've pushed up a caching fix

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try it in the benchmark.

Copy link
Member

@totten totten Oct 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated benchmark to be more complete/realistic. It now runs literal variations of Civi\Core\Format. https://gist.github.com/totten/6c2759bfc661b99a6ec594d752f9e782

I ran it with $format->number($amount, $locale) and $format->money($amount, $currency, $locale). The metadata-cache doesn't work (and doesn't make sense to me), so I converted it to static-cache. The names now correspond to the file being benchmarked.

Operation dummy-sprintf singleton new-inst static-cache static-cache-less-md5
number() 0.001 0.020 0.068 0.054 0.051
money() 0.001 0.319 0.390 0.331 0.315

Comparing new-inst.php and static-cache-less-md5.php in this benchmark... the gain for number() isn't quite as much as I'd expected from the more basic benchmark, but it is there (68ms vs 50ms). The gain for money() is a bit more notable (390ms vs 315ms).

=== benchmark of `number($amount, $locale)` ===

## Format all numbers with a dummy sprintf() call
Load: /Users/totten/bknix/build/dmaster/web/sites/all/modules/civicrm/tmp/bench/dummy-sprintf.php
dummy-sprintf.php: Warmup: 0.000003
dummy-sprintf.php: Runtime: 0.001531 for 1002 amounts across 3 locale-currency context(s)

## Format all numbers with the same single instance of NumberFormatter
Load: /Users/totten/bknix/build/dmaster/web/sites/all/modules/civicrm/tmp/bench/singleton.php
singleton.php: Warmup: 0.005992
singleton.php: Runtime: 0.020986 for 1002 amounts across 3 locale-currency context(s)

## Make a new instance of NumberFormatter for each number
Load: /Users/totten/bknix/build/dmaster/web/sites/all/modules/civicrm/tmp/bench/new-inst.php
new-inst.php: Warmup: 0.005537
new-inst.php: Runtime: 0.068378 for 1002 amounts across 3 locale-currency context(s)

## Cache NumberFormater in Civi::cache('metadata')
Load: /Users/totten/bknix/build/dmaster/web/sites/all/modules/civicrm/tmp/bench/metadata-cache.php
  [Symfony\Component\Debug\Exception\FatalThrowableError]
  Type error: Return value of Brick\Money\Money::formatWith() must be of the type string, boolean returned

## Cache NumberFormater in Civi::$statics
Load: /Users/totten/bknix/build/dmaster/web/sites/all/modules/civicrm/tmp/bench/static-cache.php
static-cache.php: Warmup: 0.008279
static-cache.php: Runtime: 0.054834 for 1002 amounts across 3 locale-currency context(s)

## Cache NumberFormater in Civi::$statics, but use trick to reduce `md5(json_encode(...))` work.
Load: /Users/totten/bknix/build/dmaster/web/sites/all/modules/civicrm/tmp/bench/static-cache-less-md5.php
static-cache-less-md5.php: Warmup: 0.007369
static-cache-less-md5.php: Runtime: 0.051832 for 1002 amounts across 3 locale-currency context(s)

=== benchmark of `money($amount, $currency, $locale)`

## Format all numbers with a dummy sprintf() call
Load: /Users/totten/bknix/build/dmaster/web/sites/all/modules/civicrm/tmp/bench/dummy-sprintf.php
dummy-sprintf.php: Warmup: 0.000003
dummy-sprintf.php: Runtime: 0.001588 for 1002 amounts across 3 locale-currency context(s)

## Format all numbers with the same single instance of NumberFormatter
Load: /Users/totten/bknix/build/dmaster/web/sites/all/modules/civicrm/tmp/bench/singleton.php
singleton.php: Warmup: 0.013554
singleton.php: Runtime: 0.319169 for 1002 amounts across 3 locale-currency context(s)

## Make a new instance of NumberFormatter for each number
Load: /Users/totten/bknix/build/dmaster/web/sites/all/modules/civicrm/tmp/bench/new-inst.php
new-inst.php: Warmup: 0.005445
new-inst.php: Runtime: 0.390538 for 1002 amounts across 3 locale-currency context(s)

## Cache NumberFormater in Civi::cache('metadata')
Load: /Users/totten/bknix/build/dmaster/web/sites/all/modules/civicrm/tmp/bench/metadata-cache.php
  [Symfony\Component\Debug\Exception\FatalThrowableError]
  Type error: Return value of Brick\Money\Money::formatWith() must be of the type string, boolean returned

## Cache NumberFormater in Civi::$statics
Load: /Users/totten/bknix/build/dmaster/web/sites/all/modules/civicrm/tmp/bench/static-cache.php
static-cache.php: Warmup: 0.007789
static-cache.php: Runtime: 0.331434 for 1002 amounts across 3 locale-currency context(s)

## Cache NumberFormater in Civi::$statics, but use trick to reduce `md5(json_encode(...))` work.
Load: /Users/totten/bknix/build/dmaster/web/sites/all/modules/civicrm/tmp/bench/static-cache-less-md5.php
static-cache-less-md5.php: Warmup: 0.009057
static-cache-less-md5.php: Runtime: 0.315846 for 1002 amounts across 3 locale-currency context(s)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@totten ah - we can't cache an object. Ah well using array-backed Redis caching makes a difference on single page loads but not batch jobs anyway

I guess it would be interesting to compare against the existing Money::format function - but we probably have enough info

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think a static-cache is low-hanging-fruit.

I'm not certain what an apples-to-apples benchmark would be for CRM_Utils_Money::format(). I took a quick stab and ran a few times locally - for run_benchmark_money(), it seems to consistently come in between (static-cache.php beats crm-utils.php which beats new-inst.php). For run_benchmark_number(), it comes in as the slowest of the pack --but it may just be that I didn't find the right operation to compare against.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I flipped it to Civi::statics

@eileenmcnaughton eileenmcnaughton force-pushed the format branch 2 times, most recently from 147ad3b to b672ee9 Compare October 5, 2021 09:17
Civi::format()->money();
Civi::format()->number();
Civi::format()->moneyNumber();
Civi::format()->moneyLong();
'money_number_long' => '1 234,500',
],
];
return $cases;
Copy link
Member

@totten totten Oct 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured en_IN / INR would be interesting to try (eg the comma rules are different). This is what currently passes on my system:

Suggested change
return $cases;
$cases['en_IN_INR'] = [
[
'amount' => '123456789',
'locale' => 'en_IN',
'currency' => 'INR',
'money' => '₹ 12,34,56,789.00',
'money_number' => '12,34,56,789.00',
'number' => '12,34,56,789',
'money_long' => '₹ 12,34,56,789.00',
'money_number_long' => '12,34,56,789.00',
],
];
return $cases;

It handled the 22,22,333 commas (yay) - but the decimals seem a little suspicious to me. My recollection is that prices are usually whole rupees, and https://en.wikipedia.org/wiki/Indian_paisa says "As of 2019, all paisa coins have been demonetised." So I'm not sure if that's a subjective policy issue (eg "The currency database lists 2-digits because we still need to present older 50p transactions!") or if that's a technical issue ("the formatter doesn't handle 0-decimals").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decimal nuance comes from the brick money library - I think they talk about Japanese having no decimal places - but I do think it relies on a list within that library

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, upstream list describes INR as 2-decimal-digits. It only provides one length (eg getDefaultFractionDigits()) whereas NumberFormatter allows a distinction between min/max fraction digits. Maybe this is a use-case for min/max digits? (ie "Most data is .00 and that should be dropped, but if your data says .50 then show it".) But obviously, someone (somewhere) would need to hardcode an opinion of INR, 0-2 digits, which disagrees with our current data.

So there might be some room for improvement in that, but I think the new API (Civi::format()->money(...)) should still look the same. It shouldn't block refactoring/improvement of the interface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@totten yeah - I think the upstream library has enough use we can rely on someone who has first-hand knowledge of a currency patching it over time

@totten totten changed the title dev/core#2638 Add Helper class for formatting (dev/core#2638) Civi::format() - Add helper for formatting Oct 6, 2021
@totten
Copy link
Member

totten commented Oct 6, 2021

OK, this has been updated to include a few rounds of feedback from a few people, and it's been focused a bit more than the initial draft. I don't think there are any outstanding issues. It's got several tests for the helpers. Let's merge it.

@totten totten merged commit 79b1f8b into civicrm:master Oct 6, 2021
@eileenmcnaughton eileenmcnaughton deleted the format branch October 6, 2021 10:16
@eileenmcnaughton
Copy link
Contributor Author

Yay - the next step is to add to support this via token render.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants