Software that communicates with users often needs to insert dynamic data into strings for presentation. Cocoa Foundation’s solution for this is printf()
–style formatting, which is fundamentally unsuitable for the task, for two reasons:
- There are many formatting options, none of which are suitable for producing well-formatted prose text.
- The interpretation of data on the stack, including its length, is specified in the formatting string itself. This means that format strings loaded from data can crash your application. This is problematic for integrated localization, and a deal-breaker for other use cases such as sandboxed plug-ins.
The C standard library has a third problem: the %n
specifier can be used to write arbitrary data onto the stack, which is serious business. Foundation does not implement %n
, but malicious format strings can still be used to read data you didn’t intend to expose, or simply crash your app.
In short, I feel that printf()
and +[NSString stringWithFormat:]
should be deprecated. For producing text in formal languages for computer consumption, I suggest a fully-fledged template system such as MGTemplateEngine. But for logging, presenting alerts, and hacking together command-line tools, printf()
-style formatting wins on convenience. This is an attempt at beating printf()
on its own ground.
JATemplate provides a family of macros and functions for inserting variables into strings. Convenience wrappers are provided for using it in conjunction with NSLog()
, NSAssert()
and -[NSMutableString stringByAppendingString:]
.
JATemplate is currently experimental. The syntax and operators are in flux, and I’m not satisfied with the robustness of the parser. That said, fuzz testing has repeatedly found a crashing bug in CoreFoundation and/or ICU, but no crashes, assertions or unexpected warnings in JATemplate itself. It is certainly far safer than +[NSString stringWithFormat:]
.
To date, it has only been tested on Mac OS X 10.8 with ARC. Some formatting operators have known incompatibilities with Mac OS X 10.7 and iOS 5.
NSString *flavor = @"strawberry";
NSString *size = @"large";
unsigned scoopCount = 3;
NSString *message = JATExpand(@"My {size} ice cream tastes of {flavor} and has {scoopCount} scoops!", flavor, size, scoopCount);
Because easy internationalization is a central goal in the design, JATExpand()
looks up the format string (template) in Localizable.strings by default. There are variants to control this behaviour.
Templates can directly refer to variables by name, but they only have access to variables specified at the call site. Parameters can also be referred to by position; in the example, {0}
could be used instead of {flavor}
. Name references are less error-prone and easier to localize, but positional references allow you to refer to an expression without creating a temporary variable. This is particularly useful in logging and assertions, which are less likely to be localized anyway.
The default behaviour for numerical parameters is to format them with NSNumberFormatter
’s NSNumberFormatterDecimalStyle
. If scoopCount
is set to 1000
in the example above, it is printed as 1,000 in English locales.
Parameters may be Objective-C objects, any C number type, C strings, C++ std::string
s (in Objective-C++), NSPoint
s, NSSize
s, NSRect
s, NSRange
s, CFString
s, CFNumber
s or CFBoolean
s. Support for other types can easily be added; see Customization below.
The most important feature of the design is that even though JATExpand()
et al. are variadic, the number of arguments passed is fixed at compile time, and their types are all known. If a format string that refers to a non-existent parameter, either by name or by index, it will simply not be expanded.
The formatting of expanded parameters can be modified by appending formatting operators, separated by a pipe character:
NSString *intensifier = @"really";
NSString *message = JATExpand(@"I {intensifier|uppercase} like ice cream!", intensifier);
// Produces “I REALLY like ice cream!”
Multiple operators can be chained together in the obvious fashion. Operators may optionally take an argument, separated by a colon. By convention, operators that need to split the argument into parts use semicolons as a separator.
NSString *message = JATExpand(@"Pi is about {0|round|num:spellout}.", M_PI);
// Produces “Pi is about three.”
// BUG 2013-02-01: Some people’s ice cream only has one scoop. :-(
// FIX: support pluralization.
NSString *message = JATExpand(@"My {size} ice cream tastes of {flavor} and has {scoopCount} {scoopCount|plural:scoop.;scoops!}", flavor, size, scoopCount);
For the full set of built-in operators, see Built-in operators below. The num:
operator and the pluralization operators are particularly important.
The full list of string expanding functions and macros, and their notional signatures, is as follows. All variadic arguments (...
) actually take a series of zero or more objects, and are type safe (as much as pointers in C are in general).
NSString *JATExpand(NSString *template, ...)
— Looks uptemplate
in Localizable.strings in the same manner asNSLocalizedString()
, then expands substitution expressions in the resulting template using the provided parameters.NSString *JATExpandLiteral(NSString *template, ...)
— LikeJATExpand()
, but skips the localization step.NSString *JATExpandFromTable(NSString *template, NSString *table, ...)
— LikeJATExpand()
, but looks up the template in the specified .strings file (likeNSLocalizedStringFromTable()
). The table name should not include the .strings extension.NSString *JATExpandFromTableInBundle(NSString *template, NSString *table, NSBundle *bundle ...)
— LikeJATExpand()
, but looks up the template in the specified .strings file and bundle (likeNSLocalizedStringFromTableInBundle()
).NSString *JATExpandWithParameters(NSString *template, NSDictionary *parameters)
– LikeJATExpand()
, but passes the parameters in a dictionary. “Positional” parameters in this case are looked up usingNSNumber
s as keys.NSString *JATExpandLiteralWithParameters(NSString *template, NSDictionary *parameters)
– LikeJATExpandWithParameters()
, but without the localization step.NSString *JATExpandFromTableWithParameters(NSString *template, NSString *table, NSDictionary *parameters)
andNSString *JATExpandFromTableInBundleWithParameters(NSString *template, NSString *table, NSBundle *bundle, NSDictionary *parameters)
— they exist.void JATAppend(NSMutableString *string, NSString *template, ...)
,void JATAppendLiteral(NSMutableString *string, NSString *template, ...)
,void JATAppendFromTable(NSMutableString *string, NSString *template, NSString *table, ...)
,void JATAppendFromTableInBundle(NSMutableString *string, NSString *template, NSString *table, NSBundle *bundle, ...)
— append an expanded template to a mutable string; Equivalent to[string appendString:JATExpand*(template, ...)]
.void JATLog(NSString *template, ...)
— performs non-localized expansion and sends the result toNSLog()
.void JATPrint(NSString *template, ...)
andvoid JATPrintLiteral(NSString *template, ...)
– Write to stdout, likeprintf()
.void JATErrorPrint(NSString *template, ...)
andvoid JATErrorPrintLiteral(NSString *template, ...)
– Write to stderr, likefprintf(stderr, ...)
.JATAssert(condition, template, ...)
andJATCAssert(condition, template, ...)
— wrappers forNSAssert()
andNSCAssert()
which perform template expansion on failure.
There are three major ways to customize JATemplate: custom coercion methods, custom operators, and custom casting handlers.
The three coercion methods in the protocol <JATCoercible>
are used by operators and the template expansion system to interpret parameters as particular types. They are implemented on NSObject
and a few other classes, but can be overridden to customize the treatment of your own classes.
-jatemplateCoerceToString
returns anNSString *
. In addition to being used by operators, it is used by the template expander to produce the final string that will be inserted into the template. The default implementation calls-description
. It is overridden forNSString
to returnself
, forNSNumber
to useNSNumberFormatterDecimalStyle
, forNSNull
to return@"(null)"
, and forNSArray
to return a comma-separated list.-jatemplateCoerceToNumber
returns anNSNumber *
. The default implementation will look for methods-(double)doubleValue
,-(float)floatValue
,-(NSInteger)integerValue
or-(int)intValue
, in that order. If none of these is found, it returnsnil
, which causes expansion to fail. It is overridden byNSNumber
to returnself
.-jatemplateCoerceToBoolean
returns anNSNumber *
which is treated as a boolean. The default implementation calls-(BOOL)boolValue
if implemented, otherwise returnsnil
. Overridden byNSNull
to return@NO
.
Operators are implemented as methods following this template:
-(id <JATCoercible>)jatemplatePerform_{operator}_withArgument:(NSString *)argument
variables:(NSDictionary *)variables
The receiver is the object being formatted – either one of the parameters to the template or the result of a previous operator in a chain. (For a nil
parameter, the operator message is sent to [NSNull null]
.) The argument
is the string following the colon in the operator invocation, or nil
if there was no colon. The variables
dictionary contains all the parameters to the expansion operation; named parameters are addressed with NSString
keys, and positional parameters with NSNumber
keys. For example, this is the implementation of the uppercase
operator:
- (id <JATCoercible>) jatemplatePerform_uppercase_withArgument:(NSString *)argument
variables:(NSDictionary *)variables
{
NSString *value = [self jatemplateCoerceToString];
if (value == nil) return nil;
return [value uppercaseStringWithLocale:[NSLocale currentLocale]];
}
In most cases, operators should be implemented in a category on NSObject
, and coerce the receiver to whatever class is relevant for the operation. However, it may be reasonable to implement specialized class-specific operators as, say, a category on a model object class.
Casting handlers are used to convert parameters to Objective-C objects. They are defined as inline functions using the JATDefineCast(TYPE)
macro, as follows:
JATDefineCast(const char *)
{
return @(value);
}
The value to be converted is of the specified type and named value
. The return type is id <JATCoercible>
.
(The macro is used to allow the same definition to work in Objective-C, using a clang extension, and in Objective-C++. If you don’t need the cross-language compatibility, you can copy the appropriate prototype from the header instead. There are probably good use cases for templated casting handlers in Objective-C++.)
The “built-in” operators are actually implemented in a separate file, JATemplateDefaultOperators.m. If you don’t like them, you can just exclude this file and write your own. Selecting a good set of operators is perhaps the most difficult design aspect of the library. Some that are currently missing are date formatting and hexadecimal numbers.
These operators coerce the receiver to a number using -jatemplateCoerceToNumber
.
num:
— Format a number using one of several predefined formats, or an ICU/NSNumberFormatter format string. The predefined formats are:decimal
ordec
— Locale-sensitive decimal formatting usingNSNumberFormatterDecimalStyle
. This is the default forNSNumber
s.noloc
– Non-locale-sensitive formatting using-[NSNumber description]
.hex
orHEX
: unsigned hexadecimal formatting, using lowercase or uppercase characters respectively. Takes an optional argument specifying the number of digits ("0x{foo|num:hex;8}"
). Not localized.currency
orcur
– Locale-sensitive currency formatting usingNSNumberFormatterCurrencyStyle
.percent
orpct
– Locale-sensitive percentage notation usingNSNumberFormatterPercentStyle
.scientific
orsci
– Locale-sensitive scientific notation usingNSNumberFormatterScientificStyle
.spellout
– Locale-sensitive text formatting usingNSNumberFormatterSpellOutStyle
.filebytes
,file
orbytes
– Byte count formatting usingNSByteCountFormatterCountStyleFile
.memorybytes
ormemory
– Byte count formatting usingNSByteCountFormatterCountStyleMemory
.decimalbytes
– Power-of-ten byte count formatting usingNSByteCountFormatterCountStyleDecimal
.binarybytes
– Power-of-two byte count formatting usingNSByteCountFormatterCountStyleBinary
.
round
– Round to an integer, rounding half-way cases away from zero (“school rounding”).plur:
– A powerful pluralization operator with support for many languages. It takes three to seven arguments separated by semicolons. The first is a number specifying a pluralization rule, and the others are different word forms determined by the rule. The rules are the same as used by Mozilla’s PluralForm system (except that rule 0 is not supported or needed).
For example, the template"{count} minute{count|plural:s}"
might be translated to Polish as"{count} {count|plur:9;minuta;minuty;minut}"
.
The selected string is also expanded as a template, so it’s possible to do things like"{plur:1;{singularString};{pluralString}}"
. This is generally a bad idea, because handling all the different language rules this way is likely to be impossible, but it’s there if you need it. (This also works withplural:
andpluraz:
.)plural:
– A simplified plural operator for languages that use the same numeric inflection structure as English (plur:
rule 1). If one argument is given, the empty string is used for a singular value and the argument is used for plural. If two arguments separated by a semicolon are given, the first is singular and the second is plural.
Example:"I have {gooseCount} {gooseCount|plural:goose;geese}.".
pluraz:
— Likeplural:
, except that a count of zero is treated as singular (plur:
rule 2).
Example:"J’ai {gooseCount} {gooseCount|pluraz:oie;oies}."
, or equivalently"J’ai {gooseCount} oie{gooseCount|pluraz:s}."
.select:
– Takes any number of arguments separated by semicolons. The receiver is coerced to a number and truncated to an integer. The corresponding argument is selected (and expanded). Arguments are numbered from zero; if the value is out of range, the last item is used.
Example:"Today is {weekDay|select:Mon;Tues;Wednes;Thurs;Fri;Satur;Sun}day."
padding
— Truncates the value to an integer and produces the corresponding number of spaces. Negative values are treated as 0.
These operators coerce the receiver to a number using -jatemplateCoerceToString
.
uppercase
,lowercase
andcapitalize
— Locale-sensitive conversion to capitals/lower case/naïve title case using-[NSString uppercaseWithLocale:]
etc.uppercase_noloc
,lowercase_noloc
andcapitalize_noloc
— Locale-insensitive conversion to capitals/lower case/naïve title case using-[NSString uppercase]
etc..trim
— Removes leading and trailing whitespace and newlines.length
— Produces the length of the receiver (coerced to a string).fold:
— Locale-sensitive removal of lesser character distinctions using-[NSString stringByFoldingWithOptions:locale:]
. The argument is a comma-separated list of options. The currently supported options arecase
,width
anddiacritics
.fit:
— Truncates or pads the string as necessary to fit in a particular number of characters. It is intended for column formatting in command-line tools and logging, and is of little use with variable-width fonts. It takes one to four arguments separated by semicolons:- The first is the desired width, a positive integer.
- The second is
start
,center
,end
ornone
, specifying where padding will be added if necessary. The default isend
.center
means padding will be added at both ends.none
means no padding will be added. - The third is also
start
,center
,end
ornone
, specifying where truncation will occur if necessary. The default isend
(irrespective of the second argument).none
means no truncation will occur. - The fourth is a string to insert when truncating. The default is
…
(a single-character elipsis). This may be any string, including the empty string or a string longer than the fit width; in this case, truncation will return the full replacement string and nothing else.
trunc:
– Truncates the string without adding a placeholder. Takes one or two arguments, the first being the desired length and the second being a mode string as forfit:
.trunc:x;y
is equivalent tofit:x;none;y;
.
These operators coerce the receiver to a number using -jatemplateCoerceToBoolean
.
if:
— Takes one or two arguments separated by a semicolon. If the receiver, as a boolean, is true, selects the first argument. Otherwise selects the second argument (or the empty string if none). The selected argument is then expanded as a template.
Example:"The flag is {flag|if:set;not set}."
or:
— If the receiver is a string, it is treated as false only if it is empty. If the receiver is considered true, the expression is expanded to the receiver. Otherwise, the argument toor:
is expanded.
Example:{foo|or:{bar}}
is equivalent tofoo
if it is truthy, and the expansion of{bar}
otherwise.
pointer
— Produces the address of the receiver, formatted as with%p
. (NSNull
is treated asnil
, since the distinction can’t be made in an operator.)basedesc
– Produces the class name and address of the receiver, suitable for use in implementing-description
.debugdesc
– Calls-debugDescription
on the receiver if implemented, otherwise-description.