Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept attributes design & remove spec note #845

Merged
merged 6 commits into from
Aug 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion exploration/expression-attributes.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Expression Attributes

Status: **Proposed**
Status: **Accepted**

<details>
<summary>Metadata</summary>
Expand All @@ -15,6 +15,8 @@ Status: **Proposed**
<dd><a href="https://github.com/unicode-org/message-format-wg/pull/772">#772</a></dd>
<dd><a href="https://github.com/unicode-org/message-format-wg/pull/780">#780</a></dd>
<dd><a href="https://github.com/unicode-org/message-format-wg/pull/792">#792</a></dd>
<dd><a href="https://github.com/unicode-org/message-format-wg/pull/845">#845</a></dd>
eemeli marked this conversation as resolved.
Show resolved Hide resolved
<dd><a href="https://github.com/unicode-org/message-format-wg/pull/846">#846</a></dd>
</dl>
</details>

Expand Down
13 changes: 11 additions & 2 deletions spec/data-model/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -203,8 +203,6 @@ interface UnsupportedExpression {
annotation: UnsupportedAnnotation;
attributes: Attributes;
}

type Attributes = Map<string, Literal | VariableRef | true>;
```

## Expressions
Expand Down Expand Up @@ -283,6 +281,17 @@ interface Markup {
}
```

## Attributes

`Attributes` is a key-value mapping
used to represent the _expression_ and _markup_ _attributes_.

_Attributes_ with no value are represented by `true` here.

```ts
type Attributes = Map<string, Literal | true>;
```

## Extensions

Implementations MAY extend this data model with additional interfaces,
Expand Down
2 changes: 1 addition & 1 deletion spec/data-model/message.dtd
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@

<!ELEMENT unsupportedAnnotation (#PCDATA)>

<!ELEMENT attribute (literal | variable)?>
<!ELEMENT attribute (literal)?>
<!ATTLIST attribute name NMTOKEN #REQUIRED>

<!ELEMENT markup (option*, attribute*)>
Expand Down
2 changes: 1 addition & 1 deletion spec/data-model/message.json
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
"attributes": {
"type": "object",
"additionalProperties": {
"oneOf": [{ "$ref": "#/$defs/literal-or-variable" }, { "const": true }]
"oneOf": [{ "$ref": "#/$defs/literal" }, { "const": true }]
}
},

Expand Down
6 changes: 2 additions & 4 deletions spec/formatting.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,8 @@ or even use them in their internal processing,
as long as the final _formatting_ result is made available to users
and the observable behavior of the formatter matches that described here.

_Attributes_ MUST NOT affect the processing or output of a _message_.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be too broadly stated, since the purpose of attributes might be the "processing" of a message (as, for example, in tools).

If we wanted to prevent attributes from having an effect, we'd be better off to say that implementations must not/should not expose attribute values to functions (if functions can't see the attributes, they can't do anything with them)?

Perhaps:

Suggested change
_Attributes_ MUST NOT affect the processing or output of a _message_.
_Attributes_ have no effect on selection or formatting in a _message_.

... and perhaps adding:

Implementations MUST NOT pass attributes to functions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a slightly wider ban than attributes' visibility to functions, to prohibit e.g. the formatted parts representation of a message from reflecting attribute values.

Would this work?

Suggested change
_Attributes_ MUST NOT affect the processing or output of a _message_.
_Attributes_ MUST NOT have any effect on the formatted output of a _message_,
or be made available to function implementations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested a no-MUSTard version in part because I don't know all the use cases that an expression attribute might be put to and in part because I don't know how to test it. If an implementation decided to support some magical attributes, how would it be harmful if (for example) they affected the formatted parts? We don't define any expression attributes, so we could leave it up to implementations to decide what they mean. The statement I suggest should guide implementers away from using attributes as some sort of option factory.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an implementation decided to support some magical attributes, how would it be harmful if (for example) they affected the formatted parts?

The MUST NOT establishes a promise that when a message is being processed for formatting only, all of the attributes can be trimmed out. This allows for potentially very verbose content to be included in attributes, given that doing so has no impact on the runtime size or parsing requirements.

Without the mustard, it is much more difficult to make this guarantee, as tools can't be certain about what they can and cannot trim out. It would also significantly fuzzify the boundary between "options" and "attributes", if they can have the same effect.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So @Locale=fr cannot be an attribute, right? Because it would affect the results of formatting.

It sounds like 'attributes' are really something like 'metadata'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sounds like u: options now handle cases like locale settings?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So @Locale=fr cannot be an attribute, right? Because it would affect the results of formatting.

That's right.

It sounds like 'attributes' are really something like 'metadata'

Yes. I would be entirely happy with renaming them as metadata, in case there's a sense that that would make it clearer.

Ah, sounds like u: options now handle cases like locale settings?

Yes, PR #846 introduces a u:locale option for just that.


## Formatting Context

A message's **_<dfn>formatting context</dfn>_** represents the data and procedures that are required
Expand Down Expand Up @@ -762,10 +764,6 @@ MUST be an empty string.
Implementations MAY offer functionality for customizing this,
such as by emitting XML-ish tags for each _markup_.

_Attributes_ are reserved for future standardization.
Other than checking for valid syntax, they SHOULD NOT
affect the processing or output of a _message_.

### Examples

_This section is non-normative._
Expand Down
66 changes: 13 additions & 53 deletions spec/syntax.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ Attempting to parse a _message_ that is not _well-formed_ will result in a _Synt
A _message_ is **_<dfn>valid</dfn>_** if it is _well-formed_ and
**also** meets the additional content restrictions
and semantic requirements about its structure defined below for
_declarations_, _matcher_ and _options_.
_declarations_, _matcher_, _options_, and _attributes_.
Attempting to parse a _message_ that is not _valid_ will result in a _Data Model Error_.

## The Message
Expand Down Expand Up @@ -154,7 +154,7 @@ message = simple-message / complex-message

A **_<dfn>simple message</dfn>_** contains a single _pattern_,
with restrictions on its first non-whitespace character.
An empty string is a valid _simple message_.
An empty string is a _valid_ _simple message_.

Whitespace at the start or end of a _simple message_ is significant,
and a part of the _text_ of the _message_.
Expand Down Expand Up @@ -445,7 +445,7 @@ There MAY be any number of additional _selectors_.

A **_<dfn>variant</dfn>_** is a _quoted pattern_ associated with a list of _keys_ in a _matcher_.
Each _variant_ MUST begin with a sequence of _keys_,
and terminate with a valid _quoted pattern_.
and terminate with a _valid_ _quoted pattern_.
The number of _keys_ in each _variant_ MUST match the number of _selectors_ in the _matcher_.

Each _key_ is separated from each other by whitespace.
Expand Down Expand Up @@ -555,7 +555,7 @@ A _function_'s entry in the _function registry_ will define
whether the _function_ is a _selector_ or formatter (or both),
whether an _operand_ is required,
what form the values of an _operand_ can take,
what _options_ and _option_ values are valid,
what _options_ and _option_ values are acceptable,
and what outputs might result.
See [function registry](./registry.md) for more information.

Expand Down Expand Up @@ -587,7 +587,7 @@ Multiple _options_ are permitted in an _annotation_.
_Options_ are separated from the preceding _function_ _identifier_
and from each other by whitespace.
Each _option_'s _identifier_ MUST be unique within the _annotation_:
an _annotation_ with duplicate _option_ _identifiers_ is not valid.
an _annotation_ with duplicate _option_ _identifiers_ is not _valid_.

The order of _options_ is not significant.

Expand Down Expand Up @@ -723,7 +723,8 @@ markup = "{" [s] "#" identifier *(s option) *(s attribute) [s] ["/"] "}" ; open
> {#button}Submit{/button} or {#img alt=|Cancel| /}.
> ```

> A _message_ with attributes in the closing tag:
> A _message_ containing _markup_ that uses _options_ to pair
> two closing markup _placeholders_ to the one open markup _placeholder_:
>
> ```
> {#ansi attr=|bold,italic|}Bold and italic{/ansi attr=|bold|} italic only {/ansi attr=|italic|} no formatting.}
Expand All @@ -737,66 +738,25 @@ on the pairing, ordering, or contents of _markup_ during _formatting_.

## Attributes

**_Attributes_ are reserved for standardization by future versions of this specification._**
Examples in this section are meant to be illustrative and
might not match future requirements or usage.

> [!NOTE]
> The Tech Preview does not provide a built-in mechanism for overriding
> values in the _formatting context_ (most notably the locale)
> Nor does it provide a mechanism for identifying specific expressions
> such as by assigning a name or id.
> The utility of these types of mechanisms has been debated.
> There are at least two proposed mechanisms for implementing support for
> these.
> Specifically, one mechanism would be to reserve specifically-named options,
> possibly using a Unicode namespace (i.e. `locale=xxx` or `u:locale=xxx`).
> Such options would be reserved for use in any and all functions or markup.
> The other mechanism would be to use the reserved "expression attribute" syntax
> for this purpose (i.e. `@locale=xxx` or `@id=foo`)
> Neither mechanism was included in this Tech Preview.
> Feedback on the preferred mechanism for managing these features
> is strongly desired.
>
> In the meantime, function authors and other implementers are cautioned to avoid creating
> function-specific or implementation-specific option values for this purpose.
> One workaround would be to use the implementation's namespace for these
> features to insure later interoperability when such a mechanism is finalized
> during the Tech Preview period.
> Specifically:
> - Avoid specifying an option for setting the locale of an expression as different from
> that of the overall _message_ locale, or use a namespace that later maps to the final
> mechanism.
> - Avoid specifying options for the purpose of linking placeholders
> (such as to pair opening markup to closing markup).
> If such an option is created, the implementer should use an
> implementation-specific namespace.
> Users and implementers are cautioned that such options might be
> replaced with a standard mechanism in a future version.
> - Avoid specifying generic options to communicate with translators and
> translation tooling (i.e. implementation-specific options that apply to all
> functions.
> The above are all desirable features.
> We welcome contributions to and proposals for such features during the
> Technical Preview.

An **_<dfn>attribute</dfn>_** is an _identifier_ with an optional value
that appears in an _expression_ or in _markup_.
During formatting, _attributes_ have no effect,
and they can be treated as code comments.

_Attributes_ are prefixed by a U+0040 COMMERCIAL AT `@` sign,
followed by an _identifier_.
An _attribute_ MAY have a _value_ which is separated from the _identifier_
An _attribute_ MAY have a _literal_ _value_ which is separated from the _identifier_
by an U+003D EQUALS SIGN `=` along with optional whitespace.
Comment on lines +748 to 749
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed the possibility that an attribute can have a variable as its value. Why? I agree that passing a variable might not be that useful (given the restrictions we are placing on attributes). But a use case might be found in tooling that uses the resolved value of some variable to assign an attribute value. For example, it might be used to set a debug level:

You have {$foo :function @debug=$debugLevel}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable values are removed because that's a part of the design we accepted: https://github.com/unicode-org/message-format-wg/blob/main/exploration/expression-attributes.md#attributes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I missed that part. The design doesn't say why they were dropped. I want to understand your reasoning.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. That comes from the first paragraph of the Proposed Design:

Provide separate solutions for attributes that impact formatting, and those which do not, so that their namespaces are not comingled.

From that starting point, the u: options have runtime impact, while @ attributes do not. Therefore, if it's not possible for @ attributes to do anything during formatting, it does not make sense to allow them to have variable values, reading which could be externally observable and could generate errors. Especially as the use cases @ attributes are serving are all non-runtime ones, which do not have access to variable values.

The _value_ of an _attribute_ can be either a _literal_ or a _variable_.

Multiple _attributes_ are permitted in an _expression_ or _markup_.
Each _attribute_ is separated by whitespace.
Each _attribute_'s _identifier_ MUST be unique within the _expression_ or _markup_:
an _expression_ or _markup_ with duplicate _attribute_ _identifiers_ is not _valid_.

The order of _attributes_ is not significant.


```abnf
attribute = "@" identifier [[s] "=" [s] (literal / variable)]
attribute = "@" identifier [[s] "=" [s] literal]
aphillips marked this conversation as resolved.
Show resolved Hide resolved
```

> Examples of _expressions_ and _markup_ with _attributes_:
Expand Down