-
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow surrogates in content, issue #895 #906
Allow surrogates in content, issue #895 #906
Conversation
This would allow surrogates in both text and literals. I don't have an opinion on whether they should be included in literals or not, just pointing out that this change makes that possible since |
Touching only
That would be a quoted literal. Allowing surrogates in the quoted literal might also allow it in a couple of other places:
and
That is again non-code, and potentially localizable. To help understanding what is going on I created a graph (click it for full size). It represents the mf2 abnf as it is today, with colors added to the interesting nodes, and all links to |
I know. It is intentional, and I explained why. As you can imagine, that graph took a while :-) But you were too fast :-) |
I think this has to be discussed again if we propose to include Note that this PR is incomplete, since the syntax spec also has to be updated. |
My understanding of the consensus reached during the 2024-10-07 call was that we'd be allowing unpaired surrogates in text only. If allowing surrogates in text is not enough and we need to consider also allowing them in quoted literal content, I think we need to see some examples of such use to consider their utility. In text I can imagine them showing up from processes that are doing dumb concatenation or string slicing, but in literals we're talking about content that needs to be processed within a framework that has a pretty good understanding of MF2, and content that is further processed within MF2 rather than only in a consumer of the formatted output of MF2. I would also strongly prefer to see such examples or other rationalisation for unpaired surrogates in quoted literals presented here, before we spend more meeting time on this topic. My own base assumption is that any unpaired surrogate is an indicator of some broken localization process, and that silently ignoring them would be hiding an error. |
Literals are text. Sometimes localizable:
Agree. |
My thoughts are that we should allow unpaired surrogates in the text in *input
parameters*, but I don't think we need to allow them in the message itself
(literals, etc).
That is, in parsing a message format, we could throw an exception if an
unpaired surrogate were encountered, but the formatting software can accept
unpaired surrogates in data passed in, and treat them as if they were
reserved, or convert to U+FFFD "replacement character".
…On Sat, Oct 12, 2024 at 3:15 AM Eemeli Aro ***@***.***> wrote:
My understanding of the consensus reached during the 2024-10-07 call was
that we'd be allowing unpaired surrogates in *text* only.
If allowing surrogates in *text* is not enough and we need to consider
also allowing them in *quoted literal* content, I think we need to see
some examples of such use to consider their utility. In *text* I can
imagine them showing up from processes that are doing dumb concatenation or
string slicing, but in *literals* we're talking about content that needs
to be processed within a framework that has a pretty good understanding of
MF2, and content that is further processed within MF2 rather than only in a
consumer of the formatted output of MF2.
I would also strongly prefer to see such examples or other rationalisation
for unpaired surrogates in *quoted literals* presented here, before we
spend more meeting time on this topic.
My own base assumption is that any unpaired surrogate is an indicator of
some broken localization process, and that silently ignoring them would be
hiding an error.
—
Reply to this email directly, view it on GitHub
<#906 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJLEMCCZGRCB3KVZU3PUTDZ3DZC5AVCNFSM6AAAAABPZQ37F2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBYGUYDSMZSG4>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
(chair hat) In the 2024-10-14 call we agree to allow unpaired surrogates in text and literals, including changing mentions of characters to mentions of code points in the spec as needed and including a health warning. |
4369e84
to
461555f
Compare
Updated the non-abnf part of the spec. In most cases there was no need to change code point to code units. From the Unicode spec, Chapter 3, section "3.8 Surrogates":
Just because in localizable elements we allow code points in the surrogates range it does not mean that all of our processing and spec are all of a sudden written in terms of code units. I think that the changes I made are sufficient. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @mihnita for getting this done before our next call!!!!
I like the changes generally, but am making some suggestions below.
spec/syntax.md
Outdated
U+100000 through U+10FFFD), unassigned code points, unpaired surrogates in messages and literals | ||
only (U+D800 through U+DFFF), and other potentially confusing content. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine, but I think it would be okay to omit it too.
…ords Thanks Addison! Co-authored-by: Addison Phillips <[email protected]>
Co-authored-by: Addison Phillips <[email protected]>
spec/appendices.md
Outdated
> [!IMPORTANT] | ||
> _Text_ and _quoted literals_ allow unpaired surrogate code points | ||
> (`U+D800` to `U+DFFF`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How/why is this important to note as a security consideration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I previously suggested the same thing:
I... don't think this is a security consideration? Maybe move this note to the section on message, just after the note that begins:
This syntax is designed to be embeddable into many different programming languages...
I think that location is a good one because we put a whole wodge of notes there that are "read once and forget" about the content of messages.
spec/syntax.md
Outdated
U+100000 through U+10FFFD), unassigned code points, unpaired surrogates in messages and | ||
quoted literals only (U+D800 through U+DFFF), and other potentially confusing content. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specifier about where surrogates are valid is unnecessary esp. in this context, where the other "potentially confusing content" is for the most part only allowed in the same.
U+100000 through U+10FFFD), unassigned code points, unpaired surrogates in messages and | |
quoted literals only (U+D800 through U+DFFF), and other potentially confusing content. | |
U+100000 through U+10FFFD), unassigned code points, | |
unpaired surrogates (U+D800 through U+DFFF), and other potentially confusing content. |
spec/syntax.md
Outdated
> [!NOTE] | ||
> Unpaired surrogate code points (`U+D800` through `U+DFFF` inclusive) | ||
> are allowed for compatibility with UTF-16 based implementations | ||
> that do not check for this encoding error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This note would be better placed a bit further down in this section, and it needs to be separated from the surrounding content by an empty line. This would also be a more appropriate place for the concerns that are currently hidden away in the Security Considerations appendix.
@mihnita Per today's call, ready to merge? |
* Create notes-2024-08-19.md * Accept attributes design & remove spec note (#845) * Accept attributes design & remove spec note * Disallow duplicate attribute names (closes #756) * Add link to contextual options PR * Add more prose to tag example text Co-authored-by: Addison Phillips <[email protected]> * Mention attribute validity condition in the **_valid_** definition --------- Co-authored-by: Addison Phillips <[email protected]> * Update selection-declaration design doc based on mtg / issue discussion (#867) * Add tests for pattern selection (#863) * Add tests for pattern selection * Add missing errors * Apply suggestions from code review Co-authored-by: Addison Phillips <[email protected]> --------- Co-authored-by: Addison Phillips <[email protected]> * Add Duplicate Variant to table in test/README.md (#861) * Add new selection-declaration alternative: Require annotation of selector variables in placeholders (#860) * Add new selection-declaration alternative: Require annotation of selector variables in placeholders * Improve examples * Switch example order * Update the stability policy (#834) * Update the stability policy Based on discussion in the 2024-07-22 call and in PR #829, update the stability policy. * A deeper, more thorough rewrite - Standardizes the phrasing completely. - Moves all potential future changes (which are not, after all, stability policies) to an "important" block - Removes duplication - Separates functions, options, and option values into separate guarantees - Clarifies the note about formatting changing over time * Update spec/README.md Co-authored-by: Tim Chevalier <[email protected]> * Update spec/README.md Co-authored-by: Eemeli Aro <[email protected]> * remove well-formed * Update spec/README.md --------- Co-authored-by: Tim Chevalier <[email protected]> Co-authored-by: Eemeli Aro <[email protected]> * Refine error handling text (#816) * Refine error handling text * Apply suggestions from code review Co-authored-by: Addison Phillips <[email protected]> * Update fallback text * Turn bullet point list into paragraphs * Be more mighty Co-authored-by: Addison Phillips <[email protected]> --------- Co-authored-by: Addison Phillips <[email protected]> * Create notes-2024-08-26.md * Select "Match on variables instead of expressions" for selection-declarations (#824) * Select "Match on variables instead of expressions" for selection-declarations * Add hybrid option to selection-declaration.md (#870) * Add hybrid option to selection-declaration.md * Update selection-declaration.md fixed glitch in original edit * Update selection-declaration.md * Apply suggestions from code review Fixing typos Co-authored-by: Addison Phillips <[email protected]> * Update selection-declaration.md * Update exploration/selection-declaration.md Co-authored-by: Eemeli Aro <[email protected]> * Update exploration/selection-declaration.md Co-authored-by: Eemeli Aro <[email protected]> * Update exploration/selection-declaration.md Co-authored-by: Eemeli Aro <[email protected]> --------- Co-authored-by: Addison Phillips <[email protected]> Co-authored-by: Eemeli Aro <[email protected]> * Update selection-declaration.md --------- Co-authored-by: Mark Davis <[email protected]> Co-authored-by: Addison Phillips <[email protected]> * Fix "Allow immutable input declarative selectors" example (#874) * Update README.md (#875) * Update README.md * Update README.md * [DESIGN] Update bidi design document to show proposed design (#871) * [DESIGN] Update bidi design document to show proposed design The design I actually think we should adopt is the "hybrid approaches" one. This is a necessary first step on the highway to UAX31 compliance and I think is responsibly contained/managed. It is a hybrid approach, in that it permits testable strict implementations to be created (particularly for message serialization). This PR consists of moving text around. I added one "pro" to one option also. * Address comments * Miscellaneous test fixes (#862) * Add missing expected bad-selector errors * Fix expected parts for unsupported-statement test * Add a few new tests for leading-whitespace and duplicate-variant * Add tests for escaped-char changes made in #743 * Fix tests for attributes with variable values * Update contributing and joining info (#876) * Update contributing and joining info * Update README.md * Update CONTRIBUTING.md * Restore CLA copy * Clarify error & fallback handling (#879) * Clarify error & fallback handling * Apply suggestions from code review Co-authored-by: Addison Phillips <[email protected]> * Select last rather than first attribute * Drop mention of "starting with Pattern Selection" * Attributes can't change the formatted output * Use "nor" instead of "or" regarding attribute restrictions --------- Co-authored-by: Addison Phillips <[email protected]> * Clarify rule selection (#878) * Clarify rule selection Fixes #868 This adds normative SHOULD language to using CLDR plural and ordinal data, which was intended originally. - clarifies that keyword selection follows exact match - clarifies the purpose of rule-based selection - makes non-CLDR-based implementation permitted * Update spec/registry.md Co-authored-by: Eemeli Aro <[email protected]> * Update spec/registry.md Co-authored-by: Eemeli Aro <[email protected]> * Update spec/registry.md Co-authored-by: Eemeli Aro <[email protected]> --------- Co-authored-by: Eemeli Aro <[email protected]> * [DESIGN] Maintaining the Standard, Optional and Unicode Namespace Function Sets (#634) * Design doc to capture registry maintenance * Update maintaining-registry.md * Update exploration/maintaining-registry.md Co-authored-by: Tim Chevalier <[email protected]> * Update exploration/maintaining-registry.md Co-authored-by: Tim Chevalier <[email protected]> * Add user stories, small updates to RGI * Update exploration/maintaining-registry.md * Adding additional detail * Remove machine readable registry; update prose * Update maintaining-registry.md * Further development work * Update to change format and naming Per the 2024-08-19 call, we decided to switch towards a specification-per-function model, with statuses. This commit includes the initial set of changes to try and implement this. * Address some comments. --------- Co-authored-by: Tim Chevalier <[email protected]> * Create notes-2024-09-09.md * Fix a typo in an example (#880) The upcoming work to implement resolved value might make this patch unnecessary or obsolete, but fixing the typo (missing `{`/`}` around the variable in the pattern) just in case * Remove forward-compatibility promise and all reserved & private syntax (#883) * Remove forwards compatibility from stability guarantee * Drop reserved statements and expressions * Drop private-use annotations * Update tests * Clarify that deprecation is not removal * Match on variables instead of expressions (#877) * Match on variables instead of expressions * Apply suggestions from code review Co-authored-by: Addison Phillips <[email protected]> * Apply suggestions from code review * Add missing test changes noticed during implementation * Empty commit to re-trigger CLA check --------- Co-authored-by: Addison Phillips <[email protected]> * Create notes-2024-09-10.md * Add bidi support and address UAX31/UTS55 requirements (#884) * Add bidi support and address UAX31/UTS55 requirements Adds the bidi strong marks ALM, RLM, and LRM plus the bidi isolate controls LRI, RLI, FSI, and PDI to the syntax. Formally defines optional vs. non-optional whitespace. Non-optional whitespace must include at least one whitespace character. Optional whitespace may contain only bidi marks (which are invisible) * Update syntax.md including text from previous PR * Repair the guidance on strongly directional marks Include ALM and better specify how to use the marks. * Fix formatting of the "important" * Add bidi characters to description of whitespace. * Permit bidi in a few more places Add optional whitespace at the start of `variant` Add optional whitespace around `quoted-pattern` These changes result in allowing bidi around keys and quoted patterns as intended. * Update syntax.md ABNF * Update formatting.md - Add a note about the difference between formatting and message syntax. - Clarify the sentence about message directionality. * Address comment about name/identifier * Address comments related to bidi in `name` * Fix variable's location * Address comment about the list of LRI/PDI targets * One character typo :-P * Update spec/syntax.md Co-authored-by: Eemeli Aro <[email protected]> * Address comments about rule R3a-1 * Update spec/syntax.md Co-authored-by: Eemeli Aro <[email protected]> * Address comment about U+061C * Change [o]wsp => `o` or `s` * Match syntax spec to abnf * Remove * * Update syntax.md * Update spec/syntax.md Co-authored-by: Eemeli Aro <[email protected]> * Update spec/message.abnf Co-authored-by: Eemeli Aro <[email protected]> * Update spec/message.abnf Co-authored-by: Eemeli Aro <[email protected]> * Update syntax.md * Update spec/message.abnf Co-authored-by: Eemeli Aro <[email protected]> * Update spec/syntax.md Co-authored-by: Eemeli Aro <[email protected]> * Update spec/syntax.md Co-authored-by: Eemeli Aro <[email protected]> --------- Co-authored-by: Eemeli Aro <[email protected]> * Specify `bad-option` for bad digit size option values (#882) * Specify `bad-option` for bad digit size option values Fixes #739 * adopt 'non-negative integer' * Create notes-2024-09-16.md * Address name and literal equality (#885) * Address name and literal equality This change defines equality as discussed in the 2024-09-09 teleconference in the following ways: - It defines _name_ equality as being under NFC - It defines _literal_ equality as explicitly **not** under NFC - It moves _name_ before _identifier_ in that section of text to avoid a forward definition. Note that this deviates from discussion in 2024-09-09's call in that we didn't discuss literals at length. It also doesn't discuss non-name/non-literal values, which I'll point out are limited to ASCII sequences such as keywords. * Typo fix * Add a note about not requiring implementations to actually normalize * Implement changes dicussed in 2024-09-16 call. - Make _key_ require NFC for uniqueness/comparison - Add a note about NFC - Make _literal_ **_not_** define equality - Make text in _name_ identical to that in _key_ for consistency * Update formatting.md to include keys in NFC * Address comments * Update spec/syntax.md Co-authored-by: Eemeli Aro <[email protected]> * Update spec/syntax.md Co-authored-by: Eemeli Aro <[email protected]> --------- Co-authored-by: Eemeli Aro <[email protected]> * Update list of normative changes during the LDML45 period (#890) * Fix typos in data-model-errors tests (#892) Fix #886 * Update note on exact numeric match for v46 (#891) Addresses #887 Non-normative changes to the notes specifically part of LDML46 * Fix attribute value to be literal (#894) Fixes #893 * Create notes-2024-09-30.md * Add Resolved Values and Function Handler sections to formatting (#728) * Add Resolved Values section to formatting * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Tim Chevalier <[email protected]> * Linkify "resolved value" * Add some examples & explicitly allow wrapping input values * No throw, only emit Co-authored-by: Tim Chevalier <[email protected]> * Add section on Function Handlers, defining the term * Apply suggestions from code review * Rephrase initial resolved value definition * Update spec/formatting.md Co-authored-by: Eemeli Aro <[email protected]> * Update resolved value definition again Co-authored-by: Addison Phillips <[email protected]> --------- Co-authored-by: Tim Chevalier <[email protected]> Co-authored-by: Addison Phillips <[email protected]> * Define function composition for :number and :integer values (#823) * Define function composition for :number and :integer values * Apply suggestions from code review Co-authored-by: Addison Phillips <[email protected]> * Add operand option priority example * Add apostrophes' Co-authored-by: Tim Chevalier <[email protected]> * Update spec/registry.md Co-authored-by: Eemeli Aro <[email protected]> * Update spec/registry.md Co-authored-by: Eemeli Aro <[email protected]> --------- Co-authored-by: Addison Phillips <[email protected]> Co-authored-by: Tim Chevalier <[email protected]> * Create notes-2024-10-07.md * Apply NFC normalization during :string key comparison (#905) * Apply NFC normalization during :string key comparison * Add link to UAX#15 Co-authored-by: Addison Phillips <[email protected]> --------- Co-authored-by: Addison Phillips <[email protected]> * Add tests for changes due to bidi/whitespace (#902) * Add tests for changes due to bidi/whitespace * Correct output * Make erroneous test a syntax error * Define function composition for date/time values (#814) * Define function composition for date/time values * Apply suggestions from code review Co-authored-by: Stanisław Małolepszy <[email protected]> * Drop the "only" * Update spec/registry.md * Update spec/registry.md Co-authored-by: Eemeli Aro <[email protected]> * Update spec/registry.md Co-authored-by: Eemeli Aro <[email protected]> * Update spec/registry.md Co-authored-by: Eemeli Aro <[email protected]> * Make :date and :time composition implementation-defined --------- Co-authored-by: Stanisław Małolepszy <[email protected]> Co-authored-by: Addison Phillips <[email protected]> * DESIGN: Add alternative designs to the design doc on function composition (#806) * DESIGN: Add a sequel to the design doc on function composition This document sketches out some alternatives for the machinery provided to enable function composition. The goal is to provide an exhaustive list of alternatives. * Remove 'part 2' document and move contents to the end of part 1 * Revise introduction to reflect the changed goal * Edited for conciseness * Further edits for conciseness * Give a name to InputType and use it * Refer to motivating examples * Update function-composition-part-1.md status Per 2024-10-14 telecon * Create notes-2024-10-14.md * Add test for :integer and :number composition (#907) * Fix `:integer` option `useGrouping` values (#912) I noticed that `:integer` does not include the "never" value for the option `useGrouping`. This is a bug. * Drop syntax note on additional bidi changes (#910) Drop syntax note on addition bidi changes * Add tests for changes due to #885 (name/literal equality) (#904) * Add tests for changes due to #885 (name/literal equality) * Update test/tests/functions/string.json Co-authored-by: Eemeli Aro <[email protected]> * Update test/tests/syntax.json Co-authored-by: Eemeli Aro <[email protected]> * Update test/tests/functions/string.json Co-authored-by: Eemeli Aro <[email protected]> * Added tests for reordering and special case mapping * Add another selection test --------- Co-authored-by: Eemeli Aro <[email protected]> * Add u: options namespace (#846) * Move spec/registry.md -> spec/registry/default.md * Add Unicode Registry definition * Refer to BCP47, add note about only requiring normal tags * Call it a namespace * Apply suggestions from code review Co-authored-by: Addison Phillips <[email protected]> * Fix test file reference Co-authored-by: Tim Chevalier <[email protected]> * Apply suggestions from code review * Update spec/u-namespace.md Co-authored-by: Eemeli Aro <[email protected]> * Apply suggestions from code review Co-authored-by: Addison Phillips <[email protected]> * Apply suggestions from code review Co-authored-by: Addison Phillips <[email protected]> * Add mention of functions to namespace description --------- Co-authored-by: Addison Phillips <[email protected]> Co-authored-by: Tim Chevalier <[email protected]> * Define function composition for :string values (#798) * Define function composition for :string values * Update spec/registry.md as suggested by @stasm in #814 * Drop the "only" * Update text following code review comments --------- Co-authored-by: Addison Phillips <[email protected]> * Drop data model request for feedback on "name" (#909) * Allow surrogates in content, issue #895 (#906) * Allow surrogates in content, issue #895 * Grammar and typos, linkify terms, make into a note, and fix 2119 keywords Thanks Addison! Co-authored-by: Addison Phillips <[email protected]> * Not using "localizable elements" Co-authored-by: Addison Phillips <[email protected]> * Keep syntax.md in sync with message.abnf * Added note about surrogates to quoted literals * Moved the note about surrogates from Security Considerations to The Message * Update spec/syntax.md * Update spec/syntax.md * Italicize in a couple of places * Implemeted more (all?) feedback from review --------- Co-authored-by: Addison Phillips <[email protected]> --------- Co-authored-by: Eemeli Aro <[email protected]> Co-authored-by: Elango Cheran <[email protected]> Co-authored-by: Tim Chevalier <[email protected]> Co-authored-by: Mark Davis <[email protected]> Co-authored-by: Danny Gleckler <[email protected]> Co-authored-by: Steven R. Loomis <[email protected]> Co-authored-by: Stanisław Małolepszy <[email protected]> Co-authored-by: Eemeli Aro <[email protected]> Co-authored-by: Mihai Nita <[email protected]>
* [DESIGN] Number selection design refinements This is to build up and capture technical considerations for how to address the issues raised by @eemeli's PR #842. * Update examples to match changes to syntax Also responds to the long discussion with @eemeli about significant digits by removing from the example. * Address 2024-09-16 call comments This changes the status to "Re-Opened" and adds a link to the PR. Expect to merge this imminently, although discussion on number selection remains. * Update exploration/number-selection.md Co-authored-by: Eemeli Aro <[email protected]> * Update from main (#914) * Create notes-2024-08-19.md * Accept attributes design & remove spec note (#845) * Accept attributes design & remove spec note * Disallow duplicate attribute names (closes #756) * Add link to contextual options PR * Add more prose to tag example text Co-authored-by: Addison Phillips <[email protected]> * Mention attribute validity condition in the **_valid_** definition --------- Co-authored-by: Addison Phillips <[email protected]> * Update selection-declaration design doc based on mtg / issue discussion (#867) * Add tests for pattern selection (#863) * Add tests for pattern selection * Add missing errors * Apply suggestions from code review Co-authored-by: Addison Phillips <[email protected]> --------- Co-authored-by: Addison Phillips <[email protected]> * Add Duplicate Variant to table in test/README.md (#861) * Add new selection-declaration alternative: Require annotation of selector variables in placeholders (#860) * Add new selection-declaration alternative: Require annotation of selector variables in placeholders * Improve examples * Switch example order * Update the stability policy (#834) * Update the stability policy Based on discussion in the 2024-07-22 call and in PR #829, update the stability policy. * A deeper, more thorough rewrite - Standardizes the phrasing completely. - Moves all potential future changes (which are not, after all, stability policies) to an "important" block - Removes duplication - Separates functions, options, and option values into separate guarantees - Clarifies the note about formatting changing over time * Update spec/README.md Co-authored-by: Tim Chevalier <[email protected]> * Update spec/README.md Co-authored-by: Eemeli Aro <[email protected]> * remove well-formed * Update spec/README.md --------- Co-authored-by: Tim Chevalier <[email protected]> Co-authored-by: Eemeli Aro <[email protected]> * Refine error handling text (#816) * Refine error handling text * Apply suggestions from code review Co-authored-by: Addison Phillips <[email protected]> * Update fallback text * Turn bullet point list into paragraphs * Be more mighty Co-authored-by: Addison Phillips <[email protected]> --------- Co-authored-by: Addison Phillips <[email protected]> * Create notes-2024-08-26.md * Select "Match on variables instead of expressions" for selection-declarations (#824) * Select "Match on variables instead of expressions" for selection-declarations * Add hybrid option to selection-declaration.md (#870) * Add hybrid option to selection-declaration.md * Update selection-declaration.md fixed glitch in original edit * Update selection-declaration.md * Apply suggestions from code review Fixing typos Co-authored-by: Addison Phillips <[email protected]> * Update selection-declaration.md * Update exploration/selection-declaration.md Co-authored-by: Eemeli Aro <[email protected]> * Update exploration/selection-declaration.md Co-authored-by: Eemeli Aro <[email protected]> * Update exploration/selection-declaration.md Co-authored-by: Eemeli Aro <[email protected]> --------- Co-authored-by: Addison Phillips <[email protected]> Co-authored-by: Eemeli Aro <[email protected]> * Update selection-declaration.md --------- Co-authored-by: Mark Davis <[email protected]> Co-authored-by: Addison Phillips <[email protected]> * Fix "Allow immutable input declarative selectors" example (#874) * Update README.md (#875) * Update README.md * Update README.md * [DESIGN] Update bidi design document to show proposed design (#871) * [DESIGN] Update bidi design document to show proposed design The design I actually think we should adopt is the "hybrid approaches" one. This is a necessary first step on the highway to UAX31 compliance and I think is responsibly contained/managed. It is a hybrid approach, in that it permits testable strict implementations to be created (particularly for message serialization). This PR consists of moving text around. I added one "pro" to one option also. * Address comments * Miscellaneous test fixes (#862) * Add missing expected bad-selector errors * Fix expected parts for unsupported-statement test * Add a few new tests for leading-whitespace and duplicate-variant * Add tests for escaped-char changes made in #743 * Fix tests for attributes with variable values * Update contributing and joining info (#876) * Update contributing and joining info * Update README.md * Update CONTRIBUTING.md * Restore CLA copy * Clarify error & fallback handling (#879) * Clarify error & fallback handling * Apply suggestions from code review Co-authored-by: Addison Phillips <[email protected]> * Select last rather than first attribute * Drop mention of "starting with Pattern Selection" * Attributes can't change the formatted output * Use "nor" instead of "or" regarding attribute restrictions --------- Co-authored-by: Addison Phillips <[email protected]> * Clarify rule selection (#878) * Clarify rule selection Fixes #868 This adds normative SHOULD language to using CLDR plural and ordinal data, which was intended originally. - clarifies that keyword selection follows exact match - clarifies the purpose of rule-based selection - makes non-CLDR-based implementation permitted * Update spec/registry.md Co-authored-by: Eemeli Aro <[email protected]> * Update spec/registry.md Co-authored-by: Eemeli Aro <[email protected]> * Update spec/registry.md Co-authored-by: Eemeli Aro <[email protected]> --------- Co-authored-by: Eemeli Aro <[email protected]> * [DESIGN] Maintaining the Standard, Optional and Unicode Namespace Function Sets (#634) * Design doc to capture registry maintenance * Update maintaining-registry.md * Update exploration/maintaining-registry.md Co-authored-by: Tim Chevalier <[email protected]> * Update exploration/maintaining-registry.md Co-authored-by: Tim Chevalier <[email protected]> * Add user stories, small updates to RGI * Update exploration/maintaining-registry.md * Adding additional detail * Remove machine readable registry; update prose * Update maintaining-registry.md * Further development work * Update to change format and naming Per the 2024-08-19 call, we decided to switch towards a specification-per-function model, with statuses. This commit includes the initial set of changes to try and implement this. * Address some comments. --------- Co-authored-by: Tim Chevalier <[email protected]> * Create notes-2024-09-09.md * Fix a typo in an example (#880) The upcoming work to implement resolved value might make this patch unnecessary or obsolete, but fixing the typo (missing `{`/`}` around the variable in the pattern) just in case * Remove forward-compatibility promise and all reserved & private syntax (#883) * Remove forwards compatibility from stability guarantee * Drop reserved statements and expressions * Drop private-use annotations * Update tests * Clarify that deprecation is not removal * Match on variables instead of expressions (#877) * Match on variables instead of expressions * Apply suggestions from code review Co-authored-by: Addison Phillips <[email protected]> * Apply suggestions from code review * Add missing test changes noticed during implementation * Empty commit to re-trigger CLA check --------- Co-authored-by: Addison Phillips <[email protected]> * Create notes-2024-09-10.md * Add bidi support and address UAX31/UTS55 requirements (#884) * Add bidi support and address UAX31/UTS55 requirements Adds the bidi strong marks ALM, RLM, and LRM plus the bidi isolate controls LRI, RLI, FSI, and PDI to the syntax. Formally defines optional vs. non-optional whitespace. Non-optional whitespace must include at least one whitespace character. Optional whitespace may contain only bidi marks (which are invisible) * Update syntax.md including text from previous PR * Repair the guidance on strongly directional marks Include ALM and better specify how to use the marks. * Fix formatting of the "important" * Add bidi characters to description of whitespace. * Permit bidi in a few more places Add optional whitespace at the start of `variant` Add optional whitespace around `quoted-pattern` These changes result in allowing bidi around keys and quoted patterns as intended. * Update syntax.md ABNF * Update formatting.md - Add a note about the difference between formatting and message syntax. - Clarify the sentence about message directionality. * Address comment about name/identifier * Address comments related to bidi in `name` * Fix variable's location * Address comment about the list of LRI/PDI targets * One character typo :-P * Update spec/syntax.md Co-authored-by: Eemeli Aro <[email protected]> * Address comments about rule R3a-1 * Update spec/syntax.md Co-authored-by: Eemeli Aro <[email protected]> * Address comment about U+061C * Change [o]wsp => `o` or `s` * Match syntax spec to abnf * Remove * * Update syntax.md * Update spec/syntax.md Co-authored-by: Eemeli Aro <[email protected]> * Update spec/message.abnf Co-authored-by: Eemeli Aro <[email protected]> * Update spec/message.abnf Co-authored-by: Eemeli Aro <[email protected]> * Update syntax.md * Update spec/message.abnf Co-authored-by: Eemeli Aro <[email protected]> * Update spec/syntax.md Co-authored-by: Eemeli Aro <[email protected]> * Update spec/syntax.md Co-authored-by: Eemeli Aro <[email protected]> --------- Co-authored-by: Eemeli Aro <[email protected]> * Specify `bad-option` for bad digit size option values (#882) * Specify `bad-option` for bad digit size option values Fixes #739 * adopt 'non-negative integer' * Create notes-2024-09-16.md * Address name and literal equality (#885) * Address name and literal equality This change defines equality as discussed in the 2024-09-09 teleconference in the following ways: - It defines _name_ equality as being under NFC - It defines _literal_ equality as explicitly **not** under NFC - It moves _name_ before _identifier_ in that section of text to avoid a forward definition. Note that this deviates from discussion in 2024-09-09's call in that we didn't discuss literals at length. It also doesn't discuss non-name/non-literal values, which I'll point out are limited to ASCII sequences such as keywords. * Typo fix * Add a note about not requiring implementations to actually normalize * Implement changes dicussed in 2024-09-16 call. - Make _key_ require NFC for uniqueness/comparison - Add a note about NFC - Make _literal_ **_not_** define equality - Make text in _name_ identical to that in _key_ for consistency * Update formatting.md to include keys in NFC * Address comments * Update spec/syntax.md Co-authored-by: Eemeli Aro <[email protected]> * Update spec/syntax.md Co-authored-by: Eemeli Aro <[email protected]> --------- Co-authored-by: Eemeli Aro <[email protected]> * Update list of normative changes during the LDML45 period (#890) * Fix typos in data-model-errors tests (#892) Fix #886 * Update note on exact numeric match for v46 (#891) Addresses #887 Non-normative changes to the notes specifically part of LDML46 * Fix attribute value to be literal (#894) Fixes #893 * Create notes-2024-09-30.md * Add Resolved Values and Function Handler sections to formatting (#728) * Add Resolved Values section to formatting * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Tim Chevalier <[email protected]> * Linkify "resolved value" * Add some examples & explicitly allow wrapping input values * No throw, only emit Co-authored-by: Tim Chevalier <[email protected]> * Add section on Function Handlers, defining the term * Apply suggestions from code review * Rephrase initial resolved value definition * Update spec/formatting.md Co-authored-by: Eemeli Aro <[email protected]> * Update resolved value definition again Co-authored-by: Addison Phillips <[email protected]> --------- Co-authored-by: Tim Chevalier <[email protected]> Co-authored-by: Addison Phillips <[email protected]> * Define function composition for :number and :integer values (#823) * Define function composition for :number and :integer values * Apply suggestions from code review Co-authored-by: Addison Phillips <[email protected]> * Add operand option priority example * Add apostrophes' Co-authored-by: Tim Chevalier <[email protected]> * Update spec/registry.md Co-authored-by: Eemeli Aro <[email protected]> * Update spec/registry.md Co-authored-by: Eemeli Aro <[email protected]> --------- Co-authored-by: Addison Phillips <[email protected]> Co-authored-by: Tim Chevalier <[email protected]> * Create notes-2024-10-07.md * Apply NFC normalization during :string key comparison (#905) * Apply NFC normalization during :string key comparison * Add link to UAX#15 Co-authored-by: Addison Phillips <[email protected]> --------- Co-authored-by: Addison Phillips <[email protected]> * Add tests for changes due to bidi/whitespace (#902) * Add tests for changes due to bidi/whitespace * Correct output * Make erroneous test a syntax error * Define function composition for date/time values (#814) * Define function composition for date/time values * Apply suggestions from code review Co-authored-by: Stanisław Małolepszy <[email protected]> * Drop the "only" * Update spec/registry.md * Update spec/registry.md Co-authored-by: Eemeli Aro <[email protected]> * Update spec/registry.md Co-authored-by: Eemeli Aro <[email protected]> * Update spec/registry.md Co-authored-by: Eemeli Aro <[email protected]> * Make :date and :time composition implementation-defined --------- Co-authored-by: Stanisław Małolepszy <[email protected]> Co-authored-by: Addison Phillips <[email protected]> * DESIGN: Add alternative designs to the design doc on function composition (#806) * DESIGN: Add a sequel to the design doc on function composition This document sketches out some alternatives for the machinery provided to enable function composition. The goal is to provide an exhaustive list of alternatives. * Remove 'part 2' document and move contents to the end of part 1 * Revise introduction to reflect the changed goal * Edited for conciseness * Further edits for conciseness * Give a name to InputType and use it * Refer to motivating examples * Update function-composition-part-1.md status Per 2024-10-14 telecon * Create notes-2024-10-14.md * Add test for :integer and :number composition (#907) * Fix `:integer` option `useGrouping` values (#912) I noticed that `:integer` does not include the "never" value for the option `useGrouping`. This is a bug. * Drop syntax note on additional bidi changes (#910) Drop syntax note on addition bidi changes * Add tests for changes due to #885 (name/literal equality) (#904) * Add tests for changes due to #885 (name/literal equality) * Update test/tests/functions/string.json Co-authored-by: Eemeli Aro <[email protected]> * Update test/tests/syntax.json Co-authored-by: Eemeli Aro <[email protected]> * Update test/tests/functions/string.json Co-authored-by: Eemeli Aro <[email protected]> * Added tests for reordering and special case mapping * Add another selection test --------- Co-authored-by: Eemeli Aro <[email protected]> * Add u: options namespace (#846) * Move spec/registry.md -> spec/registry/default.md * Add Unicode Registry definition * Refer to BCP47, add note about only requiring normal tags * Call it a namespace * Apply suggestions from code review Co-authored-by: Addison Phillips <[email protected]> * Fix test file reference Co-authored-by: Tim Chevalier <[email protected]> * Apply suggestions from code review * Update spec/u-namespace.md Co-authored-by: Eemeli Aro <[email protected]> * Apply suggestions from code review Co-authored-by: Addison Phillips <[email protected]> * Apply suggestions from code review Co-authored-by: Addison Phillips <[email protected]> * Add mention of functions to namespace description --------- Co-authored-by: Addison Phillips <[email protected]> Co-authored-by: Tim Chevalier <[email protected]> * Define function composition for :string values (#798) * Define function composition for :string values * Update spec/registry.md as suggested by @stasm in #814 * Drop the "only" * Update text following code review comments --------- Co-authored-by: Addison Phillips <[email protected]> * Drop data model request for feedback on "name" (#909) * Allow surrogates in content, issue #895 (#906) * Allow surrogates in content, issue #895 * Grammar and typos, linkify terms, make into a note, and fix 2119 keywords Thanks Addison! Co-authored-by: Addison Phillips <[email protected]> * Not using "localizable elements" Co-authored-by: Addison Phillips <[email protected]> * Keep syntax.md in sync with message.abnf * Added note about surrogates to quoted literals * Moved the note about surrogates from Security Considerations to The Message * Update spec/syntax.md * Update spec/syntax.md * Italicize in a couple of places * Implemeted more (all?) feedback from review --------- Co-authored-by: Addison Phillips <[email protected]> --------- Co-authored-by: Eemeli Aro <[email protected]> Co-authored-by: Elango Cheran <[email protected]> Co-authored-by: Tim Chevalier <[email protected]> Co-authored-by: Mark Davis <[email protected]> Co-authored-by: Danny Gleckler <[email protected]> Co-authored-by: Steven R. Loomis <[email protected]> Co-authored-by: Stanisław Małolepszy <[email protected]> Co-authored-by: Eemeli Aro <[email protected]> Co-authored-by: Mihai Nita <[email protected]> * Add serialization proposal * Revert "Add serialization proposal" This reverts commit 17af553. * Revert "Update from main (#914)" This reverts commit da9377b. * Add serialization proposal --------- Co-authored-by: Eemeli Aro <[email protected]> Co-authored-by: Elango Cheran <[email protected]> Co-authored-by: Tim Chevalier <[email protected]> Co-authored-by: Mark Davis <[email protected]> Co-authored-by: Danny Gleckler <[email protected]> Co-authored-by: Steven R. Loomis <[email protected]> Co-authored-by: Stanisław Małolepszy <[email protected]> Co-authored-by: Eemeli Aro <[email protected]> Co-authored-by: Mihai Nita <[email protected]>
The goal was to allow surrogates in localizable content, but not in names or anything else that is "code".
This change reflects that agreement.