From 59e75bb5eace7d378c5b52e47790cd8553bb66e7 Mon Sep 17 00:00:00 2001 From: Anne van Kesteren Date: Tue, 19 Sep 2023 14:08:28 +0200 Subject: [PATCH] Add endpoints for specifications needing to parse URLs This makes the following changes: * It renames parse a URL to encoding-parse a URL, signifying it takes into account the encoding of the document or environment. * Similarly it renames parse and serialize a URL to encoding-parse-and-serialize a URL. * It adds parse a URL that does not take into account the encoding of the document or environment. * It exports both parse a URL and encoding-parse a URL and adds some guidance for specification authors. --- source | 274 ++++++++++++++++++++++++++++++--------------------------- 1 file changed, 144 insertions(+), 130 deletions(-) diff --git a/source b/source index e4aed77eb44..15a5d9ca7f5 100644 --- a/source +++ b/source @@ -7064,17 +7064,32 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute

Parsing URLs

Parsing a URL is the process of taking a string and obtaining the URL record that - it represents. While this process is defined in URL, the HTML standard defines a - wrapper for convenience. URL

+ it represents. While this process is defined in URL, the HTML standard defines + several wrappers to abstract base URLs and encodings. URL

-

This wrapper is only useful when the character encoding for the URL parser has to - match that of the document or environment settings object for legacy reasons. When that is not the - case the URL parser can be used directly.

+

Most new APIs are to use parse a URL. Older APIs and HTML elements + might have reason to use encoding-parse a URL. When a + custom base URL is needed or no base URL is desired, the URL parser can of course be + used directly as well.

-

To parse a URL url, relative to a +

To parse a URL, given a string url, relative to a Document object or environment settings object environment, run these steps. They return failure or a URL.

+
    +
  1. Let baseURL be environment's base + URL, if environment is a Document object; otherwise + environment's API base URL.

  2. + +
  3. Return the result of applying the URL parser to url, with + baseURL.

  4. +
+ +

To encoding-parse a URL, + given a string url, relative to a Document object or environment + settings object environment, run these steps. They return failure or a + URL.

+
  1. Let encoding be environment's character encoding, if environment is a Document object; @@ -7082,18 +7097,19 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute

  2. Let baseURL be environment's base URL, if environment is a Document object; otherwise - environment's API base URL otherwise.

  3. + environment's API base URL.

  4. Return the result of applying the URL parser to url, with baseURL and encoding.

-

To parse and serialize a URL url, - relative to a Document object or environment settings object - environment, run these steps. They return failure or a string.

+

To encoding-parse-and-serialize a + URL, given a string url, relative to a Document object or + environment settings object environment, run these steps. They return + failure or a string.

    -
  1. Let url be the result of parsing a URL +

  2. Let url be the result of encoding-parsing a URL given url, relative to environment.

  3. If url is failure, then return failure.

  4. @@ -7123,9 +7139,9 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute

    If the URL identified by the hyperlink is being shown to the user, or if any data derived from that URL is affecting the display, then the href attribute's value should be reparsed relative to the element's node document and the UI updated - appropriately.

    + data-x="attr-hyperlink-href">href attribute's value should be reparsed, relative to the element's node + document and the UI updated appropriately.

    For example, the CSS :link/:visited pseudo-classes @@ -7133,9 +7149,9 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute

    If the hyperlink has a ping attribute and its URL(s) are being shown to the user, then the ping attribute's tokens should be reparsed relative to the element's node document and the UI updated - appropriately.

    + data-x="attr-hyperlink-ping">ping attribute's tokens should be reparsed, relative to the element's node + document and the UI updated appropriately.

    If the element is a q, blockquote, ins, or @@ -7144,8 +7160,8 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute

    If the URL identified by the cite attribute is being shown to the user, or if any data derived from that URL is affecting the display, - then the cite attribute's value should be reparsed relative to the element's node document and the UI updated + then the cite attribute's value should be reparsed, relative to the element's node document and the UI updated appropriately.

    @@ -8006,9 +8022,9 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute
    1. If contentAttributeValue is null, then return the empty string.

    2. -
    3. Let urlString be the result of parsing and serializing a URL - given contentAttributeValue, relative to element's node - document.

    4. +
    5. Let urlString be the result of encoding-parsing-and-serializing a + URL given contentAttributeValue, relative to element's + node document.

    6. If urlString is not failure, then return urlString.

    @@ -14723,7 +14739,7 @@ interface HTMLBaseElement : HTMLElement {

    The base element allows authors to specify the document base URL for - the purposes of parsing URLs, and the name of the default + the purposes of parsing URLs, and the name of the default navigable for the purposes of following hyperlinks. The element does not represent any content beyond this information.

    @@ -14820,8 +14836,8 @@ interface HTMLBaseElement : HTMLElement { document's fallback base URL, and document's character encoding. (Thus, the base element isn't affected by itself.)

    - +
  5. If any of the following are true:

    @@ -14859,8 +14875,8 @@ interface HTMLBaseElement : HTMLElement { document's character encoding. (Thus, the base element isn't affected by other base elements or itself.)

  6. - +
  7. If urlRecord is failure, return url.

  8. @@ -15524,7 +15540,7 @@ interface HTMLLinkElement : HTMLElement { a destination, then return null.

  9. -

    Let url be the result of parsing +

    Let url be the result of encoding-parsing a URL given options's href, relative to options's base URL.

    @@ -17020,8 +17036,8 @@ people expect to have work and what is necessary. code point, so that it and all subsequent code points are removed.

    -
  10. Parse: Set urlRecord to the result of parsing urlString relative to document.

  11. +
  12. Parse: Set urlRecord to the result of encoding-parsing a + URL given urlString, relative to document.

  13. If urlRecord is failure, then return.

@@ -17507,11 +17523,6 @@ console.log(style.disabled); // false href="https://github.com/whatwg/html/issues/2997">issue #2997.

- -
  • If element contributes a script-blocking style sheet, HTMLQuoteElement : HTMLElement {

    If the cite attribute is present, it must be a valid URL potentially surrounded by spaces. To obtain the - corresponding citation link, the value of the attribute must be parsed relative to the element's node document. User agents may allow users to follow such citation links, but they are primarily intended for private use (e.g., by server-side scripts collecting statistics about a site's use of quotations), not for @@ -22013,10 +22024,10 @@ gossip column, maybe!</q>.</p>

    If the cite attribute is present, it must be a valid URL potentially surrounded by spaces. To obtain the corresponding citation - link, the value of the attribute must be parsed relative to the - element's node document. User agents may allow users to follow such citation - links, but they are primarily intended for private use (e.g., by server-side scripts collecting - statistics about a site's use of quotations), not for readers.

    + link, the value of the attribute must be parsed + relative to the element's node document. User agents may allow users to follow + such citation links, but they are primarily intended for private use (e.g., by server-side scripts + collecting statistics about a site's use of quotations), not for readers.

    The q element must not be used in place of quotation marks that do not represent quotes; for example, it is inappropriate to use the q element for marking up @@ -24711,7 +24722,7 @@ document.body.appendChild(wbr);

  • If this element's href content attribute is absent, then return.

  • -
  • Let url be the result of parsing this +

  • Let url be the result of encoding-parsing a URL given this element's href content attribute's value, relative to this element's node document.

  • @@ -25184,9 +25195,9 @@ document.body.appendChild(wbr);
  • If targetNavigable is null, then return.

  • -
  • Let urlString be the result of parsing and serializing a URL given - subject's href attribute value, relative to - subject's node document.

  • +
  • Let urlString be the result of encoding-parsing-and-serializing a + URL given subject's href attribute + value, relative to subject's node document.

  • If urlString is failure, then return.

  • @@ -25256,9 +25267,9 @@ document.body.appendChild(wbr); sandboxing flag set has the sandboxed downloads browsing context flag set, then return.

    -
  • Let urlString be the result of parsing and serializing a URL given - subject's href attribute value, relative to - subject's node document.

  • +
  • Let urlString be the result of encoding-parsing-and-serializing a + URL given subject's href attribute + value, relative to subject's node document.

  • If urlString is failure, then return.

  • @@ -25476,12 +25487,13 @@ document.body.appendChild(wbr);

    If a hyperlink created by an a or area element has a ping attribute, and the user follows the hyperlink, and the value of the element's href attribute can be parsed, relative to the element's node document, without - failure, then the user agent must take the ping - attribute's value, split that string on ASCII - whitespace, parse each resulting token, relative to the - element's node document, and then run these steps for each resulting URL - ping URL, ignoring when parsing returns failure:

    + data-x="encoding-parsing a URL">parsed, relative to the element's node + document, without failure, then the user agent must take the ping attribute's value, split that string on ASCII whitespace, parse each resulting token, relative to the element's node document, and + then run these steps for each resulting URL ping URL, ignoring when + parsing returns failure:

    1. If ping URL's scheme is not an @@ -25507,9 +25519,9 @@ document.body.appendChild(wbr); data-x="">ping".

    2. -

      Let target URL be the result of parsing and serializing a URL given - the element's href attribute value, relative to the - element's node document, and then:

      +

      Let target URL be the result of encoding-parsing-and-serializing a + URL given the element's href attribute's value, + relative to the element's node document, and then:

      If the URL of the Document object @@ -26126,7 +26138,7 @@ document.body.appendChild(wbr); given a link element el, are:
        -
      1. Let url be the result of parsing +

      2. Let url be the result of encoding-parsing a URL given el's href attribute's value, relative to el's node document.

      3. @@ -26585,7 +26597,7 @@ document.body.appendChild(wbr); data-x="concept-event-fire">fire an event named error at el, and return.

        -
      4. Let url be the result of parsing +

      5. Let url be the result of encoding-parsing a URL given el's href attribute's value, relative to el's node document.

      6. @@ -26843,7 +26855,7 @@ document.body.appendChild(wbr); href is an empty string, returns.

      7. -

        Let url be the result of parsing +

        Let url be the result of encoding-parsing a URL given options's href, relative to options's base URL.

        @@ -28046,8 +28058,8 @@ document.body.appendChild(wbr);

        If the cite attribute is present, it must be a valid URL potentially surrounded by spaces that explains the change. To obtain - the corresponding citation link, the value of the attribute must be parsed relative to the element's node document. User agents may + the corresponding citation link, the value of the attribute must be parsed relative to the element's node document. User agents may allow users to follow such citation links, but they are primarily intended for private use (e.g., by server-side scripts collecting statistics about a site's edits), not for readers.

        @@ -30496,8 +30508,8 @@ was an English <a href="/wiki/Music_hall">music hall</a> singer, ...If selected source is not null, then:

          -
        1. Let urlString be the result of parsing and serializing a URL - given selected source, relative to the element's node +

        2. Let urlString be the result of encoding-parsing-and-serializing a + URL given selected source, relative to the element's node document.

        3. @@ -30610,8 +30622,9 @@ was an English <a href="/wiki/Music_hall">music hall</a> singer, ... -
        4. Let urlString be the result of parsing and serializing a URL given - selected source, relative to the element's node document.

        5. +
        6. Let urlString be the result of encoding-parsing-and-serializing a + URL given selected source, relative to the element's node + document.

        7. If urlString is failure, then:

          @@ -31565,8 +31578,8 @@ was an English <a href="/wiki/Music_hall">music hall</a> singer, ... -
        8. ⌛ Let urlString be the result of parsing and serializing a - URL given selected source, relative to the element's node +

        9. ⌛ Let urlString be the result of encoding-parsing-and-serializing + a URL given selected source, relative to the element's node document.

        10. ⌛ If urlString is failure, then return.

        11. @@ -32936,7 +32949,7 @@ interface HTMLIFrameElement : HTMLElement { and its value is not the empty string, then:

            -
          1. Let maybeURL be the result of parsing that +

          2. Let maybeURL be the result of encoding-parsing a URL given that attribute's value, relative to element's node document.

          3. If maybeURL is not failure, then set url to @@ -33087,9 +33100,9 @@ interface HTMLIFrameElement : HTMLElement {

            If, when the element is created, the srcdoc attribute is not set, and the src attribute is either also not set or set but its value cannot - be parsed, the element's content navigable will - remain at the initial about:blank - Document.

            + be parsed, the element's content + navigable will remain at the initial + about:blank Document.

            If the user navigates away from this page, the iframe's content navigable's active @@ -33618,8 +33631,8 @@ interface HTMLEmbedElement : HTMLElement {

            If element has a src attribute set, then:

              -
            1. Let url be the result of parsing the value - of element's src attribute, relative to +

            2. Let url be the result of encoding-parsing a URL given + element's src attribute's value, relative to element's node document.

            3. If url is failure, then return.

            4. @@ -33935,8 +33948,8 @@ interface HTMLObjectElement : HTMLElement { not a type that the user agent supports, then the user agent may jump to the step below labeled fallback without fetching the content to examine its real type.

              -
            5. Let url be the result of parsing the data attribute, relative to the element's node +

            6. Let url be the result of encoding-parsing a URL given the data attribute's value, relative to the element's node document.

            7. If url is failure, then fire an @@ -34399,7 +34412,7 @@ interface HTMLVideoElement : HTMLMediaElement

            8. If the poster attribute's value is the empty string or if the attribute is absent, then there is no poster frame; return.

            9. -
            10. Let url be the result of parsing the

              Let url be the result of encoding-parsing a URL given the poster attribute's value, relative to the element's node document.

            11. @@ -34964,8 +34977,8 @@ interface HTMLTrackElement : HTMLElement { value.

            12. If value is not the empty string, then set trackURL to the result of - parsing and serializing a URL given value, relative to the element's - node document.

            13. + encoding-parsing-and-serializing a URL given value, relative to the + element's node document.

            14. Set the element's track URL to trackURL if it is not failure; otherwise to the empty string.

            15. @@ -35854,9 +35867,9 @@ interface MediaError { attribute's value is the empty string, then end the synchronous section, and jump down to the failed with attribute step below.

              -
            16. ⌛ Let urlRecord be the result of parsing the src attribute's value, relative - to the media element's node document when the

              ⌛ Let urlRecord be the result of encoding-parsing a URL + given the src attribute's value, relative to the + media element's node document when the src attribute was last changed.

              @@ -35936,10 +35949,10 @@ interface MediaError { environment, then end the synchronous section, and jump down to the failed with elements step below.

            17. -
            18. ⌛ Let urlRecord be the result of parsing candidate's src - attribute's value, relative to candidate's node document when the - src attribute was last changed.

              +
            19. ⌛ Let urlRecord be the result of encoding-parsing a URL + given candidate's src attribute's value, + relative to candidate's node document when the src attribute was last changed.

              @@ -38900,7 +38913,7 @@ interface VideoTrack {

              Indicates that the text track was enabled, but when the user agent attempted to obtain it, - this failed in some way (e.g., URL could not be parsed, network error, unknown text track format). Some or all of the cues are likely missing and will not be obtained.

              @@ -50546,7 +50559,7 @@ ldh-str = < as defined in parsing the

              Let url be the result of encoding-parsing a URL given the src attribute's value, relative to the element's node document.

            20. @@ -59076,7 +59089,7 @@ fur --> -
            21. Let parsed action be the result of parsing +

            22. Let parsed action be the result of encoding-parsing a URL given action, relative to submitter's node document.

            23. If parsed action is failure, then return.

            24. @@ -61812,7 +61825,7 @@ o............A....e
            25. Set el's from an external file to true.

            26. -
            27. Let url be the result of parsing +

            28. Let url be the result of encoding-parsing a URL given src, relative to el's node document.

            29. If url is failure, then queue an element task on the DOM @@ -74004,9 +74017,9 @@ Demos:

              The global identifier of an item is the value of its element's itemid attribute, if it has one, parsed relative to the node document of the element on - which the attribute is specified. If the itemid attribute is - missing or if parsing it returns failure, it is said to have no global + data-x="encoding-parsing a URL">parsed relative to the node document of the + element on which the attribute is specified. If the itemid + attribute is missing or if parsing it returns failure, it is said to have no global identifier.

              The itemid attribute must not be specified on elements that do @@ -74264,8 +74277,8 @@ Demos: img, source, track, or video element

      -

      The value is the result of parsing and serializing a URL given the element's - src attribute value, relative to the element's node +

      The value is the result of encoding-parsing-and-serializing a URL given the + element's src attribute's value, relative to the element's node document, at the time the attribute is set, or the empty string if there is no such attribute or the result is failure.

      @@ -74274,8 +74287,8 @@ Demos:
      If the element is an a, area, or link element
      -

      The value is the result of parsing and serializing a URL given the element's - href attribute value, relative to the element's node +

      The value is the result of encoding-parsing-and-serializing a URL given the + element's href attribute's value, relative to the element's node document, at the time the attribute is set, or the empty string if there is no such attribute or the result is failure.

      @@ -74284,8 +74297,8 @@ Demos:
      If the element is an object element
      -

      The value is the result of parsing and serializing a URL given the element's - data attribute value, relative to the element's node +

      The value is the result of encoding-parsing-and-serializing a URL given the + element's data attribute's value, relative to the element's node document, at the time the attribute is set, or the empty string if there is no such attribute or the result is failure.

      @@ -82068,16 +82081,16 @@ dictionary DragEventInit : MouseEventInit {
      If the node is an a element with an href attribute
      -
      Add to urls the result of parsing and serializing a URL given - element's href content attribute's value, relative - to the element's node document.
      +
      Add to urls the result of encoding-parsing-and-serializing a URL + given element's href content attribute's value, + relative to the element's node document.
      If the node is an img element with a src attribute
      -
      Add to urls the result of parsing and serializing a URL given - element's src content attribute's value, relative to the - element's node document.
      +
      Add to urls the result of encoding-parsing-and-serializing a URL + given element's src content attribute's value, relative to + the element's node document.
    3. @@ -87112,7 +87125,7 @@ dictionary WindowPostMessageOptions : StructuredSeri about:blank.

    4. If url is not the empty string, then set urlRecord to the result - of parsing url, relative to the entry + of encoding-parsing a URL given url, relative to the entry settings object.

    5. If urlRecord is failure, then throw a "SyntaxError" @@ -87143,7 +87156,7 @@ dictionary WindowPostMessageOptions : StructuredSeri

      If url is not the empty string, then:

        -
      1. Let urlRecord be the result of parsing +

      2. Let urlRecord be the result of encoding-parsing a URL url, relative to the entry settings object.

      3. If urlRecord is failure, then throw a @@ -88591,7 +88604,7 @@ interface Location { // but see also parsing the given +

      4. Let url be the result of encoding-parsing a URL given the given value, relative to the entry settings object.

      5. If url is failure, then throw a "SyntaxError" @@ -88996,7 +89009,7 @@ interface Location { // but see also origin, then throw a "SecurityError" DOMException.

      6. -
      7. Let urlRecord be the result of parsing +

      8. Let urlRecord be the result of encoding-parsing a URL given url, relative to the entry settings object.

      9. If urlRecord is failure, then throw a "SyntaxError" @@ -89013,7 +89026,7 @@ interface Location { // but see also parsing +

      10. Let urlRecord be the result of encoding-parsing a URL given url, relative to the entry settings object.

      11. If urlRecord is failure, then throw a "SyntaxError" @@ -89451,7 +89464,7 @@ interface History {

        If url is not null or the empty string, then:

          -
        1. Set newURL to the result of parsing +

        2. Set newURL to the result of encoding-parsing a URL given url, relative to the relevant settings object of history.

        3. @@ -90828,7 +90841,7 @@ interface NavigationHistoryEntry : EventTarget data-x="dom-Navigation-navigate">navigate(options)
          method steps are:

            -
          1. Let urlRecord be the result of parsing +

          2. Let urlRecord be the result of encoding-parsing a URL given url, relative to this's relevant settings object.

          3. If urlRecord is failure, then return an

            For each string of input:

              -
            1. Let parsed be the result of parsing +

            2. Let parsed be the result of encoding-parsing a URL given string, relative to the current settings object.

            3. If parsed is failure, then return a promise rejected with a @@ -110602,7 +110615,7 @@ interface Navigator {

            4. If url does not contain "%s", then throw a "SyntaxError" DOMException.

            5. -
            6. Let urlRecord be the result of parsing +

            7. Let urlRecord be the result of encoding-parsing a URL given url, relative to environment.

            8. @@ -115262,7 +115275,7 @@ enum WorkerType { "classic", "module" };
            9. Let outside settings be the current settings object.

            10. -

              Let worker URL be the result of parsing +

              Let worker URL be the result of encoding-parsing a URL given scriptURL, relative to outside settings.

              Any same-origin URL (including SharedWorker : EventTarget {

            11. Let outside settings be the current settings object.

            12. -

              Let urlRecord be the result of parsing +

              Let urlRecord be the result of encoding-parsing a URL given scriptURL, relative to outside settings.

              Any same-origin URL (including SharedWorker : EventTarget {

              For each url of urls:

                -
              1. Let urlRecord be the result of parsing +

              2. Let urlRecord be the result of encoding-parsing a URL given url, relative to settings object.

              3. If urlRecord is failure, then throw a "SyntaxError" @@ -116447,7 +116460,7 @@ dictionary WorkletOptions {

              4. Let outsideSettings be the relevant settings object of this.

              5. -
              6. Let moduleURLRecord be the result of parsing +

              7. Let moduleURLRecord be the result of encoding-parsing a URL given moduleURL, relative to outsideSettings.

              8. If moduleURLRecord is failure, then return a promise rejected with @@ -128792,11 +128805,12 @@ html, body { display: block; }


                When a body element has a background - attribute set to a non-empty value, the new value is expected to be parsed and serialized relative to the element's node - document, and if that does not return failure, the user agent is expected to treat the - attribute as a presentational hint setting the - element's 'background-image' property to the return value.

                + attribute set to a non-empty value, the new value is expected to be encoding-parsed-and-serialized relative to + the element's node document, and if that does not return failure, the user agent is + expected to treat the attribute as a presentational + hint setting the element's 'background-image' property to the return + value.

                When a body element has a bgcolor attribute set, the new value is expected to be parsed using the rules for parsing a legacy @@ -129638,11 +129652,11 @@ table {

                When a table, thead, tbody, tfoot, tr, td, or th element has a background attribute set to a non-empty value, the new value is - expected to be parsed and serialized relative - to the element's node document, and if that does not return failure, the user agent - is expected to treat the attribute as a presentational - hint setting the element's 'background-image' property to the return - value.

                + expected to be encoding-parsed-and-serialized relative to the element's node document, + and if that does not return failure, the user agent is expected to treat the attribute as a presentational hint setting the element's + 'background-image' property to the return value.

                When a table, thead, tbody, tfoot, tr, td, or th element has a bgcolor @@ -131425,9 +131439,9 @@ progress { appearance: auto; } as part of such auditing.

                User agents may allow users to navigate navigables to the URLs indicated by - the cite attributes on q, blockquote, - ins, and del elements.

                + data-x="navigable">navigables to the URLs indicated by the cite attributes on q, + blockquote, ins, and del elements.

                User agents may surface hyperlinks created by link elements in their user interface, as discussed