diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-cache-groups.html b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-cache-groups.html new file mode 100644 index 000000000..f38c98b9d --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-cache-groups.html @@ -0,0 +1,1441 @@ + + + + + + +HTTP Cache Groups + + + + + + + + + + + + + + + + + + + + + + + +
Internet-DraftHTTP Cache GroupsJanuary 2025
NottinghamExpires 11 July 2025[Page]
+
+
+
+
Workgroup:
+
Network Working Group
+
Internet-Draft:
+
draft-ietf-httpbis-cache-groups-latest
+
Published:
+
+ +
+
Intended Status:
+
Standards Track
+
Expires:
+
+
Author:
+
+
+
M. Nottingham
+
+
+
+
+

HTTP Cache Groups

+
+

Abstract

+

This specification introduces a means of describing the relationships between stored responses in HTTP caches, "grouping" them by associating a stored response with one or more opaque strings.

+
+
+

+About This Document +

+

This note is to be removed before publishing as an RFC.

+

+ Status information for this document may be found at https://datatracker.ietf.org/doc/draft-ietf-httpbis-cache-groups/.

+

+ Discussion of this document takes place on the + HTTP Working Group mailing list (mailto:ietf-http-wg@w3.org), + which is archived at https://lists.w3.org/Archives/Public/ietf-http-wg/. + Working Group information can be found at https://httpwg.org/.

+

Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/cache-groups.

+
+
+
+

+Status of This Memo +

+

+ This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79.

+

+ Internet-Drafts are working documents of the Internet Engineering Task + Force (IETF). Note that other groups may also distribute working + documents as Internet-Drafts. The list of current Internet-Drafts is + at https://datatracker.ietf.org/drafts/current/.

+

+ Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress."

+

+ This Internet-Draft will expire on 11 July 2025.

+
+
+ +
+
+

+Table of Contents +

+ +
+
+
+
+

+1. Introduction +

+

HTTP caching [HTTP-CACHING] operates at the granularity of a single resource; the freshness of one stored response does not affect that of others. This granularity can make caching more efficient -- for example, when a page is composed of many assets that have different requirements for caching.

+

However, there are also cases where the relationship between stored responses could be used to improve cache efficiency.

+

For example, it is often necessary to invalidate a set of related resources. This might be because a state-changing request has side effects on other resources, or it might be purely for administrative convenience (e.g., "invalidate this part of the site"). Grouping responses together provides a dedicated way to express these relationships, instead of relying on things like URL structure.

+

In addition to sharing invalidation events, the relationships indicated by grouping can also be used by caches to optimise their operation; for example, it could be used to inform the operation of cache eviction algorithms.

+

Section 2 introduces a means of describing the relationships between a set of stored responses in HTTP caches by associating them with one or more opaque strings. It also describes how caches can use that information to apply invalidation events to members of a group.

+

Section 3 introduces one new source of such events: a HTTP response header that allows a state-changing response to trigger a group invalidation.

+

These mechanisms operate within a single cache, across the stored responses associated with a single origin server. They do not address this issues of synchronising state between multiple caches (e.g., in a hierarchy or mesh), nor do they facilitate association of stored responses from disparate origins.

+
+
+

+1.1. Notational Conventions +

+

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", +"MAY", and "OPTIONAL" in this document are to be interpreted as +described in BCP 14 [RFC2119] [RFC8174] when, and only when, they +appear in all capitals, as shown here.

+

This specification uses the following terminology from [STRUCTURED-FIELDS]: List, String, Parameter.

+
+
+
+
+
+
+

+2. The Cache-Groups Response Header Field +

+

The Cache-Groups HTTP Response Header is a List of Strings [STRUCTURED-FIELDS]. Each member of the list is an opaque value that identifies a group that the response belongs to.

+
+
+HTTP/1.1 200 OK
+Content-Type: application/javascript
+Cache-Control: max-age=3600
+Cache-Groups: "scripts"
+
+
+

The ordering of members is not significant. Unrecognised Parameters MUST be ignored.

+

Implementations MUST support at least 128 groups in a field value, with up to at least 128 characters in each member. Note that generic limitations on HTTP field lengths may constrain the size of this field value in practice.

+
+
+

+2.1. Identifying Grouped Responses +

+

Two responses stored in the same cache are considered to have the same group when all of the following conditions are met:

+
    +
  1. +

    They both contain a Cache-Groups response header field that contains the same String (in any position in the List), when compared character-by-character.

    +
  2. +
  3. +

    The both share the same URI origin (per Section 4.3.1 of [HTTP]).

    +
  4. +
+
+
+
+
+

+2.2. Cache Behaviour +

+
+
+

+2.2.1. Invalidation +

+

A cache that invalidates a stored response MAY invalidate any stored responses that share groups (per Section 2.1) with that response.

+

Cache extensions can explicitly strengthen the requirement above. For example, a targeted cache control header field [TARGETED] might specify that caches processing it are required to invalidate such responses.

+
+
+
+
+
+
+
+
+

+3. The Cache-Group-Invalidation Response Header Field +

+

The Cache-Group-Invalidation response header field is a List of Strings [STRUCTURED-FIELDS]. Each member of the list is an opaque value that identifies a group that the response invalidates, per Section 2.2.1.

+

For example, a POST request that has side effects on two cache groups could indicate that stored responses associated with either or both of those groups should be invalidated with:

+
+
+HTTP/1.1 200 OK
+Content-Type: text/html
+Cache-Group-Invalidation: "eurovision-results", "kylie-minogue"
+
+
+

The Cache-Group-Invalidation header field MUST be ignored on responses to requests that have a safe method (e.g., GET; see Section 9.2.1 of [HTTP]).

+

A cache that receives a Cache-Group-Invalidation header field on a response to an unsafe request MAY invalidate any stored responses that share groups (per Section 2.1) with any of the listed groups.

+

Cache extensions can explicitly strengthen the requirement above. For example, a targeted cache control header field [TARGETED] might specify that caches processing it are required to respect the Cache-Group-Invalidation signal.

+

The ordering of members is not significant. Unrecognised Parameters MUST be ignored.

+

Implementations MUST support at least 128 groups in a field value, with up to at least 128 characters in each member. Note that generic limitations on HTTP field lengths may constrain the size of this field value in practice.

+
+
+
+
+

+4. IANA Considerations +

+

IANA should perform the following tasks:

+
+
+

+4.1. HTTP Field Names +

+

Enter the following into the Hypertext Transfer Protocol (HTTP) Field Name Registry:

+
    +
  • +

    Field Name: Cache-Groups

    +
  • +
  • +

    Status: permanent

    +
  • +
  • +

    Reference: RFC nnnn

    +
  • +
  • +

    Comments:

    +
  • +
  • +

    Field Name: Cache-Group-Invalidation

    +
  • +
  • +

    Status: permanent

    +
  • +
  • +

    Reference: RFC nnnn

    +
  • +
  • +

    Comments:

    +
  • +
+
+
+
+
+
+
+

+5. Security Considerations +

+

This mechanism allows resources that share an origin to invalidate each other. Because of this, +origins that represent multiple parties (sometimes referred to as "shared hosting") might allow +one party to group its resources with those of others, or to send signals which have side effects upon them.

+

Shared hosts that wish to mitigate these risks can control access to the header fields defined in this specification.

+
+
+
+
+

+6. References +

+
+
+

+6.1. Normative References +

+
+
[HTTP]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, DOI 10.17487/RFC9110, , <https://www.rfc-editor.org/rfc/rfc9110>.
+
+
[HTTP-CACHING]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Caching", STD 98, RFC 9111, DOI 10.17487/RFC9111, , <https://www.rfc-editor.org/rfc/rfc9111>.
+
+
[RFC2119]
+
+Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
+
+
[RFC8174]
+
+Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.
+
+
[STRUCTURED-FIELDS]
+
+Nottingham, M. and P. Kamp, "Structured Field Values for HTTP", Work in Progress, Internet-Draft, draft-ietf-httpbis-sfbis-06, , <https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-sfbis-06>.
+
+
+
+
+
+
+

+6.2. Informative References +

+
+
[TARGETED]
+
+Ludin, S., Nottingham, M., and Y. Wu, "Targeted HTTP Cache Control", RFC 9213, DOI 10.17487/RFC9213, , <https://www.rfc-editor.org/rfc/rfc9213>.
+
+
+
+
+
+
+
+
+

+Appendix A. Acknowledgements +

+

Thanks to Stephen Ludin for his review and suggestions.

+
+
+
+
+

+Author's Address +

+
+
Mark Nottingham
+
Prahran
+
Australia
+ + +
+
+
+ + + diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-cache-groups.txt b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-cache-groups.txt new file mode 100644 index 000000000..750e53951 --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-cache-groups.txt @@ -0,0 +1,294 @@ + + + + +Network Working Group M. Nottingham +Internet-Draft 7 January 2025 +Intended status: Standards Track +Expires: 11 July 2025 + + + HTTP Cache Groups + draft-ietf-httpbis-cache-groups-latest + +Abstract + + This specification introduces a means of describing the relationships + between stored responses in HTTP caches, "grouping" them by + associating a stored response with one or more opaque strings. + +About This Document + + This note is to be removed before publishing as an RFC. + + Status information for this document may be found at + https://datatracker.ietf.org/doc/draft-ietf-httpbis-cache-groups/. + + Discussion of this document takes place on the HTTP Working Group + mailing list (mailto:ietf-http-wg@w3.org), which is archived at + https://lists.w3.org/Archives/Public/ietf-http-wg/. Working Group + information can be found at https://httpwg.org/. + + Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/cache-groups. + +Status of This Memo + + This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79. + + Internet-Drafts are working documents of the Internet Engineering + Task Force (IETF). Note that other groups may also distribute + working documents as Internet-Drafts. The list of current Internet- + Drafts is at https://datatracker.ietf.org/drafts/current/. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress." + + This Internet-Draft will expire on 11 July 2025. + +Copyright Notice + + Copyright (c) 2025 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents (https://trustee.ietf.org/ + license-info) in effect on the date of publication of this document. + Please review these documents carefully, as they describe your rights + and restrictions with respect to this document. Code Components + extracted from this document must include Revised BSD License text as + described in Section 4.e of the Trust Legal Provisions and are + provided without warranty as described in the Revised BSD License. + +Table of Contents + + 1. Introduction + 1.1. Notational Conventions + 2. The Cache-Groups Response Header Field + 2.1. Identifying Grouped Responses + 2.2. Cache Behaviour + 2.2.1. Invalidation + 3. The Cache-Group-Invalidation Response Header Field + 4. IANA Considerations + 4.1. HTTP Field Names + 5. Security Considerations + 6. References + 6.1. Normative References + 6.2. Informative References + Appendix A. Acknowledgements + Author's Address + +1. Introduction + + HTTP caching [HTTP-CACHING] operates at the granularity of a single + resource; the freshness of one stored response does not affect that + of others. This granularity can make caching more efficient -- for + example, when a page is composed of many assets that have different + requirements for caching. + + However, there are also cases where the relationship between stored + responses could be used to improve cache efficiency. + + For example, it is often necessary to invalidate a set of related + resources. This might be because a state-changing request has side + effects on other resources, or it might be purely for administrative + convenience (e.g., "invalidate this part of the site"). Grouping + responses together provides a dedicated way to express these + relationships, instead of relying on things like URL structure. + + In addition to sharing invalidation events, the relationships + indicated by grouping can also be used by caches to optimise their + operation; for example, it could be used to inform the operation of + cache eviction algorithms. + + Section 2 introduces a means of describing the relationships between + a set of stored responses in HTTP caches by associating them with one + or more opaque strings. It also describes how caches can use that + information to apply invalidation events to members of a group. + + Section 3 introduces one new source of such events: a HTTP response + header that allows a state-changing response to trigger a group + invalidation. + + These mechanisms operate within a single cache, across the stored + responses associated with a single origin server. They do not + address this issues of synchronising state between multiple caches + (e.g., in a hierarchy or mesh), nor do they facilitate association of + stored responses from disparate origins. + +1.1. Notational Conventions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + + This specification uses the following terminology from + [STRUCTURED-FIELDS]: List, String, Parameter. + +2. The Cache-Groups Response Header Field + + The Cache-Groups HTTP Response Header is a List of Strings + [STRUCTURED-FIELDS]. Each member of the list is an opaque value that + identifies a group that the response belongs to. + + HTTP/1.1 200 OK + Content-Type: application/javascript + Cache-Control: max-age=3600 + Cache-Groups: "scripts" + + The ordering of members is not significant. Unrecognised Parameters + MUST be ignored. + + Implementations MUST support at least 128 groups in a field value, + with up to at least 128 characters in each member. Note that generic + limitations on HTTP field lengths may constrain the size of this + field value in practice. + +2.1. Identifying Grouped Responses + + Two responses stored in the same cache are considered to have the + same group when all of the following conditions are met: + + 1. They both contain a Cache-Groups response header field that + contains the same String (in any position in the List), when + compared character-by-character. + + 2. The both share the same URI origin (per Section 4.3.1 of [HTTP]). + +2.2. Cache Behaviour + +2.2.1. Invalidation + + A cache that invalidates a stored response MAY invalidate any stored + responses that share groups (per Section 2.1) with that response. + + Cache extensions can explicitly strengthen the requirement above. + For example, a targeted cache control header field [TARGETED] might + specify that caches processing it are required to invalidate such + responses. + +3. The Cache-Group-Invalidation Response Header Field + + The Cache-Group-Invalidation response header field is a List of + Strings [STRUCTURED-FIELDS]. Each member of the list is an opaque + value that identifies a group that the response invalidates, per + Section 2.2.1. + + For example, a POST request that has side effects on two cache groups + could indicate that stored responses associated with either or both + of those groups should be invalidated with: + + HTTP/1.1 200 OK + Content-Type: text/html + Cache-Group-Invalidation: "eurovision-results", "kylie-minogue" + + The Cache-Group-Invalidation header field MUST be ignored on + responses to requests that have a safe method (e.g., GET; see + Section 9.2.1 of [HTTP]). + + A cache that receives a Cache-Group-Invalidation header field on a + response to an unsafe request MAY invalidate any stored responses + that share groups (per Section 2.1) with any of the listed groups. + + Cache extensions can explicitly strengthen the requirement above. + For example, a targeted cache control header field [TARGETED] might + specify that caches processing it are required to respect the Cache- + Group-Invalidation signal. + + The ordering of members is not significant. Unrecognised Parameters + MUST be ignored. + + Implementations MUST support at least 128 groups in a field value, + with up to at least 128 characters in each member. Note that generic + limitations on HTTP field lengths may constrain the size of this + field value in practice. + +4. IANA Considerations + + IANA should perform the following tasks: + +4.1. HTTP Field Names + + Enter the following into the Hypertext Transfer Protocol (HTTP) Field + Name Registry: + + * Field Name: Cache-Groups + + * Status: permanent + + * Reference: RFC nnnn + + * Comments: + + * Field Name: Cache-Group-Invalidation + + * Status: permanent + + * Reference: RFC nnnn + + * Comments: + +5. Security Considerations + + This mechanism allows resources that share an origin to invalidate + each other. Because of this, origins that represent multiple parties + (sometimes referred to as "shared hosting") might allow one party to + group its resources with those of others, or to send signals which + have side effects upon them. + + Shared hosts that wish to mitigate these risks can control access to + the header fields defined in this specification. + +6. References + +6.1. Normative References + + [HTTP] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP Semantics", STD 97, RFC 9110, + DOI 10.17487/RFC9110, June 2022, + . + + [HTTP-CACHING] + Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP Caching", STD 98, RFC 9111, + DOI 10.17487/RFC9111, June 2022, + . + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + . + + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, . + + [STRUCTURED-FIELDS] + Nottingham, M. and P. Kamp, "Structured Field Values for + HTTP", Work in Progress, Internet-Draft, draft-ietf- + httpbis-sfbis-06, 21 April 2024, + . + +6.2. Informative References + + [TARGETED] Ludin, S., Nottingham, M., and Y. Wu, "Targeted HTTP Cache + Control", RFC 9213, DOI 10.17487/RFC9213, June 2022, + . + +Appendix A. Acknowledgements + + Thanks to Stephen Ludin for his review and suggestions. + +Author's Address + + Mark Nottingham + Prahran + Australia + Email: mnot@mnot.net + URI: https://www.mnot.net/ diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-compression-dictionary.html b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-compression-dictionary.html new file mode 100644 index 000000000..1e6ea69bf --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-compression-dictionary.html @@ -0,0 +1,2377 @@ + + + + + + +Compression Dictionary Transport + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Internet-DraftCompression Dictionary TransportJanuary 2025
Meenan & WeissExpires 11 July 2025[Page]
+
+
+
+
Workgroup:
+
HTTP
+
Internet-Draft:
+
draft-ietf-httpbis-compression-dictionary-latest
+
Published:
+
+ +
+
Intended Status:
+
Standards Track
+
Expires:
+
+
Authors:
+
+
+
P. Meenan, Ed. +
+
Google LLC
+
+
+
Y. Weiss, Ed. +
+
Shopify Inc
+
+
+
+
+

Compression Dictionary Transport

+
+

Abstract

+

This document specifies a mechanism for dictionary-based compression in the +Hypertext Transfer Protocol (HTTP). By utilizing this technique, clients and +servers can reduce the size of transmitted data, leading to improved performance +and reduced bandwidth consumption. This document extends existing HTTP compression +methods and provides guidelines for the delivery and use of compression +dictionaries within the HTTP protocol.

+
+
+

+About This Document +

+

This note is to be removed before publishing as an RFC.

+

+ Status information for this document may be found at https://datatracker.ietf.org/doc/draft-ietf-httpbis-compression-dictionary/.

+

+ Discussion of this document takes place on the + HTTP Working Group mailing list (mailto:ietf-http-wg@w3.org), + which is archived at https://lists.w3.org/Archives/Public/ietf-http-wg/. + Working Group information can be found at https://httpwg.org/.

+

Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/compression-dictionary.

+
+
+
+

+Status of This Memo +

+

+ This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79.

+

+ Internet-Drafts are working documents of the Internet Engineering Task + Force (IETF). Note that other groups may also distribute working + documents as Internet-Drafts. The list of current Internet-Drafts is + at https://datatracker.ietf.org/drafts/current/.

+

+ Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress."

+

+ This Internet-Draft will expire on 11 July 2025.

+
+
+ + +
+
+

+1. Introduction +

+

This specification defines a mechanism for using designated [HTTP] responses +as an external dictionary for future HTTP responses for compression schemes +that support using external dictionaries (e.g., Brotli [RFC7932] and +Zstandard [RFC8878]).

+

This document describes the HTTP headers used for negotiating dictionary usage +and registers content encoding values for compressing with Brotli and Zstandard +using a negotiated dictionary.

+

The negotiation of dictionary usage leverages HTTP's content negotiation +(see Section 12 of [HTTP]) and is usable with all versions of HTTP.

+
+
+

+1.1. Use Cases +

+

Any HTTP response can be specified to be used as a compression dictionary for +future HTTP requests which provides a lot of flexibility. There are two common +use cases that are seen frequently:

+
+
+

+1.1.1. Version Upgrade +

+

Using a previous version of a resource as a dictionary for a newer version +enables delivery of a delta-compressed version of the changes, usually +resulting in significantly smaller responses than can be achieved by +compression alone.

+

For example:

+
+
+
+ + + + + + + + + + + + + + + Client + Server + GET + /app.v1.js + 200 + OK + Use-As-Dictionary: + match="/app*js" + <full + app.v1.js + resource + - + 100KB + compressed> + Some + time + later + ... + Client + Server + GET + /app.v2.js + Available-Dictionary: + :pZGm1A...2a2fFG4=: + Accept-Encoding: + gzip, + br, + zstd, + dcb, + dcz + 200 + OK + Content-Encoding: + dcb + <delta-compressed + app.v2.js + resource + - + 1KB> + + +
+
+
Figure 1: +Version Upgrade Example +
+
+
+
+
+

+1.1.2. Common Content +

+

If several resources share common patterns in their responses then it can be +useful to reference an external dictionary that contains those common patterns, +effectively compressing them out of the responses. Some examples of this are +common template HTML for similar pages across a site and common keys and values +in API calls.

+

For example:

+
+
+
+ + + + + + + + + + + + + + + + + + + Client + Server + GET + /index.html + 200 + OK + Link: + <.../dict>; + rel="compression-dictionary" + <full + index.html + resource + - + 100KB + compressed> + GET + /dict + 200 + OK + Use-As-Dictionary: + match="/*html" + Some + time + later + ... + Client + Server + GET + /page2.html + Available-Dictionary: + :pZGm1A...2a2fFG4=: + Accept-Encoding: + gzip, + br, + zstd, + dcb, + dcz + 200 + OK + Content-Encoding: + dcb + <delta-compressed + page2.html + resource + - + 10KB> + + +
+
+
Figure 2: +Common Content Example +
+
+
+
+
+
+
+

+1.2. Notational Conventions +

+

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", +"MAY", and "OPTIONAL" in this document are to be interpreted as +described in BCP 14 [RFC2119] [RFC8174] when, and only when, they +appear in all capitals, as shown here.

+

This document uses the following terminology from Section 3 of +[STRUCTURED-FIELDS] to specify syntax and parsing: Dictionary, String, +Inner List, Token, and Byte Sequence.

+

This document uses the line folding strategies described in [FOLDING].

+

This document also uses terminology from [HTTP] and [HTTP-CACHING].

+
+
+
+
+
+
+

+2. Dictionary Negotiation +

+
+
+

+2.1. Use-As-Dictionary +

+

When responding to a HTTP Request, a server can advertise that the response can +be used as a dictionary for future requests for URLs that match the rules +specified in the Use-As-Dictionary response header.

+

The Use-As-Dictionary response header is a Structured Field +[STRUCTURED-FIELDS] Dictionary with values for "match", "match-dest", "id", +and "type".

+
+
+

+2.1.1. match +

+

The "match" value of the Use-As-Dictionary header is a String value that +provides the URL Pattern to use for request matching (see [URLPATTERN]).

+

The URL Pattern used for matching does not support using regular expressions.

+

The following algorithm is used to validate that a given String value is a +valid URL Pattern that does not use regular expressions and is for the same +Origin (Section 4.3.1 of [HTTP]) as the dictionary. It will return TRUE +for a valid match pattern and FALSE for an invalid pattern that MUST NOT be +used:

+
    +
  1. +

    Let MATCH be the value of "match" for the given dictionary.

    +
  2. +
  3. +

    Let URL be the URL of the dictionary request.

    +
  4. +
  5. +

    Let PATTERN be a URL pattern created by running the steps to create a +URL pattern by setting input=MATCH, and baseURL=URL (see +Part "create" of [URLPATTERN]).

    +
  6. +
  7. +

    If the result of running the "has regexp groups" steps for PATTERN returns +TRUE then return FALSE (see Part "has regexp groups" of [URLPATTERN]).

    +
  8. +
  9. +

    Return TRUE.

    +
  10. +
+

The "match" value is required and MUST be included in the +Use-As-Dictionary response header for the dictionary to be considered valid.

+

Operating at the HTTP-level, the specified match patterns will operate on the +percent-encoded version of the URL path (see Section 2 of [URL]).

+

For example the URL "http://www.example.com/düsseldorf" would be encoded as +"http://www.example.com/d%C3%BCsseldorf" and a relevant match pattern would be:

+
+
+Use-As-Dictionary: match="/d%C3%BCsseldorf"
+
+
+
+
+
+
+

+2.1.2. match-dest +

+

The "match-dest" value of the Use-As-Dictionary header is an Inner List of +String values that provides a list of Fetch request destinations for the +dictionary to match (see Part "destination" of [FETCH]).

+

An empty list for "match-dest" MUST match all destinations.

+

For clients that do not support request destinations, the client MUST treat it +as an empty list and match all destinations.

+

The "match-dest" value is optional and defaults to an empty list.

+
+
+
+
+

+2.1.3. id +

+

The "id" value of the Use-As-Dictionary header is a String value that specifies +a server identifier for the dictionary. If an "id" value is present and has a +string length longer than zero then it MUST be sent to the server in a +"Dictionary-ID" request header when the client sends an "Available-Dictionary" +request header for the same dictionary (see Section 2.2).

+

The server identifier MUST be treated as an opaque string by the client.

+

The server identifier MUST NOT be relied upon by the server to guarantee the +contents of the dictionary. The dictionary hash MUST be validated before use.

+

The "id" value string length (after any decoding) supports up to 1024 +characters.

+

The "id" value is optional and defaults to the empty string.

+
+
+
+
+

+2.1.4. type +

+

The "type" value of the Use-As-Dictionary header is a Token value that +describes the file format of the supplied dictionary.

+

"raw" is defined as a dictionary format which represents an unformatted blob of +bytes suitable for any compression scheme to use.

+

If a client receives a dictionary with a type that it does not understand, it +MUST NOT use the dictionary.

+

The "type" value is optional and defaults to "raw".

+
+
+
+
+

+2.1.5. Examples +

+
+
+
+2.1.5.1. Path Prefix +
+

A response that contained a response header:

+
+
+NOTE: '\' line wrapping per RFC 8792
+
+Use-As-Dictionary: \
+  match="/product/*", match-dest=("document")
+
+
+

Would specify matching any document request for a URL with a path prefix of +/product/ on the same Origin (Section 4.3.1 of [HTTP]) as the original +request.

+
+
+
+
+
+2.1.5.2. Versioned Directories +
+

A response that contained a response header:

+
+
+Use-As-Dictionary: match="/app/*/main.js"
+
+
+

Would match any path that starts with "/app/" and ends with "/main.js".

+
+
+
+
+
+
+
+
+

+2.2. Available-Dictionary +

+

When a HTTP client makes a request for a resource for which it has an +appropriate dictionary, it can add a "Available-Dictionary" request header +to the request to indicate to the server that it has a dictionary available to +use for compression.

+

The "Available-Dictionary" request header is a Structured Field +[STRUCTURED-FIELDS] Byte Sequence containing the [SHA-256] hash of the +contents of a single available dictionary.

+

The client MUST only send a single "Available-Dictionary" request header +with a single hash value for the best available match that it has available.

+

For example:

+
+
+Available-Dictionary: :pZGm1Av0IEBKARczz7exkNYsZb8LzaMrV7J32a2fFG4=:
+
+
+
+
+

+2.2.1. Dictionary freshness requirement +

+

To be considered as a match, the dictionary resource MUST be either fresh +[HTTP-CACHING] or allowed to be served stale (see eg [RFC5861]).

+
+
+
+
+

+2.2.2. Dictionary URL matching +

+

When a dictionary is stored as a result of a "Use-As-Dictionary" directive, it +includes a "match" string and optional "match-dest" string that are used to +match an outgoing request from a client to the available dictionaries.

+

To see if an outbound request matches a given dictionary, the following +algorithm will return TRUE for a successful match and FALSE for no-match:

+
    +
  1. +

    If the current client supports request destinations and a "match-dest" +string was provided with the dictionary:

    +
      +
    • +

      Let DEST be the value of "match-dest" for the given dictionary.

      +
    • +
    • +

      Let REQUEST_DEST be the value of the destination for the current + request.

      +
    • +
    • +

      If DEST is not an empty list and if REQUEST_DEST is not in the DEST list + of destinations, return FALSE

      +
    • +
    +
  2. +
  3. +

    Let BASEURL be the URL of the dictionary request.

    +
  4. +
  5. +

    Let URL represent the URL of the outbound request being checked.

    +
  6. +
  7. +

    If the Origin of BASEURL and the Origin of URL are not the same, return +FALSE (see Section 4.3.1 of [HTTP]).

    +
  8. +
  9. +

    Let MATCH be the value of "match" for the given dictionary.

    +
  10. +
  11. +

    Let PATTERN be a URL pattern created by running the steps to create a +URL pattern by setting input=MATCH, and baseURL=URL (see +Part "create" of [URLPATTERN]).

    +
  12. +
  13. +

    Return the result of running the "match" steps on PATTERN with input=URL +which will check for a match between the request URL and the supplied "match" +string (see Part "match" of [URLPATTERN]).

    +
  14. +
+
+
+
+
+

+2.2.3. Multiple matching dictionaries +

+

When there are multiple dictionaries that match a given request URL, the client +MUST pick a single dictionary using the following rules:

+
    +
  1. +

    For clients that support request destinations, a dictionary that specifies +and matches a "match-dest" takes precedence over a match that does not use a +destination.

    +
  2. +
  3. +

    Given equivalent destination precedence, the dictionary with the longest +"match" takes precedence.

    +
  4. +
  5. +

    Given equivalent destination and match length precedence, the most recently +fetched dictionary takes precedence.

    +
  6. +
+
+
+
+
+
+
+

+2.3. Dictionary-ID +

+

When a HTTP client makes a request for a resource for which it has an +appropriate dictionary and the dictionary was stored with a server-provided +"id" in the Use-As-Dictionary response then the client MUST echo the stored +"id" in a "Dictionary-ID" request header.

+

The "Dictionary-ID" request header is a Structured Field [STRUCTURED-FIELDS] +String of up to 1024 characters (after any decoding) and MUST be identical to +the server-provided "id".

+

For example, given a HTTP response that set a dictionary ID:

+
+
+Use-As-Dictionary: match="/app/*/main.js", id="dictionary-12345"
+
+
+

A future request that matches the given dictionary will include both the hash +and the ID:

+
+
+Available-Dictionary: :pZGm1Av0IEBKARczz7exkNYsZb8LzaMrV7J32a2fFG4=:
+Dictionary-ID: "dictionary-12345"
+
+
+
+
+
+
+ +
+
+

+4. Dictionary-Compressed Brotli +

+

The "dcb" content encoding identifies a resource that is a +"Dictionary-Compressed Brotli" stream.

+

A "Dictionary-Compressed Brotli" stream has a fixed header that is followed by +a Shared Brotli [SHARED-BROTLI] stream. The header consists of a fixed 4-byte +sequence and a 32-byte hash of the external dictionary that was used. The +Shared Brotli stream is created using the referenced external dictionary and a +compression window that is at most 16 MB in size.

+

The dictionary used for the "dcb" content encoding is a "raw" dictionary type +as defined in Section 2.1.4 and is treated as a prefix dictionary as defined in +section 9.2 of the Shared Brotli Compressed Data Format draft. +[SHARED-BROTLI]

+

The 36-byte fixed header is as follows:

+
+
Magic_Number:
+
+

4 fixed bytes: 0xff, 0x44, 0x43, 0x42.

+
+
+
SHA_256_Hash:
+
+

32 bytes. SHA-256 hash digest of the dictionary [SHA-256].

+
+
+
+

Clients that announce support for dcb content encoding MUST be able to +decompress resources that were compressed with a window size of up to 16 MB.

+

With Brotli compression, the full dictionary is available during compression +and decompression independent of the compression window, allowing for +delta-compression of resources larger than the compression window.

+
+
+
+
+

+5. Dictionary-Compressed Zstandard +

+

The "dcz" content encoding identifies a resource that is a +"Dictionary-Compressed Zstandard" stream.

+

A "Dictionary-Compressed Zstandard" stream is a binary stream that starts with +a 40-byte fixed header and is followed by a Zstandard [RFC8878] stream of the +response that has been compressed with an external dictionary.

+

The dictionary used for the "dcz" content encoding is a "raw" dictionary type +as defined in Section 2.1.4 and is treated as a raw dictionary as per section 5 of +RFC 8878.

+

The 40-byte header consists of a fixed 8-byte sequence followed by the +32-byte SHA-256 hash of the external dictionary that was used to compress the +resource:

+
+
Magic_Number:
+
+

8 fixed bytes: 0x5e, 0x2a, 0x4d, 0x18, 0x20, 0x00, 0x00, 0x00.

+
+
+
SHA_256_Hash:
+
+

32 bytes. SHA-256 hash digest of the dictionary [SHA-256].

+
+
+
+

The 40-byte header is a Zstandard skippable frame (little-endian 0x184D2A5E) +with a 32-byte length (little-endian 0x00000020) that is compatible with +existing Zstandard decoders.

+

Clients that announce support for dcz content encoding MUST be able to +decompress resources that were compressed with a window size of at least 8 MB +or 1.25 times the size of the dictionary, which ever is greater, up to a +maximum of 128 MB.

+

The window size used will be encoded in the content (currently, this can be +expressed in powers of two only) and it MUST be lower than this limit. An +implementation MAY treat a window size that exceeds the limit as a decoding +error.

+

With Zstandard compression, the full dictionary is available during compression +and decompression until the size of the input exceeds the compression window. +Beyond that point the dictionary becomes unavailable. Using a compression +window that is 1.25 times the size of the dictionary allows for full delta +compression of resources that have grown by 25% between releases while still +giving the client control over the memory it will need to allocate for a given +response.

+
+
+
+
+

+6. Negotiating the content encoding +

+

When a compression dictionary is available for use compressing the response to +a given request, the encoding to be used is negotiated through the regular +mechanism for negotiating content encoding in HTTP through the "Accept-Encoding" +request header and "Content-Encoding" response header.

+

The dictionary to use is negotiated separately and advertised in the +"Available-Dictionary" request header.

+
+
+

+6.1. Accept-Encoding +

+

When a dictionary is available for use on a given request, and the client +chooses to make dictionary-based content-encoding available, the client adds +the dictionary-aware content encodings that it supports to the +"Accept-Encoding" request header. e.g.:

+
+
+Accept-Encoding: gzip, deflate, br, zstd, dcb, dcz
+
+
+

When a client does not have a stored dictionary that matches the request, or +chooses not to use one for the request, the client MUST NOT send its +dictionary-aware content-encodings in the "Accept-Encoding" request header.

+
+
+
+
+

+6.2. Content-Encoding +

+

If a server supports one of the dictionary encodings advertised by the client +and chooses to compress the content of the response using the dictionary that +the client has advertised then it sets the "Content-Encoding" response header +to the appropriate value for the algorithm selected. e.g.:

+
+
+Content-Encoding: dcb
+
+
+

If the response is cacheable, it MUST include a "Vary" header to prevent caches +serving dictionary-compressed resources to clients that don't support them or +serving the response compressed with the wrong dictionary:

+
+
+Vary: accept-encoding, available-dictionary
+
+
+
+
+
+
+
+
+

+7. IANA Considerations +

+
+
+

+7.1. Content Encoding +

+

IANA is asked to enter the following into the "HTTP Content Coding Registry" +registry maintained at +<https://www.iana.org/assignments/http-parameters/http-parameters.xhtml>:

+
    +
  • +

    Name: dcb

    +
  • +
  • +

    Description: "Dictionary-Compressed Brotli" data format.

    +
  • +
  • +

    Reference: This document

    +
  • +
  • +

    Notes: Section 4

    +
  • +
+

IANA is asked to enter the following into the "HTTP Content Coding Registry" +registry maintained at +<https://www.iana.org/assignments/http-parameters/http-parameters.xhtml>:

+
    +
  • +

    Name: dcz

    +
  • +
  • +

    Description: "Dictionary-Compressed Zstandard" data format.

    +
  • +
  • +

    Reference: This document

    +
  • +
  • +

    Notes: Section 5

    +
  • +
+
+
+
+
+

+7.2. Header Field Registration +

+

IANA is asked to update the +"Hypertext Transfer Protocol (HTTP) Field Name Registry" registry maintained at +<https://www.iana.org/assignments/http-fields/http-fields.xhtml> according +to the table below:

+ + + + + + + + + + + + + + + + + + + + + + + + + + +
Table 1
Field NameStatusReference
Use-As-Dictionarypermanent + Section 2.1 of this document
Available-Dictionarypermanent + Section 2.2 of this document
Dictionary-IDpermanent + Section 2.3 of this document
+
+
+ +
+
+
+
+

+8. Compatibility Considerations +

+

It is not unusual for there to be devices on the network path that intercept, +inspect and process HTTP requests (web proxies, firewalls, intrusion detection +systems, etc). To minimize the risk of these devices incorrectly processing +dictionary-compressed responses, compression dictionary transport MUST only be +used in secure contexts (HTTPS).

+
+
+
+
+

+9. Security Considerations +

+

The security considerations for Brotli [RFC7932], Shared Brotli +[SHARED-BROTLI] and Zstandard [RFC8878] apply to the +dictionary-based versions of the respective algorithms.

+
+
+

+9.1. Changing content +

+

The dictionary must be treated with the same security precautions as +the content, because a change to the dictionary can result in a +change to the decompressed content.

+

The dictionary is validated using a SHA-256 hash of the content to make sure +that the client and server are both using the same dictionary. The strength +of the SHA-256 hash algorithm isn't explicitly needed to counter attacks +since the dictionary is being served from the same origin as the content. That +said, if a weakness is discovered in SHA-256 and it is determined that the +dictionary negotiation should use a different hash algorithm, the +"Use-As-Dictionary" response header can be extended to specify a different +algorithm and the server would just ignore any "Available-Dictionary" requests +that do not use the updated hash.

+
+
+
+
+

+9.2. Reading content +

+

The compression attacks in Section 2.6 of [RFC7457] show that it's a bad idea +to compress data from mixed (e.g. public and private) sources -- the data +sources include not only the compressed data but also the dictionaries. For +example, if you compress secret cookies using a public-data-only dictionary, +you still leak information about the cookies.

+

Not only can the dictionary reveal information about the compressed +data, but vice versa, data compressed with the dictionary can reveal +the contents of the dictionary when an adversary can control parts of +data to compress and see the compressed size. On the other hand, if +the adversary can control the dictionary, the adversary can learn +information about the compressed data.

+
+
+
+
+

+9.3. Security Mitigations +

+

If any of the mitigations do not pass, the client MUST drop the response and +return an error.

+
+
+

+9.3.1. Cross-origin protection +

+

To make sure that a dictionary can only impact content from the same origin +where the dictionary was served, the URL Pattern used for matching a dictionary +to requests (Section 2.1.1) is guaranteed to be for the same origin that the +dictionary is served from.

+
+
+
+
+

+9.3.2. Response readability +

+

For clients, like web browsers, that provide additional protection against the +readability of the payload of a response and against user tracking, additional +protections MUST be taken to make sure that the use of dictionary-based +compression does not reveal information that would not otherwise be available.

+

In these cases, dictionary compression MUST only be used when both the +dictionary and the compressed response are fully readable by the client.

+

In browser terms, that means that both are either same-origin to the context +they are being fetched from or that the response is cross-origin and passes +the CORS check (see Part "CORS check" of [FETCH]).

+
+
+
+
+

+9.3.3. Server Responsibility +

+

As with any usage of compressed content in a secure context, a potential +timing attack exists if the attacker can control any part of the dictionary, +or if it can read the dictionary and control any part of the content being +compressed, while performing multiple requests that vary the dictionary or +injected content. Under such an attack, the changing size or processing time +of the response reveals information about the content, which might be +sufficient to read the supposedly secure response.

+

In general, a server can mitigate such attacks by preventing variations per +request, as in preventing active use of multiple dictionaries for the same +content, disabling compression when any portion of the content comes from +uncontrolled sources, and securing access and control over the dictionary +content in the same way as the response content. In addition, the following +requirements on a server are intended to disable dictionary-aware compression +when the client provides CORS request header fields that indicate a +cross-origin request context.

+

The following algorithm will return FALSE for cross-origin requests where +precautions such as not using dictionary-based compression should be +considered:

+
    +
  1. +

    If there is no "Sec-Fetch-Site" request header then return TRUE.

    +
  2. +
  3. +

    if the value of the "Sec-Fetch-Site" request header is "same-origin" then +return TRUE.

    +
  4. +
  5. +

    If there is no "Sec-Fetch-Mode" request header then return TRUE.

    +
  6. +
  7. +

    If the value of the "Sec-Fetch-Mode" request header is "navigate" or +"same-origin" then return TRUE.

    +
  8. +
  9. +

    If the value of the "Sec-Fetch-Mode" request header is "cors":

    +
      +
    • +

      If the response does not include an "Access-Control-Allow-Origin" response header then return FALSE.

      +
    • +
    • +

      If the request does not include an "Origin" request header then return FALSE.

      +
    • +
    • +

      If the value of the "Access-Control-Allow-Origin" response header is "*" then return TRUE.

      +
    • +
    • +

      If the value of the "Access-Control-Allow-Origin" response header matches the value of the "Origin" request header then return TRUE.

      +
    • +
    +
  10. +
  11. +

    return FALSE.

    +
  12. +
+
+
+
+
+
+
+
+
+

+10. Privacy Considerations +

+

Since dictionaries are advertised in future requests using the hash of the +content of the dictionary, it is possible to abuse the dictionary to turn it +into a tracking cookie.

+

To mitigate any additional tracking concerns, clients MUST treat dictionaries +in the same way that they treat cookies [RFC6265]. This includes partitioning +the storage as cookies are partitioned as well as clearing the dictionaries +whenever cookies are cleared.

+
+
+
+
+

+11. References +

+
+
+

+11.1. Normative References +

+
+
[FETCH]
+
+WHATWG, "Fetch - Living Standard", <https://fetch.spec.whatwg.org/>.
+
+
[FOLDING]
+
+Watsen, K., Auerswald, E., Farrel, A., and Q. Wu, "Handling Long Lines in Content of Internet-Drafts and RFCs", RFC 8792, DOI 10.17487/RFC8792, , <https://www.rfc-editor.org/rfc/rfc8792>.
+
+
[HTTP]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, DOI 10.17487/RFC9110, , <https://www.rfc-editor.org/rfc/rfc9110>.
+
+
[HTTP-CACHING]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Caching", STD 98, RFC 9111, DOI 10.17487/RFC9111, , <https://www.rfc-editor.org/rfc/rfc9111>.
+
+
[RFC2119]
+
+Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
+
+
[RFC8174]
+
+Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.
+
+
[RFC8878]
+
+Collet, Y. and M. Kucherawy, Ed., "Zstandard Compression and the 'application/zstd' Media Type", RFC 8878, DOI 10.17487/RFC8878, , <https://www.rfc-editor.org/rfc/rfc8878>.
+
+
[SHA-256]
+
+Eastlake 3rd, D. and T. Hansen, "US Secure Hash Algorithms (SHA and SHA-based HMAC and HKDF)", RFC 6234, DOI 10.17487/RFC6234, , <https://www.rfc-editor.org/rfc/rfc6234>.
+
+
[SHARED-BROTLI]
+
+"Shared Brotli Compressed Data Format", , <https://datatracker.ietf.org/doc/draft-vandevenne-shared-brotli-format/>.
+
+
[STRUCTURED-FIELDS]
+
+"Structured Field Values for HTTP", , <https://datatracker.ietf.org/doc/draft-ietf-httpbis-sfbis/>.
+
+
[URL]
+
+Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, DOI 10.17487/RFC3986, , <https://www.rfc-editor.org/rfc/rfc3986>.
+
+
[URLPATTERN]
+
+WHATWG, "URL Pattern - Living Standard", <https://urlpattern.spec.whatwg.org/>.
+
+
[WEB-LINKING]
+
+Nottingham, M., "Web Linking", RFC 8288, DOI 10.17487/RFC8288, , <https://www.rfc-editor.org/rfc/rfc8288>.
+
+
+
+
+
+
+

+11.2. Informative References +

+
+
[RFC5861]
+
+Nottingham, M., "HTTP Cache-Control Extensions for Stale Content", RFC 5861, DOI 10.17487/RFC5861, , <https://www.rfc-editor.org/rfc/rfc5861>.
+
+
[RFC6265]
+
+Barth, A., "HTTP State Management Mechanism", RFC 6265, DOI 10.17487/RFC6265, , <https://www.rfc-editor.org/rfc/rfc6265>.
+
+
[RFC7457]
+
+Sheffer, Y., Holz, R., and P. Saint-Andre, "Summarizing Known Attacks on Transport Layer Security (TLS) and Datagram TLS (DTLS)", RFC 7457, DOI 10.17487/RFC7457, , <https://www.rfc-editor.org/rfc/rfc7457>.
+
+
[RFC7932]
+
+Alakuijala, J. and Z. Szabadka, "Brotli Compressed Data Format", RFC 7932, DOI 10.17487/RFC7932, , <https://www.rfc-editor.org/rfc/rfc7932>.
+
+
+
+
+
+
+
+
+

+Authors' Addresses +

+
+
Patrick Meenan (editor)
+
Google LLC
+ +
+
+
Yoav Weiss (editor)
+
Shopify Inc
+ +
+
+
+ + + diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-compression-dictionary.txt b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-compression-dictionary.txt new file mode 100644 index 000000000..f003016f5 --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-compression-dictionary.txt @@ -0,0 +1,908 @@ + + + + +HTTP P. Meenan, Ed. +Internet-Draft Google LLC +Intended status: Standards Track Y. Weiss, Ed. +Expires: 11 July 2025 Shopify Inc + 7 January 2025 + + + Compression Dictionary Transport + draft-ietf-httpbis-compression-dictionary-latest + +Abstract + + This document specifies a mechanism for dictionary-based compression + in the Hypertext Transfer Protocol (HTTP). By utilizing this + technique, clients and servers can reduce the size of transmitted + data, leading to improved performance and reduced bandwidth + consumption. This document extends existing HTTP compression methods + and provides guidelines for the delivery and use of compression + dictionaries within the HTTP protocol. + +About This Document + + This note is to be removed before publishing as an RFC. + + Status information for this document may be found at + https://datatracker.ietf.org/doc/draft-ietf-httpbis-compression- + dictionary/. + + Discussion of this document takes place on the HTTP Working Group + mailing list (mailto:ietf-http-wg@w3.org), which is archived at + https://lists.w3.org/Archives/Public/ietf-http-wg/. Working Group + information can be found at https://httpwg.org/. + + Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/compression- + dictionary. + +Status of This Memo + + This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79. + + Internet-Drafts are working documents of the Internet Engineering + Task Force (IETF). Note that other groups may also distribute + working documents as Internet-Drafts. The list of current Internet- + Drafts is at https://datatracker.ietf.org/drafts/current/. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress." + + This Internet-Draft will expire on 11 July 2025. + +Copyright Notice + + Copyright (c) 2025 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents (https://trustee.ietf.org/ + license-info) in effect on the date of publication of this document. + Please review these documents carefully, as they describe your rights + and restrictions with respect to this document. Code Components + extracted from this document must include Revised BSD License text as + described in Section 4.e of the Trust Legal Provisions and are + provided without warranty as described in the Revised BSD License. + +Table of Contents + + 1. Introduction + 1.1. Use Cases + 1.1.1. Version Upgrade + 1.1.2. Common Content + 1.2. Notational Conventions + 2. Dictionary Negotiation + 2.1. Use-As-Dictionary + 2.1.1. match + 2.1.2. match-dest + 2.1.3. id + 2.1.4. type + 2.1.5. Examples + 2.2. Available-Dictionary + 2.2.1. Dictionary freshness requirement + 2.2.2. Dictionary URL matching + 2.2.3. Multiple matching dictionaries + 2.3. Dictionary-ID + 3. The 'compression-dictionary' Link Relation Type + 4. Dictionary-Compressed Brotli + 5. Dictionary-Compressed Zstandard + 6. Negotiating the content encoding + 6.1. Accept-Encoding + 6.2. Content-Encoding + 7. IANA Considerations + 7.1. Content Encoding + 7.2. Header Field Registration + 7.3. Link Relation Registration + 8. Compatibility Considerations + 9. Security Considerations + 9.1. Changing content + 9.2. Reading content + 9.3. Security Mitigations + 9.3.1. Cross-origin protection + 9.3.2. Response readability + 9.3.3. Server Responsibility + 10. Privacy Considerations + 11. References + 11.1. Normative References + 11.2. Informative References + Authors' Addresses + +1. Introduction + + This specification defines a mechanism for using designated [HTTP] + responses as an external dictionary for future HTTP responses for + compression schemes that support using external dictionaries (e.g., + Brotli [RFC7932] and Zstandard [RFC8878]). + + This document describes the HTTP headers used for negotiating + dictionary usage and registers content encoding values for + compressing with Brotli and Zstandard using a negotiated dictionary. + + The negotiation of dictionary usage leverages HTTP's content + negotiation (see Section 12 of [HTTP]) and is usable with all + versions of HTTP. + +1.1. Use Cases + + Any HTTP response can be specified to be used as a compression + dictionary for future HTTP requests which provides a lot of + flexibility. There are two common use cases that are seen + frequently: + +1.1.1. Version Upgrade + + Using a previous version of a resource as a dictionary for a newer + version enables delivery of a delta-compressed version of the + changes, usually resulting in significantly smaller responses than + can be achieved by compression alone. + + For example: + + Client Server + | | + | GET /app.v1.js | + |------------------------------------------------->| + | | + | 200 OK | + | Use-As-Dictionary: match="/app*js" | + | | + |<-------------------------------------------------| + | | + + Some time later ... + + Client Server + | | + | GET /app.v2.js | + | Available-Dictionary: :pZGm1A...2a2fFG4=: | + | Accept-Encoding: gzip, br, zstd, dcb, dcz | + |------------------------------------------------->| + | | + | 200 OK | + | Content-Encoding: dcb | + | | + |<-------------------------------------------------| + | | + + Figure 1: Version Upgrade Example + +1.1.2. Common Content + + If several resources share common patterns in their responses then it + can be useful to reference an external dictionary that contains those + common patterns, effectively compressing them out of the responses. + Some examples of this are common template HTML for similar pages + across a site and common keys and values in API calls. + + For example: + + Client Server + | | + | GET /index.html | + |--------------------------------------------------->| + | | + | 200 OK | + | Link: <.../dict>; rel="compression-dictionary" | + | | + |<---------------------------------------------------| + | | + | GET /dict | + |--------------------------------------------------->| + | | + | 200 OK | + | Use-As-Dictionary: match="/*html" | + |<---------------------------------------------------| + | | + + Some time later ... + + Client Server + | | + | GET /page2.html | + | Available-Dictionary: :pZGm1A...2a2fFG4=: | + | Accept-Encoding: gzip, br, zstd, dcb, dcz | + |--------------------------------------------------->| + | | + | 200 OK | + | Content-Encoding: dcb | + | | + |<---------------------------------------------------| + | | + + Figure 2: Common Content Example + +1.2. Notational Conventions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + + This document uses the following terminology from Section 3 of + [STRUCTURED-FIELDS] to specify syntax and parsing: Dictionary, + String, Inner List, Token, and Byte Sequence. + + This document uses the line folding strategies described in + [FOLDING]. + + This document also uses terminology from [HTTP] and [HTTP-CACHING]. + +2. Dictionary Negotiation + +2.1. Use-As-Dictionary + + When responding to a HTTP Request, a server can advertise that the + response can be used as a dictionary for future requests for URLs + that match the rules specified in the Use-As-Dictionary response + header. + + The Use-As-Dictionary response header is a Structured Field + [STRUCTURED-FIELDS] Dictionary with values for "match", "match-dest", + "id", and "type". + +2.1.1. match + + The "match" value of the Use-As-Dictionary header is a String value + that provides the URL Pattern to use for request matching (see + [URLPATTERN]). + + The URL Pattern used for matching does not support using regular + expressions. + + The following algorithm is used to validate that a given String value + is a valid URL Pattern that does not use regular expressions and is + for the same Origin (Section 4.3.1 of [HTTP]) as the dictionary. It + will return TRUE for a valid match pattern and FALSE for an invalid + pattern that MUST NOT be used: + + 1. Let MATCH be the value of "match" for the given dictionary. + + 2. Let URL be the URL of the dictionary request. + + 3. Let PATTERN be a URL pattern created by running the steps to + create a URL pattern by setting input=MATCH, and baseURL=URL (see + Part "create" of [URLPATTERN]). + + 4. If the result of running the "has regexp groups" steps for + PATTERN returns TRUE then return FALSE (see Part "has regexp + groups" of [URLPATTERN]). + + 5. Return TRUE. + + The "match" value is required and MUST be included in the Use-As- + Dictionary response header for the dictionary to be considered valid. + + Operating at the HTTP-level, the specified match patterns will + operate on the percent-encoded version of the URL path (see Section 2 + of [URL]). + + For example the URL "http://www.example.com/düsseldorf" would be + encoded as "http://www.example.com/d%C3%BCsseldorf" and a relevant + match pattern would be: + + Use-As-Dictionary: match="/d%C3%BCsseldorf" + +2.1.2. match-dest + + The "match-dest" value of the Use-As-Dictionary header is an Inner + List of String values that provides a list of Fetch request + destinations for the dictionary to match (see Part "destination" of + [FETCH]). + + An empty list for "match-dest" MUST match all destinations. + + For clients that do not support request destinations, the client MUST + treat it as an empty list and match all destinations. + + The "match-dest" value is optional and defaults to an empty list. + +2.1.3. id + + The "id" value of the Use-As-Dictionary header is a String value that + specifies a server identifier for the dictionary. If an "id" value + is present and has a string length longer than zero then it MUST be + sent to the server in a "Dictionary-ID" request header when the + client sends an "Available-Dictionary" request header for the same + dictionary (see Section 2.2). + + The server identifier MUST be treated as an opaque string by the + client. + + The server identifier MUST NOT be relied upon by the server to + guarantee the contents of the dictionary. The dictionary hash MUST + be validated before use. + + The "id" value string length (after any decoding) supports up to 1024 + characters. + + The "id" value is optional and defaults to the empty string. + +2.1.4. type + + The "type" value of the Use-As-Dictionary header is a Token value + that describes the file format of the supplied dictionary. + + "raw" is defined as a dictionary format which represents an + unformatted blob of bytes suitable for any compression scheme to use. + + If a client receives a dictionary with a type that it does not + understand, it MUST NOT use the dictionary. + + The "type" value is optional and defaults to "raw". + +2.1.5. Examples + +2.1.5.1. Path Prefix + + A response that contained a response header: + + NOTE: '\' line wrapping per RFC 8792 + + Use-As-Dictionary: \ + match="/product/*", match-dest=("document") + + Would specify matching any document request for a URL with a path + prefix of /product/ on the same Origin (Section 4.3.1 of [HTTP]) as + the original request. + +2.1.5.2. Versioned Directories + + A response that contained a response header: + + Use-As-Dictionary: match="/app/*/main.js" + + Would match any path that starts with "/app/" and ends with + "/main.js". + +2.2. Available-Dictionary + + When a HTTP client makes a request for a resource for which it has an + appropriate dictionary, it can add a "Available-Dictionary" request + header to the request to indicate to the server that it has a + dictionary available to use for compression. + + The "Available-Dictionary" request header is a Structured Field + [STRUCTURED-FIELDS] Byte Sequence containing the [SHA-256] hash of + the contents of a single available dictionary. + + The client MUST only send a single "Available-Dictionary" request + header with a single hash value for the best available match that it + has available. + + For example: + + Available-Dictionary: :pZGm1Av0IEBKARczz7exkNYsZb8LzaMrV7J32a2fFG4=: + +2.2.1. Dictionary freshness requirement + + To be considered as a match, the dictionary resource MUST be either + fresh [HTTP-CACHING] or allowed to be served stale (see eg + [RFC5861]). + +2.2.2. Dictionary URL matching + + When a dictionary is stored as a result of a "Use-As-Dictionary" + directive, it includes a "match" string and optional "match-dest" + string that are used to match an outgoing request from a client to + the available dictionaries. + + To see if an outbound request matches a given dictionary, the + following algorithm will return TRUE for a successful match and FALSE + for no-match: + + 1. If the current client supports request destinations and a "match- + dest" string was provided with the dictionary: + + * Let DEST be the value of "match-dest" for the given + dictionary. + + * Let REQUEST_DEST be the value of the destination for the + current request. + + * If DEST is not an empty list and if REQUEST_DEST is not in the + DEST list of destinations, return FALSE + + 2. Let BASEURL be the URL of the dictionary request. + + 3. Let URL represent the URL of the outbound request being checked. + + 4. If the Origin of BASEURL and the Origin of URL are not the same, + return FALSE (see Section 4.3.1 of [HTTP]). + + 5. Let MATCH be the value of "match" for the given dictionary. + + 6. Let PATTERN be a URL pattern created by running the steps to + create a URL pattern by setting input=MATCH, and baseURL=URL (see + Part "create" of [URLPATTERN]). + + 7. Return the result of running the "match" steps on PATTERN with + input=URL which will check for a match between the request URL + and the supplied "match" string (see Part "match" of + [URLPATTERN]). + +2.2.3. Multiple matching dictionaries + + When there are multiple dictionaries that match a given request URL, + the client MUST pick a single dictionary using the following rules: + + 1. For clients that support request destinations, a dictionary that + specifies and matches a "match-dest" takes precedence over a + match that does not use a destination. + + 2. Given equivalent destination precedence, the dictionary with the + longest "match" takes precedence. + + 3. Given equivalent destination and match length precedence, the + most recently fetched dictionary takes precedence. + +2.3. Dictionary-ID + + When a HTTP client makes a request for a resource for which it has an + appropriate dictionary and the dictionary was stored with a server- + provided "id" in the Use-As-Dictionary response then the client MUST + echo the stored "id" in a "Dictionary-ID" request header. + + The "Dictionary-ID" request header is a Structured Field + [STRUCTURED-FIELDS] String of up to 1024 characters (after any + decoding) and MUST be identical to the server-provided "id". + + For example, given a HTTP response that set a dictionary ID: + + Use-As-Dictionary: match="/app/*/main.js", id="dictionary-12345" + + A future request that matches the given dictionary will include both + the hash and the ID: + + Available-Dictionary: :pZGm1Av0IEBKARczz7exkNYsZb8LzaMrV7J32a2fFG4=: + Dictionary-ID: "dictionary-12345" + +3. The 'compression-dictionary' Link Relation Type + + This specification defines the 'compression-dictionary' link relation + type [WEB-LINKING] that provides a mechanism for a HTTP response to + provide a URL for a compression dictionary that is related to, but + not directly used by the current HTTP response. + + The 'compression-dictionary' link relation type indicates that + fetching and caching the specified resource is likely to be + beneficial for future requests. The response to some of those future + requests are likely to be able to use the indicated resource as a + compression dictionary. + + Clients can fetch the provided resource at a time that they determine + would be appropriate. + + The response to the fetch for the compression dictionary needs to + include a "Use-As-Dictionary" and caching response headers for it to + be usable as a compression dictionary. The link relation only + provides the mechanism for triggering the fetch of the dictionary. + + The following example shows a link to a resource at + https://example.org/dict.dat that is expected to produce a + compression dictionary: + + Link: ; rel="compression-dictionary" + +4. Dictionary-Compressed Brotli + + The "dcb" content encoding identifies a resource that is a + "Dictionary-Compressed Brotli" stream. + + A "Dictionary-Compressed Brotli" stream has a fixed header that is + followed by a Shared Brotli [SHARED-BROTLI] stream. The header + consists of a fixed 4-byte sequence and a 32-byte hash of the + external dictionary that was used. The Shared Brotli stream is + created using the referenced external dictionary and a compression + window that is at most 16 MB in size. + + The dictionary used for the "dcb" content encoding is a "raw" + dictionary type as defined in Section 2.1.4 and is treated as a + prefix dictionary as defined in section 9.2 of the Shared Brotli + Compressed Data Format draft. [SHARED-BROTLI] + + The 36-byte fixed header is as follows: + + Magic_Number: 4 fixed bytes: 0xff, 0x44, 0x43, 0x42. + + SHA_256_Hash: 32 bytes. SHA-256 hash digest of the dictionary + [SHA-256]. + + Clients that announce support for dcb content encoding MUST be able + to decompress resources that were compressed with a window size of up + to 16 MB. + + With Brotli compression, the full dictionary is available during + compression and decompression independent of the compression window, + allowing for delta-compression of resources larger than the + compression window. + +5. Dictionary-Compressed Zstandard + + The "dcz" content encoding identifies a resource that is a + "Dictionary-Compressed Zstandard" stream. + + A "Dictionary-Compressed Zstandard" stream is a binary stream that + starts with a 40-byte fixed header and is followed by a Zstandard + [RFC8878] stream of the response that has been compressed with an + external dictionary. + + The dictionary used for the "dcz" content encoding is a "raw" + dictionary type as defined in Section 2.1.4 and is treated as a raw + dictionary as per section 5 of RFC 8878. + + The 40-byte header consists of a fixed 8-byte sequence followed by + the 32-byte SHA-256 hash of the external dictionary that was used to + compress the resource: + + Magic_Number: 8 fixed bytes: 0x5e, 0x2a, 0x4d, 0x18, 0x20, 0x00, + 0x00, 0x00. + + SHA_256_Hash: 32 bytes. SHA-256 hash digest of the dictionary + [SHA-256]. + + The 40-byte header is a Zstandard skippable frame (little-endian + 0x184D2A5E) with a 32-byte length (little-endian 0x00000020) that is + compatible with existing Zstandard decoders. + + Clients that announce support for dcz content encoding MUST be able + to decompress resources that were compressed with a window size of at + least 8 MB or 1.25 times the size of the dictionary, which ever is + greater, up to a maximum of 128 MB. + + The window size used will be encoded in the content (currently, this + can be expressed in powers of two only) and it MUST be lower than + this limit. An implementation MAY treat a window size that exceeds + the limit as a decoding error. + + With Zstandard compression, the full dictionary is available during + compression and decompression until the size of the input exceeds the + compression window. Beyond that point the dictionary becomes + unavailable. Using a compression window that is 1.25 times the size + of the dictionary allows for full delta compression of resources that + have grown by 25% between releases while still giving the client + control over the memory it will need to allocate for a given + response. + +6. Negotiating the content encoding + + When a compression dictionary is available for use compressing the + response to a given request, the encoding to be used is negotiated + through the regular mechanism for negotiating content encoding in + HTTP through the "Accept-Encoding" request header and "Content- + Encoding" response header. + + The dictionary to use is negotiated separately and advertised in the + "Available-Dictionary" request header. + +6.1. Accept-Encoding + + When a dictionary is available for use on a given request, and the + client chooses to make dictionary-based content-encoding available, + the client adds the dictionary-aware content encodings that it + supports to the "Accept-Encoding" request header. e.g.: + + Accept-Encoding: gzip, deflate, br, zstd, dcb, dcz + + When a client does not have a stored dictionary that matches the + request, or chooses not to use one for the request, the client MUST + NOT send its dictionary-aware content-encodings in the "Accept- + Encoding" request header. + +6.2. Content-Encoding + + If a server supports one of the dictionary encodings advertised by + the client and chooses to compress the content of the response using + the dictionary that the client has advertised then it sets the + "Content-Encoding" response header to the appropriate value for the + algorithm selected. e.g.: + + Content-Encoding: dcb + + If the response is cacheable, it MUST include a "Vary" header to + prevent caches serving dictionary-compressed resources to clients + that don't support them or serving the response compressed with the + wrong dictionary: + + Vary: accept-encoding, available-dictionary + +7. IANA Considerations + +7.1. Content Encoding + + IANA is asked to enter the following into the "HTTP Content Coding + Registry" registry maintained at : + + * Name: dcb + + * Description: "Dictionary-Compressed Brotli" data format. + + * Reference: This document + + * Notes: Section 4 + + IANA is asked to enter the following into the "HTTP Content Coding + Registry" registry maintained at : + + * Name: dcz + + * Description: "Dictionary-Compressed Zstandard" data format. + + * Reference: This document + + * Notes: Section 5 + +7.2. Header Field Registration + + IANA is asked to update the "Hypertext Transfer Protocol (HTTP) Field + Name Registry" registry maintained at + + according to the table below: + + +======================+===========+==============================+ + | Field Name | Status | Reference | + +======================+===========+==============================+ + | Use-As-Dictionary | permanent | Section 2.1 of this document | + +----------------------+-----------+------------------------------+ + | Available-Dictionary | permanent | Section 2.2 of this document | + +----------------------+-----------+------------------------------+ + | Dictionary-ID | permanent | Section 2.3 of this document | + +----------------------+-----------+------------------------------+ + + Table 1 + +7.3. Link Relation Registration + + IANA is asked to update the "Link Relation Types" registry maintained + at : + + * Relation Name: compression-dictionary + + * Description: Refers to a compression dictionary used for content + encoding. + + * Reference: This document, Section 3 + +8. Compatibility Considerations + + It is not unusual for there to be devices on the network path that + intercept, inspect and process HTTP requests (web proxies, firewalls, + intrusion detection systems, etc). To minimize the risk of these + devices incorrectly processing dictionary-compressed responses, + compression dictionary transport MUST only be used in secure contexts + (HTTPS). + +9. Security Considerations + + The security considerations for Brotli [RFC7932], Shared Brotli + [SHARED-BROTLI] and Zstandard [RFC8878] apply to the dictionary-based + versions of the respective algorithms. + +9.1. Changing content + + The dictionary must be treated with the same security precautions as + the content, because a change to the dictionary can result in a + change to the decompressed content. + + The dictionary is validated using a SHA-256 hash of the content to + make sure that the client and server are both using the same + dictionary. The strength of the SHA-256 hash algorithm isn't + explicitly needed to counter attacks since the dictionary is being + served from the same origin as the content. That said, if a weakness + is discovered in SHA-256 and it is determined that the dictionary + negotiation should use a different hash algorithm, the "Use-As- + Dictionary" response header can be extended to specify a different + algorithm and the server would just ignore any "Available-Dictionary" + requests that do not use the updated hash. + +9.2. Reading content + + The compression attacks in Section 2.6 of [RFC7457] show that it's a + bad idea to compress data from mixed (e.g. public and private) + sources -- the data sources include not only the compressed data but + also the dictionaries. For example, if you compress secret cookies + using a public-data-only dictionary, you still leak information about + the cookies. + + Not only can the dictionary reveal information about the compressed + data, but vice versa, data compressed with the dictionary can reveal + the contents of the dictionary when an adversary can control parts of + data to compress and see the compressed size. On the other hand, if + the adversary can control the dictionary, the adversary can learn + information about the compressed data. + +9.3. Security Mitigations + + If any of the mitigations do not pass, the client MUST drop the + response and return an error. + +9.3.1. Cross-origin protection + + To make sure that a dictionary can only impact content from the same + origin where the dictionary was served, the URL Pattern used for + matching a dictionary to requests (Section 2.1.1) is guaranteed to be + for the same origin that the dictionary is served from. + +9.3.2. Response readability + + For clients, like web browsers, that provide additional protection + against the readability of the payload of a response and against user + tracking, additional protections MUST be taken to make sure that the + use of dictionary-based compression does not reveal information that + would not otherwise be available. + + In these cases, dictionary compression MUST only be used when both + the dictionary and the compressed response are fully readable by the + client. + + In browser terms, that means that both are either same-origin to the + context they are being fetched from or that the response is cross- + origin and passes the CORS check (see Part "CORS check" of [FETCH]). + +9.3.3. Server Responsibility + + As with any usage of compressed content in a secure context, a + potential timing attack exists if the attacker can control any part + of the dictionary, or if it can read the dictionary and control any + part of the content being compressed, while performing multiple + requests that vary the dictionary or injected content. Under such an + attack, the changing size or processing time of the response reveals + information about the content, which might be sufficient to read the + supposedly secure response. + + In general, a server can mitigate such attacks by preventing + variations per request, as in preventing active use of multiple + dictionaries for the same content, disabling compression when any + portion of the content comes from uncontrolled sources, and securing + access and control over the dictionary content in the same way as the + response content. In addition, the following requirements on a + server are intended to disable dictionary-aware compression when the + client provides CORS request header fields that indicate a cross- + origin request context. + + The following algorithm will return FALSE for cross-origin requests + where precautions such as not using dictionary-based compression + should be considered: + + 1. If there is no "Sec-Fetch-Site" request header then return TRUE. + + 2. if the value of the "Sec-Fetch-Site" request header is "same- + origin" then return TRUE. + + 3. If there is no "Sec-Fetch-Mode" request header then return TRUE. + + 4. If the value of the "Sec-Fetch-Mode" request header is "navigate" + or "same-origin" then return TRUE. + + 5. If the value of the "Sec-Fetch-Mode" request header is "cors": + + * If the response does not include an "Access-Control-Allow- + Origin" response header then return FALSE. + + * If the request does not include an "Origin" request header + then return FALSE. + + * If the value of the "Access-Control-Allow-Origin" response + header is "*" then return TRUE. + + * If the value of the "Access-Control-Allow-Origin" response + header matches the value of the "Origin" request header then + return TRUE. + + 6. return FALSE. + +10. Privacy Considerations + + Since dictionaries are advertised in future requests using the hash + of the content of the dictionary, it is possible to abuse the + dictionary to turn it into a tracking cookie. + + To mitigate any additional tracking concerns, clients MUST treat + dictionaries in the same way that they treat cookies [RFC6265]. This + includes partitioning the storage as cookies are partitioned as well + as clearing the dictionaries whenever cookies are cleared. + +11. References + +11.1. Normative References + + [FETCH] WHATWG, "Fetch - Living Standard", + . + + [FOLDING] Watsen, K., Auerswald, E., Farrel, A., and Q. Wu, + "Handling Long Lines in Content of Internet-Drafts and + RFCs", RFC 8792, DOI 10.17487/RFC8792, June 2020, + . + + [HTTP] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP Semantics", STD 97, RFC 9110, + DOI 10.17487/RFC9110, June 2022, + . + + [HTTP-CACHING] + Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP Caching", STD 98, RFC 9111, + DOI 10.17487/RFC9111, June 2022, + . + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + . + + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, . + + [RFC8878] Collet, Y. and M. Kucherawy, Ed., "Zstandard Compression + and the 'application/zstd' Media Type", RFC 8878, + DOI 10.17487/RFC8878, February 2021, + . + + [SHA-256] Eastlake 3rd, D. and T. Hansen, "US Secure Hash Algorithms + (SHA and SHA-based HMAC and HKDF)", RFC 6234, + DOI 10.17487/RFC6234, May 2011, + . + + [SHARED-BROTLI] + "Shared Brotli Compressed Data Format", September 2022, + . + + [STRUCTURED-FIELDS] + "Structured Field Values for HTTP", May 2024, + . + + [URL] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform + Resource Identifier (URI): Generic Syntax", STD 66, + RFC 3986, DOI 10.17487/RFC3986, January 2005, + . + + [URLPATTERN] + WHATWG, "URL Pattern - Living Standard", + . + + [WEB-LINKING] + Nottingham, M., "Web Linking", RFC 8288, + DOI 10.17487/RFC8288, October 2017, + . + +11.2. Informative References + + [RFC5861] Nottingham, M., "HTTP Cache-Control Extensions for Stale + Content", RFC 5861, DOI 10.17487/RFC5861, May 2010, + . + + [RFC6265] Barth, A., "HTTP State Management Mechanism", RFC 6265, + DOI 10.17487/RFC6265, April 2011, + . + + [RFC7457] Sheffer, Y., Holz, R., and P. Saint-Andre, "Summarizing + Known Attacks on Transport Layer Security (TLS) and + Datagram TLS (DTLS)", RFC 7457, DOI 10.17487/RFC7457, + February 2015, . + + [RFC7932] Alakuijala, J. and Z. Szabadka, "Brotli Compressed Data + Format", RFC 7932, DOI 10.17487/RFC7932, July 2016, + . + +Authors' Addresses + + Patrick Meenan (editor) + Google LLC + Email: pmeenan@google.com + + + Yoav Weiss (editor) + Shopify Inc + Email: yoav.weiss@shopify.com diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-connect-tcp.html b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-connect-tcp.html new file mode 100644 index 000000000..5ba15ff8a --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-connect-tcp.html @@ -0,0 +1,1828 @@ + + + + + + +Template-Driven HTTP CONNECT Proxying for TCP + + + + + + + + + + + + + + + + + + + + + + + +
Internet-DraftTemplated CONNECT-TCPJanuary 2025
SchwartzExpires 11 July 2025[Page]
+
+
+
+
Workgroup:
+
httpbis
+
Internet-Draft:
+
draft-ietf-httpbis-connect-tcp-latest
+
Published:
+
+ +
+
Intended Status:
+
Standards Track
+
Expires:
+
+
Author:
+
+
+
B. M. Schwartz
+
Meta Platforms, Inc.
+
+
+
+
+

Template-Driven HTTP CONNECT Proxying for TCP

+
+

Abstract

+

TCP proxying using HTTP CONNECT has long been part of the core HTTP specification. However, this proxying functionality has several important deficiencies in modern HTTP environments. This specification defines an alternative HTTP proxy service configuration for TCP connections. This configuration is described by a URI Template, similar to the CONNECT-UDP and CONNECT-IP protocols.

+
+
+
+

+Status of This Memo +

+

+ This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79.

+

+ Internet-Drafts are working documents of the Internet Engineering Task + Force (IETF). Note that other groups may also distribute working + documents as Internet-Drafts. The list of current Internet-Drafts is + at https://datatracker.ietf.org/drafts/current/.

+

+ Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress."

+

+ This Internet-Draft will expire on 11 July 2025.

+
+
+ + +
+
+

+1. Introduction +

+
+
+

+1.1. History +

+

HTTP has used the CONNECT method for proxying TCP connections since HTTP/1.1. When using CONNECT, the request target specifies a host and port number, and the proxy forwards TCP payloads between the client and this destination ([RFC9110], Section 9.3.6). To date, this is the only mechanism defined for proxying TCP over HTTP. In this specification, this is referred to as a "classic HTTP CONNECT proxy".

+

HTTP/3 uses a UDP transport, so it cannot be forwarded using the pre-existing CONNECT mechanism. To enable forward proxying of HTTP/3, the MASQUE effort has defined proxy mechanisms that are capable of proxying UDP datagrams [CONNECT-UDP], and more generally IP datagrams [CONNECT-IP]. The destination host and port number (if applicable) are encoded into the HTTP resource path, and end-to-end datagrams are wrapped into HTTP Datagrams [RFC9297] on the client-proxy path.

+
+
+
+
+

+1.2. Problems +

+

HTTP clients can be configured to use proxies by selecting a proxy hostname, a port, and whether to use a security protocol. However, Classic HTTP CONNECT requests using the proxy do not carry this configuration information. Instead, they only indicate the hostname and port of the target. This prevents any HTTP server from hosting multiple distinct proxy services, as the server cannot distinguish them by path (as with distinct resources) or by origin (as in "virtual hosting").

+

The absence of an explicit origin for the proxy also rules out the usual defenses against server port misdirection attacks (see Section 7.4 of [RFC9110]) and creates ambiguity about the use of origin-scoped response header fields (e.g., "Alt-Svc" [RFC7838], "Strict-Transport-Security" [RFC6797]).

+

Classic HTTP CONNECT requests cannot carry in-stream metadata. For example, the WRAP_UP capsule [I-D.schinazi-httpbis-wrap-up] cannot be used with Classic HTTP CONNECT.

+
+
+
+
+

+1.3. Overview +

+

This specification describes an alternative mechanism for proxying TCP in HTTP. Like [CONNECT-UDP] and [CONNECT-IP], the proxy service is identified by a URI Template. Proxy interactions reuse standard HTTP components and semantics, avoiding changes to the core HTTP protocol.

+
+
+
+
+
+
+

+2. Conventions and Definitions +

+

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", +"MAY", and "OPTIONAL" in this document are to be interpreted as +described in BCP 14 [RFC2119] [RFC8174] when, and only when, they +appear in all capitals, as shown here.

+
+
+
+
+

+3. Specification +

+

A template-driven TCP transport proxy for HTTP is identified by a URI Template [RFC6570] containing variables named "target_host" and "target_port". This URI Template and its variable values MUST meet all the same requirements as for UDP proxying ([RFC9298], Section 2), and are subject to the same validation rules. The client MUST substitute the destination host and port number into this template to produce the request URI. The derived URI serves as the destination of a Capsule Protocol connection using the Upgrade Token "connect-tcp" (see registration in Section 8.1).

+

When using "connect-tcp", TCP payload data is sent in the payload of a new Capsule Type named DATA (see registration in Section 8.3). The ordered concatenation of DATA capsule payloads represents the TCP payload data.

+

An intermediary MAY merge and split successive DATA capsules, subject to the following requirements:

+
    +
  • +

    There are no intervening capsules of other types.

    +
  • +
  • +

    The order of payload content is preserved.

    +
  • +
+
+
+

+3.1. In HTTP/1.1 +

+

In HTTP/1.1, the client uses the proxy by issuing a request as follows:

+
    +
  • +

    The method SHALL be "GET".

    +
  • +
  • +

    The request's target SHALL correspond to the URI derived from expansion of the proxy's URI Template.

    +
  • +
  • +

    The request SHALL include a single "Host" header field containing the origin of the proxy.

    +
  • +
  • +

    The request SHALL include a "Connection" header field with the value "Upgrade". (Note that this requirement is case-insensitive as per Section 7.6.1 of [RFC9110].)

    +
  • +
  • +

    The request SHALL include an "Upgrade" header field with the value "connect-tcp".

    +
  • +
  • +

    The request SHOULD include a "Capsule-Protocol: ?1" header.

    +
  • +
+

If the request is well-formed and permissible, the proxy MUST attempt to establish the TCP connection before sending any response status code other than "100 (Continue)" (see Section 4.2). If the TCP connection is successful, the response SHALL be as follows:

+
    +
  • +

    The HTTP status code SHALL be "101 (Switching Protocols)".

    +
  • +
  • +

    The response SHALL include a "Connection" header field with the value "Upgrade".

    +
  • +
  • +

    The response SHALL include a single "Upgrade" header field with the value "connect-tcp".

    +
  • +
  • +

    The response SHOULD include a "Capsule-Protocol: ?1" header.

    +
  • +
+

If the request is malformed or impermissible, the proxy MUST return a 4XX error code. If a TCP connection was not established, the proxy MUST NOT switch protocols to "connect-tcp", and the client MAY reuse this connection for additional HTTP requests.

+
+
+
+Client                                                 Proxy
+
+GET /proxy?target_host=192.0.2.1&target_port=443 HTTP/1.1
+Host: example.com
+Connection: Upgrade
+Upgrade: connect-tcp
+Capsule-Protocol: ?1
+
+** Proxy establishes a TCP connection to 192.0.2.1:443 **
+
+                            HTTP/1.1 101 Switching Protocols
+                            Connection: Upgrade
+                            Upgrade: connect-tcp
+                            Capsule-Protocol: ?1
+
+
+
Figure 1: +Templated TCP proxy example in HTTP/1.1 +
+
+
+
+
+

+3.2. In HTTP/2 and HTTP/3 +

+

In HTTP/2 and HTTP/3, the proxy MUST include SETTINGS_ENABLE_CONNECT_PROTOCOL in its SETTINGS frame [RFC8441][RFC9220]. The client uses the proxy by issuing an "extended CONNECT" request as follows:

+
    +
  • +

    The :method pseudo-header field SHALL be "CONNECT".

    +
  • +
  • +

    The :protocol pseudo-header field SHALL be "connect-tcp".

    +
  • +
  • +

    The :authority pseudo-header field SHALL contain the authority of the proxy.

    +
  • +
  • +

    The :path and :scheme pseudo-header fields SHALL contain the path and scheme of the request URI derived from the proxy's URI Template.

    +
  • +
+

A templated TCP proxying request that does not conform to all of these requirements represents a client error (see [RFC9110], Section 15.5) and may be malformed (see Section 8.1.1 of [RFC9113] and Section 4.1.2 of [RFC9114]).

+
+
+
+HEADERS
+:method = CONNECT
+:scheme = https
+:authority = request-proxy.example
+:path = /proxy?target_host=2001%3Adb8%3A%3A1&target_port=443
+:protocol = connect-tcp
+capsule-protocol = ?1
+...
+
+
+
Figure 2: +Templated TCP proxy example in HTTP/2 +
+
+
+
+
+

+3.3. Use of Other Relevant Headers +

+
+
+

+3.3.1. Origin-scoped Headers +

+

Ordinary HTTP headers apply only to the single resource identified in the request or response. An origin-scoped HTTP header is a special response header that is intended to change the client's behavior for subsequent requests to any resource on this origin.

+

Unlike classic HTTP CONNECT proxies, a templated TCP proxy has an unambiguous origin of its own. Origin-scoped headers apply to this origin when they are associated with a templated TCP proxy response. Here are some origin-scoped headers that could potentially be sent by a templated TCP proxy:

+ +
+
+
+
+

+3.3.2. Authentication Headers +

+

Authentication to a templated TCP proxy normally uses ordinary HTTP authentication via the "401 (Unauthorized)" response code, the "WWW-Authenticate" response header field, and the "Authorization" request header field ([RFC9110], Section 11.6). A templated TCP proxy does not use the "407 (Proxy Authentication Required)" response code and related header fields ([RFC9110], Section 11.7) because they do not traverse HTTP gateways (see Section 7).

+

Clients SHOULD assume that all proxy resources generated by a single template share a protection space (i.e., a realm) ([RFC9110], Section 11.5). For many authentication schemes, this will allow the client to avoid waiting for a "401 (Unauthorized)" response before each new connection through the proxy.

+
+
+
+
+
+
+

+3.4. Closing Connections +

+

In each HTTP version, any requirements related to closing connections in Classic HTTP CONNECT also apply to "connect-tcp", with the following modifications:

+
    +
  • +

    In HTTP/1.1, endpoints SHOULD close the connection in an error state to indicate receipt of a TCP connection error (e.g., a TCP RST or timeout). Acceptable error states include sending an incomplete DATA capsule (as defined in Section 3.3 of [RFC9297]), a TLS Error Alert ([RFC8446], Section 6.2), or a TCP RST (if TLS is not in use). When a connection is terminated in an error state, the receiving endpoint SHOULD send a TCP RST if the underlying TCP implementation permits it.

    +
  • +
  • +

    In HTTP/2 and HTTP/3, senders MAY use an incomplete DATA capsule to indicate a TCP connection error, instead of (or in addition to) the signals defined for TCP connection errors in Classic HTTP CONNECT. Recipients MUST recognize any incomplete capsule as a TCP connection error.

    +
  • +
  • +

    Intermediaries MUST propagate connection shutdown errors, including when translating between different HTTP versions.

    +
  • +
+
+
+
+
+
+
+

+4. Additional Connection Setup Behaviors +

+

This section discusses some behaviors that are permitted or recommended in order to enhance the performance or functionality of connection setup.

+
+
+

+4.1. Latency optimizations +

+

When using this specification in HTTP/2 or HTTP/3, clients MAY start sending TCP stream content optimistically, subject to flow control limits (Section 5.2 of [RFC9113] or Section 4.1 of [RFC9000]). Proxies MUST buffer this "optimistic" content until the TCP stream becomes writable, and discard it if the TCP connection fails. (Clients MUST NOT use "optimistic" behavior in HTTP/1.1, as this would interfere with reuse of the connection after an error response such as "401 (Unauthorized)".)

+

Servers that host a proxy under this specification MAY offer support for TLS early data in accordance with [RFC8470]. Clients MAY send "connect-tcp" requests in early data, and MAY include "optimistic" TCP content in early data (in HTTP/2 and HTTP/3). At the TLS layer, proxies MAY ignore, reject, or accept the early_data extension ([RFC8446], Section 4.2.10). At the HTTP layer, proxies MAY process the request immediately, return a "425 (Too Early)" response ([RFC8470], Section 5.2), or delay some or all processing of the request until the handshake completes. For example, a proxy with limited anti-replay defenses might choose to perform DNS resolution of the target_host when a request arrives in early data, but delay the TCP connection until the TLS handshake completes.

+
+
+
+
+

+4.2. Conveying metadata +

+

This specification supports the "Expect: 100-continue" request header ([RFC9110], Section 10.1.1) in any HTTP version. The "100 (Continue)" status code confirms receipt of a request at the proxy without waiting for the proxy-destination TCP handshake to succeed or fail. This might be particularly helpful when the destination host is not responding, as TCP handshakes can hang for several minutes before failing. Clients MAY send "Expect: 100-continue", and proxies MUST respect it by returning "100 (Continue)" if the request is not immediately rejected.

+

Proxies implementing this specification SHOULD include a "Proxy-Status" response header [RFC9209] in any success or failure response (i.e., status codes 101, 2XX, 4XX, or 5XX) to support advanced client behaviors and diagnostics. In HTTP/2 or HTTP/3, proxies MAY additionally send a "Proxy-Status" trailer in the event of an unclean shutdown.

+
+
+
+
+
+
+

+5. Applicability +

+
+
+

+5.1. Servers +

+

For server operators, template-driven TCP proxies are particularly valuable in situations where virtual-hosting is needed, or where multiple proxies must share an origin. For example, the proxy might benefit from sharing an HTTP gateway that provides DDoS defense, performs request sanitization, or enforces user authorization.

+

The URI template can also be structured to generate high-entropy Capability URLs [CAPABILITY], so that only authorized users can discover the proxy service.

+
+
+
+
+

+5.2. Clients +

+

Clients that support both classic HTTP CONNECT proxies and template-driven TCP proxies MAY accept both types via a single configuration string. If the configuration string can be parsed as a URI Template containing the required variables, it is a template-driven TCP proxy. Otherwise, it is presumed to represent a classic HTTP CONNECT proxy.

+

In some cases, it is valuable to allow "connect-tcp" clients to reach "connect-tcp"-only proxies when using a legacy configuration method that cannot convey a URI Template. To support this arrangement, clients SHOULD treat certain errors during classic HTTP CONNECT as indications that the proxy might only support "connect-tcp":

+
    +
  • +

    In HTTP/1.1: the response status code is "426 (Upgrade Required)", with an "Upgrade: connect-tcp" response header.

    +
  • +
  • +

    In any HTTP version: the response status code is "501 (Not Implemented)".

    +
      +
    • +

      Requires SETTINGS_ENABLE_CONNECT_PROTOCOL to have been negotiated in HTTP/2 or HTTP/3.

      +
    • +
    +
  • +
+

If the client infers that classic HTTP CONNECT is not supported, it SHOULD retry the request using the registered default template for "connect-tcp":

+
+
+
+https://$PROXY_HOST:$PROXY_PORT/.well-known/masque
+                 /tcp/{target_host}/{target_port}/
+
+
+
Figure 3: +Registered default template +
+

If this request succeeds, the client SHOULD record a preference for "connect-tcp" to avoid further retry delays.

+
+
+
+
+
+
+

+6. Security Considerations +

+

Template-driven TCP proxying is largely subject to the same security risks as classic HTTP CONNECT. For example, any restrictions on authorized use of the proxy (see [RFC9110], Section 9.3.6) apply equally to both.

+

A small additional risk is posed by the use of a URI Template parser on the client side. The template input string could be crafted to exploit any vulnerabilities in the parser implementation. Client implementers should apply their usual precautions for code that processes untrusted inputs.

+
+
+
+
+

+7. Operational Considerations +

+

Templated TCP proxies can make use of standard HTTP gateways and path-routing to ease implementation and allow use of shared infrastructure. However, current gateways might need modifications to support TCP proxy services. To be compatible, a gateway must:

+
    +
  • +

    support Extended CONNECT (if acting as an HTTP/2 or HTTP/3 server).

    +
  • +
  • +

    support HTTP/1.1 Upgrade to "connect-tcp" (if acting as an HTTP/1.1 server)

    +
      +
    • +

      only after forwarding the upgrade request to the origin and observing a success response.

      +
    • +
    +
  • +
  • +

    forward the "connect-tcp" protocol to the origin.

    +
  • +
  • +

    convert "connect-tcp" requests between all supported HTTP server and client versions.

    +
  • +
  • +

    allow any "Proxy-Status" headers to traverse the gateway.

    +
  • +
+
+
+
+
+

+8. IANA Considerations +

+
+
+

+8.1. New Upgrade Token +

+

IF APPROVED, IANA is requested to add the following entry to the HTTP Upgrade Token Registry:

+ + + + + + + + + + + + + + + + +
Table 1
ValueDescriptionReference
"connect-tcp"Proxying of TCP payloads(This document)
+

For interoperability testing of this draft version, implementations SHALL use the value "connect-tcp-07".

+
+
+
+
+

+8.2. New MASQUE Default Template +

+

IF APPROVED, IANA is requested to add the following entry to the "MASQUE URI Suffixes" registry:

+ + + + + + + + + + + + + + + + +
Table 2
Path SegmentDescriptionReference
tcpTCP Proxying(This document)
+
+
+
+
+

+8.3. New Capsule Type +

+

IF APPROVED, IANA is requested to add the following entry to the "HTTP Capsule Types" registry:

+ + + + + + + + + + + + + + + + + + + + + + +
Table 3
ValueCapsule TypeStatusReferenceChange ControllerContact
(TBD)DATApermanent(This document), Section 3 +IETFHTTPBIS
+

For this draft version of the protocol, the Capsule Type value 0x2028d7ee shall be used provisionally for testing, under the name "DATA-07".

+
+
+
+
+
+
+

+9. References +

+
+
+

+9.1. Normative References +

+
+
[RFC2119]
+
+Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
+
+
[RFC6570]
+
+Gregorio, J., Fielding, R., Hadley, M., Nottingham, M., and D. Orchard, "URI Template", RFC 6570, DOI 10.17487/RFC6570, , <https://www.rfc-editor.org/rfc/rfc6570>.
+
+
[RFC8174]
+
+Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.
+
+
[RFC8441]
+
+McManus, P., "Bootstrapping WebSockets with HTTP/2", RFC 8441, DOI 10.17487/RFC8441, , <https://www.rfc-editor.org/rfc/rfc8441>.
+
+
[RFC8446]
+
+Rescorla, E., "The Transport Layer Security (TLS) Protocol Version 1.3", RFC 8446, DOI 10.17487/RFC8446, , <https://www.rfc-editor.org/rfc/rfc8446>.
+
+
[RFC8470]
+
+Thomson, M., Nottingham, M., and W. Tarreau, "Using Early Data in HTTP", RFC 8470, DOI 10.17487/RFC8470, , <https://www.rfc-editor.org/rfc/rfc8470>.
+
+
[RFC9000]
+
+Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based Multiplexed and Secure Transport", RFC 9000, DOI 10.17487/RFC9000, , <https://www.rfc-editor.org/rfc/rfc9000>.
+
+
[RFC9110]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, DOI 10.17487/RFC9110, , <https://www.rfc-editor.org/rfc/rfc9110>.
+
+
[RFC9113]
+
+Thomson, M., Ed. and C. Benfield, Ed., "HTTP/2", RFC 9113, DOI 10.17487/RFC9113, , <https://www.rfc-editor.org/rfc/rfc9113>.
+
+
[RFC9114]
+
+Bishop, M., Ed., "HTTP/3", RFC 9114, DOI 10.17487/RFC9114, , <https://www.rfc-editor.org/rfc/rfc9114>.
+
+
[RFC9209]
+
+Nottingham, M. and P. Sikora, "The Proxy-Status HTTP Response Header Field", RFC 9209, DOI 10.17487/RFC9209, , <https://www.rfc-editor.org/rfc/rfc9209>.
+
+
[RFC9220]
+
+Hamilton, R., "Bootstrapping WebSockets with HTTP/3", RFC 9220, DOI 10.17487/RFC9220, , <https://www.rfc-editor.org/rfc/rfc9220>.
+
+
[RFC9297]
+
+Schinazi, D. and L. Pardue, "HTTP Datagrams and the Capsule Protocol", RFC 9297, DOI 10.17487/RFC9297, , <https://www.rfc-editor.org/rfc/rfc9297>.
+
+
[RFC9298]
+
+Schinazi, D., "Proxying UDP in HTTP", RFC 9298, DOI 10.17487/RFC9298, , <https://www.rfc-editor.org/rfc/rfc9298>.
+
+
+
+
+
+
+

+9.2. Informative References +

+
+
[CAPABILITY]
+
+"Good Practices for Capability URLs", , <https://www.w3.org/TR/capability-urls/>.
+
+
[CLEAR-SITE-DATA]
+
+"Clear Site Data", , <https://www.w3.org/TR/clear-site-data/>.
+
+
[CONNECT-IP]
+
+Pauly, T., Ed., Schinazi, D., Chernyakhovsky, A., Kühlewind, M., and M. Westerlund, "Proxying IP in HTTP", RFC 9484, DOI 10.17487/RFC9484, , <https://www.rfc-editor.org/rfc/rfc9484>.
+
+
[CONNECT-UDP]
+
+Schinazi, D., "Proxying UDP in HTTP", RFC 9298, DOI 10.17487/RFC9298, , <https://www.rfc-editor.org/rfc/rfc9298>.
+
+
[I-D.schinazi-httpbis-wrap-up]
+
+Schinazi, D. and L. Pardue, "The HTTP Wrap Up Capsule", Work in Progress, Internet-Draft, draft-schinazi-httpbis-wrap-up-01, , <https://datatracker.ietf.org/doc/html/draft-schinazi-httpbis-wrap-up-01>.
+
+
[RFC6265]
+
+Barth, A., "HTTP State Management Mechanism", RFC 6265, DOI 10.17487/RFC6265, , <https://www.rfc-editor.org/rfc/rfc6265>.
+
+
[RFC6797]
+
+Hodges, J., Jackson, C., and A. Barth, "HTTP Strict Transport Security (HSTS)", RFC 6797, DOI 10.17487/RFC6797, , <https://www.rfc-editor.org/rfc/rfc6797>.
+
+
[RFC7469]
+
+Evans, C., Palmer, C., and R. Sleevi, "Public Key Pinning Extension for HTTP", RFC 7469, DOI 10.17487/RFC7469, , <https://www.rfc-editor.org/rfc/rfc7469>.
+
+
[RFC7838]
+
+Nottingham, M., McManus, P., and J. Reschke, "HTTP Alternative Services", RFC 7838, DOI 10.17487/RFC7838, , <https://www.rfc-editor.org/rfc/rfc7838>.
+
+
[RFC8942]
+
+Grigorik, I. and Y. Weiss, "HTTP Client Hints", RFC 8942, DOI 10.17487/RFC8942, , <https://www.rfc-editor.org/rfc/rfc8942>.
+
+
+
+
+
+
+
+
+

+Acknowledgments +

+

Thanks to Amos Jeffries, Tommy Pauly, Kyle Nekritz, David Schinazi, and Kazuho Oku for close review and suggested changes.

+
+
+
+
+

+Author's Address +

+
+
Benjamin M. Schwartz
+
Meta Platforms, Inc.
+ +
+
+
+ + + diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-connect-tcp.txt b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-connect-tcp.txt new file mode 100644 index 000000000..e80e1e575 --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-connect-tcp.txt @@ -0,0 +1,633 @@ + + + + +httpbis B. M. Schwartz +Internet-Draft Meta Platforms, Inc. +Intended status: Standards Track 7 January 2025 +Expires: 11 July 2025 + + + Template-Driven HTTP CONNECT Proxying for TCP + draft-ietf-httpbis-connect-tcp-latest + +Abstract + + TCP proxying using HTTP CONNECT has long been part of the core HTTP + specification. However, this proxying functionality has several + important deficiencies in modern HTTP environments. This + specification defines an alternative HTTP proxy service configuration + for TCP connections. This configuration is described by a URI + Template, similar to the CONNECT-UDP and CONNECT-IP protocols. + +Status of This Memo + + This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79. + + Internet-Drafts are working documents of the Internet Engineering + Task Force (IETF). Note that other groups may also distribute + working documents as Internet-Drafts. The list of current Internet- + Drafts is at https://datatracker.ietf.org/drafts/current/. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress." + + This Internet-Draft will expire on 11 July 2025. + +Copyright Notice + + Copyright (c) 2025 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents (https://trustee.ietf.org/ + license-info) in effect on the date of publication of this document. + Please review these documents carefully, as they describe your rights + and restrictions with respect to this document. Code Components + extracted from this document must include Revised BSD License text as + described in Section 4.e of the Trust Legal Provisions and are + provided without warranty as described in the Revised BSD License. + +Table of Contents + + 1. Introduction + 1.1. History + 1.2. Problems + 1.3. Overview + 2. Conventions and Definitions + 3. Specification + 3.1. In HTTP/1.1 + 3.2. In HTTP/2 and HTTP/3 + 3.3. Use of Other Relevant Headers + 3.3.1. Origin-scoped Headers + 3.3.2. Authentication Headers + 3.4. Closing Connections + 4. Additional Connection Setup Behaviors + 4.1. Latency optimizations + 4.2. Conveying metadata + 5. Applicability + 5.1. Servers + 5.2. Clients + 6. Security Considerations + 7. Operational Considerations + 8. IANA Considerations + 8.1. New Upgrade Token + 8.2. New MASQUE Default Template + 8.3. New Capsule Type + 9. References + 9.1. Normative References + 9.2. Informative References + Acknowledgments + Author's Address + +1. Introduction + +1.1. History + + HTTP has used the CONNECT method for proxying TCP connections since + HTTP/1.1. When using CONNECT, the request target specifies a host + and port number, and the proxy forwards TCP payloads between the + client and this destination ([RFC9110], Section 9.3.6). To date, + this is the only mechanism defined for proxying TCP over HTTP. In + this specification, this is referred to as a "classic HTTP CONNECT + proxy". + + HTTP/3 uses a UDP transport, so it cannot be forwarded using the pre- + existing CONNECT mechanism. To enable forward proxying of HTTP/3, + the MASQUE effort has defined proxy mechanisms that are capable of + proxying UDP datagrams [CONNECT-UDP], and more generally IP datagrams + [CONNECT-IP]. The destination host and port number (if applicable) + are encoded into the HTTP resource path, and end-to-end datagrams are + wrapped into HTTP Datagrams [RFC9297] on the client-proxy path. + +1.2. Problems + + HTTP clients can be configured to use proxies by selecting a proxy + hostname, a port, and whether to use a security protocol. However, + Classic HTTP CONNECT requests using the proxy do not carry this + configuration information. Instead, they only indicate the hostname + and port of the target. This prevents any HTTP server from hosting + multiple distinct proxy services, as the server cannot distinguish + them by path (as with distinct resources) or by origin (as in + "virtual hosting"). + + The absence of an explicit origin for the proxy also rules out the + usual defenses against server port misdirection attacks (see + Section 7.4 of [RFC9110]) and creates ambiguity about the use of + origin-scoped response header fields (e.g., "Alt-Svc" [RFC7838], + "Strict-Transport-Security" [RFC6797]). + + Classic HTTP CONNECT requests cannot carry in-stream metadata. For + example, the WRAP_UP capsule [I-D.schinazi-httpbis-wrap-up] cannot be + used with Classic HTTP CONNECT. + +1.3. Overview + + This specification describes an alternative mechanism for proxying + TCP in HTTP. Like [CONNECT-UDP] and [CONNECT-IP], the proxy service + is identified by a URI Template. Proxy interactions reuse standard + HTTP components and semantics, avoiding changes to the core HTTP + protocol. + +2. Conventions and Definitions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + +3. Specification + + A template-driven TCP transport proxy for HTTP is identified by a URI + Template [RFC6570] containing variables named "target_host" and + "target_port". This URI Template and its variable values MUST meet + all the same requirements as for UDP proxying ([RFC9298], Section 2), + and are subject to the same validation rules. The client MUST + substitute the destination host and port number into this template to + produce the request URI. The derived URI serves as the destination + of a Capsule Protocol connection using the Upgrade Token "connect- + tcp" (see registration in Section 8.1). + + When using "connect-tcp", TCP payload data is sent in the payload of + a new Capsule Type named DATA (see registration in Section 8.3). The + ordered concatenation of DATA capsule payloads represents the TCP + payload data. + + An intermediary MAY merge and split successive DATA capsules, subject + to the following requirements: + + * There are no intervening capsules of other types. + + * The order of payload content is preserved. + +3.1. In HTTP/1.1 + + In HTTP/1.1, the client uses the proxy by issuing a request as + follows: + + * The method SHALL be "GET". + + * The request's target SHALL correspond to the URI derived from + expansion of the proxy's URI Template. + + * The request SHALL include a single "Host" header field containing + the origin of the proxy. + + * The request SHALL include a "Connection" header field with the + value "Upgrade". (Note that this requirement is case-insensitive + as per Section 7.6.1 of [RFC9110].) + + * The request SHALL include an "Upgrade" header field with the value + "connect-tcp". + + * The request SHOULD include a "Capsule-Protocol: ?1" header. + + If the request is well-formed and permissible, the proxy MUST attempt + to establish the TCP connection before sending any response status + code other than "100 (Continue)" (see Section 4.2). If the TCP + connection is successful, the response SHALL be as follows: + + * The HTTP status code SHALL be "101 (Switching Protocols)". + + * The response SHALL include a "Connection" header field with the + value "Upgrade". + + * The response SHALL include a single "Upgrade" header field with + the value "connect-tcp". + + * The response SHOULD include a "Capsule-Protocol: ?1" header. + + If the request is malformed or impermissible, the proxy MUST return a + 4XX error code. If a TCP connection was not established, the proxy + MUST NOT switch protocols to "connect-tcp", and the client MAY reuse + this connection for additional HTTP requests. + + Client Proxy + + GET /proxy?target_host=192.0.2.1&target_port=443 HTTP/1.1 + Host: example.com + Connection: Upgrade + Upgrade: connect-tcp + Capsule-Protocol: ?1 + + ** Proxy establishes a TCP connection to 192.0.2.1:443 ** + + HTTP/1.1 101 Switching Protocols + Connection: Upgrade + Upgrade: connect-tcp + Capsule-Protocol: ?1 + + Figure 1: Templated TCP proxy example in HTTP/1.1 + +3.2. In HTTP/2 and HTTP/3 + + In HTTP/2 and HTTP/3, the proxy MUST include + SETTINGS_ENABLE_CONNECT_PROTOCOL in its SETTINGS frame + [RFC8441][RFC9220]. The client uses the proxy by issuing an + "extended CONNECT" request as follows: + + * The :method pseudo-header field SHALL be "CONNECT". + + * The :protocol pseudo-header field SHALL be "connect-tcp". + + * The :authority pseudo-header field SHALL contain the authority of + the proxy. + + * The :path and :scheme pseudo-header fields SHALL contain the path + and scheme of the request URI derived from the proxy's URI + Template. + + A templated TCP proxying request that does not conform to all of + these requirements represents a client error (see [RFC9110], + Section 15.5) and may be malformed (see Section 8.1.1 of [RFC9113] + and Section 4.1.2 of [RFC9114]). + + HEADERS + :method = CONNECT + :scheme = https + :authority = request-proxy.example + :path = /proxy?target_host=2001%3Adb8%3A%3A1&target_port=443 + :protocol = connect-tcp + capsule-protocol = ?1 + ... + + Figure 2: Templated TCP proxy example in HTTP/2 + +3.3. Use of Other Relevant Headers + +3.3.1. Origin-scoped Headers + + Ordinary HTTP headers apply only to the single resource identified in + the request or response. An origin-scoped HTTP header is a special + response header that is intended to change the client's behavior for + subsequent requests to any resource on this origin. + + Unlike classic HTTP CONNECT proxies, a templated TCP proxy has an + unambiguous origin of its own. Origin-scoped headers apply to this + origin when they are associated with a templated TCP proxy response. + Here are some origin-scoped headers that could potentially be sent by + a templated TCP proxy: + + * "Alt-Svc" [RFC7838] + + * "Strict-Transport-Security" [RFC6797] + + * "Public-Key-Pins" [RFC7469] + + * "Accept-CH" [RFC8942] + + * "Set-Cookie" [RFC6265], which has configurable scope. + + * "Clear-Site-Data" [CLEAR-SITE-DATA] + +3.3.2. Authentication Headers + + Authentication to a templated TCP proxy normally uses ordinary HTTP + authentication via the "401 (Unauthorized)" response code, the "WWW- + Authenticate" response header field, and the "Authorization" request + header field ([RFC9110], Section 11.6). A templated TCP proxy does + not use the "407 (Proxy Authentication Required)" response code and + related header fields ([RFC9110], Section 11.7) because they do not + traverse HTTP gateways (see Section 7). + + Clients SHOULD assume that all proxy resources generated by a single + template share a protection space (i.e., a realm) ([RFC9110], + Section 11.5). For many authentication schemes, this will allow the + client to avoid waiting for a "401 (Unauthorized)" response before + each new connection through the proxy. + +3.4. Closing Connections + + In each HTTP version, any requirements related to closing connections + in Classic HTTP CONNECT also apply to "connect-tcp", with the + following modifications: + + * In HTTP/1.1, endpoints SHOULD close the connection in an error + state to indicate receipt of a TCP connection error (e.g., a TCP + RST or timeout). Acceptable error states include sending an + incomplete DATA capsule (as defined in Section 3.3 of [RFC9297]), + a TLS Error Alert ([RFC8446], Section 6.2), or a TCP RST (if TLS + is not in use). When a connection is terminated in an error + state, the receiving endpoint SHOULD send a TCP RST if the + underlying TCP implementation permits it. + + * In HTTP/2 and HTTP/3, senders MAY use an incomplete DATA capsule + to indicate a TCP connection error, instead of (or in addition to) + the signals defined for TCP connection errors in Classic HTTP + CONNECT. Recipients MUST recognize any incomplete capsule as a + TCP connection error. + + * Intermediaries MUST propagate connection shutdown errors, + including when translating between different HTTP versions. + +4. Additional Connection Setup Behaviors + + This section discusses some behaviors that are permitted or + recommended in order to enhance the performance or functionality of + connection setup. + +4.1. Latency optimizations + + When using this specification in HTTP/2 or HTTP/3, clients MAY start + sending TCP stream content optimistically, subject to flow control + limits (Section 5.2 of [RFC9113] or Section 4.1 of [RFC9000]). + Proxies MUST buffer this "optimistic" content until the TCP stream + becomes writable, and discard it if the TCP connection fails. + (Clients MUST NOT use "optimistic" behavior in HTTP/1.1, as this + would interfere with reuse of the connection after an error response + such as "401 (Unauthorized)".) + + Servers that host a proxy under this specification MAY offer support + for TLS early data in accordance with [RFC8470]. Clients MAY send + "connect-tcp" requests in early data, and MAY include "optimistic" + TCP content in early data (in HTTP/2 and HTTP/3). At the TLS layer, + proxies MAY ignore, reject, or accept the early_data extension + ([RFC8446], Section 4.2.10). At the HTTP layer, proxies MAY process + the request immediately, return a "425 (Too Early)" response + ([RFC8470], Section 5.2), or delay some or all processing of the + request until the handshake completes. For example, a proxy with + limited anti-replay defenses might choose to perform DNS resolution + of the target_host when a request arrives in early data, but delay + the TCP connection until the TLS handshake completes. + +4.2. Conveying metadata + + This specification supports the "Expect: 100-continue" request header + ([RFC9110], Section 10.1.1) in any HTTP version. The "100 + (Continue)" status code confirms receipt of a request at the proxy + without waiting for the proxy-destination TCP handshake to succeed or + fail. This might be particularly helpful when the destination host + is not responding, as TCP handshakes can hang for several minutes + before failing. Clients MAY send "Expect: 100-continue", and proxies + MUST respect it by returning "100 (Continue)" if the request is not + immediately rejected. + + Proxies implementing this specification SHOULD include a "Proxy- + Status" response header [RFC9209] in any success or failure response + (i.e., status codes 101, 2XX, 4XX, or 5XX) to support advanced client + behaviors and diagnostics. In HTTP/2 or HTTP/3, proxies MAY + additionally send a "Proxy-Status" trailer in the event of an unclean + shutdown. + +5. Applicability + +5.1. Servers + + For server operators, template-driven TCP proxies are particularly + valuable in situations where virtual-hosting is needed, or where + multiple proxies must share an origin. For example, the proxy might + benefit from sharing an HTTP gateway that provides DDoS defense, + performs request sanitization, or enforces user authorization. + + The URI template can also be structured to generate high-entropy + Capability URLs [CAPABILITY], so that only authorized users can + discover the proxy service. + +5.2. Clients + + Clients that support both classic HTTP CONNECT proxies and template- + driven TCP proxies MAY accept both types via a single configuration + string. If the configuration string can be parsed as a URI Template + containing the required variables, it is a template-driven TCP proxy. + Otherwise, it is presumed to represent a classic HTTP CONNECT proxy. + + In some cases, it is valuable to allow "connect-tcp" clients to reach + "connect-tcp"-only proxies when using a legacy configuration method + that cannot convey a URI Template. To support this arrangement, + clients SHOULD treat certain errors during classic HTTP CONNECT as + indications that the proxy might only support "connect-tcp": + + * In HTTP/1.1: the response status code is "426 (Upgrade Required)", + with an "Upgrade: connect-tcp" response header. + + * In any HTTP version: the response status code is "501 (Not + Implemented)". + + - Requires SETTINGS_ENABLE_CONNECT_PROTOCOL to have been + negotiated in HTTP/2 or HTTP/3. + + If the client infers that classic HTTP CONNECT is not supported, it + SHOULD retry the request using the registered default template for + "connect-tcp": + + https://$PROXY_HOST:$PROXY_PORT/.well-known/masque + /tcp/{target_host}/{target_port}/ + + Figure 3: Registered default template + + If this request succeeds, the client SHOULD record a preference for + "connect-tcp" to avoid further retry delays. + +6. Security Considerations + + Template-driven TCP proxying is largely subject to the same security + risks as classic HTTP CONNECT. For example, any restrictions on + authorized use of the proxy (see [RFC9110], Section 9.3.6) apply + equally to both. + + A small additional risk is posed by the use of a URI Template parser + on the client side. The template input string could be crafted to + exploit any vulnerabilities in the parser implementation. Client + implementers should apply their usual precautions for code that + processes untrusted inputs. + +7. Operational Considerations + + Templated TCP proxies can make use of standard HTTP gateways and + path-routing to ease implementation and allow use of shared + infrastructure. However, current gateways might need modifications + to support TCP proxy services. To be compatible, a gateway must: + + * support Extended CONNECT (if acting as an HTTP/2 or HTTP/3 + server). + + * support HTTP/1.1 Upgrade to "connect-tcp" (if acting as an + HTTP/1.1 server) + + - only after forwarding the upgrade request to the origin and + observing a success response. + + * forward the "connect-tcp" protocol to the origin. + + * convert "connect-tcp" requests between all supported HTTP server + and client versions. + + * allow any "Proxy-Status" headers to traverse the gateway. + +8. IANA Considerations + +8.1. New Upgrade Token + + IF APPROVED, IANA is requested to add the following entry to the HTTP + Upgrade Token Registry: + + +===============+==========================+=================+ + | Value | Description | Reference | + +===============+==========================+=================+ + | "connect-tcp" | Proxying of TCP payloads | (This document) | + +---------------+--------------------------+-----------------+ + + Table 1 + + For interoperability testing of this draft version, implementations + SHALL use the value "connect-tcp-07". + +8.2. New MASQUE Default Template + + IF APPROVED, IANA is requested to add the following entry to the + "MASQUE URI Suffixes" registry: + + +==============+==============+=================+ + | Path Segment | Description | Reference | + +==============+==============+=================+ + | tcp | TCP Proxying | (This document) | + +--------------+--------------+-----------------+ + + Table 2 + +8.3. New Capsule Type + + IF APPROVED, IANA is requested to add the following entry to the + "HTTP Capsule Types" registry: + + +=======+=========+===========+============+============+=========+ + | Value | Capsule | Status | Reference | Change | Contact | + | | Type | | | Controller | | + +=======+=========+===========+============+============+=========+ + | (TBD) | DATA | permanent | (This | IETF | HTTPBIS | + | | | | document), | | | + | | | | Section 3 | | | + +-------+---------+-----------+------------+------------+---------+ + + Table 3 + + For this draft version of the protocol, the Capsule Type value + 0x2028d7ee shall be used provisionally for testing, under the name + "DATA-07". + +9. References + +9.1. Normative References + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + . + + [RFC6570] Gregorio, J., Fielding, R., Hadley, M., Nottingham, M., + and D. Orchard, "URI Template", RFC 6570, + DOI 10.17487/RFC6570, March 2012, + . + + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, . + + [RFC8441] McManus, P., "Bootstrapping WebSockets with HTTP/2", + RFC 8441, DOI 10.17487/RFC8441, September 2018, + . + + [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol + Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, + . + + [RFC8470] Thomson, M., Nottingham, M., and W. Tarreau, "Using Early + Data in HTTP", RFC 8470, DOI 10.17487/RFC8470, September + 2018, . + + [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based + Multiplexed and Secure Transport", RFC 9000, + DOI 10.17487/RFC9000, May 2021, + . + + [RFC9110] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP Semantics", STD 97, RFC 9110, + DOI 10.17487/RFC9110, June 2022, + . + + [RFC9113] Thomson, M., Ed. and C. Benfield, Ed., "HTTP/2", RFC 9113, + DOI 10.17487/RFC9113, June 2022, + . + + [RFC9114] Bishop, M., Ed., "HTTP/3", RFC 9114, DOI 10.17487/RFC9114, + June 2022, . + + [RFC9209] Nottingham, M. and P. Sikora, "The Proxy-Status HTTP + Response Header Field", RFC 9209, DOI 10.17487/RFC9209, + June 2022, . + + [RFC9220] Hamilton, R., "Bootstrapping WebSockets with HTTP/3", + RFC 9220, DOI 10.17487/RFC9220, June 2022, + . + + [RFC9297] Schinazi, D. and L. Pardue, "HTTP Datagrams and the + Capsule Protocol", RFC 9297, DOI 10.17487/RFC9297, August + 2022, . + + [RFC9298] Schinazi, D., "Proxying UDP in HTTP", RFC 9298, + DOI 10.17487/RFC9298, August 2022, + . + +9.2. Informative References + + [CAPABILITY] + "Good Practices for Capability URLs", February 2014, + . + + [CLEAR-SITE-DATA] + "Clear Site Data", November 2017, + . + + [CONNECT-IP] + Pauly, T., Ed., Schinazi, D., Chernyakhovsky, A., + Kühlewind, M., and M. Westerlund, "Proxying IP in HTTP", + RFC 9484, DOI 10.17487/RFC9484, October 2023, + . + + [CONNECT-UDP] + Schinazi, D., "Proxying UDP in HTTP", RFC 9298, + DOI 10.17487/RFC9298, August 2022, + . + + [I-D.schinazi-httpbis-wrap-up] + Schinazi, D. and L. Pardue, "The HTTP Wrap Up Capsule", + Work in Progress, Internet-Draft, draft-schinazi-httpbis- + wrap-up-01, 16 October 2024, + . + + [RFC6265] Barth, A., "HTTP State Management Mechanism", RFC 6265, + DOI 10.17487/RFC6265, April 2011, + . + + [RFC6797] Hodges, J., Jackson, C., and A. Barth, "HTTP Strict + Transport Security (HSTS)", RFC 6797, + DOI 10.17487/RFC6797, November 2012, + . + + [RFC7469] Evans, C., Palmer, C., and R. Sleevi, "Public Key Pinning + Extension for HTTP", RFC 7469, DOI 10.17487/RFC7469, April + 2015, . + + [RFC7838] Nottingham, M., McManus, P., and J. Reschke, "HTTP + Alternative Services", RFC 7838, DOI 10.17487/RFC7838, + April 2016, . + + [RFC8942] Grigorik, I. and Y. Weiss, "HTTP Client Hints", RFC 8942, + DOI 10.17487/RFC8942, February 2021, + . + +Acknowledgments + + Thanks to Amos Jeffries, Tommy Pauly, Kyle Nekritz, David Schinazi, + and Kazuho Oku for close review and suggested changes. + +Author's Address + + Benjamin M. Schwartz + Meta Platforms, Inc. + Email: ietf@bemasc.net diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-no-vary-search.html b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-no-vary-search.html new file mode 100644 index 000000000..9ccf7f186 --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-no-vary-search.html @@ -0,0 +1,2143 @@ + + + + + + +No-Vary-Search + + + + + + + + + + + + + + + + + + + + + + + + + +
Internet-DraftNo-Vary-SearchJanuary 2025
Denicola & RomanExpires 11 July 2025[Page]
+
+
+
+
Workgroup:
+
HyperText Transfer Protocol
+
Internet-Draft:
+
draft-ietf-httpbis-no-vary-search-latest
+
Published:
+
+ +
+
Intended Status:
+
Standards Track
+
Expires:
+
+
Authors:
+
+
+
D. Denicola
+
Google LLC
+
+
+
J. Roman
+
Google LLC
+
+
+
+
+

No-Vary-Search

+
+

Abstract

+

A proposed HTTP header field for changing how URL search parameters impact caching.

+
+
+

+About This Document +

+

This note is to be removed before publishing as an RFC.

+

+ The latest revision of this draft can be found at https://httpwg.org/http-extensions/draft-ietf-httpbis-no-vary-search.html. + Status information for this document may be found at https://datatracker.ietf.org/doc/draft-ietf-httpbis-no-vary-search/.

+

+ Discussion of this document takes place on the + HTTP Working Group mailing list (mailto:ietf-http-wg@w3.org), + which is archived at https://lists.w3.org/Archives/Public/ietf-http-wg/. + Working Group information can be found at https://httpwg.org/.

+

Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/no-vary-search.

+
+
+
+

+Status of This Memo +

+

+ This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79.

+

+ Internet-Drafts are working documents of the Internet Engineering Task + Force (IETF). Note that other groups may also distribute working + documents as Internet-Drafts. The list of current Internet-Drafts is + at https://datatracker.ietf.org/drafts/current/.

+

+ Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress."

+

+ This Internet-Draft will expire on 11 July 2025.

+
+
+ +
+
+

+Table of Contents +

+ +
+
+
+
+

+1. Conventions and Definitions +

+

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", +"MAY", and "OPTIONAL" in this document are to be interpreted as +described in BCP 14 [RFC2119] [RFC8174] when, and only when, they +appear in all capitals, as shown here.

+

This document also adopts some conventions and notation typical in WHATWG and W3C usage, especially as it relates to algorithms. See [WHATWG-INFRA].

+
+
+
+
+

+2. HTTP header field definition +

+

The No-Vary-Search HTTP header field is a structured field [STRUCTURED-FIELDS] whose value must be a dictionary (Section 3.2 of [STRUCTURED-FIELDS]).

+

It has the following authoring conformance requirements:

+ +

The dictionary may contain entries whose keys are not one of key-order, params, and except, but their meaning is not defined by this specification. Implementations of this specification will ignore such entries (but future documents may assign meaning to such entries).

+ +
+
+
+
+

+3. Data model +

+

A URL search variance consists of the following:

+
+
no-vary params
+
+

either the special value wildcard or a list of strings

+
+
+
vary params
+
+

either the special value wildcard or a list of strings

+
+
+
vary on key order
+
+

a boolean

+
+
+
+

+The default URL search variance is a URL search variance whose no-vary params is an empty list, vary params is wildcard, and vary on key order is true.

+

The obtain a URL search variance algorithm (Section 4.2) ensures that all URL search variances obey the following constraints:

+
    +
  • +

    vary params is a list if and only if the no-vary params is wildcard; and

    +
  • +
  • +

    no-vary params is a list if and only if the vary params is wildcard.

    +
  • +
+
+
+
+
+

+4. Parsing +

+
+
+

+4.1. Parse a URL search variance +

+

+To parse a URL search variance given value:

+
    +
  1. +

    If value is null, then return the default URL search variance.

    +
  2. +
  3. +

    Let result be a new URL search variance.

    +
  4. +
  5. +

    Set result's vary on key order to true.

    +
  6. +
  7. +

    If value["key-order"] exists:

    +
      +
    1. +

      If value["key-order"] is not a boolean, then return the default URL search variance.

      +
    2. +
    3. +

      Set result's vary on key order to the boolean negation of value["key-order"].

      +
    4. +
    +
  8. +
  9. +

    If value["params"] exists:

    +
      +
    1. +

      If value["params"] is a boolean:

      +
        +
      1. +

        If value["params"] is true, then:

        +
          +
        1. +

          Set result's no-vary params to wildcard.

          +
        2. +
        3. +

          Set result's vary params to the empty list.

          +
        4. +
        +
      2. +
      3. +

        Otherwise:

        +
          +
        1. +

          Set result's no-vary params to the empty list.

          +
        2. +
        3. +

          Set result's vary params to wildcard.

          +
        4. +
        +
      4. +
      +
    2. +
    3. +

      Otherwise, if value["params"] is an array:

      +
        +
      1. +

        If any item in value["params"] is not a string, then return the default URL search variance.

        +
      2. +
      3. +

        Set result's no-vary params to the result of applying parse a key (Section 4.3) to each item in value["params"].

        +
      4. +
      5. +

        Set result's vary params to wildcard.

        +
      6. +
      +
    4. +
    5. +

      Otherwise, return the default URL search variance.

      +
    6. +
    +
  10. +
  11. +

    If value["except"] exists:

    +
      +
    1. +

      If value["params"] is not true, then return the default URL search variance.

      +
    2. +
    3. +

      If value["except"] is not an array, then return the default URL search variance.

      +
    4. +
    5. +

      If any item in value["except"] is not a string, then return the default URL search variance.

      +
    6. +
    7. +

      Set result's vary params to the result of applying parse a key (Section 4.3) to each item in value["except"].

      +
    8. +
    +
  12. +
  13. +

    Return result.

    +
  14. +
+ + +
+
+
+
+

+4.2. Obtain a URL search variance +

+

+To obtain a URL search variance given a response response:

+
    +
  1. +

    Let fieldValue be the result of getting a structured field value [FETCH] given `No-Vary-Search` and "dictionary" from response's header list.

    +
  2. +
  3. +

    Return the result of parsing a URL search variance (Section 4.1) given fieldValue.

    +
  4. +
+
+
+

+4.2.1. Examples +

+

The following illustrates how various inputs are parsed, in terms of their impacting on the resulting no-vary params and vary params:

+ + + + + + + + + + + + + + + + + + + + + + +
Table 1
InputResult
+ No-Vary-Search: params +no-vary params: wildcard
vary params: (empty list)
+ No-Vary-Search: params=("a") +no-vary params: « "a" »
vary params: wildcard +
+ No-Vary-Search: params, except=("x") +no-vary params: wildcard
vary params: « "x" »
+

The following inputs are all invalid and will cause the default URL search variance to be returned:

+
    +
  • +

    No-Vary-Search: unknown-key

    +
  • +
  • +

    No-Vary-Search: key-order="not a boolean"

    +
  • +
  • +

    No-Vary-Search: params="not a boolean or inner list"

    +
  • +
  • +

    No-Vary-Search: params=(not-a-string)

    +
  • +
  • +

    No-Vary-Search: params=("a"), except=("x")

    +
  • +
  • +

    No-Vary-Search: params=(), except=()

    +
  • +
  • +

    No-Vary-Search: params=?0, except=("x")

    +
  • +
  • +

    No-Vary-Search: params, except=(not-a-string)

    +
  • +
  • +

    No-Vary-Search: params, except="not an inner list"

    +
  • +
  • +

    No-Vary-Search: params, except=?1

    +
  • +
  • +

    No-Vary-Search: except=("x")

    +
  • +
  • +

    No-Vary-Search: except=()

    +
  • +
+

The following inputs are valid, but somewhat unconventional. They are shown alongside their more conventional form.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Table 2
InputConventional form
+ No-Vary-Search: params=?1 + + No-Vary-Search: params +
+ No-Vary-Search: key-order=?1 + + No-Vary-Search: key-order +
+ No-Vary-Search: params, key-order, except=("x") + + No-Vary-Search: key-order, params, except=("x") +
+ No-Vary-Search: params=?0 +(omit the header)
+ No-Vary-Search: params=() +(omit the header)
+ No-Vary-Search: key-order=?0 +(omit the header)
+
+
+
+
+
+
+

+4.3. Parse a key +

+

+To parse a key given an ASCII string keyString:

+
    +
  1. +

    Let keyBytes be the isomorphic encoding [WHATWG-INFRA] of keyString.

    +
  2. +
  3. +

    Replace any 0x2B (+) in keyBytes with 0x20 (SP).

    +
  4. +
  5. +

    Let keyBytesDecoded be the percent-decoding [WHATWG-URL] of keyBytes.

    +
  6. +
  7. +

    Let keyStringDecoded be the UTF-8 decoding without BOM [WHATWG-ENCODING] of keyBytesDecoded.

    +
  8. +
  9. +

    Return keyStringDecoded.

    +
  10. +
+
+
+

+4.3.1. Examples +

+

The parse a key algorithm allows encoding non-ASCII key strings in the ASCII structured header format, similar to how the application/x-www-form-urlencoded format [WHATWG-URL] allows encoding an entire entry list of keys and values in ASCII URL format. For example,

+
+
+No-Vary-Search: params=("%C3%A9+%E6%B0%97")
+
+
+

will result in a URL search variance whose vary params are « "é 気" ». As explained in a later example, the canonicalization process during equivalence testing means this will treat as equivalent URL strings such as:

+
    +
  • +

    https://example.com/?é 気=1

    +
  • +
  • +

    https://example.com/?é+気=2

    +
  • +
  • +

    https://example.com/?%C3%A9%20気=3

    +
  • +
  • +

    https://example.com/?%C3%A9+%E6%B0%97=4

    +
  • +
+

and so on, since they all are parsed [WHATWG-URL] to having the same key "é 気".

+
+
+
+
+
+
+
+
+

+5. Comparing +

+

+Two URLs [WHATWG-URL] urlA and urlB are equivalent modulo search variance given a URL search variance searchVariance if the following algorithm returns true:

+
    +
  1. +

    If the scheme, username, password, host, port, or path of urlA and urlB differ, then return false.

    +
  2. +
  3. +

    If searchVariance is equivalent to the default URL search variance, then:

    +
      +
    1. +

      If urlA's query equals urlB's query, then return true.

      +
    2. +
    3. +

      Return false.

      +
    4. +
    +

    +In this case, even URL pairs that might appear the same after running the application/x-www-form-urlencoded parser [WHATWG-URL] on their queries, such as https://example.com/a and https://example.com/a?, or https://example.com/foo?a=b&&&c and https://example.com/foo?a=b&c=, will be treated as inequivalent.

    +
  4. +
  5. +

    Let searchParamsA and searchParamsB be empty lists.

    +
  6. +
  7. +

    If wrlA's query is not null, then set searchParamsA to the result of running the application/x-www-form-urlencoded parser [WHATWG-URL] given the isomorphic encoding [WHATWG-INFRA] of urlA's query.

    +
  8. +
  9. +

    If wrlB's query is not null, then set searchParamsB to the result of running the application/x-www-form-urlencoded parser [WHATWG-URL] given the isomorphic encoding [WHATWG-INFRA] of urlB's query.

    +
  10. +
  11. +

    If searchVariance's no-vary params is a list, then:

    +
      +
    1. +

      Set searchParamsA to a list containing those items pair in searchParamsA where searchVariance's no-vary params does not contain pair[0].

      +
    2. +
    3. +

      Set searchParamsB to a list containing those items pair in searchParamsB where searchVariance's no-vary params does not contain pair[0].

      +
    4. +
    +
  12. +
  13. +

    Otherwise, if searchVariance's vary params is a list, then:

    +
      +
    1. +

      Set searchParamsA to a list containing those items pair in searchParamsA where searchVariance's vary params contains pair[0].

      +
    2. +
    3. +

      Set searchParamsB to a list containing those items pair in searchParamsB where searchVariance's vary params contains pair[0].

      +
    4. +
    +
  14. +
  15. +

    If searchVariance's vary on key order is false, then:

    +
      +
    1. +

      Let keyLessThan be an algorithm taking as inputs two pairs (keyA, valueA) and (keyB, valueB), which returns whether keyA is code unit less than [WHATWG-INFRA] keyB.

      +
    2. +
    3. +

      Set searchParamsA to the result of sorting searchParamsA in ascending order with keyLessThan.

      +
    4. +
    5. +

      Set searchParamsB to the result of sorting searchParamsB in ascending order with keyLessThan.

      +
    6. +
    +
  16. +
  17. +

    If searchParamsA's size is not equal to searchParamsB's size, then return false.

    +
  18. +
  19. +

    Let i be 0.

    +
  20. +
  21. +

    While i < searchParamsA's size:

    +
      +
    1. +

      If searchParamsA[i][0] does not equal searchParamsB[i][0], then return false.

      +
    2. +
    3. +

      If searchParamsA[i][1] does not equal searchParamsB[i][1], then return false.

      +
    4. +
    5. +

      Set i to i + 1.

      +
    6. +
    +
  22. +
  23. +

    Return true.

    +
  24. +
+
+
+

+5.1. Examples +

+

Due to how the application/x-www-form-urlencoded parser canonicalizes query strings, there are some cases where query strings which do not appear obviously equivalent, will end up being treated as equivalent after parsing.

+

So, for example, given any non-default value for No-Vary-Search, such as No-Vary-Search: key-order, we will have the following equivalences:

+
+
+ https://example.com +
+ https://example.com/? +
+
A null query is parsed the same as an empty string +
+
+
+ https://example.com/?a=x +
+ https://example.com/?%61=%78 +
+
Parsing performs percent-decoding +
+
+
+ https://example.com/?a=é +
+ https://example.com/?a=%C3%A9 +
+
Parsing performs percent-decoding +
+
+
+ https://example.com/?a=%f6 +
+ https://example.com/?a=%ef%bf%bd +
+
Both values are parsed as U+FFFD (�) +
+
+
+ https://example.com/?a=x&&&& +
+ https://example.com/?a=x +
+
Parsing splits on & + and discards empty strings +
+
+
+ https://example.com/?a= +
+ https://example.com/?a +
+
Both parse as having an empty string value for a +
+
+
+ https://example.com/?a=%20 +
+ https://example.com/?a=+ +
+ https://example.com/?a= & +
+
+ + + and %20 + are both parsed as U+0020 SPACE +
+
+
+
+
+
+
+
+
+

+6. Caching +

+

If a cache [HTTP-CACHING] implements this specification, the presented target URI requirement in Section 4 of [HTTP-CACHING] is replaced with:

+
    +
  • +

    one of the following:

    +
      +
    • +

      the presented target URI (Section 7.1 of [HTTP]) and that of the stored response match, or

      +
    • +
    • +

      the presented target URI and that of the stored response are equivalent modulo search variance (Section 5), given the variance obtained (Section 4.2) from the stored response.

      +
    • +
    +
  • +
+

Servers SHOULD send no more than one distinct non-empty value for the No-Vary-Search field in response to requests for a given pathname.

+

Cache implementations MAY fail to reuse a stored response whose target URI matches only modulo URL search variance, if the cache has more recently stored a response which:

+
    +
  • +

    has a target URI which is equal to the presented target URI, excluding the query, and

    +
  • +
  • +

    has a non-empty value for the No-Vary-Search field, and

    +
  • +
  • +

    has a No-Vary-Search field value different from the stored response being considered for reuse.

    +
  • +
+ +
+
+
+
+

+7. Security Considerations +

+

The main risk to be aware of is the impact of mismatched URLs. In particular, this could cause the user to see a response that was originally fetched from a URL different from the one displayed when they hovered a link, or the URL displayed in the URL bar.

+

However, since the impact is limited to query parameters, this does not cross the relevant security boundary, which is the origin [HTML]. (Or perhaps just the host, from the perspective of web browser security UI. [WHATWG-URL]) Indeed, we have already given origins complete control over how they present the (URL, reponse body) pair, including on the client side via technology such as history.replaceState() or service workers.

+
+
+
+
+

+8. Privacy Considerations +

+

This proposal is adjacent to the highly-privacy-relevant space of navigational tracking, which often uses query parameters to pass along user identifiers. However, we believe this proposal itself does not have privacy impacts. It does not interfere with existing navigational tracking mitigations, or any known future ones being contemplated. Indeed, if a page were to encode user identifiers in its URL, the only ability this proposal gives is to reduce such user tracking by preventing server processing of such user IDs (since the server is bypassed in favor of the cache). [NAV-TRACKING-MITIGATIONS]

+
+
+
+
+

+9. IANA Considerations +

+

IANA should do the following:

+
+
+

+9.1. HTTP Field Names +

+

Enter the following into the Hypertext Transfer Protocol (HTTP) Field Name Registry:

+
+
Field Name
+
+

No-Vary-Search

+
+
+
Status
+
+

permanent

+
+
+
Structured Type
+
+

Dictionary

+
+
+
Reference
+
+

this document

+
+
+
Comments
+
+

(none)

+
+
+
+
+
+
+
+
+
+

+10. References +

+
+
+

+10.1. Normative References +

+
+
[FETCH]
+
+van Kesteren, A., "Fetch Living Standard", n.d., <https://fetch.spec.whatwg.org/>. WHATWG +
+
+
[HTTP]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, DOI 10.17487/RFC9110, , <https://www.rfc-editor.org/rfc/rfc9110>.
+
+
[HTTP-CACHING]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Caching", STD 98, RFC 9111, DOI 10.17487/RFC9111, , <https://www.rfc-editor.org/rfc/rfc9111>.
+
+
[RFC2119]
+
+Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
+
+
[RFC8174]
+
+Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.
+
+
[STRUCTURED-FIELDS]
+
+Nottingham, M. and P. Kamp, "Structured Field Values for HTTP", RFC 8941, DOI 10.17487/RFC8941, , <https://www.rfc-editor.org/rfc/rfc8941>.
+
+
[WHATWG-ENCODING]
+
+van Kesteren, A., "Encoding Living Standard", n.d., <https://encoding.spec.whatwg.org/>. WHATWG +
+
+
[WHATWG-INFRA]
+
+van Kesteren, A. and D. Denicola, "Infra Living Standard", n.d., <https://infra.spec.whatwg.org/>. WHATWG +
+
+
[WHATWG-URL]
+
+van Kesteren, A., "URL Living Standard", n.d., <https://url.spec.whatwg.org/>. WHATWG +
+
+
+
+
+
+
+

+10.2. Informative References +

+
+
[HTML]
+
+van Kesteren, A., "HTML Living Standard", n.d., <https://html.spec.whatwg.org/>. WHATWG +
+
+ +
+Snyder, P. and J. Yasskin, "Navigational-Tracking Mitigations", n.d., <https://privacycg.github.io/nav-tracking-mitigations/>. W3C Privacy CG +
+
+
+
+
+
+
+
+
+

+Acknowledgments +

+

TODO acknowledge.

+
+
+
+

+Index +

+
+

+ D + E + O + P

+
+ +
+
+
+

+Authors' Addresses +

+
+
Domenic Denicola
+
Google LLC
+ +
+
+
Jeremy Roman
+
Google LLC
+ +
+
+
+ + + diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-no-vary-search.txt b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-no-vary-search.txt new file mode 100644 index 000000000..3c01ca753 --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-no-vary-search.txt @@ -0,0 +1,716 @@ + + + + +HyperText Transfer Protocol D. Denicola +Internet-Draft J. Roman +Intended status: Standards Track Google LLC +Expires: 11 July 2025 7 January 2025 + + + No-Vary-Search + draft-ietf-httpbis-no-vary-search-latest + +Abstract + + A proposed HTTP header field for changing how URL search parameters + impact caching. + +About This Document + + This note is to be removed before publishing as an RFC. + + The latest revision of this draft can be found at https://httpwg.org/ + http-extensions/draft-ietf-httpbis-no-vary-search.html. Status + information for this document may be found at + https://datatracker.ietf.org/doc/draft-ietf-httpbis-no-vary-search/. + + Discussion of this document takes place on the HTTP Working Group + mailing list (mailto:ietf-http-wg@w3.org), which is archived at + https://lists.w3.org/Archives/Public/ietf-http-wg/. Working Group + information can be found at https://httpwg.org/. + + Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/no-vary-search. + +Status of This Memo + + This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79. + + Internet-Drafts are working documents of the Internet Engineering + Task Force (IETF). Note that other groups may also distribute + working documents as Internet-Drafts. The list of current Internet- + Drafts is at https://datatracker.ietf.org/drafts/current/. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress." + + This Internet-Draft will expire on 11 July 2025. + +Copyright Notice + + Copyright (c) 2025 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents (https://trustee.ietf.org/ + license-info) in effect on the date of publication of this document. + Please review these documents carefully, as they describe your rights + and restrictions with respect to this document. Code Components + extracted from this document must include Revised BSD License text as + described in Section 4.e of the Trust Legal Provisions and are + provided without warranty as described in the Revised BSD License. + +Table of Contents + + 1. Conventions and Definitions + 2. HTTP header field definition + 3. Data model + 4. Parsing + 4.1. Parse a URL search variance + 4.2. Obtain a URL search variance + 4.2.1. Examples + 4.3. Parse a key + 4.3.1. Examples + 5. Comparing + 5.1. Examples + 6. Caching + 7. Security Considerations + 8. Privacy Considerations + 9. IANA Considerations + 9.1. HTTP Field Names + 10. References + 10.1. Normative References + 10.2. Informative References + Acknowledgments + Index + Authors' Addresses + +1. Conventions and Definitions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + + This document also adopts some conventions and notation typical in + WHATWG and W3C usage, especially as it relates to algorithms. See + [WHATWG-INFRA]. + +2. HTTP header field definition + + The No-Vary-Search HTTP header field is a structured field + [STRUCTURED-FIELDS] whose value must be a dictionary (Section 3.2 of + [STRUCTURED-FIELDS]). + + It has the following authoring conformance requirements: + + * If present, the key-order entry's value must be a boolean + (Section 3.3.6 of [STRUCTURED-FIELDS]). + + * If present, the params entry's value must be either a boolean + (Section 3.3.6 of [STRUCTURED-FIELDS]) or an inner list + (Section 3.1.1 of [STRUCTURED-FIELDS]). + + * If present, the except entry's value must be an inner list + (Section 3.1.1 of [STRUCTURED-FIELDS]). + + * The except entry must only be present if the params entry is also + present, and the params entry's value is the boolean value true. + + The dictionary may contain entries whose keys are not one of key- + order, params, and except, but their meaning is not defined by this + specification. Implementations of this specification will ignore + such entries (but future documents may assign meaning to such + entries). + + | As always, the authoring conformance requirements are not + | binding on implementations. Implementations instead need to + | implement the processing model given by the obtain a URL search + | variance algorithm (Section 4.2). + +3. Data model + + A _URL search variance_ consists of the following: + + no-vary params + either the special value *wildcard* or a list of strings + + vary params + either the special value *wildcard* or a list of strings + + vary on key order + a boolean + + The _default URL search variance_ is a URL search variance whose no- + vary params is an empty list, vary params is *wildcard*, and vary on + key order is true. + + The obtain a URL search variance algorithm (Section 4.2) ensures that + all URL search variances obey the following constraints: + + * vary params is a list if and only if the no-vary params is + *wildcard*; and + + * no-vary params is a list if and only if the vary params is + *wildcard*. + +4. Parsing + +4.1. Parse a URL search variance + + To _parse a URL search variance_ given _value_: + + 1. If _value_ is null, then return the default URL search variance. + + 2. Let _result_ be a new URL search variance. + + 3. Set _result_'s vary on key order to true. + + 4. If _value_["key-order"] exists: + + 1. If _value_["key-order"] is not a boolean, then return the + default URL search variance. + + 2. Set _result_'s vary on key order to the boolean negation of + _value_["key-order"]. + + 5. If _value_["params"] exists: + + 1. If _value_["params"] is a boolean: + + 1. If _value_["params"] is true, then: + + 1. Set _result_'s no-vary params to *wildcard*. + + 2. Set _result_'s vary params to the empty list. + + 2. Otherwise: + + 1. Set _result_'s no-vary params to the empty list. + + 2. Set _result_'s vary params to *wildcard*. + + 2. Otherwise, if _value_["params"] is an array: + + 1. If any item in _value_["params"] is not a string, then + return the default URL search variance. + + 2. Set _result_'s no-vary params to the result of applying + parse a key (Section 4.3) to each item in + _value_["params"]. + + 3. Set _result_'s vary params to *wildcard*. + + 3. Otherwise, return the default URL search variance. + + 6. If _value_["except"] exists: + + 1. If _value_["params"] is not true, then return the default URL + search variance. + + 2. If _value_["except"] is not an array, then return the default + URL search variance. + + 3. If any item in _value_["except"] is not a string, then return + the default URL search variance. + + 4. Set _result_'s vary params to the result of applying parse a + key (Section 4.3) to each item in _value_["except"]. + + 7. Return _result_. + + | In general, this algorithm is strict and tends to return the + | default URL search variance whenever it sees something it + | doesn't recognize. This is because the default URL search + | variance behavior will just cause fewer cache hits, which is an + | acceptable fallback behavior. + | + | However, unrecognized keys at the top level are ignored, to + | make it easier to extend this specification in the future. To + | avoid misbehavior with existing client software, such + | extensions will likely expand, rather than reduce, the set of + | requests that a cached response can match. + + | The input to this algorithm is generally obtained by parsing a + | structured field (Section 4.2 of [STRUCTURED-FIELDS]) using + | field_type "dictionary". + +4.2. Obtain a URL search variance + + To _obtain a URL search variance_ given a response + (https://fetch.spec.whatwg.org/#concept-response) _response_: + + 1. Let _fieldValue_ be the result of getting a structured field + value (https://fetch.spec.whatwg.org/#concept-header-list-get- + structured-header) [FETCH] given `No-Vary-Search` and + "dictionary" from _response_'s header list. + + 2. Return the result of parsing a URL search variance (Section 4.1) + given _fieldValue_. + +4.2.1. Examples + + The following illustrates how various inputs are parsed, in terms of + their impacting on the resulting no-vary params and vary params: + + +========================+============================+ + | Input | Result | + +========================+============================+ + | No-Vary-Search: params | no-vary params: *wildcard* | + | | vary params: (empty list) | + +------------------------+----------------------------+ + | No-Vary-Search: | no-vary params: « "a" » | + | params=("a") | vary params: *wildcard* | + +------------------------+----------------------------+ + | No-Vary-Search: | no-vary params: *wildcard* | + | params, except=("x") | vary params: « "x" » | + +------------------------+----------------------------+ + + Table 1 + + The following inputs are all invalid and will cause the default URL + search variance to be returned: + + * No-Vary-Search: unknown-key + * No-Vary-Search: key-order="not a boolean" + * No-Vary-Search: params="not a boolean or inner list" + * No-Vary-Search: params=(not-a-string) + * No-Vary-Search: params=("a"), except=("x") + * No-Vary-Search: params=(), except=() + * No-Vary-Search: params=?0, except=("x") + * No-Vary-Search: params, except=(not-a-string) + * No-Vary-Search: params, except="not an inner list" + * No-Vary-Search: params, except=?1 + * No-Vary-Search: except=("x") + * No-Vary-Search: except=() + + The following inputs are valid, but somewhat unconventional. They + are shown alongside their more conventional form. + + +==============================+============================+ + | Input | Conventional form | + +==============================+============================+ + | No-Vary-Search: params=?1 | No-Vary-Search: params | + +------------------------------+----------------------------+ + | No-Vary-Search: key-order=?1 | No-Vary-Search: key-order | + +------------------------------+----------------------------+ + | No-Vary-Search: params, key- | No-Vary-Search: key-order, | + | order, except=("x") | params, except=("x") | + +------------------------------+----------------------------+ + | No-Vary-Search: params=?0 | (omit the header) | + +------------------------------+----------------------------+ + | No-Vary-Search: params=() | (omit the header) | + +------------------------------+----------------------------+ + | No-Vary-Search: key-order=?0 | (omit the header) | + +------------------------------+----------------------------+ + + Table 2 + +4.3. Parse a key + + To _parse a key_ given an ASCII string _keyString_: + + 1. Let _keyBytes_ be the isomorphic encoding + (https://infra.spec.whatwg.org/#isomorphic-encode) [WHATWG-INFRA] + of _keyString_. + + 2. Replace any 0x2B (+) in _keyBytes_ with 0x20 (SP). + + 3. Let _keyBytesDecoded_ be the percent-decoding + (https://url.spec.whatwg.org/#percent-decode) [WHATWG-URL] of + _keyBytes_. + + 4. Let _keyStringDecoded_ be the UTF-8 decoding without BOM + (https://encoding.spec.whatwg.org/#utf-8-decode-without-bom) + [WHATWG-ENCODING] of _keyBytesDecoded_. + + 5. Return _keyStringDecoded_. + +4.3.1. Examples + + The parse a key algorithm allows encoding non-ASCII key strings in + the ASCII structured header format, similar to how the application/x- + www-form-urlencoded (https://url.spec.whatwg.org/#concept-urlencoded) + format [WHATWG-URL] allows encoding an entire entry list of keys and + values in ASCII URL format. For example, + + No-Vary-Search: params=("%C3%A9+%E6%B0%97") + + will result in a URL search variance whose vary params are « "é 気" ». + As explained in a later example, the canonicalization process during + equivalence testing means this will treat as equivalent URL strings + such as: + + * https://example.com/?é 気=1 + + * https://example.com/?é+気=2 + + * https://example.com/?%C3%A9%20気=3 + + * https://example.com/?%C3%A9+%E6%B0%97=4 + + and so on, since they all are parsed + (https://url.spec.whatwg.org/#concept-urlencoded-parser) [WHATWG-URL] + to having the same key "é 気". + +5. Comparing + + Two URLs (https://url.spec.whatwg.org/#concept-url) [WHATWG-URL] + _urlA_ and _urlB_ are _equivalent modulo search variance_ given a URL + search variance _searchVariance_ if the following algorithm returns + true: + + 1. If the scheme, username, password, host, port, or path of _urlA_ + and _urlB_ differ, then return false. + + 2. If _searchVariance_ is equivalent to the default URL search + variance, then: + + 1. If _urlA_'s query equals _urlB_'s query, then return true. + + 2. Return false. + + In this case, even URL pairs that might appear the same after + running the application/x-www-form-urlencoded parser + (https://url.spec.whatwg.org/#concept-urlencoded-parser) + [WHATWG-URL] on their queries, such as https://example.com/a and + https://example.com/a?, or https://example.com/foo?a=b&&&c and + https://example.com/foo?a=b&c=, will be treated as inequivalent. + + 3. Let _searchParamsA_ and _searchParamsB_ be empty lists. + + 4. If _wrlA_'s query is not null, then set _searchParamsA_ to the + result of running the application/x-www-form-urlencoded parser + (https://url.spec.whatwg.org/#concept-urlencoded-parser) + [WHATWG-URL] given the isomorphic encoding + (https://infra.spec.whatwg.org/#isomorphic-encode) + [WHATWG-INFRA] of _urlA_'s query. + + 5. If _wrlB_'s query is not null, then set _searchParamsB_ to the + result of running the application/x-www-form-urlencoded parser + (https://url.spec.whatwg.org/#concept-urlencoded-parser) + [WHATWG-URL] given the isomorphic encoding + (https://infra.spec.whatwg.org/#isomorphic-encode) + [WHATWG-INFRA] of _urlB_'s query. + + 6. If _searchVariance_'s no-vary params is a list, then: + + 1. Set _searchParamsA_ to a list containing those items _pair_ + in _searchParamsA_ where _searchVariance_'s no-vary params + does not contain _pair_[0]. + + 2. Set _searchParamsB_ to a list containing those items _pair_ + in _searchParamsB_ where _searchVariance_'s no-vary params + does not contain _pair_[0]. + + 7. Otherwise, if _searchVariance_'s vary params is a list, then: + + 1. Set _searchParamsA_ to a list containing those items _pair_ + in _searchParamsA_ where _searchVariance_'s vary params + contains _pair_[0]. + + 2. Set _searchParamsB_ to a list containing those items _pair_ + in _searchParamsB_ where _searchVariance_'s vary params + contains _pair_[0]. + + 8. If _searchVariance_'s vary on key order is false, then: + + 1. Let _keyLessThan_ be an algorithm taking as inputs two pairs + (_keyA_, _valueA_) and (_keyB_, _valueB_), which returns + whether _keyA_ is code unit less than + (https://infra.spec.whatwg.org/#code-unit-less-than) + [WHATWG-INFRA] _keyB_. + + 2. Set _searchParamsA_ to the result of sorting _searchParamsA_ + in ascending order with _keyLessThan_. + + 3. Set _searchParamsB_ to the result of sorting _searchParamsB_ + in ascending order with _keyLessThan_. + + 9. If _searchParamsA_'s size is not equal to _searchParamsB_'s + size, then return false. + + 10. Let _i_ be 0. + + 11. While _i_ < _searchParamsA_'s size: + + 1. If _searchParamsA_[_i_][0] does not equal + _searchParamsB_[_i_][0], then return false. + + 2. If _searchParamsA_[_i_][1] does not equal + _searchParamsB_[_i_][1], then return false. + + 3. Set _i_ to _i_ + 1. + + 12. Return true. + +5.1. Examples + + Due to how the application/x-www-form-urlencoded parser canonicalizes + query strings, there are some cases where query strings which do not + appear obviously equivalent, will end up being treated as equivalent + after parsing. + + So, for example, given any non-default value for No-Vary-Search, such + as No-Vary-Search: key-order, we will have the following + equivalences: + + https://example.com + https://example.com/? + A null query is parsed the same as an empty string + + https://example.com/?a=x + https://example.com/?%61=%78 + Parsing performs percent-decoding + + https://example.com/?a=é + https://example.com/?a=%C3%A9 + Parsing performs percent-decoding + + https://example.com/?a=%f6 + https://example.com/?a=%ef%bf%bd + Both values are parsed as U+FFFD (�) + + https://example.com/?a=x&&&& + https://example.com/?a=x + Parsing splits on & and discards empty strings + + https://example.com/?a= + https://example.com/?a + Both parse as having an empty string value for a + + https://example.com/?a=%20 + https://example.com/?a=+ + https://example.com/?a= & + + and %20 are both parsed as U+0020 SPACE + +6. Caching + + If a cache [HTTP-CACHING] implements this specification, the + presented target URI requirement in Section 4 of [HTTP-CACHING] is + replaced with: + + * one of the following: + + - the presented target URI (Section 7.1 of [HTTP]) and that of + the stored response match, or + + - the presented target URI and that of the stored response are + equivalent modulo search variance (Section 5), given the + variance obtained (Section 4.2) from the stored response. + + Servers SHOULD send no more than one distinct non-empty value for the + No-Vary-Search field in response to requests for a given pathname. + + Cache implementations MAY fail to reuse a stored response whose + target URI matches _only_ modulo URL search variance, if the cache + has more recently stored a response which: + + * has a target URI which is equal to the presented target URI, + excluding the query, and + + * has a non-empty value for the No-Vary-Search field, and + + * has a No-Vary-Search field value different from the stored + response being considered for reuse. + + | Caches aren't required to reuse stored responses, generally. + | However, the above expressly empowers caches to, if it is + | advantageous for performance or other reasons, search a smaller + | number of stored responses. Such a cache might take steps like + | the following to identify a stored response (before checking + | the other conditions in Section 4 of [HTTP-CACHING]): + | + | 1. Let exactMatch be cache[presentedTargetURI]. If it is a + | stored response that can be reused, return it. + | + | 2. Let targetPath be presentedTargetURI, with query + | parameters removed. + | + | 3. Let lastNVS be mostRecentNVS[targetPath]. If it does + | not exist, return null. + | + | 4. Let simplifiedURL be the result of simplifying + | presentedTargetURI according to lastNVS (by removing + | query parameters which are not significant, and stable + | sorting parameters by key, if key order should be + | ignored). + | + | 5. Let nvsMatch be cache[simplifiedURL]. If it does not + | exist, return null. (It is assumed that this was + | written when storing in the cache, in addition to the + | exact URL.) + | + | 6. Let searchVariance be obtained (Section 4.2) from + | nvsMatch. + | + | 7. If nvsMatch's target URI and presentedTargetURI are not + | equivalent modulo search variance (Section 5) given + | searchVariance, then return null. + | + | 8. If nvsMatch is a stored response that can be reused, + | return it. Otherwise, return null. + | + | Such implementations might "miss" some stored responses that + | could otherwise have been reused. It is therefore useful for + | servers to avoid sending different values for the No-Vary- + | Search field when possible. + +7. Security Considerations + + The main risk to be aware of is the impact of mismatched URLs. In + particular, this could cause the user to see a response that was + originally fetched from a URL different from the one displayed when + they hovered a link, or the URL displayed in the URL bar. + + However, since the impact is limited to query parameters, this does + not cross the relevant security boundary, which is the origin + (https://html.spec.whatwg.org/multipage/browsers.html#concept-origin) + [HTML]. (Or perhaps just the host + (https://url.spec.whatwg.org/#concept-url-host), from the perspective + of web browser security UI (https://url.spec.whatwg.org/#url- + rendering-simplification). [WHATWG-URL]) Indeed, we have already + given origins complete control over how they present the (URL, + reponse body) pair, including on the client side via technology such + as history.replaceState() (https://html.spec.whatwg.org/multipage/ + nav-history-apis.html#dom-history-replacestate) or service workers. + +8. Privacy Considerations + + This proposal is adjacent to the highly-privacy-relevant space of + navigational tracking (https://privacycg.github.io/nav-tracking- + mitigations/#terminology), which often uses query parameters to pass + along user identifiers. However, we believe this proposal itself + does not have privacy impacts. It does not interfere with existing + navigational tracking mitigations (https://privacycg.github.io/nav- + tracking-mitigations/#deployed-mitigations), or any known future ones + being contemplated. Indeed, if a page were to encode user + identifiers in its URL, the only ability this proposal gives is to + _reduce_ such user tracking by preventing server processing of such + user IDs (since the server is bypassed in favor of the cache). + [NAV-TRACKING-MITIGATIONS] + +9. IANA Considerations + + IANA should do the following: + +9.1. HTTP Field Names + + Enter the following into the Hypertext Transfer Protocol (HTTP) Field + Name Registry: + + Field Name No-Vary-Search + + Status permanent + + Structured Type Dictionary + + Reference this document + + Comments (none) + +10. References + +10.1. Normative References + + [FETCH] van Kesteren, A., "Fetch Living Standard", n.d., + . WHATWG + + [HTTP] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP Semantics", STD 97, RFC 9110, + DOI 10.17487/RFC9110, June 2022, + . + + [HTTP-CACHING] + Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP Caching", STD 98, RFC 9111, + DOI 10.17487/RFC9111, June 2022, + . + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + . + + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, . + + [STRUCTURED-FIELDS] + Nottingham, M. and P. Kamp, "Structured Field Values for + HTTP", RFC 8941, DOI 10.17487/RFC8941, February 2021, + . + + [WHATWG-ENCODING] + van Kesteren, A., "Encoding Living Standard", n.d., + . WHATWG + + [WHATWG-INFRA] + van Kesteren, A. and D. Denicola, "Infra Living Standard", + n.d., . WHATWG + + [WHATWG-URL] + van Kesteren, A., "URL Living Standard", n.d., + . WHATWG + +10.2. Informative References + + [HTML] van Kesteren, A., "HTML Living Standard", n.d., + . WHATWG + + [NAV-TRACKING-MITIGATIONS] + Snyder, P. and J. Yasskin, "Navigational-Tracking + Mitigations", n.d., + . + W3C Privacy CG + +Acknowledgments + + TODO acknowledge. + +Index + + D E O P + + D + + default URL search variance *_Section 3, Paragraph 3_*; + Section 4.1, Paragraph 2.1.1; Section 4.1, Paragraph + 2.4.2.1.1; Section 4.1, Paragraph 2.5.2.2.2.1.1; + Section 4.1, Paragraph 2.5.2.3.1; Section 4.1, Paragraph + 2.6.2.1.1; Section 4.1, Paragraph 2.6.2.2.1; Section 4.1, + Paragraph 2.6.2.3.1; Section 4.1, Paragraph 3.1; + Section 4.2.1, Paragraph 3; Section 5, Paragraph 2.2.1 + + E + + equivalent modulo search variance *_Section 5, Paragraph 1_* + + O + + obtain a URL search variance Section 2, Paragraph 5.1; + Section 3, Paragraph 4; *_Section 4.2, Paragraph 1_* + + P + + parse a key Section 4.1, Paragraph 2.5.2.2.2.2.1; Section 4.1, + Paragraph 2.6.2.4.1; *_Section 4.3, Paragraph 1_*; + Section 4.3.1, Paragraph 1 + parse a URL search variance *_Section 4.1, Paragraph 1_*; + Section 4.2, Paragraph 2.2.1 + +Authors' Addresses + + Domenic Denicola + Google LLC + Email: d@domenic.me + + + Jeremy Roman + Google LLC + Email: jbroman@chromium.org diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-optimistic-upgrade.html b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-optimistic-upgrade.html new file mode 100644 index 000000000..2da3ea461 --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-optimistic-upgrade.html @@ -0,0 +1,1532 @@ + + + + + + +Security Considerations for Optimistic Protocol Transitions in HTTP/1.1 + + + + + + + + + + + + + + + + + + + + + + +
Internet-DraftOptimistic HTTP Upgrade SecurityJanuary 2025
SchwartzExpires 11 July 2025[Page]
+
+
+
+
Workgroup:
+
HTTPBIS
+
Internet-Draft:
+
draft-ietf-httpbis-optimistic-upgrade-latest
+
Updates:
+
+9298 (if approved)
+
Published:
+
+ +
+
Intended Status:
+
Standards Track
+
Expires:
+
+
Author:
+
+
+
B. M. Schwartz
+
Meta Platforms, Inc.
+
+
+
+
+

Security Considerations for Optimistic Protocol Transitions in HTTP/1.1

+
+

Abstract

+

In HTTP/1.1, the client can request a change to a new protocol on the existing connection. This document discusses the security considerations that apply to data sent by the client before this request is confirmed, and updates RFC 9298 to avoid related security issues.

+
+
+

+About This Document +

+

This note is to be removed before publishing as an RFC.

+

+ Status information for this document may be found at https://datatracker.ietf.org/doc/draft-ietf-httpbis-optimistic-upgrade/.

+

Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions.

+
+
+
+

+Status of This Memo +

+

+ This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79.

+

+ Internet-Drafts are working documents of the Internet Engineering Task + Force (IETF). Note that other groups may also distribute working + documents as Internet-Drafts. The list of current Internet-Drafts is + at https://datatracker.ietf.org/drafts/current/.

+

+ Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress."

+

+ This Internet-Draft will expire on 11 July 2025.

+
+
+ + +
+
+

+1. Conventions and Definitions +

+

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", +"MAY", and "OPTIONAL" in this document are to be interpreted as +described in BCP 14 [RFC2119] [RFC8174] when, and only when, they +appear in all capitals, as shown here.

+
+
+
+
+

+2. Background +

+

In HTTP/1.1, a single connection is often used for many requests, one after another. After each request, the connection is returned to its initial state, ready to send more HTTP requests. However, HTTP/1.1 also contains two mechanisms that allow the client to change the protocol used for the remainder of the connection.

+

One such mechanism is the "Upgrade" request header field ([RFC9110], Section 7.8), which indicates that the client would like to use this connection for a protocol other than HTTP/1.1. The server replies with a "101 (Switching Protocols)" status code if it accepts the protocol change.

+

The other mechanism is the HTTP "CONNECT" method. This method indicates that the client wishes to establish a TCP connection to the specified host and port. The server replies with a 2xx (Successful) response to indicate that the request was accepted and a TCP connection was established. After this point, the TCP connection is acting as a TCP tunnel, not an HTTP/1.1 connection.

+

Both of these mechanisms also permit the server to reject the request. For example, [RFC9110] says:

+
    +
  • +

    A server MAY ignore a received Upgrade header field if it wishes to continue using the current protocol on that connection.

    +
  • +
+

and

+
    +
  • +

    A server MUST reject a CONNECT request that targets an empty or invalid port number, typically by responding with a 400 (Bad Request) status code.

    +
  • +
+

Rejections are common, and can happen for a variety of reasons. An "upgrade" request might be rejected if:

+
    +
  • +

    The server does not support any of the client's indicated Upgrade Tokens (i.e., the client's proposed new protocols), so it continues to use HTTP/1.1.

    +
  • +
  • +

    The server knows that an upgrade to the offered protocol will not provide any improvement over HTTP/1.1 for this request to this resource, so it chooses to respond in HTTP/1.1.

    +
  • +
  • +

    The server requires the client to authenticate before upgrading the protocol, so it replies with the status code "401 (Authentication Required)" and provides a challenge in an "Authorization" response header ([RFC9110], Section 11.6.2).

    +
  • +
  • +

    The resource has moved, so the server replies with a 3XX redirect status code ([RFC9110], Section 3.4).

    +
  • +
+

Similarly, a CONNECT request might be rejected if:

+
    +
  • +

    The server does not support HTTP CONNECT.

    +
  • +
  • +

    The specified destination is not allowed under server policy.

    +
  • +
  • +

    The destination cannot be resolved, is unreachable, or does not accept the connection.

    +
  • +
  • +

    The proxy requires the client to authenticate before proceeding.

    +
  • +
+

After rejecting a request, the server will continue to interpret subsequent bytes on that connection in accordance with HTTP/1.1.

+

[RFC9110] also states:

+
    +
  • +

    A client cannot begin using an upgraded protocol on the connection until it has completely sent the request message (i.e., the client can't change the protocol it is sending in the middle of a message).

    +
  • +
+

However, because of the possibility of rejection, the converse is not true: a client cannot necessarily begin using a new protocol merely because it has finished sending the corresponding request message.

+

In some cases, the client might expect that the protocol transition will succeed. If this expectation is correct, the client might be able to reduce delay by immediately sending the first bytes of the new protocol "optimistically", without waiting for the server's response. This document explores the security implications of this "optimistic" behavior.

+
+
+
+
+

+3. Possible Security Issues +

+

When there are only two distinct parties involved in an HTTP/1.1 connection (i.e., the client and the server), protocol transitions introduce no new security issues: each party must already be prepared for the other to send arbitrary data on the connection at any time. However, HTTP connections often involve more than two parties, if the requests or responses include third-party data. For example, a browser (party 1) might send an HTTP request to an origin (party 2) with path, headers, or body controlled by a website from a different origin (party 3). Post-transition protocols such as WebSocket similarly are often used to convey data chosen by a third party.

+

If the third-party data source is untrusted, we call the data it provides "attacker-controlled". The combination of attacker-controlled data and optimistic protocol transitions results in two significant security issues.

+
+
+

+3.1. Request Smuggling +

+

In a Request Smuggling attack ([RFC9112], Section 11.2) the attacker-controlled data is chosen in such a way that it is interpreted by the server as an additional HTTP request. These attacks allow the attacker to speak on behalf of the client while bypassing the client's own rules about what requests it will issue. Request Smuggling can occur if the client and server have distinct interpretations of the data that flows between them.

+

If the server accepts a protocol transition request, it interprets the subsequent bytes in accordance with the new protocol. If it rejects the request, it interprets those bytes as HTTP/1.1. However, the client doesn't know which interpretation the server will take until it receives the server's response status code. If it uses the new protocol optimistically, this creates a risk that the server will interpret attacker-controlled data in the new protocol as an additional HTTP request issued by the client.

+

As a trivial example, consider an HTTP CONNECT client providing connectivity to an untrusted application. If the client is authenticated to the proxy server using a connection-level authentication method such as TLS Client Certificates, the attacker could send an HTTP/1.1 POST request for the proxy server at the beginning of its TCP connection. If the client delivers this data optimistically, and the CONNECT request fails, the server would misinterpret the application's data as a subsequent authenticated request issued by the client.

+
+
+
+
+

+3.2. Parser Exploits +

+

A related category of attacks use protocol disagreement to exploit vulnerabilities in the server's request parsing logic. These attacks apply when the HTTP client is trusted by the server, but the post-transition data source is not. If the server software was developed under the assumption that some or all of the HTTP request data is not attacker-controlled, optimistic transmission can cause this assumption to be violated, exposing vulnerabilities in the server's HTTP request parser.

+
+
+
+
+
+
+

+4. Operational Issues +

+

If the server rejects the transition request, the connection can continue to be used for HTTP/1.1. There is no requirement to close the connection in response to a rejected transition, and keeping the connection open has performance advantages if additional HTTP requests to this server are likely. Thus, it is normally inappropriate to close the connection in response to a rejected transition.

+
+
+
+
+

+5. Impact on HTTP Upgrade with Existing Upgrade Tokens +

+

This section describes the impact of this document's considerations on some registered Upgrade Tokens that are believed to be in use at the time of writing.

+
+
+

+5.1. "TLS" +

+

The "TLS" family of Upgrade Tokens was defined in [RFC2817], which correctly highlights the possibility of the server rejecting the upgrade. If a client ignores this possibility and sends TLS data optimistically, the result cannot be valid HTTP/1.1: the first octet of a TLS connection must be 22 (ContentType.handshake), but this is not an allowed character in an HTTP/1.1 method. A compliant HTTP/1.1 server will treat this as a parsing error and close the connection without processing further requests.

+
+
+
+
+

+5.2. "WebSocket"/"websocket" +

+

Section 4.1 of [RFC6455] says:

+
    +
  • +

    Once the client's opening handshake has been sent, the client MUST wait for a response from the server before sending any further data.

    +
  • +
+

Thus, optimistic use of HTTP Upgrade is already forbidden in the WebSocket protocol. Additionally, the WebSocket protocol requires high-entropy masking of client-to-server frames (Section 5.1 of [RFC6455]).

+
+
+
+
+

+5.3. "connect-udp" +

+

Section 5 of [RFC9298] says:

+
    +
  • +

    A client MAY optimistically start sending UDP packets in HTTP Datagrams before receiving the response to its UDP proxying request.

    +
  • +
+

However, in HTTP/1.1, this "proxying request" is an HTTP Upgrade request. This upgrade is likely to be rejected in certain circumstances, such as when the UDP destination address (which is attacker-controlled) is invalid. Additionally, the contents of the "connect-udp" protocol stream can include untrusted material (i.e., the UDP packets, which might come from other applications on the client device). This creates the possibility of Request Smuggling attacks. To avoid these concerns, this text is updated as follows:

+
    +
  • +

    When using HTTP/2 or later, a client MAY optimistically ...

    +
  • +
+

Section 3.3 of [RFC9298] describes the requirement for a successful proxy setup response, including upgrading to the "connect-udp" protocol, and says:

+
    +
  • +

    If any of these requirements are not met, the client MUST treat this proxying attempt as failed and abort the connection.

    +
  • +
+

However, this could be interpreted as an instruction to abort the underlying TLS and TCP connections in the event of an unsuccessful response such as "407 ("Proxy Authentication Required)". To avoid an unnecessary delay in this case, this text is hereby updated as follows:

+
    +
  • +

    If any of these requirements are not met, the client MUST treat this proxying attempt as failed. If the "Upgrade" response header field is absent, the client MAY reuse the connection for further HTTP/1.1 requests; otherwise it MUST abort the underlying connection.

    +
  • +
+
+
+
+
+

+5.4. "connect-ip" +

+

The "connect-ip" Upgrade Token is defined in [RFC9484]. Section 11 of [RFC9484] forbids clients from using optimistic upgrade, avoiding this issue.

+
+
+
+
+
+
+

+6. Guidance for Future Upgrade Tokens +

+

There are now several good examples of designs that reduce or eliminate the security concerns discussed in this document and may be applicable in future specifications:

+ +

Future specifications for Upgrade Tokens should account for the security issues discussed here and provide clear guidance on how implementations can avoid them.

+
+
+

+6.1. Selection of Request Methods +

+

Some Upgrade Tokens, such as "TLS", are defined for use with any ordinary HTTP Method. The upgraded protocol continues to provide HTTP semantics, and will convey the response to this HTTP request.

+

The other Upgrade Tokens mentioned in Section 5 do not preserve HTTP semantics, so the method is not relevant. All of these Upgrade Tokens are specified only for use with the "GET" method.

+

Future specifications for Upgrade Tokens should restrict their use to "GET" requests if the HTTP method is otherwise irrelevant and a request body is not required. This improves consistency with other Upgrade Tokens and reduces the likelihood that a faulty server implementation might process the request body as the new protocol.

+
+
+
+
+
+
+

+7. Guidance for HTTP CONNECT +

+

In HTTP/1.1, clients that send CONNECT requests on behalf of untrusted TCP clients MUST wait for a 2xx (Successful) response before sending any TCP payload data.

+

To mitigate vulnerabilities from any clients that do not conform to this requirement, proxy servers MAY close the underlying connection when rejecting an HTTP/1.1 CONNECT request, without processing any further data sent to the proxy server on that connection. Note that this behavior may impair performance, especially when returning a "407 (Proxy Authentication Required)" response.

+
+
+
+
+

+8. IANA Considerations +

+

This document has no IANA actions.

+
+
+
+
+

+9. References +

+
+
+

+9.1. Normative References +

+
+
[RFC2119]
+
+Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
+
+
[RFC8174]
+
+Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.
+
+
[RFC9110]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, DOI 10.17487/RFC9110, , <https://www.rfc-editor.org/rfc/rfc9110>.
+
+
[RFC9298]
+
+Schinazi, D., "Proxying UDP in HTTP", RFC 9298, DOI 10.17487/RFC9298, , <https://www.rfc-editor.org/rfc/rfc9298>.
+
+
[RFC9484]
+
+Pauly, T., Ed., Schinazi, D., Chernyakhovsky, A., Kühlewind, M., and M. Westerlund, "Proxying IP in HTTP", RFC 9484, DOI 10.17487/RFC9484, , <https://www.rfc-editor.org/rfc/rfc9484>.
+
+
+
+
+
+
+

+9.2. Informative References +

+
+
[RFC2817]
+
+Khare, R. and S. Lawrence, "Upgrading to TLS Within HTTP/1.1", RFC 2817, DOI 10.17487/RFC2817, , <https://www.rfc-editor.org/rfc/rfc2817>.
+
+
[RFC6455]
+
+Fette, I. and A. Melnikov, "The WebSocket Protocol", RFC 6455, DOI 10.17487/RFC6455, , <https://www.rfc-editor.org/rfc/rfc6455>.
+
+
[RFC9112]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP/1.1", STD 99, RFC 9112, DOI 10.17487/RFC9112, , <https://www.rfc-editor.org/rfc/rfc9112>.
+
+
[RFC9113]
+
+Thomson, M., Ed. and C. Benfield, Ed., "HTTP/2", RFC 9113, DOI 10.17487/RFC9113, , <https://www.rfc-editor.org/rfc/rfc9113>.
+
+
+
+
+
+
+
+
+

+Acknowledgments +

+

Thanks to Mark Nottingham and Lucas Pardue for early reviews of this document.

+
+
+
+
+

+Author's Address +

+
+
Benjamin M. Schwartz
+
Meta Platforms, Inc.
+ +
+
+
+ + + diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-optimistic-upgrade.txt b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-optimistic-upgrade.txt new file mode 100644 index 000000000..4774bf125 --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-optimistic-upgrade.txt @@ -0,0 +1,428 @@ + + + + +HTTPBIS B. M. Schwartz +Internet-Draft Meta Platforms, Inc. +Updates: 9298 (if approved) 7 January 2025 +Intended status: Standards Track +Expires: 11 July 2025 + + +Security Considerations for Optimistic Protocol Transitions in HTTP/1.1 + draft-ietf-httpbis-optimistic-upgrade-latest + +Abstract + + In HTTP/1.1, the client can request a change to a new protocol on the + existing connection. This document discusses the security + considerations that apply to data sent by the client before this + request is confirmed, and updates RFC 9298 to avoid related security + issues. + +About This Document + + This note is to be removed before publishing as an RFC. + + Status information for this document may be found at + https://datatracker.ietf.org/doc/draft-ietf-httpbis-optimistic- + upgrade/. + + Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions. + +Status of This Memo + + This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79. + + Internet-Drafts are working documents of the Internet Engineering + Task Force (IETF). Note that other groups may also distribute + working documents as Internet-Drafts. The list of current Internet- + Drafts is at https://datatracker.ietf.org/drafts/current/. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress." + + This Internet-Draft will expire on 11 July 2025. + +Copyright Notice + + Copyright (c) 2025 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents (https://trustee.ietf.org/ + license-info) in effect on the date of publication of this document. + Please review these documents carefully, as they describe your rights + and restrictions with respect to this document. Code Components + extracted from this document must include Revised BSD License text as + described in Section 4.e of the Trust Legal Provisions and are + provided without warranty as described in the Revised BSD License. + +Table of Contents + + 1. Conventions and Definitions + 2. Background + 3. Possible Security Issues + 3.1. Request Smuggling + 3.2. Parser Exploits + 4. Operational Issues + 5. Impact on HTTP Upgrade with Existing Upgrade Tokens + 5.1. "TLS" + 5.2. "WebSocket"/"websocket" + 5.3. "connect-udp" + 5.4. "connect-ip" + 6. Guidance for Future Upgrade Tokens + 6.1. Selection of Request Methods + 7. Guidance for HTTP CONNECT + 8. IANA Considerations + 9. References + 9.1. Normative References + 9.2. Informative References + Acknowledgments + Author's Address + +1. Conventions and Definitions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + +2. Background + + In HTTP/1.1, a single connection is often used for many requests, one + after another. After each request, the connection is returned to its + initial state, ready to send more HTTP requests. However, HTTP/1.1 + also contains two mechanisms that allow the client to change the + protocol used for the remainder of the connection. + + One such mechanism is the "Upgrade" request header field ([RFC9110], + Section 7.8), which indicates that the client would like to use this + connection for a protocol other than HTTP/1.1. The server replies + with a "101 (Switching Protocols)" status code if it accepts the + protocol change. + + The other mechanism is the HTTP "CONNECT" method. This method + indicates that the client wishes to establish a TCP connection to the + specified host and port. The server replies with a 2xx (Successful) + response to indicate that the request was accepted and a TCP + connection was established. After this point, the TCP connection is + acting as a TCP tunnel, not an HTTP/1.1 connection. + + Both of these mechanisms also permit the server to reject the + request. For example, [RFC9110] says: + + A server MAY ignore a received Upgrade header field if it wishes + to continue using the current protocol on that connection. + + and + + A server MUST reject a CONNECT request that targets an empty or + invalid port number, typically by responding with a 400 (Bad + Request) status code. + + Rejections are common, and can happen for a variety of reasons. An + "upgrade" request might be rejected if: + + * The server does not support any of the client's indicated Upgrade + Tokens (i.e., the client's proposed new protocols), so it + continues to use HTTP/1.1. + + * The server knows that an upgrade to the offered protocol will not + provide any improvement over HTTP/1.1 for this request to this + resource, so it chooses to respond in HTTP/1.1. + + * The server requires the client to authenticate before upgrading + the protocol, so it replies with the status code "401 + (Authentication Required)" and provides a challenge in an + "Authorization" response header ([RFC9110], Section 11.6.2). + + * The resource has moved, so the server replies with a 3XX redirect + status code ([RFC9110], Section 3.4). + + Similarly, a CONNECT request might be rejected if: + + * The server does not support HTTP CONNECT. + + * The specified destination is not allowed under server policy. + + * The destination cannot be resolved, is unreachable, or does not + accept the connection. + + * The proxy requires the client to authenticate before proceeding. + + After rejecting a request, the server will continue to interpret + subsequent bytes on that connection in accordance with HTTP/1.1. + + [RFC9110] also states: + + A client cannot begin using an upgraded protocol on the connection + until it has completely sent the request message (i.e., the client + can't change the protocol it is sending in the middle of a + message). + + However, because of the possibility of rejection, the converse is not + true: a client cannot necessarily begin using a new protocol merely + because it has finished sending the corresponding request message. + + In some cases, the client might expect that the protocol transition + will succeed. If this expectation is correct, the client might be + able to reduce delay by immediately sending the first bytes of the + new protocol "optimistically", without waiting for the server's + response. This document explores the security implications of this + "optimistic" behavior. + +3. Possible Security Issues + + When there are only two distinct parties involved in an HTTP/1.1 + connection (i.e., the client and the server), protocol transitions + introduce no new security issues: each party must already be prepared + for the other to send arbitrary data on the connection at any time. + However, HTTP connections often involve more than two parties, if the + requests or responses include third-party data. For example, a + browser (party 1) might send an HTTP request to an origin (party 2) + with path, headers, or body controlled by a website from a different + origin (party 3). Post-transition protocols such as WebSocket + similarly are often used to convey data chosen by a third party. + + If the third-party data source is untrusted, we call the data it + provides "attacker-controlled". The combination of attacker- + controlled data and optimistic protocol transitions results in two + significant security issues. + +3.1. Request Smuggling + + In a Request Smuggling attack ([RFC9112], Section 11.2) the attacker- + controlled data is chosen in such a way that it is interpreted by the + server as an additional HTTP request. These attacks allow the + attacker to speak on behalf of the client while bypassing the + client's own rules about what requests it will issue. Request + Smuggling can occur if the client and server have distinct + interpretations of the data that flows between them. + + If the server accepts a protocol transition request, it interprets + the subsequent bytes in accordance with the new protocol. If it + rejects the request, it interprets those bytes as HTTP/1.1. However, + the client doesn't know which interpretation the server will take + until it receives the server's response status code. If it uses the + new protocol optimistically, this creates a risk that the server will + interpret attacker-controlled data in the new protocol as an + additional HTTP request issued by the client. + + As a trivial example, consider an HTTP CONNECT client providing + connectivity to an untrusted application. If the client is + authenticated to the proxy server using a connection-level + authentication method such as TLS Client Certificates, the attacker + could send an HTTP/1.1 POST request for the proxy server at the + beginning of its TCP connection. If the client delivers this data + optimistically, and the CONNECT request fails, the server would + misinterpret the application's data as a subsequent authenticated + request issued by the client. + +3.2. Parser Exploits + + A related category of attacks use protocol disagreement to exploit + vulnerabilities in the server's request parsing logic. These attacks + apply when the HTTP client is trusted by the server, but the post- + transition data source is not. If the server software was developed + under the assumption that some or all of the HTTP request data is not + attacker-controlled, optimistic transmission can cause this + assumption to be violated, exposing vulnerabilities in the server's + HTTP request parser. + +4. Operational Issues + + If the server rejects the transition request, the connection can + continue to be used for HTTP/1.1. There is no requirement to close + the connection in response to a rejected transition, and keeping the + connection open has performance advantages if additional HTTP + requests to this server are likely. Thus, it is normally + inappropriate to close the connection in response to a rejected + transition. + +5. Impact on HTTP Upgrade with Existing Upgrade Tokens + + This section describes the impact of this document's considerations + on some registered Upgrade Tokens that are believed to be in use at + the time of writing. + +5.1. "TLS" + + The "TLS" family of Upgrade Tokens was defined in [RFC2817], which + correctly highlights the possibility of the server rejecting the + upgrade. If a client ignores this possibility and sends TLS data + optimistically, the result cannot be valid HTTP/1.1: the first octet + of a TLS connection must be 22 (ContentType.handshake), but this is + not an allowed character in an HTTP/1.1 method. A compliant HTTP/1.1 + server will treat this as a parsing error and close the connection + without processing further requests. + +5.2. "WebSocket"/"websocket" + + Section 4.1 of [RFC6455] says: + + Once the client's opening handshake has been sent, the client MUST + wait for a response from the server before sending any further + data. + + Thus, optimistic use of HTTP Upgrade is already forbidden in the + WebSocket protocol. Additionally, the WebSocket protocol requires + high-entropy masking of client-to-server frames (Section 5.1 of + [RFC6455]). + +5.3. "connect-udp" + + Section 5 of [RFC9298] says: + + A client MAY optimistically start sending UDP packets in HTTP + Datagrams before receiving the response to its UDP proxying + request. + + However, in HTTP/1.1, this "proxying request" is an HTTP Upgrade + request. This upgrade is likely to be rejected in certain + circumstances, such as when the UDP destination address (which is + attacker-controlled) is invalid. Additionally, the contents of the + "connect-udp" protocol stream can include untrusted material (i.e., + the UDP packets, which might come from other applications on the + client device). This creates the possibility of Request Smuggling + attacks. To avoid these concerns, this text is updated as follows: + + When using HTTP/2 or later, a client MAY optimistically ... + + Section 3.3 of [RFC9298] describes the requirement for a successful + proxy setup response, including upgrading to the "connect-udp" + protocol, and says: + + If any of these requirements are not met, the client MUST treat + this proxying attempt as failed and abort the connection. + + However, this could be interpreted as an instruction to abort the + underlying TLS and TCP connections in the event of an unsuccessful + response such as "407 ("Proxy Authentication Required)". To avoid an + unnecessary delay in this case, this text is hereby updated as + follows: + + If any of these requirements are not met, the client MUST treat + this proxying attempt as failed. If the "Upgrade" response header + field is absent, the client MAY reuse the connection for further + HTTP/1.1 requests; otherwise it MUST abort the underlying + connection. + +5.4. "connect-ip" + + The "connect-ip" Upgrade Token is defined in [RFC9484]. Section 11 + of [RFC9484] forbids clients from using optimistic upgrade, avoiding + this issue. + +6. Guidance for Future Upgrade Tokens + + There are now several good examples of designs that reduce or + eliminate the security concerns discussed in this document and may be + applicable in future specifications: + + * Forbid optimistic use of HTTP Upgrade (WebSocket, Section 4.1 of + [RFC6455]). + + * Embed a fixed preamble that terminates HTTP/1.1 processing + (HTTP/2, Section 3.4 of [RFC9113]). + + * Apply high-entropy masking of client-to-server data (WebSocket, + Section 5.1 of [RFC6455]). + + Future specifications for Upgrade Tokens should account for the + security issues discussed here and provide clear guidance on how + implementations can avoid them. + +6.1. Selection of Request Methods + + Some Upgrade Tokens, such as "TLS", are defined for use with any + ordinary HTTP Method. The upgraded protocol continues to provide + HTTP semantics, and will convey the response to this HTTP request. + + The other Upgrade Tokens mentioned in Section 5 do not preserve HTTP + semantics, so the method is not relevant. All of these Upgrade + Tokens are specified only for use with the "GET" method. + + Future specifications for Upgrade Tokens should restrict their use to + "GET" requests if the HTTP method is otherwise irrelevant and a + request body is not required. This improves consistency with other + Upgrade Tokens and reduces the likelihood that a faulty server + implementation might process the request body as the new protocol. + +7. Guidance for HTTP CONNECT + + In HTTP/1.1, clients that send CONNECT requests on behalf of + untrusted TCP clients MUST wait for a 2xx (Successful) response + before sending any TCP payload data. + + To mitigate vulnerabilities from any clients that do not conform to + this requirement, proxy servers MAY close the underlying connection + when rejecting an HTTP/1.1 CONNECT request, without processing any + further data sent to the proxy server on that connection. Note that + this behavior may impair performance, especially when returning a + "407 (Proxy Authentication Required)" response. + +8. IANA Considerations + + This document has no IANA actions. + +9. References + +9.1. Normative References + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + . + + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, . + + [RFC9110] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP Semantics", STD 97, RFC 9110, + DOI 10.17487/RFC9110, June 2022, + . + + [RFC9298] Schinazi, D., "Proxying UDP in HTTP", RFC 9298, + DOI 10.17487/RFC9298, August 2022, + . + + [RFC9484] Pauly, T., Ed., Schinazi, D., Chernyakhovsky, A., + Kühlewind, M., and M. Westerlund, "Proxying IP in HTTP", + RFC 9484, DOI 10.17487/RFC9484, October 2023, + . + +9.2. Informative References + + [RFC2817] Khare, R. and S. Lawrence, "Upgrading to TLS Within + HTTP/1.1", RFC 2817, DOI 10.17487/RFC2817, May 2000, + . + + [RFC6455] Fette, I. and A. Melnikov, "The WebSocket Protocol", + RFC 6455, DOI 10.17487/RFC6455, December 2011, + . + + [RFC9112] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP/1.1", STD 99, RFC 9112, DOI 10.17487/RFC9112, + June 2022, . + + [RFC9113] Thomson, M., Ed. and C. Benfield, Ed., "HTTP/2", RFC 9113, + DOI 10.17487/RFC9113, June 2022, + . + +Acknowledgments + + Thanks to Mark Nottingham and Lucas Pardue for early reviews of this + document. + +Author's Address + + Benjamin M. Schwartz + Meta Platforms, Inc. + Email: ietf@bemasc.net diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-resumable-upload.html b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-resumable-upload.html new file mode 100644 index 000000000..18c2afefb --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-resumable-upload.html @@ -0,0 +1,2629 @@ + + + + + + +Resumable Uploads for HTTP + + + + + + + + + + + + + + + + + + + + + + + + + +
Internet-DraftResumable UploadsJanuary 2025
Kleidl, et al.Expires 11 July 2025[Page]
+
+
+
+
Workgroup:
+
HTTP
+
Internet-Draft:
+
draft-ietf-httpbis-resumable-upload-latest
+
Published:
+
+ +
+
Intended Status:
+
Standards Track
+
Expires:
+
+
Authors:
+
+
+
M. Kleidl, Ed. +
+
Transloadit
+
+
+
G. Zhang, Ed. +
+
Apple Inc.
+
+
+
L. Pardue, Ed. +
+
Cloudflare
+
+
+
+
+

Resumable Uploads for HTTP

+
+

Abstract

+

HTTP clients often encounter interrupted data transfers as a result of canceled requests or dropped connections. Prior to interruption, part of a representation may have been exchanged. To complete the data transfer of the entire representation, it is often desirable to issue subsequent requests that transfer only the remainder of the representation. HTTP range requests support this concept of resumable downloads from server to client. This document describes a mechanism that supports resumable uploads from client to server using HTTP.

+
+
+

+About This Document +

+

This note is to be removed before publishing as an RFC.

+

+ Status information for this document may be found at https://datatracker.ietf.org/doc/draft-ietf-httpbis-resumable-upload/.

+

+ Discussion of this document takes place on the + HTTP Working Group mailing list (mailto:ietf-http-wg@w3.org), + which is archived at https://lists.w3.org/Archives/Public/ietf-http-wg/. + Working Group information can be found at https://httpwg.org/.

+

Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/resumable-upload.

+
+
+
+

+Status of This Memo +

+

+ This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79.

+

+ Internet-Drafts are working documents of the Internet Engineering Task + Force (IETF). Note that other groups may also distribute working + documents as Internet-Drafts. The list of current Internet-Drafts is + at https://datatracker.ietf.org/drafts/current/.

+

+ Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress."

+

+ This Internet-Draft will expire on 11 July 2025.

+
+
+ +
+
+

+Table of Contents +

+ +
+
+
+
+

+1. Introduction +

+

HTTP clients often encounter interrupted data transfers as a result of canceled requests or dropped connections. Prior to interruption, part of a representation (see Section 3.2 of [HTTP]) might have been exchanged. To complete the data transfer of the entire representation, it is often desirable to issue subsequent requests that transfer only the remainder of the representation. HTTP range requests (see Section 14 of [HTTP]) support this concept of resumable downloads from server to client.

+

HTTP methods such as POST or PUT can be used by clients to request processing of representation data enclosed in the request message. The transfer of representation data from client to server is often referred to as an upload. Uploads are just as likely as downloads to suffer from the effects of data transfer interruption. Humans can play a role in upload interruptions through manual actions such as pausing an upload. Regardless of the cause of an interruption, servers may have received part of the representation before its occurrence and it is desirable if clients can complete the data transfer by sending only the remainder of the representation. The process of sending additional parts of a representation using subsequent HTTP requests from client to server is herein referred to as a resumable upload.

+

Connection interruptions are common and the absence of a standard mechanism for resumable uploads has lead to a proliferation of custom solutions. Some of those use HTTP, while others rely on other transfer mechanisms entirely. An HTTP-based standard solution is desirable for such a common class of problem.

+

This document defines an optional mechanism for HTTP that enables resumable uploads in a way that is backwards-compatible with conventional HTTP uploads. When an upload is interrupted, clients can send subsequent requests to query the server state and use this information to send the remaining data. Alternatively, they can cancel the upload entirely. Different from ranged downloads, this protocol does not support transferring different parts of the same representation in parallel.

+
+
+
+
+

+2. Conventions and Definitions +

+

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", +"MAY", and "OPTIONAL" in this document are to be interpreted as +described in BCP 14 [RFC2119] [RFC8174] when, and only when, they +appear in all capitals, as shown here.

+

The terms Byte Sequence, Item, String, Token, Integer, and Boolean are imported from +[STRUCTURED-FIELDS].

+

The terms "representation", "representation data", "representation metadata", "content", "client" and "server" are from [HTTP].

+
+
+
+
+

+3. Overview +

+

Resumable uploads are supported in HTTP through use of a temporary resource, an upload resource, that is separate from the resource being uploaded to (hereafter, the target resource) and specific to that upload. By interacting with the upload resource, a client can retrieve the current offset of the upload (Section 5), append to the upload (Section 6), and cancel the upload (Section 7).

+

The remainder of this section uses an example of a file upload to illustrate different interactions with the upload resource. Note, however, that HTTP message exchanges use representation data (see Section 8.1 of [HTTP]), which means that resumable uploads can be used with many forms of content -- not just static files.

+
+
+

+3.1. Example 1: Complete upload of file with known size +

+

In this example, the client first attempts to upload a file with a known size in a single HTTP request to the target resource. An interruption occurs and the client then attempts to resume the upload using subsequent HTTP requests to the upload resource.

+

1) The client notifies the server that it wants to begin an upload (Section 4). The server reserves the required resources to accept the upload from the client, and the client begins transferring the entire file in the request content.

+

An informational response can be sent to the client, which signals the server's support of resumable upload as well as the upload resource URL via the Location header field (Section 10.2.2 of [HTTP]).

+
+
+
+
+ + + + + + + + + + + + + + + Client + Server + POST + Reserve + resources + for + upload + 104 + Upload + Resumption + Supported + with + upload + resource + URL + Flow + Interrupted + + +
+
+
Figure 1: +Upload Creation +
+
+

2) If the connection to the server is interrupted, the client might want to resume the upload. However, before this is possible the client needs to know the amount of data that the server received before the interruption. It does so by retrieving the offset (Section 5) from the upload resource.

+
+
+
+
+ + + + + + + + + Client + Server + HEAD + to + upload + resource + URL + 204 + No + Content + with + Upload-Offset + + +
+
+
Figure 2: +Offset Retrieval +
+
+

3) The client can resume the upload by sending the remaining file content to the upload resource (Section 6), appending to the already stored data in the upload. The Upload-Offset value is included to ensure that the client and server agree on the offset that the upload resumes from.

+
+
+
+
+ + + + + + + + + Client + Server + PATCH + to + upload + resource + URL + with + Upload-Offset + 201 + Created + on + completion + + +
+
+
Figure 3: +Upload Append +
+
+

4) If the client is not interested in completing the upload, it can instruct the upload resource to delete the upload and free all related resources (Section 7).

+
+
+
+
+ + + + + + + + + Client + Server + DELETE + to + upload + resource + URL + 204 + No + Content + on + completion + + +
+
+
Figure 4: +Upload Cancellation +
+
+
+
+
+
+

+3.2. Example 2: Upload as a series of parts +

+

In some cases, clients might prefer to upload a file as a series of parts sent serially across multiple HTTP messages. One use case is to overcome server limits on HTTP message content size. Another use case is where the client does not know the final size, such as when file data originates from a streaming source.

+

This example shows how the client, with prior knowledge about the server's resumable upload support, can upload parts of a file incrementally.

+

1) If the client is aware that the server supports resumable upload, it can start an upload with the Upload-Complete field value set to false and the first part of the file.

+
+
+
+
+ + + + + + + + + Client + Server + POST + with + Upload-Complete: + ?0 + 201 + Created + with + Upload-Complete: + ?0 + and + Location + on + completion + + +
+
+
Figure 5: +Incomplete Upload Creation +
+
+

2) Subsequently, parts are appended (Section 6). The last part of the upload has a Upload-Complete field value set to true to indicate the complete transfer.

+
+
+
+
+ + + + + + + + + Client + Server + PATCH + to + upload + resource + URL + with + Upload-Offset + and + Upload-Complete: + ?1 + 201 + Created + on + completion + + +
+
+
Figure 6: +Upload Append Last Chunk +
+
+
+
+
+
+
+
+

+4. Upload Creation +

+

When a resource supports resumable uploads, the first step is creating the upload resource. To be compatible with the widest range of resources, this is accomplished by including the Upload-Complete header field in the request that initiates the upload.

+

As a consequence, resumable uploads support all HTTP request methods that can carry content, such as POST, PUT, and PATCH. Similarly, the response to the upload request can have any status code. Both the method(s) and status code(s) supported are determined by the resource.

+

Upload-Complete MUST be set to false if the end of the request content is not the end of the upload. Otherwise, it MUST be set to true. This header field can be used for request identification by a server. The request MUST NOT include the Upload-Offset header field.

+

If the request is valid, the server SHOULD create an upload resource. Then, the server MUST include the Location header field in the response and set its value to the URL of the upload resource. The client MAY use this URL for offset retrieval (Section 5), upload append (Section 6), and upload cancellation (Section 7).

+

Once the upload resource is available and while the request content is being uploaded, the target resource MAY send one or more informational responses with a 104 (Upload Resumption Supported) status code to the client. In the first informational response, the Location header field MUST be set to the URL pointing to the upload resource. In subsequent informational responses, the Location header field MUST NOT be set. An informational response MAY contain the Upload-Offset header field with the current upload offset as the value to inform the client about the upload progress.

+

The server MUST send the Upload-Offset header field in the response if it considers the upload active, either when the response is a success (e.g. 201 (Created)), or when the response is a failure (e.g. 409 (Conflict)). The Upload-Offset field value MUST be equal to the end offset of the entire upload, or the begin offset of the next chunk if the upload is still incomplete. The client SHOULD consider the upload failed if the response has a status code that indicates a success but the offset indicated in the Upload-Offset field value does not equal the total of begin offset plus the number of bytes uploaded in the request.

+

If the request completes successfully and the entire upload is complete, the server MUST acknowledge it by responding with a 2xx (Successful) status code. Servers are RECOMMENDED to use 201 (Created) unless otherwise specified. The response MUST NOT include the Upload-Complete header field with the value of false.

+

If the request completes successfully but the entire upload is not yet complete, as indicated by an Upload-Complete field value of false in the request, the server MUST acknowledge it by responding with the 201 (Created) status code and an Upload-Complete header value set to false.

+

The request can indicate the upload's final size in two different ways. Both indicators may be present in the same request as long as they convey the same size. If the sizes are inconsistent, the server MUST reject the request by responding with a 400 (Bad Request) status code.

+
    +
  • +

    If the request includes an Upload-Complete field value set to true and a valid Content-Length header field, the client attempts to upload a fixed-length resource in one request. In this case, the upload's final size is the Content-Length field value and the server MUST record it to ensure its consistency. The value can therefore not be used if the upload is split across multiple requests.

    +
  • +
  • +

    If the request includes the Upload-Length header field, the server MUST record its value as the upload's final size. A client SHOULD provide this header field if the upload length is known at the time of upload creation.

    +
  • +
+

The upload is not automatically completed if the offset reaches the upload's final size. Instead, a client MUST indicate the completion of an upload through the Upload-Complete header field. Indicating an upload's final size can help the server allocate necessary resources for the upload and provide early feedback if the size does not match the server's limits (Section 8.2).

+

The server MAY enforce a maximum size of an upload resource. This limit MAY be equal to the upload's final size, if available, or an arbitrary value. The limit's value or its existence MUST NOT change throughout the lifetime of the upload resource. The server MAY indicate such a limit to the client by including the Upload-Limit header field in the informational or final response to upload creation. If the client receives an Upload-Limit header field indicating that the maximum size is less than the amount of bytes it intends to upload to a resource, it SHOULD stop the current upload transfer immediately and cancel the upload (Section 7).

+

The request content MAY be empty. If the Upload-Complete header field is then set to true, the client intends to upload an empty resource representation. An Upload-Complete header field is set to false is also valid. This can be used to create an upload resource URL before transferring data, which can save client or server resources. Since informational responses are optional, this technique provides another mechanism to learn the URL, at the cost of an additional round-trip before data upload can commence.

+

If the server does not receive the entire request content, for example because of canceled requests or dropped connections, it SHOULD append as much of the request content starting at its beginning and without discontinuities as possible. If the server did not append the entire request content, the upload MUST NOT be considered complete.

+
+
+POST /upload HTTP/1.1
+Host: example.com
+Upload-Draft-Interop-Version: 6
+Upload-Complete: ?1
+Content-Length: 100
+Upload-Length: 100
+
+[content (100 bytes)]
+
+
+
+
+HTTP/1.1 104 Upload Resumption Supported
+Upload-Draft-Interop-Version: 6
+Location: https://example.com/upload/b530ce8ff
+
+HTTP/1.1 104 Upload Resumption Supported
+Upload-Draft-Interop-Version: 6
+Upload-Offset: 50
+Upload-Limit: max-size=1000000000
+
+HTTP/1.1 201 Created
+Location: https://example.com/upload/b530ce8ff
+Upload-Offset: 100
+Upload-Limit: max-size=1000000000
+
+
+

The next example shows an upload creation, where only the first 25 bytes of a 100 bytes upload are transferred. The server acknowledges the received data and that the upload is not complete yet:

+
+
+POST /upload HTTP/1.1
+Host: example.com
+Upload-Draft-Interop-Version: 6
+Upload-Complete: ?0
+Content-Length: 25
+Upload-Length: 100
+
+[partial content (25 bytes)]
+
+
+
+
+HTTP/1.1 201 Created
+Location: https://example.com/upload/b530ce8ff
+Upload-Complete: ?0
+Upload-Offset: 25
+Upload-Limit: max-size=1000000000
+
+
+

If the client received an informational response with the upload URL in the Location field value, it MAY automatically attempt upload resumption when the connection is terminated unexpectedly, or if a 5xx status is received. The client SHOULD NOT automatically retry if it receives a 4xx status code.

+

File metadata can affect how servers might act on the uploaded file. Clients can send representation metadata (see Section 8.3 of [HTTP]) in the request that starts an upload. Servers MAY interpret this metadata or MAY ignore it. The Content-Type header field (Section 8.3 of [HTTP]) can be used to indicate the media type of the file. The applied content codings are specified using the Content-Encoding header field and are retained throughout the entire upload. When resuming an interrupted upload, the same content codings are used for appending to the upload, producing a representation of the upload resource. The Content-Disposition header field ([RFC6266]) can be used to transmit a filename; if included, the parameters SHOULD be either filename, filename* or boundary.

+
+
+

+4.1. Feature Detection +

+

If the client has no knowledge of whether the resource supports resumable uploads, a resumable request can be used with some additional constraints. In particular, the Upload-Complete field value (Section 8.3) MUST NOT be false if the server support is unclear. This allows the upload to function as if it is a regular upload.

+

Servers SHOULD use the 104 (Upload Resumption Supported) informational response to indicate their support for a resumable upload request.

+

Clients MUST NOT attempt to resume an upload unless they receive 104 (Upload Resumption Supported) informational response, or have other out-of-band methods to determine server support for resumable uploads.

+
+
+
+
+

+4.2. Draft Version Identification +

+
    +
  • +

    RFC Editor's Note: Please remove this section and Upload-Draft-Interop-Version from all examples prior to publication of a final version of this document.

    +
  • +
+

The current interop version is 6.

+

Client implementations of draft versions of the protocol MUST send a header field Upload-Draft-Interop-Version with the interop version as its value to its requests. The Upload-Draft-Interop-Version field value is an Integer.

+

Server implementations of draft versions of the protocol MUST NOT send a 104 (Upload Resumption Supported) informational response when the interop version indicated by the Upload-Draft-Interop-Version header field in the request is missing or mismatching.

+

Server implementations of draft versions of the protocol MUST also send a header field Upload-Draft-Interop-Version with the interop version as its value to the 104 (Upload Resumption Supported) informational response.

+

Client implementations of draft versions of the protocol MUST ignore a 104 (Upload Resumption Supported) informational response with missing or mismatching interop version indicated by the Upload-Draft-Interop-Version header field.

+

The reason both the client and the server are sending and checking the draft version is to ensure that implementations of the final RFC will not accidentally interop with draft implementations, as they will not check the existence of the Upload-Draft-Interop-Version header field.

+
+
+
+
+
+
+

+5. Offset Retrieval +

+

If an upload is interrupted, the client MAY attempt to fetch the offset of the incomplete upload by sending a HEAD request to the upload resource.

+

The request MUST NOT include an Upload-Offset, Upload-Complete, or Upload-Length header field. The server MUST reject requests with either of these fields by responding with a 400 (Bad Request) status code.

+

If the server considers the upload resource to be active, it MUST respond with a 204 (No Content) or 200 (OK) status code. The response MUST include the Upload-Offset header field, with the value set to the current resumption offset for the target resource. The response MUST include the Upload-Complete header field; the value is set to true only if the upload is complete. The response MUST include the Upload-Length header field set to the upload's final size if one was recorded during the upload creation (Section 4). The response MAY include the Upload-Limit header field if corresponding limits on the upload resource exist.

+

An upload is considered complete only if the server completely and successfully received a corresponding creation request (Section 4) or append request (Section 6) with the Upload-Complete header value set to true.

+

The client MUST NOT perform offset retrieval while creation (Section 4) or append (Section 6) is in progress.

+

The offset MUST be accepted by a subsequent append (Section 6). Due to network delay and reordering, the server might still be receiving data from an ongoing transfer for the same upload resource, which in the client's perspective has failed. The server MAY terminate any transfers for the same upload resource before sending the response by abruptly terminating the HTTP connection or stream. Alternatively, the server MAY keep the ongoing transfer alive but ignore further bytes received past the offset.

+

The client MUST NOT start more than one append (Section 6) based on the resumption offset from a single offset retrieving request.

+

In order to prevent HTTP caching ([CACHING]), the response SHOULD include a Cache-Control header field with the value no-store.

+

If the server does not consider the upload resource to be active, it MUST respond with a 404 (Not Found) status code.

+

The resumption offset can be less than or equal to the number of bytes the client has already sent. The client MAY reject an offset which is greater than the number of bytes it has already sent during this upload. The client is expected to handle backtracking of a reasonable length. If the offset is invalid for this upload, or if the client cannot backtrack to the offset and reproduce the same content it has already sent, the upload MUST be considered a failure. The client MAY cancel the upload (Section 7) after rejecting the offset.

+

The following example shows an offset retrieval request. The server indicates the new offset and that the upload is not complete yet:

+
+
+HEAD /upload/b530ce8ff HTTP/1.1
+Host: example.com
+Upload-Draft-Interop-Version: 6
+
+
+
+
+HTTP/1.1 204 No Content
+Upload-Offset: 100
+Upload-Complete: ?0
+Cache-Control: no-store
+
+
+

The client SHOULD NOT automatically retry if a 4xx (Client Error) status code is received.

+
+
+
+
+

+6. Upload Append +

+

Upload appending is used for resuming an existing upload.

+

The request MUST use the PATCH method with the application/partial-upload media type and MUST be sent to the upload resource. The Upload-Offset field value (Section 8.1) MUST be set to the resumption offset.

+

If the end of the request content is not the end of the upload, the Upload-Complete field value (Section 8.3) MUST be set to false.

+

The server SHOULD respect representation metadata received during creation (Section 4). An upload append request continues uploading the same representation as used in the upload creation (Section 4) and thus uses the same content codings, if they were applied. For example, if the initial upload creation included the Content-Encoding: gzip header field, the upload append request resumes the transfer of the gzipped data without indicating again that the gzip coding is applied.

+

If the server does not consider the upload associated with the upload resource active, it MUST respond with a 404 (Not Found) status code.

+

The client MUST NOT perform multiple upload transfers for the same upload resource in parallel. This helps avoid race conditions, and data loss or corruption. The server is RECOMMENDED to take measures to avoid parallel upload transfers: The server MAY terminate any creation (Section 4) or append for the same upload URL. Since the client is not allowed to perform multiple transfers in parallel, the server can assume that the previous attempt has already failed. Therefore, the server MAY abruptly terminate the previous HTTP connection or stream.

+

If the offset indicated by the Upload-Offset field value does not match the offset provided by the immediate previous offset retrieval (Section 5), or the end offset of the immediate previous incomplete successful transfer, the server MUST respond with a 409 (Conflict) status code. The server MAY use the problem type [PROBLEM] of "https://iana.org/assignments/http-problem-types#mismatching-upload-offset" in the response; see Section 10.1.

+

The server applies the patch document of the application/partial-upload media type by appending the request content to the targeted upload resource. If the server does not receive the entire patch document, for example because of canceled requests or dropped connections, it SHOULD append as much of the patch document starting at its beginning and without discontinuities as possible. Appending a continuous section starting at the patch document's beginning constitutes a successful PATCH as defined in Section 2 of [RFC5789]. If the server did not receive and apply the entire patch document, the upload MUST NOT be considered complete.

+

While the request content is being uploaded, the target resource MAY send one or more informational responses with a 104 (Upload Resumption Supported) status code to the client. These informational responses MUST NOT contain the Location header field. They MAY include the Upload-Offset header field with the current upload offset as the value to inform the client about the upload progress.

+

The server MUST send the Upload-Offset header field in the response if it considers the upload active, either when the response is a success (e.g. 201 (Created)), or when the response is a failure (e.g. 409 (Conflict)). The value MUST be equal to the end offset of the entire upload, or the begin offset of the next chunk if the upload is still incomplete. The client SHOULD consider the upload failed if the status code indicates a success but the offset indicated by the Upload-Offset field value does not equal the total of begin offset plus the number of bytes uploaded in the request.

+

If the upload is already complete, the server MUST NOT modify the upload resource and MUST respond with a 400 (Bad Request) status code. The server MAY use the problem type [PROBLEM] of "https://iana.org/assignments/http-problem-types#completed-upload" in the response; see Section 10.2.

+

If the request completes successfully and the entire upload is complete, the server MUST acknowledge it by responding with a 2xx (Successful) status code. Servers are RECOMMENDED to use a 201 (Created) response if not otherwise specified. The response MUST NOT include the Upload-Complete header field with the value set to false.

+

If the request completes successfully but the entire upload is not yet complete indicated by the Upload-Complete field value set to false, the server MUST acknowledge it by responding with a 201 (Created) status code and the Upload-Complete field value set to false.

+

If the request includes the Upload-Complete field value set to true and a valid Content-Length header field, the client attempts to upload the remaining resource in one request. In this case, the upload's final size is the sum of the upload's offset and the Content-Length header field. If the server does not have a record of the upload's final size from creation or the previous append, the server MUST record the upload's final size to ensure its consistency. If the server does have a previous record, that value MUST match the upload's final size. If they do not match, the server MUST reject the request with a 400 (Bad Request) status code.

+

The server MUST prevent that the offset exceeds the upload's final size when appending. If a final size has been recorded and the upload append request exceeds this value, the server MUST stop appending bytes to the upload once the offset reaches the final size and reject the request with a 400 (Bad Request) status code. It is not sufficient to rely on the Content-Length header field for enforcement because the header field might not be present.

+

The request content MAY be empty. If the Upload-Complete field is then set to true, the client wants to complete the upload without appending additional data.

+

The following example shows an upload append. The client transfers the next 100 bytes at an offset of 100 and does not indicate that the upload is then completed. The server acknowledges the new offset:

+
+
+PATCH /upload/b530ce8ff HTTP/1.1
+Host: example.com
+Upload-Offset: 100
+Upload-Draft-Interop-Version: 6
+Content-Length: 100
+Content-Type: application/partial-upload
+
+[content (100 bytes)]
+
+
+
+
+HTTP/1.1 201 Created
+Upload-Offset: 200
+
+
+

The client MAY automatically attempt upload resumption when the connection is terminated unexpectedly, or if a 5xx (Server Error) status code is received. The client SHOULD NOT automatically retry if a 4xx (Client Error) status code is received.

+
+
+
+
+

+7. Upload Cancellation +

+

If the client wants to terminate the transfer without the ability to resume, it can send a DELETE request to the upload resource. Doing so is an indication that the client is no longer interested in continuing the upload, and that the server can release any resources associated with it.

+

The client MUST NOT initiate cancellation without the knowledge of server support.

+

The request MUST use the DELETE method. The request MUST NOT include an Upload-Offset or Upload-Complete header field. The server MUST reject the request with a Upload-Offset or Upload-Complete header field with a 400 (Bad Request) status code.

+

If the server successfully deactivates the upload resource, it MUST respond with a 204 (No Content) status code.

+

The server MAY terminate any in-flight requests to the upload resource before sending the response by abruptly terminating their HTTP connection(s) or stream(s).

+

If the server does not consider the upload resource to be active, it MUST respond with a 404 (Not Found) status code.

+

If the server does not support cancellation, it MUST respond with a 405 (Method Not Allowed) status code.

+

The following example shows an upload cancellation:

+
+
+DELETE /upload/b530ce8ff HTTP/1.1
+Host: example.com
+Upload-Draft-Interop-Version: 6
+
+
+
+
+HTTP/1.1 204 No Content
+
+
+
+
+
+
+

+8. Header Fields +

+
+
+

+8.1. Upload-Offset +

+

The Upload-Offset request and response header field indicates the resumption offset of corresponding upload, counted in bytes. The Upload-Offset field value is an Integer.

+
+
+
+
+

+8.2. Upload-Limit +

+

The Upload-Limit response header field indicates limits applying the upload resource. The Upload-Limit field value is a Dictionary. The following limits are defined:

+
    +
  • +

    The max-size key specifies a maximum size that an upload resource is allowed to reach, counted in bytes. The value is an Integer.

    +
  • +
  • +

    The min-size key specifies a minimum size for a resumable upload, counted in bytes. The server MAY NOT create an upload resource if the client indicates that the uploaded data is smaller than the minimum size by including the Content-Length and Upload-Complete: ?1 fields, but the server MAY still accept the uploaded data. The value is an Integer.

    +
  • +
  • +

    The max-append-size key specifies a maximum size counted in bytes for the request content in a single upload append request (Section 6). The server MAY reject requests exceeding this limit and a client SHOULD NOT send larger upload append requests. The value is an Integer.

    +
  • +
  • +

    The min-append-size key specifies a minimum size counted in bytes for the request content in a single upload append request (Section 6) that does not complete the upload by setting the Upload-Complete: ?1 field. The server MAY reject non-completing requests below this limit and a client SHOULD NOT send smaller non-completing upload append requests. A server MUST NOT reject an upload append request due to smaller size if the request includes the Upload-Complete: ?1 field. The value is an Integer.

    +
  • +
  • +

    The expires key specifies the remaining lifetime of the upload resource in seconds counted from the generation of the response by the server. After the resource's lifetime is reached, the server MAY make the upload resource inaccessible and a client SHOULD NOT attempt to access the upload resource. The lifetime MAY be extended but SHOULD NOT be reduced once the upload resource is created. The value is an Integer.

    +
  • +
+

When parsing this header field, unrecognized keys MUST be ignored and MUST NOT fail the parsing to facilitate the addition of new limits in the future.

+

A server that supports the creation of a resumable upload resource (Section 4) under a target URI MUST include the Upload-Limit header field with the corresponding limits in a response to an OPTIONS request sent to this target URI. If a server supports the creation of upload resources for any target URI, it MUST include the Upload-Limit header field with the corresponding limits in a response to an OPTIONS request with the * target. The limits announced in an OPTIONS response SHOULD NOT be less restrictive than the limits applied to an upload once the upload resource has been created. If the server does not apply any limits, it MUST use min-size=0 instead of an empty header value. A client can use an OPTIONS request to discover support for resumable uploads and potential limits before creating an upload resource.

+
+
+
+
+

+8.3. Upload-Complete +

+

The Upload-Complete request and response header field indicates whether the corresponding upload is considered complete. The Upload-Complete field value is a Boolean.

+

The Upload-Complete header field MUST only be used if support by the resource is known to the client (Section 4.1).

+
+
+
+
+

+8.4. Upload-Length +

+

The Upload-Length request and response header field indicates the number of bytes to be uploaded for the corresponding upload resource, counted in bytes. The Upload-Length field value is an Integer.

+
+
+
+
+
+
+

+9. Media Type application/partial-upload +

+

The application/partial-upload media type describes a contiguous block of data that should be uploaded to a resource. There is no minimum block size and the block might be empty. The start and end of the block might align with the start and end of the file that should be uploaded, but they are not required to be aligned.

+
+
+
+
+

+10. Problem Types +

+
+
+

+10.1. Mismatching Offset +

+

This section defines the "https://iana.org/assignments/http-problem-types#mismatching-upload-offset" problem type [PROBLEM]. A server MAY use this problem type when responding to an upload append request (Section 6) to indicate that the Upload-Offset header field in the request does not match the upload resource's offset.

+

Two problem type extension members are defined: the expected-offset and provided-offset members. A response using this problem type SHOULD populate both members, with the value of expected-offset taken from the upload resource and the value of provided-offset taken from the upload append request.

+

The following example shows an example response, where the resource's offset was 100, but the client attempted to append at offset 200:

+
+
+HTTP/1.1 409 Conflict
+Content-Type: application/problem+json
+
+{
+  "type":"https://iana.org/assignments/http-problem-types#mismatching-upload-offset",
+  "title": "offset from request does not match offset of resource",
+  "expected-offset": 100,
+  "provided-offset": 200
+}
+
+
+
+
+
+
+

+10.2. Completed Upload +

+

This section defines the "https://iana.org/assignments/http-problem-types#completed-upload" problem type [PROBLEM]. A server MAY use this problem type when responding to an upload append request (Section 6) to indicate that the upload has already been completed and cannot be modified.

+

The following example shows an example response:

+
+
+HTTP/1.1 400 Bad Request
+Content-Type: application/problem+json
+
+{
+  "type":"https://iana.org/assignments/http-problem-types#completed-upload",
+  "title": "upload is already completed"
+}
+
+
+
+
+
+
+
+
+
+

+11. Offset values +

+

The offset of an upload resource is the number of bytes that have been appended to the upload resource. Appended data cannot be removed from an upload and, therefore, the upload offset MUST NOT decrease. A server MUST NOT generate responses containing an Upload-Offset header field with a value that is smaller than was included in previous responses for the same upload resource. This includes informational and final responses for upload creation (Section 4), upload appending (Section 6), and offset retrieval (Section 5).

+

If a server loses data that has been appended to an upload, it MUST consider the upload resource invalid and reject further use of the upload resource. The Upload-Offset header field in responses serves as an acknowledgement of the append operation and as a guarantee that no retransmission of the data will be necessary. Client can use this guarantee to free resources associated to already uploaded data while the upload is still ongoing.

+
+
+
+
+

+12. Redirection +

+

The 301 (Moved Permanently) and 302 (Found) status codes MUST NOT be used in offset retrieval (Section 5) and upload cancellation (Section 7) responses. For other responses, the upload resource MAY return a 308 (Permanent Redirect) status code and clients SHOULD use the new permanent URI for subsequent requests. If the client receives a 307 (Temporary Redirect) response to an offset retrieval (Section 5) request, it MAY apply the redirection directly in an immediate subsequent upload append (Section 6).

+
+
+
+
+

+13. Content Codings +

+

Since the codings listed in Content-Encoding are a characteristic of the representation (see Section 8.4 of [HTTP]), both the client and the server always compute the values for Upload-Offset and optionally Upload-Length on the content coded data (that is, the representation data). Moreover, the content codings are retained throughout the entire upload, meaning that the server is not required to decode the representation data to support resumable uploads. See Appendix A of [DIGEST-FIELDS] for more information.

+
+
+
+
+

+14. Transfer Codings +

+

Unlike Content-Encoding (see Section 8.4.1 of [HTTP]), Transfer-Encoding (see Section 6.1 of [HTTP/1.1]) is a property of the message, not of the representation. Moreover, transfer codings can be applied in transit (e.g., by proxies). This means that a client does not have to consider the transfer codings to compute the upload offset, while a server is responsible for transfer decoding the message before computing the upload offset. The same applies to the value of Upload-Length. Please note that the Content-Length header field cannot be used in conjunction with the Transfer-Encoding header field.

+
+
+
+
+

+15. Integrity Digests +

+

The integrity of an entire upload or individual upload requests can be verifying using digests from [DIGEST-FIELDS].

+
+
+

+15.1. Representation Digests +

+

Representation digests help verify the integrity of the entire data that has been uploaded so far, which might strech across multiple requests.

+

If the client knows the integrity digest of the entire data before creating an upload resource, it MAY include the Repr-Digest header field when creating an upload (Section 4). Once the upload is completed, the server can compute the integrity digest of the received upload representation and compare it to the provided digest. If the digests don't match the server SHOULD consider the transfer failed and not process the uploaded data further. This way, the integrity of the entire uploaded data can be protected.

+

Alternatively, when creating an upload (Section 4), the client MAY ask the server to compute and return the integrity digests using a Want-Repr-Digest field conveying the preferred algorithms. +The response SHOULD include at least one of the requested digests, but MAY not include it. +The server SHOULD compute the representation digests using the preferred algorithms once the upload is complete and include the corresponding Repr-Digest header field in the response. +Alternatively, the server MAY compute the digest continuously during the upload and include the Repr-Digest header field in responses to upload creation (Section 4) and upload appending requests (Section 6) even when the upload is not completed yet. +This allows the client to simultaneously compute the digest of the transmitted upload data, compare its digest to the server's digest, and spot data integrity issues. +If an upload is spread across multiple requests, data integrity issues can be found even before the upload is fully completed.

+
+
+
+
+

+15.2. Content Digests +

+

Content digests help verify the integrity of the content in an individual request.

+

If the client knows the integrity digest of the content from an upload creation (Section 4) or upload appending (Section 6) request, it MAY include the Content-Digest header field in the request. Once the content has been received, the server can compute the integrity digest of the received content and compare it to the provided digest. If the digests don't match the server SHOULD consider the transfer failed and not append the content to the upload resource. This way, the integrity of an individual request can be protected.

+
+
+
+
+
+
+

+16. Subsequent Resources +

+

The server might process the uploaded data and make its results available in another resource during or after the upload. This subsequent resource is different from the upload resource created by the upload creation request (Section 4). The subsequent resource does not handle the upload process itself, but instead facilitates further interaction with the uploaded data. The server MAY indicate the location of this subsequent resource by including the Content-Location header field in the informational or final responses generated while creating (Section 4), appending to (Section 6), or retrieving the offset (Section 5) of an upload. For example, a subsequent resource could allow the client to fetch information extracted from the uploaded data.

+
+
+
+
+

+17. Upload Strategies +

+

The definition of the upload creation request (Section 4) provides the client with flexibility to choose whether the file is fully or partially transferred in the first request, or if no file data is included at all. Which behavior is best largely depends on the client's capabilities, its intention to avoid data re-transmission, and its knowledge about the server's support for resumable uploads.

+

The following subsections describe two typical upload strategies that are suited for common environments. Note that these modes are never explicitly communicated to the server and clients are not required to stick to one strategy, but can mix and adapt them to their needs.

+
+
+

+17.1. Optimistic Upload Creation +

+

An "optimistic upload creation" can be used independent of the client's knowledge about the server's support for resumable uploads. However, the client must be capable of handling and processing interim responses. An upload creation request then includes the full file because the client anticipates that the file will be transferred without interruptions or resumed if an interruption occurs.

+

The benefit of this method is that if the upload creation request succeeds, the file was transferred in a single request without additional round trips.

+

A possible drawback is that the client might be unable to resume an upload. If an upload is interrupted before the client received a 104 (Upload Resumption Supported) intermediate response with the upload URL, the client cannot resume that upload due to the missing upload URL. The intermediate response might not be received if the interruption happens too early in the message exchange, the server does not support resumable uploads at all, the server does not support sending the 104 (Upload Resumption Supported) intermediate response, or an intermediary dropped the intermediate response. Without a 104 response, the client needs to either treat the upload as failed or retry the entire upload creation request if this is allowed by the application.

+
+
+

+17.1.1. Upgrading To Resumable Uploads +

+

Optimistic upload creation allows clients and servers to automatically upgrade non-resumable uploads to resumable ones. In a non-resumable upload, the file is transferred in a single request, usually POST or PUT, without any ability to resume from interruptions. The client can offer the server to upgrade such a request to a resumable upload (see Section 4.1) by adding the Upload-Complete: ?1 header field to the original request. The Upload-Length header field SHOULD be added if the upload's final size is known upfront. The request is not changed otherwise.

+

A server that supports resumable uploads at the target URI can create a resumable upload resource and send its upload URL in a 104 (Upload Resumption Supported) intermediate response for the client to resume the upload after interruptions. A server that does not support resumable uploads or does not want to upgrade to a resumable upload for this request ignores the Upload-Complete: ?1 header. The transfer then falls back to a non-resumable upload without additional cost.

+

This upgrade can also be performed transparently by the client without the user taking an active role. When a user asks the client to send a non-resumable request, the client can perform the upgrade and handle potential interruptions and resumptions under the hood without involving the user. The last response received by the client is considered the response for the entire file upload and should be presented to the user.

+
+
+
+
+
+
+

+17.2. Careful Upload Creation +

+

For a "careful upload creation" the client knows that the server supports resumable uploads and sends an empty upload creation request without including any file data. Upon successful response reception, the client can use the included upload URL to transmit the file data (Section 6) and resume the upload at any stage if an interruption occurs. The client should inspect the response for the Upload-Limit header field, which would indicate limits applying to the remaining upload procedure.

+

The retransmission of file data or the ultimate upload failure that can happen with an "optimistic upload creation" is therefore avoided at the expense of an additional request that does not carry file data.

+

This approach best suited if the client cannot receive intermediate responses, e.g. due to a limitation in the provided HTTP interface, or if large files are transferred where the cost of the additional request is miniscule compared to the effort of transferring the large file itself.

+
+
+
+
+
+
+

+18. Security Considerations +

+

The upload resource URL is the identifier used for modifying the upload. Without further protection of this URL, an attacker may obtain information about an upload, append data to it, or cancel it. To prevent this, the server SHOULD ensure that only authorized clients can access the upload resource. In addition, the upload resource URL SHOULD be generated in such a way that makes it hard to be guessed by unauthorized clients.

+

Some servers or intermediaries provide scanning of content uploaded by clients. Any scanning mechanism that relies on receiving a complete file in a single request message can be defeated by resumable uploads because content can be split across multiple messages. Servers or intermediaries wishing to perform content scanning SHOULD consider how resumable uploads can circumvent scanning and take appropriate measures. Possible strategies include waiting for the upload to complete before scanning a full file, or disabling resumable uploads.

+

Resumable uploads are vulnerable to Slowloris-style attacks [SLOWLORIS]. A malicious client may create upload resources and keep them alive by regularly sending PATCH requests with no or small content to the upload resources. This could be abused to exhaust server resources by creating and holding open uploads indefinitely with minimal work.

+

Servers SHOULD provide mitigations for Slowloris attacks, such as increasing the maximum number of clients the server will allow, limiting the number of uploads a single client is allowed to make, imposing restrictions on the minimum transfer speed an upload is allowed to have, and restricting the length of time an upload resource can exist.

+
+
+
+
+

+19. IANA Considerations +

+

IANA is asked to register the following entries in the "Hypertext Transfer Protocol (HTTP) Field Name Registry":

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Table 1
Field NameStatusReference
Upload-Completepermanent + Section 8.3 of this document
Upload-Offsetpermanent + Section 8.1 of this document
Upload-Limitpermanent + Section 8.2 of this document
Upload-Lengthpermanent + Section 8.4 of this document
+

IANA is asked to register the following entry in the "HTTP Status Codes" registry:

+
+
Value:
+
+

104 (suggested value)

+
+
+
Description:
+
+

Upload Resumption Supported

+
+
+
Specification:
+
+

This document

+
+
+
+

IANA is asked to register the following entry in the "Media Types" registry:

+
+
Type name:
+
+

application

+
+
+
Subtype name:
+
+

partial-upload

+
+
+
Required parameters:
+
+

N/A

+
+
+
Optional parameters:
+
+

N/A

+
+
+
Encoding considerations:
+
+

binary

+
+
+
Security considerations:
+
+

see Section 18 of this document

+
+
+
Interoperability considerations:
+
+

N/A

+
+
+
Published specification:
+
+

This document

+
+
+
Applications that use this media type:
+
+

Applications that transfer files over unreliable networks or want pause- and resumable uploads.

+
+
+
Fragment identifier considerations:
+
+

N/A

+
+
+
+

Additional information:

+
    +
  • +

    Deprecated alias names for this type: N/A

    +
  • +
  • +

    Magic number(s): N/A

    +
  • +
  • +

    File extension(s): N/A

    +
  • +
  • +

    Macintosh file type code(s): N/A

    +
  • +
  • +

    Windows Clipboard Name: N/A

    +
  • +
+
+
Person and email address to contact for further information:
+
+

See the Authors' Addresses section of this document.

+
+
+
Intended usage:
+
+

COMMON

+
+
+
Restrictions on usage:
+
+

N/A

+
+
+
Author:
+
+

See the Authors' Addresses section of this document.

+
+
+
Change controller:
+
+

IETF

+
+
+
+

IANA is asked to register the following entry in the "HTTP Problem Types" registry:

+
+
Type URI:
+
+

https://iana.org/assignments/http-problem-types#mismatching-upload-offset +Title:

+
+
+
+
+

Mismatching Upload Offset +Recommended HTTP status code:

+
+
+
+
+

409 +Reference:

+
+
+
+
+

This document

+
+
+
+

IANA is asked to register the following entry in the "HTTP Problem Types" registry:

+
+
Type URI:
+
+

https://iana.org/assignments/http-problem-types#completed-upload +Title:

+
+
+
+
+

Upload Is Completed +Recommended HTTP status code:

+
+
+
+
+

400 +Reference:

+
+
+
+
+

This document

+
+
+
+
+
+
+
+

+20. References +

+
+
+

+20.1. Normative References +

+
+
[CACHING]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Caching", STD 98, RFC 9111, DOI 10.17487/RFC9111, , <https://www.rfc-editor.org/rfc/rfc9111>.
+
+
[DIGEST-FIELDS]
+
+Polli, R. and L. Pardue, "Digest Fields", RFC 9530, DOI 10.17487/RFC9530, , <https://www.rfc-editor.org/rfc/rfc9530>.
+
+
[HTTP]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, DOI 10.17487/RFC9110, , <https://www.rfc-editor.org/rfc/rfc9110>.
+
+
[HTTP/1.1]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP/1.1", STD 99, RFC 9112, DOI 10.17487/RFC9112, , <https://www.rfc-editor.org/rfc/rfc9112>.
+
+
[PROBLEM]
+
+Nottingham, M., Wilde, E., and S. Dalal, "Problem Details for HTTP APIs", RFC 9457, DOI 10.17487/RFC9457, , <https://www.rfc-editor.org/rfc/rfc9457>.
+
+
[RFC2119]
+
+Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
+
+
[RFC5789]
+
+Dusseault, L. and J. Snell, "PATCH Method for HTTP", RFC 5789, DOI 10.17487/RFC5789, , <https://www.rfc-editor.org/rfc/rfc5789>.
+
+
[RFC6266]
+
+Reschke, J., "Use of the Content-Disposition Header Field in the Hypertext Transfer Protocol (HTTP)", RFC 6266, DOI 10.17487/RFC6266, , <https://www.rfc-editor.org/rfc/rfc6266>.
+
+
[RFC8174]
+
+Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.
+
+
[STRUCTURED-FIELDS]
+
+Nottingham, M. and P. Kamp, "Structured Field Values for HTTP", RFC 8941, DOI 10.17487/RFC8941, , <https://www.rfc-editor.org/rfc/rfc8941>.
+
+
+
+
+
+
+

+20.2. Informative References +

+
+
[SLOWLORIS]
+
+"RSnake" Hansen, R., "Welcome to Slowloris - the low bandwidth, yet greedy and poisonous HTTP client!", , <https://web.archive.org/web/20150315054838/http://ha.ckers.org/slowloris/>.
+
+
+
+
+
+
+
+
+

+Appendix A. Informational Response +

+

The server is allowed to respond to upload creation (Section 4) requests with a 104 (Upload Resumption Supported) intermediate response as soon as the server has validated the request. This way, the client knows that the server supports resumable uploads before the complete response is received. The benefit is the clients can defer starting the actual data transfer until the server indicates full support (i.e. resumable are supported, the provided upload URL is active etc).

+

On the contrary, support for intermediate responses (the 1XX range) in existing software is limited or not at all present. Such software includes proxies, firewalls, browsers, and HTTP libraries for clients and server. Therefore, the 104 (Upload Resumption Supported) status code is optional and not mandatory for the successful completion of an upload. Otherwise, it might be impossible in some cases to implement resumable upload servers using existing software packages. Furthermore, as parts of the current internet infrastructure currently have limited support for intermediate responses, a successful delivery of a 104 (Upload Resumption Supported) from the server to the client should be assumed.

+

We hope that support for intermediate responses increases in the near future, to allow a wider usage of 104 (Upload Resumption Supported).

+
+
+
+
+

+Appendix B. Feature Detection +

+

This specification includes a section about feature detection (it was called service discovery in earlier discussions, but this name is probably ill-suited). The idea is to allow resumable uploads to be transparently implemented by HTTP clients. This means that application developers just keep using the same API of their HTTP library as they have done in the past with traditional, non-resumable uploads. Once the HTTP library gets updated (e.g. because mobile OS or browsers start implementing resumable uploads), the HTTP library can transparently decide to use resumable uploads without explicit configuration by the application developer. Of course, in order to use resumable uploads, the HTTP library needs to know whether the server supports resumable uploads. If no support is detected, the HTTP library should use the traditional, non-resumable upload technique. We call this process feature detection.

+

Ideally, the technique used for feature detection meets following criteria (there might not be one approach which fits all requirements, so we have to prioritize them):

+
    +
  1. +

    Avoid additional roundtrips by the client, if possible (i.e. an additional HTTP request by the client should be avoided).

    +
  2. +
  3. +

    Be backwards compatible to HTTP/1.1 and existing network infrastructure: This means to avoid using new features in HTTP/2, or features which might require changes to existing network infrastructure (e.g. nginx or HTTP libraries)

    +
  4. +
  5. +

    Conserve the user's privacy (i.e. the feature detection should not leak information to other third-parties about which URLs have been connected to)

    +
  6. +
+

Following approaches have already been considered in the past. All except the last approaches have not been deemed acceptable and are therefore not included in the specification. This follow list is a reference for the advantages and disadvantages of some approaches:

+

Include a support statement in the SETTINGS frame. The SETTINGS frame is a HTTP/2 feature and is sent by the server to the client to exchange information about the current connection. The idea was to include an additional statement in this frame, so the client can detect support for resumable uploads without an additional roundtrip. The problem is that this is not compatible with HTTP/1.1. Furthermore, the SETTINGS frame is intended for information about the current connection (not bound to a request/response) and might not be persisted when transmitted through a proxy.

+

Include a support statement in the DNS record. The client can detect support when resolving a domain name. Of course, DNS is not semantically the correct layer. Also, DNS might not be involved if the record is cached or retrieved from a hosts files.

+

Send a HTTP request to ask for support. This is the easiest approach where the client sends an OPTIONS request and uses the response to determine if the server indicates support for resumable uploads. An alternative is that the client sends the request to a well-known URL to obtain this response, e.g. /.well-known/resumable-uploads. Of course, while being fully backwards-compatible, it requires an additional roundtrip.

+

Include a support statement in previous responses. In many cases, the file upload is not the first time that the client connects to the server. Often additional requests are sent beforehand for authentication, data retrieval etc. The responses for those requests can also include a header field which indicates support for resumable uploads. There are two options: +- Use the standardized Alt-Svc response header field. However, it has been indicated to us that this header field might be reworked in the future and could also be semantically different from our intended usage. +- Use a new response header field Resumable-Uploads: https://example.org/files/* to indicate under which endpoints support for resumable uploads is available.

+

Send a 104 intermediate response to indicate support. The clients normally starts a traditional upload and includes a header field indicate that it supports resumable uploads (e.g. Upload-Offset: 0). If the server also supports resumable uploads, it will immediately respond with a 104 intermediate response to indicate its support, before further processing the request. This way the client is informed during the upload whether it can resume from possible connection errors or not. While an additional roundtrip is avoided, the problem with that solution is that many HTTP server libraries do not support sending custom 1XX responses and that some proxies may not be able to handle new 1XX status codes correctly.

+

Send a 103 Early Hint response to indicate support. This approach is the similar to the above one, with one exception: Instead of a new 104 (Upload Resumption Supported) status code, the existing 103 (Early Hint) status code is used in the intermediate response. The 103 code would then be accompanied by a header field indicating support for resumable uploads (e.g. Resumable-Uploads: 1). It is unclear whether the Early Hints code is appropriate for that, as it is currently only used to indicate resources for prefetching them.

+
+
+
+
+

+Appendix C. Upload Metadata +

+

When an upload is created (Section 4), the Content-Type and Content-Disposition header fields are allowed to be included. They are intended to be a standardized way of communicating the file name and file type, if available. However, this is not without controversy. Some argue that since these header fields are already defined in other specifications, it is not necessary to include them here again. Furthermore, the Content-Disposition header field's format is not clearly enough defined. For example, it is left open which disposition value should be used in the header field. There needs to be more discussion whether this approach is suited or not.

+

However, from experience with the tus project, users are often asking for a way to communicate the file name and file type. Therefore, we believe it is help to explicitly include an approach for doing so.

+
+
+
+
+

+Appendix D. FAQ +

+
    +
  • +

    Are multipart requests supported? Yes, requests whose content is encoded using the multipart/form-data are implicitly supported. The entire encoded content can be considered as a single file, which is then uploaded using the resumable protocol. The server, of course, must store the delimiter ("boundary") separating each part and must be able to parse the multipart format once the upload is completed.

    +
  • +
+
+
+
+
+

+Acknowledgments +

+

This document is based on an Internet-Draft specification written by Jiten Mehta, Stefan Matsson, and the authors of this document.

+

The tus v1 protocol is a specification for a resumable file upload protocol over HTTP. It inspired the early design of this protocol. Members of the tus community helped significantly in the process of bringing this work to the IETF.

+

The authors would like to thank Mark Nottingham for substantive contributions to the text.

+
+
+
+
+

+Changes +

+

This section is to be removed before publishing as an RFC.

+ +
+
+

+Since draft-ietf-httpbis-resumable-upload-04 +

+
    +
  • +

    Clarify implications of Upload-Limit header.

    +
  • +
  • +

    Allow client to fetch upload limits upfront via OPTIONS.

    +
  • +
  • +

    Add guidance on upload creation strategy.

    +
  • +
  • +

    Add Upload-Length header to indicate length during creation.

    +
  • +
  • +

    Describe possible usage of Want-Repr-Digest.

    +
  • +
+
+
+
+
+

+Since draft-ietf-httpbis-resumable-upload-03 +

+
    +
  • +

    Add note about Content-Location for referring to subsequent resources.

    +
  • +
  • +

    Require application/partial-upload for appending to uploads.

    +
  • +
  • +

    Explain handling of content and transfer codings.

    +
  • +
  • +

    Add problem types for mismatching offsets and completed uploads.

    +
  • +
  • +

    Clarify that completed uploads must not be appended to.

    +
  • +
  • +

    Describe interaction with Digest Fields from RFC9530.

    +
  • +
  • +

    Require that upload offset does not decrease over time.

    +
  • +
  • +

    Add Upload-Limit header field.

    +
  • +
  • +

    Increase the draft interop version.

    +
  • +
+
+
+
+
+

+Since draft-ietf-httpbis-resumable-upload-02 +

+
    +
  • +

    Add upload progress notifications via informational responses.

    +
  • +
  • +

    Add security consideration regarding request filtering.

    +
  • +
  • +

    Explain the use of empty requests for creation uploads and appending.

    +
  • +
  • +

    Extend security consideration to include resource exhaustion attacks.

    +
  • +
  • +

    Allow 200 status codes for offset retrieval.

    +
  • +
  • +

    Increase the draft interop version.

    +
  • +
+
+
+
+
+

+Since draft-ietf-httpbis-resumable-upload-01 +

+
    +
  • +

    Replace Upload-Incomplete header with Upload-Complete.

    +
  • +
  • +

    Replace terminology about procedures with HTTP resources.

    +
  • +
  • +

    Increase the draft interop version.

    +
  • +
+
+
+
+
+

+Since draft-ietf-httpbis-resumable-upload-00 +

+
    +
  • +

    Remove Upload-Token and instead use Server-generated upload URL for upload identification.

    +
  • +
  • +

    Require the Upload-Incomplete header field in Upload Creation Procedure.

    +
  • +
  • +

    Increase the draft interop version.

    +
  • +
+
+
+ +
+
+

+Since draft-tus-httpbis-resumable-uploads-protocol-01 +

+
    +
  • +

    Clarifying backtracking and preventing skipping ahead during the Offset Receiving Procedure.

    +
  • +
  • +

    Clients auto-retry 404 is no longer allowed.

    +
  • +
+
+
+
+
+

+Since draft-tus-httpbis-resumable-uploads-protocol-00 +

+
    +
  • +

    Split the Upload Transfer Procedure into the Upload Creation Procedure and the Upload Appending Procedure.

    +
  • +
+
+
+
+
+
+
+

+Authors' Addresses +

+
+
Marius Kleidl (editor)
+
Transloadit
+ +
+
+
Guoye Zhang (editor)
+
Apple Inc.
+ +
+
+
Lucas Pardue (editor)
+
Cloudflare
+ +
+
+
+ + + diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-resumable-upload.txt b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-resumable-upload.txt new file mode 100644 index 000000000..eb23fb8f3 --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-resumable-upload.txt @@ -0,0 +1,1601 @@ + + + + +HTTP M. Kleidl, Ed. +Internet-Draft Transloadit +Intended status: Standards Track G. Zhang, Ed. +Expires: 11 July 2025 Apple Inc. + L. Pardue, Ed. + Cloudflare + 7 January 2025 + + + Resumable Uploads for HTTP + draft-ietf-httpbis-resumable-upload-latest + +Abstract + + HTTP clients often encounter interrupted data transfers as a result + of canceled requests or dropped connections. Prior to interruption, + part of a representation may have been exchanged. To complete the + data transfer of the entire representation, it is often desirable to + issue subsequent requests that transfer only the remainder of the + representation. HTTP range requests support this concept of + resumable downloads from server to client. This document describes a + mechanism that supports resumable uploads from client to server using + HTTP. + +About This Document + + This note is to be removed before publishing as an RFC. + + Status information for this document may be found at + https://datatracker.ietf.org/doc/draft-ietf-httpbis-resumable- + upload/. + + Discussion of this document takes place on the HTTP Working Group + mailing list (mailto:ietf-http-wg@w3.org), which is archived at + https://lists.w3.org/Archives/Public/ietf-http-wg/. Working Group + information can be found at https://httpwg.org/. + + Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/resumable-upload. + +Status of This Memo + + This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79. + + Internet-Drafts are working documents of the Internet Engineering + Task Force (IETF). Note that other groups may also distribute + working documents as Internet-Drafts. The list of current Internet- + Drafts is at https://datatracker.ietf.org/drafts/current/. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress." + + This Internet-Draft will expire on 11 July 2025. + +Copyright Notice + + Copyright (c) 2025 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents (https://trustee.ietf.org/ + license-info) in effect on the date of publication of this document. + Please review these documents carefully, as they describe your rights + and restrictions with respect to this document. Code Components + extracted from this document must include Revised BSD License text as + described in Section 4.e of the Trust Legal Provisions and are + provided without warranty as described in the Revised BSD License. + +Table of Contents + + 1. Introduction + 2. Conventions and Definitions + 3. Overview + 3.1. Example 1: Complete upload of file with known size + 3.2. Example 2: Upload as a series of parts + 4. Upload Creation + 4.1. Feature Detection + 4.2. Draft Version Identification + 5. Offset Retrieval + 6. Upload Append + 7. Upload Cancellation + 8. Header Fields + 8.1. Upload-Offset + 8.2. Upload-Limit + 8.3. Upload-Complete + 8.4. Upload-Length + 9. Media Type application/partial-upload + 10. Problem Types + 10.1. Mismatching Offset + 10.2. Completed Upload + 11. Offset values + 12. Redirection + 13. Content Codings + 14. Transfer Codings + 15. Integrity Digests + 15.1. Representation Digests + 15.2. Content Digests + 16. Subsequent Resources + 17. Upload Strategies + 17.1. Optimistic Upload Creation + 17.1.1. Upgrading To Resumable Uploads + 17.2. Careful Upload Creation + 18. Security Considerations + 19. IANA Considerations + 20. References + 20.1. Normative References + 20.2. Informative References + Appendix A. Informational Response + Appendix B. Feature Detection + Appendix C. Upload Metadata + Appendix D. FAQ + Acknowledgments + Changes + Since draft-ietf-httpbis-resumable-upload-05 + Since draft-ietf-httpbis-resumable-upload-04 + Since draft-ietf-httpbis-resumable-upload-03 + Since draft-ietf-httpbis-resumable-upload-02 + Since draft-ietf-httpbis-resumable-upload-01 + Since draft-ietf-httpbis-resumable-upload-00 + Since draft-tus-httpbis-resumable-uploads-protocol-02 + Since draft-tus-httpbis-resumable-uploads-protocol-01 + Since draft-tus-httpbis-resumable-uploads-protocol-00 + Authors' Addresses + +1. Introduction + + HTTP clients often encounter interrupted data transfers as a result + of canceled requests or dropped connections. Prior to interruption, + part of a representation (see Section 3.2 of [HTTP]) might have been + exchanged. To complete the data transfer of the entire + representation, it is often desirable to issue subsequent requests + that transfer only the remainder of the representation. HTTP range + requests (see Section 14 of [HTTP]) support this concept of resumable + downloads from server to client. + + HTTP methods such as POST or PUT can be used by clients to request + processing of representation data enclosed in the request message. + The transfer of representation data from client to server is often + referred to as an upload. Uploads are just as likely as downloads to + suffer from the effects of data transfer interruption. Humans can + play a role in upload interruptions through manual actions such as + pausing an upload. Regardless of the cause of an interruption, + servers may have received part of the representation before its + occurrence and it is desirable if clients can complete the data + transfer by sending only the remainder of the representation. The + process of sending additional parts of a representation using + subsequent HTTP requests from client to server is herein referred to + as a resumable upload. + + Connection interruptions are common and the absence of a standard + mechanism for resumable uploads has lead to a proliferation of custom + solutions. Some of those use HTTP, while others rely on other + transfer mechanisms entirely. An HTTP-based standard solution is + desirable for such a common class of problem. + + This document defines an optional mechanism for HTTP that enables + resumable uploads in a way that is backwards-compatible with + conventional HTTP uploads. When an upload is interrupted, clients + can send subsequent requests to query the server state and use this + information to send the remaining data. Alternatively, they can + cancel the upload entirely. Different from ranged downloads, this + protocol does not support transferring different parts of the same + representation in parallel. + +2. Conventions and Definitions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + + The terms Byte Sequence, Item, String, Token, Integer, and Boolean + are imported from [STRUCTURED-FIELDS]. + + The terms "representation", "representation data", "representation + metadata", "content", "client" and "server" are from [HTTP]. + +3. Overview + + Resumable uploads are supported in HTTP through use of a temporary + resource, an _upload resource_, that is separate from the resource + being uploaded to (hereafter, the _target resource_) and specific to + that upload. By interacting with the upload resource, a client can + retrieve the current offset of the upload (Section 5), append to the + upload (Section 6), and cancel the upload (Section 7). + + The remainder of this section uses an example of a file upload to + illustrate different interactions with the upload resource. Note, + however, that HTTP message exchanges use representation data (see + Section 8.1 of [HTTP]), which means that resumable uploads can be + used with many forms of content -- not just static files. + +3.1. Example 1: Complete upload of file with known size + + In this example, the client first attempts to upload a file with a + known size in a single HTTP request to the target resource. An + interruption occurs and the client then attempts to resume the upload + using subsequent HTTP requests to the upload resource. + + 1) The client notifies the server that it wants to begin an upload + (Section 4). The server reserves the required resources to accept + the upload from the client, and the client begins transferring the + entire file in the request content. + + An informational response can be sent to the client, which signals + the server's support of resumable upload as well as the upload + resource URL via the Location header field (Section 10.2.2 of + [HTTP]). + + Client Server + | | + | POST | + |------------------------------------------->| + | | + | | Reserve resources + | | for upload + | |-----------------. + | | | + | |<----------------' + | | + | 104 Upload Resumption Supported | + | with upload resource URL | + |<-------------------------------------------| + | | + | Flow Interrupted | + |------------------------------------------->| + | | + + Figure 1: Upload Creation + + 2) If the connection to the server is interrupted, the client might + want to resume the upload. However, before this is possible the + client needs to know the amount of data that the server received + before the interruption. It does so by retrieving the offset + (Section 5) from the upload resource. + + Client Server + | | + | HEAD to upload resource URL | + |------------------------------------------------>| + | | + | 204 No Content with Upload-Offset | + |<------------------------------------------------| + | | + + Figure 2: Offset Retrieval + + 3) The client can resume the upload by sending the remaining file + content to the upload resource (Section 6), appending to the already + stored data in the upload. The Upload-Offset value is included to + ensure that the client and server agree on the offset that the upload + resumes from. + + Client Server + | | + | PATCH to upload resource URL with Upload-Offset | + |------------------------------------------------>| + | | + | 201 Created on completion | + |<------------------------------------------------| + | | + + Figure 3: Upload Append + + 4) If the client is not interested in completing the upload, it can + instruct the upload resource to delete the upload and free all + related resources (Section 7). + + Client Server + | | + | DELETE to upload resource URL | + |------------------------------------------------>| + | | + | 204 No Content on completion | + |<------------------------------------------------| + | | + + Figure 4: Upload Cancellation + +3.2. Example 2: Upload as a series of parts + + In some cases, clients might prefer to upload a file as a series of + parts sent serially across multiple HTTP messages. One use case is + to overcome server limits on HTTP message content size. Another use + case is where the client does not know the final size, such as when + file data originates from a streaming source. + + This example shows how the client, with prior knowledge about the + server's resumable upload support, can upload parts of a file + incrementally. + + 1) If the client is aware that the server supports resumable upload, + it can start an upload with the Upload-Complete field value set to + false and the first part of the file. + + Client Server + | | + | POST with Upload-Complete: ?0 | + |------------------------------------------------>| + | | + | 201 Created with Upload-Complete: ?0 | + | and Location on completion | + |<------------------------------------------------| + | | + + Figure 5: Incomplete Upload Creation + + 2) Subsequently, parts are appended (Section 6). The last part of + the upload has a Upload-Complete field value set to true to indicate + the complete transfer. + + Client Server + | | + | PATCH to upload resource URL with | + | Upload-Offset and Upload-Complete: ?1 | + |------------------------------------------------>| + | | + | 201 Created on completion | + |<------------------------------------------------| + | | + + Figure 6: Upload Append Last Chunk + +4. Upload Creation + + When a resource supports resumable uploads, the first step is + creating the upload resource. To be compatible with the widest range + of resources, this is accomplished by including the Upload-Complete + header field in the request that initiates the upload. + + As a consequence, resumable uploads support all HTTP request methods + that can carry content, such as POST, PUT, and PATCH. Similarly, the + response to the upload request can have any status code. Both the + method(s) and status code(s) supported are determined by the + resource. + + Upload-Complete MUST be set to false if the end of the request + content is not the end of the upload. Otherwise, it MUST be set to + true. This header field can be used for request identification by a + server. The request MUST NOT include the Upload-Offset header field. + + If the request is valid, the server SHOULD create an upload resource. + Then, the server MUST include the Location header field in the + response and set its value to the URL of the upload resource. The + client MAY use this URL for offset retrieval (Section 5), upload + append (Section 6), and upload cancellation (Section 7). + + Once the upload resource is available and while the request content + is being uploaded, the target resource MAY send one or more + informational responses with a 104 (Upload Resumption Supported) + status code to the client. In the first informational response, the + Location header field MUST be set to the URL pointing to the upload + resource. In subsequent informational responses, the Location header + field MUST NOT be set. An informational response MAY contain the + Upload-Offset header field with the current upload offset as the + value to inform the client about the upload progress. + + The server MUST send the Upload-Offset header field in the response + if it considers the upload active, either when the response is a + success (e.g. 201 (Created)), or when the response is a failure (e.g. + 409 (Conflict)). The Upload-Offset field value MUST be equal to the + end offset of the entire upload, or the begin offset of the next + chunk if the upload is still incomplete. The client SHOULD consider + the upload failed if the response has a status code that indicates a + success but the offset indicated in the Upload-Offset field value + does not equal the total of begin offset plus the number of bytes + uploaded in the request. + + If the request completes successfully and the entire upload is + complete, the server MUST acknowledge it by responding with a 2xx + (Successful) status code. Servers are RECOMMENDED to use 201 + (Created) unless otherwise specified. The response MUST NOT include + the Upload-Complete header field with the value of false. + + If the request completes successfully but the entire upload is not + yet complete, as indicated by an Upload-Complete field value of false + in the request, the server MUST acknowledge it by responding with the + 201 (Created) status code and an Upload-Complete header value set to + false. + + The request can indicate the upload's final size in two different + ways. Both indicators may be present in the same request as long as + they convey the same size. If the sizes are inconsistent, the server + MUST reject the request by responding with a 400 (Bad Request) status + code. + + * If the request includes an Upload-Complete field value set to true + and a valid Content-Length header field, the client attempts to + upload a fixed-length resource in one request. In this case, the + upload's final size is the Content-Length field value and the + server MUST record it to ensure its consistency. The value can + therefore not be used if the upload is split across multiple + requests. + + * If the request includes the Upload-Length header field, the server + MUST record its value as the upload's final size. A client SHOULD + provide this header field if the upload length is known at the + time of upload creation. + + The upload is not automatically completed if the offset reaches the + upload's final size. Instead, a client MUST indicate the completion + of an upload through the Upload-Complete header field. Indicating an + upload's final size can help the server allocate necessary resources + for the upload and provide early feedback if the size does not match + the server's limits (Section 8.2). + + The server MAY enforce a maximum size of an upload resource. This + limit MAY be equal to the upload's final size, if available, or an + arbitrary value. The limit's value or its existence MUST NOT change + throughout the lifetime of the upload resource. The server MAY + indicate such a limit to the client by including the Upload-Limit + header field in the informational or final response to upload + creation. If the client receives an Upload-Limit header field + indicating that the maximum size is less than the amount of bytes it + intends to upload to a resource, it SHOULD stop the current upload + transfer immediately and cancel the upload (Section 7). + + The request content MAY be empty. If the Upload-Complete header + field is then set to true, the client intends to upload an empty + resource representation. An Upload-Complete header field is set to + false is also valid. This can be used to create an upload resource + URL before transferring data, which can save client or server + resources. Since informational responses are optional, this + technique provides another mechanism to learn the URL, at the cost of + an additional round-trip before data upload can commence. + + If the server does not receive the entire request content, for + example because of canceled requests or dropped connections, it + SHOULD append as much of the request content starting at its + beginning and without discontinuities as possible. If the server did + not append the entire request content, the upload MUST NOT be + considered complete. + + POST /upload HTTP/1.1 + Host: example.com + Upload-Draft-Interop-Version: 6 + Upload-Complete: ?1 + Content-Length: 100 + Upload-Length: 100 + + [content (100 bytes)] + + HTTP/1.1 104 Upload Resumption Supported + Upload-Draft-Interop-Version: 6 + Location: https://example.com/upload/b530ce8ff + + HTTP/1.1 104 Upload Resumption Supported + Upload-Draft-Interop-Version: 6 + Upload-Offset: 50 + Upload-Limit: max-size=1000000000 + + HTTP/1.1 201 Created + Location: https://example.com/upload/b530ce8ff + Upload-Offset: 100 + Upload-Limit: max-size=1000000000 + + The next example shows an upload creation, where only the first 25 + bytes of a 100 bytes upload are transferred. The server acknowledges + the received data and that the upload is not complete yet: + + POST /upload HTTP/1.1 + Host: example.com + Upload-Draft-Interop-Version: 6 + Upload-Complete: ?0 + Content-Length: 25 + Upload-Length: 100 + + [partial content (25 bytes)] + + HTTP/1.1 201 Created + Location: https://example.com/upload/b530ce8ff + Upload-Complete: ?0 + Upload-Offset: 25 + Upload-Limit: max-size=1000000000 + + If the client received an informational response with the upload URL + in the Location field value, it MAY automatically attempt upload + resumption when the connection is terminated unexpectedly, or if a + 5xx status is received. The client SHOULD NOT automatically retry if + it receives a 4xx status code. + + File metadata can affect how servers might act on the uploaded file. + Clients can send representation metadata (see Section 8.3 of [HTTP]) + in the request that starts an upload. Servers MAY interpret this + metadata or MAY ignore it. The Content-Type header field + (Section 8.3 of [HTTP]) can be used to indicate the media type of the + file. The applied content codings are specified using the Content- + Encoding header field and are retained throughout the entire upload. + When resuming an interrupted upload, the same content codings are + used for appending to the upload, producing a representation of the + upload resource. The Content-Disposition header field ([RFC6266]) + can be used to transmit a filename; if included, the parameters + SHOULD be either filename, filename* or boundary. + +4.1. Feature Detection + + If the client has no knowledge of whether the resource supports + resumable uploads, a resumable request can be used with some + additional constraints. In particular, the Upload-Complete field + value (Section 8.3) MUST NOT be false if the server support is + unclear. This allows the upload to function as if it is a regular + upload. + + Servers SHOULD use the 104 (Upload Resumption Supported) + informational response to indicate their support for a resumable + upload request. + + Clients MUST NOT attempt to resume an upload unless they receive 104 + (Upload Resumption Supported) informational response, or have other + out-of-band methods to determine server support for resumable + uploads. + +4.2. Draft Version Identification + + *RFC Editor's Note:* Please remove this section and Upload-Draft- + Interop-Version from all examples prior to publication of a final + version of this document. + + The current interop version is 6. + + Client implementations of draft versions of the protocol MUST send a + header field Upload-Draft-Interop-Version with the interop version as + its value to its requests. The Upload-Draft-Interop-Version field + value is an Integer. + + Server implementations of draft versions of the protocol MUST NOT + send a 104 (Upload Resumption Supported) informational response when + the interop version indicated by the Upload-Draft-Interop-Version + header field in the request is missing or mismatching. + + Server implementations of draft versions of the protocol MUST also + send a header field Upload-Draft-Interop-Version with the interop + version as its value to the 104 (Upload Resumption Supported) + informational response. + + Client implementations of draft versions of the protocol MUST ignore + a 104 (Upload Resumption Supported) informational response with + missing or mismatching interop version indicated by the Upload-Draft- + Interop-Version header field. + + The reason both the client and the server are sending and checking + the draft version is to ensure that implementations of the final RFC + will not accidentally interop with draft implementations, as they + will not check the existence of the Upload-Draft-Interop-Version + header field. + +5. Offset Retrieval + + If an upload is interrupted, the client MAY attempt to fetch the + offset of the incomplete upload by sending a HEAD request to the + upload resource. + + The request MUST NOT include an Upload-Offset, Upload-Complete, or + Upload-Length header field. The server MUST reject requests with + either of these fields by responding with a 400 (Bad Request) status + code. + + If the server considers the upload resource to be active, it MUST + respond with a 204 (No Content) or 200 (OK) status code. The + response MUST include the Upload-Offset header field, with the value + set to the current resumption offset for the target resource. The + response MUST include the Upload-Complete header field; the value is + set to true only if the upload is complete. The response MUST + include the Upload-Length header field set to the upload's final size + if one was recorded during the upload creation (Section 4). The + response MAY include the Upload-Limit header field if corresponding + limits on the upload resource exist. + + An upload is considered complete only if the server completely and + successfully received a corresponding creation request (Section 4) or + append request (Section 6) with the Upload-Complete header value set + to true. + + The client MUST NOT perform offset retrieval while creation + (Section 4) or append (Section 6) is in progress. + + The offset MUST be accepted by a subsequent append (Section 6). Due + to network delay and reordering, the server might still be receiving + data from an ongoing transfer for the same upload resource, which in + the client's perspective has failed. The server MAY terminate any + transfers for the same upload resource before sending the response by + abruptly terminating the HTTP connection or stream. Alternatively, + the server MAY keep the ongoing transfer alive but ignore further + bytes received past the offset. + + The client MUST NOT start more than one append (Section 6) based on + the resumption offset from a single offset retrieving request. + + In order to prevent HTTP caching ([CACHING]), the response SHOULD + include a Cache-Control header field with the value no-store. + + If the server does not consider the upload resource to be active, it + MUST respond with a 404 (Not Found) status code. + + The resumption offset can be less than or equal to the number of + bytes the client has already sent. The client MAY reject an offset + which is greater than the number of bytes it has already sent during + this upload. The client is expected to handle backtracking of a + reasonable length. If the offset is invalid for this upload, or if + the client cannot backtrack to the offset and reproduce the same + content it has already sent, the upload MUST be considered a failure. + The client MAY cancel the upload (Section 7) after rejecting the + offset. + + The following example shows an offset retrieval request. The server + indicates the new offset and that the upload is not complete yet: + + HEAD /upload/b530ce8ff HTTP/1.1 + Host: example.com + Upload-Draft-Interop-Version: 6 + + HTTP/1.1 204 No Content + Upload-Offset: 100 + Upload-Complete: ?0 + Cache-Control: no-store + + The client SHOULD NOT automatically retry if a 4xx (Client Error) + status code is received. + +6. Upload Append + + Upload appending is used for resuming an existing upload. + + The request MUST use the PATCH method with the application/partial- + upload media type and MUST be sent to the upload resource. The + Upload-Offset field value (Section 8.1) MUST be set to the resumption + offset. + + If the end of the request content is not the end of the upload, the + Upload-Complete field value (Section 8.3) MUST be set to false. + + The server SHOULD respect representation metadata received during + creation (Section 4). An upload append request continues uploading + the same representation as used in the upload creation (Section 4) + and thus uses the same content codings, if they were applied. For + example, if the initial upload creation included the Content- + Encoding: gzip header field, the upload append request resumes the + transfer of the gzipped data without indicating again that the gzip + coding is applied. + + If the server does not consider the upload associated with the upload + resource active, it MUST respond with a 404 (Not Found) status code. + + The client MUST NOT perform multiple upload transfers for the same + upload resource in parallel. This helps avoid race conditions, and + data loss or corruption. The server is RECOMMENDED to take measures + to avoid parallel upload transfers: The server MAY terminate any + creation (Section 4) or append for the same upload URL. Since the + client is not allowed to perform multiple transfers in parallel, the + server can assume that the previous attempt has already failed. + Therefore, the server MAY abruptly terminate the previous HTTP + connection or stream. + + If the offset indicated by the Upload-Offset field value does not + match the offset provided by the immediate previous offset retrieval + (Section 5), or the end offset of the immediate previous incomplete + successful transfer, the server MUST respond with a 409 (Conflict) + status code. The server MAY use the problem type [PROBLEM] of + "https://iana.org/assignments/http-problem-types#mismatching-upload- + offset" in the response; see Section 10.1. + + The server applies the patch document of the application/partial- + upload media type by appending the request content to the targeted + upload resource. If the server does not receive the entire patch + document, for example because of canceled requests or dropped + connections, it SHOULD append as much of the patch document starting + at its beginning and without discontinuities as possible. Appending + a continuous section starting at the patch document's beginning + constitutes a successful PATCH as defined in Section 2 of [RFC5789]. + If the server did not receive and apply the entire patch document, + the upload MUST NOT be considered complete. + + While the request content is being uploaded, the target resource MAY + send one or more informational responses with a 104 (Upload + Resumption Supported) status code to the client. These informational + responses MUST NOT contain the Location header field. They MAY + include the Upload-Offset header field with the current upload offset + as the value to inform the client about the upload progress. + + The server MUST send the Upload-Offset header field in the response + if it considers the upload active, either when the response is a + success (e.g. 201 (Created)), or when the response is a failure (e.g. + 409 (Conflict)). The value MUST be equal to the end offset of the + entire upload, or the begin offset of the next chunk if the upload is + still incomplete. The client SHOULD consider the upload failed if + the status code indicates a success but the offset indicated by the + Upload-Offset field value does not equal the total of begin offset + plus the number of bytes uploaded in the request. + + If the upload is already complete, the server MUST NOT modify the + upload resource and MUST respond with a 400 (Bad Request) status + code. The server MAY use the problem type [PROBLEM] of + "https://iana.org/assignments/http-problem-types#completed-upload" in + the response; see Section 10.2. + + If the request completes successfully and the entire upload is + complete, the server MUST acknowledge it by responding with a 2xx + (Successful) status code. Servers are RECOMMENDED to use a 201 + (Created) response if not otherwise specified. The response MUST NOT + include the Upload-Complete header field with the value set to false. + + If the request completes successfully but the entire upload is not + yet complete indicated by the Upload-Complete field value set to + false, the server MUST acknowledge it by responding with a 201 + (Created) status code and the Upload-Complete field value set to + false. + + If the request includes the Upload-Complete field value set to true + and a valid Content-Length header field, the client attempts to + upload the remaining resource in one request. In this case, the + upload's final size is the sum of the upload's offset and the + Content-Length header field. If the server does not have a record of + the upload's final size from creation or the previous append, the + server MUST record the upload's final size to ensure its consistency. + If the server does have a previous record, that value MUST match the + upload's final size. If they do not match, the server MUST reject + the request with a 400 (Bad Request) status code. + + The server MUST prevent that the offset exceeds the upload's final + size when appending. If a final size has been recorded and the + upload append request exceeds this value, the server MUST stop + appending bytes to the upload once the offset reaches the final size + and reject the request with a 400 (Bad Request) status code. It is + not sufficient to rely on the Content-Length header field for + enforcement because the header field might not be present. + + The request content MAY be empty. If the Upload-Complete field is + then set to true, the client wants to complete the upload without + appending additional data. + + The following example shows an upload append. The client transfers + the next 100 bytes at an offset of 100 and does not indicate that the + upload is then completed. The server acknowledges the new offset: + + PATCH /upload/b530ce8ff HTTP/1.1 + Host: example.com + Upload-Offset: 100 + Upload-Draft-Interop-Version: 6 + Content-Length: 100 + Content-Type: application/partial-upload + + [content (100 bytes)] + + HTTP/1.1 201 Created + Upload-Offset: 200 + + The client MAY automatically attempt upload resumption when the + connection is terminated unexpectedly, or if a 5xx (Server Error) + status code is received. The client SHOULD NOT automatically retry + if a 4xx (Client Error) status code is received. + +7. Upload Cancellation + + If the client wants to terminate the transfer without the ability to + resume, it can send a DELETE request to the upload resource. Doing + so is an indication that the client is no longer interested in + continuing the upload, and that the server can release any resources + associated with it. + + The client MUST NOT initiate cancellation without the knowledge of + server support. + + The request MUST use the DELETE method. The request MUST NOT include + an Upload-Offset or Upload-Complete header field. The server MUST + reject the request with a Upload-Offset or Upload-Complete header + field with a 400 (Bad Request) status code. + + If the server successfully deactivates the upload resource, it MUST + respond with a 204 (No Content) status code. + + The server MAY terminate any in-flight requests to the upload + resource before sending the response by abruptly terminating their + HTTP connection(s) or stream(s). + + If the server does not consider the upload resource to be active, it + MUST respond with a 404 (Not Found) status code. + + If the server does not support cancellation, it MUST respond with a + 405 (Method Not Allowed) status code. + + The following example shows an upload cancellation: + + DELETE /upload/b530ce8ff HTTP/1.1 + Host: example.com + Upload-Draft-Interop-Version: 6 + + HTTP/1.1 204 No Content + +8. Header Fields + +8.1. Upload-Offset + + The Upload-Offset request and response header field indicates the + resumption offset of corresponding upload, counted in bytes. The + Upload-Offset field value is an Integer. + +8.2. Upload-Limit + + The Upload-Limit response header field indicates limits applying the + upload resource. The Upload-Limit field value is a Dictionary. The + following limits are defined: + + * The max-size key specifies a maximum size that an upload resource + is allowed to reach, counted in bytes. The value is an Integer. + + * The min-size key specifies a minimum size for a resumable upload, + counted in bytes. The server MAY NOT create an upload resource if + the client indicates that the uploaded data is smaller than the + minimum size by including the Content-Length and Upload-Complete: + ?1 fields, but the server MAY still accept the uploaded data. The + value is an Integer. + + * The max-append-size key specifies a maximum size counted in bytes + for the request content in a single upload append request + (Section 6). The server MAY reject requests exceeding this limit + and a client SHOULD NOT send larger upload append requests. The + value is an Integer. + + * The min-append-size key specifies a minimum size counted in bytes + for the request content in a single upload append request + (Section 6) that does not complete the upload by setting the + Upload-Complete: ?1 field. The server MAY reject non-completing + requests below this limit and a client SHOULD NOT send smaller + non-completing upload append requests. A server MUST NOT reject + an upload append request due to smaller size if the request + includes the Upload-Complete: ?1 field. The value is an Integer. + + * The expires key specifies the remaining lifetime of the upload + resource in seconds counted from the generation of the response by + the server. After the resource's lifetime is reached, the server + MAY make the upload resource inaccessible and a client SHOULD NOT + attempt to access the upload resource. The lifetime MAY be + extended but SHOULD NOT be reduced once the upload resource is + created. The value is an Integer. + + When parsing this header field, unrecognized keys MUST be ignored and + MUST NOT fail the parsing to facilitate the addition of new limits in + the future. + + A server that supports the creation of a resumable upload resource + (Section 4) under a target URI MUST include the Upload-Limit header + field with the corresponding limits in a response to an OPTIONS + request sent to this target URI. If a server supports the creation + of upload resources for any target URI, it MUST include the Upload- + Limit header field with the corresponding limits in a response to an + OPTIONS request with the * target. The limits announced in an + OPTIONS response SHOULD NOT be less restrictive than the limits + applied to an upload once the upload resource has been created. If + the server does not apply any limits, it MUST use min-size=0 instead + of an empty header value. A client can use an OPTIONS request to + discover support for resumable uploads and potential limits before + creating an upload resource. + +8.3. Upload-Complete + + The Upload-Complete request and response header field indicates + whether the corresponding upload is considered complete. The Upload- + Complete field value is a Boolean. + + The Upload-Complete header field MUST only be used if support by the + resource is known to the client (Section 4.1). + +8.4. Upload-Length + + The Upload-Length request and response header field indicates the + number of bytes to be uploaded for the corresponding upload resource, + counted in bytes. The Upload-Length field value is an Integer. + +9. Media Type application/partial-upload + + The application/partial-upload media type describes a contiguous + block of data that should be uploaded to a resource. There is no + minimum block size and the block might be empty. The start and end + of the block might align with the start and end of the file that + should be uploaded, but they are not required to be aligned. + +10. Problem Types + +10.1. Mismatching Offset + + This section defines the "https://iana.org/assignments/http-problem- + types#mismatching-upload-offset" problem type [PROBLEM]. A server + MAY use this problem type when responding to an upload append request + (Section 6) to indicate that the Upload-Offset header field in the + request does not match the upload resource's offset. + + Two problem type extension members are defined: the expected-offset + and provided-offset members. A response using this problem type + SHOULD populate both members, with the value of expected-offset taken + from the upload resource and the value of provided-offset taken from + the upload append request. + + The following example shows an example response, where the resource's + offset was 100, but the client attempted to append at offset 200: + + HTTP/1.1 409 Conflict + Content-Type: application/problem+json + + { + "type":"https://iana.org/assignments/http-problem-types#mismatching-upload-offset", + "title": "offset from request does not match offset of resource", + "expected-offset": 100, + "provided-offset": 200 + } + +10.2. Completed Upload + + This section defines the "https://iana.org/assignments/http-problem- + types#completed-upload" problem type [PROBLEM]. A server MAY use + this problem type when responding to an upload append request + (Section 6) to indicate that the upload has already been completed + and cannot be modified. + + The following example shows an example response: + + HTTP/1.1 400 Bad Request + Content-Type: application/problem+json + + { + "type":"https://iana.org/assignments/http-problem-types#completed-upload", + "title": "upload is already completed" + } + +11. Offset values + + The offset of an upload resource is the number of bytes that have + been appended to the upload resource. Appended data cannot be + removed from an upload and, therefore, the upload offset MUST NOT + decrease. A server MUST NOT generate responses containing an Upload- + Offset header field with a value that is smaller than was included in + previous responses for the same upload resource. This includes + informational and final responses for upload creation (Section 4), + upload appending (Section 6), and offset retrieval (Section 5). + + If a server loses data that has been appended to an upload, it MUST + consider the upload resource invalid and reject further use of the + upload resource. The Upload-Offset header field in responses serves + as an acknowledgement of the append operation and as a guarantee that + no retransmission of the data will be necessary. Client can use this + guarantee to free resources associated to already uploaded data while + the upload is still ongoing. + +12. Redirection + + The 301 (Moved Permanently) and 302 (Found) status codes MUST NOT be + used in offset retrieval (Section 5) and upload cancellation + (Section 7) responses. For other responses, the upload resource MAY + return a 308 (Permanent Redirect) status code and clients SHOULD use + the new permanent URI for subsequent requests. If the client + receives a 307 (Temporary Redirect) response to an offset retrieval + (Section 5) request, it MAY apply the redirection directly in an + immediate subsequent upload append (Section 6). + +13. Content Codings + + Since the codings listed in Content-Encoding are a characteristic of + the representation (see Section 8.4 of [HTTP]), both the client and + the server always compute the values for Upload-Offset and optionally + Upload-Length on the content coded data (that is, the representation + data). Moreover, the content codings are retained throughout the + entire upload, meaning that the server is not required to decode the + representation data to support resumable uploads. See Appendix A of + [DIGEST-FIELDS] for more information. + +14. Transfer Codings + + Unlike Content-Encoding (see Section 8.4.1 of [HTTP]), Transfer- + Encoding (see Section 6.1 of [HTTP/1.1]) is a property of the + message, not of the representation. Moreover, transfer codings can + be applied in transit (e.g., by proxies). This means that a client + does not have to consider the transfer codings to compute the upload + offset, while a server is responsible for transfer decoding the + message before computing the upload offset. The same applies to the + value of Upload-Length. Please note that the Content-Length header + field cannot be used in conjunction with the Transfer-Encoding header + field. + +15. Integrity Digests + + The integrity of an entire upload or individual upload requests can + be verifying using digests from [DIGEST-FIELDS]. + +15.1. Representation Digests + + Representation digests help verify the integrity of the entire data + that has been uploaded so far, which might strech across multiple + requests. + + If the client knows the integrity digest of the entire data before + creating an upload resource, it MAY include the Repr-Digest header + field when creating an upload (Section 4). Once the upload is + completed, the server can compute the integrity digest of the + received upload representation and compare it to the provided digest. + If the digests don't match the server SHOULD consider the transfer + failed and not process the uploaded data further. This way, the + integrity of the entire uploaded data can be protected. + + Alternatively, when creating an upload (Section 4), the client MAY + ask the server to compute and return the integrity digests using a + Want-Repr-Digest field conveying the preferred algorithms. The + response SHOULD include at least one of the requested digests, but + MAY not include it. The server SHOULD compute the representation + digests using the preferred algorithms once the upload is complete + and include the corresponding Repr-Digest header field in the + response. Alternatively, the server MAY compute the digest + continuously during the upload and include the Repr-Digest header + field in responses to upload creation (Section 4) and upload + appending requests (Section 6) even when the upload is not completed + yet. This allows the client to simultaneously compute the digest of + the transmitted upload data, compare its digest to the server's + digest, and spot data integrity issues. If an upload is spread + across multiple requests, data integrity issues can be found even + before the upload is fully completed. + +15.2. Content Digests + + Content digests help verify the integrity of the content in an + individual request. + + If the client knows the integrity digest of the content from an + upload creation (Section 4) or upload appending (Section 6) request, + it MAY include the Content-Digest header field in the request. Once + the content has been received, the server can compute the integrity + digest of the received content and compare it to the provided digest. + If the digests don't match the server SHOULD consider the transfer + failed and not append the content to the upload resource. This way, + the integrity of an individual request can be protected. + +16. Subsequent Resources + + The server might process the uploaded data and make its results + available in another resource during or after the upload. This + subsequent resource is different from the upload resource created by + the upload creation request (Section 4). The subsequent resource + does not handle the upload process itself, but instead facilitates + further interaction with the uploaded data. The server MAY indicate + the location of this subsequent resource by including the Content- + Location header field in the informational or final responses + generated while creating (Section 4), appending to (Section 6), or + retrieving the offset (Section 5) of an upload. For example, a + subsequent resource could allow the client to fetch information + extracted from the uploaded data. + +17. Upload Strategies + + The definition of the upload creation request (Section 4) provides + the client with flexibility to choose whether the file is fully or + partially transferred in the first request, or if no file data is + included at all. Which behavior is best largely depends on the + client's capabilities, its intention to avoid data re-transmission, + and its knowledge about the server's support for resumable uploads. + + The following subsections describe two typical upload strategies that + are suited for common environments. Note that these modes are never + explicitly communicated to the server and clients are not required to + stick to one strategy, but can mix and adapt them to their needs. + +17.1. Optimistic Upload Creation + + An "optimistic upload creation" can be used independent of the + client's knowledge about the server's support for resumable uploads. + However, the client must be capable of handling and processing + interim responses. An upload creation request then includes the full + file because the client anticipates that the file will be transferred + without interruptions or resumed if an interruption occurs. + + The benefit of this method is that if the upload creation request + succeeds, the file was transferred in a single request without + additional round trips. + + A possible drawback is that the client might be unable to resume an + upload. If an upload is interrupted before the client received a 104 + (Upload Resumption Supported) intermediate response with the upload + URL, the client cannot resume that upload due to the missing upload + URL. The intermediate response might not be received if the + interruption happens too early in the message exchange, the server + does not support resumable uploads at all, the server does not + support sending the 104 (Upload Resumption Supported) intermediate + response, or an intermediary dropped the intermediate response. + Without a 104 response, the client needs to either treat the upload + as failed or retry the entire upload creation request if this is + allowed by the application. + +17.1.1. Upgrading To Resumable Uploads + + Optimistic upload creation allows clients and servers to + automatically upgrade non-resumable uploads to resumable ones. In a + non-resumable upload, the file is transferred in a single request, + usually POST or PUT, without any ability to resume from + interruptions. The client can offer the server to upgrade such a + request to a resumable upload (see Section 4.1) by adding the Upload- + Complete: ?1 header field to the original request. The Upload-Length + header field SHOULD be added if the upload's final size is known + upfront. The request is not changed otherwise. + + A server that supports resumable uploads at the target URI can create + a resumable upload resource and send its upload URL in a 104 (Upload + Resumption Supported) intermediate response for the client to resume + the upload after interruptions. A server that does not support + resumable uploads or does not want to upgrade to a resumable upload + for this request ignores the Upload-Complete: ?1 header. The + transfer then falls back to a non-resumable upload without additional + cost. + + This upgrade can also be performed transparently by the client + without the user taking an active role. When a user asks the client + to send a non-resumable request, the client can perform the upgrade + and handle potential interruptions and resumptions under the hood + without involving the user. The last response received by the client + is considered the response for the entire file upload and should be + presented to the user. + +17.2. Careful Upload Creation + + For a "careful upload creation" the client knows that the server + supports resumable uploads and sends an empty upload creation request + without including any file data. Upon successful response reception, + the client can use the included upload URL to transmit the file data + (Section 6) and resume the upload at any stage if an interruption + occurs. The client should inspect the response for the Upload-Limit + header field, which would indicate limits applying to the remaining + upload procedure. + + The retransmission of file data or the ultimate upload failure that + can happen with an "optimistic upload creation" is therefore avoided + at the expense of an additional request that does not carry file + data. + + This approach best suited if the client cannot receive intermediate + responses, e.g. due to a limitation in the provided HTTP interface, + or if large files are transferred where the cost of the additional + request is miniscule compared to the effort of transferring the large + file itself. + +18. Security Considerations + + The upload resource URL is the identifier used for modifying the + upload. Without further protection of this URL, an attacker may + obtain information about an upload, append data to it, or cancel it. + To prevent this, the server SHOULD ensure that only authorized + clients can access the upload resource. In addition, the upload + resource URL SHOULD be generated in such a way that makes it hard to + be guessed by unauthorized clients. + + Some servers or intermediaries provide scanning of content uploaded + by clients. Any scanning mechanism that relies on receiving a + complete file in a single request message can be defeated by + resumable uploads because content can be split across multiple + messages. Servers or intermediaries wishing to perform content + scanning SHOULD consider how resumable uploads can circumvent + scanning and take appropriate measures. Possible strategies include + waiting for the upload to complete before scanning a full file, or + disabling resumable uploads. + + Resumable uploads are vulnerable to Slowloris-style attacks + [SLOWLORIS]. A malicious client may create upload resources and keep + them alive by regularly sending PATCH requests with no or small + content to the upload resources. This could be abused to exhaust + server resources by creating and holding open uploads indefinitely + with minimal work. + + Servers SHOULD provide mitigations for Slowloris attacks, such as + increasing the maximum number of clients the server will allow, + limiting the number of uploads a single client is allowed to make, + imposing restrictions on the minimum transfer speed an upload is + allowed to have, and restricting the length of time an upload + resource can exist. + +19. IANA Considerations + + IANA is asked to register the following entries in the "Hypertext + Transfer Protocol (HTTP) Field Name Registry": + + +=================+===========+==============================+ + | Field Name | Status | Reference | + +=================+===========+==============================+ + | Upload-Complete | permanent | Section 8.3 of this document | + +-----------------+-----------+------------------------------+ + | Upload-Offset | permanent | Section 8.1 of this document | + +-----------------+-----------+------------------------------+ + | Upload-Limit | permanent | Section 8.2 of this document | + +-----------------+-----------+------------------------------+ + | Upload-Length | permanent | Section 8.4 of this document | + +-----------------+-----------+------------------------------+ + + Table 1 + + IANA is asked to register the following entry in the "HTTP Status + Codes" registry: + + Value: 104 (suggested value) + + Description: Upload Resumption Supported + + Specification: This document + + IANA is asked to register the following entry in the "Media Types" + registry: + + Type name: application + + Subtype name: partial-upload + + Required parameters: N/A + + Optional parameters: N/A + + Encoding considerations: binary + + Security considerations: see Section 18 of this document + + Interoperability considerations: N/A + + Published specification: This document + + Applications that use this media type: Applications that transfer + files over unreliable networks or want pause- and resumable + uploads. + + Fragment identifier considerations: N/A + + Additional information: + + * Deprecated alias names for this type: N/A + + * Magic number(s): N/A + + * File extension(s): N/A + + * Macintosh file type code(s): N/A + + * Windows Clipboard Name: N/A + + Person and email address to contact for further information: See the + Authors' Addresses section of this document. + + Intended usage: COMMON + + Restrictions on usage: N/A + + Author: See the Authors' Addresses section of this document. + + Change controller: IETF + + IANA is asked to register the following entry in the "HTTP Problem + Types" registry: + + Type URI: https://iana.org/assignments/http-problem- + types#mismatching-upload-offset Title: + + Mismatching Upload Offset Recommended HTTP status code: + + 409 Reference: + + This document + + IANA is asked to register the following entry in the "HTTP Problem + Types" registry: + + Type URI: https://iana.org/assignments/http-problem-types#completed- + upload Title: + + Upload Is Completed Recommended HTTP status code: + + 400 Reference: + + This document + +20. References + +20.1. Normative References + + [CACHING] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP Caching", STD 98, RFC 9111, + DOI 10.17487/RFC9111, June 2022, + . + + [DIGEST-FIELDS] + Polli, R. and L. Pardue, "Digest Fields", RFC 9530, + DOI 10.17487/RFC9530, February 2024, + . + + [HTTP] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP Semantics", STD 97, RFC 9110, + DOI 10.17487/RFC9110, June 2022, + . + + [HTTP/1.1] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP/1.1", STD 99, RFC 9112, DOI 10.17487/RFC9112, + June 2022, . + + [PROBLEM] Nottingham, M., Wilde, E., and S. Dalal, "Problem Details + for HTTP APIs", RFC 9457, DOI 10.17487/RFC9457, July 2023, + . + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + . + + [RFC5789] Dusseault, L. and J. Snell, "PATCH Method for HTTP", + RFC 5789, DOI 10.17487/RFC5789, March 2010, + . + + [RFC6266] Reschke, J., "Use of the Content-Disposition Header Field + in the Hypertext Transfer Protocol (HTTP)", RFC 6266, + DOI 10.17487/RFC6266, June 2011, + . + + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, . + + [STRUCTURED-FIELDS] + Nottingham, M. and P. Kamp, "Structured Field Values for + HTTP", RFC 8941, DOI 10.17487/RFC8941, February 2021, + . + +20.2. Informative References + + [SLOWLORIS] + "RSnake" Hansen, R., "Welcome to Slowloris - the low + bandwidth, yet greedy and poisonous HTTP client!", June + 2009, . + +Appendix A. Informational Response + + The server is allowed to respond to upload creation (Section 4) + requests with a 104 (Upload Resumption Supported) intermediate + response as soon as the server has validated the request. This way, + the client knows that the server supports resumable uploads before + the complete response is received. The benefit is the clients can + defer starting the actual data transfer until the server indicates + full support (i.e. resumable are supported, the provided upload URL + is active etc). + + On the contrary, support for intermediate responses (the 1XX range) + in existing software is limited or not at all present. Such software + includes proxies, firewalls, browsers, and HTTP libraries for clients + and server. Therefore, the 104 (Upload Resumption Supported) status + code is optional and not mandatory for the successful completion of + an upload. Otherwise, it might be impossible in some cases to + implement resumable upload servers using existing software packages. + Furthermore, as parts of the current internet infrastructure + currently have limited support for intermediate responses, a + successful delivery of a 104 (Upload Resumption Supported) from the + server to the client should be assumed. + + We hope that support for intermediate responses increases in the near + future, to allow a wider usage of 104 (Upload Resumption Supported). + +Appendix B. Feature Detection + + This specification includes a section about feature detection (it was + called service discovery in earlier discussions, but this name is + probably ill-suited). The idea is to allow resumable uploads to be + transparently implemented by HTTP clients. This means that + application developers just keep using the same API of their HTTP + library as they have done in the past with traditional, non-resumable + uploads. Once the HTTP library gets updated (e.g. because mobile OS + or browsers start implementing resumable uploads), the HTTP library + can transparently decide to use resumable uploads without explicit + configuration by the application developer. Of course, in order to + use resumable uploads, the HTTP library needs to know whether the + server supports resumable uploads. If no support is detected, the + HTTP library should use the traditional, non-resumable upload + technique. We call this process feature detection. + + Ideally, the technique used for feature detection meets following + *criteria* (there might not be one approach which fits all + requirements, so we have to prioritize them): + + 1. Avoid additional roundtrips by the client, if possible (i.e. an + additional HTTP request by the client should be avoided). + + 2. Be backwards compatible to HTTP/1.1 and existing network + infrastructure: This means to avoid using new features in HTTP/2, + or features which might require changes to existing network + infrastructure (e.g. nginx or HTTP libraries) + + 3. Conserve the user's privacy (i.e. the feature detection should + not leak information to other third-parties about which URLs have + been connected to) + + Following *approaches* have already been considered in the past. All + except the last approaches have not been deemed acceptable and are + therefore not included in the specification. This follow list is a + reference for the advantages and disadvantages of some approaches: + + *Include a support statement in the SETTINGS frame.* The SETTINGS + frame is a HTTP/2 feature and is sent by the server to the client to + exchange information about the current connection. The idea was to + include an additional statement in this frame, so the client can + detect support for resumable uploads without an additional roundtrip. + The problem is that this is not compatible with HTTP/1.1. + Furthermore, the SETTINGS frame is intended for information about the + current connection (not bound to a request/response) and might not be + persisted when transmitted through a proxy. + + *Include a support statement in the DNS record.* The client can + detect support when resolving a domain name. Of course, DNS is not + semantically the correct layer. Also, DNS might not be involved if + the record is cached or retrieved from a hosts files. + + *Send a HTTP request to ask for support.* This is the easiest + approach where the client sends an OPTIONS request and uses the + response to determine if the server indicates support for resumable + uploads. An alternative is that the client sends the request to a + well-known URL to obtain this response, e.g. /.well-known/resumable- + uploads. Of course, while being fully backwards-compatible, it + requires an additional roundtrip. + + *Include a support statement in previous responses.* In many cases, + the file upload is not the first time that the client connects to the + server. Often additional requests are sent beforehand for + authentication, data retrieval etc. The responses for those requests + can also include a header field which indicates support for resumable + uploads. There are two options: - Use the standardized Alt-Svc + response header field. However, it has been indicated to us that + this header field might be reworked in the future and could also be + semantically different from our intended usage. - Use a new response + header field Resumable-Uploads: https://example.org/files/* to + indicate under which endpoints support for resumable uploads is + available. + + *Send a 104 intermediate response to indicate support.* The clients + normally starts a traditional upload and includes a header field + indicate that it supports resumable uploads (e.g. Upload-Offset: 0). + If the server also supports resumable uploads, it will immediately + respond with a 104 intermediate response to indicate its support, + before further processing the request. This way the client is + informed during the upload whether it can resume from possible + connection errors or not. While an additional roundtrip is avoided, + the problem with that solution is that many HTTP server libraries do + not support sending custom 1XX responses and that some proxies may + not be able to handle new 1XX status codes correctly. + + *Send a 103 Early Hint response to indicate support.* This approach + is the similar to the above one, with one exception: Instead of a new + 104 (Upload Resumption Supported) status code, the existing 103 + (Early Hint) status code is used in the intermediate response. The + 103 code would then be accompanied by a header field indicating + support for resumable uploads (e.g. Resumable-Uploads: 1). It is + unclear whether the Early Hints code is appropriate for that, as it + is currently only used to indicate resources for prefetching them. + +Appendix C. Upload Metadata + + When an upload is created (Section 4), the Content-Type and Content- + Disposition header fields are allowed to be included. They are + intended to be a standardized way of communicating the file name and + file type, if available. However, this is not without controversy. + Some argue that since these header fields are already defined in + other specifications, it is not necessary to include them here again. + Furthermore, the Content-Disposition header field's format is not + clearly enough defined. For example, it is left open which + disposition value should be used in the header field. There needs to + be more discussion whether this approach is suited or not. + + However, from experience with the tus project, users are often asking + for a way to communicate the file name and file type. Therefore, we + believe it is help to explicitly include an approach for doing so. + +Appendix D. FAQ + + * *Are multipart requests supported?* Yes, requests whose content is + encoded using the multipart/form-data are implicitly supported. + The entire encoded content can be considered as a single file, + which is then uploaded using the resumable protocol. The server, + of course, must store the delimiter ("boundary") separating each + part and must be able to parse the multipart format once the + upload is completed. + +Acknowledgments + + This document is based on an Internet-Draft specification written by + Jiten Mehta, Stefan Matsson, and the authors of this document. + + The tus v1 protocol (https://tus.io/) is a specification for a + resumable file upload protocol over HTTP. It inspired the early + design of this protocol. Members of the tus community helped + significantly in the process of bringing this work to the IETF. + + The authors would like to thank Mark Nottingham for substantive + contributions to the text. + +Changes + + This section is to be removed before publishing as an RFC. + +Since draft-ietf-httpbis-resumable-upload-05 + + None yet + +Since draft-ietf-httpbis-resumable-upload-04 + + * Clarify implications of Upload-Limit header. + + * Allow client to fetch upload limits upfront via OPTIONS. + + * Add guidance on upload creation strategy. + + * Add Upload-Length header to indicate length during creation. + + * Describe possible usage of Want-Repr-Digest. + +Since draft-ietf-httpbis-resumable-upload-03 + + * Add note about Content-Location for referring to subsequent + resources. + + * Require application/partial-upload for appending to uploads. + + * Explain handling of content and transfer codings. + + * Add problem types for mismatching offsets and completed uploads. + + * Clarify that completed uploads must not be appended to. + + * Describe interaction with Digest Fields from RFC9530. + + * Require that upload offset does not decrease over time. + + * Add Upload-Limit header field. + + * Increase the draft interop version. + +Since draft-ietf-httpbis-resumable-upload-02 + + * Add upload progress notifications via informational responses. + + * Add security consideration regarding request filtering. + + * Explain the use of empty requests for creation uploads and + appending. + + * Extend security consideration to include resource exhaustion + attacks. + + * Allow 200 status codes for offset retrieval. + + * Increase the draft interop version. + +Since draft-ietf-httpbis-resumable-upload-01 + + * Replace Upload-Incomplete header with Upload-Complete. + + * Replace terminology about procedures with HTTP resources. + + * Increase the draft interop version. + +Since draft-ietf-httpbis-resumable-upload-00 + + * Remove Upload-Token and instead use Server-generated upload URL + for upload identification. + + * Require the Upload-Incomplete header field in Upload Creation + Procedure. + + * Increase the draft interop version. + +Since draft-tus-httpbis-resumable-uploads-protocol-02 + + None + +Since draft-tus-httpbis-resumable-uploads-protocol-01 + + * Clarifying backtracking and preventing skipping ahead during the + Offset Receiving Procedure. + + * Clients auto-retry 404 is no longer allowed. + +Since draft-tus-httpbis-resumable-uploads-protocol-00 + + * Split the Upload Transfer Procedure into the Upload Creation + Procedure and the Upload Appending Procedure. + +Authors' Addresses + + Marius Kleidl (editor) + Transloadit + Email: marius@transloadit.com + + + Guoye Zhang (editor) + Apple Inc. + Email: guoye_zhang@apple.com + + + Lucas Pardue (editor) + Cloudflare + Email: lucas@lucaspardue.com diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-retrofit.html b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-retrofit.html new file mode 100644 index 000000000..783846d49 --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-retrofit.html @@ -0,0 +1,1856 @@ + + + + + + +Retrofit Structured Fields for HTTP + + + + + + + + + + + + + + + + + + + + + + + + +
Internet-DraftRetrofit Structured FieldsJanuary 2025
NottinghamExpires 11 July 2025[Page]
+
+
+
+
Workgroup:
+
HTTP
+
Internet-Draft:
+
draft-ietf-httpbis-retrofit-latest
+
Published:
+
+ +
+
Intended Status:
+
Standards Track
+
Expires:
+
+
Author:
+
+
+
M. Nottingham
+
+
+
+
+

Retrofit Structured Fields for HTTP

+
+

Abstract

+

This specification nominates a selection of existing HTTP fields whose values are compatible with Structured Fields syntax, so that they can be handled as such (subject to certain caveats).

+

To accommodate some additional fields whose syntax is not compatible, it also defines mappings of their semantics into Structured Fields. It does not specify how to convey them in HTTP messages.

+
+
+

+About This Document +

+

This note is to be removed before publishing as an RFC.

+

+ Status information for this document may be found at https://datatracker.ietf.org/doc/draft-ietf-httpbis-retrofit/.

+

+ Discussion of this document takes place on the + HTTP Working Group mailing list (mailto:ietf-http-wg@w3.org), + which is archived at https://lists.w3.org/Archives/Public/ietf-http-wg/. + Working Group information can be found at https://httpwg.org/.

+

Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/retrofit.

+
+
+
+

+Status of This Memo +

+

+ This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79.

+

+ Internet-Drafts are working documents of the Internet Engineering Task + Force (IETF). Note that other groups may also distribute working + documents as Internet-Drafts. The list of current Internet-Drafts is + at https://datatracker.ietf.org/drafts/current/.

+

+ Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress."

+

+ This Internet-Draft will expire on 11 July 2025.

+
+
+ +
+
+

+Table of Contents +

+ +
+
+
+
+

+1. Introduction +

+

Structured Field Values for HTTP [STRUCTURED-FIELDS] introduced a data model with associated parsing and serialization algorithms for use by new HTTP field values. Fields that are defined as Structured Fields can bring advantages that include:

+
    +
  • +

    Improved interoperability and security: precisely defined parsing and serialisation algorithms are typically not available for fields defined with just ABNF and/or prose.

    +
  • +
  • +

    Reuse of common implementations: many parsers for other fields are specific to a single field or a small family of fields.

    +
  • +
  • +

    Canonical form: because a deterministic serialisation algorithm is defined for each type, Structure Fields have a canonical representation.

    +
  • +
  • +

    Enhanced API support: a regular data model makes it easier to expose field values as a native data structure in implementations.

    +
  • +
  • +

    Alternative serialisations: While [STRUCTURED-FIELDS] defines a textual serialisation of that data model, other, more efficient serialisations of the underlying data model are also possible.

    +
  • +
+

However, a field needs to be defined as a Structured Field for these benefits to be realised. Many existing fields are not, making up the bulk of header and trailer fields seen in HTTP traffic on the internet.

+

This specification defines how a selection of existing HTTP fields can be handled as Structured Fields, so that these benefits can be realised -- thereby making them Retrofit Structured Fields.

+

It does so using two techniques. Section 2 lists compatible fields -- those that can be handled as if they were Structured Fields due to the similarity of their defined syntax to that in Structured Fields. Section 3 lists mapped fields -- those whose syntax needs to be transformed into an underlying data model which is then mapped into that defined by Structured Fields.

+
+
+

+1.1. Using Retrofit Structured Fields +

+

Retrofitting data structures onto existing and widely-deployed HTTP fields requires careful handling to assure interoperability and security. This section highlights considerations for applications that use Retrofit Structured Fields.

+

While the majority of field values seen in HTTP traffic should be able to be parsed or mapped successfully, some will not. An application using Retrofit Structured Fields will need to define how unsuccessful values will be handled.

+

For example, an API that exposes field values using Structured Fields data types might make the field value available as a string in cases where the field did not successfully parse or map.

+

The mapped field values described in Section 3 are not compatible with the original syntax of their fields, and so cannot be used unless parties processing them have explicitly indicated their support for that form of the field value. An application using Retrofit Structured Fields will need to define how to negotiate support for them.

+

For example, an alternative serialization of fields that takes advantage of Structured Fields would need to establish an explicit negotiation mechanism to assure that both peers would handle that serialization appropriately before using it.

+

See also the security considerations in Section 5.

+
+
+
+
+

+1.2. Notational Conventions +

+

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", +"MAY", and "OPTIONAL" in this document are to be interpreted as +described in BCP 14 [RFC2119] [RFC8174] when, and only when, they +appear in all capitals, as shown here.

+
+
+
+
+
+
+

+2. Compatible Fields +

+

The HTTP fields listed in Table 1 have values that can be handled as Structured Field Values according to the parsing and serialisation algorithms in [STRUCTURED-FIELDS] corresponding to the listed top-level type, subject to the caveats in Section 2.1.

+

The top-level types are chosen for compatibility with the defined syntax of the field as well as with actual internet traffic. However, not all instances of these fields will successfully parse as a Structured Field Value. This might be because the field value is clearly invalid, or it might be because it is valid but not parseable as a Structured Field.

+

An application using this specification will need to consider how to handle such field values. Depending on its requirements, it might be advisable to reject such values, treat them as opaque strings, or attempt to recover a Structured Field Value from them in an ad hoc fashion.

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+Table 1: +Compatible Fields +
Field NameStructured Type
AcceptList
Accept-EncodingList
Accept-LanguageList
Accept-PatchList
Accept-PostList
Accept-RangesList
Access-Control-Allow-CredentialsItem
Access-Control-Allow-HeadersList
Access-Control-Allow-MethodsList
Access-Control-Allow-OriginItem
Access-Control-Expose-HeadersList
Access-Control-Max-AgeItem
Access-Control-Request-HeadersList
Access-Control-Request-MethodItem
AgeItem
AllowList
ALPNList
Alt-SvcDictionary
Alt-UsedItem
Cache-ControlDictionary
CDN-LoopList
Clear-Site-DataList
ConnectionList
Content-EncodingList
Content-LanguageList
Content-LengthList
Content-TypeItem
Cross-Origin-Resource-PolicyItem
DNTItem
ExpectDictionary
Expect-CTDictionary
HostItem
Keep-AliveDictionary
Max-ForwardsItem
OriginItem
PragmaDictionary
PreferDictionary
Preference-AppliedDictionary
Retry-AfterItem
Sec-WebSocket-ExtensionsList
Sec-WebSocket-ProtocolList
Sec-WebSocket-VersionItem
Server-TimingList
Surrogate-ControlDictionary
TEList
Timing-Allow-OriginList
TrailerList
Transfer-EncodingList
Upgrade-Insecure-RequestsItem
VaryList
X-Content-Type-OptionsItem
X-Frame-OptionsItem
X-XSS-ProtectionList
+
+
+
+

+2.1. Caveats +

+

Note the following caveats regarding compatibility:

+
+
Parsing differences:
+
+

Some values may fail to parse as Structured Fields, even though they are valid according to their originally specified syntax. For example, HTTP parameter names are case-insensitive (per Section 5.6.6 of [HTTP]), but Structured Fields require them to be all-lowercase. +Likewise, many Dictionary-based fields (e.g., Cache-Control, Expect-CT, Pragma, Prefer, Preference-Applied, Surrogate-Control) have case-insensitive keys. +Similarly, the parameters rule in HTTP (see Section 5.6.6 of [HTTP]) allows whitespace before the ";" delimiter, but Structured Fields does not. +And, Section 5.6.4 of [HTTP] allows backslash-escaping most characters in quoted strings, whereas Structured Field Strings only escape "\" and DQUOTE. The vast majority of fields seen in typical traffic do not exhibit these behaviors.

+
+
+
Error handling:
+
+

Parsing algorithms specified (or just widely implemented) for current HTTP headers may differ from those in Structured Fields in details such as error handling. For example, HTTP specifies that repeated directives in the Cache-Control header field have a different precedence than that assigned by a Dictionary structured field (which Cache-Control is mapped to).

+
+
+
Token limitations:
+
+

In Structured Fields, tokens are required to begin with an alphabetic character or "*", whereas HTTP tokens allow a wider range of characters. This prevents use of mapped values that begin with one of these characters. For example, media types, field names, methods, range-units, character and transfer codings that begin with a number or special character other than "*" might be valid HTTP protocol elements, but will not be able to be represented as Structured Field Tokens.

+
+
+
Integer limitations:
+
+

Structured Fields Integers can have at most 15 digits; larger values will not be able to be represented in them.

+
+
+
IPv6 Literals:
+
+

Fields whose values contain IPv6 literal addresses (such as CDN-Loop, Host, and Origin) are not able to be represented as Structured Fields Tokens, because the brackets used to delimit them are not allowed in Tokens.

+
+
+
Empty Field Values:
+
+

Empty and whitespace-only field values are considered errors in Structured Fields. For compatible fields, an empty field indicates that the field should be silently ignored.

+
+
+
Alt-Svc:
+
+

Some ALPN tokens (e.g., h3-Q43) do not conform to key's syntax, and therefore cannot be represented as a Token. Since the final version of HTTP/3 uses the h3 token, this shouldn't be a long-term issue, although future tokens may again violate this assumption.

+
+
+
Content-Length:
+
+

Note that Content-Length is defined as a List because it is not uncommon for implementations to mistakenly send multiple values. See Section 8.6 of [HTTP] for handling requirements.

+
+
+
Retry-After:
+
+

Only the delta-seconds form of Retry-After can be represented; a Retry-After value containing a http-date will need to be converted into delta-seconds to be conveyed as a Structured Field Value.

+
+
+
+
+
+
+
+
+
+

+3. Mapped Fields +

+

Some HTTP field values have syntax that cannot be successfully parsed as Structured Field values. Instead, it is necessary to map them into a Structured Field value.

+

For example, the Date HTTP header field carries a date:

+
+
+Date: Sun, 06 Nov 1994 08:49:37 GMT
+
+
+

Its value would be mapped to:

+
+
+@784111777
+
+
+

Unlike those listed in Section 2, these representations are not compatible with the original fields' syntax, and MUST NOT be used unless they are explicitly and unambiguously supported. For example, this means that sending them to a next-hop recipient in HTTP requires prior negotiation. This specification does not define how to do so.

+
+
+

+3.1. URLs +

+

The field names in Table 2 have values that can be mapped into Structured Field values by treating the original field's value as a String.

+
+ + + + + + + + + + + + + + + + + + +
+Table 2: +URL Fields +
Field Name
Content-Location
Location
Referer
+
+

For example, this Location field:

+
+
+Location: https://example.com/foo
+
+
+

would have a mapped value of:

+
+
+"https://example.com/foo"
+
+
+
+
+
+
+

+3.2. Dates +

+

The field names in Table 3 have values that can be mapped into Structured Field values by parsing their payload according to Section 5.6.7 of [HTTP] and representing the result as a Date.

+
+ + + + + + + + + + + + + + + + + + + + + + + + +
+Table 3: +Date Fields +
Field Name
Date
Expires
If-Modified-Since
If-Unmodified-Since
Last-Modified
+
+

For example, an Expires field's value could be mapped as:

+
+
+@1659578233
+
+
+
+
+
+
+

+3.3. ETags +

+

The field value of the ETag header field can be mapped into a Structured Field value by representing the entity-tag as a String, and the weakness flag as a Boolean "w" parameter on it, where true indicates that the entity-tag is weak; if 0 or unset, the entity-tag is strong.

+

For example, this ETag header field:

+
+
+ETag: W/"abcdef"
+
+
+

would have a mapped value of:

+
+
+"abcdef"; w
+
+
+

If-None-Match's field value can be mapped into a Structured Field value which is a List of the structure described above. When a field value contains "*", it is represented as a Token.

+

Likewise, If-Match's field value can be mapped into a Structured Field value in the same manner.

+

For example, this If-None-Match field:

+
+
+If-None-Match: W/"abcdef", "ghijkl", *
+
+
+

would have a mapped value of:

+
+
+"abcdef"; w, "ghijkl", *
+
+
+
+
+
+
+

+3.4. Cookies +

+

The field values of the Cookie and Set-Cookie fields [COOKIES] can be mapped into Structured Fields Lists.

+

In each case, a cookie is represented as an Inner List containing two Items; the cookie name and value. The cookie name is always a String; the cookie value is a String, unless it can be successfully parsed as the textual representation of another, bare Item structured type (e.g., Byte Sequence, Decimal, Integer, Token, or Boolean).

+

Cookie attributes map to Parameters on the Inner List, with the parameter name being forced to lowercase. Cookie attribute values are Strings unless a specific type is defined for them. This specification defines types for existing cookie attributes in Table 4.

+ +

The Expires attribute is mapped to a Date representation of parsed-cookie-date (see Section 5.1.1 of [COOKIES]).

+

For example, this Set-Cookie field:

+
+
+Set-Cookie: Lang=en-US; Expires=Wed, 09 Jun 2021 10:18:14 GMT;
+               samesite=Strict; secure
+
+
+

would have a mapped value of:

+
+
+("Lang" "en-US"); expires=@1623233894;
+               samesite=Strict; secure
+
+
+

And this Cookie field:

+
+
+Cookie: SID=31d4d96e407aad42; lang=en-US
+
+
+

would have a mapped value of:

+
+
+("SID" "31d4d96e407aad42"), ("lang" "en-US")
+
+
+
+
+
+
+
+
+

+4. IANA Considerations +

+

Please add the following note to the "Hypertext Transfer Protocol (HTTP) Field Name Registry":

+
    +
  • +

    A prefix of "*" in the Structured Type column indicates that it is a retrofit type (i.e., not +natively Structured); see RFC nnnn.

    +
  • +
+

Then, add a new column, "Structured Type", with the values from Section 2 assigned to the nominated registrations, prefixing each with "*" to indicate that it is a retrofit type.

+

Finally, add a new column to the "Cookie Attribute Registry" established by [COOKIES] with the title "Structured Type", using information from Table 4.

+
+
+
+
+

+5. Security Considerations +

+

Section 2 identifies existing HTTP fields that can be parsed and serialised with the algorithms defined in [STRUCTURED-FIELDS]. Variances from existing parser behavior might be exploitable, particularly if they allow an attacker to target one implementation in a chain (e.g., an intermediary). However, given the considerable variance in parsers already deployed, convergence towards a single parsing algorithm is likely to have a net security benefit in the longer term.

+

Section 3 defines alternative representations of existing fields. Because downstream consumers might interpret the message differently based upon whether they recognise the alternative representation, implementations are prohibited from generating such values unless they have negotiated support for them with their peer. This specification does not define such a mechanism, but any such definition needs to consider the implications of doing so carefully.

+
+
+
+
+

+6. Normative References +

+
+
[COOKIES]
+
+Bingler, S., West, M., and J. Wilander, "Cookies: HTTP State Management Mechanism", Work in Progress, Internet-Draft, draft-ietf-httpbis-rfc6265bis-18, , <https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-rfc6265bis-18>.
+
+
[HTTP]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, DOI 10.17487/RFC9110, , <https://www.rfc-editor.org/rfc/rfc9110>.
+
+
[RFC2119]
+
+Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
+
+
[RFC8174]
+
+Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.
+
+
[STRUCTURED-FIELDS]
+
+Nottingham, M. and P. Kamp, "Structured Field Values for HTTP", Work in Progress, Internet-Draft, draft-ietf-httpbis-sfbis-06, , <https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-sfbis-06>.
+
+
+
+
+
+
+

+Author's Address +

+
+
Mark Nottingham
+
Prahran
+
Australia
+ + +
+
+
+ + + diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-retrofit.txt b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-retrofit.txt new file mode 100644 index 000000000..a039bb42a --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-retrofit.txt @@ -0,0 +1,591 @@ + + + + +HTTP M. Nottingham +Internet-Draft 7 January 2025 +Intended status: Standards Track +Expires: 11 July 2025 + + + Retrofit Structured Fields for HTTP + draft-ietf-httpbis-retrofit-latest + +Abstract + + This specification nominates a selection of existing HTTP fields + whose values are compatible with Structured Fields syntax, so that + they can be handled as such (subject to certain caveats). + + To accommodate some additional fields whose syntax is not compatible, + it also defines mappings of their semantics into Structured Fields. + It does not specify how to convey them in HTTP messages. + +About This Document + + This note is to be removed before publishing as an RFC. + + Status information for this document may be found at + https://datatracker.ietf.org/doc/draft-ietf-httpbis-retrofit/. + + Discussion of this document takes place on the HTTP Working Group + mailing list (mailto:ietf-http-wg@w3.org), which is archived at + https://lists.w3.org/Archives/Public/ietf-http-wg/. Working Group + information can be found at https://httpwg.org/. + + Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/retrofit. + +Status of This Memo + + This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79. + + Internet-Drafts are working documents of the Internet Engineering + Task Force (IETF). Note that other groups may also distribute + working documents as Internet-Drafts. The list of current Internet- + Drafts is at https://datatracker.ietf.org/drafts/current/. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress." + + This Internet-Draft will expire on 11 July 2025. + +Copyright Notice + + Copyright (c) 2025 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents (https://trustee.ietf.org/ + license-info) in effect on the date of publication of this document. + Please review these documents carefully, as they describe your rights + and restrictions with respect to this document. Code Components + extracted from this document must include Revised BSD License text as + described in Section 4.e of the Trust Legal Provisions and are + provided without warranty as described in the Revised BSD License. + +Table of Contents + + 1. Introduction + 1.1. Using Retrofit Structured Fields + 1.2. Notational Conventions + 2. Compatible Fields + 2.1. Caveats + 3. Mapped Fields + 3.1. URLs + 3.2. Dates + 3.3. ETags + 3.4. Cookies + 4. IANA Considerations + 5. Security Considerations + 6. Normative References + Author's Address + +1. Introduction + + Structured Field Values for HTTP [STRUCTURED-FIELDS] introduced a + data model with associated parsing and serialization algorithms for + use by new HTTP field values. Fields that are defined as Structured + Fields can bring advantages that include: + + * Improved interoperability and security: precisely defined parsing + and serialisation algorithms are typically not available for + fields defined with just ABNF and/or prose. + + * Reuse of common implementations: many parsers for other fields are + specific to a single field or a small family of fields. + + * Canonical form: because a deterministic serialisation algorithm is + defined for each type, Structure Fields have a canonical + representation. + + * Enhanced API support: a regular data model makes it easier to + expose field values as a native data structure in implementations. + + * Alternative serialisations: While [STRUCTURED-FIELDS] defines a + textual serialisation of that data model, other, more efficient + serialisations of the underlying data model are also possible. + + However, a field needs to be defined as a Structured Field for these + benefits to be realised. Many existing fields are not, making up the + bulk of header and trailer fields seen in HTTP traffic on the + internet. + + This specification defines how a selection of existing HTTP fields + can be handled as Structured Fields, so that these benefits can be + realised -- thereby making them Retrofit Structured Fields. + + It does so using two techniques. Section 2 lists compatible fields + -- those that can be handled as if they were Structured Fields due to + the similarity of their defined syntax to that in Structured Fields. + Section 3 lists mapped fields -- those whose syntax needs to be + transformed into an underlying data model which is then mapped into + that defined by Structured Fields. + +1.1. Using Retrofit Structured Fields + + Retrofitting data structures onto existing and widely-deployed HTTP + fields requires careful handling to assure interoperability and + security. This section highlights considerations for applications + that use Retrofit Structured Fields. + + While the majority of field values seen in HTTP traffic should be + able to be parsed or mapped successfully, some will not. An + application using Retrofit Structured Fields will need to define how + unsuccessful values will be handled. + + For example, an API that exposes field values using Structured Fields + data types might make the field value available as a string in cases + where the field did not successfully parse or map. + + The mapped field values described in Section 3 are not compatible + with the original syntax of their fields, and so cannot be used + unless parties processing them have explicitly indicated their + support for that form of the field value. An application using + Retrofit Structured Fields will need to define how to negotiate + support for them. + + For example, an alternative serialization of fields that takes + advantage of Structured Fields would need to establish an explicit + negotiation mechanism to assure that both peers would handle that + serialization appropriately before using it. + + See also the security considerations in Section 5. + +1.2. Notational Conventions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + +2. Compatible Fields + + The HTTP fields listed in Table 1 have values that can be handled as + Structured Field Values according to the parsing and serialisation + algorithms in [STRUCTURED-FIELDS] corresponding to the listed top- + level type, subject to the caveats in Section 2.1. + + The top-level types are chosen for compatibility with the defined + syntax of the field as well as with actual internet traffic. + However, not all instances of these fields will successfully parse as + a Structured Field Value. This might be because the field value is + clearly invalid, or it might be because it is valid but not parseable + as a Structured Field. + + An application using this specification will need to consider how to + handle such field values. Depending on its requirements, it might be + advisable to reject such values, treat them as opaque strings, or + attempt to recover a Structured Field Value from them in an ad hoc + fashion. + + +==================================+=================+ + | Field Name | Structured Type | + +==================================+=================+ + | Accept | List | + +----------------------------------+-----------------+ + | Accept-Encoding | List | + +----------------------------------+-----------------+ + | Accept-Language | List | + +----------------------------------+-----------------+ + | Accept-Patch | List | + +----------------------------------+-----------------+ + | Accept-Post | List | + +----------------------------------+-----------------+ + | Accept-Ranges | List | + +----------------------------------+-----------------+ + | Access-Control-Allow-Credentials | Item | + +----------------------------------+-----------------+ + | Access-Control-Allow-Headers | List | + +----------------------------------+-----------------+ + | Access-Control-Allow-Methods | List | + +----------------------------------+-----------------+ + | Access-Control-Allow-Origin | Item | + +----------------------------------+-----------------+ + | Access-Control-Expose-Headers | List | + +----------------------------------+-----------------+ + | Access-Control-Max-Age | Item | + +----------------------------------+-----------------+ + | Access-Control-Request-Headers | List | + +----------------------------------+-----------------+ + | Access-Control-Request-Method | Item | + +----------------------------------+-----------------+ + | Age | Item | + +----------------------------------+-----------------+ + | Allow | List | + +----------------------------------+-----------------+ + | ALPN | List | + +----------------------------------+-----------------+ + | Alt-Svc | Dictionary | + +----------------------------------+-----------------+ + | Alt-Used | Item | + +----------------------------------+-----------------+ + | Cache-Control | Dictionary | + +----------------------------------+-----------------+ + | CDN-Loop | List | + +----------------------------------+-----------------+ + | Clear-Site-Data | List | + +----------------------------------+-----------------+ + | Connection | List | + +----------------------------------+-----------------+ + | Content-Encoding | List | + +----------------------------------+-----------------+ + | Content-Language | List | + +----------------------------------+-----------------+ + | Content-Length | List | + +----------------------------------+-----------------+ + | Content-Type | Item | + +----------------------------------+-----------------+ + | Cross-Origin-Resource-Policy | Item | + +----------------------------------+-----------------+ + | DNT | Item | + +----------------------------------+-----------------+ + | Expect | Dictionary | + +----------------------------------+-----------------+ + | Expect-CT | Dictionary | + +----------------------------------+-----------------+ + | Host | Item | + +----------------------------------+-----------------+ + | Keep-Alive | Dictionary | + +----------------------------------+-----------------+ + | Max-Forwards | Item | + +----------------------------------+-----------------+ + | Origin | Item | + +----------------------------------+-----------------+ + | Pragma | Dictionary | + +----------------------------------+-----------------+ + | Prefer | Dictionary | + +----------------------------------+-----------------+ + | Preference-Applied | Dictionary | + +----------------------------------+-----------------+ + | Retry-After | Item | + +----------------------------------+-----------------+ + | Sec-WebSocket-Extensions | List | + +----------------------------------+-----------------+ + | Sec-WebSocket-Protocol | List | + +----------------------------------+-----------------+ + | Sec-WebSocket-Version | Item | + +----------------------------------+-----------------+ + | Server-Timing | List | + +----------------------------------+-----------------+ + | Surrogate-Control | Dictionary | + +----------------------------------+-----------------+ + | TE | List | + +----------------------------------+-----------------+ + | Timing-Allow-Origin | List | + +----------------------------------+-----------------+ + | Trailer | List | + +----------------------------------+-----------------+ + | Transfer-Encoding | List | + +----------------------------------+-----------------+ + | Upgrade-Insecure-Requests | Item | + +----------------------------------+-----------------+ + | Vary | List | + +----------------------------------+-----------------+ + | X-Content-Type-Options | Item | + +----------------------------------+-----------------+ + | X-Frame-Options | Item | + +----------------------------------+-----------------+ + | X-XSS-Protection | List | + +----------------------------------+-----------------+ + + Table 1: Compatible Fields + +2.1. Caveats + + Note the following caveats regarding compatibility: + + Parsing differences: Some values may fail to parse as Structured + Fields, even though they are valid according to their originally + specified syntax. For example, HTTP parameter names are case- + insensitive (per Section 5.6.6 of [HTTP]), but Structured Fields + require them to be all-lowercase. Likewise, many Dictionary-based + fields (e.g., Cache-Control, Expect-CT, Pragma, Prefer, + Preference-Applied, Surrogate-Control) have case-insensitive keys. + Similarly, the parameters rule in HTTP (see Section 5.6.6 of + [HTTP]) allows whitespace before the ";" delimiter, but Structured + Fields does not. And, Section 5.6.4 of [HTTP] allows backslash- + escaping most characters in quoted strings, whereas Structured + Field Strings only escape "\" and DQUOTE. The vast majority of + fields seen in typical traffic do not exhibit these behaviors. + + Error handling: Parsing algorithms specified (or just widely + implemented) for current HTTP headers may differ from those in + Structured Fields in details such as error handling. For example, + HTTP specifies that repeated directives in the Cache-Control + header field have a different precedence than that assigned by a + Dictionary structured field (which Cache-Control is mapped to). + + Token limitations: In Structured Fields, tokens are required to + begin with an alphabetic character or "*", whereas HTTP tokens + allow a wider range of characters. This prevents use of mapped + values that begin with one of these characters. For example, + media types, field names, methods, range-units, character and + transfer codings that begin with a number or special character + other than "*" might be valid HTTP protocol elements, but will not + be able to be represented as Structured Field Tokens. + + Integer limitations: Structured Fields Integers can have at most 15 + digits; larger values will not be able to be represented in them. + + IPv6 Literals: Fields whose values contain IPv6 literal addresses + (such as CDN-Loop, Host, and Origin) are not able to be + represented as Structured Fields Tokens, because the brackets used + to delimit them are not allowed in Tokens. + + Empty Field Values: Empty and whitespace-only field values are + considered errors in Structured Fields. For compatible fields, an + empty field indicates that the field should be silently ignored. + + Alt-Svc: Some ALPN tokens (e.g., h3-Q43) do not conform to key's + syntax, and therefore cannot be represented as a Token. Since the + final version of HTTP/3 uses the h3 token, this shouldn't be a + long-term issue, although future tokens may again violate this + assumption. + + Content-Length: Note that Content-Length is defined as a List + because it is not uncommon for implementations to mistakenly send + multiple values. See Section 8.6 of [HTTP] for handling + requirements. + + Retry-After: Only the delta-seconds form of Retry-After can be + represented; a Retry-After value containing a http-date will need + to be converted into delta-seconds to be conveyed as a Structured + Field Value. + +3. Mapped Fields + + Some HTTP field values have syntax that cannot be successfully parsed + as Structured Field values. Instead, it is necessary to map them + into a Structured Field value. + + For example, the Date HTTP header field carries a date: + + Date: Sun, 06 Nov 1994 08:49:37 GMT + + Its value would be mapped to: + + @784111777 + + Unlike those listed in Section 2, these representations are not + compatible with the original fields' syntax, and MUST NOT be used + unless they are explicitly and unambiguously supported. For example, + this means that sending them to a next-hop recipient in HTTP requires + prior negotiation. This specification does not define how to do so. + +3.1. URLs + + The field names in Table 2 have values that can be mapped into + Structured Field values by treating the original field's value as a + String. + + +==================+ + | Field Name | + +==================+ + | Content-Location | + +------------------+ + | Location | + +------------------+ + | Referer | + +------------------+ + + Table 2: URL Fields + + For example, this Location field: + + Location: https://example.com/foo + + would have a mapped value of: + + "https://example.com/foo" + +3.2. Dates + + The field names in Table 3 have values that can be mapped into + Structured Field values by parsing their payload according to + Section 5.6.7 of [HTTP] and representing the result as a Date. + + +=====================+ + | Field Name | + +=====================+ + | Date | + +---------------------+ + | Expires | + +---------------------+ + | If-Modified-Since | + +---------------------+ + | If-Unmodified-Since | + +---------------------+ + | Last-Modified | + +---------------------+ + + Table 3: Date Fields + + For example, an Expires field's value could be mapped as: + + @1659578233 + +3.3. ETags + + The field value of the ETag header field can be mapped into a + Structured Field value by representing the entity-tag as a String, + and the weakness flag as a Boolean "w" parameter on it, where true + indicates that the entity-tag is weak; if 0 or unset, the entity-tag + is strong. + + For example, this ETag header field: + + ETag: W/"abcdef" + + would have a mapped value of: + + "abcdef"; w + + If-None-Match's field value can be mapped into a Structured Field + value which is a List of the structure described above. When a field + value contains "*", it is represented as a Token. + + Likewise, If-Match's field value can be mapped into a Structured + Field value in the same manner. + + For example, this If-None-Match field: + + If-None-Match: W/"abcdef", "ghijkl", * + + would have a mapped value of: + + "abcdef"; w, "ghijkl", * + +3.4. Cookies + + The field values of the Cookie and Set-Cookie fields [COOKIES] can be + mapped into Structured Fields Lists. + + In each case, a cookie is represented as an Inner List containing two + Items; the cookie name and value. The cookie name is always a + String; the cookie value is a String, unless it can be successfully + parsed as the textual representation of another, bare Item structured + type (e.g., Byte Sequence, Decimal, Integer, Token, or Boolean). + + Cookie attributes map to Parameters on the Inner List, with the + parameter name being forced to lowercase. Cookie attribute values + are Strings unless a specific type is defined for them. This + specification defines types for existing cookie attributes in + Table 4. + + +================+=================+ + | Parameter Name | Structured Type | + +================+=================+ + | Domain | String | + +----------------+-----------------+ + | HttpOnly | Boolean | + +----------------+-----------------+ + | Expires | Date | + +----------------+-----------------+ + | Max-Age | Integer | + +----------------+-----------------+ + | Path | String | + +----------------+-----------------+ + | Secure | Boolean | + +----------------+-----------------+ + | SameSite | Token | + +----------------+-----------------+ + + Table 4: Set-Cookie Parameter Types + + The Expires attribute is mapped to a Date representation of parsed- + cookie-date (see Section 5.1.1 of [COOKIES]). + + For example, this Set-Cookie field: + + Set-Cookie: Lang=en-US; Expires=Wed, 09 Jun 2021 10:18:14 GMT; + samesite=Strict; secure + + would have a mapped value of: + + ("Lang" "en-US"); expires=@1623233894; + samesite=Strict; secure + + And this Cookie field: + + Cookie: SID=31d4d96e407aad42; lang=en-US + + would have a mapped value of: + + ("SID" "31d4d96e407aad42"), ("lang" "en-US") + +4. IANA Considerations + + Please add the following note to the "Hypertext Transfer Protocol + (HTTP) Field Name Registry": + + A prefix of "*" in the Structured Type column indicates that it is + a retrofit type (i.e., not natively Structured); see RFC nnnn. + + Then, add a new column, "Structured Type", with the values from + Section 2 assigned to the nominated registrations, prefixing each + with "*" to indicate that it is a retrofit type. + + Finally, add a new column to the "Cookie Attribute Registry" + established by [COOKIES] with the title "Structured Type", using + information from Table 4. + +5. Security Considerations + + Section 2 identifies existing HTTP fields that can be parsed and + serialised with the algorithms defined in [STRUCTURED-FIELDS]. + Variances from existing parser behavior might be exploitable, + particularly if they allow an attacker to target one implementation + in a chain (e.g., an intermediary). However, given the considerable + variance in parsers already deployed, convergence towards a single + parsing algorithm is likely to have a net security benefit in the + longer term. + + Section 3 defines alternative representations of existing fields. + Because downstream consumers might interpret the message differently + based upon whether they recognise the alternative representation, + implementations are prohibited from generating such values unless + they have negotiated support for them with their peer. This + specification does not define such a mechanism, but any such + definition needs to consider the implications of doing so carefully. + +6. Normative References + + [COOKIES] Bingler, S., West, M., and J. Wilander, "Cookies: HTTP + State Management Mechanism", Work in Progress, Internet- + Draft, draft-ietf-httpbis-rfc6265bis-18, 20 December 2024, + . + + [HTTP] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP Semantics", STD 97, RFC 9110, + DOI 10.17487/RFC9110, June 2022, + . + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + . + + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, . + + [STRUCTURED-FIELDS] + Nottingham, M. and P. Kamp, "Structured Field Values for + HTTP", Work in Progress, Internet-Draft, draft-ietf- + httpbis-sfbis-06, 21 April 2024, + . + +Author's Address + + Mark Nottingham + Prahran + Australia + Email: mnot@mnot.net + URI: https://www.mnot.net/ diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-rfc6265bis.html b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-rfc6265bis.html new file mode 100644 index 000000000..948355197 --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-rfc6265bis.html @@ -0,0 +1,4598 @@ + + + + + + +Cookies: HTTP State Management Mechanism + + + + + + + + + + + + + + + + + + + + + + + + +
Internet-DraftCookies: HTTP State Management MechanismJanuary 2025
Bingler, et al.Expires 11 July 2025[Page]
+
+
+
+
Workgroup:
+
HTTP
+
Internet-Draft:
+
draft-ietf-httpbis-rfc6265bis-latest
+
Obsoletes:
+
+6265 (if approved)
+
Published:
+
+ +
+
Intended Status:
+
Standards Track
+
Expires:
+
+
Authors:
+
+
+
S. Bingler, Ed. +
+
Google LLC
+
+
+
M. West, Ed. +
+
Google LLC
+
+
+
J. Wilander, Ed. +
+
Apple, Inc
+
+
+
+
+

Cookies: HTTP State Management Mechanism

+
+

Abstract

+

This document defines the HTTP Cookie and Set-Cookie header fields. These +header fields can be used by HTTP servers to store state (called cookies) at +HTTP user agents, letting the servers maintain a stateful session over the +mostly stateless HTTP protocol. Although cookies have many historical +infelicities that degrade their security and privacy, the Cookie and Set-Cookie +header fields are widely used on the Internet. This document obsoletes RFC +6265.

+
+
+

+About This Document +

+

This note is to be removed before publishing as an RFC.

+

+ Status information for this document may be found at https://datatracker.ietf.org/doc/draft-ietf-httpbis-rfc6265bis/.

+

+ Discussion of this document takes place on the + HTTP Working Group mailing list (mailto:ietf-http-wg@w3.org), + which is archived at https://lists.w3.org/Archives/Public/ietf-http-wg/. + Working Group information can be found at https://httpwg.org/.

+

Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/6265bis.

+
+
+
+

+Status of This Memo +

+

+ This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79.

+

+ Internet-Drafts are working documents of the Internet Engineering Task + Force (IETF). Note that other groups may also distribute working + documents as Internet-Drafts. The list of current Internet-Drafts is + at https://datatracker.ietf.org/drafts/current/.

+

+ Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress."

+

+ This Internet-Draft will expire on 11 July 2025.

+
+
+ +
+
+

+Table of Contents +

+ +
+
+
+
+

+1. Introduction +

+

This document defines the HTTP Cookie and Set-Cookie header fields. Using +the Set-Cookie header field, an HTTP server can pass name/value pairs and +associated metadata (called cookies) to a user agent. When the user agent makes +subsequent requests to the server, the user agent uses the metadata and other +information to determine whether to return the name/value pairs in the Cookie +header field.

+

Although simple on their surface, cookies have a number of complexities. For +example, the server indicates a scope for each cookie when sending it to the +user agent. The scope indicates the maximum amount of time in which the user +agent should return the cookie, the servers to which the user agent should +return the cookie, and the connection types for which the cookie is applicable.

+

For historical reasons, cookies contain a number of security and privacy +infelicities. For example, a server can indicate that a given cookie is +intended for "secure" connections, but the Secure attribute does not provide +integrity in the presence of an active network attacker. Similarly, cookies +for a given host are shared across all the ports on that host, even though the +usual "same-origin policy" used by web browsers isolates content retrieved via +different ports.

+

This specification applies to developers of both cookie-producing servers and +cookie-consuming user agents. Section 3.2 helps to clarify the +intended target audience for each implementation type.

+

To maximize interoperability with user agents, servers SHOULD limit themselves +to the well-behaved profile defined in Section 4 when generating cookies.

+

User agents MUST implement the more liberal processing rules defined in Section 5, +in order to maximize interoperability with existing servers that do not +conform to the well-behaved profile defined in Section 4.

+

This document specifies the syntax and semantics of these header fields as they are +actually used on the Internet. In particular, this document does not create +new syntax or semantics beyond those in use today. The recommendations for +cookie generation provided in Section 4 represent a preferred subset of current +server behavior, and even the more liberal cookie processing algorithm provided +in Section 5 does not recommend all of the syntactic and semantic variations in +use today. Where some existing software differs from the recommended protocol +in significant ways, the document contains a note explaining the difference.

+

This document obsoletes [RFC6265].

+
+
+
+
+

+2. Conventions +

+
+
+

+2.1. Conformance Criteria +

+

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", +"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be +interpreted as described in [RFC2119].

+

Requirements phrased in the imperative as part of algorithms (such as "strip any +leading space characters" or "return false and abort these steps") are to be +interpreted with the meaning of the key word ("MUST", "SHOULD", "MAY", etc.) +used in introducing the algorithm.

+

Conformance requirements phrased as algorithms or specific steps can be +implemented in any manner, so long as the end result is equivalent. In +particular, the algorithms defined in this specification are intended to be +easy to understand and are not intended to be performant.

+
+
+
+
+

+2.2. Syntax Notation +

+

This specification uses the Augmented Backus-Naur Form (ABNF) notation of +[RFC5234].

+

The following core rules are included by reference, as defined in [RFC5234], +Appendix B.1: ALPHA (letters), CR (carriage return), CRLF (CR LF), CTLs +(controls), DIGIT (decimal 0-9), DQUOTE (double quote), HEXDIG +(hexadecimal 0-9/A-F/a-f), LF (line feed), NUL (null octet), OCTET (any +8-bit sequence of data except NUL), SP (space), HTAB (horizontal tab), +CHAR (any [USASCII] character), VCHAR (any visible [USASCII] character), +and WSP (whitespace).

+

The OWS (optional whitespace) and BWS (bad whitespace) rules are defined in +Section 5.6.3 of [HTTP].

+
+
+
+
+

+2.3. Terminology +

+

The terms "user agent", "client", "server", "proxy", and "origin server" have +the same meaning as in the HTTP/1.1 specification ([HTTP], Section 3).

+

The request-host is the name of the host, as known by the user agent, to which +the user agent is sending an HTTP request or from which it is receiving an HTTP +response (i.e., the name of the host to which it sent the corresponding HTTP +request).

+

The term request-uri refers to "target URI" as defined in Section 7.1 of [HTTP].

+

Two sequences of octets are said to case-insensitively match each other if and +only if they are equivalent under the i;ascii-casemap collation defined in +[RFC4790].

+

The term string means a sequence of non-NUL octets.

+

The terms "active browsing context", "active document", "ancestor navigables", +"container document", "content navigable", "dedicated worker", "Document", +"inclusive ancestor navigables", "navigable", "opaque origin", "sandboxed +origin browsing context flag", "shared worker", "the worker's Documents", +"top-level traversable", and "WorkerGlobalScope" are defined in [HTML].

+

"Service Workers" are defined in the Service Workers specification +[SERVICE-WORKERS].

+

The term "origin", the mechanism of deriving an origin from a URI, and the "the +same" matching algorithm for origins are defined in [RFC6454].

+

"Safe" HTTP methods include GET, HEAD, OPTIONS, and TRACE, as defined +in Section 9.2.1 of [HTTP].

+

A domain's "public suffix" is the portion of a domain that is controlled by a +public registry, such as "com", "co.uk", and "pvt.k12.wy.us". A domain's +"registrable domain" is the domain's public suffix plus the label to its left. +That is, for https://www.site.example, the public suffix is example, and the +registrable domain is site.example. Whenever possible, user agents SHOULD +use an up-to-date public suffix list, such as the one maintained by the Mozilla +project at [PSL].

+

The term "request", as well as a request's "client", "current url", "method", +"target browsing context", and "url list", are defined in [FETCH].

+

The term "non-HTTP APIs" refers to non-HTTP mechanisms used to set and retrieve +cookies, such as a web browser API that exposes cookies to scripts.

+

The term "top-level navigation" refers to a navigation of a top-level +traversable.

+
+
+
+
+
+
+

+3. Overview +

+

This section outlines a way for an origin server to send state information to a +user agent and for the user agent to return the state information to the origin +server.

+

To store state, the origin server includes a Set-Cookie header field in an HTTP +response. In subsequent requests, the user agent returns a Cookie request +header field to the origin server. The Cookie header field contains cookies the user agent +received in previous Set-Cookie header fields. The origin server is free to ignore +the Cookie header field or use its contents for an application-defined purpose.

+

Origin servers MAY send a Set-Cookie response header field with any response. An +origin server can include multiple Set-Cookie header fields in a single response. +The presence of a Cookie or a Set-Cookie header field does not preclude HTTP +caches from storing and reusing a response.

+

Origin servers SHOULD NOT fold multiple Set-Cookie header fields into a single +header field. The usual mechanism for folding HTTP headers fields (i.e., as +defined in Section 5.3 of [HTTP]) might change the semantics of the Set-Cookie header +field because the %x2C (",") character is used by Set-Cookie in a way that +conflicts with such folding.

+

User agents MAY ignore Set-Cookie header fields based on response status codes or +the user agent's cookie policy (see Section 5.3).

+
+
+

+3.1. Examples +

+

Using the Set-Cookie header field, a server can send the user agent a short string +in an HTTP response that the user agent will return in future HTTP requests that +are within the scope of the cookie. For example, the server can send the user +agent a "session identifier" named SID with the value 31d4d96e407aad42. The +user agent then returns the session identifier in subsequent requests.

+
+
+== Server -> User Agent ==
+
+Set-Cookie: SID=31d4d96e407aad42
+
+== User Agent -> Server ==
+
+Cookie: SID=31d4d96e407aad42
+
+
+

The server can alter the default scope of the cookie using the Path and +Domain attributes. For example, the server can instruct the user agent to +return the cookie to every path and every subdomain of site.example.

+
+
+== Server -> User Agent ==
+
+Set-Cookie: SID=31d4d96e407aad42; Path=/; Domain=site.example
+
+== User Agent -> Server ==
+
+Cookie: SID=31d4d96e407aad42
+
+
+

As shown in the next example, the server can store multiple cookies at the user +agent. For example, the server can store a session identifier as well as the +user's preferred language by returning two Set-Cookie header fields. Notice +that the server uses the Secure and HttpOnly attributes to provide +additional security protections for the more sensitive session identifier (see +Section 4.1.2).

+
+
+== Server -> User Agent ==
+
+Set-Cookie: SID=31d4d96e407aad42; Path=/; Secure; HttpOnly
+Set-Cookie: lang=en-US; Path=/; Domain=site.example
+
+== User Agent -> Server ==
+
+Cookie: SID=31d4d96e407aad42; lang=en-US
+
+
+

Notice that the Cookie header field above contains two cookies, one named SID and +one named lang.

+

Cookie names are case-sensitive, meaning that if a server sends the user agent +two Set-Cookie header fields that differ only in their name's case the user +agent will store and return both of those cookies in subsequent requests.

+
+
+== Server -> User Agent ==
+
+Set-Cookie: SID=31d4d96e407aad42
+Set-Cookie: sid=31d4d96e407aad42
+
+== User Agent -> Server ==
+
+Cookie: SID=31d4d96e407aad42; sid=31d4d96e407aad42
+
+
+

If the server wishes the user agent to persist the cookie over +multiple "sessions" (e.g., user agent restarts), the server can specify an +expiration date in the Expires attribute. Note that the user agent might +delete the cookie before the expiration date if the user agent's cookie store +exceeds its quota or if the user manually deletes the server's cookie.

+
+
+== Server -> User Agent ==
+
+Set-Cookie: lang=en-US; Expires=Wed, 09 Jun 2021 10:18:14 GMT
+
+== User Agent -> Server ==
+
+Cookie: SID=31d4d96e407aad42; lang=en-US
+
+
+

Finally, to remove a cookie, the server returns a Set-Cookie header field with an +expiration date in the past. The server will be successful in removing the +cookie only if the Path and the Domain attribute in the Set-Cookie header field +match the values used when the cookie was created.

+
+
+== Server -> User Agent ==
+
+Set-Cookie: lang=; Expires=Sun, 06 Nov 1994 08:49:37 GMT
+
+== User Agent -> Server ==
+
+Cookie: SID=31d4d96e407aad42
+
+
+
+
+
+
+

+3.2. Which Requirements to Implement +

+

The upcoming two sections, Section 4 and Section 5, discuss +the set of requirements for two distinct types of implementations. This section +is meant to help guide implementers in determining which set of requirements +best fits their goals. Choosing the wrong set of requirements could result in a +lack of compatibility with other cookie implementations.

+

It's important to note that being compatible means different things +depending on the implementer's goals. These differences have built up over time +due to both intentional and unintentional spec changes, spec interpretations, +and historical implementation differences.

+

This section roughly divides implementers of the cookie spec into two types, +producers and consumers. These are not official terms and are only used here to +help readers develop an intuitive understanding of the use cases.

+ + +
+
+
+
+
+
+

+4. Server Requirements +

+

This section describes the syntax and semantics of a well-behaved profile of the +Cookie and Set-Cookie header fields.

+ + +
+
+
+
+

+5. User Agent Requirements +

+

This section specifies the Cookie and Set-Cookie header fields in sufficient +detail that a user agent implementing these requirements precisely can +interoperate with existing servers (even those that do not conform to the +well-behaved profile described in Section 4).

+

A user agent could enforce more restrictions than those specified herein (e.g., +restrictions specified by its cookie policy, described in Section 7.2). +However, such additional restrictions may reduce the likelihood that a user +agent will be able to interoperate with existing servers.

+
+
+

+5.1. Subcomponent Algorithms +

+

This section defines some algorithms used by user agents to process specific +subcomponents of the Cookie and Set-Cookie header fields.

+ +
+
+

+5.1.2. Canonicalized Host Names +

+

A canonicalized host name is the string generated by the following algorithm:

+
    +
  1. +

    Convert the host name to a sequence of individual domain name labels.

    +
  2. +
  3. +

    Convert each label that is not a Non-Reserved LDH (NR-LDH) label, to an +A-label (see Section 2.3.2.1 of [RFC5890] for the former and latter).

    +
  4. +
  5. +

    Concatenate the resulting labels, separated by a %x2E (".") character.

    +
  6. +
+
+
+
+
+

+5.1.3. Domain Matching +

+

A string domain-matches a given domain string if at least one of the following +conditions hold:

+
    +
  • +

    The domain string and the string are identical. (Note that both the domain +string and the string will have been canonicalized to lower case at this +point.)

    +
  • +
  • +

    All of the following conditions hold:

    +
      +
    • +

      The domain string is a suffix of the string.

      +
    • +
    • +

      The last character of the string that is not included in the domain +string is a %x2E (".") character.

      +
    • +
    • +

      The string is a host name (i.e., not an IP address).

      +
    • +
    +
  • +
+
+
+ +
+
+
+
+

+5.2. "Same-site" and "cross-site" Requests +

+

Two origins are same-site if they satisfy the "same site" criteria defined in +[SAMESITE]. A request is "same-site" if the following criteria are true:

+
    +
  1. +

    The request is not the result of a reload navigation triggered through a +user interface element (as defined by the user agent; e.g., a request +triggered by the user clicking a refresh button on a toolbar).

    +
  2. +
  3. +

    The request's current url's origin is same-site with the request's +client's "site for cookies" (which is an origin), or if the request has no +client or the request's client is null.

    +
  4. +
+

Requests which are the result of a reload navigation triggered through a user +interface element are same-site if the reloaded document was originally +navigated to via a same-site request. A request that is not "same-site" is +instead "cross-site".

+

The request's client's "site for cookies" is calculated depending upon its +client's type, as described in the following subsections:

+
+
+

+5.2.1. Document-based requests +

+

The URI displayed in a user agent's address bar is the only security context +directly exposed to users, and therefore the only signal users can reasonably +rely upon to determine whether or not they trust a particular website. The +origin of that URI represents the context in which a user most likely believes +themselves to be interacting. We'll define this origin, the top-level +traversable's active document's origin, as the "top-level origin".

+

For a document displayed in a top-level traversable, we can stop here: the +document's "site for cookies" is the top-level origin.

+

For container documents, we need to audit the origins of each of a document's +ancestor navigables' active documents in order to account for the +"multiple-nested scenarios" described in Section 4 of [RFC7034]. A document's +"site for cookies" is the top-level origin if and only if the top-level origin +is same-site with the document's origin, and with each of the document's +ancestor documents' origins. Otherwise its "site for cookies" is an origin set +to an opaque origin.

+

Given a Document (document), the following algorithm returns its "site for +cookies":

+
    +
  1. +

    Let top-document be the active document in document's navigable's +top-level traversable.

    +
  2. +
  3. +

    Let top-origin be the origin of top-document's URI if top-document's +sandboxed origin browsing context flag is set, and top-document's origin +otherwise.

    +
  4. +
  5. +

    Let documents be a list consisting of the active documents of +document's inclusive ancestor navigables.

    +
  6. +
  7. +

    For each item in documents:

    +
      +
    1. +

      Let origin be the origin of item's URI if item's sandboxed origin +browsing context flag is set, and item's origin otherwise.

      +
    2. +
    3. +

      If origin is not same-site with top-origin, return an origin set to +an opaque origin.

      +
    4. +
    +
  8. +
  9. +

    Return top-origin.

    +
  10. +
+

Note: This algorithm only applies when the entire chain of documents from +top-document to document are all active.

+
+
+
+
+

+5.2.2. Worker-based requests +

+

Worker-driven requests aren't as clear-cut as document-driven requests, as +there isn't a clear link between a top-level traversable and a worker. +This is especially true for Service Workers [SERVICE-WORKERS], which may +execute code in the background, without any document visible at all.

+

Note: The descriptions below assume that workers must be same-origin with +the documents that instantiate them. If this invariant changes, we'll need to +take the worker's script's URI into account when determining their status.

+
+
+
+5.2.2.1. Dedicated and Shared Workers +
+

Dedicated workers are simple, as each dedicated worker is bound to one and only +one document. Requests generated from a dedicated worker (via importScripts, +XMLHttpRequest, fetch(), etc) define their "site for cookies" as that +document's "site for cookies".

+

Shared workers may be bound to multiple documents at once. As it is quite +possible for those documents to have distinct "site for cookies" values, the +worker's "site for cookies" will be an origin set to an opaque origin in cases +where the values are not all same-site with the worker's origin, and the +worker's origin in cases where the values agree.

+

Given a WorkerGlobalScope (worker), the following algorithm returns its "site +for cookies":

+
    +
  1. +

    Let site be worker's origin.

    +
  2. +
  3. +

    For each document in worker's Documents:

    +
      +
    1. +

      Let document-site be document's "site for cookies" (as defined +in Section 5.2.1).

      +
    2. +
    3. +

      If document-site is not same-site with site, return an origin +set to an opaque origin.

      +
    4. +
    +
  4. +
  5. +

    Return site.

    +
  6. +
+
+
+
+
+
+5.2.2.2. Service Workers +
+

Service Workers are more complicated, as they act as a completely separate +execution context with only tangential relationship to the Document which +registered them.

+

How user agents handle Service Workers may differ, but user agents SHOULD +match the [SERVICE-WORKERS] specification.

+
+
+
+
+
+
+
+
+ +

User agents MAY ignore Set-Cookie header fields contained in responses with 100-level +status codes or based on its cookie policy (see Section 7.2).

+

All other Set-Cookie header fields SHOULD be processed according to Section 5.6. +That is, Set-Cookie header fields contained in responses with non-100-level status +codes (including those in responses with 400- and 500-level status codes) +SHOULD be processed unless ignored according to the user agent's cookie policy.

+
+
+
+
+ +

User agents' requirements for cookie name prefixes differ slightly from +servers' (Section 4.1.3) in that UAs MUST match the prefix string +case-insensitively.

+

The normative requirements for the prefixes are detailed in the storage model +algorithm defined in Section 5.7.

+

This is because some servers will process cookies case-insensitively, resulting +in them unintentionally miscapitalizing and accepting miscapitalized prefixes.

+

For example, if a server sends the following Set-Cookie header field

+
+
+Set-Cookie: __SECURE-SID=12345
+
+
+

to a UA which checks prefixes case-sensitively it will accept this cookie and +the server would incorrectly believe the cookie is subject the same guarantees +as one spelled __Secure-.

+

Additionally the server is vulnerable to an attacker that purposefully +miscapitalizes a cookie in order to impersonate a prefixed cookie. For example, +a site already has a cookie __Secure-SID=12345 and by some means an attacker +sends the following Set-Cookie header field for the site to a UA which checks +prefixes case-sensitively.

+
+
+Set-Cookie: __SeCuRe-SID=evil
+
+
+

The next time a user visits the site the UA will send both cookies:

+
+
+Cookie: __Secure-SID=12345; __SeCuRe-SID=evil
+
+
+

The server, being case-insensitive, won't be able to tell the difference +between the two cookies allowing the attacker to compromise the site.

+

To prevent these issues, UAs MUST match cookie name prefixes case-insensitive.

+

Note: Cookies with different names are still considered separate by UAs. So +both __Secure-foo=bar and __secure-foo=baz can exist as distinct cookies +simultaneously and both would have the requirements of the __Secure- prefix +applied.

+

The following are examples of Set-Cookie header fields that would be rejected +by a conformant user agent.

+
+
+Set-Cookie: __Secure-SID=12345; Domain=site.example
+Set-Cookie: __secure-SID=12345; Domain=site.example
+Set-Cookie: __SECURE-SID=12345; Domain=site.example
+Set-Cookie: __Host-SID=12345
+Set-Cookie: __host-SID=12345; Secure
+Set-Cookie: __host-SID=12345; Domain=site.example
+Set-Cookie: __HOST-SID=12345; Domain=site.example; Path=/
+Set-Cookie: __Host-SID=12345; Secure; Domain=site.example; Path=/
+Set-Cookie: __host-SID=12345; Secure; Domain=site.example; Path=/
+Set-Cookie: __HOST-SID=12345; Secure; Domain=site.example; Path=/
+
+
+

Whereas the following Set-Cookie header fields would be accepted if set from +a secure origin.

+
+
+Set-Cookie: __Secure-SID=12345; Domain=site.example; Secure
+Set-Cookie: __secure-SID=12345; Domain=site.example; Secure
+Set-Cookie: __SECURE-SID=12345; Domain=site.example; Secure
+Set-Cookie: __Host-SID=12345; Secure; Path=/
+Set-Cookie: __host-SID=12345; Secure; Path=/
+Set-Cookie: __HOST-SID=12345; Secure; Path=/
+
+
+
+
+ + +
+
+

+5.7. Storage Model +

+

The user agent stores the following fields about each cookie: name, value, +expiry-time, domain, path, creation-time, last-access-time, +persistent-flag, host-only-flag, secure-only-flag, http-only-flag, +and same-site-flag.

+

When the user agent "receives a cookie" from a request-uri with name +cookie-name, value cookie-value, and attributes cookie-attribute-list, the +user agent MUST process the cookie as follows:

+
    +
  1. +

    A user agent MAY ignore a received cookie in its entirety. See +Section 5.3.

    +
  2. +
  3. +

    If cookie-name is empty and cookie-value is empty, abort these steps and +ignore the cookie entirely.

    +
  4. +
  5. +

    If the cookie-name or the cookie-value contains a +%x00-08 / %x0A-1F / %x7F character (CTL characters excluding HTAB), +abort these steps and ignore the cookie entirely.

    +
  6. +
  7. +

    If the sum of the lengths of cookie-name and cookie-value is more than +4096 octets, abort these steps and ignore the cookie entirely.

    +
  8. +
  9. +

    Create a new cookie with name cookie-name, value cookie-value. Set the +creation-time and the last-access-time to the current date and time.

    +
  10. +
  11. +

    If the cookie-attribute-list contains an attribute with an attribute-name +of "Max-Age":

    +
      +
    1. +

      Set the cookie's persistent-flag to true.

      +
    2. +
    3. +

      Set the cookie's expiry-time to attribute-value of the last +attribute in the cookie-attribute-list with an attribute-name of +"Max-Age".

      +
    4. +
    +

    +Otherwise, if the cookie-attribute-list contains an attribute with an +attribute-name of "Expires" (and does not contain an attribute with an +attribute-name of "Max-Age"):

    +
      +
    1. +

      Set the cookie's persistent-flag to true.

      +
    2. +
    3. +

      Set the cookie's expiry-time to attribute-value of the last +attribute in the cookie-attribute-list with an attribute-name of +"Expires".

      +
    4. +
    +

    +Otherwise:

    +
      +
    1. +

      Set the cookie's persistent-flag to false.

      +
    2. +
    3. +

      Set the cookie's expiry-time to the latest representable date.

      +
    4. +
    +
  12. +
  13. +

    If the cookie-attribute-list contains an attribute with an +attribute-name of "Domain":

    +
      +
    1. +

      Let the domain-attribute be the attribute-value of the last +attribute in the cookie-attribute-list with both an +attribute-name of "Domain" and an attribute-value whose +length is no more than 1024 octets. (Note that a leading %x2E +("."), if present, is ignored even though that character is not +permitted.)

      +
    2. +
    +

    +Otherwise:

    +
      +
    1. +

      Let the domain-attribute be the empty string.

      +
    2. +
    +
  14. +
  15. +

    If the domain-attribute contains a character that is not in the range of [USASCII] +characters, abort these steps and ignore the cookie entirely.

    +
  16. +
  17. +

    If the user agent is configured to reject "public suffixes" and the +domain-attribute is a public suffix:

    +
      +
    1. +

      If the domain-attribute is identical to the canonicalized +request-host:

      +
        +
      1. +

        Let the domain-attribute be the empty string.

        +
      2. +
      +

      +Otherwise:

      +
        +
      1. +

        Abort these steps and ignore the cookie entirely.

        +
      2. +
      +
    2. +
    +

    +NOTE: This step prevents attacker.example from disrupting the integrity of +site.example by setting a cookie with a Domain attribute of "example".

    +
  18. +
  19. +

    If the domain-attribute is non-empty:

    +
      +
    1. +

      If the canonicalized request-host does not domain-match the +domain-attribute:

      +
        +
      1. +

        Abort these steps and ignore the cookie entirely.

        +
      2. +
      +

      +Otherwise:

      +
        +
      1. +

        Set the cookie's host-only-flag to false.

        +
      2. +
      3. +

        Set the cookie's domain to the domain-attribute.

        +
      4. +
      +
    2. +
    +

    +Otherwise:

    +
      +
    1. +

      Set the cookie's host-only-flag to true.

      +
    2. +
    3. +

      Set the cookie's domain to the canonicalized request-host.

      +
    4. +
    +
  20. +
  21. +

    If the cookie-attribute-list contains an attribute with an +attribute-name of "Path", set the cookie's path to attribute-value of +the last attribute in the cookie-attribute-list with both an +attribute-name of "Path" and an attribute-value whose length is no +more than 1024 octets. Otherwise, set the cookie's path to the +default-path of the request-uri.

    +
  22. +
  23. +

    If the cookie-attribute-list contains an attribute with an +attribute-name of "Secure", set the cookie's secure-only-flag to true. +Otherwise, set the cookie's secure-only-flag to false.

    +
  24. +
  25. +

    If the request-uri does not denote a "secure" connection (as defined by the +user agent), and the cookie's secure-only-flag is true, then abort these +steps and ignore the cookie entirely.

    +
  26. +
  27. +

    If the cookie-attribute-list contains an attribute with an +attribute-name of "HttpOnly", set the cookie's http-only-flag to true. +Otherwise, set the cookie's http-only-flag to false.

    +
  28. +
  29. +

    If the cookie was received from a "non-HTTP" API and the cookie's +http-only-flag is true, abort these steps and ignore the cookie entirely.

    +
  30. +
  31. +

    If the cookie's secure-only-flag is false, and the request-uri does not +denote a "secure" connection, then abort these steps and ignore the cookie +entirely if the cookie store contains one or more cookies that meet all of +the following criteria:

    +
      +
    1. +

      Their name matches the name of the newly-created cookie.

      +
    2. +
    3. +

      Their secure-only-flag is true.

      +
    4. +
    5. +

      Their domain domain-matches the domain of the newly-created cookie, or +vice-versa.

      +
    6. +
    7. +

      The path of the newly-created cookie path-matches the path of the +existing cookie.

      +
    8. +
    +

    +Note: The path comparison is not symmetric, ensuring only that a +newly-created, non-secure cookie does not overlay an existing secure +cookie, providing some mitigation against cookie-fixing attacks. That is, +given an existing secure cookie named 'a' with a path of '/login', a +non-secure cookie named 'a' could be set for a path of '/' or '/foo', but +not for a path of '/login' or '/login/en'.

    +
  32. +
  33. +

    If the cookie-attribute-list contains an attribute with an +attribute-name of "SameSite", and an attribute-value of "Strict", "Lax", or +"None", set the cookie's same-site-flag to the attribute-value of the last +attribute in the cookie-attribute-list with an attribute-name of "SameSite". +Otherwise, set the cookie's same-site-flag to "Default".

    +
  34. +
  35. +

    If the cookie's same-site-flag is not "None":

    +
      +
    1. +

      If the cookie was received from a "non-HTTP" API, and the API was called +from a navigable's active document whose "site for cookies" is +not same-site with the top-level origin, then abort these steps and +ignore the newly created cookie entirely.

      +
    2. +
    3. +

      If the cookie was received from a "same-site" request (as defined in +Section 5.2), skip the remaining substeps and continue +processing the cookie.

      +
    4. +
    5. +

      If the cookie was received from a request which is navigating a +top-level traversable [HTML] (e.g. if the request's "reserved +client" is either null or an environment whose "target browsing +context"'s navigable is a top-level traversable), skip the remaining +substeps and continue processing the cookie.

      +

      +Note: Top-level navigations can create a cookie with any SameSite +value, even if the new cookie wouldn't have been sent along with +the request had it already existed prior to the navigation.

      +
    6. +
    7. +

      Abort these steps and ignore the newly created cookie entirely.

      +
    8. +
    +
  36. +
  37. +

    If the cookie's "same-site-flag" is "None", abort these steps and ignore the +cookie entirely unless the cookie's secure-only-flag is true.

    +
  38. +
  39. +

    If the cookie-name begins with a case-insensitive match for the string +"__Secure-", abort these steps and ignore the cookie entirely unless the +cookie's secure-only-flag is true.

    +
  40. +
  41. +

    If the cookie-name begins with a case-insensitive match for the string +"__Host-", abort these steps and ignore the cookie entirely unless the +cookie meets all the following criteria:

    +
      +
    1. +

      The cookie's secure-only-flag is true.

      +
    2. +
    3. +

      The cookie's host-only-flag is true.

      +
    4. +
    5. +

      The cookie-attribute-list contains an attribute with an attribute-name +of "Path", and the cookie's path is /.

      +
    6. +
    +
  42. +
  43. +

    If the cookie-name is empty and either of the following conditions are +true, abort these steps and ignore the cookie entirely:

    +
      +
    • +

      the cookie-value begins with a case-insensitive match for the string +"__Secure-"

      +
    • +
    • +

      the cookie-value begins with a case-insensitive match for the string +"__Host-"

      +
    • +
    +
  44. +
  45. +

    If the cookie store contains a cookie with the same name, domain, +host-only-flag, and path as the newly-created cookie:

    +
      +
    1. +

      Let old-cookie be the existing cookie with the same name, domain, +host-only-flag, and path as the newly-created cookie. (Notice that this +algorithm maintains the invariant that there is at most one such +cookie.)

      +
    2. +
    3. +

      If the newly-created cookie was received from a "non-HTTP" API and the +old-cookie's http-only-flag is true, abort these steps and ignore the +newly created cookie entirely.

      +
    4. +
    5. +

      Update the creation-time of the newly-created cookie to match the +creation-time of the old-cookie.

      +
    6. +
    7. +

      Remove the old-cookie from the cookie store.

      +
    8. +
    +
  46. +
  47. +

    Insert the newly-created cookie into the cookie store.

    +
  48. +
+

A cookie is "expired" if the cookie has an expiry date in the past.

+

The user agent MUST evict all expired cookies from the cookie store if, at any +time, an expired cookie exists in the cookie store.

+

At any time, the user agent MAY "remove excess cookies" from the cookie store +if the number of cookies sharing a domain field exceeds some +implementation-defined upper bound (such as 50 cookies).

+

At any time, the user agent MAY "remove excess cookies" from the cookie store +if the cookie store exceeds some predetermined upper bound (such as 3000 +cookies).

+

When the user agent removes excess cookies from the cookie store, the user +agent MUST evict cookies in the following priority order:

+
    +
  1. +

    Expired cookies.

    +
  2. +
  3. +

    Cookies whose secure-only-flag is false, and which share a domain field +with more than a predetermined number of other cookies.

    +
  4. +
  5. +

    Cookies that share a domain field with more than a predetermined number of +other cookies.

    +
  6. +
  7. +

    All cookies.

    +
  8. +
+

If two cookies have the same removal priority, the user agent MUST evict the +cookie with the earliest last-access-time first.

+

When "the current session is over" (as defined by the user agent), the user +agent MUST remove from the cookie store all cookies with the persistent-flag +set to false.

+
+
+
+
+

+5.8. Retrieval Model +

+

This section defines how cookies are retrieved from a cookie store in the form +of a cookie-string. A "retrieval" is any event which requires generating a +cookie-string. For example, a retrieval may occur in order to build a Cookie +header field for an HTTP request, or may be required in order to return a +cookie-string from a call to a "non-HTTP" API that provides access to cookies. A +retrieval has an associated URI, same-site status, and type, which +are defined below depending on the type of retrieval.

+ +
+
+

+5.8.2. Non-HTTP APIs +

+

The user agent MAY implement "non-HTTP" APIs that can be used to access +stored cookies.

+

A user agent MAY return an empty cookie-string in certain contexts, such as +when a retrieval occurs within a third-party context (see +Section 7.1).

+

If a user agent does return cookies for a given call to a "non-HTTP" API with +an associated Document, then the user agent MUST compute the cookie-string +following the algorithm defined in Section 5.8.3, where the +retrieval's URI is defined by the caller (see [DOM-DOCUMENT-COOKIE]), the +retrieval's same-site status is "same-site" if the Document's "site for +cookies" is same-site with the top-level origin as defined in +Section 5.2.1 (otherwise it is "cross-site"), and the retrieval's type +is "non-HTTP".

+
+
+
+
+

+5.8.3. Retrieval Algorithm +

+

Given a cookie store and a retrieval, the following algorithm returns a +cookie-string from a given cookie store.

+
    +
  1. +

    Let cookie-list be the set of cookies from the cookie store that meets all +of the following requirements:

    +
      +
    • +

      Either:

      +
        +
      • +

        The cookie's host-only-flag is true and the canonicalized +host of the retrieval's URI is identical to the cookie's domain.

        +
      • +
      +

      +Or:

      +
        +
      • +

        The cookie's host-only-flag is false and the canonicalized +host of the retrieval's URI domain-matches the cookie's domain.

        +
      • +
      +

      +NOTE: (For user agents configured to reject "public suffixes") It's +possible that the public suffix list was changed since a cookie was +created. If this change results in a cookie's domain becoming a public +suffix then that cookie is considered invalid as it would have been +rejected during creation (See Section 5.7 step 9). User agents +should be careful to avoid retrieving these invalid cookies even if they +domain-match the host of the retrieval's URI.

      +
    • +
    • +

      The retrieval's URI's path path-matches the cookie's path.

      +
    • +
    • +

      If the cookie's secure-only-flag is true, then the retrieval's URI must +denote a "secure" connection (as defined by the user agent).

      +

      +NOTE: The notion of a "secure" connection is not defined by this document. +Typically, user agents consider a connection secure if the connection makes +use of transport-layer security, such as SSL or TLS, or if the host is +trusted. For example, most user agents consider "https" to be a scheme that +denotes a secure protocol and "localhost" to be trusted host.

      +
    • +
    • +

      If the cookie's http-only-flag is true, then exclude the cookie if the +retrieval's type is "non-HTTP".

      +
    • +
    • +

      If the cookie's same-site-flag is not "None" and the retrieval's same-site +status is "cross-site", then exclude the cookie unless all of the +following conditions are met:

      +
        +
      • +

        The retrieval's type is "HTTP".

        +
      • +
      • +

        The same-site-flag is "Lax" or "Default".

        +
      • +
      • +

        The HTTP request associated with the retrieval uses a "safe" method.

        +
      • +
      • +

        The target browsing context of the HTTP request associated with the +retrieval is the active browsing context or a top-level traversable.

        +
      • +
      +
    • +
    +
  2. +
  3. +

    The user agent SHOULD sort the cookie-list in the following order:

    +
      +
    • +

      Cookies with longer paths are listed before cookies with shorter +paths.

      +
    • +
    • +

      Among cookies that have equal-length path fields, cookies with earlier +creation-times are listed before cookies with later creation-times.

      +
    • +
    +

    +NOTE: Not all user agents sort the cookie-list in this order, but this order +reflects common practice when this document was written, and, historically, +there have been servers that (erroneously) depended on this order.

    +
  4. +
  5. +

    Update the last-access-time of each cookie in the cookie-list to the +current date and time.

    +
  6. +
  7. +

    Serialize the cookie-list into a cookie-string by processing each cookie +in the cookie-list in order:

    +
      +
    1. +

      If the cookies' name is not empty, output the cookie's name followed by +the %x3D ("=") character.

      +
    2. +
    3. +

      If the cookies' value is not empty, output the cookie's value.

      +
    4. +
    5. +

      If there is an unprocessed cookie in the cookie-list, output the +characters %x3B and %x20 ("; ").

      +
    6. +
    +
  8. +
+
+
+
+
+
+
+
+
+

+6. Implementation Considerations +

+
+
+

+6.1. Limits +

+

Practical user agent implementations have limits on the number and size of +cookies that they can store. General-use user agents SHOULD provide each of the +following minimum capabilities:

+
    +
  • +

    At least 50 cookies per domain.

    +
  • +
  • +

    At least 3000 cookies total.

    +
  • +
+

User agents MAY limit the maximum number of cookies they store, and may evict +any cookie at any time (whether at the request of the user or due to +implementation limitations).

+

Note that a limit on the maximum number of cookies also limits the total size of +the stored cookies, due to the length limits which MUST be enforced in +Section 5.6.

+

Servers SHOULD use as few and as small cookies as possible to avoid reaching +these implementation limits, minimize network bandwidth due to the +Cookie header field being included in every request, and to avoid reaching +server header field limits (See Section 4.2.1).

+

Servers SHOULD gracefully degrade if the user agent fails to return one or more +cookies in the Cookie header field because the user agent might evict any cookie +at any time.

+
+
+
+
+

+6.2. Application Programming Interfaces +

+

One reason the Cookie and Set-Cookie header fields use such esoteric syntax is +that many platforms (both in servers and user agents) provide a string-based +application programming interface (API) to cookies, requiring +application-layer programmers to generate and parse the syntax used by the +Cookie and Set-Cookie header fields, which many programmers have done incorrectly, +resulting in interoperability problems.

+

Instead of providing string-based APIs to cookies, platforms would be +well-served by providing more semantic APIs. It is beyond the scope of this +document to recommend specific API designs, but there are clear benefits to +accepting an abstract "Date" object instead of a serialized date string.

+
+
+
+
+
+
+

+7. Privacy Considerations +

+

Cookies' primary privacy risk is their ability to correlate user activity. This +can happen on a single site, but is most problematic when activity is tracked across different, +seemingly unconnected Web sites to build a user profile.

+

Over time, this capability (warned against explicitly in [RFC2109] and all of its successors) +has become widely used for varied reasons including:

+
    +
  • +

    authenticating users across sites,

    +
  • +
  • +

    assembling information on users,

    +
  • +
  • +

    protecting against fraud and other forms of undesirable traffic,

    +
  • +
  • +

    targeting advertisements at specific users or at users with specified attributes,

    +
  • +
  • +

    measuring how often ads are shown to users, and

    +
  • +
  • +

    recognizing when an ad resulted in a change in user behavior.

    +
  • +
+

While not every use of cookies is necessarily problematic for privacy, their potential for abuse +has become a widespread concern in the Internet community and broader society. In response to these concerns, user agents +have actively constrained cookie functionality in various ways (as allowed and encouraged by +previous specifications), while avoiding disruption to features they judge desirable for the health +of the Web.

+

It is too early to declare consensus on which specific mechanism(s) should be used to mitigate cookies' privacy impact; user agents' ongoing changes to how they are handled are best characterised as experiments that +can provide input into that eventual consensus.

+

Instead, this document describes limited, general mitigations against the privacy risks associated +with cookies that enjoy wide deployment at the time of writing. It is expected that implementations +will continue to experiment and impose stricter, more well-defined limitations on cookies over +time. Future versions of this document might codify those mechanisms based upon deployment +experience. If functions that currently rely on cookies can be supported by separate, targeted +mechanisms, they might be documented in separate specifications and stricter limitations on cookies +might become feasible.

+

Note that cookies are not the only mechanism that can be used to track users across sites, so while +these mitigations are necessary to improve Web privacy, they are not sufficient on their own.

+
+
+

+7.1. Third-Party Cookies +

+

A "third-party" or cross-site cookie is one that is associated with embedded content (such as +scripts, images, stylesheets, frames) that is obtained from a different server than the one that +hosts the primary resource (usually, the Web page that the user is viewing). Third-party cookies +are often used to correlate users' activity on different sites.

+

Because of their inherent privacy issues, most user agents now limit third-party cookies in a +variety of ways. Some completely block third-party cookies by refusing to process third-party +Set-Cookie header fields and refusing to send third-party Cookie header fields. Some partition +cookies based upon the first-party context, so that different cookies are sent depending on the +site being browsed. Some block cookies based upon user agent cookie policy and/or user controls.

+

While this document does not endorse or require a specific approach, it is RECOMMENDED that user +agents adopt a policy for third-party cookies that is as restrictive as compatibility constraints +permit. Consequently, resources cannot rely upon third-party cookies being treated consistently by +user agents for the foreseeable future.

+
+
+ +
+
+

+7.3. User Controls +

+

User agents SHOULD provide users with a mechanism for managing the cookies +stored in the cookie store. For example, a user agent might let users delete +all cookies received during a specified time period or all the cookies related +to a particular domain. In addition, many user agents include a user interface +element that lets users examine the cookies stored in their cookie store.

+

User agents SHOULD provide users with a mechanism for disabling cookies. When +cookies are disabled, the user agent MUST NOT include a Cookie header field in +outbound HTTP requests and the user agent MUST NOT process Set-Cookie header fields +in inbound HTTP responses.

+

User agents MAY offer a way to change the cookie policy (see +Section 7.2).

+

User agents MAY provide users the option of preventing persistent storage of +cookies across sessions. When configured thusly, user agents MUST treat all +received cookies as if the persistent-flag were set to false. Some popular +user agents expose this functionality via "private browsing" mode +[Aggarwal2010].

+
+
+
+
+

+7.4. Expiration Dates +

+

Although servers can set the expiration date for cookies to the distant future, +most user agents do not actually retain cookies for multiple decades. Rather +than choosing gratuitously long expiration periods, servers SHOULD promote user +privacy by selecting reasonable cookie expiration periods based on the purpose +of the cookie. For example, a typical session identifier might reasonably be +set to expire in two weeks.

+
+
+
+
+
+
+

+8. Security Considerations +

+
+
+

+8.1. Overview +

+

Cookies have a number of security pitfalls. This section overviews a few of the +more salient issues.

+

In particular, cookies encourage developers to rely on ambient authority for +authentication, often becoming vulnerable to attacks such as cross-site request +forgery [CSRF]. Also, when storing session identifiers in cookies, developers +often create session fixation vulnerabilities.

+

Transport-layer encryption, such as that employed in HTTPS, is insufficient to +prevent a network attacker from obtaining or altering a victim's cookies because +the cookie protocol itself has various vulnerabilities (see "Weak Confidentiality" +and "Weak Integrity", below). In addition, by default, cookies do not provide +confidentiality or integrity from network attackers, even when used in conjunction +with HTTPS.

+
+
+
+
+

+8.2. Ambient Authority +

+

A server that uses cookies to authenticate users can suffer security +vulnerabilities because some user agents let remote parties issue HTTP requests +from the user agent (e.g., via HTTP redirects or HTML forms). When issuing +those requests, user agents attach cookies even if the remote party does not +know the contents of the cookies, potentially letting the remote party exercise +authority at an unwary server.

+

Although this security concern goes by a number of names (e.g., cross-site +request forgery, confused deputy), the issue stems from cookies being a form of +ambient authority. Cookies encourage server operators to separate designation +(in the form of URLs) from authorization (in the form of cookies). +Consequently, the user agent might supply the authorization for a resource +designated by the attacker, possibly causing the server or its clients to +undertake actions designated by the attacker as though they were authorized by +the user.

+

Instead of using cookies for authorization, server operators might wish to +consider entangling designation and authorization by treating URLs as +capabilities. Instead of storing secrets in cookies, this approach stores +secrets in URLs, requiring the remote entity to supply the secret itself. +Although this approach is not a panacea, judicious application of these +principles can lead to more robust security.

+
+
+
+
+

+8.3. Clear Text +

+

Unless sent over a secure channel (such as TLS), the information in the Cookie +and Set-Cookie header fields is transmitted in the clear.

+
    +
  1. +

    All sensitive information conveyed in these header fields is exposed to an +eavesdropper.

    +
  2. +
  3. +

    A malicious intermediary could alter the header fields as they travel in either +direction, with unpredictable results.

    +
  4. +
  5. +

    A malicious client could alter the Cookie header fields before transmission, +with unpredictable results.

    +
  6. +
+

Servers SHOULD encrypt and sign the contents of cookies (using whatever format +the server desires) when transmitting them to the user agent (even when sending +the cookies over a secure channel). However, encrypting and signing cookie +contents does not prevent an attacker from transplanting a cookie from one user +agent to another or from replaying the cookie at a later time.

+

In addition to encrypting and signing the contents of every cookie, servers that +require a higher level of security SHOULD use the Cookie and Set-Cookie +header fields only over a secure channel. When using cookies over a secure channel, +servers SHOULD set the Secure attribute (see Section 4.1.2.5) for every +cookie. If a server does not set the Secure attribute, the protection +provided by the secure channel will be largely moot.

+

For example, consider a webmail server that stores a session identifier in a +cookie and is typically accessed over HTTPS. If the server does not set the +Secure attribute on its cookies, an active network attacker can intercept any +outbound HTTP request from the user agent and redirect that request to the +webmail server over HTTP. Even if the webmail server is not listening for HTTP +connections, the user agent will still include cookies in the request. The +active network attacker can intercept these cookies, replay them against the +server, and learn the contents of the user's email. If, instead, the server had +set the Secure attribute on its cookies, the user agent would not have +included the cookies in the clear-text request.

+
+
+
+
+

+8.4. Session Identifiers +

+

Instead of storing session information directly in a cookie (where it might be +exposed to or replayed by an attacker), servers commonly store a nonce (or +"session identifier") in a cookie. When the server receives an HTTP request +with a nonce, the server can look up state information associated with the +cookie using the nonce as a key.

+

Using session identifier cookies limits the damage an attacker can cause if the +attacker learns the contents of a cookie because the nonce is useful only for +interacting with the server (unlike non-nonce cookie content, which might itself +be sensitive). Furthermore, using a single nonce prevents an attacker from +"splicing" together cookie content from two interactions with the server, which +could cause the server to behave unexpectedly.

+

Using session identifiers is not without risk. For example, the server SHOULD +take care to avoid "session fixation" vulnerabilities. A session fixation attack +proceeds in three steps. First, the attacker transplants a session identifier +from his or her user agent to the victim's user agent. Second, the victim uses +that session identifier to interact with the server, possibly imbuing the +session identifier with the user's credentials or confidential information. +Third, the attacker uses the session identifier to interact with server +directly, possibly obtaining the user's authority or confidential information.

+
+
+
+
+

+8.5. Weak Confidentiality +

+

Cookies do not provide isolation by port. If a cookie is readable by a service +running on one port, the cookie is also readable by a service running on another +port of the same server. If a cookie is writable by a service on one port, the +cookie is also writable by a service running on another port of the same server. +For this reason, servers SHOULD NOT both run mutually distrusting services on +different ports of the same host and use cookies to store security-sensitive +information.

+

Cookies do not provide isolation by scheme. Although most commonly used with +the http and https schemes, the cookies for a given host might also be +available to other schemes, such as ftp and gopher. Although this lack of +isolation by scheme is most apparent in non-HTTP APIs that permit access to +cookies (e.g., HTML's document.cookie API), the lack of isolation by scheme +is actually present in requirements for processing cookies themselves (e.g., +consider retrieving a URI with the gopher scheme via HTTP).

+

Cookies do not always provide isolation by path. Although the network-level +protocol does not send cookies stored for one path to another, some user +agents expose cookies via non-HTTP APIs, such as HTML's document.cookie API. +Because some of these user agents (e.g., web browsers) do not isolate resources +received from different paths, a resource retrieved from one path might be able +to access cookies stored for another path.

+
+
+
+
+

+8.6. Weak Integrity +

+

Cookies do not provide integrity guarantees for sibling domains (and their +subdomains). For example, consider foo.site.example and bar.site.example. The +foo.site.example server can set a cookie with a Domain attribute of +"site.example" (possibly overwriting an existing "site.example" cookie set by +bar.site.example), and the user agent will include that cookie in HTTP requests +to bar.site.example. In the worst case, bar.site.example will be unable to +distinguish this cookie from a cookie it set itself. The foo.site.example +server might be able to leverage this ability to mount an attack against +bar.site.example.

+

Even though the Set-Cookie header field supports the Path attribute, the Path +attribute does not provide any integrity protection because the user agent +will accept an arbitrary Path attribute in a Set-Cookie header field. For +example, an HTTP response to a request for http://site.example/foo/bar can set +a cookie with a Path attribute of "/qux". Consequently, servers SHOULD NOT +both run mutually distrusting services on different paths of the same host and +use cookies to store security-sensitive information.

+

An active network attacker can also inject cookies into the Cookie header field +sent to https://site.example/ by impersonating a response from +http://site.example/ and injecting a Set-Cookie header field. The HTTPS server +at site.example will be unable to distinguish these cookies from cookies that +it set itself in an HTTPS response. An active network attacker might be able +to leverage this ability to mount an attack against site.example even if +site.example uses HTTPS exclusively.

+

Servers can partially mitigate these attacks by encrypting and signing the +contents of their cookies, or by naming the cookie with the __Secure- prefix. +However, using cryptography does not mitigate the issue completely because an +attacker can replay a cookie he or she received from the authentic site.example +server in the user's session, with unpredictable results.

+

Finally, an attacker might be able to force the user agent to delete cookies by +storing a large number of cookies. Once the user agent reaches its storage +limit, the user agent will be forced to evict some cookies. Servers SHOULD NOT +rely upon user agents retaining cookies.

+
+
+
+
+

+8.7. Reliance on DNS +

+

Cookies rely upon the Domain Name System (DNS) for security. If the DNS is +partially or fully compromised, the cookie protocol might fail to provide the +security properties required by applications.

+
+
+
+
+

+8.8. SameSite Cookies +

+
+
+

+8.8.1. Defense in depth +

+

"SameSite" cookies offer a robust defense against CSRF attack when deployed in +strict mode, and when supported by the client. It is, however, prudent to ensure +that this designation is not the extent of a site's defense against CSRF, as +same-site navigations and submissions can certainly be executed in conjunction +with other attack vectors such as cross-site scripting or abuse of page +redirections.

+

Understanding how and when a request is considered same-site is also important +in order to properly design a site for SameSite cookies. For example, if a +cross-site top-level request is made to a sensitive page that request will be +considered cross-site and SameSite=Strict cookies won't be sent; that page's +sub-resources requests, however, are same-site and would receive SameSite=Strict +cookies. Sites can avoid inadvertently allowing access to these sub-resources +by returning an error for the initial page request if it doesn't include the +appropriate cookies.

+

Developers are strongly encouraged to deploy the usual server-side defenses +(CSRF tokens, ensuring that "safe" HTTP methods are idempotent, etc) to mitigate +the risk more fully.

+

Additionally, client-side techniques such as those described in +[app-isolation] may also prove effective against CSRF, and are certainly worth +exploring in combination with "SameSite" cookies.

+
+
+
+
+

+8.8.2. Top-level Navigations +

+

Setting the SameSite attribute in "strict" mode provides robust defense in +depth against CSRF attacks, but has the potential to confuse users unless sites' +developers carefully ensure that their cookie-based session management systems +deal reasonably well with top-level navigations.

+

Consider the scenario in which a user reads their email at MegaCorp Inc's +webmail provider https://site.example/. They might expect that clicking on an +emailed link to https://projects.example/secret/project would show them the +secret project that they're authorized to see, but if https://projects.example +has marked their session cookies as SameSite=Strict, then this cross-site +navigation won't send them along with the request. https://projects.example +will render a 404 error to avoid leaking secret information, and the user will +be quite confused.

+

Developers can avoid this confusion by adopting a session management system that +relies on not one, but two cookies: one conceptually granting "read" access, +another granting "write" access. The latter could be marked as SameSite=Strict, +and its absence would prompt a reauthentication step before executing any +non-idempotent action. The former could be marked as SameSite=Lax, in +order to allow users access to data via top-level navigation, or +SameSite=None, to permit access in all contexts (including cross-site +embedded contexts).

+
+
+
+
+

+8.8.3. Mashups and Widgets +

+

The Lax and Strict values for the SameSite attribute are inappropriate +for some important use-cases. In particular, note that content intended for +embedding in cross-site contexts (social networking widgets or commenting +services, for instance) will not have access to same-site cookies. Cookies +which are required in these situations should be marked with SameSite=None +to allow access in cross-site contexts.

+

Likewise, some forms of Single-Sign-On might require cookie-based authentication +in a cross-site context; these mechanisms will not function as intended with +same-site cookies and will also require SameSite=None.

+
+
+
+
+

+8.8.4. Server-controlled +

+

SameSite cookies in and of themselves don't do anything to address the +general privacy concerns outlined in Section 7.1 of [RFC6265]. The "SameSite" +attribute is set by the server, and serves to mitigate the risk of certain kinds +of attacks that the server is worried about. The user is not involved in this +decision. Moreover, a number of side-channels exist which could allow a server +to link distinct requests even in the absence of cookies (for example, connection +and/or socket pooling between same-site and cross-site requests).

+
+
+
+
+

+8.8.5. Reload navigations +

+

Requests issued for reloads triggered through user interface elements (such as a +refresh button on a toolbar) are same-site only if the reloaded document was +originally navigated to via a same-site request. This differs from the handling +of other reload navigations, which are always same-site if top-level, since the +source navigable's active document is precisely the document being +reloaded.

+

This special handling of reloads triggered through a user interface element +avoids sending SameSite cookies on user-initiated reloads if they were +withheld on the original navigation (i.e., if the initial navigation were +cross-site). If the reload navigation were instead considered same-site, and +sent all the initially withheld SameSite cookies, the security benefits of +withholding the cookies in the first place would be nullified. This is +especially important given that the absence of SameSite cookies withheld on a +cross-site navigation request may lead to visible site breakage, prompting the +user to trigger a reload.

+

For example, suppose the user clicks on a link from https://attacker.example/ +to https://victim.example/. This is a cross-site request, so SameSite=Strict +cookies are withheld. Suppose this causes https://victim.example/ to appear +broken, because the site only displays its sensitive content if a particular +SameSite cookie is present in the request. The user, frustrated by the +unexpectedly broken site, presses refresh on their browser's toolbar. To now +consider the reload request same-site and send the initially withheld SameSite +cookie would defeat the purpose of withholding it in the first place, as the +reload navigation triggered through the user interface may replay the original +(potentially malicious) request. Thus, the reload request should be considered +cross-site, like the request that initially navigated to the page.

+

Because requests issued for, non-user initiated, reloads attach all SameSite +cookies, developers should be careful and thoughtful about when to initiate a +reload in order to avoid a CSRF attack. For example, the page could only +initiate a reload if a CSRF token is present on the initial request.

+
+
+
+
+

+8.8.6. Top-level requests with "unsafe" methods +

+

The "Lax" enforcement mode described in Section 5.6.7.1 allows a cookie to be +sent with a cross-site HTTP request if and only if it is a top-level navigation +with a "safe" HTTP method. Implementation experience shows that this is +difficult to apply as the default behavior, as some sites may rely on cookies +not explicitly specifying a SameSite attribute being included on top-level +cross-site requests with "unsafe" HTTP methods (as was the case prior to the +introduction of the SameSite attribute).

+

For example, the concluding step of a login flow may involve a cross-site top-level +POST request to an endpoint; this endpoint expects a recently created cookie +containing transactional state information, necessary to securely complete the +login. For such a cookie, "Lax" enforcement is not appropriate, as it would +cause the cookie to be excluded due to the unsafe HTTP request method, +resulting in an unrecoverable failure of the whole login flow.

+

The "Lax-allowing-unsafe" enforcement mode described in Section 5.6.7.2 +retains some of the protections of "Lax" enforcement (as compared to "None") +while still allowing recently created cookies to be sent cross-site with unsafe +top-level requests.

+

As a more permissive variant of "Lax" mode, "Lax-allowing-unsafe" mode +necessarily provides fewer protections against CSRF. Ultimately, the provision +of such an enforcement mode should be seen as a temporary, transitional measure +to ease adoption of "Lax" enforcement by default.

+
+
+
+
+
+
+
+
+

+9. IANA Considerations +

+ + + +
+
+
+
+

+10. References +

+
+
+

+10.1. Normative References +

+
+ +
+WHATWG, "HTML - Living Standard", , <https://html.spec.whatwg.org/#dom-document-cookie>.
+
+
[FETCH]
+
+van Kesteren, A., "Fetch", n.d., <https://fetch.spec.whatwg.org/>.
+
+
[HTML]
+
+Hickson, I., Pieters, S., van Kesteren, A., Jägenstedt, P., and D. Denicola, "HTML", n.d., <https://html.spec.whatwg.org/>.
+
+
[HTTP]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, DOI 10.17487/RFC9110, , <https://www.rfc-editor.org/rfc/rfc9110>.
+
+
[RFC1034]
+
+Mockapetris, P., "Domain names - concepts and facilities", STD 13, RFC 1034, DOI 10.17487/RFC1034, , <https://www.rfc-editor.org/rfc/rfc1034>.
+
+
[RFC1123]
+
+Braden, R., Ed., "Requirements for Internet Hosts - Application and Support", STD 3, RFC 1123, DOI 10.17487/RFC1123, , <https://www.rfc-editor.org/rfc/rfc1123>.
+
+
[RFC2119]
+
+Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
+
+
[RFC4790]
+
+Newman, C., Duerst, M., and A. Gulbrandsen, "Internet Application Protocol Collation Registry", RFC 4790, DOI 10.17487/RFC4790, , <https://www.rfc-editor.org/rfc/rfc4790>.
+
+
[RFC5234]
+
+Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, DOI 10.17487/RFC5234, , <https://www.rfc-editor.org/rfc/rfc5234>.
+
+
[RFC5890]
+
+Klensin, J., "Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework", RFC 5890, DOI 10.17487/RFC5890, , <https://www.rfc-editor.org/rfc/rfc5890>.
+
+
[RFC6454]
+
+Barth, A., "The Web Origin Concept", RFC 6454, DOI 10.17487/RFC6454, , <https://www.rfc-editor.org/rfc/rfc6454>.
+
+
[RFC8126]
+
+Cotton, M., Leiba, B., and T. Narten, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 8126, DOI 10.17487/RFC8126, , <https://www.rfc-editor.org/rfc/rfc8126>.
+
+
[SAMESITE]
+
+WHATWG, "HTML - Living Standard", , <https://html.spec.whatwg.org/#same-site>.
+
+
[USASCII]
+
+American National Standards Institute, "Coded Character Set -- 7-bit American Standard Code for Information Interchange", ANSI X3.4, .
+
+
+
+
+
+
+

+10.2. Informative References +

+
+
[Aggarwal2010]
+
+Aggarwal, G., Burzstein, E., Jackson, C., and D. Boneh, "An Analysis of Private Browsing Modes in Modern Browsers", , <http://www.usenix.org/events/sec10/tech/full_papers/Aggarwal.pdf>.
+
+
[app-isolation]
+
+Chen, E., Bau, J., Reis, C., Barth, A., and C. Jackson, "App Isolation - Get the Security of Multiple Browsers with Just One", , <http://www.collinjackson.com/research/papers/appisolation.pdf>.
+
+
[CSRF]
+
+Barth, A., Jackson, C., and J. Mitchell, "Robust Defenses for Cross-Site Request Forgery", DOI 10.1145/1455770.1455782, ISBN 978-1-59593-810-7, ACM CCS '08: Proceedings of the 15th ACM conference on Computer and communications security (pages 75-88), , <http://portal.acm.org/citation.cfm?id=1455770.1455782>.
+
+
[HttpFieldNameRegistry]
+
+"Hypertext Transfer Protocol (HTTP) Field Name Registry", n.d., <https://www.iana.org/assignments/http-fields/>.
+
+
[prerendering]
+
+Bentzel, C., "Chrome Prerendering", n.d., <https://www.chromium.org/developers/design-documents/prerender>.
+
+
[PSL]
+
+"Public Suffix List", n.d., <https://publicsuffix.org/list/>.
+
+
[RFC2109]
+
+Kristol, D. and L. Montulli, "HTTP State Management Mechanism", RFC 2109, DOI 10.17487/RFC2109, , <https://www.rfc-editor.org/rfc/rfc2109>.
+
+
[RFC3986]
+
+Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, DOI 10.17487/RFC3986, , <https://www.rfc-editor.org/rfc/rfc3986>.
+
+
[RFC4648]
+
+Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, DOI 10.17487/RFC4648, , <https://www.rfc-editor.org/rfc/rfc4648>.
+
+
[RFC6265]
+
+Barth, A., "HTTP State Management Mechanism", RFC 6265, DOI 10.17487/RFC6265, , <https://www.rfc-editor.org/rfc/rfc6265>.
+
+
[RFC7034]
+
+Ross, D. and T. Gondrom, "HTTP Header Field X-Frame-Options", RFC 7034, DOI 10.17487/RFC7034, , <https://www.rfc-editor.org/rfc/rfc7034>.
+
+
[RFC9113]
+
+Thomson, M., Ed. and C. Benfield, Ed., "HTTP/2", RFC 9113, DOI 10.17487/RFC9113, , <https://www.rfc-editor.org/rfc/rfc9113>.
+
+
[RFC9114]
+
+Bishop, M., Ed., "HTTP/3", RFC 9114, DOI 10.17487/RFC9114, , <https://www.rfc-editor.org/rfc/rfc9114>.
+
+
[SERVICE-WORKERS]
+
+Archibald, J. and M. Kruisselbrink, "Service Workers", n.d., <https://www.w3.org/TR/service-workers/>.
+
+
+
+
+
+
+
+
+

+Appendix A. Changes from RFC 6265 +

+
    +
  • +

    Adds the same-site concept and the SameSite attribute. +(Section 5.2 and Section 4.1.2.7)

    +
  • +
  • +

    Introduces cookie prefixes and prohibits nameless cookies from setting a +value that would mimic a cookie prefix. (Section 4.1.3 and +Section 5.7)

    +
  • +
  • +

    Prohibits non-secure origins from setting cookies with a Secure flag or +overwriting cookies with this flag. (Section 5.7)

    +
  • +
  • +

    Limits maximum cookie size. (Section 5.7)

    +
  • +
  • +

    Limits maximum values for max-age and expire. (Section 5.6.1 and Section 5.6.2)

    +
  • +
  • +

    Includes the host-only-flag as part of a cookie's uniqueness computation. +(Section 5.7)

    +
  • +
  • +

    Considers potentially trustworthy origins as "secure". (Section 5.7)

    +
  • +
  • +

    Improves cookie syntax

    +
      +
    • +

      Treats Set-Cookie: token as creating the cookie ("", "token"). +(Section 5.6)

      +
    • +
    • +

      Rejects cookies without a name nor value. (Section 5.7)

      +
    • +
    • +

      Specifies how to serialize a nameless/valueless cookie. (Section 5.8.3)

      +
    • +
    • +

      Adjusts ABNF for cookie-pair and the Cookie header production to allow +for spaces. (Section 4.1.1)

      +
    • +
    • +

      Explicitly handle control characters. (Section 5.6 and Section 5.7)

      +
    • +
    • +

      Specifies how to handle empty domain attributes. (Section 5.7)

      +
    • +
    • +

      Requires ASCII characters for the domain attribute. (Section 5.7)

      +
    • +
    +
  • +
  • +

    Refactors cookie retrieval algorithm to support non-HTTP APIs. (Section 5.8.2)

    +
  • +
  • +

    Specifies that the Set-Cookie line should not be decoded. (Section 5.6)

    +
  • +
  • +

    Adds an advisory section to assist implementers in deciding which requirements +to implement. (Section 3.2)

    +
  • +
  • +

    Advises against sending invalid cookies due to public suffix list changes. +(Section 5.8.3)

    +
  • +
  • +

    Removes the single cookie header requirement. (Section 5.8.1)

    +
  • +
  • +

    Address errata 3444 by updating the path-value andextension-av grammar, +errata 4148 by updating the day-of-month, year, and time grammar, and errata +3663 by adding the requested note. (Section 4.1 and Section 5.1.4)

    +
  • +
+
+
+
+
+

+Acknowledgements +

+

RFC 6265 was written by Adam Barth. This document is an update of RFC 6265, +adding features and aligning the specification with the reality of today's +deployments. Here, we're standing upon the shoulders of a giant since the +majority of the text is still Adam's.

+

Thank you to both Lily Chen and Steven Englehardt, editors emeritus, for their +significant contributions improving this draft.

+
+
+
+
+

+Authors' Addresses +

+
+
Steven Bingler (editor)
+
Google LLC
+ +
+
+
Mike West (editor)
+
Google LLC
+ + +
+
+
John Wilander (editor)
+
Apple, Inc
+ +
+
+
+ + + diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-rfc6265bis.txt b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-rfc6265bis.txt new file mode 100644 index 000000000..83df8209b --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-rfc6265bis.txt @@ -0,0 +1,2816 @@ + + + + +HTTP S. Bingler, Ed. +Internet-Draft M. West, Ed. +Obsoletes: 6265 (if approved) Google LLC +Intended status: Standards Track J. Wilander, Ed. +Expires: 11 July 2025 Apple, Inc + 7 January 2025 + + + Cookies: HTTP State Management Mechanism + draft-ietf-httpbis-rfc6265bis-latest + +Abstract + + This document defines the HTTP Cookie and Set-Cookie header fields. + These header fields can be used by HTTP servers to store state + (called cookies) at HTTP user agents, letting the servers maintain a + stateful session over the mostly stateless HTTP protocol. Although + cookies have many historical infelicities that degrade their security + and privacy, the Cookie and Set-Cookie header fields are widely used + on the Internet. This document obsoletes RFC 6265. + +About This Document + + This note is to be removed before publishing as an RFC. + + Status information for this document may be found at + https://datatracker.ietf.org/doc/draft-ietf-httpbis-rfc6265bis/. + + Discussion of this document takes place on the HTTP Working Group + mailing list (mailto:ietf-http-wg@w3.org), which is archived at + https://lists.w3.org/Archives/Public/ietf-http-wg/. Working Group + information can be found at https://httpwg.org/. + + Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/6265bis. + +Status of This Memo + + This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79. + + Internet-Drafts are working documents of the Internet Engineering + Task Force (IETF). Note that other groups may also distribute + working documents as Internet-Drafts. The list of current Internet- + Drafts is at https://datatracker.ietf.org/drafts/current/. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress." + + This Internet-Draft will expire on 11 July 2025. + +Copyright Notice + + Copyright (c) 2025 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents (https://trustee.ietf.org/ + license-info) in effect on the date of publication of this document. + Please review these documents carefully, as they describe your rights + and restrictions with respect to this document. Code Components + extracted from this document must include Revised BSD License text as + described in Section 4.e of the Trust Legal Provisions and are + provided without warranty as described in the Revised BSD License. + + This document may contain material from IETF Documents or IETF + Contributions published or made publicly available before November + 10, 2008. The person(s) controlling the copyright in some of this + material may not have granted the IETF Trust the right to allow + modifications of such material outside the IETF Standards Process. + Without obtaining an adequate license from the person(s) controlling + the copyright in such materials, this document may not be modified + outside the IETF Standards Process, and derivative works of it may + not be created outside the IETF Standards Process, except to format + it for publication as an RFC or to translate it into languages other + than English. + +Table of Contents + + 1. Introduction + 2. Conventions + 2.1. Conformance Criteria + 2.2. Syntax Notation + 2.3. Terminology + 3. Overview + 3.1. Examples + 3.2. Which Requirements to Implement + 3.2.1. Cookie Producing Implementations + 3.2.2. Cookie Consuming Implementations + 4. Server Requirements + 4.1. Set-Cookie + 4.1.1. Syntax + 4.1.2. Semantics (Non-Normative) + 4.1.3. Cookie Name Prefixes + 4.2. Cookie + 4.2.1. Syntax + 4.2.2. Semantics + 5. User Agent Requirements + 5.1. Subcomponent Algorithms + 5.1.1. Dates + 5.1.2. Canonicalized Host Names + 5.1.3. Domain Matching + 5.1.4. Paths and Path-Match + 5.2. "Same-site" and "cross-site" Requests + 5.2.1. Document-based requests + 5.2.2. Worker-based requests + 5.3. Ignoring Set-Cookie Header Fields + 5.4. Cookie Name Prefixes + 5.5. Cookie Lifetime Limits + 5.6. The Set-Cookie Header Field + 5.6.1. The Expires Attribute + 5.6.2. The Max-Age Attribute + 5.6.3. The Domain Attribute + 5.6.4. The Path Attribute + 5.6.5. The Secure Attribute + 5.6.6. The HttpOnly Attribute + 5.6.7. The SameSite Attribute + 5.7. Storage Model + 5.8. Retrieval Model + 5.8.1. The Cookie Header Field + 5.8.2. Non-HTTP APIs + 5.8.3. Retrieval Algorithm + 6. Implementation Considerations + 6.1. Limits + 6.2. Application Programming Interfaces + 7. Privacy Considerations + 7.1. Third-Party Cookies + 7.2. Cookie Policy + 7.3. User Controls + 7.4. Expiration Dates + 8. Security Considerations + 8.1. Overview + 8.2. Ambient Authority + 8.3. Clear Text + 8.4. Session Identifiers + 8.5. Weak Confidentiality + 8.6. Weak Integrity + 8.7. Reliance on DNS + 8.8. SameSite Cookies + 8.8.1. Defense in depth + 8.8.2. Top-level Navigations + 8.8.3. Mashups and Widgets + 8.8.4. Server-controlled + 8.8.5. Reload navigations + 8.8.6. Top-level requests with "unsafe" methods + 9. IANA Considerations + 9.1. Cookie + 9.2. Set-Cookie + 9.3. Cookie Attribute Registry + 9.3.1. Procedure + 9.3.2. Registration + 10. References + 10.1. Normative References + 10.2. Informative References + Appendix A. Changes from RFC 6265 + Acknowledgements + Authors' Addresses + +1. Introduction + + This document defines the HTTP Cookie and Set-Cookie header fields. + Using the Set-Cookie header field, an HTTP server can pass name/value + pairs and associated metadata (called cookies) to a user agent. When + the user agent makes subsequent requests to the server, the user + agent uses the metadata and other information to determine whether to + return the name/value pairs in the Cookie header field. + + Although simple on their surface, cookies have a number of + complexities. For example, the server indicates a scope for each + cookie when sending it to the user agent. The scope indicates the + maximum amount of time in which the user agent should return the + cookie, the servers to which the user agent should return the cookie, + and the connection types for which the cookie is applicable. + + For historical reasons, cookies contain a number of security and + privacy infelicities. For example, a server can indicate that a + given cookie is intended for "secure" connections, but the Secure + attribute does not provide integrity in the presence of an active + network attacker. Similarly, cookies for a given host are shared + across all the ports on that host, even though the usual "same-origin + policy" used by web browsers isolates content retrieved via different + ports. + + This specification applies to developers of both cookie-producing + servers and cookie-consuming user agents. Section 3.2 helps to + clarify the intended target audience for each implementation type. + + To maximize interoperability with user agents, servers SHOULD limit + themselves to the well-behaved profile defined in Section 4 when + generating cookies. + + User agents MUST implement the more liberal processing rules defined + in Section 5, in order to maximize interoperability with existing + servers that do not conform to the well-behaved profile defined in + Section 4. + + This document specifies the syntax and semantics of these header + fields as they are actually used on the Internet. In particular, + this document does not create new syntax or semantics beyond those in + use today. The recommendations for cookie generation provided in + Section 4 represent a preferred subset of current server behavior, + and even the more liberal cookie processing algorithm provided in + Section 5 does not recommend all of the syntactic and semantic + variations in use today. Where some existing software differs from + the recommended protocol in significant ways, the document contains a + note explaining the difference. + + This document obsoletes [RFC6265]. + +2. Conventions + +2.1. Conformance Criteria + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in [RFC2119]. + + Requirements phrased in the imperative as part of algorithms (such as + "strip any leading space characters" or "return false and abort these + steps") are to be interpreted with the meaning of the key word + ("MUST", "SHOULD", "MAY", etc.) used in introducing the algorithm. + + Conformance requirements phrased as algorithms or specific steps can + be implemented in any manner, so long as the end result is + equivalent. In particular, the algorithms defined in this + specification are intended to be easy to understand and are not + intended to be performant. + +2.2. Syntax Notation + + This specification uses the Augmented Backus-Naur Form (ABNF) + notation of [RFC5234]. + + The following core rules are included by reference, as defined in + [RFC5234], Appendix B.1: ALPHA (letters), CR (carriage return), CRLF + (CR LF), CTLs (controls), DIGIT (decimal 0-9), DQUOTE (double quote), + HEXDIG (hexadecimal 0-9/A-F/a-f), LF (line feed), NUL (null octet), + OCTET (any 8-bit sequence of data except NUL), SP (space), HTAB + (horizontal tab), CHAR (any [USASCII] character), VCHAR (any visible + [USASCII] character), and WSP (whitespace). + + The OWS (optional whitespace) and BWS (bad whitespace) rules are + defined in Section 5.6.3 of [HTTP]. + +2.3. Terminology + + The terms "user agent", "client", "server", "proxy", and "origin + server" have the same meaning as in the HTTP/1.1 specification + ([HTTP], Section 3). + + The request-host is the name of the host, as known by the user agent, + to which the user agent is sending an HTTP request or from which it + is receiving an HTTP response (i.e., the name of the host to which it + sent the corresponding HTTP request). + + The term request-uri refers to "target URI" as defined in Section 7.1 + of [HTTP]. + + Two sequences of octets are said to case-insensitively match each + other if and only if they are equivalent under the i;ascii-casemap + collation defined in [RFC4790]. + + The term string means a sequence of non-NUL octets. + + The terms "active browsing context", "active document", "ancestor + navigables", "container document", "content navigable", "dedicated + worker", "Document", "inclusive ancestor navigables", "navigable", + "opaque origin", "sandboxed origin browsing context flag", "shared + worker", "the worker's Documents", "top-level traversable", and + "WorkerGlobalScope" are defined in [HTML]. + + "Service Workers" are defined in the Service Workers specification + [SERVICE-WORKERS]. + + The term "origin", the mechanism of deriving an origin from a URI, + and the "the same" matching algorithm for origins are defined in + [RFC6454]. + + "Safe" HTTP methods include GET, HEAD, OPTIONS, and TRACE, as defined + in Section 9.2.1 of [HTTP]. + + A domain's "public suffix" is the portion of a domain that is + controlled by a public registry, such as "com", "co.uk", and + "pvt.k12.wy.us". A domain's "registrable domain" is the domain's + public suffix plus the label to its left. That is, for + https://www.site.example, the public suffix is example, and the + registrable domain is site.example. Whenever possible, user agents + SHOULD use an up-to-date public suffix list, such as the one + maintained by the Mozilla project at [PSL]. + + The term "request", as well as a request's "client", "current url", + "method", "target browsing context", and "url list", are defined in + [FETCH]. + + The term "non-HTTP APIs" refers to non-HTTP mechanisms used to set + and retrieve cookies, such as a web browser API that exposes cookies + to scripts. + + The term "top-level navigation" refers to a navigation of a top-level + traversable. + +3. Overview + + This section outlines a way for an origin server to send state + information to a user agent and for the user agent to return the + state information to the origin server. + + To store state, the origin server includes a Set-Cookie header field + in an HTTP response. In subsequent requests, the user agent returns + a Cookie request header field to the origin server. The Cookie + header field contains cookies the user agent received in previous + Set-Cookie header fields. The origin server is free to ignore the + Cookie header field or use its contents for an application-defined + purpose. + + Origin servers MAY send a Set-Cookie response header field with any + response. An origin server can include multiple Set-Cookie header + fields in a single response. The presence of a Cookie or a Set- + Cookie header field does not preclude HTTP caches from storing and + reusing a response. + + Origin servers SHOULD NOT fold multiple Set-Cookie header fields into + a single header field. The usual mechanism for folding HTTP headers + fields (i.e., as defined in Section 5.3 of [HTTP]) might change the + semantics of the Set-Cookie header field because the %x2C (",") + character is used by Set-Cookie in a way that conflicts with such + folding. + + User agents MAY ignore Set-Cookie header fields based on response + status codes or the user agent's cookie policy (see Section 5.3). + +3.1. Examples + + Using the Set-Cookie header field, a server can send the user agent a + short string in an HTTP response that the user agent will return in + future HTTP requests that are within the scope of the cookie. For + example, the server can send the user agent a "session identifier" + named SID with the value 31d4d96e407aad42. The user agent then + returns the session identifier in subsequent requests. + + == Server -> User Agent == + + Set-Cookie: SID=31d4d96e407aad42 + + == User Agent -> Server == + + Cookie: SID=31d4d96e407aad42 + + The server can alter the default scope of the cookie using the Path + and Domain attributes. For example, the server can instruct the user + agent to return the cookie to every path and every subdomain of + site.example. + + == Server -> User Agent == + + Set-Cookie: SID=31d4d96e407aad42; Path=/; Domain=site.example + + == User Agent -> Server == + + Cookie: SID=31d4d96e407aad42 + + As shown in the next example, the server can store multiple cookies + at the user agent. For example, the server can store a session + identifier as well as the user's preferred language by returning two + Set-Cookie header fields. Notice that the server uses the Secure and + HttpOnly attributes to provide additional security protections for + the more sensitive session identifier (see Section 4.1.2). + + == Server -> User Agent == + + Set-Cookie: SID=31d4d96e407aad42; Path=/; Secure; HttpOnly + Set-Cookie: lang=en-US; Path=/; Domain=site.example + + == User Agent -> Server == + + Cookie: SID=31d4d96e407aad42; lang=en-US + + Notice that the Cookie header field above contains two cookies, one + named SID and one named lang. + + Cookie names are case-sensitive, meaning that if a server sends the + user agent two Set-Cookie header fields that differ only in their + name's case the user agent will store and return both of those + cookies in subsequent requests. + + == Server -> User Agent == + + Set-Cookie: SID=31d4d96e407aad42 + Set-Cookie: sid=31d4d96e407aad42 + + == User Agent -> Server == + + Cookie: SID=31d4d96e407aad42; sid=31d4d96e407aad42 + + If the server wishes the user agent to persist the cookie over + multiple "sessions" (e.g., user agent restarts), the server can + specify an expiration date in the Expires attribute. Note that the + user agent might delete the cookie before the expiration date if the + user agent's cookie store exceeds its quota or if the user manually + deletes the server's cookie. + + == Server -> User Agent == + + Set-Cookie: lang=en-US; Expires=Wed, 09 Jun 2021 10:18:14 GMT + + == User Agent -> Server == + + Cookie: SID=31d4d96e407aad42; lang=en-US + + Finally, to remove a cookie, the server returns a Set-Cookie header + field with an expiration date in the past. The server will be + successful in removing the cookie only if the Path and the Domain + attribute in the Set-Cookie header field match the values used when + the cookie was created. + + == Server -> User Agent == + + Set-Cookie: lang=; Expires=Sun, 06 Nov 1994 08:49:37 GMT + + == User Agent -> Server == + + Cookie: SID=31d4d96e407aad42 + +3.2. Which Requirements to Implement + + The upcoming two sections, Section 4 and Section 5, discuss the set + of requirements for two distinct types of implementations. This + section is meant to help guide implementers in determining which set + of requirements best fits their goals. Choosing the wrong set of + requirements could result in a lack of compatibility with other + cookie implementations. + + It's important to note that being compatible means different things + depending on the implementer's goals. These differences have built + up over time due to both intentional and unintentional spec changes, + spec interpretations, and historical implementation differences. + + This section roughly divides implementers of the cookie spec into two + types, producers and consumers. These are not official terms and are + only used here to help readers develop an intuitive understanding of + the use cases. + +3.2.1. Cookie Producing Implementations + + An implementer should choose Section 4 whenever cookies are created + and will be sent to a user agent, such as a web browser. These + implementations are frequently referred to as servers by the spec but + that term includes anything which primarily produces cookies. Some + potential examples: + + * Server applications hosting a website or API + + * Programming languages or software frameworks that support cookies + + * Integrated third-party web applications, such as a business + management suite + + All these benefit from not only supporting as many user agents as + possible but also supporting other servers. This is useful if a + cookie is produced by a software framework and is later sent back to + a server application which needs to read it. Section 4 advises best + practices that help maximize this sense of compatibility. + + See Section 3.2.2.1 for more details on programming languages and + software frameworks. + +3.2.2. Cookie Consuming Implementations + + An implementer should choose Section 5 whenever cookies are primarily + received from another source. These implementations are referred to + as user agents. Some examples: + + * Web browsers + + * Tools that support stateful HTTP + + * Programming languages or software frameworks that support cookies + + Because user agents don't know which servers a user will access, and + whether or not that server is following best practices, users agents + are advised to implement a more lenient set of requirements and to + accept some things that servers are warned against producing. + Section 5 advises best practices that help maximize this sense of + compatibility. + + See Section 3.2.2.1 for more details on programming languages and + software frameworks. + +3.2.2.1. Programming Languages & Software Frameworks + + A programming language or software framework with support for cookies + could reasonably be used to create an application that acts as a + cookie producer, cookie consumer, or both. Because a developer may + want to maximize their compatibility as either a producer or + consumer, these languages or frameworks should strongly consider + supporting both sets of requirements, Section 4 and Section 5, behind + a compatibility mode toggle. This toggle should default to + Section 4's requirements. + + Doing so will reduce the chances that a developer's application can + inadvertently create cookies that cannot be read by other servers. + +4. Server Requirements + + This section describes the syntax and semantics of a well-behaved + profile of the Cookie and Set-Cookie header fields. + +4.1. Set-Cookie + + The Set-Cookie HTTP response header field is used to send cookies + from the server to the user agent. + +4.1.1. Syntax + + Informally, the Set-Cookie response header field contains a cookie, + which begins with a name-value-pair, followed by zero or more + attribute-value pairs. Servers SHOULD NOT send Set-Cookie header + fields that fail to conform to the following grammar: + + set-cookie = set-cookie-string + set-cookie-string = BWS cookie-pair *( BWS ";" OWS cookie-av ) + cookie-pair = cookie-name BWS "=" BWS cookie-value + cookie-name = token + cookie-value = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE ) + cookie-octet = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E + ; US-ASCII characters excluding CTLs, + ; whitespace, DQUOTE, comma, semicolon, + ; and backslash + token = + + cookie-av = expires-av / max-age-av / domain-av / + path-av / secure-av / httponly-av / + samesite-av / extension-av + expires-av = "Expires" BWS "=" BWS sane-cookie-date + sane-cookie-date = + + max-age-av = "Max-Age" BWS "=" BWS non-zero-digit *DIGIT + non-zero-digit = %x31-39 + ; digits 1 through 9 + domain-av = "Domain" BWS "=" BWS domain-value + domain-value = + ; see details below + path-av = "Path" BWS "=" BWS path-value + path-value = *av-octet + secure-av = "Secure" + httponly-av = "HttpOnly" + samesite-av = "SameSite" BWS "=" BWS samesite-value + samesite-value = "Strict" / "Lax" / "None" + extension-av = *av-octet + av-octet = %x20-3A / %x3C-7E + ; any CHAR except CTLs or ";" + + Note that some of the grammatical terms above reference documents + that use different grammatical notations than this document (which + uses ABNF from [RFC5234]). + + Per the grammar above, servers SHOULD NOT produce nameless cookies + (i.e.: an empty cookie-name) as such cookies may be unpredictably + serialized by UAs when sent back to the server. + + The semantics of the cookie-value are not defined by this document. + + To maximize compatibility with user agents, servers that wish to + store arbitrary data in a cookie-value SHOULD encode that data, for + example, using Base64 [RFC4648]. + + Per the grammar above, the cookie-value MAY be wrapped in DQUOTE + characters. Note that in this case, the initial and trailing DQUOTE + characters are not stripped. They are part of the cookie-value, and + will be included in Cookie header fields sent to the server. + + The domain-value is a subdomain as defined by [RFC1034], Section 3.5, + and as enhanced by [RFC1123], Section 2.1. Thus, domain-value is a + string of [USASCII] characters, such as an "A-label" as defined in + Section 2.3.2.1 of [RFC5890]. + + The portions of the set-cookie-string produced by the cookie-av term + are known as attributes. To maximize compatibility with user agents, + servers SHOULD NOT produce two attributes with the same name in the + same set-cookie-string. (See Section 5.7 for how user agents handle + this case.) + + NOTE: The name of an attribute-value pair is not case-sensitive. So + while they are presented here in CamelCase, such as "HttpOnly" or + "SameSite", any case is accepted. E.x.: "httponly", "Httponly", + "hTTPoNLY", etc. + + Servers SHOULD NOT include more than one Set-Cookie header field in + the same response with the same cookie-name. (See Section 5.6 for + how user agents handle this case.) + + If a server sends multiple responses containing Set-Cookie header + fields concurrently to the user agent (e.g., when communicating with + the user agent over multiple sockets), these responses create a "race + condition" that can lead to unpredictable behavior. + + NOTE: Some existing user agents differ in their interpretation of + two-digit years. To avoid compatibility issues, servers SHOULD use + the rfc1123-date format, which requires a four-digit year. + + NOTE: Some user agents store and process dates in cookies as 32-bit + UNIX time_t values. Implementation bugs in the libraries supporting + time_t processing on some systems might cause such user agents to + process dates after the year 2038 incorrectly. + +4.1.2. Semantics (Non-Normative) + + This section describes simplified semantics of the Set-Cookie header + field. These semantics are detailed enough to be useful for + understanding the most common uses of cookies by servers. The full + semantics are described in Section 5. + + When the user agent receives a Set-Cookie header field, the user + agent stores the cookie together with its attributes. Subsequently, + when the user agent makes an HTTP request, the user agent includes + the applicable, non-expired cookies in the Cookie header field. + + If the user agent receives a new cookie with the same cookie-name, + domain-value, and path-value as a cookie that it has already stored, + the existing cookie is evicted and replaced with the new cookie. + Notice that servers can delete cookies by sending the user agent a + new cookie with an Expires attribute with a value in the past. + + Unless the cookie's attributes indicate otherwise, the cookie is + returned only to the origin server (and not, for example, to any + subdomains), and it expires at the end of the current session (as + defined by the user agent). User agents ignore unrecognized cookie + attributes (but not the entire cookie). + +4.1.2.1. The Expires Attribute + + The Expires attribute indicates the maximum lifetime of the cookie, + represented as the date and time at which the cookie expires. The + user agent may adjust the specified date and is not required to + retain the cookie until that date has passed. In fact, user agents + often evict cookies due to memory pressure or privacy concerns. + +4.1.2.2. The Max-Age Attribute + + The Max-Age attribute indicates the maximum lifetime of the cookie, + represented as the number of seconds until the cookie expires. The + user agent may adjust the specified duration and is not required to + retain the cookie for that duration. In fact, user agents often + evict cookies due to memory pressure or privacy concerns. + + NOTE: Some existing user agents do not support the Max-Age attribute. + User agents that do not support the Max-Age attribute ignore the + attribute. + + If a cookie has both the Max-Age and the Expires attribute, the Max- + Age attribute has precedence and controls the expiration date of the + cookie. If a cookie has neither the Max-Age nor the Expires + attribute, the user agent will retain the cookie until "the current + session is over" (as defined by the user agent). + +4.1.2.3. The Domain Attribute + + The Domain attribute specifies those hosts to which the cookie will + be sent. For example, if the value of the Domain attribute is + "site.example", the user agent will include the cookie in the Cookie + header field when making HTTP requests to site.example, + www.site.example, and www.corp.site.example. (Note that a leading + %x2E ("."), if present, is ignored even though that character is not + permitted.) If the server omits the Domain attribute, the user agent + will return the cookie only to the origin server. + + WARNING: Some existing user agents treat an absent Domain attribute + as if the Domain attribute were present and contained the current + host name. For example, if site.example returns a Set-Cookie header + field without a Domain attribute, these user agents will erroneously + send the cookie to www.site.example as well. + + The user agent will reject cookies unless the Domain attribute + specifies a scope for the cookie that would include the origin + server. For example, the user agent will accept a cookie with a + Domain attribute of "site.example" or of "foo.site.example" from + foo.site.example, but the user agent will not accept a cookie with a + Domain attribute of "bar.site.example" or of "baz.foo.site.example". + + NOTE: For security reasons, many user agents are configured to reject + Domain attributes that correspond to "public suffixes". For example, + some user agents will reject Domain attributes of "com" or "co.uk". + (See Section 5.7 for more information.) + +4.1.2.4. The Path Attribute + + The scope of each cookie is limited to a set of paths, controlled by + the Path attribute. If the server omits the Path attribute, the user + agent will use the "directory" of the request-uri's path component as + the default value. (See Section 5.1.4 for more details.) + + The user agent will include the cookie in an HTTP request only if the + path portion of the request-uri matches (or is a subdirectory of) the + cookie's Path attribute, where the %x2F ("/") character is + interpreted as a directory separator. + + Although seemingly useful for isolating cookies between different + paths within a given host, the Path attribute cannot be relied upon + for security (see Section 8). + +4.1.2.5. The Secure Attribute + + The Secure attribute limits the scope of the cookie to "secure" + channels (where "secure" is defined by the user agent). When a + cookie has the Secure attribute, the user agent will include the + cookie in an HTTP request only if the request is transmitted over a + secure channel (typically HTTP over Transport Layer Security (TLS) + [HTTP]). + +4.1.2.6. The HttpOnly Attribute + + The HttpOnly attribute limits the scope of the cookie to HTTP + requests. In particular, the attribute instructs the user agent to + omit the cookie when providing access to cookies via non-HTTP APIs. + + Note that the HttpOnly attribute is independent of the Secure + attribute: a cookie can have both the HttpOnly and the Secure + attribute. + +4.1.2.7. The SameSite Attribute + + The "SameSite" attribute limits the scope of the cookie such that it + will only be attached to requests if those requests are same-site, as + defined by the algorithm in Section 5.2. For example, requests for + https://site.example/sekrit-image will attach same-site cookies if + and only if initiated from a context whose "site for cookies" is an + origin with a scheme and registered domain of "https" and + "site.example" respectively. + + If the "SameSite" attribute's value is "Strict", the cookie will only + be sent along with "same-site" requests. If the value is "Lax", the + cookie will be sent with same-site requests, and with "cross-site" + top-level navigations, as described in Section 5.6.7.1. If the value + is "None", the cookie will be sent with same-site and cross-site + requests. If the "SameSite" attribute's value is something other + than these three known keywords, the attribute's value will be + subject to a default enforcement mode that is equivalent to "Lax". + If a user agent uses "Lax-allowing-unsafe" enforcement (See + Section 5.6.7.2) then this default enforcement mode will instead be + equivalent to "Lax-allowing-unsafe". + + The "SameSite" attribute affects cookie creation as well as delivery. + Cookies which assert "SameSite=Lax" or "SameSite=Strict" cannot be + set in responses to cross-site subresource requests, or cross-site + nested navigations. They can be set along with any top-level + navigation, cross-site or otherwise. + +4.1.3. Cookie Name Prefixes + + Section 8.5 and Section 8.6 of this document spell out some of the + drawbacks of cookies' historical implementation. In particular, it + is impossible for a server to have confidence that a given cookie was + set with a particular set of attributes. In order to provide such + confidence in a backwards-compatible way, two common sets of + requirements can be inferred from the first few characters of the + cookie's name. + + The user agent requirements for the prefixes described below are + detailed in Section 5.4. + + To maximize compatibility with user agents servers SHOULD use + prefixes as described below. + +4.1.3.1. The "__Secure-" Prefix + + If a cookie's name begins with a case-sensitive match for the string + __Secure-, then the cookie will have been set with a Secure + attribute. + + For example, the following Set-Cookie header field would be rejected + by a conformant user agent, as it does not have a Secure attribute. + + Set-Cookie: __Secure-SID=12345; Domain=site.example + + Whereas the following Set-Cookie header field would be accepted if + set from a secure origin (e.g. "https://site.example/"), and rejected + otherwise: + + Set-Cookie: __Secure-SID=12345; Domain=site.example; Secure + +4.1.3.2. The "__Host-" Prefix + + If a cookie's name begins with a case-sensitive match for the string + __Host-, then the cookie will have been set with a Secure attribute, + a Path attribute with a value of /, and no Domain attribute. + + This combination yields a cookie that hews as closely as a cookie can + to treating the origin as a security boundary. The lack of a Domain + attribute ensures that the cookie's host-only-flag is true, locking + the cookie to a particular host, rather than allowing it to span + subdomains. Setting the Path to / means that the cookie is effective + for the entire host, and won't be overridden for specific paths. The + Secure attribute ensures that the cookie is unaltered by non-secure + origins, and won't span protocols. + + Ports are the only piece of the origin model that __Host- cookies + continue to ignore. + + For example, the following cookies would always be rejected: + + Set-Cookie: __Host-SID=12345 + Set-Cookie: __Host-SID=12345; Secure + Set-Cookie: __Host-SID=12345; Domain=site.example + Set-Cookie: __Host-SID=12345; Domain=site.example; Path=/ + Set-Cookie: __Host-SID=12345; Secure; Domain=site.example; Path=/ + + While the following would be accepted if set from a secure origin + (e.g. "https://site.example/"), and rejected otherwise: + + Set-Cookie: __Host-SID=12345; Secure; Path=/ + +4.2. Cookie + +4.2.1. Syntax + + The user agent sends stored cookies to the origin server in the + Cookie header field. If the server conforms to the requirements in + Section 4.1 (and the user agent conforms to the requirements in + Section 5), the user agent will send a Cookie header field that + conforms to the following grammar: + + cookie = cookie-string + cookie-string = cookie-pair *( ";" SP cookie-pair ) + + While Section 5.4 of [HTTP] does not define a length limit for header + fields it is likely that the web server's implementation does impose + a limit; many popular implementations have default limits of 8K. + Servers SHOULD avoid setting a large number of large cookies such + that the final cookie-string would exceed their header field limit. + Not doing so could result in requests to the server failing. + + Servers MUST be tolerant of multiple cookie headers. For example, an + HTTP/2 [RFC9113] or HTTP/3 [RFC9114] client or intermediary might + split a cookie header to improve compression. Servers are free to + determine what form this tolerance takes. For example, the server + could process each cookie header individually or the server could + concatenate all the cookie headers into one and then process that + final, single, header. The server should be mindful of any header + field limits when deciding which approach to take. + + Note: Since intermediaries can modify cookie headers they should also + be mindful of common server header field limits in order to avoid + sending servers headers that they cannot process. For example, + concatenating multiple cookie headers into a single header might + exceed a server's size limit. + +4.2.2. Semantics + + Each cookie-pair represents a cookie stored by the user agent. The + cookie-pair contains the cookie-name and cookie-value the user agent + received in the Set-Cookie header field. + + Notice that the cookie attributes are not returned. In particular, + the server cannot determine from the Cookie field alone when a cookie + will expire, for which hosts the cookie is valid, for which paths the + cookie is valid, or whether the cookie was set with the Secure or + HttpOnly attributes. + + The semantics of individual cookies in the Cookie header field are + not defined by this document. Servers are expected to imbue these + cookies with application-specific semantics. + + Although cookies are serialized linearly in the Cookie header field, + servers SHOULD NOT rely upon the serialization order. In particular, + if the Cookie header field contains two cookies with the same name + (e.g., that were set with different Path or Domain attributes), + servers SHOULD NOT rely upon the order in which these cookies appear + in the header field. + +5. User Agent Requirements + + This section specifies the Cookie and Set-Cookie header fields in + sufficient detail that a user agent implementing these requirements + precisely can interoperate with existing servers (even those that do + not conform to the well-behaved profile described in Section 4). + + A user agent could enforce more restrictions than those specified + herein (e.g., restrictions specified by its cookie policy, described + in Section 7.2). However, such additional restrictions may reduce + the likelihood that a user agent will be able to interoperate with + existing servers. + +5.1. Subcomponent Algorithms + + This section defines some algorithms used by user agents to process + specific subcomponents of the Cookie and Set-Cookie header fields. + +5.1.1. Dates + + The user agent MUST use an algorithm equivalent to the following + algorithm to parse a cookie-date. Note that the various boolean + flags defined as a part of the algorithm (i.e., found-time, found- + day-of-month, found-month, found-year) are initially "not set". + + 1. Using the grammar below, divide the cookie-date into date-tokens. + + cookie-date = *delimiter date-token-list *delimiter + date-token-list = date-token *( 1*delimiter date-token ) + date-token = 1*non-delimiter + + delimiter = %x09 / %x20-2F / %x3B-40 / %x5B-60 / %x7B-7E + non-delimiter = %x00-08 / %x0A-1F / DIGIT / ":" / ALPHA + / %x7F-FF + non-digit = %x00-2F / %x3A-FF + + day-of-month = 1*2DIGIT [ non-digit *OCTET ] + month = ( "jan" / "feb" / "mar" / "apr" / + "may" / "jun" / "jul" / "aug" / + "sep" / "oct" / "nov" / "dec" ) *OCTET + year = 2*4DIGIT [ non-digit *OCTET ] + time = hms-time [ non-digit *OCTET ] + hms-time = time-field ":" time-field ":" time-field + time-field = 1*2DIGIT + + 2. Process each date-token sequentially in the order the date-tokens + appear in the cookie-date: + + 1. If the found-time flag is not set and the token matches the + time production, set the found-time flag and set the hour- + value, minute-value, and second-value to the numbers denoted + by the digits in the date-token, respectively. Skip the + remaining sub-steps and continue to the next date-token. + + 2. If the found-day-of-month flag is not set and the date-token + matches the day-of-month production, set the found-day-of- + month flag and set the day-of-month-value to the number + denoted by the date-token. Skip the remaining sub-steps and + continue to the next date-token. + + 3. If the found-month flag is not set and the date-token matches + the month production, set the found-month flag and set the + month-value to the month denoted by the date-token. Skip the + remaining sub-steps and continue to the next date-token. + + 4. If the found-year flag is not set and the date-token matches + the year production, set the found-year flag and set the + year-value to the number denoted by the date-token. Skip the + remaining sub-steps and continue to the next date-token. + + 3. If the year-value is greater than or equal to 70 and less than or + equal to 99, increment the year-value by 1900. + + 4. If the year-value is greater than or equal to 0 and less than or + equal to 69, increment the year-value by 2000. + + 1. NOTE: Some existing user agents interpret two-digit years + differently. + + 5. Abort these steps and fail to parse the cookie-date if: + + * at least one of the found-day-of-month, found-month, found- + year, or found-time flags is not set, + + * the day-of-month-value is less than 1 or greater than 31, + + * the year-value is less than 1601, + + * the hour-value is greater than 23, + + * the minute-value is greater than 59, or + + * the second-value is greater than 59. + + (Note that leap seconds cannot be represented in this syntax.) + + 6. Let the parsed-cookie-date be the date whose day-of-month, month, + year, hour, minute, and second (in UTC) are the day-of-month- + value, the month-value, the year-value, the hour-value, the + minute-value, and the second-value, respectively. If no such + date exists, abort these steps and fail to parse the cookie-date. + + 7. Return the parsed-cookie-date as the result of this algorithm. + +5.1.2. Canonicalized Host Names + + A canonicalized host name is the string generated by the following + algorithm: + + 1. Convert the host name to a sequence of individual domain name + labels. + + 2. Convert each label that is not a Non-Reserved LDH (NR-LDH) label, + to an A-label (see Section 2.3.2.1 of [RFC5890] for the former + and latter). + + 3. Concatenate the resulting labels, separated by a %x2E (".") + character. + +5.1.3. Domain Matching + + A string domain-matches a given domain string if at least one of the + following conditions hold: + + * The domain string and the string are identical. (Note that both + the domain string and the string will have been canonicalized to + lower case at this point.) + + * All of the following conditions hold: + + - The domain string is a suffix of the string. + + - The last character of the string that is not included in the + domain string is a %x2E (".") character. + + - The string is a host name (i.e., not an IP address). + +5.1.4. Paths and Path-Match + + The user agent MUST use an algorithm equivalent to the following + algorithm to compute the default-path of a cookie: + + 1. Let uri-path be the path portion of the request-uri if such a + portion exists (and empty otherwise). + + 2. If the uri-path is empty or if the first character of the uri- + path is not a %x2F ("/") character, output %x2F ("/") and skip + the remaining steps. + + 3. If the uri-path contains no more than one %x2F ("/") character, + output %x2F ("/") and skip the remaining step. + + 4. Output the characters of the uri-path from the first character up + to, but not including, the right-most %x2F ("/"). + + A request-path path-matches a given cookie-path if at least one of + the following conditions holds: + + * The cookie-path and the request-path are identical. + + Note that this differs from the rules in [RFC3986] for equivalence + of the path component, and hence two equivalent paths can have + different cookies. + + * The cookie-path is a prefix of the request-path, and the last + character of the cookie-path is %x2F ("/"). + + * The cookie-path is a prefix of the request-path, and the first + character of the request-path that is not included in the cookie- + path is a %x2F ("/") character. + +5.2. "Same-site" and "cross-site" Requests + + Two origins are same-site if they satisfy the "same site" criteria + defined in [SAMESITE]. A request is "same-site" if the following + criteria are true: + + 1. The request is not the result of a reload navigation triggered + through a user interface element (as defined by the user agent; + e.g., a request triggered by the user clicking a refresh button + on a toolbar). + + 2. The request's current url's origin is same-site with the + request's client's "site for cookies" (which is an origin), or if + the request has no client or the request's client is null. + + Requests which are the result of a reload navigation triggered + through a user interface element are same-site if the reloaded + document was originally navigated to via a same-site request. A + request that is not "same-site" is instead "cross-site". + + The request's client's "site for cookies" is calculated depending + upon its client's type, as described in the following subsections: + +5.2.1. Document-based requests + + The URI displayed in a user agent's address bar is the only security + context directly exposed to users, and therefore the only signal + users can reasonably rely upon to determine whether or not they trust + a particular website. The origin of that URI represents the context + in which a user most likely believes themselves to be interacting. + We'll define this origin, the top-level traversable's active + document's origin, as the "top-level origin". + + For a document displayed in a top-level traversable, we can stop + here: the document's "site for cookies" is the top-level origin. + + For container documents, we need to audit the origins of each of a + document's ancestor navigables' active documents in order to account + for the "multiple-nested scenarios" described in Section 4 of + [RFC7034]. A document's "site for cookies" is the top-level origin + if and only if the top-level origin is same-site with the document's + origin, and with each of the document's ancestor documents' origins. + Otherwise its "site for cookies" is an origin set to an opaque + origin. + + Given a Document (document), the following algorithm returns its + "site for cookies": + + 1. Let top-document be the active document in document's navigable's + top-level traversable. + + 2. Let top-origin be the origin of top-document's URI if top- + document's sandboxed origin browsing context flag is set, and + top-document's origin otherwise. + + 3. Let documents be a list consisting of the active documents of + document's inclusive ancestor navigables. + + 4. For each item in documents: + + 1. Let origin be the origin of item's URI if item's sandboxed + origin browsing context flag is set, and item's origin + otherwise. + + 2. If origin is not same-site with top-origin, return an origin + set to an opaque origin. + + 5. Return top-origin. + + Note: This algorithm only applies when the entire chain of documents + from top-document to document are all active. + +5.2.2. Worker-based requests + + Worker-driven requests aren't as clear-cut as document-driven + requests, as there isn't a clear link between a top-level traversable + and a worker. This is especially true for Service Workers + [SERVICE-WORKERS], which may execute code in the background, without + any document visible at all. + + Note: The descriptions below assume that workers must be same-origin + with the documents that instantiate them. If this invariant changes, + we'll need to take the worker's script's URI into account when + determining their status. + +5.2.2.1. Dedicated and Shared Workers + + Dedicated workers are simple, as each dedicated worker is bound to + one and only one document. Requests generated from a dedicated + worker (via importScripts, XMLHttpRequest, fetch(), etc) define their + "site for cookies" as that document's "site for cookies". + + Shared workers may be bound to multiple documents at once. As it is + quite possible for those documents to have distinct "site for + cookies" values, the worker's "site for cookies" will be an origin + set to an opaque origin in cases where the values are not all same- + site with the worker's origin, and the worker's origin in cases where + the values agree. + + Given a WorkerGlobalScope (worker), the following algorithm returns + its "site for cookies": + + 1. Let site be worker's origin. + + 2. For each document in worker's Documents: + + 1. Let document-site be document's "site for cookies" (as + defined in Section 5.2.1). + + 2. If document-site is not same-site with site, return an origin + set to an opaque origin. + + 3. Return site. + +5.2.2.2. Service Workers + + Service Workers are more complicated, as they act as a completely + separate execution context with only tangential relationship to the + Document which registered them. + + How user agents handle Service Workers may differ, but user agents + SHOULD match the [SERVICE-WORKERS] specification. + +5.3. Ignoring Set-Cookie Header Fields + + User agents MAY ignore Set-Cookie header fields contained in + responses with 100-level status codes or based on its cookie policy + (see Section 7.2). + + All other Set-Cookie header fields SHOULD be processed according to + Section 5.6. That is, Set-Cookie header fields contained in + responses with non-100-level status codes (including those in + responses with 400- and 500-level status codes) SHOULD be processed + unless ignored according to the user agent's cookie policy. + +5.4. Cookie Name Prefixes + + User agents' requirements for cookie name prefixes differ slightly + from servers' (Section 4.1.3) in that UAs MUST match the prefix + string case-insensitively. + + The normative requirements for the prefixes are detailed in the + storage model algorithm defined in Section 5.7. + + This is because some servers will process cookies case-insensitively, + resulting in them unintentionally miscapitalizing and accepting + miscapitalized prefixes. + + For example, if a server sends the following Set-Cookie header field + + Set-Cookie: __SECURE-SID=12345 + + to a UA which checks prefixes case-sensitively it will accept this + cookie and the server would incorrectly believe the cookie is subject + the same guarantees as one spelled __Secure-. + + Additionally the server is vulnerable to an attacker that + purposefully miscapitalizes a cookie in order to impersonate a + prefixed cookie. For example, a site already has a cookie __Secure- + SID=12345 and by some means an attacker sends the following Set- + Cookie header field for the site to a UA which checks prefixes case- + sensitively. + + Set-Cookie: __SeCuRe-SID=evil + + The next time a user visits the site the UA will send both cookies: + + Cookie: __Secure-SID=12345; __SeCuRe-SID=evil + + The server, being case-insensitive, won't be able to tell the + difference between the two cookies allowing the attacker to + compromise the site. + + To prevent these issues, UAs MUST match cookie name prefixes case- + insensitive. + + Note: Cookies with different names are still considered separate by + UAs. So both __Secure-foo=bar and __secure-foo=baz can exist as + distinct cookies simultaneously and both would have the requirements + of the __Secure- prefix applied. + + The following are examples of Set-Cookie header fields that would be + rejected by a conformant user agent. + + Set-Cookie: __Secure-SID=12345; Domain=site.example + Set-Cookie: __secure-SID=12345; Domain=site.example + Set-Cookie: __SECURE-SID=12345; Domain=site.example + Set-Cookie: __Host-SID=12345 + Set-Cookie: __host-SID=12345; Secure + Set-Cookie: __host-SID=12345; Domain=site.example + Set-Cookie: __HOST-SID=12345; Domain=site.example; Path=/ + Set-Cookie: __Host-SID=12345; Secure; Domain=site.example; Path=/ + Set-Cookie: __host-SID=12345; Secure; Domain=site.example; Path=/ + Set-Cookie: __HOST-SID=12345; Secure; Domain=site.example; Path=/ + + Whereas the following Set-Cookie header fields would be accepted if + set from a secure origin. + + Set-Cookie: __Secure-SID=12345; Domain=site.example; Secure + Set-Cookie: __secure-SID=12345; Domain=site.example; Secure + Set-Cookie: __SECURE-SID=12345; Domain=site.example; Secure + Set-Cookie: __Host-SID=12345; Secure; Path=/ + Set-Cookie: __host-SID=12345; Secure; Path=/ + Set-Cookie: __HOST-SID=12345; Secure; Path=/ + +5.5. Cookie Lifetime Limits + + When processing cookies with a specified lifetime, either with the + Expires or with the Max-Age attribute, the user agent MUST limit the + maximum age of the cookie. The limit SHOULD NOT be greater than 400 + days (34560000 seconds) in the future. The RECOMMENDED limit is 400 + days in the future, but the user agent MAY adjust the limit (see + Section 7.2). Expires or Max-Age attributes that specify a lifetime + longer than the limit MUST be reduced to the limit. + +5.6. The Set-Cookie Header Field + + When a user agent receives a Set-Cookie header field in an HTTP + response, the user agent MAY ignore the Set-Cookie header field in + its entirety (see Section 5.3). + + If the user agent does not ignore the Set-Cookie header field in its + entirety, the user agent MUST parse the field-value of the Set-Cookie + header field as a set-cookie-string (defined below). + + NOTE: The algorithm below is more permissive than the grammar in + Section 4.1. For example, the algorithm allows cookie-name to be + comprised of cookie-octets instead of being a token as specified in + Section 4.1 and the algorithm accommodates some characters that are + not cookie-octets according to the grammar in Section 4.1. In + addition, the algorithm below also strips leading and trailing + whitespace from the cookie name and value (but maintains internal + whitespace), whereas the grammar in Section 4.1 forbids whitespace in + these positions. User agents use this algorithm so as to + interoperate with servers that do not follow the recommendations in + Section 4. + + NOTE: As set-cookie-string may originate from a non-HTTP API, it is + not guaranteed to be free of CTL characters, so this algorithm + handles them explicitly. Horizontal tab (%x09) is excluded from the + CTL characters that lead to set-cookie-string rejection, as it is + considered whitespace, which is handled separately. + + NOTE: The set-cookie-string may contain octet sequences that appear + percent-encoded as per Section 2.1 of [RFC3986]. However, a user + agent MUST NOT decode these sequences and instead parse the + individual octets as specified in this algorithm. + + A user agent MUST use an algorithm equivalent to the following + algorithm to parse a set-cookie-string: + + 1. If the set-cookie-string contains a %x00-08 / %x0A-1F / %x7F + character (CTL characters excluding HTAB): Abort these steps and + ignore the set-cookie-string entirely. + + 2. If the set-cookie-string contains a %x3B (";") character: + + 1. The name-value-pair string consists of the characters up to, + but not including, the first %x3B (";"), and the unparsed- + attributes consist of the remainder of the set-cookie-string + (including the %x3B (";") in question). + + Otherwise: + + 1. The name-value-pair string consists of all the characters + contained in the set-cookie-string, and the unparsed- + attributes is the empty string. + + 3. If the name-value-pair string lacks a %x3D ("=") character, then + the name string is empty, and the value string is the value of + name-value-pair. + + Otherwise, the (possibly empty) name string consists of the + characters up to, but not including, the first %x3D ("=") + character, and the (possibly empty) value string consists of the + characters after the first %x3D ("=") character. + + 4. Remove any leading or trailing WSP characters from the name + string and the value string. + + 5. If the sum of the lengths of the name string and the value string + is more than 4096 octets, abort these steps and ignore the set- + cookie-string entirely. + + 6. The cookie-name is the name string, and the cookie-value is the + value string. + + The user agent MUST use an algorithm equivalent to the following + algorithm to parse the unparsed-attributes: + + 1. If the unparsed-attributes string is empty, skip the rest of + these steps. + + 2. Discard the first character of the unparsed-attributes (which + will be a %x3B (";") character). + + 3. If the remaining unparsed-attributes contains a %x3B (";") + character: + + 1. Consume the characters of the unparsed-attributes up to, but + not including, the first %x3B (";") character. + + Otherwise: + + 1. Consume the remainder of the unparsed-attributes. + + Let the cookie-av string be the characters consumed in this step. + + 4. If the cookie-av string contains a %x3D ("=") character: + + 1. The (possibly empty) attribute-name string consists of the + characters up to, but not including, the first %x3D ("=") + character, and the (possibly empty) attribute-value string + consists of the characters after the first %x3D ("=") + character. + + Otherwise: + + 1. The attribute-name string consists of the entire cookie-av + string, and the attribute-value string is empty. + + 5. Remove any leading or trailing WSP characters from the attribute- + name string and the attribute-value string. + + 6. If the attribute-value is longer than 1024 octets, ignore the + cookie-av string and return to Step 1 of this algorithm. + + 7. Process the attribute-name and attribute-value according to the + requirements in the following subsections. (Notice that + attributes with unrecognized attribute-names are ignored.) + + 8. Return to Step 1 of this algorithm. + + When the user agent finishes parsing the set-cookie-string, the user + agent is said to "receive a cookie" from the request-uri with name + cookie-name, value cookie-value, and attributes cookie-attribute- + list. (See Section 5.7 for additional requirements triggered by + receiving a cookie.) + +5.6.1. The Expires Attribute + + If the attribute-name case-insensitively matches the string + "Expires", the user agent MUST process the cookie-av as follows. + + 1. Let the expiry-time be the result of parsing the attribute-value + as cookie-date (see Section 5.1.1). + + 2. If the attribute-value failed to parse as a cookie date, ignore + the cookie-av. + + 3. Let cookie-age-limit be the maximum age of the cookie (which + SHOULD be 400 days in the future or sooner, see Section 5.5). + + 4. If the expiry-time is more than cookie-age-limit, the user agent + MUST set the expiry time to cookie-age-limit in seconds. + + 5. If the expiry-time is earlier than the earliest date the user + agent can represent, the user agent MAY replace the expiry-time + with the earliest representable date. + + 6. Append an attribute to the cookie-attribute-list with an + attribute-name of Expires and an attribute-value of expiry-time. + +5.6.2. The Max-Age Attribute + + If the attribute-name case-insensitively matches the string "Max- + Age", the user agent MUST process the cookie-av as follows. + + 1. If the attribute-value is empty, ignore the cookie-av. + + 2. If the first character of the attribute-value is neither a DIGIT, + nor a "-" character followed by a DIGIT, ignore the cookie-av. + + 3. If the remainder of attribute-value contains a non-DIGIT + character, ignore the cookie-av. + + 4. Let delta-seconds be the attribute-value converted to a base 10 + integer. + + 5. Let cookie-age-limit be the maximum age of the cookie (which + SHOULD be 400 days or less, see Section 5.5). + + 6. Set delta-seconds to the smaller of its present value and cookie- + age-limit. + + 7. If delta-seconds is less than or equal to zero (0), let expiry- + time be the earliest representable date and time. Otherwise, let + the expiry-time be the current date and time plus delta-seconds + seconds. + + 8. Append an attribute to the cookie-attribute-list with an + attribute-name of Max-Age and an attribute-value of expiry-time. + +5.6.3. The Domain Attribute + + If the attribute-name case-insensitively matches the string "Domain", + the user agent MUST process the cookie-av as follows. + + 1. Let cookie-domain be the attribute-value. + + 2. If cookie-domain starts with %x2E ("."), let cookie-domain be + cookie-domain without its leading %x2E ("."). + + 3. Convert the cookie-domain to lower case. + + 4. Append an attribute to the cookie-attribute-list with an + attribute-name of Domain and an attribute-value of cookie-domain. + +5.6.4. The Path Attribute + + If the attribute-name case-insensitively matches the string "Path", + the user agent MUST process the cookie-av as follows. + + 1. If the attribute-value is empty or if the first character of the + attribute-value is not %x2F ("/"): + + 1. Let cookie-path be the default-path. + + Otherwise: + + 1. Let cookie-path be the attribute-value. + + 2. Append an attribute to the cookie-attribute-list with an + attribute-name of Path and an attribute-value of cookie-path. + +5.6.5. The Secure Attribute + + If the attribute-name case-insensitively matches the string "Secure", + the user agent MUST append an attribute to the cookie-attribute-list + with an attribute-name of Secure and an empty attribute-value. + +5.6.6. The HttpOnly Attribute + + If the attribute-name case-insensitively matches the string + "HttpOnly", the user agent MUST append an attribute to the cookie- + attribute-list with an attribute-name of HttpOnly and an empty + attribute-value. + +5.6.7. The SameSite Attribute + + If the attribute-name case-insensitively matches the string + "SameSite", the user agent MUST process the cookie-av as follows: + + 1. Let enforcement be "Default". + + 2. If cookie-av's attribute-value is a case-insensitive match for + "None", set enforcement to "None". + + 3. If cookie-av's attribute-value is a case-insensitive match for + "Strict", set enforcement to "Strict". + + 4. If cookie-av's attribute-value is a case-insensitive match for + "Lax", set enforcement to "Lax". + + 5. Append an attribute to the cookie-attribute-list with an + attribute-name of "SameSite" and an attribute-value of + enforcement. + +5.6.7.1. "Strict" and "Lax" enforcement + + Same-site cookies in "Strict" enforcement mode will not be sent along + with top-level navigations which are triggered from a cross-site + document context. As discussed in Section 8.8.2, this might or might + not be compatible with existing session management systems. In the + interests of providing a drop-in mechanism that mitigates the risk of + CSRF attacks, developers may set the SameSite attribute in a "Lax" + enforcement mode that carves out an exception which sends same-site + cookies along with cross-site requests if and only if they are top- + level navigations which use a "safe" (in the [HTTP] sense) HTTP + method. (Note that a request's method may be changed from POST to + GET for some redirects (see Sections 15.4.2 and 15.4.3 of [HTTP]); in + these cases, a request's "safe"ness is determined based on the method + of the current redirect hop.) + + Lax enforcement provides reasonable defense in depth against CSRF + attacks that rely on unsafe HTTP methods (like POST), but does not + offer a robust defense against CSRF as a general category of attack: + + 1. Attackers can still pop up new windows or trigger top-level + navigations in order to create a "same-site" request (as + described in Section 5.2.1), which is only a speedbump along the + road to exploitation. + + 2. Features like [prerendering] can be + exploited to create "same-site" requests without the risk of user + detection. + + When possible, developers should use a session management mechanism + such as that described in Section 8.8.2 to mitigate the risk of CSRF + more completely. + +5.6.7.2. "Lax-Allowing-Unsafe" enforcement + + As discussed in Section 8.8.6, compatibility concerns may necessitate + the use of a "Lax-allowing-unsafe" enforcement mode that allows + cookies to be sent with a cross-site HTTP request if and only if it + is a top-level request, regardless of request method. That is, the + "Lax-allowing-unsafe" enforcement mode waives the requirement for the + HTTP request's method to be "safe" in the SameSite enforcement step + of the retrieval algorithm in Section 5.8.3. (All cookies, + regardless of SameSite enforcement mode, may be set for top-level + navigations, regardless of HTTP request method, as specified in + Section 5.7.) + + "Lax-allowing-unsafe" is not a distinct value of the SameSite + attribute. Rather, user agents MAY apply "Lax-allowing-unsafe" + enforcement only to cookies that did not explicitly specify a + SameSite attribute (i.e., those whose same-site-flag was set to + "Default" by default). To limit the scope of this compatibility + mode, user agents which apply "Lax-allowing-unsafe" enforcement + SHOULD restrict the enforcement to cookies which were created + recently. Deployment experience has shown a cookie age of 2 minutes + or less to be a reasonable limit. + + If the user agent uses "Lax-allowing-unsafe" enforcement, it MUST + apply the following modification to the retrieval algorithm defined + in Section 5.8.3: + + Replace the condition in the penultimate bullet point of step 1 of + the retrieval algorithm reading + + * The HTTP request associated with the retrieval uses a "safe" + method. + + with + + * At least one of the following is true: + + 1. The HTTP request associated with the retrieval uses a "safe" + method. + + 2. The cookie's same-site-flag is "Default" and the amount of + time elapsed since the cookie's creation-time is at most a + duration of the user agent's choosing. + +5.7. Storage Model + + The user agent stores the following fields about each cookie: name, + value, expiry-time, domain, path, creation-time, last-access-time, + persistent-flag, host-only-flag, secure-only-flag, http-only-flag, + and same-site-flag. + + When the user agent "receives a cookie" from a request-uri with name + cookie-name, value cookie-value, and attributes cookie-attribute- + list, the user agent MUST process the cookie as follows: + + 1. A user agent MAY ignore a received cookie in its entirety. See + Section 5.3. + + 2. If cookie-name is empty and cookie-value is empty, abort these + steps and ignore the cookie entirely. + + 3. If the cookie-name or the cookie-value contains a %x00-08 / + %x0A-1F / %x7F character (CTL characters excluding HTAB), abort + these steps and ignore the cookie entirely. + + 4. If the sum of the lengths of cookie-name and cookie-value is + more than 4096 octets, abort these steps and ignore the cookie + entirely. + + 5. Create a new cookie with name cookie-name, value cookie-value. + Set the creation-time and the last-access-time to the current + date and time. + + 6. If the cookie-attribute-list contains an attribute with an + attribute-name of "Max-Age": + + 1. Set the cookie's persistent-flag to true. + + 2. Set the cookie's expiry-time to attribute-value of the last + attribute in the cookie-attribute-list with an attribute- + name of "Max-Age". + + Otherwise, if the cookie-attribute-list contains an attribute + with an attribute-name of "Expires" (and does not contain an + attribute with an attribute-name of "Max-Age"): + + 1. Set the cookie's persistent-flag to true. + + 2. Set the cookie's expiry-time to attribute-value of the last + attribute in the cookie-attribute-list with an attribute- + name of "Expires". + + Otherwise: + + 1. Set the cookie's persistent-flag to false. + + 2. Set the cookie's expiry-time to the latest representable + date. + + 7. If the cookie-attribute-list contains an attribute with an + attribute-name of "Domain": + + 1. Let the domain-attribute be the attribute-value of the last + attribute in the cookie-attribute-list with both an + attribute-name of "Domain" and an attribute-value whose + length is no more than 1024 octets. (Note that a leading + %x2E ("."), if present, is ignored even though that + character is not permitted.) + + Otherwise: + + 1. Let the domain-attribute be the empty string. + + 8. If the domain-attribute contains a character that is not in the + range of [USASCII] characters, abort these steps and ignore the + cookie entirely. + + 9. If the user agent is configured to reject "public suffixes" and + the domain-attribute is a public suffix: + + 1. If the domain-attribute is identical to the canonicalized + request-host: + + 1. Let the domain-attribute be the empty string. + + Otherwise: + + 1. Abort these steps and ignore the cookie entirely. + + NOTE: This step prevents attacker.example from disrupting the + integrity of site.example by setting a cookie with a Domain + attribute of "example". + + 10. If the domain-attribute is non-empty: + + 1. If the canonicalized request-host does not domain-match the + domain-attribute: + + 1. Abort these steps and ignore the cookie entirely. + + Otherwise: + + 1. Set the cookie's host-only-flag to false. + + 2. Set the cookie's domain to the domain-attribute. + + Otherwise: + + 1. Set the cookie's host-only-flag to true. + + 2. Set the cookie's domain to the canonicalized request-host. + + 11. If the cookie-attribute-list contains an attribute with an + attribute-name of "Path", set the cookie's path to attribute- + value of the last attribute in the cookie-attribute-list with + both an attribute-name of "Path" and an attribute-value whose + length is no more than 1024 octets. Otherwise, set the cookie's + path to the default-path of the request-uri. + + 12. If the cookie-attribute-list contains an attribute with an + attribute-name of "Secure", set the cookie's secure-only-flag to + true. Otherwise, set the cookie's secure-only-flag to false. + + 13. If the request-uri does not denote a "secure" connection (as + defined by the user agent), and the cookie's secure-only-flag is + true, then abort these steps and ignore the cookie entirely. + + 14. If the cookie-attribute-list contains an attribute with an + attribute-name of "HttpOnly", set the cookie's http-only-flag to + true. Otherwise, set the cookie's http-only-flag to false. + + 15. If the cookie was received from a "non-HTTP" API and the + cookie's http-only-flag is true, abort these steps and ignore + the cookie entirely. + + 16. If the cookie's secure-only-flag is false, and the request-uri + does not denote a "secure" connection, then abort these steps + and ignore the cookie entirely if the cookie store contains one + or more cookies that meet all of the following criteria: + + 1. Their name matches the name of the newly-created cookie. + + 2. Their secure-only-flag is true. + + 3. Their domain domain-matches the domain of the newly-created + cookie, or vice-versa. + + 4. The path of the newly-created cookie path-matches the path + of the existing cookie. + + Note: The path comparison is not symmetric, ensuring only that a + newly-created, non-secure cookie does not overlay an existing + secure cookie, providing some mitigation against cookie-fixing + attacks. That is, given an existing secure cookie named 'a' + with a path of '/login', a non-secure cookie named 'a' could be + set for a path of '/' or '/foo', but not for a path of '/login' + or '/login/en'. + + 17. If the cookie-attribute-list contains an attribute with an + attribute-name of "SameSite", and an attribute-value of + "Strict", "Lax", or "None", set the cookie's same-site-flag to + the attribute-value of the last attribute in the cookie- + attribute-list with an attribute-name of "SameSite". Otherwise, + set the cookie's same-site-flag to "Default". + + 18. If the cookie's same-site-flag is not "None": + + 1. If the cookie was received from a "non-HTTP" API, and the + API was called from a navigable's active document whose + "site for cookies" is not same-site with the top-level + origin, then abort these steps and ignore the newly created + cookie entirely. + + 2. If the cookie was received from a "same-site" request (as + defined in Section 5.2), skip the remaining substeps and + continue processing the cookie. + + 3. If the cookie was received from a request which is + navigating a top-level traversable [HTML] (e.g. if the + request's "reserved client" is either null or an environment + whose "target browsing context"'s navigable is a top-level + traversable), skip the remaining substeps and continue + processing the cookie. + + Note: Top-level navigations can create a cookie with any + SameSite value, even if the new cookie wouldn't have been + sent along with the request had it already existed prior to + the navigation. + + 4. Abort these steps and ignore the newly created cookie + entirely. + + 19. If the cookie's "same-site-flag" is "None", abort these steps + and ignore the cookie entirely unless the cookie's secure-only- + flag is true. + + 20. If the cookie-name begins with a case-insensitive match for the + string "__Secure-", abort these steps and ignore the cookie + entirely unless the cookie's secure-only-flag is true. + + 21. If the cookie-name begins with a case-insensitive match for the + string "__Host-", abort these steps and ignore the cookie + entirely unless the cookie meets all the following criteria: + + 1. The cookie's secure-only-flag is true. + + 2. The cookie's host-only-flag is true. + + 3. The cookie-attribute-list contains an attribute with an + attribute-name of "Path", and the cookie's path is /. + + 22. If the cookie-name is empty and either of the following + conditions are true, abort these steps and ignore the cookie + entirely: + + * the cookie-value begins with a case-insensitive match for the + string "__Secure-" + + * the cookie-value begins with a case-insensitive match for the + string "__Host-" + + 23. If the cookie store contains a cookie with the same name, + domain, host-only-flag, and path as the newly-created cookie: + + 1. Let old-cookie be the existing cookie with the same name, + domain, host-only-flag, and path as the newly-created + cookie. (Notice that this algorithm maintains the invariant + that there is at most one such cookie.) + + 2. If the newly-created cookie was received from a "non-HTTP" + API and the old-cookie's http-only-flag is true, abort these + steps and ignore the newly created cookie entirely. + + 3. Update the creation-time of the newly-created cookie to + match the creation-time of the old-cookie. + + 4. Remove the old-cookie from the cookie store. + + 24. Insert the newly-created cookie into the cookie store. + + A cookie is "expired" if the cookie has an expiry date in the past. + + The user agent MUST evict all expired cookies from the cookie store + if, at any time, an expired cookie exists in the cookie store. + + At any time, the user agent MAY "remove excess cookies" from the + cookie store if the number of cookies sharing a domain field exceeds + some implementation-defined upper bound (such as 50 cookies). + + At any time, the user agent MAY "remove excess cookies" from the + cookie store if the cookie store exceeds some predetermined upper + bound (such as 3000 cookies). + + When the user agent removes excess cookies from the cookie store, the + user agent MUST evict cookies in the following priority order: + + 1. Expired cookies. + + 2. Cookies whose secure-only-flag is false, and which share a domain + field with more than a predetermined number of other cookies. + + 3. Cookies that share a domain field with more than a predetermined + number of other cookies. + + 4. All cookies. + + If two cookies have the same removal priority, the user agent MUST + evict the cookie with the earliest last-access-time first. + + When "the current session is over" (as defined by the user agent), + the user agent MUST remove from the cookie store all cookies with the + persistent-flag set to false. + +5.8. Retrieval Model + + This section defines how cookies are retrieved from a cookie store in + the form of a cookie-string. A "retrieval" is any event which + requires generating a cookie-string. For example, a retrieval may + occur in order to build a Cookie header field for an HTTP request, or + may be required in order to return a cookie-string from a call to a + "non-HTTP" API that provides access to cookies. A retrieval has an + associated URI, same-site status, and type, which are defined below + depending on the type of retrieval. + +5.8.1. The Cookie Header Field + + The user agent includes stored cookies in the Cookie HTTP request + header field. + + A user agent MAY omit the Cookie header field in its entirety. For + example, the user agent might wish to block sending cookies during + "third-party" requests from setting cookies (see Section 7.1). + + If the user agent does attach a Cookie header field to an HTTP + request, the user agent MUST generate a single cookie-string and the + user agent MUST compute the cookie-string following the algorithm + defined in Section 5.8.3, where the retrieval's URI is the request- + uri, the retrieval's same-site status is computed for the HTTP + request as defined in Section 5.2, and the retrieval's type is + "HTTP". + + Note: Previous versions of this specification required that only one + Cookie header field be sent in requests. This is no longer a + requirement. While this specification requires that a single cookie- + string be produced, some user agents may split that string across + multiple cookie header fields. For examples, see Section 8.2.3 of + [RFC9113] and Section 4.2.1 of [RFC9114]. + +5.8.2. Non-HTTP APIs + + The user agent MAY implement "non-HTTP" APIs that can be used to + access stored cookies. + + A user agent MAY return an empty cookie-string in certain contexts, + such as when a retrieval occurs within a third-party context (see + Section 7.1). + + If a user agent does return cookies for a given call to a "non-HTTP" + API with an associated Document, then the user agent MUST compute the + cookie-string following the algorithm defined in Section 5.8.3, where + the retrieval's URI is defined by the caller (see + [DOM-DOCUMENT-COOKIE]), the retrieval's same-site status is "same- + site" if the Document's "site for cookies" is same-site with the top- + level origin as defined in Section 5.2.1 (otherwise it is "cross- + site"), and the retrieval's type is "non-HTTP". + +5.8.3. Retrieval Algorithm + + Given a cookie store and a retrieval, the following algorithm returns + a cookie-string from a given cookie store. + + 1. Let cookie-list be the set of cookies from the cookie store that + meets all of the following requirements: + + * Either: + + - The cookie's host-only-flag is true and the canonicalized + host of the retrieval's URI is identical to the cookie's + domain. + + Or: + + - The cookie's host-only-flag is false and the canonicalized + host of the retrieval's URI domain-matches the cookie's + domain. + + NOTE: (For user agents configured to reject "public suffixes") + It's possible that the public suffix list was changed since a + cookie was created. If this change results in a cookie's + domain becoming a public suffix then that cookie is considered + invalid as it would have been rejected during creation (See + Section 5.7 step 9). User agents should be careful to avoid + retrieving these invalid cookies even if they domain-match the + host of the retrieval's URI. + + * The retrieval's URI's path path-matches the cookie's path. + + * If the cookie's secure-only-flag is true, then the retrieval's + URI must denote a "secure" connection (as defined by the user + agent). + + NOTE: The notion of a "secure" connection is not defined by + this document. Typically, user agents consider a connection + secure if the connection makes use of transport-layer + security, such as SSL or TLS, or if the host is trusted. For + example, most user agents consider "https" to be a scheme that + denotes a secure protocol and "localhost" to be trusted host. + + * If the cookie's http-only-flag is true, then exclude the + cookie if the retrieval's type is "non-HTTP". + + * If the cookie's same-site-flag is not "None" and the + retrieval's same-site status is "cross-site", then exclude the + cookie unless all of the following conditions are met: + + - The retrieval's type is "HTTP". + + - The same-site-flag is "Lax" or "Default". + + - The HTTP request associated with the retrieval uses a + "safe" method. + + - The target browsing context of the HTTP request associated + with the retrieval is the active browsing context or a top- + level traversable. + + 2. The user agent SHOULD sort the cookie-list in the following + order: + + * Cookies with longer paths are listed before cookies with + shorter paths. + + * Among cookies that have equal-length path fields, cookies with + earlier creation-times are listed before cookies with later + creation-times. + + NOTE: Not all user agents sort the cookie-list in this order, but + this order reflects common practice when this document was + written, and, historically, there have been servers that + (erroneously) depended on this order. + + 3. Update the last-access-time of each cookie in the cookie-list to + the current date and time. + + 4. Serialize the cookie-list into a cookie-string by processing each + cookie in the cookie-list in order: + + 1. If the cookies' name is not empty, output the cookie's name + followed by the %x3D ("=") character. + + 2. If the cookies' value is not empty, output the cookie's + value. + + 3. If there is an unprocessed cookie in the cookie-list, output + the characters %x3B and %x20 ("; "). + +6. Implementation Considerations + +6.1. Limits + + Practical user agent implementations have limits on the number and + size of cookies that they can store. General-use user agents SHOULD + provide each of the following minimum capabilities: + + * At least 50 cookies per domain. + + * At least 3000 cookies total. + + User agents MAY limit the maximum number of cookies they store, and + may evict any cookie at any time (whether at the request of the user + or due to implementation limitations). + + Note that a limit on the maximum number of cookies also limits the + total size of the stored cookies, due to the length limits which MUST + be enforced in Section 5.6. + + Servers SHOULD use as few and as small cookies as possible to avoid + reaching these implementation limits, minimize network bandwidth due + to the Cookie header field being included in every request, and to + avoid reaching server header field limits (See Section 4.2.1). + + Servers SHOULD gracefully degrade if the user agent fails to return + one or more cookies in the Cookie header field because the user agent + might evict any cookie at any time. + +6.2. Application Programming Interfaces + + One reason the Cookie and Set-Cookie header fields use such esoteric + syntax is that many platforms (both in servers and user agents) + provide a string-based application programming interface (API) to + cookies, requiring application-layer programmers to generate and + parse the syntax used by the Cookie and Set-Cookie header fields, + which many programmers have done incorrectly, resulting in + interoperability problems. + + Instead of providing string-based APIs to cookies, platforms would be + well-served by providing more semantic APIs. It is beyond the scope + of this document to recommend specific API designs, but there are + clear benefits to accepting an abstract "Date" object instead of a + serialized date string. + +7. Privacy Considerations + + Cookies' primary privacy risk is their ability to correlate user + activity. This can happen on a single site, but is most problematic + when activity is tracked across different, seemingly unconnected Web + sites to build a user profile. + + Over time, this capability (warned against explicitly in [RFC2109] + and all of its successors) has become widely used for varied reasons + including: + + * authenticating users across sites, + + * assembling information on users, + + * protecting against fraud and other forms of undesirable traffic, + + * targeting advertisements at specific users or at users with + specified attributes, + + * measuring how often ads are shown to users, and + + * recognizing when an ad resulted in a change in user behavior. + + While not every use of cookies is necessarily problematic for + privacy, their potential for abuse has become a widespread concern in + the Internet community and broader society. In response to these + concerns, user agents have actively constrained cookie functionality + in various ways (as allowed and encouraged by previous + specifications), while avoiding disruption to features they judge + desirable for the health of the Web. + + It is too early to declare consensus on which specific mechanism(s) + should be used to mitigate cookies' privacy impact; user agents' + ongoing changes to how they are handled are best characterised as + experiments that can provide input into that eventual consensus. + + Instead, this document describes limited, general mitigations against + the privacy risks associated with cookies that enjoy wide deployment + at the time of writing. It is expected that implementations will + continue to experiment and impose stricter, more well-defined + limitations on cookies over time. Future versions of this document + might codify those mechanisms based upon deployment experience. If + functions that currently rely on cookies can be supported by + separate, targeted mechanisms, they might be documented in separate + specifications and stricter limitations on cookies might become + feasible. + + Note that cookies are not the only mechanism that can be used to + track users across sites, so while these mitigations are necessary to + improve Web privacy, they are not sufficient on their own. + +7.1. Third-Party Cookies + + A "third-party" or cross-site cookie is one that is associated with + embedded content (such as scripts, images, stylesheets, frames) that + is obtained from a different server than the one that hosts the + primary resource (usually, the Web page that the user is viewing). + Third-party cookies are often used to correlate users' activity on + different sites. + + Because of their inherent privacy issues, most user agents now limit + third-party cookies in a variety of ways. Some completely block + third-party cookies by refusing to process third-party Set-Cookie + header fields and refusing to send third-party Cookie header fields. + Some partition cookies based upon the first-party context, so that + different cookies are sent depending on the site being browsed. Some + block cookies based upon user agent cookie policy and/or user + controls. + + While this document does not endorse or require a specific approach, + it is RECOMMENDED that user agents adopt a policy for third-party + cookies that is as restrictive as compatibility constraints permit. + Consequently, resources cannot rely upon third-party cookies being + treated consistently by user agents for the foreseeable future. + +7.2. Cookie Policy + + User agents MAY enforce a cookie policy consisting of restrictions on + how cookies may be used or ignored (see Section 5.3). + + A cookie policy may govern which domains or parties, as in first and + third parties (see Section 7.1), for which the user agent will allow + cookie access. The policy can also define limits on cookie size, + cookie expiry (see Section 5.5), and the number of cookies per domain + or in total. + + The recommended cookie expiry upper limit is 400 days. User agents + may set a lower limit to enforce shorter data retention timelines, or + set the limit higher to support longer retention when appropriate + (e.g., server-to-server communication over HTTPS). + + The goal of a restrictive cookie policy is often to improve security + or privacy. User agents often allow users to change the cookie + policy (see Section 7.3). + +7.3. User Controls + + User agents SHOULD provide users with a mechanism for managing the + cookies stored in the cookie store. For example, a user agent might + let users delete all cookies received during a specified time period + or all the cookies related to a particular domain. In addition, many + user agents include a user interface element that lets users examine + the cookies stored in their cookie store. + + User agents SHOULD provide users with a mechanism for disabling + cookies. When cookies are disabled, the user agent MUST NOT include + a Cookie header field in outbound HTTP requests and the user agent + MUST NOT process Set-Cookie header fields in inbound HTTP responses. + + User agents MAY offer a way to change the cookie policy (see + Section 7.2). + + User agents MAY provide users the option of preventing persistent + storage of cookies across sessions. When configured thusly, user + agents MUST treat all received cookies as if the persistent-flag were + set to false. Some popular user agents expose this functionality via + "private browsing" mode [Aggarwal2010]. + +7.4. Expiration Dates + + Although servers can set the expiration date for cookies to the + distant future, most user agents do not actually retain cookies for + multiple decades. Rather than choosing gratuitously long expiration + periods, servers SHOULD promote user privacy by selecting reasonable + cookie expiration periods based on the purpose of the cookie. For + example, a typical session identifier might reasonably be set to + expire in two weeks. + +8. Security Considerations + +8.1. Overview + + Cookies have a number of security pitfalls. This section overviews a + few of the more salient issues. + + In particular, cookies encourage developers to rely on ambient + authority for authentication, often becoming vulnerable to attacks + such as cross-site request forgery [CSRF]. Also, when storing + session identifiers in cookies, developers often create session + fixation vulnerabilities. + + Transport-layer encryption, such as that employed in HTTPS, is + insufficient to prevent a network attacker from obtaining or altering + a victim's cookies because the cookie protocol itself has various + vulnerabilities (see "Weak Confidentiality" and "Weak Integrity", + below). In addition, by default, cookies do not provide + confidentiality or integrity from network attackers, even when used + in conjunction with HTTPS. + +8.2. Ambient Authority + + A server that uses cookies to authenticate users can suffer security + vulnerabilities because some user agents let remote parties issue + HTTP requests from the user agent (e.g., via HTTP redirects or HTML + forms). When issuing those requests, user agents attach cookies even + if the remote party does not know the contents of the cookies, + potentially letting the remote party exercise authority at an unwary + server. + + Although this security concern goes by a number of names (e.g., + cross-site request forgery, confused deputy), the issue stems from + cookies being a form of ambient authority. Cookies encourage server + operators to separate designation (in the form of URLs) from + authorization (in the form of cookies). Consequently, the user agent + might supply the authorization for a resource designated by the + attacker, possibly causing the server or its clients to undertake + actions designated by the attacker as though they were authorized by + the user. + + Instead of using cookies for authorization, server operators might + wish to consider entangling designation and authorization by treating + URLs as capabilities. Instead of storing secrets in cookies, this + approach stores secrets in URLs, requiring the remote entity to + supply the secret itself. Although this approach is not a panacea, + judicious application of these principles can lead to more robust + security. + +8.3. Clear Text + + Unless sent over a secure channel (such as TLS), the information in + the Cookie and Set-Cookie header fields is transmitted in the clear. + + 1. All sensitive information conveyed in these header fields is + exposed to an eavesdropper. + + 2. A malicious intermediary could alter the header fields as they + travel in either direction, with unpredictable results. + + 3. A malicious client could alter the Cookie header fields before + transmission, with unpredictable results. + + Servers SHOULD encrypt and sign the contents of cookies (using + whatever format the server desires) when transmitting them to the + user agent (even when sending the cookies over a secure channel). + However, encrypting and signing cookie contents does not prevent an + attacker from transplanting a cookie from one user agent to another + or from replaying the cookie at a later time. + + In addition to encrypting and signing the contents of every cookie, + servers that require a higher level of security SHOULD use the Cookie + and Set-Cookie header fields only over a secure channel. When using + cookies over a secure channel, servers SHOULD set the Secure + attribute (see Section 4.1.2.5) for every cookie. If a server does + not set the Secure attribute, the protection provided by the secure + channel will be largely moot. + + For example, consider a webmail server that stores a session + identifier in a cookie and is typically accessed over HTTPS. If the + server does not set the Secure attribute on its cookies, an active + network attacker can intercept any outbound HTTP request from the + user agent and redirect that request to the webmail server over HTTP. + Even if the webmail server is not listening for HTTP connections, the + user agent will still include cookies in the request. The active + network attacker can intercept these cookies, replay them against the + server, and learn the contents of the user's email. If, instead, the + server had set the Secure attribute on its cookies, the user agent + would not have included the cookies in the clear-text request. + +8.4. Session Identifiers + + Instead of storing session information directly in a cookie (where it + might be exposed to or replayed by an attacker), servers commonly + store a nonce (or "session identifier") in a cookie. When the server + receives an HTTP request with a nonce, the server can look up state + information associated with the cookie using the nonce as a key. + + Using session identifier cookies limits the damage an attacker can + cause if the attacker learns the contents of a cookie because the + nonce is useful only for interacting with the server (unlike non- + nonce cookie content, which might itself be sensitive). Furthermore, + using a single nonce prevents an attacker from "splicing" together + cookie content from two interactions with the server, which could + cause the server to behave unexpectedly. + + Using session identifiers is not without risk. For example, the + server SHOULD take care to avoid "session fixation" vulnerabilities. + A session fixation attack proceeds in three steps. First, the + attacker transplants a session identifier from his or her user agent + to the victim's user agent. Second, the victim uses that session + identifier to interact with the server, possibly imbuing the session + identifier with the user's credentials or confidential information. + Third, the attacker uses the session identifier to interact with + server directly, possibly obtaining the user's authority or + confidential information. + +8.5. Weak Confidentiality + + Cookies do not provide isolation by port. If a cookie is readable by + a service running on one port, the cookie is also readable by a + service running on another port of the same server. If a cookie is + writable by a service on one port, the cookie is also writable by a + service running on another port of the same server. For this reason, + servers SHOULD NOT both run mutually distrusting services on + different ports of the same host and use cookies to store security- + sensitive information. + + Cookies do not provide isolation by scheme. Although most commonly + used with the http and https schemes, the cookies for a given host + might also be available to other schemes, such as ftp and gopher. + Although this lack of isolation by scheme is most apparent in non- + HTTP APIs that permit access to cookies (e.g., HTML's document.cookie + API), the lack of isolation by scheme is actually present in + requirements for processing cookies themselves (e.g., consider + retrieving a URI with the gopher scheme via HTTP). + + Cookies do not always provide isolation by path. Although the + network-level protocol does not send cookies stored for one path to + another, some user agents expose cookies via non-HTTP APIs, such as + HTML's document.cookie API. Because some of these user agents (e.g., + web browsers) do not isolate resources received from different paths, + a resource retrieved from one path might be able to access cookies + stored for another path. + +8.6. Weak Integrity + + Cookies do not provide integrity guarantees for sibling domains (and + their subdomains). For example, consider foo.site.example and + bar.site.example. The foo.site.example server can set a cookie with + a Domain attribute of "site.example" (possibly overwriting an + existing "site.example" cookie set by bar.site.example), and the user + agent will include that cookie in HTTP requests to bar.site.example. + In the worst case, bar.site.example will be unable to distinguish + this cookie from a cookie it set itself. The foo.site.example server + might be able to leverage this ability to mount an attack against + bar.site.example. + + Even though the Set-Cookie header field supports the Path attribute, + the Path attribute does not provide any integrity protection because + the user agent will accept an arbitrary Path attribute in a Set- + Cookie header field. For example, an HTTP response to a request for + http://site.example/foo/bar can set a cookie with a Path attribute of + "/qux". Consequently, servers SHOULD NOT both run mutually + distrusting services on different paths of the same host and use + cookies to store security-sensitive information. + + An active network attacker can also inject cookies into the Cookie + header field sent to https://site.example/ by impersonating a + response from http://site.example/ and injecting a Set-Cookie header + field. The HTTPS server at site.example will be unable to + distinguish these cookies from cookies that it set itself in an HTTPS + response. An active network attacker might be able to leverage this + ability to mount an attack against site.example even if site.example + uses HTTPS exclusively. + + Servers can partially mitigate these attacks by encrypting and + signing the contents of their cookies, or by naming the cookie with + the __Secure- prefix. However, using cryptography does not mitigate + the issue completely because an attacker can replay a cookie he or + she received from the authentic site.example server in the user's + session, with unpredictable results. + + Finally, an attacker might be able to force the user agent to delete + cookies by storing a large number of cookies. Once the user agent + reaches its storage limit, the user agent will be forced to evict + some cookies. Servers SHOULD NOT rely upon user agents retaining + cookies. + +8.7. Reliance on DNS + + Cookies rely upon the Domain Name System (DNS) for security. If the + DNS is partially or fully compromised, the cookie protocol might fail + to provide the security properties required by applications. + +8.8. SameSite Cookies + +8.8.1. Defense in depth + + "SameSite" cookies offer a robust defense against CSRF attack when + deployed in strict mode, and when supported by the client. It is, + however, prudent to ensure that this designation is not the extent of + a site's defense against CSRF, as same-site navigations and + submissions can certainly be executed in conjunction with other + attack vectors such as cross-site scripting or abuse of page + redirections. + + Understanding how and when a request is considered same-site is also + important in order to properly design a site for SameSite cookies. + For example, if a cross-site top-level request is made to a sensitive + page that request will be considered cross-site and SameSite=Strict + cookies won't be sent; that page's sub-resources requests, however, + are same-site and would receive SameSite=Strict cookies. Sites can + avoid inadvertently allowing access to these sub-resources by + returning an error for the initial page request if it doesn't include + the appropriate cookies. + + Developers are strongly encouraged to deploy the usual server-side + defenses (CSRF tokens, ensuring that "safe" HTTP methods are + idempotent, etc) to mitigate the risk more fully. + + Additionally, client-side techniques such as those described in + [app-isolation] may also prove effective against CSRF, and are + certainly worth exploring in combination with "SameSite" cookies. + +8.8.2. Top-level Navigations + + Setting the SameSite attribute in "strict" mode provides robust + defense in depth against CSRF attacks, but has the potential to + confuse users unless sites' developers carefully ensure that their + cookie-based session management systems deal reasonably well with + top-level navigations. + + Consider the scenario in which a user reads their email at MegaCorp + Inc's webmail provider https://site.example/. They might expect that + clicking on an emailed link to https://projects.example/secret/ + project would show them the secret project that they're authorized to + see, but if https://projects.example has marked their session cookies + as SameSite=Strict, then this cross-site navigation won't send them + along with the request. https://projects.example will render a 404 + error to avoid leaking secret information, and the user will be quite + confused. + + Developers can avoid this confusion by adopting a session management + system that relies on not one, but two cookies: one conceptually + granting "read" access, another granting "write" access. The latter + could be marked as SameSite=Strict, and its absence would prompt a + reauthentication step before executing any non-idempotent action. + The former could be marked as SameSite=Lax, in order to allow users + access to data via top-level navigation, or SameSite=None, to permit + access in all contexts (including cross-site embedded contexts). + +8.8.3. Mashups and Widgets + + The Lax and Strict values for the SameSite attribute are + inappropriate for some important use-cases. In particular, note that + content intended for embedding in cross-site contexts (social + networking widgets or commenting services, for instance) will not + have access to same-site cookies. Cookies which are required in + these situations should be marked with SameSite=None to allow access + in cross-site contexts. + + Likewise, some forms of Single-Sign-On might require cookie-based + authentication in a cross-site context; these mechanisms will not + function as intended with same-site cookies and will also require + SameSite=None. + +8.8.4. Server-controlled + + SameSite cookies in and of themselves don't do anything to address + the general privacy concerns outlined in Section 7.1 of [RFC6265]. + The "SameSite" attribute is set by the server, and serves to mitigate + the risk of certain kinds of attacks that the server is worried + about. The user is not involved in this decision. Moreover, a + number of side-channels exist which could allow a server to link + distinct requests even in the absence of cookies (for example, + connection and/or socket pooling between same-site and cross-site + requests). + +8.8.5. Reload navigations + + Requests issued for reloads triggered through user interface elements + (such as a refresh button on a toolbar) are same-site only if the + reloaded document was originally navigated to via a same-site + request. This differs from the handling of other reload navigations, + which are always same-site if top-level, since the source navigable's + active document is precisely the document being reloaded. + + This special handling of reloads triggered through a user interface + element avoids sending SameSite cookies on user-initiated reloads if + they were withheld on the original navigation (i.e., if the initial + navigation were cross-site). If the reload navigation were instead + considered same-site, and sent all the initially withheld SameSite + cookies, the security benefits of withholding the cookies in the + first place would be nullified. This is especially important given + that the absence of SameSite cookies withheld on a cross-site + navigation request may lead to visible site breakage, prompting the + user to trigger a reload. + + For example, suppose the user clicks on a link from + https://attacker.example/ to https://victim.example/. This is a + cross-site request, so SameSite=Strict cookies are withheld. Suppose + this causes https://victim.example/ to appear broken, because the + site only displays its sensitive content if a particular SameSite + cookie is present in the request. The user, frustrated by the + unexpectedly broken site, presses refresh on their browser's toolbar. + To now consider the reload request same-site and send the initially + withheld SameSite cookie would defeat the purpose of withholding it + in the first place, as the reload navigation triggered through the + user interface may replay the original (potentially malicious) + request. Thus, the reload request should be considered cross-site, + like the request that initially navigated to the page. + + Because requests issued for, non-user initiated, reloads attach all + SameSite cookies, developers should be careful and thoughtful about + when to initiate a reload in order to avoid a CSRF attack. For + example, the page could only initiate a reload if a CSRF token is + present on the initial request. + +8.8.6. Top-level requests with "unsafe" methods + + The "Lax" enforcement mode described in Section 5.6.7.1 allows a + cookie to be sent with a cross-site HTTP request if and only if it is + a top-level navigation with a "safe" HTTP method. Implementation + experience shows that this is difficult to apply as the default + behavior, as some sites may rely on cookies not explicitly specifying + a SameSite attribute being included on top-level cross-site requests + with "unsafe" HTTP methods (as was the case prior to the introduction + of the SameSite attribute). + + For example, the concluding step of a login flow may involve a cross- + site top-level POST request to an endpoint; this endpoint expects a + recently created cookie containing transactional state information, + necessary to securely complete the login. For such a cookie, "Lax" + enforcement is not appropriate, as it would cause the cookie to be + excluded due to the unsafe HTTP request method, resulting in an + unrecoverable failure of the whole login flow. + + The "Lax-allowing-unsafe" enforcement mode described in + Section 5.6.7.2 retains some of the protections of "Lax" enforcement + (as compared to "None") while still allowing recently created cookies + to be sent cross-site with unsafe top-level requests. + + As a more permissive variant of "Lax" mode, "Lax-allowing-unsafe" + mode necessarily provides fewer protections against CSRF. + Ultimately, the provision of such an enforcement mode should be seen + as a temporary, transitional measure to ease adoption of "Lax" + enforcement by default. + +9. IANA Considerations + +9.1. Cookie + + The HTTP Field Name Registry (see [HttpFieldNameRegistry]) needs to + be updated with the following registration: + + Header field name: Cookie + + Applicable protocol: http + + Status: standard + + Author/Change controller: IETF + + Specification document: this specification (Section 5.8.1) + +9.2. Set-Cookie + + The permanent message header field registry (see + [HttpFieldNameRegistry]) needs to be updated with the following + registration: + + Header field name: Set-Cookie + + Applicable protocol: http + + Status: standard + + Author/Change controller: IETF + + Specification document: this specification (Section 5.6) + +9.3. Cookie Attribute Registry + + IANA is requested to create the "Cookie Attribute" registry, defining + the name space of attribute used to control cookies' behavior. The + registry should be maintained at https://www.iana.org/assignments/ + cookie-attribute-names (https://www.iana.org/assignments/cookie- + attribute-names). + +9.3.1. Procedure + + Each registered attribute name is associated with a description, and + a reference detailing how the attribute is to be processed and + stored. + + New registrations happen on a "RFC Required" basis (see Section 4.7 + of [RFC8126]). The attribute to be registered MUST match the + extension-av syntax defined in Section 4.1.1. Note that attribute + names are generally defined in CamelCase, but technically accepted + case-insensitively. + +9.3.2. Registration + + The "Cookie Attribute Registry" should be created with the + registrations below: + + +==========+==================================+ + | Name | Reference | + +==========+==================================+ + | Domain | Section 4.1.2.3 of this document | + +----------+----------------------------------+ + | Expires | Section 4.1.2.1 of this document | + +----------+----------------------------------+ + | HttpOnly | Section 4.1.2.6 of this document | + +----------+----------------------------------+ + | Max-Age | Section 4.1.2.2 of this document | + +----------+----------------------------------+ + | Path | Section 4.1.2.4 of this document | + +----------+----------------------------------+ + | SameSite | Section 4.1.2.7 of this document | + +----------+----------------------------------+ + | Secure | Section 4.1.2.5 of this document | + +----------+----------------------------------+ + + Table 1 + +10. References + +10.1. Normative References + + [DOM-DOCUMENT-COOKIE] + WHATWG, "HTML - Living Standard", 18 May 2021, + . + + [FETCH] van Kesteren, A., "Fetch", n.d., + . + + [HTML] Hickson, I., Pieters, S., van Kesteren, A., Jägenstedt, + P., and D. Denicola, "HTML", n.d., + . + + [HTTP] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP Semantics", STD 97, RFC 9110, + DOI 10.17487/RFC9110, June 2022, + . + + [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", + STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987, + . + + [RFC1123] Braden, R., Ed., "Requirements for Internet Hosts - + Application and Support", STD 3, RFC 1123, + DOI 10.17487/RFC1123, October 1989, + . + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + . + + [RFC4790] Newman, C., Duerst, M., and A. Gulbrandsen, "Internet + Application Protocol Collation Registry", RFC 4790, + DOI 10.17487/RFC4790, March 2007, + . + + [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax + Specifications: ABNF", STD 68, RFC 5234, + DOI 10.17487/RFC5234, January 2008, + . + + [RFC5890] Klensin, J., "Internationalized Domain Names for + Applications (IDNA): Definitions and Document Framework", + RFC 5890, DOI 10.17487/RFC5890, August 2010, + . + + [RFC6454] Barth, A., "The Web Origin Concept", RFC 6454, + DOI 10.17487/RFC6454, December 2011, + . + + [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for + Writing an IANA Considerations Section in RFCs", BCP 26, + RFC 8126, DOI 10.17487/RFC8126, June 2017, + . + + [SAMESITE] WHATWG, "HTML - Living Standard", 26 January 2021, + . + + [USASCII] American National Standards Institute, "Coded Character + Set -- 7-bit American Standard Code for Information + Interchange", ANSI X3.4, 1986. + +10.2. Informative References + + [Aggarwal2010] + Aggarwal, G., Burzstein, E., Jackson, C., and D. Boneh, + "An Analysis of Private Browsing Modes in Modern + Browsers", 2010, + . + + [app-isolation] + Chen, E., Bau, J., Reis, C., Barth, A., and C. Jackson, + "App Isolation - Get the Security of Multiple Browsers + with Just One", 2011, + . + + [CSRF] Barth, A., Jackson, C., and J. Mitchell, "Robust Defenses + for Cross-Site Request Forgery", + DOI 10.1145/1455770.1455782, ISBN 978-1-59593-810-7, + ACM CCS '08: Proceedings of the 15th ACM conference on + Computer and communications security (pages 75-88), + October 2008, + . + + [HttpFieldNameRegistry] + "Hypertext Transfer Protocol (HTTP) Field Name Registry", + n.d., . + + [prerendering] + Bentzel, C., "Chrome Prerendering", n.d., + . + + [PSL] "Public Suffix List", n.d., + . + + [RFC2109] Kristol, D. and L. Montulli, "HTTP State Management + Mechanism", RFC 2109, DOI 10.17487/RFC2109, February 1997, + . + + [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform + Resource Identifier (URI): Generic Syntax", STD 66, + RFC 3986, DOI 10.17487/RFC3986, January 2005, + . + + [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data + Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, + . + + [RFC6265] Barth, A., "HTTP State Management Mechanism", RFC 6265, + DOI 10.17487/RFC6265, April 2011, + . + + [RFC7034] Ross, D. and T. Gondrom, "HTTP Header Field X-Frame- + Options", RFC 7034, DOI 10.17487/RFC7034, October 2013, + . + + [RFC9113] Thomson, M., Ed. and C. Benfield, Ed., "HTTP/2", RFC 9113, + DOI 10.17487/RFC9113, June 2022, + . + + [RFC9114] Bishop, M., Ed., "HTTP/3", RFC 9114, DOI 10.17487/RFC9114, + June 2022, . + + [SERVICE-WORKERS] + Archibald, J. and M. Kruisselbrink, "Service Workers", + n.d., . + +Appendix A. Changes from RFC 6265 + + * Adds the same-site concept and the SameSite attribute. + (Section 5.2 and Section 4.1.2.7) + + * Introduces cookie prefixes and prohibits nameless cookies from + setting a value that would mimic a cookie prefix. (Section 4.1.3 + and Section 5.7) + + * Prohibits non-secure origins from setting cookies with a Secure + flag or overwriting cookies with this flag. (Section 5.7) + + * Limits maximum cookie size. (Section 5.7) + + * Limits maximum values for max-age and expire. (Section 5.6.1 and + Section 5.6.2) + + * Includes the host-only-flag as part of a cookie's uniqueness + computation. (Section 5.7) + + * Considers potentially trustworthy origins as "secure". + (Section 5.7) + + * Improves cookie syntax + + - Treats Set-Cookie: token as creating the cookie ("", "token"). + (Section 5.6) + + - Rejects cookies without a name nor value. (Section 5.7) + + - Specifies how to serialize a nameless/valueless cookie. + (Section 5.8.3) + + - Adjusts ABNF for cookie-pair and the Cookie header production + to allow for spaces. (Section 4.1.1) + + - Explicitly handle control characters. (Section 5.6 and + Section 5.7) + + - Specifies how to handle empty domain attributes. (Section 5.7) + + - Requires ASCII characters for the domain attribute. + (Section 5.7) + + * Refactors cookie retrieval algorithm to support non-HTTP APIs. + (Section 5.8.2) + + * Specifies that the Set-Cookie line should not be decoded. + (Section 5.6) + + * Adds an advisory section to assist implementers in deciding which + requirements to implement. (Section 3.2) + + * Advises against sending invalid cookies due to public suffix list + changes. (Section 5.8.3) + + * Removes the single cookie header requirement. (Section 5.8.1) + + * Address errata 3444 by updating the path-value andextension-av + grammar, errata 4148 by updating the day-of-month, year, and time + grammar, and errata 3663 by adding the requested note. + (Section 4.1 and Section 5.1.4) + +Acknowledgements + + RFC 6265 was written by Adam Barth. This document is an update of + RFC 6265, adding features and aligning the specification with the + reality of today's deployments. Here, we're standing upon the + shoulders of a giant since the majority of the text is still Adam's. + + Thank you to both Lily Chen and Steven Englehardt, editors emeritus, + for their significant contributions improving this draft. + +Authors' Addresses + + Steven Bingler (editor) + Google LLC + Email: bingler@google.com + + + Mike West (editor) + Google LLC + Email: mkwst@google.com + URI: https://mikewest.org/ + + + John Wilander (editor) + Apple, Inc + Email: wilander@apple.com diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-safe-method-w-body.html b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-safe-method-w-body.html new file mode 100644 index 000000000..c96936d7f --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-safe-method-w-body.html @@ -0,0 +1,2087 @@ + + + + + + +The HTTP QUERY Method + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Internet-DraftThe HTTP QUERY MethodJanuary 2025
Reschke, et al.Expires 11 July 2025[Page]
+
+
+
+
Workgroup:
+
HTTP
+
Published:
+
+ +
+
Intended Status:
+
Standards Track
+
Expires:
+
+
Authors:
+
+
+
J. Reschke
+
greenbytes
+
+
+
A. Malhotra
+
+
+
J.M. Snell
+
+
+
M. Bishop
+
Akamai
+
+
+
+
+

The HTTP QUERY Method

+
+

Abstract

+

+ This specification defines a new HTTP method, QUERY, as a safe, idempotent + request method that can carry request content.

+
+
+

+Editorial Note +

+

This note is to be removed before publishing as an RFC.

+

+ Discussion of this draft takes place on the HTTP working group + mailing list (ietf-http-wg@w3.org), which is archived at + https://lists.w3.org/Archives/Public/ietf-http-wg/.

+

+ Working Group information can be found at https://httpwg.org/; + source code and issues list for this draft can be found at + https://github.com/httpwg/http-extensions/labels/query-method.

+

+ The changes in this draft are summarized in Appendix B.8.

+
+
+
+

+Status of This Memo +

+

+ This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79.

+

+ Internet-Drafts are working documents of the Internet Engineering Task + Force (IETF). Note that other groups may also distribute working + documents as Internet-Drafts. The list of current Internet-Drafts is + at https://datatracker.ietf.org/drafts/current/.

+

+ Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress."

+

+ This Internet-Draft will expire on 11 July 2025.

+
+
+ + +
+
+

+1. Introduction +

+

+ This specification defines the HTTP QUERY request method as a means of + making a safe, idempotent request that contains content.

+

+ Most often, this is desirable when the data conveyed in a request is + too voluminous to be encoded into the request's URI. For example, + this is a common query pattern:

+
+
+GET /feed?q=foo&limit=10&sort=-published HTTP/1.1
+Host: example.org
+
+
+

+ However, for a query with parameters that are complex or large in size, + encoding it in the request URI may not be the best option because

+
    +
  • often size limits are not known ahead of time because a request can pass through many uncoordinated + system (but note that Section 4.1 of [HTTP] recommends senders and recipients to support at least 8000 octets), +
  • +
  • expressing certain kinds of data in the target URI is inefficient because of the overhead of encoding that data into a valid URI, and +
  • +
  • encoding query parameters directly into the request URI effectively casts every possible combination of query inputs as distinct + resources. +
  • +
+

+ As an alternative to using GET, many implementations make use of the + HTTP POST method to perform queries, as illustrated in the example + below. In this case, the input parameters to the query operation are + passed along within the request content as opposed to using the + request URI.

+

+ A typical use of HTTP POST for requesting a query:

+
+
+POST /feed HTTP/1.1
+Host: example.org
+Content-Type: application/x-www-form-urlencoded
+
+q=foo&limit=10&sort=-published
+
+
+

+ This variation, however, suffers from the same basic limitation as GET + in that it is not readily apparent -- absent specific knowledge of the + resource and server to which the request is being sent -- that a safe, + idempotent query is being performed.

+

+ The QUERY method provides a solution that spans the gap between the use of GET and POST, with + the example above being expressed as:

+
+
+QUERY /feed HTTP/1.1
+Host: example.org
+Content-Type: application/x-www-form-urlencoded
+
+q=foo&limit=10&sort=-published
+
+
+

+ As with POST, the input to the query operation is passed along within the content of the request + rather than as part of the request URI. Unlike POST, however, the method is explicitly safe + and idempotent, allowing functions like caching and automatic retries to operate.

+

Summarizing:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Table 1
GETQUERYPOST
Safeyesyespotentially no
Idempotentyesyespotentially no
Cacheableyesyesno
Content (body)"no defined semantics"expected (semantics per target resource)expected (semantics per target resource)
+
+
+

+1.1. Terminology +

+

+ This document uses terminology defined in Section 3 of [HTTP].

+
+
+
+
+

+1.2. Notational Conventions +

+

+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL + NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", + "MAY", and "OPTIONAL" in this document are to be interpreted as + described in BCP 14 [RFC2119] [RFC8174] when, and only when, they + appear in all capitals, as shown here.

+
+
+
+
+
+
+

+2. QUERY +

+

+ The QUERY method is used to initiate a server-side query. Unlike + the HTTP GET method, which requests that a server return a + representation of the resource identified by the target URI + (as defined by Section 7.1 of [HTTP]), the QUERY + method is used to ask the server to perform a query operation + (described by the request content) over some set of data scoped to the + target URI. The content returned in response to a QUERY + cannot be assumed to be a representation of the resource identified by + the target URI.

+

+ The content of the request defines the query. Implementations MAY use + a request content of any media type with the QUERY method, provided that it + has appropriate query semantics.

+

+ QUERY requests are both safe and idempotent with regards to the + resource identified by the request URI. That is, QUERY requests do not + alter the state of the targeted resource. However, while processing a + QUERY request, a server can be expected to allocate computing and memory + resources or even create additional HTTP resources through which the + response can be retrieved.

+

+ A successful response to a QUERY request is expected to provide some + indication as to the final disposition of the operation. For + instance, a successful query that yields no results can be represented + by a 204 No Content response. If the response includes content, + it is expected to describe the results of the operation.

+
+
+

+2.1. Content-Location and Location Fields +

+

+ Furthermore, a successful response can include a Content-Location + header field (see Section 8.7 of [HTTP]) containing an + identifier for a resource corresponding to the results of the + operation. This represents a claim from the server that a client can send + a GET request for the indicated URI to retrieve the results of the query + operation just performed. The indicated resource might be temporary.

+

+ A server MAY create or locate a resource that identifies the query + operation for future use. If the server does so, the URI of the resource + can be included in the Location header field of the response (see Section 10.2.2 of [HTTP]). This represents a claim that a client can + send a GET request to the indicated URI to repeat the query operation just + performed without resending the query parameters. This resource might be + temporary; if a future request fails, the client can retry using the + original QUERY resource and the previously submitted parameters again.

+
+
+
+
+

+2.2. Redirection +

+

+ In some cases, the server may choose to respond indirectly to the QUERY + request by redirecting the user agent to a different URI (see + Section 15.4 of [HTTP]). The semantics of the redirect + response do not differ from other methods. For instance, a 303 (See Other) + response would indicate that the Location field identifies an + alternate URI from which the results can be retrieved + using a GET request (this use case is also covered by the + use of the Location response field in a 2xx response). + On the other hand, response codes 307 (Temporary Redirect) and 308 + (Permanent Redirect) can be used to request the user agent to redo + the QUERY request on the URI specified by the Location field. + Various non-normative examples of successful + QUERY responses are illustrated in Appendix A.

+
+
+
+
+

+2.3. Conditional Requests +

+

+ A conditional QUERY requests that the selected representation + (i.e., the query results, after any content negotiation) be + returned in the response only under the circumstances described by the + conditional header field(s), as defined in + Section 13 of [HTTP].

+
+
+
+
+

+2.4. Caching +

+

+ The response to a QUERY method is cacheable; a cache MAY use it to satisfy subsequent + QUERY requests as per Section 4 of [HTTP-CACHING]).

+

+ The cache key for a query (see Section 2 of [HTTP-CACHING]) MUST + incorporate the request content. When doing so, caches SHOULD first normalize request + content to remove semantically insignificant differences, thereby improving cache + efficiency, by:

+
    +
  • Removing content encoding(s) +
  • +
  • Normalizing based upon knowledge of format conventions, as indicated by the any media type suffix in the request's Content-Type field (e.g., "+json") +
  • +
  • Normalizing based upon knowledge of the semantics of the content itself, as indicated by the request's Content-Type field. +
  • +
+

+ Note that any such normalization is performed solely for the purpose of generating a cache key; it + does not change the request itself.

+
+
+
+
+
+
+

+3. The "Accept-Query" Header Field +

+

+ The "Accept-Query" response header field can be used by a resource to + directly signal support for the QUERY method while identifying + the specific query format media type(s) that may be used.

+

+ "Accept-Query" contains a list of media ranges (Section 12.5.1 of [HTTP]) + using "Structured Fields" syntax ([STRUCTURED-FIELDS]). + Media ranges are represented by a List Structured Header Field of either Tokens or + Strings, containing the media range value without parameters. Parameters, + if any, are mapped to Parameters of type String.

+

+ The choice of Token vs. String is semantically insignificant. That is, + recipients MAY convert Tokens to Strings, but MUST NOT process them + differently based on the received type.

+

+ Media types do not exactly map to Tokens, for instance they + allow a leading digit. In cases like these, the String format needs to + be used.

+

+ The only supported uses of wildcards are "*/*", which matches any type, + or "xxxx/*", which matches any subtype of the indicated type.

+

+ The order of types listed in the field value is not significant.

+

+ The only allowed format for parameters is String.

+

+ Accept-Query's value applies to every URI on the server that shares the same path; in + other words, the query component is ignored. If requests to the same resource return + different Accept-Query values, the most recently received fresh value (per + Section 4.2 of [HTTP-CACHING]) is used.

+

+ Example:

+
+
+Accept-Query: "application/jsonpath", application/sql;charset="UTF-8"
+
+

+ Although the syntax for this field appears to be similar to other + fields, such as "Accept" (Section 12.5.1 of [HTTP]), + it is a Structured Field and thus MUST be processed as specified in + Section 4 of [STRUCTURED-FIELDS].

+
+
+
+

+4. Security Considerations +

+

+ The QUERY method is subject to the same general security + considerations as all HTTP methods as described in + [HTTP].

+

+ It can be used as an alternative to passing request + information in the URI (e.g., in the query section). This is preferred in some + cases, as the URI is more likely to be logged or otherwise processed + by intermediaries than the request content. + If a server creates a temporary resource to represent the results of a QUERY + request (e.g., for use in the Location or Content-Location field) and the request + contains sensitive information that cannot be logged, then the URI of this + resource SHOULD be chosen such that it does not include any sensitive + portions of the original request content.

+

+ Caches that normalize QUERY content incorrectly or in ways that are + significantly different than how the resource processes the content + can return the incorrect response if normalization results in a false positive.

+

+ A QUERY request from user agents implementing CORS (Cross-Origin Resource Sharing) + will require a "preflight" request, + as QUERY does not belong to the set of CORS-safelisted methods + (see "Methods" in + [FETCH]).

+
+
+
+

+5. IANA Considerations +

+
+
+

+5.1. Registration of QUERY method +

+

+ IANA is requested to add the QUERY method to the HTTP + Method Registry at <http://www.iana.org/assignments/http-methods> + (see Section 16.3.1 of [HTTP]).

+ + + + + + + + + + + + + + + + + + +
Table 2
Method NameSafeIdempotentSpecification
QUERYYesYes + Section 2 +
+
+
+
+
+

+5.2. Registration of Accept-Query field +

+

+ IANA is requested to add the Accept-Query field to the HTTP Field Name + Registry at <https://www.iana.org/assignments/http-fields> + (see Section 16.1.1 of [HTTP]).

+ + + + + + + + + + + + + + + + + + + + +
Table 3
Field NameStatusStructured TypeReferenceComments
Accept-QuerypermanentList + Section 3 of this document.
+
+
+
+
+
+

+6. Normative References +

+
+
[RFC2119]
+
+Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
+
+
[RFC8174]
+
+Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.
+
+
[HTTP]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, , <https://www.rfc-editor.org/rfc/rfc9110>.
+
+
[HTTP-CACHING]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Caching", STD 98, RFC 9111, , <https://www.rfc-editor.org/rfc/rfc9111>.
+
+
[STRUCTURED-FIELDS]
+
+Nottingham, M. and P-H. Kamp, "Structured Field Values for HTTP", RFC 9651, , <https://www.rfc-editor.org/rfc/rfc9651>.
+
+
+
+
+

+7. Informative References +

+
+
[FETCH]
+
+WHATWG, "FETCH", <https://fetch.spec.whatwg.org>.
+
+
+
+
+
+

+Appendix A. Examples +

+

+ The non-normative examples in this section make use of a simple, + hypothetical plain-text based query syntax based on SQL with results + returned as comma-separated values. This is done for illustration + purposes only. Implementations are free to use any format they wish on + both the request and response.

+
+

+A.1. Simple QUERY with a Direct Response +

+

A simple query with a direct response:

+
+
+QUERY /contacts HTTP/1.1
+Host: example.org
+Content-Type: application/sql
+Accept: text/csv
+
+select surname, givenname, email limit 10
+
+
+

Response:

+
+
+HTTP/1.1 200 OK
+Content-Type: text/csv
+
+surname, givenname, email
+Smith, John, john.smith@example.org
+Jones, Sally, sally.jones@example.com
+Dubois, Camille, camille.dubois@example.net
+
+
+
+
+

+A.2. Simple QUERY with a Direct Response and Location Fields +

+

A simple query with a direct response:

+
+
+QUERY /contacts HTTP/1.1
+Host: example.org
+Content-Type: application/sql
+Accept: text/csv
+
+select surname, givenname, email limit 10
+
+
+

Response:

+
+
+HTTP/1.1 200 OK
+Content-Type: text/csv
+Content-Location: /contacts/responses/42
+Location: /contacts/queries/17
+
+surname, givenname, email
+Smith, John, john.smith@example.org
+Jones, Sally, sally.jones@example.com
+Dubois, Camille, camille.dubois@example.net
+
+
+

+ A subsequent GET request on /contacts/responses/42 would return + the same response, until the server decides to remove that + resource.

+

+ A GET request on /contacts/queries/17 however would execute the same + query again, and return a fresh result for that query:

+
+
+GET /contacts/queries/17 HTTP/1.1
+Host: example.org
+Accept: text/csv
+
+
+

Response:

+
+
+HTTP/1.1 200 OK
+Content-Type: text/csv
+Content-Location: /contacts/responses/43
+
+surname, givenname, email
+Jones, Sally, sally.jones@example.com
+Dubois, Camille, camille.dubois@example.net
+
+
+
+
+

+A.3. Simple QUERY with Indirect Response (303 See Other) +

+

A simple query with an Indirect Response (303 See Other):

+
+
+QUERY /contacts HTTP/1.1
+Host: example.org
+Content-Type: application/sql
+Accept: text/csv
+
+select surname, givenname, email limit 10
+
+
+

Response:

+
+
+HTTP/1.1 303 See Other
+Location: /contacts/query123
+
+
+

Retrieval of the Query Response:

+
+
+GET /contacts/query123 HTTP/1.1
+Host: example.org
+
+
+

Response:

+
+
+HTTP/1.1 200 OK
+Content-Type: text/csv
+
+surname, givenname, email
+Smith, John, john.smith@example.org
+Jones, Sally, sally.jones@example.com
+Dubois, Camille, camille.dubois@example.net
+
+
+
+
+

+A.4. Simple QUERY with Redirect Response (308 Moved Permanently) +

+

A simple query being redirected:

+
+
+QUERY /contacts HTTP/1.1
+Host: example.org
+Content-Type: application/sql
+Accept: text/csv
+
+select surname, givenname, email limit 10
+
+
+

Response:

+
+
+HTTP/1.1 308 Moved Permanently
+Location: /morecontacts
+
+
+

Redirected request:

+
+
+QUERY /morecontacts HTTP/1.1
+Host: example.org
+Content-Type: application/sql
+Accept: text/csv
+
+select surname, givenname, email limit 10
+
+
+

Response:

+
+
+HTTP/1.1 200 OK
+Content-Type: text/csv
+
+surname, givenname, email
+Smith, John, john.smith@example.org
+Jones, Sally, sally.jones@example.com
+Dubois, Camille, camille.dubois@example.net
+
+
+
+
+
+
+
+

+Appendix B. Change Log +

+

This section is to be removed before publishing as an RFC.

+
+
+

+B.1. Since draft-ietf-httpbis-safe-method-w-body-00 +

+ +
+
+ +
+
+

+B.3. Since draft-ietf-httpbis-safe-method-w-body-02 +

+ +
+
+ +
+
+

+B.5. Since draft-ietf-httpbis-safe-method-w-body-04 +

+ +
+
+
+
+

+B.6. Since draft-ietf-httpbis-safe-method-w-body-05 +

+ +
+
+
+
+

+B.7. Since draft-ietf-httpbis-safe-method-w-body-06 +

+ +
+
+ +
+
+
+
+

+Authors' Addresses +

+
+
Julian Reschke
+
greenbytes GmbH
+
Hafenweg 16
+
+48155 Münster +
+
Germany
+ + +
+
+
Ashok Malhotra
+ +
+
+
James M Snell
+ +
+
+
Mike Bishop
+
Akamai
+ +
+
+
+ + + diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-safe-method-w-body.txt b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-safe-method-w-body.txt new file mode 100644 index 000000000..516a3f9cf --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-safe-method-w-body.txt @@ -0,0 +1,694 @@ + + + + +HTTP J. Reschke +Internet-Draft greenbytes +Intended status: Standards Track A. Malhotra +Expires: 11 July 2025 + J.M. Snell + + M. Bishop + Akamai + 7 January 2025 + + + The HTTP QUERY Method + draft-ietf-httpbis-safe-method-w-body-latest + +Abstract + + This specification defines a new HTTP method, QUERY, as a safe, + idempotent request method that can carry request content. + +Editorial Note + + This note is to be removed before publishing as an RFC. + + Discussion of this draft takes place on the HTTP working group + mailing list (ietf-http-wg@w3.org), which is archived at + https://lists.w3.org/Archives/Public/ietf-http-wg/. + + Working Group information can be found at https://httpwg.org/; source + code and issues list for this draft can be found at + https://github.com/httpwg/http-extensions/labels/query-method. + + The changes in this draft are summarized in Appendix B.8. + +Status of This Memo + + This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79. + + Internet-Drafts are working documents of the Internet Engineering + Task Force (IETF). Note that other groups may also distribute + working documents as Internet-Drafts. The list of current Internet- + Drafts is at https://datatracker.ietf.org/drafts/current/. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress." + + This Internet-Draft will expire on 11 July 2025. + +Copyright Notice + + Copyright (c) 2025 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents (https://trustee.ietf.org/ + license-info) in effect on the date of publication of this document. + Please review these documents carefully, as they describe your rights + and restrictions with respect to this document. Code Components + extracted from this document must include Revised BSD License text as + described in Section 4.e of the Trust Legal Provisions and are + provided without warranty as described in the Revised BSD License. + +Table of Contents + + 1. Introduction + 1.1. Terminology + 1.2. Notational Conventions + 2. QUERY + 2.1. Content-Location and Location Fields + 2.2. Redirection + 2.3. Conditional Requests + 2.4. Caching + 3. The "Accept-Query" Header Field + 4. Security Considerations + 5. IANA Considerations + 5.1. Registration of QUERY method + 5.2. Registration of Accept-Query field + 6. Normative References + 7. Informative References + Appendix A. Examples + A.1. Simple QUERY with a Direct Response + A.2. Simple QUERY with a Direct Response and Location Fields + A.3. Simple QUERY with Indirect Response (303 See Other) + A.4. Simple QUERY with Redirect Response (308 Moved Permanently) + Appendix B. Change Log + B.1. Since draft-ietf-httpbis-safe-method-w-body-00 + B.2. Since draft-ietf-httpbis-safe-method-w-body-01 + B.3. Since draft-ietf-httpbis-safe-method-w-body-02 + B.4. Since draft-ietf-httpbis-safe-method-w-body-03 + B.5. Since draft-ietf-httpbis-safe-method-w-body-04 + B.6. Since draft-ietf-httpbis-safe-method-w-body-05 + B.7. Since draft-ietf-httpbis-safe-method-w-body-06 + B.8. Since draft-ietf-httpbis-safe-method-w-body-07 + Authors' Addresses + +1. Introduction + + This specification defines the HTTP QUERY request method as a means + of making a safe, idempotent request that contains content. + + Most often, this is desirable when the data conveyed in a request is + too voluminous to be encoded into the request's URI. For example, + this is a common query pattern: + + GET /feed?q=foo&limit=10&sort=-published HTTP/1.1 + Host: example.org + + However, for a query with parameters that are complex or large in + size, encoding it in the request URI may not be the best option + because + + * often size limits are not known ahead of time because a request + can pass through many uncoordinated system (but note that + Section 4.1 of [HTTP] recommends senders and recipients to support + at least 8000 octets), + + * expressing certain kinds of data in the target URI is inefficient + because of the overhead of encoding that data into a valid URI, + and + + * encoding query parameters directly into the request URI + effectively casts every possible combination of query inputs as + distinct resources. + + As an alternative to using GET, many implementations make use of the + HTTP POST method to perform queries, as illustrated in the example + below. In this case, the input parameters to the query operation are + passed along within the request content as opposed to using the + request URI. + + A typical use of HTTP POST for requesting a query: + + POST /feed HTTP/1.1 + Host: example.org + Content-Type: application/x-www-form-urlencoded + + q=foo&limit=10&sort=-published + + This variation, however, suffers from the same basic limitation as + GET in that it is not readily apparent -- absent specific knowledge + of the resource and server to which the request is being sent -- that + a safe, idempotent query is being performed. + + The QUERY method provides a solution that spans the gap between the + use of GET and POST, with the example above being expressed as: + + QUERY /feed HTTP/1.1 + Host: example.org + Content-Type: application/x-www-form-urlencoded + + q=foo&limit=10&sort=-published + + As with POST, the input to the query operation is passed along within + the content of the request rather than as part of the request URI. + Unlike POST, however, the method is explicitly safe and idempotent, + allowing functions like caching and automatic retries to operate. + + Summarizing: + + +============+=============+==================+==================+ + | | GET | QUERY | POST | + +============+=============+==================+==================+ + | Safe | yes | yes | potentially no | + +------------+-------------+------------------+------------------+ + | Idempotent | yes | yes | potentially no | + +------------+-------------+------------------+------------------+ + | Cacheable | yes | yes | no | + +------------+-------------+------------------+------------------+ + | Content | "no defined | expected | expected | + | (body) | semantics" | (semantics per | (semantics per | + | | | target resource) | target resource) | + +------------+-------------+------------------+------------------+ + + Table 1 + +1.1. Terminology + + This document uses terminology defined in Section 3 of [HTTP]. + +1.2. Notational Conventions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in BCP + 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + +2. QUERY + + The QUERY method is used to initiate a server-side query. Unlike the + HTTP GET method, which requests that a server return a representation + of the resource identified by the target URI (as defined by + Section 7.1 of [HTTP]), the QUERY method is used to ask the server to + perform a query operation (described by the request content) over + some set of data scoped to the target URI. The content returned in + response to a QUERY cannot be assumed to be a representation of the + resource identified by the target URI. + + The content of the request defines the query. Implementations MAY + use a request content of any media type with the QUERY method, + provided that it has appropriate query semantics. + + QUERY requests are both safe and idempotent with regards to the + resource identified by the request URI. That is, QUERY requests do + not alter the state of the targeted resource. However, while + processing a QUERY request, a server can be expected to allocate + computing and memory resources or even create additional HTTP + resources through which the response can be retrieved. + + A successful response to a QUERY request is expected to provide some + indication as to the final disposition of the operation. For + instance, a successful query that yields no results can be + represented by a 204 No Content response. If the response includes + content, it is expected to describe the results of the operation. + +2.1. Content-Location and Location Fields + + Furthermore, a successful response can include a Content-Location + header field (see Section 8.7 of [HTTP]) containing an identifier for + a resource corresponding to the results of the operation. This + represents a claim from the server that a client can send a GET + request for the indicated URI to retrieve the results of the query + operation just performed. The indicated resource might be temporary. + + A server MAY create or locate a resource that identifies the query + operation for future use. If the server does so, the URI of the + resource can be included in the Location header field of the response + (see Section 10.2.2 of [HTTP]). This represents a claim that a + client can send a GET request to the indicated URI to repeat the + query operation just performed without resending the query + parameters. This resource might be temporary; if a future request + fails, the client can retry using the original QUERY resource and the + previously submitted parameters again. + +2.2. Redirection + + In some cases, the server may choose to respond indirectly to the + QUERY request by redirecting the user agent to a different URI (see + Section 15.4 of [HTTP]). The semantics of the redirect response do + not differ from other methods. For instance, a 303 (See Other) + response would indicate that the Location field identifies an + alternate URI from which the results can be retrieved using a GET + request (this use case is also covered by the use of the Location + response field in a 2xx response). On the other hand, response codes + 307 (Temporary Redirect) and 308 (Permanent Redirect) can be used to + request the user agent to redo the QUERY request on the URI specified + by the Location field. Various non-normative examples of successful + QUERY responses are illustrated in Appendix A. + +2.3. Conditional Requests + + A conditional QUERY requests that the selected representation (i.e., + the query results, after any content negotiation) be returned in the + response only under the circumstances described by the conditional + header field(s), as defined in Section 13 of [HTTP]. + +2.4. Caching + + The response to a QUERY method is cacheable; a cache MAY use it to + satisfy subsequent QUERY requests as per Section 4 of + [HTTP-CACHING]). + + The cache key for a query (see Section 2 of [HTTP-CACHING]) MUST + incorporate the request content. When doing so, caches SHOULD first + normalize request content to remove semantically insignificant + differences, thereby improving cache efficiency, by: + + * Removing content encoding(s) + + * Normalizing based upon knowledge of format conventions, as + indicated by the any media type suffix in the request's Content- + Type field (e.g., "+json") + + * Normalizing based upon knowledge of the semantics of the content + itself, as indicated by the request's Content-Type field. + + Note that any such normalization is performed solely for the purpose + of generating a cache key; it does not change the request itself. + +3. The "Accept-Query" Header Field + + The "Accept-Query" response header field can be used by a resource to + directly signal support for the QUERY method while identifying the + specific query format media type(s) that may be used. + + "Accept-Query" contains a list of media ranges (Section 12.5.1 of + [HTTP]) using "Structured Fields" syntax ([STRUCTURED-FIELDS]). + Media ranges are represented by a List Structured Header Field of + either Tokens or Strings, containing the media range value without + parameters. Parameters, if any, are mapped to Parameters of type + String. + + The choice of Token vs. String is semantically insignificant. That + is, recipients MAY convert Tokens to Strings, but MUST NOT process + them differently based on the received type. + + Media types do not exactly map to Tokens, for instance they allow a + leading digit. In cases like these, the String format needs to be + used. + + The only supported uses of wildcards are "*/*", which matches any + type, or "xxxx/*", which matches any subtype of the indicated type. + + The order of types listed in the field value is not significant. + + The only allowed format for parameters is String. + + Accept-Query's value applies to every URI on the server that shares + the same path; in other words, the query component is ignored. If + requests to the same resource return different Accept-Query values, + the most recently received fresh value (per Section 4.2 of + [HTTP-CACHING]) is used. + + Example: + + Accept-Query: "application/jsonpath", application/sql;charset="UTF-8" + + Although the syntax for this field appears to be similar to other + fields, such as "Accept" (Section 12.5.1 of [HTTP]), it is a + Structured Field and thus MUST be processed as specified in Section 4 + of [STRUCTURED-FIELDS]. + +4. Security Considerations + + The QUERY method is subject to the same general security + considerations as all HTTP methods as described in [HTTP]. + + It can be used as an alternative to passing request information in + the URI (e.g., in the query section). This is preferred in some + cases, as the URI is more likely to be logged or otherwise processed + by intermediaries than the request content. If a server creates a + temporary resource to represent the results of a QUERY request (e.g., + for use in the Location or Content-Location field) and the request + contains sensitive information that cannot be logged, then the URI of + this resource SHOULD be chosen such that it does not include any + sensitive portions of the original request content. + + Caches that normalize QUERY content incorrectly or in ways that are + significantly different than how the resource processes the content + can return the incorrect response if normalization results in a false + positive. + + A QUERY request from user agents implementing CORS (Cross-Origin + Resource Sharing) will require a "preflight" request, as QUERY does + not belong to the set of CORS-safelisted methods (see "Methods + (https://fetch.spec.whatwg.org/#methods)" in [FETCH]). + +5. IANA Considerations + +5.1. Registration of QUERY method + + IANA is requested to add the QUERY method to the HTTP Method Registry + at (see Section 16.3.1 + of [HTTP]). + + +=============+======+============+===============+ + | Method Name | Safe | Idempotent | Specification | + +=============+======+============+===============+ + | QUERY | Yes | Yes | Section 2 | + +-------------+------+------------+---------------+ + + Table 2 + +5.2. Registration of Accept-Query field + + IANA is requested to add the Accept-Query field to the HTTP Field + Name Registry at (see + Section 16.1.1 of [HTTP]). + + +==============+===========+============+================+==========+ + | Field Name | Status | Structured | Reference | Comments | + | | | Type | | | + +==============+===========+============+================+==========+ + | Accept-Query | permanent | List | Section 3 | | + | | | | of this | | + | | | | document. | | + +--------------+-----------+------------+----------------+----------+ + + Table 3 + +6. Normative References + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + . + + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, . + + [HTTP] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP Semantics", STD 97, RFC 9110, June 2022, + . + + [HTTP-CACHING] + Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP Caching", STD 98, RFC 9111, June 2022, + . + + [STRUCTURED-FIELDS] + Nottingham, M. and P-H. Kamp, "Structured Field Values for + HTTP", RFC 9651, September 2024, + . + +7. Informative References + + [FETCH] WHATWG, "FETCH", . + +Appendix A. Examples + + The non-normative examples in this section make use of a simple, + hypothetical plain-text based query syntax based on SQL with results + returned as comma-separated values. This is done for illustration + purposes only. Implementations are free to use any format they wish + on both the request and response. + +A.1. Simple QUERY with a Direct Response + + A simple query with a direct response: + + QUERY /contacts HTTP/1.1 + Host: example.org + Content-Type: application/sql + Accept: text/csv + + select surname, givenname, email limit 10 + + Response: + + HTTP/1.1 200 OK + Content-Type: text/csv + + surname, givenname, email + Smith, John, john.smith@example.org + Jones, Sally, sally.jones@example.com + Dubois, Camille, camille.dubois@example.net + +A.2. Simple QUERY with a Direct Response and Location Fields + + A simple query with a direct response: + + QUERY /contacts HTTP/1.1 + Host: example.org + Content-Type: application/sql + Accept: text/csv + + select surname, givenname, email limit 10 + + Response: + + HTTP/1.1 200 OK + Content-Type: text/csv + Content-Location: /contacts/responses/42 + Location: /contacts/queries/17 + + surname, givenname, email + Smith, John, john.smith@example.org + Jones, Sally, sally.jones@example.com + Dubois, Camille, camille.dubois@example.net + + A subsequent GET request on /contacts/responses/42 would return the + same response, until the server decides to remove that resource. + + A GET request on /contacts/queries/17 however would execute the same + query again, and return a fresh result for that query: + + GET /contacts/queries/17 HTTP/1.1 + Host: example.org + Accept: text/csv + + Response: + + HTTP/1.1 200 OK + Content-Type: text/csv + Content-Location: /contacts/responses/43 + + surname, givenname, email + Jones, Sally, sally.jones@example.com + Dubois, Camille, camille.dubois@example.net + +A.3. Simple QUERY with Indirect Response (303 See Other) + + A simple query with an Indirect Response (303 See Other): + + QUERY /contacts HTTP/1.1 + Host: example.org + Content-Type: application/sql + Accept: text/csv + + select surname, givenname, email limit 10 + + Response: + + HTTP/1.1 303 See Other + Location: /contacts/query123 + + Retrieval of the Query Response: + + GET /contacts/query123 HTTP/1.1 + Host: example.org + + Response: + + HTTP/1.1 200 OK + Content-Type: text/csv + + surname, givenname, email + Smith, John, john.smith@example.org + Jones, Sally, sally.jones@example.com + Dubois, Camille, camille.dubois@example.net + +A.4. Simple QUERY with Redirect Response (308 Moved Permanently) + + A simple query being redirected: + + QUERY /contacts HTTP/1.1 + Host: example.org + Content-Type: application/sql + Accept: text/csv + + select surname, givenname, email limit 10 + + Response: + + HTTP/1.1 308 Moved Permanently + Location: /morecontacts + + Redirected request: + + QUERY /morecontacts HTTP/1.1 + Host: example.org + Content-Type: application/sql + Accept: text/csv + + select surname, givenname, email limit 10 + + Response: + + HTTP/1.1 200 OK + Content-Type: text/csv + + surname, givenname, email + Smith, John, john.smith@example.org + Jones, Sally, sally.jones@example.com + Dubois, Camille, camille.dubois@example.net + +Appendix B. Change Log + + This section is to be removed before publishing as an RFC. + +B.1. Since draft-ietf-httpbis-safe-method-w-body-00 + + * Use "example/query" media type instead of undefined "text/query" + (https://github.com/httpwg/http-extensions/issues/1450) + + * In Section 3, adjust the grammar to just define the field value + (https://github.com/httpwg/http-extensions/issues/1470) + + * Update to latest HTTP core spec, and adjust terminology + accordingly (https://github.com/httpwg/http-extensions/ + issues/1473) + + * Reference RFC 8174 and markup bcp14 terms + (https://github.com/httpwg/http-extensions/issues/1497) + + * Update HTTP reference (https://github.com/httpwg/http-extensions/ + issues/1524) + + * Relax restriction of generic XML media type in request content + (https://github.com/httpwg/http-extensions/issues/1535) + +B.2. Since draft-ietf-httpbis-safe-method-w-body-01 + + * Add minimal description of cacheability + (https://github.com/httpwg/http-extensions/issues/1552) + + * Use "QUERY" as method name (https://github.com/httpwg/http- + extensions/issues/1614) + + * Update HTTP reference (https://github.com/httpwg/http-extensions/ + issues/1669) + +B.3. Since draft-ietf-httpbis-safe-method-w-body-02 + + * In Section 3, slightly rephrase statement about significance of + ordering (https://github.com/httpwg/http-extensions/issues/1896) + + * Throughout: use "content" instead of "payload" or "body" + (https://github.com/httpwg/http-extensions/issues/1915) + + * Updated references (https://github.com/httpwg/http-extensions/ + issues/2157) + +B.4. Since draft-ietf-httpbis-safe-method-w-body-03 + + * In Section 3, clarify scope (https://github.com/httpwg/http- + extensions/issues/1913) + +B.5. Since draft-ietf-httpbis-safe-method-w-body-04 + + * Describe role of Content-Location and Location fields + (https://github.com/httpwg/http-extensions/issues/1745) + + * Added Mike Bishop as author (https://github.com/httpwg/http- + extensions/issues/2837) + + * Use "target URI" instead of "effective request URI" + (https://github.com/httpwg/http-extensions/issues/2883) + +B.6. Since draft-ietf-httpbis-safe-method-w-body-05 + + * Updated language and examples about redirects and method rewriting + (https://github.com/httpwg/http-extensions/issues/1917) + + * Add QUERY example to introduction (https://github.com/httpwg/http- + extensions/issues/2171) + + * Update "Sensitive information in QUERY URLs" + (https://github.com/httpwg/http-extensions/issues/2853) + + * Field registration for "Accept-Query" (https://github.com/httpwg/ + http-extensions/issues/2903) + +B.7. Since draft-ietf-httpbis-safe-method-w-body-06 + + * Improve language about sensitive information in URIs + (https://github.com/httpwg/http-extensions/issues/1895) + + * Guidance about what's possible with GET wrt URI length + (https://github.com/httpwg/http-extensions/issues/1914) + + * Clarified description of conditional queries + (https://github.com/httpwg/http-extensions/issues/1917) + + * Editorial changes to Introduction (ack Will Hawkins, + https://github.com/httpwg/http-extensions/pull/2859) + + * Added Security Consideration with respect to Normalization + (https://github.com/httpwg/http-extensions/issues/2896) + + * Added CORS considerations (https://github.com/httpwg/http- + extensions/issues/2898) + + * Make Accept-Query a Structured Field (https://github.com/httpwg/ + http-extensions/issues/2934) + + * SQL media type is application/sql (RFC6922) + (https://github.com/httpwg/http-extensions/issues/2936) + + * Added overview table to introduction (https://github.com/httpwg/ + http-extensions/issues/2951) + + * Reference HTTP spec for terminology (https://github.com/httpwg/ + http-extensions/issues/2953) + + * Moved BCP14 related text into subsection + (https://github.com/httpwg/http-extensions/issues/2954) + + * Move examples into index (https://github.com/httpwg/http- + extensions/issues/2957) + +B.8. Since draft-ietf-httpbis-safe-method-w-body-07 + + None yet. + +Authors' Addresses + + Julian Reschke + greenbytes GmbH + Hafenweg 16 + 48155 Münster + Germany + Email: julian.reschke@greenbytes.de + URI: https://greenbytes.de/tech/webdav/ + + + Ashok Malhotra + Email: malhotrasahib@gmail.com + + + James M Snell + Email: jasnell@gmail.com + + + Mike Bishop + Akamai + Email: mbishop@evequefou.be diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-secondary-server-certs.html b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-secondary-server-certs.html new file mode 100644 index 000000000..ef92b7fdf --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-secondary-server-certs.html @@ -0,0 +1,1907 @@ + + + + + + +Secondary Certificate Authentication of HTTP Servers + + + + + + + + + + + + + + + + + + + + + + + + + +
Internet-DraftHTTP Server Secondary Cert AuthJanuary 2025
Gorbaty & BishopExpires 11 July 2025[Page]
+
+
+
+
Workgroup:
+
HTTP
+
Internet-Draft:
+
draft-ietf-httpbis-secondary-server-certs-latest
+
Published:
+
+ +
+
Intended Status:
+
Standards Track
+
Expires:
+
+
Authors:
+
+
+
E. Gorbaty, Ed. +
+
Apple
+
+
+
M. Bishop, Ed. +
+
Akamai
+
+
+
+
+

Secondary Certificate Authentication of HTTP Servers

+
+

Abstract

+

This document defines a way for HTTP/2 and HTTP/3 servers to send additional +certificate-based credentials after a TLS connection is established, based +on TLS Exported Authenticators.

+
+
+

+About This Document +

+

This note is to be removed before publishing as an RFC.

+

+ The latest revision of this draft can be found at https://httpwg.org/http-extensions/draft-ietf-httpbis-secondary-server-certs.html. + Status information for this document may be found at https://datatracker.ietf.org/doc/draft-ietf-httpbis-secondary-server-certs/.

+

+ Discussion of this document takes place on the + HTTP Working Group mailing list (mailto:ietf-http-wg@w3.org), + which is archived at https://lists.w3.org/Archives/Public/ietf-http-wg/.

+

Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/secondary-server-certs.

+
+
+
+

+Status of This Memo +

+

+ This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79.

+

+ Internet-Drafts are working documents of the Internet Engineering Task + Force (IETF). Note that other groups may also distribute working + documents as Internet-Drafts. The list of current Internet-Drafts is + at https://datatracker.ietf.org/drafts/current/.

+

+ Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress."

+

+ This Internet-Draft will expire on 11 July 2025.

+
+
+ + +
+
+

+1. Introduction +

+

HTTP [HTTP] clients need to know that the content they receive on a +connection comes from the origin from which they intended to retrieve it. The +traditional form of server authentication in HTTP has been in the form of a +single X.509 certificate provided during the TLS [TLS13] handshake.

+

TLS supports one server and one client certificate on a connection. These +certificates may contain multiple identities, but only one certificate may be +provided.

+

Many HTTP servers host content from several origins. HTTP/2 [H2] and +HTTP/3 [H3] permit clients to reuse an existing HTTP connection to a +server provided that the secondary origin is also in the certificate provided +during the TLS handshake. In many cases, servers choose to maintain separate +certificates for different origins but still desire the benefits of a shared +HTTP connection. This document defines a capability for servers to use and +to authenticate with those seperate certificates over a shared connection.

+

The ability to maintain seperate certificates for different origins can also +allow proxies that cache content from secondary origins to communicate to +clients that they can service some of those origins directly, allowing the +proxy to behave as a TLS-terminating reverse proxy for those origins instead of +establishing a TLS encrypted tunnel through the proxy.

+
+
+

+1.1. Server Certificate Authentication +

+

Section 9.1.1 of [H2] and Section 3.3 of [H3] describe how connections may +be used to make requests from multiple origins as long as the server is +authoritative for both. A server is considered authoritative for an origin if +DNS resolves the origin to the IP address of the server and (for TLS) if the +certificate presented by the server contains the origin in the Subject +Alternative Names field.

+

[ALTSVC] enables a step of abstraction from the DNS resolution. If +both hosts have provided an Alternative Service at hostnames which resolve to +the IP address of the server, they are considered authoritative just as if DNS +resolved the origin itself to that address. However, the server's one TLS +certificate is still required to contain the name of each origin in question.

+

[ORIGIN] relaxes the requirement to perform the DNS lookup if already +connected to a server with an appropriate certificate which claims support for +a particular origin.

+

Servers which host many origins often would prefer to have separate certificates +for some sets of origins. This may be for ease of certificate management +(the ability to separately revoke or renew them), due to different sources of +certificates (a CDN acting on behalf of multiple origins), or other factors +which might drive this administrative decision. Clients connecting to such +origins cannot currently reuse connections, even if both client and server +would prefer to do so.

+

Because the TLS SNI extension is exchanged in the clear, clients might also +prefer to retrieve certificates inside the encrypted context. When this +information is sensitive, it might be advantageous to request a general-purpose +certificate or anonymous ciphersuite at the TLS layer, while acquiring the +"real" certificate in HTTP after the connection is established.

+
+
+
+
+

+1.2. TLS Exported Authenticators +

+

TLS Exported Authenticators [EXPORTED-AUTH] are structured messages +that can be exported by either party of a TLS connection and validated by the +other party. Given an established TLS connection, an authenticator message can +be constructed proving possession of a certificate and a corresponding private +key. The mechanisms that this document defines are primarily focused on the +server's ability to generate TLS Exported Authenticators.

+

Each Authenticator is computed using a Handshake Context and Finished MAC Key +derived from the TLS session. The Handshake Context is identical for both +parties of the TLS connection, while the Finished MAC Key is dependent on +whether the Authenticator is created by the client or the server.

+

Successfully verified Authenticators result in certificate chains, with verified +possession of the corresponding private key, which can be supplied into a +collection of available certificates. Likewise, descriptions of desired +certificates can also be supplied into these collections.

+
+
+
+
+

+1.3. HTTP-Layer Certificate Authentication +

+

This document defines HTTP/2 and HTTP/3 CERTIFICATE frames (Section 5) to +carry the relevant certificate messages, enabling certificate-based +authentication of servers independent of TLS version. This mechanism can be +implemented at the HTTP layer without breaking the existing interface between +HTTP and applications above it.

+

TLS Exported Authenticators [EXPORTED-AUTH] allow the opportunity for an +HTTP/2 and HTTP/3 servers to send certificate frames which can be used to prove +the servers authenticity for multiple origins.

+

This document additionally defines SETTINGS parameters for HTTP/2 and HTTP/3 +(Section 4) that allow the client and server to indicate support for +HTTP-Layer certificate authentication.

+
+
+
+
+
+
+

+2. Conventions and Definitions +

+

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", +"MAY", and "OPTIONAL" in this document are to be interpreted as +described in BCP 14 [RFC2119] [RFC8174] when, and only when, they +appear in all capitals, as shown here.

+
+
+
+
+

+3. Discovering Additional Certificates at the HTTP Layer +

+

A certificate chain with proof of possession of the private key corresponding to +the end-entity certificate is sent as a sequence of CERTIFICATE frames (see +Section 5.1, Section 5.2) to the client. Once the holder of a certificate +has sent the chain and proof, this certificate chain is cached by the recipient +and available for future use.

+
+
+

+3.1. Indicating Support for HTTP-Layer Certificate Authentication +

+

The SETTINGS_HTTP_SERVER_CERT_AUTH parameters for HTTP/2 and HTTP/3 are +defined in Section 4 so that clients and servers can indicate support for +secondary certificate authentication of servers.

+

HTTP/2 and HTTP/3 endpoints who wish to indicate support for HTTP-Layer +certificate authentication MUST send a SETTINGS_HTTP_SERVER_CERT_AUTH +parameter set to "1" in their SETTINGS frame. Endpoints MUST NOT use any of the +authentication functionality described in this document unless the parameter has +been negotiated by both sides.

+

Endpoints MUST NOT send a SETTINGS_HTTP_SERVER_CERT_AUTH parameter with a +value of 0 after previously sending a value of 1.

+

SETTINGS_HTTP_SERVER_CERT_AUTH indicates that servers are able to offer +additional certificates to demonstrate control over other origin hostnames, and +that clients are able to make requests for hostnames received in a TLS Exported +Authenticator that the server sends.

+
+
+
+
+

+3.2. Making Certificates Available +

+

When both peers have advertised support for HTTP-layer certificates in a given +direction as in Section 3.1, the indicated endpoint can supply +additional certificates into the connection at any time. That is, if both +endpoints have sent SETTINGS_HTTP_SERVER_CERT_AUTH and validated the value +received from the peer, the server may send certificates spontaneously, at any +time, as described by the Spontaneous Server Authentication message sequence +in Section 3 of [EXPORTED-AUTH].

+

This does mean that if a server knows it supports secondary certificate +authentication, and it receives SETTINGS_HTTP_SERVER_CERT_AUTH from the +client, that it can enqueue certificates immediately following the received +SETTINGS frame.

+

Certificates supplied by servers can be considered by clients without further +action by the server. A server SHOULD NOT send certificates which do not cover +origins which it is prepared to service on the current connection, and SHOULD NOT send them if the client has not indicated support with +SETTINGS_HTTP_SERVER_CERT_AUTH.

+

A client MUST NOT send certificates to the server. The server SHOULD close the +connection upon receipt of a CERTIFICATE frame from a client.

+
+
+
+
+Client                                        Server
+   <-- (stream 0 / control stream) CERTIFICATE --
+   ...
+   -- (stream N) GET /from-new-origin ---------->
+   <----------------------- (stream N) 200 OK ---
+
+
+
Figure 1: +Simple unprompted server authentication +
+
+

A server MAY send a CERTIFICATE immediately after sending its SETTINGS. +However, it MAY also send certificates at any time later. For example, a proxy +might discover that a client is interested in an origin that it can reverse +proxy at the time that a client sends a CONNECT request. It can then send +certificates for those origins to allow for TLS-terminated reverse proxying to +those origins for the remainder of the connection lifetime. +Figure 2 illustrates this behavior.

+
+
+
+
+Client                                        Server
+   -- (stream N) CONNECT /to-new-origin -------->
+   <-- (stream 0 / control stream) CERTIFICATE --
+   <-- (stream 0 / control stream) 200 OK -------
+   ...
+   -- (stream M) GET /to-new-origin ------------>
+   <--- (stream M, direct from server) 200 OK ---
+
+
+
Figure 2: +Reverse proxy server authentication +
+
+
+
+
+
+
+
+

+4. SETTINGS_HTTP_SERVER_CERT_AUTH +

+

SETTINGS parameters for HTTP/2 and HTTP/3 seperately are defined below.

+
+
+

+4.1. The SETTINGS_HTTP_SERVER_CERT_AUTH HTTP/2 SETTINGS Parameter +

+

This document adds a new HTTP/2 SETTINGS(0xTBD) parameter to those defined by +Section 6.5.2 of [H2].

+

The new parameter name is SETTINGS_HTTP_SERVER_CERT_AUTH. The value of the +parameter MUST be 0 or 1.

+

The usage of this parameter is described in Section 3.1.

+
+
+
+
+

+4.2. The SETTINGS_HTTP_SERVER_CERT_AUTH HTTP/3 SETTINGS Parameter +

+

This document adds a new HTTP/3 SETTINGS(0xTBD) parameter to those defined by +Section 7.2.4.1 of [H3].

+

The new parameter name is SETTINGS_HTTP_SERVER_CERT_AUTH. The value of the +parameter MUST be 0 or 1.

+

The usage of this parameter is described in Section 3.1.

+
+
+
+
+
+
+

+5. CERTIFICATE frame +

+

The CERTIFICATE frame contains an exported authenticator message from the TLS +layer that provides a chain of certificates and associated extensions, proving +possession of the private key corresponding to the end-entity certificate.

+

A server sends a CERTIFICATE frame on stream 0 for HTTP/2 and on the control +stream for HTTP/3. The client is permitted to make subsequent requests for +resources upon receipt of a CERTIFICATE frame without further action from the +server.

+

Upon receiving a complete series of CERTIFICATE frames, the receiver may +validate the Exported Authenticator value by using the exported authenticator +API. This returns either an error indicating that the message was invalid or +the certificate chain and extensions used to create the message.

+
+
+

+5.1. HTTP/2 CERTIFICATE frame +

+

A CERTIFICATE frame in HTTP/2 (type=0xTBD) carrries a TLS Exported authenticator +that clients can use to authenticate secondary origins from a sending server.

+

The CERTIFICATE frame MUST be sent on stream 0. A CERTIFICATE frame received on +any other stream MUST not be used for server authentication.

+
+
+
+CERTIFICATE Frame {
+  Length (24),
+  Type (8) = 0xTBD,
+
+  Unused Flags (8),
+
+  Reserved (1),
+  Stream Identifier (31) = 0,
+
+  Authenticator (..),
+}
+
+
+
Figure 3: +HTTP/2 CERTIFICATE Frame +
+

The Length, Type, Unused Flag(s), Reserved, and Stream Identifier fields are +described in Section 4 of [H2].

+

The CERTIFICATE frame does not define any flags.

+

The authenticator field is a portion of the opaque data returned from the TLS +connection exported authenticator authenticate API. See Section 5.3 for more +details on the input to this API.

+

The CERTIFICATE frame applies to the connection, not a specific stream. An +endpoint MUST treat a CERTIFICATE frame with a stream identifier other than +0x00 as a connection error.

+
+
+
+
+

+5.2. HTTP/3 CERTIFICATE frame +

+

A CERTIFICATE frame in HTTP/3 (type=0xTBD) carrries a TLS Exported authenticator +that clients can use to authenticate secondary origins from a sending server.

+

The CERTIFICATE frame MUST be sent on the control stream. A CERTIFICATE frame +received on any other stream MUST not be used for server authentication.

+
+
+
+CERTIFICATE Frame {
+  Type (i) = 0xTBD,
+  Length (i),
+  Authenticator (...),
+}
+
+
+
Figure 4: +HTTP/3 CERTIFICATE Frame +
+

The Type and Length fields are described in Section 7.1 of [H3].

+

The authenticator field is a portion of the opaque data returned from the TLS +connection exported authenticator authenticate API. See Section 5.3 for more +details on the input to this API.

+

The CERTIFICATE frame applies to the connection, not a specific stream. An +endpoint MUST treat a CERTIFICATE frame received on any stream other than the +control stream as a connection error.

+
+
+
+
+

+5.3. Exported Authenticator Characteristics +

+

The Exported Authenticator API defined in [EXPORTED-AUTH] takes as input a +request, a set of certificates, and supporting information about the +certificate (OCSP, SCT, etc.). The result is an opaque token which is used +when generating the CERTIFICATE frame.

+

Upon receipt of a CERTIFICATE frame, an endpoint which has negotiated support +for secondary certfiicates MUST perform the following steps to validate the +token it contains:

+
    +
  • +

    Using the get context API, retrieve the certificate_request_context used +to generate the authenticator, if any. Because the certificate_request_context +for spontaneous server certificates is chosen by the server, the usage of +the certificate_request_context is implementation-dependent. For details, +see Section 5 of [EXPORTED-AUTH].

    +
  • +
  • +

    Use the validate API to confirm the validity of the authenticator with +regard to the generated request, if any.

    +
  • +
+

If the authenticator cannot be validated, this SHOULD be treated as a connection +error.

+

Once the authenticator is accepted, the endpoint can perform any other checks +for the acceptability of the certificate itself.

+
+
+
+
+
+
+

+6. Indicating Failures During HTTP-Layer Certificate Authentication +

+

Because this document permits certificates to be exchanged at the HTTP framing +layer instead of the TLS layer, several certificate-related errors which are +defined at the TLS layer might now occur at the HTTP framing layer.

+

There are two classes of errors which might be encountered, and they are handled +differently.

+
+
+

+6.1. Misbehavior +

+

This category of errors could indicate a peer failing to follow requirements in +this document or might indicate that the connection is not fully secure. These +errors are fatal to stream or connection, as appropriate.

+
+
CERTIFICATE_UNREADABLE (0xERROR-TBD):
+
+

An exported authenticator could not be validated.

+
+
+
+
+
+
+
+

+6.2. Invalid Certificates +

+

Unacceptable certificates (expired, revoked, or insufficient to satisfy the +request) are not treated as stream or connection errors. This is typically not +an indication of a protocol failure. Clients SHOULD establish a new connection +in an attempt to reach an authoritative server if they deem a +certificate from the server unacceptable.

+
+
+
+
+
+
+

+7. Security Considerations +

+

This mechanism defines an alternate way to obtain server and client certificates +other than in the initial TLS handshake. While the signature of exported +authenticator values is expected to be equally secure, it is important to +recognize that a vulnerability in this code path is at least equal to a +vulnerability in the TLS handshake.

+
+
+

+7.1. Impersonation +

+

This mechanism could increase the impact of a key compromise. Rather than +needing to subvert DNS or IP routing in order to use a compromised certificate, +a malicious server now only needs a client to connect to some HTTPS site +under its control in order to present the compromised certificate. Clients +SHOULD consult DNS for hostnames presented in secondary certificates if they +would have done so for the same hostname if it were present in the primary +certificate.

+

As recommended in [ORIGIN], clients opting not to consult DNS ought to employ +some alternative means to increase confidence that the certificate is +legitimate, such as an ORIGIN frame.

+

As noted in the Security Considerations of [EXPORTED-AUTH], it is difficult to +formally prove that an endpoint is jointly authoritative over multiple +certificates, rather than individually authoritative on each certificate. As a +result, clients MUST NOT assume that because one origin was previously +colocated with another, those origins will be reachable via the same endpoints +in the future. Clients MUST NOT consider previous secondary certificates to be +validated after TLS session resumption. Servers MAY re-present certificates +if a TLS Session is resumed.

+
+
+
+
+

+7.2. Fingerprinting +

+

This document defines a mechanism which could be used to probe servers for origins +they support, but it opens no new attack that was not already possible by +making repeat TLS connections with different SNI values.

+
+
+
+
+

+7.3. Persistence of Service +

+

CNAME records in the DNS are frequently used to delegate authority for an origin +to a third-party provider. This delegation can be changed without notice, even +to the third-party provider, simply by modifying the CNAME record in question.

+

After the owner of the domain has redirected traffic elsewhere by changing the +CNAME, new connections will not arrive for that origin, but connections which +are properly directed to this provider for other origins would continue to +claim control of this origin (via Secondary Certificates). This is proper +behavior based on the third-party provider's configuration, but would likely +not be what is intended by the owner of the origin.

+

This is not an issue which can be mitigated by the protocol, but something about +which third-party providers SHOULD educate their customers before using the +features described in this document.

+
+
+
+
+

+7.4. Confusion About State +

+

Implementations need to be aware of the potential for confusion about the state +of a connection. The presence or absence of a validated certificate can change +during the processing of a request, potentially multiple times, as +CERTIFICATE frames are received. A client that uses certificate +authentication needs to be prepared to reevaluate the authorization state of a +request as the set of certificates changes.

+

Behavior for TLS-Terminated reverse proxies is also worth considering. If a +server which situationally reverse-proxies wishes for the client to view a +request made prior to receipt of certificates as TLS-Terminated, or wishes for +the client to start a new tunnel alternatively, this document does not currently +define formal mechanisms to facilitate that intention.

+
+
+
+
+
+
+

+8. IANA Considerations +

+

This document registers the CERTIFICATE frame type and +SETTINGS_HTTP_SERVER_CERT_AUTH setting for both [H2] and [H3].

+
+
+

+8.1. Frame Types +

+

This specification registers the following entry in the "HTTP/2 Frame Type" +registry defined in [H2]:

+

Code: : TBD

+

Frame Type: : CERTIFICATE

+

Reference: : This document

+

This specification registers the following entry in the "HTTP/3 Frame Types" +registry established by [H3]:

+

Value: : TBD

+

Frame Type: : CERTIFICATE

+

Status: : permanent

+

Reference: : This document

+

Change Controller: : IETF

+

Contact: : ietf-http-wg@w3.org

+
+
+
+
+

+8.2. Settings Parameters +

+

This specification registers the following entry in the "HTTP/2 Settings" +registry defined in [H2]:

+

Code: : TBD

+

Name: : SETTINGS_HTTP_SERVER_CERT_AUTH

+

Initial Value: : 0

+

Reference: : This document

+

This specification registers the following entry in the "HTTP/3 Settings" +registry defined in [H3]:

+

Code: : TBD

+

Name: : SETTINGS_HTTP_SERVER_CERT_AUTH

+

Default: : 0

+

Reference: : This document

+

Change Controller: : IETF

+

Contact: : ietf-http-wg@w3.org

+
+
+
+
+
+
+

+9. References +

+
+
+

+9.1. Normative References +

+
+
[EXPORTED-AUTH]
+
+Sullivan, N., "Exported Authenticators in TLS", RFC 9261, DOI 10.17487/RFC9261, , <https://www.rfc-editor.org/rfc/rfc9261>.
+
+
[H2]
+
+Thomson, M., Ed. and C. Benfield, Ed., "HTTP/2", RFC 9113, DOI 10.17487/RFC9113, , <https://www.rfc-editor.org/rfc/rfc9113>.
+
+
[H3]
+
+Bishop, M., Ed., "HTTP/3", RFC 9114, DOI 10.17487/RFC9114, , <https://www.rfc-editor.org/rfc/rfc9114>.
+
+
[HTTP]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, DOI 10.17487/RFC9110, , <https://www.rfc-editor.org/rfc/rfc9110>.
+
+
[RFC2119]
+
+Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
+
+
[RFC8174]
+
+Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.
+
+
[TLS13]
+
+Rescorla, E., "The Transport Layer Security (TLS) Protocol Version 1.3", RFC 8446, DOI 10.17487/RFC8446, , <https://www.rfc-editor.org/rfc/rfc8446>.
+
+
+
+
+
+
+

+9.2. Informative References +

+
+
[ALTSVC]
+
+Nottingham, M., McManus, P., and J. Reschke, "HTTP Alternative Services", RFC 7838, DOI 10.17487/RFC7838, , <https://www.rfc-editor.org/rfc/rfc7838>.
+
+
[ORIGIN]
+
+Nottingham, M. and E. Nygren, "The ORIGIN HTTP/2 Frame", RFC 8336, DOI 10.17487/RFC8336, , <https://www.rfc-editor.org/rfc/rfc8336>.
+
+
+
+
+
+
+
+
+

+Acknowledgments +

+

Thanks to Mike Bishop, Nick Sullivan, Martin Thomson and other +contributors for their work on the document that this is based on.

+

And thanks to Eric Kinnear, Tommy Pauly, and Lucas Pardue for their guidance and +editorial contributions to this document.

+
+
+
+
+

+Authors' Addresses +

+
+
Eric Gorbaty (editor)
+
Apple
+ +
+
+
Mike Bishop (editor)
+
Akamai
+ +
+
+
+ + + diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-secondary-server-certs.txt b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-secondary-server-certs.txt new file mode 100644 index 000000000..39ca6fcb4 --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-secondary-server-certs.txt @@ -0,0 +1,664 @@ + + + + +HTTP E. Gorbaty, Ed. +Internet-Draft Apple +Intended status: Standards Track M. Bishop, Ed. +Expires: 11 July 2025 Akamai + 7 January 2025 + + + Secondary Certificate Authentication of HTTP Servers + draft-ietf-httpbis-secondary-server-certs-latest + +Abstract + + This document defines a way for HTTP/2 and HTTP/3 servers to send + additional certificate-based credentials after a TLS connection is + established, based on TLS Exported Authenticators. + +About This Document + + This note is to be removed before publishing as an RFC. + + The latest revision of this draft can be found at https://httpwg.org/ + http-extensions/draft-ietf-httpbis-secondary-server-certs.html. + Status information for this document may be found at + https://datatracker.ietf.org/doc/draft-ietf-httpbis-secondary-server- + certs/. + + Discussion of this document takes place on the HTTP Working Group + mailing list (mailto:ietf-http-wg@w3.org), which is archived at + https://lists.w3.org/Archives/Public/ietf-http-wg/. + + Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/secondary-server- + certs. + +Status of This Memo + + This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79. + + Internet-Drafts are working documents of the Internet Engineering + Task Force (IETF). Note that other groups may also distribute + working documents as Internet-Drafts. The list of current Internet- + Drafts is at https://datatracker.ietf.org/drafts/current/. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress." + + This Internet-Draft will expire on 11 July 2025. + +Copyright Notice + + Copyright (c) 2025 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents (https://trustee.ietf.org/ + license-info) in effect on the date of publication of this document. + Please review these documents carefully, as they describe your rights + and restrictions with respect to this document. Code Components + extracted from this document must include Revised BSD License text as + described in Section 4.e of the Trust Legal Provisions and are + provided without warranty as described in the Revised BSD License. + +Table of Contents + + 1. Introduction + 1.1. Server Certificate Authentication + 1.2. TLS Exported Authenticators + 1.3. HTTP-Layer Certificate Authentication + 2. Conventions and Definitions + 3. Discovering Additional Certificates at the HTTP Layer + 3.1. Indicating Support for HTTP-Layer Certificate + Authentication + 3.2. Making Certificates Available + 4. SETTINGS_HTTP_SERVER_CERT_AUTH + 4.1. The SETTINGS_HTTP_SERVER_CERT_AUTH HTTP/2 SETTINGS + Parameter + 4.2. The SETTINGS_HTTP_SERVER_CERT_AUTH HTTP/3 SETTINGS + Parameter + 5. CERTIFICATE frame + 5.1. HTTP/2 CERTIFICATE frame + 5.2. HTTP/3 CERTIFICATE frame + 5.3. Exported Authenticator Characteristics + 6. Indicating Failures During HTTP-Layer Certificate + Authentication + 6.1. Misbehavior + 6.2. Invalid Certificates + 7. Security Considerations + 7.1. Impersonation + 7.2. Fingerprinting + 7.3. Persistence of Service + 7.4. Confusion About State + 8. IANA Considerations + 8.1. Frame Types + 8.2. Settings Parameters + 9. References + 9.1. Normative References + 9.2. Informative References + Acknowledgments + Authors' Addresses + +1. Introduction + + HTTP [HTTP] clients need to know that the content they receive on a + connection comes from the origin from which they intended to retrieve + it. The traditional form of server authentication in HTTP has been + in the form of a single X.509 certificate provided during the TLS + [TLS13] handshake. + + TLS supports one server and one client certificate on a connection. + These certificates may contain multiple identities, but only one + certificate may be provided. + + Many HTTP servers host content from several origins. HTTP/2 [H2] and + HTTP/3 [H3] permit clients to reuse an existing HTTP connection to a + server provided that the secondary origin is also in the certificate + provided during the TLS handshake. In many cases, servers choose to + maintain separate certificates for different origins but still desire + the benefits of a shared HTTP connection. This document defines a + capability for servers to use and to authenticate with those seperate + certificates over a shared connection. + + The ability to maintain seperate certificates for different origins + can also allow proxies that cache content from secondary origins to + communicate to clients that they can service some of those origins + directly, allowing the proxy to behave as a TLS-terminating reverse + proxy for those origins instead of establishing a TLS encrypted + tunnel through the proxy. + +1.1. Server Certificate Authentication + + Section 9.1.1 of [H2] and Section 3.3 of [H3] describe how + connections may be used to make requests from multiple origins as + long as the server is authoritative for both. A server is considered + authoritative for an origin if DNS resolves the origin to the IP + address of the server and (for TLS) if the certificate presented by + the server contains the origin in the Subject Alternative Names + field. + + [ALTSVC] enables a step of abstraction from the DNS resolution. If + both hosts have provided an Alternative Service at hostnames which + resolve to the IP address of the server, they are considered + authoritative just as if DNS resolved the origin itself to that + address. However, the server's one TLS certificate is still required + to contain the name of each origin in question. + + [ORIGIN] relaxes the requirement to perform the DNS lookup if already + connected to a server with an appropriate certificate which claims + support for a particular origin. + + Servers which host many origins often would prefer to have separate + certificates for some sets of origins. This may be for ease of + certificate management (the ability to separately revoke or renew + them), due to different sources of certificates (a CDN acting on + behalf of multiple origins), or other factors which might drive this + administrative decision. Clients connecting to such origins cannot + currently reuse connections, even if both client and server would + prefer to do so. + + Because the TLS SNI extension is exchanged in the clear, clients + might also prefer to retrieve certificates inside the encrypted + context. When this information is sensitive, it might be + advantageous to request a general-purpose certificate or anonymous + ciphersuite at the TLS layer, while acquiring the "real" certificate + in HTTP after the connection is established. + +1.2. TLS Exported Authenticators + + TLS Exported Authenticators [EXPORTED-AUTH] are structured messages + that can be exported by either party of a TLS connection and + validated by the other party. Given an established TLS connection, + an authenticator message can be constructed proving possession of a + certificate and a corresponding private key. The mechanisms that + this document defines are primarily focused on the server's ability + to generate TLS Exported Authenticators. + + Each Authenticator is computed using a Handshake Context and Finished + MAC Key derived from the TLS session. The Handshake Context is + identical for both parties of the TLS connection, while the Finished + MAC Key is dependent on whether the Authenticator is created by the + client or the server. + + Successfully verified Authenticators result in certificate chains, + with verified possession of the corresponding private key, which can + be supplied into a collection of available certificates. Likewise, + descriptions of desired certificates can also be supplied into these + collections. + +1.3. HTTP-Layer Certificate Authentication + + This document defines HTTP/2 and HTTP/3 CERTIFICATE frames + (Section 5) to carry the relevant certificate messages, enabling + certificate-based authentication of servers independent of TLS + version. This mechanism can be implemented at the HTTP layer without + breaking the existing interface between HTTP and applications above + it. + + TLS Exported Authenticators [EXPORTED-AUTH] allow the opportunity for + an HTTP/2 and HTTP/3 servers to send certificate frames which can be + used to prove the servers authenticity for multiple origins. + + This document additionally defines SETTINGS parameters for HTTP/2 and + HTTP/3 (Section 4) that allow the client and server to indicate + support for HTTP-Layer certificate authentication. + +2. Conventions and Definitions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + +3. Discovering Additional Certificates at the HTTP Layer + + A certificate chain with proof of possession of the private key + corresponding to the end-entity certificate is sent as a sequence of + CERTIFICATE frames (see Section 5.1, Section 5.2) to the client. + Once the holder of a certificate has sent the chain and proof, this + certificate chain is cached by the recipient and available for future + use. + +3.1. Indicating Support for HTTP-Layer Certificate Authentication + + The SETTINGS_HTTP_SERVER_CERT_AUTH parameters for HTTP/2 and HTTP/3 + are defined in Section 4 so that clients and servers can indicate + support for secondary certificate authentication of servers. + + HTTP/2 and HTTP/3 endpoints who wish to indicate support for HTTP- + Layer certificate authentication MUST send a + SETTINGS_HTTP_SERVER_CERT_AUTH parameter set to "1" in their SETTINGS + frame. Endpoints MUST NOT use any of the authentication + functionality described in this document unless the parameter has + been negotiated by both sides. + + Endpoints MUST NOT send a SETTINGS_HTTP_SERVER_CERT_AUTH parameter + with a value of 0 after previously sending a value of 1. + + SETTINGS_HTTP_SERVER_CERT_AUTH indicates that servers are able to + offer additional certificates to demonstrate control over other + origin hostnames, and that clients are able to make requests for + hostnames received in a TLS Exported Authenticator that the server + sends. + +3.2. Making Certificates Available + + When both peers have advertised support for HTTP-layer certificates + in a given direction as in Section 3.1, the indicated endpoint can + supply additional certificates into the connection at any time. That + is, if both endpoints have sent SETTINGS_HTTP_SERVER_CERT_AUTH and + validated the value received from the peer, the server may send + certificates spontaneously, at any time, as described by the + Spontaneous Server Authentication message sequence in Section 3 of + [EXPORTED-AUTH]. + + This does mean that if a server knows it supports secondary + certificate authentication, and it receives + SETTINGS_HTTP_SERVER_CERT_AUTH from the client, that it can enqueue + certificates immediately following the received SETTINGS frame. + + Certificates supplied by servers can be considered by clients without + further action by the server. A server SHOULD NOT send certificates + which do not cover origins which it is prepared to service on the + current connection, and SHOULD NOT send them if the client has not + indicated support with SETTINGS_HTTP_SERVER_CERT_AUTH. + + A client MUST NOT send certificates to the server. The server SHOULD + close the connection upon receipt of a CERTIFICATE frame from a + client. + + Client Server + <-- (stream 0 / control stream) CERTIFICATE -- + ... + -- (stream N) GET /from-new-origin ----------> + <----------------------- (stream N) 200 OK --- + + Figure 1: Simple unprompted server authentication + + A server MAY send a CERTIFICATE immediately after sending its + SETTINGS. However, it MAY also send certificates at any time later. + For example, a proxy might discover that a client is interested in an + origin that it can reverse proxy at the time that a client sends a + CONNECT request. It can then send certificates for those origins to + allow for TLS-terminated reverse proxying to those origins for the + remainder of the connection lifetime. Figure 2 illustrates this + behavior. + + Client Server + -- (stream N) CONNECT /to-new-origin --------> + <-- (stream 0 / control stream) CERTIFICATE -- + <-- (stream 0 / control stream) 200 OK ------- + ... + -- (stream M) GET /to-new-origin ------------> + <--- (stream M, direct from server) 200 OK --- + + Figure 2: Reverse proxy server authentication + +4. SETTINGS_HTTP_SERVER_CERT_AUTH + + SETTINGS parameters for HTTP/2 and HTTP/3 seperately are defined + below. + +4.1. The SETTINGS_HTTP_SERVER_CERT_AUTH HTTP/2 SETTINGS Parameter + + This document adds a new HTTP/2 SETTINGS(0xTBD) parameter to those + defined by Section 6.5.2 of [H2]. + + The new parameter name is SETTINGS_HTTP_SERVER_CERT_AUTH. The value + of the parameter MUST be 0 or 1. + + The usage of this parameter is described in Section 3.1. + +4.2. The SETTINGS_HTTP_SERVER_CERT_AUTH HTTP/3 SETTINGS Parameter + + This document adds a new HTTP/3 SETTINGS(0xTBD) parameter to those + defined by Section 7.2.4.1 of [H3]. + + The new parameter name is SETTINGS_HTTP_SERVER_CERT_AUTH. The value + of the parameter MUST be 0 or 1. + + The usage of this parameter is described in Section 3.1. + +5. CERTIFICATE frame + + The CERTIFICATE frame contains an exported authenticator message from + the TLS layer that provides a chain of certificates and associated + extensions, proving possession of the private key corresponding to + the end-entity certificate. + + A server sends a CERTIFICATE frame on stream 0 for HTTP/2 and on the + control stream for HTTP/3. The client is permitted to make + subsequent requests for resources upon receipt of a CERTIFICATE frame + without further action from the server. + + Upon receiving a complete series of CERTIFICATE frames, the receiver + may validate the Exported Authenticator value by using the exported + authenticator API. This returns either an error indicating that the + message was invalid or the certificate chain and extensions used to + create the message. + +5.1. HTTP/2 CERTIFICATE frame + + A CERTIFICATE frame in HTTP/2 (type=0xTBD) carrries a TLS Exported + authenticator that clients can use to authenticate secondary origins + from a sending server. + + The CERTIFICATE frame MUST be sent on stream 0. A CERTIFICATE frame + received on any other stream MUST not be used for server + authentication. + + CERTIFICATE Frame { + Length (24), + Type (8) = 0xTBD, + + Unused Flags (8), + + Reserved (1), + Stream Identifier (31) = 0, + + Authenticator (..), + } + + Figure 3: HTTP/2 CERTIFICATE Frame + + The Length, Type, Unused Flag(s), Reserved, and Stream Identifier + fields are described in Section 4 of [H2]. + + The CERTIFICATE frame does not define any flags. + + The authenticator field is a portion of the opaque data returned from + the TLS connection exported authenticator authenticate API. See + Section 5.3 for more details on the input to this API. + + The CERTIFICATE frame applies to the connection, not a specific + stream. An endpoint MUST treat a CERTIFICATE frame with a stream + identifier other than 0x00 as a connection error. + +5.2. HTTP/3 CERTIFICATE frame + + A CERTIFICATE frame in HTTP/3 (type=0xTBD) carrries a TLS Exported + authenticator that clients can use to authenticate secondary origins + from a sending server. + + The CERTIFICATE frame MUST be sent on the control stream. A + CERTIFICATE frame received on any other stream MUST not be used for + server authentication. + + CERTIFICATE Frame { + Type (i) = 0xTBD, + Length (i), + Authenticator (...), + } + + Figure 4: HTTP/3 CERTIFICATE Frame + + The Type and Length fields are described in Section 7.1 of [H3]. + + The authenticator field is a portion of the opaque data returned from + the TLS connection exported authenticator authenticate API. See + Section 5.3 for more details on the input to this API. + + The CERTIFICATE frame applies to the connection, not a specific + stream. An endpoint MUST treat a CERTIFICATE frame received on any + stream other than the control stream as a connection error. + +5.3. Exported Authenticator Characteristics + + The Exported Authenticator API defined in [EXPORTED-AUTH] takes as + input a request, a set of certificates, and supporting information + about the certificate (OCSP, SCT, etc.). The result is an opaque + token which is used when generating the CERTIFICATE frame. + + Upon receipt of a CERTIFICATE frame, an endpoint which has negotiated + support for secondary certfiicates MUST perform the following steps + to validate the token it contains: + + * Using the get context API, retrieve the + certificate_request_context used to generate the authenticator, if + any. Because the certificate_request_context for spontaneous + server certificates is chosen by the server, the usage of the + certificate_request_context is implementation-dependent. For + details, see Section 5 of [EXPORTED-AUTH]. + + * Use the validate API to confirm the validity of the authenticator + with regard to the generated request, if any. + + If the authenticator cannot be validated, this SHOULD be treated as a + connection error. + + Once the authenticator is accepted, the endpoint can perform any + other checks for the acceptability of the certificate itself. + +6. Indicating Failures During HTTP-Layer Certificate Authentication + + Because this document permits certificates to be exchanged at the + HTTP framing layer instead of the TLS layer, several certificate- + related errors which are defined at the TLS layer might now occur at + the HTTP framing layer. + + There are two classes of errors which might be encountered, and they + are handled differently. + +6.1. Misbehavior + + This category of errors could indicate a peer failing to follow + requirements in this document or might indicate that the connection + is not fully secure. These errors are fatal to stream or connection, + as appropriate. + + CERTIFICATE_UNREADABLE (0xERROR-TBD): An exported authenticator + could not be validated. + +6.2. Invalid Certificates + + Unacceptable certificates (expired, revoked, or insufficient to + satisfy the request) are not treated as stream or connection errors. + This is typically not an indication of a protocol failure. Clients + SHOULD establish a new connection in an attempt to reach an + authoritative server if they deem a certificate from the server + unacceptable. + +7. Security Considerations + + This mechanism defines an alternate way to obtain server and client + certificates other than in the initial TLS handshake. While the + signature of exported authenticator values is expected to be equally + secure, it is important to recognize that a vulnerability in this + code path is at least equal to a vulnerability in the TLS handshake. + +7.1. Impersonation + + This mechanism could increase the impact of a key compromise. Rather + than needing to subvert DNS or IP routing in order to use a + compromised certificate, a malicious server now only needs a client + to connect to _some_ HTTPS site under its control in order to present + the compromised certificate. Clients SHOULD consult DNS for + hostnames presented in secondary certificates if they would have done + so for the same hostname if it were present in the primary + certificate. + + As recommended in [ORIGIN], clients opting not to consult DNS ought + to employ some alternative means to increase confidence that the + certificate is legitimate, such as an ORIGIN frame. + + As noted in the Security Considerations of [EXPORTED-AUTH], it is + difficult to formally prove that an endpoint is jointly authoritative + over multiple certificates, rather than individually authoritative on + each certificate. As a result, clients MUST NOT assume that because + one origin was previously colocated with another, those origins will + be reachable via the same endpoints in the future. Clients MUST NOT + consider previous secondary certificates to be validated after TLS + session resumption. Servers MAY re-present certificates if a TLS + Session is resumed. + +7.2. Fingerprinting + + This document defines a mechanism which could be used to probe + servers for origins they support, but it opens no new attack that was + not already possible by making repeat TLS connections with different + SNI values. + +7.3. Persistence of Service + + CNAME records in the DNS are frequently used to delegate authority + for an origin to a third-party provider. This delegation can be + changed without notice, even to the third-party provider, simply by + modifying the CNAME record in question. + + After the owner of the domain has redirected traffic elsewhere by + changing the CNAME, new connections will not arrive for that origin, + but connections which are properly directed to this provider for + other origins would continue to claim control of this origin (via + Secondary Certificates). This is proper behavior based on the third- + party provider's configuration, but would likely not be what is + intended by the owner of the origin. + + This is not an issue which can be mitigated by the protocol, but + something about which third-party providers SHOULD educate their + customers before using the features described in this document. + +7.4. Confusion About State + + Implementations need to be aware of the potential for confusion about + the state of a connection. The presence or absence of a validated + certificate can change during the processing of a request, + potentially multiple times, as CERTIFICATE frames are received. A + client that uses certificate authentication needs to be prepared to + reevaluate the authorization state of a request as the set of + certificates changes. + + Behavior for TLS-Terminated reverse proxies is also worth + considering. If a server which situationally reverse-proxies wishes + for the client to view a request made prior to receipt of + certificates as TLS-Terminated, or wishes for the client to start a + new tunnel alternatively, this document does not currently define + formal mechanisms to facilitate that intention. + +8. IANA Considerations + + This document registers the CERTIFICATE frame type and + SETTINGS_HTTP_SERVER_CERT_AUTH setting for both [H2] and [H3]. + +8.1. Frame Types + + This specification registers the following entry in the "HTTP/2 Frame + Type" registry defined in [H2]: + + Code: : TBD + + Frame Type: : CERTIFICATE + + Reference: : This document + + This specification registers the following entry in the "HTTP/3 Frame + Types" registry established by [H3]: + + Value: : TBD + + Frame Type: : CERTIFICATE + + Status: : permanent + + Reference: : This document + + Change Controller: : IETF + + Contact: : ietf-http-wg@w3.org + +8.2. Settings Parameters + + This specification registers the following entry in the "HTTP/2 + Settings" registry defined in [H2]: + + Code: : TBD + + Name: : SETTINGS_HTTP_SERVER_CERT_AUTH + + Initial Value: : 0 + + Reference: : This document + + This specification registers the following entry in the "HTTP/3 + Settings" registry defined in [H3]: + + Code: : TBD + + Name: : SETTINGS_HTTP_SERVER_CERT_AUTH + + Default: : 0 + + Reference: : This document + + Change Controller: : IETF + + Contact: : ietf-http-wg@w3.org + +9. References + +9.1. Normative References + + [EXPORTED-AUTH] + Sullivan, N., "Exported Authenticators in TLS", RFC 9261, + DOI 10.17487/RFC9261, July 2022, + . + + [H2] Thomson, M., Ed. and C. Benfield, Ed., "HTTP/2", RFC 9113, + DOI 10.17487/RFC9113, June 2022, + . + + [H3] Bishop, M., Ed., "HTTP/3", RFC 9114, DOI 10.17487/RFC9114, + June 2022, . + + [HTTP] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP Semantics", STD 97, RFC 9110, + DOI 10.17487/RFC9110, June 2022, + . + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + . + + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, . + + [TLS13] Rescorla, E., "The Transport Layer Security (TLS) Protocol + Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, + . + +9.2. Informative References + + [ALTSVC] Nottingham, M., McManus, P., and J. Reschke, "HTTP + Alternative Services", RFC 7838, DOI 10.17487/RFC7838, + April 2016, . + + [ORIGIN] Nottingham, M. and E. Nygren, "The ORIGIN HTTP/2 Frame", + RFC 8336, DOI 10.17487/RFC8336, March 2018, + . + +Acknowledgments + + Thanks to Mike Bishop, Nick Sullivan, Martin Thomson and other + contributors for their work on the document that this is based on. + + And thanks to Eric Kinnear, Tommy Pauly, and Lucas Pardue for their + guidance and editorial contributions to this document. + +Authors' Addresses + + Eric Gorbaty (editor) + Apple + Email: e_gorbaty@apple.com + + + Mike Bishop (editor) + Akamai + Email: mbishop@evequefou.be diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-sfbis.html b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-sfbis.html new file mode 100644 index 000000000..e219be6d3 --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-sfbis.html @@ -0,0 +1,3368 @@ + + + + + + +Structured Field Values for HTTP + + + + + + + + + + + + + + + + + + + + + + + + +
Internet-DraftStructured Field Values for HTTPJanuary 2025
Nottingham & KampExpires 11 July 2025[Page]
+
+
+
+
Workgroup:
+
HTTP
+
Internet-Draft:
+
draft-ietf-httpbis-sfbis-latest
+
Obsoletes:
+
+8941 (if approved)
+
Published:
+
+ +
+
Intended Status:
+
Standards Track
+
Expires:
+
+
Authors:
+
+
+
M. Nottingham
+
Cloudflare
+
+
+
P.-H. Kamp
+
The Varnish Cache Project
+
+
+
+
+

Structured Field Values for HTTP

+
+

Abstract

+

This document describes a set of data types and associated algorithms that are intended to make it easier and safer to define and handle HTTP header and trailer fields, known as "Structured Fields", "Structured Headers", or "Structured Trailers". It is intended for use by specifications of new HTTP fields that wish to use a common syntax that is more restrictive than traditional HTTP field values.

+

This document obsoletes RFC 8941.

+
+
+

+About This Document +

+

This note is to be removed before publishing as an RFC.

+

+ Status information for this document may be found at https://datatracker.ietf.org/doc/draft-ietf-httpbis-sfbis/.

+

+ Discussion of this document takes place on the + HTTP Working Group mailing list (mailto:ietf-http-wg@w3.org), + which is archived at https://lists.w3.org/Archives/Public/ietf-http-wg/. + Working Group information can be found at https://httpwg.org/.

+

Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/header-structure.

+
+
+
+

+Status of This Memo +

+

+ This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79.

+

+ Internet-Drafts are working documents of the Internet Engineering Task + Force (IETF). Note that other groups may also distribute working + documents as Internet-Drafts. The list of current Internet-Drafts is + at https://datatracker.ietf.org/drafts/current/.

+

+ Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress."

+

+ This Internet-Draft will expire on 11 July 2025.

+
+
+ +
+
+

+Table of Contents +

+ +
+
+
+
+

+1. Introduction +

+

Specifying the syntax of new HTTP header (and trailer) fields is an onerous task; even with the guidance in Section 16.3.2 of [HTTP], there are many decisions -- and pitfalls -- for a prospective HTTP field author.

+

Once a field is defined, bespoke parsers and serializers often need to be written, because each field value has a slightly different handling of what looks like common syntax.

+

This document introduces a set of common data structures for use in definitions of new HTTP field values to address these problems. In particular, it defines a generic, abstract model for them, along with a concrete serialization for expressing that model in HTTP [HTTP] header and trailer fields.

+

An HTTP field that is defined as a "Structured Header" or "Structured Trailer" (if the field can be either, it is a "Structured Field") uses the types defined in this specification to define its syntax and basic handling rules, thereby simplifying both its definition by specification writers and handling by implementations.

+

Additionally, future versions of HTTP can define alternative serializations of the abstract model of these structures, allowing fields that use that model to be transmitted more efficiently without being redefined.

+

Note that it is not a goal of this document to redefine the syntax of existing HTTP fields; the mechanisms described herein are only intended to be used with fields that explicitly opt into them.

+

Section 2 describes how to specify a Structured Field.

+

Section 3 defines a number of abstract data types that can be used in Structured Fields.

+

Those abstract types can be serialized into and parsed from HTTP field values using the algorithms described in Section 4.

+
+
+

+1.1. Intentionally Strict Processing +

+

This specification intentionally defines strict parsing and serialization behaviors using step-by-step algorithms; the only error handling defined is to fail the entire operation altogether.

+

It is designed to encourage faithful implementation and good interoperability. Therefore, an implementation that tried to be helpful by being more tolerant of input would make interoperability worse, since that would create pressure on other implementations to implement similar (but likely subtly different) workarounds.

+

In other words, strict processing is an intentional feature of this specification; it allows non-conformant input to be discovered and corrected by the producer early and avoids both interoperability and security issues that might otherwise result.

+

Note that as a result of this strictness, if a field is appended to by multiple parties (e.g., intermediaries or different components in the sender), an error in one party's value is likely to cause the entire field value to fail parsing.

+
+
+
+
+

+1.2. Notational Conventions +

+

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

+

This document uses the VCHAR, SP, DIGIT, ALPHA, and DQUOTE rules from [RFC5234] to specify characters and/or their corresponding ASCII bytes, depending on context. It uses the tchar and OWS rules from [HTTP] for the same purpose.

+

This document uses algorithms to specify parsing and serialization behaviors. When parsing from HTTP fields, implementations MUST have behavior that is indistinguishable from following the algorithms.

+

For serialization to HTTP fields, the algorithms define the recommended way to produce them. Implementations MAY vary from the specified behavior so long as the output is still correctly handled by the parsing algorithm described in Section 4.2.

+
+
+
+
+
+
+

+2. Defining New Structured Fields +

+

To specify an HTTP field as a Structured Field, its authors need to:

+
    +
  • +

    Normatively reference this specification. Recipients and generators of the field need to know that the requirements of this document are in effect.

    +
  • +
  • +

    Identify whether the field is a Structured Header (i.e., it can only be used in the header section -- the common case), a Structured Trailer (only in the trailer section), or a Structured Field (both).

    +
  • +
  • +

    Specify the type of the field value; either List (Section 3.1), Dictionary (Section 3.2), or Item (Section 3.3).

    +
  • +
  • +

    Define the semantics of the field value.

    +
  • +
  • +

    Specify any additional constraints upon the field value, as well as the consequences when those constraints are violated.

    +
  • +
+

Typically, this means that a field definition will specify the top-level type -- List, Dictionary, or Item -- and then define its allowable types and constraints upon them. For example, a header defined as a List might have all Integer members, or a mix of types; a header defined as an Item might allow only Strings, and additionally only strings beginning with the letter "Q", or strings in lowercase. Likewise, Inner Lists (Section 3.1.1) are only valid when a field definition explicitly allows them.

+

Fields that use the Display String type are advised to carefully specify their allowable unicode code points; for example, specifying the use of a profile from [PRECIS].

+

Field definitions can only use this specification for the entire field value, not a portion thereof.

+

Specifications can refer to a field name as a "structured header name", "structured trailer name", or "structured field name" as appropriate. Likewise, they can refer its field value as a "structured header value", "structured trailer value", or "structured field value" as necessary.

+

This specification defines minimums for the length or number of various structures supported by implementations. It does not specify maximum sizes in most cases, but authors should be aware that HTTP implementations do impose various limits on the size of individual fields, the total number of fields, and/or the size of the entire header or trailer section.

+
+
+

+2.1. Example +

+

A fictitious Foo-Example header field might be specified as:

+
+

42. Foo-Example Header Field

+

The Foo-Example HTTP header field conveys information about how +much Foo the message has.

+

Foo-Example is an Item Structured Header Field [RFCnnnn]. Its value +MUST be an Integer (Section 3.3.1 of [RFCnnnn]).

+

Its value indicates the amount of Foo in the message, and it MUST +be between 0 and 10, inclusive; other values MUST cause +the entire header field to be ignored.

+

The following parameter is defined:

+
    +
  • A parameter whose key is "foourl", and whose value is a String + (Section 3.3.3 of [RFCnnnn]), conveying the Foo URL + for the message. See below for processing requirements. +
  • +
+

"foourl" contains a URI-reference (Section 4.1 of [RFC3986]). If +its value is not a valid URI-reference, the entire header field +MUST be ignored. If its value is a relative reference (Section 4.2 +of [RFC3986]), it MUST be resolved (Section 5 of [RFC3986]) before +being used.

+

For example:

+
+
+  Foo-Example: 2; foourl="https://foo.example.com/"
+
+
+
+
+
+
+
+

+2.2. Error Handling +

+

When parsing fails, the entire field is ignored (see Section 4.2). Field definitions cannot override this, because doing so would preclude handling by generic software; they can only add additional constraints (for example, on the numeric range of Integers and Decimals, the format of Strings and Tokens, the types allowed in a Dictionary's values, or the number of Items in a List).

+

When field-specific constraints are violated, the entire field is also ignored, unless the field definition defines other handling requirements. For example, if a header field is defined as an Item and required to be an Integer, but a String is received, it should be ignored unless that field's definition explicitly specifies otherwise.

+
+
+
+
+

+2.3. Preserving Extensibility +

+

Structured Fields are designed to be extensible, because experience has shown that even when it is not foreseen, it is often necessary to modify and add to the allowable syntax and semantics of a field in a controlled fashion.

+

Both Items and Inner Lists allow Parameters as an extensibility mechanism; this means that their values can later be extended to accommodate more information, if need be. To preserve forward compatibility, field specifications are discouraged from defining the presence of an unrecognized parameter as an error condition.

+

Field specifications are required to be either an Item, List, or Dictionary to preserve extensibility. Fields that erroneously defined as another type (e.g., Integer) are assumed to be Items (i.e., they allow Parameters).

+

To further assure that this extensibility is available in the future, and to encourage consumers to use a complete parser implementation, a field definition can specify that "grease" parameters be added by senders. A specification could stipulate that all parameters that fit a defined pattern are reserved for this use and then encourage them to be sent on some portion of requests. This helps to discourage recipients from writing a parser that does not account for Parameters.

+

Specifications that use Dictionaries can also allow for forward compatibility by requiring that the presence of -- as well as value and type associated with -- unknown keys be ignored. Subsequent specifications can then add additional keys, specifying constraints on them as appropriate.

+

An extension to a Structured Field can then require that an entire field value be ignored by a recipient that understands the extension if constraints on the value it defines are not met.

+
+
+
+
+

+2.4. Using New Structured Types in Extensions +

+

Because a field definition needs to reference a specific RFC for Structured Fields, the types available for use in its value are limited to those defined in that RFC. For example, a field whose definition references this document can have a value that uses the Date type (Section 3.3.7), whereas a field whose definition references RFC 8941 cannot, because it will be treated as invalid (and therefore discarded) by implementations of that specification.

+

This limitation also applies to future extensions to a field; for example, a field that is defined with reference to RFC 8941 cannot use the Date type, because some recipients might still be using an RFC 8941 parser to process it.

+

However, this document is designed to be backwards-compatible with RFC 8941; a parser that implements the requirements here can also parse valid Structured Fields whose definitions reference RFC 8941.

+

Upgrading a Structured Fields implementation to support a newer revision of the specification (such as this document) brings the possibility that some field values that were invalid according to the earlier RFC might become valid when processed.

+

For example, field instance might contain a syntactically valid Date (Section 3.3.7), even though that field's definition does not accommodate Dates. An RFC8941 implementation would fail parsing such a field instance, because they are not defined in that specification. If that implementation were upgraded to this specification, parsing would now succeed. In some cases, the resulting Date value will be rejected by field-specific logic, but values in fields that are otherwise ignored (such as extension parameters) might not be detected and the field might subsequently be accepted and processed.

+
+
+
+
+
+
+

+3. Structured Data Types +

+

This section provides an overview of the abstract types that Structured Fields use, and gives a brief description and examples of how each of those types are serialized into textual HTTP fields. Section 4 specifies the details of how they are parsed from and serialized into textual HTTP fields.

+

In summary:

+
    +
  • +

    There are three top-level types that an HTTP field can be defined as: Lists, Dictionaries, and Items.

    +
  • +
  • +

    Lists and Dictionaries are containers; their members can be Items or Inner Lists (which are themselves arrays of Items).

    +
  • +
  • +

    Both Items and Inner Lists can be Parameterized with key/value pairs.

    +
  • +
+
+
+

+3.1. Lists +

+

Lists are arrays of zero or more members, each of which can be an Item (Section 3.3) or an Inner List (Section 3.1.1), both of which can be Parameterized (Section 3.1.2).

+

An empty List is denoted by not serializing the field at all. This implies that fields defined as Lists have a default empty value.

+

When serialized as a textual HTTP field, each member is separated by a comma and optional whitespace. For example, a field whose value is defined as a List of Tokens could look like:

+
+
+Example-List: sugar, tea, rum
+
+
+

Note that Lists can have their members split across multiple lines of the same header or trailer section, as per Section 5.3 of [HTTP]; for example, the following are equivalent:

+
+
+Example-List: sugar, tea, rum
+
+
+

and

+
+
+Example-List: sugar, tea
+Example-List: rum
+
+
+

However, individual members of a List cannot be safely split between lines; see Section 4.2 for details.

+

Parsers MUST support Lists containing at least 1024 members. Field specifications can constrain the types and cardinality of individual List values as they require.

+
+
+

+3.1.1. Inner Lists +

+

An Inner List is an array of zero or more Items (Section 3.3). Both the individual Items and the Inner List itself can be Parameterized (Section 3.1.2).

+

When serialized in a textual HTTP field, Inner Lists are denoted by surrounding parenthesis, and their values are delimited by one or more spaces. A field whose value is defined as a List of Inner Lists of Strings could look like:

+
+
+Example-List: ("foo" "bar"), ("baz"), ("bat" "one"), ()
+
+
+

Note that the last member in this example is an empty Inner List.

+

A header field whose value is defined as a List of Inner Lists with Parameters at both levels could look like:

+
+
+Example-List: ("foo"; a=1;b=2);lvl=5, ("bar" "baz");lvl=1
+
+
+

Parsers MUST support Inner Lists containing at least 256 members. Field specifications can constrain the types and cardinality of individual Inner List members as they require.

+
+
+
+
+

+3.1.2. Parameters +

+

Parameters are an ordered map of key-value pairs that are associated with an Item (Section 3.3) or Inner List (Section 3.1.1). The keys are unique within the scope of the Parameters they occur within, and the values are bare items (i.e., they themselves cannot be parameterized; see Section 3.3).

+

Implementations MUST provide access to Parameters both by index and by key. Specifications MAY use either means of accessing them.

+

Note that parameters are ordered, and parameter keys cannot contain uppercase letters.

+

When serialized in a textual HTTP field, a Parameter is separated from its Item or Inner List and other Parameters by a semicolon. For example:

+
+
+Example-List: abc;a=1;b=2; cde_456, (ghi;jk=4 l);q="9";r=w
+
+
+

Parameters whose value is Boolean (see Section 3.3.6) true MUST omit that value when serialized. For example, the "a" parameter here is true, while the "b" parameter is false:

+
+
+Example-Integer: 1; a; b=?0
+
+
+

Note that this requirement is only on serialization; parsers are still required to correctly handle the true value when it appears in a parameter.

+

Parsers MUST support at least 256 parameters on an Item or Inner List, and support parameter keys with at least 64 characters. Field specifications can constrain the order of individual parameters, as well as their values' types as required.

+
+
+
+
+
+
+

+3.2. Dictionaries +

+

Dictionaries are ordered maps of key-value pairs, where the keys are short textual strings and the values are Items (Section 3.3) or arrays of Items, both of which can be Parameterized (Section 3.1.2). There can be zero or more members, and their keys are unique in the scope of the Dictionary they occur within.

+

Implementations MUST provide access to Dictionaries both by index and by key. Specifications MAY use either means of accessing the members.

+

As with Lists, an empty Dictionary is represented by omitting the entire field. This implies that fields defined as Dictionaries have a default empty value.

+

Typically, a field specification will define the semantics of Dictionaries by specifying the allowed type(s) for individual members by their keys, as well as whether their presence is required or optional. Recipients MUST ignore members whose keys are undefined or unknown, unless the field's specification specifically disallows them.

+

When serialized as a textual HTTP field, Members are ordered as serialized and separated by a comma with optional whitespace. Member keys cannot contain uppercase characters. Keys and values are separated by "=" (without whitespace). For example:

+
+
+Example-Dict: en="Applepie", da=:w4ZibGV0w6ZydGU=:
+
+
+

Note that in this example, the final "=" is due to the inclusion of a Byte Sequence; see Section 3.3.5.

+

Members whose value is Boolean (see Section 3.3.6) true MUST omit that value when serialized. For example, here both "b" and "c" are true:

+
+
+Example-Dict: a=?0, b, c; foo=bar
+
+
+

Note that this requirement is only on serialization; parsers are still required to correctly handle the true Boolean value when it appears in Dictionary values.

+

A Dictionary with a member whose value is an Inner List of Tokens:

+
+
+Example-Dict: rating=1.5, feelings=(joy sadness)
+
+
+

A Dictionary with a mix of Items and Inner Lists, some with parameters:

+
+
+Example-Dict: a=(1 2), b=3, c=4;aa=bb, d=(5 6);valid
+
+
+

Note that Dictionaries can have their members split across multiple lines of the same header or trailer section; for example, the following are equivalent:

+
+
+Example-Dict: foo=1, bar=2
+
+
+

and

+
+
+Example-Dict: foo=1
+Example-Dict: bar=2
+
+
+

However, individual members of a Dictionary cannot be safely split between lines; see Section 4.2 for details.

+

Parsers MUST support Dictionaries containing at least 1024 key/value pairs and keys with at least 64 characters. Field specifications can constrain the order of individual Dictionary members, as well as their values' types as required.

+
+
+
+
+

+3.3. Items +

+

An Item can be an Integer (Section 3.3.1), a Decimal (Section 3.3.2), a String (Section 3.3.3), a Token (Section 3.3.4), a Byte Sequence (Section 3.3.5), a Boolean (Section 3.3.6), or a Date (Section 3.3.7). It can have associated parameters (Section 3.1.2).

+

For example, a header field that is defined to be an Item that is an Integer might look like:

+
+
+Example-Integer: 5
+
+
+

or with parameters:

+
+
+Example-Integer: 5; foo=bar
+
+
+
+
+

+3.3.1. Integers +

+

Integers have a range of -999,999,999,999,999 to 999,999,999,999,999 inclusive (i.e., up to fifteen digits, signed), for IEEE 754 compatibility [IEEE754].

+

For example:

+
+
+Example-Integer: 42
+
+
+

Integers larger than 15 digits can be supported in a variety of ways; for example, by using a String (Section 3.3.3), a Byte Sequence (Section 3.3.5), or a parameter on an Integer that acts as a scaling factor.

+

While it is possible to serialize Integers with leading zeros (e.g., "0002", "-01") and signed zero ("-0"), these distinctions may not be preserved by implementations.

+

Note that commas in Integers are used in this section's prose only for readability; they are not valid in the wire format.

+
+
+
+
+

+3.3.2. Decimals +

+

Decimals are numbers with an integer and a fractional component. The integer component has at most 12 digits; the fractional component has at most three digits.

+

For example, a header whose value is defined as a Decimal could look like:

+
+
+Example-Decimal: 4.5
+
+
+

While it is possible to serialize Decimals with leading zeros (e.g., "0002.5", "-01.334"), trailing zeros (e.g., "5.230", "-0.40"), and signed zero (e.g., "-0.0"), these distinctions may not be preserved by implementations.

+

Note that the serialization algorithm (Section 4.1.5) rounds input with more than three digits of precision in the fractional component. If an alternative rounding strategy is desired, this should be specified by the field definition to occur before serialization.

+
+
+
+
+

+3.3.3. Strings +

+

Strings are zero or more printable ASCII [RFC0020] characters (i.e., the range %x20 to %x7E). Note that this excludes tabs, newlines, carriage returns, etc.

+

Non-ASCII characters are not directly supported in Strings, because they cause a number of interoperability issues, and -- with few exceptions -- field values do not require them.

+

When it is necessary for a field value to convey non-ASCII content, a Display String (Section 3.3.8) can be specified.

+

When serialized in a textual HTTP field, Strings are delimited with double quotes, using a backslash ("\") to escape double quotes and backslashes. For example:

+
+
+Example-String: "hello world"
+
+
+

Note that Strings only use DQUOTE as a delimiter; single quotes do not delimit Strings. Furthermore, only DQUOTE and "\" can be escaped; other characters after "\" MUST cause parsing to fail.

+

Parsers MUST support Strings (after any decoding) with at least 1024 characters.

+
+
+
+
+

+3.3.4. Tokens +

+

Tokens are short textual words that begin with an alphabetic character or "*", followed by zero to many token characters, which are the same as those allowed by the "token" ABNF rule defined in [HTTP], plus the ":" and "/" characters.

+

For example:

+
+
+Example-Token: foo123/456
+
+
+

Parsers MUST support Tokens with at least 512 characters.

+

Note that Tokens are defined largely for compatibility with the data model of existing HTTP fields, and may require additional steps to use in some implementations. As a result, new fields are encouraged to use Strings.

+
+
+
+
+

+3.3.5. Byte Sequences +

+

Byte Sequences can be conveyed in Structured Fields.

+

When serialized in a textual HTTP field, a Byte Sequence is delimited with colons and encoded using base64 ([RFC4648], Section 4). For example:

+
+
+Example-ByteSequence: :cHJldGVuZCB0aGlzIGlzIGJpbmFyeSBjb250ZW50Lg==:
+
+
+

Parsers MUST support Byte Sequences with at least 16384 octets after decoding.

+
+
+
+
+

+3.3.6. Booleans +

+

Boolean values can be conveyed in Structured Fields.

+

When serialized in a textual HTTP field, a Boolean is indicated with a leading "?" character followed by a "1" for a true value or "0" for false. For example:

+
+
+Example-Boolean: ?1
+
+
+

Note that in Dictionary (Section 3.2) and Parameter (Section 3.1.2) values, Boolean true is indicated by omitting the value.

+
+
+
+
+

+3.3.7. Dates +

+

Date values can be conveyed in Structured Fields.

+

Dates have a data model that is similar to Integers, representing a (possibly negative) delta in seconds from 1970-01-01T00:00:00Z, excluding leap seconds. Accordingly, their serialization in textual HTTP fields is similar to that of Integers, distinguished from them with a leading "@".

+

For example:

+
+
+Example-Date: @1659578233
+
+
+

Parsers MUST support Dates whose values include all days in years 1 to 9999 (i.e., -62,135,596,800 to 253,402,214,400 delta seconds from 1970-01-01T00:00:00Z).

+
+
+
+
+

+3.3.8. Display Strings +

+

Display Strings are similar to Strings, in that they consist of zero or more characters, but they allow Unicode scalar values (i.e., all Unicode code points except for surrogates), unlike Strings.

+

Display Strings are intended for use in cases where a value is displayed to end users, and therefore may need to carry non-ASCII content. It is NOT RECOMMENDED that they be used in situations where a String (Section 3.3.3) or Token (Section 3.3.4) would be adequate, because Unicode has processing considerations (e.g., normalization) and security considerations (e.g., homograph attacks) that make it more difficult to handle correctly.

+

Note that Display Strings do not indicate the language used in the value; that can be done separately if necessary (e.g., with a parameter).

+

In textual HTTP fields, Display Strings are represented in a manner similar to Strings, except that non-ASCII characters are percent-encoded; there is a leading "%" to distinguish them from Strings.

+

For example:

+
+
+Example-DisplayString: %"This is intended for display to %c3%bcsers."
+
+
+

See Section 6 for additional security considerations when handling Display Strings.

+
+
+
+
+
+
+
+
+

+4. Working with Structured Fields in HTTP +

+

This section defines how to serialize and parse the abstract types defined by Section 3 into textual HTTP field values and other encodings compatible with them (e.g., in HTTP/2 [HTTP/2] before compression with HPACK [HPACK]).

+
+
+

+4.1. Serializing Structured Fields +

+

Given a structure defined in this specification, return an ASCII string suitable for use in an HTTP field value.

+
    +
  1. +

    If the structure is a Dictionary or List and its value is empty (i.e., it has no members), do not serialize the field at all (i.e., omit both the field-name and field-value).

    +
  2. +
  3. +

    If the structure is a List, let output_string be the result of running Serializing a List (Section 4.1.1) with the structure.

    +
  4. +
  5. +

    Else, if the structure is a Dictionary, let output_string be the result of running Serializing a Dictionary (Section 4.1.2) with the structure.

    +
  6. +
  7. +

    Else, if the structure is an Item, let output_string be the result of running Serializing an Item (Section 4.1.3) with the structure.

    +
  8. +
  9. +

    Else, fail serialization.

    +
  10. +
  11. +

    Return output_string converted into an array of bytes, using ASCII encoding [RFC0020].

    +
  12. +
+
+
+

+4.1.1. Serializing a List +

+

Given an array of (member_value, parameters) tuples as input_list, return an ASCII string suitable for use in an HTTP field value.

+
    +
  1. +

    Let output be an empty string.

    +
  2. +
  3. +

    For each (member_value, parameters) of input_list:

    +
      +
    1. +

      If member_value is an array, append the result of running Serializing an Inner List (Section 4.1.1.1) with (member_value, parameters) to output.

      +
    2. +
    3. +

      Otherwise, append the result of running Serializing an Item (Section 4.1.3) with (member_value, parameters) to output.

      +
    4. +
    5. +

      If more member_values remain in input_list:

      +
        +
      1. +

        Append "," to output.

        +
      2. +
      3. +

        Append a single SP to output.

        +
      4. +
      +
    6. +
    +
  4. +
  5. +

    Return output.

    +
  6. +
+
+
+
+4.1.1.1. Serializing an Inner List +
+

Given an array of (member_value, parameters) tuples as inner_list, and parameters as list_parameters, return an ASCII string suitable for use in an HTTP field value.

+
    +
  1. +

    Let output be the string "(".

    +
  2. +
  3. +

    For each (member_value, parameters) of inner_list:

    +
      +
    1. +

      Append the result of running Serializing an Item (Section 4.1.3) with (member_value, parameters) to output.

      +
    2. +
    3. +

      If more values remain in inner_list, append a single SP to output.

      +
    4. +
    +
  4. +
  5. +

    Append ")" to output.

    +
  6. +
  7. +

    Append the result of running Serializing Parameters (Section 4.1.1.2) with list_parameters to output.

    +
  8. +
  9. +

    Return output.

    +
  10. +
+
+
+
+
+
+4.1.1.2. Serializing Parameters +
+

Given an ordered Dictionary as input_parameters (each member having a param_key and a param_value), return an ASCII string suitable for use in an HTTP field value.

+
    +
  1. +

    Let output be an empty string.

    +
  2. +
  3. +

    For each param_key with a value of param_value in input_parameters:

    +
      +
    1. +

      Append ";" to output.

      +
    2. +
    3. +

      Append the result of running Serializing a Key (Section 4.1.1.3) with param_key to output.

      +
    4. +
    5. +

      If param_value is not Boolean true:

      +
        +
      1. +

        Append "=" to output.

        +
      2. +
      3. +

        Append the result of running Serializing a bare Item (Section 4.1.3.1) with param_value to output.

        +
      4. +
      +
    6. +
    +
  4. +
  5. +

    Return output.

    +
  6. +
+
+
+
+
+
+4.1.1.3. Serializing a Key +
+

Given a key as input_key, return an ASCII string suitable for use in an HTTP field value.

+
    +
  1. +

    Convert input_key into a sequence of ASCII characters; if conversion fails, fail serialization.

    +
  2. +
  3. +

    If input_key contains characters not in lcalpha, DIGIT, "_", "-", ".", or "*", fail serialization.

    +
  4. +
  5. +

    If the first character of input_key is not lcalpha or "*", fail serialization.

    +
  6. +
  7. +

    Let output be an empty string.

    +
  8. +
  9. +

    Append input_key to output.

    +
  10. +
  11. +

    Return output.

    +
  12. +
+
+
+
+
+
+
+

+4.1.2. Serializing a Dictionary +

+

Given an ordered Dictionary as input_dictionary (each member having a member_key and a tuple value of (member_value, parameters)), return an ASCII string suitable for use in an HTTP field value.

+
    +
  1. +

    Let output be an empty string.

    +
  2. +
  3. +

    For each member_key with a value of (member_value, parameters) in input_dictionary:

    +
      +
    1. +

      Append the result of running Serializing a Key (Section 4.1.1.3) with member's member_key to output.

      +
    2. +
    3. +

      If member_value is Boolean true:

      +
        +
      1. +

        Append the result of running Serializing Parameters (Section 4.1.1.2) with parameters to output.

        +
      2. +
      +
    4. +
    5. +

      Otherwise:

      +
        +
      1. +

        Append "=" to output.

        +
      2. +
      3. +

        If member_value is an array, append the result of running Serializing an Inner List (Section 4.1.1.1) with (member_value, parameters) to output.

        +
      4. +
      5. +

        Otherwise, append the result of running Serializing an Item (Section 4.1.3) with (member_value, parameters) to output.

        +
      6. +
      +
    6. +
    7. +

      If more members remain in input_dictionary:

      +
        +
      1. +

        Append "," to output.

        +
      2. +
      3. +

        Append a single SP to output.

        +
      4. +
      +
    8. +
    +
  4. +
  5. +

    Return output.

    +
  6. +
+
+
+
+
+

+4.1.3. Serializing an Item +

+

Given an Item as bare_item and Parameters as item_parameters, return an ASCII string suitable for use in an HTTP field value.

+
    +
  1. +

    Let output be an empty string.

    +
  2. +
  3. +

    Append the result of running Serializing a Bare Item (Section 4.1.3.1) with bare_item to output.

    +
  4. +
  5. +

    Append the result of running Serializing Parameters (Section 4.1.1.2) with item_parameters to output.

    +
  6. +
  7. +

    Return output.

    +
  8. +
+
+
+
+4.1.3.1. Serializing a Bare Item +
+

Given an Item as input_item, return an ASCII string suitable for use in an HTTP field value.

+
    +
  1. +

    If input_item is an Integer, return the result of running Serializing an Integer (Section 4.1.4) with input_item.

    +
  2. +
  3. +

    If input_item is a Decimal, return the result of running Serializing a Decimal (Section 4.1.5) with input_item.

    +
  4. +
  5. +

    If input_item is a String, return the result of running Serializing a String (Section 4.1.6) with input_item.

    +
  6. +
  7. +

    If input_item is a Token, return the result of running Serializing a Token (Section 4.1.7) with input_item.

    +
  8. +
  9. +

    If input_item is a Byte Sequence, return the result of running Serializing a Byte Sequence (Section 4.1.8) with input_item.

    +
  10. +
  11. +

    If input_item is a Boolean, return the result of running Serializing a Boolean (Section 4.1.9) with input_item.

    +
  12. +
  13. +

    If input_item is a Date, return the result of running Serializing a Date (Section 4.1.10) with input_item.

    +
  14. +
  15. +

    If input_item is a Display String, return the result of running Serializing a Display String (Section 4.1.11) with input_item.

    +
  16. +
  17. +

    Otherwise, fail serialization.

    +
  18. +
+
+
+
+
+
+
+

+4.1.4. Serializing an Integer +

+

Given an Integer as input_integer, return an ASCII string suitable for use in an HTTP field value.

+
    +
  1. +

    If input_integer is not an integer in the range of -999,999,999,999,999 to 999,999,999,999,999 inclusive, fail serialization.

    +
  2. +
  3. +

    Let output be an empty string.

    +
  4. +
  5. +

    If input_integer is less than (but not equal to) 0, append "-" to output.

    +
  6. +
  7. +

    Append input_integer's numeric value represented in base 10 using only decimal digits to output.

    +
  8. +
  9. +

    Return output.

    +
  10. +
+
+
+
+
+

+4.1.5. Serializing a Decimal +

+

Given a decimal number as input_decimal, return an ASCII string suitable for use in an HTTP field value.

+
    +
  1. +

    If input_decimal is not a decimal number, fail serialization.

    +
  2. +
  3. +

    If input_decimal has more than three significant digits to the right of the decimal point, round it to three decimal places, rounding the final digit to the nearest value, or to the even value if it is equidistant.

    +
  4. +
  5. +

    If input_decimal has more than 12 significant digits to the left of the decimal point after rounding, fail serialization.

    +
  6. +
  7. +

    Let output be an empty string.

    +
  8. +
  9. +

    If input_decimal is less than (but not equal to) 0, append "-" to output.

    +
  10. +
  11. +

    Append input_decimal's integer component represented in base 10 (using only decimal digits) to output; if it is zero, append "0".

    +
  12. +
  13. +

    Append "." to output.

    +
  14. +
  15. +

    If input_decimal's fractional component is zero, append "0" to output.

    +
  16. +
  17. +

    Otherwise, append the significant digits of input_decimal's fractional component represented in base 10 (using only decimal digits) to output.

    +
  18. +
  19. +

    Return output.

    +
  20. +
+
+
+
+
+

+4.1.6. Serializing a String +

+

Given a String as input_string, return an ASCII string suitable for use in an HTTP field value.

+
    +
  1. +

    Convert input_string into a sequence of ASCII characters; if conversion fails, fail serialization.

    +
  2. +
  3. +

    If input_string contains characters in the range %x00-1f or %x7f-ff (i.e., not in VCHAR or SP), fail serialization.

    +
  4. +
  5. +

    Let output be the string DQUOTE.

    +
  6. +
  7. +

    For each character char in input_string:

    +
      +
    1. +

      If char is "\" or DQUOTE:

      +
        +
      1. +

        Append "\" to output.

        +
      2. +
      +
    2. +
    3. +

      Append char to output.

      +
    4. +
    +
  8. +
  9. +

    Append DQUOTE to output.

    +
  10. +
  11. +

    Return output.

    +
  12. +
+
+
+
+
+

+4.1.7. Serializing a Token +

+

Given a Token as input_token, return an ASCII string suitable for use in an HTTP field value.

+
    +
  1. +

    Convert input_token into a sequence of ASCII characters; if conversion fails, fail serialization.

    +
  2. +
  3. +

    If the first character of input_token is not ALPHA or "*", or the remaining portion contains a character not in tchar, ":", or "/", fail serialization.

    +
  4. +
  5. +

    Let output be an empty string.

    +
  6. +
  7. +

    Append input_token to output.

    +
  8. +
  9. +

    Return output.

    +
  10. +
+
+
+
+
+

+4.1.8. Serializing a Byte Sequence +

+

Given a Byte Sequence as input_bytes, return an ASCII string suitable for use in an HTTP field value.

+
    +
  1. +

    If input_bytes is not a sequence of bytes, fail serialization.

    +
  2. +
  3. +

    Let output be an empty string.

    +
  4. +
  5. +

    Append ":" to output.

    +
  6. +
  7. +

    Append the result of base64-encoding input_bytes as per [RFC4648], Section 4, taking account of the requirements below.

    +
  8. +
  9. +

    Append ":" to output.

    +
  10. +
  11. +

    Return output.

    +
  12. +
+

The encoded data is required to be padded with "=", as per [RFC4648], Section 3.2.

+

Likewise, encoded data SHOULD have pad bits set to zero, as per [RFC4648], Section 3.5, unless it is not possible to do so due to implementation constraints.

+
+
+
+
+

+4.1.9. Serializing a Boolean +

+

Given a Boolean as input_boolean, return an ASCII string suitable for use in an HTTP field value.

+
    +
  1. +

    If input_boolean is not a boolean, fail serialization.

    +
  2. +
  3. +

    Let output be an empty string.

    +
  4. +
  5. +

    Append "?" to output.

    +
  6. +
  7. +

    If input_boolean is true, append "1" to output.

    +
  8. +
  9. +

    If input_boolean is false, append "0" to output.

    +
  10. +
  11. +

    Return output.

    +
  12. +
+
+
+
+
+

+4.1.10. Serializing a Date +

+

Given a Date as input_date, return an ASCII string suitable for use in an HTTP field value.

+
    +
  1. +

    Let output be "@".

    +
  2. +
  3. +

    Append to output the result of running Serializing an Integer with input_date (Section 4.1.4).

    +
  4. +
  5. +

    Return output.

    +
  6. +
+
+
+
+
+

+4.1.11. Serializing a Display String +

+

Given a sequence of Unicode codepoints as input_sequence, return an ASCII string suitable for use in an HTTP field value.

+
    +
  1. +

    If input_sequence is not a sequence of Unicode codepoints, fail serialization.

    +
  2. +
  3. +

    Let byte_array be the result of applying UTF-8 encoding (Section 3 of [UTF8]) to input_sequence. If encoding fails, fail serialization.

    +
  4. +
  5. +

    Let encoded_string be a string containing "%" followed by DQUOTE.

    +
  6. +
  7. +

    For each byte in byte_array:

    +
      +
    1. +

      If byte is %x25 ("%"), %x22 (DQUOTE), or in the ranges %x00-1f or %x7f-ff:

      +
        +
      1. +

        Append "%" to encoded_string.

        +
      2. +
      3. +

        Let encoded_byte be the result of applying base16 encoding (Section 8 of [RFC4648]) to byte, with any alphabetic characters converted to lowercase.

        +
      4. +
      5. +

        Append encoded_byte to encoded_string.

        +
      6. +
      +
    2. +
    3. +

      Otherwise, decode byte as an ASCII character and append the result to encoded_string.

      +
    4. +
    +
  8. +
  9. +

    Append DQUOTE to encoded_string.

    +
  10. +
  11. +

    Return encoded_string.

    +
  12. +
+

Note that [UTF8] prohibits the encoding of codepoints between U+D800 and U+DFFF (surrogates); if they occur in input_sequence, serialization will fail.

+
+
+
+
+
+
+

+4.2. Parsing Structured Fields +

+

When a receiving implementation parses HTTP fields that are known to be Structured Fields, it is important that care be taken, as there are a number of edge cases that can cause interoperability or even security problems. This section specifies the algorithm for doing so.

+

Given an array of bytes as input_bytes that represent the chosen field's field-value (which is empty if that field is not present) and field_type (one of "dictionary", "list", or "item"), return the parsed field value.

+
    +
  1. +

    Convert input_bytes into an ASCII string input_string; if conversion fails, fail parsing.

    +
  2. +
  3. +

    Discard any leading SP characters from input_string.

    +
  4. +
  5. +

    If field_type is "list", let output be the result of running Parsing a List (Section 4.2.1) with input_string.

    +
  6. +
  7. +

    If field_type is "dictionary", let output be the result of running Parsing a Dictionary (Section 4.2.2) with input_string.

    +
  8. +
  9. +

    If field_type is "item", let output be the result of running Parsing an Item (Section 4.2.3) with input_string.

    +
  10. +
  11. +

    Discard any leading SP characters from input_string.

    +
  12. +
  13. +

    If input_string is not empty, fail parsing.

    +
  14. +
  15. +

    Otherwise, return output.

    +
  16. +
+

When generating input_bytes, parsers MUST combine all field lines in the same section (header or trailer) that case-insensitively match the field name into one comma-separated field-value, as per Section 5.2 of [HTTP]; this assures that the entire field value is processed correctly.

+

For Lists and Dictionaries, this has the effect of correctly concatenating all of the field's lines, as long as individual members of the top-level data structure are not split across multiple field instances. The parsing algorithms for both types allow tab characters, since these might +be used to combine field lines by some implementations.

+

Strings split across multiple field lines will have unpredictable results, because one or more commas (with optional whitespace) will become part of the string output by the parser. Since concatenation might be done by an upstream intermediary, the results are not under the control of the serializer or the parser, even when they are both under the control of the same party.

+

Tokens, Integers, Decimals, and Byte Sequences cannot be split across multiple field lines because the inserted commas will cause parsing to fail.

+

Parsers MAY fail when processing a field value spread across multiple field lines, when one of those lines does not parse as that field. For example, a parsing handling an Example-String field that's defined as an sf-string is allowed to fail when processing this field section:

+
+
+Example-String: "foo
+Example-String: bar"
+
+
+

If parsing fails, either the entire field value MUST be ignored (i.e., treated as if the field were not present in the section), or alternatively the complete HTTP message MUST be treated as malformed. This is intentionally strict to improve interoperability and safety, and field specifications that use Structured Fields are not allowed to loosen this requirement.

+

Note that this requirement does not apply to an implementation that is not parsing the field; for example, an intermediary is not required to strip a failing field from a message before forwarding it.

+
+
+

+4.2.1. Parsing a List +

+

Given an ASCII string as input_string, return an array of (item_or_inner_list, parameters) tuples. input_string is modified to remove the parsed value.

+
    +
  1. +

    Let members be an empty array.

    +
  2. +
  3. +

    While input_string is not empty:

    +
      +
    1. +

      Append the result of running Parsing an Item or Inner List (Section 4.2.1.1) with input_string to members.

      +
    2. +
    3. +

      Discard any leading OWS characters from input_string.

      +
    4. +
    5. +

      If input_string is empty, return members.

      +
    6. +
    7. +

      Consume the first character of input_string; if it is not ",", fail parsing.

      +
    8. +
    9. +

      Discard any leading OWS characters from input_string.

      +
    10. +
    11. +

      If input_string is empty, there is a trailing comma; fail parsing.

      +
    12. +
    +
  4. +
  5. +

    No structured data has been found; return members (which is empty).

    +
  6. +
+
+
+
+4.2.1.1. Parsing an Item or Inner List +
+

Given an ASCII string as input_string, return the tuple (item_or_inner_list, parameters), where item_or_inner_list can be either a single bare item or an array of (bare_item, parameters) tuples. input_string is modified to remove the parsed value.

+
    +
  1. +

    If the first character of input_string is "(", return the result of running Parsing an Inner List (Section 4.2.1.2) with input_string.

    +
  2. +
  3. +

    Return the result of running Parsing an Item (Section 4.2.3) with input_string.

    +
  4. +
+
+
+
+
+
+4.2.1.2. Parsing an Inner List +
+

Given an ASCII string as input_string, return the tuple (inner_list, parameters), where inner_list is an array of (bare_item, parameters) tuples. input_string is modified to remove the parsed value.

+
    +
  1. +

    Consume the first character of input_string; if it is not "(", fail parsing.

    +
  2. +
  3. +

    Let inner_list be an empty array.

    +
  4. +
  5. +

    While input_string is not empty:

    +
      +
    1. +

      Discard any leading SP characters from input_string.

      +
    2. +
    3. +

      If the first character of input_string is ")":

      +
        +
      1. +

        Consume the first character of input_string.

        +
      2. +
      3. +

        Let parameters be the result of running Parsing Parameters (Section 4.2.3.2) with input_string.

        +
      4. +
      5. +

        Return the tuple (inner_list, parameters).

        +
      6. +
      +
    4. +
    5. +

      Let item be the result of running Parsing an Item (Section 4.2.3) with input_string.

      +
    6. +
    7. +

      Append item to inner_list.

      +
    8. +
    9. +

      If the first character of input_string is not SP or ")", fail parsing.

      +
    10. +
    +
  6. +
  7. +

    The end of the Inner List was not found; fail parsing.

    +
  8. +
+
+
+
+
+
+
+

+4.2.2. Parsing a Dictionary +

+

Given an ASCII string as input_string, return an ordered map whose values are (item_or_inner_list, parameters) tuples. input_string is modified to remove the parsed value.

+
    +
  1. +

    Let dictionary be an empty, ordered map.

    +
  2. +
  3. +

    While input_string is not empty:

    +
      +
    1. +

      Let this_key be the result of running Parsing a Key (Section 4.2.3.3) with input_string.

      +
    2. +
    3. +

      If the first character of input_string is "=":

      +
        +
      1. +

        Consume the first character of input_string.

        +
      2. +
      3. +

        Let member be the result of running Parsing an Item or Inner List (Section 4.2.1.1) with input_string.

        +
      4. +
      +
    4. +
    5. +

      Otherwise:

      +
        +
      1. +

        Let value be Boolean true.

        +
      2. +
      3. +

        Let parameters be the result of running Parsing Parameters (Section 4.2.3.2) with input_string.

        +
      4. +
      5. +

        Let member be the tuple (value, parameters).

        +
      6. +
      +
    6. +
    7. +

      If dictionary already contains a key this_key (comparing character for character), overwrite its value with member.

      +
    8. +
    9. +

      Otherwise, append key this_key with value member to dictionary.

      +
    10. +
    11. +

      Discard any leading OWS characters from input_string.

      +
    12. +
    13. +

      If input_string is empty, return dictionary.

      +
    14. +
    15. +

      Consume the first character of input_string; if it is not ",", fail parsing.

      +
    16. +
    17. +

      Discard any leading OWS characters from input_string.

      +
    18. +
    19. +

      If input_string is empty, there is a trailing comma; fail parsing.

      +
    20. +
    +
  4. +
  5. +

    No structured data has been found; return dictionary (which is empty).

    +
  6. +
+

Note that when duplicate Dictionary keys are encountered, all but the last instance are ignored.

+
+
+
+
+

+4.2.3. Parsing an Item +

+

Given an ASCII string as input_string, return a (bare_item, parameters) tuple. input_string is modified to remove the parsed value.

+
    +
  1. +

    Let bare_item be the result of running Parsing a Bare Item (Section 4.2.3.1) with input_string.

    +
  2. +
  3. +

    Let parameters be the result of running Parsing Parameters (Section 4.2.3.2) with input_string.

    +
  4. +
  5. +

    Return the tuple (bare_item, parameters).

    +
  6. +
+
+
+
+4.2.3.1. Parsing a Bare Item +
+

Given an ASCII string as input_string, return a bare Item. input_string is modified to remove the parsed value.

+
    +
  1. +

    If the first character of input_string is a "-" or a DIGIT, return the result of running Parsing an Integer or Decimal (Section 4.2.4) with input_string.

    +
  2. +
  3. +

    If the first character of input_string is a DQUOTE, return the result of running Parsing a String (Section 4.2.5) with input_string.

    +
  4. +
  5. +

    If the first character of input_string is an ALPHA or "*", return the result of running Parsing a Token (Section 4.2.6) with input_string.

    +
  6. +
  7. +

    If the first character of input_string is ":", return the result of running Parsing a Byte Sequence (Section 4.2.7) with input_string.

    +
  8. +
  9. +

    If the first character of input_string is "?", return the result of running Parsing a Boolean (Section 4.2.8) with input_string.

    +
  10. +
  11. +

    If the first character of input_string is "@", return the result of running Parsing a Date (Section 4.2.9) with input_string.

    +
  12. +
  13. +

    If the first character of input_string is "%", return the result of running Parsing a Display String (Section 4.2.10) with input_string.

    +
  14. +
  15. +

    Otherwise, the item type is unrecognized; fail parsing.

    +
  16. +
+
+
+
+
+
+4.2.3.2. Parsing Parameters +
+

Given an ASCII string as input_string, return an ordered map whose values are bare Items. input_string is modified to remove the parsed value.

+
    +
  1. +

    Let parameters be an empty, ordered map.

    +
  2. +
  3. +

    While input_string is not empty:

    +
      +
    1. +

      If the first character of input_string is not ";", exit the loop.

      +
    2. +
    3. +

      Consume the ";" character from the beginning of input_string.

      +
    4. +
    5. +

      Discard any leading SP characters from input_string.

      +
    6. +
    7. +

      Let param_key be the result of running Parsing a Key (Section 4.2.3.3) with input_string.

      +
    8. +
    9. +

      Let param_value be Boolean true.

      +
    10. +
    11. +

      If the first character of input_string is "=":

      +
        +
      1. +

        Consume the "=" character at the beginning of input_string.

        +
      2. +
      3. +

        Let param_value be the result of running Parsing a Bare Item (Section 4.2.3.1) with input_string.

        +
      4. +
      +
    12. +
    13. +

      If parameters already contains a key param_key (comparing character for character), overwrite its value with param_value.

      +
    14. +
    15. +

      Otherwise, append key param_key with value param_value to parameters.

      +
    16. +
    +
  4. +
  5. +

    Return parameters.

    +
  6. +
+

Note that when duplicate parameter keys are encountered, all but the last instance are ignored.

+
+
+
+
+
+4.2.3.3. Parsing a Key +
+

Given an ASCII string as input_string, return a key. input_string is modified to remove the parsed value.

+
    +
  1. +

    If the first character of input_string is not lcalpha or "*", fail parsing.

    +
  2. +
  3. +

    Let output_string be an empty string.

    +
  4. +
  5. +

    While input_string is not empty:

    +
      +
    1. +

      If the first character of input_string is not one of lcalpha, DIGIT, "_", "-", ".", or "*", return output_string.

      +
    2. +
    3. +

      Let char be the result of consuming the first character of input_string.

      +
    4. +
    5. +

      Append char to output_string.

      +
    6. +
    +
  6. +
  7. +

    Return output_string.

    +
  8. +
+
+
+
+
+
+
+

+4.2.4. Parsing an Integer or Decimal +

+

Given an ASCII string as input_string, return an Integer or Decimal. input_string is modified to remove the parsed value.

+

NOTE: This algorithm parses both Integers (Section 3.3.1) and Decimals (Section 3.3.2), and returns the corresponding structure.

+
    +
  1. +

    Let type be "integer".

    +
  2. +
  3. +

    Let sign be 1.

    +
  4. +
  5. +

    Let input_number be an empty string.

    +
  6. +
  7. +

    If the first character of input_string is "-", consume it and set sign to -1.

    +
  8. +
  9. +

    If input_string is empty, there is an empty integer; fail parsing.

    +
  10. +
  11. +

    If the first character of input_string is not a DIGIT, fail parsing.

    +
  12. +
  13. +

    While input_string is not empty:

    +
      +
    1. +

      Let char be the result of consuming the first character of input_string.

      +
    2. +
    3. +

      If char is a DIGIT, append it to input_number.

      +
    4. +
    5. +

      Else, if type is "integer" and char is ".":

      +
        +
      1. +

        If input_number contains more than 12 characters, fail parsing.

        +
      2. +
      3. +

        Otherwise, append char to input_number and set type to "decimal".

        +
      4. +
      +
    6. +
    7. +

      Otherwise, prepend char to input_string, and exit the loop.

      +
    8. +
    9. +

      If type is "integer" and input_number contains more than 15 characters, fail parsing.

      +
    10. +
    11. +

      If type is "decimal" and input_number contains more than 16 characters, fail parsing.

      +
    12. +
    +
  14. +
  15. +

    If type is "integer":

    +
      +
    1. +

      Let output_number be an Integer that is the result of parsing input_number as an integer.

      +
    2. +
    +
  16. +
  17. +

    Otherwise:

    +
      +
    1. +

      If the final character of input_number is ".", fail parsing.

      +
    2. +
    3. +

      If the number of characters after "." in input_number is greater than three, fail parsing.

      +
    4. +
    5. +

      Let output_number be a Decimal that is the result of parsing input_number as a decimal number.

      +
    6. +
    +
  18. +
  19. +

    Let output_number be the product of output_number and sign.

    +
  20. +
  21. +

    Return output_number.

    +
  22. +
+
+
+
+
+

+4.2.5. Parsing a String +

+

Given an ASCII string as input_string, return an unquoted String. input_string is modified to remove the parsed value.

+
    +
  1. +

    Let output_string be an empty string.

    +
  2. +
  3. +

    If the first character of input_string is not DQUOTE, fail parsing.

    +
  4. +
  5. +

    Discard the first character of input_string.

    +
  6. +
  7. +

    While input_string is not empty:

    +
      +
    1. +

      Let char be the result of consuming the first character of input_string.

      +
    2. +
    3. +

      If char is a backslash ("\"):

      +
        +
      1. +

        If input_string is now empty, fail parsing.

        +
      2. +
      3. +

        Let next_char be the result of consuming the first character of input_string.

        +
      4. +
      5. +

        If next_char is not DQUOTE or "\", fail parsing.

        +
      6. +
      7. +

        Append next_char to output_string.

        +
      8. +
      +
    4. +
    5. +

      Else, if char is DQUOTE, return output_string.

      +
    6. +
    7. +

      Else, if char is in the range %x00-1f or %x7f-ff (i.e., it is not in VCHAR or SP), fail parsing.

      +
    8. +
    9. +

      Else, append char to output_string.

      +
    10. +
    +
  8. +
  9. +

    Reached the end of input_string without finding a closing DQUOTE; fail parsing.

    +
  10. +
+
+
+
+
+

+4.2.6. Parsing a Token +

+

Given an ASCII string as input_string, return a Token. input_string is modified to remove the parsed value.

+
    +
  1. +

    If the first character of input_string is not ALPHA or "*", fail parsing.

    +
  2. +
  3. +

    Let output_string be an empty string.

    +
  4. +
  5. +

    While input_string is not empty:

    +
      +
    1. +

      If the first character of input_string is not in tchar, ":", or "/", return output_string.

      +
    2. +
    3. +

      Let char be the result of consuming the first character of input_string.

      +
    4. +
    5. +

      Append char to output_string.

      +
    6. +
    +
  6. +
  7. +

    Return output_string.

    +
  8. +
+
+
+
+
+

+4.2.7. Parsing a Byte Sequence +

+

Given an ASCII string as input_string, return a Byte Sequence. input_string is modified to remove the parsed value.

+
    +
  1. +

    If the first character of input_string is not ":", fail parsing.

    +
  2. +
  3. +

    Discard the first character of input_string.

    +
  4. +
  5. +

    If there is not a ":" character before the end of input_string, fail parsing.

    +
  6. +
  7. +

    Let b64_content be the result of consuming content of input_string up to but not including the first instance of the character ":".

    +
  8. +
  9. +

    Consume the ":" character at the beginning of input_string.

    +
  10. +
  11. +

    If b64_content contains a character not included in ALPHA, DIGIT, "+", "/", and "=", fail parsing.

    +
  12. +
  13. +

    Let binary_content be the result of base64-decoding [RFC4648] b64_content, synthesizing padding if necessary (note the requirements about recipient behavior below). If base64 decoding fails, parsing fails.

    +
  14. +
  15. +

    Return binary_content.

    +
  16. +
+

Because some implementations of base64 do not allow rejection of encoded data that is not properly "=" padded (see [RFC4648], Section 3.2), parsers SHOULD NOT fail when "=" padding is not present, unless they cannot be configured to do so.

+

Because some implementations of base64 do not allow rejection of encoded data that has non-zero pad bits (see [RFC4648], Section 3.5), parsers SHOULD NOT fail when non-zero pad bits are present, unless they cannot be configured to do so.

+

This specification does not relax the requirements in Sections 3.1 and 3.3 of [RFC4648]; therefore, parsers MUST fail on characters outside the base64 alphabet and on line feeds in encoded data.

+
+
+
+
+

+4.2.8. Parsing a Boolean +

+

Given an ASCII string as input_string, return a Boolean. input_string is modified to remove the parsed value.

+
    +
  1. +

    If the first character of input_string is not "?", fail parsing.

    +
  2. +
  3. +

    Discard the first character of input_string.

    +
  4. +
  5. +

    If the first character of input_string matches "1", discard the first character, and return true.

    +
  6. +
  7. +

    If the first character of input_string matches "0", discard the first character, and return false.

    +
  8. +
  9. +

    No value has matched; fail parsing.

    +
  10. +
+
+
+
+
+

+4.2.9. Parsing a Date +

+

Given an ASCII string as input_string, return a Date. input_string is modified to remove the parsed value.

+
    +
  1. +

    If the first character of input_string is not "@", fail parsing.

    +
  2. +
  3. +

    Discard the first character of input_string.

    +
  4. +
  5. +

    Let output_date be the result of running Parsing an Integer or Decimal (Section 4.2.4) with input_string.

    +
  6. +
  7. +

    If output_date is a Decimal, fail parsing.

    +
  8. +
  9. +

    Return output_date.

    +
  10. +
+
+
+
+
+

+4.2.10. Parsing a Display String +

+

Given an ASCII string as input_string, return a sequence of Unicode codepoints. input_string is modified to remove the parsed value.

+
    +
  1. +

    If the first two characters of input_string are not "%" followed by DQUOTE, fail parsing.

    +
  2. +
  3. +

    Discard the first two characters of input_string.

    +
  4. +
  5. +

    Let byte_array be an empty byte array.

    +
  6. +
  7. +

    While input_string is not empty:

    +
      +
    1. +

      Let char be the result of consuming the first character of input_string.

      +
    2. +
    3. +

      If char is in the range %x00-1f or %x7f-ff (i.e., it is not in VCHAR or SP), fail parsing.

      +
    4. +
    5. +

      If char is "%":

      +
        +
      1. +

        Let octet_hex be the result of consuming two characters from input_string. If there are not two characters, fail parsing.

        +
      2. +
      3. +

        If octet_hex contains characters outside the range %x30-39 or %x61-66 (i.e., it is not in 0-9 or lowercase a-f), fail parsing.

        +
      4. +
      5. +

        Let octet be the result of hex decoding octet_hex (Section 8 of [RFC4648]).

        +
      6. +
      7. +

        Append octet to byte_array.

        +
      8. +
      +
    6. +
    7. +

      If char is DQUOTE:

      +
        +
      1. +

        Let unicode_sequence be the result of decoding byte_array as a UTF-8 string (Section 3 of [UTF8]). Fail parsing if decoding fails.

        +
      2. +
      3. +

        Return unicode_sequence.

        +
      4. +
      +
    8. +
    9. +

      Otherwise, if char is not "%" or DQUOTE:

      +
        +
      1. +

        Let byte be the result of applying ASCII encoding to char.

        +
      2. +
      3. +

        Append byte to byte_array.

        +
      4. +
      +
    10. +
    +
  8. +
  9. +

    Reached the end of input_string without finding a closing DQUOTE; fail parsing.

    +
  10. +
+
+
+
+
+
+
+
+
+

+5. IANA Considerations +

+

Please add the following note to the "Hypertext Transfer Protocol (HTTP) Field Name Registry":

+
    +
  • +

    The "Structured Type" column indicates the type of the field (per RFC nnnn), if any, and may be +"Dictionary", "List" or "Item".

    +

    Note that field names beginning with characters other than ALPHA or "*" will not be able to be +represented as a Structured Fields Token, and therefore may be incompatible with being mapped into +field values that refer to it.

    +
  • +
+

Then, add a new column, "Structured Type".

+

Then, add the indicated Structured Type for each existing registry entry listed in Table 1.

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+Table 1: +Existing Fields +
Field NameStructured Type
Accept-CHList
Cache-StatusList
CDN-Cache-ControlDictionary
Cross-Origin-Embedder-PolicyItem
Cross-Origin-Embedder-Policy-Report-OnlyItem
Cross-Origin-Opener-PolicyItem
Cross-Origin-Opener-Policy-Report-OnlyItem
Origin-Agent-ClusterItem
PriorityDictionary
Proxy-StatusList
+
+
+
+
+
+

+6. Security Considerations +

+

The size of most types defined by Structured Fields is not limited; as a result, extremely large fields could be an attack vector (e.g., for resource consumption). Most HTTP implementations limit the sizes of individual fields as well as the overall header or trailer section size to mitigate such attacks.

+

It is possible for parties with the ability to inject new HTTP fields to change the meaning +of a Structured Field. In some circumstances, this will cause parsing to fail, but it is not possible to reliably fail in all such circumstances.

+

The Display String type can convey any possible Unicode code point without sanitization; for example, they might contain unassigned code points, control points (including NUL), or noncharacters. Therefore, applications consuming Display Strings need to consider strategies such as filtering or escaping untrusted content before displaying it. See [PRECIS] and [UNICODE-SECURITY].

+
+
+
+
+

+7. References +

+
+
+

+7.1. Normative References +

+
+
[HTTP]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, DOI 10.17487/RFC9110, , <https://www.rfc-editor.org/rfc/rfc9110>.
+
+
[RFC0020]
+
+Cerf, V., "ASCII format for network interchange", STD 80, RFC 20, DOI 10.17487/RFC0020, , <https://www.rfc-editor.org/rfc/rfc20>.
+
+
[RFC2119]
+
+Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
+
+
[RFC4648]
+
+Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, DOI 10.17487/RFC4648, , <https://www.rfc-editor.org/rfc/rfc4648>.
+
+
[RFC8174]
+
+Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.
+
+
[UTF8]
+
+Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, , <http://www.rfc-editor.org/info/std63>.
+
+
+
+
+
+
+

+7.2. Informative References +

+
+
[HPACK]
+
+Peon, R. and H. Ruellan, "HPACK: Header Compression for HTTP/2", RFC 7541, DOI 10.17487/RFC7541, , <https://www.rfc-editor.org/rfc/rfc7541>.
+
+
[HTTP/2]
+
+Thomson, M., Ed. and C. Benfield, Ed., "HTTP/2", RFC 9113, DOI 10.17487/RFC9113, , <https://www.rfc-editor.org/rfc/rfc9113>.
+
+
[IEEE754]
+
+IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE 754-2019, DOI 10.1109/IEEESTD.2019.8766229, ISBN 978-1-5044-5924-2, , <https://ieeexplore.ieee.org/document/8766229>.
+
+
[PRECIS]
+
+Saint-Andre, P. and M. Blanchet, "PRECIS Framework: Preparation, Enforcement, and Comparison of Internationalized Strings in Application Protocols", RFC 8264, DOI 10.17487/RFC8264, , <https://www.rfc-editor.org/rfc/rfc8264>.
+
+
[RFC5234]
+
+Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, DOI 10.17487/RFC5234, , <https://www.rfc-editor.org/rfc/rfc5234>.
+
+
[RFC7493]
+
+Bray, T., Ed., "The I-JSON Message Format", RFC 7493, DOI 10.17487/RFC7493, , <https://www.rfc-editor.org/rfc/rfc7493>.
+
+
[RFC8259]
+
+Bray, T., Ed., "The JavaScript Object Notation (JSON) Data Interchange Format", STD 90, RFC 8259, DOI 10.17487/RFC8259, , <https://www.rfc-editor.org/rfc/rfc8259>.
+
+
[UNICODE-SECURITY]
+
+Davis, M. and M. Suignard, "Unicode Security Considerations", Unicode Technical Report #16, , <http://www.unicode.org/reports/tr36/>.
+
+
+
+
+
+
+
+
+

+Appendix A. Frequently Asked Questions +

+
+
+

+A.1. Why Not JSON? +

+

Earlier proposals for Structured Fields were based upon JSON [RFC8259]. However, constraining its use to make it suitable for HTTP fields required senders and recipients to implement specific additional handling.

+

For example, JSON has specification issues around large numbers and objects with duplicate members. Although advice for avoiding these issues is available (e.g., [RFC7493]), it cannot be relied upon.

+

Likewise, JSON strings are by default Unicode strings, which have a number of potential interoperability issues (e.g., in comparison). Although implementers can be advised to avoid non-ASCII content where unnecessary, this is difficult to enforce.

+

Another example is JSON's ability to nest content to arbitrary depths. Since the resulting memory commitment might be unsuitable (e.g., in embedded and other limited server deployments), it's necessary to limit it in some fashion; however, existing JSON implementations have no such limits, and even if a limit is specified, it's likely that some field definition will find a need to violate it.

+

Because of JSON's broad adoption and implementation, it is difficult to impose such additional constraints across all implementations; some deployments would fail to enforce them, thereby harming interoperability. In short, if it looks like JSON, people will be tempted to use a JSON parser/serializer on field values.

+

Since a major goal for Structured Fields is to improve interoperability and simplify implementation, these concerns led to a format that requires a dedicated parser and serializer.

+

Additionally, there were widely shared feelings that JSON doesn't "look right" in HTTP fields.

+
+
+
+
+
+
+

+Appendix B. Implementation Notes +

+

A generic implementation of this specification should expose the top-level serialize (Section 4.1) and parse (Section 4.2) functions. They need not be functions; for example, it could be implemented as an object, with methods for each of the different top-level types.

+

For interoperability, it's important that generic implementations be complete and follow the algorithms closely; see Section 1.1. To aid this, a common test suite is being maintained by the community at <https://github.com/httpwg/structured-field-tests>.

+

Implementers should note that Dictionaries and Parameters are order-preserving maps. Some fields may not convey meaning in the ordering of these data types, but it should still be exposed so that it will be available to applications that need to use it.

+

Likewise, implementations should note that it's important to preserve the distinction between Tokens and Strings. While most programming languages have native types that map to the other types well, it may be necessary to create a wrapper "token" object or use a parameter on functions to assure that these types remain separate.

+

The serialization algorithm is defined in a way that it is not strictly limited to the data types defined in Section 3 in every case. For example, Decimals are designed to take broader input and round to allowed values.

+

Implementations are allowed to limit the size of different structures, subject to the minimums defined for each type. When a structure exceeds an implementation limit, that structure fails parsing or serialization.

+
+
+
+
+

+Appendix C. ABNF +

+

This section uses the Augmented Backus-Naur Form (ABNF) notation [RFC5234] to illustrate expected syntax of Structured Fields. However, it cannot be used to validate their syntax, because it does not capture all requirements.

+

This section is non-normative. If there is disagreement between the parsing algorithms and ABNF, the specified algorithms take precedence.

+
+
+sf-list       = list-member *( OWS "," OWS list-member )
+list-member   = sf-item / inner-list
+
+inner-list    = "(" *SP [ sf-item *( 1*SP sf-item ) *SP ] ")"
+                parameters
+
+parameters    = *( ";" *SP parameter )
+parameter     = param-key [ "=" param-value ]
+param-key     = key
+key           = ( lcalpha / "*" )
+                *( lcalpha / DIGIT / "_" / "-" / "." / "*" )
+lcalpha       = %x61-7A ; a-z
+param-value   = bare-item
+
+sf-dictionary = dict-member *( OWS "," OWS dict-member )
+dict-member   = member-key ( parameters / ( "=" member-value ))
+member-key    = key
+member-value  = sf-item / inner-list
+
+sf-item   = bare-item parameters
+bare-item = sf-integer / sf-decimal / sf-string / sf-token
+            / sf-binary / sf-boolean / sf-date / sf-displaystring
+
+sf-integer       = ["-"] 1*15DIGIT
+sf-decimal       = ["-"] 1*12DIGIT "." 1*3DIGIT
+sf-string        = DQUOTE *( unescaped / "%" / bs-escaped ) DQUOTE
+sf-token         = ( ALPHA / "*" ) *( tchar / ":" / "/" )
+sf-binary        = ":" base64 ":"
+sf-boolean       = "?" ( "0" / "1" )
+sf-date          = "@" sf-integer
+sf-displaystring = "%" DQUOTE *( unescaped / "\" / pct-encoded )
+                   DQUOTE
+
+base64       = *( ALPHA / DIGIT / "+" / "/" ) *"="
+
+unescaped    = %x20-21 / %x23-24 / %x26-5B / %x5D-7E
+bs-escaped   = "\" ( DQUOTE / "\" )
+
+pct-encoded  = "%" lc-hexdig lc-hexdig
+lc-hexdig = DIGIT / %x61-66 ; 0-9, a-f
+
+
+
+
+
+
+

+Appendix D. Changes from RFC 8941 +

+

This revision of the Structured Field Values for HTTP specification has made the following changes:

+
    +
  • +

    Added the Date structured type. (Section 3.3.7)

    +
  • +
  • +

    Stopped encouraging use of ABNF in definitions of new structured fields. (Section 2)

    +
  • +
  • +

    Moved ABNF to an informative appendix. (Appendix C)

    +
  • +
  • +

    Added a "Structured Type" column to the HTTP Field Name Registry. (Section 5)

    +
  • +
  • +

    Refined parse failure handling. (Section 4.2)

    +
  • +
  • +

    Added the Display String structured type. (Section 3.3.8)

    +
  • +
+
+
+
+
+

+Acknowledgements +

+

Many thanks to Matthew Kerwin for his detailed feedback and careful consideration during the development of this specification.

+

Thanks also to Ian Clelland, Roy Fielding, Anne van Kesteren, Kazuho Oku, Evert Pot, Julian Reschke, Martin Thomson, Mike West, and Jeffrey Yasskin for their contributions.

+
+
+
+
+

+Authors' Addresses +

+
+
Mark Nottingham
+
Cloudflare
+
+Prahran VIC
+
Australia
+ + +
+
+
Poul-Henning Kamp
+
The Varnish Cache Project
+ +
+
+
+ + + diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-sfbis.txt b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-sfbis.txt new file mode 100644 index 000000000..995e80be4 --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-sfbis.txt @@ -0,0 +1,2011 @@ + + + + +HTTP M. Nottingham +Internet-Draft Cloudflare +Obsoletes: 8941 (if approved) P.-H. Kamp +Intended status: Standards Track The Varnish Cache Project +Expires: 11 July 2025 7 January 2025 + + + Structured Field Values for HTTP + draft-ietf-httpbis-sfbis-latest + +Abstract + + This document describes a set of data types and associated algorithms + that are intended to make it easier and safer to define and handle + HTTP header and trailer fields, known as "Structured Fields", + "Structured Headers", or "Structured Trailers". It is intended for + use by specifications of new HTTP fields that wish to use a common + syntax that is more restrictive than traditional HTTP field values. + + This document obsoletes RFC 8941. + +About This Document + + This note is to be removed before publishing as an RFC. + + Status information for this document may be found at + https://datatracker.ietf.org/doc/draft-ietf-httpbis-sfbis/. + + Discussion of this document takes place on the HTTP Working Group + mailing list (mailto:ietf-http-wg@w3.org), which is archived at + https://lists.w3.org/Archives/Public/ietf-http-wg/. Working Group + information can be found at https://httpwg.org/. + + Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/header-structure. + +Status of This Memo + + This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79. + + Internet-Drafts are working documents of the Internet Engineering + Task Force (IETF). Note that other groups may also distribute + working documents as Internet-Drafts. The list of current Internet- + Drafts is at https://datatracker.ietf.org/drafts/current/. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress." + + This Internet-Draft will expire on 11 July 2025. + +Copyright Notice + + Copyright (c) 2025 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents (https://trustee.ietf.org/ + license-info) in effect on the date of publication of this document. + Please review these documents carefully, as they describe your rights + and restrictions with respect to this document. Code Components + extracted from this document must include Revised BSD License text as + described in Section 4.e of the Trust Legal Provisions and are + provided without warranty as described in the Revised BSD License. + +Table of Contents + + 1. Introduction + 1.1. Intentionally Strict Processing + 1.2. Notational Conventions + 2. Defining New Structured Fields + 2.1. Example + 2.2. Error Handling + 2.3. Preserving Extensibility + 2.4. Using New Structured Types in Extensions + 3. Structured Data Types + 3.1. Lists + 3.1.1. Inner Lists + 3.1.2. Parameters + 3.2. Dictionaries + 3.3. Items + 3.3.1. Integers + 3.3.2. Decimals + 3.3.3. Strings + 3.3.4. Tokens + 3.3.5. Byte Sequences + 3.3.6. Booleans + 3.3.7. Dates + 3.3.8. Display Strings + 4. Working with Structured Fields in HTTP + 4.1. Serializing Structured Fields + 4.1.1. Serializing a List + 4.1.2. Serializing a Dictionary + 4.1.3. Serializing an Item + 4.1.4. Serializing an Integer + 4.1.5. Serializing a Decimal + 4.1.6. Serializing a String + 4.1.7. Serializing a Token + 4.1.8. Serializing a Byte Sequence + 4.1.9. Serializing a Boolean + 4.1.10. Serializing a Date + 4.1.11. Serializing a Display String + 4.2. Parsing Structured Fields + 4.2.1. Parsing a List + 4.2.2. Parsing a Dictionary + 4.2.3. Parsing an Item + 4.2.4. Parsing an Integer or Decimal + 4.2.5. Parsing a String + 4.2.6. Parsing a Token + 4.2.7. Parsing a Byte Sequence + 4.2.8. Parsing a Boolean + 4.2.9. Parsing a Date + 4.2.10. Parsing a Display String + 5. IANA Considerations + 6. Security Considerations + 7. References + 7.1. Normative References + 7.2. Informative References + Appendix A. Frequently Asked Questions + A.1. Why Not JSON? + Appendix B. Implementation Notes + Appendix C. ABNF + Appendix D. Changes from RFC 8941 + Acknowledgements + Authors' Addresses + +1. Introduction + + Specifying the syntax of new HTTP header (and trailer) fields is an + onerous task; even with the guidance in Section 16.3.2 of [HTTP], + there are many decisions -- and pitfalls -- for a prospective HTTP + field author. + + Once a field is defined, bespoke parsers and serializers often need + to be written, because each field value has a slightly different + handling of what looks like common syntax. + + This document introduces a set of common data structures for use in + definitions of new HTTP field values to address these problems. In + particular, it defines a generic, abstract model for them, along with + a concrete serialization for expressing that model in HTTP [HTTP] + header and trailer fields. + + An HTTP field that is defined as a "Structured Header" or "Structured + Trailer" (if the field can be either, it is a "Structured Field") + uses the types defined in this specification to define its syntax and + basic handling rules, thereby simplifying both its definition by + specification writers and handling by implementations. + + Additionally, future versions of HTTP can define alternative + serializations of the abstract model of these structures, allowing + fields that use that model to be transmitted more efficiently without + being redefined. + + Note that it is not a goal of this document to redefine the syntax of + existing HTTP fields; the mechanisms described herein are only + intended to be used with fields that explicitly opt into them. + + Section 2 describes how to specify a Structured Field. + + Section 3 defines a number of abstract data types that can be used in + Structured Fields. + + Those abstract types can be serialized into and parsed from HTTP + field values using the algorithms described in Section 4. + +1.1. Intentionally Strict Processing + + This specification intentionally defines strict parsing and + serialization behaviors using step-by-step algorithms; the only error + handling defined is to fail the entire operation altogether. + + It is designed to encourage faithful implementation and good + interoperability. Therefore, an implementation that tried to be + helpful by being more tolerant of input would make interoperability + worse, since that would create pressure on other implementations to + implement similar (but likely subtly different) workarounds. + + In other words, strict processing is an intentional feature of this + specification; it allows non-conformant input to be discovered and + corrected by the producer early and avoids both interoperability and + security issues that might otherwise result. + + Note that as a result of this strictness, if a field is appended to + by multiple parties (e.g., intermediaries or different components in + the sender), an error in one party's value is likely to cause the + entire field value to fail parsing. + +1.2. Notational Conventions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in BCP + 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + + This document uses the VCHAR, SP, DIGIT, ALPHA, and DQUOTE rules from + [RFC5234] to specify characters and/or their corresponding ASCII + bytes, depending on context. It uses the tchar and OWS rules from + [HTTP] for the same purpose. + + This document uses algorithms to specify parsing and serialization + behaviors. When parsing from HTTP fields, implementations MUST have + behavior that is indistinguishable from following the algorithms. + + For serialization to HTTP fields, the algorithms define the + recommended way to produce them. Implementations MAY vary from the + specified behavior so long as the output is still correctly handled + by the parsing algorithm described in Section 4.2. + +2. Defining New Structured Fields + + To specify an HTTP field as a Structured Field, its authors need to: + + * Normatively reference this specification. Recipients and + generators of the field need to know that the requirements of this + document are in effect. + + * Identify whether the field is a Structured Header (i.e., it can + only be used in the header section -- the common case), a + Structured Trailer (only in the trailer section), or a Structured + Field (both). + + * Specify the type of the field value; either List (Section 3.1), + Dictionary (Section 3.2), or Item (Section 3.3). + + * Define the semantics of the field value. + + * Specify any additional constraints upon the field value, as well + as the consequences when those constraints are violated. + + Typically, this means that a field definition will specify the top- + level type -- List, Dictionary, or Item -- and then define its + allowable types and constraints upon them. For example, a header + defined as a List might have all Integer members, or a mix of types; + a header defined as an Item might allow only Strings, and + additionally only strings beginning with the letter "Q", or strings + in lowercase. Likewise, Inner Lists (Section 3.1.1) are only valid + when a field definition explicitly allows them. + + Fields that use the Display String type are advised to carefully + specify their allowable unicode code points; for example, specifying + the use of a profile from [PRECIS]. + + Field definitions can only use this specification for the entire + field value, not a portion thereof. + + Specifications can refer to a field name as a "structured header + name", "structured trailer name", or "structured field name" as + appropriate. Likewise, they can refer its field value as a + "structured header value", "structured trailer value", or "structured + field value" as necessary. + + This specification defines minimums for the length or number of + various structures supported by implementations. It does not specify + maximum sizes in most cases, but authors should be aware that HTTP + implementations do impose various limits on the size of individual + fields, the total number of fields, and/or the size of the entire + header or trailer section. + +2.1. Example + + A fictitious Foo-Example header field might be specified as: + + | 42. Foo-Example Header Field + | + | The Foo-Example HTTP header field conveys information about how + | much Foo the message has. + | + | Foo-Example is an Item Structured Header Field [RFCnnnn]. Its + | value MUST be an Integer (Section 3.3.1 of [RFCnnnn]). + | + | Its value indicates the amount of Foo in the message, and it MUST + | be between 0 and 10, inclusive; other values MUST cause the entire + | header field to be ignored. + | + | The following parameter is defined: + | + | * A parameter whose key is "foourl", and whose value is a String + | (Section 3.3.3 of [RFCnnnn]), conveying the Foo URL for the + | message. See below for processing requirements. + | + | "foourl" contains a URI-reference (Section 4.1 of [RFC3986]). If + | its value is not a valid URI-reference, the entire header field + | MUST be ignored. If its value is a relative reference + | (Section 4.2 of [RFC3986]), it MUST be resolved (Section 5 of + | [RFC3986]) before being used. + | + | For example: + | + | Foo-Example: 2; foourl="https://foo.example.com/" + +2.2. Error Handling + + When parsing fails, the entire field is ignored (see Section 4.2). + Field definitions cannot override this, because doing so would + preclude handling by generic software; they can only add additional + constraints (for example, on the numeric range of Integers and + Decimals, the format of Strings and Tokens, the types allowed in a + Dictionary's values, or the number of Items in a List). + + When field-specific constraints are violated, the entire field is + also ignored, unless the field definition defines other handling + requirements. For example, if a header field is defined as an Item + and required to be an Integer, but a String is received, it should be + ignored unless that field's definition explicitly specifies + otherwise. + +2.3. Preserving Extensibility + + Structured Fields are designed to be extensible, because experience + has shown that even when it is not foreseen, it is often necessary to + modify and add to the allowable syntax and semantics of a field in a + controlled fashion. + + Both Items and Inner Lists allow Parameters as an extensibility + mechanism; this means that their values can later be extended to + accommodate more information, if need be. To preserve forward + compatibility, field specifications are discouraged from defining the + presence of an unrecognized parameter as an error condition. + + Field specifications are required to be either an Item, List, or + Dictionary to preserve extensibility. Fields that erroneously + defined as another type (e.g., Integer) are assumed to be Items + (i.e., they allow Parameters). + + To further assure that this extensibility is available in the future, + and to encourage consumers to use a complete parser implementation, a + field definition can specify that "grease" parameters be added by + senders. A specification could stipulate that all parameters that + fit a defined pattern are reserved for this use and then encourage + them to be sent on some portion of requests. This helps to + discourage recipients from writing a parser that does not account for + Parameters. + + Specifications that use Dictionaries can also allow for forward + compatibility by requiring that the presence of -- as well as value + and type associated with -- unknown keys be ignored. Subsequent + specifications can then add additional keys, specifying constraints + on them as appropriate. + + An extension to a Structured Field can then require that an entire + field value be ignored by a recipient that understands the extension + if constraints on the value it defines are not met. + +2.4. Using New Structured Types in Extensions + + Because a field definition needs to reference a specific RFC for + Structured Fields, the types available for use in its value are + limited to those defined in that RFC. For example, a field whose + definition references this document can have a value that uses the + Date type (Section 3.3.7), whereas a field whose definition + references RFC 8941 cannot, because it will be treated as invalid + (and therefore discarded) by implementations of that specification. + + This limitation also applies to future extensions to a field; for + example, a field that is defined with reference to RFC 8941 cannot + use the Date type, because some recipients might still be using an + RFC 8941 parser to process it. + + However, this document is designed to be backwards-compatible with + RFC 8941; a parser that implements the requirements here can also + parse valid Structured Fields whose definitions reference RFC 8941. + + Upgrading a Structured Fields implementation to support a newer + revision of the specification (such as this document) brings the + possibility that some field values that were invalid according to the + earlier RFC might become valid when processed. + + For example, field instance might contain a syntactically valid Date + (Section 3.3.7), even though that field's definition does not + accommodate Dates. An RFC8941 implementation would fail parsing such + a field instance, because they are not defined in that specification. + If that implementation were upgraded to this specification, parsing + would now succeed. In some cases, the resulting Date value will be + rejected by field-specific logic, but values in fields that are + otherwise ignored (such as extension parameters) might not be + detected and the field might subsequently be accepted and processed. + +3. Structured Data Types + + This section provides an overview of the abstract types that + Structured Fields use, and gives a brief description and examples of + how each of those types are serialized into textual HTTP fields. + Section 4 specifies the details of how they are parsed from and + serialized into textual HTTP fields. + + In summary: + + * There are three top-level types that an HTTP field can be defined + as: Lists, Dictionaries, and Items. + + * Lists and Dictionaries are containers; their members can be Items + or Inner Lists (which are themselves arrays of Items). + + * Both Items and Inner Lists can be Parameterized with key/value + pairs. + +3.1. Lists + + Lists are arrays of zero or more members, each of which can be an + Item (Section 3.3) or an Inner List (Section 3.1.1), both of which + can be Parameterized (Section 3.1.2). + + An empty List is denoted by not serializing the field at all. This + implies that fields defined as Lists have a default empty value. + + When serialized as a textual HTTP field, each member is separated by + a comma and optional whitespace. For example, a field whose value is + defined as a List of Tokens could look like: + + Example-List: sugar, tea, rum + + Note that Lists can have their members split across multiple lines of + the same header or trailer section, as per Section 5.3 of [HTTP]; for + example, the following are equivalent: + + Example-List: sugar, tea, rum + + and + + Example-List: sugar, tea + Example-List: rum + + However, individual members of a List cannot be safely split between + lines; see Section 4.2 for details. + + Parsers MUST support Lists containing at least 1024 members. Field + specifications can constrain the types and cardinality of individual + List values as they require. + +3.1.1. Inner Lists + + An Inner List is an array of zero or more Items (Section 3.3). Both + the individual Items and the Inner List itself can be Parameterized + (Section 3.1.2). + + When serialized in a textual HTTP field, Inner Lists are denoted by + surrounding parenthesis, and their values are delimited by one or + more spaces. A field whose value is defined as a List of Inner Lists + of Strings could look like: + + Example-List: ("foo" "bar"), ("baz"), ("bat" "one"), () + + Note that the last member in this example is an empty Inner List. + + A header field whose value is defined as a List of Inner Lists with + Parameters at both levels could look like: + + Example-List: ("foo"; a=1;b=2);lvl=5, ("bar" "baz");lvl=1 + + Parsers MUST support Inner Lists containing at least 256 members. + Field specifications can constrain the types and cardinality of + individual Inner List members as they require. + +3.1.2. Parameters + + Parameters are an ordered map of key-value pairs that are associated + with an Item (Section 3.3) or Inner List (Section 3.1.1). The keys + are unique within the scope of the Parameters they occur within, and + the values are bare items (i.e., they themselves cannot be + parameterized; see Section 3.3). + + Implementations MUST provide access to Parameters both by index and + by key. Specifications MAY use either means of accessing them. + + Note that parameters are ordered, and parameter keys cannot contain + uppercase letters. + + When serialized in a textual HTTP field, a Parameter is separated + from its Item or Inner List and other Parameters by a semicolon. For + example: + + Example-List: abc;a=1;b=2; cde_456, (ghi;jk=4 l);q="9";r=w + + Parameters whose value is Boolean (see Section 3.3.6) true MUST omit + that value when serialized. For example, the "a" parameter here is + true, while the "b" parameter is false: + + Example-Integer: 1; a; b=?0 + + Note that this requirement is only on serialization; parsers are + still required to correctly handle the true value when it appears in + a parameter. + + Parsers MUST support at least 256 parameters on an Item or Inner + List, and support parameter keys with at least 64 characters. Field + specifications can constrain the order of individual parameters, as + well as their values' types as required. + +3.2. Dictionaries + + Dictionaries are ordered maps of key-value pairs, where the keys are + short textual strings and the values are Items (Section 3.3) or + arrays of Items, both of which can be Parameterized (Section 3.1.2). + There can be zero or more members, and their keys are unique in the + scope of the Dictionary they occur within. + + Implementations MUST provide access to Dictionaries both by index and + by key. Specifications MAY use either means of accessing the + members. + + As with Lists, an empty Dictionary is represented by omitting the + entire field. This implies that fields defined as Dictionaries have + a default empty value. + + Typically, a field specification will define the semantics of + Dictionaries by specifying the allowed type(s) for individual members + by their keys, as well as whether their presence is required or + optional. Recipients MUST ignore members whose keys are undefined or + unknown, unless the field's specification specifically disallows + them. + + When serialized as a textual HTTP field, Members are ordered as + serialized and separated by a comma with optional whitespace. Member + keys cannot contain uppercase characters. Keys and values are + separated by "=" (without whitespace). For example: + + Example-Dict: en="Applepie", da=:w4ZibGV0w6ZydGU=: + + Note that in this example, the final "=" is due to the inclusion of a + Byte Sequence; see Section 3.3.5. + + Members whose value is Boolean (see Section 3.3.6) true MUST omit + that value when serialized. For example, here both "b" and "c" are + true: + + Example-Dict: a=?0, b, c; foo=bar + + Note that this requirement is only on serialization; parsers are + still required to correctly handle the true Boolean value when it + appears in Dictionary values. + + A Dictionary with a member whose value is an Inner List of Tokens: + + Example-Dict: rating=1.5, feelings=(joy sadness) + + A Dictionary with a mix of Items and Inner Lists, some with + parameters: + + Example-Dict: a=(1 2), b=3, c=4;aa=bb, d=(5 6);valid + + Note that Dictionaries can have their members split across multiple + lines of the same header or trailer section; for example, the + following are equivalent: + + Example-Dict: foo=1, bar=2 + + and + + Example-Dict: foo=1 + Example-Dict: bar=2 + + However, individual members of a Dictionary cannot be safely split + between lines; see Section 4.2 for details. + + Parsers MUST support Dictionaries containing at least 1024 key/value + pairs and keys with at least 64 characters. Field specifications can + constrain the order of individual Dictionary members, as well as + their values' types as required. + +3.3. Items + + An Item can be an Integer (Section 3.3.1), a Decimal (Section 3.3.2), + a String (Section 3.3.3), a Token (Section 3.3.4), a Byte Sequence + (Section 3.3.5), a Boolean (Section 3.3.6), or a Date + (Section 3.3.7). It can have associated parameters (Section 3.1.2). + + For example, a header field that is defined to be an Item that is an + Integer might look like: + + Example-Integer: 5 + + or with parameters: + + Example-Integer: 5; foo=bar + +3.3.1. Integers + + Integers have a range of -999,999,999,999,999 to 999,999,999,999,999 + inclusive (i.e., up to fifteen digits, signed), for IEEE 754 + compatibility [IEEE754]. + + For example: + + Example-Integer: 42 + + Integers larger than 15 digits can be supported in a variety of ways; + for example, by using a String (Section 3.3.3), a Byte Sequence + (Section 3.3.5), or a parameter on an Integer that acts as a scaling + factor. + + While it is possible to serialize Integers with leading zeros (e.g., + "0002", "-01") and signed zero ("-0"), these distinctions may not be + preserved by implementations. + + Note that commas in Integers are used in this section's prose only + for readability; they are not valid in the wire format. + +3.3.2. Decimals + + Decimals are numbers with an integer and a fractional component. The + integer component has at most 12 digits; the fractional component has + at most three digits. + + For example, a header whose value is defined as a Decimal could look + like: + + Example-Decimal: 4.5 + + While it is possible to serialize Decimals with leading zeros (e.g., + "0002.5", "-01.334"), trailing zeros (e.g., "5.230", "-0.40"), and + signed zero (e.g., "-0.0"), these distinctions may not be preserved + by implementations. + + Note that the serialization algorithm (Section 4.1.5) rounds input + with more than three digits of precision in the fractional component. + If an alternative rounding strategy is desired, this should be + specified by the field definition to occur before serialization. + +3.3.3. Strings + + Strings are zero or more printable ASCII [RFC0020] characters (i.e., + the range %x20 to %x7E). Note that this excludes tabs, newlines, + carriage returns, etc. + + Non-ASCII characters are not directly supported in Strings, because + they cause a number of interoperability issues, and -- with few + exceptions -- field values do not require them. + + When it is necessary for a field value to convey non-ASCII content, a + Display String (Section 3.3.8) can be specified. + + When serialized in a textual HTTP field, Strings are delimited with + double quotes, using a backslash ("\") to escape double quotes and + backslashes. For example: + + Example-String: "hello world" + + Note that Strings only use DQUOTE as a delimiter; single quotes do + not delimit Strings. Furthermore, only DQUOTE and "\" can be + escaped; other characters after "\" MUST cause parsing to fail. + + Parsers MUST support Strings (after any decoding) with at least 1024 + characters. + +3.3.4. Tokens + + Tokens are short textual words that begin with an alphabetic + character or "*", followed by zero to many token characters, which + are the same as those allowed by the "token" ABNF rule defined in + [HTTP], plus the ":" and "/" characters. + + For example: + + Example-Token: foo123/456 + + Parsers MUST support Tokens with at least 512 characters. + + Note that Tokens are defined largely for compatibility with the data + model of existing HTTP fields, and may require additional steps to + use in some implementations. As a result, new fields are encouraged + to use Strings. + +3.3.5. Byte Sequences + + Byte Sequences can be conveyed in Structured Fields. + + When serialized in a textual HTTP field, a Byte Sequence is delimited + with colons and encoded using base64 ([RFC4648], Section 4). For + example: + + Example-ByteSequence: :cHJldGVuZCB0aGlzIGlzIGJpbmFyeSBjb250ZW50Lg==: + + Parsers MUST support Byte Sequences with at least 16384 octets after + decoding. + +3.3.6. Booleans + + Boolean values can be conveyed in Structured Fields. + + When serialized in a textual HTTP field, a Boolean is indicated with + a leading "?" character followed by a "1" for a true value or "0" for + false. For example: + + Example-Boolean: ?1 + + Note that in Dictionary (Section 3.2) and Parameter (Section 3.1.2) + values, Boolean true is indicated by omitting the value. + +3.3.7. Dates + + Date values can be conveyed in Structured Fields. + + Dates have a data model that is similar to Integers, representing a + (possibly negative) delta in seconds from 1970-01-01T00:00:00Z, + excluding leap seconds. Accordingly, their serialization in textual + HTTP fields is similar to that of Integers, distinguished from them + with a leading "@". + + For example: + + Example-Date: @1659578233 + + Parsers MUST support Dates whose values include all days in years 1 + to 9999 (i.e., -62,135,596,800 to 253,402,214,400 delta seconds from + 1970-01-01T00:00:00Z). + +3.3.8. Display Strings + + Display Strings are similar to Strings, in that they consist of zero + or more characters, but they allow Unicode scalar values (i.e., all + Unicode code points except for surrogates), unlike Strings. + + Display Strings are intended for use in cases where a value is + displayed to end users, and therefore may need to carry non-ASCII + content. It is NOT RECOMMENDED that they be used in situations where + a String (Section 3.3.3) or Token (Section 3.3.4) would be adequate, + because Unicode has processing considerations (e.g., normalization) + and security considerations (e.g., homograph attacks) that make it + more difficult to handle correctly. + + Note that Display Strings do not indicate the language used in the + value; that can be done separately if necessary (e.g., with a + parameter). + + In textual HTTP fields, Display Strings are represented in a manner + similar to Strings, except that non-ASCII characters are percent- + encoded; there is a leading "%" to distinguish them from Strings. + + For example: + + Example-DisplayString: %"This is intended for display to %c3%bcsers." + + See Section 6 for additional security considerations when handling + Display Strings. + +4. Working with Structured Fields in HTTP + + This section defines how to serialize and parse the abstract types + defined by Section 3 into textual HTTP field values and other + encodings compatible with them (e.g., in HTTP/2 [HTTP/2] before + compression with HPACK [HPACK]). + +4.1. Serializing Structured Fields + + Given a structure defined in this specification, return an ASCII + string suitable for use in an HTTP field value. + + 1. If the structure is a Dictionary or List and its value is empty + (i.e., it has no members), do not serialize the field at all + (i.e., omit both the field-name and field-value). + + 2. If the structure is a List, let output_string be the result of + running Serializing a List (Section 4.1.1) with the structure. + + 3. Else, if the structure is a Dictionary, let output_string be the + result of running Serializing a Dictionary (Section 4.1.2) with + the structure. + + 4. Else, if the structure is an Item, let output_string be the + result of running Serializing an Item (Section 4.1.3) with the + structure. + + 5. Else, fail serialization. + + 6. Return output_string converted into an array of bytes, using + ASCII encoding [RFC0020]. + +4.1.1. Serializing a List + + Given an array of (member_value, parameters) tuples as input_list, + return an ASCII string suitable for use in an HTTP field value. + + 1. Let output be an empty string. + + 2. For each (member_value, parameters) of input_list: + + 1. If member_value is an array, append the result of running + Serializing an Inner List (Section 4.1.1.1) with + (member_value, parameters) to output. + + 2. Otherwise, append the result of running Serializing an Item + (Section 4.1.3) with (member_value, parameters) to output. + + 3. If more member_values remain in input_list: + + 1. Append "," to output. + + 2. Append a single SP to output. + + 3. Return output. + +4.1.1.1. Serializing an Inner List + + Given an array of (member_value, parameters) tuples as inner_list, + and parameters as list_parameters, return an ASCII string suitable + for use in an HTTP field value. + + 1. Let output be the string "(". + + 2. For each (member_value, parameters) of inner_list: + + 1. Append the result of running Serializing an Item + (Section 4.1.3) with (member_value, parameters) to output. + + 2. If more values remain in inner_list, append a single SP to + output. + + 3. Append ")" to output. + + 4. Append the result of running Serializing Parameters + (Section 4.1.1.2) with list_parameters to output. + + 5. Return output. + +4.1.1.2. Serializing Parameters + + Given an ordered Dictionary as input_parameters (each member having a + param_key and a param_value), return an ASCII string suitable for use + in an HTTP field value. + + 1. Let output be an empty string. + + 2. For each param_key with a value of param_value in + input_parameters: + + 1. Append ";" to output. + + 2. Append the result of running Serializing a Key + (Section 4.1.1.3) with param_key to output. + + 3. If param_value is not Boolean true: + + 1. Append "=" to output. + + 2. Append the result of running Serializing a bare Item + (Section 4.1.3.1) with param_value to output. + + 3. Return output. + +4.1.1.3. Serializing a Key + + Given a key as input_key, return an ASCII string suitable for use in + an HTTP field value. + + 1. Convert input_key into a sequence of ASCII characters; if + conversion fails, fail serialization. + + 2. If input_key contains characters not in lcalpha, DIGIT, "_", "-", + ".", or "*", fail serialization. + + 3. If the first character of input_key is not lcalpha or "*", fail + serialization. + + 4. Let output be an empty string. + + 5. Append input_key to output. + + 6. Return output. + +4.1.2. Serializing a Dictionary + + Given an ordered Dictionary as input_dictionary (each member having a + member_key and a tuple value of (member_value, parameters)), return + an ASCII string suitable for use in an HTTP field value. + + 1. Let output be an empty string. + + 2. For each member_key with a value of (member_value, parameters) in + input_dictionary: + + 1. Append the result of running Serializing a Key + (Section 4.1.1.3) with member's member_key to output. + + 2. If member_value is Boolean true: + + 1. Append the result of running Serializing Parameters + (Section 4.1.1.2) with parameters to output. + + 3. Otherwise: + + 1. Append "=" to output. + + 2. If member_value is an array, append the result of running + Serializing an Inner List (Section 4.1.1.1) with + (member_value, parameters) to output. + + 3. Otherwise, append the result of running Serializing an + Item (Section 4.1.3) with (member_value, parameters) to + output. + + 4. If more members remain in input_dictionary: + + 1. Append "," to output. + + 2. Append a single SP to output. + + 3. Return output. + +4.1.3. Serializing an Item + + Given an Item as bare_item and Parameters as item_parameters, return + an ASCII string suitable for use in an HTTP field value. + + 1. Let output be an empty string. + + 2. Append the result of running Serializing a Bare Item + (Section 4.1.3.1) with bare_item to output. + + 3. Append the result of running Serializing Parameters + (Section 4.1.1.2) with item_parameters to output. + + 4. Return output. + +4.1.3.1. Serializing a Bare Item + + Given an Item as input_item, return an ASCII string suitable for use + in an HTTP field value. + + 1. If input_item is an Integer, return the result of running + Serializing an Integer (Section 4.1.4) with input_item. + + 2. If input_item is a Decimal, return the result of running + Serializing a Decimal (Section 4.1.5) with input_item. + + 3. If input_item is a String, return the result of running + Serializing a String (Section 4.1.6) with input_item. + + 4. If input_item is a Token, return the result of running + Serializing a Token (Section 4.1.7) with input_item. + + 5. If input_item is a Byte Sequence, return the result of running + Serializing a Byte Sequence (Section 4.1.8) with input_item. + + 6. If input_item is a Boolean, return the result of running + Serializing a Boolean (Section 4.1.9) with input_item. + + 7. If input_item is a Date, return the result of running Serializing + a Date (Section 4.1.10) with input_item. + + 8. If input_item is a Display String, return the result of running + Serializing a Display String (Section 4.1.11) with input_item. + + 9. Otherwise, fail serialization. + +4.1.4. Serializing an Integer + + Given an Integer as input_integer, return an ASCII string suitable + for use in an HTTP field value. + + 1. If input_integer is not an integer in the range of + -999,999,999,999,999 to 999,999,999,999,999 inclusive, fail + serialization. + + 2. Let output be an empty string. + + 3. If input_integer is less than (but not equal to) 0, append "-" to + output. + + 4. Append input_integer's numeric value represented in base 10 using + only decimal digits to output. + + 5. Return output. + +4.1.5. Serializing a Decimal + + Given a decimal number as input_decimal, return an ASCII string + suitable for use in an HTTP field value. + + 1. If input_decimal is not a decimal number, fail serialization. + + 2. If input_decimal has more than three significant digits to the + right of the decimal point, round it to three decimal places, + rounding the final digit to the nearest value, or to the even + value if it is equidistant. + + 3. If input_decimal has more than 12 significant digits to the left + of the decimal point after rounding, fail serialization. + + 4. Let output be an empty string. + + 5. If input_decimal is less than (but not equal to) 0, append "-" + to output. + + 6. Append input_decimal's integer component represented in base 10 + (using only decimal digits) to output; if it is zero, append + "0". + + 7. Append "." to output. + + 8. If input_decimal's fractional component is zero, append "0" to + output. + + 9. Otherwise, append the significant digits of input_decimal's + fractional component represented in base 10 (using only decimal + digits) to output. + + 10. Return output. + +4.1.6. Serializing a String + + Given a String as input_string, return an ASCII string suitable for + use in an HTTP field value. + + 1. Convert input_string into a sequence of ASCII characters; if + conversion fails, fail serialization. + + 2. If input_string contains characters in the range %x00-1f or %x7f- + ff (i.e., not in VCHAR or SP), fail serialization. + + 3. Let output be the string DQUOTE. + + 4. For each character char in input_string: + + 1. If char is "\" or DQUOTE: + + 1. Append "\" to output. + + 2. Append char to output. + + 5. Append DQUOTE to output. + + 6. Return output. + +4.1.7. Serializing a Token + + Given a Token as input_token, return an ASCII string suitable for use + in an HTTP field value. + + 1. Convert input_token into a sequence of ASCII characters; if + conversion fails, fail serialization. + + 2. If the first character of input_token is not ALPHA or "*", or the + remaining portion contains a character not in tchar, ":", or "/", + fail serialization. + + 3. Let output be an empty string. + + 4. Append input_token to output. + + 5. Return output. + +4.1.8. Serializing a Byte Sequence + + Given a Byte Sequence as input_bytes, return an ASCII string suitable + for use in an HTTP field value. + + 1. If input_bytes is not a sequence of bytes, fail serialization. + + 2. Let output be an empty string. + + 3. Append ":" to output. + + 4. Append the result of base64-encoding input_bytes as per + [RFC4648], Section 4, taking account of the requirements below. + + 5. Append ":" to output. + + 6. Return output. + + The encoded data is required to be padded with "=", as per [RFC4648], + Section 3.2. + + Likewise, encoded data SHOULD have pad bits set to zero, as per + [RFC4648], Section 3.5, unless it is not possible to do so due to + implementation constraints. + +4.1.9. Serializing a Boolean + + Given a Boolean as input_boolean, return an ASCII string suitable for + use in an HTTP field value. + + 1. If input_boolean is not a boolean, fail serialization. + + 2. Let output be an empty string. + + 3. Append "?" to output. + + 4. If input_boolean is true, append "1" to output. + + 5. If input_boolean is false, append "0" to output. + + 6. Return output. + +4.1.10. Serializing a Date + + Given a Date as input_date, return an ASCII string suitable for use + in an HTTP field value. + + 1. Let output be "@". + + 2. Append to output the result of running Serializing an Integer + with input_date (Section 4.1.4). + + 3. Return output. + +4.1.11. Serializing a Display String + + Given a sequence of Unicode codepoints as input_sequence, return an + ASCII string suitable for use in an HTTP field value. + + 1. If input_sequence is not a sequence of Unicode codepoints, fail + serialization. + + 2. Let byte_array be the result of applying UTF-8 encoding + (Section 3 of [UTF8]) to input_sequence. If encoding fails, fail + serialization. + + 3. Let encoded_string be a string containing "%" followed by DQUOTE. + + 4. For each byte in byte_array: + + 1. If byte is %x25 ("%"), %x22 (DQUOTE), or in the ranges + %x00-1f or %x7f-ff: + + 1. Append "%" to encoded_string. + + 2. Let encoded_byte be the result of applying base16 + encoding (Section 8 of [RFC4648]) to byte, with any + alphabetic characters converted to lowercase. + + 3. Append encoded_byte to encoded_string. + + 2. Otherwise, decode byte as an ASCII character and append the + result to encoded_string. + + 5. Append DQUOTE to encoded_string. + + 6. Return encoded_string. + + Note that [UTF8] prohibits the encoding of codepoints between U+D800 + and U+DFFF (surrogates); if they occur in input_sequence, + serialization will fail. + +4.2. Parsing Structured Fields + + When a receiving implementation parses HTTP fields that are known to + be Structured Fields, it is important that care be taken, as there + are a number of edge cases that can cause interoperability or even + security problems. This section specifies the algorithm for doing + so. + + Given an array of bytes as input_bytes that represent the chosen + field's field-value (which is empty if that field is not present) and + field_type (one of "dictionary", "list", or "item"), return the + parsed field value. + + 1. Convert input_bytes into an ASCII string input_string; if + conversion fails, fail parsing. + + 2. Discard any leading SP characters from input_string. + + 3. If field_type is "list", let output be the result of running + Parsing a List (Section 4.2.1) with input_string. + + 4. If field_type is "dictionary", let output be the result of + running Parsing a Dictionary (Section 4.2.2) with input_string. + + 5. If field_type is "item", let output be the result of running + Parsing an Item (Section 4.2.3) with input_string. + + 6. Discard any leading SP characters from input_string. + + 7. If input_string is not empty, fail parsing. + + 8. Otherwise, return output. + + When generating input_bytes, parsers MUST combine all field lines in + the same section (header or trailer) that case-insensitively match + the field name into one comma-separated field-value, as per + Section 5.2 of [HTTP]; this assures that the entire field value is + processed correctly. + + For Lists and Dictionaries, this has the effect of correctly + concatenating all of the field's lines, as long as individual members + of the top-level data structure are not split across multiple field + instances. The parsing algorithms for both types allow tab + characters, since these might be used to combine field lines by some + implementations. + + Strings split across multiple field lines will have unpredictable + results, because one or more commas (with optional whitespace) will + become part of the string output by the parser. Since concatenation + might be done by an upstream intermediary, the results are not under + the control of the serializer or the parser, even when they are both + under the control of the same party. + + Tokens, Integers, Decimals, and Byte Sequences cannot be split across + multiple field lines because the inserted commas will cause parsing + to fail. + + Parsers MAY fail when processing a field value spread across multiple + field lines, when one of those lines does not parse as that field. + For example, a parsing handling an Example-String field that's + defined as an sf-string is allowed to fail when processing this field + section: + + Example-String: "foo + Example-String: bar" + + If parsing fails, either the entire field value MUST be ignored + (i.e., treated as if the field were not present in the section), or + alternatively the complete HTTP message MUST be treated as malformed. + This is intentionally strict to improve interoperability and safety, + and field specifications that use Structured Fields are not allowed + to loosen this requirement. + + Note that this requirement does not apply to an implementation that + is not parsing the field; for example, an intermediary is not + required to strip a failing field from a message before forwarding + it. + +4.2.1. Parsing a List + + Given an ASCII string as input_string, return an array of + (item_or_inner_list, parameters) tuples. input_string is modified to + remove the parsed value. + + 1. Let members be an empty array. + + 2. While input_string is not empty: + + 1. Append the result of running Parsing an Item or Inner List + (Section 4.2.1.1) with input_string to members. + + 2. Discard any leading OWS characters from input_string. + + 3. If input_string is empty, return members. + + 4. Consume the first character of input_string; if it is not + ",", fail parsing. + + 5. Discard any leading OWS characters from input_string. + + 6. If input_string is empty, there is a trailing comma; fail + parsing. + + 3. No structured data has been found; return members (which is + empty). + +4.2.1.1. Parsing an Item or Inner List + + Given an ASCII string as input_string, return the tuple + (item_or_inner_list, parameters), where item_or_inner_list can be + either a single bare item or an array of (bare_item, parameters) + tuples. input_string is modified to remove the parsed value. + + 1. If the first character of input_string is "(", return the result + of running Parsing an Inner List (Section 4.2.1.2) with + input_string. + + 2. Return the result of running Parsing an Item (Section 4.2.3) with + input_string. + +4.2.1.2. Parsing an Inner List + + Given an ASCII string as input_string, return the tuple (inner_list, + parameters), where inner_list is an array of (bare_item, parameters) + tuples. input_string is modified to remove the parsed value. + + 1. Consume the first character of input_string; if it is not "(", + fail parsing. + + 2. Let inner_list be an empty array. + + 3. While input_string is not empty: + + 1. Discard any leading SP characters from input_string. + + 2. If the first character of input_string is ")": + + 1. Consume the first character of input_string. + + 2. Let parameters be the result of running Parsing + Parameters (Section 4.2.3.2) with input_string. + + 3. Return the tuple (inner_list, parameters). + + 3. Let item be the result of running Parsing an Item + (Section 4.2.3) with input_string. + + 4. Append item to inner_list. + + 5. If the first character of input_string is not SP or ")", fail + parsing. + + 4. The end of the Inner List was not found; fail parsing. + +4.2.2. Parsing a Dictionary + + Given an ASCII string as input_string, return an ordered map whose + values are (item_or_inner_list, parameters) tuples. input_string is + modified to remove the parsed value. + + 1. Let dictionary be an empty, ordered map. + + 2. While input_string is not empty: + + 1. Let this_key be the result of running Parsing a Key + (Section 4.2.3.3) with input_string. + + 2. If the first character of input_string is "=": + + 1. Consume the first character of input_string. + + 2. Let member be the result of running Parsing an Item or + Inner List (Section 4.2.1.1) with input_string. + + 3. Otherwise: + + 1. Let value be Boolean true. + + 2. Let parameters be the result of running Parsing + Parameters (Section 4.2.3.2) with input_string. + + 3. Let member be the tuple (value, parameters). + + 4. If dictionary already contains a key this_key (comparing + character for character), overwrite its value with member. + + 5. Otherwise, append key this_key with value member to + dictionary. + + 6. Discard any leading OWS characters from input_string. + + 7. If input_string is empty, return dictionary. + + 8. Consume the first character of input_string; if it is not + ",", fail parsing. + + 9. Discard any leading OWS characters from input_string. + + 10. If input_string is empty, there is a trailing comma; fail + parsing. + + 3. No structured data has been found; return dictionary (which is + empty). + + Note that when duplicate Dictionary keys are encountered, all but the + last instance are ignored. + +4.2.3. Parsing an Item + + Given an ASCII string as input_string, return a (bare_item, + parameters) tuple. input_string is modified to remove the parsed + value. + + 1. Let bare_item be the result of running Parsing a Bare Item + (Section 4.2.3.1) with input_string. + + 2. Let parameters be the result of running Parsing Parameters + (Section 4.2.3.2) with input_string. + + 3. Return the tuple (bare_item, parameters). + +4.2.3.1. Parsing a Bare Item + + Given an ASCII string as input_string, return a bare Item. + input_string is modified to remove the parsed value. + + 1. If the first character of input_string is a "-" or a DIGIT, + return the result of running Parsing an Integer or Decimal + (Section 4.2.4) with input_string. + + 2. If the first character of input_string is a DQUOTE, return the + result of running Parsing a String (Section 4.2.5) with + input_string. + + 3. If the first character of input_string is an ALPHA or "*", return + the result of running Parsing a Token (Section 4.2.6) with + input_string. + + 4. If the first character of input_string is ":", return the result + of running Parsing a Byte Sequence (Section 4.2.7) with + input_string. + + 5. If the first character of input_string is "?", return the result + of running Parsing a Boolean (Section 4.2.8) with input_string. + + 6. If the first character of input_string is "@", return the result + of running Parsing a Date (Section 4.2.9) with input_string. + + 7. If the first character of input_string is "%", return the result + of running Parsing a Display String (Section 4.2.10) with + input_string. + + 8. Otherwise, the item type is unrecognized; fail parsing. + +4.2.3.2. Parsing Parameters + + Given an ASCII string as input_string, return an ordered map whose + values are bare Items. input_string is modified to remove the parsed + value. + + 1. Let parameters be an empty, ordered map. + + 2. While input_string is not empty: + + 1. If the first character of input_string is not ";", exit the + loop. + + 2. Consume the ";" character from the beginning of input_string. + + 3. Discard any leading SP characters from input_string. + + 4. Let param_key be the result of running Parsing a Key + (Section 4.2.3.3) with input_string. + + 5. Let param_value be Boolean true. + + 6. If the first character of input_string is "=": + + 1. Consume the "=" character at the beginning of + input_string. + + 2. Let param_value be the result of running Parsing a Bare + Item (Section 4.2.3.1) with input_string. + + 7. If parameters already contains a key param_key (comparing + character for character), overwrite its value with + param_value. + + 8. Otherwise, append key param_key with value param_value to + parameters. + + 3. Return parameters. + + Note that when duplicate parameter keys are encountered, all but the + last instance are ignored. + +4.2.3.3. Parsing a Key + + Given an ASCII string as input_string, return a key. input_string is + modified to remove the parsed value. + + 1. If the first character of input_string is not lcalpha or "*", + fail parsing. + + 2. Let output_string be an empty string. + + 3. While input_string is not empty: + + 1. If the first character of input_string is not one of lcalpha, + DIGIT, "_", "-", ".", or "*", return output_string. + + 2. Let char be the result of consuming the first character of + input_string. + + 3. Append char to output_string. + + 4. Return output_string. + +4.2.4. Parsing an Integer or Decimal + + Given an ASCII string as input_string, return an Integer or Decimal. + input_string is modified to remove the parsed value. + + NOTE: This algorithm parses both Integers (Section 3.3.1) and + Decimals (Section 3.3.2), and returns the corresponding structure. + + 1. Let type be "integer". + + 2. Let sign be 1. + + 3. Let input_number be an empty string. + + 4. If the first character of input_string is "-", consume it and + set sign to -1. + + 5. If input_string is empty, there is an empty integer; fail + parsing. + + 6. If the first character of input_string is not a DIGIT, fail + parsing. + + 7. While input_string is not empty: + + 1. Let char be the result of consuming the first character of + input_string. + + 2. If char is a DIGIT, append it to input_number. + + 3. Else, if type is "integer" and char is ".": + + 1. If input_number contains more than 12 characters, fail + parsing. + + 2. Otherwise, append char to input_number and set type to + "decimal". + + 4. Otherwise, prepend char to input_string, and exit the loop. + + 5. If type is "integer" and input_number contains more than 15 + characters, fail parsing. + + 6. If type is "decimal" and input_number contains more than 16 + characters, fail parsing. + + 8. If type is "integer": + + 1. Let output_number be an Integer that is the result of + parsing input_number as an integer. + + 9. Otherwise: + + 1. If the final character of input_number is ".", fail parsing. + + 2. If the number of characters after "." in input_number is + greater than three, fail parsing. + + 3. Let output_number be a Decimal that is the result of parsing + input_number as a decimal number. + + 10. Let output_number be the product of output_number and sign. + + 11. Return output_number. + +4.2.5. Parsing a String + + Given an ASCII string as input_string, return an unquoted String. + input_string is modified to remove the parsed value. + + 1. Let output_string be an empty string. + + 2. If the first character of input_string is not DQUOTE, fail + parsing. + + 3. Discard the first character of input_string. + + 4. While input_string is not empty: + + 1. Let char be the result of consuming the first character of + input_string. + + 2. If char is a backslash ("\"): + + 1. If input_string is now empty, fail parsing. + + 2. Let next_char be the result of consuming the first + character of input_string. + + 3. If next_char is not DQUOTE or "\", fail parsing. + + 4. Append next_char to output_string. + + 3. Else, if char is DQUOTE, return output_string. + + 4. Else, if char is in the range %x00-1f or %x7f-ff (i.e., it is + not in VCHAR or SP), fail parsing. + + 5. Else, append char to output_string. + + 5. Reached the end of input_string without finding a closing DQUOTE; + fail parsing. + +4.2.6. Parsing a Token + + Given an ASCII string as input_string, return a Token. input_string + is modified to remove the parsed value. + + 1. If the first character of input_string is not ALPHA or "*", fail + parsing. + + 2. Let output_string be an empty string. + + 3. While input_string is not empty: + + 1. If the first character of input_string is not in tchar, ":", + or "/", return output_string. + + 2. Let char be the result of consuming the first character of + input_string. + + 3. Append char to output_string. + + 4. Return output_string. + +4.2.7. Parsing a Byte Sequence + + Given an ASCII string as input_string, return a Byte Sequence. + input_string is modified to remove the parsed value. + + 1. If the first character of input_string is not ":", fail parsing. + + 2. Discard the first character of input_string. + + 3. If there is not a ":" character before the end of input_string, + fail parsing. + + 4. Let b64_content be the result of consuming content of + input_string up to but not including the first instance of the + character ":". + + 5. Consume the ":" character at the beginning of input_string. + + 6. If b64_content contains a character not included in ALPHA, DIGIT, + "+", "/", and "=", fail parsing. + + 7. Let binary_content be the result of base64-decoding [RFC4648] + b64_content, synthesizing padding if necessary (note the + requirements about recipient behavior below). If base64 decoding + fails, parsing fails. + + 8. Return binary_content. + + Because some implementations of base64 do not allow rejection of + encoded data that is not properly "=" padded (see [RFC4648], + Section 3.2), parsers SHOULD NOT fail when "=" padding is not + present, unless they cannot be configured to do so. + + Because some implementations of base64 do not allow rejection of + encoded data that has non-zero pad bits (see [RFC4648], Section 3.5), + parsers SHOULD NOT fail when non-zero pad bits are present, unless + they cannot be configured to do so. + + This specification does not relax the requirements in Sections 3.1 + and 3.3 of [RFC4648]; therefore, parsers MUST fail on characters + outside the base64 alphabet and on line feeds in encoded data. + +4.2.8. Parsing a Boolean + + Given an ASCII string as input_string, return a Boolean. input_string + is modified to remove the parsed value. + + 1. If the first character of input_string is not "?", fail parsing. + + 2. Discard the first character of input_string. + + 3. If the first character of input_string matches "1", discard the + first character, and return true. + + 4. If the first character of input_string matches "0", discard the + first character, and return false. + + 5. No value has matched; fail parsing. + +4.2.9. Parsing a Date + + Given an ASCII string as input_string, return a Date. input_string is + modified to remove the parsed value. + + 1. If the first character of input_string is not "@", fail parsing. + + 2. Discard the first character of input_string. + + 3. Let output_date be the result of running Parsing an Integer or + Decimal (Section 4.2.4) with input_string. + + 4. If output_date is a Decimal, fail parsing. + + 5. Return output_date. + +4.2.10. Parsing a Display String + + Given an ASCII string as input_string, return a sequence of Unicode + codepoints. input_string is modified to remove the parsed value. + + 1. If the first two characters of input_string are not "%" followed + by DQUOTE, fail parsing. + + 2. Discard the first two characters of input_string. + + 3. Let byte_array be an empty byte array. + + 4. While input_string is not empty: + + 1. Let char be the result of consuming the first character of + input_string. + + 2. If char is in the range %x00-1f or %x7f-ff (i.e., it is not + in VCHAR or SP), fail parsing. + + 3. If char is "%": + + 1. Let octet_hex be the result of consuming two characters + from input_string. If there are not two characters, fail + parsing. + + 2. If octet_hex contains characters outside the range + %x30-39 or %x61-66 (i.e., it is not in 0-9 or lowercase + a-f), fail parsing. + + 3. Let octet be the result of hex decoding octet_hex + (Section 8 of [RFC4648]). + + 4. Append octet to byte_array. + + 4. If char is DQUOTE: + + 1. Let unicode_sequence be the result of decoding byte_array + as a UTF-8 string (Section 3 of [UTF8]). Fail parsing if + decoding fails. + + 2. Return unicode_sequence. + + 5. Otherwise, if char is not "%" or DQUOTE: + + 1. Let byte be the result of applying ASCII encoding to + char. + + 2. Append byte to byte_array. + + 5. Reached the end of input_string without finding a closing DQUOTE; + fail parsing. + +5. IANA Considerations + + Please add the following note to the "Hypertext Transfer Protocol + (HTTP) Field Name Registry": + + The "Structured Type" column indicates the type of the field (per + RFC nnnn), if any, and may be "Dictionary", "List" or "Item". + + Note that field names beginning with characters other than ALPHA + or "*" will not be able to be represented as a Structured Fields + Token, and therefore may be incompatible with being mapped into + field values that refer to it. + + Then, add a new column, "Structured Type". + + Then, add the indicated Structured Type for each existing registry + entry listed in Table 1. + + +==========================================+=================+ + | Field Name | Structured Type | + +==========================================+=================+ + | Accept-CH | List | + +------------------------------------------+-----------------+ + | Cache-Status | List | + +------------------------------------------+-----------------+ + | CDN-Cache-Control | Dictionary | + +------------------------------------------+-----------------+ + | Cross-Origin-Embedder-Policy | Item | + +------------------------------------------+-----------------+ + | Cross-Origin-Embedder-Policy-Report-Only | Item | + +------------------------------------------+-----------------+ + | Cross-Origin-Opener-Policy | Item | + +------------------------------------------+-----------------+ + | Cross-Origin-Opener-Policy-Report-Only | Item | + +------------------------------------------+-----------------+ + | Origin-Agent-Cluster | Item | + +------------------------------------------+-----------------+ + | Priority | Dictionary | + +------------------------------------------+-----------------+ + | Proxy-Status | List | + +------------------------------------------+-----------------+ + + Table 1: Existing Fields + +6. Security Considerations + + The size of most types defined by Structured Fields is not limited; + as a result, extremely large fields could be an attack vector (e.g., + for resource consumption). Most HTTP implementations limit the sizes + of individual fields as well as the overall header or trailer section + size to mitigate such attacks. + + It is possible for parties with the ability to inject new HTTP fields + to change the meaning of a Structured Field. In some circumstances, + this will cause parsing to fail, but it is not possible to reliably + fail in all such circumstances. + + The Display String type can convey any possible Unicode code point + without sanitization; for example, they might contain unassigned code + points, control points (including NUL), or noncharacters. Therefore, + applications consuming Display Strings need to consider strategies + such as filtering or escaping untrusted content before displaying it. + See [PRECIS] and [UNICODE-SECURITY]. + +7. References + +7.1. Normative References + + [HTTP] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP Semantics", STD 97, RFC 9110, + DOI 10.17487/RFC9110, June 2022, + . + + [RFC0020] Cerf, V., "ASCII format for network interchange", STD 80, + RFC 20, DOI 10.17487/RFC0020, October 1969, + . + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + . + + [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data + Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, + . + + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, . + + [UTF8] Yergeau, F., "UTF-8, a transformation format of ISO + 10646", STD 63, RFC 3629, November 2003, + . + +7.2. Informative References + + [HPACK] Peon, R. and H. Ruellan, "HPACK: Header Compression for + HTTP/2", RFC 7541, DOI 10.17487/RFC7541, May 2015, + . + + [HTTP/2] Thomson, M., Ed. and C. Benfield, Ed., "HTTP/2", RFC 9113, + DOI 10.17487/RFC9113, June 2022, + . + + [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", + IEEE 754-2019, DOI 10.1109/IEEESTD.2019.8766229, + ISBN 978-1-5044-5924-2, July 2019, + . + + [PRECIS] Saint-Andre, P. and M. Blanchet, "PRECIS Framework: + Preparation, Enforcement, and Comparison of + Internationalized Strings in Application Protocols", + RFC 8264, DOI 10.17487/RFC8264, October 2017, + . + + [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax + Specifications: ABNF", STD 68, RFC 5234, + DOI 10.17487/RFC5234, January 2008, + . + + [RFC7493] Bray, T., Ed., "The I-JSON Message Format", RFC 7493, + DOI 10.17487/RFC7493, March 2015, + . + + [RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data + Interchange Format", STD 90, RFC 8259, + DOI 10.17487/RFC8259, December 2017, + . + + [UNICODE-SECURITY] + Davis, M. and M. Suignard, "Unicode Security + Considerations", Unicode Technical Report #16, 19 + September 2014, . + +Appendix A. Frequently Asked Questions + +A.1. Why Not JSON? + + Earlier proposals for Structured Fields were based upon JSON + [RFC8259]. However, constraining its use to make it suitable for + HTTP fields required senders and recipients to implement specific + additional handling. + + For example, JSON has specification issues around large numbers and + objects with duplicate members. Although advice for avoiding these + issues is available (e.g., [RFC7493]), it cannot be relied upon. + + Likewise, JSON strings are by default Unicode strings, which have a + number of potential interoperability issues (e.g., in comparison). + Although implementers can be advised to avoid non-ASCII content where + unnecessary, this is difficult to enforce. + + Another example is JSON's ability to nest content to arbitrary + depths. Since the resulting memory commitment might be unsuitable + (e.g., in embedded and other limited server deployments), it's + necessary to limit it in some fashion; however, existing JSON + implementations have no such limits, and even if a limit is + specified, it's likely that some field definition will find a need to + violate it. + + Because of JSON's broad adoption and implementation, it is difficult + to impose such additional constraints across all implementations; + some deployments would fail to enforce them, thereby harming + interoperability. In short, if it looks like JSON, people will be + tempted to use a JSON parser/serializer on field values. + + Since a major goal for Structured Fields is to improve + interoperability and simplify implementation, these concerns led to a + format that requires a dedicated parser and serializer. + + Additionally, there were widely shared feelings that JSON doesn't + "look right" in HTTP fields. + +Appendix B. Implementation Notes + + A generic implementation of this specification should expose the top- + level serialize (Section 4.1) and parse (Section 4.2) functions. + They need not be functions; for example, it could be implemented as + an object, with methods for each of the different top-level types. + + For interoperability, it's important that generic implementations be + complete and follow the algorithms closely; see Section 1.1. To aid + this, a common test suite is being maintained by the community at + . + + Implementers should note that Dictionaries and Parameters are order- + preserving maps. Some fields may not convey meaning in the ordering + of these data types, but it should still be exposed so that it will + be available to applications that need to use it. + + Likewise, implementations should note that it's important to preserve + the distinction between Tokens and Strings. While most programming + languages have native types that map to the other types well, it may + be necessary to create a wrapper "token" object or use a parameter on + functions to assure that these types remain separate. + + The serialization algorithm is defined in a way that it is not + strictly limited to the data types defined in Section 3 in every + case. For example, Decimals are designed to take broader input and + round to allowed values. + + Implementations are allowed to limit the size of different + structures, subject to the minimums defined for each type. When a + structure exceeds an implementation limit, that structure fails + parsing or serialization. + +Appendix C. ABNF + + This section uses the Augmented Backus-Naur Form (ABNF) notation + [RFC5234] to illustrate expected syntax of Structured Fields. + However, it cannot be used to validate their syntax, because it does + not capture all requirements. + + This section is non-normative. If there is disagreement between the + parsing algorithms and ABNF, the specified algorithms take + precedence. + + sf-list = list-member *( OWS "," OWS list-member ) + list-member = sf-item / inner-list + + inner-list = "(" *SP [ sf-item *( 1*SP sf-item ) *SP ] ")" + parameters + + parameters = *( ";" *SP parameter ) + parameter = param-key [ "=" param-value ] + param-key = key + key = ( lcalpha / "*" ) + *( lcalpha / DIGIT / "_" / "-" / "." / "*" ) + lcalpha = %x61-7A ; a-z + param-value = bare-item + + sf-dictionary = dict-member *( OWS "," OWS dict-member ) + dict-member = member-key ( parameters / ( "=" member-value )) + member-key = key + member-value = sf-item / inner-list + + sf-item = bare-item parameters + bare-item = sf-integer / sf-decimal / sf-string / sf-token + / sf-binary / sf-boolean / sf-date / sf-displaystring + + sf-integer = ["-"] 1*15DIGIT + sf-decimal = ["-"] 1*12DIGIT "." 1*3DIGIT + sf-string = DQUOTE *( unescaped / "%" / bs-escaped ) DQUOTE + sf-token = ( ALPHA / "*" ) *( tchar / ":" / "/" ) + sf-binary = ":" base64 ":" + sf-boolean = "?" ( "0" / "1" ) + sf-date = "@" sf-integer + sf-displaystring = "%" DQUOTE *( unescaped / "\" / pct-encoded ) + DQUOTE + + base64 = *( ALPHA / DIGIT / "+" / "/" ) *"=" + + unescaped = %x20-21 / %x23-24 / %x26-5B / %x5D-7E + bs-escaped = "\" ( DQUOTE / "\" ) + + pct-encoded = "%" lc-hexdig lc-hexdig + lc-hexdig = DIGIT / %x61-66 ; 0-9, a-f + +Appendix D. Changes from RFC 8941 + + This revision of the Structured Field Values for HTTP specification + has made the following changes: + + * Added the Date structured type. (Section 3.3.7) + + * Stopped encouraging use of ABNF in definitions of new structured + fields. (Section 2) + + * Moved ABNF to an informative appendix. (Appendix C) + + * Added a "Structured Type" column to the HTTP Field Name Registry. + (Section 5) + + * Refined parse failure handling. (Section 4.2) + + * Added the Display String structured type. (Section 3.3.8) + +Acknowledgements + + Many thanks to Matthew Kerwin for his detailed feedback and careful + consideration during the development of this specification. + + Thanks also to Ian Clelland, Roy Fielding, Anne van Kesteren, Kazuho + Oku, Evert Pot, Julian Reschke, Martin Thomson, Mike West, and + Jeffrey Yasskin for their contributions. + +Authors' Addresses + + Mark Nottingham + Cloudflare + Prahran VIC + Australia + Email: mnot@mnot.net + URI: https://www.mnot.net/ + + + Poul-Henning Kamp + The Varnish Cache Project + Email: phk@varnish-cache.org diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-unprompted-auth.html b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-unprompted-auth.html new file mode 100644 index 000000000..a64c884cc --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-unprompted-auth.html @@ -0,0 +1,2174 @@ + + + + + + +The Concealed HTTP Authentication Scheme + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Internet-DraftThe Concealed HTTP Authentication SchemeJanuary 2025
Schinazi, et al.Expires 11 July 2025[Page]
+
+
+
+
Workgroup:
+
HTTPBIS
+
Internet-Draft:
+
draft-ietf-httpbis-unprompted-auth-latest
+
Published:
+
+ +
+
Intended Status:
+
Standards Track
+
Expires:
+
+
Authors:
+
+
+
D. Schinazi
+
Google LLC
+
+
+
D. Oliver
+
Guardian Project
+
+
+
J. Hoyland
+
Cloudflare Inc.
+
+
+
+
+

The Concealed HTTP Authentication Scheme

+
+

Abstract

+

Most HTTP authentication schemes are probeable in the sense that it is possible +for an unauthenticated client to probe whether an origin serves resources that +require authentication. It is possible for an origin to hide the fact that it +requires authentication by not generating Unauthorized status codes, however +that only works with non-cryptographic authentication schemes: cryptographic +signatures require a fresh nonce to be signed. Prior to this document, there +was no existing way for the origin to share such a nonce without exposing the +fact that it serves resources that require authentication. This document +defines a new non-probeable cryptographic authentication scheme.

+
+
+

+About This Document +

+

This note is to be removed before publishing as an RFC.

+

+ The latest revision of this draft can be found at https://httpwg.org/http-extensions/draft-ietf-httpbis-unprompted-auth.html. + Status information for this document may be found at https://datatracker.ietf.org/doc/draft-ietf-httpbis-unprompted-auth/.

+

+ Discussion of this document takes place on the + HTTP Working Group mailing list (mailto:ietf-http-wg@w3.org), + which is archived at https://lists.w3.org/Archives/Public/ietf-http-wg/. + Working Group information can be found at https://httpwg.org/.

+

Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/unprompted-auth.

+
+
+
+

+Status of This Memo +

+

+ This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79.

+

+ Internet-Drafts are working documents of the Internet Engineering Task + Force (IETF). Note that other groups may also distribute working + documents as Internet-Drafts. The list of current Internet-Drafts is + at https://datatracker.ietf.org/drafts/current/.

+

+ Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress."

+

+ This Internet-Draft will expire on 11 July 2025.

+
+
+ + +
+
+

+1. Introduction +

+

HTTP authentication schemes (see Section 11 of [HTTP]) allow origins +to restrict access for some resources to only authenticated requests. While +these schemes commonly involve a challenge where the origin asks the client to +provide authentication information, it is possible for clients to send such +information unprompted. This is particularly useful in cases where an origin +wants to offer a service or capability only to "those who know" while all +others are given no indication the service or capability exists. Such designs +rely on an externally-defined mechanism by which keys are distributed. For +example, a company might offer remote employee access to company services +directly via its website using their employee credentials, or offer access to +limited special capabilities for specific employees, while making discovering +(or probing for) such capabilities difficult. As another example, members of +less well-defined communities might use more ephemeral keys to acquire access +to geography- or capability-specific resources, as issued by an entity whose +user base is larger than the available resources can support (by having that +entity metering the availability of keys temporally or geographically).

+

While digital-signature-based HTTP authentication schemes already exist (e.g., +[HOBA]), they rely on the origin explicitly sending a fresh +challenge to the client, to ensure that the signature input is fresh. That +makes the origin probeable as it sends the challenge to unauthenticated +clients. This document defines a new signature-based authentication scheme that +is not probeable.

+
+
+

+1.1. Conventions and Definitions +

+

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", +"MAY", and "OPTIONAL" in this document are to be interpreted as +described in BCP 14 [RFC2119] [RFC8174] when, and only when, they +appear in all capitals, as shown here.

+

This document uses the notation from Section 1.3 of [QUIC].

+
+
+
+
+
+
+

+2. The Concealed Authentication Scheme +

+

This document defines the "Concealed" HTTP authentication scheme. It uses +asymmetric cryptography. Clients possess a key ID and a public/private key +pair, and origin servers maintain a mapping of authorized key IDs to associated +public keys.

+

The client uses a TLS keying material exporter to generate data to be signed +(see Section 3) then sends the signature using the Authorization (or +Proxy-Authorization) header field (see Section 11 of [HTTP]). The signature +and additional information are exchanged using authentication parameters (see +Section 4). Once the server receives these, it can check whether the +signature validates against an entry in its database of known keys. The server +can then use the validation result to influence its response to the client, for +example by restricting access to certain resources.

+
+
+
+
+

+3. Client Handling +

+

When a client wishes to use the Concealed HTTP authentication scheme with a +request, it SHALL compute the authentication proof using a TLS keying material +exporter with the following parameters:

+
    +
  • +

    the label is set to "EXPORTER-HTTP-Concealed-Authentication"

    +
  • +
  • +

    the context is set to the structure described in Section 3.1

    +
  • +
  • +

    the exporter output length is set to 48 bytes (see Section 3.2)

    +
  • +
+

Note that TLS 1.3 keying material exporters are defined in Section 7.5 of [TLS], while TLS 1.2 keying material exporters are defined in +[KEY-EXPORT].

+
+
+

+3.1. Key Exporter Context +

+

The TLS key exporter context is described in Figure 1, using the +notation from Section 1.3 of [QUIC]:

+
+
+
+
+  Signature Algorithm (16),
+  Key ID Length (i),
+  Key ID (..),
+  Public Key Length (i),
+  Public Key (..),
+  Scheme Length (i),
+  Scheme (..),
+  Host Length (i),
+  Host (..),
+  Port (16),
+  Realm Length (i),
+  Realm (..),
+
+
+
Figure 1: +Key Exporter Context Format +
+
+

The key exporter context contains the following fields:

+
+
Signature Algorithm:
+
+

The signature scheme sent in the s Parameter (see Section 4.4).

+
+
+
Key ID:
+
+

The key ID sent in the k Parameter (see Section 4.1).

+
+
+
Public Key:
+
+

The public key used by the server to validate the signature provided by the +client. Its encoding is described in Section 3.1.1.

+
+
+
Scheme:
+
+

The scheme for this request, encoded using the format of the scheme portion +of a URI as defined in Section 3.1 of [URI].

+
+
+
Host:
+
+

The host for this request, encoded using the format of the host portion of a +URI as defined in Section 3.2.2 of [URI].

+
+
+
Port:
+
+

The port for this request, encoded in network byte order. Note that the port +is either included in the URI, or is the default port for the scheme in use; +see Section 3.2.3 of [URI].

+
+
+
Realm:
+
+

The realm of authentication that is sent in the realm authentication +parameter (Section 11.5 of [HTTP]). If the realm authentication parameter is +not present, this SHALL be empty. This document does not define a means for the +origin to communicate a realm to the client. If a client is not configured to +use a specific realm, it SHALL use an empty realm and SHALL NOT send the realm +authentication parameter.

+
+
+
+

The Signature Algorithm and Port fields are encoded as unsigned 16-bit integers +in network byte order. The Key ID, Public Key, Scheme, Host, and Realm fields +are length prefixed strings; they are preceded by a Length field that +represents their length in bytes. These length fields are encoded using the +variable-length integer encoding from Section 16 of [QUIC] and MUST be +encoded in the minimum number of bytes necessary.

+
+
+

+3.1.1. Public Key Encoding +

+

Both the "Public Key" field of the TLS key exporter context (see above) and the +a Parameter (see Section 4.2) carry the same public key. The encoding of +the public key is determined by the Signature Algorithm in use as follows:

+
+
RSASSA-PSS algorithms:
+
+

The public key is an RSAPublicKey structure [PKCS1] encoded in DER +[X.690]. BER encodings which are not DER MUST be rejected.

+
+
+
ECDSA algorithms:
+
+

The public key is a UncompressedPointRepresentation structure defined in +Section 4.2.8.2 of [TLS], using the curve specified by the SignatureScheme.

+
+
+
EdDSA algorithms:
+
+

The public key is the byte string encoding defined in [EdDSA].

+
+
+
+

This document does not define the public key encodings for other algorithms. In +order for a SignatureScheme to be usable with the Concealed HTTP authentication +scheme, its public key encoding needs to be defined in a corresponding document.

+
+
+
+
+
+
+

+3.2. Key Exporter Output +

+

The key exporter output is 48 bytes long. Of those, the first 32 bytes are part +of the input to the signature and the next 16 bytes are sent alongside the +signature. This allows the recipient to confirm that the exporter produces the +right values. This is described in Figure 2, using the notation from +Section 1.3 of [QUIC]:

+
+
+
+
+  Signature Input (256),
+  Verification (128),
+
+
+
Figure 2: +Key Exporter Output Format +
+
+

The key exporter output contains the following fields:

+
+
Signature Input:
+
+

This is part of the data signed using the client's chosen asymmetric private +key (see Section 3.3).

+
+
+
Verification:
+
+

The verification is transmitted to the server using the v Parameter (see +Section 4.5).

+
+
+
+
+
+
+
+

+3.3. Signature Computation +

+

Once the Signature Input has been extracted from the key exporter output (see +Section 3.2), it is prefixed with static data before being signed. The signature +is computed over the concatenation of:

+
    +
  • +

    A string that consists of octet 32 (0x20) repeated 64 times

    +
  • +
  • +

    The context string "HTTP Concealed Authentication"

    +
  • +
  • +

    A single 0 byte which serves as a separator

    +
  • +
  • +

    The Signature Input extracted from the key exporter output (see Section 3.2)

    +
  • +
+

For example, if the Signature Input has all its 32 bytes set to 01, the content +covered by the signature (in hexadecimal format) would be:

+
+
+
+
+2020202020202020202020202020202020202020202020202020202020202020
+2020202020202020202020202020202020202020202020202020202020202020
+48545450205369676E61747572652041757468656E7469636174696F6E
+00
+0101010101010101010101010101010101010101010101010101010101010101
+
+
+
Figure 3: +Example Content Covered by Signature +
+
+

The purpose of this static prefix is to mitigate issues that could arise if +authentication asymmetric keys were accidentally reused across protocols (even +though this is forbidden, see Section 8). This construction mirrors that of +the TLS 1.3 CertificateVerify message defined in Section 4.4.3 of [TLS].

+

The resulting signature is then transmitted to the server using the p +Parameter (see Section 4.3).

+
+
+
+
+
+
+

+4. Authentication Parameters +

+

This specification defines the following authentication parameters.

+

All of the byte sequences below are encoded using base64url (see Section 5 of [BASE64]) without quotes and without padding. In other words, the +values of these byte-sequence authentication parameters MUST NOT include any +characters other than ASCII letters, digits, dash and underscore.

+

The integer below is encoded without a minus and without leading zeroes. In +other words, the value of this integer authentication parameter MUST NOT +include any characters other than digits, and MUST NOT start with a zero unless +the full value is "0".

+

Using the syntax from [ABNF]:

+
+
+
+
+concealed-byte-sequence-param-value = *( ALPHA / DIGIT / "-" / "_" )
+concealed-integer-param-value =  %x31-39 1*4( DIGIT ) / "0"
+
+
+
Figure 4: +Authentication Parameter Value ABNF +
+
+
+
+

+4.1. The k Parameter +

+

The REQUIRED "k" (key ID) Parameter is a byte sequence that identifies which +key the client wishes to use to authenticate. This is used by the backend to +point to an entry in a server-side database of known keys, see Section 6.3.

+
+
+
+
+

+4.2. The a Parameter +

+

The REQUIRED "a" (public key) Parameter is a byte sequence that specifies the +public key used by the server to validate the signature provided by the client. +This avoids key confusion issues (see [SEEMS-LEGIT]). The encoding of the +public key is described in Section 3.1.1.

+
+
+
+
+

+4.3. The p Parameter +

+

The REQUIRED "p" (proof) Parameter is a byte sequence that specifies the proof +that the client provides to attest to possessing the credential that matches +its key ID.

+
+
+
+
+

+4.4. The s Parameter +

+

The REQUIRED "s" (signature) Parameter is an integer that specifies the +signature scheme used to compute the proof transmitted in the p Parameter. +Its value is an integer between 0 and 65535 inclusive from the IANA "TLS +SignatureScheme" registry maintained at +<https://www.iana.org/assignments/tls-parameters/tls-parameters.xhtml#tls-signaturescheme>.

+
+
+
+
+

+4.5. The v Parameter +

+

The REQUIRED "v" (verification) Parameter is a byte sequence that specifies the +verification that the client provides to attest to possessing the key exporter +output (see Section 3.2 for details). This avoids issues with signature schemes +where certain keys can generate signatures that are valid for multiple inputs +(see [SEEMS-LEGIT]).

+
+
+
+
+
+
+

+5. Example +

+

For example, the key ID "basement" authenticating using Ed25519 +[ED25519] could produce the following header field:

+
+
+
+
+NOTE: '\' line wrapping per RFC 8792
+
+Authorization: Concealed \
+  k=YmFzZW1lbnQ, \
+  a=VGhpcyBpcyBh-HB1YmxpYyBrZXkgaW4gdXNl_GhlcmU, \
+  s=2055, \
+  v=dmVyaWZpY2F0aW9u_zE2Qg, \
+  p=QzpcV2luZG93c_xTeXN0ZW0zMlxkcml2ZXJz-ENyb3dkU\
+    3RyaWtlXEMtMDAwMDAwMDAyOTEtMD-wMC0w_DAwLnN5cw
+
+
+
Figure 5: +Example Header Field +
+
+
+
+
+
+

+6. Server Handling +

+

In this section, we subdivide the server role in two:

+
    +
  • +

    the "frontend" runs in the HTTP server that terminates the TLS or QUIC +connection created by the client.

    +
  • +
  • +

    the "backend" runs in the HTTP server that has access to the database of +accepted key identifiers and public keys.

    +
  • +
+

In most deployments, we expect the frontend and backend roles to both be +implemented in a single HTTP origin server (as defined in Section 3.6 of [HTTP]). However, these roles can be split such that the frontend is an HTTP +gateway (as defined in Section 3.7 of [HTTP]) and the backend is an HTTP +origin server.

+
+
+

+6.1. Frontend Handling +

+

If a frontend is configured to check the Concealed authentication scheme, it +will parse the Authorization (or Proxy-Authorization) header field. If the +authentication scheme is set to "Concealed", the frontend MUST validate that +all the required authentication parameters are present and can be parsed +correctly as defined in Section 4. If any parameter is missing or fails +to parse, the frontend MUST ignore the entire Authorization (or +Proxy-Authorization) header field.

+

The frontend then uses the data from these authentication parameters to compute +the key exporter output, as defined in Section 3.2. The frontend then shares the +header field and the key exporter output with the backend.

+
+
+
+
+

+6.2. Communication between Frontend and Backend +

+

If the frontend and backend roles are implemented in the same machine, this can +be handled by a simple function call.

+

If the roles are split between two separate HTTP servers, then the backend +won't be able to directly access the TLS keying material exporter from the TLS +connection between the client and frontend, so the frontend needs to explictly +send it. This document defines the "Concealed-Auth-Export" request header field +for this purpose. The Concealed-Auth-Export header field's value is a +Structured Field Byte Sequence (see Section 3.3.5 of [STRUCTURED-FIELDS]) that contains the 48-byte key exporter output +(see Section 3.2), without any parameters. Note that Structured Field Byte +Sequences are encoded using the non-URL-safe variant of base64. For example:

+
+
+
+
+NOTE: '\' line wrapping per RFC 8792
+
+Concealed-Auth-Export: :VGhpc+BleGFtcGxlIFRMU/BleHBvcn\
+  Rlc+BvdXRwdXQ/aXMgNDggYnl0ZXMgI/+h:
+
+
+
Figure 6: +Example Concealed-Auth-Export Header Field +
+
+

The frontend SHALL forward the HTTP request to the backend, including the +original unmodified Authorization (or Proxy-Authorization) header field and the +newly added Concealed-Auth-Export header field.

+

Note that, since the security of this mechanism requires the key exporter +output to be correct, backends need to trust frontends to send it truthfully. +This trust relationship is common because the frontend already needs access to +the TLS certificate private key in order to respond to requests. HTTP servers +that parse the Concealed-Auth-Export header field MUST ignore it unless they +have already established that they trust the sender. Similarly, frontends that +send the Concealed-Auth-Export header field MUST ensure that they do not +forward any Concealed-Auth-Export header field received from the client.

+
+
+
+
+

+6.3. Backend Handling +

+

Once the backend receives the Authorization (or Proxy-Authorization) header +field and the key exporter output, it looks up the key ID in its database of +public keys. The backend SHALL then perform the following checks:

+
    +
  • +

    validate that all the required authentication parameters are present and can +be parsed correctly as defined in Section 4

    +
  • +
  • +

    ensure the key ID is present in the backend's database and maps to a +corresponding public key

    +
  • +
  • +

    validate that the public key from the database is equal to the one in the +Authorization (or Proxy-Authorization) header field

    +
  • +
  • +

    validate that the verification field from the Authorization (or +Proxy-Authorization) header field matches the one extracted from the key +exporter output

    +
  • +
  • +

    verify the cryptographic signature as defined in Section 3.3

    +
  • +
+

If all of these checks succeed, the backend can consider the request to be +properly authenticated, and can reply accordingly (the backend can also forward +the request to another HTTP server).

+

If any of the above checks fail, the backend MUST treat it as if the +Authorization (or Proxy-Authorization) header field was missing.

+
+
+
+
+

+6.4. Non-Probeable Server Handling +

+

Servers that wish to introduce resources whose existence cannot be probed need +to ensure that they do not reveal any information about those resources to +unauthenticated clients. In particular, such servers MUST respond to +authentication failures with the exact same response that they would have used +for non-existent resources. For example, this can mean using HTTP status code +404 (Not Found) instead of 401 (Unauthorized).

+

The authentication checks described above can take time to compute, and an +attacker could detect use of this mechanism if that time is observable by +comparing the timing of a request for a known non-existent resource to the +timing of a request for a potentially authenticated resource. Servers can +mitigate this observability by slightly delaying responses to some non-existent +resources such that the timing of the authentication verification is not +observable. This delay needs to be carefully considered to avoid having the +delay itself leak the fact that this origin uses this mechanism at all.

+

Non-probeable resources also need to be non-discoverable for unauthenticated +users. For example, if a server operator wishes to hide an authenticated +resource by pretending it does not exist to unauthenticated users, then the +server operator needs to ensure there are no unauthenticated pages with links +to that resource, and no other out-of-band ways for unauthenticated users to +discover this resource.

+
+
+
+
+
+
+

+7. Requirements on TLS Usage +

+

This authentication scheme is only defined for uses of HTTP with TLS +[TLS]. This includes any use of HTTP over TLS as typically used for +HTTP/2 [HTTP/2], or HTTP/3 [HTTP/3] where the transport protocol uses TLS as its +authentication and key exchange mechanism [QUIC-TLS].

+

Because the TLS keying material exporter is only secure for authentication when +it is uniquely bound to the TLS session [RFC7627], the Concealed +authentication scheme requires either one of the following properties:

+
    +
  • +

    The TLS version in use is greater or equal to 1.3 [TLS].

    +
  • +
  • +

    The TLS version in use is 1.2 and the Extended Master Secret extension +[RFC7627] has been negotiated.

    +
  • +
+

Clients MUST NOT use the Concealed authentication scheme on connections that do +not meet one of the two properties above. If a server receives a request that +uses this authentication scheme on a connection that meets neither of the above +properties, the server MUST treat the request as if the authentication were not +present.

+
+
+
+
+

+8. Security Considerations +

+

The Concealed HTTP authentication scheme allows a client to authenticate to an +origin server while guaranteeing freshness and without the need for the server +to transmit a nonce to the client. This allows the server to accept +authenticated clients without revealing that it supports or expects +authentication for some resources. It also allows authentication without the +client leaking the presence of authentication to observers due to clear-text +TLS Client Hello extensions.

+

Since the freshness described above is provided by a TLS key exporter, it can +be as old as the underlying TLS connection. Servers can require better +freshness by forcing clients to create new connections using mechanisms such as +the GOAWAY frame (see Section 5.2 of [HTTP/3]).

+

The authentication proofs described in this document are not bound to +individual HTTP requests; if the key is used for authentication proofs on +multiple requests on the same connection, they will all be identical. This +allows for better compression when sending over the wire, but implies that +client implementations that multiplex different security contexts over a single +HTTP connection need to ensure that those contexts cannot read each other's +header fields. Otherwise, one context would be able to replay the Authorization +header field of another. This constraint is met by modern Web browsers. If an +attacker were to compromise the browser such that it could access another +context's memory, the attacker might also be able to access the corresponding +key, so binding authentication to requests would not provide much benefit in +practice.

+

Authentication asymmetric keys used for the Concealed HTTP authentication +scheme MUST NOT be reused in other protocols. Even though we attempt to +mitigate these issues by adding a static prefix to the signed data (see +Section 3.3), reusing keys could undermine the security guarantees of the +authentication.

+

Origins offering this scheme can link requests that use the same key. +However, requests are not linkable across origins if the keys used are specific +to the individual origins using this scheme.

+
+
+
+
+

+9. IANA Considerations +

+
+
+

+9.1. HTTP Authentication Schemes Registry +

+

This document, if approved, requests IANA to register the following entry in +the "HTTP Authentication Schemes" Registry maintained at +<https://www.iana.org/assignments/http-authschemes>:

+
+
Authentication Scheme Name:
+
+

Concealed

+
+
+
Reference:
+
+

This document

+
+
+
Notes:
+
+

None

+
+
+
+
+
+
+
+

+9.2. TLS Keying Material Exporter Labels +

+

This document, if approved, requests IANA to register the following entry in +the "TLS Exporter Labels" registry maintained at +<https://www.iana.org/assignments/tls-parameters#exporter-labels>:

+
+
Value:
+
+

EXPORTER-HTTP-Concealed-Authentication

+
+
+
DTLS-OK:
+
+

N

+
+
+
Recommended:
+
+

Y

+
+
+
Reference:
+
+

This document

+
+
+
+
+
+
+
+

+9.3. HTTP Field Name +

+

This document, if approved, requests IANA to register the following entry in +the "Hypertext Transfer Protocol (HTTP) Field Name" registry maintained at +<https://www.iana.org/assignments/http-fields/http-fields.xhtml>:

+
+
Field Name:
+
+

Concealed-Auth-Export

+
+
+
Status:
+
+

permanent

+
+
+
Structured Type:
+
+

Item

+
+
+
Reference:
+
+

This document

+
+
+
Comments:
+
+

None

+
+
+
+
+
+
+
+
+
+

+10. References +

+
+
+

+10.1. Normative References +

+
+
[ABNF]
+
+Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, DOI 10.17487/RFC5234, , <https://www.rfc-editor.org/rfc/rfc5234>.
+
+
[BASE64]
+
+Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, DOI 10.17487/RFC4648, , <https://www.rfc-editor.org/rfc/rfc4648>.
+
+
[EdDSA]
+
+Josefsson, S. and I. Liusvaara, "Edwards-Curve Digital Signature Algorithm (EdDSA)", RFC 8032, DOI 10.17487/RFC8032, , <https://www.rfc-editor.org/rfc/rfc8032>.
+
+
[HTTP]
+
+Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, DOI 10.17487/RFC9110, , <https://www.rfc-editor.org/rfc/rfc9110>.
+
+
[KEY-EXPORT]
+
+Rescorla, E., "Keying Material Exporters for Transport Layer Security (TLS)", RFC 5705, DOI 10.17487/RFC5705, , <https://www.rfc-editor.org/rfc/rfc5705>.
+
+
[PKCS1]
+
+Moriarty, K., Ed., Kaliski, B., Jonsson, J., and A. Rusch, "PKCS #1: RSA Cryptography Specifications Version 2.2", RFC 8017, DOI 10.17487/RFC8017, , <https://www.rfc-editor.org/rfc/rfc8017>.
+
+
[QUIC]
+
+Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based Multiplexed and Secure Transport", RFC 9000, DOI 10.17487/RFC9000, , <https://www.rfc-editor.org/rfc/rfc9000>.
+
+
[RFC2119]
+
+Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
+
+
[RFC7627]
+
+Bhargavan, K., Ed., Delignat-Lavaud, A., Pironti, A., Langley, A., and M. Ray, "Transport Layer Security (TLS) Session Hash and Extended Master Secret Extension", RFC 7627, DOI 10.17487/RFC7627, , <https://www.rfc-editor.org/rfc/rfc7627>.
+
+
[RFC8174]
+
+Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.
+
+
[RFC8792]
+
+Watsen, K., Auerswald, E., Farrel, A., and Q. Wu, "Handling Long Lines in Content of Internet-Drafts and RFCs", RFC 8792, DOI 10.17487/RFC8792, , <https://www.rfc-editor.org/rfc/rfc8792>.
+
+
[STRUCTURED-FIELDS]
+
+Nottingham, M. and P. Kamp, "Structured Field Values for HTTP", RFC 8941, DOI 10.17487/RFC8941, , <https://www.rfc-editor.org/rfc/rfc8941>.
+
+
[TLS]
+
+Rescorla, E., "The Transport Layer Security (TLS) Protocol Version 1.3", RFC 8446, DOI 10.17487/RFC8446, , <https://www.rfc-editor.org/rfc/rfc8446>.
+
+
[URI]
+
+Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, DOI 10.17487/RFC3986, , <https://www.rfc-editor.org/rfc/rfc3986>.
+
+
[X.690]
+
+ITU-T, "Information technology - ASN.1 encoding Rules: Specification of Basic Encoding Rules (BER), Canonical Encoding Rules (CER) and Distinguished Encoding Rules (DER)", ISO/IEC 8824-1:2021 , .
+
+
+
+
+
+
+

+10.2. Informative References +

+
+
[ED25519]
+
+Josefsson, S. and J. Schaad, "Algorithm Identifiers for Ed25519, Ed448, X25519, and X448 for Use in the Internet X.509 Public Key Infrastructure", RFC 8410, DOI 10.17487/RFC8410, , <https://www.rfc-editor.org/rfc/rfc8410>.
+
+
[HOBA]
+
+Farrell, S., Hoffman, P., and M. Thomas, "HTTP Origin-Bound Authentication (HOBA)", RFC 7486, DOI 10.17487/RFC7486, , <https://www.rfc-editor.org/rfc/rfc7486>.
+
+
[HTTP/2]
+
+Thomson, M., Ed. and C. Benfield, Ed., "HTTP/2", RFC 9113, DOI 10.17487/RFC9113, , <https://www.rfc-editor.org/rfc/rfc9113>.
+
+
[HTTP/3]
+
+Bishop, M., Ed., "HTTP/3", RFC 9114, DOI 10.17487/RFC9114, , <https://www.rfc-editor.org/rfc/rfc9114>.
+
+
[MASQUE-ORIGINAL]
+
+Schinazi, D., "The MASQUE Protocol", Work in Progress, Internet-Draft, draft-schinazi-masque-00, , <https://datatracker.ietf.org/doc/html/draft-schinazi-masque-00>.
+
+
[QUIC-TLS]
+
+Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure QUIC", RFC 9001, DOI 10.17487/RFC9001, , <https://www.rfc-editor.org/rfc/rfc9001>.
+
+
[SEEMS-LEGIT]
+
+Jackson, D., Cremers, C., Cohn-Gordon, K., and R. Sasse, "Seems Legit: Automated Analysis of Subtle Attacks on Protocols That Use Signatures", CCS '19: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 2165–2180, DOI 10.1145/3319535.3339813, , <https://doi.org/10.1145/3319535.3339813>.
+
+
+
+
+
+
+
+
+

+Acknowledgments +

+

The authors would like to thank many members of the IETF community, as this +document is the fruit of many hallway conversations. In particular, the authors +would like to thank David Benjamin, Reese Enghardt, Nick Harper, Dennis Jackson, Ilari Liusvaara, François Michel, +Lucas Pardue, Justin Richer, Ben Schwartz, Martin Thomson, and Chris A. Wood for their reviews and contributions. The +mechanism described in this document was originally part of the first iteration +of MASQUE [MASQUE-ORIGINAL].

+
+
+
+
+

+Authors' Addresses +

+
+
David Schinazi
+
Google LLC
+
1600 Amphitheatre Parkway
+
+Mountain View, CA 94043 +
+
United States of America
+ +
+
+
David M. Oliver
+
Guardian Project
+ + +
+
+
Jonathan Hoyland
+
Cloudflare Inc.
+ +
+
+
+ + + diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-unprompted-auth.txt b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-unprompted-auth.txt new file mode 100644 index 000000000..17b4f9bc2 --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-unprompted-auth.txt @@ -0,0 +1,773 @@ + + + + +HTTPBIS D. Schinazi +Internet-Draft Google LLC +Intended status: Standards Track D. Oliver +Expires: 11 July 2025 Guardian Project + J. Hoyland + Cloudflare Inc. + 7 January 2025 + + + The Concealed HTTP Authentication Scheme + draft-ietf-httpbis-unprompted-auth-latest + +Abstract + + Most HTTP authentication schemes are probeable in the sense that it + is possible for an unauthenticated client to probe whether an origin + serves resources that require authentication. It is possible for an + origin to hide the fact that it requires authentication by not + generating Unauthorized status codes, however that only works with + non-cryptographic authentication schemes: cryptographic signatures + require a fresh nonce to be signed. Prior to this document, there + was no existing way for the origin to share such a nonce without + exposing the fact that it serves resources that require + authentication. This document defines a new non-probeable + cryptographic authentication scheme. + +About This Document + + This note is to be removed before publishing as an RFC. + + The latest revision of this draft can be found at https://httpwg.org/ + http-extensions/draft-ietf-httpbis-unprompted-auth.html. Status + information for this document may be found at + https://datatracker.ietf.org/doc/draft-ietf-httpbis-unprompted-auth/. + + Discussion of this document takes place on the HTTP Working Group + mailing list (mailto:ietf-http-wg@w3.org), which is archived at + https://lists.w3.org/Archives/Public/ietf-http-wg/. Working Group + information can be found at https://httpwg.org/. + + Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/unprompted-auth. + +Status of This Memo + + This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79. + + Internet-Drafts are working documents of the Internet Engineering + Task Force (IETF). Note that other groups may also distribute + working documents as Internet-Drafts. The list of current Internet- + Drafts is at https://datatracker.ietf.org/drafts/current/. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress." + + This Internet-Draft will expire on 11 July 2025. + +Copyright Notice + + Copyright (c) 2025 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents (https://trustee.ietf.org/ + license-info) in effect on the date of publication of this document. + Please review these documents carefully, as they describe your rights + and restrictions with respect to this document. Code Components + extracted from this document must include Revised BSD License text as + described in Section 4.e of the Trust Legal Provisions and are + provided without warranty as described in the Revised BSD License. + +Table of Contents + + 1. Introduction + 1.1. Conventions and Definitions + 2. The Concealed Authentication Scheme + 3. Client Handling + 3.1. Key Exporter Context + 3.1.1. Public Key Encoding + 3.2. Key Exporter Output + 3.3. Signature Computation + 4. Authentication Parameters + 4.1. The k Parameter + 4.2. The a Parameter + 4.3. The p Parameter + 4.4. The s Parameter + 4.5. The v Parameter + 5. Example + 6. Server Handling + 6.1. Frontend Handling + 6.2. Communication between Frontend and Backend + 6.3. Backend Handling + 6.4. Non-Probeable Server Handling + 7. Requirements on TLS Usage + 8. Security Considerations + 9. IANA Considerations + 9.1. HTTP Authentication Schemes Registry + 9.2. TLS Keying Material Exporter Labels + 9.3. HTTP Field Name + 10. References + 10.1. Normative References + 10.2. Informative References + Acknowledgments + Authors' Addresses + +1. Introduction + + HTTP authentication schemes (see Section 11 of [HTTP]) allow origins + to restrict access for some resources to only authenticated requests. + While these schemes commonly involve a challenge where the origin + asks the client to provide authentication information, it is possible + for clients to send such information unprompted. This is + particularly useful in cases where an origin wants to offer a service + or capability only to "those who know" while all others are given no + indication the service or capability exists. Such designs rely on an + externally-defined mechanism by which keys are distributed. For + example, a company might offer remote employee access to company + services directly via its website using their employee credentials, + or offer access to limited special capabilities for specific + employees, while making discovering (or probing for) such + capabilities difficult. As another example, members of less well- + defined communities might use more ephemeral keys to acquire access + to geography- or capability-specific resources, as issued by an + entity whose user base is larger than the available resources can + support (by having that entity metering the availability of keys + temporally or geographically). + + While digital-signature-based HTTP authentication schemes already + exist (e.g., [HOBA]), they rely on the origin explicitly sending a + fresh challenge to the client, to ensure that the signature input is + fresh. That makes the origin probeable as it sends the challenge to + unauthenticated clients. This document defines a new signature-based + authentication scheme that is not probeable. + +1.1. Conventions and Definitions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + + This document uses the notation from Section 1.3 of [QUIC]. + +2. The Concealed Authentication Scheme + + This document defines the "Concealed" HTTP authentication scheme. It + uses asymmetric cryptography. Clients possess a key ID and a public/ + private key pair, and origin servers maintain a mapping of authorized + key IDs to associated public keys. + + The client uses a TLS keying material exporter to generate data to be + signed (see Section 3) then sends the signature using the + Authorization (or Proxy-Authorization) header field (see Section 11 + of [HTTP]). The signature and additional information are exchanged + using authentication parameters (see Section 4). Once the server + receives these, it can check whether the signature validates against + an entry in its database of known keys. The server can then use the + validation result to influence its response to the client, for + example by restricting access to certain resources. + +3. Client Handling + + When a client wishes to use the Concealed HTTP authentication scheme + with a request, it SHALL compute the authentication proof using a TLS + keying material exporter with the following parameters: + + * the label is set to "EXPORTER-HTTP-Concealed-Authentication" + + * the context is set to the structure described in Section 3.1 + + * the exporter output length is set to 48 bytes (see Section 3.2) + + Note that TLS 1.3 keying material exporters are defined in + Section 7.5 of [TLS], while TLS 1.2 keying material exporters are + defined in [KEY-EXPORT]. + +3.1. Key Exporter Context + + The TLS key exporter context is described in Figure 1, using the + notation from Section 1.3 of [QUIC]: + + Signature Algorithm (16), + Key ID Length (i), + Key ID (..), + Public Key Length (i), + Public Key (..), + Scheme Length (i), + Scheme (..), + Host Length (i), + Host (..), + Port (16), + Realm Length (i), + Realm (..), + + Figure 1: Key Exporter Context Format + + The key exporter context contains the following fields: + + Signature Algorithm: The signature scheme sent in the s Parameter + (see Section 4.4). + + Key ID: The key ID sent in the k Parameter (see Section 4.1). + + Public Key: The public key used by the server to validate the + signature provided by the client. Its encoding is described in + Section 3.1.1. + + Scheme: The scheme for this request, encoded using the format of the + scheme portion of a URI as defined in Section 3.1 of [URI]. + + Host: The host for this request, encoded using the format of the + host portion of a URI as defined in Section 3.2.2 of [URI]. + + Port: The port for this request, encoded in network byte order. + Note that the port is either included in the URI, or is the + default port for the scheme in use; see Section 3.2.3 of [URI]. + + Realm: The realm of authentication that is sent in the realm + authentication parameter (Section 11.5 of [HTTP]). If the realm + authentication parameter is not present, this SHALL be empty. + This document does not define a means for the origin to + communicate a realm to the client. If a client is not configured + to use a specific realm, it SHALL use an empty realm and SHALL NOT + send the realm authentication parameter. + + The Signature Algorithm and Port fields are encoded as unsigned + 16-bit integers in network byte order. The Key ID, Public Key, + Scheme, Host, and Realm fields are length prefixed strings; they are + preceded by a Length field that represents their length in bytes. + These length fields are encoded using the variable-length integer + encoding from Section 16 of [QUIC] and MUST be encoded in the minimum + number of bytes necessary. + +3.1.1. Public Key Encoding + + Both the "Public Key" field of the TLS key exporter context (see + above) and the a Parameter (see Section 4.2) carry the same public + key. The encoding of the public key is determined by the Signature + Algorithm in use as follows: + + RSASSA-PSS algorithms: The public key is an RSAPublicKey structure + [PKCS1] encoded in DER [X.690]. BER encodings which are not DER + MUST be rejected. + + ECDSA algorithms: The public key is a + UncompressedPointRepresentation structure defined in + Section 4.2.8.2 of [TLS], using the curve specified by the + SignatureScheme. + + EdDSA algorithms: The public key is the byte string encoding defined + in [EdDSA]. + + This document does not define the public key encodings for other + algorithms. In order for a SignatureScheme to be usable with the + Concealed HTTP authentication scheme, its public key encoding needs + to be defined in a corresponding document. + +3.2. Key Exporter Output + + The key exporter output is 48 bytes long. Of those, the first 32 + bytes are part of the input to the signature and the next 16 bytes + are sent alongside the signature. This allows the recipient to + confirm that the exporter produces the right values. This is + described in Figure 2, using the notation from Section 1.3 of [QUIC]: + + Signature Input (256), + Verification (128), + + Figure 2: Key Exporter Output Format + + The key exporter output contains the following fields: + + Signature Input: This is part of the data signed using the client's + chosen asymmetric private key (see Section 3.3). + + Verification: The verification is transmitted to the server using + the v Parameter (see Section 4.5). + +3.3. Signature Computation + + Once the Signature Input has been extracted from the key exporter + output (see Section 3.2), it is prefixed with static data before + being signed. The signature is computed over the concatenation of: + + * A string that consists of octet 32 (0x20) repeated 64 times + + * The context string "HTTP Concealed Authentication" + + * A single 0 byte which serves as a separator + + * The Signature Input extracted from the key exporter output (see + Section 3.2) + + For example, if the Signature Input has all its 32 bytes set to 01, + the content covered by the signature (in hexadecimal format) would + be: + + 2020202020202020202020202020202020202020202020202020202020202020 + 2020202020202020202020202020202020202020202020202020202020202020 + 48545450205369676E61747572652041757468656E7469636174696F6E + 00 + 0101010101010101010101010101010101010101010101010101010101010101 + + Figure 3: Example Content Covered by Signature + + The purpose of this static prefix is to mitigate issues that could + arise if authentication asymmetric keys were accidentally reused + across protocols (even though this is forbidden, see Section 8). + This construction mirrors that of the TLS 1.3 CertificateVerify + message defined in Section 4.4.3 of [TLS]. + + The resulting signature is then transmitted to the server using the p + Parameter (see Section 4.3). + +4. Authentication Parameters + + This specification defines the following authentication parameters. + + All of the byte sequences below are encoded using base64url (see + Section 5 of [BASE64]) without quotes and without padding. In other + words, the values of these byte-sequence authentication parameters + MUST NOT include any characters other than ASCII letters, digits, + dash and underscore. + + The integer below is encoded without a minus and without leading + zeroes. In other words, the value of this integer authentication + parameter MUST NOT include any characters other than digits, and MUST + NOT start with a zero unless the full value is "0". + + Using the syntax from [ABNF]: + + concealed-byte-sequence-param-value = *( ALPHA / DIGIT / "-" / "_" ) + concealed-integer-param-value = %x31-39 1*4( DIGIT ) / "0" + + Figure 4: Authentication Parameter Value ABNF + +4.1. The k Parameter + + The REQUIRED "k" (key ID) Parameter is a byte sequence that + identifies which key the client wishes to use to authenticate. This + is used by the backend to point to an entry in a server-side database + of known keys, see Section 6.3. + +4.2. The a Parameter + + The REQUIRED "a" (public key) Parameter is a byte sequence that + specifies the public key used by the server to validate the signature + provided by the client. This avoids key confusion issues (see + [SEEMS-LEGIT]). The encoding of the public key is described in + Section 3.1.1. + +4.3. The p Parameter + + The REQUIRED "p" (proof) Parameter is a byte sequence that specifies + the proof that the client provides to attest to possessing the + credential that matches its key ID. + +4.4. The s Parameter + + The REQUIRED "s" (signature) Parameter is an integer that specifies + the signature scheme used to compute the proof transmitted in the p + Parameter. Its value is an integer between 0 and 65535 inclusive + from the IANA "TLS SignatureScheme" registry maintained at + . + +4.5. The v Parameter + + The REQUIRED "v" (verification) Parameter is a byte sequence that + specifies the verification that the client provides to attest to + possessing the key exporter output (see Section 3.2 for details). + This avoids issues with signature schemes where certain keys can + generate signatures that are valid for multiple inputs (see + [SEEMS-LEGIT]). + +5. Example + + For example, the key ID "basement" authenticating using Ed25519 + [ED25519] could produce the following header field: + + NOTE: '\' line wrapping per RFC 8792 + + Authorization: Concealed \ + k=YmFzZW1lbnQ, \ + a=VGhpcyBpcyBh-HB1YmxpYyBrZXkgaW4gdXNl_GhlcmU, \ + s=2055, \ + v=dmVyaWZpY2F0aW9u_zE2Qg, \ + p=QzpcV2luZG93c_xTeXN0ZW0zMlxkcml2ZXJz-ENyb3dkU\ + 3RyaWtlXEMtMDAwMDAwMDAyOTEtMD-wMC0w_DAwLnN5cw + + Figure 5: Example Header Field + +6. Server Handling + + In this section, we subdivide the server role in two: + + * the "frontend" runs in the HTTP server that terminates the TLS or + QUIC connection created by the client. + + * the "backend" runs in the HTTP server that has access to the + database of accepted key identifiers and public keys. + + In most deployments, we expect the frontend and backend roles to both + be implemented in a single HTTP origin server (as defined in + Section 3.6 of [HTTP]). However, these roles can be split such that + the frontend is an HTTP gateway (as defined in Section 3.7 of [HTTP]) + and the backend is an HTTP origin server. + +6.1. Frontend Handling + + If a frontend is configured to check the Concealed authentication + scheme, it will parse the Authorization (or Proxy-Authorization) + header field. If the authentication scheme is set to "Concealed", + the frontend MUST validate that all the required authentication + parameters are present and can be parsed correctly as defined in + Section 4. If any parameter is missing or fails to parse, the + frontend MUST ignore the entire Authorization (or Proxy- + Authorization) header field. + + The frontend then uses the data from these authentication parameters + to compute the key exporter output, as defined in Section 3.2. The + frontend then shares the header field and the key exporter output + with the backend. + +6.2. Communication between Frontend and Backend + + If the frontend and backend roles are implemented in the same + machine, this can be handled by a simple function call. + + If the roles are split between two separate HTTP servers, then the + backend won't be able to directly access the TLS keying material + exporter from the TLS connection between the client and frontend, so + the frontend needs to explictly send it. This document defines the + "Concealed-Auth-Export" request header field for this purpose. The + Concealed-Auth-Export header field's value is a Structured Field Byte + Sequence (see Section 3.3.5 of [STRUCTURED-FIELDS]) that contains the + 48-byte key exporter output (see Section 3.2), without any + parameters. Note that Structured Field Byte Sequences are encoded + using the non-URL-safe variant of base64. For example: + + NOTE: '\' line wrapping per RFC 8792 + + Concealed-Auth-Export: :VGhpc+BleGFtcGxlIFRMU/BleHBvcn\ + Rlc+BvdXRwdXQ/aXMgNDggYnl0ZXMgI/+h: + + Figure 6: Example Concealed-Auth-Export Header Field + + The frontend SHALL forward the HTTP request to the backend, including + the original unmodified Authorization (or Proxy-Authorization) header + field and the newly added Concealed-Auth-Export header field. + + Note that, since the security of this mechanism requires the key + exporter output to be correct, backends need to trust frontends to + send it truthfully. This trust relationship is common because the + frontend already needs access to the TLS certificate private key in + order to respond to requests. HTTP servers that parse the Concealed- + Auth-Export header field MUST ignore it unless they have already + established that they trust the sender. Similarly, frontends that + send the Concealed-Auth-Export header field MUST ensure that they do + not forward any Concealed-Auth-Export header field received from the + client. + +6.3. Backend Handling + + Once the backend receives the Authorization (or Proxy-Authorization) + header field and the key exporter output, it looks up the key ID in + its database of public keys. The backend SHALL then perform the + following checks: + + * validate that all the required authentication parameters are + present and can be parsed correctly as defined in Section 4 + + * ensure the key ID is present in the backend's database and maps to + a corresponding public key + + * validate that the public key from the database is equal to the one + in the Authorization (or Proxy-Authorization) header field + + * validate that the verification field from the Authorization (or + Proxy-Authorization) header field matches the one extracted from + the key exporter output + + * verify the cryptographic signature as defined in Section 3.3 + + If all of these checks succeed, the backend can consider the request + to be properly authenticated, and can reply accordingly (the backend + can also forward the request to another HTTP server). + + If any of the above checks fail, the backend MUST treat it as if the + Authorization (or Proxy-Authorization) header field was missing. + +6.4. Non-Probeable Server Handling + + Servers that wish to introduce resources whose existence cannot be + probed need to ensure that they do not reveal any information about + those resources to unauthenticated clients. In particular, such + servers MUST respond to authentication failures with the exact same + response that they would have used for non-existent resources. For + example, this can mean using HTTP status code 404 (Not Found) instead + of 401 (Unauthorized). + + The authentication checks described above can take time to compute, + and an attacker could detect use of this mechanism if that time is + observable by comparing the timing of a request for a known non- + existent resource to the timing of a request for a potentially + authenticated resource. Servers can mitigate this observability by + slightly delaying responses to some non-existent resources such that + the timing of the authentication verification is not observable. + This delay needs to be carefully considered to avoid having the delay + itself leak the fact that this origin uses this mechanism at all. + + Non-probeable resources also need to be non-discoverable for + unauthenticated users. For example, if a server operator wishes to + hide an authenticated resource by pretending it does not exist to + unauthenticated users, then the server operator needs to ensure there + are no unauthenticated pages with links to that resource, and no + other out-of-band ways for unauthenticated users to discover this + resource. + +7. Requirements on TLS Usage + + This authentication scheme is only defined for uses of HTTP with TLS + [TLS]. This includes any use of HTTP over TLS as typically used for + HTTP/2 [HTTP/2], or HTTP/3 [HTTP/3] where the transport protocol uses + TLS as its authentication and key exchange mechanism [QUIC-TLS]. + + Because the TLS keying material exporter is only secure for + authentication when it is uniquely bound to the TLS session + [RFC7627], the Concealed authentication scheme requires either one of + the following properties: + + * The TLS version in use is greater or equal to 1.3 [TLS]. + + * The TLS version in use is 1.2 and the Extended Master Secret + extension [RFC7627] has been negotiated. + + Clients MUST NOT use the Concealed authentication scheme on + connections that do not meet one of the two properties above. If a + server receives a request that uses this authentication scheme on a + connection that meets neither of the above properties, the server + MUST treat the request as if the authentication were not present. + +8. Security Considerations + + The Concealed HTTP authentication scheme allows a client to + authenticate to an origin server while guaranteeing freshness and + without the need for the server to transmit a nonce to the client. + This allows the server to accept authenticated clients without + revealing that it supports or expects authentication for some + resources. It also allows authentication without the client leaking + the presence of authentication to observers due to clear-text TLS + Client Hello extensions. + + Since the freshness described above is provided by a TLS key + exporter, it can be as old as the underlying TLS connection. Servers + can require better freshness by forcing clients to create new + connections using mechanisms such as the GOAWAY frame (see + Section 5.2 of [HTTP/3]). + + The authentication proofs described in this document are not bound to + individual HTTP requests; if the key is used for authentication + proofs on multiple requests on the same connection, they will all be + identical. This allows for better compression when sending over the + wire, but implies that client implementations that multiplex + different security contexts over a single HTTP connection need to + ensure that those contexts cannot read each other's header fields. + Otherwise, one context would be able to replay the Authorization + header field of another. This constraint is met by modern Web + browsers. If an attacker were to compromise the browser such that it + could access another context's memory, the attacker might also be + able to access the corresponding key, so binding authentication to + requests would not provide much benefit in practice. + + Authentication asymmetric keys used for the Concealed HTTP + authentication scheme MUST NOT be reused in other protocols. Even + though we attempt to mitigate these issues by adding a static prefix + to the signed data (see Section 3.3), reusing keys could undermine + the security guarantees of the authentication. + + Origins offering this scheme can link requests that use the same key. + However, requests are not linkable across origins if the keys used + are specific to the individual origins using this scheme. + +9. IANA Considerations + +9.1. HTTP Authentication Schemes Registry + + This document, if approved, requests IANA to register the following + entry in the "HTTP Authentication Schemes" Registry maintained at + : + + Authentication Scheme Name: Concealed + Reference: This document + Notes: None + +9.2. TLS Keying Material Exporter Labels + + This document, if approved, requests IANA to register the following + entry in the "TLS Exporter Labels" registry maintained at + : + + Value: EXPORTER-HTTP-Concealed-Authentication + DTLS-OK: N + Recommended: Y + Reference: This document + +9.3. HTTP Field Name + + This document, if approved, requests IANA to register the following + entry in the "Hypertext Transfer Protocol (HTTP) Field Name" registry + maintained at : + + Field Name: Concealed-Auth-Export + Status: permanent + Structured Type: Item + Reference: This document + Comments: None + +10. References + +10.1. Normative References + + [ABNF] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax + Specifications: ABNF", STD 68, RFC 5234, + DOI 10.17487/RFC5234, January 2008, + . + + [BASE64] Josefsson, S., "The Base16, Base32, and Base64 Data + Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, + . + + [EdDSA] Josefsson, S. and I. Liusvaara, "Edwards-Curve Digital + Signature Algorithm (EdDSA)", RFC 8032, + DOI 10.17487/RFC8032, January 2017, + . + + [HTTP] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, + Ed., "HTTP Semantics", STD 97, RFC 9110, + DOI 10.17487/RFC9110, June 2022, + . + + [KEY-EXPORT] + Rescorla, E., "Keying Material Exporters for Transport + Layer Security (TLS)", RFC 5705, DOI 10.17487/RFC5705, + March 2010, . + + [PKCS1] Moriarty, K., Ed., Kaliski, B., Jonsson, J., and A. Rusch, + "PKCS #1: RSA Cryptography Specifications Version 2.2", + RFC 8017, DOI 10.17487/RFC8017, November 2016, + . + + [QUIC] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based + Multiplexed and Secure Transport", RFC 9000, + DOI 10.17487/RFC9000, May 2021, + . + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + . + + [RFC7627] Bhargavan, K., Ed., Delignat-Lavaud, A., Pironti, A., + Langley, A., and M. Ray, "Transport Layer Security (TLS) + Session Hash and Extended Master Secret Extension", + RFC 7627, DOI 10.17487/RFC7627, September 2015, + . + + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, . + + [RFC8792] Watsen, K., Auerswald, E., Farrel, A., and Q. Wu, + "Handling Long Lines in Content of Internet-Drafts and + RFCs", RFC 8792, DOI 10.17487/RFC8792, June 2020, + . + + [STRUCTURED-FIELDS] + Nottingham, M. and P. Kamp, "Structured Field Values for + HTTP", RFC 8941, DOI 10.17487/RFC8941, February 2021, + . + + [TLS] Rescorla, E., "The Transport Layer Security (TLS) Protocol + Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, + . + + [URI] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform + Resource Identifier (URI): Generic Syntax", STD 66, + RFC 3986, DOI 10.17487/RFC3986, January 2005, + . + + [X.690] ITU-T, "Information technology - ASN.1 encoding Rules: + Specification of Basic Encoding Rules (BER), Canonical + Encoding Rules (CER) and Distinguished Encoding Rules + (DER)", ISO/IEC 8824-1:2021 , February 2021. + +10.2. Informative References + + [ED25519] Josefsson, S. and J. Schaad, "Algorithm Identifiers for + Ed25519, Ed448, X25519, and X448 for Use in the Internet + X.509 Public Key Infrastructure", RFC 8410, + DOI 10.17487/RFC8410, August 2018, + . + + [HOBA] Farrell, S., Hoffman, P., and M. Thomas, "HTTP Origin- + Bound Authentication (HOBA)", RFC 7486, + DOI 10.17487/RFC7486, March 2015, + . + + [HTTP/2] Thomson, M., Ed. and C. Benfield, Ed., "HTTP/2", RFC 9113, + DOI 10.17487/RFC9113, June 2022, + . + + [HTTP/3] Bishop, M., Ed., "HTTP/3", RFC 9114, DOI 10.17487/RFC9114, + June 2022, . + + [MASQUE-ORIGINAL] + Schinazi, D., "The MASQUE Protocol", Work in Progress, + Internet-Draft, draft-schinazi-masque-00, 28 February + 2019, . + + [QUIC-TLS] Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure + QUIC", RFC 9001, DOI 10.17487/RFC9001, May 2021, + . + + [SEEMS-LEGIT] + Jackson, D., Cremers, C., Cohn-Gordon, K., and R. Sasse, + "Seems Legit: Automated Analysis of Subtle Attacks on + Protocols That Use Signatures", CCS '19: Proceedings of + the 2019 ACM SIGSAC Conference on Computer and + Communications Security, pp. 2165–2180, + DOI 10.1145/3319535.3339813, 2019, + . + +Acknowledgments + + The authors would like to thank many members of the IETF community, + as this document is the fruit of many hallway conversations. In + particular, the authors would like to thank David Benjamin, Reese + Enghardt, Nick Harper, Dennis Jackson, Ilari Liusvaara, François + Michel, Lucas Pardue, Justin Richer, Ben Schwartz, Martin Thomson, + and Chris A. Wood for their reviews and contributions. The mechanism + described in this document was originally part of the first iteration + of MASQUE [MASQUE-ORIGINAL]. + +Authors' Addresses + + David Schinazi + Google LLC + 1600 Amphitheatre Parkway + Mountain View, CA 94043 + United States of America + Email: dschinazi.ietf@gmail.com + + + David M. Oliver + Guardian Project + Email: david@guardianproject.info + URI: https://guardianproject.info + + + Jonathan Hoyland + Cloudflare Inc. + Email: jonathan.hoyland@gmail.com diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-zstd-window-size.html b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-zstd-window-size.html new file mode 100644 index 000000000..9e653648b --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-zstd-window-size.html @@ -0,0 +1,1375 @@ + + + + + + +Window Sizing for Zstandard Content Encoding + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Internet-DraftZstd Window SizeJanuary 2025
Jaju & HandteExpires 11 July 2025[Page]
+
+
+
+
Workgroup:
+
HTTPBIS
+
Internet-Draft:
+
draft-ietf-httpbis-zstd-window-size-latest
+
Updates:
+
+8878 (if approved)
+
Published:
+
+ +
+
Intended Status:
+
Informational
+
Expires:
+
+
Authors:
+
+
+
N. Jaju, Ed. +
+
Google
+
+
+
F. Handte, Ed. +
+
Meta Platforms, Inc.
+
+
+
+
+

Window Sizing for Zstandard Content Encoding

+
+

Abstract

+

Deployments of Zstandard, or "zstd", can use different window sizes to limit +memory usage during compression and decompression. Some browsers and user +agents limit window sizes to mitigate memory usage concerns, causing +interoperability issues. This document updates the window size limit in RFC8878 +from a recommendation to a requirement in HTTP contexts.

+
+
+

+About This Document +

+

This note is to be removed before publishing as an RFC.

+

+ The latest revision of this draft can be found at https://httpwg.org/http-extensions/draft-ietf-httpbis-zstd-window-size.html. + Status information for this document may be found at https://datatracker.ietf.org/doc/draft-ietf-httpbis-zstd-window-size/.

+

+ Discussion of this document takes place on the + HTTP Working Group mailing list (mailto:ietf-http-wg@w3.org), + which is archived at https://lists.w3.org/Archives/Public/ietf-http-wg/.

+

Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/zstd-window-size.

+
+
+
+

+Status of This Memo +

+

+ This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79.

+

+ Internet-Drafts are working documents of the Internet Engineering Task + Force (IETF). Note that other groups may also distribute working + documents as Internet-Drafts. The list of current Internet-Drafts is + at https://datatracker.ietf.org/drafts/current/.

+

+ Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress."

+

+ This Internet-Draft will expire on 11 July 2025.

+
+
+ + +
+
+

+1. Introduction +

+

Zstandard, or "zstd", specified in [RFC8878], is a lossless data compression +mechanism similar to gzip. When used with HTTP, the "zstd" content coding +token signals to the decoder that the content is Zstandard-compressed.

+

An important property of Zstandard-compressed content is its Window_Size +([RFC8878], Section 3.1.1.1.2), which describes the maximum distance for +back-references and therefore how much of the content must be kept in memory +during decompression.

+

The minimum Window_Size is 1 KB. The maximum Window_Size is +(1<<41) + 7*(1<<38) bytes, which is 3.75 TB. Larger Window_Size values tend +to improve the compression ratio, but at the cost of increased memory usage.

+

To protect against unreasonable memory usage, some browsers and user agents +limit the maximum Window_Size they will handle. This causes failures to decode +responses when the content is compressed with a larger Window_Size than the +recipient allows, leading to decreased interoperability.

+

[RFC8878], Section 3.1.1.1.2 recommends that decoders support a Window_Size +of up to 8 MB, and that encoders not generate frames using a Window_Size larger +than 8 MB. However, it imposes no requirements.

+

This document updates [RFC8878] to enforce Window_Size limits on the encoder +and decoder for the "zstd" HTTP content coding.

+
+
+
+
+

+2. Conventions and Definitions +

+

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", +"MAY", and "OPTIONAL" in this document are to be interpreted as +described in BCP 14 [RFC2119] [RFC8174] when, and only when, they +appear in all capitals, as shown here.

+
+
+
+
+

+3. Window Size +

+

To ensure interoperability, when using the "zstd" content coding, decoders MUST +support a Window_Size of up to and including 8 MB, and encoders MUST NOT +generate frames requiring a Window_Size larger than 8 MB (see +Section 5.1).

+
+
+
+
+

+4. Security Considerations +

+

This document introduces no new security considerations beyond those discussed +in [RFC8878].

+

Note that decoders still need to take into account that they can receive +oversized frames that do not follow the window size limit specified in this +document and fail decoding when such invalid frames are received.

+
+
+
+
+

+5. IANA Considerations +

+
+
+

+5.1. Content Encoding +

+

This document updates the entry added in [RFC8878] to the "HTTP Content +Coding Registry" +within the "Hypertext Transfer Protocol (HTTP) +Parameters" +registry:

+
+
Name:
+
+

zstd

+
+
+
Description:
+
+

A stream of bytes compressed using the Zstandard protocol with a Window_Size +of not more than 8 MB.

+
+
+
Reference:
+
+

This document and [RFC8878]

+
+
+
+
+
+
+
+
+
+

+6. Normative References +

+
+
[RFC2119]
+
+Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
+
+
[RFC8174]
+
+Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.
+
+
[RFC8878]
+
+Collet, Y. and M. Kucherawy, Ed., "Zstandard Compression and the 'application/zstd' Media Type", RFC 8878, DOI 10.17487/RFC8878, , <https://www.rfc-editor.org/rfc/rfc8878>.
+
+
+
+
+
+
+

+Acknowledgments +

+

Zstandard was developed by Yann Collet.

+

The authors would like to thank Yann Collet, Klaus Post, Adam Rice, and members +of the Web Performance Working Group in the W3C for collaborating on the window +size issue and helping to formulate a solution. Also, thank you to Nick Terrell +for providing feedback that went into RFC 8478 and RFC 8878.

+
+
+
+
+

+Authors' Addresses +

+
+
Nidhi Jaju (editor)
+
Google
+
+Shibuya Stream, 3 Chome-21-3 Shibuya, Shibuya City, Tokyo +
+
150-0002
+
Japan
+ +
+
+
W. Felix P. Handte (editor)
+
Meta Platforms, Inc.
+
380 W 33rd St
+
+New York, NY 10001 +
+
United States of America
+ +
+
+
+ + + diff --git a/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-zstd-window-size.txt b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-zstd-window-size.txt new file mode 100644 index 000000000..8a3b3b1b2 --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-zstd-window-size.txt @@ -0,0 +1,199 @@ + + + + +HTTPBIS N. Jaju, Ed. +Internet-Draft Google +Updates: 8878 (if approved) F. Handte, Ed. +Intended status: Informational Meta Platforms, Inc. +Expires: 11 July 2025 7 January 2025 + + + Window Sizing for Zstandard Content Encoding + draft-ietf-httpbis-zstd-window-size-latest + +Abstract + + Deployments of Zstandard, or "zstd", can use different window sizes + to limit memory usage during compression and decompression. Some + browsers and user agents limit window sizes to mitigate memory usage + concerns, causing interoperability issues. This document updates the + window size limit in RFC8878 from a recommendation to a requirement + in HTTP contexts. + +About This Document + + This note is to be removed before publishing as an RFC. + + The latest revision of this draft can be found at https://httpwg.org/ + http-extensions/draft-ietf-httpbis-zstd-window-size.html. Status + information for this document may be found at + https://datatracker.ietf.org/doc/draft-ietf-httpbis-zstd-window- + size/. + + Discussion of this document takes place on the HTTP Working Group + mailing list (mailto:ietf-http-wg@w3.org), which is archived at + https://lists.w3.org/Archives/Public/ietf-http-wg/. + + Source for this draft and an issue tracker can be found at + https://github.com/httpwg/http-extensions/labels/zstd-window-size. + +Status of This Memo + + This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79. + + Internet-Drafts are working documents of the Internet Engineering + Task Force (IETF). Note that other groups may also distribute + working documents as Internet-Drafts. The list of current Internet- + Drafts is at https://datatracker.ietf.org/drafts/current/. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress." + + This Internet-Draft will expire on 11 July 2025. + +Copyright Notice + + Copyright (c) 2025 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents (https://trustee.ietf.org/ + license-info) in effect on the date of publication of this document. + Please review these documents carefully, as they describe your rights + and restrictions with respect to this document. Code Components + extracted from this document must include Revised BSD License text as + described in Section 4.e of the Trust Legal Provisions and are + provided without warranty as described in the Revised BSD License. + +Table of Contents + + 1. Introduction + 2. Conventions and Definitions + 3. Window Size + 4. Security Considerations + 5. IANA Considerations + 5.1. Content Encoding + 6. Normative References + Acknowledgments + Authors' Addresses + +1. Introduction + + Zstandard, or "zstd", specified in [RFC8878], is a lossless data + compression mechanism similar to gzip. When used with HTTP, the + "zstd" content coding token signals to the decoder that the content + is Zstandard-compressed. + + An important property of Zstandard-compressed content is its + Window_Size ([RFC8878], Section 3.1.1.1.2), which describes the + maximum distance for back-references and therefore how much of the + content must be kept in memory during decompression. + + The minimum Window_Size is 1 KB. The maximum Window_Size is (1<<41) + + 7*(1<<38) bytes, which is 3.75 TB. Larger Window_Size values tend + to improve the compression ratio, but at the cost of increased memory + usage. + + To protect against unreasonable memory usage, some browsers and user + agents limit the maximum Window_Size they will handle. This causes + failures to decode responses when the content is compressed with a + larger Window_Size than the recipient allows, leading to decreased + interoperability. + + [RFC8878], Section 3.1.1.1.2 recommends that decoders support a + Window_Size of up to 8 MB, and that encoders not generate frames + using a Window_Size larger than 8 MB. However, it imposes no + requirements. + + This document updates [RFC8878] to enforce Window_Size limits on the + encoder and decoder for the "zstd" HTTP content coding. + +2. Conventions and Definitions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + +3. Window Size + + To ensure interoperability, when using the "zstd" content coding, + decoders MUST support a Window_Size of up to and including 8 MB, and + encoders MUST NOT generate frames requiring a Window_Size larger than + 8 MB (see Section 5.1). + +4. Security Considerations + + This document introduces no new security considerations beyond those + discussed in [RFC8878]. + + Note that decoders still need to take into account that they can + receive oversized frames that do not follow the window size limit + specified in this document and fail decoding when such invalid frames + are received. + +5. IANA Considerations + +5.1. Content Encoding + + This document updates the entry added in [RFC8878] to the "HTTP + Content Coding Registry" (https://www.iana.org/assignments/http- + parameters/http-parameters.xhtml#content-coding) within the + "Hypertext Transfer Protocol (HTTP) Parameters" + (https://www.iana.org/assignments/http-parameters/http- + parameters.xhtml) registry: + + Name: zstd + + Description: A stream of bytes compressed using the Zstandard + protocol with a Window_Size of not more than 8 MB. + + Reference: This document and [RFC8878] + +6. Normative References + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + . + + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, . + + [RFC8878] Collet, Y. and M. Kucherawy, Ed., "Zstandard Compression + and the 'application/zstd' Media Type", RFC 8878, + DOI 10.17487/RFC8878, February 2021, + . + +Acknowledgments + + Zstandard was developed by Yann Collet. + + The authors would like to thank Yann Collet, Klaus Post, Adam Rice, + and members of the Web Performance Working Group in the W3C for + collaborating on the window size issue and helping to formulate a + solution. Also, thank you to Nick Terrell for providing feedback + that went into RFC 8478 and RFC 8878. + +Authors' Addresses + + Nidhi Jaju (editor) + Google + Shibuya Stream, 3 Chome-21-3 Shibuya, Shibuya City, Tokyo + 150-0002 + Japan + Email: nidhijaju@google.com + + + W. Felix P. Handte (editor) + Meta Platforms, Inc. + 380 W 33rd St + New York, NY 10001 + United States of America + Email: felixh@meta.com diff --git a/draft-ietf-httpbis-rfc6265bis-19/index.md b/draft-ietf-httpbis-rfc6265bis-19/index.md new file mode 100644 index 000000000..6c67e020b --- /dev/null +++ b/draft-ietf-httpbis-rfc6265bis-19/index.md @@ -0,0 +1,18 @@ +# Editor's drafts for draft-ietf-httpbis-rfc6265bis-19 branch of [httpwg/http-extensions](https://github.com/httpwg/http-extensions/tree/draft-ietf-httpbis-rfc6265bis-19) + +| Draft | | | | +| ----- | --- | --- | --- | +| [The HTTP QUERY Method](./draft-ietf-httpbis-safe-method-w-body.html "The HTTP QUERY Method (HTML)") | [plain text](./draft-ietf-httpbis-safe-method-w-body.txt "The HTTP QUERY Method (Text)") | same as main | +| [The Concealed HTTP Authentication Scheme](./draft-ietf-httpbis-unprompted-auth.html "The Concealed HTTP Authentication Scheme (HTML)") | [plain text](./draft-ietf-httpbis-unprompted-auth.txt "The Concealed HTTP Authentication Scheme (Text)") | same as main | +| [Structured Field Values for HTTP](./draft-ietf-httpbis-sfbis.html "Structured Field Values for HTTP (HTML)") | [plain text](./draft-ietf-httpbis-sfbis.txt "Structured Field Values for HTTP (Text)") | same as main | +| [No-Vary-Search](./draft-ietf-httpbis-no-vary-search.html "No-Vary-Search (HTML)") | [plain text](./draft-ietf-httpbis-no-vary-search.txt "No-Vary-Search (Text)") | same as main | +| [HTTP Cache Groups](./draft-ietf-httpbis-cache-groups.html "HTTP Cache Groups (HTML)") | [plain text](./draft-ietf-httpbis-cache-groups.txt "HTTP Cache Groups (Text)") | same as main | +| [HTTP Server Secondary Cert Auth](./draft-ietf-httpbis-secondary-server-certs.html "Secondary Certificate Authentication of HTTP Servers (HTML)") | [plain text](./draft-ietf-httpbis-secondary-server-certs.txt "Secondary Certificate Authentication of HTTP Servers (Text)") | same as main | +| [Templated CONNECT-TCP](./draft-ietf-httpbis-connect-tcp.html "Template-Driven HTTP CONNECT Proxying for TCP (HTML)") | [plain text](./draft-ietf-httpbis-connect-tcp.txt "Template-Driven HTTP CONNECT Proxying for TCP (Text)") | same as main | +| [Zstd Window Size](./draft-ietf-httpbis-zstd-window-size.html "Window Sizing for Zstandard Content Encoding (HTML)") | [plain text](./draft-ietf-httpbis-zstd-window-size.txt "Window Sizing for Zstandard Content Encoding (Text)") | same as main | +| [Compression Dictionary Transport](./draft-ietf-httpbis-compression-dictionary.html "Compression Dictionary Transport (HTML)") | [plain text](./draft-ietf-httpbis-compression-dictionary.txt "Compression Dictionary Transport (Text)") | same as main | +| [Optimistic HTTP Upgrade Security](./draft-ietf-httpbis-optimistic-upgrade.html "Security Considerations for Optimistic Protocol Transitions in HTTP/1.1 (HTML)") | [plain text](./draft-ietf-httpbis-optimistic-upgrade.txt "Security Considerations for Optimistic Protocol Transitions in HTTP/1.1 (Text)") | same as main | +| [Retrofit Structured Fields](./draft-ietf-httpbis-retrofit.html "Retrofit Structured Fields for HTTP (HTML)") | [plain text](./draft-ietf-httpbis-retrofit.txt "Retrofit Structured Fields for HTTP (Text)") | same as main | +| [Cookies: HTTP State Management Mechanism](./draft-ietf-httpbis-rfc6265bis.html "Cookies: HTTP State Management Mechanism (HTML)") | [plain text](./draft-ietf-httpbis-rfc6265bis.txt "Cookies: HTTP State Management Mechanism (Text)") | same as main | +| [Resumable Uploads](./draft-ietf-httpbis-resumable-upload.html "Resumable Uploads for HTTP (HTML)") | [plain text](./draft-ietf-httpbis-resumable-upload.txt "Resumable Uploads for HTTP (Text)") | same as main | + diff --git a/index.md b/index.md index d38fbb3be..83500e712 100644 --- a/index.md +++ b/index.md @@ -18,3 +18,21 @@ View [saved issues](issues.html), or the latest GitHub [issues](https://github.c | [Cookies: HTTP State Management Mechanism](./draft-ietf-httpbis-rfc6265bis.html "Cookies: HTTP State Management Mechanism (HTML)") | [plain text](./draft-ietf-httpbis-rfc6265bis.txt "Cookies: HTTP State Management Mechanism (Text)") | [datatracker](https://datatracker.ietf.org/doc/draft-ietf-httpbis-rfc6265bis "Datatracker for draft-ietf-httpbis-rfc6265bis") | [diff with last submission](https://author-tools.ietf.org/api/iddiff?doc_1=draft-ietf-httpbis-rfc6265bis&url_2=https://httpwg.github.io/http-extensions/draft-ietf-httpbis-rfc6265bis.txt) | [issues](https://github.com/httpwg/http-extensions/labels/6265bis) | | [Resumable Uploads](./draft-ietf-httpbis-resumable-upload.html "Resumable Uploads for HTTP (HTML)") | [plain text](./draft-ietf-httpbis-resumable-upload.txt "Resumable Uploads for HTTP (Text)") | [datatracker](https://datatracker.ietf.org/doc/draft-ietf-httpbis-resumable-upload "Datatracker for draft-ietf-httpbis-resumable-upload") | [diff with last submission](https://author-tools.ietf.org/api/iddiff?doc_1=draft-ietf-httpbis-resumable-upload&url_2=https://httpwg.github.io/http-extensions/draft-ietf-httpbis-resumable-upload.txt) | [issues](https://github.com/httpwg/http-extensions/labels/resumable-upload) | +## Preview for branch [draft-ietf-httpbis-rfc6265bis-19](draft-ietf-httpbis-rfc6265bis-19) + +| Draft | | | | +| ----- | --- | --- | --- | +| [The HTTP QUERY Method](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-safe-method-w-body.html "The HTTP QUERY Method (HTML)") | [plain text](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-safe-method-w-body.txt "The HTTP QUERY Method (Text)") | same as main | +| [The Concealed HTTP Authentication Scheme](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-unprompted-auth.html "The Concealed HTTP Authentication Scheme (HTML)") | [plain text](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-unprompted-auth.txt "The Concealed HTTP Authentication Scheme (Text)") | same as main | +| [Structured Field Values for HTTP](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-sfbis.html "Structured Field Values for HTTP (HTML)") | [plain text](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-sfbis.txt "Structured Field Values for HTTP (Text)") | [diff with main](https://author-tools.ietf.org/api/iddiff?url_1=https://httpwg.github.io/http-extensions/draft-ietf-httpbis-sfbis.txt&url_2=https://httpwg.github.io/http-extensions/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-sfbis.txt) | +| [No-Vary-Search](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-no-vary-search.html "No-Vary-Search (HTML)") | [plain text](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-no-vary-search.txt "No-Vary-Search (Text)") | [diff with main](https://author-tools.ietf.org/api/iddiff?url_1=https://httpwg.github.io/http-extensions/draft-ietf-httpbis-no-vary-search.txt&url_2=https://httpwg.github.io/http-extensions/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-no-vary-search.txt) | +| [HTTP Cache Groups](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-cache-groups.html "HTTP Cache Groups (HTML)") | [plain text](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-cache-groups.txt "HTTP Cache Groups (Text)") | same as main | +| [HTTP Server Secondary Cert Auth](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-secondary-server-certs.html "Secondary Certificate Authentication of HTTP Servers (HTML)") | [plain text](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-secondary-server-certs.txt "Secondary Certificate Authentication of HTTP Servers (Text)") | same as main | +| [Templated CONNECT-TCP](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-connect-tcp.html "Template-Driven HTTP CONNECT Proxying for TCP (HTML)") | [plain text](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-connect-tcp.txt "Template-Driven HTTP CONNECT Proxying for TCP (Text)") | same as main | +| [Zstd Window Size](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-zstd-window-size.html "Window Sizing for Zstandard Content Encoding (HTML)") | [plain text](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-zstd-window-size.txt "Window Sizing for Zstandard Content Encoding (Text)") | [diff with main](https://author-tools.ietf.org/api/iddiff?url_1=https://httpwg.github.io/http-extensions/draft-ietf-httpbis-zstd-window-size.txt&url_2=https://httpwg.github.io/http-extensions/draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-zstd-window-size.txt) | +| [Compression Dictionary Transport](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-compression-dictionary.html "Compression Dictionary Transport (HTML)") | [plain text](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-compression-dictionary.txt "Compression Dictionary Transport (Text)") | same as main | +| [Optimistic HTTP Upgrade Security](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-optimistic-upgrade.html "Security Considerations for Optimistic Protocol Transitions in HTTP/1.1 (HTML)") | [plain text](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-optimistic-upgrade.txt "Security Considerations for Optimistic Protocol Transitions in HTTP/1.1 (Text)") | same as main | +| [Retrofit Structured Fields](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-retrofit.html "Retrofit Structured Fields for HTTP (HTML)") | [plain text](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-retrofit.txt "Retrofit Structured Fields for HTTP (Text)") | same as main | +| [Cookies: HTTP State Management Mechanism](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-rfc6265bis.html "Cookies: HTTP State Management Mechanism (HTML)") | [plain text](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-rfc6265bis.txt "Cookies: HTTP State Management Mechanism (Text)") | same as main | +| [Resumable Uploads](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-resumable-upload.html "Resumable Uploads for HTTP (HTML)") | [plain text](draft-ietf-httpbis-rfc6265bis-19/draft-ietf-httpbis-resumable-upload.txt "Resumable Uploads for HTTP (Text)") | same as main | +