Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine tune wording of the requirement for including Content-Type in the response #565

Open
elf-pavlik opened this issue Aug 26, 2023 · 8 comments

Comments

@elf-pavlik
Copy link
Member

elf-pavlik commented Aug 26, 2023

Extracting from: #561 (comment)

In short, we want to make sure that the work makes it clear that it also applies when responding to HEAD requests.

@woutermont proposed

Server MUST generate a Content-Type header field in a message that contains a representation, even if that representation only consists of representation metadata, as in a HEAD request.

I think this is clear and concise. Maybe adjusting the last sentence to , as in a response to HEAD request

Since the wording in the spec includes nuances like

In a response to a HEAD request, the representation header fields describe the representation data that would have been enclosed in the content if the same request had been a GET

we should triple-check if we use all the terms strictly as intended by the RFC.

@csarven
Copy link
Member

csarven commented Aug 26, 2023

https://github.com/solid/specification/blob/main/meetings/2023-05-17.md#add-server-content-type-payload

RFC says HEAD is same as GET without the payload. When we add [this] new requirement, it ensures we get a content type in the HEAD.

@elf-pavlik
Copy link
Member Author

I think the current wording in #561 can be slightly adjusted

Server MUST generate a Content-Type header field in a message that contains content.

Looking at https://www.rfc-editor.org/rfc/rfc9110#section-8.3

A sender that generates a message containing content SHOULD generate a Content-Type header field in that message unless the intended media type of the enclosed representation is unknown to the sender.

If we want to raise the requirement level to MUST we could just adjust the original text to:

Server that generates a message containing content MUST generate a Content-Type header field in that message unless the intended media type of the enclosed representation is unknown to the Server.


If we don't want to have HEAD called out in the requirement itself, it could be mentioned in a non-normative note under the requirement. In the minutes linked above there were at least 2 other people requesting explicit mention of HEAD.

" NOTE: Solid Protocol raises the level of this requirement from original SHOULD [RFC9110] to MUST
This requirement also applies when responding to HEAD requests, since the response still contains content even if the content is not sent [1] [2] "

@csarven
Copy link
Member

csarven commented Aug 27, 2023

Good suggestions. I think we should do both.

  1. Reuse the RFC statement and change requirement level to MUST. The "unless" in the requirement helps to retain this from the RFC:

    However, a server MAY omit header fields for which a value is determined only while generating the content. [..] Such a response to GET might contain Content-Length and Vary fields, for example, that are not generated within a HEAD response.

  2. Note is useful to raise awareness about the requirement level change, and clarifies potential ambiguity with HEAD. It can be prefixed with some whys, one of which can be along the lines of:

    "Servers are expected to be able to determine the media type of the representation they are providing in order to avoid cases where a resource may not have been created by a Client."

@elf-pavlik
Copy link
Member Author

"Servers are expected to be able to determine the media type of the representation they are providing in order to avoid cases where a resource may not have been created by a Client."

Could you please clarify this point? Somehow reading it doesn't make my 🧠 click.

@csarven
Copy link
Member

csarven commented Aug 27, 2023

I probably should've held off responding at 0230. I'll give it at another shot.

What I had in mind is a take on some of the considerations in #119 in that while the Protocol only defines what happens at the observable interface (and implementation details are hidden), requiring Servers to both determine the media type and include the Content-Type in the response (with the exception of "unless" case in RFC) forces certain kinds of implementations (such as the classic filesystem) to be better aware of what they serving, and whether to serve in the first place (and avoid potential security or privacy implications, and content handling errors) even if they physically occupy space alongside resources that are known to the storage (which went through the HTTP interface by a Client). While most of that is implementation detail that need not be discussed in the Protocol, the requirement to preemptively prevents storages from making stuff available out there that it may not be equipped to manage properly. On the other side of this is allowing that possibility of course - allowing some variability with more potential issues.

@woutermont
Copy link
Contributor

" NOTE: Solid Protocol raises the level of this requirement from original SHOULD [RFC9110] to MUST This requirement also applies when responding to HEAD requests, since the response still contains content even if the content is not sent [1] [2] "

This is incorrect, or at least very misleading. The first sentence of the section on HEAD requests specifically reads "The HEAD method is identical to GET except that the server MUST NOT send content in the response."

If we do want to change the wording of the requirement more drastically to stick closer to RFC9110, I would suggest we use the term 'associated representation', as used in the section on the Content-Type header (highlight).

A Server that generates a message with an associated representation (either the representation enclosed in the message content or the selected representation, as determined by the message semantics) MUST generate a Content-Type header field in that message unless the intended media type of the associated representation is unknown to the Server.

(Note that the current wording actually suffices from a conformance perspective: messages with content must include the header, and a HEAD request is identical to a GET request without the content. So we might just clarify that in a non-normative note instead of rephrasing the entire thing.)

@elf-pavlik
Copy link
Member Author

@woutermont which part do you consider incorrect? Based on the two links at the end of the note, it seems to me that:

In response to HEAD request, the server generates a response containing content but does not send that content.

I do find it overly nuanced, but that's what those two quotes from the RFC seem to say

A sender that generates a message containing content SHOULD generate a Content-Type header field in that message unless the intended media type of the enclosed representation is unknown to the sender

The HEAD method is identical to GET except that the server MUST NOT send content in the response. HEAD is used to obtain metadata about the selected representation without transferring its representation data, often for the sake of testing hypertext links or finding recent modifications.

Now, looking at

The "Content-Type" header field indicates the media type of the associated representation: either the representation enclosed in the message content or the selected representation, as determined by the message semantics.

I like how you incorporated part of it in your suggestion. I think this is the cleanest wording we managed to get so far out of the original terminology in RFC 9110!

@woutermont
Copy link
Contributor

@woutermont which part do you consider incorrect? Based on the two links at the end of the note, it seems to me that: In response to HEAD request, the server generates a response containing content but does not send that content.

That is exactly the part I consider incorrect. HEAD requests provide an important performance aspect, that is achieved precisely because no content is being generated by the server. (It is for this reason that the spec states: a server MAY omit header fields for which a value is determined only while generating the content.) The server knows what it would generate and how it would do that, but does not necessarily perform that generation. A response to a HEAD request thus has a selected representation (the one that would be sent as content in response to GET), but does not contain (= enclose) that representation as content.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants