-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify the heuristics to determine the interaction model if none is specified #128
Comments
Since we don't have the actual clarification here, I'll return this to TODO. |
I think there are two parts to this, one is the specific heuristics, which is simple. The other thing is consistency, which is probably harder. The latter was more the intention of #121 , but it really is connected. My proposal for the heuristics:
I'm thinking it is good to define it in terms of media type in the second point, so that the interaction model can always be inferred from headers (since #70), the alternative is to inspect the body to ensure that. We could probably enumerate the media types that are LDP-RS at this point. I'm still of the opinion that the HTML+RDFa falls between LDP-NR and LDP-RS, (#69), but I'm willing to go with that if it makes stuff simpler. |
There is no between. "Fully represented in RDF" (LDP-RS) and "not fully represented in RDF" (LDP-NR) add up to 100% of all possibilities (LDPR). HTML+RDFa docs are not fully represented in RDF, so they are LDP-NR. Otherwise, I think I agree with your last. |
Would it make sense to require the response headers to include an indication of how the server understood the request? That is, if the server understands the new resource to be an LDP-BC, then the server |
But that's not what it says, it says:
I think I can go with that though, since I can see that it may be just inaccuracy, which happens. |
|
Categorically pinning text/html, application/xhtml+xml, image/svg+xml etc to LDP-NR conflicts with the possibility of having an HTML(+RDFa) representation available from |
That's a practical question, I think. As a matter of definition, it was what I was referring to when I said that LDP-C does not imply LDP-RS. That may, or may not take care of it, depending on the practical implications of the assumptions made by pure LDP implementations. We could also say that HTML+RDFa etc are just LDPR... |
Or... We could say that "these media types must be inspected to determine if the document is fully represented in RDF and thus LDP-RS", and then leave the rest to be LDP-NR. But it won't solve the conflict of LDP-C not necessarily being LDP-RS. |
If / is supposed to be handled as a LDP-BC, we shouldn't venture from LDP-RS. I don't particularly like the idea of Solid having a different take on LDP eg. / is LDP-BC but not necessarily LDP-RS.. or even suggesting that a resource has a (default?) interaction model but will differ depending on the representation. That would be internally inconsistent and not a simple design. Just because LDP works with vague notions attempting to cover spectrum of things with eg "fully", "partial", or "not have useful", doesn't mean that Solid has to go on that path. If it is really about that, I feel much more comfortable to default to RDF 1.1's notion of what RDF Source entails and not get sucked into pedantic over-engineering. / is intended to be a LDP-BC and LDP-RS and so must be handled as a RDF graph. Everything else is secondary or a non-issue... out of scope. To answer @acoburn 's question:
Yes, I agree that makes sense (as I've already proposed elsewhere). We note that LDP doesn't specify how requests without an interaction model should be handled. So, specifying it for Solid is not about compatibility with LDP but just us striving for normalisation and uniform handling. |
We'd have to take that up with @timbl , basically, it is his design that My opinion is, and has always been, that this notion (LDP-C isa LDP-RS) is a Really Bad Idea[tm], so, I don't feel the tension personally, but indeed, it is important for Solid to have a clear relationship to LDP. |
We need to define heuristics with the use of a |
Good catch. Forgot that one, since it wasn't on the project board... I'm starting to think that we may need to elevate that from the implementation detail status. I envision that the presence of a |
Actually, rereading, I found that the description #96 to be quite different from what needs to be resolved here. Also, I think it should be orthogonoal to #107 , since I think the use of a So, I'll guess I'll go with my proposal right here. :-) It seems there are six possibilities, combinations between whether the Here's my suggestion:
[Removed updated suggestion] |
If Slug is provided, the slash semantics is used for the intended URI. If the Link header accompanies Slug, Link has a higher specificity than Slug for the algorithm generating the URI. |
Right! You can put it that way, but it has to say something more concrete about what to do with the slash if it conflicts with the expectation set by the Can we formulate a consensus around this? I think we first need to construct the Resource URL. With some methods, (i.e. Then, the Resource URL is checked for consistency, e.g. a Resource URL not ending with Then,
|
I agree with the steps involving the construction of the effective request URI and setting the interaction model. |
Just to note, NSS says:
in the case where the I don't think |
I've encountered that error on NSS as well when I tested Slug. Noted it as a bug because as far as I can tell from https://tools.ietf.org/html/rfc5023#section-9.7.1 , |
But then slugs can be used to create folders recursively, given that slashes do have a meaning in Solid? And this would create complex locking problems? Not to mention the attack vector for security problems. I'm against. |
Recursive container creation is an atomic operation and not unique to Slug use. Applying the slash semantics to the Slug value keeps things consistent ( #96 (comment) ). Can you elaborate on how the attack vector for security problems that you are thinking of is unique to Slug use? |
Understood; could you add a reference for that (might provide context for the next question)?
But then I think we need a MUST there (will comment).
Depending on how recursive resource creation is defined, the problems would be the same. I.e., we should in both cases warn against things such as:
However, the real danger is that implementers overlook the fact that it is recursive container creation, as they might expect the slug to be just a filename. A such, I would ask the question differently: what is the use case for recursive container creation via slug? |
I'm thinking that we do need to restrict the acceptable characters or even what the server may use to construct a resource identifier (with relevance to #142), but that the server should otherwise have the liberty to construct identifiers based on the That said, I say we defer recursive container creation with |
#68 (comment) , #118 (comment) , #126 (comment) , #137 (comment) .. there is more out there I'm sure. Resolving security follow up #129 can help further. Requirements like #107 (comment) can help towards halting if a particular segment (container) can't be created.
IIRC, the rationale for SHOULD was because of what Slug entails and that it would still be within the confines of client suggestion as well as server choosing to ignore or use as it prefers. I also understand the rationale for MUST and that would be equally valid but it may be overstepping Slug.
These would be fine as non-normative information in general but I don't see it being specific to Slug.
Servers that use the Slug (with a particular algorithm eg. slash semantics) will have to generate an effective request URI and perform the request based on that as usual. Again I'm not sure if there is anything unique here for Slug but I'd be content to clarify these further as non-normative information because we are really not defining anything new other than suggesting to apply the slash semantics algorithm for the sake of consistent path segment handling.
The same use cases that would require PUT or PATCH to create recursive containers. Slug just happens to be a hint or a suggestion to the server on the naming. |
And you might even want to limit length, as per whatwg/fetch#862 (comment), to avoid that people ask for 10MB slugs. (Also note for PUT.)
Security seems key here. If Slug has it as a SHOULD, but you create it, and the slash interpretation is a SHOULD or MUST on
The danger is huge. Why allow
Then use those, I'd say. This is the kind of security risk where it's hard to imagine all implementations to get it right. I'd strongly ask the security experts to weigh in. |
By mere definition of Slug, server can ignore or change it to what it deems to be useful or safe. ".acl" has no special meaning or expectation as far as the spec is concerned. If an implementation considers that to be a pattern that it best avoids, it will strip it off or use something else. As for ".." , these fall under how the effective URI is handled. No different than request-target being "..". Server still has to construct the intended target. In some cases it will be justifiably useful (relative paths) and in other cases it may be a malicious or nonsensical attempt that the server can ignore. Server choosing what to do also goes for client asking to create many nested containers and not getting it. You asked for use cases that would entail recursive container creation which happens to route through Slug (because of slash semantics for consistency). |
That, or we can avoid the whole attack vector by not allowing these things in slug, such that there is no expectation of them to work at all. The risk is that Slug could follow different code paths from regular request URI handling, and be way less tested because of it being more rare and having more edge cases because of it being a relative URI. Anyway, not my call; just bringing in arguments of why a security expert should really look at this. |
All features in all specs will be reviewed by security experts. There are indeed many attack vectors but the ones that you've raised can either be ignored or has no more specialty than relative request-targets on their way to the effective request URI. Nevertheless, I do agree that potential risks due to different code paths would be useful to note in the spec. Thanks for raising that. Edit: "attack vectors" not "vendors" =) |
We could surely define in terms of request URI handling and detail how to construct an effective request URI that would need to validated. We could put some common attacks in the test suite, but lets just consider it later :-) For now, lets just redefine Edit: Oh, I didn't read what it said about Unicode, that last part would exclude that. So, what I wrote was correct, just much more restrictive than I intended :-) |
Another note on NSS, as a response to the case of NSS5 responds with
rather than just |
It does not seem correct. |
That would be a valid response that would pass the tests. I think the primary concern was about classifying the request as a container or a non-container resource. I don't think NSS's response is good because it should probably stick to /foo and do /foo$.ttl on disk. It is not entirely clear what requesting JSON-LD on /foo.ttl is supposed to do.. not to mention the slight confusion. Any way, this bit is an implementation detail. |
Noting here that the rough consensus in this issue: #128 (comment) was taken up in PR #160 , reviewed, and approved. However, the issue was revisited in #160 (comment) highlighting the functional requirements, variability, and complexity to implement the heuristics. #160 (comment) proposed an alternative approach that was deemed to simplify implementation and remove variability. It was reviewed and approved. c1aac42 includes the criteria based on the proposal. |
No description provided.
The text was updated successfully, but these errors were encountered: