-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PATCH vs PUT #14
Comments
Also mentioned here http://www.tus.io/protocols/resumable-upload.html#comment-866226341 |
Yep, I was thinking about Then again, while The difference I see between those two is that (in the version 0.1 of the procotol) |
PUT already has a specification for resuming uploads, and if that was all we wanted to do, it'd be fine. PATCH seems appropriate because we might want to replace corrupted segments in the middle of already-uploaded data. Although, it'd also make sense to have a three-step protocol - POST to create the file, PUT to upload/resume, and PATCH to fix corrupted uploads, with the last being wholly optional. |
There's no real specification I'm aware of for ranged-PUT requests. As far as I can tell, it seems to be actively discouraged. PUT is primarily intended as a 'full replacement', whereas with PATCH the intention is a partial replacement. The 'append' use-case is specifically mentioned in the RFC: http://tools.ietf.org/html/rfc5789 Bonus: it also provides a discovery mechanism. |
I spend a ton of time just reading rfc's and implementing them. Not saying you should use this for a template, but I felt that considering all the rules and regulations, this was the most appropriate for myself: http://code.google.com/p/sabredav/wiki/PartialUpdate It covers all the use-cases afaik. |
@evert: I think you are referring to the example of adding lines to a textual log. To me this is a different use case, because a log consisting of a few or even no lines can still be considered as a complete document, while a partially uploaded image is just garbage or at least a corrupted document. |
Just to be clear, this isn't a ranged-PUT request in the sense of a From section 9.6 PUT:
|
From HTTPbis:
|
Ok, I think in order to make progress on this, we need to clarify the goals. I think we should set the following priorities, in order:
Ideally we can find something that does not violate existing or upcoming specifications and goes well with the spirit of HTTP/REST. But I'd rather sacrifice idealism for a protocol that works today and solves the end user problem. I'll work on a few ideas now and post them in a bit, but please share your thoughts on these priorities. |
PATCH imho is the right thing to do here (or POST if you dislike "extension" methods). Note that in either case you'll need to define a payload format that contains the "patch" instructions (in this case likely minimally: the offset plus an octet (byte) sequence). |
@reschke thank you so much for your feedback!
Makes sense. How would a client find ask the server about the received data after a network interruption? What response headers should be used by the server? One problem I see with PATCH is this requirement from http://tools.ietf.org/html/rfc5789#section-2 :
If I interpret this and the following explanations correctly, it means that we'd have to use many requests with small payloads to avoid loosing a lot of data due to a network error. |
a) using a HEAD request? Or do you need more than the current length? b) the requirement is not really different from other methods; the problem here is on the transport level (incomplete HTTP message), not the semantical level... |
It would be nice if the server could communicate:
Ok, does this mean that the spec is unclear about processing partial requests that experienced transport level issues? |
The client should know the expected final length, right?
Why do we need parallel chunk uploading?
Well, incomplete messages are just that: incomplete. What a recipient can do with them depends a lot on context. For instance, can you assume that if you do a GET and get a truncated result, the contents received so far is ok? No, you can't. You might after getting the remainder and checking checksums. The same is true for PATCH: a server could do something with truncated messages when it has some confidence that it's doing the right thing. What the cited text from PATCH wants to prevent is that a PATCH requests two changes, and the server applies only one of them. |
Yes. The use case would be giving the URL of the partial resource another client that wouldn't have this information, but might be interested about the state of the resource during the transfer. Consider it a nice to have.
To speed up uploads, especially for mobile clients with high latency / data loss. I realize your skeptical about this, so I'm willing to perform whatever measurements you'd like to see to convince you (or be convinced that it's not necessary).
I think that depends on the requirements. If the guarantees provided by TCP and the data link layer are strong enough for the application, the partial content should be considered ok I think. We've been discussing this over here: #7 (comment)
That makes sense. |
Hm, no. For instance, what you might see is the start of the data stream, followed by an error message written to the stream (think delayed error page with stacktrace). The problem is that once a server has said "200" and started to write data, there is no reliable way of indicating an error except closing the connection. |
You're right. That's an excellent point, thanks for bringing it up. |
For PATCH, you may want to consider a payload format where you have many "chunks" with checksums in a single payload; so the server could apply all those for which it does know that they are intact. |
Doesn't this violate the aforementioned constraint that all parts of a PATCH request have to be applied?
So when a PATCH request is incomplete, is there even a way to comply to the RFC except for not applying any changes on the server? Wouldn't this require multiple, smaller PATCH requests instead until all of them were successful? |
I'd say that there is some wiggle room if you define the PATCH payload to explicitly address this case. With pipelining that should work as well, but you'd have to find a way to make things still work if you loose one of the requests in between. |
I don't think we should worry too much about following specs. Do so when it's convenient, yes, and to the degree we can interoperate with Anything beyond that is a bonus.
|
@Baughn I think that's a great idea as it will allow fast iteration from an implementation standpoint, but there needs to be a commitment to keep iterating. Otherwise, there will be 8 versions of 0.x spec implementations with potentially no compatibility. It's important to still have these conversations upfront about standards and do the best we can to keep the two in line with each other. The downside of not following them is that tus will not work with existing Web infrastructure. That's a pretty big penalty from my perspective. I'm looking at tus with an eye toward Web APIs. It enables some really awesome scenarios. Not having a single, defined way to do it that Just Works would be a real bummer for implementors of the future. |
@Baughn I fully agree that we need something we can use today. However, if we can manage to fit it into the existing or upcoming specs, that would provide incredible benefits for adoption and long term stability. At this point there is a very good discussion at the httpbis mailing list, and with a little luck we may see something as simple as So for now I'm very hopeful that we can do this in way that truly improves the web, instead of fragmenting it for our benefit. Let's see how the discussion works out. |
Wow, there was great input from all of you. |
Is this interesting/relevant for a protocol? |
@evert That's a great resource. I like their use of the |
Well, it's good to define specific mimetypes for things. This keeps the door open to alternative patch formats and gives you a sane way to figure out which type of One issue with the format used in this project, is that it doesn't at all attempt to prevent problems related to multiple clients writing to a resource. It's a sure way for corrupted files to sneak in. |
You right but if the body doesn't have a specific format, as it's currently the case, using
I agree, too. The current spec doesn't handle this case and our reference implementation (tusd) disallows writes starting at a different offset then the number of bytes written already. |
I disagree that using a generic mime-type is a good idea. Even though it just looks like a stream of bytes, it is a 'format' in some way. Furthermore aside from the format of the actual body, you can also use the mime-type as an indicator of how the server should interprete the request, and it's the only way to advertise support using Using a generic mimetype is definitely bad design in this case. |
Ok, I understand your concerns but an example would help. |
Say if both tus and sabre.io's patch format both used application/octet-stream, how can a client figure out which one is supported? The standard way to do this, is by defining a custom mime-type and broadcasting support in the |
First of all, I like the idea of multiple different formats. Each of them could be implemented as a separate extension.
The client knows which protocol to speak and so it's knows what |
If client and server need out-of-band knowledge, this is simply a private protocol. Not necessarily bad, but also something I wouldn't promote here. |
Yes, the message should be self-descriptive, and application/octet-stream essentially means hand-waving (yes, for binary formats as well) |
Every version should use a distinct mime-type. That's the best way to handle that. |
You should pay attention to what mister @reschke is saying, he's actually one of the authors of HTTP/1.1 ;) |
If so then this is quite a honour. |
@reschke The content is nearly the same but thanks for pointing out. |
Even though I'm not doing this either.. i think the recommended way to do private mimetypes now is by using the So what about for example:
|
@evert I'm not sure if that's the correct document but https://tools.ietf.org/html/rfc6838#section-3 says that the vendor tree should be the best case for use, you're right. |
@Acconut, do you mean like uploading some data, and then applying a diff patch on top of that? That's what I understood, but I must be wrong, because it would be tricky to do. To apply a diff, the state of the resource must be known. When an upload is interrupted, client doesn't know it. The only continuation specified at the moment, is to check how many bytes server got and assume these are the same bytes client has. And then continue the upload. Once the upload is finished, there should be no way to modify the resource, because server can process and remove it. So there's no opportunity to apply diffs.
@reschke, @evert, What if there was only an upload, not a resumable upload? Would we then use When saying So using a combination of content types makes sense to me. Now, does it make any difference on the server? I think not. And I'm inclined to say that this gets more complicated than it needs to be. The way the protocol is specified right now, server doesn't need to look at content types at all. So perhaps this doesn't need to be specified in 1.0? Another point. Do we want clients to be able to say that, after all the pieces are combined, you'll get BTW: Perhaps this discussion deserves a separate issue? |
I agree.
This was just a bad example since I could not think of any other format which should be used inside the body when uploading a file. Maybe @evert can tell us a good one?
The client should not have to send the content type of the final resource in every request since it mustn't change. It may send the type in the file creation although the server can decide what to do with this information.
Yeah, probably. :) |
What I meant to say with different This is not unlike a REST-ish API supporting more than 1 format at any given endpoint. To do content-negotation for Similarly, a server may support your protocol as well as this one http://sabre.io/dav/http-patch/ How can a server and/or client support both? Using the If you think this is too complex, well, I don't have much to add then :) I should say I'm just here as an interested bystander... I don't really have an intention to implement tus, as it doesn't make a lot of sense for me... Mainly just interested in the discussion. |
Let's continue in #43. This issue should be closed. PATCH vs PUT is settled already. |
As pointed out by @Baughn in IRC, PATCH (partial entity update) seems more appropriate than PUT (replacing an entire entity) for resuming uploads. But it might cause issues with mobile http proxies as PATCH is somewhat exotic. Needs investigation.
The text was updated successfully, but these errors were encountered: