Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GET media/config #1189

Merged
merged 18 commits into from
Jul 6, 2018
Merged

GET media/config #1189

merged 18 commits into from
Jul 6, 2018

Conversation

Half-Shot
Copy link
Contributor

@Half-Shot Half-Shot commented May 3, 2018

Rationale

Often I'll upload things that are too large to the homeserver and then find that it only gets rejected once it has uploaded a sizable portion. This is intensely frustrating when I am on a mobile network or a slow connection.

This PR adds the endpoint for the homeserver to specify limits to uploads. Currently, this is only maximum size, but I would eventually expect this to be adapted to content types, per-user limits or something else.

Clients can then decide whether to attempt an upload.

GDoc https://docs.google.com/document/d/1fI4ZqQcyAyBEPMtS1MCZWpN84kEPdm9SDw6SVZsJvYY/edit?usp=sharing

Collection of PRs

@turt2live
Copy link
Member

I'm completely failing to find the issue, but somewhere there was a suggestion to have a generic "server info" endpoint which would include this kind of information. The closest I found is https://github.com/matrix-org/matrix-doc/issues/500 however that doesn't describe the media repo portion.

For the synapse side: it really should be failing fast well before the upload finishes. I know riot-web does some weird things with media (at it's own detriment sometimes), but if the request has a Content-Length, surely synapse can use that value to try and fail fast. Clients should also be recommended to use chunked uploads or similar so the media repo can complain about file sizes.

@Half-Shot
Copy link
Contributor Author

Half-Shot commented May 3, 2018

@turt2live That fails part of the issue, but I still think it's crap that then you're playing guessing games with what the maximum file size should be. Why not just declare it to the client.

I'm completely failing to find the issue, but somewhere there was a suggestion to have a generic "server info" endpoint which would include this kind of information. The closest I found is matrix-org/matrix-spec#68 however that doesn't describe the media repo portion.

I thought of it, but I felt like that would bloat quickly and since this is technically a module that you could actually replace with a seperate service, I wouldn't want this to get muddled into a generic server info endpoint. You could easily write your own media server and have it give it's own suggestions about file sizes.
The other intention was to make this endpoint eventually give other kinds of limits which you might want to be visible in the client before the user attempts an upload, for example, if I was writing a document sharing platform I might want to warn the user that we don't accept video/* files. That's kind of irrelevant right now, but I think that's my point made.

What I'm trying to go for is something like:

image

@Half-Shot
Copy link
Contributor Author

More discussions in https://matrix.to/#/!PxMWbvomtRqGOTsaDK:half-shot.uk/$152536223672YbGdb:half-shot.uk.

Travis suggested that I should use different names for different size restrictions depending on the operation, so adjusting size to be upload_size

@maxidorius
Copy link
Contributor

I'm not sure this is a smart idea since the recommended setup is to have a reverse proxy sitting in front of synapse, which also has its own value.

Also, AFAIK, you can fail-fast at the HTTP level to prevent large upload and the current issue is because of Riot, not synapse/reverse proxy?

@Half-Shot
Copy link
Contributor Author

Half-Shot commented May 3, 2018

which also has its own value

I can't think of many reasons why you would do that, unless you wanted to give local services more/less limits in which case...you don't care about this API because it's a suggestion rather than enforcement.

Also, AFAIK, you can fail-fast at the HTTP level to prevent large upload and the current issue is because of Riot, not synapse/reverse proxy?

Yep, but that's only fixing one issue. It's still bad UX to fail at all if you can give the client advance warning and more importantly what those limits are. Reason: If my server has a limit of 1MB, and I want to upload an image of 10MB then I have to keep retrying at different sizes until it accepts it. By giving the client a set of limits, you can show in the UI what those limits are.

It also could very easily tie into per-user limits in the future which Travis and I were talking about.

Oh, and I don't remember it ever being the reccomended setup to have different max upload sizes on your reverse proxy and synapse.

@maxidorius
Copy link
Contributor

200% agree on the UX and is why I brought up the reverse proxy. If the recommended setup tells you to put one, you can't ignore it as it can to equally bad UX like "you can upload up to 10MB. users attempts to upload 9MB, upload fails as file is too large". But obviously, it can be dealt with later, as this is only about the endpoint.

@maxidorius
Copy link
Contributor

actually nevermind, I just realised this is for the spec, not for synapse. forget all my previous comments.

@Half-Shot
Copy link
Contributor Author

:)

@turt2live
Copy link
Member

to summarize some deliberation in other places (on my part):

  • This is fine (imo)
  • There's some desire to have per-user limits (ie: make authentication optional but recommended on this endpoint)
  • Per-room limits are a later thing and should be easily supported at that point

@Half-Shot
Copy link
Contributor Author

Half-Shot commented May 3, 2018

There's some desire to have per-user limits

I'm feeling like for now we can leave this out until we've figured out exactly how you'd implement it. It's not going to be a major change to implement that at a later date

On the other hand I could specify it now, and synapse will only partially support the spec.

Per-room limits are a later thing and should be easily supported at that point

Oh and this will be under it's own /limits/room(s)/@abcdef:example.com later so it can be ignored for now.

@turt2live
Copy link
Member

I'd certainly appreciate the suggestion for authentication be included in the spec, even if the synapse implementation doesn't care.

properties:
upload_size:
type: number
description: "The maximum size a upload can be in bytes."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an upload

@dbkr
Copy link
Member

dbkr commented May 4, 2018

I'd strongly encourage going through the full spec-proposal process (ie. make a google doc), even for things like this that feels trivial, as they generally turn out to be more complex than you think. Eg. I have questions about the requirement for servers to allow anyone to query their upload limits without auth.

@Half-Shot
Copy link
Contributor Author

Sure thing. Happy to do so, least I have a PoC to point to as well. I'll write one up.

@Half-Shot
Copy link
Contributor Author

Half-Shot commented May 4, 2018

turt2live added a commit to t2bot/matrix-media-repo that referenced this pull request May 9, 2018
This is currently an unstable endpoint, but for the purpose of client validation it's been implemented here.

Reference issue: matrix-org/matrix-spec-proposals#1189

This is based upon the behaviour described in the Google Doc proposal (see issue) as of writing.
@benparsons benparsons mentioned this pull request May 14, 2018
schema:
type: object
properties:
upload_size:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proposal calls for m.upload.size btw (same with the example later on)

@Half-Shot
Copy link
Contributor Author

@turt2live @anoadragon453 @uhoreg @ara4n and anyone else, could you give feedback on the PR so we can get it merged :)

@anoadragon453
Copy link
Member

Looks fine to me. Presumably #1232 can be closed when this is as well.

Copy link
Member

@dbkr dbkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can you rename this PR if it's no longer an initial draft please :)



If an accessToken is supplied, the configuration applied to the authenticated user is returned.
Otherwise it should give the configuration applied globally to the server.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docs says, "Clients MUST request this endpoint with authorisation" so I don't think this matches the doc.

"/config":
get:
summary: Get the configuration for the media repository.
Clients SHOULD use this as a guide when uploading content.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably needs to clarify what it's talking about now, since this is now a generic /config API.

Edit: in fact this probably just needs to move to m.upload.size below

All values are intentionally left optional. Clients SHOULD follow
the advice given in the field description when the field is not available.

**NOTE:** Reverse proxies may apply their own configuration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the intent here but probably only because I have the context of the normal nginx 2MB limit failure. I think this paragraph might be a bit confusing and left-field to others without this context. It could be nice to have a hint here, but I think it would have to be a little more specific and detailed. (As-is it's sort of a non-statement: the Matrix spec doesn't care what your front end proxy may or may not be doing, so long as what comes out of the sausage factory is compliant with the spec).

Also, strictly speaking, this probably ought to have been in the proposal doc if it's part of the proposal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not part of the proposal because it's clearly out of scope, but leaving it out is going to also leave a lot of people confused. I think leaving a note to explain why it might fail on you is reasonable, though it could probably be worded better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My 2cents: when I read that sentence, I understood that reverse proxy might alter the response to reflect their own limits, if needed. But it seems it's not the case, but rather "the info may not be authoritative"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was actually the original intention, but since it wasn't in the spec I guess it was a bit of a sneaky thing to ask for. "the info may not be authoritative" is probably a better thing to say. In most cases, you'd hope people configuring their nginx backends would set their limits up appropriately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate you're trying to be very general here but I think you're losing the meaning, ie. it's not clear to me what "applying a configuration" is exactly. My suggestion would be something like:

Both clients and server administrators should be aware that proxies between the client and the server may affect the apparent behaviour of content repository APIs, for example, proxies may enforce a lower upload size limit than is advertised by the server on this endpoint. Clients should handle such situations gracefully.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's fine by me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. :) Others: does this look OK?

@Half-Shot Half-Shot changed the title Initial draft of GET media/limits GET media/config Jun 28, 2018
"/config":
get:
summary: Get the configuration for the media repository.
Clients SHOULD use this as a guide when using media endpoints.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/media/content repository/ please.

I only keep harassing you about this because the spec doesn't reference media in the text, and instead references "content".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought about this when I wrote it but content repository seems wrong when all our endpoints start with /media. But heyho...

@Half-Shot
Copy link
Contributor Author

@KitsuneRal @mujx want a look?

@KitsuneRal
Copy link
Member

I'm generally happy with the PR (barring that "gracefully" sentence also discussed in #matrix-spec - I think it confuses more than clarifies so I'd rather just delete it and consider discrepancy between revproxies and homeservers as plain misconfiguration).
I have another concern though: now that we have /config (and anticipating more configuration points inside of it - both for content-repo/media server and for "genuine" homeserver, such TURN server config), I wonder how this will work given that media servers can be separate from homeservers. I can imagine that /config will simply be a union of media- and HS-configuration points if a homeserver also acts as a media server (and as long as we identify m.* configuration points as pertaining to either media servers or HSes but not both, I'm fine). @turt2live, you might want to chime in on that.

@turt2live
Copy link
Member

The route is under /_matrix/media, so anything that applies to media would show up in this endpoint. More general configuration, such as MOTD and TURN would appear under a different not-yet-introduced endpoint.

@KitsuneRal
Copy link
Member

Oh, right, I missed that the whole thing is still under media. Then I'd only remove the "gracefully" sentence but even that is not a showstopper for me (given that it's in a note and is on the verge of being non-normative anyway).

@Half-Shot
Copy link
Contributor Author

I'm quite happy to remove that sentence, thanks for the input @KitsuneRal

@dbkr
Copy link
Member

dbkr commented Jun 29, 2018

actually my approval comes with the caveat that you seem to have broken one of the scripts

@Half-Shot
Copy link
Contributor Author

Half-Shot commented Jun 29, 2018

@dbkr Not me 👼 . One of the proposals is breaking the proposal py script.

@turt2live
Copy link
Member

Looks like somehow the script is getting "documentation_url" instead of an array of issues from github?

@turt2live
Copy link
Member

@richvdh can I get your seal of approval on this? (you were one of the reviewers for the proposal, and it'd be good to get your feedback)

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm otherwise

@@ -269,3 +269,41 @@ paths:
"$ref": "definitions/error.yaml"
tags:
- Media
"/config":
get:
summary: Get the configuration for the content repository.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh dear, yes. good catch.

All values are intentionally left optional. Clients SHOULD follow
the advice given in the field description when the field is not available.
description: |-
Clients SHOULD use this as a guide when using content repository endpoints.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally we start with a wordy description of the endpoint here. "This endpoint allows clients to retrieve information about the media repository."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[which makes it read better in the spec doc]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants