Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC2703: Media ID grammar #2703

Closed
wants to merge 1 commit into from
Closed

Conversation

turt2live
Copy link
Member

@turt2live turt2live added proposal-in-review proposal A matrix spec change proposal kind:maintenance MSC which clarifies/updates existing spec labels Jul 28, 2020
## Proposal

Media IDs must be treated and represented as opaque identifiers. Only characters in the
[RFC 2986 Unreserved Characters](https://tools.ietf.org/html/rfc3986#section-2.3) set can be used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if media ids are not allowed to be . or ... Those can lead to issues with local client media caches. While a client should never rely on a path being safe, I think those 2 should be explicitly disallowed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how would this be a problem for client caches?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you cache the files on a traditional file system, a media id like . and .. would be a reserved filename on many systems mapping to current directory or parent directory respectively. Since in your proposal the media id can't contain /, the use of that is limited, if you want to escape a directory and overwrite other files, but you still wouldn't be able to create files like that. So you would need to work around those being valid ids and need to find alternative mappings. I think all other character combinations should be fine though. That way you can just cache files like <server part>/<media id> in the filesystem, making the client side caches much simpler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a super bad idea to cache the media IDs as-is on a traditional file system regardless of this MSC - don't do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's effectively untrusted user input and file storage 101: don't use the name given to you when storing files. Collisions, directory issues, etc are all possible which can cause problems for people.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you validate, that the id only contains ALPHA / DIGIT / "-" / "." / "_" / "~" and is not one of the special filenames, you only have collision issues now. If 2 media ids collide, you already have bigger issues, so I'm not sure, if that applies here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd hopefully not be storing just the bare media ID and would instead be accompanying it with a domain, which can theoretically have the same problem: we don't actually specify that the origin needs to be a server, just heavily imply it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a super bad idea to cache the media IDs as-is on a traditional file system regardless of this MSC - don't do that.

Yeah. Synapse definitely wouldn't do that 🤦

## Proposal

Media IDs must be treated and represented as opaque identifiers. Only characters in the
[RFC 2986 Unreserved Characters](https://tools.ietf.org/html/rfc3986#section-2.3) set can be used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[RFC 2986 Unreserved Characters](https://tools.ietf.org/html/rfc3986#section-2.3) set can be used.
[RFC 3986 Unreserved Characters](https://tools.ietf.org/html/rfc3986#section-2.3) set can be used.

Also, maybe relax the list to pchar instead?

@turt2live turt2live added the needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. label Jun 8, 2021
@richvdh
Copy link
Member

richvdh commented Oct 12, 2022

this looks like an attempt at fixing matrix-org/matrix-spec#503.

An alternative proposal was made in MSC1597 (https://github.com/matrix-org/matrix-spec-proposals/blob/rav/proposals/id_grammar/proposals/1597-id-grammar.md#opaque-ids), though I think it says exactly the same thing.

@turt2live
Copy link
Member Author

I haven't checked, but I can believe that. From memory, the plan here was to try and get something through a bit faster than the general case, but considering that clearly hasn't happened, let's just close this one.

@turt2live turt2live closed this Oct 12, 2022
@turt2live turt2live added the obsolete A proposal which has been overtaken by other proposals label Oct 12, 2022
@richvdh
Copy link
Member

richvdh commented Oct 12, 2022

personally I'd say the reason that MSC1597 hasn't gone anywhere is because it is trying to fix everything at once. Splitting it up seems like a good idea to me...

@turt2live
Copy link
Member Author

fair :D

We can always ressurect this MSC or make a new one - I don't really have bandwidth in any case to properly give attention to this one (and honestly forgot it existed...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:maintenance MSC which clarifies/updates existing spec needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. obsolete A proposal which has been overtaken by other proposals proposal A matrix spec change proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants