-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC1700: Improving .well-known discovery #1700
Changes from all commits
1ee26a4
1a1d1ed
44bc48e
cdbf871
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
# Improving .well-known discovery of homeservers | ||
|
||
Having [done an implementation](https://github.com/matrix-org/matrix-react-sdk/pull/2227) in the react-sdk | ||
for .well-known autodiscovery there's a few tweaks that could be made to have the experience be easier for | ||
developers, without sacrificing critical functionality. | ||
|
||
In general, the process should be shortened to two steps: get the JSON and use it if possible. Currently the | ||
process employs a lot of verification methods that don't seem to prevent anything. | ||
|
||
|
||
### Refining the validation process | ||
|
||
For reference, here are the current UX states: | ||
|
||
> `PROMPT` | ||
> Retrieve the specific piece of information from the user in a way which fits within the existing client | ||
> user experience, if the client is inclined to do so. Failure can take place instead if no good user | ||
> experience for this is possible at this point. | ||
> | ||
> `IGNORE` | ||
> Stop the current auto-discovery mechanism. If no more auto-discovery mechanisms are available, then the | ||
> client may use other methods of determining the required parameters, such as prompting the user, or using | ||
> default values. | ||
> | ||
> `FAIL_PROMPT` | ||
> Inform the user that auto-discovery failed due to invalid/empty data and PROMPT for the parameter. | ||
> | ||
> `FAIL_ERROR` | ||
> Inform the user that auto-discovery did not return any usable URLs. Do not continue further with the | ||
> current login process. At this point, valid data was obtained, but no homeserver is available to serve | ||
> the client. No further guess should be attempted and the user should make a conscientious decision what | ||
> to do next. | ||
|
||
Also for reference, here is the last step of the discovery process (the validation): | ||
|
||
> 3. Make a `GET` request to `https://hostname/.well-known/matrix/client`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is illegible in the rendered version. |
||
> a. If the returned status code is 404, then `IGNORE`. | ||
> b. If the returned status code is not 200, or the response body is empty, then `FAIL_PROMPT`. | ||
> c. Parse the response body as a JSON object | ||
> i. If the content cannot be parsed, then `FAIL_PROMPT`. | ||
> d. Extract the `base_url` value from the `m.homeserver` property. This value is to be used as the base URL of | ||
> the homeserver. | ||
> i. If this value is not provided, then `FAIL_PROMPT`. | ||
> e. Validate the homeserver base URL: | ||
> i. Parse it as a URL. If it is not a URL, then `FAIL_ERROR`. | ||
> ii. Clients SHOULD validate that the URL points to a valid homeserver before accepting it by connecting | ||
> to the `/_matrix/client/versions` endpoint, ensuring that it does not return an error, and parsing and | ||
> validating that the data conforms with the expected response format. If any step in the validation | ||
> fails, then `FAIL_ERROR`. Validation is done as a simple check against configuration errors, in order | ||
> to ensure that the discovered address points to a valid homeserver. | ||
> f. If the `m.identity_server` property is present, extract the base_url value for use as the base URL of the | ||
> identity server. Validation for this URL is done as in the step above, but using `/_matrix/identity/api/v1` | ||
> as the endpoint to connect to. If the `m.identity_server` property is present, but does not have a `base_url` | ||
> value, then `FAIL_ERROR`. | ||
|
||
A lot of the validation here doesn't need to be done for the following reasons. In general, the validation steps make | ||
it difficult for libraries to provide the functionality to support their UI counterparts. By reducing the amount | ||
of validation required, an expressive UI is still possible while also making it easier for automated clients and | ||
libraries to use the functionality. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this should be demonstrated with code, showing both implementations (the current spec one, and this MSC one). At this point, this is mere speculation. If code was produced following the current spec, with notes/comments on how the various elements are not working, it would give the occasion for the community to actually review the code to ensure the problem is with the specification itself instead of:
On the other hand, the algorithm is implemented in the Matrix Java SDK and used by at least two clients, Palaver and SimpleMatrix - This means that so far we have conflicting evidences, not allowing us to know if a problem exists and if yes, where. |
||
|
||
Step 3 in the reference should remain the same, however all of the sub points should be replaced with just two: | ||
* If the request was successful, use the `m.homeserver` configuration where possible. The lack of a homeserver | ||
configuration should be considered a failure. | ||
* If the request was successful, use the `m.identity_server` configuration where possible. This assumes that | ||
there is an appropriate `m.homeserver` configuration in the response. | ||
|
||
The first check removed is checking whether the `base_url` is actually a URL. There is no definition as to what | ||
makes a valid URL, which is completely unhelpful for the people having to implement this. Instead, the unwritten | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
rule where the client should be verifying the information is appropriate for its own purpose should come into | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think letting anyone decide what qualifies as a "valid URL" instead of simply clarifying what "valid" means is very extreme in the approach. Clarifying the term There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have a hard time defining what would constitute a 'valid URL' in this case. Do you have a suggestion of what a clarified There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @mvgorcum See the comment just below. |
||
effect. This also allows application-specific clients to verify the URL independently of the specification, or | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If clients can choose to verify a URL independently of the specification, is there a point to put any kind of verification system in the specification? |
||
not at all if it chooses to. The risk of an invalid URL here is relatively minimal, as a responsible client would | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would leeway given in "application-specific" behaviour allowing bypass of the check and requirement for HTTPS? |
||
be communicating to the user what the URL it is about to use is for the user to verify. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since there is no concept of showing the URL about to be used in the current spec, is this effectively a strong recommendation, or a requirement? |
||
|
||
The other check removed is verifying a Matrix server (homeserver or identity) is actually present on the given | ||
URL. The URL was given to the client for the sole purpose of assisting the user into the system, and is unlikely | ||
to be wrong. If it were wrong, the administrator would be more than likely interested in fixing it for their users. | ||
In the event of a security breach where the .well-known configuration is changed to point at a malicious party, | ||
the client's responsibility to show which URL it is about to connect to comes into play. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No such thing as "responsibility" or "responsible client" is ever defined in the specification. What is the expectation here? |
||
|
||
Further to the point of not having to verify if a Matrix server exists on the endpoint, the risk of a conflicting | ||
service is minimal. There are not many services out there that use both "matrix" and "m.homeserver" to do .well-known | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was never given as a reason for validating the .well-known data. |
||
discovery. Not to mention the possibility of a Matrix server not running on the endpoint is also minimal as a Matrix | ||
user would have invoked the query in the first place, indicating that there is an extremely high chance that the | ||
server is capable of speaking Matrix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many of those verification steps/methods are actually part of most SDK or libraries, making them available and used by default.
The current process is also having steps in an order which allow for their UX states to be "grouped", ensuring you actually don't need to discern them if your coding language/tools allow you to bundle them up.
Not all languages/tools/SDKs/libraries give you that high-level access and therefore the exact steps need to be written down to ensure it is not language specific. This is the same principle which is used in the S2S spec for state resolution - if you were to actually implement it, many steps would actually be grouped together. But that doesn't mean the algorithm shouldn't be documented or guidance not given for implementations.
I think this strips away the needed algorithm which is agnostic of language, SDK or tools and promotes high-level languages like Javascript at the expense of low(er)-level languages like Java, C(++), Go, etc.
I would suggest to give advises for languages on what can and cannot be "bunched up" instead.